1124607 Commits

Author SHA1 Message Date
4c78782e2e net/mlx5e: kTLS, Check ICOSQ WQE size in advance
Instead of WARNing in runtime when TLS offload WQEs posted to ICOSQ are
over the hardware limit, check their size before enabling TLS RX
offload, and block the offload if the condition fails. It also allows to
drop a u16 field from struct mlx5e_icosq.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:37 -07:00
21a0502d59 net/mlx5e: Use the aligned max TX MPWQE size
TX MPWQE size is limited to the cacheline-aligned maximum. Use the same
value for the stop room and the capability check.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
e3c4c496dc net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full
Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet
work queue element).

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
527918e9cc net/mlx5e: Use mlx5e_stop_room_for_max_wqe where appropriate
mlx5e_alloc_xdpsq calculates sq->stop_room internally, but there is
already a function for that: mlx5e_stop_room_for_max_wqe. This commit
makes use of this function.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
ed5c92ff0f net/mlx5e: Let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev
To shorten and simplify code, let mlx5e_get_sw_max_sq_mpw_wqebbs accept
mdev and derive max SQ WQEBBs from it. Also rename the function to a
more generic name mlx5e_get_max_sq_aligned_wqebbs, because the following
patches will use it in non-MPWQE contexts.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
44f4fd03b5 net/mlx5e: Validate striding RQ before enabling XDP
Currently, the driver can silently fall back to legacy RQ after enabling
XDP, even if striding RQ was active before. It happens when PAGE_SIZE is
bigger than the maximum supported stride size. This commit changes this
behavior to more straightforward: if an operation (enabling XDP) doesn't
support the current parameters (striding RQ mode), it fails.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
7e49abb1e3 net/mlx5e: Make mlx5e_verify_rx_mpwqe_strides static
mlx5e_verify_rx_mpwqe_strides is only used in en/params.c, so it can be
made static and removed from en/params.h.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
665f29de4c net/mlx5e: Remove unused fields from datapath structs
No need to keep max_sq_wqebbs in mlx5e_txqsq and mlx5e_xdpsq, as it's
only used when allocating the queues. Removing an extra field reduces
the struct size.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:34 -07:00
f060ccc2af net/mlx5e: Convert mlx5e_get_max_sq_wqebbs to u8
The return value of mlx5e_get_max_sq_wqebbs is clamped down to
MLX5_SEND_WQE_MAX_WQEBBS = 16, which fits into u8. This commit changes
the return type of this function to u8 for stricter type safety.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:34 -07:00
40b72108f9 net/mlx5: Add the log_min_mkey_entity_size capability
Add the capability that will allow the driver to determine the minimal
MTT page size to be able to map the smallest possible pages in XSK. The
older firmwares that don't have this capability default to 12 (i.e.
4096-byte pages).

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:33 -07:00
0d5bfebf74 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
updates from mlx5-next 2022-09-24

Updates form mlx5-next including[1]:

1) HW definitions and support for NPPS clock settings.

2) various cleanups

3) Enable hash mode by default for all NICs

4) page tracker and advanced virtualization HW definitions for vfio

[1] https://lore.kernel.org/netdev/20220907233636.388475-1-saeed@kernel.org/

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Remove from FPGA IFC file not-needed definitions
  net/mlx5: Remove unused structs
  net/mlx5: Remove unused functions
  net/mlx5: detect and enable bypass port select flow table
  net/mlx5: Lag, enable hash mode by default for all NICs
  net/mlx5: Lag, set active ports if support bypass port select flow table
  RDMA/mlx5: Don't set tx affinity when lag is in hash mode
  net/mlx5: add IFC bits for bypassing port select flow table
  net/mlx5: Add support for NPPS with real time mode
  net/mlx5: Expose NPPS related registers
  net/mlx5: Query ADV_VIRTUALIZATION capabilities
  net/mlx5: Introduce ifc bits for page tracker
  RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib
====================

Link: https://lore.kernel.org/all/20220927201906.234015-1-saeed@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:20:49 -07:00
d4ddeefa64 net: sunhme: Fix undersized zeroing of quattro->happy_meals
Just use kzalloc instead.

Fixes: d6f1e89bdbb8 ("sunhme: Return an ERR_PTR from quattro_pci_find")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Link: https://lore.kernel.org/r/20220928004157.279731-1-seanga2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:18:42 -07:00
f45892f750 net: wwan: iosm: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is clear.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Link: https://lore.kernel.org/r/20220927023254.30342-1-shangxiaojing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:18:03 -07:00
8278ddb161 Merge branch 'rework-resource-allocation-in-felix-dsa-driver'
Vladimir Oltean says:

====================
Rework resource allocation in Felix DSA driver

The Felix DSA driver controls NXP variations of Microchip switches.
Colin Foster is trying to add support in this driver for "genuine"
Microchip hardware, but some of the NXP-isms in this driver need to go
away before that happens cleanly.
https://patchwork.kernel.org/project/netdevbpf/cover/20220926002928.2744638-1-colin.foster@in-advantage.com/

The starting point was Colin's patch 08/14 "net: dsa: felix: update
init_regmap to be string-based", and this continues to be the central
theme here, but things are done differently.

In short (full explanations are in patches), the goal is for MFD-based
switches like Colin's SPI-controlled VSC7512 to be able to request a
regmap that was created 100% externally (by drivers/mfd/ocelot-core.c)
in a very simple way, that does not create dependencies on other
modules. That is dev_get_regmap(), and as input it wants a string, for
the resource name. So we rework the resource allocation in this driver
to be based on string names provided by the specific instantiation (in
Colin's case, ocelot_ext.c).

Patch set was boot-tested on NXP LS1028A.
====================

Link: https://lore.kernel.org/r/20220927191521.1578084-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:15:30 -07:00
1109b97b61 net: dsa: felix: update regmap requests to be string-based
Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were
integrated in NXP SoCs, which makes them a bit unusual compared to the
usual Microchip branded Ocelot switches.

To be precise, looking at
Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can
see 21 memory regions for the "switch" node, and these correspond to the
"targets" of the switch IP, which are spread throughout the guts of that
SoC's memory space.

In NXP integrations, those targets still exist, but they were condensed
within a single memory region, with no other peripheral in between them,
so it made more sense for the driver to ioremap the entire memory space
of the switch, and then find the targets within that memory space via
some offsets hardcoded in the driver.

The effect of this design decision is that now, the felix driver expects
hardware instantiations to provide their own resource definitions, which
is kind of odd when considering a typical device (those are retrieved
from 'reg' properties in the device tree, using platform_get_resource()
or similar).

Allow other hardware instantiations that share the felix driver to not
provide a hardcoded array of resources in the future. Instead, make the
common denominator based on which regmaps are created be just the
resource "names". Each instantiation comes with its own array of names
that are mandatory for it, and with an optional array of resources.

So we split the resources in 2 arrays, one is what's requested and the
other is what's provided. There is one pool of provided resources, in
felix->info->resources (of length felix->info->num_resources). There are
2 different ways of requesting a resource. One is by enum ocelot_target
(this handles the global regmaps), and one is by int port (this handles
the per-port ones).

For the existing vsc9959 and vsc9953, it would be a bit stupid to
request something that's not provided, given that the 2 arrays are both
defined in the same place.

The advantage is that we can now modify felix_request_regmap_by_name()
to make felix->info->resources[] optional, and if absent, the
implementation can call dev_get_regmap() and this is something that is
compatible with MFD.

Co-developed-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:15:02 -07:00
044d447a80 net: dsa: felix: use DEFINE_RES_MEM_NAMED for resources
Use less verbose resource definitions in vsc9959 and vsc9953. This also
sets IORESOURCE_MEM in the constant array of resources, so we don't have
to do this from felix_init_structs() - in fact, in the future, we may
even support IORESOURCE_REG resources.

Note that this macro takes start and length as argument, and we had
start and end before. So transform end into length.

While at it, sort the resources according to their offset.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:56 -07:00
8f66c64bfc net: dsa: felix: remove felix_info :: init_regmap
It turns out that the idea of having a customizable implementation of a
regmap creation from a resource is not exactly useful. The idea was for
the new MFD-based VSC7512 driver to use something that creates a SPI
regmap from a resource. But there are problems in actually getting those
resources (it involves getting them from MFD).

To avoid all that, we'll be getting resources by name, so this custom
init_regmap() method won't be needed. Remove it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:56 -07:00
1382ba68a0 net: dsa: felix: remove felix_info :: imdio_base
This address is only relevant for the vsc9959, which is a PCIe device
that holds its switch registers in a different PCIe BAR compared to the
registers for the internal MDIO controller.

Hide this aspect from the common felix driver and move the
pci_resource_start() call to the only place that needs it, which is in
vsc9959_mdio_bus_alloc().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:55 -07:00
5fc080de89 net: dsa: felix: remove felix_info :: imdio_res
The imdio_res is used only by vsc9959, which references its own
vsc9959_imdio_res through the common felix_info->imdio_res pointer.
Since the common code doesn't care about this resource (and it can't be
part of the common array of resources, either, because it belongs in a
different PCI BAR), just reference it directly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:55 -07:00
b48b89f9c1 net: drop the weight argument from netif_napi_add
We tell driver developers to always pass NAPI_POLL_WEIGHT
as the weight to netif_napi_add(). This may be confusing
to newcomers, drop the weight argument, those who really
need to tweak the weight can use netif_napi_add_weight().

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:57:14 -07:00
5456262d2b net: Fix incorrect address comparison when searching for a bind2 bucket
The v6_rcv_saddr and rcv_saddr are inside a union in the
'struct inet_bind2_bucket'.  When searching a bucket by following the
bhash2 hashtable chain, eg. inet_bind2_bucket_match, it is only using
the sk->sk_family and there is no way to check if the inet_bind2_bucket
has a v6 or v4 address in the union.  This leads to an uninit-value
KMSAN report in [0] and also potentially incorrect matches.

This patch fixes it by adding a family member to the inet_bind2_bucket
and then tests 'sk->sk_family != tb->family' before matching
the sk's address to the tb's address.

Cc: Joanne Koong <joannelkoong@gmail.com>
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/r/20220927002544.3381205-1-kafai@fb.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:17 -07:00
30b172ee56 Merge branch 'mptcp-mptcp-support-for-tcp_fastopen_connect'
Mat Martineau says:

====================
mptcp: MPTCP support for TCP_FASTOPEN_CONNECT

RFC 8684 appendix B describes how to use TCP Fast Open with MPTCP. This
series allows TFO use with MPTCP using the TCP_FASTOPEN_CONNECT socket
option. The scope here is limited to the initiator of the connection -
support for MSG_FASTOPEN and the listener side of the connection will be
in a separate series. The preexisting TCP fastopen code does most of the
work, so these changes mostly involve plumbing MPTCP through to those
TCP functions.

Patch 1 changes the MPTCP socket option code to pass the
TCP_FASTOPEN_CONNECT option through to the initial unconnected subflow.

Patch 2 exports the existing tcp_sendmsg_fastopen() function from tcp.c

Patch 3 adds the call to tcp_sendmsg_fastopen() from the MPTCP send
function.

Patch 4 modifies mptcp_poll() to handle the deferred TFO connection.
====================

Link: https://lore.kernel.org/r/20220926232739.76317-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:11 -07:00
a42cf9d182 mptcp: poll allow write call before actual connect
If fastopen is used, poll must allow a first write that will trigger
the SYN+data

Similar to what is done in tcp_poll().

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:03 -07:00
d98a82a6af mptcp: handle defer connect in mptcp_sendmsg
When TCP_FASTOPEN_CONNECT has been set on the socket before a connect,
the defer flag is set and must be handled when sendmsg is called.

This is similar to what is done in tcp_sendmsg_locked().

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Co-developed-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:03 -07:00
3242abeb8d tcp: export tcp_sendmsg_fastopen
It will be used to support TCP FastOpen with MPTCP in the following
commit.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Co-developed-by: Dmytro Shytyi <dmytro@shytyi.net>
Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net>
Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:03 -07:00
54635bd047 mptcp: add TCP_FASTOPEN_CONNECT socket option
Set the option for the first subflow only. For the other subflows TFO
can't be used because a mapping would be needed to cover the data in the
SYN.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:52:03 -07:00
63a8bf8556 netns: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated and we are moving towards adopting
C99 flexible-array members, instead. So, replace zero-length arrays
declarations in anonymous union with the new DECLARE_FLEX_ARRAY()
helper macro.

This helper allows for flexible-array members in unions.

Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/225
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/YzIvfGXxfjdXmIS3@work
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:47 -07:00
578b054684 Merge branch 'shrink-struct-ubuf_info'
Pavel Begunkov says:

====================
shrink struct ubuf_info

struct ubuf_info is large but not all fields are needed for all
cases. We have limited space in io_uring for it and large ubuf_info
prevents some struct embedding, even though we use only a subset
of the fields. It's also not very clean trying to use this typeless
extra space.

Shrink struct ubuf_info to only necessary fields used in generic paths,
namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And
make MSG_ZEROCOPY and some other users to embed it into a larger struct
ubuf_info_msgzc mimicking the former ubuf_info.

Note, xen/vhost may also have some cleaning on top by creating
new structs containing ubuf_info but with proper types.
====================

Link: https://lore.kernel.org/r/cover.1663892211.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:28 -07:00
e7d2b51016 net: shrink struct ubuf_info
We can benefit from a smaller struct ubuf_info, so leave only mandatory
fields and let users to decide how they want to extend it. Convert
MSG_ZEROCOPY to struct ubuf_info_msgzc and remove duplicated fields.
This reduces the size from 48 bytes to just 16.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:25 -07:00
dfff202be5 vhost/net: use struct ubuf_info_msgzc
struct ubuf_info will be changed, use ubuf_info_msgzc instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:24 -07:00
b63ca3e822 xen/netback: use struct ubuf_info_msgzc
struct ubuf_info will be changed, use ubuf_info_msgzc instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:24 -07:00
6eaab4dfdd net: introduce struct ubuf_info_msgzc
We're going to split struct ubuf_info and leave there only
mandatory fields. Users are free to extend it. Add struct
ubuf_info_msgzc, which will be an extended version for MSG_ZEROCOPY and
some other users. It duplicates of struct ubuf_info for now and will be
removed in a couple of patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:24 -07:00
929a6cdfae Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:

====================
netfilter fix for net-next

This is a late bug fix for the *net-next* tree to make nftables
"fib" expression play nice with VRF devices.

This was broken since day 1 (v4.10) so I don't see a compelling reason
to push this via net at the last minute.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nft_fib: Fix for rpath check with VRF devices
====================

Link: https://lore.kernel.org/r/20220928113908.4525-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 09:46:36 -07:00
2a8a7c0eaa netfilter: nft_fib: Fix for rpath check with VRF devices
Analogous to commit b575b24b8eee3 ("netfilter: Fix rpfilter
dropping vrf packets by mistake") but for nftables fib expression:
Add special treatment of VRF devices so that typical reverse path
filtering via 'fib saddr . iif oif' expression works as expected.

Fixes: f6d0cbcf09c50 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-28 13:33:26 +02:00
b9a5cbf8ba Merge branch 'sfc-tc-offload'
Edward Cree says:

====================
sfc: bare bones TC offload

This series begins the work of supporting TC flower offload on EF100 NICs.
This is the absolute minimum viable TC implementation to get traffic to
 VFs and allow them to be tested; it supports no match fields besides
 ingress port, no actions besides mirred and drop, and no stats.
More matches, actions, and counters will be added in subsequent patches.

Changed in v2:
 - Add missing 'static' on declarations (kernel test robot, sparse)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
d902e1a737 sfc: bare bones TC offload on EF100
This is the absolute minimum viable TC implementation to get traffic to
 VFs and allow them to be tested; it supports no match fields besides
 ingress port, no actions besides mirred and drop, and no stats.
Example usage:
    tc filter add dev $PF parent ffff: flower skip_sw \
        action mirred egress mirror dev $VFREP
    tc filter add dev $VFREP parent ffff: flower skip_sw \
        action mirred egress redirect dev $PF
 gives a VF unfiltered access to the network out the physical port ($PF
 acts here as a physical port representor).
More matches, actions, and counters will be added in subsequent patches.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
7ce3e235f2 sfc: interrogate MAE capabilities at probe time
Different versions of EF100 firmware and FPGA bitstreams support different
 matching capabilities in the Match-Action Engine.  Probe for these at
 start of day; subsequent patches will validate TC offload requests
 against the reported capabilities.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
f54a28a211 sfc: add a hashtable for offloaded TC rules
Nothing inserts into this table yet, but we have code to remove rules
 on FLOW_CLS_DESTROY or at driver teardown time, in both cases also
 attempting to remove the corresponding hardware rules.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
7c9d266d8f sfc: optional logging of TC offload errors
TC offload support will involve complex limitations on what matches and
 actions a rule can do, in some cases potentially depending on rules
 already offloaded.  So add an ethtool private flag "log-tc-errors" which
 controls reporting the reasons for un-offloadable TC rules at NETIF_INFO.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
5b2e12d51b sfc: bind indirect blocks for TC offload on EF100
Bind indirect blocks for recognised tunnel netdevices.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
9dc0cad203 sfc: bind blocks for TC offload on EF100
Bind direct blocks for the MAE-admin PF and each VF representor.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
c87e4ad1d3 net: ethernet: rmnet: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated and we are moving towards adopting
C99 flexible-array members, instead. So, replace zero-length arrays
declarations in anonymous union with the new DECLARE_FLEX_ARRAY()
helper macro.

This helper allows for flexible-array members in unions.

Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/221
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:40:31 +01:00
8fff09effb net: sched: act_bpf: simplify code logic in tcf_bpf_init()
Both is_bpf and is_ebpf are boolean types, so
(!is_bpf && !is_ebpf) || (is_bpf && is_ebpf) can be reduced to
is_bpf == is_ebpf in tcf_bpf_init().

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:38:56 +01:00
6a1dc68eb1 Merge branch 'lan966x-qos'
Horatiu Vultur says:

====================
net: lan966x: Add tbf, cbs, ets support

Add support for offloading QoS features with tc command to lan966x.
The offloaded Qos features are tbf, cbs and ets.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
29aaf3d40e net: lan966x: Add offload support for ets
Add ets qdisc which allows to mix strict priority with bandwidth-sharing
bands. The ets qdisc needs to be attached as root qdisc.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
21ce14a8e7 net: lan966x: Add offload support for cbs
Lan966x switch supports credit based shaper in hardware according to
IEEE Std 802.1Q-2018 Section 8.6.8.2. Add support for cbs configuration
on egress port of lan966x switch.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
94644b6d72 net: lan966x: Add offload support for tbf
The tbf qdisc allows to attach a shaper on traffic egress on a port or
on a queue. On port they are attached directly to the root and on queue
they are attached on one of the classes of the parent qdisc.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
2ae3cb58b9 Merge branch 'tc-testing-qdisc'
Zhengchao Shao says:

====================
net: add tc-testing qdisc test cases

For this patchset, test cases of the qdisc modules are added to the
tc-testing test suite.

Last, thanks to Victor for testing and suggestion.

After a test case is added locally, the test result is as follows:

./tdc.py -c atm
ok 1 7628 - Create ATM with default setting
ok 2 390a - Delete ATM with valid handle
ok 3 32a0 - Show ATM class
ok 4 6310 - Dump ATM stats

./tdc.py -c choke
ok 1 8937 - Create CHOKE with default setting
ok 2 48c0 - Create CHOKE with min packet setting
ok 3 38c1 - Create CHOKE with max packet setting
ok 4 234a - Create CHOKE with ecn setting
ok 5 4380 - Create CHOKE with burst setting
ok 6 48c7 - Delete CHOKE with valid handle
ok 7 4398 - Replace CHOKE with min setting
ok 8 0301 - Change CHOKE with limit setting

./tdc.py -c codel
ok 1 983a - Create CODEL with default setting
ok 2 38aa - Create CODEL with limit packet setting
ok 3 9178 - Create CODEL with target setting
ok 4 78d1 - Create CODEL with interval setting
ok 5 238a - Create CODEL with ecn setting
ok 6 939c - Create CODEL with ce_threshold setting
ok 7 8380 - Delete CODEL with valid handle
ok 8 289c - Replace CODEL with limit setting
ok 9 0648 - Change CODEL with limit setting

./tdc.py -c etf
ok 1 34ba - Create ETF with default setting
ok 2 438f - Create ETF with delta nanos setting
ok 3 9041 - Create ETF with deadline_mode setting
ok 4 9a0c - Create ETF with skip_sock_check setting
ok 5 2093 - Delete ETF with valid handle

./tdc.py -c fq
ok 1 983b - Create FQ with default setting
ok 2 38a1 - Create FQ with limit packet setting
ok 3 0a18 - Create FQ with flow_limit setting
ok 4 2390 - Create FQ with quantum setting
ok 5 845b - Create FQ with initial_quantum setting
ok 6 9398 - Create FQ with maxrate setting
ok 7 342c - Create FQ with nopacing setting
ok 8 6391 - Create FQ with refill_delay setting
ok 9 238b - Create FQ with low_rate_threshold setting
ok 10 7582 - Create FQ with orphan_mask setting
ok 11 4894 - Create FQ with timer_slack setting
ok 12 324c - Create FQ with ce_threshold setting
ok 13 424a - Create FQ with horizon time setting
ok 14 89e1 - Create FQ with horizon_cap setting
ok 15 32e1 - Delete FQ with valid handle
ok 16 49b0 - Replace FQ with limit setting
ok 17 9478 - Change FQ with limit setting

./tdc.py -c gred
ok 1 8942 - Create GRED with default setting
ok 2 5783 - Create GRED with grio setting
ok 3 8a09 - Create GRED with limit setting
ok 4 48cb - Create GRED with ecn setting
ok 5 763a - Change GRED setting
ok 6 8309 - Show GRED class

./tdc.py -c hhf
ok 1 4812 - Create HHF with default setting
ok 2 8a92 - Create HHF with limit setting
ok 3 3491 - Create HHF with quantum setting
ok 4 ba04 - Create HHF with reset_timeout setting
ok 5 4238 - Create HHF with admit_bytes setting
ok 6 839f - Create HHF with evict_timeout setting
ok 7 a044 - Create HHF with non_hh_weight setting
ok 8 32f9 - Change HHF with limit setting
ok 9 385e - Show HHF class

./tdc.py -c pfifo_fast
ok 1 900c - Create pfifo_fast with default setting
ok 2 7470 - Dump pfifo_fast stats
ok 3 b974 - Replace pfifo_fast with different handle
ok 4 3240 - Delete pfifo_fast with valid handle
ok 5 4385 - Delete pfifo_fast with invalid handle

./tdc.py -c plug
ok 1 3289 - Create PLUG with default setting
ok 2 0917 - Create PLUG with block setting
ok 3 483b - Create PLUG with release setting
ok 4 4995 - Create PLUG with release_indefinite setting
ok 5 389c - Create PLUG with limit setting
ok 6 384a - Delete PLUG with valid handle
ok 7 439a - Replace PLUG with limit setting
ok 8 9831 - Change PLUG with limit setting

./tdc.py -c sfb
ok 1 3294 - Create SFB with default setting
ok 2 430a - Create SFB with rehash setting
ok 3 3410 - Create SFB with db setting
ok 4 49a0 - Create SFB with limit setting
ok 5 1241 - Create SFB with max setting
ok 6 3249 - Create SFB with target setting
ok 7 30a9 - Create SFB with increment setting
ok 8 239a - Create SFB with decrement setting
ok 9 9301 - Create SFB with penalty_rate setting
ok 10 2a01 - Create SFB with penalty_burst setting
ok 11 3209 - Change SFB with rehash setting
ok 12 5447 - Show SFB class

./tdc.py -c sfq
ok 1 7482 - Create SFQ with default setting
ok 2 c186 - Create SFQ with limit setting
ok 3 ae23 - Create SFQ with perturb setting
ok 4 a430 - Create SFQ with quantum setting
ok 5 4539 - Create SFQ with divisor setting
ok 6 b089 - Create SFQ with flows setting
ok 7 99a0 - Create SFQ with depth setting
ok 8 7389 - Create SFQ with headdrop setting
ok 9 6472 - Create SFQ with redflowlimit setting
ok 10 8929 - Show SFQ class

./tdc.py -c skbprio
ok 1 283e - Create skbprio with default setting
ok 2 c086 - Create skbprio with limit setting
ok 3 6733 - Change skbprio with limit setting
ok 4 2958 - Show skbprio class

./tdc.py -c taprio
ok 1 ba39 - Add taprio Qdisc to multi-queue device (8 queues)
ok 2 9462 - Add taprio Qdisc with multiple sched-entry
ok 3 8d92 - Add taprio Qdisc with txtime-delay
ok 4 d092 - Delete taprio Qdisc with valid handle
ok 5 8471 - Show taprio class
ok 6 0a85 - Add taprio Qdisc to single-queue device

./tdc.py -c tbf
ok 1 6430 - Create TBF with default setting
ok 2 0518 - Create TBF with mtu setting
ok 3 320a - Create TBF with peakrate setting
ok 4 239b - Create TBF with latency setting
ok 5 c975 - Create TBF with overhead setting
ok 6 948c - Create TBF with linklayer setting
ok 7 3549 - Replace TBF with mtu
ok 8 f948 - Change TBF with latency time
ok 9 2348 - Show TBF class

./tdc.py -c teql
ok 1 84a0 - Create TEQL with default setting
ok 2 7734 - Create TEQL with multiple device
ok 3 34a9 - Delete TEQL with valid handle
ok 4 6289 - Show TEQL stats

---
v3: add config
v2: modify subject prefix
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 08:32:55 +01:00
cc62fbe114 selftests/tc-testing: add selftests for teql qdisc
Test 84a0: Create TEQL with default setting
Test 7734: Create TEQL with multiple device
Test 34a9: Delete TEQL with valid handle
Test 6289: Show TEQL stats

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 08:32:55 +01:00
10835be3f0 selftests/tc-testing: add selftests for tbf qdisc
Test 6430: Create TBF with default setting
Test 0518: Create TBF with mtu setting
Test 320a: Create TBF with peakrate setting
Test 239b: Create TBF with latency setting
Test c975: Create TBF with overhead setting
Test 948c: Create TBF with linklayer setting
Test 3549: Replace TBF with mtu
Test f948: Change TBF with latency time
Test 2348: Show TBF class

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 08:32:55 +01:00