DCCP is virtually never used, so no need to use space in struct net for it.
Put the pernet ipv4/v6 socket in the dccp ipv4/ipv6 modules instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20210408174502.1625-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
napi_disable() is subject to an hangup, when the threaded
mode is enabled and the napi is under heavy traffic.
If the relevant napi has been scheduled and the napi_disable()
kicks in before the next napi_threaded_wait() completes - so
that the latter quits due to the napi_disable_pending() condition,
the existing code leaves the NAPI_STATE_SCHED bit set and the
napi_disable() loop waiting for such bit will hang.
This patch addresses the issue by dropping the NAPI_STATE_DISABLE
bit test in napi_thread_wait(). The later napi_threaded_poll()
iteration will take care of clearing the NAPI_STATE_SCHED.
This also addresses a related problem reported by Jakub:
before this patch a napi_disable()/napi_enable() pair killed
the napi thread, effectively disabling the threaded mode.
On the patched kernel napi_disable() simply stops scheduling
the relevant thread.
v1 -> v2:
- let the main napi_thread_poll() loop clear the SCHED bit
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 29863d41bb ("net: implement threaded-able napi poll loop support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nlh is being checked for validtity two times when it is dereferenced in
this function. Check for validity again when updating the flags through
nlh pointer to make the dereferencing safe.
CC: <stable@vger.kernel.org>
Addresses-Coverity: ("NULL pointer dereference")
Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Proper support for BCM4330 and BMC4334
- Various improvements for firmware download of Intel controllers
- Update management interface revision to 20
- Support for AOSP HCI vendor commands
- Initial Virtio support
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmBvL7UZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKU+eD/4qGiwnwQVia4hjnQ5PZ0c/
sBmZvzzZE43RNRvy8gAwcVLypLaiIkEYml4ruVdF6RGoqaDA6kRB7D0tdk8EtZc4
6h72LX0zefnyZNj/Fcl6+dKuq7bSHjdOK0G+oXFmpvYfScPsI6Oq0ntpYOFYwUE0
cDdeZyb/sW4f6L5AXUgLjD9AIjzlBOJEU5HcRL/uY+E/fH1oe757/IhUDerXjTIh
9EhY28mrI7KeoEcJ1NIgvjPnRCxPRLAKgrzc8NXkUsLeNZSIywrwjgRfTvCZ+qQf
g/MmNv+n7yFlUzA7Fu2d8YAgRtgF9nZc45geqOSHqcSZURORSs0CzEoYinqwT7f0
xdMBi/UoSzJsLksgv+OfXOyoQ4lPNJ47pcH/edmT/ZJ/Ai2yTqUUSODHQ3MLC4Dp
kQj34thQvfTf5b6+9HwKmBjlfMK0QGPqWjp6dth2bzos/ugurDP+XimAv0urvFz+
NkES5kCMUs/1z7Yh+Nj1MyU3Wfdwol0wOh+PXBTAw2fxSy6kmKYGyLw7HEdWegB0
K2w/ZS58g0EGK89ZSiqlZKmXtAd22dQqS6JlZje+9YTtEtRWu+rAF1dddS8vJDjl
Rq3VSQ1B6Qw0rdGpyfxdgm/sWwq5S1YgQQcAZgVaeo9ZiIhfscVD5v5L0Ol6k9ms
3eCK/2brsX0IaV61eVnoOw==
=jiCz
-----END PGP SIGNATURE-----
Merge tag 'for-net-next-2021-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Proper support for BCM4330 and BMC4334
- Various improvements for firmware download of Intel controllers
- Update management interface revision to 20
- Support for AOSP HCI vendor commands
- Initial Virtio support
====================
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2021-04-08
The following pull-request contains BPF updates for your *net* tree.
We've added 4 non-merge commits during the last 2 day(s) which contain
a total of 4 files changed, 31 insertions(+), 10 deletions(-).
The main changes are:
1) Validate and reject invalid JIT branch displacements, from Piotr Krysiuk.
2) Fix incorrect unhash restore as well as fwd_alloc memory accounting in
sock map, from John Fastabend.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
* S1G beacon validation
* potential leak in nl80211
* fast-RX confusion with 4-addr mode
* erroneous WARN_ON that userspace can trigger
* wrong time units in virt_wifi
* rfkill userspace API breakage
* TXQ AC confusing that led to traffic stopped forever
* connection monitoring time after/before confusion
* netlink beacon head validation buffer overrun
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmBvFsAACgkQB8qZga/f
l8QJRA//f7HTL6lgRMck229bKoKUO9ICRGifYdAZfIwFBzNZguBPwxiUeIZ0rFWf
LDJgee+QsKv7YkVoUXYX0K7vA9DXbadcDWgQqB8pJrxjpibvxUtF2fA0YPEftrLu
1cCl9dgtJHaZF+SZKbEEUpDz19gZtSMTRdbJXLyhGWcZ+itXX6cFoa8hCWMxwWVi
cftZJl1lXqJVzJTr/4/lENt51dwvha/z4jZodnTl2PZjOFrho4/L5UbJ7FlTmbg4
FiI9fpI+VCd6VJO/o5g8v6i7QMI0yJ1eMjuFYcOmDzagJqxD6diU31U/hpi94DGI
Fw2RCWTMk2S8ol6SszvyIvt5GD6YM28MFJo69uBd6U30QjynvxDbHBqzpbd23Sqe
34QlHOfn+WbMK1DOAAEeTrYeRsm9dem9DSkAW0uf9YrTQF0E3HyJAMX6Zgd1h3Ic
lWk3f/h9GiD4u/cK9vhISylu5e12F84sYND1Nk5yeMN/ihzyWAQGQiQN7pS2crvX
JBzv2obMqiUeW6LPtaiPzCBrkdTnQ8kgHwuuagmKhLwF2vDVyQJEitmeVGHl/FdV
O+89KodIVGHXd17cK9ddRieg8Fp8ecKCqxv/+aNN7Rsda/6mQPGTaVSFk/++FI6Q
h8NMWi5qVqHShYWcF8Q1FMdPwBOeW/pZs5iqKYhp+60qTGDc9LU=
=rGEc
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-net-2021-04-08.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes berg says:
====================
Various small fixes:
* S1G beacon validation
* potential leak in nl80211
* fast-RX confusion with 4-addr mode
* erroneous WARN_ON that userspace can trigger
* wrong time units in virt_wifi
* rfkill userspace API breakage
* TXQ AC confusing that led to traffic stopped forever
* connection monitoring time after/before confusion
* netlink beacon head validation buffer overrun
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- for kerneldoc in batadv_priv, by Linus Luessing
- drop unused header preempt.h, by Sven Eckelmann
- Fix misspelled "wont", by Sven Eckelmann
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmBu7mIWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoTwhEACATSUDT505T88n3BFqXRhX3/gV
F40bFT+Q0SSo5a5iJFBY2KgSJF+cAJ4L16F+CCYhfI0Ju7lRwFqEg/J6wDRG/MpH
Cjb4Prn2N0nD/YeR3bPbUiyiSUDlDVmjDXqaQ03FghzzbXrONyVaT5xBsHb1p7CV
cCyMJIPzM9FjM8qUXTh10MmHt0LvYIL/2JKp0W1FlOHST3GIkmX32S5CAhOZQD6N
gYUCZ+bQqEox5TRYw0PajZIZXYs8xuPgcRNAzGgESAlewlWC8xUy2L6wOfAF6v3J
89dT+1maUxv8l+fftBChSOvKx7tHvpJ2W8XPLKtZNeO9qDFUtZ0EKMCGEMtgvDEd
peLSOCGCp0/75rC+Z8NocUaQqjFicLbmZo5s57bi368S0BOmTO+NBYZM2ftyDnfz
QpZenbzDi9TawcVmSF7v+QYoMZJkr3qYQ+/nH4DKJN4u3FdfRDp3bs7s64qDj9er
s4ufATNsh7ybwQl6BjRyeanG3j8hYywZObnydr26iVol6Nafo45CYFq1z8SzU7EI
1/10eEwxgNUO3HFm+8vCvpkUvDu3W10XROLvd28rQuzG5+bMPnYQ2ZlUstRASftb
XlNH2JPez/ZUKmftpbqOiQN1UDQ+h8RnPb7dMmKnn6gp5KrJ7yvXKU9C7tz/BYh+
546TfEVyyPTmSlawYw==
=dg37
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-pullrequest-20210408' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- for kerneldoc in batadv_priv, by Linus Luessing
- drop unused header preempt.h, by Sven Eckelmann
- Fix misspelled "wont", by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting iftoken can fail for several different reasons but there
and there was no report to user as to the cause. Add netlink
extended errors to the processing of the request.
This requires adding additional argument through rtnl_af_ops
set_link_af callback.
Reported-by: Hongren Zheng <li@zenithal.me>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With recent changes that separated action module load from action
initialization tcf_action_init() function error handling code was modified
to manually release the loaded modules if loading/initialization of any
further action in same batch failed. For the case when all modules
successfully loaded and some of the actions were initialized before one of
them failed in init handler. In this case for all previous actions the
module will be released twice by the error handler: First time by the loop
that manually calls module_put() for all ops, and second time by the action
destroy code that puts the module after destroying the action.
Reproduction:
$ sudo tc actions add action simple sdata \"2\" index 2
$ sudo tc actions add action simple sdata \"1\" index 1 \
action simple sdata \"2\" index 2
RTNETLINK answers: File exists
We have an error talking to the kernel
$ sudo tc actions ls action simple
total acts 1
action order 0: Simple <"2">
index 2 ref 1 bind 0
$ sudo tc actions flush action simple
$ sudo tc actions ls action simple
$ sudo tc actions add action simple sdata \"2\" index 2
Error: Failed to load TC action module.
We have an error talking to the kernel
$ lsmod | grep simple
act_simple 20480 -1
Fix the issue by modifying module reference counting handling in action
initialization code:
- Get module reference in tcf_idr_create() and put it in tcf_idr_release()
instead of taking over the reference held by the caller.
- Modify users of tcf_action_init_1() to always release the module
reference which they obtain before calling init function instead of
assuming that created action takes over the reference.
- Finally, modify tcf_action_init_1() to not release the module reference
when overwriting existing action as this is no longer necessary since both
upper and lower layers obtain and manage their own module references
independently.
Fixes: d349f99768 ("net_sched: fix RTNL deadlock again caused by request_module()")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Action init code increments reference counter when it changes an action.
This is the desired behavior for cls API which needs to obtain action
reference for every classifier that points to action. However, act API just
needs to change the action and releases the reference before returning.
This sequence breaks when the requested action doesn't exist, which causes
act API init code to create new action with specified index, but action is
still released before returning and is deleted (unless it was referenced
concurrently by cls API).
Reproduction:
$ sudo tc actions ls action gact
$ sudo tc actions change action gact drop index 1
$ sudo tc actions ls action gact
Extend tcf_action_init() to accept 'init_res' array and initialize it with
action->ops->init() result. In tcf_action_add() remove pointers to created
actions from actions array before passing it to tcf_action_put_many().
Fixes: cae422f379 ("net: sched: use reference counting action init")
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 6855e8213e.
Following commit in series fixes the issue without introducing regression
in error rollback of tcf_action_destroy().
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the beacon head attribute (NL80211_ATTR_BEACON_HEAD)
is too short to even contain the frame control field,
we access uninitialized data beyond the buffer. Fix this
by checking the minimal required size first. We used to
do this until S1G support was added, where the fixed
data portion has a different size.
Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 1d47f1198d ("nl80211: correctly validate S1G beacon head")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If any of the cipher schemes specified by the driver are invalid, bail
out and fail the registration rather than just warning. Otherwise, we
might later crash when we try to use the invalid cipher scheme, e.g.
if the hdr_len is (significantly) less than the pn_offs + pn_len, we'd
have an out-of-bounds access in RX validation.
Fixes: 2475b1cc0d ("mac80211: add generic cipher scheme support")
Link: https://lore.kernel.org/r/20210408143149.38a3a13a1b19.I6b7f5790fa0958ed8049cf02ac2a535c61e9bc96@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
After channel switch, we should consider any beacon with a
CSA IE as a new switch. If the CSA IE is a leftover from
before the switch that the AP forgot to remove, we'll get
a CSA-to-Self.
This caused issues in iwlwifi where the firmware saw a beacon
with a CSA-to-Self with mode = 1 on the new channel after a
switch. The firmware considered this a new switch and closed
its queues. Since the beacon didn't change between before and
after the switch, we wouldn't handle it (the CRC is the same)
and we wouldn't let the firmware open its queues again or
disconnect if the CSA IE stays for too long.
Clear the CRC valid state after we switch to make sure that
we handle the beacon and handle the CSA IE as required.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20210408143124.b9e68aa98304.I465afb55ca2c7d59f7bf610c6046a1fd732b4c28@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some drivers, for example mt76, use the skb priority field, and
expects that to be consistent with the skb queue mapping. On some
frame injection code paths that was not true, and it broke frame
injection. Now the skb queue mapping is set according to the skb
priority value when the frame is injected. The skb priority value
is also derived from the frame data for all frame types, as it
was done prior to commit dbd50a851c (only allocate one queue
when using iTXQs). Fixes frame injection with the mt76 driver on
MT7610E chipset.
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/r/20210401164455.978245-1-johan.almbladh@anyfinetworks.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some HW/driver can support passing ethernet rx decap frames and
raw 802.11 frames for the monitor interface concurrently and
via separate RX calls to mac80211. Packets going to the monitor
interface(s) would be in 802.11 format and thus not have the
RX_FLAG_8023 set, and 802.11 format monitoring frames should have
RX_FLAG_ONLY_MONITOR set.
Drivers doing such can enable the SUPPORTS_CONC_MON_RX_DECAP to
allow using ethernet decap offload while a monitor interface is
active, currently RX decapsulation offload gets disabled when a
monitor interface is added.
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Link: https://lore.kernel.org/r/1617068116-32253-1-git-send-email-srirrama@codeaurora.org
[add proper documentation, rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
rfkill now allows to report a reason for the hw_rfkill state.
Allow cfg80211 drivers to specify this reason.
Keep the current API to use the default reason
(RFKILL_HARD_BLOCK_SIGNAL).
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20210322204633.102581-4-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some controllers don't support the Simple Pairing Options feature that
can indicate the support for P-192 and P-256 public key validation.
However they might support the Microsoft vendor extension that can
indicate the validiation capability as well.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The le_scan_{int,window}_adv_monitor settings have not been set with a
sensible default.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
In some cases (depending on the driver, but it's true e.g. for
iwlwifi) we're using an internal TXQ for management packets,
mostly to simplify the code and to have a place to queue them.
However, it appears that in certain cases we can confuse the
code and management frames are dropped, which is certainly not
what we want.
Short-circuit the processing of management frames. To keep the
impact minimal, only put them on the frags queue and check the
tid == management only for doing that and to skip the airtime
fairness checks, if applicable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20210319232800.0e876c800866.Id2b66eb5a17f3869b776c39b5ca713272ea09d5d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The variable result is being assigned a value that is never
read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210328213729.65819-1-colin.king@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
GCC reports the following warning with W=1:
net/mac80211/rc80211_minstrel_ht.c:871:34: warning:
variable 'mg' set but not used [-Wunused-but-set-variable]
871 | struct minstrel_mcs_group_data *mg;
| ^~
This variable is not used in function , this commit
remove it to fix the warning.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20210326024843.987941-1-weiyongjun1@huawei.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qiheng Lin <linqiheng@huawei.com>
Link: https://lore.kernel.org/r/20210325143854.13186-1-linqiheng@huawei.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
Link: https://lore.kernel.org/r/1617711116-49370-1-git-send-email-huangguobin4@huawei.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A WARN_ON(wdev->conn) would trigger in cfg80211_sme_connect(), if multiple
send_msg(NL80211_CMD_CONNECT) system calls are made from the userland, which
should be anticipated and handled by the wireless driver. Remove this WARN_ON()
to prevent kernel panic if kernel is configured to "panic_on_warn".
Bug reported by syzbot.
Reported-by: syzbot+5f9392825de654244975@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Link: https://lore.kernel.org/r/20210407162756.6101-1-ducheng2@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The incorrect timeout check caused probing to happen when it did
not need to happen. This in turn caused tx performance drop
for around 5 seconds in ath10k-ct driver. Possibly that tx drop
is due to a secondary issue, but fixing the probe to not happen
when traffic is running fixes the symptom.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Fixes: 9abf4e4983 ("mac80211: optimize station connection monitor")
Acked-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Normally, TXQs have
txq->tid = tid;
txq->ac = ieee80211_ac_from_tid(tid);
However, the special management TXQ actually has
txq->tid = IEEE80211_NUM_TIDS; // 16
txq->ac = IEEE80211_AC_VO;
This makes sense, but ieee80211_ac_from_tid(16) is the same
as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE.
Now, normally this is fine. However, if the netdev queues
were stopped, then the code in ieee80211_tx_dequeue() will
propagate the stop from the interface (vif->txqs_stopped[])
if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as
stopped. On wake, however, __ieee80211_wake_txqs() will wake
the TXQ if AC 0 (txq->ac) is woken up.
If a driver stops all queues with ieee80211_stop_tx_queues()
and then wakes them again with ieee80211_wake_tx_queues(),
the ieee80211_wake_txqs() tasklet will run to resync queue
and TXQ state. If all queues were woken, then what'll happen
is that _ieee80211_wake_txqs() will run in order of HW queues
0-3, typically (and certainly for iwlwifi) corresponding to
ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in
order 0-3.
When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll
wake up the management TXQ (remember its tid is 16), and the
driver's wake_tx_queue() will be called. That tries to get a
frame, which will immediately *stop* the TXQ again, because
now we check against AC 2, and AC 2 hasn't yet been marked as
woken up again in sdata->vif.txqs_stopped[] since we're only
in the __ieee80211_wake_txqs() call for AC 0.
Thus, the management TXQ will never be started again.
Fix this by checking txq->ac directly instead of calculating
the AC as ieee80211_ac_from_tid(txq->tid).
Fixes: adf8ed01e4 ("mac80211: add an optional TXQ for other PS-buffered frames")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818f5442e3c9bf8670a9c4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Recompiling with the new extended version of struct rfkill_event
broke systemd in *two* ways:
- It used "sizeof(struct rfkill_event)" to read the event, but
then complained if it actually got something != 8, this broke
it on new kernels (that include the updated API);
- It used sizeof(struct rfkill_event) to write a command, but
didn't implement the intended expansion protocol where the
kernel returns only how many bytes it accepted, and errored
out due to the unexpected smaller size on kernels that didn't
include the updated API.
Even though systemd has now been fixed, that fix may not be always
deployed, and other applications could potentially have similar
issues.
As such, in the interest of avoiding regressions, revert the
default API "struct rfkill_event" back to the original size.
Instead, add a new "struct rfkill_event_ext" that extends it by
the new field, and even more clearly document that applications
should be prepared for extensions in two ways:
* write might only accept fewer bytes on older kernels, and
will return how many to let userspace know which data may
have been ignored;
* read might return anything between 8 (the original size) and
whatever size the application sized its buffer at, indicating
how much event data was supported by the kernel.
Perhaps that will help avoid such issues in the future and we
won't have to come up with another version of the struct if we
ever need to extend it again.
Applications that want to take advantage of the new field will
have to be modified to use struct rfkill_event_ext instead now,
which comes with the danger of them having already been updated
to use it from 'struct rfkill_event', but I found no evidence
of that, and it's still relatively new.
Cc: stable@vger.kernel.org # 5.11
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-r4 (x86-64)
Link: https://lore.kernel.org/r/20210319232510.f1a139cfdd9c.Ic5c7c9d1d28972059e132ea653a21a427c326678@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In some race conditions, with more clients and traffic configuration,
below crash is seen when making the interface down. sta->fast_rx wasn't
cleared when STA gets removed from 4-addr AP_VLAN interface. The crash is
due to try accessing 4-addr AP_VLAN interface's net_device (fast_rx->dev)
which has been deleted already.
Resolve this by clearing sta->fast_rx pointer when STA removes
from a 4-addr VLAN.
[ 239.449529] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 239.449531] pgd = 80204000
...
[ 239.481496] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.60 #227
[ 239.481591] Hardware name: Generic DT based system
[ 239.487665] task: be05b700 ti: be08e000 task.ti: be08e000
[ 239.492360] PC is at get_rps_cpu+0x2d4/0x31c
[ 239.497823] LR is at 0xbe08fc54
...
[ 239.778574] [<80739740>] (get_rps_cpu) from [<8073cb10>] (netif_receive_skb_internal+0x8c/0xac)
[ 239.786722] [<8073cb10>] (netif_receive_skb_internal) from [<8073d578>] (napi_gro_receive+0x48/0xc4)
[ 239.795267] [<8073d578>] (napi_gro_receive) from [<c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames+0xbcc/0x12d4 [mac80211])
[ 239.804776] [<c7b83e8c>] (ieee80211_mark_rx_ba_filtered_frames [mac80211]) from [<c7b84d4c>] (ieee80211_rx_napi+0x7b8/0x8c8 [mac8
0211])
[ 239.815857] [<c7b84d4c>] (ieee80211_rx_napi [mac80211]) from [<c7f63d7c>] (ath11k_dp_process_rx+0x7bc/0x8c8 [ath11k])
[ 239.827757] [<c7f63d7c>] (ath11k_dp_process_rx [ath11k]) from [<c7f5b6c4>] (ath11k_dp_service_srng+0x2c0/0x2e0 [ath11k])
[ 239.838484] [<c7f5b6c4>] (ath11k_dp_service_srng [ath11k]) from [<7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll+0x20/0x84 [ath11k_ahb]
)
[ 239.849419] [<7f55b7dc>] (ath11k_ahb_ext_grp_napi_poll [ath11k_ahb]) from [<8073ce1c>] (net_rx_action+0xe0/0x28c)
[ 239.860945] [<8073ce1c>] (net_rx_action) from [<80324868>] (__do_softirq+0xe4/0x228)
[ 239.871269] [<80324868>] (__do_softirq) from [<80324c48>] (irq_exit+0x98/0x108)
[ 239.879080] [<80324c48>] (irq_exit) from [<8035c59c>] (__handle_domain_irq+0x90/0xb4)
[ 239.886114] [<8035c59c>] (__handle_domain_irq) from [<8030137c>] (gic_handle_irq+0x50/0x94)
[ 239.894100] [<8030137c>] (gic_handle_irq) from [<803024c0>] (__irq_svc+0x40/0x74)
Signed-off-by: Seevalamuthu Mariappan <seevalam@codeaurora.org>
Link: https://lore.kernel.org/r/1616163532-3881-1-git-send-email-seevalam@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2021-04-07
An update from ieee802154 for your *net* tree.
Most of these are coming from the flood of syzkaller reports
lately got for the ieee802154 subsystem. There are likely to
come more for this, but this is a good batch to get out for now.
Alexander Aring created a patchset to avoid llsec handling on a
monitor interface, which we do not support.
Alex Shi removed a unused macro.
Pavel Skripkin fixed another protection fault found by syzkaller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lanes field is missing for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT
link mode and it causes a failure when trying to set
'speed 10000 lanes 1' on Spectrum-2 machines when autoneg is set to on.
Add the lanes parameter for ETHTOOL_LINK_MODE_10000baseR_FEC_BIT
link mode.
Fixes: c8907043c6 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback, before populating it with actual values.
Such drivers will set the new 'link_mode' field to zero, resulting in
user space receiving wrong link mode information given that zero is a
valid value for the field.
Another problem is that some drivers (notably tun) can report random
values in the 'link_mode' field. This can result in a general protection
fault when the field is used as an index to the 'link_mode_params' array
[1].
This happens because such drivers implement their set_link_ksettings()
callback by simply overwriting their private copy of
'ethtool_link_ksettings' struct with the one they get from the stack,
which is not always properly initialized.
Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings'
and instead have drivers call ethtool_params_from_link_mode() with the
current link mode. The function will derive the link parameters (e.g.,
speed) from the link mode and fill them in the 'ethtool_link_ksettings'
struct.
v3:
* Remove link_mode parameter and derive the link parameters in
the driver instead of passing link_mode parameter to ethtool
and derive it there.
v2:
* Introduce 'cap_link_mode_supported' instead of adding a
validity field to 'ethtool_link_ksettings' struct.
[1]
general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967]
CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446
Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03
+38 d0 7c 08 84 d2 0f 85 b9
RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000
RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960
RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f
R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000
R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210
FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37
ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586
ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656
ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620
dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842
dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440
sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
sock_ioctl+0x477/0x6a0 net/socket.c:1177
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c8907043c6 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is only one place where we want to specify new_ifindex. In all
other cases, callers pass 0 as new_ifindex. It looks reasonable to add a
low-level function with new_ifindex and to convert
dev_change_net_namespace to a static inline wrapper.
Fixes: eeb85a14ee ("net: Allow to specify ifindex when device is moved to another namespace")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this case, we don't need to check that new_ifindex is positive in
validate_linkmsg.
Fixes: eeb85a14ee ("net: Allow to specify ifindex when device is moved to another namespace")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
These patches fix a series of spelling errors in net/tipc module.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reset MAC header in HSR Tx path. This is needed, because direct packet
transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC
header.
This has been observed using the following setup:
|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1
|$ ifconfig hsr0 up
|$ ./test hsr0
The test binary is using mmap'ed sockets and is specifying the
PACKET_QDISC_BYPASS socket option.
This patch resolves the following warning on a non-patched kernel:
|[ 112.725394] ------------[ cut here ]------------
|[ 112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568
|[ 112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0)
The warning can be safely removed, because the other call sites of
hsr_forward_skb() make sure that the skb is prepared correctly.
Fixes: d346a3fae3 ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current Linux carries echo-ed ADD_ADDR over pure TCP ACKs, so there is no
need to add a DSS element that would fit only ADD_ADDR with IPv4 address.
Drop the DSS from echo-ed ADD_ADDR, regardless of the IP version.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The length of the IPv4 address is 4 octets and IPv6 is 16. That's the only
difference between add_addr_generate_hmac and add_addr6_generate_hmac.
This patch dropped the duplicate code and unify them into one.
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the type of the address family in struct mptcp_options_received
became sa_family_t, we should set AF_INET/AF_INET6 to it, instead of
using MPTCP_ADDR_IPVERSION_4/6.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added a new struct mptcp_addr_info member addr in struct
mptcp_options_received, and dropped the original family, addr_id, addr,
addr6 and port fields in it. Then we can pass the parameter mp_opt.addr
directly to mptcp_pm_add_addr_received and mptcp_pm_add_addr_echoed.
Since the port number became big-endian now, use htons to convert the
incoming port number to it. Also use ntohs to convert it when passing
it to add_addr_generate_hmac or printing it out.
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the family field was added in struct mptcp_out_options, no need to
use OPTION_MPTCP_ADD_ADDR6 to identify the IPv6 address. Drop it.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moved the mptcp_addr_info struct from protocol.h to mptcp.h,
added a new struct mptcp_addr_info member addr in struct mptcp_out_options,
and dropped the original addr, addr6, addr_id and port fields in it. Then
we can use opts->addr to get the adding address from PM directly using
mptcp_pm_add_addr_signal.
Since the port number became big-endian now, use ntohs to convert it
before sending it out with the ADD_ADDR suboption. Also convert it
when passing it to add_addr_generate_hmac or printing it out.
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moved the flags and ifindex fields from struct mptcp_addr_info
to struct mptcp_pm_addr_entry. Add the flags and ifindex values as two new
parameters to __mptcp_subflow_connect.
In mptcp_pm_create_subflow_or_signal_addr, pass the local address entry's
flags and ifindex fields to __mptcp_subflow_connect.
In mptcp_pm_nl_add_addr_received, just pass two zeros to it.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of rs failure in rds_send_remove_from_sock(), the 'rm' resource
is freed and later under spinlock, causing potential use-after-free.
Set the free pointer to NULL to avoid undefined behavior.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
header file. Hence, we should use ethnl_update_u32() in set_eee ops.
Fixes: fd77be7bd4 ("ethtool: set EEE settings with EEE_SET request")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following batch contains Netfilter/IPVS updates for your net-next tree:
1) Simplify log infrastructure modularity: Merge ipv4, ipv6, bridge,
netdev and ARP families to nf_log_syslog.c. Add module softdeps.
This fixes a rare deadlock condition that might occur when log
module autoload is required. From Florian Westphal.
2) Moves part of netfilter related pernet data from struct net to
net_generic() infrastructure. All of these users can be modules,
so if they are not loaded there is no need to waste space. Size
reduction is 7 cachelines on x86_64, also from Florian.
2) Update nftables audit support to report events once per table,
to get it aligned with iptables. From Richard Guy Briggs.
3) Check for stale routes from the flowtable garbage collector path.
This is fixing IPv6 which breaks due missing check for the dst_cookie.
4) Add a nfnl_fill_hdr() function to simplify netlink + nfnetlink
headers setup.
5) Remove documentation on several statified functions.
6) Remove printk on netns creation for the FTP IPVS tracker,
from Florian Westphal.
7) Remove unnecessary nf_tables_destroy_list_lock spinlock
initialization, from Yang Yingliang.
7) Remove a duplicated forward declaration in ipset,
from Wan Jiabing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Incorrect accounting fwd_alloc can result in a warning when the socket
is torn down,
[18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230
[...]
[18455.319543] Call Trace:
[18455.319556] inet_csk_destroy_sock+0xba/0x1f0
[18455.319577] tcp_rcv_state_process+0x1b4e/0x2380
[18455.319593] ? lock_downgrade+0x3a0/0x3a0
[18455.319617] ? tcp_finish_connect+0x1e0/0x1e0
[18455.319631] ? sk_reset_timer+0x15/0x70
[18455.319646] ? tcp_schedule_loss_probe+0x1b2/0x240
[18455.319663] ? lock_release+0xb2/0x3f0
[18455.319676] ? __release_sock+0x8a/0x1b0
[18455.319690] ? lock_downgrade+0x3a0/0x3a0
[18455.319704] ? lock_release+0x3f0/0x3f0
[18455.319717] ? __tcp_close+0x2c6/0x790
[18455.319736] ? tcp_v4_do_rcv+0x168/0x370
[18455.319750] tcp_v4_do_rcv+0x168/0x370
[18455.319767] __release_sock+0xbc/0x1b0
[18455.319785] __tcp_close+0x2ee/0x790
[18455.319805] tcp_close+0x20/0x80
This currently happens because on redirect case we do skb_set_owner_r()
with the original sock. This increments the fwd_alloc memory accounting
on the original sock. Then on redirect we may push this into the queue
of the psock we are redirecting to. When the skb is flushed from the
queue we give the memory back to the original sock. The problem is if
the original sock is destroyed/closed with skbs on another psocks queue
then the original sock will not have a way to reclaim the memory before
being destroyed. Then above warning will be thrown
sockA sockB
sk_psock_strp_read()
sk_psock_verdict_apply()
-- SK_REDIRECT --
sk_psock_skb_redirect()
skb_queue_tail(psock_other->ingress_skb..)
sk_close()
sock_map_unref()
sk_psock_put()
sk_psock_drop()
sk_psock_zap_ingress()
At this point we have torn down our own psock, but have the outstanding
skb in psock_other. Note that SK_PASS doesn't have this problem because
the sk_psock_drop() logic releases the skb, its still associated with
our psock.
To resolve lets only account for sockets on the ingress queue that are
still associated with the current socket. On the redirect case we will
check memory limits per 6fa9201a89, but will omit fwd_alloc accounting
until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
or received with sk_psock_skb_ingress memory will be claimed on psock_other.
Fixes: 6fa9201a89 ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370
Li Shuang found a NULL pointer dereference crash in her testing:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc]
[] Call Trace:
[] <IRQ>
[] tipc_crypto_rcv+0x2d9/0x8f0 [tipc]
[] tipc_rcv+0x2fc/0x1120 [tipc]
[] tipc_udp_recv+0xc6/0x1e0 [tipc]
[] udpv6_queue_rcv_one_skb+0x16a/0x460
[] udp6_unicast_rcv_skb.isra.35+0x41/0xa0
[] ip6_protocol_deliver_rcu+0x23b/0x4c0
[] ip6_input+0x3d/0xb0
[] ipv6_rcv+0x395/0x510
[] __netif_receive_skb_core+0x5fc/0xc40
This is caused by NULL returned by tipc_aead_get(), and then crashed when
dereferencing it later in tipc_crypto_rcv_complete(). This might happen
when tipc_crypto_rcv_complete() is called by two threads at the same time:
the tmp attached by tipc_crypto_key_attach() in one thread may be released
by the one attached by that in the other thread.
This patch is to fix it by incrementing the tmp's refcnt before attaching
it instead of calling tipc_aead_get() after attaching it.
Fixes: fc1b6d6de2 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Userspace sends tcp connection (sock) destroy on network switch
i.e switching the default network of the device between multiple
networks(Cellular/Wifi/Ethernet).
Kernel though doesn't send reset for the connections in SYN-SENT state
and these connections continue to remain.
Even as per RFC 793, there is no hard rule to not send RST on ABORT in
this state.
Modify tcp_abort and tcp_disconnect behavior to send RST for connections
in syn-sent state to avoid lingering connections on network switch.
Signed-off-by: Manoj Basapathi <manojbm@codeaurora.org>
Signed-off-by: Sauvik Saha <ssaha@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
These comments in udp_bpf_update_proto() are copied from the
original TCP code and apparently do not apply to UDP. Just
remove them.
Reported-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210403052715.13854-1-xiyou.wangcong@gmail.com
When drivers indicate support for AOSP vendor extension, initialize them
and read its capabilities.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
syzbot found general protection fault in crypto_destroy_tfm()[1].
It was caused by wrong clean up loop in llsec_key_alloc().
If one of the tfm array members is in IS_ERR() range it will
cause general protection fault in clean up function [1].
Call Trace:
crypto_free_aead include/crypto/aead.h:191 [inline] [1]
llsec_key_alloc net/mac802154/llsec.c:156 [inline]
mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249
ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338
rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline]
nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584
genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210304152125.1052825-1-paskripkin@gmail.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to add llsec seclevel for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-14-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch stops dumping llsec seclevels for monitors which we don't
support yet. Otherwise we will access llsec mib which isn't initialized
for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-13-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to del llsec devkey for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-12-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to add llsec devkey for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-11-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch stops dumping llsec devkeys for monitors which we don't support
yet. Otherwise we will access llsec mib which isn't initialized for
monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-10-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to del llsec dev for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-9-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to add llsec dev for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-8-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch stops dumping llsec devs for monitors which we don't support
yet. Otherwise we will access llsec mib which isn't initialized for
monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-7-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to del llsec key for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-6-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch forbids to add llsec key for monitor interfaces which we
don't support yet. Otherwise we will access llsec mib which isn't
initialized for monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-5-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
This patch stops dumping llsec keys for monitors which we don't support
yet. Otherwise we will access llsec mib which isn't initialized for
monitors.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20210405003054.256017-4-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
own_address_type has to changed to 0x02 and 0x03 only when
HCI_ENABLE_LL_PRIVACY flag is set.
Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We set hdev->cur_adv_instance in the adv param MGMT request to allow the
callback to the hci param request to set the tx power to the correct
instance. Now that the callbacks use the advertising handle from the hci
request (as they should), this workaround is no longer necessary.
Furthermore, this change resolves a race condition that is more
prevalent when using the extended advertising MGMT calls - if
hdev->cur_adv_instance is set in the params request, then when the data
request is called, we believe our new instance is already active. This
treats it as an update and immediately schedules the instance with the
controller, which has a potential race with the software rotation adv
update. By not setting hdev->cur_adv_instance too early, the new
instance is queued as it should be, to be used when the rotation comes
around again.
This change is tested on harrison peak to confirm that it resolves the
race condition on registration, and that there is no regression in
single- and multi-advertising automated tests.
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
Signed-off-by: Daniel Winkler <danielwinkler@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Some extended advertising hci command complete events are still using
hdev->cur_adv_instance to map the request to the correct advertisement
handle. However, with extended advertising, "current instance" doesn't
make sense as we can have multiple concurrent advertisements. This
change switches these command complete handlers to use the advertising
handle from the request/event, to ensure we will always use the correct
advertising handle regardless of the state of hdev->cur_adv_instance.
This change is tested on hatch and kefka chromebooks and run through
single- and multi-advertising automated tests to confirm callbacks
report tx power to the correct advertising handle, etc.
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
Signed-off-by: Daniel Winkler <danielwinkler@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use the correct print format. Printing an unsigned int value should use %u
instead of %d. For details, please read document:
Documentation/core-api/printk-formats.rst
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
dwork struct is large (>128 byte) and not needed when conntrack module
is not loaded.
Place it in net_generic data instead. The struct net dwork member is now
obsolete and will be removed in a followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
No need to keep this in struct net, place it in the net_generic data.
The sysctl pointer is removed from struct net in a followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This moves all nf_tables pernet data from struct net to a net_generic
extension, with the exception of the gencursor.
The latter is used in the data path and also outside of the nf_tables
core. All others are only used from the configuration plane.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ebtables currently uses net->xt.tables[BRIDGE], but upcoming
patch will move net->xt.tables away from struct net.
To avoid exposing x_tables internals to ebtables, use a private list
instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This allows followup patch to remove the defrag_ipv4 member from struct
net. It also allows to auto-remove the hooks later on by adding a
_disable() function. This will be done later in a follow patch series.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This allows followup patch to remove these members from struct net.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
reduce size of struct net and make this self-contained.
The member in struct net is kept to minimize changes to struct net
layout, it will be removed in a separate patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
No need to place it in struct net, nfnetlink is a module and usage
doesn't occur in fastpath.
Also remove rcu usage:
Not a single reader of net->nfnl uses rcu accessors.
When exit_batch callbacks are executed the net namespace is already dead
so no calls to these functions are possible anymore (else we'd get NULL
deref crash too).
If the module is removed, then modules that call any of those functions
have been removed too so no calls to nfnl functions are possible either.
The nfnl and nfl_stash pointers in struct net are no longer used, they
will be removed in a followup patch to minimize changes to struct net
(causes rebuild for entire network stack).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This removes the only reference of net->nfnl outside of the nfnetlink
module. This allows to move net->nfnl to net_generic infra.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
KMSAN found uninitialized value at batadv_tt_prepare_tvlv_local_data()
[1], for commit ced72933a5 ("batman-adv: use CRC32C instead of CRC16
in TT code") inserted 'reserved' field into "struct batadv_tvlv_tt_data"
and commit 7ea7b4a142 ("batman-adv: make the TT CRC logic VLAN
specific") moved that field to "struct batadv_tvlv_tt_vlan_data" but left
that field uninitialized.
[1] https://syzkaller.appspot.com/bug?id=07f3e6dba96f0eb3cabab986adcd8a58b9bdbe9d
Reported-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com>
Tested-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: ced72933a5 ("batman-adv: use CRC32C instead of CRC16 in TT code")
Fixes: 7ea7b4a142 ("batman-adv: make the TT CRC logic VLAN specific")
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we can specify ifindex on link creation. This change allows
to specify ifindex when a device is moved to another network namespace.
Even now, a device ifindex can be changed if there is another device
with the same ifindex in the target namespace. So this change doesn't
introduce completely new behavior, it adds more control to the process.
CRIU users want to restore containers with pre-created network devices.
A user will provide network devices and instructions where they have to
be restored, then CRIU will restore network namespaces and move devices
into them. The problem is that devices have to be restored with the same
indexes that they have before C/R.
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These patches fix a series of spelling errors in net/nfc module.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Found by virtue of ipv6 raw sockets not honouring the per-socket
IP{,V6}_FREEBIND setting.
Based on hits found via:
git grep '[.]ip_nonlocal_bind'
We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT,
and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured
FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl.
The helper is defined as:
static inline bool ipv6_can_nonlocal_bind(struct net *net, struct inet_sock *inet) {
return net->ipv6.sysctl.ip_nonlocal_bind || inet->freebind || inet->transparent;
}
so this change only widens the accepted opt-outs and is thus a clean bugfix.
I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug,
but IMHO this should be applied to stable kernels as far back as possible.
As such I'm adding a 'fixes' tag with the commit that originally added the helper,
which happened in 4.19. Backporting to older LTS kernels (at least 4.9 and 4.14)
would presumably require open-coding it or backporting the helper as well.
Other possibly relevant commits:
v4.18-rc6-1502-g83ba4645152d net: add helpers checking if socket can be bound to nonlocal address
v4.18-rc6-1431-gd0c1f01138c4 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind
v4.14-rc5-271-gb71d21c274ef sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
v4.7-rc7-1883-g9b9742022888 sctp: support ipv6 nonlocal bind
v4.1-12247-g35a256fee52c ipv6: Nonlocal bind
Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 83ba464515 ("net: add helpers checking if socket can be bound to nonlocal address")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'struct ovs_zone_limit' has more members than initialized in
ovs_ct_limit_get_default_limit(). The rest of the memory is a random
kernel stack content that ends up being sent to userspace.
Fix that by using designated initializer that will clear all
non-specified fields.
Fixes: 11efd5cb04 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warning:
Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c
:558: WARNING: Unexpected indentation.
Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c
:559: WARNING: Block quote ends without a blank line; unexpected unindent.
Due to blank line missing.
Fixes: 908148bc50 ("tipc: refactor tipc_sendmsg() and tipc_lookup_anycast()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/netdev/20210318172255.63185609@canb.auug.org.au/
Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct ip6_sf_socklist and ipv6_mc_socklist are per-socket MLD data.
These data are protected by rtnl lock, socket lock, and RCU.
So, when these are used, it verifies whether rtnl lock is acquired or not.
ip6_mc_msfget() is called by do_ipv6_getsockopt().
But caller doesn't acquire rtnl lock.
So, when these data are used in the ip6_mc_msfget() lockdep warns about it.
But accessing these is actually safe because socket lock was acquired by
do_ipv6_getsockopt().
So, it changes lockdep annotation from rtnl lock to socket lock.
(rtnl_dereference -> sock_dereference)
Locking graph for mld data is like below:
When writing mld data:
do_ipv6_setsockopt()
rtnl_lock
lock_sock
(mld functions)
idev->mc_lock(if per-interface mld data is modified)
When reading mld data:
do_ipv6_getsockopt()
lock_sock
ip6_mc_msfget()
Splat looks like:
=============================
WARNING: suspicious RCU usage
5.12.0-rc4+ #503 Not tainted
-----------------------------
net/ipv6/mcast.c:610 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by mcast-listener-/923:
#0: ffff888007958a70 (sk_lock-AF_INET6){+.+.}-{0:0}, at:
ipv6_get_msfilter+0xaf/0x190
stack backtrace:
CPU: 1 PID: 923 Comm: mcast-listener- Not tainted 5.12.0-rc4+ #503
Call Trace:
dump_stack+0xa4/0xe5
ip6_mc_msfget+0x553/0x6c0
? ipv6_sock_mc_join_ssm+0x10/0x10
? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
? mark_held_locks+0xb7/0x120
? lockdep_hardirqs_on_prepare+0x27c/0x3e0
? __local_bh_enable_ip+0xa5/0xf0
? lock_sock_nested+0x82/0xf0
ipv6_get_msfilter+0xc3/0x190
? compat_ipv6_get_msfilter+0x300/0x300
? lock_downgrade+0x690/0x690
do_ipv6_getsockopt.isra.6.constprop.13+0x1809/0x29e0
? do_ipv6_mcast_group_source+0x150/0x150
? register_lock_class+0x1750/0x1750
? kvm_sched_clock_read+0x14/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x3a/0x1c0
? lock_downgrade+0x690/0x690
? ipv6_getsockopt+0xdb/0x1b0
ipv6_getsockopt+0xdb/0x1b0
[ ... ]
Fixes: 88e2ca3080 ("mld: convert ifmcaddr6 to RCU")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'skb_push()'/'skb_postpush_rcsum()' can be replaced by an equivalent
'skb_push_rcsum()' which is less verbose.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 4f16d25c68 ("netfilter: nftables: add nft_parse_register_load()
and use it") and 345023b0db ("netfilter: nftables: add
nft_parse_register_store() and use it"), the following functions are not
exported symbols anymore:
- nft_parse_register()
- nft_validate_register_load()
- nft_validate_register_store()
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The first argument of a WARN_ONCE() is a condition. This WARN_ONCE()
will only print the table name, and is potentially problematic if the
table name has a %s in it.
Fixes: c520292f29 ("audit: log nftables configuration change events once per table")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This causes dmesg spew during normal operation, so remove this.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The MPTCP reset option allows to carry a mptcp-specific error code that
provides more information on the nature of a connection reset.
Reset option data received gets stored in the subflow context so it can
be sent to userspace via the 'subflow closed' netlink event.
When a subflow is closed, the desired error code that should be sent to
the peer is also placed in the subflow context structure.
If a reset is sent before subflow establishment could complete, e.g. on
HMAC failure during an MP_JOIN operation, the mptcp skb extension is
used to store the reset information.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we explicitly check for the first subflow being
NULL in a couple of places, even if we don't need any
special actions in such scenario.
Just drop the unneeded checks, to avoid confusion.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are not currently tracking the active MPTCP connection
attempts. Let's add the related counters.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the MPTCP protocol is unable to create a new token,
the socket fallback to plain TCP, let's keep track
of such events via a specific MIB.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'unlocked_driver_cb' struct field in 'bo' is not being initialized
in tcf_block_offload_init(). The uninitialized 'unlocked_driver_cb'
will be used when calling unlocked_driver_cb(). So initialize 'bo' to
zero to avoid the issue.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 0fdcf78d59 ("net: use flow_indr_dev_setup_offload()")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2021-04-01
The following pull-request contains BPF updates for your *net-next* tree.
We've added 68 non-merge commits during the last 7 day(s) which contain
a total of 70 files changed, 2944 insertions(+), 1139 deletions(-).
The main changes are:
1) UDP support for sockmap, from Cong.
2) Verifier merge conflict resolution fix, from Daniel.
3) xsk selftests enhancements, from Maciej.
4) Unstable helpers aka kernel func calling, from Martin.
5) Batches ops for LPM map, from Pedro.
6) Fix race in bpf_get_local_storage, from Yonghong.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes kbuild findings:
smatch warnings:
net/bluetooth/smp.c:1633 smp_user_confirm_reply() warn: variable
dereferenced before check 'conn' (see line 1631)
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There is a possibility where HCI_INQUIRY flag is set but we still
send HCI_OP_INQUIRY anyway.
Such a case can be reproduced by connecting to an LE device while
active scanning. When the device is discovered, we initiate a
connection, stop LE Scan, and send Discovery MGMT with status
disabled, but we don't cancel the inquiry.
Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
1. Add space when needed;
2. Block comments style fix;
3. Move open brace '{' following function definitions to the next line;
4. Remove unnecessary braces '{}' for single statement blocks.
Signed-off-by: Meng Yu <yumeng18@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
void function return statements are not generally useful.
Signed-off-by: Meng Yu <yumeng18@huawei.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This change reverts commit ad98dd3705 ("mptcp: provide subflow aware
release function"). The latter introduced a deadlock spotted by
syzkaller and is not needed anymore after the previous commit.
Fixes: ad98dd3705 ("mptcp: provide subflow aware release function")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unrolling mcast state at msk dismantel time is bug prone, as
syzkaller reported:
======================================================
WARNING: possible circular locking dependency detected
5.11.0-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor905/8822 is trying to acquire lock:
ffffffff8d678fe8 (rtnl_mutex){+.+.}-{3:3}, at: ipv6_sock_mc_close+0xd7/0x110 net/ipv6/mcast.c:323
but task is already holding lock:
ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1600 [inline]
ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp6_release+0x57/0x130 net/mptcp/protocol.c:3507
which lock already depends on the new lock.
Instead we can simply forbit any mcast-related setsockopt
Fixes: 717e79c867 ("mptcp: Add setsockopt()/getsockopt() socket operations")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct smc_clc_msg_local is declared twice. One is declared at
301st line. The blew one is not needed. Remove the duplicate.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support for UDP_GRO was added in the past but the implementation for
getsockopt was missed which did lead to an error when we tried to
retrieve the setting for UDP_GRO. This patch adds the missing switch
case for UDP_GRO
Fixes: e20cf8d3f1 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Norman Maurer <norman_maurer@apple.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The logic in rt6_age_examine_exception is confusing. The commit is
to refactor the code.
Signed-off-by: Xu Jia <xujia39@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When enabling a bearer by name, we don't sanity check its name with
higher slot in bearer list. This may have the effect that the name
of an already enabled bearer bypasses the check.
To fix the above issue, we just perform an extra checking with all
existing bearers.
Fixes: cb30a63384 ("tipc: refactor function tipc_enable_bearer()")
Cc: stable@vger.kernel.org
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now UDP supports sockmap and redirection, we can safely update
the sock type checks for it accordingly.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-15-xiyou.wangcong@gmail.com
We have to implement udp_bpf_recvmsg() to replace the ->recvmsg()
to retrieve skmsg from ingress_msg.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-14-xiyou.wangcong@gmail.com
Although these two functions are only used by TCP, they are not
specific to TCP at all, both operate on skmsg and ingress_msg,
so fit in net/core/skmsg.c very well.
And we will need them for non-TCP, so rename and move them to
skmsg.c and export them to modules.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331023237.41094-13-xiyou.wangcong@gmail.com
This is similar to tcp_read_sock(), except we do not need
to worry about connections, we just need to retrieve skb
from UDP receive queue.
Note, the return value of ->read_sock() is unused in
sk_psock_verdict_data_ready(), and UDP still does not
support splice() due to lack of ->splice_read(), so users
can not reach udp_read_sock() directly.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-12-xiyou.wangcong@gmail.com
Currently sockmap calls into each protocol to update the struct
proto and replace it. This certainly won't work when the protocol
is implemented as a module, for example, AF_UNIX.
Introduce a new ops sk->sk_prot->psock_update_sk_prot(), so each
protocol can implement its own way to replace the struct proto.
This also helps get rid of symbol dependencies on CONFIG_INET.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331023237.41094-11-xiyou.wangcong@gmail.com
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
confusing and more importantly we still want to distinguish them
from user-space. So we can just reuse the stream verdict code but
introduce a new type of eBPF program, skb_verdict. Users are not
allowed to attach stream_verdict and skb_verdict programs to the
same map.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com
sock_map_link() passes down map progs, but it is confusing
to see both map progs and psock progs. Make the map progs
more obvious by retrieving it directly with sock_map_progs()
inside sock_map_link(). Now it is aligned with
sock_map_link_no_progs() too.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-8-xiyou.wangcong@gmail.com
The RCU callback sk_psock_destroy() only queues work psock->gc,
so we can just switch to rcu work to simplify the code.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-6-xiyou.wangcong@gmail.com
We do not have to lock the sock to avoid losing sk_socket,
instead we can purge all the ingress queues when we close
the socket. Sending or receiving packets after orphaning
socket makes no sense.
We do purge these queues when psock refcnt reaches zero but
here we want to purge them explicitly in sock_map_close().
There are also some nasty race conditions on testing bit
SK_PSOCK_TX_ENABLED and queuing/canceling the psock work,
we can expand psock->ingress_lock a bit to protect them too.
As noticed by John, we still have to lock the psock->work,
because the same work item could be running concurrently on
different CPU's.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-5-xiyou.wangcong@gmail.com
We only have skb_send_sock_locked() which requires callers
to use lock_sock(). Introduce a variant skb_send_sock()
which locks on its own, callers do not need to lock it
any more. This will save us from adding a ->sendmsg_locked
for each protocol.
To reuse the code, pass function pointers to __skb_send_sock()
and build skb_send_sock() and skb_send_sock_locked() on top.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-4-xiyou.wangcong@gmail.com
Currently we rely on lock_sock to protect ingress_msg,
it is too big for this, we can actually just use a spinlock
to protect this list like protecting other skb queues.
__tcp_bpf_recvmsg() is still special because of peeking,
it still has to use lock_sock.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-3-xiyou.wangcong@gmail.com
Currently we purge the ingress_skb queue only when psock
refcnt goes down to 0, so locking the queue is not necessary,
but in order to be called during ->close, we have to lock it
here.
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-2-xiyou.wangcong@gmail.com
xdp_return_frame() may be called outside of NAPI context to return
xdpf back to page_pool. xdp_return_frame() calls __xdp_return() with
napi_direct = false. For page_pool memory model, __xdp_return() calls
xdp_return_frame_no_direct() unconditionally and below false negative
kernel BUG throw happened under preempt-rt build:
[ 430.450355] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/3884
[ 430.451678] caller is __xdp_return+0x1ff/0x2e0
[ 430.452111] CPU: 0 PID: 3884 Comm: modprobe Tainted: G U E 5.12.0-rc2+ #45
Changes in v2:
- This patch fixes the issue by making xdp_return_frame_no_direct() is
only called if napi_direct = true, as recommended for better by
Jesper Dangaard Brouer. Thanks!
Fixes: 2539650fad ("xdp: Helpers for disabling napi_direct of xdp_return_frame")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My previous commits added a dev_hold() in tunnels ndo_init(),
but forgot to remove it from special functions setting up fallback tunnels.
Fallback tunnels do call their respective ndo_init()
This leads to various reports like :
unregister_netdevice: waiting for ip6gre0 to become free. Usage count = 2
Fixes: 48bb569726 ("ip6_tunnel: sit: proper dev_{hold|put} in ndo_[un]init methods")
Fixes: 6289a98f08 ("sit: proper dev_{hold|put} in ndo_[un]init methods")
Fixes: 40cb881b5a ("ip6_vti: proper dev_{hold|put} in ndo_[un]init methods")
Fixes: 7f700334be ("ip6_gre: proper dev_{hold|put} in ndo_[un]init methods")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the missing destroy_workqueue() before return from
tipc_crypto_start() in the error handling case.
Fixes: 1ef6f7c939 ("tipc: add automatic session key exchange")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2021-03-31
1) Fix ipv4 pmtu checks for xfrm anf vti interfaces.
From Eyal Birger.
2) There are situations where the socket passed to
xfrm_output_resume() is not the same as the one
attached to the skb. Use the socket passed to
xfrm_output_resume() to avoid lookup failures
when xfrm is used with VRFs.
From Evan Nimmo.
3) Make the xfrm_state_hash_generation sequence counter per
network namespace because but its write serialization
lock is also per network namespace. Write protection
is insufficient otherwise.
From Ahmed S. Darwish.
4) Fixup sctp featue flags when used with esp offload.
From Xin Long.
5) xfrm BEET mode doesn't support fragments for inner packets.
This is a limitation of the protocol, so no fix possible.
Warn at least to notify the user about that situation.
From Xin Long.
6) Fix NULL pointer dereference on policy lookup when
namespaces are uses in combination with esp offload.
7) Fix incorrect transformation on esp offload when
packets get segmented at layer 3.
8) Fix some user triggered usages of WARN_ONCE in
the xfrm compat layer.
From Dmitry Safonov.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The XArray interface is easier for this driver to use. Also fixes a
bug reported by the improper use of GFP_ATOMIC.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rds_message_map_pages, the rm is freed by rds_message_put(rm).
But rm is still used by rm->data.op_sg in return value.
My patch assigns ERR_CAST(rm->data.op_sg) to err before the rm is
freed to avoid the uaf.
Fixes: 7dba92037b ("net/rds: Use ERR_PTR for rds_message_alloc_sgs()")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add FEC API to netlink.
This is not a 1-to-1 conversion.
FEC settings already depend on link modes to tell user which
modes are supported. Take this further an use link modes for
manual configuration. Old struct ethtool_fecparam is still
used to talk to the drivers, so we need to translate back
and forth. We can revisit the internal API if number of FEC
encodings starts to grow.
Enforce only one active FEC bit (by using a bit position
rather than another mask).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a short network outage, the dst_entry is timed out and put
in DST_OBSOLETE_DEAD. We are in this code because arp reply comes
from this neighbour after network recovers. There is a potential
race condition that dst_entry is still in DST_OBSOLETE_DEAD.
With that, another neighbour lookup causes more harm than good.
In best case all packets in arp_queue are lost. This is
counterproductive to the original goal of finding a better path
for those packets.
I observed a worst case with 4.x kernel where a dst_entry in
DST_OBSOLETE_DEAD state is associated with loopback net_device.
It leads to an ethernet header with all zero addresses.
A packet with all zero source MAC address is quite deadly with
mac80211, ath9k and 802.11 block ack. It fails
ieee80211_find_sta_by_ifaddr in ath9k (xmit.c). Ath9k flushes tx
queue (ath_tx_complete_aggr). BAW (block ack window) is not
updated. BAW logic is damaged and ath9k transmission is disabled.
Signed-off-by: Tong Zhu <zhutong@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a helper function to set up the netlink and nfnetlink headers.
Update existing codebase to use it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a helper function to calculate the base sequence number
field that is stored in the nfnetlink header. Use the helper function
whenever possible.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The spinlock nf_tables_destroy_list_lock is initialized statically.
It is unnecessary to initialize by spin_lock_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Move dst_check() to the garbage collector path. Stale routes trigger the
flow entry teardown state which makes affected flows go back to the
classic forwarding path to re-evaluate flow offloading.
IPv6 requires the dst cookie to work, store it in the flow_tuple,
otherwise dst_check() always fails.
Fixes: e5075c0bad ("netfilter: flowtable: call dst_check() to fall back to classic forwarding")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
modprobe calls from the nf_logger_find_get() API causes deadlock in very
special cases because they occur with the nf_tables transaction mutex held.
In the specific case of nf_log, deadlock is via:
A nf_tables -> transaction mutex -> nft_log -> modprobe -> nf_log_syslog \
-> pernet_ops rwsem -> wait for C
B netlink event -> rtnl_mutex -> nf_tables transaction mutex -> wait for A
C close() -> ip6mr_sk_done -> rtnl_mutex -> wait for B
Earlier patch added NFLOG/xt_LOG module softdeps to avoid the need to load
the backend module during a transaction.
For nft_log we would have to add a softdep for both nfnetlink_log or
nf_log_syslog, since we do not know in advance which of the two backends
are going to be configured.
This defers the modprobe op until after the transaction mutex is released.
Tested-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
xt_LOG has no direct dependency on the syslog-based logger, it relies
on the nf_log core to probe the requested backend.
Now that all syslog-based loggers reside in the same module, we can
just add a soft dependency on nf_log_syslog and let modprobe take
care of it.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Remove nf_log_common. Now that all per-af modules have been merged
there is no longer a need to provide a helper module.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Provide bridge log support from nf_log_syslog.
After the merge there is no need to load the "real packet loggers",
all of them now reside in the same module.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch added a new function mptcp_nl_remove_id_zero_address to
remove the id 0 address.
In this function, traverse all the existing msk sockets to find the
msk matched the input IP address. Then fill the removing list with
id 0, and pass it to mptcp_pm_remove_addr and mptcp_pm_remove_subflow.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some duplicate code in mptcp_pm_nl_rm_addr_received and
mptcp_pm_nl_rm_subflow_received. This patch unifies them into a new
function named mptcp_pm_nl_rm_addr_or_subflow. In it, use the input
parameter rm_type to identify it's now removing an address or a subflow.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's only one subflow involving the non-zero id address, but there
may be multi subflows involving the id 0 address.
Here's an example:
local_id=0, remote_id=0
local_id=1, remote_id=0
local_id=0, remote_id=1
If the removing address id is 0, all the subflows involving the id 0
address need to be removed.
In mptcp_pm_nl_rm_addr_received/mptcp_pm_nl_rm_subflow_received, the
"break" prevents the iteration to the next subflow, so this patch
dropped them.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sysctl_icmp_echo_enable_probe is an u8.
ipv4_net_table entry should use
.maxlen = sizeof(u8).
.proc_handler = proc_dou8vec_minmax,
Fixes: f1b8fa9fa5 ("net: add sysctl for enabling RFC 8335 PROBE messages")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the UDP protocol delivers GSO_FRAGLIST packets to
the sockets without the expected segmentation.
This change addresses the issue introducing and maintaining
a couple of new fields to explicitly accept SKB_GSO_UDP_L4
or GSO_FRAGLIST packets. Additionally updates udp_unexpected_gso()
accordingly.
UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
zeroed.
v1 -> v2:
- use 2 bits instead of a whole GSO bitmask (Willem)
Fixes: 9fd1ff5d2a ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the previous patch, the stack can do L4 UDP aggregation
on top of a UDP tunnel.
In such scenario, udp{4,6}_gro_complete will be called twice. This function
will enter its is_flist branch immediately, even though that is only
correct on the second call, as GSO_FRAGLIST is only relevant for the
inner packet.
Instead, we need to try first UDP tunnel-based aggregation, if the GRO
packet requires that.
This patch changes udp{4,6}_gro_complete to skip the frag list processing
when while encap_mark == 1, identifying processing of the outer tunnel
header.
Additionally, clears the field in udp_gro_complete() so that we can enter
the frag list path on the next round, for the inner header.
v1 -> v2:
- hopefully clarified the commit message
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
are UDP tunnels available in the system, udp_gro_receive() could end-up
doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
the outer UDP tunnel level for packets effectively carrying and UDP
tunnel header.
That could cause inner protocol corruption. If e.g. the relevant
packets carry a vxlan header, different vxlan ids will be ignored/
aggregated to the same GSO packet. Inner headers will be ignored, too,
so that e.g. TCP over vxlan push packets will be held in the GRO
engine till the next flush, etc.
Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
current packet could land in a UDP tunnel, and let udp_gro_receive()
do GRO via udp_sk(sk)->gro_receive.
The check implemented in this patch is broader than what is strictly
needed, as the existing UDP tunnel could be e.g. configured on top of
a different device: we could end-up skipping GRO at-all for some packets.
Anyhow, that is a very thin corner case and covering it will add quite
a bit of complexity.
v1 -> v2:
- hopefully clarify the commit message
Fixes: 9fd1ff5d2a ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6 ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When UDP packets generated locally by a socket with UDP_SEGMENT
traverse the following path:
UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
UDP tunnel (rx) -> UDP socket (no UDP_GRO)
ip_summed will be set to CHECKSUM_PARTIAL at creation time and
such checksum mode will be preserved in the above path up to the
UDP tunnel receive code where we have:
__iptunnel_pull_header() -> skb_pull_rcsum() ->
skb_postpull_rcsum() -> __skb_postpull_rcsum()
The latter will convert the skb to CHECKSUM_NONE.
The UDP GSO packet will be later segmented as part of the rx socket
receive operation, and will present a CHECKSUM_NONE after segmentation.
Additionally the segmented packets UDP CB still refers to the original
GSO packet len. Overall that causes unexpected/wrong csum validation
errors later in the UDP receive path.
We could possibly address the issue with some additional checks and
csum mangling in the UDP tunnel code. Since the issue affects only
this UDP receive slow path, let's set a suitable csum status there.
Note that SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets lacking an UDP
encapsulation present a valid checksum when landing to udp_queue_rcv_skb(),
as the UDP checksum has been validated by the GRO engine.
v2 -> v3:
- even more verbose commit message and comments
v1 -> v2:
- restrict the csum update to the packets strictly needing them
- hopefully clarify the commit message and code comments
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TODO file here has not been updated from 2005, and the function
development described in the file have been implemented or abandoned.
Its existence will mislead developers seeking to view outdated information.
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TODO file here has not been updated for 13 years, and the function
development described in the file have been implemented or abandoned.
Its existence will mislead developers seeking to view outdated information.
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
/proc/net/nf_conntrack shows icmpv6 as unknown.
Fixes: 09ec82f5af ("netfilter: conntrack: remove protocol name from l4proto struct")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Provide netdev family support from the nf_log_syslog module.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This removes the nf_log_ipv6 module, the functionality is now
provided by nf_log_syslog.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
similar to previous change: nf_log_syslog now covers ARP logging
as well, the old nf_log_arp module is removed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Netfilter has multiple log modules:
nf_log_arp
nf_log_bridge
nf_log_ipv4
nf_log_ipv6
nf_log_netdev
nfnetlink_log
nf_log_common
With the exception of nfnetlink_log (packet is sent to userspace for
dissection/logging), all of them log to the kernel ringbuffer.
This is the first part of a series to merge all modules except
nfnetlink_log into a single module: nf_log_syslog.
This allows to reduce code. After the series, only two log modules remain:
nfnetlink_log and nf_log_syslog. The latter provides the same
functionality as the old per-af log modules.
This renames nf_log_ipv4 to nf_log_syslog.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently the mentioned helper can end-up freeing the socket wmem
without waking-up any processes waiting for more write memory.
If the partially orphaned skb is attached to an UDP (or raw) socket,
the lack of wake-up can hang the user-space.
Even for TCP sockets not calling the sk destructor could have bad
effects on TSQ.
Address the issue using skb_orphan to release the sk wmem before
setting the new sock_efree destructor. Additionally bundle the
whole ownership update in a new helper, so that later other
potential users could avoid duplicate code.
v1 -> v2:
- use skb_orphan() instead of sort of open coding it (Eric)
- provide an helper for the ownership change (Eric)
Fixes: f6ba8d33cf ("netem: fix skb_orphan_partial()")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sch_htb: fix null pointer dereference on a null new_q
Currently if new_q is null, the null new_q pointer will be
dereference when 'q->offload' is true. Fix this by adding
a braces around htb_parent_to_leaf_offload() to avoid it.
Addresses-Coverity: ("Dereference after null check")
Fixes: d03b195b5a ("sch_htb: Hierarchical QoS hardware offload")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qrtr_tx_wait does not check for radix_tree_insert failure, causing
the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix
that by releasing flow on radix_tree_insert failure.
Fixes: 5fdeb0d372 ("net: qrtr: Implement outgoing flow control")
Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the icmp_rcv function to check PROBE messages and call icmp_echo
if a PROBE request is detected.
Modify the existing icmp_echo function to respond ot both ping and PROBE
requests.
This was tested using a custom modification to the iputils package and
wireshark. It supports IPV4 probing by name, ifindex, and probing by
both IPV4 and IPV6 addresses. It currently does not support responding
to probes off the proxy node (see RFC 8335 Section 2).
The modification to the iputils package is still in development and can
be found here: https://github.com/Juniper-Clinic-2020/iputils.git. It
supports full sending functionality of PROBE requests, but currently
does not parse the response messages, which is why Wireshark is required
to verify the sent and recieved PROBE messages. The modification adds
the ``-e'' flag to the command which allows the user to specify the
interface identifier to query the probed host. An example usage would be
<./ping -4 -e 1 [destination]> to send a PROBE request of ifindex 1 to the
destination node.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ipv6_dev_find to ipv6_stub to allow lookup of net_devices by IPV6
address in net/ipv4/icmp.c.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the ping_supported function to support PROBE message types. This
allows tools such as the ping command in the iputils package to be
modified to send PROBE requests through the existing framework for
sending ping requests.
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Section 8 of RFC 8335 specifies potential security concerns of
responding to PROBE requests, and states that nodes that support PROBE
functionality MUST be able to enable/disable responses and that
responses MUST be disabled by default
Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, action creation using ACT API in replace mode is buggy.
When invoking for non-existent action index 42,
tc action replace action bpf obj foo.o sec <xyz> index 42
kernel creates the action, fills up the netlink response, and then just
deletes the action after notifying userspace.
tc action show action bpf
doesn't list the action.
This happens due to the following sequence when ovr = 1 (replace mode)
is enabled:
tcf_idr_check_alloc is used to atomically check and either obtain
reference for existing action at index, or reserve the index slot using
a dummy entry (ERR_PTR(-EBUSY)).
This is necessary as pointers to these actions will be held after
dropping the idrinfo lock, so bumping the reference count is necessary
as we need to insert the actions, and notify userspace by dumping their
attributes. Finally, we drop the reference we took using the
tcf_action_put_many call in tcf_action_add. However, for the case where
a new action is created due to free index, its refcount remains one.
This when paired with the put_many call leads to the kernel setting up
the action, notifying userspace of its creation, and then tearing it
down. For existing actions, the refcount is still held so they remain
unaffected.
Fortunately due to rtnl_lock serialization requirement, such an action
with refcount == 1 will not be concurrently deleted by anything else, at
best CLS API can move its refcount up and down by binding to it after it
has been published from tcf_idr_insert_many. Since refcount is atleast
one until put_many call, CLS API cannot delete it. Also __tcf_action_put
release path already ensures deterministic outcome (either new action
will be created or existing action will be reused in case CLS API tries
to bind to action concurrently) due to idr lock serialization.
We fix this by making refcount of newly created actions as 2 in ACT API
replace mode. A relaxed store will suffice as visibility is ensured only
after the tcf_idr_insert_many call.
Note that in case of creation or overwriting using CLS API only (i.e.
bind = 1), overwriting existing action object is not allowed, and any
such request is silently ignored (without error).
The refcount bump that occurs in tcf_idr_check_alloc call there for
existing action will pair with tcf_exts_destroy call made from the
owner module for the same action. In case of action creation, there
is no existing action, so no tcf_exts_destroy callback happens.
This means no code changes for CLS API.
Fixes: cae422f379 ("net: sched: use reference counting action init")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
deadlock on SMP because stop calls del_timer_sync on the timer that
invoked channel_monitor as its timer function.
Recognise the inherent race of marking the monitor disabled before
deleting the timer by just returning if enable was cleared. After
a timeout (the default case -- reset to START when response received)
just mark the monitor.enabled false.
If the channel has an entry on the channel_queue list, or if the
state is not ACTIVE or INACTIVE, then warn and mark the timer stopped
and don't restart, as the locking is broken somehow.
Fixes: 0795fb2021 ("net/ncsi: Stop monitor if channel times out or is inactive")
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
checkpatch started to complain about the mispelling of:
CHECK: 'wont' may be misspelled - perhaps 'won't'?
#459: FILE: ./net/batman-adv/bat_iv_ogm.c:459:
+ * - the resulting packet wont be bigger than
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Replace WARN_ONCE() that can be triggered from userspace with
pr_warn_once(). Those still give user a hint what's the issue.
I've left WARN()s that are not possible to trigger with current
code-base and that would mean that the code has issues:
- relying on current compat_msg_min[type] <= xfrm_msg_min[type]
- expected 4-byte padding size difference between
compat_msg_min[type] and xfrm_msg_min[type]
- compat_policy[type].len <= xfrma_policy[type].len
(for every type)
Reported-by: syzbot+834ffd1afc7212eb8147@syzkaller.appspotmail.com
Fixes: 5f3eea6b7e ("xfrm/compat: Attach xfrm dumps to 64=>32 bit translator")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
pahole currently only generates the btf_id for external function and
ftrace-able function. Some functions in the bpf_tcp_ca_kfunc_ids
are static (e.g. cubictcp_init). Thus, unless CONFIG_DYNAMIC_FTRACE
is set, btf_ids for those functions will not be generated and the
compilation fails during resolve_btfids.
This patch limits those functions to CONFIG_DYNAMIC_FTRACE. I will
address the pahole generation in a followup and then remove the
CONFIG_DYNAMIC_FTRACE limitation.
Fixes: e78aea8b21 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210329221357.834438-1-kafai@fb.com
tcp_min_tso_segs is now stored in u8, so max value is 255.
255 limit is enforced by proc_dou8vec_minmax().
We can therefore remove the gso_max_segs variable.
Fixes: 47996b489bdc ("tcp: convert elligible sysctls to u8")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger
a warning [1]
Issue here is that:
- all dev_put() should be paired with a corresponding prior dev_hold().
- A driver doing a dev_put() in its ndo_uninit() MUST also
do a dev_hold() in its ndo_init(), only when ndo_init()
is returning 0.
Otherwise, register_netdevice() would call ndo_uninit()
in its error path and release a refcount too soon.
Fixes: 919067cc84 ("net: add CONFIG_PCPU_DEV_REFCNT")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger
a warning [1]
Issue here is that:
- all dev_put() should be paired with a corresponding dev_hold(),
and vice versa.
- A driver doing a dev_put() in its ndo_uninit() MUST also
do a dev_hold() in its ndo_init(), only when ndo_init()
is returning 0.
Otherwise, register_netdevice() would call ndo_uninit()
in its error path and release a refcount too soon.
ip6_gre for example (among others problematic drivers)
has to use dev_hold() in ip6gre_tunnel_init_common()
instead of from ip6gre_newlink_common(), covering
both ip6gre_tunnel_init() and ip6gre_tap_init()/
Note that ip6gre_tunnel_init_common() is not called from
ip6erspan_tap_init() thus we also need to add a dev_hold() there,
as ip6erspan_tunnel_uninit() does call dev_put()
[1]
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 0 PID: 8422 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
Modules linked in:
CPU: 1 PID: 8422 Comm: syz-executor854 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58
RSP: 0018:ffffc900018befd0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88801ef19c40 RSI: ffffffff815c51f5 RDI: fffff52000317dec
RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff888018cf4568
R13: ffff888018cf4c00 R14: ffff8880228f2000 R15: ffffffff8d659b80
FS: 00000000014eb300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055d7bf2b3138 CR3: 0000000014933000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__refcount_dec include/linux/refcount.h:344 [inline]
refcount_dec include/linux/refcount.h:359 [inline]
dev_put include/linux/netdevice.h:4135 [inline]
ip6gre_tunnel_uninit+0x3d7/0x440 net/ipv6/ip6_gre.c:420
register_netdevice+0xadf/0x1500 net/core/dev.c:10308
ip6gre_newlink_common.constprop.0+0x158/0x410 net/ipv6/ip6_gre.c:1984
ip6gre_newlink+0x275/0x7a0 net/ipv6/ip6_gre.c:2017
__rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443
rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491
rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:654 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:674
____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
___sys_sendmsg+0xf3/0x170 net/socket.c:2404
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
Fixes: 919067cc84 ("net: add CONFIG_PCPU_DEV_REFCNT")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We fix a warning from the htmldoc tool and an indentation error reported
by smatch. There are no functional changes in this commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get
the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq
and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq)
finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time.
Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after
the branch completed.
My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because
this skb will be freed by kfree_skb(skb) finally.
Fixes: cb1b728096 ("tipc: eliminate race condition at multicast reception")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
If PHY is not available on DSA port (described at devicetree but absent or
failed to detect) then kernel prints warning after 3700 secs:
[ 3707.948771] ------------[ cut here ]------------
[ 3707.948784] Type was not set for devlink port.
[ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8
We should unregister the devlink port as a user port and
re-register it as an unused port before executing "continue" in case of
dsa_port_setup error.
Fixes: 86f8b1c01a ("net: dsa: Do not make user port errors fatal")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit f5223e9eee ("can: extend sockaddr_can to include j1939
members") the sockaddr_can has been extended in size and a new
CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol
specific needed size.
The ABI for the msg_name and msg_namelen has not been adapted to the
new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to
a problem when an existing binary reads the (increased) struct
sockaddr_can in msg_name.
Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Reported-by: Richard Weinberger <richard@nod.at>
Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t
Link: https://lore.kernel.org/r/20210325125850.1620-2-socketcan@hartkopp.net
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Since commit f5223e9eee ("can: extend sockaddr_can to include j1939
members") the sockaddr_can has been extended in size and a new
CAN_REQUIRED_SIZE macro has been introduced to calculate the protocol
specific needed size.
The ABI for the msg_name and msg_namelen has not been adapted to the
new CAN_REQUIRED_SIZE macro for the other CAN protocols which leads to
a problem when an existing binary reads the (increased) struct
sockaddr_can in msg_name.
Fixes: f5223e9eee ("can: extend sockaddr_can to include j1939 members")
Reported-by: Richard Weinberger <richard@nod.at>
Tested-by: Richard Weinberger <richard@nod.at>
Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Link: https://lore.kernel.org/linux-can/1135648123.112255.1616613706554.JavaMail.zimbra@nod.at/T/#t
Link: https://lore.kernel.org/r/20210325125850.1620-1-socketcan@hartkopp.net
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Commit 94579ac3f6 ("xfrm: Fix double ESP trailer insertion in IPsec
crypto offload.") added a XFRM_XMIT flag to avoid duplicate ESP trailer
insertion on HW offload. This flag is set on the secpath that is shared
amongst segments. This lead to a situation where some segments are
not transformed correctly when segmentation happens at layer 3.
Fix this by using private skb extensions for segmented and hw offloaded
ESP packets.
Fixes: 94579ac3f6 ("xfrm: Fix double ESP trailer insertion in IPsec crypto offload.")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
There is a typo in the bbr function, s/even/event/.
This patch fixes it.
Fixes: e78aea8b21 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210329003213.2274210-1-kafai@fb.com
Opportunity for min()
Generated by: scripts/coccinelle/misc/minmax.cocci
CC: Denis Efremov <efremov@linux.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/nfc/digital_core.c:473: warning: expecting prototype for start_poll operation(). Prototype was for digital_start_poll() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/ipv6/ip6_tunnel.c:401: warning: expecting prototype for parse_tvl_tnl_enc_lim(). Prototype was for ip6_tnl_parse_tlv_enc_lim() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/9p/client.c:133: warning: expecting prototype for parse_options(). Prototype was for parse_opts() instead
net/9p/client.c:269: warning: expecting prototype for p9_req_alloc(). Prototype was for p9_tag_alloc() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/9p/trans_fd.c:881: warning: expecting prototype for p9_mux_destroy(). Prototype was for p9_conn_destroy() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/9p/error.c:207: warning: expecting prototype for errstr2errno(). Prototype was for p9_errstr2errno() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/core/netevent.c:45: warning: expecting prototype for netevent_unregister_notifier(). Prototype was for unregister_netevent_notifier() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/core/dev_addr_lists.c:732: warning: expecting prototype for dev_uc_flush(). Prototype was for dev_uc_init() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following W=1 kernel build warning(s):
net/netlabel/netlabel_mgmt.c:78: warning: expecting prototype for netlbl_mgmt_add(). Prototype was for netlbl_mgmt_add_common() instead
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>