1140918 Commits

Author SHA1 Message Date
1a5ead77c9 ASoC: amd: yc: Add Razer Blade 14 2022 into DMI table
[ Upstream commit 68506a173dd700c2bd794dcc3489edcdb8ee35c6 ]

Razer Blade 14 (2022) - RZ09-0427 needs the quirk to enable the built in microphone

Signed-off-by: Wim Van Boven <wimvanboven@gmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20221216081828.12382-1-wimvanboven@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:28 +01:00
c658a6d550 ASoC: SOF: Add FW state to debugfs
[ Upstream commit 9a9134fd56f6ba614ff7b2b3b0bac0bf1d0dc0c9 ]

Allow system health detection mechanisms to check the FW state, this
will allow them to check if the FW is in its "crashed" state going
forward to help automatically diagnose driver state.

Signed-off-by: Curtis Malainey <cujomalainey@chromium.org>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20221220125629.8469-4-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:28 +01:00
4e491a1373 ASoC: SOF: pm: Always tear down pipelines before DSP suspend
[ Upstream commit d185e0689abc98ef55fb7a7d75aa0c48a0ed5838 ]

When the DSP is suspended while the firmware is in the crashed state, we
skip tearing down the pipelines. This means that the widget reference
counts will not get to reset to 0 before suspend. This will lead to
errors with resuming audio after system resume. To fix this, invoke the
tear_down_all_pipelines op before skipping to DSP suspend.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Curtis Malainey <cujomalainey@chromium.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20221220125629.8469-3-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
7af7cb349c ASoC: SOF: pm: Set target state earlier
[ Upstream commit 6f95eec6fb89e195dbdf30de65553c7fc57d9372 ]

If the DSP crashes before the system suspends, the setting of target state
will be skipped because the firmware state will no longer be
SOF_FW_BOOT_COMPLETE. This leads to the incorrect assumption that the
DSP should suspend to D0I3 instead of suspending to D3. To fix this,
set the target_state before we skip to DSP suspend even when the DSP has
crashed.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Curtis Malainey <cujomalainey@chromium.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20221220125629.8469-2-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
c9f0e54a25 scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace
[ Upstream commit a3be19b91ea7121d388084e8c07f5b1b982eb40c ]

It was observed that the kernel would potentially send
ISCSI_KEVENT_UNBIND_SESSION multiple times. Introduce 'target_state' in
iscsi_cls_session() to make sure session will send only one unbind session
event.

This introduces a regression wrt. the issue fixed in commit 13e60d3ba287
("scsi: iscsi: Report unbind session event when the target has been
removed"). If iscsid dies for any reason after sending an unbind session to
kernel, once iscsid is restarted, the kernel's ISCSI_KEVENT_UNBIND_SESSION
event is lost and userspace is then unable to logout. However, the session
is actually in invalid state (its target_id is INVALID) so iscsid should
not sync this session during restart.

Consequently we need to check the session's target state during iscsid
restart.  If session is in unbound state, do not sync this session and
perform session teardown. This is OK because once a session is unbound, we
can not recover it any more (mainly because its target id is INVALID).

Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Link: https://lore.kernel.org/r/20221126010752.231917-1-haowenchao@huawei.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
dc706d7787 tcp: fix rate_app_limited to default to 1
[ Upstream commit 300b655db1b5152d6101bcb6801d50899b20c2d6 ]

The initial default value of 0 for tp->rate_app_limited was incorrect,
since a flow is indeed application-limited until it first sends
data. Fixing the default to be 1 is generally correct but also
specifically will help user-space applications avoid using the initial
tcpi_delivery_rate value of 0 that persists until the connection has
some non-zero bandwidth sample.

Fixes: eb8329e0a04d ("tcp: export data delivery rate")
Suggested-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David Morley <morleyd@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Tested-by: David Morley <morleyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
cefa85480a bnxt: Do not read past the end of test names
[ Upstream commit d3e599c090fc6977331150c5f0a69ab8ce87da21 ]

Test names were being concatenated based on a offset beyond the end of
the first name, which tripped the buffer overflow detection logic:

 detected buffer overflow in strnlen
 [...]
 Call Trace:
 bnxt_ethtool_init.cold+0x18/0x18

Refactor struct hwrm_selftest_qlist_output to use an actual array,
and adjust the concatenation to use snprintf() rather than a series of
strncat() calls.

Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Link: https://lore.kernel.org/lkml/Y8F%2F1w1AZTvLglFX@x1-carbon/
Tested-by: Niklas Cassel <Niklas.Cassel@wdc.com>
Fixes: eb51365846bc ("bnxt_en: Add basic ethtool -t selftest support.")
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
aebf7e6270 net: stmmac: enable all safety features by default
[ Upstream commit fdfc76a116b5e9d3e98e6c96fe83b42d011d21d4 ]

In the original implementation of dwmac5
commit 8bf993a5877e ("net: stmmac: Add support for DWMAC5 and implement Safety Features")
all safety features were enabled by default.

Later it seems some implementations didn't have support for all the
features, so in
commit 5ac712dcdfef ("net: stmmac: enable platform specific safety features")
the safety_feat_cfg structure was added to the callback and defined for
some platforms to selectively enable these safety features.

The problem is that only certain platforms were given that software
support. If the automotive safety package bit is set in the hardware
features register the safety feature callback is called for the platform,
and for platforms that didn't get a safety_feat_cfg defined this results
in the following NULL pointer dereference:

[    7.933303] Call trace:
[    7.935812]  dwmac5_safety_feat_config+0x20/0x170 [stmmac]
[    7.941455]  __stmmac_open+0x16c/0x474 [stmmac]
[    7.946117]  stmmac_open+0x38/0x70 [stmmac]
[    7.950414]  __dev_open+0x100/0x1dc
[    7.954006]  __dev_change_flags+0x18c/0x204
[    7.958297]  dev_change_flags+0x24/0x6c
[    7.962237]  do_setlink+0x2b8/0xfa4
[    7.965827]  __rtnl_newlink+0x4ec/0x840
[    7.969766]  rtnl_newlink+0x50/0x80
[    7.973353]  rtnetlink_rcv_msg+0x12c/0x374
[    7.977557]  netlink_rcv_skb+0x5c/0x130
[    7.981500]  rtnetlink_rcv+0x18/0x2c
[    7.985172]  netlink_unicast+0x2e8/0x340
[    7.989197]  netlink_sendmsg+0x1a8/0x420
[    7.993222]  ____sys_sendmsg+0x218/0x280
[    7.997249]  ___sys_sendmsg+0xac/0x100
[    8.001103]  __sys_sendmsg+0x84/0xe0
[    8.004776]  __arm64_sys_sendmsg+0x24/0x30
[    8.008983]  invoke_syscall+0x48/0x114
[    8.012840]  el0_svc_common.constprop.0+0xcc/0xec
[    8.017665]  do_el0_svc+0x38/0xb0
[    8.021071]  el0_svc+0x2c/0x84
[    8.024212]  el0t_64_sync_handler+0xf4/0x120
[    8.028598]  el0t_64_sync+0x190/0x194

Go back to the original behavior, if the automotive safety package
is found to be supported in hardware enable all the features unless
safety_feat_cfg is passed in saying this particular platform only
supports a subset of the features.

Fixes: 5ac712dcdfef ("net: stmmac: enable platform specific safety features")
Reported-by: Ning Cai <ncai@quicinc.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
2846a7412f thermal: core: call put_device() only after device_register() fails
[ Upstream commit 6c54b7bc8a31ce0f7cc7f8deef05067df414f1d8 ]

put_device() shouldn't be called before a prior call to
device_register(). __thermal_cooling_device_register() doesn't follow
that properly and needs fixing. Also
thermal_cooling_device_destroy_sysfs() is getting called unnecessarily
on few error paths.

Fix all this by placing the calls at the right place.

Based on initial work done by Caleb Connolly.

Fixes: 4748f9687caa ("thermal: core: fix some possible name leaks in error paths")
Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
Reported-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Frank Rowand <frowand.list@gmail.com>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Tested-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:27 +01:00
d4dbbbbb24 thermal/core: fix error code in __thermal_cooling_device_register()
[ Upstream commit e49a1e1ee078aee21006192076a8d93335e0daa9 ]

Return an error pointer if ->get_max_state() fails.  The current code
returns NULL which will cause an oops in the callers.

Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 6c54b7bc8a31 ("thermal: core: call put_device() only after device_register() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
80bb3b901a thermal: Validate new state in cur_state_store()
[ Upstream commit c408b3d1d9bbc7de5fb0304fea424ef2539da616 ]

In cur_state_store(), the new state of the cooling device is received
from user-space and is not validated by the thermal core but the same is
left for the individual drivers to take care of. Apart from duplicating
the code it leaves possibility for introducing bugs where a driver may
not do it right.

Lets make the thermal core check the new state itself and store the max
value in the cooling device structure.

Link: https://lore.kernel.org/all/Y0ltRJRjO7AkawvE@kili/
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 6c54b7bc8a31 ("thermal: core: call put_device() only after device_register() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
3896f78b22 net: dsa: microchip: ksz9477: port map correction in ALU table entry register
[ Upstream commit 6c977c5c2e4c5d8ad1b604724cc344e38f96fe9b ]

ALU table entry 2 register in KSZ9477 have bit positions reserved for
forwarding port map. This field is referred in ksz9477_fdb_del() for
clearing forward port map and alu table.

But current fdb_del refer ALU table entry 3 register for accessing forward
port map. Update ksz9477_fdb_del() to get forward port map from correct
alu table entry register.

With this bug, issue can be observed while deleting static MAC entries.
Delete any specific MAC entry using "bridge fdb del" command. This should
clear all the specified MAC entries. But it is observed that entries with
self static alone are retained.

Tested on LAN9370 EVB since ksz9477_fdb_del() is used common across
LAN937x and KSZ series.

Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477")
Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230118174735.702377-1-rakesh.sankaranarayanan@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
248f6a70ea selftests/net: toeplitz: fix race on tpacket_v3 block close
[ Upstream commit 903848249a781d76d59561d51676c95b3a4d7162 ]

Avoid race between process wakeup and tpacket_v3 block timeout.

The test waits for cfg_timeout_msec for packets to arrive. Packets
arrive in tpacket_v3 rings, which pass packets ("frames") to the
process in batches ("blocks"). The sk waits for req3.tp_retire_blk_tov
msec to release a block.

Set the block timeout lower than the process waiting time, else
the process may find that no block has been released by the time it
scans the socket list. Convert to a ring of more than one, smaller,
blocks with shorter timeouts. Blocks must be page aligned, so >= 64KB.

Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230118151847.4124260-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
c804752148 driver core: Fix test_async_probe_init saves device in wrong array
[ Upstream commit 9be182da0a7526f1b9a3777a336f83baa2e64d23 ]

In test_async_probe_init, second set of asynchronous devices are saved
in sync_dev[sync_id], which should be async_dev[async_id].
This makes these devices not unregistered when exit.

> modprobe test_async_driver_probe && \
> modprobe -r test_async_driver_probe && \
> modprobe test_async_driver_probe
 ...
> sysfs: cannot create duplicate filename '/devices/platform/test_async_driver.4'
> kobject_add_internal failed for test_async_driver.4 with -EEXIST,
  don't try to register things with the same name in the same directory.

Fixes: 57ea974fb871 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Link: https://lore.kernel.org/r/20221125063541.241328-1-chenzhongjin@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
cfc7462ff8 w1: fix WARNING after calling w1_process()
[ Upstream commit 36225a7c72e9e3e1ce4001b6ce72849f5c9a2d3b ]

I got the following WARNING message while removing driver(ds2482):

------------[ cut here ]------------
do not call blocking ops when !TASK_RUNNING; state=1 set at [<000000002d50bfb6>] w1_process+0x9e/0x1d0 [wire]
WARNING: CPU: 0 PID: 262 at kernel/sched/core.c:9817 __might_sleep+0x98/0xa0
CPU: 0 PID: 262 Comm: w1_bus_master1 Tainted: G                 N 6.1.0-rc3+ #307
RIP: 0010:__might_sleep+0x98/0xa0
Call Trace:
 exit_signals+0x6c/0x550
 do_exit+0x2b4/0x17e0
 kthread_exit+0x52/0x60
 kthread+0x16d/0x1e0
 ret_from_fork+0x1f/0x30

The state of task is set to TASK_INTERRUPTIBLE in loop in w1_process(),
set it to TASK_RUNNING when it breaks out of the loop to avoid the
warning.

Fixes: 3c52e4e62789 ("W1: w1_process, block or sleep")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221205101558.3599162-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
8bc7d87067 w1: fix deadloop in __w1_remove_master_device()
[ Upstream commit 25d5648802f12ae486076ceca5d7ddf1fef792b2 ]

I got a deadloop report while doing device(ds2482) add/remove test:

  [  162.241881] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
  [  163.272251] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
  [  164.296157] w1_master_driver w1_bus_master1: Waiting for w1_bus_master1 to become free: refcnt=1.
  ...

__w1_remove_master_device() can't return, because the dev->refcnt is not zero.

w1_add_master_device()			|
  w1_alloc_dev()			|
    atomic_set(&dev->refcnt, 2)		|
  kthread_run()				|
					|__w1_remove_master_device()
					|  kthread_stop()
  // KTHREAD_SHOULD_STOP is set,	|
  // threadfn(w1_process) won't be	|
  // called.				|
  kthread()				|
					|  // refcnt will never be 0, it's deadloop.
					|  while (atomic_read(&dev->refcnt)) {...}

After calling w1_add_master_device(), w1_process() is not really
invoked, before w1_process() starting, if kthread_stop() is called
in __w1_remove_master_device(), w1_process() will never be called,
the refcnt can not be decreased, then it causes deadloop in remove
function because of non-zero refcnt.

We need to make sure w1_process() is really started, so move the
set refcnt into w1_process() to fix this problem.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221205080434.3149205-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
e75485fc58 device property: fix of node refcount leak in fwnode_graph_get_next_endpoint()
[ Upstream commit 39af728649b05e88a2b40e714feeee6451c3f18e ]

The 'parent' returned by fwnode_graph_get_port_parent()
with refcount incremented when 'prev' is not NULL, it
needs be put when finish using it.

Because the parent is const, introduce a new variable to
store the returned fwnode, then put it before returning
from fwnode_graph_get_next_endpoint().

Fixes: b5b41ab6b0c1 ("device property: Check fwnode->secondary in fwnode_graph_get_next_endpoint()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-and-tested-by: Daniel Scally <djrscally@gmail.com>
Link: https://lore.kernel.org/r/20221123022542.2999510-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:26 +01:00
13ba563c2c ptdma: pt_core_execute_cmd() should use spinlock
[ Upstream commit 95e5fda3b5f9ed8239b145da3fa01e641cf5d53c ]

The interrupt handler (pt_core_irq_handler()) of the ptdma
driver can be called from interrupt context. The code flow
in this function can lead down to pt_core_execute_cmd() which
will attempt to grab a mutex, which is not appropriate in
interrupt context and ultimately leads to a kernel panic.
The fix here changes this mutex to a spinlock, which has
been verified to resolve the issue.

Fixes: fa5d823b16a9 ("dmaengine: ptdma: Initial driver for the AMD PTDMA")
Signed-off-by: Eric Pilmore <epilmore@gigaio.com>
Link: https://lore.kernel.org/r/20230119033907.35071-1-epilmore@gigaio.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
0c5213ce6e usb: dwc3: fix extcon dependency
[ Upstream commit 7d80dbd708c18c683dd34f79b600a05307707ce8 ]

The dwc3 core support now links against the extcon subsystem,
so it cannot be built-in when extcon is a loadable module:

arm-linux-gnueabi-ld: drivers/usb/dwc3/core.o: in function `dwc3_get_extcon':
core.c:(.text+0x572): undefined reference to `extcon_get_edev_by_phandle'
arm-linux-gnueabi-ld: core.c:(.text+0x596): undefined reference to `extcon_get_extcon_dev'
arm-linux-gnueabi-ld: core.c:(.text+0x5ea): undefined reference to `extcon_find_edev_by_node'

There was already a Kconfig dependency in the dual-role support,
but this is now needed for the entire dwc3 driver.

It is still possible to build dwc3 without extcon, but this
prevents it from being set to built-in when extcon is a loadable
module.

Fixes: d182c2e1bc92 ("usb: dwc3: Don't switch OTG -> peripheral if extcon is present")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20230118090147.2126563-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
6df50d9469 tcp: avoid the lookup process failing to get sk in ehash table
[ Upstream commit 3f4ca5fafc08881d7a57daa20449d171f2887043 ]

While one cpu is working on looking up the right socket from ehash
table, another cpu is done deleting the request socket and is about
to add (or is adding) the big socket from the table. It means that
we could miss both of them, even though it has little chance.

Let me draw a call trace map of the server side.
   CPU 0                           CPU 1
   -----                           -----
tcp_v4_rcv()                  syn_recv_sock()
                            inet_ehash_insert()
                            -> sk_nulls_del_node_init_rcu(osk)
__inet_lookup_established()
                            -> __sk_nulls_add_node_rcu(sk, list)

Notice that the CPU 0 is receiving the data after the final ack
during 3-way shakehands and CPU 1 is still handling the final ack.

Why could this be a real problem?
This case is happening only when the final ack and the first data
receiving by different CPUs. Then the server receiving data with
ACK flag tries to search one proper established socket from ehash
table, but apparently it fails as my map shows above. After that,
the server fetches a listener socket and then sends a RST because
it finds a ACK flag in the skb (data), which obeys RST definition
in RFC 793.

Besides, Eric pointed out there's one more race condition where it
handles tw socket hashdance. Only by adding to the tail of the list
before deleting the old one can we avoid the race if the reader has
already begun the bucket traversal and it would possibly miss the head.

Many thanks to Eric for great help from beginning to end.

Fixes: 5e0724d027f0 ("tcp/dccp: fix hashdance race for passive sessions")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/lkml/20230112065336.41034-1-kerneljasonxing@gmail.com/
Link: https://lore.kernel.org/r/20230118015941.1313-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
4888f9fc03 nvme-pci: fix timeout request state check
[ Upstream commit 1c5842085851f786eba24a39ecd02650ad892064 ]

Polling the completion can progress the request state to IDLE, either
inline with the completion, or through softirq. Either way, the state
may not be COMPLETED, so don't check for that. We only care if the state
isn't IN_FLIGHT.

This is fixing an issue where the driver aborts an IO that we just
completed. Seeing the "aborting" message instead of "polled" is very
misleading as to where the timeout problem resides.

Fixes: bf392a5dc02a9b ("nvme-pci: Remove tag from process cq")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
58bdab02b2 net: sched: gred: prevent races when adding offloads to stats
[ Upstream commit 339346d49ae0859fe19b860998867861d37f1a76 ]

Naresh reports seeing a warning that gred is calling
u64_stats_update_begin() with preemption enabled.
Arnd points out it's coming from _bstats_update().

We should be holding the qdisc lock when writing
to stats, they are also updated from the datapath.

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/all/CA+G9fYsTr9_r893+62u6UGD3dVaCE-kN9C-Apmb2m=hxjc1Cqg@mail.gmail.com/
Fixes: e49efd5288bd ("net: sched: gred: support reporting stats from offloads")
Link: https://lore.kernel.org/r/20230113044137.1383067-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
b0b029ee04 drm/amd/display: fix issues with driver unload
[ Upstream commit e433adc60f7f847e734c56246b09291532f29b6d ]

Currently, we run into a number of WARN()s when attempting to unload the
amdgpu driver (e.g. using "modprobe -r amdgpu"). These all stem from
calling drm_encoder_cleanup() too early. So, to fix this we can stop
calling drm_encoder_cleanup() from amdgpu_dm_fini() and instead have it
be called from amdgpu_dm_encoder_destroy(). Also, we don't need to free
in amdgpu_dm_encoder_destroy() since mst_encoders[] isn't explicitly
allocated by the slab allocator.

Fixes: f74367e492ba ("drm/amdgpu/display: create fake mst encoders ahead of time (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
8f4764c0d4 phy: phy-can-transceiver: Skip warning if no "max-bitrate"
[ Upstream commit bc30c15f275484f9b9fe27c2fa0895f3022d9943 ]

According to the DT bindings, the "max-bitrate" property is optional.
However, when it is not present, a warning is printed.
Fix this by adding a missing check for -EINVAL.

Fixes: a4a86d273ff1b6f7 ("phy: phy-can-transceiver: Add support for generic CAN transceiver driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/88e158f97dd52ebaa7126cd9631f34764b9c0795.1674037334.git.geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:25 +01:00
567128076d dmaengine: tegra: Fix memory leak in terminate_all()
[ Upstream commit a7a7ee6f5a019ad72852c001abbce50d35e992f2 ]

Terminate vdesc when terminating an ongoing transfer.
This will ensure that the vdesc is present in the desc_terminated list
The descriptor will be freed later in desc_free_list().

This fixes the memory leaks which can happen when terminating an
ongoing transfer.

Fixes: ee17028009d4 ("dmaengine: tegra: Add tegra gpcdma driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Link: https://lore.kernel.org/r/20230118115801.15210-1-akhilrajeev@nvidia.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
f7a57ef457 dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node()
[ Upstream commit 596b53ccc36a546ab28e8897315c5b4d1d5a0200 ]

Since for_each_child_of_node() will increase the refcount of node, we need
to call of_node_put() manually when breaking out of the iteration.

Fixes: 9cd4360de609 ("dma: Add Xilinx AXI Video Direct Memory Access Engine driver support")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Link: https://lore.kernel.org/r/20221122021612.1908866-1-liushixin2@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
a8a518ff3b cifs: fix potential deadlock in cache_refresh_path()
[ Upstream commit 9fb0db40513e27537fde63287aea920b60557a69 ]

Avoid getting DFS referral from an exclusive lock in
cache_refresh_path() because the tcon IPC used for getting the
referral could be disconnected and thus causing a deadlock as shown
below:

task A                       task B
======                       ======
cifs_demultiplex_thread()    dfs_cache_find()
 cifs_handle_standard()       cache_refresh_path()
  reconnect_dfs_server()       down_write()
   dfs_cache_noreq_find()       get_dfs_referral()
    down_read() <- deadlock      smb2_get_dfs_refer()
                                  SMB2_ioctl()
				   cifs_send_recv()
				    compound_send_recv()
				     wait_for_response()

where task A cannot wake up task B because it is blocked on
down_read() due to the exclusive lock held in cache_refresh_path() and
therefore not being able to make progress.

Fixes: c9f711039905 ("cifs: keep referral server sessions alive")
Reviewed-by: Aurélien Aptel <aurelien.aptel@gmail.com>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
d5fb544b4c drm/i915/selftests: Unwind hugepages to drop wakeref on error
[ Upstream commit 93eea624526fc7d070cdae463408665824075f54 ]

Make sure that upon error after we have acquired the wakeref we do
release it again.

v2: add another missing "goto out_wf"(Andi).

Fixes: 027c38b4121e ("drm/i915/selftests: Grab the runtime pm in shrink_thp")
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117123234.26487-1-nirmoy.das@intel.com
(cherry picked from commit 14ec40a88210151296fff3e981c1a7196ad9bf55)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
07bc32e53c HID: betop: check shape of output reports
[ Upstream commit 3782c0d6edf658b71354a64d60aa7a296188fc90 ]

betopff_init() only checks the total sum of the report counts for each
report field to be at least 4, but hid_betopff_play() expects 4 report
fields.
A device advertising an output report with one field and 4 report counts
would pass the check but crash the kernel with a NULL pointer dereference
in hid_betopff_play().

Fixes: 52cd7785f3cd ("HID: betop: add drivers/hid/hid-betopff.c")
Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
b717cf5a6b l2tp: prevent lockdep issue in l2tp_tunnel_register()
[ Upstream commit b9fb10d131b8c84af9bb14e2078d5c63600c7dea ]

lockdep complains with the following lock/unlock sequence:

     lock_sock(sk);
     write_lock_bh(&sk->sk_callback_lock);
[1]  release_sock(sk);
[2]  write_unlock_bh(&sk->sk_callback_lock);

We need to swap [1] and [2] to fix this issue.

Fixes: 0b2c59720e65 ("l2tp: close all race conditions in l2tp_tunnel_register()")
Reported-by: syzbot+bbd35b345c7cab0d9a08@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/20230114030137.672706-1-xiyou.wangcong@gmail.com/T/#m1164ff20628671b0f326a24cb106ab3239c70ce3
Cc: Cong Wang <cong.wang@bytedance.com>
Cc: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
c81fcd4e49 virtio-net: correctly enable callback during start_xmit
[ Upstream commit d71ebe8114b4bf622804b810f5e274069060a174 ]

Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
virtqueue callback via the following statement:

        do {
		if (use_napi)
			virtqueue_disable_cb(sq->vq);

		free_old_xmit_skbs(sq, false);

	} while (use_napi && kick &&
               unlikely(!virtqueue_enable_cb_delayed(sq->vq)));

When NAPI is used and kick is false, the callback won't be enabled
here. And when the virtqueue is about to be full, the tx will be
disabled, but we still don't enable tx interrupt which will cause a TX
hang. This could be observed when using pktgen with burst enabled.

TO be consistent with the logic that tries to disable cb only for
NAPI, fixing this by trying to enable delayed callback only when NAPI
is enabled when the queue is about to be full.

Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
74a76b8017 net: macb: fix PTP TX timestamp failure due to packet padding
[ Upstream commit 7b90f5a665acd46efbbfa677a3a3a18d01ad6487 ]

PTP TX timestamp handling was observed to be broken with this driver
when using the raw Layer 2 PTP encapsulation. ptp4l was not receiving
the expected TX timestamp after transmitting a packet, causing it to
enter a failure state.

The problem appears to be due to the way that the driver pads packets
which are smaller than the Ethernet minimum of 60 bytes. If headroom
space was available in the SKB, this caused the driver to move the data
back to utilize it. However, this appears to cause other data references
in the SKB to become inconsistent. In particular, this caused the
ptp_one_step_sync function to later (in the TX completion path) falsely
detect the packet as a one-step SYNC packet, even when it was not, which
caused the TX timestamp to not be processed when it should be.

Using the headroom for this purpose seems like an unnecessary complexity
as this is not a hot path in the driver, and in most cases it appears
that there is sufficient tailroom to not require using the headroom
anyway. Remove this usage of headroom to prevent this inconsistency from
occurring and causing other problems.

Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation")
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> # on SAMA7G5
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:24 +01:00
142d644fd2 dmaengine: Fix double increment of client_count in dma_chan_get()
[ Upstream commit f3dc1b3b4750851a94212dba249703dd0e50bb20 ]

The first time dma_chan_get() is called for a channel the channel
client_count is incorrectly incremented twice for public channels,
first in balance_ref_count(), and again prior to returning. This
results in an incorrect client count which will lead to the
channel resources not being freed when they should be. A simple
 test of repeated module load and unload of async_tx on a Dell
 Power Edge R7425 also shows this resulting in a kref underflow
 warning.

[  124.329662] async_tx: api initialized (async)
[  129.000627] async_tx: api initialized (async)
[  130.047839] ------------[ cut here ]------------
[  130.052472] refcount_t: underflow; use-after-free.
[  130.057279] WARNING: CPU: 3 PID: 19364 at lib/refcount.c:28
refcount_warn_saturate+0xba/0x110
[  130.065811] Modules linked in: async_tx(-) rfkill intel_rapl_msr
intel_rapl_common amd64_edac edac_mce_amd ipmi_ssif kvm_amd dcdbas kvm
mgag200 drm_shmem_helper acpi_ipmi irqbypass drm_kms_helper ipmi_si
syscopyarea sysfillrect rapl pcspkr ipmi_devintf sysimgblt fb_sys_fops
k10temp i2c_piix4 ipmi_msghandler acpi_power_meter acpi_cpufreq vfat
fat drm fuse xfs libcrc32c sd_mod t10_pi sg ahci crct10dif_pclmul
libahci crc32_pclmul crc32c_intel ghash_clmulni_intel igb megaraid_sas
i40e libata i2c_algo_bit ccp sp5100_tco dca dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: async_tx]
[  130.117361] CPU: 3 PID: 19364 Comm: modprobe Kdump: loaded Not
tainted 5.14.0-185.el9.x86_64 #1
[  130.126091] Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS
1.18.0 01/17/2022
[  130.133806] RIP: 0010:refcount_warn_saturate+0xba/0x110
[  130.139041] Code: 01 01 e8 6d bd 55 00 0f 0b e9 72 9d 8a 00 80 3d
26 18 9c 01 00 75 85 48 c7 c7 f8 a3 03 9d c6 05 16 18 9c 01 01 e8 4a
bd 55 00 <0f> 0b e9 4f 9d 8a 00 80 3d 01 18 9c 01 00 0f 85 5e ff ff ff
48 c7
[  130.157807] RSP: 0018:ffffbf98898afe68 EFLAGS: 00010286
[  130.163036] RAX: 0000000000000000 RBX: ffff9da06028e598 RCX: 0000000000000000
[  130.170172] RDX: ffff9daf9de26480 RSI: ffff9daf9de198a0 RDI: ffff9daf9de198a0
[  130.177316] RBP: ffff9da7cddf3970 R08: 0000000000000000 R09: 00000000ffff7fff
[  130.184459] R10: ffffbf98898afd00 R11: ffffffff9d9e8c28 R12: ffff9da7cddf1970
[  130.191596] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  130.198739] FS:  00007f646435c740(0000) GS:ffff9daf9de00000(0000)
knlGS:0000000000000000
[  130.206832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.212586] CR2: 00007f6463b214f0 CR3: 00000008ab98c000 CR4: 00000000003506e0
[  130.219729] Call Trace:
[  130.222192]  <TASK>
[  130.224305]  dma_chan_put+0x10d/0x110
[  130.227988]  dmaengine_put+0x7a/0xa0
[  130.231575]  __do_sys_delete_module.constprop.0+0x178/0x280
[  130.237157]  ? syscall_trace_enter.constprop.0+0x145/0x1d0
[  130.242652]  do_syscall_64+0x5c/0x90
[  130.246240]  ? exc_page_fault+0x62/0x150
[  130.250178]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  130.255243] RIP: 0033:0x7f6463a3f5ab
[  130.258830] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48
83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89
01 48
[  130.277591] RSP: 002b:00007fff22f972c8 EFLAGS: 00000206 ORIG_RAX:
00000000000000b0
[  130.285164] RAX: ffffffffffffffda RBX: 000055b6786edd40 RCX: 00007f6463a3f5ab
[  130.292303] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055b6786edda8
[  130.299443] RBP: 000055b6786edd40 R08: 0000000000000000 R09: 0000000000000000
[  130.306584] R10: 00007f6463b9eac0 R11: 0000000000000206 R12: 000055b6786edda8
[  130.313731] R13: 0000000000000000 R14: 000055b6786edda8 R15: 00007fff22f995f8
[  130.320875]  </TASK>
[  130.323081] ---[ end trace eff7156d56b5cf25 ]---

cat /sys/class/dma/dma0chan*/in_use would get the wrong result.
2
2
2

Fixes: d2f4f99db3e9 ("dmaengine: Rework dma_chan_get")
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Reviewed-by: Jie Hai <haijie1@huawei.com>
Test-by: Jie Hai <haijie1@huawei.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Joel Savitz <jsavitz@redhat.com>
Link: https://lore.kernel.org/r/20221201030050.978595-1-koba.ko@canonical.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
f5e58b546c drm/panfrost: fix GENERIC_ATOMIC64 dependency
[ Upstream commit 6437a549ae178a3f5a5c03e983f291ebcdc2bbc7 ]

On ARMv5 and earlier, a randconfig build can still run into

WARNING: unmet direct dependencies detected for IOMMU_IO_PGTABLE_LPAE
  Depends on [n]: IOMMU_SUPPORT [=y] && (ARM [=y] || ARM64 || COMPILE_TEST [=y]) && !GENERIC_ATOMIC64 [=y]
  Selected by [y]:
  - DRM_PANFROST [=y] && HAS_IOMEM [=y] && DRM [=y] && (ARM [=y] || ARM64 || COMPILE_TEST [=y] && !GENERIC_ATOMIC64 [=y]) && MMU [=y]

Rework the dependencies to always require a working cmpxchg64.

Fixes: db594ba3fcf9 ("drm/panfrost: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117164456.1591901-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
459b413821 net: mlx5: eliminate anonymous module_init & module_exit
[ Upstream commit 2c1e1b949024989e20907b84e11a731a50778416 ]

Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.

Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.

Example 1: (System.map)
 ffffffff832fc78c t init
 ffffffff832fc79e t init
 ffffffff832fc8f8 t init

Example 2: (initcall_debug log)
 calling  init+0x0/0x12 @ 1
 initcall init+0x0/0x12 returned 0 after 15 usecs
 calling  init+0x0/0x60 @ 1
 initcall init+0x0/0x60 returned 0 after 2 usecs
 calling  init+0x0/0x9a @ 1
 initcall init+0x0/0x9a returned 0 after 74 usecs

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Eli Cohen <eli@mellanox.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: linux-rdma@vger.kernel.org
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
d0453c63d3 net/mlx5: E-switch, Fix switchdev mode after devlink reload
[ Upstream commit 7c83d1f4c5adae9583e7fca1e3e830d6b061522d ]

The cited commit removes eswitch mode none. So after devlink reload
in switchdev mode, eswitch mode is not changed. But actually eswitch
is disabled during devlink reload.

Fix it by setting eswitch mode to legacy when disabling eswitch
which is called by reload_down.

Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
a03b6ef0f6 net/mlx5e: Set decap action based on attr for sample
[ Upstream commit ffa99b534732f90077f346c62094cab3d1ccddce ]

Currently decap action is set based on tunnel_id. That means it is
set unconditionally. But for decap, ct and sample actions, decap is
done before ct. No need to decap again in sample.

And the actions are set correctly when parsing. So set decap action
based on attr instead of tunnel_id.

Fixes: 2741f2230905 ("net/mlx5e: TC, Support sample offload action for tunneled traffic")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
ad2732630f net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT
[ Upstream commit 4ddf77f9bc76092d268bd3af447d60d9cc62b652 ]

According to HW spec parent_element_id field should be reserved (0x0) when calling
MODIFY_SCHEDULING_ELEMENT command.

This patch remove the wrong initialization of reserved field, parent_element_id, on
mlx5_qos_update_node.

Fixes: 214baf22870c ("net/mlx5e: Support HTB offload")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
0680225a55 net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT
[ Upstream commit f51471d1935ce1f504fce6c115ce3bfbc32032b0 ]

According to HW spec element_type, element_attributes and parent_element_id fields
should be reserved (0x0) when calling MODIFY_SCHEDULING_ELEMENT command.

This patch remove initialization of these fields when calling the command.

Fixes: bd77bf1cb595 ("net/mlx5: Add SRIOV VF max rate configuration support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:23 +01:00
d8da8d96ff net/mlx5e: Avoid false lock dependency warning on tc_ht even more
[ Upstream commit 5aa56105930374928d567744595fd7ac525d0688 ]

The cited commit changed class of tc_ht internal mutex in order to avoid
false lock dependency with fs_core node and flow_table hash table
structures. However, hash table implementation internally also includes a
workqueue task with its own lockdep map which causes similar bogus lockdep
splat[0]. Fix it by also adding dedicated class for hash table workqueue
work structure of tc_ht.

[0]:

[ 1139.672465] ======================================================
[ 1139.673552] WARNING: possible circular locking dependency detected
[ 1139.674635] 6.1.0_for_upstream_debug_2022_12_12_17_02 #1 Not tainted
[ 1139.675734] ------------------------------------------------------
[ 1139.676801] modprobe/5998 is trying to acquire lock:
[ 1139.677726] ffff88811e7b93b8 (&node->lock){++++}-{3:3}, at: down_write_ref_node+0x7c/0xe0 [mlx5_core]
[ 1139.679662]
               but task is already holding lock:
[ 1139.680703] ffff88813c1f96a0 (&tc_ht_lock_key){+.+.}-{3:3}, at: rhashtable_free_and_destroy+0x38/0x6f0
[ 1139.682223]
               which lock already depends on the new lock.

[ 1139.683640]
               the existing dependency chain (in reverse order) is:
[ 1139.684887]
               -> #2 (&tc_ht_lock_key){+.+.}-{3:3}:
[ 1139.685975]        __mutex_lock+0x12c/0x14b0
[ 1139.686659]        rht_deferred_worker+0x35/0x1540
[ 1139.687405]        process_one_work+0x7c2/0x1310
[ 1139.688134]        worker_thread+0x59d/0xec0
[ 1139.688820]        kthread+0x28f/0x330
[ 1139.689444]        ret_from_fork+0x1f/0x30
[ 1139.690106]
               -> #1 ((work_completion)(&ht->run_work)){+.+.}-{0:0}:
[ 1139.691250]        __flush_work+0xe8/0x900
[ 1139.691915]        __cancel_work_timer+0x2ca/0x3f0
[ 1139.692655]        rhashtable_free_and_destroy+0x22/0x6f0
[ 1139.693472]        del_sw_flow_table+0x22/0xb0 [mlx5_core]
[ 1139.694592]        tree_put_node+0x24c/0x450 [mlx5_core]
[ 1139.695686]        tree_remove_node+0x6e/0x100 [mlx5_core]
[ 1139.696803]        mlx5_destroy_flow_table+0x187/0x690 [mlx5_core]
[ 1139.698017]        mlx5e_tc_nic_cleanup+0x2f8/0x400 [mlx5_core]
[ 1139.699217]        mlx5e_cleanup_nic_rx+0x2b/0x210 [mlx5_core]
[ 1139.700397]        mlx5e_detach_netdev+0x19d/0x2b0 [mlx5_core]
[ 1139.701571]        mlx5e_suspend+0xdb/0x140 [mlx5_core]
[ 1139.702665]        mlx5e_remove+0x89/0x190 [mlx5_core]
[ 1139.703756]        auxiliary_bus_remove+0x52/0x70
[ 1139.704492]        device_release_driver_internal+0x3c1/0x600
[ 1139.705360]        bus_remove_device+0x2a5/0x560
[ 1139.706080]        device_del+0x492/0xb80
[ 1139.706724]        mlx5_rescan_drivers_locked+0x194/0x6a0 [mlx5_core]
[ 1139.707961]        mlx5_unregister_device+0x7a/0xa0 [mlx5_core]
[ 1139.709138]        mlx5_uninit_one+0x5f/0x160 [mlx5_core]
[ 1139.710252]        remove_one+0xd1/0x160 [mlx5_core]
[ 1139.711297]        pci_device_remove+0x96/0x1c0
[ 1139.722721]        device_release_driver_internal+0x3c1/0x600
[ 1139.723590]        unbind_store+0x1b1/0x200
[ 1139.724259]        kernfs_fop_write_iter+0x348/0x520
[ 1139.725019]        vfs_write+0x7b2/0xbf0
[ 1139.725658]        ksys_write+0xf3/0x1d0
[ 1139.726292]        do_syscall_64+0x3d/0x90
[ 1139.726942]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 1139.727769]
               -> #0 (&node->lock){++++}-{3:3}:
[ 1139.728698]        __lock_acquire+0x2cf5/0x62f0
[ 1139.729415]        lock_acquire+0x1c1/0x540
[ 1139.730076]        down_write+0x8e/0x1f0
[ 1139.730709]        down_write_ref_node+0x7c/0xe0 [mlx5_core]
[ 1139.731841]        mlx5_del_flow_rules+0x6f/0x610 [mlx5_core]
[ 1139.732982]        __mlx5_eswitch_del_rule+0xdd/0x560 [mlx5_core]
[ 1139.734207]        mlx5_eswitch_del_offloaded_rule+0x14/0x20 [mlx5_core]
[ 1139.735491]        mlx5e_tc_rule_unoffload+0x104/0x2b0 [mlx5_core]
[ 1139.736716]        mlx5e_tc_unoffload_fdb_rules+0x10c/0x1f0 [mlx5_core]
[ 1139.738007]        mlx5e_tc_del_fdb_flow+0xc3c/0xfa0 [mlx5_core]
[ 1139.739213]        mlx5e_tc_del_flow+0x146/0xa20 [mlx5_core]
[ 1139.740377]        _mlx5e_tc_del_flow+0x38/0x60 [mlx5_core]
[ 1139.741534]        rhashtable_free_and_destroy+0x3be/0x6f0
[ 1139.742351]        mlx5e_tc_ht_cleanup+0x1b/0x30 [mlx5_core]
[ 1139.743512]        mlx5e_cleanup_rep_tx+0x4a/0xe0 [mlx5_core]
[ 1139.744683]        mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
[ 1139.745860]        mlx5e_netdev_change_profile+0xd9/0x1c0 [mlx5_core]
[ 1139.747098]        mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core]
[ 1139.748372]        mlx5e_vport_rep_unload+0x16a/0x1b0 [mlx5_core]
[ 1139.749590]        __esw_offloads_unload_rep+0xb1/0xd0 [mlx5_core]
[ 1139.750813]        mlx5_eswitch_unregister_vport_reps+0x409/0x5f0 [mlx5_core]
[ 1139.752147]        mlx5e_rep_remove+0x62/0x80 [mlx5_core]
[ 1139.753293]        auxiliary_bus_remove+0x52/0x70
[ 1139.754028]        device_release_driver_internal+0x3c1/0x600
[ 1139.754885]        driver_detach+0xc1/0x180
[ 1139.755553]        bus_remove_driver+0xef/0x2e0
[ 1139.756260]        auxiliary_driver_unregister+0x16/0x50
[ 1139.757059]        mlx5e_rep_cleanup+0x19/0x30 [mlx5_core]
[ 1139.758207]        mlx5e_cleanup+0x12/0x30 [mlx5_core]
[ 1139.759295]        mlx5_cleanup+0xc/0x49 [mlx5_core]
[ 1139.760384]        __x64_sys_delete_module+0x2b5/0x450
[ 1139.761166]        do_syscall_64+0x3d/0x90
[ 1139.761827]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 1139.762663]
               other info that might help us debug this:

[ 1139.763925] Chain exists of:
                 &node->lock --> (work_completion)(&ht->run_work) --> &tc_ht_lock_key

[ 1139.765743]  Possible unsafe locking scenario:

[ 1139.766688]        CPU0                    CPU1
[ 1139.767399]        ----                    ----
[ 1139.768111]   lock(&tc_ht_lock_key);
[ 1139.768704]                                lock((work_completion)(&ht->run_work));
[ 1139.769869]                                lock(&tc_ht_lock_key);
[ 1139.770770]   lock(&node->lock);
[ 1139.771326]
                *** DEADLOCK ***

[ 1139.772345] 2 locks held by modprobe/5998:
[ 1139.772994]  #0: ffff88813c1ff0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600
[ 1139.774399]  #1: ffff88813c1f96a0 (&tc_ht_lock_key){+.+.}-{3:3}, at: rhashtable_free_and_destroy+0x38/0x6f0
[ 1139.775822]
               stack backtrace:
[ 1139.776579] CPU: 3 PID: 5998 Comm: modprobe Not tainted 6.1.0_for_upstream_debug_2022_12_12_17_02 #1
[ 1139.777935] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 1139.779529] Call Trace:
[ 1139.779992]  <TASK>
[ 1139.780409]  dump_stack_lvl+0x57/0x7d
[ 1139.781015]  check_noncircular+0x278/0x300
[ 1139.781687]  ? print_circular_bug+0x460/0x460
[ 1139.782381]  ? rcu_read_lock_sched_held+0x3f/0x70
[ 1139.783121]  ? lock_release+0x487/0x7c0
[ 1139.783759]  ? orc_find.part.0+0x1f1/0x330
[ 1139.784423]  ? mark_lock.part.0+0xef/0x2fc0
[ 1139.785091]  __lock_acquire+0x2cf5/0x62f0
[ 1139.785754]  ? register_lock_class+0x18e0/0x18e0
[ 1139.786483]  lock_acquire+0x1c1/0x540
[ 1139.787093]  ? down_write_ref_node+0x7c/0xe0 [mlx5_core]
[ 1139.788195]  ? lockdep_hardirqs_on_prepare+0x3f0/0x3f0
[ 1139.788978]  ? register_lock_class+0x18e0/0x18e0
[ 1139.789715]  down_write+0x8e/0x1f0
[ 1139.790292]  ? down_write_ref_node+0x7c/0xe0 [mlx5_core]
[ 1139.791380]  ? down_write_killable+0x220/0x220
[ 1139.792080]  ? find_held_lock+0x2d/0x110
[ 1139.792713]  down_write_ref_node+0x7c/0xe0 [mlx5_core]
[ 1139.793795]  mlx5_del_flow_rules+0x6f/0x610 [mlx5_core]
[ 1139.794879]  __mlx5_eswitch_del_rule+0xdd/0x560 [mlx5_core]
[ 1139.796032]  ? __esw_offloads_unload_rep+0xd0/0xd0 [mlx5_core]
[ 1139.797227]  ? xa_load+0x11a/0x200
[ 1139.797800]  ? __xa_clear_mark+0xf0/0xf0
[ 1139.798438]  mlx5_eswitch_del_offloaded_rule+0x14/0x20 [mlx5_core]
[ 1139.799660]  mlx5e_tc_rule_unoffload+0x104/0x2b0 [mlx5_core]
[ 1139.800821]  mlx5e_tc_unoffload_fdb_rules+0x10c/0x1f0 [mlx5_core]
[ 1139.802049]  ? mlx5_eswitch_get_uplink_priv+0x25/0x80 [mlx5_core]
[ 1139.803260]  mlx5e_tc_del_fdb_flow+0xc3c/0xfa0 [mlx5_core]
[ 1139.804398]  ? __cancel_work_timer+0x1c2/0x3f0
[ 1139.805099]  ? mlx5e_tc_unoffload_from_slow_path+0x460/0x460 [mlx5_core]
[ 1139.806387]  mlx5e_tc_del_flow+0x146/0xa20 [mlx5_core]
[ 1139.807481]  _mlx5e_tc_del_flow+0x38/0x60 [mlx5_core]
[ 1139.808564]  rhashtable_free_and_destroy+0x3be/0x6f0
[ 1139.809336]  ? mlx5e_tc_del_flow+0xa20/0xa20 [mlx5_core]
[ 1139.809336]  ? mlx5e_tc_del_flow+0xa20/0xa20 [mlx5_core]
[ 1139.810455]  mlx5e_tc_ht_cleanup+0x1b/0x30 [mlx5_core]
[ 1139.811552]  mlx5e_cleanup_rep_tx+0x4a/0xe0 [mlx5_core]
[ 1139.812655]  mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
[ 1139.813768]  mlx5e_netdev_change_profile+0xd9/0x1c0 [mlx5_core]
[ 1139.814952]  mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core]
[ 1139.816166]  mlx5e_vport_rep_unload+0x16a/0x1b0 [mlx5_core]
[ 1139.817336]  __esw_offloads_unload_rep+0xb1/0xd0 [mlx5_core]
[ 1139.818507]  mlx5_eswitch_unregister_vport_reps+0x409/0x5f0 [mlx5_core]
[ 1139.819788]  ? mlx5_eswitch_uplink_get_proto_dev+0x30/0x30 [mlx5_core]
[ 1139.821051]  ? kernfs_find_ns+0x137/0x310
[ 1139.821705]  mlx5e_rep_remove+0x62/0x80 [mlx5_core]
[ 1139.822778]  auxiliary_bus_remove+0x52/0x70
[ 1139.823449]  device_release_driver_internal+0x3c1/0x600
[ 1139.824240]  driver_detach+0xc1/0x180
[ 1139.824842]  bus_remove_driver+0xef/0x2e0
[ 1139.825504]  auxiliary_driver_unregister+0x16/0x50
[ 1139.826245]  mlx5e_rep_cleanup+0x19/0x30 [mlx5_core]
[ 1139.827322]  mlx5e_cleanup+0x12/0x30 [mlx5_core]
[ 1139.828345]  mlx5_cleanup+0xc/0x49 [mlx5_core]
[ 1139.829382]  __x64_sys_delete_module+0x2b5/0x450
[ 1139.830119]  ? module_flags+0x300/0x300
[ 1139.830750]  ? task_work_func_match+0x50/0x50
[ 1139.831440]  ? task_work_cancel+0x20/0x20
[ 1139.832088]  ? lockdep_hardirqs_on_prepare+0x273/0x3f0
[ 1139.832873]  ? syscall_enter_from_user_mode+0x1d/0x50
[ 1139.833661]  ? trace_hardirqs_on+0x2d/0x100
[ 1139.834328]  do_syscall_64+0x3d/0x90
[ 1139.834922]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 1139.835700] RIP: 0033:0x7f153e71288b
[ 1139.836302] Code: 73 01 c3 48 8b 0d 9d 75 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d 75 0e 00 f7 d8 64 89 01 48
[ 1139.838866] RSP: 002b:00007ffe0a3ed938 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1139.840020] RAX: ffffffffffffffda RBX: 0000564c2cbf8220 RCX: 00007f153e71288b
[ 1139.841043] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000564c2cbf8288
[ 1139.842072] RBP: 0000564c2cbf8220 R08: 0000000000000000 R09: 0000000000000000
[ 1139.843094] R10: 00007f153e7a3ac0 R11: 0000000000000206 R12: 0000564c2cbf8288
[ 1139.844118] R13: 0000000000000000 R14: 0000564c2cbf7ae8 R15: 00007ffe0a3efcb8

Fixes: 9ba33339c043 ("net/mlx5e: Avoid false lock depenency warning on tc_ht")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
e401ecbc37 net: ipa: disable ipa interrupt during suspend
[ Upstream commit 9ec9b2a30853ba843b70ea16f196e5fe3327be5f ]

The IPA interrupt can fire when pm_runtime is disabled due to it racing
with the PM suspend/resume code. This causes a splat in the interrupt
handler when it tries to call pm_runtime_get().

Explicitly disable the interrupt in our ->suspend callback, and
re-enable it in ->resume to avoid this. If there is an interrupt pending
it will be handled after resuming. The interrupt is a wake_irq, as a
result even when disabled if it fires it will cause the system to wake
from suspend as well as cancel any suspend transition that may be in
progress. If there is an interrupt pending, the ipa_isr_thread handler
will be called after resuming.

Fixes: 1aac309d3207 ("net: ipa: use autosuspend")
Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20230115175925.465918-1-caleb.connolly@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
17511bd848 Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
[ Upstream commit 1d80d57ffcb55488f0ec0b77928d4f82d16b6a90 ]

syzbot reports a possible deadlock in rfcomm_sk_state_change [1].
While rfcomm_sock_connect acquires the sk lock and waits for
the rfcomm lock, rfcomm_sock_release could have the rfcomm
lock and hit a deadlock for acquiring the sk lock.
Here's a simplified flow:

rfcomm_sock_connect:
  lock_sock(sk)
  rfcomm_dlc_open:
    rfcomm_lock()

rfcomm_sock_release:
  rfcomm_sock_shutdown:
    rfcomm_lock()
    __rfcomm_dlc_close:
        rfcomm_k_state_change:
	  lock_sock(sk)

This patch drops the sk lock before calling rfcomm_dlc_open to
avoid the possible deadlock and holds sk's reference count to
prevent use-after-free after rfcomm_dlc_open completes.

Reported-by: syzbot+d7ce59...@syzkaller.appspotmail.com
Fixes: 1804fdf6e494 ("Bluetooth: btintel: Combine setting up MSFT extension")
Link: https://syzkaller.appspot.com/bug?extid=d7ce59b06b3eb14fd218 [1]

Signed-off-by: Ying Hsu <yinghsu@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
9a7c3befef Bluetooth: hci_event: Fix Invalid wait context
[ Upstream commit e9d50f76fe1f7f6f251114919247445fb5cb3734 ]

This fixes the following trace caused by attempting to lock
cmd_sync_work_lock while holding the rcu_read_lock:

kworker/u3:2/212 is trying to lock:
ffff888002600910 (&hdev->cmd_sync_work_lock){+.+.}-{3:3}, at:
hci_cmd_sync_queue+0xad/0x140
other info that might help us debug this:
context-{4:4}
4 locks held by kworker/u3:2/212:
 #0: ffff8880028c6530 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4dc/0x9a0
 #1: ffff888001aafde0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4dc/0x9a0
 #2: ffff888002600070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_cc_le_set_cig_params+0x64/0x4f0
 #3: ffffffffa5994b00 (rcu_read_lock){....}-{1:2}, at:
 hci_cc_le_set_cig_params+0x2f9/0x4f0

Fixes: 26afbd826ee3 ("Bluetooth: Add initial implementation of CIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
2539cbc625 Bluetooth: ISO: Fix possible circular locking dependency
[ Upstream commit 6a5ad251b7cdb990a3705428aef408433f05614a ]

This attempts to fix the following trace:

kworker/u3:1/184 is trying to acquire lock:
ffff888001888130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at:
iso_connect_cfm+0x2de/0x690

but task is already holding lock:
ffff8880028d1c20 (&conn->lock){+.+.}-{2:2}, at:
iso_connect_cfm+0x265/0x690

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&conn->lock){+.+.}-{2:2}:
       lock_acquire+0x176/0x3d0
       _raw_spin_lock+0x2a/0x40
       __iso_sock_close+0x1dd/0x4f0
       iso_sock_release+0xa0/0x1b0
       sock_close+0x5e/0x120
       __fput+0x102/0x410
       task_work_run+0xf1/0x160
       exit_to_user_mode_prepare+0x170/0x180
       syscall_exit_to_user_mode+0x19/0x50
       do_syscall_64+0x4e/0x90
       entry_SYSCALL_64_after_hwframe+0x62/0xcc

-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
       check_prev_add+0xfc/0x1190
       __lock_acquire+0x1e27/0x2750
       lock_acquire+0x176/0x3d0
       lock_sock_nested+0x32/0x80
       iso_connect_cfm+0x2de/0x690
       hci_cc_le_setup_iso_path+0x195/0x340
       hci_cmd_complete_evt+0x1ae/0x500
       hci_event_packet+0x38e/0x7c0
       hci_rx_work+0x34c/0x980
       process_one_work+0x5a5/0x9a0
       worker_thread+0x89/0x6f0
       kthread+0x14e/0x180
       ret_from_fork+0x22/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&conn->lock);
                               lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
                               lock(&conn->lock);
  lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);

 *** DEADLOCK ***

Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type")
Fixes: f764a6c2c1e4 ("Bluetooth: ISO: Add broadcast support")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
c524f9561c Bluetooth: ISO: Avoid circular locking dependency
[ Upstream commit 241f51931c35085449502c10f64fb3ecd6e02171 ]

This attempts to avoid circular locking dependency between sock_lock
and hdev_lock:

WARNING: possible circular locking dependency detected
6.0.0-rc7-03728-g18dd8ab0a783 #3 Not tainted
------------------------------------------------------
kworker/u3:2/53 is trying to acquire lock:
ffff888000254130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at:
iso_conn_del+0xbd/0x1d0
but task is already holding lock:
ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at:
hci_le_cis_estabilished_evt+0x1b5/0x500
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (hci_cb_list_lock){+.+.}-{3:3}:
       __mutex_lock+0x10e/0xfe0
       hci_le_remote_feat_complete_evt+0x17f/0x320
       hci_event_packet+0x39c/0x7d0
       hci_rx_work+0x2bf/0x950
       process_one_work+0x569/0x980
       worker_thread+0x2a3/0x6f0
       kthread+0x153/0x180
       ret_from_fork+0x22/0x30
-> #1 (&hdev->lock){+.+.}-{3:3}:
       __mutex_lock+0x10e/0xfe0
       iso_connect_cis+0x6f/0x5a0
       iso_sock_connect+0x1af/0x710
       __sys_connect+0x17e/0x1b0
       __x64_sys_connect+0x37/0x50
       do_syscall_64+0x43/0x90
       entry_SYSCALL_64_after_hwframe+0x62/0xcc
-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
       __lock_acquire+0x1b51/0x33d0
       lock_acquire+0x16f/0x3b0
       lock_sock_nested+0x32/0x80
       iso_conn_del+0xbd/0x1d0
       iso_connect_cfm+0x226/0x680
       hci_le_cis_estabilished_evt+0x1ed/0x500
       hci_event_packet+0x39c/0x7d0
       hci_rx_work+0x2bf/0x950
       process_one_work+0x569/0x980
       worker_thread+0x2a3/0x6f0
       kthread+0x153/0x180
       ret_from_fork+0x22/0x30
other info that might help us debug this:
Chain exists of:
  sk_lock-AF_BLUETOOTH-BTPROTO_ISO --> &hdev->lock --> hci_cb_list_lock
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(hci_cb_list_lock);
                               lock(&hdev->lock);
                               lock(hci_cb_list_lock);
  lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
 *** DEADLOCK ***
4 locks held by kworker/u3:2/53:
 #0: ffff8880021d9130 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4ad/0x980
 #1: ffff888002387de0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4ad/0x980
 #2: ffff888001ac0070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_le_cis_estabilished_evt+0xc3/0x500
 #3: ffffffff9f39a080 (hci_cb_list_lock){+.+.}-{3:3}, at:
 hci_le_cis_estabilished_evt+0x1b5/0x500

Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Stable-dep-of: 6a5ad251b7cd ("Bluetooth: ISO: Fix possible circular locking dependency")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
8ac6043bd3 Bluetooth: hci_sync: fix memory leak in hci_update_adv_data()
[ Upstream commit 1ed8b37cbaf14574c779064ef1372af62e8ba6aa ]

When hci_cmd_sync_queue() failed in hci_update_adv_data(), inst_ptr is
not freed, which will cause memory leak, convert to use ERR_PTR/PTR_ERR
to pass the instance to callback so no memory needs to be allocated.

Fixes: 651cd3d65b0f ("Bluetooth: convert hci_update_adv_data to hci_sync")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:22 +01:00
f51a825b9f Bluetooth: hci_conn: Fix memory leaks
[ Upstream commit 3aa21311f36d8a2730c7ccef37235e951f23927b ]

When hci_cmd_sync_queue() failed in hci_le_terminate_big() or
hci_le_big_terminate(), the memory pointed by variable d is not freed,
which will cause memory leak. Add release process to error path.

Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:21 +01:00
ed818fd8c5 Bluetooth: Fix a buffer overflow in mgmt_mesh_add()
[ Upstream commit 2185e0fdbb2137f22a9dd9fcbf6481400d56299b ]

Smatch Warning:
net/bluetooth/mgmt_util.c:375 mgmt_mesh_add() error: __memcpy()
'mesh_tx->param' too small (48 vs 50)

Analysis:

'mesh_tx->param' is array of size 48. This is the destination.
u8 param[sizeof(struct mgmt_cp_mesh_send) + 29]; // 19 + 29 = 48.

But in the caller 'mesh_send' we reject only when len > 50.
len > (MGMT_MESH_SEND_SIZE + 31) // 19 + 31 = 50.

Fixes: b338d91703fa ("Bluetooth: Implement support for Mesh")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Brian Gix <brian.gix@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:21 +01:00
5d947222e4 netfilter: conntrack: handle tcp challenge acks during connection reuse
[ Upstream commit c410cb974f2ba562920ecb8492ee66945dcf88af ]

When a connection is re-used, following can happen:
[ connection starts to close, fin sent in either direction ]
 > syn   # initator quickly reuses connection
 < ack   # peer sends a challenge ack
 > rst   # rst, sequence number == ack_seq of previous challenge ack
 > syn   # this syn is expected to pass

Problem is that the rst will fail window validation, so it gets
tagged as invalid.

If ruleset drops such packets, we get repeated syn-retransmits until
initator gives up or peer starts responding with syn/ack.

Before the commit indicated in the "Fixes" tag below this used to work:

The challenge-ack made conntrack re-init state based on the challenge
ack itself, so the following rst would pass window validation.

Add challenge-ack support: If we get ack for syn, record the ack_seq,
and then check if the rst sequence number matches the last ack number
seen in reverse direction.

Fixes: c7aab4f17021 ("netfilter: nf_conntrack_tcp: re-init for syn packets only")
Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:21 +01:00
bb03899b9b usb: gadget: f_fs: Ensure ep0req is dequeued before free_request
[ Upstream commit ce405d561b020e5a46340eb5146805a625dcacee ]

As per the documentation, function usb_ep_free_request guarantees
the request will not be queued or no longer be re-queued (or
otherwise used). However, with the current implementation it
doesn't make sure that the request in ep0 isn't reused.

Fix this by dequeuing the ep0req on functionfs_unbind before
freeing the request to align with the definition.

Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver")
Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com>
Tested-by: Krishna Kurapati <quic_kriskura@quicinc.com>
Link: https://lore.kernel.org/r/20221215052906.8993-3-quic_ugoswami@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:21 +01:00