Go to file
Zhang Wensheng 8191b6cd9a driver core: fix potential deadlock in __driver_attach
[ Upstream commit 70fe758352cafdee72a7b13bf9db065f9613ced8 ]

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&entry_count) > MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &entry->work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  <TASK>
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25 11:17:52 +02:00
arch crypto: arm64/gcm - Select AEAD for GHASH_ARM64_CE 2022-08-25 11:17:41 +02:00
block blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created 2022-08-25 11:17:36 +02:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:11:22 +02:00
crypto crypto: drbg - make reseeding from get_random_bytes() synchronous 2022-06-22 14:11:18 +02:00
Documentation x86: Handle idle=nomwait cmdline properly for x86_idle 2022-08-25 11:17:28 +02:00
drivers driver core: fix potential deadlock in __driver_attach 2022-08-25 11:17:52 +02:00
fs fs: check FMODE_LSEEK to control internal pipe splicing 2022-08-25 11:17:44 +02:00
include can: error: specify the values of data[5..7] of CAN error frames 2022-08-25 11:17:47 +02:00
init random: handle latent entropy and command line from random_init() 2022-06-22 14:11:17 +02:00
ipc ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() 2022-06-14 18:11:41 +02:00
kernel nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt() 2022-08-25 11:17:36 +02:00
lib locking/refcount: Consolidate implementations of refcount_t 2022-07-29 17:14:17 +02:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm/mremap: hold the rmap lock in write mode when moving page table entries. 2022-08-25 11:17:20 +02:00
net dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock 2022-08-25 11:17:49 +02:00
samples samples/kretprobes: Fix return value if register_kretprobe() failed 2021-11-17 09:48:39 +01:00
scripts modpost: fix section mismatch check for exported init/exit sections 2022-06-29 08:58:49 +02:00
security selinux: Add boundary check in put_entry() 2022-08-25 11:17:32 +02:00
sound ALSA: hda/realtek: Add quirk for another Asus K42JZ model 2022-08-25 11:17:21 +02:00
tools selftests/bpf: fix a test for snprintf() overflow 2022-08-25 11:17:45 +02:00
usr initramfs: restore default compression behavior 2020-04-08 09:08:38 +02:00
virt KVM: Don't null dereference ops->destroy 2022-08-11 12:57:52 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS MAINTAINERS: co-maintain random.c 2022-06-22 14:11:05 +02:00
Makefile Makefile: link with -z noexecstack --no-warn-rwx-segments 2022-08-25 11:17:17 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.