android_kernel_xiaomi_sm8450/block
Omar Sandoval 3bc6d0f8b7 blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
commit e972b08b91ef48488bae9789f03cfedb148667fb upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a45e ("blk-wbt: improve waking of tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-22 15:39:26 +02:00
..
partitions block: fix potential invalid pointer dereference in blk_add_partition 2024-10-17 15:07:43 +02:00
badblocks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
bfq-cgroup.c block, bfq: fix uaf for bfqq in bic_set_bfqq() 2023-03-17 08:45:14 +01:00
bfq-iosched.c block, bfq: don't break merge chain in bfq_split_bfqq() 2024-10-17 15:07:43 +02:00
bfq-iosched.h bfq: Get rid of __bio_blkcg() usage 2022-06-09 10:21:31 +02:00
bfq-wf2q.c bfq: fix blkio cgroup leakage v4 2020-08-18 07:48:08 -07:00
bio-integrity.c block: initialize integrity buffer to zero before writing it to media 2024-09-12 11:06:41 +02:00
bio.c block: prevent an integer overflow in bvec_try_merge_hw_page 2024-02-23 08:42:09 +01:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-30 14:31:48 +02:00
blk-cgroup-rwstat.h blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup.c blk-cgroup: fix missing pd_online_fn() while activating policy 2023-02-06 07:56:15 +01:00
blk-core.c block: fix use-after-free of q->q_usage_counter 2023-10-10 21:53:36 +02:00
blk-crypto-fallback.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
blk-crypto-internal.h blk-mq: release crypto keyslot before reporting I/O complete 2023-05-17 11:47:32 +02:00
blk-crypto.c blk-crypto: make blk_crypto_evict_key() more robust 2023-05-17 11:47:32 +02:00
blk-exec.c block: add a blk_account_io_merge_bio helper 2020-05-27 05:21:23 -06:00
blk-flush.c block: Fix fsync always failed if once failed 2022-01-27 10:54:30 +01:00
blk-integrity.c block: remove the blk_flush_integrity call in blk_integrity_unregister 2024-09-12 11:06:41 +02:00
blk-ioc.c block: remove retry loop in ioc_release_fn() 2020-07-16 10:22:15 -06:00
blk-iocost.c blk_iocost: fix more out of bound shifts 2024-10-17 15:08:10 +02:00
blk-iolatency.c blk-iolatency: Fix inflight count imbalances and IO hangs on offline 2022-06-09 10:21:29 +02:00
blk-lib.c block: add a bdev_is_partition helper 2020-09-25 08:18:57 -06:00
blk-map.c block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern 2022-05-12 12:25:45 +02:00
blk-merge.c blk-mq: release crypto keyslot before reporting I/O complete 2023-05-17 11:47:32 +02:00
blk-mq-cpumap.c blk-mq: remove the calling of local_memory_node() 2020-10-20 07:08:17 -06:00
blk-mq-debugfs-zoned.c block: Cleanup license notice 2019-01-17 21:21:40 -07:00
blk-mq-debugfs.c blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created 2022-08-21 15:15:35 +02:00
blk-mq-debugfs.h blk-mq: no need to check return value of debugfs_create functions 2019-06-13 03:00:30 -06:00
blk-mq-pci.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-rdma.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-sched.c blk-mq: correct stale comment of .get_budget 2023-03-11 16:39:15 +01:00
blk-mq-sched.h block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2023-01-14 10:16:19 +01:00
blk-mq-tag.c blk-mq: avoid to iterate over stale request 2021-09-30 10:11:05 +02:00
blk-mq-tag.h blk-mq: clear stale request in tags->rq[] before freeing one request pool 2021-07-14 16:55:58 +02:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c blk-mq: fix IO hang from sbitmap wakeup race 2024-02-23 08:42:14 +01:00
blk-mq.h blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter 2021-07-14 16:55:58 +02:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2022-01-27 10:54:08 +01:00
blk-pm.h scsi: block: Do not accept any requests while suspended 2021-01-12 20:18:17 +01:00
blk-rq-qos.c blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race 2024-10-22 15:39:26 +02:00
blk-rq-qos.h block: fix race between adding/removing rq qos and normal IO 2021-07-14 16:56:00 +02:00
blk-settings.c block: Clear zone limits for a non-zoned stacked queue 2024-04-13 12:58:07 +02:00
blk-stat.c block: prevent division by zero in blk_rq_stat_sum() 2024-04-13 12:59:49 +02:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2018-12-12 06:47:51 -07:00
blk-sysfs.c block: introduce zone_write_granularity limit 2024-04-13 12:58:07 +02:00
blk-throttle.c blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2023-12-20 15:44:31 +01:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init() 2022-10-30 09:41:19 +01:00
blk-wbt.h blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled() 2021-07-14 16:56:12 +02:00
blk-zoned.c blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN 2021-09-18 13:40:06 +02:00
blk.h block: bump max plugged deferred size from 16 to 32 2021-11-18 14:03:57 +01:00
bounce.c block: make bio_crypt_clone() able to fail 2020-10-05 10:47:43 -06:00
bsg-lib.c block: drop double zeroing 2020-09-23 09:18:13 -06:00
bsg.c scsi: bsg: Remove support for SCSI_IOCTL_SEND_COMMAND 2021-09-18 13:40:11 +02:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elevator.c block/wbt: fix negative inflight counter when remove scsi device 2022-02-23 12:01:04 +01:00
genhd.c block: Suppress uevent for hidden device when removed 2021-03-30 14:31:52 +02:00
ioctl.c block/ioctl: prefer different overflow check 2024-07-05 09:12:33 +02:00
ioprio.c block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) 2021-12-14 11:32:40 +01:00
Kconfig blk-wbt: Remove obsolete multiqueue I/O scheduling comment 2020-09-01 16:49:26 -06:00
Kconfig.iosched treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
keyslot-manager.c blk-crypto: make blk_crypto_evict_key() more robust 2023-05-17 11:47:32 +02:00
kyber-iosched.c kyber: fix out of bounds access when preempted 2021-05-19 10:13:13 +02:00
Makefile blk-mq: merge blk-softirq.c into blk-mq.c 2020-06-24 09:15:56 -06:00
mq-deadline.c block: return ELEVATOR_DISCARD_MERGE if possible 2021-09-15 09:50:28 +02:00
opal_proto.h block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:21:46 -04:00
scsi_ioctl.c drivers-5.10-2020-10-12 2020-10-13 13:04:41 -07:00
sed-opal.c block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:21:46 -04:00
t10-pi.c block: Allow t10-pi to be modular 2020-01-06 20:59:04 -07:00