android_rvh_blk_allocated_queue_init: Allocate specific request_queue
information and save the pointer address in the ANDROID_OEM_DATA field.
The allocation process may be scheduled, so a restricted hook function
is used
android_rvh_blk_flush_plug_list: Flush the customized plug list. During
this process, the scheduled queue_rq will be called to process the
request, so a restricted hook function is used
Bug: 319582497
Bug: 327716807
Change-Id: I0af3915de899b678ffd4f207cac2e35a744936b8
Signed-off-by: hao lv <hao.lv5@transsion.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmUlq8kACgkQONu9yGCS
aT5GiA//fiURwpUcawIhvgYewMVp+ovJ+mpX5IT+bMbW9Ur0sBhtiiU+WDNYxMru
34xbSQ/+o2a6N2tmK1JF7o76e2sHw/aRgaoDHkN5oEG+lbRH7TdCv6O0QRFAthcd
sJL+SX/GclcKW0ZHDjJX9Wt5Lq3gqVYlqJlCsw6gI/1JrQTxStrSQh7yRbrYSqpY
wGWEq19IrE/ToZFTBuPEEvlBswszGrI88lVtjvRzIdczQVyFLAoEQ2GNPWl3XNBh
ygGnwiHjk3a+QhZ30evIv2LX+tlGmpLy7gdLDsdZF7RfEkNHQ92IgaHvFDs8JqDg
QnRE8KCrC2V45OIQRRnA5NVtD3LBYM0bUhbqqLiNvTMiSIBWge4efJwxyYcTTfkX
MTmbo9z/bIVFdpgCQtneRw3eUyfbRKQ1cUvtmkuXIVLzvZUQaVMpXVZ6pz864E54
3nJrl2HJtIdJsRX5M4unL+AXNLRoJUbfb4hbzAD0Tg8Wbdgrn7vL/z6JmIzA2ssQ
+R/52ghimOThGTUbCi2pJx/cpKhegkJEJ7+JwUhS9L9ybA93g/bD0n9zy6JXpd/H
Cct0JWbiukbDp1CTLQ6Qm9TK5HANW2fXMHoR3H5ltPojNZwN7/pgYqN6ppjtBKVe
gA3k8KYkZoXjbF6VS1B5Y83wJ+H+39Luk/DSmm1ZNvYPHxmz+q0=
=2MQy
-----END PGP SIGNATURE-----
Merge 5.10.198 into android12-5.10-lts
Changes in 5.10.198
NFS: Use the correct commit info in nfs_join_page_group()
NFS/pNFS: Report EINVAL errors from connect() to the server
SUNRPC: Mark the cred for revalidation if the server rejects it
tracing: Increase trace array ref count on enable and filter files
ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones
ata: libahci: clear pending interrupt status
ext4: remove the 'group' parameter of ext4_trim_extent
ext4: add new helper interface ext4_try_to_trim_range()
ext4: scope ret locally in ext4_try_to_trim_range()
ext4: change s_last_trim_minblks type to unsigned long
ext4: mark group as trimmed only if it was fully scanned
ext4: replace the traditional ternary conditional operator with with max()/min()
ext4: move setting of trimmed bit into ext4_try_to_trim_range()
ext4: do not let fstrim block system suspend
tracing: Have event inject files inc the trace array ref count
netfilter: nf_tables: integrate pipapo into commit protocol
netfilter: nf_tables: don't skip expired elements during walk
netfilter: nf_tables: GC transaction API to avoid race with control plane
netfilter: nf_tables: adapt set backend to use GC transaction API
netfilter: nft_set_hash: mark set element as dead when deleting from packet path
netfilter: nf_tables: remove busy mark and gc batch API
netfilter: nf_tables: don't fail inserts if duplicate has expired
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
netfilter: nf_tables: GC transaction race with netns dismantle
netfilter: nf_tables: GC transaction race with abort path
netfilter: nf_tables: use correct lock to protect gc_list
netfilter: nf_tables: defer gc run if previous batch is still pending
netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
netfilter: nft_set_rbtree: use read spinlock to avoid datapath contention
netfilter: nft_set_pipapo: stop GC iteration if GC transaction allocation fails
netfilter: nft_set_hash: try later when GC hits EAGAIN on iteration
netfilter: nf_tables: fix memleak when more than 255 elements expired
ASoC: meson: spdifin: start hw on dai probe
netfilter: nf_tables: disallow element removal on anonymous sets
bpf: Avoid deadlock when using queue and stack maps from NMI
selftests/tls: Add {} to avoid static checker warning
selftests: tls: swap the TX and RX sockets in some tests
ASoC: imx-audmix: Fix return error with devm_clk_get()
i40e: Fix VF VLAN offloading when port VLAN is configured
ipv4: fix null-deref in ipv4_link_failure
powerpc/perf/hv-24x7: Update domain value check
dccp: fix dccp_v4_err()/dccp_v6_err() again
platform/x86: intel_scu_ipc: Check status after timeout in busy_loop()
platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
platform/x86: intel_scu_ipc: Don't override scu in intel_scu_ipc_dev_simple_command()
platform/x86: intel_scu_ipc: Fail IPC send if still busy
x86/srso: Fix srso_show_state() side effect
x86/srso: Fix SBPB enablement for spec_rstack_overflow=off
net: hns3: only enable unicast promisc when mac table full
net: hns3: add 5ms delay before clear firmware reset irq source
net: bridge: use DEV_STATS_INC()
team: fix null-ptr-deref when team device type is changed
netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
seqlock: avoid -Wshadow warnings
seqlock: Rename __seqprop() users
seqlock: Prefix internal seqcount_t-only macros with a "do_"
locking/seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()
bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI
net: rds: Fix possible NULL-pointer dereference
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()
netfilter: nf_tables: unregister flowtable hooks on netns exit
netfilter: nf_tables: double hook unregistration in netns path
Input: i8042 - rename i8042-x86ia64io.h to i8042-acpipnpio.h
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup
mmc: renesas_sdhi: populate SCC pointer at the proper place
mmc: tmio: support custom irq masks
mmc: renesas_sdhi: register irqs before registering controller
media: venus: core: Add io base variables for each block
media: venus: hfi,pm,firmware: Convert to block relative addressing
media: venus: hfi: Define additional 6xx registers
media: venus: core: Add differentiator IS_V6(core)
media: venus: hfi: Add a 6xx boot logic
media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interrupts
netfilter: use actual socket sk for REJECT action
netfilter: nft_exthdr: Support SCTP chunks
netfilter: nf_tables: add and use nft_sk helper
netfilter: nf_tables: add and use nft_thoff helper
netfilter: nft_exthdr: break evaluation if setting TCP option fails
netfilter: exthdr: add support for tcp option removal
netfilter: nft_exthdr: Fix non-linear header modification
ata: libata: Rename link flag ATA_LFLAG_NO_DB_DELAY
ata: ahci: Add support for AMD A85 FCH (Hudson D4)
ata: ahci: Rename board_ahci_mobile
ata: ahci: Add Elkhart Lake AHCI controller
btrfs: reset destination buffer when read_extent_buffer() gets invalid range
MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()
bus: ti-sysc: Fix missing AM35xx SoC matching
clk: tegra: fix error return case for recalc_rate
ARM: dts: omap: correct indentation
ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4
ARM: dts: motorola-mapphone: Configure lower temperature passive cooling
ARM: dts: motorola-mapphone: Add 1.2GHz OPP
ARM: dts: motorola-mapphone: Drop second ti,wlcore compatible value
ARM: dts: am335x: Guardian: Update beeper label
ARM: dts: Unify pwm-omap-dmtimer node names
ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot
bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up
power: supply: ucs1002: fix error code in ucs1002_get_property()
xtensa: add default definition for XCHAL_HAVE_DIV32
xtensa: iss/network: make functions static
xtensa: boot: don't add include-dirs
xtensa: boot/lib: fix function prototypes
gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip
i2c: npcm7xx: Fix callback completion ordering
dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock
parisc: sba: Fix compile warning wrt list of SBA devices
parisc: iosapic.c: Fix sparse warnings
parisc: drivers: Fix sparse warning
parisc: irq: Make irq_stack_union static to avoid sparse warning
scsi: qedf: Add synchronization between I/O completions and abort
selftests/ftrace: Correctly enable event in instance-event.tc
ring-buffer: Avoid softlockup in ring_buffer_resize()
selftests: fix dependency checker script
ring-buffer: Do not attempt to read past "commit"
platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig
scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command
scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
spi: nxp-fspi: reset the FLSHxCR1 registers
bpf: Clarify error expectations from bpf_clone_redirect
media: vb2: frame_vector.c: replace WARN_ONCE with a comment
powerpc/watchpoints: Disable preemption in thread_change_pc()
ncsi: Propagate carrier gain/loss events to the NCSI controller
fbdev/sh7760fb: Depend on FB=y
perf build: Define YYNOMEM as YYNOABORT for bison < 3.81
sched/cpuacct: Fix user/system in shown cpuacct.usage*
sched/cpuacct: Fix charge percpu cpuusage
sched/cpuacct: Optimize away RCU read lock
cgroup: Fix suspicious rcu_dereference_check() usage warning
ACPI: Check StorageD3Enable _DSD property in ACPI code
nvme-pci: factor the iod mempool creation into a helper
nvme-pci: factor out a nvme_pci_alloc_dev helper
nvme-pci: do not set the NUMA node of device if it has none
watchdog: iTCO_wdt: No need to stop the timer in probe
watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running
netfilter: nft_exthdr: Search chunks in SCTP packets only
netfilter: nft_exthdr: Fix for unsafe packet data read
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
smack: Record transmuting in smk_transmuted
smack: Retrieve transmuting information in smack_inode_getsecurity()
Smack:- Use overlay inode label in smack_inode_copy_up()
Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"
serial: 8250_port: Check IRQ data before use
nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
netfilter: nf_tables: disallow rule removal from chain binding
ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q
ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
i2c: i801: unregister tco_pdev in i801_probe() error path
Revert "SUNRPC dont update timeout value on connection reset"
proc: nommu: /proc/<pid>/maps: release mmap read lock
ring-buffer: Update "shortest_full" in polling
btrfs: properly report 0 avail for very full file systems
bpf: Fix BTF_ID symbol generation collision
bpf: Fix BTF_ID symbol generation collision in tools/
net: thunderbolt: Fix TCPv6 GSO checksum calculation
ata: libata-core: Fix ata_port_request_pm() locking
ata: libata-core: Fix port and device removal
ata: libata-core: Do not register PM operations for SAS ports
ata: libata-sata: increase PMP SRST timeout to 10s
fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
spi: spi-zynqmp-gqspi: Fix runtime PM imbalance in zynqmp_qspi_probe
spi: zynqmp-gqspi: fix clock imbalance on probe failure
NFS: Cleanup unused rpc_clnt variable
NFS: rename nfs_client_kset to nfs_kset
NFSv4: Fix a state manager thread deadlock regression
ring-buffer: remove obsolete comment for free_buffer_page()
ring-buffer: Fix bytes info in per_cpu buffer stats
drm/mediatek: Fix backport issue in mtk_drm_gem_prime_vmap()
rbd: move rbd_dev_refresh() definition
rbd: decouple header read-in from updating rbd_dev->header
rbd: decouple parent info read-in from updating rbd_dev
rbd: take header_rwsem in rbd_dev_refresh() only when updating
block: fix use-after-free of q->q_usage_counter
Revert "clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz"
Revert "PCI: qcom: Disable write access to read only registers for IP v2.3.3"
scsi: zfcp: Fix a double put in zfcp_port_enqueue()
qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info
wifi: mwifiex: Fix tlv_buf_left calculation
net: replace calls to sock->ops->connect() with kernel_connect()
net: prevent rewrite of msg_name in sock_sendmsg()
arm64: Add Cortex-A520 CPU part definition
ubi: Refuse attaching if mtd's erasesize is 0
wifi: iwlwifi: dbg_ini: fix structure packing
wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
bpf: Fix tr dereferencing
drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()
wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling
regmap: rbtree: Fix wrong register marked as in-cache when creating new node
ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig
scsi: target: core: Fix deadlock due to recursive locking
ima: rework CONFIG_IMA dependency block
NFSv4: Fix a nfs4_state_manager() race
modpost: add missing else to the "of" check
net: fix possible store tearing in neigh_periodic_work()
ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()
net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent
net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg
net: nfc: llcp: Add lock when modifying device list
net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
net: stmmac: dwmac-stm32: fix resume on STM32 MCU
tipc: fix a potential deadlock on &tx->lock
tcp: fix quick-ack counting to count actual ACKs of new data
tcp: fix delayed ACKs for MSS boundary condition
sctp: update transport state when processing a dupcook packet
sctp: update hb timer immediately after users change hb_interval
cpupower: add Makefile dependencies for install targets
dm zoned: free dmz->ddev array in dmz_put_zoned_devices
RDMA/core: Require admin capabilities to set system parameters
of: dynamic: Fix potential memory leak in of_changeset_action()
IB/mlx4: Fix the size of a buffer in add_port_entries()
gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()
gpio: pxa: disable pinctrl calls for MMP_GPIO
RDMA/cma: Initialize ib_sa_multicast structure to 0 when join
RDMA/cma: Fix truncation compilation warning in make_cma_ports
RDMA/uverbs: Fix typo of sizeof argument
RDMA/siw: Fix connection failure handling
RDMA/mlx5: Fix NULL string error
parisc: Restore __ldcw_align for PA-RISC 2.0 processors
netfilter: nf_tables: fix kdoc warnings after gc rework
netfilter: nftables: exthdr: fix 4-byte stack OOB write
mmc: renesas_sdhi: only reset SCC when its pointer is populated
xen/events: replace evtchn_rwlock with RCU
Linux 5.10.198
Change-Id: Iabfdf919ae63e41a565e523087d800ebc20e5448
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit d36a9ea5e7766961e753ee38d4c331bbe6ef659b upstream.
For blk-mq, queue release handler is usually called after
blk_mq_freeze_queue_wait() returns. However, the
q_usage_counter->release() handler may not be run yet at that time, so
this can cause a use-after-free.
Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu().
Since ->release() is called with rcu read lock held, it is agreed that
the race should be covered in caller per discussion from the two links.
Reported-by: Zhang Wensheng <zhangwensheng@huaweicloud.com>
Reported-by: Zhong Jinghua <zhongjinghua@huawei.com>
Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u
Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/
Cc: Hillf Danton <hdanton@sina.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Dennis Zhou <dennis@kernel.org>
Fixes: 2b0d3d3e4f ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cd1e566676bbcb8a126acd921e4e194e6339603 upstream.
Once all I/O using a blk_crypto_key has completed, filesystems can call
blk_crypto_evict_key(). However, the block layer currently doesn't call
blk_crypto_put_keyslot() until the request is being freed, which happens
after upper layers have been told (via bio_endio()) the I/O has
completed. This causes a race condition where blk_crypto_evict_key()
can see 'slot_refs != 0' without there being an actual bug.
This makes __blk_crypto_evict_key() hit the
'WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)' and return without
doing anything, eventually causing a use-after-free in
blk_crypto_reprogram_all_keys(). (This is a very rare bug and has only
been seen when per-file keys are being used with fscrypt.)
There are two options to fix this: either release the keyslot before
bio_endio() is called on the request's last bio, or make
__blk_crypto_evict_key() ignore slot_refs. Let's go with the first
solution, since it preserves the ability to report bugs (via
WARN_ON_ONCE) where a key is evicted while still in-use.
Fixes: a892c8d52c ("block: Inline encryption support for blk-mq")
Cc: stable@vger.kernel.org
Reviewed-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Once all I/O using a blk_crypto_key has completed, filesystems can call
blk_crypto_evict_key(). However, the block layer currently doesn't call
blk_crypto_put_keyslot() until the request is being freed, which happens
after upper layers have been told (via bio_endio()) the I/O has
completed. This causes a race condition where blk_crypto_evict_key()
can see 'slot_refs != 0' without there being an actual bug.
This makes __blk_crypto_evict_key() hit the
'WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)' and return without
doing anything, eventually causing a use-after-free in
blk_crypto_reprogram_all_keys(). (This is a very rare bug and has only
been seen when per-file keys are being used with fscrypt.)
There are two options to fix this: either release the keyslot before
bio_endio() is called on the request's last bio, or make
__blk_crypto_evict_key() ignore slot_refs. Let's go with the first
solution, since it preserves the ability to report bugs (via
WARN_ON_ONCE) where a key is evicted while still in-use.
Fixes: a892c8d52c ("block: Inline encryption support for blk-mq")
Cc: stable@vger.kernel.org
Reviewed-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bug: 270098322
(cherry picked from commit 9cd1e566676bbcb8a126acd921e4e194e6339603
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-next)
Change-Id: Ic2c2426db7693a06901c7893d481471f30de03b2
Signed-off-by: Eric Biggers <ebiggers@google.com>
Changes in 5.10.166
clk: generalize devm_clk_get() a bit
clk: Provide new devm_clk helpers for prepared and enabled clocks
memory: atmel-sdramc: Fix missing clk_disable_unprepare in atmel_ramc_probe()
memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe()
ARM: dts: imx6ul-pico-dwarf: Use 'clock-frequency'
ARM: dts: imx7d-pico: Use 'clock-frequency'
ARM: dts: imx6qdl-gw560x: Remove incorrect 'uart-has-rtscts'
arm64: dts: imx8mm-beacon: Fix ecspi2 pinmux
ARM: imx: add missing of_node_put()
HID: intel_ish-hid: Add check for ishtp_dma_tx_map
EDAC/highbank: Fix memory leak in highbank_mc_probe()
firmware: arm_scmi: Harden shared memory access in fetch_response
firmware: arm_scmi: Harden shared memory access in fetch_notification
tomoyo: fix broken dependency on *.conf.default
RDMA/core: Fix ib block iterator counter overflow
IB/hfi1: Reject a zero-length user expected buffer
IB/hfi1: Reserve user expected TIDs
IB/hfi1: Fix expected receive setup error exit issues
IB/hfi1: Immediately remove invalid memory from hardware
IB/hfi1: Remove user expected buffer invalidate race
affs: initialize fsdata in affs_truncate()
PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe()
phy: ti: fix Kconfig warning and operator precedence
ARM: dts: at91: sam9x60: fix the ddr clock for sam9x60
amd-xgbe: TX Flow Ctrl Registers are h/w ver dependent
amd-xgbe: Delay AN timeout during KR training
bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation
phy: rockchip-inno-usb2: Fix missing clk_disable_unprepare() in rockchip_usb2phy_power_on()
net: nfc: Fix use-after-free in local_cleanup()
net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs
gpio: mxc: Always set GPIOs used as interrupt source to INPUT mode
wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid
net/sched: sch_taprio: fix possible use-after-free
l2tp: Serialize access to sk_user_data with sk_callback_lock
l2tp: Don't sleep and disable BH under writer-side sk_callback_lock
l2tp: convert l2tp_tunnel_list to idr
l2tp: close all race conditions in l2tp_tunnel_register()
net: usb: sr9700: Handle negative len
net: mdio: validate parameter addr in mdiobus_get_phy()
HID: check empty report_list in hid_validate_values()
HID: check empty report_list in bigben_probe()
net: stmmac: fix invalid call to mdiobus_get_phy()
HID: revert CHERRY_MOUSE_000C quirk
usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait
usb: gadget: f_fs: Ensure ep0req is dequeued before free_request
net: mlx5: eliminate anonymous module_init & module_exit
drm/panfrost: fix GENERIC_ATOMIC64 dependency
dmaengine: Fix double increment of client_count in dma_chan_get()
net: macb: fix PTP TX timestamp failure due to packet padding
l2tp: prevent lockdep issue in l2tp_tunnel_register()
HID: betop: check shape of output reports
dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node()
nvme-pci: fix timeout request state check
tcp: avoid the lookup process failing to get sk in ehash table
w1: fix deadloop in __w1_remove_master_device()
w1: fix WARNING after calling w1_process()
driver core: Fix test_async_probe_init saves device in wrong array
net: dsa: microchip: ksz9477: port map correction in ALU table entry register
tcp: fix rate_app_limited to default to 1
scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace
cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist
kcsan: test: don't put the expect array on the stack
ASoC: fsl_micfil: Correct the number of steps on SX controls
drm: Add orientation quirk for Lenovo ideapad D330-10IGL
s390/debug: add _ASM_S390_ prefix to header guard
cpufreq: armada-37xx: stop using 0 as NULL pointer
ASoC: fsl_ssi: Rename AC'97 streams to avoid collisions with AC'97 CODEC
ASoC: fsl-asoc-card: Fix naming of AC'97 CODEC widgets
spi: spidev: remove debug messages that access spidev->spi without locking
KVM: s390: interrupt: use READ_ONCE() before cmpxchg()
scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id
platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD
platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK
lockref: stop doing cpu_relax in the cmpxchg loop
Revert "selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID"
netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
x86: ACPI: cstate: Optimize C3 entry on AMD CPUs
fs: reiserfs: remove useless new_opts in reiserfs_remount
sysctl: add a new register_sysctl_init() interface
kernel/panic: move panic sysctls to its own file
panic: unset panic_on_warn inside panic()
ubsan: no need to unset panic_on_warn in ubsan_epilogue()
kasan: no need to unset panic_on_warn in end_report()
exit: Add and use make_task_dead.
objtool: Add a missing comma to avoid string concatenation
hexagon: Fix function name in die()
h8300: Fix build errors from do_exit() to make_task_dead() transition
csky: Fix function name in csky_alignment() and die()
ia64: make IA64_MCA_RECOVERY bool instead of tristate
panic: Separate sysctl logic from CONFIG_SMP
exit: Put an upper limit on how often we can oops
exit: Expose "oops_count" to sysfs
exit: Allow oops_limit to be disabled
panic: Consolidate open-coded panic_on_warn checks
panic: Introduce warn_limit
panic: Expose "warn_count" to sysfs
docs: Fix path paste-o for /sys/kernel/warn_count
exit: Use READ_ONCE() for all oops/warn limit reads
Bluetooth: hci_sync: cancel cmd_timer if hci_open failed
xhci: Set HCD flag to defer primary roothub registration
scsi: hpsa: Fix allocation size for scsi_host_alloc()
module: Don't wait for GOING modules
tracing: Make sure trace_printk() can output as soon as it can be used
trace_events_hist: add check for return value of 'create_hist_field'
ftrace/scripts: Update the instructions for ftrace-bisect.sh
cifs: Fix oops due to uncleared server->smbd_conn in reconnect
KVM: x86/vmx: Do not skip segment attributes if unusable bit is set
thermal: intel: int340x: Protect trip temperature from concurrent updates
ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment
EDAC/device: Respect any driver-supplied workqueue polling value
EDAC/qcom: Do not pass llcc_driv_data as edac_device_ctl_info's pvt_info
units: Add Watt units
units: Add SI metric prefix definitions
i2c: designware: Use DIV_ROUND_CLOSEST() macro
i2c: designware: use casting of u64 in clock multiplication to avoid overflow
netlink: prevent potential spectre v1 gadgets
net: fix UaF in netns ops registration error path
netfilter: nft_set_rbtree: Switch to node list walk for overlap detection
netfilter: nft_set_rbtree: skip elements in transaction from garbage collection
netlink: annotate data races around nlk->portid
netlink: annotate data races around dst_portid and dst_group
netlink: annotate data races around sk_state
ipv4: prevent potential spectre v1 gadget in ip_metrics_convert()
ipv4: prevent potential spectre v1 gadget in fib_metrics_match()
netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE
netrom: Fix use-after-free of a listening socket.
net/sched: sch_taprio: do not schedule in taprio_reset()
sctp: fail if no bound addresses can be used for a given scope
net: ravb: Fix possible hang if RIS2_QFF1 happen
thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
net/tg3: resolve deadlock in tg3_reset_task() during EEH
net: mdio-mux-meson-g12a: force internal PHY off on mux switch
tools: gpio: fix -c option of gpio-event-mon
Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode"
nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf
nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted
Revert "selftests/ftrace: Update synthetic event syntax errors"
block: fix and cleanup bio_check_ro
x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL
netfilter: conntrack: unify established states for SCTP paths
perf/x86/amd: fix potential integer overflow on shift of a int
clk: Fix pointer casting to prevent oops in devm_clk_release()
Linux 5.10.166
Change-Id: Ibf582f7504221c6ee1648da95c49b45e3678708c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 57e95e4670d1126c103305bcf34a9442f49f6d6a upstream.
Don't use a WARN_ON when printing a potentially user triggered
condition. Also don't print the partno when the block device name
already includes it, and use the %pg specifier to simplify printing
the block device name.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220304180105.409765-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmGgq10ACgkQONu9yGCS
aT6KDA/+N9ysKF4cH2zdUMhDAkjCKB3YqTEsJxSfGBkJu2wuncAEEtrKy9jxC+lv
fz6BE1tduit/IQIhGTCXJlfe9NIxwU87f2v5JlHnYeXg4dz72c+Ei236l7ZvkSNE
ii8/ikHGvbbhKv+BTgcRg7jVUMMy6eEpS6iJwMNLB/sHROjZXPogFoiYjbO+Jzc5
0jTciMZv6r4yNhrdHBjhWHe6ZhB94H//Jy8MVYk37NGc5EbJmrMN83GM5ceSmhOZ
PgxxyrVTv+SdGm0XViyK+94HXWGQHLXQF+Nsu3YEZfnNI+HNSPQKTqBLPM1hV7Ak
h+IYW6VHPmcBmQzEdSA67uMKJayKtEwpkqO6aLRcj/NIThRiZoznbrtZOoGSXaU1
0MzQRPum76GA5/SVGgtB8FrE6kcFm74eq82mXvUD+rgCp0HTbIpYQK9ZKSmbFOkv
fYjcpWHZ8PEmffMbtIlVKSffVxcUILoNuQwnr21NGiRUrd54DhNPgVahmCKnvUTb
847bGU/wQJPIF/2SO1rdpaA9MrPqZ/9sMEX3nSdx7xS8D+h2wfJqAJkLq0KBYt7R
sbsXbfqbri893VHBo2YUqby3+7x3uNr118SjyiA8zpHHJpTBrVVImxSnW1z626HT
KNJU4MSulLs+settJKAw1PHGRIGuW5TGSGF94p5LcsZDM2uIabU=
=Wg4d
-----END PGP SIGNATURE-----
Merge 5.10.82 into android12-5.10-lts
Changes in 5.10.82
arm64: zynqmp: Do not duplicate flash partition label property
arm64: zynqmp: Fix serial compatible string
ARM: dts: sunxi: Fix OPPs node name
arm64: dts: allwinner: h5: Fix GPU thermal zone node name
arm64: dts: allwinner: a100: Fix thermal zone node name
staging: wfx: ensure IRQ is ready before enabling it
ARM: dts: NSP: Fix mpcore, mmc node names
scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
arm64: dts: rockchip: Disable CDN DP on Pinebook Pro
arm64: dts: hisilicon: fix arm,sp805 compatible string
RDMA/bnxt_re: Check if the vlan is valid before reporting
bus: ti-sysc: Add quirk handling for reinit on context lost
bus: ti-sysc: Use context lost quirk for otg
usb: musb: tusb6010: check return value after calling platform_get_resource()
usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
ARM: dts: ux500: Skomer regulator fixes
staging: rtl8723bs: remove possible deadlock when disconnect (v2)
ARM: BCM53016: Specify switch ports for Meraki MR32
arm64: dts: qcom: msm8998: Fix CPU/L2 idle state latency and residency
arm64: dts: qcom: ipq6018: Fix qcom,controlled-remotely property
arm64: dts: freescale: fix arm,sp805 compatible string
ASoC: SOF: Intel: hda-dai: fix potential locking issue
clk: imx: imx6ul: Move csi_sel mux to correct base register
ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
scsi: advansys: Fix kernel pointer leak
ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on ES8336 codec
firmware_loader: fix pre-allocated buf built-in firmware use
ARM: dts: omap: fix gpmc,mux-add-data type
usb: host: ohci-tmio: check return value after calling platform_get_resource()
ARM: dts: ls1021a: move thermal-zones node out of soc/
ARM: dts: ls1021a-tsn: use generic "jedec,spi-nor" compatible for flash
ALSA: ISA: not for M68K
tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
MIPS: sni: Fix the build
scsi: scsi_debug: Fix out-of-bound read in resp_readcap16()
scsi: scsi_debug: Fix out-of-bound read in resp_report_tgtpgs()
scsi: target: Fix ordered tag handling
scsi: target: Fix alua_tg_pt_gps_count tracking
iio: imu: st_lsm6dsx: Avoid potential array overflow in st_lsm6dsx_set_odr()
powerpc/5200: dts: fix memory node unit name
ARM: dts: qcom: fix memory and mdio nodes naming for RB3011
ALSA: gus: fix null pointer dereference on pointer block
powerpc/dcr: Use cmplwi instead of 3-argument cmpli
powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
sh: check return code of request_irq
maple: fix wrong return value of maple_bus_init().
f2fs: fix up f2fs_lookup tracepoints
f2fs: fix to use WHINT_MODE
sh: fix kconfig unmet dependency warning for FRAME_POINTER
sh: math-emu: drop unused functions
sh: define __BIG_ENDIAN for math-emu
f2fs: compress: disallow disabling compress on non-empty compressed file
f2fs: fix incorrect return value in f2fs_sanity_check_ckpt()
clk: ingenic: Fix bugs with divided dividers
clk/ast2600: Fix soc revision for AHB
clk: qcom: gcc-msm8996: Drop (again) gcc_aggre1_pnoc_ahb_clk
mips: BCM63XX: ensure that CPU_SUPPORTS_32BIT_KERNEL is set
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
perf/x86/vlbr: Add c->flags to vlbr event constraints
blkcg: Remove extra blkcg_bio_issue_init
tracing/histogram: Do not copy the fixed-size char array field over the field size
perf bpf: Avoid memory leak from perf_env__insert_btf()
perf bench futex: Fix memory leak of perf_cpu_map__new()
perf tests: Remove bash construct from record+zstd_comp_decomp.sh
drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrame
net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.
net-zerocopy: Refactor skb frag fast-forward op.
tcp: Fix uninitialized access in skb frags array for Rx 0cp.
tracing: Add length protection to histogram string copies
net: ipa: disable HOLB drop when updating timer
net: bnx2x: fix variable dereferenced before check
bnxt_en: reject indirect blk offload when hw-tc-offload is off
tipc: only accept encrypted MSG_CRYPTO msgs
net: reduce indentation level in sk_clone_lock()
sock: fix /proc/net/sockstat underflow in sk_clone_lock()
net/smc: Make sure the link_id is unique
iavf: Fix return of set the new channel count
iavf: check for null in iavf_fix_features
iavf: free q_vectors before queues in iavf_disable_vf
iavf: Fix failure to exit out from last all-multicast mode
iavf: prevent accidental free of filter structure
iavf: validate pointers
iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset
iavf: Fix for setting queues to 0
MIPS: generic/yamon-dt: fix uninitialized variable error
mips: bcm63xx: add support for clk_get_parent()
mips: lantiq: add support for clk_get_parent()
platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()'
net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove()
net/mlx5: Lag, update tracker when state change event received
net/mlx5: E-Switch, Change mode lock from mutex to rw semaphore
net/mlx5: E-Switch, return error if encap isn't supported
scsi: core: sysfs: Fix hang when device state is set via sysfs
net: sched: act_mirred: drop dst for the direction from egress to ingress
net: dpaa2-eth: fix use-after-free in dpaa2_eth_remove
net: virtio_net_hdr_to_skb: count transport header in UFO
i40e: Fix correct max_pkt_size on VF RX queue
i40e: Fix NULL ptr dereference on VSI filter sync
i40e: Fix changing previously set num_queue_pairs for PFs
i40e: Fix ping is lost after configuring ADq on VF
i40e: Fix warning message and call stack during rmmod i40e driver
i40e: Fix creation of first queue by omitting it if is not power of two
i40e: Fix display error code in dmesg
NFC: reorganize the functions in nci_request
NFC: reorder the logic in nfc_{un,}register_device
net: nfc: nci: Change the NCI close sequence
NFC: add NCI_UNREG flag to eliminate the race
e100: fix device suspend/resume
KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
pinctrl: qcom: sdm845: Enable dual edge errata
perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
s390/kexec: fix return code handling
net: stmmac: dwmac-rk: Fix ethernet on rk3399 based devices
arm64: vdso32: suppress error message for 'make mrproper'
tun: fix bonding active backup with arp monitoring
hexagon: export raw I/O routines for modules
hexagon: clean up timer-regs.h
tipc: check for null after calling kmemdup
ipc: WARN if trying to remove ipc object which is absent
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
x86/hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup fails
powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWX
scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id()
s390/kexec: fix memory leak of ipl report buffer
block: Check ADMIN before NICE for IOPRIO_CLASS_RT
KVM: nVMX: don't use vcpu->arch.efer when checking host state on nested state load
udf: Fix crash after seekdir
net: stmmac: socfpga: add runtime suspend/resume callback for stratix10 platform
btrfs: fix memory ordering between normal and ordered work functions
parisc/sticon: fix reverse colors
cfg80211: call cfg80211_stop_ap when switch from P2P_GO type
drm/amd/display: Update swizzle mode enums
drm/udl: fix control-message timeout
drm/nouveau: Add a dedicated mutex for the clients list
drm/nouveau: use drm_dev_unplug() during device removal
drm/nouveau: clean up all clients on device removal
drm/i915/dp: Ensure sink rate values are always valid
drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors
scsi: ufs: core: Fix task management completion
scsi: ufs: core: Fix task management completion timeout race
hugetlbfs: flush TLBs correctly after huge_pmd_unshare
RDMA/netlink: Add __maybe_unused to static inline in C file
selinux: fix NULL-pointer dereference when hashtab allocation fails
ASoC: DAPM: Cover regression by kctl change notification fix
usb: max-3421: Use driver data instead of maintaining a list of bound devices
ice: Delete always true check of PF pointer
fs: export an inode_update_time helper
btrfs: update device path inode time instead of bd_inode
x86/Kconfig: Fix an unused variable error in dell-smm-hwmon
ALSA: hda: hdac_ext_stream: fix potential locking issues
ALSA: hda: hdac_stream: fix potential locking issue in snd_hdac_stream_assign()
Revert "perf: Rework perf_event_exit_event()"
Linux 5.10.82
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I56e067875dafc27c2e86fc3b8c47abb3296c6a18
[ Upstream commit b781d8db580c058ecd54ed7d5dde7f8270b25f5b ]
KASAN reports a use-after-free report when doing block test:
==================================================================
[10050.967049] BUG: KASAN: use-after-free in
submit_bio_checks+0x1539/0x1550
[10050.977638] Call Trace:
[10050.978190] dump_stack+0x9b/0xce
[10050.979674] print_address_description.constprop.6+0x3e/0x60
[10050.983510] kasan_report.cold.9+0x22/0x3a
[10050.986089] submit_bio_checks+0x1539/0x1550
[10050.989576] submit_bio_noacct+0x83/0xc80
[10050.993714] submit_bio+0xa7/0x330
[10050.994435] mpage_readahead+0x380/0x500
[10050.998009] read_pages+0x1c1/0xbf0
[10051.002057] page_cache_ra_unbounded+0x4c2/0x6f0
[10051.007413] do_page_cache_ra+0xda/0x110
[10051.008207] force_page_cache_ra+0x23d/0x3d0
[10051.009087] page_cache_sync_ra+0xca/0x300
[10051.009970] generic_file_buffered_read+0xbea/0x2130
[10051.012685] generic_file_read_iter+0x315/0x490
[10051.014472] blkdev_read_iter+0x113/0x1b0
[10051.015300] aio_read+0x2ad/0x450
[10051.023786] io_submit_one+0xc8e/0x1d60
[10051.029855] __se_sys_io_submit+0x125/0x350
[10051.033442] do_syscall_64+0x2d/0x40
[10051.034156] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[10051.048733] Allocated by task 18598:
[10051.049482] kasan_save_stack+0x19/0x40
[10051.050263] __kasan_kmalloc.constprop.1+0xc1/0xd0
[10051.051230] kmem_cache_alloc+0x146/0x440
[10051.052060] mempool_alloc+0x125/0x2f0
[10051.052818] bio_alloc_bioset+0x353/0x590
[10051.053658] mpage_alloc+0x3b/0x240
[10051.054382] do_mpage_readpage+0xddf/0x1ef0
[10051.055250] mpage_readahead+0x264/0x500
[10051.056060] read_pages+0x1c1/0xbf0
[10051.056758] page_cache_ra_unbounded+0x4c2/0x6f0
[10051.057702] do_page_cache_ra+0xda/0x110
[10051.058511] force_page_cache_ra+0x23d/0x3d0
[10051.059373] page_cache_sync_ra+0xca/0x300
[10051.060198] generic_file_buffered_read+0xbea/0x2130
[10051.061195] generic_file_read_iter+0x315/0x490
[10051.062189] blkdev_read_iter+0x113/0x1b0
[10051.063015] aio_read+0x2ad/0x450
[10051.063686] io_submit_one+0xc8e/0x1d60
[10051.064467] __se_sys_io_submit+0x125/0x350
[10051.065318] do_syscall_64+0x2d/0x40
[10051.066082] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[10051.067455] Freed by task 13307:
[10051.068136] kasan_save_stack+0x19/0x40
[10051.068931] kasan_set_track+0x1c/0x30
[10051.069726] kasan_set_free_info+0x1b/0x30
[10051.070621] __kasan_slab_free+0x111/0x160
[10051.071480] kmem_cache_free+0x94/0x460
[10051.072256] mempool_free+0xd6/0x320
[10051.072985] bio_free+0xe0/0x130
[10051.073630] bio_put+0xab/0xe0
[10051.074252] bio_endio+0x3a6/0x5d0
[10051.074984] blk_update_request+0x590/0x1370
[10051.075870] scsi_end_request+0x7d/0x400
[10051.076667] scsi_io_completion+0x1aa/0xe50
[10051.077503] scsi_softirq_done+0x11b/0x240
[10051.078344] blk_mq_complete_request+0xd4/0x120
[10051.079275] scsi_mq_done+0xf0/0x200
[10051.080036] virtscsi_vq_done+0xbc/0x150
[10051.080850] vring_interrupt+0x179/0x390
[10051.081650] __handle_irq_event_percpu+0xf7/0x490
[10051.082626] handle_irq_event_percpu+0x7b/0x160
[10051.083527] handle_irq_event+0xcc/0x170
[10051.084297] handle_edge_irq+0x215/0xb20
[10051.085122] asm_call_irq_on_stack+0xf/0x20
[10051.085986] common_interrupt+0xae/0x120
[10051.086830] asm_common_interrupt+0x1e/0x40
==================================================================
Bio will be checked at beginning of submit_bio_noacct(). If bio needs
to be throttled, it will start the timer and stop submit bio directly.
Bio will submit in blk_throtl_dispatch_work_fn() when the timer expires.
But in the current process, if bio is throttled, it will still set bio
issue->value by blkcg_bio_issue_init(). This is redundant and may cause
the above use-after-free.
CPU0 CPU1
submit_bio
submit_bio_noacct
submit_bio_checks
blk_throtl_bio()
<=mod_timer(&sq->pending_timer
blk_throtl_dispatch_work_fn
submit_bio_noacct() <= bio have
throttle tag, will throw directly
and bio issue->value will be set
here
bio_endio()
bio_put()
bio_free() <= free this bio
blkcg_bio_issue_init(bio)
<= bio has been freed and
will lead to UAF
return BLK_QC_T_NONE
Fix this by remove extra blkcg_bio_issue_init.
Fixes: e439bedf6b (blkcg: consolidate bio_issue_init() to be a part of core)
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Link: https://lore.kernel.org/r/20211112093354.3581504-1-qiulaibin@huawei.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch will export some tracepoints so that vendor modules can use them.
Export these tracepoint functions to track IO data flow for performance tuning.
Bug: 205648026
Change-Id: Ia37b8f99b2d940cecce46c8bc24f724c14450517
Signed-off-by: Yang Yang <yang.yang@vivo.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmE9pRsACgkQONu9yGCS
aT4GqRAAzjCqjeEQ+JgWzP5N9YXXqzQI6s6PNqyEZpQiQF3tak12WZh02/zYhyZf
nLiJgB/RitgPcbvSeRLSV9gYbAhL0HGI0DZ7wA821esA3+ywuS/21D2g49YuU0Cy
mpTh4w1wm/dES4sZchhIMN+fzx81T/xHyiHkuA+wAU+o/bn4Qf+eCevKe2+8D2UC
vr5Oc0k+RvAF16pFRd+MKSR0JNNPEpS962KnRFYUk47hjMk/aH4FhQCSP+nqXcJW
3WuI+FR/UTltGMDe4/WIwlxQxsf4wMOMwEwRi0v5oyhwC4oCpZD3FlcNPJ81J6MS
GkNukV/mvIKZ/IVD2F3YY34nHLE2RMkGKMbYSc9ONYWPEgGFnxINnXQZonPkazEN
IA7GSPjxS4ToxETmDVVErIGQx/iCYnp9FJk+wqgrJ3z/e/qKZpt8C7RUDGJFzKjC
WeRDFHtvdZT1WcmiUfrXz8qcbf2e1dPorYJ5tRi7hztrEx4XzwARSGGeW7H5rXJT
50Q1oGL5IHj5lBe4BMbI59v1jQxzt0Tyyenv0xq/Jfrnmk0IrmUo0DiGZGO9Zxko
as+HR3NNGl27uvlmz+bZ+ztL1DoXvJpkaGoCXTwBimg+/TAT9cSA+bFhI9+lvigx
Ez6vInwcWkL6Tk5DlCeR7H0dBlKFGxktWHN5+qs0gSZoXPXZfK8=
=AflS
-----END PGP SIGNATURE-----
Merge 5.10.64 into android12-5.10-lts
Changes in 5.10.64
igmp: Add ip_mc_list lock in ip_check_mc_rcu
USB: serial: mos7720: improve OOM-handling in read_mos_reg()
net: ll_temac: Remove left-over debug message
mm/page_alloc: speed up the iteration of max_order
net: kcov: don't select SKB_EXTENSIONS when there is no NET
serial: 8250: 8250_omap: Fix unused variable warning
net: linux/skbuff.h: combine SKB_EXTENSIONS + KCOV handling
tty: drop termiox user definitions
Revert "r8169: avoid link-up interrupt issue on RTL8106e if user enables ASPM"
x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating
blk-mq: fix kernel panic during iterating over flush request
blk-mq: fix is_flush_rq
netfilter: nftables: avoid potential overflows on 32bit arches
netfilter: nf_tables: initialize set before expression setup
netfilter: nftables: clone set element expression template
blk-mq: clearing flush request reference in tags->rqs[]
ALSA: usb-audio: Add registration quirk for JBL Quantum 800
usb: host: xhci-rcar: Don't reload firmware after the completion
usb: gadget: tegra-xudc: fix the wrong mult value for HS isoc or intr
usb: mtu3: restore HS function when set SS/SSP
usb: mtu3: use @mult for HS isoc or intr
usb: mtu3: fix the wrong HS mult value
xhci: fix even more unsafe memory usage in xhci tracing
xhci: fix unsafe memory usage in xhci tracing
x86/reboot: Limit Dell Optiplex 990 quirk to early BIOS versions
PCI: Call Max Payload Size-related fixup quirks early
Linux 5.10.64
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2269075a6d5eb6121b6e42a28d4f3fd0c252695c
commit c2da19ed50554ce52ecbad3655c98371fe58599f upstream.
For fixing use-after-free during iterating over requests, we grabbed
request's refcount before calling ->fn in commit 2e315dc07df0 ("blk-mq:
grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter").
Turns out this way may cause kernel panic when iterating over one flush
request:
1) old flush request's tag is just released, and this tag is reused by
one new request, but ->rqs[] isn't updated yet
2) the flush request can be re-used for submitting one new flush command,
so blk_rq_init() is called at the same time
3) meantime blk_mq_queue_tag_busy_iter() is called, and old flush request
is retrieved from ->rqs[tag]; when blk_mq_put_rq_ref() is called,
flush_rq->end_io may not be updated yet, so NULL pointer dereference
is triggered in blk_mq_put_rq_ref().
Fix the issue by calling refcount_set(&flush_rq->ref, 1) after
flush_rq->end_io is set. So far the only other caller of blk_rq_init() is
scsi_ioctl_reset() in which the request doesn't enter block IO stack and
the request reference count isn't used, so the change is safe.
Fixes: 2e315dc07df0 ("blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter")
Reported-by: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Tested-by: "Blank-Burian, Markus, Dr." <blankburian@uni-muenster.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20210811142624.618598-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit "block: Do not accept any requests while suspended" broke the UFS
driver. In the upstream kernel this has been fixed by commit b294ff3e3449
("scsi: ufs: core: Enable power management for wlun"). Backporting that
commit or backporting the entire v5.14-rc1 UFS driver is too risky.
Hence revert the block layer patch that is incompatible with the v5.10
UFS driver power management code.
This reverts commit d55d15a332.
Bug: 193181075
Change-Id: Ic50d4e1df98d7ed393bf9797787225ae22e5d7a3
Signed-off-by: Bart Van Assche <bvanassche@google.com>
[ Upstream commit 52abca64fd9410ea6c9a3a74eab25663b403d7da ]
blk_queue_enter() accepts BLK_MQ_REQ_PM requests independent of the runtime
power management state. Now that SCSI domain validation no longer depends
on this behavior, modify the behavior of blk_queue_enter() as follows:
- Do not accept any requests while suspended.
- Only process power management requests while suspending or resuming.
Submitting BLK_MQ_REQ_PM requests to a device that is runtime suspended
causes runtime-suspended devices not to resume as they should. The request
which should cause a runtime resume instead gets issued directly, without
resuming the device first. Of course the device can't handle it properly,
the I/O fails, and the device remains suspended.
The problem is fixed by checking that the queue's runtime-PM status isn't
RPM_SUSPENDED before allowing a request to be issued, and queuing a
runtime-resume request if it is. In particular, the inline
blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and the
code is unified by merging the surrounding checks into the routine. If the
queue isn't set up for runtime PM, or there currently is no restriction on
allowed requests, the request is allowed. Likewise if the BLK_MQ_REQ_PM
flag is set and the status isn't RPM_SUSPENDED. Otherwise a runtime resume
is queued and the request is blocked until conditions are more suitable.
[ bvanassche: modified commit message and removed Cc: stable because
without the previous patches from this series this patch would break
parallel SCSI domain validation + introduced queue_rpm_status() ]
Link: https://lore.kernel.org/r/20201209052951.16136-9-bvanassche@acm.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-and-tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a4d34da715e3cb7e0741fe603dcd511bed067e00 ]
Remove flag RQF_PREEMPT and BLK_MQ_REQ_PREEMPT since these are no longer
used by any kernel code.
Link: https://lore.kernel.org/r/20201209052951.16136-8-bvanassche@acm.org
Cc: Can Guo <cang@codeaurora.org>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0854bcdcdec26aecdc92c303816f349ee1fba2bc ]
Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation
functions set RQF_PM. This is the first step towards removing
BLK_MQ_REQ_PREEMPT.
Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Can Guo <cang@codeaurora.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
A zoned device with limited resources to open or activate zones may
return an error when the host exceeds those limits. The same command may
be successful if retried later, but the host needs to wait for specific
zone states before it should expect a retry to succeed. Have the block
layer provide an appropriate status for these conditions so applications
can distinuguish this error for special handling.
Cc: linux-api@vger.kernel.org
Cc: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
syzbot is reporting unkillable task [1], for the caller is failing to
handle a corrupted filesystem image which attempts to access beyond
the end of the device. While we need to fix the caller, flooding the
console with handle_bad_sector() message is unlikely useful.
[1] https://syzkaller.appspot.com/bug?id=f1f49fb971d7a3e01bd8ab8cff2ff4572ccf3092
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_crypto_rq_bio_prep() assumes its gfp_mask argument always includes
__GFP_DIRECT_RECLAIM, so that the mempool_alloc() will always succeed.
However, blk_crypto_rq_bio_prep() might be called with GFP_ATOMIC via
setup_clone() in drivers/md/dm-rq.c.
This case isn't currently reachable with a bio that actually has an
encryption context. However, it's fragile to rely on this. Just make
blk_crypto_rq_bio_prep() able to fail.
Suggested-by: Satya Tangirala <satyat@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Satya Tangirala <satyat@google.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add QUEUE_FLAG_NOWAIT to allow a block device to advertise support for
REQ_NOWAIT. Bio-based devices may set QUEUE_FLAG_NOWAIT where
applicable.
Update QUEUE_FLAG_MQ_DEFAULT to include QUEUE_FLAG_NOWAIT. Also
update submit_bio_checks() to verify it is set for REQ_NOWAIT bios.
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just checking SB_I_CGROUPWB for cgroup writeback support is enough.
Either the file system allocates its own bdi (e.g. btrfs), in which case
it is known to support cgroup writeback, or the bdi comes from the block
layer, which always supports cgroup writeback.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Set up a readahead size by default, as very few users have a good
reason to change it. This means code, ecryptfs, and orangefs now
set up the values while they were previously missing it, while ubifs,
mtd and vboxsf manually set it to 0 to avoid readahead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
These functions can be used to enable iostat for partitions on devices
like md, bcache.
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The per-hctx nr_active value can no longer be used to fairly assign a share
of tag depth per request queue for when using a shared sbitmap, as it does
not consider that the tags are shared tags over all hctx's.
For this case, record the nr_active_requests per request_queue, and make
the judgement based on that value.
Co-developed-with: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If WRITE_ZERO/WRITE_SAME operation is not supported by the storage,
blk_cloned_rq_check_limits() will return IO error which will cause
device-mapper to fail the paths.
Instead, if the queue limit is set to 0, return BLK_STS_NOTSUPP.
BLK_STS_NOTSUPP will be ignored by device-mapper and will not fail the
paths.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ritika Srivastava <ritika.srivastava@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It's better to move bio merge related functions into blk-merge.c,
which contains all merge related functions.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If a driver leaves the limit settings as the defaults, then we don't
initialize bdi->io_pages. This means that file systems may need to
work around bdi->io_pages == 0, which is somewhat messy.
Initialize the default value just like we do for ->ra_pages.
Cc: stable@vger.kernel.org
Fixes: 9491ae4aad ("mm: don't cap request size based on read-ahead setting")
Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8m7asQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplrCD/0S17kio+k4cOJDGwl88WoJw+QiYmM5019k
decZ1JymQvV1HXRmlcZiEAu0hHDD0FoovSRrw7II3gw3GouETmYQM62f6ZTpDeMD
CED/fidnfULAkPaI6h+bj3jyI0cEuujG/R47rGSQEkIIr3RttqKZUzVkB9KN+KMw
+OBuXZtMIoFFEVJ91qwC2dm2qHLqOn1/5MlT59knso/xbPOYOXsFQpGiACJqF97x
6qSSI8uGE+HZqvL2OLWPDBbLEJhrq+dzCgxln5VlvLele4UcRhOdonUb7nUwEKCe
zwvtXzz16u1D1b8bJL4Kg5bGqyUAQUCSShsfBJJxh6vTTULiHyCX5sQaai1OEB16
4dpBL9E+nOUUix4wo9XBY0/KIYaPWg5L1CoEwkAXqkXPhFvNUucsC0u6KvmzZR3V
1OogVTjl6GhS8uEVQjTKNshkTIC9QHEMXDUOHtINDCb/sLU+ANXU5UpvsuzZ9+kt
KGc4mdyCwaKBq4YW9sVwhhq/RHLD4AUtWZiUVfOE+0cltCLJUNMbQsJ+XrcYaQnm
W4zz22Rep+SJuQNVcCW/w7N2zN3yB6gC1qeroSLvzw4b5el2TdFp+BcgVlLHK+uh
xjsGNCq++fyzNk7vvMZ5hVq4JGXYjza7AiP5HlQ8nqdiPUKUPatWCBqUm9i9Cz/B
n+0dlYbRwQ==
=2vmy
-----END PGP SIGNATURE-----
Merge tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"Lots of cleanups in here, hardening the code and/or making it easier
to read and fixing bugs, but a core feature/change too adding support
for real async buffered reads. With the latter in place, we just need
buffered write async support and we're done relying on kthreads for
the fast path. In detail:
- Cleanup how memory accounting is done on ring setup/free (Bijan)
- sq array offset calculation fixup (Dmitry)
- Consistently handle blocking off O_DIRECT submission path (me)
- Support proper async buffered reads, instead of relying on kthread
offload for that. This uses the page waitqueue to drive retries
from task_work, like we handle poll based retry. (me)
- IO completion optimizations (me)
- Fix race with accounting and ring fd install (me)
- Support EPOLLEXCLUSIVE (Jiufei)
- Get rid of the io_kiocb unionizing, made possible by shrinking
other bits (Pavel)
- Completion side cleanups (Pavel)
- Cleanup REQ_F_ flags handling, and kill off many of them (Pavel)
- Request environment grabbing cleanups (Pavel)
- File and socket read/write cleanups (Pavel)
- Improve kiocb_set_rw_flags() (Pavel)
- Tons of fixes and cleanups (Pavel)
- IORING_SQ_NEED_WAKEUP clear fix (Xiaoguang)"
* tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block: (127 commits)
io_uring: flip if handling after io_setup_async_rw
fs: optimise kiocb_set_rw_flags()
io_uring: don't touch 'ctx' after installing file descriptor
io_uring: get rid of atomic FAA for cq_timeouts
io_uring: consolidate *_check_overflow accounting
io_uring: fix stalled deferred requests
io_uring: fix racy overflow count reporting
io_uring: deduplicate __io_complete_rw()
io_uring: de-unionise io_kiocb
io-wq: update hash bits
io_uring: fix missing io_queue_linked_timeout()
io_uring: mark ->work uninitialised after cleanup
io_uring: deduplicate io_grab_files() calls
io_uring: don't do opcode prep twice
io_uring: clear IORING_SQ_NEED_WAKEUP after executing task works
io_uring: batch put_task_struct()
tasks: add put_task_struct_many()
io_uring: return locked and pinned page accounting
io_uring: don't miscount pinned memory
io_uring: don't open-code recv kbuf managment
...
If blk_mq_submit_bio flushes the plug list, bios for other disks can
show up on current->bio_list. As that doesn't involve any stacking of
block device it is entirely harmless and we should not warn about
this case.
Fixes: ff93ea0ce7 ("block: shortcut __submit_bio_noacct for blk-mq drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio_alloc_bioset references current->bio_list[1], so we need to
initialize it for the blk-mq submission path as well.
Fixes: ff93ea0ce7 ("block: shortcut __submit_bio_noacct for blk-mq drivers")
Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Now that submit_bio_noacct has a decent blk-mq fast path there is no
more need for this bypass.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For blk-mq drivers bios can only be inserted for the same queue. So
bypass the complicated sorting logic in __submit_bio_noacct with
a blk-mq simpler submission helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Split out a __submit_bio_noacct helper for the actual de-recursion
algorithm, and simplify the loop by using a continue when we can't
enter the queue for a bio.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
generic_make_request has always been very confusingly misnamed, so rename
it to submit_bio_noacct to make it clear that it is submit_bio minus
accounting and a few checks.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The make_request_fn is a little weird in that it sits directly in
struct request_queue instead of an operation vector. Replace it with
a block_device_operations method called submit_bio (which describes much
better what it does). Also remove the request_queue argument to it, as
the queue can be derived pretty trivially from the bio.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The variable is only used once, so just open code the bio_sector()
there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
All registers disks must have a valid queue pointer, so don't bother to
log a warning for that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The "generic_make_request: " prefix has no value, and will soon become
stale.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blkcg_bio_issue_check is a giant inline function that does three entirely
different things. Factor out the blk-cgroup related bio initalization
into a new helper, and the open code the sequence in the only caller,
relying on the fact that all the actual functionality is stubbed out for
non-cgroup builds.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We were only creating the request_queue debugfs_dir only
for make_request block drivers (multiqueue), but never for
request-based block drivers. We did this as we were only
creating non-blktrace additional debugfs files on that directory
for make_request drivers. However, since blktrace *always* creates
that directory anyway, we special-case the use of that directory
on blktrace. Other than this being an eye-sore, this exposes
request-based block drivers to the same debugfs fragile
race that used to exist with make_request block drivers
where if we start adding files onto that directory we can later
run a race with a double removal of dentries on the directory
if we don't deal with this carefully on blktrace.
Instead, just simplify things by always creating the request_queue
debugfs_dir on request_queue registration. Rename the mutex also to
reflect the fact that this is used outside of the blktrace context.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit dc9edc44de ("block: Fix a blk_exit_rl() regression") merged on
v4.12 moved the work behind blk_release_queue() into a workqueue after a
splat floated around which indicated some work on blk_release_queue()
could sleep in blk_exit_rl(). This splat would be possible when a driver
called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue()
as its final call) from an atomic context.
blk_put_queue() decrements the refcount for the request_queue kobject, and
upon reaching 0 blk_release_queue() is called. Although blk_exit_rl() is
now removed through commit db6d995235 ("block: remove request_list code")
on v5.0, we reserve the right to be able to sleep within
blk_release_queue() context.
The last reference for the request_queue must not be called from atomic
context. *When* the last reference to the request_queue reaches 0 varies,
and so let's take the opportunity to document when that is expected to
happen and also document the context of the related calls as best as
possible so we can avoid future issues, and with the hopes that the
synchronous request_queue removal sticks.
We revert back to synchronous request_queue removal because asynchronous
removal creates a regression with expected userspace interaction with
several drivers. An example is when removing the loopback driver, one
uses ioctls from userspace to do so, but upon return and if successful,
one expects the device to be removed. Likewise if one races to add another
device the new one may not be added as it is still being removed. This was
expected behavior before and it now fails as the device is still present
and busy still. Moving to asynchronous request_queue removal could have
broken many scripts which relied on the removal to have been completed if
there was no error. Document this expectation as well so that this
doesn't regress userspace again.
Using asynchronous request_queue removal however has helped us find
other bugs. In the future we can test what could break with this
arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE.
While at it, update the docs with the context expectations for the
request_queue / gendisk refcount decrement, and make these
expectations explicit by using might_sleep().
Fixes: dc9edc44de ("block: Fix a blk_exit_rl() regression")
Suggested-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: yu kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Let us clarify the context under which the helpers to increment the
refcount for the gendisk and request_queue can be called under. We
make this explicit on the places where we may sleep with might_sleep().
We don't address the decrement context yet, as that needs some extra
work and fixes, but will be addressed in the next patch.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds documentation for the gendisk / request_queue refcount
helpers.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Provide a way for the caller to specify that IO should be marked
with REQ_NOWAIT to avoid blocking on allocation.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl7VOwMQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoR7EADAlz3TCkb4wwuHytTBDrm6gVDdsJ9zUfQW
Cl2ASLtufA8PWZUCEI3vhFyOe6P5e+ZZ0O2HjljSevmHyogCaRYXFYVfbWKcQKuk
AcxiTgnYNevh8KbGLfJY1WL4eXsY+C3QUGivg35cCgrx+kr9oDaHMeqA9Tm1plyM
FSprDBoSmHPqRxiV/1gnr8uXLX6K7i/fHzwmKgySMhavum7Ma8W3wdAGebzvQwrO
SbFSuJVgz06e4B1Fzr/wSvVNUE/qW/KqfGuQKIp7VQFIywbgG7TgRMHjE1FSnpnh
gn+BfL+O5gc0sTvcOTGOE0SRWWwLx961WNg8Azq08l3fzsxLA6h8/AnoDf3i+QMA
rHmLpWZIic2xPSvjaFHX3/V9ITyGYeAMpAR77EL+4ivWrKv5JrBhnSLDt1fKILdg
5elxm7RDI+C4nCP4xuTlVCy5gCd6gwjgytKj+NUWhNq1WiGAD0B54SSiV+SbCSH6
Om2f5trcxz8E4pqWcf0k3LjFapVKRNV8v/+TmVkCdRPBl3y9P0h0wFTkkcEquqnJ
y7Yq6efdWviRCnX5w/r/yj0qBuk4xo5hMVsPmlthCWtnBm+xZQ6LwMRcq4HQgZgR
2SYNscZ3OFMekHssH7DvY4DAy1J+n83ims+KzbScbLg2zCZjh/scQuv38R5Eh9WZ
rCS8c+T7Ig==
=HYf4
-----END PGP SIGNATURE-----
Merge tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"Core block changes that have been queued up for this release:
- Remove dead blk-throttle and blk-wbt code (Guoqing)
- Include pid in blktrace note traces (Jan)
- Don't spew I/O errors on wouldblock termination (me)
- Zone append addition (Johannes, Keith, Damien)
- IO accounting improvements (Konstantin, Christoph)
- blk-mq hardware map update improvements (Ming)
- Scheduler dispatch improvement (Salman)
- Inline block encryption support (Satya)
- Request map fixes and improvements (Weiping)
- blk-iocost tweaks (Tejun)
- Fix for timeout failing with error injection (Keith)
- Queue re-run fixes (Douglas)
- CPU hotplug improvements (Christoph)
- Queue entry/exit improvements (Christoph)
- Move DMA drain handling to the few drivers that use it (Christoph)
- Partition handling cleanups (Christoph)"
* tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits)
block: mark bio_wouldblock_error() bio with BIO_QUIET
blk-wbt: rename __wbt_update_limits to wbt_update_limits
blk-wbt: remove wbt_update_limits
blk-throttle: remove tg_drain_bios
blk-throttle: remove blk_throtl_drain
null_blk: force complete for timeout request
blk-mq: drain I/O when all CPUs in a hctx are offline
blk-mq: add blk_mq_all_tag_iter
blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx
blk-mq: use BLK_MQ_NO_TAG in more places
blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG
blk-mq: move more request initialization to blk_mq_rq_ctx_init
blk-mq: simplify the blk_mq_get_request calling convention
blk-mq: remove the bio argument to ->prepare_request
nvme: force complete cancelled requests
blk-mq: blk-mq: provide forced completion method
block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds
block: blk-crypto-fallback: remove redundant initialization of variable err
block: reduce part_stat_lock() scope
block: use __this_cpu_add() instead of access by smp_processor_id()
...
Patch series "Change readahead API", v11.
This series adds a readahead address_space operation to replace the
readpages operation. The key difference is that pages are added to the
page cache as they are allocated (and then looked up by the filesystem)
instead of passing them on a list to the readpages operation and having
the filesystem add them to the page cache. It's a net reduction in code
for each implementation, more efficient than walking a list, and solves
the direct-write vs buffered-read problem reported by yu kuai at
http://lkml.kernel.org/r/20200116063601.39201-1-yukuai3@huawei.com
The only unconverted filesystems are those which use fscache. Their
conversion is pending Dave Howells' rewrite which will make the
conversion substantially easier. This should be completed by the end of
the year.
I want to thank the reviewers/testers; Dave Chinner, John Hubbard, Eric
Biggers, Johannes Thumshirn, Dave Sterba, Zi Yan, Christoph Hellwig and
Miklos Szeredi have done a marvellous job of providing constructive
criticism.
These patches pass an xfstests run on ext4, xfs & btrfs with no
regressions that I can tell (some of the tests seem a little flaky
before and remain flaky afterwards).
This patch (of 25):
The readahead code is part of the page cache so should be found in the
pagemap.h file. force_page_cache_readahead is only used within mm, so
move it to mm/internal.h instead. Remove the parameter names where they
add no value, and rename the ones which were actively misleading.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Link: http://lkml.kernel.org/r/20200414150233.24495-1-willy@infradead.org
Link: http://lkml.kernel.org/r/20200414150233.24495-2-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit c58c1f8343.
io_uring does do the right thing for this case, and we're still returning
-EAGAIN to userspace for the cases we don't support. Revert this change
to avoid doing endless spins of resubmits.
Cc: stable@vger.kernel.org # v5.6
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We only need the stats lock (aka preempt_disable()) for updating the
states, not for looking up or dropping the hd_struct reference.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@kernel.dk>