-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmV5528ACgkQONu9yGCS
aT7+DxAAl/t4oGT1Di8mIhCfqsezTj/SQ6HAMaAFpKjGXdgTBX9QavwTHp35qLpv
xZtibdl611oEDT2B3//cvJu9Qs8sJGlDWxpJF8fkF/NED2EHLRwUAOmhc4fyBCBq
01GfPRefU9G5E0Nw2g3o07dD0otKQzENh74iAzUr/cyju5TBgS4NC7CljiD6GXP8
DtCcbz2UMmHF5icyasjw7WoCqKaWUn7KLqU3RyQfmDgnh3z0vSXKs17CFoNk1+R6
EAkZJkrUsIDO3N6aRkJGvwtbEFDws4S7onpcthckXt6IF/OQ+h8LcHvyTpblVeWH
qVBXmj1QZD+SwfP5qVtM+RwHHTE3GVJI3+MVTInu0vzgEz6lhNPILzWwso+AIhHM
+SBZkx4/pO2LbrSgqn+NYWRcBTn4fXNURPd+zLXNJ1ZSnhamWm7kuFK8B36lz0s8
CB5ngLld1wg6m2NVPneqAeEcD54msGd18iNZvnxfx3c28sb27RgLqLJQKdSrtFMH
tZj0+3KGkYHzWQmt07iLmg1DPAmZSNzMW7MDJRHlqDCKS2gBKO/LKxWiri2WPh+o
IvgrfGzEJsyYXi5+JcGC9+RkoPcGaous4VA2kWFJSjXiXPrJC4jknQm3mtdMiTf3
V9DEOMvot9ELGI9RODaNsk59C2RhOuVimm2XWV9n88UcX38reAQ=
=gOna
-----END PGP SIGNATURE-----
Merge 5.4.264 into android11-5.4-lts
Changes in 5.4.264
hrtimers: Push pending hrtimers away from outgoing CPU earlier
netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test
tg3: Move the [rt]x_dropped counters to tg3_napi
tg3: Increment tx_dropped in tg3_tso_bug()
kconfig: fix memory leak from range properties
drm/amdgpu: correct chunk_ptr to a pointer to chunk.
of: base: Add of_get_cpu_state_node() to get idle states for a CPU node
ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
of/iommu: Make of_map_rid() PCI agnostic
of/irq: make of_msi_map_get_device_domain() bus agnostic
of/irq: Make of_msi_map_rid() PCI bus agnostic
of: base: Fix some formatting issues and provide missing descriptions
of: Fix kerneldoc output formatting
of: Add missing 'Return' section in kerneldoc comments
of: dynamic: Fix of_reconfig_get_state_change() return value documentation
ipv6: fix potential NULL deref in fib6_add()
hv_netvsc: rndis_filter needs to select NLS
net: arcnet: Fix RESET flag handling
net: arcnet: com20020 fix error handling
arcnet: restoring support for multiple Sohard Arcnet cards
ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit()
net: hns: fix fake link up on xge port
netfilter: xt_owner: Fix for unsafe access of sk->sk_socket
tcp: do not accept ACK of bytes we never sent
bpf: sockmap, updating the sg structure should also update curr
RDMA/bnxt_re: Correct module description string
hwmon: (acpi_power_meter) Fix 4.29 MW bug
ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate
tracing: Fix a warning when allocating buffered events fails
scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init
ARM: dts: imx: make gpt node name generic
ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt
ALSA: pcm: fix out-of-bounds in snd_pcm_state_names
nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage()
tracing: Always update snapshot buffer size
tracing: Fix incomplete locking when disabling buffered events
tracing: Fix a possible race when disabling buffered events
packet: Move reference count in packet_sock to atomic_long_t
arm64: dts: mediatek: mt7622: fix memory node warning check
arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names
perf/core: Add a new read format to get a number of lost samples
perf: Fix perf_event_validate_size()
gpiolib: sysfs: Fix error handling on failed export
mmc: core: add helpers mmc_regulator_enable/disable_vqmmc
mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled
usb: gadget: f_hid: fix report descriptor allocation
parport: Add support for Brainboxes IX/UC/PX parallel cards
usb: typec: class: fix typec_altmode_put_partner to put plugs
ARM: PL011: Fix DMA support
serial: sc16is7xx: address RX timeout interrupt errata
serial: 8250_omap: Add earlycon support for the AM654 UART controller
x86/CPU/AMD: Check vendor in the AMD microcode callback
KVM: s390/mm: Properly reset no-dat
nilfs2: fix missing error check for sb_set_blocksize call
io_uring/af_unix: disable sending io_uring over sockets
netlink: don't call ->netlink_bind with table lock held
genetlink: add CAP_NET_ADMIN test for multicast bind
psample: Require 'CAP_NET_ADMIN' when joining "packets" group
drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem"
cifs: Fix non-availability of dedup breaking generic/304
smb: client: fix potential NULL deref in parse_dfs_referrals()
devcoredump : Serialize devcd_del work
devcoredump: Send uevent once devcd is ready
Linux 5.4.264
Change-Id: I32d19db2a0ff0cf6d061fa9c8ca527d0b61dd158
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 705318a99a138c29a512a72c3e0043b3cd7f55f4 upstream.
File reference cycles have caused lots of problems for io_uring
in the past, and it still doesn't work exactly right and races with
unix_stream_read_generic(). The safest fix would be to completely
disallow sending io_uring files via sockets via SCM_RIGHT, so there
are no possible cycles invloving registered files and thus rendering
SCM accounting on the io_uring side unnecessary.
Cc: <stable@vger.kernel.org>
Fixes: 0091bfc81741b ("io_uring/af_unix: defer registered files gc to io_uring release")
Reported-and-suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c716c88321939156909cfa1bd8b0faaf1c804103.1701868795.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVyyWIACgkQONu9yGCS
aT6Y8A//QJPg7pguCawsJGrem3a5dvhi9scNMmfuhKZOKS73JEmt4yudB9IOUjIX
1c1aBcJo5yYMZq5L9mhXnlgkgqENxE9fI45FtMdwoKiriEQ0w9OBLlfZuKN9lwzC
tyIigaGE5DD3SqL8e/04LNmMPPdolM38lJ368fYaD3T4d7LfwK0qHJFL8dSg4OFQ
VaePViMFgbodjtSXoERNjVLaNtSlQDQytiWHMiQX2uf6CIIRbm+zFHn2Se1mUgh3
WGT9JfXZ+achPw6OLhSIjwL+7vowhn3eRETq4zGkkNSK+rmB6W7zjPhou4SYsmc+
FAYXvalmhQWWjlmIyZzO7GIVtgx19VuEYB8h5KLvp6DXQ0h0wCBOGgsfIT4icbgW
wO0R+toWYY3Y79OLRGiMjiL9b60njJYnrm7JrheRD+BIm2jva+Tb7UxhC6QDMfH6
a8fya8iJDNZWggwpx67JUANdMO8e+2rS4ttNxW0gTZSHhyEjo1HXctKBEmmtXk4s
HGNV5xUniPnzrP8rduNqePG5B6c3wqOHUwj45L4scGmeC0DzW7E8EBgkHfRcU6CG
ik9z5nQeDikREfK7cp8OSFtLaEBWSIX57XwHWDTMVPDGTN8EQ6eI7vTnQH3xOhA8
VWFfwcU6avROM/ih7eJ+X4JvuDKcAGTPeD6oF3II0MLPK2m7ZmE=
=p/ty
-----END PGP SIGNATURE-----
Merge 5.4.263 into android11-5.4-lts
Changes in 5.4.263
driver core: Release all resources during unbind before updating device links
RDMA/irdma: Prevent zero-length STAG registration
PCI: keystone: Drop __init from ks_pcie_add_pcie_{ep,port}()
afs: Make error on cell lookup failure consistent with OpenAFS
drm/panel: simple: Fix Innolux G101ICE-L01 bus flags
drm/panel: simple: Fix Innolux G101ICE-L01 timings
ata: pata_isapnp: Add missing error check for devm_ioport_map()
drm/rockchip: vop: Fix color for RGB888/BGR888 format on VOP full
HID: core: store the unique system identifier in hid_device
HID: fix HID device resource race between HID core and debugging support
ipv4: Correct/silence an endian warning in __ip_do_redirect
net: usb: ax88179_178a: fix failed operations during ax88179_reset
arm/xen: fix xen_vcpu_info allocation alignment
amd-xgbe: handle corner-case during sfp hotplug
amd-xgbe: handle the corner-case during tx completion
amd-xgbe: propagate the correct speed and duplex status
net: axienet: Fix check for partial TX checksum
afs: Return ENOENT if no cell DNS record can be found
afs: Fix file locking on R/O volumes to operate in local mode
nvmet: remove unnecessary ctrl parameter
nvmet: nul-terminate the NQNs passed in the connect command
MIPS: KVM: Fix a build warning about variable set but not used
ext4: add a new helper to check if es must be kept
ext4: factor out __es_alloc_extent() and __es_free_extent()
ext4: use pre-allocated es in __es_insert_extent()
ext4: use pre-allocated es in __es_remove_extent()
ext4: using nofail preallocation in ext4_es_remove_extent()
ext4: using nofail preallocation in ext4_es_insert_delayed_block()
ext4: using nofail preallocation in ext4_es_insert_extent()
ext4: fix slab-use-after-free in ext4_es_insert_extent()
ext4: make sure allocate pending entry not fail
arm64: cpufeature: Extract capped perfmon fields
KVM: arm64: limit PMU version to PMUv3 for ARMv8.1
ACPI: resource: Skip IRQ override on ASUS ExpertBook B1402CVA
bcache: replace a mistaken IS_ERR() by IS_ERR_OR_NULL() in btree_gc_coalesce()
s390/dasd: protect device queue against concurrent access
USB: serial: option: add Luat Air72*U series products
hv_netvsc: Fix race of register_netdevice_notifier and VF register
hv_netvsc: Mark VF as slave before exposing it to user-mode
dm-delay: fix a race between delay_presuspend and delay_bio
bcache: check return value from btree_node_alloc_replacement()
bcache: prevent potential division by zero error
USB: serial: option: add Fibocom L7xx modules
USB: serial: option: fix FM101R-GL defines
USB: serial: option: don't claim interface 4 for ZTE MF290
USB: dwc2: write HCINT with INTMASK applied
usb: dwc3: set the dma max_seg_size
USB: dwc3: qcom: fix resource leaks on probe deferral
USB: dwc3: qcom: fix wakeup after probe deferral
io_uring: fix off-by one bvec index
pinctrl: avoid reload of p state in list iteration
firewire: core: fix possible memory leak in create_units()
mmc: block: Do not lose cache flush during CQE error recovery
ALSA: hda: Disable power-save on KONTRON SinglePC
ALSA: hda/realtek: Headset Mic VREF to 100%
ALSA: hda/realtek: Add supported ALC257 for ChromeOS
dm-verity: align struct dm_verity_fec_io properly
dm verity: don't perform FEC for failed readahead IO
bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
powerpc: Don't clobber f0/vs0 during fp|altivec register save
btrfs: add dmesg output for first mount and last unmount of a filesystem
btrfs: fix off-by-one when checking chunk map includes logical address
btrfs: send: ensure send_fd is writable
btrfs: make error messages more clear when getting a chunk map
Input: xpad - add HyperX Clutch Gladiate Support
ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet
net: stmmac: xgmac: Disable FPE MMC interrupts
ravb: Fix races between ravb_tx_timeout_work() and net related ops
net: ravb: Use pm_runtime_resume_and_get()
net: ravb: Start TX queues after HW initialization succeeded
smb3: fix touch -h of symlink
s390/mm: fix phys vs virt confusion in mark_kernel_pXd() functions family
s390/cmma: fix detection of DAT pages
mtd: cfi_cmdset_0001: Support the absence of protection registers
mtd: cfi_cmdset_0001: Byte swap OTP info
fbdev: stifb: Make the STI next font pointer a 32-bit signed offset
ima: annotate iint mutex to avoid lockdep false positive warnings
ovl: skip overlayfs superblocks at global sync
ima: detect changes to the backing overlay file
scsi: qla2xxx: Simplify the code for aborting SCSI commands
scsi: core: Introduce the scsi_cmd_to_rq() function
scsi: qla2xxx: Use scsi_cmd_to_rq() instead of scsi_cmnd.request
scsi: qla2xxx: Fix system crash due to bad pointer access
cpufreq: imx6q: don't warn for disabling a non-existing frequency
cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
mmc: cqhci: Increase recovery halt timeout
mmc: cqhci: Warn of halt or task clear failure
mmc: cqhci: Fix task clearing in CQE error recovery
mmc: core: convert comma to semicolon
mmc: block: Retry commands in CQE error recovery
Linux 5.4.263
Change-Id: I5187b50207d7ed37d7448664448409ed75106ea1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit d6fef34ee4d102be448146f24caf96d7b4a05401 upstream.
If the offset equals the bv_len of the first registered bvec, then the
request does not include any of that first bvec. Skip it so that drivers
don't have to deal with a zero length bvec, which was observed to break
NVMe's PRP list creation.
Cc: stable@vger.kernel.org
Fixes: bd11b3a391 ("io_uring: don't use iov_iter_advance() for fixed buffers")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20231120221831.2646460-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Changes in 5.4.245
cdc_ncm: Implement the 32-bit version of NCM Transfer Block
net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize
power: supply: bq27xxx: After charger plug in/out wait 0.5s for things to stabilize
power: supply: core: Refactor power_supply_set_input_current_limit_from_supplier()
power: supply: bq24190: Call power_supply_changed() after updating input current
fs: fix undefined behavior in bit shift for SB_NOUSER
net/mlx5: devcom only supports 2 ports
net/mlx5: Devcom, serialize devcom registration
cdc_ncm: Fix the build warning
io_uring: always grab lock in io_cancel_async_work()
io_uring: don't drop completion lock before timer is fully initialized
io_uring: have io_kill_timeout() honor the request references
bluetooth: Add cmd validity checks at the start of hci_sock_ioctl()
binder: fix UAF caused by faulty buffer cleanup
ipv{4,6}/raw: fix output xfrm lookup wrt protocol
netfilter: ctnetlink: Support offloaded conntrack entry deletion
Linux 5.4.245
Change-Id: I25e786ed304f80b6ccb3896a8b8d2f16384f0cd6
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
No upstream commit exists for this patch.
Don't free the request unconditionally, if the request is issued async
then someone else may be holding a submit reference to it.
Reported-and-tested-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No upstream commit exists for this patch.
If we drop the lock right after adding it to the timeout list, then
someone attempting to kill timeouts will find it in an indeterminate
state. That means that cancelation could attempt to cancel and remove
a timeout, and then io_timeout() proceeds to init and add the timer
afterwards.
Ensure the timeout request is fully setup before we drop the
completion lock, which guards cancelation as well.
Reported-and-tested-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No upstream commit exists for this patch.
It's not necessarily safe to check the task_list locklessly, remove
this micro optimization and always grab task_lock before deeming it
empty.
Reported-and-tested-by: Lee Jones <lee@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmNZGM8ACgkQONu9yGCS
aT6cjQ/+JSj2g4OKD3WLjhnyy3+GJC7GdHvD8dvMkX/DNW+DD+Ja32O00Jfwi7F1
NMP/AglR4Y5aL3LvCyBR3SLj7Hq8pGOLYpLT8FxtFf7NSCXumUZmnLjRCDUqzovE
W1ObC5EIJ1WMArZc28ECq5EGqOLuqiRcZyel4yDM71ttJ6AglEgOvhGIZMDDEaIh
7rTKgplaU0rNwiOrh16PwUjXVd7AW3dkVCN+Mog96hgkrfokCTVj00QHy2DxEFV4
JKrmrQBSwK36Db02k1+V2kpaKzVflPA1ZHAPee9SfJG50kfEoOOvjg9Yo0csMvqV
LbYXiDhd04oF37Gf73PNhQyFVdyJYZstw1BOO5M/etYN9CNEGrWC1jR3XculxPdx
oIN5Cy+9jBBAJOMxMi7Zx2ZSnacaSlKQq1faVFyv9ekA53HFKPKHUwy4jOGcM/rR
yJw0r+IkCSYv4zTzUc2XM5n+3PXCBtXnrG7yVsihZiHxt4MZvQ5+J/aI88L8vOYa
5mkt8hQ75cZmWiCQOzR2TcVwy/FoPoGlKUWZIO8XYCDLVNgUyqSyTPhe7+9AU7HK
rKHTktX7BJ/202xRypqc4tRuOhRZ3W3Htzq9Dmhf0so61D9Ayzrdm7/eiNto+1ru
nU+V4I740is9x1CMyUU30pHretuhUdz0cuhgpwHeiF2ki/21J6A=
=JFUC
-----END PGP SIGNATURE-----
Merge 5.4.220 into android11-5.4-lts
Changes in 5.4.220
ALSA: oss: Fix potential deadlock at unregistration
ALSA: rawmidi: Drop register_mutex in snd_rawmidi_free()
ALSA: usb-audio: Fix potential memory leaks
ALSA: usb-audio: Fix NULL dererence at error path
ALSA: hda/realtek: remove ALC289_FIXUP_DUAL_SPK for Dell 5530
ALSA: hda/realtek: Correct pin configs for ASUS G533Z
ALSA: hda/realtek: Add quirk for ASUS GV601R laptop
ALSA: hda/realtek: Add Intel Reference SSID to support headset keys
mtd: rawnand: atmel: Unmap streaming DMA mappings
cifs: destage dirty pages before re-reading them for cache=none
cifs: Fix the error length of VALIDATE_NEGOTIATE_INFO message
iio: dac: ad5593r: Fix i2c read protocol requirements
iio: pressure: dps310: Refactor startup procedure
iio: pressure: dps310: Reset chip after timeout
usb: add quirks for Lenovo OneLink+ Dock
can: kvaser_usb: Fix use of uninitialized completion
can: kvaser_usb_leaf: Fix overread with an invalid command
can: kvaser_usb_leaf: Fix TX queue out of sync after restart
can: kvaser_usb_leaf: Fix CAN state after restart
mmc: sdhci-sprd: Fix minimum clock limit
fs: dlm: fix race between test_bit() and queue_work()
fs: dlm: handle -EBUSY first in lock arg validation
HID: multitouch: Add memory barriers
quota: Check next/prev free block number after reading from quota file
ASoC: wcd9335: fix order of Slimbus unprepare/disable
regulator: qcom_rpm: Fix circular deferral regression
RISC-V: Make port I/O string accessors actually work
parisc: fbdev/stifb: Align graphics memory size to 4MB
riscv: Allow PROT_WRITE-only mmap()
riscv: Pass -mno-relax only on lld < 15.0.0
UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge
powerpc/boot: Explicitly disable usage of SPE instructions
fbdev: smscufx: Fix use-after-free in ufx_ops_open()
btrfs: fix race between quota enable and quota rescan ioctl
f2fs: increase the limit for reserve_root
f2fs: fix to do sanity check on destination blkaddr during recovery
f2fs: fix to do sanity check on summary info
nilfs2: fix use-after-free bug of struct nilfs_root
jbd2: wake up journal waiters in FIFO order, not LIFO
ext4: avoid crash when inline data creation follows DIO write
ext4: fix null-ptr-deref in ext4_write_info
ext4: make ext4_lazyinit_thread freezable
ext4: place buffer head allocation before handle start
livepatch: fix race between fork and KLP transition
ftrace: Properly unset FTRACE_HASH_FL_MOD
ring-buffer: Allow splice to read previous partially read pages
ring-buffer: Have the shortest_full queue be the shortest not longest
ring-buffer: Check pending waiters when doing wake ups as well
ring-buffer: Fix race between reset page and reading page
media: cedrus: Set the platform driver data earlier
KVM: x86/emulator: Fix handing of POP SS to correctly set interruptibility
KVM: nVMX: Unconditionally purge queued/injected events on nested "exit"
KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS
gcov: support GCC 12.1 and newer compilers
drm/nouveau: fix a use-after-free in nouveau_gem_prime_import_sg_table()
selinux: use "grep -E" instead of "egrep"
tracing: Disable interrupt or preemption before acquiring arch_spinlock_t
userfaultfd: open userfaultfds with O_RDONLY
sh: machvec: Use char[] for section boundaries
ARM: 9247/1: mm: set readonly for MT_MEMORY_RO with ARM_LPAE
nfsd: Fix a memory leak in an error handling path
wifi: ath10k: add peer map clean up for peer delete in ath10k_sta_state()
wifi: mac80211: allow bw change during channel switch in mesh
bpftool: Fix a wrong type cast in btf_dumper_int
x86/resctrl: Fix to restore to original value when re-enabling hardware prefetch register
wifi: rtl8xxxu: tighten bounds checking in rtl8xxxu_read_efuse()
spi: qup: add missing clk_disable_unprepare on error in spi_qup_resume()
spi: qup: add missing clk_disable_unprepare on error in spi_qup_pm_resume_runtime()
wifi: rtl8xxxu: Fix skb misuse in TX queue selection
bpf: btf: fix truncated last_member_type_id in btf_struct_resolve
wifi: rtl8xxxu: gen2: Fix mistake in path B IQ calibration
net: fs_enet: Fix wrong check in do_pd_setup
bpf: Ensure correct locking around vulnerable function find_vpid()
x86/microcode/AMD: Track patch allocation size explicitly
spi/omap100k:Fix PM disable depth imbalance in omap1_spi100k_probe
netfilter: nft_fib: Fix for rpath check with VRF devices
spi: s3c64xx: Fix large transfers with DMA
vhost/vsock: Use kvmalloc/kvfree for larger packets.
mISDN: fix use-after-free bugs in l1oip timer handlers
sctp: handle the error returned from sctp_auth_asoc_init_active_key
tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
net: rds: don't hold sock lock when cancelling work from rds_tcp_reset_callbacks()
bnx2x: fix potential memory leak in bnx2x_tpa_stop()
net/ieee802154: reject zero-sized raw_sendmsg()
once: add DO_ONCE_SLOW() for sleepable contexts
net: mvpp2: fix mvpp2 debugfs leak
drm: bridge: adv7511: fix CEC power down control register offset
drm/mipi-dsi: Detach devices when removing the host
platform/chrome: fix double-free in chromeos_laptop_prepare()
platform/chrome: fix memory corruption in ioctl
platform/x86: msi-laptop: Fix old-ec check for backlight registering
platform/x86: msi-laptop: Fix resource cleanup
drm: fix drm_mipi_dbi build errors
drm/bridge: megachips: Fix a null pointer dereference bug
ASoC: rsnd: Add check for rsnd_mod_power_on
ALSA: hda: beep: Simplify keep-power-at-enable behavior
drm/omap: dss: Fix refcount leak bugs
mmc: au1xmmc: Fix an error handling path in au1xmmc_probe()
ASoC: eureka-tlv320: Hold reference returned from of_find_xxx API
drm/msm/dpu: index dpu_kms->hw_vbif using vbif_idx
ALSA: dmaengine: increment buffer pointer atomically
mmc: wmt-sdmmc: Fix an error handling path in wmt_mci_probe()
ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe
ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe
ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe
ALSA: hda/hdmi: Don't skip notification handling during PM operation
memory: pl353-smc: Fix refcount leak bug in pl353_smc_probe()
memory: of: Fix refcount leak bug in of_get_ddr_timings()
soc: qcom: smsm: Fix refcount leak bugs in qcom_smsm_probe()
soc: qcom: smem_state: Add refcounting for the 'state->of_node'
ARM: dts: turris-omnia: Fix mpp26 pin name and comment
ARM: dts: kirkwood: lsxl: fix serial line
ARM: dts: kirkwood: lsxl: remove first ethernet port
ARM: dts: exynos: correct s5k6a3 reset polarity on Midas family
ARM: Drop CMDLINE_* dependency on ATAGS
ARM: dts: exynos: fix polarity of VBUS GPIO of Origen
iio: adc: at91-sama5d2_adc: fix AT91_SAMA5D2_MR_TRACKTIM_MAX
iio: adc: at91-sama5d2_adc: check return status for pressure and touch
iio: adc: at91-sama5d2_adc: lock around oversampling and sample freq
iio: inkern: only release the device node when done with it
iio: ABI: Fix wrong format of differential capacitance channel ABI.
clk: meson: Hold reference returned by of_get_parent()
clk: oxnas: Hold reference returned by of_get_parent()
clk: berlin: Add of_node_put() for of_get_parent()
clk: tegra: Fix refcount leak in tegra210_clock_init
clk: tegra: Fix refcount leak in tegra114_clock_init
clk: tegra20: Fix refcount leak in tegra20_clock_init
HSI: omap_ssi: Fix refcount leak in ssi_probe
HSI: omap_ssi_port: Fix dma_map_sg error check
media: exynos4-is: fimc-is: Add of_node_put() when breaking out of loop
tty: xilinx_uartps: Fix the ignore_status
media: xilinx: vipp: Fix refcount leak in xvip_graph_dma_init
RDMA/rxe: Fix "kernel NULL pointer dereference" error
RDMA/rxe: Fix the error caused by qp->sk
misc: ocxl: fix possible refcount leak in afu_ioctl()
dyndbg: fix module.dyndbg handling
dyndbg: let query-modname override actual module name
mtd: devices: docg3: check the return value of devm_ioremap() in the probe
RDMA/siw: Always consume all skbuf data in sk_data_ready() upcall.
ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting()
ata: fix ata_id_has_devslp()
ata: fix ata_id_has_ncq_autosense()
ata: fix ata_id_has_dipm()
mtd: rawnand: meson: fix bit map use in meson_nfc_ecc_correct()
md/raid5: Ensure stripe_fill happens on non-read IO with journal
xhci: Don't show warning for reinit on known broken suspend
usb: gadget: function: fix dangling pnp_string in f_printer.c
drivers: serial: jsm: fix some leaks in probe
tty: serial: fsl_lpuart: disable dma rx/tx use flags in lpuart_dma_shutdown
phy: qualcomm: call clk_disable_unprepare in the error handling
staging: vt6655: fix some erroneous memory clean-up loops
firmware: google: Test spinlock on panic path to avoid lockups
serial: 8250: Fix restoring termios speed after suspend
scsi: libsas: Fix use-after-free bug in smp_execute_task_sg()
fsi: core: Check error number after calling ida_simple_get
mfd: intel_soc_pmic: Fix an error handling path in intel_soc_pmic_i2c_probe()
mfd: fsl-imx25: Fix an error handling path in mx25_tsadc_setup_irq()
mfd: lp8788: Fix an error handling path in lp8788_probe()
mfd: lp8788: Fix an error handling path in lp8788_irq_init() and lp8788_irq_init()
mfd: fsl-imx25: Fix check for platform_get_irq() errors
mfd: sm501: Add check for platform_driver_register()
clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
dmaengine: ioat: stop mod_timer from resurrecting deleted timer in __cleanup()
spmi: pmic-arb: correct duplicate APID to PPID mapping logic
clk: bcm2835: fix bcm2835_clock_rate_from_divisor declaration
clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
clk: ast2600: BCLK comes from EPLL
mailbox: bcm-ferxrm-mailbox: Fix error check for dma_map_sg
powerpc/math_emu/efp: Include module.h
powerpc/sysdev/fsl_msi: Add missing of_node_put()
powerpc/pci_dn: Add missing of_node_put()
powerpc/powernv: add missing of_node_put() in opal_export_attrs()
x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5
powerpc: Fix SPE Power ISA properties for e500v1 platforms
cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset
iommu/omap: Fix buffer overflow in debugfs
crypto: akcipher - default implementation for setting a private key
crypto: ccp - Release dma channels before dmaengine unrgister
iommu/iova: Fix module config properly
kbuild: remove the target in signal traps when interrupted
crypto: cavium - prevent integer overflow loading firmware
f2fs: fix race condition on setting FI_NO_EXTENT flag
ACPI: video: Add Toshiba Satellite/Portege Z830 quirk
MIPS: BCM47XX: Cast memcmp() of function to (void *)
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash
NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data
wifi: brcmfmac: fix invalid address access when enabling SCAN log level
bpftool: Clear errno after libcap's checks
openvswitch: Fix double reporting of drops in dropwatch
openvswitch: Fix overreporting of drops in dropwatch
tcp: annotate data-race around tcp_md5sig_pool_populated
wifi: ath9k: avoid uninit memory read in ath9k_htc_rx_msg()
xfrm: Update ipcomp_scratches with NULL when freed
wifi: brcmfmac: fix use-after-free bug in brcmf_netdev_start_xmit()
Bluetooth: L2CAP: initialize delayed works at l2cap_chan_create()
Bluetooth: hci_sysfs: Fix attempting to call device_add multiple times
can: bcm: check the result of can_send() in bcm_can_tx()
wifi: rt2x00: don't run Rt5592 IQ calibration on MT7620
wifi: rt2x00: set correct TX_SW_CFG1 MAC register for MT7620
wifi: rt2x00: set VGC gain for both chains of MT7620
wifi: rt2x00: set SoC wmac clock register
wifi: rt2x00: correctly set BBP register 86 for MT7620
net: If sock is dead don't access sock's sk_wq in sk_stream_wait_memory
Bluetooth: L2CAP: Fix user-after-free
r8152: Rate limit overflow messages
drm/nouveau/nouveau_bo: fix potential memory leak in nouveau_bo_alloc()
drm: Use size_t type for len variable in drm_copy_field()
drm: Prevent drm_copy_field() to attempt copying a NULL pointer
drm/amd/display: fix overflow on MIN_I64 definition
drm/vc4: vec: Fix timings for VEC modes
drm: panel-orientation-quirks: Add quirk for Anbernic Win600
platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading
drm/amdgpu: fix initial connector audio value
mmc: sdhci-msm: add compatible string check for sdm670
ARM: dts: imx7d-sdb: config the max pressure for tsc2046
ARM: dts: imx6q: add missing properties for sram
ARM: dts: imx6dl: add missing properties for sram
ARM: dts: imx6qp: add missing properties for sram
ARM: dts: imx6sl: add missing properties for sram
ARM: dts: imx6sll: add missing properties for sram
ARM: dts: imx6sx: add missing properties for sram
btrfs: scrub: try to fix super block errors
clk: zynqmp: Fix stack-out-of-bounds in strncpy`
media: cx88: Fix a null-ptr-deref bug in buffer_prepare()
clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
scsi: 3w-9xxx: Avoid disabling device if failing to enable it
nbd: Fix hung when signal interrupts nbd_start_device_ioctl()
power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type()
staging: vt6655: fix potential memory leak
ata: libahci_platform: Sanity check the DT child nodes number
bcache: fix set_at_max_writeback_rate() for multiple attached devices
HID: roccat: Fix use-after-free in roccat_read()
md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d
usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info()
usb: musb: Fix musb_gadget.c rxstate overflow bug
Revert "usb: storage: Add quirk for Samsung Fit flash"
staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv()
nvme: copy firmware_rev on each init
nvmet-tcp: add bounds check on Transfer Tag
usb: idmouse: fix an uninit-value in idmouse_open
clk: bcm2835: Make peripheral PLLC critical
perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
io_uring/af_unix: defer registered files gc to io_uring release
net: ieee802154: return -EINVAL for unknown addr type
Revert "net/ieee802154: reject zero-sized raw_sendmsg()"
net/ieee802154: don't warn zero-sized raw_sendmsg()
ext4: continue to expand file system when the target size doesn't reach
md: Replace snprintf with scnprintf
efi: libstub: drop pointless get_memory_map() call
inet: fully convert sk->sk_rx_dst to RCU rules
thermal: intel_powerclamp: Use first online CPU as control_cpu
Linux 5.4.220
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I91859d6b79f44ab654cb0c88d0d6c9c46f62131b
[ upstream commit 0091bfc81741b8d3aeb3b7ab8636f911b2de6e80 ]
Instead of putting io_uring's registered files in unix_gc() we want it
to be done by io_uring itself. The trick here is to consider io_uring
registered files for cycle detection but not actually putting them down.
Because io_uring can't register other ring instances, this will remove
all refs to the ring file triggering the ->release path and clean up
with io_ring_ctx_free().
Cc: stable@vger.kernel.org
Fixes: 6b06314c47 ("io_uring: add file set registration")
Reported-and-tested-by: David Bouman <dbouman03@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
[axboe: add kerneldoc comment to skb, fold in skb leak fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit fc78b2fc21.
This breaks the Android api and for now, does not seem to be necessary
due to the lack of io_uring users in this kernel branch. If io_uring
starts to be used more, it can be brought back in a ABI-safe way.
Bug: 161946584
Bug: 248008710
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2696bd5e1ad61d3ab0e8d06f4ffe46718bb05845
Older kernels lack io_uring POLLFREE handling. As only affected files
are signalfd and android binder the safest option would be to disable
polling those files via io_uring and hope there are no users.
Fixes: 221c5eb233 ("io_uring: add support for IORING_OP_POLL")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a bunch of cases where we can grab req->fs but not put it, this
can be used to cause a controllable overflow with further implications.
Release req->fs in the request free path and make sure we zero the field
to be sure we don't do it twice.
Fixes: cac68d12c5 ("io_uring: grab ->fs as part of async offload")
Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No upstream commit, this is a fix to a stable 5.4 specific backport.
The intention of backport commit cac68d12c5 ("io_uring: grab ->fs as part
of async offload") as found in the stable 5.4 tree was to make
io_sq_wq_submit_work() to switch the workqueue task's ->fs over to the
submitting task's one during the IO operation.
However, due to a small logic error, this change turned out to not have any
actual effect. From a high level, the relevant code in
io_sq_wq_submit_work() looks like
old_fs_struct = current->fs;
do {
...
if (req->fs != current->fs && current->fs != old_fs_struct) {
task_lock(current);
if (req->fs)
current->fs = req->fs;
else
current->fs = old_fs_struct;
task_unlock(current);
}
...
} while (req);
The if condition is supposed to cover the case that current->fs doesn't
match what's needed for processing the request, but observe how it fails
to ever evaluate to true due to the second clause:
current->fs != old_fs_struct will be false in the first iteration as per
the initialization of old_fs_struct and because this prevents current->fs
from getting replaced, the same follows inductively for all subsequent
iterations.
Fix said if condition such that
- if req->fs is set and doesn't match current->fs, the latter will be
switched to the former
- or if req->fs is unset, the switch back to the initial old_fs_struct
will be made, if necessary.
While at it, also correct the condition for the ->fs related cleanup right
before the return of io_sq_wq_submit_work(): currently, old_fs_struct is
restored only if it's non-NULL. It is always non-NULL though and thus, the
if-condition is rendundant. Supposedly, the motivation had been to optimize
and avoid switching current->fs back to the initial old_fs_struct in case
it is found to have the desired value already. Make it so.
Cc: stable@vger.kernel.org # v5.4
Fixes: cac68d12c5 ("io_uring: grab ->fs as part of async offload")
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we queue work in io_poll_wake(), it will leads to list double
add. So we should add the list when the callback func is the
io_sq_wq_submit_work.
The following oops was seen:
list_add double add: new=ffff9ca6a8f1b0e0, prev=ffff9ca62001cee8,
next=ffff9ca6a8f1b0e0.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:31!
Call Trace:
<IRQ>
io_poll_wake+0xf3/0x230
__wake_up_common+0x91/0x170
__wake_up_common_lock+0x7a/0xc0
io_commit_cqring+0xea/0x280
? blkcg_iolatency_done_bio+0x2b/0x610
io_cqring_add_event+0x3e/0x60
io_complete_rw+0x58/0x80
dio_complete+0x106/0x250
blk_update_request+0xa0/0x3b0
blk_mq_end_request+0x1a/0x110
blk_mq_complete_request+0xd0/0xe0
nvme_irq+0x129/0x270 [nvme]
__handle_irq_event_percpu+0x7b/0x190
handle_irq_event_percpu+0x30/0x80
handle_irq_event+0x3c/0x60
handle_edge_irq+0x91/0x1e0
do_IRQ+0x4d/0xd0
common_interrupt+0xf/0xf
Fixes: 1c4404efcf ("io_uring: make sure async workqueue is canceled on exit")
Reported-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the process 0 has been initialized io_uring is complete, and
then fork process 1. If process 1 exits and it leads to delete
all reqs from the task_list. If we kill process 0. We will not
send SIGINT signal to the kworker. So we can not remove the req
from the task_list. The io_sq_wq_submit_work() can do that for
us.
Fixes: 1c4404efcf ("io_uring: make sure async workqueue is canceled on exit")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The store to req->flags and load req->work_task should not be
reordering in io_cancel_async_work(). We should make sure that
either we store REQ_F_CANCE flag to req->flags or we see the
req->work_task setted in io_sq_wq_submit_work().
Fixes: 1c4404efcf ("io_uring: make sure async workqueue is canceled on exit")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit
1c4404efcf2c0> ("<io_uring: make sure async workqueue is canceled on exit>")
doesn't solve the resource leak problem totally! When kworker is doing a
io task for the io_uring, The process which submitted the io task has
received a SIGKILL signal from the user. Then the io_cancel_async_work
function could have sent a SIGINT signal to the kworker, but the judging
condition is wrong. So it doesn't send a SIGINT signal to the kworker,
then caused the resource leaking problem.
Why the juding condition is wrong? The process is a multi-threaded process,
we call the thread of the process which has submitted the io task Thread1.
So the req->task is the current macro of the Thread1. when all the threads
of the process have done exit procedure, the last thread will call the
io_cancel_async_work, but the last thread may not the Thread1, so the task
is not equal and doesn't send the SIGINT signal. To fix this bug, we alter
the task attribute of the req with struct files_struct. And check the files
instead.
Fixes: 1c4404efcf ("io_uring: make sure async workqueue is canceled on exit")
Signed-off-by: Yinyin Zhu <zhuyinyin@bytedance.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd74048108c179cea0ff52979506164c80f29da7 upstream.
If we hit an earlier error path in io_uring_create(), then we will have
accounted memory, but not set ctx->{sq,cq}_entries yet. Then when the
ring is torn down in error, we use those values to unaccount the memory.
Ensure we set the ctx entries before we're able to hit a potential error
path.
Cc: stable@vger.kernel.org
Reported-by: Tomáš Chaloupka <chalucha@gmail.com>
Tested-by: Tomáš Chaloupka <chalucha@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b36200f543ff07a1cb346aa582349141df2c8068 ]
rings_size() sets sq_offset to the total size of the rings (the returned
value which is used for memory allocation). This is wrong: sq array should
be located within the rings, not after them. Set sq_offset to where it
should be.
Fixes: 75b28affdd ("io_uring: allocate the two rings together")
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Hristo Venev <hristo@venev.name>
Cc: io-uring@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
the commit <a4d61e66ee4a> ("<io_uring: prevent re-read of sqe->opcode>")
caused another vulnerability. After io_get_req(), the sqe_submit struct
in req is not initialized, but the following code defaults that
req->submit.opcode is available.
Signed-off-by: Liu Yong <pkfxxxing@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
when ctx->sqo_mm is zero, io_sq_wq_submit_work() frees 'req'
without deleting it from 'task_list'. After that, 'req' is
accessed in io_ring_ctx_wait_and_kill() which lead to
a use-after-free.
Signed-off-by: Guoyu Huang <hgy5945@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Liu reports that he can trigger a NULL pointer dereference with
IORING_OP_SENDMSG, by changing the sqe->opcode after we've validated
that the previous opcode didn't need a file and didn't assign one.
Ensure we validate and read the opcode only once.
Reported-by: Liu Yong <pkfxxxing@gmail.com>
Tested-by: Liu Yong <pkfxxxing@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Track async work items that we queue, so we can safely cancel them
if the ring is closed or the process exits. Newer kernels handle
this automatically with io-wq, but the old workqueue based setup needs
a bit of special help to get there.
There's no upstream variant of this, as that would require backporting
all the io-wq changes from 5.5 and on. Hence I made a one-off that
ensures that we don't leak memory if we have async work items that
need active cancelation (like socket IO).
Reported-by: Agarwal, Anchal <anchalag@amazon.com>
Tested-by: Agarwal, Anchal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a8c73c1a614f6da6c0b04c393f87447e28cb6de4 upstream.
Use kvfree() to free the pages and vmas, since they are allocated by
kvmalloc_array() in a loop.
Fixes: d4ef647510 ("io_uring: avoid page allocation warnings")
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200605093203.40087-1-efremov@linux.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ed734b0d0913e566a9d871e15d24eb240f269f7 upstream.
With the previous fixes for number of files open checking, I added some
debug code to see if we had other spots where we're checking rlimit()
against the async io-wq workers. The only one I found was file size
checking, which we should also honor.
During write and fallocate prep, store the max file size and override
that for the current ask if we're in io-wq worker context.
Cc: stable@vger.kernel.org # 5.1+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c336e992cb1cb1db9ee608dfb30342ae781057ab upstream.
We already checked this limit when the file was opened, and we keep it
open in the file table. Hence when we added unit_inflight to the count
we want to register, we're doubly accounting these files. This results
in -EMFILE for file registration, if we're at half the limit.
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d876836204897b6d7d911f942084f69a1e9d5c4d upstream.
We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise
the iovec import for these commands will not do the right thing and fail
the command with -EINVAL.
Found by running the test suite compiled as 32-bit.
Cc: stable@vger.kernel.org
Fixes: aa1fa28fc7 ("io_uring: add support for recvmsg()")
Fixes: 0fa03c624d ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commits 9392a27d88b9 and ff002b30181d ]
Ensure that the async work grabs ->fs from the queueing task if the
punted commands needs to do lookups.
We don't have these two commits in 5.4-stable:
ff002b30181d30cdfbca316dadd099c3ca0d739c
9392a27d88b9707145d713654eb26f0c29789e50
because they don't apply with the rework that was done in how io_uring
handles offload. Since there's no io-wq in 5.4, it doesn't make sense to
do two patches. I'm attaching my port of the two for 5.4-stable, it's
been tested. Please queue it up for the next 5.4-stable, thanks!
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7143b5ac5750f404ff3a594b34fdf3fc2f99f828 upstream.
This patch drops 'cur_mm' before calling cond_resched(), to prevent
the sq_thread from spinning even when the user process is finished.
Before this patch, if the user process ended without closing the
io_uring fd, the sq_thread continues to spin until the
'sq_thread_idle' timeout ends.
In the worst case where the 'sq_thread_idle' parameter is bigger than
INT_MAX, the sq_thread will spin forever.
Fixes: 6c271ce2f1 ("io_uring: add submission polling")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7849be9cc2dd2754c48ddbaca27c2de6d80a95d upstream.
Since commit a3a0e43fd7 ("io_uring: don't enter poll loop if we have
CQEs pending"), if we already events pending, we won't enter poll loop.
In case SETUP_IOPOLL and SETUP_SQPOLL are both enabled, if app has
been terminated and don't reap pending events which are already in cq
ring, and there are some reqs in poll_list, io_sq_thread will enter
__io_iopoll_check(), and find pending events, then return, this loop
will never have a chance to exit.
I have seen this issue in fio stress tests, to fix this issue, let
io_sq_thread call io_iopoll_getevents() with argument 'min' being zero,
and remove __io_iopoll_check().
Fixes: a3a0e43fd7 ("io_uring: don't enter poll loop if we have CQEs pending")
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73e08e711d9c1d79fae01daed4b0e1fee5f8a275 upstream.
This ends up being too restrictive for tasks that willingly fork and
share the ring between forks. Andres reports that this breaks his
postgresql work. Since we're close to 5.5 release, revert this change
for now.
Cc: stable@vger.kernel.org
Fixes: 44d282796f81 ("io_uring: only allow submit from owning task")
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44d282796f81eb1debc1d7cb53245b4cb3214cb5 upstream.
If the credentials or the mm doesn't match, don't allow the task to
submit anything on behalf of this ring. The task that owns the ring can
pass the file descriptor to another task, but we don't want to allow
that task to submit an SQE that then assumes the ring mm and creds if
it needs to go async.
Cc: stable@vger.kernel.org
Suggested-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7c504e65206a4379ff38fe41d21b32b6c2c3e53e ]
There is no reliable way to submit and wait in a single syscall, as
io_submit_sqes() may under-consume sqes (in case of an early error).
Then it will wait for not-yet-submitted requests, deadlocking the user
in most cases.
Don't wait/poll if can't submit all sqes
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb065d301e8c83643367bdb0898becc364046bda ]
We currently rely on the ring destroy on cleaning things up in case of
failure, but io_allocate_scq_urings() can leave things half initialized
if only parts of it fails.
Be nice and return with either everything setup in success, or return an
error with things nicely cleaned up.
Reported-by: syzbot+0d818c0d39399188f393@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
There's an issue with deferred requests through drain, where if we do
need to defer, we're not copying over the sqe_submit state correctly.
This can result in using uninitialized data when we then later go and
submit the deferred request, like this check in __io_submit_sqe():
if (unlikely(s->index >= ctx->sq_entries))
return -EINVAL;
with 's' being uninitialized, we can randomly fail this check. Fix this
by copying sqe_submit state when we defer a request.
Because it was fixed as part of a cleanup series in mainline, before
anyone realized we had this issue. That removed the separate states
of ->index vs ->submit.sqe. That series is not something I was
comfortable putting into stable, hence the much simpler addition.
Here's the patch in the series that fixes the same issue:
commit cf6fd4bd559ee61a4454b161863c8de6f30f8dca
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Mon Nov 25 23:14:39 2019 +0300
io_uring: inline struct sqe_submit
Reported-by: Andres Freund <andres@anarazel.de>
Reported-by: Tomáš Chaloupka
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa4c3967756c6c576a38a23ac511be211462a6b7 upstream.
Christophe reports that current master fails building on powerpc with
this error:
CC fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’
[-Werror=implicit-function-declaration]
iovec.iov_base = kmap(iter->bvec->bv_page)
^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer
without a cast [-Wint-conversion]
iovec.iov_base = kmap(iter->bvec->bv_page)
^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’
[-Werror=implicit-function-declaration]
kunmap(iter->bvec->bv_page);
^
which is caused by a missing highmem.h include. Fix it by including
it.
Fixes: 311ae9e159d8 ("io_uring: fix dead-hung for non-iter fixed rw")
Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 441cdbd5449b4923cd413d3ba748124f91388be9 upstream.
We should never return -ERESTARTSYS to userspace, transform it into
-EINTR.
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 311ae9e159d81a1ec1cf645daf40b39ae5a0bd84 upstream.
Read/write requests to devices without implemented read/write_iter
using fixed buffers can cause general protection fault, which totally
hangs a machine.
io_import_fixed() initialises iov_iter with bvec, but loop_rw_iter()
accesses it as iovec, dereferencing random address.
kmap() page by page in this case
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 181e448d8709e517c9c7b523fcd209f24eb38ca7 ]
If we don't inherit the original task creds, then we can confuse users
like fuse that pass creds in the request header. See link below on
identical aio issue.
Link: https://lore.kernel.org/linux-fsdevel/26f0d78e-99ca-2f1b-78b9-433088053a61@scylladb.com/T/#u
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
A test case was reported where two linked reads with registered buffers
failed the second link always. This is because we set the expected value
of a request in req->result, and if we don't get this result, then we
fail the dependent links. For some reason the registered buffer import
returned -ERROR/0, while the normal import returns -ERROR/length. This
broke linked commands with registered buffers.
Fix this by making io_import_fixed() correctly return the mapped length.
Cc: stable@vger.kernel.org # v5.3
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For timeout requests io_uring tries to grab a file with specified fd,
which is usually stdin/fd=0.
Update io_op_needs_file()
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently we make sequence == 0 be the same as sequence == 1, but that's
not super useful if the intent is really to have a timeout that's just
a pure timeout.
If the user passes in sqe->off == 0, then don't apply any sequence logic
to the request, let it purely be driven by the timeout specified.
Reported-by: 李通洲 <carter.li@eoitek.com>
Reviewed-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We use io_kiocb->result == -EAGAIN as a way to know if we need to
re-submit a polled request, as -EAGAIN reporting happens out-of-line
for IO submission failures. This field is cleared when we originally
allocate the request, but it isn't reset when we retry the submission
from async context. This can cause issues where we think something
needs a re-issue, but we're really just reading stale data.
Reset ->result whenever we re-prep a request for polled submission.
Cc: stable@vger.kernel.org
Fixes: 9e645e1105 ("io_uring: add support for sqe links")
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
syzkaller reported an issue where it looks like a malicious app can
trigger a use-after-free of reading the ctx ->sq_array and ->rings
value right after having installed the ring fd in the process file
table.
Defer ring fd installation until after we're done reading those
values.
Fixes: 75b28affdd ("io_uring: allocate the two rings together")
Reported-by: syzbot+6f03d895a6cd0d06187f@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_queue_link_head() owns shadow_req after taking it as an argument.
By not freeing it in case of an error, it can leak the request along
with taken ctx->refs.
Reviewed-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>