Changes in 6.1.78
ext4: regenerate buddy after block freeing failed if under fc replay
dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools
dmaengine: ti: k3-udma: Report short packet errors
dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA
dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA
phy: renesas: rcar-gen3-usb2: Fix returning wrong error code
dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV
phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP
cifs: failure to add channel on iface should bump up weight
drm/msms/dp: fixed link clock divider bits be over written in BPC unknown case
drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case
drm/msm/dpu: check for valid hw_pp in dpu_encoder_helper_phys_cleanup
net: stmmac: xgmac: fix handling of DPP safety error for DMA channels
wifi: mac80211: fix waiting for beacons logic
netdevsim: avoid potential loop in nsim_dev_trap_report_work()
net: atlantic: Fix DMA mapping for PTP hwts ring
selftests: net: cut more slack for gro fwd tests.
selftests: net: avoid just another constant wait
tunnels: fix out of bounds access when building IPv6 PMTU error
atm: idt77252: fix a memleak in open_card_ubr0
octeontx2-pf: Fix a memleak otx2_sq_init
hwmon: (aspeed-pwm-tacho) mutex for tach reading
hwmon: (coretemp) Fix out-of-bounds memory access
hwmon: (coretemp) Fix bogus core_id to attr name mapping
inet: read sk->sk_family once in inet_recv_error()
drm/i915/gvt: Fix uninitialized variable in handle_mmio()
rxrpc: Fix response to PING RESPONSE ACKs to a dead call
tipc: Check the bearer type before calling tipc_udp_nl_bearer_add()
af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.
ppp_async: limit MRU to 64K
selftests: cmsg_ipv6: repeat the exact packet
netfilter: nft_compat: narrow down revision to unsigned 8-bits
netfilter: nft_compat: reject unused compat flag
netfilter: nft_compat: restrict match/target protocol to u16
drm/amd/display: Implement bounds check for stream encoder creation in DCN301
netfilter: nft_ct: reject direction for ct id
netfilter: nft_set_pipapo: store index in scratch maps
netfilter: nft_set_pipapo: add helper to release pcpu scratch area
netfilter: nft_set_pipapo: remove scratch_aligned pointer
fs/ntfs3: Fix an NULL dereference bug
scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
blk-iocost: Fix an UBSAN shift-out-of-bounds warning
fs: dlm: don't put dlm_local_addrs on heap
mtd: parsers: ofpart: add workaround for #size-cells 0
ALSA: usb-audio: Add delay quirk for MOTU M Series 2nd revision
ALSA: usb-audio: Add a quirk for Yamaha YIT-W12TX transmitter
ALSA: usb-audio: add quirk for RODE NT-USB+
USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e
USB: serial: option: add Fibocom FM101-GL variant
USB: serial: cp210x: add ID for IMST iM871A-USB
usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK
usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK
hrtimer: Report offline hrtimer enqueue
Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU
Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID
io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers
Revert "ASoC: amd: Add new dmi entries for acp5x platform"
vhost: use kzalloc() instead of kmalloc() followed by memset()
RDMA/irdma: Fix support for 64k pages
f2fs: add helper to check compression level
block: treat poll queue enter similarly to timeouts
clocksource: Skip watchdog check for large watchdog intervals
net: stmmac: xgmac: use #define for string constants
ALSA: usb-audio: Sort quirk table entries
net: stmmac: xgmac: fix a typo of register name in DPP safety handling
netfilter: nft_set_rbtree: skip end interval element from gc
Linux 6.1.78
Change-Id: Iba16875d4cb88deffea077cf69495f9fe447ea23
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmWy7o0ACgkQONu9yGCS
aT76JA/9Gh3VNSLG35LaLyq3xGd827N6DPsMzeFHi+MGSyPVg0auE77QkHD/gZl9
KynmBmz2+9DSoFxymWAS9oEPM8d/vw87AMuSTTct3GKkjEeUcj9lbeOEzgZydXX8
cJSXvcCeKE3FESU/YbQKxo0N+r7tUDmnCR0edss5/FpYni3jPdg7jdESzGhiCHXj
r5rjrTE6h7Z/d+2kaKqlheL4o4OkV0YwnFnU2gC3MOOvLmgvXdOVQQsyaZ+WgSAN
0JS0Q6Xk1xyYWx8iFaLGWIs1pUsQPKxIiRG3N/1KmXITopf2Pu68Yy7ST+YryDkO
nLcNrr3gsQxrM6MYnEhLzlxs3H1KuAVxJ4Y/dNqJnDxn0OJjcY3repwempz5Sxtk
0OLDOsCICAiMHeF8rYIGhm09WdowLz0EH+sqadIGqWKzW/BcXqD+r9mpF1lwk1ZL
FJLgLmtOaG4amI46lEUHQ6ujN7Oad3gLYzudq2zKLeqonSIjm1TuDoMRvHWFsspO
5i9I0x7Vlo3PqCl7kkKVL9PvVHx6BXJGFShABJqa9ao/oHxkOWuIt26pxUoLUN3P
7Wa5WnfdlDd9nR3VGHcVe2ncuRmEfuriYpXvItJ7/KJKyIPkGoPehAh+vbZMoEy0
DwhtD9PPsTlnUufbcZdHavYA1E4y/uXDMOIGB+ERpsTdXh9DwEo=
=2XHn
-----END PGP SIGNATURE-----
Merge 6.1.75 into android14-6.1-lts
Changes in 6.1.75
x86/lib: Fix overflow when counting digits
x86/mce/inject: Clear test status value
EDAC/thunderx: Fix possible out-of-bounds string access
powerpc: remove checks for binutils older than 2.25
powerpc: add crtsavres.o to always-y instead of extra-y
powerpc/44x: select I2C for CURRITUCK
powerpc/pseries/memhp: Fix access beyond end of drmem array
selftests/powerpc: Fix error handling in FPU/VMX preemption tests
powerpc/powernv: Add a null pointer check to scom_debug_init_one()
powerpc/powernv: Add a null pointer check in opal_event_init()
powerpc/powernv: Add a null pointer check in opal_powercap_init()
powerpc/imc-pmu: Add a null pointer check in update_events_in_group()
spi: spi-zynqmp-gqspi: fix driver kconfig dependencies
mtd: rawnand: Increment IFC_TIMEOUT_MSECS for nand controller response
ACPI: video: check for error while searching for backlight device parent
ACPI: LPIT: Avoid u32 multiplication overflow
KEYS: encrypted: Add check for strsep
platform/x86/intel/vsec: Enhance and Export intel_vsec_add_aux()
platform/x86/intel/vsec: Support private data
platform/x86/intel/vsec: Use mutex for ida_alloc() and ida_free()
platform/x86/intel/vsec: Fix xa_alloc memory leak
of: Add of_property_present() helper
cpufreq: Use of_property_present() for testing DT property presence
cpufreq: scmi: process the result of devm_of_clk_add_hw_provider()
calipso: fix memory leak in netlbl_calipso_add_pass()
efivarfs: force RO when remounting if SetVariable is not supported
efivarfs: Free s_fs_info on unmount
spi: sh-msiof: Enforce fixed DTDL for R-Car H3
ACPI: LPSS: Fix the fractional clock divider flags
ACPI: extlog: Clear Extended Error Log status when RAS_CEC handled the error
kunit: debugfs: Fix unchecked dereference in debugfs_print_results()
mtd: Fix gluebi NULL pointer dereference caused by ftl notifier
selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socket
crypto: virtio - Handle dataq logic with tasklet
crypto: sa2ul - Return crypto_aead_setkey to transfer the error
crypto: ccp - fix memleak in ccp_init_dm_workarea
crypto: af_alg - Disallow multiple in-flight AIO requests
crypto: safexcel - Add error handling for dma_map_sg() calls
crypto: sahara - remove FLAGS_NEW_KEY logic
crypto: sahara - fix cbc selftest failure
crypto: sahara - fix ahash selftest failure
crypto: sahara - fix processing requests with cryptlen < sg->length
crypto: sahara - fix error handling in sahara_hw_descriptor_create()
crypto: hisilicon/qm - save capability registers in qm init process
crypto: hisilicon/zip - add zip comp high perf mode configuration
crypto: hisilicon/qm - add a function to set qm algs
crypto: hisilicon/hpre - save capability registers in probe process
crypto: hisilicon/sec2 - save capability registers in probe process
crypto: hisilicon/zip - save capability registers in probe process
pstore: ram_core: fix possible overflow in persistent_ram_init_ecc()
erofs: fix memory leak on short-lived bounced pages
fs: indicate request originates from old mount API
gfs2: Fix kernel NULL pointer dereference in gfs2_rgrp_dump
crypto: virtio - Wait for tasklet to complete on device remove
crypto: sahara - avoid skcipher fallback code duplication
crypto: sahara - handle zero-length aes requests
crypto: sahara - fix ahash reqsize
crypto: sahara - fix wait_for_completion_timeout() error handling
crypto: sahara - improve error handling in sahara_sha_process()
crypto: sahara - fix processing hash requests with req->nbytes < sg->length
crypto: sahara - do not resize req->src when doing hash operations
crypto: scomp - fix req->dst buffer overflow
csky: fix arch_jump_label_transform_static override
blocklayoutdriver: Fix reference leak of pnfs_device_node
NFSv4.1/pnfs: Ensure we handle the error NFS4ERR_RETURNCONFLICT
SUNRPC: fix _xprt_switch_find_current_entry logic
pNFS: Fix the pnfs block driver's calculation of layoutget size
wifi: plfxlc: check for allocation failure in plfxlc_usb_wreq_async()
wifi: rtw88: fix RX filter in FIF_ALLMULTI flag
bpf, lpm: Fix check prefixlen before walking trie
bpf: Add crosstask check to __bpf_get_stack
wifi: ath11k: Defer on rproc_get failure
wifi: libertas: stop selecting wext
ARM: dts: qcom: apq8064: correct XOADC register address
net/ncsi: Fix netlink major/minor version numbers
firmware: ti_sci: Fix an off-by-one in ti_sci_debugfs_create()
firmware: meson_sm: populate platform devices from sm device tree data
wifi: rtlwifi: rtl8821ae: phy: fix an undefined bitwise shift behavior
arm64: dts: ti: k3-am62a-main: Fix GPIO pin count in DT nodes
arm64: dts: ti: k3-am65-main: Fix DSS irq trigger type
selftests/bpf: Fix erroneous bitmask operation
md: synchronize flush io with array reconfiguration
bpf: enforce precision of R0 on callback return
ARM: dts: qcom: sdx65: correct SPMI node name
arm64: dts: qcom: sc7180: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sc7280: Mark some nodes as 'reserved'
arm64: dts: qcom: sc7280: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sdm845: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm8150: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm8250: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sc8280xp: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm6350: Make watchdog bark interrupt edge triggered
rcu-tasks: Provide rcu_trace_implies_rcu_gp()
bpf: add percpu stats for bpf_map elements insertions/deletions
bpf: Add map and need_defer parameters to .map_fd_put_ptr()
bpf: Defer the free of inner map when necessary
selftests/net: specify the interface when do arping
bpf: fix check for attempt to corrupt spilled pointer
scsi: fnic: Return error if vmalloc() failed
arm64: dts: qcom: qrb5165-rb5: correct LED panic indicator
arm64: dts: qcom: sdm845-db845c: correct LED panic indicator
arm64: dts: qcom: sm8350: Fix DMA0 address
arm64: dts: qcom: sc7280: Fix up GPU SIDs
arm64: dts: qcom: sc7280: Mark Adreno SMMU as DMA coherent
arm64: dts: qcom: sc7280: fix usb_2 wakeup interrupt types
wifi: mt76: mt7921s: fix workqueue problem causes STA association fail
bpf: Fix verification of indirect var-off stack access
arm64: dts: hisilicon: hikey970-pmic: fix regulator cells properties
dt-bindings: media: mediatek: mdp3: correct RDMA and WROT node with generic names
arm64: dts: mediatek: mt8183: correct MDP3 DMA-related nodes
wifi: mt76: mt7921: fix country count limitation for CLC
selftests/bpf: Relax time_tai test for equal timestamps in tai_forward
block: Set memalloc_noio to false on device_add_disk() error path
arm64: dts: renesas: white-hawk-cpu: Fix missing serial console pin control
arm64: dts: imx8mm: Reduce GPU to nominal speed
scsi: hisi_sas: Replace with standard error code return value
scsi: hisi_sas: Rollback some operations if FLR failed
scsi: hisi_sas: Correct the number of global debugfs registers
ARM: dts: stm32: don't mix SCMI and non-SCMI board compatibles
selftests/net: fix grep checking for fib_nexthop_multiprefix
ipmr: support IP_PKTINFO on cache report IGMP msg
virtio/vsock: fix logic which reduces credit update messages
dma-mapping: clear dev->dma_mem to NULL after freeing it
soc: qcom: llcc: Fix dis_cap_alloc and retain_on_pc configuration
arm64: dts: qcom: sm8150-hdk: fix SS USB regulators
block: add check of 'minors' and 'first_minor' in device_add_disk()
arm64: dts: qcom: sc7280: Mark SDHCI hosts as cache-coherent
arm64: dts: qcom: ipq6018: fix clock rates for GCC_USB0_MOCK_UTMI_CLK
arm64: dts: qcom: ipq6018: improve pcie phy pcs reg table
arm64: dts: qcom: ipq6018: Use lowercase hex
arm64: dts: qcom: ipq6018: Pad addresses to 8 hex digits
arm64: dts: qcom: ipq6018: Fix up indentation
wifi: rtlwifi: add calculate_bit_shift()
wifi: rtlwifi: rtl8188ee: phy: using calculate_bit_shift()
wifi: rtlwifi: rtl8192c: using calculate_bit_shift()
wifi: rtlwifi: rtl8192cu: using calculate_bit_shift()
wifi: rtlwifi: rtl8192ce: using calculate_bit_shift()
wifi: rtlwifi: rtl8192de: using calculate_bit_shift()
wifi: rtlwifi: rtl8192ee: using calculate_bit_shift()
wifi: rtlwifi: rtl8192se: using calculate_bit_shift()
wifi: iwlwifi: mvm: set siso/mimo chains to 1 in FW SMPS request
wifi: iwlwifi: mvm: send TX path flush in rfkill
netfilter: nf_tables: mark newset as dead on transaction abort
Bluetooth: Fix bogus check for re-auth no supported with non-ssp
Bluetooth: btmtkuart: fix recv_buf() return value
block: make BLK_DEF_MAX_SECTORS unsigned
null_blk: don't cap max_hw_sectors to BLK_DEF_MAX_SECTORS
bpf: sockmap, fix proto update hook to avoid dup calls
sctp: support MSG_ERRQUEUE flag in recvmsg()
sctp: fix busy polling
net/sched: act_ct: fix skb leak and crash on ooo frags
mlxbf_gige: Fix intermittent no ip issue
mlxbf_gige: Enable the GigE port in mlxbf_gige_open
ip6_tunnel: fix NEXTHDR_FRAGMENT handling in ip6_tnl_parse_tlv_enc_lim()
ARM: davinci: always select CONFIG_CPU_ARM926T
Revert "drm/tidss: Annotate dma-fence critical section in commit path"
Revert "drm/omapdrm: Annotate dma-fence critical section in commit path"
drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()
RDMA/usnic: Silence uninitialized symbol smatch warnings
RDMA/hns: Fix inappropriate err code for unsupported operations
drm/panel-elida-kd35t133: hold panel in reset for unprepare
drm/nouveau/fence:: fix warning directly dereferencing a rcu pointer
drm/bridge: tpd12s015: Drop buggy __exit annotation for remove function
drm/tilcdc: Fix irq free on unload
media: pvrusb2: fix use after free on context disconnection
media: mtk-jpegdec: export jpeg decoder functions
media: mtk-jpeg: Remove cancel worker in mtk_jpeg_remove to avoid the crash of multi-core JPEG devices
media: verisilicon: Hook the (TRY_)DECODER_CMD stateless ioctls
media: rkvdec: Hook the (TRY_)DECODER_CMD stateless ioctls
drm/bridge: Fix typo in post_disable() description
f2fs: fix to avoid dirent corruption
drm/radeon/r600_cs: Fix possible int overflows in r600_cs_check_reg()
drm/radeon/r100: Fix integer overflow issues in r100_cs_track_check()
drm/radeon: check return value of radeon_ring_lock()
drm/tidss: Move reset to the end of dispc_init()
drm/tidss: Return error value from from softreset
drm/tidss: Check for K2G in in dispc_softreset()
drm/tidss: Fix dss reset
ASoC: cs35l33: Fix GPIO name and drop legacy include
ASoC: cs35l34: Fix GPIO name and drop legacy include
drm/msm/mdp4: flush vblank event on disable
drm/msm/dsi: Use pm_runtime_resume_and_get to prevent refcnt leaks
drm/drv: propagate errors from drm_modeset_register_all()
ASoC: Intel: glk_rt5682_max98357a: fix board id mismatch
drm/panfrost: Ignore core_mask for poweroff and disable PWRTRANS irq
drm/radeon: check the alloc_workqueue return value in radeon_crtc_init()
drm/radeon/dpm: fix a memleak in sumo_parse_power_table
drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table
drm/bridge: cdns-mhdp8546: Fix use of uninitialized variable
drm/bridge: tc358767: Fix return value on error case
media: cx231xx: fix a memleak in cx231xx_init_isoc
RDMA/hns: Fix memory leak in free_mr_init()
clk: qcom: gpucc-sm8150: Update the gpu_cc_pll1 config
media: imx-mipi-csis: Fix clock handling in remove()
media: dt-bindings: media: rkisp1: Fix the port description for the parallel interface
media: rkisp1: Fix media device memory leak
drm/panel: st7701: Fix AVCL calculation
f2fs: fix to wait on block writeback for post_read case
f2fs: fix to check compress file in f2fs_move_file_range()
f2fs: fix to update iostat correctly in f2fs_filemap_fault()
media: dvbdev: drop refcount on error path in dvb_device_open()
media: dvb-frontends: m88ds3103: Fix a memory leak in an error handling path of m88ds3103_probe()
clk: renesas: rzg2l-cpg: Reuse code in rzg2l_cpg_reset()
clk: renesas: rzg2l: Check reset monitor registers
drm/msm/dpu: Set input_sel bit for INTF
drm/msm/dpu: Drop enable and frame_count parameters from dpu_hw_setup_misr()
drm/mediatek: Return error if MDP RDMA failed to enable the clock
drm/mediatek: Fix underrun in VDO1 when switches off the layer
drm/amdgpu/debugfs: fix error code when smc register accessors are NULL
drm/amd/pm: fix a double-free in si_dpm_init
drivers/amd/pm: fix a use-after-free in kv_parse_power_table
gpu/drm/radeon: fix two memleaks in radeon_vm_init
drm/amd/pm: fix a double-free in amdgpu_parse_extended_power_table
f2fs: fix to check return value of f2fs_recover_xattr_data
dt-bindings: clock: Update the videocc resets for sm8150
clk: qcom: videocc-sm8150: Update the videocc resets
clk: qcom: videocc-sm8150: Add missing PLL config property
drivers: clk: zynqmp: calculate closest mux rate
drivers: clk: zynqmp: update divider round rate logic
watchdog: set cdev owner before adding
watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO
watchdog: bcm2835_wdt: Fix WDIOC_SETTIMEOUT handling
watchdog: rti_wdt: Drop runtime pm reference count when watchdog is unused
clk: si5341: fix an error code problem in si5341_output_clk_set_rate
drm/mediatek: dp: Add phy_mtk_dp module as pre-dependency
accel/habanalabs: fix information leak in sec_attest_info()
clk: fixed-rate: fix clk_hw_register_fixed_rate_with_accuracy_parent_hw
pwm: stm32: Use regmap_clear_bits and regmap_set_bits where applicable
pwm: stm32: Use hweight32 in stm32_pwm_detect_channels
pwm: stm32: Fix enable count for clk in .probe()
ASoC: rt5645: Drop double EF20 entry from dmi_platform_data[]
ALSA: scarlett2: Add missing error check to scarlett2_config_save()
ALSA: scarlett2: Add missing error check to scarlett2_usb_set_config()
ALSA: scarlett2: Allow passing any output to line_out_remap()
ALSA: scarlett2: Add missing error checks to *_ctl_get()
ALSA: scarlett2: Add clamp() in scarlett2_mixer_ctl_put()
mmc: sdhci_am654: Fix TI SoC dependencies
mmc: sdhci_omap: Fix TI SoC dependencies
IB/iser: Prevent invalidating wrong MR
drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c
drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init
kselftest/alsa - mixer-test: fix the number of parameters to ksft_exit_fail_msg()
kselftest/alsa - mixer-test: Fix the print format specifier warning
ksmbd: validate the zero field of packet header
of: Fix double free in of_parse_phandle_with_args_map
fbdev: imxfb: fix left margin setting
of: unittest: Fix of_count_phandle_with_args() expected value message
selftests/bpf: Add assert for user stacks in test_task_stack
keys, dns: Fix size check of V1 server-list header
binder: fix async space check for 0-sized buffers
binder: fix unused alloc->free_async_space
mips/smp: Call rcutree_report_cpu_starting() earlier
Input: atkbd - use ab83 as id when skipping the getid command
xen-netback: don't produce zero-size SKB frags
binder: fix race between mmput() and do_exit()
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
powerpc/64s: Increase default stack size to 32KB
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
usb: phy: mxs: remove CONFIG_USB_OTG condition for mxs_phy_is_otg_host()
usb: dwc: ep0: Update request status in dwc3_ep0_stall_restart
Revert "usb: dwc3: Soft reset phy on probe for host"
Revert "usb: dwc3: don't reset device side if dwc3 was configured as host-only"
usb: chipidea: wait controller resume finished for wakeup irq
usb: cdns3: fix uvc failure work since sg support enabled
usb: cdns3: fix iso transfer error when mult is not zero
usb: cdns3: Fix uvc fail when DMA cross 4k boundery since sg enabled
Revert "usb: typec: class: fix typec_altmode_put_partner to put plugs"
usb: typec: class: fix typec_altmode_put_partner to put plugs
usb: mon: Fix atomicity violation in mon_bin_vma_fault
serial: core: fix sanitizing check for RTS settings
serial: core: make sure RS485 cannot be enabled when it is not supported
serial: 8250_bcm2835aux: Restore clock error handling
serial: core, imx: do not set RS485 enabled if it is not supported
serial: imx: Ensure that imx_uart_rs485_config() is called with enabled clock
serial: 8250_exar: Set missing rs485_supported flag
serial: omap: do not override settings for RS485 support
drm/vmwgfx: Fix possible invalid drm gem put calls
drm/vmwgfx: Keep a gem reference to user bos in surfaces
ALSA: oxygen: Fix right channel of capture volume mixer
ALSA: hda/relatek: Enable Mute LED on HP Laptop 15s-fq2xxx
ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on HP ZBook
ALSA: hda/realtek: Enable headset mic on Lenovo M70 Gen5
ksmbd: validate mech token in session setup
ksmbd: fix UAF issue in ksmbd_tcp_new_connection()
ksmbd: only v2 leases handle the directory
io_uring/rw: ensure io->bytes_done is always initialized
fbdev: flush deferred work in fb_deferred_io_fsync()
fbdev: flush deferred IO before closing
scsi: ufs: core: Simplify power management during async scan
scsi: target: core: add missing file_{start,end}_write()
scsi: mpi3mr: Refresh sdev queue depth after controller reset
scsi: mpi3mr: Block PEL Enable Command on Controller Reset and Unrecoverable State
drm/amd: Enable PCIe PME from D3
block: add check that partition length needs to be aligned with block size
block: Fix iterating over an empty bio with bio_for_each_folio_all
netfilter: nf_tables: check if catch-all set element is active in next generation
pwm: jz4740: Don't use dev_err_probe() in .request()
pwm: Fix out-of-bounds access in of_pwm_single_xlate()
md/raid1: Use blk_opf_t for read and write operations
rootfs: Fix support for rootfstype= when root= is given
Bluetooth: Fix atomicity violation in {min,max}_key_size_set
bpf: Fix re-attachment branch in bpf_tracing_prog_attach
LoongArch: Fix and simplify fcsr initialization on execve()
iommu/arm-smmu-qcom: Add missing GMU entry to match table
iommu/dma: Trace bounce buffer usage when mapping buffers
wifi: mt76: fix broken precal loading from MTD for mt7915
wifi: rtlwifi: Remove bogus and dangerous ASPM disable/enable code
wifi: rtlwifi: Convert LNKCTL change to PCIe cap RMW accessors
wifi: mwifiex: configure BSSID consistently when starting AP
Revert "net: rtnetlink: Enslave device before bringing it up"
cxl/port: Fix decoder initialization when nr_targets > interleave_ways
PCI/P2PDMA: Remove reference to pci_p2pdma_map_sg()
PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support
PCI: mediatek: Clear interrupt status before dispatching handler
x86/kvm: Do not try to disable kvmclock if it was not enabled
KVM: arm64: vgic-v4: Restore pending state on host userspace write
KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
iio: adc: ad7091r: Pass iio_dev to event handler
HID: wacom: Correct behavior when processing some confidence == false touches
serial: sc16is7xx: add check for unsupported SPI modes during probe
serial: sc16is7xx: set safe default SPI clock frequency
ARM: 9330/1: davinci: also select PINCTRL
mfd: syscon: Fix null pointer dereference in of_syscon_register()
leds: aw2013: Select missing dependency REGMAP_I2C
mfd: intel-lpss: Fix the fractional clock divider flags
mips: dmi: Fix early remap on MIPS32
mips: Fix incorrect max_low_pfn adjustment
riscv: Check if the code to patch lies in the exit section
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
MIPS: Alchemy: Fix an out-of-bound access in db1200_dev_setup()
MIPS: Alchemy: Fix an out-of-bound access in db1550_dev_setup()
power: supply: cw2015: correct time_to_empty units in sysfs
power: supply: bq256xx: fix some problem in bq256xx_hw_init
serial: 8250: omap: Don't skip resource freeing if pm_runtime_resume_and_get() failed
libapi: Add missing linux/types.h header to get the __u64 type on io.h
base/node.c: initialize the accessor list before registering
acpi: property: Let args be NULL in __acpi_node_get_property_reference
software node: Let args be NULL in software_node_get_reference_args
serial: imx: fix tx statemachine deadlock
selftests/sgx: Fix uninitialized pointer dereference in error path
selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
selftests/sgx: Include memory clobber for inline asm in test enclave
selftests/sgx: Skip non X86_64 platform
iio: adc: ad9467: fix reset gpio handling
iio: adc: ad9467: don't ignore error codes
iio: adc: ad9467: fix scale setting
perf header: Fix one memory leakage in perf_event__fprintf_event_update()
perf hisi-ptt: Fix one memory leakage in hisi_ptt_process_auxtrace_event()
perf genelf: Set ELF program header addresses properly
tty: change tty_write_lock()'s ndelay parameter to bool
tty: early return from send_break() on TTY_DRIVER_HARDWARE_BREAK
tty: don't check for signal_pending() in send_break()
tty: use 'if' in send_break() instead of 'goto'
usb: cdc-acm: return correct error code on unsupported break
spmi: mtk-pmif: Serialize PMIF status check and command submission
vdpa: Fix an error handling path in eni_vdpa_probe()
nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length
nvmet-tcp: fix a crash in nvmet_req_complete()
perf env: Avoid recursively taking env->bpf_progs.lock
cxl/region: fix x9 interleave typo
apparmor: avoid crash when parsed profile name is empty
usb: xhci-mtk: fix a short packet issue of gen1 isoc-in transfer
serial: imx: Correct clock error message in function probe()
nvmet: re-fix tracing strncpy() warning
nvme: trace: avoid memcpy overflow warning
nvmet-tcp: Fix the H2C expected PDU len calculation
PCI: keystone: Fix race condition when initializing PHYs
PCI: mediatek-gen3: Fix translation window size calculation
ASoC: mediatek: sof-common: Add NULL check for normal_link string
s390/pci: fix max size calculation in zpci_memcpy_toio()
net: qualcomm: rmnet: fix global oob in rmnet_policy
net: ethernet: ti: am65-cpsw: Fix max mtu to fit ethernet frames
amt: do not use overwrapped cb area
net: phy: micrel: populate .soft_reset for KSZ9131
mptcp: mptcp_parse_option() fix for MPTCPOPT_MP_JOIN
mptcp: strict validation before using mp_opt->hmac
mptcp: use OPTION_MPTCP_MPJ_SYNACK in subflow_finish_connect()
mptcp: use OPTION_MPTCP_MPJ_SYN in subflow_check_req()
mptcp: refine opt_mp_capable determination
block: ensure we hold a queue reference when using queue limits
udp: annotate data-races around up->pending
net: ravb: Fix dma_addr_t truncation in error case
dt-bindings: gpio: xilinx: Fix node address in gpio
drm/amdkfd: Use resource_size() helper function
drm/amdkfd: fixes for HMM mem allocation
net: stmmac: ethtool: Fixed calltrace caused by unbalanced disable_irq_wake calls
bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS
net: dsa: vsc73xx: Add null pointer check to vsc73xx_gpio_probe
LoongArch: BPF: Prevent out-of-bounds memory access
mptcp: relax check on MPC passive fallback
netfilter: nf_tables: reject invalid set policy
netfilter: nft_limit: do not ignore unsupported flags
netfilter: nfnetlink_log: use proper helper for fetching physinif
netfilter: nf_queue: remove excess nf_bridge variable
netfilter: propagate net to nf_bridge_get_physindev
netfilter: bridge: replace physindev with physinif in nf_bridge_info
netfilter: nf_tables: do not allow mismatch field size and set key length
netfilter: nf_tables: skip dead set elements in netlink dump
netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
ipvs: avoid stat macros calls from preemptible context
kdb: Fix a potential buffer overflow in kdb_local()
ethtool: netlink: Add missing ethnl_ops_begin/complete
loop: fix the the direct I/O support check when used on top of block devices
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
i2c: s3c24xx: fix read transfers in polling mode
i2c: s3c24xx: fix transferring more than one message in polling mode
block: Remove special-casing of compound pages
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
Revert "KEYS: encrypted: Add check for strsep"
arm64: dts: armada-3720-turris-mox: set irq type for RTC
Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
Linux 6.1.75
Change-Id: I60398ecc9a2e50206fd9d25c0d6c9ad6e1ca71a0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 72bd80252feeb3bef8724230ee15d9f7ab541c6e upstream.
If we use IORING_OP_RECV with provided buffers and pass in '0' as the
length of the request, the length is retrieved from the selected buffer.
If MSG_WAITALL is also set and we get a short receive, then we may hit
the retry path which decrements sr->len and increments the buffer for
a retry. However, the length is still zero at this point, which means
that sr->len now becomes huge and import_ubuf() will cap it to
MAX_RW_COUNT and subsequently return -EFAULT for the range as a whole.
Fix this by always assigning sr->len once the buffer has been selected.
Cc: stable@vger.kernel.org
Fixes: 7ba89d2af1 ("io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Backmerge the latest android14-6.1 changes into the lts branch to keep
up to date. Contains the following commits:
* 3578913b2e UPSTREAM: net/rose: Fix Use-After-Free in rose_ioctl
* 8fbed1ea00 UPSTREAM: ida: Fix crash in ida_free when the bitmap is empty
* 6ce5bb744e ANDROID: GKI: Update symbol list for mtk
* 7cbad58851 Reapply "perf: Disallow mis-matched inherited group reads"
* 067a03c44e ANDROID: GKI: Add Pasa symbol list
* b6be1a36f7 FROMGIT: mm: memcg: don't periodically flush stats when memcg is disabled
* d0e2d333f9 ANDROID: Update the ABI symbol list
* 10558542a1 ANDROID: sched: export update_misfit_status symbol
* a0b3b39898 ANDROID: GKI: Add ASR KMI symbol list
* 599710db0f FROMGIT: usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend
* 9265fa90c1 FROMLIST: usb: core: Prevent null pointer dereference in update_port_device_state
* 2730733d54 ANDROID: gki_defconfig: Enable CONFIG_NVME_MULTIPATH
* 4f668f5682 BACKPORT: irqchip/gic-v3: Work around affinity issues on ASR8601
* 473a871315 BACKPORT: irqchip/gic-v3: Improve affinity helper
* 6c32acf537 UPSTREAM: sched/fair: Limit sched slice duration
* 7088d250bf ANDROID: Update the ABI symbol list
* c249740414 ANDROID: idle_inject: Export function symbols
* 990d341477 ANDROID: Update the ABI symbol list
* be92a6a1b4 ANDROID: GKI: Remove CONFIG_MEDIA_CEC_RC
* fa9ac43f16 BACKPORT: usb: host: xhci: Avoid XHCI resume delay if SSUSB device is not present
* f27fc6ba23 Merge "Merge tag 'android14-6.1.68_r00' into branch 'android14-6.1'" into android14-6.1
|\
| * 0177cfb2a2 Merge tag 'android14-6.1.68_r00' into branch 'android14-6.1'
* c96cea1a3c ANDROID: Update the ABI symbol list
* c2fbc12180 ANDROID: uid_sys_stats: Drop CONFIG_UID_SYS_STATS_DEBUG logic
* 90bd30bdef ANDROID: Update the ABI symbol list
* 3280560843 ANDROID: Update the ABI symbol list
* 427210e440 UPSTREAM: usb: gadget: uvc: Remove nested locking
* 9267e267be ANDROID: uid_sys_stats: Fully initialize uid_entry_tmp value
* 2d3f0c9d41 ANDROID: Roll back some code to fix system_server registers psi trigger failed.
* bd77c97c76 UPSTREAM: usb: gadget: uvc: Fix use are free during STREAMOFF
* 21c71a7d0e ANDROID: GKI: Add symbol list for Nothing
* aba5a3fe09 ANDROID: Enable CONFIG_LAZY_RCU in x86 gki_defconfig
* 204160394a ANDROID: fuse-bpf: Fix the issue of abnormal lseek system calls
* 947708f1ff ANDROID: ABI: Update symbol list for imx
* 7eedea7abf BACKPORT: PM: sleep: Fix possible deadlocks in core system-wide PM code
* e1a20dd9ff UPSTREAM: async: Introduce async_schedule_dev_nocall()
* e4b0e14f83 UPSTREAM: async: Split async_schedule_node_domain()
* 6b4c816d17 FROMGIT: BACKPORT: mm: update mark_victim tracepoints fields
* d97ea65296 ANDROID: Enable CONFIG_LAZY_RCU in arm64 gki_defconfig
* 90d68cedd1 FROMLIST: rcu: Provide a boot time parameter to control lazy RCU
* a079cc5876 ANDROID: rcu: Add a minimum time for marking boot as completed
* ffe09c06a8 UPSTREAM: rcu: Disable laziness if lazy-tracking says so
* d07488d26e UPSTREAM: rcu: Track laziness during boot and suspend
* 4316bd568b UPSTREAM: net: Use call_rcu_hurry() for dst_release()
* b9427245f0 UPSTREAM: workqueue: Make queue_rcu_work() use call_rcu_hurry()
* 72fdf7f606 UPSTREAM: percpu-refcount: Use call_rcu_hurry() for atomic switch
* ced65a053b UPSTREAM: io_uring: use call_rcu_hurry if signaling an eventfd
* 84c8157d06 UPSTREAM: rcu: Update synchronize_rcu_mult() comment for call_rcu_hurry()
* 3751416eeb UPSTREAM: scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu()
* 52193e9489 UPSTREAM: rcu/rcutorture: Use call_rcu_hurry() where needed
* 83f8ba569f UPSTREAM: rcu/rcuscale: Use call_rcu_hurry() for async reader test
* 9b625f4978 UPSTREAM: rcu/sync: Use call_rcu_hurry() instead of call_rcu
* c570c8fea3 BACKPORT: rcu: Shrinker for lazy rcu
* 4957579439 UPSTREAM: rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()
* 66a832fe38 UPSTREAM: rcu: Make call_rcu() lazy to save power
* 4fb09fb4f7 UPSTREAM: rcu: Fix missing nocb gp wake on rcu_barrier()
* 64c59ad2c3 UPSTREAM: rcu: Fix late wakeup when flush of bypass cblist happens
* 0799ace265 ANDROID: Update the ABI symbol list
* 65db2f8ed3 ANDROID: GKI: add GKI symbol list for Exynosauto SoC
* cfe8cce4e8 UPSTREAM: coresight: tmc: Don't enable TMC when it's not ready.
* 899194d7e9 UPSTREAM: netfilter: nf_tables: bail out on mismatching dynset and set expressions
* e6712ed4f0 ANDROID: ABI: Update oplus symbol list
* 24bb8fc82e ANDROID: vendor_hooks: add hooks in driver/android/binder.c
* 55930b39ca ANDROID: GKI: Update honda symbol list for xt_LOG
* 3160b69e20 ANDROID: GKI: Update honda symbol list for ebt filter
* 4dc7f98815 ANDROID: GKI: Update honda symbol list for ebtables
* 39a0823340 ANDROID: GKI: Update honda symbol list for net scheduler
* dd0098bdb4 ANDROID: GKI: Update honda symbol list for led-trigger
* 66a20ed4b8 ANDROID: GKI: Add initial symbol list for honda
* 28dbe4d613 ANDROID: GKI: add symbols to ABI
* 97100e867e FROMGIT: usb: dwc: ep0: Update request status in dwc3_ep0_stall_restart
* 36248a15a7 FROMGIT: usb: dwc3: set pm runtime active before resume common
Change-Id: I8d9586a94c3182cd365d1e3b651a7552c7c9949b
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 0a535eddbe0dc1de4386046ab849f08aeb2f8faf upstream.
If IOSQE_ASYNC is set and we fail importing an iovec for a readv or
writev request, then we leave ->bytes_done uninitialized and hence the
eventual failure CQE posted can potentially have a random res value
rather than the expected -EINVAL.
Setup ->bytes_done before potentially failing, so we have a consistent
value if we fail the request early.
Cc: stable@vger.kernel.org
Reported-by: xingwei lee <xrivendell7@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
io_uring uses call_rcu in the case it needs to signal an eventfd as a
result of an eventfd signal, since recursing eventfd signals are not
allowed. This should be calling the new call_rcu_hurry API to not delay
the signal.
Signed-off-by: Dylan Yudaken <dylany@meta.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20221215184138.795576-1-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 44a84da45272b3f4beb90025a64cfbde18f1aef0)
Bug: 258241771
Signed-off-by: Joel Fernandes <joelaf@google.com>
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4909038
Reviewed-by: Vineeth Pillai <vineethrp@google.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Qais Yousef <qyousef@google.com>
Change-Id: Iec189c9ce0a95ccacda81f58bf7d49a575a6ab3f
[ Upstream commit b841b901c452d92610f739a36e54978453528876 ]
Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a
network protocol that it should splice pages from the source iterator
rather than copying the data if it can. This flag is added to a list that
is cleared by sendmsg syscalls on entry.
This is intended as a replacement for the ->sendpage() op, allowing a way
to splice in several multipage folios in one go.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7644b1a1c9a7ae8ab99175989bfc8676055edb46 upstream.
We could race with SQ thread exit, and if we do, we'll hit a NULL pointer
dereference when the thread is cleared. Grab the SQPOLL data lock before
attempting to get the task cpu and pid for fdinfo, this ensures we have a
stable view of it.
Bug: 309790656
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218032
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 9236d2ea64)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: I044e0285d4535440606ff593230b873e3145db91
commit f7b32e785042d2357c5abc23ca6db1b92c91a070 upstream.
Callers of mutex_unlock() have to make sure that the mutex stays alive
for the whole duration of the function call. For io_uring that means
that the following pattern is not valid unless we ensure that the
context outlives the mutex_unlock() call.
mutex_lock(&ctx->uring_lock);
req_put(req); // typically via io_req_task_submit()
mutex_unlock(&ctx->uring_lock);
Most contexts are fine: io-wq pins requests, syscalls hold the file,
task works are taking ctx references and so on. However, the task work
fallback path doesn't follow the rule.
Cc: <stable@vger.kernel.org>
Fixes: 04fc6c802d ("io_uring: save ctx put/get for task_work submit")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/io-uring/CAG48ez3xSoYb+45f1RLtktROJrpiDQ1otNvdR+YLQf7m+Krj5Q@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 705318a99a138c29a512a72c3e0043b3cd7f55f4 upstream.
File reference cycles have caused lots of problems for io_uring
in the past, and it still doesn't work exactly right and races with
unix_stream_read_generic(). The safest fix would be to completely
disallow sending io_uring files via sockets via SCM_RIGHT, so there
are no possible cycles invloving registered files and thus rendering
SCM accounting on the io_uring side unnecessary.
Cc: <stable@vger.kernel.org>
Fixes: 0091bfc817 ("io_uring/af_unix: defer registered files gc to io_uring release")
Reported-and-suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c716c88321939156909cfa1bd8b0faaf1c804103.1701868795.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6fef34ee4d102be448146f24caf96d7b4a05401 upstream.
If the offset equals the bv_len of the first registered bvec, then the
request does not include any of that first bvec. Skip it so that drivers
don't have to deal with a zero length bvec, which was observed to break
NVMe's PRP list creation.
Cc: stable@vger.kernel.org
Fixes: bd11b3a391 ("io_uring: don't use iov_iter_advance() for fixed buffers")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20231120221831.2646460-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8479063f1fbee201a8739130e816cc331b675838 upstream.
In order for `AT_EMPTY_PATH` to work as expected, the fact
that the user wants that behavior needs to make it to `getname_flags`
or it will return ENOENT.
Fixes: cf30da90bc ("io_uring: add support for IORING_OP_LINKAT")
Cc: <stable@vger.kernel.org>
Link: https://github.com/axboe/liburing/issues/995
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Link: https://lore.kernel.org/r/20231120105545.1209530-1-cmirabil@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f8f9ab2d98116e79d220f1d089df7464ad4e026d upstream.
io_uring does non-blocking connection attempts, which can yield some
unexpected results if a connect request is re-attempted by an an
application. This is equivalent to the following sync syscall sequence:
sock = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP);
connect(sock, &addr, sizeof(addr);
ret == -1 and errno == EINPROGRESS expected here. Now poll for POLLOUT
on sock, and when that returns, we expect the socket to be connected.
But if we follow that procedure with:
connect(sock, &addr, sizeof(addr));
you'd expect ret == -1 and errno == EISCONN here, but you actually get
ret == 0. If we attempt the connection one more time, then we get EISCON
as expected.
io_uring used to do this, but turns out that bluetooth fails with EBADFD
if you attempt to re-connect. Also looks like EISCONN _could_ occur with
this sequence.
Retain the ->in_progress logic, but work-around a potential EISCONN or
EBADFD error and only in those cases look at the sock_error(). This
should work in general and avoid the odd sequence of a repeated connect
request returning success when the socket is already connected.
This is all a side effect of the socket state being in a CONNECTING
state when we get EINPROGRESS, and only a re-connect or other related
operation will turn that into CONNECTED.
Cc: stable@vger.kernel.org
Fixes: 3fb1bd6881 ("io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT")
Link: https://github.com/axboe/liburing/issues/980
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f74c746e476b9dad51448b9a9421aae72b60e25f ]
nbufs tracks the number of buffers and not the last bgid. In 16-bit, we
have 2^16 valid buffers, but the check mistakenly rejects the last
bid. Let's fix it to make the interface consistent with the
documentation.
Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20231005000531.30800-3-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ab69838e7c75b0edb699c1a8f42752b30333c46f ]
Commit 3851d25c75 ("io_uring: check for rollover of buffer ID when
providing buffers") introduced a check to prevent wrapping the BID
counter when sqe->off is provided, but it's off-by-one too
restrictive, rejecting the last possible BID (65534).
i.e., the following fails with -EINVAL.
io_uring_prep_provide_buffers(sqe, addr, size, 0xFFFF, 0, 0);
Fixes: 3851d25c75 ("io_uring: check for rollover of buffer ID when providing buffers")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20231005000531.30800-2-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1939316bf988f3e49a07d9c4dd6f660bf4daa53d ]
->ki_pos value is unreliable in such cases. For an obvious example,
consider O_DSYNC write - we feed the data to page cache and start IO,
then we make sure it's completed. Update of ->ki_pos is dealt with
by the first part; failure in the second ends up with negative value
returned _and_ ->ki_pos left advanced as if sync had been successful.
In the same situation write(2) does not advance the file position
at all.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 7644b1a1c9a7ae8ab99175989bfc8676055edb46 upstream.
We could race with SQ thread exit, and if we do, we'll hit a NULL pointer
dereference when the thread is cleared. Grab the SQPOLL data lock before
attempting to get the task cpu and pid for fdinfo, this ensures we have a
stable view of it.
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218032
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a52d4f657568d6458e873f74a9602e022afe666f upstream.
This is unionized with the actual link flags, so they can of course be
set and they will be evaluated further down. If not we fail any LINKAT
that has to set option flags.
Fixes: cf30da90bc ("io_uring: add support for IORING_OP_LINKAT")
Cc: stable@vger.kernel.org
Reported-by: Thomas Leonard <talex5@gmail.com>
Link: https://github.com/axboe/liburing/issues/955
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c21a8027ad8a68c340d0d58bf1cc61dcb0bc4d2f upstream.
When using selected buffer feature, io_uring delays data iter setup
until later. If io_setup_async_msg() is called before that it might see
not correctly setup iterator. Pre-init nr_segs and judge from its state
whether we repointing.
Cc: stable@vger.kernel.org
Reported-by: syzbot+a4c6e5ef999b68b26ed1@syzkaller.appspotmail.com
Fixes: 0455d4ccec ("io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0000000000002770be06053c7757@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Jens Axboe <axboe@kernel.dk>
[ upstream commit ebdfefc09c6de7897962769bd3e63a2ff443ebf5 ]
If we setup the ring with SQPOLL, then that polling thread has its
own io-wq setup. This means that if the application uses
IORING_REGISTER_IOWQ_AFF to set the io-wq affinity, we should not be
setting it for the invoking task, but rather the sqpoll task.
Add an sqpoll helper that parks the thread and updates the affinity,
and use that one if we're using SQPOLL.
Fixes: fe76421d1d ("io_uring: allow user configurable IO thread CPU affinity")
Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/discussions/884
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ upstream commit 45500dc4e01c167ee063f3dcc22f51ced5b2b1e9 ]
io-wq will retry iopoll even when it failed with -EAGAIN. If that
races with task exit, which sets TIF_NOTIFY_SIGNAL for all its workers,
such workers might potentially infinitely spin retrying iopoll again and
again and each time failing on some allocation / waiting / etc. Don't
keep spinning if io-wq is dying.
Fixes: 561fb04a6a ("io_uring: replace workqueue usage with io-wq")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Dylan Yudaken <dylany@meta.com>
[ upstream commit 515e26961295bee9da5e26916c27739dca6c10e1 ]
This is no longer needed after commit aa1df3a360 ("io_uring: fix CQE
reordering"), since all reordering is now taken care of.
This reverts commit cbd2574854 ("io_uring: fix multishot accept
ordering").
Signed-off-by: Dylan Yudaken <dylany@meta.com>
Link: https://lore.kernel.org/r/20221107125236.260132-2-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
From: Dylan Yudaken <dylany@meta.com>
[ upstream commit c06c6c5d276707e04cedbcc55625e984922118aa ]
This is required for the failure case (io_req_complete_failed) and is
missing.
The alternative would be to only lock in the failure path, however all of
the non-error paths in io_poll_check_events that do not do not return
IOU_POLL_NO_ACTION end up locking anyway. The only extraneous lock would
be for the multishot poll overflowing the CQE ring, however multishot poll
would probably benefit from being locked as it will allow completions to
be batched.
So it seems reasonable to lock always.
Signed-off-by: Dylan Yudaken <dylany@meta.com>
Link: https://lore.kernel.org/r/20221124093559.3780686-3-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc314886cb3d0e4ab2858003e8de2917f8a3ccbd upstream.
Don't keep spinning iopoll with a signal set. It'll eventually return
back, e.g. by virtue of need_resched(), but it's not a nice user
experience.
Cc: stable@vger.kernel.org
Fixes: def596e955 ("io_uring: support for IO polling")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/eeba551e82cad12af30c3220125eb6cb244cc94c.1691594339.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cfdbaa3a291d6fd2cb4a1a70d74e63b4abc2f5ec ]
cq_extra is protected by ->completion_lock, which io_get_sqe() misses.
The bug is harmless as it doesn't happen in real life, requires invalid
SQ index array and racing with submission, and only messes up the
userspace, i.e. stall requests execution but will be cleaned up on
ring destruction.
Fixes: 15641e4270 ("io_uring: don't cache number of dropped SQEs")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/66096d54651b1a60534bb2023f2947f09f50ef73.1691538547.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Vidra Jonas reported issues on parisc with libuv which then triggers
build errors with cmake. Debugging shows that those issues stem from
io_uring().
I was not able to easily pull in upstream commits directly, so here
is IMHO the least invasive manual backport of the following upstream
commits to fix the cache aliasing issues on parisc on kernel 6.1
with io_uring:
56675f8b9f9b ("io_uring/parisc: Adjust pgoff in io_uring mmap() for parisc")
32832a407a71 ("io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()")
d808459b2e31 ("io_uring: Adjust mapping wrt architecture aliasing requirements")
With this patch kernel 6.1 has all relevant mmap changes and is
identical to kernel 6.5 with regard to mmap() in io_uring.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Vidra.Jonas@seznam.cz
Link: https://lore.kernel.org/linux-parisc/520.NvTX.6mXZpmfh4Ju.1awpAS@seznam.cz/
Cc: Sam James <sam@gentoo.org>
Cc: John David Anglin <dave.anglin@bell.net>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e12d7a46f65ae4b7d58a5e0c1cbfa825cf8d830d upstream.
If the target ring is configured with IOPOLL, then we always need to hold
the target ring uring_lock before posting CQEs. We could just grab it
unconditionally, but since we don't expect many target rings to be of this
type, make grabbing the uring_lock conditional on the ring type.
Link: https://lore.kernel.org/io-uring/Y8krlYa52%2F0YGqkg@ip-172-31-85-199.ec2.internal/
Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 423d5081d0451faa59a707e57373801da5b40141 upstream.
In preparation for needing them somewhere else, move them and get rid of
the unused 'issue_flags' for the unlock side.
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 172113101641cf1f9628c528ec790cb809f2b704 upstream.
Extract a helper called io_msg_install_complete() from io_msg_send_fd(),
will be used later.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1500ca1054cc4286a3ee1c60aacead57fcdfa02a.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 11373026f2960390d5e330df4e92735c4265c440 upstream.
We don't need to take both uring_locks at once, msg_ring can be split in
two parts, first getting a file from the filetable of the first ring and
then installing it into the second one.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a80ecc2bc99c3b3f2cf20015d618b7c51419a797.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 72dbde0f2afbe4af8e8595a89c650ae6b9d9c36f upstream.
O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old
check for whether RESOLVE_CACHED can be used would incorrectly think
that O_DIRECTORY could not be used with RESOLVE_CACHED.
Cc: stable@vger.kernel.org # v5.12+
Fixes: 3a81fd0204 ("io_uring: enable LOOKUP_CACHED path resolution for filename lookups")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/r/20230807-resolve_cached-o_tmpfile-v3-1-e49323e1ef6f@cyphar.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5498bf28d8f2bd63a46ad40f4427518615fb793f upstream.
It's racy to read ->cached_cq_tail without taking proper measures
(usually grabbing ->completion_lock) as timeout requests with CQE
offsets do, however they have never had a good semantics for from
when they start counting. Annotate racy reads with data_race().
Reported-by: syzbot+cb265db2f3f3468ef436@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4de3685e185832a92a572df2be2c735d2e21a83d.1684506056.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7b72d661f1f2f950ab8c12de7e2bc48bdac8ed69 upstream.
A previous commit made all cqring waits marked as iowait, as a way to
improve performance for short schedules with pending IO. However, for
use cases that have a special reaper thread that does nothing but
wait on events on the ring, this causes a cosmetic issue where we
know have one core marked as being "busy" with 100% iowait.
While this isn't a grave issue, it is confusing to users. Rather than
always mark us as being in iowait, gate setting of current->in_iowait
to 1 by whether or not the waiting task has pending requests.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/io-uring/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+UgLXsv1Xg@mail.gmail.com/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217699
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217700
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Phil Elwell <phil@raspberrypi.com>
Tested-by: Andres Freund <andres@anarazel.de>
Fixes: 8a796565cec3 ("io_uring: Use io_schedule* in cqring wait")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6adc2272aaaf84f34b652cf77f770c6fcc4b8336 ]
The check being unconditional may lead to unwanted denials reported by
LSMs when a process has the capability granted by DAC, but denied by an
LSM. In the case of SELinux such denials are a problem, since they can't
be effectively filtered out via the policy and when not silenced, they
produce noise that may hide a true problem or an attack.
Since not having the capability merely means that the created io_uring
context will be accounted against the current user's RLIMIT_MEMLOCK
limit, we can disable auditing of denials for this check by using
ns_capable_noaudit() instead of capable().
Fixes: 2b188cc1bb ("Add io_uring IO interface")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2193317
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/20230718115607.65652-1-omosnace@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a9be202269580ca611c6cebac90eaf1795497800 upstream.
io-wq assumes that an issue is blocking, but it may not be if the
request type has asked for a non-blocking attempt. If we get
-EAGAIN for that case, then we need to treat it as a final result
and not retry or arm poll for it.
Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/issues/897
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8a796565cec3601071cbbd27d6304e202019d014 upstream.
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().
The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0
This is repeatable with different filesystems, using raw block devices
and using different block devices.
Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.
After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).
There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.
Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4826c59453b3b4677d6bf72814e7ababdea86949 upstream.
WHen the ring exits, cleanup is done and the final cancelation and
waiting on completions is done by io_ring_exit_work. That function is
invoked by kworker, which doesn't take any signals. Because of that, it
doesn't really matter if we wait for completions in TASK_INTERRUPTIBLE
or TASK_UNINTERRUPTIBLE state. However, it does matter to the hung task
detection checker!
Normally we expect cancelations and completions to happen rather
quickly. Some test cases, however, will exit the ring and park the
owning task stopped (eg via SIGSTOP). If the owning task needs to run
task_work to complete requests, then io_ring_exit_work won't make any
progress until the task is runnable again. Hence io_ring_exit_work can
trigger the hung task detection, which is particularly problematic if
panic-on-hung-task is enabled.
As the ring exit doesn't take signals to begin with, have it wait
interruptibly rather than uninterruptibly. io_uring has a separate
stuck-exit warning that triggers independently anyway, so we're not
really missing anything by making this switch.
Cc: stable@vger.kernel.org # 5.10+
Link: https://lore.kernel.org/r/b0e4aaef-7088-56ce-244c-976edeac0e66@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 26fed83653d0154704cadb7afc418f315c7ac1f0 ]
Rather than assign the user pointer to msghdr->msg_control, assign it
to msghdr->msg_control_user to make sparse happy. They are in a union
so the end result is the same, but let's avoid new sparse warnings and
squash this one.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306210654.mDMcyMuB-lkp@intel.com/
Fixes: cac9e4418f4c ("io_uring/net: save msghdr->msg_control for retries")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit ef7dfac51d8ed961b742218f526bd589f3900a59 upstream.
We selectively grab the ctx->uring_lock for poll update/removal, but
we really should grab it from the start to fully synchronize with
linked timeouts. Normally this is indeed the case, but if requests
are forced async by the application, we don't fully cover removal
and timer disarm within the uring_lock.
Make this simpler by having consistent locking state for poll removal.
Cc: stable@vger.kernel.org # 6.1+
Reported-by: Querijn Voet <querijnqyn@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78d0d2063bab954d19a1696feae4c7706a626d48 upstream.
We cannot sanely handle partial retries for recvmsg if we have cmsg
attached. If we don't, then we'd just be overwriting the initial cmsg
header on retries. Alternatively we could increment and handle this
appropriately, but it doesn't seem worth the complication.
Move the MSG_WAITALL check into the non-multishot case while at it,
since MSG_WAITALL is explicitly disabled for multishot anyway.
Link: https://lore.kernel.org/io-uring/0b0d4411-c8fd-4272-770b-e030af6919a0@kernel.dk/
Cc: stable@vger.kernel.org # 5.10+
Reported-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1dc492087db0f2e5a45f1072a743d04618dd6be upstream.
If we have cmsg attached AND we transferred partial data at least, clear
msg_controllen on retry so we don't attempt to send that again.
Cc: stable@vger.kernel.org # 5.10+
Fixes: cac9e4418f4c ("io_uring/net: save msghdr->msg_control for retries")
Reported-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cac9e4418f4cbd548ccb065b3adcafe073f7f7d2 upstream.
If the application sets ->msg_control and we have to later retry this
command, or if it got queued with IOSQE_ASYNC to begin with, then we
need to retain the original msg_control value. This is due to the net
stack overwriting this field with an in-kernel pointer, to copy it
in. Hitting that path for the second time will now fail the copy from
user, as it's attempting to copy from a non-user address.
Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/issues/880
Reported-and-tested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 533ab73f5b5c95dcb4152b52d5482abcc824c690 ]
The sq thread actively releases CPU resources by calling the
cond_resched() and schedule() interfaces when it is idle. Therefore,
more resources are available for other threads to run.
There exists a problem in sq thread: it does not unlock sqd->lock before
releasing CPU resources every time. This makes other threads pending on
sqd->lock for a long time. For example, the following interfaces all
require sqd->lock: io_sq_offload_create(), io_register_iowq_max_workers()
and io_ring_exit_work().
Before the sq thread releases CPU resources, unlocking sqd->lock will
provide the user a better experience because it can respond quickly to
user requests.
Signed-off-by: Kanchan Joshi<joshi.k@samsung.com>
Signed-off-by: Wenwen Chen<wenwen.chen@samsung.com>
Link: https://lore.kernel.org/r/20230525082626.577862-1-wenwen.chen@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 953c37e066f05a3dca2d74643574b8dfe8a83983 ]
We use array_index_nospec() for registered buffer indexes, but don't use
it while poking into rsrc tags, fix that.
Fixes: 634d00df5e ("io_uring: add full-fledged dynamic buffers support")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f02fafc5a9c0dd69be2b0618c38831c078232ff0.1681395792.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b4a72c0589fdea6259720375426179888969d6a2 ]
When removing provided buffers, io_buffer structs are not being disposed
of, leading to a memory leak. They can't be freed individually, because
they are allocated in page-sized groups. They need to be added to some
free list instead, such as io_buffers_cache. All callers already hold
the lock protecting it, apart from when destroying buffers, so had to
extend the lock there.
Fixes: cc3cec8367 ("io_uring: speedup provided buffer handling")
Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com>
Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0921e51dab767ef5adf6175c4a0ba3c6e1074a3 ]
When a request to remove buffers is submitted, and the given number to be
removed is larger than available in the specified buffer group, the
resulting CQE result will be the number of removed buffers + 1, which is
1 more than it should be.
Previously, the head was part of the list and it got removed after the
loop, so the increment was needed. Now, the head is not an element of
the list, so the increment shouldn't be there anymore.
Fixes: dbc7d452e7 ("io_uring: manage provided buffers strictly ordered")
Signed-off-by: Wojciech Lukowicz <wlukowicz01@gmail.com>
Link: https://lore.kernel.org/r/20230401195039.404909-2-wlukowicz01@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>