-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmbCv24ACgkQONu9yGCS
aT7lNRAAzP2lSCUHROaMTldoQdahqoWqwFSiMI9p32HYLTerpg1GHVsi1IUvD+pv
zhmUG9w+ACbSbZ9337G61FeEDCIBzgqaIXLCtbK2Be9nWMa9I1ZtMSFUKoSmVJBw
YbrI/UOscJmAf44G6DeMp+N+/S2o7INK463u51SYjufo/zhFF8KsYElm23p06kgn
lTkkUAoo9mSVvEr64zbjwLrWyBWTlcvYH/xrkWeJWXl+hBv0K5Ig9IBm0sc0DSQR
fErADzDLFkmD9pduZbMwbzUUzC8ST41KKjTgClaHQhSMeoLoWT8CJM5Swwds4XVE
JkoClkqnj3+stYFpLFm9UUgZ12wu/9slzgRCN6fTraSNT8gE9F9BRJXFGL+3S5OO
oHKZYEEPTZDsD3PihgufJ4Ft27+KpMUzAgQUmVH/y47wrVJ2pf4fCK8LKT0MbjBi
pjZaDRCxwo1aORL3+jYJBVRecrNqQ0DhacYOKznhb2KKeaHojIwLaE6k/W/0Q8U5
1uMYv+NJ3LWDNzGcNUTCfNtuDELOpkp24Xc8RN0MK2iMMMyfjMpgKssjSBZtz0QW
NH0UVpfiWKECKH+m03NeFnYdMuK8/VyM8vatkcemz0FfgJP2UazeiVwSujfS2r2S
0TtsCMPP3kgKa9mAnni7lQs4wkG+OTNDNZqbuDqFZ1rHUS2Usrg=
=8i2e
-----END PGP SIGNATURE-----
Merge 5.10.224 into android12-5.10-lts
Changes in 5.10.224
EDAC/skx_common: Add new ADXL components for 2-level memory
EDAC, i10nm: make skx_common.o a separate module
platform/chrome: cros_ec_debugfs: fix wrong EC message version
hfsplus: fix to avoid false alarm of circular locking
x86/of: Return consistent error type from x86_of_pci_irq_enable()
x86/pci/intel_mid_pci: Fix PCIBIOS_* return code handling
x86/pci/xen: Fix PCIBIOS_* return code handling
x86/platform/iosf_mbi: Convert PCIBIOS_* return codes to errnos
hwmon: (adt7475) Fix default duty on fan is disabled
pwm: stm32: Always do lazy disabling
hwmon: (max6697) Fix underflow when writing limit attributes
hwmon: (max6697) Fix swapped temp{1,8} critical alarms
arm64: dts: qcom: sdm845: add power-domain to UFS PHY
soc: qcom: rpmh-rsc: Ensure irqs aren't disabled by rpmh_rsc_send_data() callers
arm64: dts: qcom: msm8996: specify UFS core_clk frequencies
soc: qcom: pdr: protect locator_addr with the main mutex
soc: qcom: pdr: fix parsing of domains lists
arm64: dts: rockchip: Increase VOP clk rate on RK3328
ARM: dts: imx6qdl-kontron-samx6i: move phy reset into phy-node
ARM: dts: imx6qdl-kontron-samx6i: fix PHY reset
ARM: dts: imx6qdl-kontron-samx6i: fix board reset
ARM: dts: imx6qdl-kontron-samx6i: fix SPI0 chip selects
ARM: dts: imx6qdl-kontron-samx6i: fix PCIe reset polarity
arm64: dts: mediatek: mt8183-kukui: Drop bogus output-enable property
arm64: dts: mediatek: mt7622: fix "emmc" pinctrl mux
arm64: dts: amlogic: gx: correct hdmi clocks
m68k: atari: Fix TT bootup freeze / unexpected (SCU) interrupt messages
x86/xen: Convert comma to semicolon
m68k: cmpxchg: Fix return value for default case in __arch_xchg()
ARM: pxa: spitz: use gpio descriptors for audio
ARM: spitz: fix GPIO assignment for backlight
firmware: turris-mox-rwtm: Fix checking return value of wait_for_completion_timeout()
firmware: turris-mox-rwtm: Initialize completion before mailbox
wifi: brcmsmac: LCN PHY code is used for BCM4313 2G-only device
selftests/bpf: Fix prog numbers in test_sockmap
net: esp: cleanup esp_output_tail_tcp() in case of unsupported ESPINTCP
net/smc: Allow SMC-D 1MB DMB allocations
net/smc: set rmb's SG_MAX_SINGLE_ALLOC limitation only when CONFIG_ARCH_NO_SG_CHAIN is defined
selftests/bpf: Check length of recv in test_sockmap
lib: objagg: Fix general protection fault
mlxsw: spectrum_acl_erp: Fix object nesting warning
mlxsw: spectrum_acl_bloom_filter: Make mlxsw_sp_acl_bf_key_encode() more flexible
mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors
ath11k: dp: stop rx pktlog before suspend
wifi: ath11k: fix wrong handling of CCMP256 and GCMP ciphers
wifi: cfg80211: fix typo in cfg80211_calculate_bitrate_he()
wifi: cfg80211: handle 2x996 RU allocation in cfg80211_calculate_bitrate_he()
net: fec: Refactor: #define magic constants
net: fec: Fix FEC_ECR_EN1588 being cleared on link-down
ipvs: Avoid unnecessary calls to skb_is_gso_sctp
netfilter: nf_tables: rise cap on SELinux secmark context
perf/x86/intel/pt: Fix pt_topa_entry_for_page() address calculation
perf: Fix perf_aux_size() for greater-than 32-bit size
perf: Prevent passing zero nr_pages to rb_alloc_aux()
qed: Improve the stack space of filter_config()
wifi: virt_wifi: avoid reporting connection success with wrong SSID
gss_krb5: Fix the error handling path for crypto_sync_skcipher_setkey
wifi: virt_wifi: don't use strlen() in const context
selftests/bpf: Close fd in error path in drop_on_reuseport
bpf: annotate BTF show functions with __printf
bna: adjust 'name' buf size of bna_tcb and bna_ccb structures
bpf: Eliminate remaining "make W=1" warnings in kernel/bpf/btf.o
selftests: forwarding: devlink_lib: Wait for udev events after reloading
xdp: fix invalid wait context of page_pool_destroy()
drm/panel: boe-tv101wum-nl6: If prepare fails, disable GPIO before regulators
drm/panel: boe-tv101wum-nl6: Check for errors on the NOP in prepare()
media: dvb-usb: Fix unexpected infinite loop in dvb_usb_read_remote_control()
media: imon: Fix race getting ictx->lock
saa7134: Unchecked i2c_transfer function result fixed
media: uvcvideo: Allow entity-defined get_info and get_cur
media: uvcvideo: Override default flags
media: renesas: vsp1: Fix _irqsave and _irq mix
media: renesas: vsp1: Store RPF partition configuration per RPF instance
leds: trigger: Unregister sysfs attributes before calling deactivate()
perf report: Fix condition in sort__sym_cmp()
drm/etnaviv: fix DMA direction handling for cached RW buffers
drm/qxl: Add check for drm_cvt_mode
Revert "leds: led-core: Fix refcount leak in of_led_get()"
ext4: fix infinite loop when replaying fast_commit
media: venus: flush all buffers in output plane streamoff
mfd: omap-usb-tll: Use struct_size to allocate tll
xprtrdma: Rename frwr_release_mr()
xprtrdma: Fix rpcrdma_reqs_reset()
SUNRPC: avoid soft lockup when transmitting UDP to reachable server.
ext4: avoid writing unitialized memory to disk in EA inodes
sparc64: Fix incorrect function signature and add prototype for prom_cif_init
SUNRPC: Fixup gss_status tracepoint error output
PCI: Fix resource double counting on remove & rescan
coresight: Fix ref leak when of_coresight_parse_endpoint() fails
Input: qt1050 - handle CHIP_ID reading error
RDMA/mlx4: Fix truncated output warning in mad.c
RDMA/mlx4: Fix truncated output warning in alias_GUID.c
RDMA/rxe: Don't set BTH_ACK_MASK for UC or UD QPs
ASoC: max98088: Check for clk_prepare_enable() error
mtd: make mtd_test.c a separate module
RDMA/device: Return error earlier if port in not valid
Input: elan_i2c - do not leave interrupt disabled on suspend failure
MIPS: Octeron: remove source file executable bit
powerpc/xmon: Fix disassembly CPU feature checks
macintosh/therm_windtunnel: fix module unload.
RDMA/hns: Fix missing pagesize and alignment check in FRMR
bnxt_re: Fix imm_data endianness
netfilter: ctnetlink: use helper function to calculate expect ID
net: dsa: mv88e6xxx: Limit chip-wide frame size config to CPU ports
net: dsa: b53: Limit chip-wide jumbo frame config to CPU ports
pinctrl: rockchip: update rk3308 iomux routes
pinctrl: core: fix possible memory leak when pinctrl_enable() fails
pinctrl: single: fix possible memory leak when pinctrl_enable() fails
pinctrl: ti: ti-iodelay: Drop if block with always false condition
pinctrl: ti: ti-iodelay: fix possible memory leak when pinctrl_enable() fails
pinctrl: freescale: mxs: Fix refcount of child
fs/proc/task_mmu: indicate PM_FILE for PMD-mapped file THP
fs/nilfs2: remove some unused macros to tame gcc
nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
rtc: interface: Add RTC offset to alarm after fix-up
dt-bindings: thermal: correct thermal zone node name limit
tick/broadcast: Make takeover of broadcast hrtimer reliable
net: netconsole: Disable target before netpoll cleanup
af_packet: Handle outgoing VLAN packets without hardware offloading
ipv6: take care of scope when choosing the src addr
sched/fair: set_load_weight() must also call reweight_task() for SCHED_IDLE tasks
char: tpm: Fix possible memory leak in tpm_bios_measurements_open()
media: venus: fix use after free in vdec_close
hfs: fix to initialize fields of hfs_inode_info after hfs_alloc_inode()
ext2: Verify bitmap and itable block numbers before using them
drm/gma500: fix null pointer dereference in cdv_intel_lvds_get_modes
drm/gma500: fix null pointer dereference in psb_intel_lvds_get_modes
scsi: qla2xxx: Fix optrom version displayed in FDMI
drm/amd/display: Check for NULL pointer
sched/fair: Use all little CPUs for CPU-bound workloads
apparmor: use kvfree_sensitive to free data->data
task_work: s/task_work_cancel()/task_work_cancel_func()/
task_work: Introduce task_work_cancel() again
udf: Avoid using corrupted block bitmap buffer
m68k: amiga: Turn off Warp1260 interrupts during boot
ext4: check dot and dotdot of dx_root before making dir indexed
ext4: make sure the first directory block is not a hole
wifi: mwifiex: Fix interface type change
leds: ss4200: Convert PCIBIOS_* return codes to errnos
jbd2: make jbd2_journal_get_max_txn_bufs() internal
KVM: VMX: Split out the non-virtualization part of vmx_interrupt_blocked()
tools/memory-model: Fix bug in lock.cat
hwrng: amd - Convert PCIBIOS_* return codes to errnos
PCI: hv: Return zero, not garbage, when reading PCI_INTERRUPT_PIN
PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting ep_gpio
binder: fix hang of unregistered readers
dev/parport: fix the array out-of-bounds risk
scsi: qla2xxx: Return ENOBUFS if sg_cnt is more than one for ELS cmds
f2fs: fix to don't dirty inode for readonly filesystem
clk: davinci: da8xx-cfgchip: Initialize clk_init_data before use
ubi: eba: properly rollback inside self_check_eba
decompress_bunzip2: fix rare decompression failure
kbuild: Fix '-S -c' in x86 stack protector scripts
kobject_uevent: Fix OOB access within zap_modalias_env()
devres: Fix devm_krealloc() wasting memory
rtc: cmos: Fix return value of nvmem callbacks
scsi: qla2xxx: During vport delete send async logout explicitly
scsi: qla2xxx: Fix for possible memory corruption
scsi: qla2xxx: Fix flash read failure
scsi: qla2xxx: Complete command early within lock
scsi: qla2xxx: validate nvme_local_port correctly
perf/x86/intel/pt: Fix topa_entry base length
perf/x86/intel/pt: Fix a topa_entry base address calculation
rtc: isl1208: Fix return value of nvmem callbacks
watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
platform: mips: cpu_hwmon: Disable driver on unsupported hardware
RDMA/iwcm: Fix a use-after-free related to destroying CM IDs
selftests/sigaltstack: Fix ppc64 GCC build
rbd: don't assume rbd_is_lock_owner() for exclusive mappings
MIPS: ip30: ip30-console: Add missing include
MIPS: Loongson64: env: Hook up Loongsson-2K
drm/panfrost: Mark simple_ondemand governor as softdep
rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
Bluetooth: btusb: Add RTL8852BE device 0489:e125 to device tables
Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x13d3:0x3591
nilfs2: handle inconsistent state in nilfs_btnode_create_block()
io_uring/io-wq: limit retrying worker initialisation
kernel: rerun task_work while freezing in get_signal()
kdb: address -Wformat-security warnings
kdb: Use the passed prompt in kdb_position_cursor()
jfs: Fix array-index-out-of-bounds in diFree
um: time-travel: fix time-travel-start option
f2fs: fix start segno of large section
libbpf: Fix no-args func prototype BTF dumping syntax
dma: fix call order in dmam_free_coherent
MIPS: SMP-CPS: Fix address for GCR_ACCESS register for CM3 and later
ipv4: Fix incorrect source address in Record Route option
net: bonding: correctly annotate RCU in bond_should_notify_peers()
netfilter: nft_set_pipapo_avx2: disable softinterrupts
tipc: Return non-zero value from tipc_udp_addr2str() on error
net: stmmac: Correct byte order of perfect_match
net: nexthop: Initialize all fields in dumped nexthops
bpf: Fix a segment issue when downgrading gso_size
mISDN: Fix a use after free in hfcmulti_tx()
apparmor: Fix null pointer deref when receiving skb during sock creation
powerpc: fix a file leak in kvm_vcpu_ioctl_enable_cap()
lirc: rc_dev_get_from_fd(): fix file leak
ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable
ceph: fix incorrect kmalloc size of pagevec mempool
nvme: split command copy into a helper
nvme-pci: add missing condition check for existence of mapped data
fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT
powerpc/configs: Update defconfig with now user-visible CONFIG_FSL_IFC
fuse: name fs_context consistently
fuse: verify {g,u}id mount options correctly
sysctl: always initialize i_uid/i_gid
ext4: factor out a common helper to query extent map
ext4: check the extent status again before inserting delalloc block
soc: xilinx: move PM_INIT_FINALIZE to zynqmp_pm_domains driver
drivers: soc: xilinx: check return status of get_api_version()
driver core: Cast to (void *) with __force for __percpu pointer
devres: Fix memory leakage caused by driver API devm_free_percpu()
genirq: Allow the PM device to originate from irq domain
irqchip/imx-irqsteer: Constify irq_chip struct
irqchip/imx-irqsteer: Add runtime PM support
irqchip/imx-irqsteer: Handle runtime power management correctly
remoteproc: imx_rproc: ignore mapping vdev regions
remoteproc: imx_rproc: Fix ignoring mapping vdev regions
remoteproc: imx_rproc: Skip over memory region when node value is NULL
drm/nouveau: prime: fix refcount underflow
drm/vmwgfx: Fix overlay when using Screen Targets
sched: act_ct: take care of padding in struct zones_ht_key
net/iucv: fix use after free in iucv_sock_close()
net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys
ipv6: fix ndisc_is_useropt() handling for PIO
riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
platform/chrome: cros_ec_proto: Lock device when updating MKBP version
HID: wacom: Modify pen IDs
protect the fetch of ->fd[fd] in do_dup2() from mispredictions
ALSA: usb-audio: Correct surround channels in UAC1 channel map
ALSA: hda/realtek: Add quirk for Acer Aspire E5-574G
net: usb: sr9700: fix uninitialized variable use in sr_mdio_read
r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY
mptcp: fix duplicate data handling
netfilter: ipset: Add list flush to cancel_gc
genirq: Allow irq_chip registration functions to take a const irq_chip
irqchip/mbigen: Fix mbigen node address layout
x86/mm: Fix pti_clone_pgtable() alignment assumption
x86/mm: Fix pti_clone_entry_text() for i386
sctp: move hlist_node and hashent out of sctp_ep_common
sctp: Fix null-ptr-deref in reuseport_add_sock().
net: usb: qmi_wwan: fix memory leak for not ip packets
net: linkwatch: use system_unbound_wq
Bluetooth: l2cap: always unlock channel in l2cap_conless_channel()
net: dsa: bcm_sf2: Fix a possible memory leak in bcm_sf2_mdio_register()
l2tp: fix lockdep splat
net: fec: Stop PPS on driver remove
rcutorture: Fix rcu_torture_fwd_cb_cr() data race
md: do not delete safemode_timer in mddev_suspend
md/raid5: avoid BUG_ON() while continue reshape after reassembling
clocksource/drivers/sh_cmt: Address race condition for clock events
ACPI: battery: create alarm sysfs attribute atomically
ACPI: SBS: manage alarm sysfs attribute through psy core
selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
PCI: Add Edimax Vendor ID to pci_ids.h
udf: prevent integer overflow in udf_bitmap_free_blocks()
wifi: nl80211: don't give key data to userspace
btrfs: fix bitmap leak when loading free space cache on duplicate entry
drm/amdgpu: Fix the null pointer dereference to ras_manager
drm/amdgpu/pm: Fix the null pointer dereference in apply_state_adjust_rules
media: uvcvideo: Ignore empty TS packets
media: uvcvideo: Fix the bandwdith quirk on USB 3.x
jbd2: avoid memleak in jbd2_journal_write_metadata_buffer
s390/sclp: Prevent release of buffer in I/O
SUNRPC: Fix a race to wake a sync task
sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
ext4: fix wrong unit use in ext4_mb_find_by_goal
arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space
arm64: Add Neoverse-V2 part
arm64: cputype: Add Cortex-X4 definitions
arm64: cputype: Add Neoverse-V3 definitions
arm64: errata: Add workaround for Arm errata 3194386 and 3312417
arm64: cputype: Add Cortex-X3 definitions
arm64: cputype: Add Cortex-A720 definitions
arm64: cputype: Add Cortex-X925 definitions
arm64: errata: Unify speculative SSBS errata logic
arm64: errata: Expand speculative SSBS workaround
arm64: cputype: Add Cortex-X1C definitions
arm64: cputype: Add Cortex-A725 definitions
arm64: errata: Expand speculative SSBS workaround (again)
i2c: smbus: Improve handling of stuck alerts
ASoC: codecs: wsa881x: Correct Soundwire ports mask
i2c: smbus: Send alert notifications to all devices if source not found
bpf: kprobe: remove unused declaring of bpf_kprobe_override
kprobes: Fix to check symbol prefixes correctly
spi: spi-fsl-lpspi: Fix scldiv calculation
ALSA: usb-audio: Re-add ScratchAmp quirk entries
drm/client: fix null pointer dereference in drm_client_modeset_probe
ALSA: line6: Fix racy access to midibuf
ALSA: hda: Add HP MP9 G4 Retail System AMS to force connect list
ALSA: hda/hdmi: Yet more pin fix for HP EliteDesk 800 G4
usb: vhci-hcd: Do not drop references before new references are gained
USB: serial: debug: do not echo input by default
usb: gadget: core: Check for unset descriptor
usb: gadget: u_serial: Set start_delayed during suspend
scsi: ufs: core: Fix hba->last_dme_cmd_tstamp timestamp updating logic
tick/broadcast: Move per CPU pointer access into the atomic section
ntp: Clamp maxerror and esterror to operating range
driver core: Fix uevent_show() vs driver detach race
ntp: Safeguard against time_constant overflow
scsi: mpt3sas: Remove scsi_dma_map() error messages
scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES
irqchip/meson-gpio: support more than 8 channels gpio irq
irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t'
serial: core: check uartclk for zero to avoid divide by zero
irqchip/xilinx: Fix shift out of bounds
genirq/irqdesc: Honor caller provided affinity in alloc_desc()
power: supply: axp288_charger: Fix constant_charge_voltage writes
power: supply: axp288_charger: Round constant_charge_voltage writes down
tracing: Fix overflow in get_free_elt()
padata: Fix possible divide-by-0 panic in padata_mt_helper()
x86/mtrr: Check if fixed MTRRs exist before saving them
drm/bridge: analogix_dp: properly handle zero sized AUX transactions
drm/mgag200: Set DDC timeout in milliseconds
mptcp: sched: check both directions for backup
mptcp: distinguish rcv vs sent backup flag in requests
mptcp: fix NL PM announced address accounting
mptcp: mib: count MPJ with backup flag
mptcp: export local_address
mptcp: pm: fix backup support in signal endpoints
samples: Add fs error monitoring example
samples: Make fs-monitor depend on libc and headers
Add gitignore file for samples/fanotify/ subdirectory
Fix gcc 4.9 build issue in 5.10.y
PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
netfilter: nf_tables: set element extended ACK reporting support
netfilter: nf_tables: use timestamp to check for set element timeout
netfilter: nf_tables: allow clone callbacks to sleep
netfilter: nf_tables: prefer nft_chain_validate
drm/i915/gem: Fix Virtual Memory mapping boundaries calculation
powerpc: Avoid nmi_enter/nmi_exit in real mode interrupt.
arm64: cpufeature: Fix the visibility of compat hwcaps
media: uvcvideo: Use entity get_cur in uvc_ctrl_set
exec: Fix ToCToU between perm check and set-uid/gid usage
nvme/pci: Add APST quirk for Lenovo N60z laptop
vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler
wifi: cfg80211: restrict NL80211_ATTR_TXQ_QUANTUM values
ARM: dts: imx6qdl-kontron-samx6i: fix phy-mode
media: Revert "media: dvb-usb: Fix unexpected infinite loop in dvb_usb_read_remote_control()"
Linux 5.10.224
Change-Id: I7cd19d506c4c86df918a280598946060a494a161
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit f409530e4db9dd11b88cb7703c97c8f326ff6566 upstream.
Re-introduce task_work_cancel(), this time to cancel an actual callback
and not *any* callback pointing to a given function. This is going to be
needed for perf events event freeing.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-3-frederic@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68cbd415dd4b9c5b9df69f0f091879e56bf5907a upstream.
A proper task_work_cancel() API that actually cancels a callback and not
*any* callback pointing to a given function is going to be needed for
perf events event freeing. Do the appropriate rename to prepare for
that.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-2-frederic@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A previous commit changed the arguments to task_work_cancel_match(), but
didn't document all of them.
Bug: 254441685
Link: https://lkml.kernel.org/r/93938bff-baa3-4091-85f5-784aae297a07@kernel.dk
Fixes: c7aab1a7c52b ("task_work: add helper for more targeted task_work canceling")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309120307.zis3yQGe-lkp@intel.com/
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 4653e5dd04cb869526477a76b87d0aa1a5c65101)
Signed-off-by: Lee Jones <joneslee@google.com>
Change-Id: Ic85484d66e03f94307f6d1ea5ab32c9a182428c5
[ Upstream commit c7aab1a7c52b82d9afd7e03c398eb03dc2aa0507 ]
The only exported helper we have right now is task_work_cancel(), which
cancels any task_work from a given task where func matches the queued
work item. This is a bit too coarse for some use cases. Add a
task_work_cancel_match() that allows to more specifically target
individual work items outside of purely the callback function used.
task_work_cancel() can be trivially implemented on top of that, hence do
so.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Change-Id: Ia33480d209b26d433a3ca196972d6931aa4f8dde
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ed30050329)
Bug: 268174392
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit 114518eb6430b832d2f9f5a008043b913ccf0e24 ]
If the arch supports TIF_NOTIFY_SIGNAL, then use that for TWA_SIGNAL as
it's more efficient than using the signal delivery method. This is
especially true on threaded applications, where ->sighand is shared across
threads, but it's also lighter weight on non-shared cases.
io_uring is a heavy consumer of TWA_SIGNAL based task_work. A test with
threads shows a nice improvement running an io_uring based echo server.
stock kernel:
0.01% <= 0.1 milliseconds
95.86% <= 0.2 milliseconds
98.27% <= 0.3 milliseconds
99.71% <= 0.4 milliseconds
100.00% <= 0.5 milliseconds
100.00% <= 0.6 milliseconds
100.00% <= 0.7 milliseconds
100.00% <= 0.8 milliseconds
100.00% <= 0.9 milliseconds
100.00% <= 1.0 milliseconds
100.00% <= 1.1 milliseconds
100.00% <= 2 milliseconds
100.00% <= 3 milliseconds
100.00% <= 3 milliseconds
1378930.00 requests per second
~1600% CPU
1.38M requests/second, and all 16 CPUs are maxed out.
patched kernel:
0.01% <= 0.1 milliseconds
98.24% <= 0.2 milliseconds
99.47% <= 0.3 milliseconds
99.99% <= 0.4 milliseconds
100.00% <= 0.5 milliseconds
100.00% <= 0.6 milliseconds
100.00% <= 0.7 milliseconds
100.00% <= 0.8 milliseconds
100.00% <= 0.9 milliseconds
100.00% <= 1.2 milliseconds
1666111.38 requests per second
~1450% CPU
1.67M requests/second, and we're no longer just hammering on the sighand
lock. The original reporter states:
"For 5.7.15 my benchmark achieves 1.6M qps and system cpu is at ~80%.
for 5.7.16 or later it achieves only 1M qps and the system cpu is is
at ~100%"
with the only difference there being that TWA_SIGNAL is used
unconditionally in 5.7.16, since it's required to be able to handle the
inability to run task_work if the application is waiting in the kernel
already on an event that needs task_work run to be satisfied. Also see
commit 0ba9c9edcd.
Reported-by: Roman Gershman <romger@amazon.com>
Change-Id: I80788634e6b91012ebf94e0ab6bf897c99f9f732
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-5-axboe@kernel.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit eb42e7b304)
Bug: 268174392
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit c7aab1a7c52b82d9afd7e03c398eb03dc2aa0507 ]
The only exported helper we have right now is task_work_cancel(), which
cancels any task_work from a given task where func matches the queued
work item. This is a bit too coarse for some use cases. Add a
task_work_cancel_match() that allows to more specifically target
individual work items outside of purely the callback function used.
task_work_cancel() can be trivially implemented on top of that, hence do
so.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 114518eb6430b832d2f9f5a008043b913ccf0e24 ]
If the arch supports TIF_NOTIFY_SIGNAL, then use that for TWA_SIGNAL as
it's more efficient than using the signal delivery method. This is
especially true on threaded applications, where ->sighand is shared across
threads, but it's also lighter weight on non-shared cases.
io_uring is a heavy consumer of TWA_SIGNAL based task_work. A test with
threads shows a nice improvement running an io_uring based echo server.
stock kernel:
0.01% <= 0.1 milliseconds
95.86% <= 0.2 milliseconds
98.27% <= 0.3 milliseconds
99.71% <= 0.4 milliseconds
100.00% <= 0.5 milliseconds
100.00% <= 0.6 milliseconds
100.00% <= 0.7 milliseconds
100.00% <= 0.8 milliseconds
100.00% <= 0.9 milliseconds
100.00% <= 1.0 milliseconds
100.00% <= 1.1 milliseconds
100.00% <= 2 milliseconds
100.00% <= 3 milliseconds
100.00% <= 3 milliseconds
1378930.00 requests per second
~1600% CPU
1.38M requests/second, and all 16 CPUs are maxed out.
patched kernel:
0.01% <= 0.1 milliseconds
98.24% <= 0.2 milliseconds
99.47% <= 0.3 milliseconds
99.99% <= 0.4 milliseconds
100.00% <= 0.5 milliseconds
100.00% <= 0.6 milliseconds
100.00% <= 0.7 milliseconds
100.00% <= 0.8 milliseconds
100.00% <= 0.9 milliseconds
100.00% <= 1.2 milliseconds
1666111.38 requests per second
~1450% CPU
1.67M requests/second, and we're no longer just hammering on the sighand
lock. The original reporter states:
"For 5.7.15 my benchmark achieves 1.6M qps and system cpu is at ~80%.
for 5.7.16 or later it achieves only 1M qps and the system cpu is is
at ~100%"
with the only difference there being that TWA_SIGNAL is used
unconditionally in 5.7.16, since it's required to be able to handle the
inability to run task_work if the application is waiting in the kernel
already on an event that needs task_work run to be satisfied. Also see
commit 0ba9c9edcd.
Reported-by: Roman Gershman <romger@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20201026203230.386348-5-axboe@kernel.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Why record task_work_add() call stack? Syzbot reports many use-after-free
issues for task_work, see [1]. After seeing the free stack and the
current auxiliary stack, we think they are useless, we don't know where
the work was registered. This work may be the free call stack, so we miss
the root cause and don't solve the use-after-free.
Add the task_work_add() call stack into the KASAN auxiliary stack in order
to improve KASAN reports. It helps programmers solve use-after-free
issues.
[1]: https://groups.google.com/g/syzkaller-bugs/search?q=kasan%20use-after-free%20task_work_run
Link: https://lkml.kernel.org/r/20210316024410.19967-1-walter-zh.wu@mediatek.com
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
(cherry picked from commit 357e2e021b3a5c473b43a5a4d752139564bf27b8
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git akpm)
Bug: 182930667
Signed-off-by: Alexander Potapenko <glider@google.com>
Change-Id: I38b2e1856ba9605bcdf0fb4fd4a7031596c8fe4a
A previous commit changed the notification mode from true/false to an
int, allowing notify-no, notify-yes, or signal-notify. This was
backwards compatible in the sense that any existing true/false user
would translate to either 0 (on notification sent) or 1, the latter
which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2.
Clean this up properly, and define a proper enum for the notification
mode. Now we have:
- TWA_NONE. This is 0, same as before the original change, meaning no
notification requested.
- TWA_RESUME. This is 1, same as before the original change, meaning
that we use TIF_NOTIFY_RESUME.
- TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the
notification.
Clean up all the callers, switching their 0/1/false/true to using the
appropriate TWA_* mode for notifications.
Fixes: e91b481623 ("task_work: teach task_work_add() to do signal_wake_up()")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If JOBCTL_TASK_WORK is already set on the targeted task, then we need
not go through {lock,unlock}_task_sighand() to set it again and queue
a signal wakeup. This is safe as we're checking it _after_ adding the
new task_work with cmpxchg().
The ordering is as follows:
task_work_add() get_signal()
--------------------------------------------------------------
STORE(task->task_works, new_work); STORE(task->jobctl);
mb(); mb();
LOAD(task->jobctl); LOAD(task->task_works);
This speeds up TWA_SIGNAL handling quite a bit, which is important now
that io_uring is relying on it for all task_work deliveries.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
So that the target task will exit the wait_event_interruptible-like
loop and call task_work_run() asap.
The patch turns "bool notify" into 0,TWA_RESUME,TWA_SIGNAL enum, the
new TWA_SIGNAL flag implies signal_wake_up(). However, it needs to
avoid the race with recalc_sigpending(), so the patch also adds the
new JOBCTL_TASK_WORK bit included in JOBCTL_PENDING_MASK.
TODO: once this patch is merged we need to change all current users
of task_work_add(notify = true) to use TWA_RESUME.
Cc: stable@vger.kernel.org # v5.7
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
As Peter pointed out, task_work() can avoid ->pi_lock and cmpxchg()
if task->task_works == NULL && !PF_EXITING.
And in fact the only reason why task_work_run() needs ->pi_lock is
the possible race with task_work_cancel(), we can optimize this code
and make the locking more clear.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it
can be used instead of lockless_dereference() without any change in
semantics.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore replaces the spin_unlock_wait() call in
task_work_run() with a spin_lock_irq() and a spin_unlock_irq() aruond
the cmpxchg() dequeue loop. This should be safe from a performance
perspective because ->pi_lock is local to the task and because calls to
the other side of the race, task_work_cancel(), should be rare.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Change task_work_cancel() to use lockless_dereference(), this is what
the code really wants but we didn't have this helper when it was
written.
Also add the fast-path task->task_works == NULL check, in the likely
case this task has no pending works and we can avoid
spin_lock(task->pi_lock).
While at it, change other users of ACCESS_ONCE() to use READ_ONCE().
Link: http://lkml.kernel.org/r/20160610150042.GA13868@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. Also update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In commit f341861fb0 ("task_work: add a scheduling point in
task_work_run()") I fixed a latency problem adding a cond_resched()
call.
Later, commit ac3d0da8f3 added yet another loop to reverse a list,
bringing back the latency spike :
I've seen in some cases this loop taking 275 ms, if for example a
process with 2,000,000 files is killed.
We could add yet another cond_resched() in the reverse loop, or we
can simply remove the reversal, as I do not think anything
would depend on order of task_work_add() submitted works.
Fixes: ac3d0da8f3 ("task_work: Make task_work_add() lockless")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ed3e694d "move exit_task_work() past exit_files() et.al" destroyed
the add/exit synchronization we had, the caller itself should ensure
task_work_add() can't race with the exiting task.
However, this is not convenient/simple, and the only user which tries
to do this is buggy (see the next patch). Unless the task is current,
there is simply no way to do this in general.
Change exit_task_work()->task_work_run() to use the dummy "work_exited"
entry to let task_work_add() know it should fail.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120826191211.GA4228@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change task_work's to use llist-like code to avoid pi_lock
in task_work_add(), this makes it useable under rq->lock.
task_work_cancel() and task_work_run() still use pi_lock
to synchronize with each other.
(This is in preparation for a deadlock fix.)
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120826191209.GA4221@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It seems commit 4a9d4b024a ("switch fput to task_work_add") re-
introduced the problem addressed in 944be0b224 ("close_files(): add
scheduling point")
If a server process with a lot of files (say 2 million tcp sockets) is
killed, we can spend a lot of time in task_work_run() and trigger a soft
lockup.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It doesn't matter on normal return to userland path (we'll recheck the
NOTIFY_RESUME flag anyway), but in case of exit_task_work() we'll
need that as soon as we get callbacks capable of triggering more
task_work_add().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
layout based on Oleg's suggestion; single-linked list,
task->task_works points to the last element, forward pointer
from said last element points to head. I'd still prefer
much more regular scheme with two pointers in task_work,
but...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.
The caller does task_work_add(task, task_work) and this task executes
task_work->func() either from do_notify_resume() or from do_exit(). The
callback can rely on PF_EXITING to detect the latter case.
"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.
This allows us to kill the ->replacement_session_keyring hack, and
potentially this can have more users.
Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit(). But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().
Note: task_work_add/task_work_run abuses ->pi_lock. This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING. Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Smith <dsmith@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>