Commit Graph

365 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
2ebd481b31 Merge 5.10.221 into android12-5.10-lts
Changes in 5.10.221
	tracing/selftests: Fix kprobe event name test for .isra. functions
	null_blk: Print correct max open zones limit in null_init_zoned_dev()
	wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
	wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
	wifi: cfg80211: pmsr: use correct nla_get_uX functions
	wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
	wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
	wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
	wifi: iwlwifi: mvm: don't read past the mfuart notifcation
	wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
	net/ncsi: add NCSI Intel OEM command to keep PHY up
	net/ncsi: Simplify Kconfig/dts control flow
	net/ncsi: Fix the multi thread manner of NCSI driver
	ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
	net: sched: sch_multiq: fix possible OOB write in multiq_tune()
	vxlan: Fix regression when dropping packets due to invalid src addresses
	tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
	net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
	ptp: Fix error message on failed pin verification
	af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
	af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
	af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
	af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
	af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
	af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
	af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
	af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
	ipv6: fix possible race in __fib6_drop_pcpu_from()
	usb: gadget: f_fs: Fix race between aio_cancel() and AIO request complete
	drm/amd/display: Handle Y carry-over in VCP X.Y calculation
	serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
	serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
	mmc: davinci: Don't strip remove function when driver is builtin
	selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
	selftests/mm: conform test to TAP format output
	selftests/mm: compaction_test: fix bogus test success on Aarch64
	btrfs: fix leak of qgroup extent records after transaction abort
	nilfs2: Remove check for PageError
	nilfs2: return the mapped address from nilfs_get_page()
	nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
	USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
	mei: me: release irq in mei_me_pci_resume error path
	jfs: xattr: fix buffer overflow for invalid xattr
	xhci: Set correct transferred length for cancelled bulk transfers
	xhci: Apply reset resume quirk to Etron EJ188 xHCI host
	xhci: Apply broken streams quirk to Etron EJ188 xHCI host
	scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
	powerpc/uaccess: Fix build errors seen with GCC 13/14
	Input: try trimming too long modalias strings
	SUNRPC: return proper error from gss_wrap_req_priv
	gpio: tqmx86: fix typo in Kconfig label
	HID: core: remove unnecessary WARN_ON() in implement()
	gpio: tqmx86: store IRQ trigger type and unmask status separately
	iommu/amd: Introduce pci segment structure
	iommu/amd: Fix sysfs leak in iommu init
	iommu: Return right value in iommu_sva_bind_device()
	HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
	drm/vmwgfx: 3D disabled should not effect STDU memory limits
	net: sfp: Always call `sfp_sm_mod_remove()` on remove
	net: hns3: add cond_resched() to hns3 ring buffer init process
	liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
	drm/komeda: check for error-valued pointer
	drm/bridge/panel: Fix runtime warning on panel bridge release
	tcp: fix race in tcp_v6_syn_recv_sock()
	net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
	Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
	netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
	net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
	net/ipv6: Fix the RT cache flush via sysctl using a previous delay
	ionic: fix use after netif_napi_del()
	iio: adc: ad9467: fix scan type sign
	iio: dac: ad5592r: fix temperature channel scaling value
	iio: imu: inv_icm42600: delete unneeded update watermark call
	drivers: core: synchronize really_probe() and dev_uevent()
	drm/exynos/vidi: fix memory leak in .get_modes()
	drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
	vmci: prevent speculation leaks by sanitizing event in event_deliver()
	fs/proc: fix softlockup in __read_vmcore
	ocfs2: use coarse time for new created files
	ocfs2: fix races between hole punching and AIO+DIO
	PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
	dmaengine: axi-dmac: fix possible race in remove()
	remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
	intel_th: pci: Add Granite Rapids support
	intel_th: pci: Add Granite Rapids SOC support
	intel_th: pci: Add Sapphire Rapids SOC support
	intel_th: pci: Add Meteor Lake-S support
	intel_th: pci: Add Lunar Lake support
	nilfs2: fix potential kernel bug due to lack of writeback flag waiting
	tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()
	serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
	hugetlb_encode.h: fix undefined behaviour (34 << 26)
	mptcp: ensure snd_una is properly initialized on connect
	mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
	mptcp: pm: update add_addr counters after connect
	remoteproc: k3-r5: Jump to error handling labels in start/stop errors
	greybus: Fix use-after-free bug in gb_interface_release due to race condition.
	usb-storage: alauda: Check whether the media is initialized
	i2c: at91: Fix the functionality flags of the slave-only interface
	i2c: designware: Fix the functionality flags of the slave-only interface
	zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
	padata: Disable BH when taking works lock on MT path
	rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
	rcutorture: Fix invalid context warning when enable srcu barrier testing
	block/ioctl: prefer different overflow check
	selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh
	selftests/bpf: Fix flaky test btf_map_in_map/lookup_update
	batman-adv: bypass empty buckets in batadv_purge_orig_ref()
	wifi: ath9k: work around memset overflow warning
	af_packet: avoid a false positive warning in packet_setsockopt()
	drop_monitor: replace spin_lock by raw_spin_lock
	scsi: qedi: Fix crash while reading debugfs attribute
	kselftest: arm64: Add a null pointer check
	netpoll: Fix race condition in netpoll_owner_active
	HID: Add quirk for Logitech Casa touchpad
	ACPI: video: Add backlight=native quirk for Lenovo Slim 7 16ARH7
	Bluetooth: ath3k: Fix multiple issues reported by checkpatch.pl
	drm/amd/display: Exit idle optimizations before HDCP execution
	ASoC: Intel: sof_sdw: add JD2 quirk for HP Omen 14
	drm/lima: add mask irq callback to gp and pp
	drm/lima: mask irqs in timeout path before hard reset
	powerpc/pseries: Enforce hcall result buffer validity and size
	powerpc/io: Avoid clang null pointer arithmetic warnings
	power: supply: cros_usbpd: provide ID table for avoiding fallback match
	iommu/arm-smmu-v3: Free MSIs in case of ENOMEM
	f2fs: remove clear SB_INLINECRYPT flag in default_options
	usb: misc: uss720: check for incompatible versions of the Belkin F5U002
	udf: udftime: prevent overflow in udf_disk_stamp_to_time()
	PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
	MIPS: Octeon: Add PCIe link status check
	serial: exar: adding missing CTI and Exar PCI ids
	MIPS: Routerboard 532: Fix vendor retry check code
	mips: bmips: BCM6358: make sure CBR is correctly set
	tracing: Build event generation tests only as modules
	cipso: fix total option length computation
	netrom: Fix a memory leak in nr_heartbeat_expiry()
	ipv6: prevent possible NULL deref in fib6_nh_init()
	ipv6: prevent possible NULL dereference in rt6_probe()
	xfrm6: check ip6_dst_idev() return value in xfrm6_get_saddr()
	netns: Make get_net_ns() handle zero refcount net
	qca_spi: Make interrupt remembering atomic
	net/sched: act_api: rely on rcu in tcf_idr_check_alloc
	net/sched: act_api: fix possible infinite loop in tcf_idr_check_alloc()
	tipc: force a dst refcount before doing decryption
	net/sched: act_ct: set 'net' pointer when creating new nf_flow_table
	sched: act_ct: add netns into the key of tcf_ct_flow_table
	net: stmmac: No need to calculate speed divider when offload is disabled
	virtio_net: checksum offloading handling fix
	netfilter: ipset: Fix suspicious rcu_dereference_protected()
	net: usb: rtl8150 fix unintiatilzed variables in rtl8150_get_link_ksettings
	regulator: core: Fix modpost error "regulator_get_regmap" undefined
	dmaengine: ioat: switch from 'pci_' to 'dma_' API
	dmaengine: ioat: Drop redundant pci_enable_pcie_error_reporting()
	dmaengine: ioatdma: Fix leaking on version mismatch
	dmaengine: ioat: use PCI core macros for PCIe Capability
	dmaengine: ioatdma: Fix error path in ioat3_dma_probe()
	dmaengine: ioatdma: Fix kmemleak in ioat_pci_probe()
	dmaengine: ioatdma: Fix missing kmem_cache_destroy()
	ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
	RDMA/mlx5: Add check for srq max_sge attribute
	ALSA: hda/realtek: Limit mic boost on N14AP7
	drm/radeon: fix UBSAN warning in kv_dpm.c
	gcov: add support for GCC 14
	kcov: don't lose track of remote references during softirqs
	i2c: ocores: set IACK bit after core is enabled
	dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema
	drm/amd/display: revert Exit idle optimizations before HDCP execution
	ARM: dts: samsung: smdkv310: fix keypad no-autorepeat
	ARM: dts: samsung: exynos4412-origen: fix keypad no-autorepeat
	ARM: dts: samsung: smdk4412: fix keypad no-autorepeat
	rtlwifi: rtl8192de: Style clean-ups
	wifi: rtlwifi: rtl8192de: Fix 5 GHz TX power
	pmdomain: ti-sci: Fix duplicate PD referrals
	knfsd: LOOKUP can return an illegal error value
	spmi: hisi-spmi-controller: Do not override device identifier
	bcache: fix variable length array abuse in btree_iter
	tracing: Add MODULE_DESCRIPTION() to preemptirq_delay_test
	x86/cpu/vfm: Add new macros to work with (vendor/family/model) values
	x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
	r8169: remove unneeded memory barrier in rtl_tx
	r8169: improve rtl_tx
	r8169: improve rtl8169_start_xmit
	r8169: remove nr_frags argument from rtl_tx_slots_avail
	r8169: remove not needed check in rtl8169_start_xmit
	r8169: Fix possible ring buffer corruption on fragmented Tx packets.
	Revert "kheaders: substituting --sort in archive creation"
	kheaders: explicitly define file modes for archived headers
	perf/core: Fix missing wakeup when waiting for context reference
	PCI: Add PCI_ERROR_RESPONSE and related definitions
	x86/amd_nb: Check for invalid SMN reads
	cifs: missed ref-counting smb session in find
	smb: client: fix deadlock in smb2_find_smb_tcon()
	ACPI: Add quirks for AMD Renoir/Lucienne CPUs to force the D3 hint
	ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable
	ACPI: x86: Add another system to quirk list for forcing StorageD3Enable
	ACPI: x86: utils: Add Cezanne to the list for forcing StorageD3Enable
	ACPI: x86: utils: Add Picasso to the list for forcing StorageD3Enable
	ACPI: x86: Force StorageD3Enable on more products
	Input: ili210x - fix ili251x_read_touch_data() return value
	pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER
	pinctrl: rockchip: fix pinmux bits for RK3328 GPIO2-B pins
	pinctrl: rockchip: fix pinmux bits for RK3328 GPIO3-B pins
	pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
	pinctrl: rockchip: use dedicated pinctrl type for RK3328
	pinctrl: rockchip: fix pinmux reset in rockchip_pmx_set
	drm/amdgpu: fix UBSAN warning in kv_dpm.c
	netfilter: nf_tables: validate family when identifying table via handle
	SUNRPC: Fix null pointer dereference in svc_rqst_free()
	SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
	SUNRPC: Fix svcxdr_init_decode's end-of-buffer calculation
	SUNRPC: Fix svcxdr_init_encode's buflen calculation
	nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
	ASoC: fsl-asoc-card: set priv->pdev before using it
	net: dsa: microchip: fix initial port flush problem
	net: phy: micrel: add Microchip KSZ 9477 to the device table
	xdp: Move the rxq_info.mem clearing to unreg_mem_model()
	xdp: Allow registering memory model without rxq reference
	xdp: Remove WARN() from __xdp_reg_mem_model()
	sparc: fix old compat_sys_select()
	sparc: fix compat recv/recvfrom syscalls
	parisc: use correct compat recv/recvfrom syscalls
	netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
	drm/panel: ilitek-ili9881c: Fix warning with GPIO controllers that sleep
	mtd: partitions: redboot: Added conversion of operands to a larger type
	bpf: Add a check for struct bpf_fib_lookup size
	net/iucv: Avoid explicit cpumask var allocation on stack
	net/dpaa2: Avoid explicit cpumask var allocation on stack
	ALSA: emux: improve patch ioctl data validation
	media: dvbdev: Initialize sbuf
	soc: ti: wkup_m3_ipc: Send NULL dummy message instead of pointer message
	drm/radeon/radeon_display: Decrease the size of allocated memory
	nvme: fixup comment for nvme RDMA Provider Type
	drm/panel: simple: Add missing display timing flags for KOE TX26D202VM0BWA
	gpio: davinci: Validate the obtained number of IRQs
	gpiolib: cdev: Disallow reconfiguration without direction (uAPI v1)
	x86: stop playing stack games in profile_pc()
	ocfs2: fix DIO failure due to insufficient transaction credits
	mmc: sdhci-pci: Convert PCIBIOS_* return codes to errnos
	mmc: sdhci: Do not invert write-protect twice
	mmc: sdhci: Do not lock spinlock around mmc_gpio_get_ro()
	counter: ti-eqep: enable clock at probe
	iio: adc: ad7266: Fix variable checking bug
	iio: chemical: bme680: Fix pressure value output
	iio: chemical: bme680: Fix calibration data variable
	iio: chemical: bme680: Fix overflows in compensate() functions
	iio: chemical: bme680: Fix sensor data read operation
	net: usb: ax88179_178a: improve link status logs
	usb: gadget: printer: SS+ support
	usb: gadget: printer: fix races against disable
	usb: musb: da8xx: fix a resource leak in probe()
	usb: atm: cxacru: fix endpoint checking in cxacru_bind()
	serial: 8250_omap: Implementation of Errata i2310
	tty: mcf: MCF54418 has 10 UARTS
	net: can: j1939: Initialize unused data in j1939_send_one()
	net: can: j1939: recover socket queue on CAN bus error during BAM transmission
	net: can: j1939: enhanced error handling for tightly received RTS messages in xtp_rx_rts_session_new
	kbuild: Install dtb files as 0644 in Makefile.dtbinst
	csky, hexagon: fix broken sys_sync_file_range
	hexagon: fix fadvise64_64 calling conventions
	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes
	drm/i915/gt: Fix potential UAF by revoke of fence registers
	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes
	batman-adv: Don't accept TT entries for out-of-spec VIDs
	ata: ahci: Clean up sysfs file on error
	ata: libata-core: Fix double free on error
	ftruncate: pass a signed offset
	syscalls: fix compat_sys_io_pgetevents_time64 usage
	mtd: spinand: macronix: Add support for serial NAND flash
	pwm: stm32: Refuse too small period requests
	nfs: Leave pages in the pagecache if readpage failed
	ipv6: annotate some data-races around sk->sk_prot
	ipv6: Fix data races around sk->sk_prot.
	tcp: Fix data races around icsk->icsk_af_ops.
	drivers: fix typo in firmware/efi/memmap.c
	efi: Correct comment on efi_memmap_alloc
	efi: memmap: Move manipulation routines into x86 arch tree
	efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
	efi/x86: Free EFI memory map only when installing a new one.
	KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
	ARM: dts: rockchip: rk3066a: add #sound-dai-cells to hdmi node
	arm64: dts: rockchip: Add sound-dai-cells for RK3368
	xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
	serial: 8250_omap: Fix Errata i2310 with RX FIFO level check
	tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
	Linux 5.10.221

Change-Id: Icac1c62fcbda5102be7ea031121f28d6fee36875
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-07-17 09:08:09 +00:00
Greg Kroah-Hartman
88eb084d18 Revert "Merge 5.10.220 into android12-5.10-lts"
This reverts commit 87a7f35a24, reversing
changes made to 640645c85b.

5.10.220 is a bunch of vfs and nfs changes that are not needed in
Android systems, so revert the whole lot all at once, except for the
version number bump.

Change-Id: If28dc2231f27d326d3730716f23545dd0a2cdc75
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-07-16 16:33:14 +00:00
Greg Kroah-Hartman
87a7f35a24 Merge 5.10.220 into android12-5.10-lts
Changes in 5.10.220
	SUNRPC: Rename svc_encode_read_payload()
	NFSD: Invoke svc_encode_result_payload() in "read" NFSD encoders
	NFSD: A semicolon is not needed after a switch statement.
	nfsd/nfs3: remove unused macro nfsd3_fhandleres
	NFSD: Clean up the show_nf_may macro
	NFSD: Remove extra "0x" in tracepoint format specifier
	NFSD: Add SPDX header for fs/nfsd/trace.c
	nfsd: Fix error return code in nfsd_file_cache_init()
	SUNRPC: Add xdr_set_scratch_page() and xdr_reset_scratch_buffer()
	SUNRPC: Prepare for xdr_stream-style decoding on the server-side
	NFSD: Add common helpers to decode void args and encode void results
	NFSD: Add tracepoints in nfsd_dispatch()
	NFSD: Add tracepoints in nfsd4_decode/encode_compound()
	NFSD: Replace the internals of the READ_BUF() macro
	NFSD: Replace READ* macros in nfsd4_decode_access()
	NFSD: Replace READ* macros in nfsd4_decode_close()
	NFSD: Replace READ* macros in nfsd4_decode_commit()
	NFSD: Change the way the expected length of a fattr4 is checked
	NFSD: Replace READ* macros that decode the fattr4 size attribute
	NFSD: Replace READ* macros that decode the fattr4 acl attribute
	NFSD: Replace READ* macros that decode the fattr4 mode attribute
	NFSD: Replace READ* macros that decode the fattr4 owner attribute
	NFSD: Replace READ* macros that decode the fattr4 owner_group attribute
	NFSD: Replace READ* macros that decode the fattr4 time_set attributes
	NFSD: Replace READ* macros that decode the fattr4 security label attribute
	NFSD: Replace READ* macros that decode the fattr4 umask attribute
	NFSD: Replace READ* macros in nfsd4_decode_fattr()
	NFSD: Replace READ* macros in nfsd4_decode_create()
	NFSD: Replace READ* macros in nfsd4_decode_delegreturn()
	NFSD: Replace READ* macros in nfsd4_decode_getattr()
	NFSD: Replace READ* macros in nfsd4_decode_link()
	NFSD: Relocate nfsd4_decode_opaque()
	NFSD: Add helpers to decode a clientid4 and an NFSv4 state owner
	NFSD: Add helper for decoding locker4
	NFSD: Replace READ* macros in nfsd4_decode_lock()
	NFSD: Replace READ* macros in nfsd4_decode_lockt()
	NFSD: Replace READ* macros in nfsd4_decode_locku()
	NFSD: Replace READ* macros in nfsd4_decode_lookup()
	NFSD: Add helper to decode NFSv4 verifiers
	NFSD: Add helper to decode OPEN's createhow4 argument
	NFSD: Add helper to decode OPEN's openflag4 argument
	NFSD: Replace READ* macros in nfsd4_decode_share_access()
	NFSD: Replace READ* macros in nfsd4_decode_share_deny()
	NFSD: Add helper to decode OPEN's open_claim4 argument
	NFSD: Replace READ* macros in nfsd4_decode_open()
	NFSD: Replace READ* macros in nfsd4_decode_open_confirm()
	NFSD: Replace READ* macros in nfsd4_decode_open_downgrade()
	NFSD: Replace READ* macros in nfsd4_decode_putfh()
	NFSD: Replace READ* macros in nfsd4_decode_read()
	NFSD: Replace READ* macros in nfsd4_decode_readdir()
	NFSD: Replace READ* macros in nfsd4_decode_remove()
	NFSD: Replace READ* macros in nfsd4_decode_rename()
	NFSD: Replace READ* macros in nfsd4_decode_renew()
	NFSD: Replace READ* macros in nfsd4_decode_secinfo()
	NFSD: Replace READ* macros in nfsd4_decode_setattr()
	NFSD: Replace READ* macros in nfsd4_decode_setclientid()
	NFSD: Replace READ* macros in nfsd4_decode_setclientid_confirm()
	NFSD: Replace READ* macros in nfsd4_decode_verify()
	NFSD: Replace READ* macros in nfsd4_decode_write()
	NFSD: Replace READ* macros in nfsd4_decode_release_lockowner()
	NFSD: Replace READ* macros in nfsd4_decode_cb_sec()
	NFSD: Replace READ* macros in nfsd4_decode_backchannel_ctl()
	NFSD: Replace READ* macros in nfsd4_decode_bind_conn_to_session()
	NFSD: Add a separate decoder to handle state_protect_ops
	NFSD: Add a separate decoder for ssv_sp_parms
	NFSD: Add a helper to decode state_protect4_a
	NFSD: Add a helper to decode nfs_impl_id4
	NFSD: Add a helper to decode channel_attrs4
	NFSD: Replace READ* macros in nfsd4_decode_create_session()
	NFSD: Replace READ* macros in nfsd4_decode_destroy_session()
	NFSD: Replace READ* macros in nfsd4_decode_free_stateid()
	NFSD: Replace READ* macros in nfsd4_decode_getdeviceinfo()
	NFSD: Replace READ* macros in nfsd4_decode_layoutcommit()
	NFSD: Replace READ* macros in nfsd4_decode_layoutget()
	NFSD: Replace READ* macros in nfsd4_decode_layoutreturn()
	NFSD: Replace READ* macros in nfsd4_decode_secinfo_no_name()
	NFSD: Replace READ* macros in nfsd4_decode_sequence()
	NFSD: Replace READ* macros in nfsd4_decode_test_stateid()
	NFSD: Replace READ* macros in nfsd4_decode_destroy_clientid()
	NFSD: Replace READ* macros in nfsd4_decode_reclaim_complete()
	NFSD: Replace READ* macros in nfsd4_decode_fallocate()
	NFSD: Replace READ* macros in nfsd4_decode_nl4_server()
	NFSD: Replace READ* macros in nfsd4_decode_copy()
	NFSD: Replace READ* macros in nfsd4_decode_copy_notify()
	NFSD: Replace READ* macros in nfsd4_decode_offload_status()
	NFSD: Replace READ* macros in nfsd4_decode_seek()
	NFSD: Replace READ* macros in nfsd4_decode_clone()
	NFSD: Replace READ* macros in nfsd4_decode_xattr_name()
	NFSD: Replace READ* macros in nfsd4_decode_setxattr()
	NFSD: Replace READ* macros in nfsd4_decode_listxattrs()
	NFSD: Make nfsd4_ops::opnum a u32
	NFSD: Replace READ* macros in nfsd4_decode_compound()
	NFSD: Remove macros that are no longer used
	nfsd: only call inode_query_iversion in the I_VERSION case
	nfsd: simplify nfsd4_change_info
	nfsd: minor nfsd4_change_attribute cleanup
	nfsd4: don't query change attribute in v2/v3 case
	Revert "nfsd4: support change_attr_type attribute"
	nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations
	nfsd: allow filesystems to opt out of subtree checking
	nfsd: close cached files prior to a REMOVE or RENAME that would replace target
	exportfs: Add a function to return the raw output from fh_to_dentry()
	nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE
	nfsd: Set PF_LOCAL_THROTTLE on local filesystems only
	nfsd: Record NFSv4 pre/post-op attributes as non-atomic
	exec: Don't open code get_close_on_exec
	exec: Move unshare_files to fix posix file locking during exec
	exec: Simplify unshare_files
	exec: Remove reset_files_struct
	kcmp: In kcmp_epoll_target use fget_task
	bpf: In bpf_task_fd_query use fget_task
	proc/fd: In proc_fd_link use fget_task
	Revert "fget: clarify and improve __fget_files() implementation"
	file: Rename __fcheck_files to files_lookup_fd_raw
	file: Factor files_lookup_fd_locked out of fcheck_files
	file: Replace fcheck_files with files_lookup_fd_rcu
	file: Rename fcheck lookup_fd_rcu
	file: Implement task_lookup_fd_rcu
	proc/fd: In tid_fd_mode use task_lookup_fd_rcu
	kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
	file: Implement task_lookup_next_fd_rcu
	proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
	proc/fd: In fdinfo seq_show don't use get_files_struct
	file: Merge __fd_install into fd_install
	file: In f_dupfd read RLIMIT_NOFILE once.
	file: Merge __alloc_fd into alloc_fd
	file: Rename __close_fd to close_fd and remove the files parameter
	file: Replace ksys_close with close_fd
	inotify: Increase default inotify.max_user_watches limit to 1048576
	fs/lockd: convert comma to semicolon
	NFSD: Fix sparse warning in nfssvc.c
	NFSD: Restore NFSv4 decoding's SAVEMEM functionality
	SUNRPC: Make trace_svc_process() display the RPC procedure symbolically
	SUNRPC: Display RPC procedure names instead of proc numbers
	SUNRPC: Move definition of XDR_UNIT
	NFSD: Update GETATTR3args decoder to use struct xdr_stream
	NFSD: Update ACCESS3arg decoder to use struct xdr_stream
	NFSD: Update READ3arg decoder to use struct xdr_stream
	NFSD: Update WRITE3arg decoder to use struct xdr_stream
	NFSD: Update READLINK3arg decoder to use struct xdr_stream
	NFSD: Fix returned READDIR offset cookie
	NFSD: Add helper to set up the pages where the dirlist is encoded
	NFSD: Update READDIR3args decoders to use struct xdr_stream
	NFSD: Update COMMIT3arg decoder to use struct xdr_stream
	NFSD: Update the NFSv3 DIROPargs decoder to use struct xdr_stream
	NFSD: Update the RENAME3args decoder to use struct xdr_stream
	NFSD: Update the LINK3args decoder to use struct xdr_stream
	NFSD: Update the SETATTR3args decoder to use struct xdr_stream
	NFSD: Update the CREATE3args decoder to use struct xdr_stream
	NFSD: Update the MKDIR3args decoder to use struct xdr_stream
	NFSD: Update the SYMLINK3args decoder to use struct xdr_stream
	NFSD: Update the MKNOD3args decoder to use struct xdr_stream
	NFSD: Update the NFSv2 GETATTR argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 READ argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 WRITE argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 READLINK argument decoder to use struct xdr_stream
	NFSD: Add helper to set up the pages where the dirlist is encoded
	NFSD: Update the NFSv2 READDIR argument decoder to use struct xdr_stream
	NFSD: Update NFSv2 diropargs decoding to use struct xdr_stream
	NFSD: Update the NFSv2 RENAME argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 LINK argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 SETATTR argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 CREATE argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 SYMLINK argument decoder to use struct xdr_stream
	NFSD: Remove argument length checking in nfsd_dispatch()
	NFSD: Update the NFSv2 GETACL argument decoder to use struct xdr_stream
	NFSD: Add an xdr_stream-based decoder for NFSv2/3 ACLs
	NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 ACL GETATTR argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 ACL ACCESS argument decoder to use struct xdr_stream
	NFSD: Clean up after updating NFSv2 ACL decoders
	NFSD: Update the NFSv3 GETACL argument decoder to use struct xdr_stream
	NFSD: Update the NFSv2 SETACL argument decoder to use struct xdr_stream
	NFSD: Clean up after updating NFSv3 ACL decoders
	nfsd: remove unused stats counters
	nfsd: protect concurrent access to nfsd stats counters
	nfsd: report per-export stats
	nfsd4: simplify process_lookup1
	nfsd: simplify process_lock
	nfsd: simplify nfsd_renew
	nfsd: rename lookup_clientid->set_client
	nfsd: refactor set_client
	nfsd: find_cpntf_state cleanup
	nfsd: remove unused set_client argument
	nfsd: simplify nfsd4_check_open_reclaim
	nfsd: cstate->session->se_client -> cstate->clp
	NFSv4_2: SSC helper should use its own config.
	nfs: use change attribute for NFS re-exports
	nfsd: skip some unnecessary stats in the v4 case
	inotify, memcg: account inotify instances to kmemcg
	module: unexport find_module and module_mutex
	module: use RCU to synchronize find_module
	kallsyms: refactor {,module_}kallsyms_on_each_symbol
	kallsyms: only build {,module_}kallsyms_on_each_symbol when required
	fs: add file and path permissions helpers
	namei: introduce struct renamedata
	NFSD: Extract the svcxdr_init_encode() helper
	NFSD: Update the GETATTR3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream
	NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream
	NFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream
	NFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream
	NFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream
	NFSD: Add a helper that encodes NFSv3 directory offset cookies
	NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder
	NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream
	NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream
	NFSD: Remove unused NFSv3 directory entry encoders
	NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations
	NFSD: Update the NFSv2 stat encoder to use struct xdr_stream
	NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream
	NFSD: Update the NFSv2 diropres encoder to use struct xdr_stream
	NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream
	NFSD: Add a helper that encodes NFSv3 directory offset cookies
	NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder
	NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream
	NFSD: Remove unused NFSv2 directory entry encoders
	NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs
	NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream
	NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream
	NFSD: Clean up after updating NFSv2 ACL encoders
	NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream
	NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream
	NFSD: Clean up after updating NFSv3 ACL encoders
	NFSD: Add a tracepoint to record directory entry encoding
	NFSD: Clean up NFSDDBG_FACILITY macro
	nfsd: helper for laundromat expiry calculations
	nfsd: Log client tracking type log message as info instead of warning
	nfsd: Fix typo "accesible"
	nfsd: COPY with length 0 should copy to end of file
	nfsd: don't ignore high bits of copy count
	nfsd: report client confirmation status in "info" file
	SUNRPC: Export svc_xprt_received()
	UAPI: nfsfh.h: Replace one-element array with flexible-array member
	NFSD: Use DEFINE_SPINLOCK() for spinlock
	fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
	Revert "fanotify: limit number of event merge attempts"
	fanotify: reduce event objectid to 29-bit hash
	fanotify: mix event info and pid into merge key hash
	fsnotify: use hash table for faster events merge
	fanotify: limit number of event merge attempts
	fanotify: configurable limits via sysfs
	fanotify: support limited functionality for unprivileged users
	fanotify_user: use upper_32_bits() to verify mask
	nfsd: remove unused function
	nfsd: removed unused argument in nfsd_startup_generic()
	nfsd: hash nfs4_files by inode number
	nfsd: track filehandle aliasing in nfs4_files
	nfsd: reshuffle some code
	nfsd: grant read delegations to clients holding writes
	nfsd: Fix fall-through warnings for Clang
	NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code.
	NFS: fix nfs_fetch_iversion()
	fanotify: fix permission model of unprivileged group
	NFSD: Add an RPC authflavor tracepoint display helper
	NFSD: Add nfsd_clid_cred_mismatch tracepoint
	NFSD: Add nfsd_clid_verf_mismatch tracepoint
	NFSD: Remove trace_nfsd_clid_inuse_err
	NFSD: Add nfsd_clid_confirmed tracepoint
	NFSD: Add nfsd_clid_reclaim_complete tracepoint
	NFSD: Add nfsd_clid_destroyed tracepoint
	NFSD: Add a couple more nfsd_clid_expired call sites
	NFSD: Add tracepoints for SETCLIENTID edge cases
	NFSD: Add tracepoints for EXCHANGEID edge cases
	NFSD: Constify @fh argument of knfsd_fh_hash()
	NFSD: Capture every CB state transition
	NFSD: Drop TRACE_DEFINE_ENUM for NFSD4_CB_<state> macros
	NFSD: Add cb_lost tracepoint
	NFSD: Adjust cb_shutdown tracepoint
	NFSD: Enhance the nfsd_cb_setup tracepoint
	NFSD: Add an nfsd_cb_lm_notify tracepoint
	NFSD: Add an nfsd_cb_offload tracepoint
	NFSD: Replace the nfsd_deleg_break tracepoint
	NFSD: Add an nfsd_cb_probe tracepoint
	NFSD: Remove the nfsd_cb_work and nfsd_cb_done tracepoints
	NFSD: Update nfsd_cb_args tracepoint
	nfsd: Prevent truncation of an unlinked inode from blocking access to its directory
	nfsd: move some commit_metadata()s outside the inode lock
	NFSD add vfs_fsync after async copy is done
	NFSD: delay unmount source's export after inter-server copy completed.
	nfsd: move fsnotify on client creation outside spinlock
	nfsd4: Expose the callback address and state of each NFS4 client
	nfsd: fix kernel test robot warning in SSC code
	NFSD: Fix error return code in nfsd4_interssc_connect()
	nfsd: rpc_peeraddr2str needs rcu lock
	lockd: Remove stale comments
	lockd: Create a simplified .vs_dispatch method for NLM requests
	lockd: Common NLM XDR helpers
	lockd: Update the NLMv1 void argument decoder to use struct xdr_stream
	lockd: Update the NLMv1 TEST arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 LOCK arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 CANCEL arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 UNLOCK arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 nlm_res arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 SM_NOTIFY arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 SHARE arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 FREE_ALL arguments decoder to use struct xdr_stream
	lockd: Update the NLMv1 void results encoder to use struct xdr_stream
	lockd: Update the NLMv1 TEST results encoder to use struct xdr_stream
	lockd: Update the NLMv1 nlm_res results encoder to use struct xdr_stream
	lockd: Update the NLMv1 SHARE results encoder to use struct xdr_stream
	lockd: Update the NLMv4 void arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 TEST arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 LOCK arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 CANCEL arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 UNLOCK arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 nlm_res arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 SM_NOTIFY arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 SHARE arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 FREE_ALL arguments decoder to use struct xdr_stream
	lockd: Update the NLMv4 void results encoder to use struct xdr_stream
	lockd: Update the NLMv4 TEST results encoder to use struct xdr_stream
	lockd: Update the NLMv4 nlm_res results encoder to use struct xdr_stream
	lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream
	nfsd: remove redundant assignment to pointer 'this'
	NFSD: Prevent a possible oops in the nfs_dirent() tracepoint
	nfsd: fix NULL dereference in nfs3svc_encode_getaclres
	kernel/pid.c: remove static qualifier from pidfd_create()
	kernel/pid.c: implement additional checks upon pidfd_create() parameters
	fanotify: minor cosmetic adjustments to fid labels
	fanotify: introduce a generic info record copying helper
	fanotify: add pidfd support to the fanotify API
	fsnotify: replace igrab() with ihold() on attach connector
	fsnotify: count s_fsnotify_inode_refs for attached connectors
	fsnotify: count all objects with attached connectors
	fsnotify: optimize the case of no marks of any type
	NFSD: Clean up splice actor
	SUNRPC: Add svc_rqst_replace_page() API
	NFSD: Batch release pages during splice read
	NFSD: remove vanity comments
	sysctl: introduce new proc handler proc_dobool
	lockd: change the proc_handler for nsm_use_hostnames
	nlm: minor nlm_lookup_file argument change
	nlm: minor refactoring
	lockd: update nlm_lookup_file reexport comment
	Keep read and write fds with each nlm_file
	nfs: don't atempt blocking locks on nfs reexports
	lockd: don't attempt blocking locks on nfs reexports
	nfs: don't allow reexport reclaims
	SUNRPC: Add svc_rqst::rq_auth_stat
	SUNRPC: Set rq_auth_stat in the pg_authenticate() callout
	SUNRPC: Eliminate the RQ_AUTHERR flag
	NFS: Add a private local dispatcher for NFSv4 callback operations
	NFS: Remove unused callback void decoder
	fsnotify: fix sb_connectors leak
	NLM: Fix svcxdr_encode_owner()
	nfsd: Fix a warning for nfsd_file_close_inode
	fsnotify: pass data_type to fsnotify_name()
	fsnotify: pass dentry instead of inode data
	fsnotify: clarify contract for create event hooks
	fsnotify: Don't insert unmergeable events in hashtable
	fanotify: Fold event size calculation to its own function
	fanotify: Split fsid check from other fid mode checks
	inotify: Don't force FS_IN_IGNORED
	fsnotify: Add helper to detect overflow_event
	fsnotify: Add wrapper around fsnotify_add_event
	fsnotify: Retrieve super block from the data field
	fsnotify: Protect fsnotify_handle_inode_event from no-inode events
	fsnotify: Pass group argument to free_event
	fanotify: Support null inode event in fanotify_dfid_inode
	fanotify: Allow file handle encoding for unhashed events
	fanotify: Encode empty file handle when no inode is provided
	fanotify: Require fid_mode for any non-fd event
	fsnotify: Support FS_ERROR event type
	fanotify: Reserve UAPI bits for FAN_FS_ERROR
	fanotify: Pre-allocate pool of error events
	fanotify: Support enqueueing of error events
	fanotify: Support merging of error events
	fanotify: Wrap object_fh inline space in a creator macro
	fanotify: Add helpers to decide whether to report FID/DFID
	fanotify: WARN_ON against too large file handles
	fanotify: Report fid info for file related file system errors
	fanotify: Emit generic error info for error event
	fanotify: Allow users to request FAN_FS_ERROR events
	SUNRPC: Trace calls to .rpc_call_done
	NFSD: Optimize DRC bucket pruning
	NFSD: move filehandle format declarations out of "uapi".
	NFSD: drop support for ancient filehandles
	NFSD: simplify struct nfsfh
	NFSD: Initialize pointer ni with NULL and not plain integer 0
	NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()
	SUNRPC: Replace the "__be32 *p" parameter to .pc_decode
	SUNRPC: Change return value type of .pc_decode
	NFSD: Save location of NFSv4 COMPOUND status
	SUNRPC: Replace the "__be32 *p" parameter to .pc_encode
	SUNRPC: Change return value type of .pc_encode
	nfsd: update create verifier comment
	NFSD:fix boolreturn.cocci warning
	nfsd4: remove obselete comment
	NFSD: Fix exposure in nfsd4_decode_bitmap()
	NFSD: Fix READDIR buffer overflow
	fsnotify: clarify object type argument
	fsnotify: separate mark iterator type from object type enum
	fanotify: introduce group flag FAN_REPORT_TARGET_FID
	fsnotify: generate FS_RENAME event with rich information
	fanotify: use macros to get the offset to fanotify_info buffer
	fanotify: use helpers to parcel fanotify_info buffer
	fanotify: support secondary dir fh and name in fanotify_info
	fanotify: record old and new parent and name in FAN_RENAME event
	fanotify: record either old name new name or both for FAN_RENAME
	fanotify: report old and/or new parent+name in FAN_RENAME event
	fanotify: wire up FAN_RENAME event
	exit: Implement kthread_exit
	exit: Rename module_put_and_exit to module_put_and_kthread_exit
	NFSD: Fix sparse warning
	NFSD: handle errors better in write_ports_addfd()
	SUNRPC: change svc_get() to return the svc.
	SUNRPC/NFSD: clean up get/put functions.
	SUNRPC: stop using ->sv_nrthreads as a refcount
	nfsd: make nfsd_stats.th_cnt atomic_t
	SUNRPC: use sv_lock to protect updates to sv_nrthreads.
	NFSD: narrow nfsd_mutex protection in nfsd thread
	NFSD: Make it possible to use svc_set_num_threads_sync
	SUNRPC: discard svo_setup and rename svc_set_num_threads_sync()
	NFSD: simplify locking for network notifier.
	lockd: introduce nlmsvc_serv
	lockd: simplify management of network status notifiers
	lockd: move lockd_start_svc() call into lockd_create_svc()
	lockd: move svc_exit_thread() into the thread
	lockd: introduce lockd_put()
	lockd: rename lockd_create_svc() to lockd_get()
	SUNRPC: move the pool_map definitions (back) into svc.c
	SUNRPC: always treat sv_nrpools==1 as "not pooled"
	lockd: use svc_set_num_threads() for thread start and stop
	NFS: switch the callback service back to non-pooled.
	NFSD: Remove be32_to_cpu() from DRC hash function
	NFSD: Fix inconsistent indenting
	NFSD: simplify per-net file cache management
	NFSD: Combine XDR error tracepoints
	nfsd: improve stateid access bitmask documentation
	NFSD: De-duplicate nfsd4_decode_bitmap4()
	nfs: block notification on fs with its own ->lock
	nfsd4: add refcount for nfsd4_blocked_lock
	NFSD: Fix zero-length NFSv3 WRITEs
	nfsd: map EBADF
	nfsd: Add errno mapping for EREMOTEIO
	nfsd: Retry once in nfsd_open on an -EOPENSTALE return
	NFSD: Clean up nfsd_vfs_write()
	NFSD: De-duplicate net_generic(SVC_NET(rqstp), nfsd_net_id)
	NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id)
	nfsd: Add a tracepoint for errors in nfsd4_clone_file_range()
	NFSD: Write verifier might go backwards
	NFSD: Clean up the nfsd_net::nfssvc_boot field
	NFSD: Rename boot verifier functions
	NFSD: Trace boot verifier resets
	Revert "nfsd: skip some unnecessary stats in the v4 case"
	NFSD: Move fill_pre_wcc() and fill_post_wcc()
	nfsd: fix crash on COPY_NOTIFY with special stateid
	fanotify: remove variable set but not used
	lockd: fix server crash on reboot of client holding lock
	lockd: fix failure to cleanup client locks
	NFSD: Fix the behavior of READ near OFFSET_MAX
	NFSD: Fix ia_size underflow
	NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes
	NFSD: COMMIT operations must not return NFS?ERR_INVAL
	NFSD: Deprecate NFS_OFFSET_MAX
	nfsd: Add support for the birth time attribute
	NFSD: De-duplicate hash bucket indexing
	NFSD: Skip extra computation for RC_NOCACHE case
	NFSD: Streamline the rare "found" case
	SUNRPC: Remove the .svo_enqueue_xprt method
	SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt()
	SUNRPC: Remove svo_shutdown method
	SUNRPC: Rename svc_create_xprt()
	SUNRPC: Rename svc_close_xprt()
	SUNRPC: Remove svc_shutdown_net()
	NFSD: Remove svc_serv_ops::svo_module
	NFSD: Move svc_serv_ops::svo_function into struct svc_serv
	NFSD: Remove CONFIG_NFSD_V3
	NFSD: Clean up _lm_ operation names
	nfsd: fix using the correct variable for sizeof()
	fsnotify: fix merge with parent's ignored mask
	fsnotify: optimize FS_MODIFY events with no ignored masks
	fsnotify: remove redundant parameter judgment
	SUNRPC: Return true/false (not 1/0) from bool functions
	nfsd: Fix a write performance regression
	nfsd: Clean up nfsd_file_put()
	fanotify: do not allow setting dirent events in mask of non-dir
	fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock.
	inotify: move control flags from mask to mark flags
	fsnotify: pass flags argument to fsnotify_alloc_group()
	fsnotify: make allow_dups a property of the group
	fsnotify: create helpers for group mark_mutex lock
	inotify: use fsnotify group lock helpers
	nfsd: use fsnotify group lock helpers
	dnotify: use fsnotify group lock helpers
	fsnotify: allow adding an inode mark without pinning inode
	fanotify: create helper fanotify_mark_user_flags()
	fanotify: factor out helper fanotify_mark_update_flags()
	fanotify: implement "evictable" inode marks
	fanotify: use fsnotify group lock helpers
	fanotify: enable "evictable" inode marks
	fsnotify: introduce mark type iterator
	fsnotify: consistent behavior for parent not watching children
	fanotify: fix incorrect fmode_t casts
	NFSD: Clean up nfsd_splice_actor()
	NFSD: add courteous server support for thread with only delegation
	NFSD: add support for share reservation conflict to courteous server
	NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd
	fs/lock: add helper locks_owner_has_blockers to check for blockers
	fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict
	NFSD: add support for lock conflict to courteous server
	NFSD: Show state of courtesy client in client info
	NFSD: Clean up nfsd3_proc_create()
	NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create()
	NFSD: Refactor nfsd_create_setattr()
	NFSD: Refactor NFSv3 CREATE
	NFSD: Refactor NFSv4 OPEN(CREATE)
	NFSD: Remove do_nfsd_create()
	NFSD: Clean up nfsd_open_verified()
	NFSD: Instantiate a struct file when creating a regular NFSv4 file
	NFSD: Remove dprintk call sites from tail of nfsd4_open()
	NFSD: Fix whitespace
	NFSD: Move documenting comment for nfsd4_process_open2()
	NFSD: Trace filecache opens
	NFSD: Clean up the show_nf_flags() macro
	SUNRPC: Use RMW bitops in single-threaded hot paths
	nfsd: Unregister the cld notifier when laundry_wq create failed
	nfsd: Fix null-ptr-deref in nfsd_fill_super()
	nfsd: destroy percpu stats counters after reply cache shutdown
	NFSD: Modernize nfsd4_release_lockowner()
	NFSD: Add documenting comment for nfsd4_release_lockowner()
	NFSD: nfsd_file_put() can sleep
	NFSD: Fix potential use-after-free in nfsd_file_put()
	SUNRPC: Optimize xdr_reserve_space()
	fanotify: refine the validation checks on non-dir inode mask
	NFS: restore module put when manager exits.
	NFSD: Decode NFSv4 birth time attribute
	lockd: set fl_owner when unlocking files
	lockd: fix nlm_close_files
	fs: inotify: Fix typo in inotify comment
	fanotify: prepare for setting event flags in ignore mask
	fanotify: cleanups for fanotify_mark() input validations
	fanotify: introduce FAN_MARK_IGNORE
	fsnotify: Fix comment typo
	nfsd: eliminate the NFSD_FILE_BREAK_* flags
	SUNRPC: Fix xdr_encode_bool()
	NLM: Defend against file_lock changes after vfs_test_lock()
	NFSD: Fix space and spelling mistake
	nfsd: remove redundant assignment to variable len
	NFSD: Demote a WARN to a pr_warn()
	NFSD: Report filecache LRU size
	NFSD: Report count of calls to nfsd_file_acquire()
	NFSD: Report count of freed filecache items
	NFSD: Report average age of filecache items
	NFSD: Add nfsd_file_lru_dispose_list() helper
	NFSD: Refactor nfsd_file_gc()
	NFSD: Refactor nfsd_file_lru_scan()
	NFSD: Report the number of items evicted by the LRU walk
	NFSD: Record number of flush calls
	NFSD: Zero counters when the filecache is re-initialized
	NFSD: Hook up the filecache stat file
	NFSD: WARN when freeing an item still linked via nf_lru
	NFSD: Trace filecache LRU activity
	NFSD: Leave open files out of the filecache LRU
	NFSD: Fix the filecache LRU shrinker
	NFSD: Never call nfsd_file_gc() in foreground paths
	NFSD: No longer record nf_hashval in the trace log
	NFSD: Remove lockdep assertion from unhash_and_release_locked()
	NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
	NFSD: Refactor __nfsd_file_close_inode()
	NFSD: nfsd_file_hash_remove can compute hashval
	NFSD: Remove nfsd_file::nf_hashval
	NFSD: Replace the "init once" mechanism
	NFSD: Set up an rhashtable for the filecache
	NFSD: Convert the filecache to use rhashtable
	NFSD: Clean up unused code after rhashtable conversion
	NFSD: Separate tracepoints for acquire and create
	NFSD: Move nfsd_file_trace_alloc() tracepoint
	NFSD: NFSv4 CLOSE should release an nfsd_file immediately
	NFSD: Ensure nf_inode is never dereferenced
	NFSD: refactoring v4 specific code to a helper in nfs4state.c
	NFSD: keep track of the number of v4 clients in the system
	NFSD: limit the number of v4 clients to 1024 per 1GB of system memory
	nfsd: silence extraneous printk on nfsd.ko insertion
	NFSD: Optimize nfsd4_encode_operation()
	NFSD: Optimize nfsd4_encode_fattr()
	NFSD: Clean up SPLICE_OK in nfsd4_encode_read()
	NFSD: Add an nfsd4_read::rd_eof field
	NFSD: Optimize nfsd4_encode_readv()
	NFSD: Simplify starting_len
	NFSD: Use xdr_pad_size()
	NFSD: Clean up nfsd4_encode_readlink()
	NFSD: Fix strncpy() fortify warning
	NFSD: nfserrno(-ENOMEM) is nfserr_jukebox
	NFSD: Shrink size of struct nfsd4_copy_notify
	NFSD: Shrink size of struct nfsd4_copy
	NFSD: Reorder the fields in struct nfsd4_op
	NFSD: Make nfs4_put_copy() static
	NFSD: Replace boolean fields in struct nfsd4_copy
	NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
	NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
	NFSD: Refactor nfsd4_do_copy()
	NFSD: Remove kmalloc from nfsd4_do_async_copy()
	NFSD: Add nfsd4_send_cb_offload()
	NFSD: Move copy offload callback arguments into a separate structure
	NFSD: drop fh argument from alloc_init_deleg
	NFSD: verify the opened dentry after setting a delegation
	NFSD: introduce struct nfsd_attrs
	NFSD: set attributes when creating symlinks
	NFSD: add security label to struct nfsd_attrs
	NFSD: add posix ACLs to struct nfsd_attrs
	NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
	NFSD: always drop directory lock in nfsd_unlink()
	NFSD: only call fh_unlock() once in nfsd_link()
	NFSD: reduce locking in nfsd_lookup()
	NFSD: use explicit lock/unlock for directory ops
	NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
	NFSD: discard fh_locked flag and fh_lock/fh_unlock
	lockd: detect and reject lock arguments that overflow
	NFSD: fix regression with setting ACLs.
	nfsd_splice_actor(): handle compound pages
	NFSD: move from strlcpy with unused retval to strscpy
	lockd: move from strlcpy with unused retval to strscpy
	NFSD enforce filehandle check for source file in COPY
	NFSD: remove redundant variable status
	nfsd: Avoid some useless tests
	nfsd: Propagate some error code returned by memdup_user()
	NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND
	NFSD: Protect against send buffer overflow in NFSv2 READDIR
	NFSD: Protect against send buffer overflow in NFSv3 READDIR
	NFSD: Protect against send buffer overflow in NFSv2 READ
	NFSD: Protect against send buffer overflow in NFSv3 READ
	NFSD: drop fname and flen args from nfsd_create_locked()
	NFSD: Fix handling of oversized NFSv4 COMPOUND requests
	nfsd: clean up mounted_on_fileid handling
	nfsd: remove nfsd4_prepare_cb_recall() declaration
	NFSD: Add tracepoints to report NFSv4 callback completions
	NFSD: Add a mechanism to wait for a DELEGRETURN
	NFSD: Refactor nfsd_setattr()
	NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY
	NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY
	NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY
	NFSD: keep track of the number of courtesy clients in the system
	NFSD: add shrinker to reap courtesy clients on low memory condition
	SUNRPC: Parametrize how much of argsize should be zeroed
	NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing
	NFSD: Refactor common code out of dirlist helpers
	NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks
	NFSD: Clean up WRITE arg decoders
	NFSD: Clean up nfs4svc_encode_compoundres()
	NFSD: Remove "inline" directives on op_rsize_bop helpers
	NFSD: Remove unused nfsd4_compoundargs::cachetype field
	NFSD: Pack struct nfsd4_compoundres
	nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops
	nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops
	nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops
	nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops
	nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops
	NFSD: Rename the fields in copy_stateid_t
	NFSD: Cap rsize_bop result based on send buffer size
	nfsd: only fill out return pointer on success in nfsd4_lookup_stateid
	nfsd: fix comments about spinlock handling with delegations
	nfsd: make nfsd4_run_cb a bool return function
	nfsd: extra checks when freeing delegation stateids
	fs/notify: constify path
	fsnotify: remove unused declaration
	fanotify: Remove obsoleted fanotify_event_has_path()
	nfsd: fix nfsd_file_unhash_and_dispose
	nfsd: rework hashtable handling in nfsd_do_file_acquire
	NFSD: unregister shrinker when nfsd_init_net() fails
	nfsd: fix net-namespace logic in __nfsd_file_cache_purge
	nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint
	nfsd: put the export reference in nfsd4_verify_deleg_dentry
	NFSD: Fix reads with a non-zero offset that don't end on a page boundary
	filelock: add a new locks_inode_context accessor function
	lockd: use locks_inode_context helper
	nfsd: use locks_inode_context helper
	NFSD: Simplify READ_PLUS
	NFSD: Remove redundant assignment to variable host_err
	NFSD: Finish converting the NFSv2 GETACL result encoder
	NFSD: Finish converting the NFSv3 GETACL result encoder
	nfsd: ignore requests to disable unsupported versions
	nfsd: move nfserrno() to vfs.c
	nfsd: allow disabling NFSv2 at compile time
	exportfs: use pr_debug for unreachable debug statements
	NFSD: Pass the target nfsd_file to nfsd_commit()
	NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"
	NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
	NFSD: Flesh out a documenting comment for filecache.c
	NFSD: Clean up nfs4_preprocess_stateid_op() call sites
	NFSD: Trace stateids returned via DELEGRETURN
	NFSD: Trace delegation revocations
	NFSD: Use const pointers as parameters to fh_ helpers
	NFSD: Update file_hashtbl() helpers
	NFSD: Clean up nfsd4_init_file()
	NFSD: Add a nfsd4_file_hash_remove() helper
	NFSD: Clean up find_or_add_file()
	NFSD: Refactor find_file()
	NFSD: Use rhashtable for managing nfs4_file objects
	NFSD: Fix licensing header in filecache.c
	nfsd: remove the pages_flushed statistic from filecache
	nfsd: reorganize filecache.c
	nfsd: fix up the filecache laundrette scheduling
	NFSD: Add an nfsd_file_fsync tracepoint
	lockd: set other missing fields when unlocking files
	nfsd: return error if nfs4_setacl fails
	NFSD: Use struct_size() helper in alloc_session()
	lockd: set missing fl_flags field when retrieving args
	lockd: ensure we use the correct file descriptor when unlocking
	lockd: fix file selection in nlmsvc_cancel_blocked
	NFSD: pass range end to vfs_fsync_range() instead of count
	NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
	NFSD: add support for sending CB_RECALL_ANY
	NFSD: add delegation reaper to react to low memory condition
	NFSD: Use only RQ_DROPME to signal the need to drop a reply
	NFSD: Avoid clashing function prototypes
	nfsd: rework refcounting in filecache
	nfsd: fix handling of cached open files in nfsd4_open codepath
	Revert "SUNRPC: Use RMW bitops in single-threaded hot paths"
	NFSD: Use set_bit(RQ_DROPME)
	NFSD: fix use-after-free in nfsd4_ssc_setup_dul()
	NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time
	NFSD: replace delayed_work with work_struct for nfsd_client_shrinker
	nfsd: don't free files unconditionally in __nfsd_file_cache_purge
	nfsd: don't destroy global nfs4_file table in per-net shutdown
	NFSD: enhance inter-server copy cleanup
	nfsd: allow nfsd_file_get to sanely handle a NULL pointer
	nfsd: clean up potential nfsd_file refcount leaks in COPY codepath
	NFSD: fix leaked reference count of nfsd4_ssc_umount_item
	nfsd: don't hand out delegation on setuid files being opened for write
	NFSD: fix problems with cleanup on errors in nfsd4_copy
	nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open
	nfsd: don't fsync nfsd_files on last close
	NFSD: copy the whole verifier in nfsd_copy_write_verifier
	NFSD: Protect against filesystem freezing
	lockd: set file_lock start and end when decoding nlm4 testargs
	nfsd: don't replace page in rq_pages if it's a continuation of last page
	NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
	nfsd: call op_release, even when op_func returns an error
	nfsd: don't open-code clear_and_wake_up_bit
	nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entries
	nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparator
	nfsd: don't kill nfsd_files because of lease break error
	nfsd: add some comments to nfsd_file_do_acquire
	nfsd: don't take/put an extra reference when putting a file
	nfsd: update comment over __nfsd_file_cache_purge
	nfsd: allow reaping files still under writeback
	NFSD: Convert filecache to rhltable
	nfsd: simplify the delayed disposal list code
	NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop
	nfsd: make a copy of struct iattr before calling notify_change
	nfsd: fix double fget() bug in __write_ports_addfd()
	lockd: drop inappropriate svc_get() from locked_get()
	NFSD: Add an nfsd4_encode_nfstime4() helper
	nfsd: Fix creation time serialization order
	nfsd: don't allow nfsd threads to be signalled.
	nfsd: Simplify code around svc_exit_thread() call in nfsd()
	nfsd: separate nfsd_last_thread() from nfsd_put()
	Documentation: Add missing documentation for EXPORT_OP flags
	NFSD: fix possible oops when nfsd/pool_stats is closed.
	nfsd: call nfsd_last_thread() before final nfsd_put()
	nfsd: drop the nfsd_put helper
	nfsd: fix RELEASE_LOCKOWNER
	nfsd: don't take fi_lock in nfsd_break_deleg_cb()
	nfsd: don't call locks_release_private() twice concurrently
	nfsd: Fix a regression in nfsd_setattr()
	Linux 5.10.220

Change-Id: I589ec5e63d1f985ab69f9755b9a87330627d44c5
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-07-16 15:58:54 +00:00
Arnd Bergmann
84bf6b64a1 ftruncate: pass a signed offset
commit 4b8e88e563b5f666446d002ad0dc1e6e8e7102b0 upstream.

The old ftruncate() syscall, using the 32-bit off_t misses a sign
extension when called in compat mode on 64-bit architectures.  As a
result, passing a negative length accidentally succeeds in truncating
to file size between 2GiB and 4GiB.

Changing the type of the compat syscall to the signed compat_off_t
changes the behavior so it instead returns -EINVAL.

The native entry point, the truncate() syscall and the corresponding
loff_t based variants are all correct already and do not suffer
from this mistake.

Fixes: 3f6d078d4a ("fix compat truncate/ftruncate")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05 09:12:55 +02:00
Chuck Lever
c20097329d NFSD: Instantiate a struct file when creating a regular NFSv4 file
[ Upstream commit fb70bf124b051d4ded4ce57511dfec6d3ebf2b43 ]

There have been reports of races that cause NFSv4 OPEN(CREATE) to
return an error even though the requested file was created. NFSv4
does not provide a status code for this case.

To mitigate some of these problems, reorganize the NFSv4
OPEN(CREATE) logic to allocate resources before the file is actually
created, and open the new file while the parent directory is still
locked.

Two new APIs are added:

+ Add an API that works like nfsd_file_acquire() but does not open
the underlying file. The OPEN(CREATE) path can use this API when it
already has an open file.

+ Add an API that is kin to dentry_open(). NFSD needs to create a
file and grab an open "struct file *" atomically. The
alloc_empty_file() has to be done before the inode create. If it
fails (for example, because the NFS server has exceeded its
max_files limit), we avoid creating the file and can still return
an error to the NFS client.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: JianHong Yin <jiyin@redhat.com>
[ cel: backported to 5.10.y, prior to idmapped mounts ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-21 14:53:43 +02:00
Christian Brauner
b0fa673c8c fs: add file and path permissions helpers
[ Upstream commit 02f92b3868a1b34ab98464e76b0e4e060474ba10 ]

Add two simple helpers to check permissions on a file and path
respectively and convert over some callers. It simplifies quite a few
codepaths and also reduces the churn in later patches quite a bit.
Christoph also correctly points out that this makes codepaths (e.g.
ioctls) way easier to follow that would otherwise have to do more
complex argument passing than necessary.

Link: https://lore.kernel.org/r/20210121131959.646623-4-christian.brauner@ubuntu.com
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-21 14:52:58 +02:00
Eric W. Biederman
6d256a904c file: Rename __close_fd to close_fd and remove the files parameter
[ Upstream commit 8760c909f54a82aaa6e76da19afe798a0c77c3c3 ]

The function __close_fd was added to support binder[1].  Now that
binder has been fixed to no longer need __close_fd[2] all calls
to __close_fd pass current->files.

Therefore transform the files parameter into a local variable
initialized to current->files, and rename __close_fd to close_fd to
reflect this change, and keep it in sync with the similar changes to
__alloc_fd, and __fd_install.

This removes the need for callers to care about the extra care that
needs to be take if anything except current->files is passed, by
limiting the callers to only operation on current->files.

[1] 483ce1d4b8 ("take descriptor-related part of close() to file.c")
[2] 44d8047f1d ("binder: use standard functions to allocate fds")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-17-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-21-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-21 14:52:49 +02:00
Greg Kroah-Hartman
df0f5bd7a8 This is the 5.10.190 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmTWBikACgkQONu9yGCS
 aT7+sBAAgFSUVSAURxIgHfi1XvXnGI/eZI/0ub8AqkWRdMUVbpqXR0hsq1/KP5sj
 zz6l4QTfFyGm5EQPv3dEDi3Nztsb2dQEaJTIKQV6N4kY133fn6sYIxQE9Rb1jHcp
 UoI/WbiOAwRqKd5glsPANByen6U4871Y1h2drBgMXYzP1fVEhPCVxz1XhAyZlVTw
 n94774DCiSXaVUt82r5IPOV2HfZAnik4BThCD3oE1v9qN2ugow51GA11R9Mf49lR
 mthPqsMhOHl+IPgB+3VaL+DCDDbs6b+izBCNSy5Jdupqsd22RuuLE3a8gSgZSoYB
 3NrIqwOPUuVX3yUsvaTfklXMDU8RyKGEAIK/njW2NSFv8ccMa1lymhXnrkjSd7AP
 BdHVfpB9k3WJQt6ElMfH6S8KszWFyLgolsT7cG3osi7VgsbDuGW30ApTl7WI7nDB
 QgoM3X2FfYYZ8uc7+fT0PU8tk098VqB0c4PQbigN7i0l0J6o1PPrg4Ke9++qaSAs
 PZ+rUha4+9whgjjyF05sWSScgwo0Az4qEx5r1Igf3lfDOUnfIPy3cKd+O1zpjrt5
 6ZO3CQkYM4IO8mP5ps2t4rJvZuPfAVBi3Ndgtt/JbLlkxbkuhNzTE1TKA7UOZRAW
 RK6YlzhHsR5LQEOZb6pVJtO2HTaJSBjTrp4W+xj/e6k51z+I4Co=
 =RCz+
 -----END PGP SIGNATURE-----

Merge 5.10.190 into android12-5.10-lts

Changes in 5.10.190
	KVM: s390: pv: fix index value of replaced ASCE
	io_uring: don't audit the capability check in io_uring_create()
	gpio: tps68470: Make tps68470_gpio_output() always set the initial value
	btrfs: fix race between quota disable and relocation
	btrfs: fix extent buffer leak after tree mod log failure at split_node()
	i2c: Delete error messages for failed memory allocations
	i2c: Improve size determinations
	i2c: nomadik: Remove unnecessary goto label
	i2c: nomadik: Use devm_clk_get_enabled()
	i2c: nomadik: Remove a useless call in the remove function
	PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
	PCI/ASPM: Factor out pcie_wait_for_retrain()
	PCI/ASPM: Avoid link retraining race
	dlm: cleanup plock_op vs plock_xop
	dlm: rearrange async condition return
	fs: dlm: interrupt posix locks only when process is killed
	drm/ttm: add ttm_bo_pin()/ttm_bo_unpin() v2
	drm/ttm: never consider pinned BOs for eviction&swap
	tracing: Show real address for trace event arguments
	pwm: meson: Simplify duplicated per-channel tracking
	pwm: meson: fix handling of period/duty if greater than UINT_MAX
	ext4: fix to check return value of freeze_bdev() in ext4_shutdown()
	phy: qcom-snps: Use dev_err_probe() to simplify code
	phy: qcom-snps: correct struct qcom_snps_hsphy kerneldoc
	phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend
	phy: qcom-snps-femto-v2: properly enable ref clock
	media: staging: atomisp: select V4L2_FWNODE
	i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
	net: phy: marvell10g: fix 88x3310 power up
	net: hns3: reconstruct function hclge_ets_validate()
	net: hns3: fix wrong bw weight of disabled tc issue
	vxlan: move to its own directory
	vxlan: calculate correct header length for GPE
	phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
	ethernet: atheros: fix return value check in atl1e_tso_csum()
	ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
	tcp: Reduce chance of collisions in inet6_hashfn().
	ice: Fix memory management in ice_ethtool_fdir.c
	bonding: reset bond's flags when down link is P2P device
	team: reset team's flags when down link is P2P device
	platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
	netfilter: nft_set_rbtree: fix overlap expiration walk
	netfilter: nftables: add helper function to validate set element data
	netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
	netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
	net/sched: mqprio: refactor nlattr parsing to a separate function
	net/sched: mqprio: add extack to mqprio_parse_nlattr()
	net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
	benet: fix return value check in be_lancer_xmit_workarounds()
	tipc: check return value of pskb_trim()
	tipc: stop tipc crypto on failure in tipc_node_create
	RDMA/mlx4: Make check for invalid flags stricter
	drm/msm/dpu: drop enum dpu_core_perf_data_bus_id
	drm/msm/adreno: Fix snapshot BINDLESS_DATA size
	RDMA/mthca: Fix crash when polling CQ for shared QPs
	drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
	ASoC: fsl_spdif: Silence output on stop
	block: Fix a source code comment in include/uapi/linux/blkzoned.h
	dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
	dm raid: clean up four equivalent goto tags in raid_ctr()
	dm raid: protect md_stop() with 'reconfig_mutex'
	ata: pata_ns87415: mark ns87560_tf_read static
	ring-buffer: Fix wrong stat of cpu_buffer->read
	tracing: Fix warning in trace_buffered_event_disable()
	Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
	USB: gadget: Fix the memory leak in raw_gadget driver
	serial: qcom-geni: drop bogus runtime pm state update
	serial: 8250_dw: Preserve original value of DLF register
	serial: sifive: Fix sifive_serial_console_setup() section
	USB: serial: option: support Quectel EM060K_128
	USB: serial: option: add Quectel EC200A module support
	USB: serial: simple: add Kaufmann RKS+CAN VCP
	USB: serial: simple: sort driver entries
	can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
	Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
	usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
	usb: dwc3: don't reset device side if dwc3 was configured as host-only
	usb: ohci-at91: Fix the unhandle interrupt when resume
	USB: quirks: add quirk for Focusrite Scarlett
	usb: xhci-mtk: set the dma max_seg_size
	Revert "usb: xhci: tegra: Fix error check"
	Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
	Documentation: security-bugs.rst: clarify CVE handling
	staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
	tty: n_gsm: fix UAF in gsm_cleanup_mux
	ALSA: hda/relatek: Enable Mute LED on HP 250 G8
	hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
	btrfs: check for commit error at btrfs_attach_transaction_barrier()
	file: always lock position for FMODE_ATOMIC_POS
	nfsd: Remove incorrect check in nfsd4_validate_stateid
	tpm_tis: Explicitly check for error code
	irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
	irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
	KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
	KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
	KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
	KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
	staging: rtl8712: Use constants from <linux/ieee80211.h>
	staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
	btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
	virtio-net: fix race between set queues and probe
	s390/dasd: fix hanging device after quiesce/resume
	ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
	ceph: never send metrics if disable_send_metrics is set
	dm cache policy smq: ensure IO doesn't prevent cleaner policy progress
	drm/ttm: make ttm_bo_unpin more defensive
	ACPI: processor: perflib: Use the "no limit" frequency QoS
	ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily
	cpufreq: intel_pstate: Drop ACPI _PSS states table patching
	selftests: mptcp: depend on SYN_COOKIES
	io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
	ASoC: cs42l51: fix driver to properly autoload with automatic module loading
	kprobes/x86: Fix fall-through warnings for Clang
	x86/kprobes: Do not decode opcode in resume_execution()
	x86/kprobes: Retrieve correct opcode for group instruction
	x86/kprobes: Identify far indirect JMP correctly
	x86/kprobes: Use int3 instead of debug trap for single-step
	x86/kprobes: Fix to identify indirect jmp and others using range case
	x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
	x86/kprobes: Update kcb status flag after singlestepping
	x86/kprobes: Fix JNG/JNLE emulation
	io_uring: gate iowait schedule on having pending requests
	perf: Fix function pointer case
	loop: Select I/O scheduler 'none' from inside add_disk()
	arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset pinmux
	word-at-a-time: use the same return type for has_zero regardless of endianness
	KVM: s390: fix sthyi error handling
	wifi: cfg80211: Fix return value in scan logic
	net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx
	net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer()
	bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
	rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length
	net: dsa: fix value check in bcm_sf2_sw_probe()
	perf test uprobe_from_different_cu: Skip if there is no gcc
	net: sched: cls_u32: Fix match key mis-addressing
	mISDN: hfcpci: Fix potential deadlock on &hc->lock
	net: annotate data-races around sk->sk_max_pacing_rate
	net: add missing READ_ONCE(sk->sk_rcvlowat) annotation
	net: add missing READ_ONCE(sk->sk_sndbuf) annotation
	net: add missing READ_ONCE(sk->sk_rcvbuf) annotation
	net: add missing data-race annotations around sk->sk_peek_off
	net: add missing data-race annotation for sk_ll_usec
	net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free
	net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free
	net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free
	bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
	net: ll_temac: Switch to use dev_err_probe() helper
	net: ll_temac: fix error checking of irq_of_parse_and_map()
	net: netsec: Ignore 'phy-mode' on SynQuacer in DT mode
	net: dcb: choose correct policy to parse DCB_ATTR_BCN
	s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
	ip6mr: Fix skb_under_panic in ip6mr_cache_report()
	vxlan: Fix nexthop hash size
	net/mlx5: fs_core: Make find_closest_ft more generic
	net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
	tcp_metrics: fix addr_same() helper
	tcp_metrics: annotate data-races around tm->tcpm_stamp
	tcp_metrics: annotate data-races around tm->tcpm_lock
	tcp_metrics: annotate data-races around tm->tcpm_vals[]
	tcp_metrics: annotate data-races around tm->tcpm_net
	tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
	scsi: zfcp: Defer fc_rport blocking until after ADISC response
	libceph: fix potential hang in ceph_osdc_notify()
	USB: zaurus: Add ID for A-300/B-500/C-700
	ceph: defer stopping mdsc delayed_work
	exfat: use kvmalloc_array/kvfree instead of kmalloc_array/kfree
	exfat: release s_lock before calling dir_emit()
	mtd: spinand: toshiba: Fix ecc_get_status
	mtd: rawnand: meson: fix OOB available bytes for ECC
	arm64: dts: stratix10: fix incorrect I2C property for SCL signal
	net: tun_chr_open(): set sk_uid from current_fsuid()
	net: tap_open(): set sk_uid from current_fsuid()
	bpf: Disable preemption in bpf_event_output
	open: make RESOLVE_CACHED correctly test for O_TMPFILE
	drm/ttm: check null pointer before accessing when swapping
	file: reinstate f_pos locking optimization for regular files
	tracing: Fix sleeping while atomic in kdb ftdump
	fs/sysv: Null check to prevent null-ptr-deref bug
	Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb
	net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb
	fs: Protect reconfiguration of sb read-write from racing writes
	ext2: Drop fragment support
	mtd: rawnand: omap_elm: Fix incorrect type in assignment
	mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op()
	powerpc/mm/altmap: Fix altmap boundary check
	selftests/rseq: check if libc rseq support is registered
	selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
	soundwire: bus: add better dev_dbg to track complete() calls
	soundwire: bus: pm_runtime_request_resume on peripheral attachment
	soundwire: fix enumeration completion
	PM / wakeirq: support enabling wake-up irq after runtime_suspend called
	PM: sleep: wakeirq: fix wake irq arming
	exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
	exfat: support dynamic allocate bh for exfat_entry_set_cache
	exfat: check if filename entries exceeds max filename length
	mt76: move band capabilities in mt76_phy
	mt76: mt7615: Fix fall-through warnings for Clang
	wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC)
	ARM: dts: imx: add usb alias
	ARM: dts: imx6sll: fixup of operating points
	ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node
	x86/CPU/AMD: Do not leak quotient data after a division by 0
	Linux 5.10.190

Fix up build problem in ext4 due to merge of ed3d841f2f ("ext4: fix to
check return value of freeze_bdev() in ext4_shutdown()") conflicting
with a previous block layer core change coming in through the f2fs tree
in the past, that is not upstream, but ANDROID specific.

Change-Id: Ib95e59ce8ba653bcc791802735afafcd26bd996f
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-08-25 12:26:58 +00:00
Aleksa Sarai
c9c78b91c7 open: make RESOLVE_CACHED correctly test for O_TMPFILE
commit a0fc452a5d7fed986205539259df1d60546f536c upstream.

O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old
fast-path check for RESOLVE_CACHED would reject all users passing
O_DIRECTORY with -EAGAIN, when in fact the intended test was to check
for __O_TMPFILE.

Cc: stable@vger.kernel.org # v5.12+
Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11 11:57:53 +02:00
Greg Kroah-Hartman
df23049a96 This is the 5.10.176 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmQa9NgACgkQONu9yGCS
 aT4Iew//X/3+Bpiu+FyaYe0NZ4I95rQvNh4fG6wXCFd/PVbCRpxVOAKQ91GnkU+D
 iMeuGBPqkpPhHvesRybsq0u8GmJ+fJj58+fgy1ABI7UzkWihzNDu1n2RntYmuRvl
 TEEsAIS+6/lhVKosDhyYcXAL5eT8F06zFOI9HspWRe+lYoRBIQyykcLgZQwt5mBX
 qyKAFkvhH0Z77ATiID5alRkVArgi/t3qBUANTrJ7LqOlhtY42EOS0Sp7wpZWskqI
 7Mpb6pfODsOq5d+6zNvZzdrtMaKRBal0Inxj2+zLEYdSv+xbTqp4Cb6UI18gJTA7
 zsvItAzTRxp+7KiZVS2HP3uMRRV4lQ5HxgMJhSsONHSSRh7ndhkW7NQq/o/dRFm2
 IgVf1beHk2pE+LN0Plf2oQCOMV8h/vQRZLCejoQEbFy6oNQ6bA4btJaXZnfluqDb
 KXONyDqXZ3uX3DSrKO4pCNCTsm5JhinkFHhO125kjSkPp/k2YWXdnBftQT1mWPYf
 dbWu1z/E+3qvObedwNn+icuu/MUznZMTYwDOD31tJp+1iEBgeBQWI+IRaIaWbDyD
 dxSoV8cScNZz+X4M70EFlwJMYL/VcIzDljeH2EA3CImDycDH0tspo6z8Z+xFhsrg
 D1wshmaT9XkSEJ92xDMw82B/1noOati75HpkUW1W/PKTqvjH/uU=
 =/t/A
 -----END PGP SIGNATURE-----

Merge 5.10.176 into android12-5.10-lts

Changes in 5.10.176
	xfrm: Allow transport-mode states with AF_UNSPEC selector
	drm/panfrost: Don't sync rpm suspension after mmu flushing
	cifs: Move the in_send statistic to __smb_send_rqst()
	drm/meson: fix 1px pink line on GXM when scaling video overlay
	clk: HI655X: select REGMAP instead of depending on it
	docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
	scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
	ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
	netfilter: nft_nat: correct length for loading protocol registers
	netfilter: nft_masq: correct length for loading protocol registers
	netfilter: nft_redir: correct length for loading protocol registers
	netfilter: nft_redir: correct value of inet type `.maxattrs`
	scsi: core: Fix a comment in function scsi_host_dev_release()
	scsi: core: Fix a procfs host directory removal regression
	tcp: tcp_make_synack() can be called from process context
	nfc: pn533: initialize struct pn533_out_arg properly
	ipvlan: Make skb->skb_iif track skb->dev for l3s mode
	i40e: Fix kernel crash during reboot when adapter is in recovery mode
	net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
	qed/qed_dev: guard against a possible division by zero
	net: tunnels: annotate lockless accesses to dev->needed_headroom
	net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails
	nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
	net/smc: fix deadlock triggered by cancel_delayed_work_syn()
	net: usb: smsc75xx: Limit packet length to skb->len
	drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
	null_blk: Move driver into its own directory
	block: null_blk: Fix handling of fake timeout request
	nvme: fix handling single range discard request
	nvmet: avoid potential UAF in nvmet_req_complete()
	block: sunvdc: add check for mdesc_grab() returning NULL
	ice: xsk: disable txq irq before flushing hw
	net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290
	ipv4: Fix incorrect table ID in IOCTL path
	net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull
	net/iucv: Fix size of interrupt data
	selftests: net: devlink_port_split.py: skip test if no suitable device available
	qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
	ethernet: sun: add check for the mdesc_grab()
	hwmon: (adt7475) Display smoothing attributes in correct order
	hwmon: (adt7475) Fix masking of hysteresis registers
	hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
	hwmon: (ina3221) return prober error code
	hwmon: (ucd90320) Add minimum delay between bus accesses
	hwmon: tmp512: drop of_match_ptr for ID table
	hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
	media: m5mols: fix off-by-one loop termination error
	mmc: atmel-mci: fix race between stop command and start of next command
	jffs2: correct logic when creating a hole in jffs2_write_begin
	ext4: fail ext4_iget if special inode unallocated
	ext4: fix task hung in ext4_xattr_delete_inode
	drm/amdkfd: Fix an illegal memory access
	sh: intc: Avoid spurious sizeof-pointer-div warning
	drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
	ext4: fix possible double unlock when moving a directory
	tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
	serial: 8250_em: Fix UART port type
	firmware: xilinx: don't make a sleepable memory allocation from an atomic context
	interconnect: fix mem leak when freeing nodes
	tracing: Make splice_read available again
	tracing: Check field value in hist_field_name()
	tracing: Make tracepoint lockdep check actually test something
	cifs: Fix smb2_set_path_size()
	KVM: nVMX: add missing consistency checks for CR0 and CR4
	ALSA: hda: intel-dsp-config: add MTL PCI id
	ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
	drm/shmem-helper: Remove another errant put in error path
	mptcp: avoid setting TCP_CLOSE state twice
	ftrace: Fix invalid address access in lookup_rec() when index is 0
	mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
	mmc: sdhci_am654: lower power-on failed message severity
	fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
	cpuidle: psci: Iterate backwards over list in psci_pd_remove()
	x86/mce: Make sure logged MCEs are processed after sysfs update
	x86/mm: Fix use of uninitialized buffer in sme_enable()
	drm/i915: Don't use stolen memory for ring buffers with LLC
	drm/i915/active: Fix misuse of non-idle barriers as fence trackers
	io_uring: avoid null-ptr-deref in io_arm_poll_handler
	s390/ipl: add missing intersection check to ipl_report handling
	PCI: Unify delay handling for reset and resume
	PCI/DPC: Await readiness of secondary bus after reset
	xfs: don't assert fail on perag references on teardown
	xfs: purge dquots after inode walk fails during quotacheck
	xfs: don't leak btree cursor when insrec fails after a split
	xfs: remove XFS_PREALLOC_SYNC
	xfs: fallocate() should call file_modified()
	xfs: set prealloc flag in xfs_alloc_file_space()
	xfs: use setattr_copy to set vfs inode attributes
	fs: add mode_strip_sgid() helper
	fs: move S_ISGID stripping into the vfs_*() helpers
	attr: add in_group_or_capable()
	fs: move should_remove_suid()
	attr: add setattr_should_drop_sgid()
	attr: use consistent sgid stripping checks
	fs: use consistent setgid checks in is_sxid()
	xfs: remove xfs_setattr_time() declaration
	HID: core: Provide new max_buffer_size attribute to over-ride the default
	HID: uhid: Over-ride the default maximum data buffer value with our own
	Linux 5.10.176

Change-Id: Icd45189f4182c749d1758c13e18705abb4ea9c5a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-24 16:03:04 +00:00
Amir Goldstein
0e9dbde96c attr: use consistent sgid stripping checks
commit ed5a7047d2011cb6b2bf84ceb6680124cc6a7d95 upstream.

[backported to 5.10.y, prior to idmapped mounts]

Currently setgid stripping in file_remove_privs()'s should_remove_suid()
helper is inconsistent with other parts of the vfs. Specifically, it only
raises ATTR_KILL_SGID if the inode is S_ISGID and S_IXGRP but not if the
inode isn't in the caller's groups and the caller isn't privileged over the
inode although we require this already in setattr_prepare() and
setattr_copy() and so all filesystem implement this requirement implicitly
because they have to use setattr_{prepare,copy}() anyway.

But the inconsistency shows up in setgid stripping bugs for overlayfs in
xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
generic/687). For example, we test whether suid and setgid stripping works
correctly when performing various write-like operations as an unprivileged
user (fallocate, reflink, write, etc.):

echo "Test 1 - qa_user, non-exec file $verb"
setup_testfile
chmod a+rws $junk_file
commit_and_check "$qa_user" "$verb" 64k 64k

The test basically creates a file with 6666 permissions. While the file has
the S_ISUID and S_ISGID bits set it does not have the S_IXGRP set. On a
regular filesystem like xfs what will happen is:

sys_fallocate()
-> vfs_fallocate()
   -> xfs_file_fallocate()
      -> file_modified()
         -> __file_remove_privs()
            -> dentry_needs_remove_privs()
               -> should_remove_suid()
            -> __remove_privs()
               newattrs.ia_valid = ATTR_FORCE | kill;
               -> notify_change()
                  -> setattr_copy()

In should_remove_suid() we can see that ATTR_KILL_SUID is raised
unconditionally because the file in the test has S_ISUID set.

But we also see that ATTR_KILL_SGID won't be set because while the file
is S_ISGID it is not S_IXGRP (see above) which is a condition for
ATTR_KILL_SGID being raised.

So by the time we call notify_change() we have attr->ia_valid set to
ATTR_KILL_SUID | ATTR_FORCE. Now notify_change() sees that
ATTR_KILL_SUID is set and does:

ia_valid = attr->ia_valid |= ATTR_MODE
attr->ia_mode = (inode->i_mode & ~S_ISUID);

which means that when we call setattr_copy() later we will definitely
update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.

Now we call into the filesystem's ->setattr() inode operation which will
end up calling setattr_copy(). Since ATTR_MODE is set we will hit:

if (ia_valid & ATTR_MODE) {
        umode_t mode = attr->ia_mode;
        vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode);
        if (!vfsgid_in_group_p(vfsgid) &&
            !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
                mode &= ~S_ISGID;
        inode->i_mode = mode;
}

and since the caller in the test is neither capable nor in the group of the
inode the S_ISGID bit is stripped.

But assume the file isn't suid then ATTR_KILL_SUID won't be raised which
has the consequence that neither the setgid nor the suid bits are stripped
even though it should be stripped because the inode isn't in the caller's
groups and the caller isn't privileged over the inode.

If overlayfs is in the mix things become a bit more complicated and the bug
shows up more clearly. When e.g., ovl_setattr() is hit from
ovl_fallocate()'s call to file_remove_privs() then ATTR_KILL_SUID and
ATTR_KILL_SGID might be raised but because the check in notify_change() is
questioning the ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be
stripped the S_ISGID bit isn't removed even though it should be stripped:

sys_fallocate()
-> vfs_fallocate()
   -> ovl_fallocate()
      -> file_remove_privs()
         -> dentry_needs_remove_privs()
            -> should_remove_suid()
         -> __remove_privs()
            newattrs.ia_valid = ATTR_FORCE | kill;
            -> notify_change()
               -> ovl_setattr()
                  // TAKE ON MOUNTER'S CREDS
                  -> ovl_do_notify_change()
                     -> notify_change()
                  // GIVE UP MOUNTER'S CREDS
     // TAKE ON MOUNTER'S CREDS
     -> vfs_fallocate()
        -> xfs_file_fallocate()
           -> file_modified()
              -> __file_remove_privs()
                 -> dentry_needs_remove_privs()
                    -> should_remove_suid()
                 -> __remove_privs()
                    newattrs.ia_valid = attr_force | kill;
                    -> notify_change()

The fix for all of this is to make file_remove_privs()'s
should_remove_suid() helper to perform the same checks as we already
require in setattr_prepare() and setattr_copy() and have notify_change()
not pointlessly requiring S_IXGRP again. It doesn't make any sense in the
first place because the caller must calculate the flags via
should_remove_suid() anyway which would raise ATTR_KILL_SGID.

While we're at it we move should_remove_suid() from inode.c to attr.c
where it belongs with the rest of the iattr helpers. Especially since it
returns ATTR_KILL_S{G,U}ID flags. We also rename it to
setattr_should_drop_suidgid() to better reflect that it indicates both
setuid and setgid bit removal and also that it returns attr flags.

Running xfstests with this doesn't report any regressions. We should really
try and use consistent checks.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:30:08 +01:00
Greg Kroah-Hartman
8596b99884 Merge 5.10.162 into android12-5.10-lts
Changes in 5.10.162
	kernel: provide create_io_thread() helper
	iov_iter: add helper to save iov_iter state
	saner calling conventions for unlazy_child()
	fs: add support for LOOKUP_CACHED
	fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*
	Make sure nd->path.mnt and nd->path.dentry are always valid pointers
	fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED
	tools headers UAPI: Sync openat2.h with the kernel sources
	net: provide __sys_shutdown_sock() that takes a socket
	net: add accept helper not installing fd
	signal: Add task_sigpending() helper
	fs: make do_renameat2() take struct filename
	file: Rename __close_fd_get_file close_fd_get_file
	fs: provide locked helper variant of close_fd_get_file()
	entry: Add support for TIF_NOTIFY_SIGNAL
	task_work: Use TIF_NOTIFY_SIGNAL if available
	x86: Wire up TIF_NOTIFY_SIGNAL
	arc: add support for TIF_NOTIFY_SIGNAL
	arm64: add support for TIF_NOTIFY_SIGNAL
	m68k: add support for TIF_NOTIFY_SIGNAL
	nios32: add support for TIF_NOTIFY_SIGNAL
	parisc: add support for TIF_NOTIFY_SIGNAL
	powerpc: add support for TIF_NOTIFY_SIGNAL
	mips: add support for TIF_NOTIFY_SIGNAL
	s390: add support for TIF_NOTIFY_SIGNAL
	um: add support for TIF_NOTIFY_SIGNAL
	sh: add support for TIF_NOTIFY_SIGNAL
	openrisc: add support for TIF_NOTIFY_SIGNAL
	csky: add support for TIF_NOTIFY_SIGNAL
	hexagon: add support for TIF_NOTIFY_SIGNAL
	microblaze: add support for TIF_NOTIFY_SIGNAL
	arm: add support for TIF_NOTIFY_SIGNAL
	xtensa: add support for TIF_NOTIFY_SIGNAL
	alpha: add support for TIF_NOTIFY_SIGNAL
	c6x: add support for TIF_NOTIFY_SIGNAL
	h8300: add support for TIF_NOTIFY_SIGNAL
	ia64: add support for TIF_NOTIFY_SIGNAL
	nds32: add support for TIF_NOTIFY_SIGNAL
	riscv: add support for TIF_NOTIFY_SIGNAL
	sparc: add support for TIF_NOTIFY_SIGNAL
	ia64: don't call handle_signal() unless there's actually a signal queued
	ARC: unbork 5.11 bootup: fix snafu in _TIF_NOTIFY_SIGNAL handling
	alpha: fix TIF_NOTIFY_SIGNAL handling
	task_work: remove legacy TWA_SIGNAL path
	kernel: remove checking for TIF_NOTIFY_SIGNAL
	coredump: Limit what can interrupt coredumps
	kernel: allow fork with TIF_NOTIFY_SIGNAL pending
	entry/kvm: Exit to user mode when TIF_NOTIFY_SIGNAL is set
	arch: setup PF_IO_WORKER threads like PF_KTHREAD
	arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread()
	x86/process: setup io_threads more like normal user space threads
	kernel: stop masking signals in create_io_thread()
	kernel: don't call do_exit() for PF_IO_WORKER threads
	task_work: add helper for more targeted task_work canceling
	io_uring: import 5.15-stable io_uring
	signal: kill JOBCTL_TASK_WORK
	task_work: unconditionally run task_work from get_signal()
	net: remove cmsg restriction from io_uring based send/recvmsg calls
	Revert "proc: don't allow async path resolution of /proc/thread-self components"
	Revert "proc: don't allow async path resolution of /proc/self components"
	eventpoll: add EPOLL_URING_WAKE poll wakeup flag
	eventfd: provide a eventfd_signal_mask() helper
	io_uring: pass in EPOLL_URING_WAKE for eventfd signaling and wakeups
	Linux 5.10.162

Change-Id: I50a7b8bc8d38fac612113281b218cf5323b0af5e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-01 16:13:18 +00:00
Jens Axboe
5683caa735 fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED
[ Upstream commit 99668f618062816ca7ba639b007eb145b9d3d41e ]

Now that we support non-blocking path resolution internally, expose it
via openat2() in the struct open_how ->resolve flags. This allows
applications using openat2() to limit path resolution to the extent that
it is already cached.

If the lookup cannot be satisfied in a non-blocking manner, openat2(2)
will return -1/-EAGAIN.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-04 11:39:17 +01:00
Greg Kroah-Hartman
d483eed85f ANDROID: GKI: set vfs-only exports into their own namespace
We have namespaces, so use them for all vfs-exported namespaces so that
filesystems can use them, but not anything else.

Some in-kernel drivers that do direct filesystem accesses (because they
serve up files) are also allowed access to these symbols to keep 'make
allmodconfig' builds working properly, but it is not needed for Android
kernel images.

Bug: 157965270
Bug: 210074446
Cc: Matthias Maennich <maennich@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iaf6140baf3a18a516ab2d5c3966235c42f3f70de
2022-01-11 09:30:47 +01:00
Greg Kroah-Hartman
2df0fb4a4b This is the 5.10.50 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmDu+1UACgkQONu9yGCS
 aT7jQRAAuLDi7ejk3JUameYFMzVXGAUE6yPs392/lWJzey7IBf+2uLqz4FzqqUHp
 U1GkEKJVaCacEfi0+rpi7BxNFljUdZdg/F/P68ARtAWPvwqAeJ4QIh5u3A682UUO
 1M5h6e5/oY9F4kQIb5Kot04avqOeR6lTqrkA8jeP5h43ngyLWuS2d+5oOGmbCukS
 UgEaCC6CiKjcN51UUTj/fXMQ0X4IDHP5pD8rWwH0IvK0i7gduvk744un8LVB6aW1
 rNV88C3BEFFtkPQh2XySnXM5Ok8kYlhFoTDsqlpeAX7pA8hiUPYBoRzTg0MJtPZn
 N1L/Yqhvxmn5xs9HAw7mDOo8E8NWXzsT5FvZVaBeiCgtdKmcPszylXqmSt1oiOb0
 /EmkCWmlbG/3qWql24+LU4XP36iVPx32HQxAgg2XbnlNU5o0E1y2F98p6p/3JSWX
 NAjHtmg/MxueFQ+w8bDzhO8YzYn1dIU3V3qaXRvtpODrmaSYW+bwCyPtSjXe3/vL
 604zb3dOg9+tD/gKqfRb/UPMu24nNll8M/gnSRci05/thmIxwtYudPwoLNSejDqr
 e+a8vejISfIyp41XrpYQbUeKs1WOA+A7vgx6CZrT791afiT+6UgC/ecQfg1NFxhs
 8ayWpocaIszxyXxVGro1rfwZeQmTlbTCZ5wVdpn9sDPZfI7epts=
 =FCrA
 -----END PGP SIGNATURE-----

Merge 5.10.50 into android12-5.10-lts

Changes in 5.10.50
	Bluetooth: hci_qca: fix potential GPF
	Bluetooth: btqca: Don't modify firmware contents in-place
	Bluetooth: Remove spurious error message
	ALSA: usb-audio: fix rate on Ozone Z90 USB headset
	ALSA: usb-audio: Fix OOB access at proc output
	ALSA: firewire-motu: fix stream format for MOTU 8pre FireWire
	ALSA: usb-audio: scarlett2: Fix wrong resume call
	ALSA: intel8x0: Fix breakage at ac97 clock measurement
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 450 G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G8
	ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 630 G8
	ALSA: hda/realtek: Add another ALC236 variant support
	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook x360 830 G8
	ALSA: hda/realtek: Improve fixup for HP Spectre x360 15-df0xxx
	ALSA: hda/realtek: Fix bass speaker DAC mapping for Asus UM431D
	ALSA: hda/realtek: Apply LED fixup for HP Dragonfly G1, too
	ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 830 G8 Notebook PC
	media: dvb-usb: fix wrong definition
	Input: usbtouchscreen - fix control-request directions
	net: can: ems_usb: fix use-after-free in ems_usb_disconnect()
	usb: gadget: eem: fix echo command packet response issue
	usb: renesas-xhci: Fix handling of unknown ROM state
	USB: cdc-acm: blacklist Heimann USB Appset device
	usb: dwc3: Fix debugfs creation flow
	usb: typec: Add the missed altmode_id_remove() in typec_register_altmode()
	xhci: solve a double free problem while doing s4
	gfs2: Fix underflow in gfs2_page_mkwrite
	gfs2: Fix error handling in init_statfs
	ntfs: fix validity check for file name attribute
	selftests/lkdtm: Avoid needing explicit sub-shell
	copy_page_to_iter(): fix ITER_DISCARD case
	iov_iter_fault_in_readable() should do nothing in xarray case
	Input: joydev - prevent use of not validated data in JSIOCSBTNMAP ioctl
	crypto: nx - Fix memcpy() over-reading in nonce
	crypto: ccp - Annotate SEV Firmware file names
	arm_pmu: Fix write counter incorrect in ARMv7 big-endian mode
	ARM: dts: ux500: Fix LED probing
	ARM: dts: at91: sama5d4: fix pinctrl muxing
	btrfs: send: fix invalid path for unlink operations after parent orphanization
	btrfs: compression: don't try to compress if we don't have enough pages
	btrfs: clear defrag status of a root if starting transaction fails
	ext4: cleanup in-core orphan list if ext4_truncate() failed to get a transaction handle
	ext4: fix kernel infoleak via ext4_extent_header
	ext4: fix overflow in ext4_iomap_alloc()
	ext4: return error code when ext4_fill_flex_info() fails
	ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
	ext4: remove check for zero nr_to_scan in ext4_es_scan()
	ext4: fix avefreec in find_group_orlov
	ext4: use ext4_grp_locked_error in mb_find_extent
	can: bcm: delay release of struct bcm_op after synchronize_rcu()
	can: gw: synchronize rcu operations before removing gw job entry
	can: isotp: isotp_release(): omit unintended hrtimer restart on socket release
	can: j1939: j1939_sk_init(): set SOCK_RCU_FREE to call sk_destruct() after RCU is done
	can: peak_pciefd: pucan_handle_status(): fix a potential starvation issue in TX path
	mac80211: remove iwlwifi specific workaround that broke sta NDP tx
	SUNRPC: Fix the batch tasks count wraparound.
	SUNRPC: Should wake up the privileged task firstly.
	bus: mhi: Wait for M2 state during system resume
	mm/gup: fix try_grab_compound_head() race with split_huge_page()
	perf/smmuv3: Don't trample existing events with global filter
	KVM: nVMX: Handle split-lock #AC exceptions that happen in L2
	KVM: PPC: Book3S HV: Workaround high stack usage with clang
	KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs
	KVM: x86/mmu: Use MMU's role to detect CR4.SMEP value in nested NPT walk
	s390/cio: dont call css_wait_for_slow_path() inside a lock
	s390: mm: Fix secure storage access exception handling
	f2fs: Prevent swap file in LFS mode
	clk: agilex/stratix10/n5x: fix how the bypass_reg is handled
	clk: agilex/stratix10: remove noc_clk
	clk: agilex/stratix10: fix bypass representation
	rtc: stm32: Fix unbalanced clk_disable_unprepare() on probe error path
	iio: frequency: adf4350: disable reg and clk on error in adf4350_probe()
	iio: light: tcs3472: do not free unallocated IRQ
	iio: ltr501: mark register holding upper 8 bits of ALS_DATA{0,1} and PS_DATA as volatile, too
	iio: ltr501: ltr559: fix initialization of LTR501_ALS_CONTR
	iio: ltr501: ltr501_read_ps(): add missing endianness conversion
	iio: accel: bma180: Fix BMA25x bandwidth register values
	serial: mvebu-uart: fix calculation of clock divisor
	serial: sh-sci: Stop dmaengine transfer in sci_stop_tx()
	serial_cs: Add Option International GSM-Ready 56K/ISDN modem
	serial_cs: remove wrong GLOBETROTTER.cis entry
	ath9k: Fix kernel NULL pointer dereference during ath_reset_internal()
	ssb: sdio: Don't overwrite const buffer if block_write fails
	rsi: Assign beacon rate settings to the correct rate_info descriptor field
	rsi: fix AP mode with WPA failure due to encrypted EAPOL
	tracing/histograms: Fix parsing of "sym-offset" modifier
	tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
	seq_buf: Make trace_seq_putmem_hex() support data longer than 8
	powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
	loop: Fix missing discard support when using LOOP_CONFIGURE
	evm: Execute evm_inode_init_security() only when an HMAC key is loaded
	evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loaded
	fuse: Fix crash in fuse_dentry_automount() error path
	fuse: Fix crash if superblock of submount gets killed early
	fuse: Fix infinite loop in sget_fc()
	fuse: ignore PG_workingset after stealing
	fuse: check connected before queueing on fpq->io
	fuse: reject internal errno
	thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
	spi: Make of_register_spi_device also set the fwnode
	Add a reference to ucounts for each cred
	staging: media: rkvdec: fix pm_runtime_get_sync() usage count
	media: marvel-ccic: fix some issues when getting pm_runtime
	media: mdk-mdp: fix pm_runtime_get_sync() usage count
	media: s5p: fix pm_runtime_get_sync() usage count
	media: am437x: fix pm_runtime_get_sync() usage count
	media: sh_vou: fix pm_runtime_get_sync() usage count
	media: mtk-vcodec: fix PM runtime get logic
	media: s5p-jpeg: fix pm_runtime_get_sync() usage count
	media: sunxi: fix pm_runtime_get_sync() usage count
	media: sti/bdisp: fix pm_runtime_get_sync() usage count
	media: exynos4-is: fix pm_runtime_get_sync() usage count
	media: exynos-gsc: fix pm_runtime_get_sync() usage count
	spi: spi-loopback-test: Fix 'tx_buf' might be 'rx_buf'
	spi: spi-topcliff-pch: Fix potential double free in pch_spi_process_messages()
	spi: omap-100k: Fix the length judgment problem
	regulator: uniphier: Add missing MODULE_DEVICE_TABLE
	sched/core: Initialize the idle task with preemption disabled
	hwrng: exynos - Fix runtime PM imbalance on error
	crypto: nx - add missing MODULE_DEVICE_TABLE
	media: sti: fix obj-$(config) targets
	media: cpia2: fix memory leak in cpia2_usb_probe
	media: cobalt: fix race condition in setting HPD
	media: hevc: Fix dependent slice segment flags
	media: pvrusb2: fix warning in pvr2_i2c_core_done
	media: imx: imx7_mipi_csis: Fix logging of only error event counters
	crypto: qat - check return code of qat_hal_rd_rel_reg()
	crypto: qat - remove unused macro in FW loader
	crypto: qce: skcipher: Fix incorrect sg count for dma transfers
	arm64: perf: Convert snprintf to sysfs_emit
	sched/fair: Fix ascii art by relpacing tabs
	media: i2c: ov2659: Use clk_{prepare_enable,disable_unprepare}() to set xvclk on/off
	media: bt878: do not schedule tasklet when it is not setup
	media: em28xx: Fix possible memory leak of em28xx struct
	media: hantro: Fix .buf_prepare
	media: cedrus: Fix .buf_prepare
	media: v4l2-core: Avoid the dangling pointer in v4l2_fh_release
	media: bt8xx: Fix a missing check bug in bt878_probe
	media: st-hva: Fix potential NULL pointer dereferences
	crypto: hisilicon/sec - fixup 3des minimum key size declaration
	Makefile: fix GDB warning with CONFIG_RELR
	media: dvd_usb: memory leak in cinergyt2_fe_attach
	memstick: rtsx_usb_ms: fix UAF
	mmc: sdhci-sprd: use sdhci_sprd_writew
	mmc: via-sdmmc: add a check against NULL pointer dereference
	spi: meson-spicc: fix a wrong goto jump for avoiding memory leak.
	spi: meson-spicc: fix memory leak in meson_spicc_probe
	crypto: shash - avoid comparing pointers to exported functions under CFI
	media: dvb_net: avoid speculation from net slot
	media: siano: fix device register error path
	media: imx-csi: Skip first few frames from a BT.656 source
	hwmon: (max31790) Report correct current pwm duty cycles
	hwmon: (max31790) Fix pwmX_enable attributes
	drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
	KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processors
	btrfs: fix error handling in __btrfs_update_delayed_inode
	btrfs: abort transaction if we fail to update the delayed inode
	btrfs: sysfs: fix format string for some discard stats
	btrfs: don't clear page extent mapped if we're not invalidating the full page
	btrfs: disable build on platforms having page size 256K
	locking/lockdep: Fix the dep path printing for backwards BFS
	lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()
	KVM: s390: get rid of register asm usage
	regulator: mt6358: Fix vdram2 .vsel_mask
	regulator: da9052: Ensure enough delay time for .set_voltage_time_sel
	media: Fix Media Controller API config checks
	ACPI: video: use native backlight for GA401/GA502/GA503
	HID: do not use down_interruptible() when unbinding devices
	EDAC/ti: Add missing MODULE_DEVICE_TABLE
	ACPI: processor idle: Fix up C-state latency if not ordered
	hv_utils: Fix passing zero to 'PTR_ERR' warning
	lib: vsprintf: Fix handling of number field widths in vsscanf
	Input: goodix - platform/x86: touchscreen_dmi - Move upside down quirks to touchscreen_dmi.c
	platform/x86: touchscreen_dmi: Add an extra entry for the upside down Goodix touchscreen on Teclast X89 tablets
	platform/x86: touchscreen_dmi: Add info for the Goodix GT912 panel of TM800A550L tablets
	ACPI: EC: Make more Asus laptops use ECDT _GPE
	block_dump: remove block_dump feature in mark_inode_dirty()
	blk-mq: grab rq->refcount before calling ->fn in blk_mq_tagset_busy_iter
	blk-mq: clear stale request in tags->rq[] before freeing one request pool
	fs: dlm: cancel work sync othercon
	random32: Fix implicit truncation warning in prandom_seed_state()
	open: don't silently ignore unknown O-flags in openat2()
	drivers: hv: Fix missing error code in vmbus_connect()
	fs: dlm: fix memory leak when fenced
	ACPICA: Fix memory leak caused by _CID repair function
	ACPI: bus: Call kobject_put() in acpi_init() error path
	ACPI: resources: Add checks for ACPI IRQ override
	block: fix race between adding/removing rq qos and normal IO
	platform/x86: asus-nb-wmi: Revert "Drop duplicate DMI quirk structures"
	platform/x86: asus-nb-wmi: Revert "add support for ASUS ROG Zephyrus G14 and G15"
	platform/x86: toshiba_acpi: Fix missing error code in toshiba_acpi_setup_keyboard()
	nvme-pci: fix var. type for increasing cq_head
	nvmet-fc: do not check for invalid target port in nvmet_fc_handle_fcp_rqst()
	EDAC/Intel: Do not load EDAC driver when running as a guest
	PCI: hv: Add check for hyperv_initialized in init_hv_pci_drv()
	cifs: improve fallocate emulation
	ACPI: EC: trust DSDT GPE for certain HP laptop
	clocksource: Retry clock read if long delays detected
	clocksource: Check per-CPU clock synchronization when marked unstable
	tpm_tis_spi: add missing SPI device ID entries
	ACPI: tables: Add custom DSDT file as makefile prerequisite
	HID: wacom: Correct base usage for capacitive ExpressKey status bits
	cifs: fix missing spinlock around update to ses->status
	mailbox: qcom: Use PLATFORM_DEVID_AUTO to register platform device
	block: fix discard request merge
	kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
	ia64: mca_drv: fix incorrect array size calculation
	writeback, cgroup: increment isw_nr_in_flight before grabbing an inode
	spi: Allow to have all native CSs in use along with GPIOs
	spi: Avoid undefined behaviour when counting unused native CSs
	media: venus: Rework error fail recover logic
	media: s5p_cec: decrement usage count if disabled
	media: hantro: do a PM resume earlier
	crypto: ixp4xx - dma_unmap the correct address
	crypto: ixp4xx - update IV after requests
	crypto: ux500 - Fix error return code in hash_hw_final()
	sata_highbank: fix deferred probing
	pata_rb532_cf: fix deferred probing
	media: I2C: change 'RST' to "RSET" to fix multiple build errors
	sched/uclamp: Fix wrong implementation of cpu.uclamp.min
	sched/uclamp: Fix locking around cpu_util_update_eff()
	kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n'
	pata_octeon_cf: avoid WARN_ON() in ata_host_activate()
	evm: fix writing <securityfs>/evm overflow
	x86/elf: Use _BITUL() macro in UAPI headers
	crypto: sa2ul - Fix leaks on failure paths with sa_dma_init()
	crypto: sa2ul - Fix pm_runtime enable in sa_ul_probe()
	crypto: ccp - Fix a resource leak in an error handling path
	media: rc: i2c: Fix an error message
	pata_ep93xx: fix deferred probing
	locking/lockdep: Reduce LOCKDEP dependency list
	media: rkvdec: Fix .buf_prepare
	media: exynos4-is: Fix a use after free in isp_video_release
	media: au0828: fix a NULL vs IS_ERR() check
	media: tc358743: Fix error return code in tc358743_probe_of()
	media: gspca/gl860: fix zero-length control requests
	m68k: atari: Fix ATARI_KBD_CORE kconfig unmet dependency warning
	media: siano: Fix out-of-bounds warnings in smscore_load_firmware_family2()
	regulator: fan53880: Fix vsel_mask setting for FAN53880_BUCK
	crypto: nitrox - fix unchecked variable in nitrox_register_interrupts
	crypto: omap-sham - Fix PM reference leak in omap sham ops
	crypto: x86/curve25519 - fix cpu feature checking logic in mod_exit
	crypto: sm2 - remove unnecessary reset operations
	crypto: sm2 - fix a memory leak in sm2
	mmc: usdhi6rol0: fix error return code in usdhi6_probe()
	arm64: consistently use reserved_pg_dir
	arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan
	media: subdev: remove VIDIOC_DQEVENT_TIME32 handling
	media: s5p-g2d: Fix a memory leak on ctx->fh.m2m_ctx
	hwmon: (lm70) Use device_get_match_data()
	hwmon: (lm70) Revert "hwmon: (lm70) Add support for ACPI"
	hwmon: (max31722) Remove non-standard ACPI device IDs
	hwmon: (max31790) Fix fan speed reporting for fan7..12
	KVM: nVMX: Sync all PGDs on nested transition with shadow paging
	KVM: nVMX: Ensure 64-bit shift when checking VMFUNC bitmap
	KVM: nVMX: Don't clobber nested MMU's A/D status on EPTP switch
	KVM: x86/mmu: Fix return value in tdp_mmu_map_handle_target_level()
	perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
	KVM: arm64: Don't zero the cycle count register when PMCR_EL0.P is set
	regulator: hi655x: Fix pass wrong pointer to config.driver_data
	btrfs: clear log tree recovering status if starting transaction fails
	x86/sev: Make sure IRQs are disabled while GHCB is active
	x86/sev: Split up runtime #VC handler for correct state tracking
	sched/rt: Fix RT utilization tracking during policy change
	sched/rt: Fix Deadline utilization tracking during policy change
	sched/uclamp: Fix uclamp_tg_restrict()
	lockdep: Fix wait-type for empty stack
	lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING
	spi: spi-sun6i: Fix chipselect/clock bug
	crypto: nx - Fix RCU warning in nx842_OF_upd_status
	psi: Fix race between psi_trigger_create/destroy
	media: v4l2-async: Clean v4l2_async_notifier_add_fwnode_remote_subdev
	media: video-mux: Skip dangling endpoints
	PM / devfreq: Add missing error code in devfreq_add_device()
	ACPI: PM / fan: Put fan device IDs into separate header file
	block: avoid double io accounting for flush request
	nvme-pci: look for StorageD3Enable on companion ACPI device instead
	ACPI: sysfs: Fix a buffer overrun problem with description_show()
	mark pstore-blk as broken
	clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG
	extcon: extcon-max8997: Fix IRQ freeing at error path
	ACPI: APEI: fix synchronous external aborts in user-mode
	blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled()
	blk-wbt: make sure throttle is enabled properly
	ACPI: Use DEVICE_ATTR_<RW|RO|WO> macros
	ACPI: bgrt: Fix CFI violation
	cpufreq: Make cpufreq_online() call driver->offline() on errors
	blk-mq: update hctx->dispatch_busy in case of real scheduler
	ocfs2: fix snprintf() checking
	dax: fix ENOMEM handling in grab_mapping_entry()
	mm/debug_vm_pgtable/basic: add validation for dirtiness after write protect
	mm/debug_vm_pgtable/basic: iterate over entire protection_map[]
	mm/debug_vm_pgtable: ensure THP availability via has_transparent_hugepage()
	swap: fix do_swap_page() race with swapoff
	mm/shmem: fix shmem_swapin() race with swapoff
	mm: memcg/slab: properly set up gfp flags for objcg pointer array
	mm: page_alloc: refactor setup_per_zone_lowmem_reserve()
	mm/page_alloc: fix counting of managed_pages
	xfrm: xfrm_state_mtu should return at least 1280 for ipv6
	drm/bridge/sii8620: fix dependency on extcon
	drm/bridge: Fix the stop condition of drm_bridge_chain_pre_enable()
	drm/amd/dc: Fix a missing check bug in dm_dp_mst_detect()
	drm/ast: Fix missing conversions to managed API
	video: fbdev: imxfb: Fix an error message
	net: mvpp2: Put fwnode in error case during ->probe()
	net: pch_gbe: Propagate error from devm_gpio_request_one()
	pinctrl: renesas: r8a7796: Add missing bias for PRESET# pin
	pinctrl: renesas: r8a77990: JTAG pins do not have pull-down capabilities
	drm/vmwgfx: Mark a surface gpu-dirty after the SVGA3dCmdDXGenMips command
	drm/vmwgfx: Fix cpu updates of coherent multisample surfaces
	net: qrtr: ns: Fix error return code in qrtr_ns_init()
	clk: meson: g12a: fix gp0 and hifi ranges
	net: ftgmac100: add missing error return code in ftgmac100_probe()
	drm: rockchip: set alpha_en to 0 if it is not used
	drm/rockchip: cdn-dp-core: add missing clk_disable_unprepare() on error in cdn_dp_grf_write()
	drm/rockchip: dsi: move all lane config except LCDC mux to bind()
	drm/rockchip: lvds: Fix an error handling path
	drm/rockchip: cdn-dp: fix sign extension on an int multiply for a u64 result
	mptcp: fix pr_debug in mptcp_token_new_connect
	mptcp: generate subflow hmac after mptcp_finish_join()
	RDMA/srp: Fix a recently introduced memory leak
	RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its stats
	RDMA/rtrs: Do not reset hb_missed_max after re-connection
	RDMA/rtrs-srv: Fix memory leak of unfreed rtrs_srv_stats object
	RDMA/rtrs-srv: Fix memory leak when having multiple sessions
	RDMA/rtrs-clt: Check if the queue_depth has changed during a reconnection
	RDMA/rtrs-clt: Fix memory leak of not-freed sess->stats and stats->pcpu_stats
	ehea: fix error return code in ehea_restart_qps()
	clk: tegra30: Use 300MHz for video decoder by default
	xfrm: remove the fragment check for ipv6 beet mode
	net/sched: act_vlan: Fix modify to allow 0
	RDMA/core: Sanitize WQ state received from the userspace
	drm/pl111: depend on CONFIG_VEXPRESS_CONFIG
	RDMA/rxe: Fix failure during driver load
	drm/pl111: Actually fix CONFIG_VEXPRESS_CONFIG depends
	drm/vc4: hdmi: Fix error path of hpd-gpios
	clk: vc5: fix output disabling when enabling a FOD
	drm: qxl: ensure surf.data is ininitialized
	tools/bpftool: Fix error return code in do_batch()
	ath10k: go to path err_unsupported when chip id is not supported
	ath10k: add missing error return code in ath10k_pci_probe()
	wireless: carl9170: fix LEDS build errors & warnings
	ieee802154: hwsim: Fix possible memory leak in hwsim_subscribe_all_others
	clk: imx8mq: remove SYS PLL 1/2 clock gates
	wcn36xx: Move hal_buf allocation to devm_kmalloc in probe
	ssb: Fix error return code in ssb_bus_scan()
	brcmfmac: fix setting of station info chains bitmask
	brcmfmac: correctly report average RSSI in station info
	brcmfmac: Fix a double-free in brcmf_sdio_bus_reset
	brcmsmac: mac80211_if: Fix a resource leak in an error handling path
	cw1200: Revert unnecessary patches that fix unreal use-after-free bugs
	ath11k: Fix an error handling path in ath11k_core_fetch_board_data_api_n()
	ath10k: Fix an error code in ath10k_add_interface()
	ath11k: send beacon template after vdev_start/restart during csa
	netlabel: Fix memory leak in netlbl_mgmt_add_common
	RDMA/mlx5: Don't add slave port to unaffiliated list
	netfilter: nft_exthdr: check for IPv6 packet before further processing
	netfilter: nft_osf: check for TCP packet before further processing
	netfilter: nft_tproxy: restrict support to TCP and UDP transport protocols
	RDMA/rxe: Fix qp reference counting for atomic ops
	selftests/bpf: Whitelist test_progs.h from .gitignore
	xsk: Fix missing validation for skb and unaligned mode
	xsk: Fix broken Tx ring validation
	bpf: Fix libelf endian handling in resolv_btfids
	RDMA/rtrs-srv: Set minimal max_send_wr and max_recv_wr
	samples/bpf: Fix Segmentation fault for xdp_redirect command
	samples/bpf: Fix the error return code of xdp_redirect's main()
	mt76: fix possible NULL pointer dereference in mt76_tx
	mt76: mt7615: fix NULL pointer dereference in tx_prepare_skb()
	net: ethernet: aeroflex: fix UAF in greth_of_remove
	net: ethernet: ezchip: fix UAF in nps_enet_remove
	net: ethernet: ezchip: fix error handling
	vrf: do not push non-ND strict packets with a source LLA through packet taps again
	net: sched: add barrier to ensure correct ordering for lockless qdisc
	tls: prevent oversized sendfile() hangs by ignoring MSG_MORE
	netfilter: nf_tables_offload: check FLOW_DISSECTOR_KEY_BASIC in VLAN transfer logic
	pkt_sched: sch_qfq: fix qfq_change_class() error path
	xfrm: Fix xfrm offload fallback fail case
	iwlwifi: increase PNVM load timeout
	rtw88: 8822c: fix lc calibration timing
	vxlan: add missing rcu_read_lock() in neigh_reduce()
	ip6_tunnel: fix GRE6 segmentation
	net/ipv4: swap flow ports when validating source
	net: ti: am65-cpsw-nuss: Fix crash when changing number of TX queues
	tc-testing: fix list handling
	ieee802154: hwsim: Fix memory leak in hwsim_add_one
	ieee802154: hwsim: avoid possible crash in hwsim_del_edge_nl()
	bpf: Fix null ptr deref with mixed tail calls and subprogs
	drm/msm: Fix error return code in msm_drm_init()
	drm/msm/dpu: Fix error return code in dpu_mdss_init()
	mac80211: remove iwlwifi specific workaround NDPs of null_response
	net: bcmgenet: Fix attaching to PYH failed on RPi 4B
	ipv6: exthdrs: do not blindly use init_net
	can: j1939: j1939_sk_setsockopt(): prevent allocation of j1939 filter for optlen == 0
	bpf: Do not change gso_size during bpf_skb_change_proto()
	i40e: Fix error handling in i40e_vsi_open
	i40e: Fix autoneg disabling for non-10GBaseT links
	i40e: Fix missing rtnl locking when setting up pf switch
	Revert "ibmvnic: remove duplicate napi_schedule call in open function"
	ibmvnic: set ltb->buff to NULL after freeing
	ibmvnic: free tx_pool if tso_pool alloc fails
	RDMA/cma: Protect RMW with qp_mutex
	net: macsec: fix the length used to copy the key for offloading
	net: phy: mscc: fix macsec key length
	net: atlantic: fix the macsec key length
	ipv6: fix out-of-bound access in ip6_parse_tlv()
	e1000e: Check the PCIm state
	net: dsa: sja1105: fix NULL pointer dereference in sja1105_reload_cbs()
	bpfilter: Specify the log level for the kmsg message
	RDMA/cma: Fix incorrect Packet Lifetime calculation
	gve: Fix swapped vars when fetching max queues
	Revert "be2net: disable bh with spin_lock in be_process_mcc"
	Bluetooth: mgmt: Fix slab-out-of-bounds in tlv_data_is_valid
	Bluetooth: Fix not sending Set Extended Scan Response
	Bluetooth: Fix Set Extended (Scan Response) Data
	Bluetooth: Fix handling of HCI_LE_Advertising_Set_Terminated event
	clk: actions: Fix UART clock dividers on Owl S500 SoC
	clk: actions: Fix SD clocks factor table on Owl S500 SoC
	clk: actions: Fix bisp_factor_table based clocks on Owl S500 SoC
	clk: actions: Fix AHPPREDIV-H-AHB clock chain on Owl S500 SoC
	clk: qcom: clk-alpha-pll: fix CAL_L write in alpha_pll_fabia_prepare
	clk: si5341: Wait for DEVICE_READY on startup
	clk: si5341: Avoid divide errors due to bogus register contents
	clk: si5341: Check for input clock presence and PLL lock on startup
	clk: si5341: Update initialization magic
	writeback: fix obtain a reference to a freeing memcg css
	net: lwtunnel: handle MTU calculation in forwading
	net: sched: fix warning in tcindex_alloc_perfect_hash
	net: tipc: fix FB_MTU eat two pages
	RDMA/mlx5: Don't access NULL-cleared mpi pointer
	RDMA/core: Always release restrack object
	MIPS: Fix PKMAP with 32-bit MIPS huge page support
	staging: fbtft: Rectify GPIO handling
	staging: fbtft: Don't spam logs when probe is deferred
	ASoC: rt5682: Disable irq on shutdown
	rcu: Invoke rcu_spawn_core_kthreads() from rcu_spawn_gp_kthread()
	serial: fsl_lpuart: don't modify arbitrary data on lpuart32
	serial: fsl_lpuart: remove RTSCTS handling from get_mctrl()
	serial: 8250_omap: fix a timeout loop condition
	tty: nozomi: Fix a resource leak in an error handling function
	mwifiex: re-fix for unaligned accesses
	iio: adis_buffer: do not return ints in irq handlers
	iio: adis16400: do not return ints in irq handlers
	iio: adis16475: do not return ints in irq handlers
	iio: accel: bma180: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: bma220: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: hid: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: kxcjk-1013: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: mxc4005: Fix overread of data and alignment issue.
	iio: accel: stk8312: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: accel: stk8ba50: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: ti-ads1015: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: vf610: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: gyro: bmg160: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: humidity: am2315: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: srf08: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: pulsed-light: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: as3935: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: magn: hmc5843: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: magn: bmc150: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: isl29125: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: tcs3414: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: light: tcs3472: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: chemical: atlas: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: cros_ec_sensors: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: potentiostat: lmp91000: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	ASoC: rk3328: fix missing clk_disable_unprepare() on error in rk3328_platform_probe()
	ASoC: hisilicon: fix missing clk_disable_unprepare() on error in hi6210_i2s_startup()
	backlight: lm3630a_bl: Put fwnode in error case during ->probe()
	ASoC: rsnd: tidyup loop on rsnd_adg_clk_query()
	Input: hil_kbd - fix error return code in hil_dev_connect()
	perf scripting python: Fix tuple_set_u64()
	mtd: partitions: redboot: seek fis-index-block in the right node
	mtd: rawnand: arasan: Ensure proper configuration for the asserted target
	staging: mmal-vchiq: Fix incorrect static vchiq_instance.
	char: pcmcia: error out if 'num_bytes_read' is greater than 4 in set_protocol()
	firmware: stratix10-svc: Fix a resource leak in an error handling path
	tty: nozomi: Fix the error handling path of 'nozomi_card_init()'
	leds: class: The -ENOTSUPP should never be seen by user space
	leds: lm3532: select regmap I2C API
	leds: lm36274: Put fwnode in error case during ->probe()
	leds: lm3692x: Put fwnode in any case during ->probe()
	leds: lm3697: Don't spam logs when probe is deferred
	leds: lp50xx: Put fwnode in error case during ->probe()
	scsi: FlashPoint: Rename si_flags field
	scsi: iscsi: Flush block work before unblock
	mfd: mp2629: Select MFD_CORE to fix build error
	mfd: rn5t618: Fix IRQ trigger by changing it to level mode
	fsi: core: Fix return of error values on failures
	fsi: scom: Reset the FSI2PIB engine for any error
	fsi: occ: Don't accept response from un-initialized OCC
	fsi/sbefifo: Clean up correct FIFO when receiving reset request from SBE
	fsi/sbefifo: Fix reset timeout
	visorbus: fix error return code in visorchipset_init()
	iommu/amd: Fix extended features logging
	s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
	s390: enable HAVE_IOREMAP_PROT
	s390: appldata depends on PROC_SYSCTL
	selftests: splice: Adjust for handler fallback removal
	iommu/dma: Fix IOVA reserve dma ranges
	ASoC: max98373-sdw: use first_hw_init flag on resume
	ASoC: rt1308-sdw: use first_hw_init flag on resume
	ASoC: rt5682-sdw: use first_hw_init flag on resume
	ASoC: rt700-sdw: use first_hw_init flag on resume
	ASoC: rt711-sdw: use first_hw_init flag on resume
	ASoC: rt715-sdw: use first_hw_init flag on resume
	ASoC: rt5682: fix getting the wrong device id when the suspend_stress_test
	ASoC: rt5682-sdw: set regcache_cache_only false before reading RT5682_DEVICE_ID
	ASoC: mediatek: mtk-btcvsd: Fix an error handling path in 'mtk_btcvsd_snd_probe()'
	usb: gadget: f_fs: Fix setting of device and driver data cross-references
	usb: dwc2: Don't reset the core after setting turnaround time
	eeprom: idt_89hpesx: Put fwnode in matching case during ->probe()
	eeprom: idt_89hpesx: Restore printing the unsupported fwnode name
	thunderbolt: Bond lanes only when dual_link_port != NULL in alloc_dev_default()
	iio: adc: at91-sama5d2: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: hx711: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: mxs-lradc: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: magn: rm3100: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()
	iio: light: vcnl4000: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	ASoC: fsl_spdif: Fix error handler with pm_runtime_enable
	staging: gdm724x: check for buffer overflow in gdm_lte_multi_sdu_pkt()
	staging: gdm724x: check for overflow in gdm_lte_netif_rx()
	staging: rtl8712: fix error handling in r871xu_drv_init
	staging: rtl8712: fix memory leak in rtl871x_load_fw_cb
	coresight: core: Fix use of uninitialized pointer
	staging: mt7621-dts: fix pci address for PCI memory range
	serial: 8250: Actually allow UPF_MAGIC_MULTIPLIER baud rates
	iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	iio: prox: isl29501: Fix buffer alignment in iio_push_to_buffers_with_timestamp()
	ASoC: cs42l42: Correct definition of CS42L42_ADC_PDN_MASK
	of: Fix truncation of memory sizes on 32-bit platforms
	mtd: rawnand: marvell: add missing clk_disable_unprepare() on error in marvell_nfc_resume()
	habanalabs: Fix an error handling path in 'hl_pci_probe()'
	scsi: mpt3sas: Fix error return value in _scsih_expander_add()
	soundwire: stream: Fix test for DP prepare complete
	phy: uniphier-pcie: Fix updating phy parameters
	phy: ti: dm816x: Fix the error handling path in 'dm816x_usb_phy_probe()
	extcon: sm5502: Drop invalid register write in sm5502_reg_data
	extcon: max8997: Add missing modalias string
	powerpc/powernv: Fix machine check reporting of async store errors
	ASoC: atmel-i2s: Fix usage of capture and playback at the same time
	configfs: fix memleak in configfs_release_bin_file
	ASoC: Intel: sof_sdw: add SOF_RT715_DAI_ID_FIX for AlderLake
	ASoC: fsl_spdif: Fix unexpected interrupt after suspend
	leds: as3645a: Fix error return code in as3645a_parse_node()
	leds: ktd2692: Fix an error handling path
	selftests/ftrace: fix event-no-pid on 1-core machine
	serial: 8250: 8250_omap: Disable RX interrupt after DMA enable
	serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs
	powerpc: Offline CPU in stop_this_cpu()
	powerpc/papr_scm: Properly handle UUID types and API
	powerpc/64s: Fix copy-paste data exposure into newly created tasks
	powerpc/papr_scm: Make 'perf_stats' invisible if perf-stats unavailable
	ALSA: firewire-lib: Fix 'amdtp_domain_start()' when no AMDTP_OUT_STREAM stream is found
	serial: mvebu-uart: do not allow changing baudrate when uartclk is not available
	serial: mvebu-uart: correctly calculate minimal possible baudrate
	arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART
	vfio/pci: Handle concurrent vma faults
	mm/pmem: avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
	mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
	mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()
	mm/huge_memory.c: don't discard hugepage if other processes are mapping it
	mm/hugetlb: use helper huge_page_order and pages_per_huge_page
	mm/hugetlb: remove redundant check in preparing and destroying gigantic page
	hugetlb: remove prep_compound_huge_page cleanup
	include/linux/huge_mm.h: remove extern keyword
	mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
	mm/z3fold: use release_z3fold_page_locked() to release locked z3fold page
	lib/math/rational.c: fix divide by zero
	selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
	selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
	selftests/vm/pkeys: refill shadow register after implicit kernel write
	perf llvm: Return -ENOMEM when asprintf() fails
	csky: fix syscache.c fallthrough warning
	csky: syscache: Fixup duplicate cache flush
	exfat: handle wrong stream entry size in exfat_readdir()
	scsi: fc: Correct RHBA attributes length
	scsi: target: cxgbit: Unmap DMA buffer before calling target_execute_cmd()
	mailbox: qcom-ipcc: Fix IPCC mbox channel exhaustion
	fscrypt: don't ignore minor_hash when hash is 0
	fscrypt: fix derivation of SipHash keys on big endian CPUs
	tpm: Replace WARN_ONCE() with dev_err_once() in tpm_tis_status()
	erofs: fix error return code in erofs_read_superblock()
	block: return the correct bvec when checking for gaps
	io_uring: fix blocking inline submission
	mmc: block: Disable CMDQ on the ioctl path
	mmc: vub3000: fix control-request direction
	media: exynos4-is: remove a now unused integer
	scsi: core: Retry I/O for Notify (Enable Spinup) Required error
	crypto: qce - fix error return code in qce_skcipher_async_req_handle()
	s390: preempt: Fix preempt_count initialization
	cred: add missing return error code when set_cred_ucounts() failed
	iommu/dma: Fix compile warning in 32-bit builds
	powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
	Linux 5.10.50

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iec4eab24ea8eb5a6d79739a1aec8432d93a8f82c
2021-07-14 17:35:23 +02:00
Christian Brauner
019d04f914 open: don't silently ignore unknown O-flags in openat2()
[ Upstream commit cfe80306a0dd6d363934913e47c3f30d71b721e5 ]

The new openat2() syscall verifies that no unknown O-flag values are
set and returns an error to userspace if they are while the older open
syscalls like open() and openat() simply ignore unknown flag values:

  #define O_FLAG_CURRENTLY_INVALID (1 << 31)
  struct open_how how = {
          .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID,
          .resolve = 0,
  };

  /* fails */
  fd = openat2(-EBADF, "/dev/null", &how, sizeof(how));

  /* succeeds */
  fd = openat(-EBADF, "/dev/null", O_RDONLY | O_FLAG_CURRENTLY_INVALID);

However, openat2() silently truncates the upper 32 bits meaning:

  #define O_FLAG_CURRENTLY_INVALID_LOWER32 (1 << 31)
  #define O_FLAG_CURRENTLY_INVALID_UPPER32 (1 << 40)

  struct open_how how_lowe32 = {
          .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWER32,
  };

  struct open_how how_upper32 = {
          .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_UPPER32,
  };

  /* fails */
  fd = openat2(-EBADF, "/dev/null", &how_lower32, sizeof(how_lower32));

  /* succeeds */
  fd = openat2(-EBADF, "/dev/null", &how_upper32, sizeof(how_upper32));

Fix this by preventing the immediate truncation in build_open_flags().

There's a snafu here though stripping FMODE_* directly from flags would
cause the upper 32 bits to be truncated as well due to integer promotion
rules since FMODE_* is unsigned int, O_* are signed ints (yuck).

In addition, struct open_flags currently defines flags to be 32 bit
which is reasonable. If we simply were to bump it to 64 bit we would
need to change a lot of code preemptively which doesn't seem worth it.
So simply add a compile-time check verifying that all currently known
O_* flags are within the 32 bit range and fail to build if they aren't
anymore.

This change shouldn't regress old open syscalls since they silently
truncate any unknown values anyway. It is a tiny semantic change for
openat2() but it is very unlikely people pass ing > 32 bit unknown flags
and the syscall is relatively new too.

Link: https://lore.kernel.org/r/20210528092417.3942079-3-brauner@kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reported-by: Richard Guy Briggs <rgb@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14 16:55:59 +02:00
Kuan-Ying Lee
a7a3b31d58 ANDROID: syscall_check: add vendor hook for open syscall
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and open operation.

Bug: 191291287

Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: Ia644ceb02dbc230ee1d25cad3630c2c3f908e41a
2021-07-09 13:48:44 +00:00
Collin Fijalkovich
28b4b1588e FROMLIST: mm, thp: Relax the VM_DENYWRITE constraint on file-backed THPs
Transparent huge pages are supported for read-only non-shmem files,
but are only used for vmas with VM_DENYWRITE. This condition ensures that
file THPs are protected from writes while an application is running
(ETXTBSY).  Any existing file THPs are then dropped from the page cache
when a file is opened for write in do_dentry_open(). Since sys_mmap
ignores MAP_DENYWRITE, this constrains the use of file THPs to vmas
produced by execve().

Systems that make heavy use of shared libraries (e.g. Android) are unable
to apply VM_DENYWRITE through the dynamic linker, preventing them from
benefiting from the resultant reduced contention on the TLB.

This patch reduces the constraint on file THPs allowing use with any
executable mapping from a file not opened for write (see
inode_is_open_for_write()). It also introduces additional conditions to
ensure that files opened for write will never be backed by file THPs.

Restricting the use of THPs to executable mappings eliminates the risk that
a read-only file later opened for write would encounter significant
latencies due to page cache truncation.

The ld linker flag '-z max-page-size=(hugepage size)' can be used to
produce executables with the necessary layout. The dynamic linker must
map these file's segments at a hugepage size aligned vma for the mapping to
be backed with THPs.

Comparison of the performance characteristics of 4KB and 2MB-backed
libraries follows; the Android dex2oat tool was used to AOT compile an
example application on a single ARM core.

4KB Pages:
==========

count              event_name            # count / runtime
598,995,035,942    cpu-cycles            # 1.800861 GHz
 81,195,620,851    raw-stall-frontend    # 244.112 M/sec
347,754,466,597    iTLB-loads            # 1.046 G/sec
  2,970,248,900    iTLB-load-misses      # 0.854122% miss rate

Total test time: 332.854998 seconds.

2MB Pages:
==========

count              event_name            # count / runtime
592,872,663,047    cpu-cycles            # 1.800358 GHz
 76,485,624,143    raw-stall-frontend    # 232.261 M/sec
350,478,413,710    iTLB-loads            # 1.064 G/sec
    803,233,322    iTLB-load-misses      # 0.229182% miss rate

Total test time: 329.826087 seconds

A check of /proc/$(pidof dex2oat64)/smaps shows THPs in use:

/apex/com.android.art/lib64/libart.so
FilePmdMapped:      4096 kB

/apex/com.android.art/lib64/libart-compiler.so
FilePmdMapped:      2048 kB

Bug: 158135888
Link: https://lore.kernel.org/patchwork/patch/1408266/

Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Collin Fijalkovich <cfijalkovich@google.com>
Change-Id: I75c693a4b4e7526d374ef2c010bde3094233eef2
2021-04-28 18:41:35 +00:00
Alistair Delva
cf8f7947f2 ANDROID: Add filp_open_block() for zram
We currently plan to disallow use of filp_open() from drivers in GKI,
however the ZRAM driver still needs it. Add a new GKI-only variant of
filp_open() which only permits a block device to be opened, which can
be exported instead. This keeps ZRAM working but cuts down on drivers
that attempt to open and write files in kernel mode.

Bug: 179220339
Change-Id: Id696b4aaf204b0499ce0a1b6416648670236e570
Signed-off-by: Alistair Delva <adelva@google.com>
2021-02-03 18:27:38 +00:00
Aleksa Sarai
aa606ebab1 openat2: reject RESOLVE_BENEATH|RESOLVE_IN_ROOT
commit 398840f8bb935d33c64df4ec4fed77a7d24c267d upstream.

This was an oversight in the original implementation, as it makes no
sense to specify both scoping flags to the same openat2(2) invocation
(before this patch, the result of such an invocation was equivalent to
RESOLVE_IN_ROOT being ignored).

This is a userspace-visible ABI change, but the only user of openat2(2)
at the moment is LXC which doesn't specify both flags and so no
userspace programs will break as a result.

Fixes: fddb5d430a ("open: introduce openat2(2) syscall")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: <stable@vger.kernel.org> # v5.6+
Link: https://lore.kernel.org/r/20201027235044.5240-2-cyphar@cyphar.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30 11:54:24 +01:00
Kees Cook
633fb6ac39 exec: move S_ISREG() check earlier
The execve(2)/uselib(2) syscalls have always rejected non-regular files.
Recently, it was noticed that a deadlock was introduced when trying to
execute pipes, as the S_ISREG() test was happening too late.  This was
fixed in commit 73601ea5b7 ("fs/open.c: allow opening only regular files
during execve()"), but it was added after inode_permission() had already
run, which meant LSMs could see bogus attempts to execute non-regular
files.

Move the test into the other inode type checks (which already look for
other pathological conditions[1]).  Since there is no need to use
FMODE_EXEC while we still have access to "acc_mode", also switch the test
to MAY_EXEC.

Also include a comment with the redundant S_ISREG() checks at the end of
execve(2)/uselib(2) to note that they are present to avoid any mistakes.

My notes on the call path, and related arguments, checks, etc:

do_open_execat()
    struct open_flags open_exec_flags = {
        .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC,
        .acc_mode = MAY_EXEC,
        ...
    do_filp_open(dfd, filename, open_flags)
        path_openat(nameidata, open_flags, flags)
            file = alloc_empty_file(open_flags, current_cred());
            do_open(nameidata, file, open_flags)
                may_open(path, acc_mode, open_flag)
		    /* new location of MAY_EXEC vs S_ISREG() test */
                    inode_permission(inode, MAY_OPEN | acc_mode)
                        security_inode_permission(inode, acc_mode)
                vfs_open(path, file)
                    do_dentry_open(file, path->dentry->d_inode, open)
                        /* old location of FMODE_EXEC vs S_ISREG() test */
                        security_file_open(f)
                        open()

[1] https://lore.kernel.org/lkml/202006041910.9EF0C602@keescook/

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: http://lkml.kernel.org/r/20200605160013.3954297-3-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:01 -07:00
Linus Torvalds
e1ec517e18 Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull init and set_fs() cleanups from Al Viro:
 "Christoph's 'getting rid of ksys_...() uses under KERNEL_DS' series"

* 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (50 commits)
  init: add an init_dup helper
  init: add an init_utimes helper
  init: add an init_stat helper
  init: add an init_mknod helper
  init: add an init_mkdir helper
  init: add an init_symlink helper
  init: add an init_link helper
  init: add an init_eaccess helper
  init: add an init_chmod helper
  init: add an init_chown helper
  init: add an init_chroot helper
  init: add an init_chdir helper
  init: add an init_rmdir helper
  init: add an init_unlink helper
  init: add an init_umount helper
  init: add an init_mount helper
  init: mark create_dev as __init
  init: mark console_on_rootfs as __init
  init: initialize ramdisk_execute_command at compile time
  devtmpfs: refactor devtmpfsd()
  ...
2020-08-07 09:40:34 -07:00
Christoph Hellwig
eb9d7d390e init: add an init_eaccess helper
Add a simple helper to check if a file exists based on kernel space file
name and switch the early init code over to it.  Note that this
theoretically changes behavior as it always is based on the effective
permissions.  But during early init that doesn't make a difference.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:53 +02:00
Christoph Hellwig
1097742efc init: add an init_chmod helper
Add a simple helper to chmod with a kernel space file name and switch
the early init code over to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:53 +02:00
Christoph Hellwig
b873498f99 init: add an init_chown helper
Add a simple helper to chown with a kernel space file name and switch
the early init code over to it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:52 +02:00
Christoph Hellwig
4b7ca5014c init: add an init_chroot helper
Add a simple helper to chroot with a kernel space file name and switch
the early init code over to it.  Remove the now unused ksys_chroot.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:52 +02:00
Christoph Hellwig
db63f1e315 init: add an init_chdir helper
Add a simple helper to chdir with a kernel space file name and switch
the early init code over to it.  Remove the now unused ksys_chdir.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-31 08:17:52 +02:00
Christoph Hellwig
b25ba7c3c9 fs: remove ksys_fchmod
Fold it into the only remaining caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-31 08:16:01 +02:00
Christoph Hellwig
166e07c37c fs: remove ksys_open
Just open code it in the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-31 08:16:00 +02:00
Christoph Hellwig
9e96c8c0e9 fs: add a vfs_fchmod helper
Add a helper for struct file based chmode operations.  To be used by
the initramfs code soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16 15:33:04 +02:00
Christoph Hellwig
c04011fe8c fs: add a vfs_fchown helper
Add a helper for struct file based chown operations.  To be used by
the initramfs code soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-16 15:33:00 +02:00
Christian Brauner
60997c3d45
close_range: add CLOSE_RANGE_UNSHARE
One of the use-cases of close_range() is to drop file descriptors just before
execve(). This would usually be expressed in the sequence:

unshare(CLONE_FILES);
close_range(3, ~0U);

as pointed out by Linus it might be desirable to have this be a part of
close_range() itself under a new flag CLOSE_RANGE_UNSHARE.

This expands {dup,unshare)_fd() to take a max_fds argument that indicates the
maximum number of file descriptors to copy from the old struct files. When the
user requests that all file descriptors are supposed to be closed via
close_range(min, max) then we can cap via unshare_fd(min) and hence don't need
to do any of the heavy fput() work for everything above min.

The patch makes it so that if CLOSE_RANGE_UNSHARE is requested and we do in
fact currently share our file descriptor table we create a new private copy.
We then close all fds in the requested range and finally after we're done we
install the new fd table.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-06-17 00:07:38 +02:00
Christian Brauner
278a5fbaed
open: add close_range()
This adds the close_range() syscall. It allows to efficiently close a range
of file descriptors up to all file descriptors of a calling task.

I was contacted by FreeBSD as they wanted to have the same close_range()
syscall as we proposed here. We've coordinated this and in the meantime, Kyle
was fast enough to merge close_range() into FreeBSD already in April:
https://reviews.freebsd.org/D21627
https://svnweb.freebsd.org/base?view=revision&revision=359836
and the current plan is to backport close_range() to FreeBSD 12.2 (cf. [2])
once its merged in Linux too. Python is in the process of switching to
close_range() on FreeBSD and they are waiting on us to merge this to switch on
Linux as well: https://bugs.python.org/issue38061

The syscall came up in a recent discussion around the new mount API and
making new file descriptor types cloexec by default. During this
discussion, Al suggested the close_range() syscall (cf. [1]). Note, a
syscall in this manner has been requested by various people over time.

First, it helps to close all file descriptors of an exec()ing task. This
can be done safely via (quoting Al's example from [1] verbatim):

        /* that exec is sensitive */
        unshare(CLONE_FILES);
        /* we don't want anything past stderr here */
        close_range(3, ~0U);
        execve(....);

The code snippet above is one way of working around the problem that file
descriptors are not cloexec by default. This is aggravated by the fact that
we can't just switch them over without massively regressing userspace. For
a whole class of programs having an in-kernel method of closing all file
descriptors is very helpful (e.g. demons, service managers, programming
language standard libraries, container managers etc.).
(Please note, unshare(CLONE_FILES) should only be needed if the calling
task is multi-threaded and shares the file descriptor table with another
thread in which case two threads could race with one thread allocating file
descriptors and the other one closing them via close_range(). For the
general case close_range() before the execve() is sufficient.)

Second, it allows userspace to avoid implementing closing all file
descriptors by parsing through /proc/<pid>/fd/* and calling close() on each
file descriptor. From looking at various large(ish) userspace code bases
this or similar patterns are very common in:
- service managers (cf. [4])
- libcs (cf. [6])
- container runtimes (cf. [5])
- programming language runtimes/standard libraries
  - Python (cf. [2])
  - Rust (cf. [7], [8])
As Dmitry pointed out there's even a long-standing glibc bug about missing
kernel support for this task (cf. [3]).
In addition, the syscall will also work for tasks that do not have procfs
mounted and on kernels that do not have procfs support compiled in. In such
situations the only way to make sure that all file descriptors are closed
is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE,
OPEN_MAX trickery (cf. comment [8] on Rust).

The performance is striking. For good measure, comparing the following
simple close_all_fds() userspace implementation that is essentially just
glibc's version in [6]:

static int close_all_fds(void)
{
        int dir_fd;
        DIR *dir;
        struct dirent *direntp;

        dir = opendir("/proc/self/fd");
        if (!dir)
                return -1;
        dir_fd = dirfd(dir);
        while ((direntp = readdir(dir))) {
                int fd;
                if (strcmp(direntp->d_name, ".") == 0)
                        continue;
                if (strcmp(direntp->d_name, "..") == 0)
                        continue;
                fd = atoi(direntp->d_name);
                if (fd == dir_fd || fd == 0 || fd == 1 || fd == 2)
                        continue;
                close(fd);
        }
        closedir(dir);
        return 0;
}

to close_range() yields:
1. closing 4 open files:
   - close_all_fds(): ~280 us
   - close_range():    ~24 us

2. closing 1000 open files:
   - close_all_fds(): ~5000 us
   - close_range():   ~800 us

close_range() is designed to allow for some flexibility. Specifically, it
does not simply always close all open file descriptors of a task. Instead,
callers can specify an upper bound.
This is e.g. useful for scenarios where specific file descriptors are
created with well-known numbers that are supposed to be excluded from
getting closed.
For extra paranoia close_range() comes with a flags argument. This can e.g.
be used to implement extension. Once can imagine userspace wanting to stop
at the first error instead of ignoring errors under certain circumstances.
There might be other valid ideas in the future. In any case, a flag
argument doesn't hurt and keeps us on the safe side.

From an implementation side this is kept rather dumb. It saw some input
from David and Jann but all nonsense is obviously my own!
- Errors to close file descriptors are currently ignored. (Could be changed
  by setting a flag in the future if needed.)
- __close_range() is a rather simplistic wrapper around __close_fd().
  My reasoning behind this is based on the nature of how __close_fd() needs
  to release an fd. But maybe I misunderstood specifics:
  We take the files_lock and rcu-dereference the fdtable of the calling
  task, we find the entry in the fdtable, get the file and need to release
  files_lock before calling filp_close().
  In the meantime the fdtable might have been altered so we can't just
  retake the spinlock and keep the old rcu-reference of the fdtable
  around. Instead we need to grab a fresh reference to the fdtable.
  If my reasoning is correct then there's really no point in fancyfying
  __close_range(): We just need to rcu-dereference the fdtable of the
  calling task once to cap the max_fd value correctly and then go on
  calling __close_fd() in a loop.

/* References */
[1]: https://lore.kernel.org/lkml/20190516165021.GD17978@ZenIV.linux.org.uk/
[2]: 9e4f2f3a6b/Modules/_posixsubprocess.c (L220)
[3]: https://sourceware.org/bugzilla/show_bug.cgi?id=10353#c7
[4]: 5238e95759/src/basic/fd-util.c (L217)
[5]: ddf4b77e11/src/lxc/start.c (L236)
[6]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/grantpt.c;h=2030e07fa6e652aac32c775b8c6e005844c3c4eb;hb=HEAD#l17
     Note that this is an internal implementation that is not exported.
     Currently, libc seems to not provide an exported version of this
     because of missing kernel support to do this.
     Note, in a recent patch series Florian made grantpt() a nop thereby
     removing the code referenced here.
[7]: https://github.com/rust-lang/rust/issues/12148
[8]: 5f47c0613e/src/libstd/sys/unix/process2.rs (L303-L308)
     Rust's solution is slightly different but is equally unperformant.
     Rust calls getdtablesize() which is a glibc library function that
     simply returns the current RLIMIT_NOFILE or OPEN_MAX values. Rust then
     goes on to call close() on each fd. That's obviously overkill for most
     tasks. Rarely, tasks - especially non-demons - hit RLIMIT_NOFILE or
     OPEN_MAX.
     Let's be nice and assume an unprivileged user with RLIMIT_NOFILE set
     to 1024. Even in this case, there's a very high chance that in the
     common case Rust is calling the close() syscall 1021 times pointlessly
     if the task just has 0, 1, and 2 open.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kyle Evans <self@kyle-evans.net>
Cc: Jann Horn <jannh@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dmitry V. Levin <ldv@altlinux.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: linux-api@vger.kernel.org
2020-06-17 00:05:19 +02:00
Linus Torvalds
94709049fb Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "A few little subsystems and a start of a lot of MM patches.

  Subsystems affected by this patch series: squashfs, ocfs2, parisc,
  vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
  swap, memcg, pagemap, memory-failure, vmalloc, kasan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  kasan: move kasan_report() into report.c
  mm/mm_init.c: report kasan-tag information stored in page->flags
  ubsan: entirely disable alignment checks under UBSAN_TRAP
  kasan: fix clang compilation warning due to stack protector
  x86/mm: remove vmalloc faulting
  mm: remove vmalloc_sync_(un)mappings()
  x86/mm/32: implement arch_sync_kernel_mappings()
  x86/mm/64: implement arch_sync_kernel_mappings()
  mm/ioremap: track which page-table levels were modified
  mm/vmalloc: track which page-table levels were modified
  mm: add functions to track page directory modifications
  s390: use __vmalloc_node in stack_alloc
  powerpc: use __vmalloc_node in alloc_vm_stack
  arm64: use __vmalloc_node in arch_alloc_vmap_stack
  mm: remove vmalloc_user_node_flags
  mm: switch the test_vmalloc module to use __vmalloc_node
  mm: remove __vmalloc_node_flags_caller
  mm: remove both instances of __vmalloc_node_flags
  mm: remove the prot argument to __vmalloc_node
  mm: remove the pgprot argument to __vmalloc
  ...
2020-06-02 12:21:36 -07:00
Jeff Layton
735e4ae5ba vfs: track per-sb writeback errors and report them to syncfs
Patch series "vfs: have syncfs() return error when there are writeback
errors", v6.

Currently, syncfs does not return errors when one of the inodes fails to
be written back.  It will return errors based on the legacy AS_EIO and
AS_ENOSPC flags when syncing out the block device fails, but that's not
particularly helpful for filesystems that aren't backed by a blockdev.
It's also possible for a stray sync to lose those errors.

The basic idea in this set is to track writeback errors at the
superblock level, so that we can quickly and easily check whether
something bad happened without having to fsync each file individually.
syncfs is then changed to reliably report writeback errors after they
occur, much in the same fashion as fsync does now.

This patch (of 2):

Usually we suggest that applications call fsync when they want to ensure
that all data written to the file has made it to the backing store, but
that can be inefficient when there are a lot of open files.

Calling syncfs on the filesystem can be more efficient in some
situations, but the error reporting doesn't currently work the way most
people expect.  If a single inode on a filesystem reports a writeback
error, syncfs won't necessarily return an error.  syncfs only returns an
error if __sync_blockdev fails, and on some filesystems that's a no-op.

It would be better if syncfs reported an error if there were any
writeback failures.  Then applications could call syncfs to see if there
are any errors on any open files, and could then call fsync on all of
the other descriptors to figure out which one failed.

This patch adds a new errseq_t to struct super_block, and has
mapping_set_error also record writeback errors there.

To report those errors, we also need to keep an errseq_t in struct file
to act as a cursor.  This patch adds a dedicated field for that purpose,
which slots nicely into 4 bytes of padding at the end of struct file on
x86_64.

An earlier version of this patch used an O_PATH file descriptor to cue
the kernel that the open file should track the superblock error and not
the inode's writeback error.

I think that API is just too weird though.  This is simpler and should
make syncfs error reporting "just work" even if someone is multiplexing
fsync and syncfs on the same fds.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20200428135155.19223-1-jlayton@kernel.org
Link: http://lkml.kernel.org/r/20200428135155.19223-2-jlayton@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Miklos Szeredi
c8ffd8bcdd vfs: add faccessat2 syscall
POSIX defines faccessat() as having a fourth "flags" argument, while the
linux syscall doesn't have it.  Glibc tries to emulate AT_EACCESS and
AT_SYMLINK_NOFOLLOW, but AT_EACCESS emulation is broken.

Add a new faccessat(2) syscall with the added flags argument and implement
both flags.

The value of AT_EACCESS is defined in glibc headers to be the same as
AT_REMOVEDIR.  Use this value for the kernel interface as well, together
with the explanatory comment.

Also add AT_EMPTY_PATH support, which is not documented by POSIX, but can
be useful and is trivial to implement.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-14 16:44:25 +02:00
Miklos Szeredi
9470451505 vfs: split out access_override_creds()
Split out a helper that overrides the credentials in preparation for
actually doing the access check.

This prepares for the next patch that optionally disables the creds
override.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-14 16:44:24 +02:00
Linus Torvalds
9c577491b9 Merge branch 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pathwalk sanitizing from Al Viro:
 "Massive pathwalk rewrite and cleanups.

  Several iterations have been posted; hopefully this thing is getting
  readable and understandable now. Pretty much all parts of pathname
  resolutions are affected...

  The branch is identical to what has sat in -next, except for commit
  message in "lift all calls of step_into() out of follow_dotdot/
  follow_dotdot_rcu", crediting Qian Cai for reporting the bug; only
  commit message changed there."

* 'work.dotdot1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (69 commits)
  lookup_open(): don't bother with fallbacks to lookup+create
  atomic_open(): no need to pass struct open_flags anymore
  open_last_lookups(): move complete_walk() into do_open()
  open_last_lookups(): lift O_EXCL|O_CREAT handling into do_open()
  open_last_lookups(): don't abuse complete_walk() when all we want is unlazy
  open_last_lookups(): consolidate fsnotify_create() calls
  take post-lookup part of do_last() out of loop
  link_path_walk(): sample parent's i_uid and i_mode for the last component
  __nd_alloc_stack(): make it return bool
  reserve_stack(): switch to __nd_alloc_stack()
  pick_link(): take reserving space on stack into a new helper
  pick_link(): more straightforward handling of allocation failures
  fold path_to_nameidata() into its only remaining caller
  pick_link(): pass it struct path already with normal refcounting rules
  fs/namei.c: kill follow_mount()
  non-RCU analogue of the previous commit
  helper for mount rootwards traversal
  follow_dotdot(): be lazy about changing nd->path
  follow_dotdot_rcu(): be lazy about changing nd->path
  follow_dotdot{,_rcu}(): massage loops
  ...
2020-04-02 12:30:08 -07:00
Al Viro
d9a9f4849f cifs_atomic_open(): fix double-put on late allocation failure
several iterations of ->atomic_open() calling conventions ago, we
used to need fput() if ->atomic_open() failed at some point after
successful finish_open().  Now (since 2016) it's not needed -
struct file carries enough state to make fput() work regardless
of the point in struct file lifecycle and discarding it on
failure exits in open() got unified.  Unfortunately, I'd missed
the fact that we had an instance of ->atomic_open() (cifs one)
that used to need that fput(), as well as the stale comment in
finish_open() demanding such late failure handling.  Trivially
fixed...

Fixes: fe9ec8291f "do_last(): take fput() on error after opening to out:"
Cc: stable@kernel.org # v4.7+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-12 18:25:20 -04:00
Al Viro
31d1726d72 make build_open_flags() treat O_CREAT | O_EXCL as implying O_NOFOLLOW
O_CREAT | O_EXCL means "-EEXIST if we run into a trailing symlink".
As it is, we might or might not have LOOKUP_FOLLOW in op->intent
in that case - that depends upon having O_NOFOLLOW in open flags.
It doesn't matter, since we won't be checking it in that case -
do_last() bails out earlier.

However, making sure it's not set (i.e. acting as if we had an explicit
O_NOFOLLOW) makes the behaviour more explicit and allows to reorder the
check for O_CREAT | O_EXCL in do_last() with the call of step_into()
immediately following it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-27 14:43:56 -05:00
Jens Axboe
35cb6d54c1 fs: make build_open_flags() available internally
This is a prep patch for supporting non-blocking open from io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Aleksa Sarai
fddb5d430a open: introduce openat2(2) syscall
/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].

This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).

Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.

In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.

Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).

/* Syscall Prototype. */
  /*
   * open_how is an extensible structure (similar in interface to
   * clone3(2) or sched_setattr(2)). The size parameter must be set to
   * sizeof(struct open_how), to allow for future extensions. All future
   * extensions will be appended to open_how, with their zero value
   * acting as a no-op default.
   */
  struct open_how { /* ... */ };

  int openat2(int dfd, const char *pathname,
              struct open_how *how, size_t size);

/* Description. */
The initial version of 'struct open_how' contains the following fields:

  flags
    Used to specify openat(2)-style flags. However, any unknown flag
    bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
    will result in -EINVAL. In addition, this field is 64-bits wide to
    allow for more O_ flags than currently permitted with openat(2).

  mode
    The file mode for O_CREAT or O_TMPFILE.

    Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.

  resolve
    Restrict path resolution (in contrast to O_* flags they affect all
    path components). The current set of flags are as follows (at the
    moment, all of the RESOLVE_ flags are implemented as just passing
    the corresponding LOOKUP_ flag).

    RESOLVE_NO_XDEV       => LOOKUP_NO_XDEV
    RESOLVE_NO_SYMLINKS   => LOOKUP_NO_SYMLINKS
    RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
    RESOLVE_BENEATH       => LOOKUP_BENEATH
    RESOLVE_IN_ROOT       => LOOKUP_IN_ROOT

open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.

Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).

After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.

/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.

In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).

/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).

Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.

Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).

[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014bb8 ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVs

Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18 09:19:18 -05:00
Linus Torvalds
2be7d348fe Revert "vfs: properly and reliably lock f_pos in fdget_pos()"
This reverts commit 0be0ee7181.

I was hoping it would be benign to switch over entirely to FMODE_STREAM,
and we'd have just a couple of small fixups we'd need, but it looks like
we're not quite there yet.

While it worked fine on both my desktop and laptop, they are fairly
similar in other respects, and run mostly the same loads.  Kenneth
Crudup reports that it seems to break both his vmware installation and
the KDE upower service.  In both cases apparently leading to timeouts
due to waitinmg for the f_pos lock.

There are a number of character devices in particular that definitely
want stream-like behavior, but that currently don't get marked as
streams, and as a result get the exclusion between concurrent
read()/write() on the same file descriptor.  Which doesn't work well for
them.

The most obvious example if this is /dev/console and /dev/tty, which use
console_fops and tty_fops respectively (and ptmx_fops for the pty master
side).  It may be that it's just this that causes problems, but we
clearly weren't ready yet.

Because there's a number of other likely common cases that don't have
llseek implementations and would seem to act as stream devices:

  /dev/fuse		(fuse_dev_operations)
  /dev/mcelog		(mce_chrdev_ops)
  /dev/mei0		(mei_fops)
  /dev/net/tun		(tun_fops)
  /dev/nvme0		(nvme_dev_fops)
  /dev/tpm0		(tpm_fops)
  /proc/self/ns/mnt	(ns_file_operations)
  /dev/snd/pcm*		(snd_pcm_f_ops[])

and while some of these could be trivially automatically detected by the
vfs layer when the character device is opened by just noticing that they
have no read or write operations either, it often isn't that obvious.

Some character devices most definitely do use the file position, even if
they don't allow seeking: the firmware update code, for example, uses
simple_read_from_buffer() that does use f_pos, but doesn't allow seeking
back and forth.

We'll revisit this when there's a better way to detect the problem and
fix it (possibly with a coccinelle script to do more of the FMODE_STREAM
annotations).

Reported-by: Kenneth R. Crudup <kenny@panix.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-26 11:34:06 -08:00
Linus Torvalds
0be0ee7181 vfs: properly and reliably lock f_pos in fdget_pos()
fdget_pos() is used by file operations that will read and update f_pos:
things like "read()", "write()" and "lseek()" (but not, for example,
"pread()/pwrite" that get their file positions elsewhere).

However, it had two separate escape clauses for this, because not
everybody wants or needs serialization of the file position.

The first and most obvious case is the "file descriptor doesn't have a
position at all", ie a stream-like file.  Except we didn't actually use
FMODE_STREAM, but instead used FMODE_ATOMIC_POS.  The reason for that
was that FMODE_STREAM didn't exist back in the days, but also that we
didn't want to mark all the special cases, so we only marked the ones
that _required_ position atomicity according to POSIX - regular files
and directories.

The case one was intentionally lazy, but now that we _do_ have
FMODE_STREAM we could and should just use it.  With the change to use
FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all
the code to set it is deleted.

Any cases where we don't want the serialization because the driver (or
subsystem) doesn't use the file position should just be updated to do
"stream_open()".  We've done that for all the obvious and common
situations, we may need a few more.  Quoting Kirill Smelkov in the
original FMODE_STREAM thread (see link below for full email):

 "And I appreciate if people could help at least somehow with "getting
  rid of mixed case entirely" (i.e. always lock f_pos_lock on
  !FMODE_STREAM), because this transition starts to diverge from my
  particular use-case too far. To me it makes sense to do that
  transition as follows:

   - convert nonseekable_open -> stream_open via stream_open.cocci;
   - audit other nonseekable_open calls and convert left users that
     truly don't depend on position to stream_open;
   - extend stream_open.cocci to analyze alloc_file_pseudo as well (this
     will cover pipes and sockets), or maybe convert pipes and sockets
     to FMODE_STREAM manually;
   - extend stream_open.cocci to analyze file_operations that use
     no_llseek or noop_llseek, but do not use nonseekable_open or
     alloc_file_pseudo. This might find files that have stream semantic
     but are opened differently;
   - extend stream_open.cocci to analyze file_operations whose
     .read/.write do not use ppos at all (independently of how file was
     opened);
   - ...
   - after that remove FMODE_ATOMIC_POS and always take f_pos_lock if
     !FMODE_STREAM;
   - gather bug reports for deadlocked read/write and convert missed
     cases to FMODE_STREAM, probably extending stream_open.cocci along
     the road to catch similar cases

  i.e. always take f_pos_lock unless a file is explicitly marked as
  being stream, and try to find and cover all files that are streams"

We have not done the "extend stream_open.cocci to analyze
alloc_file_pseudo" as well, but the previous commit did manually handle
the case of pipes and sockets.

The other case where we can avoid locking f_pos is the "this file
descriptor only has a single user and it is us, and thus there is no
need to lock it".

The second test was correct, although a bit subtle and worth just
re-iterating here.  There are two kinds of other sources of references
to the same file descriptor: file descriptors that have been explicitly
shared across fork() or with dup(), and file tables having elevated
reference counts due to threading (or explicit file sharing with
clone()).

The first case would have incremented the file count explicitly, and in
the second case the previous __fdget() would have incremented it for us
and set the FDPUT_FPUT flag.

But in both cases the file count would be greater than one, so the
"file_count(file) > 1" test catches both situations.  Also note that if
file_count is 1, that also means that no other thread can have access to
the file table, so there also cannot be races with concurrent calls to
dup()/fork()/clone() that would increment the file count any other way.

Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Eic Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-25 10:01:53 -08:00
Denis Efremov
7159d54418 fs: remove unlikely() from WARN_ON() condition
"unlikely(WARN_ON(x))" is excessive. WARN_ON() already uses unlikely()
internally.

Link: http://lkml.kernel.org/r/20190829165025.15750-5-efremov@linux.com
Signed-off-by: Denis Efremov <efremov@linux.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-26 10:10:30 -07:00
Song Liu
09d91cda0e mm,thp: avoid writes to file with THP in pagecache
In previous patch, an application could put part of its text section in
THP via madvise().  These THPs will be protected from writes when the
application is still running (TXTBSY).  However, after the application
exits, the file is available for writes.

This patch avoids writes to file THP by dropping page cache for the file
when the file is open for write.  A new counter nr_thps is added to struct
address_space.  In do_dentry_open(), if the file is open for write and
nr_thps is non-zero, we drop page cache for the whole file.

Link: http://lkml.kernel.org/r/20190801184244.3169074-8-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Linus Torvalds
d7852fbd0f access: avoid the RCU grace period for the temporary subjective credentials
It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
work because it installs a temporary credential that gets allocated and
freed for each system call.

The allocation and freeing overhead is mostly benign, but because
credentials can be accessed under the RCU read lock, the freeing
involves a RCU grace period.

Which is not a huge deal normally, but if you have a lot of access()
calls, this causes a fair amount of seconday damage: instead of having a
nice alloc/free patterns that hits in hot per-CPU slab caches, you have
all those delayed free's, and on big machines with hundreds of cores,
the RCU overhead can end up being enormous.

But it turns out that all of this is entirely unnecessary.  Exactly
because access() only installs the credential as the thread-local
subjective credential, the temporary cred pointer doesn't actually need
to be RCU free'd at all.  Once we're done using it, we can just free it
synchronously and avoid all the RCU overhead.

So add a 'non_rcu' flag to 'struct cred', which can be set by users that
know they only use it in non-RCU context (there are other potential
users for this).  We can make it a union with the rcu freeing list head
that we need for the RCU case, so this doesn't need any extra storage.

Note that this also makes 'get_current_cred()' clear the new non_rcu
flag, in case we have filesystems that take a long-term reference to the
cred and then expect the RCU delayed freeing afterwards.  It's not
entirely clear that this is required, but it makes for clear semantics:
the subjective cred remains non-RCU as long as you only access it
synchronously using the thread-local accessors, but you _can_ use it as
a generic cred if you want to.

It is possible that we should just remove the whole RCU markings for
->cred entirely.  Only ->real_cred is really supposed to be accessed
through RCU, and the long-term cred copies that nfs uses might want to
explicitly re-enable RCU freeing if required, rather than have
get_current_cred() do it implicitly.

But this is a "minimal semantic changes" change for the immediate
problem.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jan Glauber <jglauber@marvell.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>
Cc: Greg KH <greg@kroah.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-24 10:12:09 -07:00
Thomas Gleixner
457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Kirill Smelkov
438ab720c6 vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files
This amends commit 10dce8af34 ("fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock") in how position is passed into .read()/.write() handler for
stream-like files:

Rasmus noticed that we currently pass 0 as position and ignore any position
change if that is done by a file implementation. This papers over bugs if ppos
is used in files that declare themselves as being stream-like as such bugs will
go unnoticed. Even if a file implementation is correctly converted into using
stream_open, its read/write later could be changed to use ppos and even though
that won't be working correctly, that bug might go unnoticed without someone
doing wrong behaviour analysis. It is thus better to pass ppos=NULL into
read/write for stream-like files as that don't give any chance for ppos usage
bugs because it will oops if ppos is ever used inside .read() or .write().

Note 1: rw_verify_area, new_sync_{read,write} needs to be updated
because they are called by vfs_read/vfs_write & friends before
file_operations .read/.write .

Note 2: if file backend uses new-style .read_iter/.write_iter, position
is still passed into there as non-pointer kiocb.ki_pos . Currently
stream_open.cocci (semantic patch added by 10dce8af34) ignores files
whose file_operations has *_iter methods.

Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
2019-05-06 17:46:52 +03:00
Kirill Smelkov
10dce8af34 fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
Commit 9c225f2655 ("vfs: atomic f_pos accesses as per POSIX") added
locking for file.f_pos access and in particular made concurrent read and
write not possible - now both those functions take f_pos lock for the
whole run, and so if e.g. a read is blocked waiting for data, write will
deadlock waiting for that read to complete.

This caused regression for stream-like files where previously read and
write could run simultaneously, but after that patch could not do so
anymore. See e.g. commit 581d21a2d0 ("xenbus: fix deadlock on writes
to /proc/xen/xenbus") which fixes such regression for particular case of
/proc/xen/xenbus.

The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
safety for read/write/lseek and added the locking to file descriptors of
all regular files. In 2014 that thread-safety problem was not new as it
was already discussed earlier in 2006.

However even though 2006'th version of Linus's patch was adding f_pos
locking "only for files that are marked seekable with FMODE_LSEEK (thus
avoiding the stream-like objects like pipes and sockets)", the 2014
version - the one that actually made it into the tree as 9c225f2655 -
is doing so irregardless of whether a file is seekable or not.

See

    https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
    https://lwn.net/Articles/180387
    https://lwn.net/Articles/180396

for historic context.

The reason that it did so is, probably, that there are many files that
are marked non-seekable, but e.g. their read implementation actually
depends on knowing current position to correctly handle the read. Some
examples:

	kernel/power/user.c		snapshot_read
	fs/debugfs/file.c		u32_array_read
	fs/fuse/control.c		fuse_conn_waiting_read + ...
	drivers/hwmon/asus_atk0110.c	atk_debugfs_ggrp_read
	arch/s390/hypfs/inode.c		hypfs_read_iter
	...

Despite that, many nonseekable_open users implement read and write with
pure stream semantics - they don't depend on passed ppos at all. And for
those cases where read could wait for something inside, it creates a
situation similar to xenbus - the write could be never made to go until
read is done, and read is waiting for some, potentially external, event,
for potentially unbounded time -> deadlock.

Besides xenbus, there are 14 such places in the kernel that I've found
with semantic patch (see below):

	drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
	drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
	drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
	drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
	net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
	drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
	drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
	drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
	net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
	drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
	drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
	drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
	drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
	drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()

In addition to the cases above another regression caused by f_pos
locking is that now FUSE filesystems that implement open with
FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
stream-like files - for the same reason as above e.g. read can deadlock
write locking on file.f_pos in the kernel.

FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f7 ("fuse:
implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
write routines not depending on current position at all, and with both
read and write being potentially blocking operations:

See

    https://github.com/libfuse/osspd
    https://lwn.net/Articles/308445

    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510

Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
"somewhat pipe-like files ..." with read handler not using offset.
However that test implements only read without write and cannot exercise
the deadlock scenario:

    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216

I've actually hit the read vs write deadlock for real while implementing
my FUSE filesystem where there is /head/watch file, for which open
creates separate bidirectional socket-like stream in between filesystem
and its user with both read and write being later performed
simultaneously. And there it is semantically not easy to split the
stream into two separate read-only and write-only channels:

    https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169

Let's fix this regression. The plan is:

1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
   doing so would break many in-kernel nonseekable_open users which
   actually use ppos in read/write handlers.

2. Add stream_open() to kernel to open stream-like non-seekable file
   descriptors. Read and write on such file descriptors would never use
   nor change ppos. And with that property on stream-like files read and
   write will be running without taking f_pos lock - i.e. read and write
   could be running simultaneously.

3. With semantic patch search and convert to stream_open all in-kernel
   nonseekable_open users for which read and write actually do not
   depend on ppos and where there is no other methods in file_operations
   which assume @offset access.

4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
   steam_open if that bit is present in filesystem open reply.

   It was tempting to change fs/fuse/ open handler to use stream_open
   instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
   grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
   and in particular GVFS which actually uses offset in its read and
   write handlers

	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

   so if we would do such a change it will break a real user.

5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
   from v3.14+ (the kernel where 9c225f2655 first appeared).

   This will allow to patch OSSPD and other FUSE filesystems that
   provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
   in their open handler and this way avoid the deadlock on all kernel
   versions. This should work because fs/fuse/ ignores unknown open
   flags returned from a filesystem and so passing FOPEN_STREAM to a
   kernel that is not aware of this flag cannot hurt. In turn the kernel
   that is not aware of FOPEN_STREAM will be < v3.14 where just
   FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
   write deadlock.

This patch adds stream_open, converts /proc/xen/xenbus to it and adds
semantic patch to automatically locate in-kernel places that are either
required to be converted due to read vs write deadlock, or that are just
safe to be converted because read and write do not use ppos and there
are no other funky methods in file_operations.

Regarding semantic patch I've verified each generated change manually -
that it is correct to convert - and each other nonseekable_open instance
left - that it is either not correct to convert there, or that it is not
converted due to current stream_open.cocci limitations.

The script also does not convert files that should be valid to convert,
but that currently have .llseek = noop_llseek or generic_file_llseek for
unknown reason despite file being opened with nonseekable_open (e.g.
drivers/input/mousedev.c)

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yongzhi Pan <panyongzhi@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Nikolaus Rath <Nikolaus@rath.org>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-06 07:01:55 -10:00