Commit Graph

156 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
e92b643b4b Merge 5.10.211 into android12-5.10-lts
Changes in 5.10.211
	net/sched: Retire CBQ qdisc
	net/sched: Retire ATM qdisc
	net/sched: Retire dsmark qdisc
	smb: client: fix OOB in receive_encrypted_standard()
	smb: client: fix potential OOBs in smb2_parse_contexts()
	smb: client: fix parsing of SMB3.1.1 POSIX create context
	sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
	userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb
	zonefs: Improve error handling
	sched/rt: Fix sysctl_sched_rr_timeslice intial value
	sched/rt: Disallow writing invalid values to sched_rt_period_us
	scsi: target: core: Add TMF to tmr_list handling
	dmaengine: shdma: increase size of 'dev_id'
	dmaengine: fsl-qdma: increase size of 'irq_name'
	wifi: cfg80211: fix missing interfaces when dumping
	wifi: mac80211: fix race condition on enabling fast-xmit
	fbdev: savage: Error out if pixclock equals zero
	fbdev: sis: Error out if pixclock equals zero
	spi: hisi-sfc-v3xx: Return IRQ_NONE if no interrupts were detected
	ahci: asm1166: correct count of reported ports
	ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers
	ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found()
	ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal()
	dmaengine: ti: edma: Add some null pointer checks to the edma_probe
	regulator: pwm-regulator: Add validity checks in continuous .get_voltage
	nvmet-tcp: fix nvme tcp ida memory leak
	ASoC: sunxi: sun4i-spdif: Add support for Allwinner H616
	spi: sh-msiof: avoid integer overflow in constants
	netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new
	nvme-fc: do not wait in vain when unloading module
	nvmet-fcloop: swap the list_add_tail arguments
	nvmet-fc: release reference on target port
	nvmet-fc: abort command when there is no binding
	ext4: correct the hole length returned by ext4_map_blocks()
	Input: i8042 - add Fujitsu Lifebook U728 to i8042 quirk table
	efi: runtime: Fix potential overflow of soft-reserved region size
	efi: Don't add memblocks for soft-reserved memory
	hwmon: (coretemp) Enlarge per package core count limit
	scsi: lpfc: Use unsigned type for num_sge
	firewire: core: send bus reset promptly on gap count error
	virtio-blk: Ensure no requests in virtqueues before deleting vqs.
	pmdomain: renesas: r8a77980-sysc: CR7 must be always on
	ARM: dts: BCM53573: Drop nonexistent "default-off" LED trigger
	irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable
	ARM: dts: imx: Set default tuning step for imx6sx usdhc
	ASoC: fsl_micfil: register platform component before registering cpu dai
	media: av7110: prevent underflow in write_ts_to_decoder()
	hvc/xen: prevent concurrent accesses to the shared ring
	hsr: Avoid double remove of a node.
	x86/uaccess: Implement macros for CMPXCHG on user addresses
	seccomp: Invalidate seccomp mode to catch death failures
	block: ataflop: fix breakage introduced at blk-mq refactoring
	powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions
	powerpc/watchpoints: Annotate atomic context in more places
	cifs: add a warning when the in-flight count goes negative
	mtd: spinand: macronix: Add support for MX35LFxGE4AD
	ASoC: Intel: boards: harden codec property handling
	ASoC: Intel: boards: get codec device with ACPI instead of bus search
	ASoC: Intel: bytcr_rt5651: Drop reference count of ACPI device after use
	task_stack, x86/cea: Force-inline stack helpers
	btrfs: tree-checker: check for overlapping extent items
	btrfs: introduce btrfs_lookup_match_dir
	btrfs: unify lookup return value when dir entry is missing
	btrfs: do not pin logs too early during renames
	lan743x: fix for potential NULL pointer dereference with bare card
	platform/x86: intel-vbtn: Support for tablet mode on HP Pavilion 13 x360 PC
	iwlwifi: mvm: do more useful queue sync accounting
	iwlwifi: mvm: write queue_sync_state only for sync
	jbd2: remove redundant buffer io error checks
	jbd2: recheck chechpointing non-dirty buffer
	jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
	x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm()
	erofs: fix lz4 inplace decompression
	IB/hfi1: Fix sdma.h tx->num_descs off-by-one error
	s390/cio: fix invalid -EBUSY on ccw_device_start
	dm-crypt: don't modify the data when using authenticated encryption
	KVM: arm64: vgic-its: Test for valid IRQ in MOVALL handler
	KVM: arm64: vgic-its: Test for valid IRQ in its_sync_lpi_pending_table()
	gtp: fix use-after-free and null-ptr-deref in gtp_genl_dump_pdp()
	PCI/MSI: Prevent MSI hardware interrupt number truncation
	l2tp: pass correct message length to ip6_append_data
	ARM: ep93xx: Add terminator to gpiod_lookup_table
	Revert "x86/ftrace: Use alternative RET encoding"
	x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR
	x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
	x86/ftrace: Use alternative RET encoding
	x86/returnthunk: Allow different return thunks
	Revert "x86/alternative: Make custom return thunk unconditional"
	x86/alternative: Make custom return thunk unconditional
	usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
	usb: cdns3: fix memory double free when handle zero packet
	usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
	usb: roles: fix NULL pointer issue when put module's reference
	usb: roles: don't get/set_role() when usb_role_switch is unregistered
	mptcp: fix lockless access in subflow ULP diag
	IB/hfi1: Fix a memleak in init_credit_return
	RDMA/bnxt_re: Return error for SRQ resize
	RDMA/srpt: Support specifying the srpt_service_guid parameter
	RDMA/qedr: Fix qedr_create_user_qp error flow
	arm64: dts: rockchip: set num-cs property for spi on px30
	RDMA/srpt: fix function pointer cast warnings
	bpf, scripts: Correct GPL license name
	scsi: jazz_esp: Only build if SCSI core is builtin
	nouveau: fix function cast warnings
	ipv4: properly combine dev_base_seq and ipv4.dev_addr_genid
	ipv6: properly combine dev_base_seq and ipv6.dev_addr_genid
	afs: Increase buffer size in afs_update_volume_status()
	ipv6: sr: fix possible use-after-free and null-ptr-deref
	packet: move from strlcpy with unused retval to strscpy
	net: dev: Convert sa_data to flexible array in struct sockaddr
	s390: use the correct count for __iowrite64_copy()
	tls: rx: jump to a more appropriate label
	tls: rx: drop pointless else after goto
	tls: stop recv() if initial process_rx_list gave us non-DATA
	netfilter: nf_tables: set dormant flag on hook register failure
	drm/syncobj: make lockdep complain on WAIT_FOR_SUBMIT v3
	drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set
	drm/amd/display: Fix memory leak in dm_sw_fini()
	block: ataflop: more blk-mq refactoring fixes
	fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio
	arp: Prevent overflow in arp_req_get().
	ext4: regenerate buddy after block freeing failed if under fc replay
	Linux 5.10.211

Note, this merges away the following commit:
	a0180e940c ("erofs: fix lz4 inplace decompression")
as it conflicted too badly with the existing erofs changes in this
branch that are not upstream.  If it is needed, it can be brought back
in the future in a safe way.

Change-Id: I432a4a0964e0708d2cd337872ad75d57cbf92cce
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2024-04-16 15:38:10 +00:00
Kees Cook
1dd3dc3892 seccomp: Invalidate seccomp mode to catch death failures
[ Upstream commit 495ac3069a6235bfdf516812a2a9b256671bbdf9 ]

If seccomp tries to kill a process, it should never see that process
again. To enforce this proactively, switch the mode to something
impossible. If encountered: WARN, reject all syscalls, and attempt to
kill the process again even harder.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Fixes: 8112c4f140 ("seccomp: remove 2-phase API")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01 13:16:46 +01:00
Greg Kroah-Hartman
b558262fdc This is the 5.10.60 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmEcsVUACgkQONu9yGCS
 aT7r0xAAtwJSVAboFmw/R7pyRzsveDo8KdTATgOKRRmGk/fJIodSHXwPTcPtKSRO
 k09OnEU777t67+3pfS6PA2lNCWhE+Z3PlFVn6b8G8xR1o14vpwNg3aiGNGSqd8aL
 edA4TrqTBb7nu8Bhg1nPdSiKSK2UTPDYlBo+jF6j6YU1dVe/yMy2D4bVOGotPxTn
 WVFmlmEtHDN22q67rEjLAU+wvtn/nRJle/0b2gj32BB+CJqLyjHKH+aUmk9shMfP
 CCkzX+CjhEOz4Sooh0v3aYsyJ4N6AiCAaT4SYy/pnT1RV/b9aw4f+ddrg3xo8SnI
 2C9/i2EzsZc2UnzoFld1R15YrdZfIWOLaiU6enIis5ziBDuxZ5nd5VOHV99BzhUH
 nEY65Ob3Wr8LVSgFTo61uk5R0z0WjQMwSvW1uaauxMUQN7Q2645VJsPomPNlmp8I
 pNBSLj5uz76cxh+ciZNS4HgtwXNUHolhRHl5nvHxMKH0i1+9xQ1vKs1sKMpDYGFe
 2lqg8eGe8lGrwpg0w4oKHyVP2fgj7yk3FvU3/pXX8BleolmQ3xkwkztofKaMorJP
 s9eQIw/2U3j/xBqhXoIbTjBuXl+5MjfbOG4xeK0v/oohrDw0IvT7KuO6TYUXhlbh
 CniUBfItbgGzXyQiwgwgEbS8upz+nK6IDdNEAV5dGn6gG6w4ajk=
 =/0MF
 -----END PGP SIGNATURE-----

Merge 5.10.60 into android12-5.10-lts

Changes in 5.10.60
	iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
	iio: adis: set GPIO reset pin direction
	iio: humidity: hdc100x: Add margin to the conversion time
	iio: adc: Fix incorrect exit of for-loop
	ASoC: amd: Fix reference to PCM buffer address
	ASoC: xilinx: Fix reference to PCM buffer address
	ASoC: uniphier: Fix reference to PCM buffer address
	ASoC: tlv320aic31xx: Fix jack detection after suspend
	ASoC: intel: atom: Fix reference to PCM buffer address
	i2c: dev: zero out array used for i2c reads from userspace
	cifs: create sd context must be a multiple of 8
	scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
	seccomp: Fix setting loaded filter count during TSYNC
	net: ethernet: ti: cpsw: fix min eth packet size for non-switch use-cases
	ARC: fp: set FPU_STATUS.FWE to enable FPU_STATUS update on context switch
	ceph: reduce contention in ceph_check_delayed_caps()
	ACPI: NFIT: Fix support for virtual SPA ranges
	libnvdimm/region: Fix label activation vs errors
	drm/amd/display: Remove invalid assert for ODM + MPC case
	drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work
	drm/amdgpu: don't enable baco on boco platforms in runpm
	ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi
	ieee802154: hwsim: fix GPF in hwsim_new_edge_nl
	pinctrl: mediatek: Fix fallback behavior for bias_set_combo
	ASoC: cs42l42: Correct definition of ADC Volume control
	ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J
	ASoC: SOF: Intel: hda-ipc: fix reply size checking
	ASoC: cs42l42: Fix inversion of ADC Notch Switch control
	ASoC: cs42l42: Remove duplicate control for WNF filter frequency
	netfilter: nf_conntrack_bridge: Fix memory leak when error
	pinctrl: tigerlake: Fix GPIO mapping for newer version of software
	ASoC: cs42l42: Fix LRCLK frame start edge
	net: dsa: mt7530: add the missing RxUnicast MIB counter
	net: mvvp2: fix short frame size on s390
	platform/x86: pcengines-apuv2: Add missing terminating entries to gpio-lookup tables
	libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT
	bpf: Fix integer overflow involving bucket_size
	net: phy: micrel: Fix link detection on ksz87xx switch"
	ppp: Fix generating ifname when empty IFLA_IFNAME is specified
	net/smc: fix wait on already cleared link
	net: sched: act_mirred: Reset ct info when mirror/redirect skb
	ice: Prevent probing virtual functions
	ice: don't remove netdev->dev_addr from uc sync list
	iavf: Set RSS LUT and key in reset handle path
	psample: Add a fwd declaration for skbuff
	bareudp: Fix invalid read beyond skb's linear data
	net/mlx5: Synchronize correct IRQ when destroying CQ
	net/mlx5: Fix return value from tracer initialization
	drm/meson: fix colour distortion from HDR set during vendor u-boot
	net: dsa: microchip: Fix ksz_read64()
	net: dsa: microchip: ksz8795: Fix VLAN filtering
	net: Fix memory leak in ieee802154_raw_deliver
	net: igmp: fix data-race in igmp_ifc_timer_expire()
	net: dsa: lan9303: fix broken backpressure in .port_fdb_dump
	net: dsa: lantiq: fix broken backpressure in .port_fdb_dump
	net: dsa: sja1105: fix broken backpressure in .port_fdb_dump
	net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
	net: bridge: fix flags interpretation for extern learn fdb entries
	net: bridge: fix memleak in br_add_if()
	net: linkwatch: fix failure to restore device state across suspend/resume
	tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets
	net: igmp: increase size of mr_ifc_count
	drm/i915: Only access SFC_DONE when media domain is not fused off
	xen/events: Fix race in set_evtchn_to_irq
	vsock/virtio: avoid potential deadlock when vsock device remove
	nbd: Aovid double completion of a request
	arm64: efi: kaslr: Fix occasional random alloc (and boot) failure
	efi/libstub: arm64: Force Image reallocation if BSS was not reserved
	efi/libstub: arm64: Relax 2M alignment again for relocatable kernels
	powerpc/kprobes: Fix kprobe Oops happens in booke
	x86/tools: Fix objdump version check again
	genirq: Provide IRQCHIP_AFFINITY_PRE_STARTUP
	x86/msi: Force affinity setup before startup
	x86/ioapic: Force affinity setup before startup
	x86/resctrl: Fix default monitoring groups reporting
	genirq/msi: Ensure deactivation on teardown
	genirq/timings: Prevent potential array overflow in __irq_timings_store()
	PCI/MSI: Enable and mask MSI-X early
	PCI/MSI: Mask all unused MSI-X entries
	PCI/MSI: Enforce that MSI-X table entry is masked for update
	PCI/MSI: Enforce MSI[X] entry updates to be visible
	PCI/MSI: Do not set invalid bits in MSI mask
	PCI/MSI: Correct misleading comments
	PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown()
	PCI/MSI: Protect msi_desc::masked for multi-MSI
	powerpc/smp: Fix OOPS in topology_init()
	efi/libstub: arm64: Double check image alignment at entry
	KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
	KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF
	vboxsf: Add vboxsf_[create|release]_sf_handle() helpers
	vboxsf: Add support for the atomic_open directory-inode op
	ceph: add some lockdep assertions around snaprealm handling
	ceph: clean up locking annotation for ceph_get_snap_realm and __lookup_snap_realm
	ceph: take snap_empty_lock atomically with snaprealm refcount change
	vmlinux.lds.h: Handle clang's module.{c,d}tor sections
	KVM: nSVM: avoid picking up unsupported bits from L2 in int_ctl (CVE-2021-3653)
	KVM: nSVM: always intercept VMLOAD/VMSAVE when nested (CVE-2021-3656)
	net: dsa: microchip: Fix probing KSZ87xx switch with DT node for host port
	net: dsa: microchip: ksz8795: Fix PVID tag insertion
	net: dsa: microchip: ksz8795: Reject unsupported VLAN configuration
	net: dsa: microchip: ksz8795: Fix VLAN untagged flag change on deletion
	net: dsa: microchip: ksz8795: Use software untagging on CPU port
	Linux 5.10.60

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7d55aed1883b31ba2d9b8dfad4bc33e1efbcbd2f
2021-08-27 17:14:51 +02:00
Hsuan-Chi Kuo
561d13128b seccomp: Fix setting loaded filter count during TSYNC
commit b4d8a58f8dcfcc890f296696cadb76e77be44b5f upstream.

The desired behavior is to set the caller's filter count to thread's.
This value is reported via /proc, so this fixes the inaccurate count
exposed to userspace; it is not used for reference counting, etc.

Signed-off-by: Hsuan-Chi Kuo <hsuanchikuo@gmail.com>
Link: https://lore.kernel.org/r/20210304233708.420597-1-hsuanchikuo@gmail.com
Co-developed-by: Wiktor Garbacz <wiktorg@google.com>
Signed-off-by: Wiktor Garbacz <wiktorg@google.com>
Link: https://lore.kernel.org/lkml/20210810125158.329849-1-wiktorg@google.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Fixes: c818c03b66 ("seccomp: Report number of loaded filters in /proc/$pid/status")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-18 08:59:06 +02:00
Greg Kroah-Hartman
c5d480cd47 Merge 5.10.42 into android12-5.10
Changes in 5.10.42
	ALSA: hda/realtek: the bass speaker can't output sound on Yoga 9i
	ALSA: hda/realtek: Headphone volume is controlled by Front mixer
	ALSA: hda/realtek: Chain in pop reduction fixup for ThinkStation P340
	ALSA: hda/realtek: fix mute/micmute LEDs for HP 855 G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 15 G8
	ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 17 G8
	ALSA: usb-audio: scarlett2: Fix device hang with ehci-pci
	ALSA: usb-audio: scarlett2: Improve driver startup messages
	cifs: set server->cipher_type to AES-128-CCM for SMB3.0
	NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
	iommu/vt-d: Fix sysfs leak in alloc_iommu()
	perf intel-pt: Fix sample instruction bytes
	perf intel-pt: Fix transaction abort handling
	perf scripts python: exported-sql-viewer.py: Fix copy to clipboard from Top Calls by elapsed Time report
	perf scripts python: exported-sql-viewer.py: Fix Array TypeError
	perf scripts python: exported-sql-viewer.py: Fix warning display
	proc: Check /proc/$pid/attr/ writes against file opener
	net: hso: fix control-request directions
	net/sched: fq_pie: re-factor fix for fq_pie endless loop
	net/sched: fq_pie: fix OOB access in the traffic path
	netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version
	mac80211: assure all fragments are encrypted
	mac80211: prevent mixed key and fragment cache attacks
	mac80211: properly handle A-MSDUs that start with an RFC 1042 header
	cfg80211: mitigate A-MSDU aggregation attacks
	mac80211: drop A-MSDUs on old ciphers
	mac80211: add fragment cache to sta_info
	mac80211: check defrag PN against current frame
	mac80211: prevent attacks on TKIP/WEP as well
	mac80211: do not accept/forward invalid EAPOL frames
	mac80211: extend protection against mixed key and fragment cache attacks
	ath10k: add CCMP PN replay protection for fragmented frames for PCIe
	ath10k: drop fragments with multicast DA for PCIe
	ath10k: drop fragments with multicast DA for SDIO
	ath10k: drop MPDU which has discard flag set by firmware for SDIO
	ath10k: Fix TKIP Michael MIC verification for PCIe
	ath10k: Validate first subframe of A-MSDU before processing the list
	ath11k: Clear the fragment cache during key install
	dm snapshot: properly fix a crash when an origin has no snapshots
	drm/amd/pm: correct MGpuFanBoost setting
	drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate
	drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error
	drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate
	drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate
	drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate
	selftests/gpio: Use TEST_GEN_PROGS_EXTENDED
	selftests/gpio: Move include of lib.mk up
	selftests/gpio: Fix build when source tree is read only
	kgdb: fix gcc-11 warnings harder
	Documentation: seccomp: Fix user notification documentation
	seccomp: Refactor notification handler to prepare for new semantics
	serial: core: fix suspicious security_locked_down() call
	misc/uss720: fix memory leak in uss720_probe
	thunderbolt: usb4: Fix NVM read buffer bounds and offset issue
	thunderbolt: dma_port: Fix NVM read buffer bounds and offset issue
	KVM: X86: Fix vCPU preempted state from guest's point of view
	KVM: arm64: Prevent mixed-width VM creation
	mei: request autosuspend after sending rx flow control
	staging: iio: cdc: ad7746: avoid overwrite of num_channels
	iio: gyro: fxas21002c: balance runtime power in error path
	iio: dac: ad5770r: Put fwnode in error case during ->probe()
	iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()
	iio: adc: ad7124: Fix missbalanced regulator enable / disable on error.
	iio: adc: ad7124: Fix potential overflow due to non sequential channel numbers
	iio: adc: ad7923: Fix undersized rx buffer.
	iio: adc: ad7793: Add missing error code in ad7793_setup()
	iio: adc: ad7192: Avoid disabling a clock that was never enabled.
	iio: adc: ad7192: handle regulator voltage error first
	serial: 8250: Add UART_BUG_TXRACE workaround for Aspeed VUART
	serial: 8250_dw: Add device HID for new AMD UART controller
	serial: 8250_pci: Add support for new HPE serial device
	serial: 8250_pci: handle FL_NOIRQ board flag
	USB: trancevibrator: fix control-request direction
	Revert "irqbypass: do not start cons/prod when failed connect"
	USB: usbfs: Don't WARN about excessively large memory allocations
	drivers: base: Fix device link removal
	serial: tegra: Fix a mask operation that is always true
	serial: sh-sci: Fix off-by-one error in FIFO threshold register setting
	serial: rp2: use 'request_firmware' instead of 'request_firmware_nowait'
	USB: serial: ti_usb_3410_5052: add startech.com device id
	USB: serial: option: add Telit LE910-S1 compositions 0x7010, 0x7011
	USB: serial: ftdi_sio: add IDs for IDS GmbH Products
	USB: serial: pl2303: add device id for ADLINK ND-6530 GC
	thermal/drivers/intel: Initialize RW trip to THERMAL_TEMP_INVALID
	usb: dwc3: gadget: Properly track pending and queued SG
	usb: gadget: udc: renesas_usb3: Fix a race in usb3_start_pipen()
	usb: typec: mux: Fix matching with typec_altmode_desc
	net: usb: fix memory leak in smsc75xx_bind
	Bluetooth: cmtp: fix file refcount when cmtp_attach_device fails
	fs/nfs: Use fatal_signal_pending instead of signal_pending
	NFS: fix an incorrect limit in filelayout_decode_layout()
	NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
	NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
	NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
	drm/meson: fix shutdown crash when component not probed
	net/mlx5e: reset XPS on error flow if netdev isn't registered yet
	net/mlx5e: Fix multipath lag activation
	net/mlx5e: Fix error path of updating netdev queues
	{net,vdpa}/mlx5: Configure interface MAC into mpfs L2 table
	net/mlx5e: Fix nullptr in add_vlan_push_action()
	net/mlx5: Set reformat action when needed for termination rules
	net/mlx5e: Fix null deref accessing lag dev
	net/mlx4: Fix EEPROM dump support
	net/mlx5: Set term table as an unmanaged flow table
	SUNRPC in case of backlog, hand free slots directly to waiting task
	Revert "net:tipc: Fix a double free in tipc_sk_mcast_rcv"
	tipc: wait and exit until all work queues are done
	tipc: skb_linearize the head skb when reassembling msgs
	spi: spi-fsl-dspi: Fix a resource leak in an error handling path
	netfilter: flowtable: Remove redundant hw refresh bit
	net: dsa: mt7530: fix VLAN traffic leaks
	net: dsa: fix a crash if ->get_sset_count() fails
	net: dsa: sja1105: update existing VLANs from the bridge VLAN list
	net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic
	net: dsa: sja1105: error out on unsupported PHY mode
	net: dsa: sja1105: add error handling in sja1105_setup()
	net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails
	net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S
	i2c: s3c2410: fix possible NULL pointer deref on read message after write
	i2c: mediatek: Disable i2c start_en and clear intr_stat brfore reset
	i2c: i801: Don't generate an interrupt on bus reset
	i2c: sh_mobile: Use new clock calculation formulas for RZ/G2E
	afs: Fix the nlink handling of dir-over-dir rename
	perf jevents: Fix getting maximum number of fds
	nvmet-tcp: fix inline data size comparison in nvmet_tcp_queue_response
	mptcp: avoid error message on infinite mapping
	mptcp: drop unconditional pr_warn on bad opt
	mptcp: fix data stream corruption
	platform/x86: hp_accel: Avoid invoking _INI to speed up resume
	gpio: cadence: Add missing MODULE_DEVICE_TABLE
	Revert "crypto: cavium/nitrox - add an error message to explain the failure of pci_request_mem_regions"
	Revert "media: usb: gspca: add a missed check for goto_low_power"
	Revert "ALSA: sb: fix a missing check of snd_ctl_add"
	Revert "serial: max310x: pass return value of spi_register_driver"
	serial: max310x: unregister uart driver in case of failure and abort
	Revert "net: fujitsu: fix a potential NULL pointer dereference"
	net: fujitsu: fix potential null-ptr-deref
	Revert "net/smc: fix a NULL pointer dereference"
	net/smc: properly handle workqueue allocation failure
	Revert "net: caif: replace BUG_ON with recovery code"
	net: caif: remove BUG_ON(dev == NULL) in caif_xmit
	Revert "char: hpet: fix a missing check of ioremap"
	char: hpet: add checks after calling ioremap
	Revert "ALSA: gus: add a check of the status of snd_ctl_add"
	Revert "ALSA: usx2y: Fix potential NULL pointer dereference"
	Revert "isdn: mISDNinfineon: fix potential NULL pointer dereference"
	isdn: mISDNinfineon: check/cleanup ioremap failure correctly in setup_io
	Revert "ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()"
	ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()
	Revert "isdn: mISDN: Fix potential NULL pointer dereference of kzalloc"
	isdn: mISDN: correctly handle ph_info allocation failure in hfcsusb_ph_info
	Revert "dmaengine: qcom_hidma: Check for driver register failure"
	dmaengine: qcom_hidma: comment platform_driver_register call
	Revert "libertas: add checks for the return value of sysfs_create_group"
	libertas: register sysfs groups properly
	Revert "ASoC: cs43130: fix a NULL pointer dereference"
	ASoC: cs43130: handle errors in cs43130_probe() properly
	Revert "media: dvb: Add check on sp8870_readreg"
	media: dvb: Add check on sp8870_readreg return
	Revert "media: gspca: mt9m111: Check write_bridge for timeout"
	media: gspca: mt9m111: Check write_bridge for timeout
	Revert "media: gspca: Check the return value of write_bridge for timeout"
	media: gspca: properly check for errors in po1030_probe()
	Revert "net: liquidio: fix a NULL pointer dereference"
	net: liquidio: Add missing null pointer checks
	Revert "brcmfmac: add a check for the status of usb_register"
	brcmfmac: properly check for bus register errors
	btrfs: return whole extents in fiemap
	scsi: ufs: ufs-mediatek: Fix power down spec violation
	scsi: BusLogic: Fix 64-bit system enumeration error for Buslogic
	openrisc: Define memory barrier mb
	scsi: pm80xx: Fix drives missing during rmmod/insmod loop
	btrfs: release path before starting transaction when cloning inline extent
	btrfs: do not BUG_ON in link_to_fixup_dir
	platform/x86: hp-wireless: add AMD's hardware id to the supported list
	platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI
	platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet
	SMB3: incorrect file id in requests compounded with open
	drm/amd/display: Disconnect non-DP with no EDID
	drm/amd/amdgpu: fix refcount leak
	drm/amdgpu: Fix a use-after-free
	drm/amd/amdgpu: fix a potential deadlock in gpu reset
	drm/amdgpu: stop touching sched.ready in the backend
	platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
	block: fix a race between del_gendisk and BLKRRPART
	linux/bits.h: fix compilation error with GENMASK
	net: netcp: Fix an error message
	net: dsa: fix error code getting shifted with 4 in dsa_slave_get_sset_count
	interconnect: qcom: bcm-voter: add a missing of_node_put()
	interconnect: qcom: Add missing MODULE_DEVICE_TABLE
	ASoC: cs42l42: Regmap must use_single_read/write
	net: stmmac: Fix MAC WoL not working if PHY does not support WoL
	net: ipa: memory region array is variable size
	vfio-ccw: Check initialized flag in cp_init()
	spi: Assume GPIO CS active high in ACPI case
	net: really orphan skbs tied to closing sk
	net: packetmmap: fix only tx timestamp on request
	net: fec: fix the potential memory leak in fec_enet_init()
	chelsio/chtls: unlock on error in chtls_pt_recvmsg()
	net: mdio: thunder: Fix a double free issue in the .remove function
	net: mdio: octeon: Fix some double free issues
	cxgb4/ch_ktls: Clear resources when pf4 device is removed
	openvswitch: meter: fix race when getting now_ms.
	tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
	net: sched: fix packet stuck problem for lockless qdisc
	net: sched: fix tx action rescheduling issue during deactivation
	net: sched: fix tx action reschedule issue with stopped queue
	net: hso: check for allocation failure in hso_create_bulk_serial_device()
	net: bnx2: Fix error return code in bnx2_init_board()
	bnxt_en: Include new P5 HV definition in VF check.
	bnxt_en: Fix context memory setup for 64K page size.
	mld: fix panic in mld_newpack()
	net/smc: remove device from smcd_dev_list after failed device_add()
	gve: Check TX QPL was actually assigned
	gve: Update mgmt_msix_idx if num_ntfy changes
	gve: Add NULL pointer checks when freeing irqs.
	gve: Upgrade memory barrier in poll routine
	gve: Correct SKB queue index validation.
	iommu/virtio: Add missing MODULE_DEVICE_TABLE
	net: hns3: fix incorrect resp_msg issue
	net: hns3: put off calling register_netdev() until client initialize complete
	iommu/vt-d: Use user privilege for RID2PASID translation
	cxgb4: avoid accessing registers when clearing filters
	staging: emxx_udc: fix loop in _nbu2ss_nuke()
	ASoC: cs35l33: fix an error code in probe()
	bpf, offload: Reorder offload callback 'prepare' in verifier
	bpf: Set mac_len in bpf_skb_change_head
	ixgbe: fix large MTU request from VF
	ASoC: qcom: lpass-cpu: Use optional clk APIs
	scsi: libsas: Use _safe() loop in sas_resume_port()
	net: lantiq: fix memory corruption in RX ring
	ipv6: record frag_max_size in atomic fragments in input path
	ALSA: usb-audio: scarlett2: snd_scarlett_gen2_controls_create() can be static
	net: ethernet: mtk_eth_soc: Fix packet statistics support for MT7628/88
	sch_dsmark: fix a NULL deref in qdisc_reset()
	net: hsr: fix mac_len checks
	MIPS: alchemy: xxs1500: add gpio-au1000.h header file
	MIPS: ralink: export rt_sysc_membase for rt2880_wdt.c
	net: zero-initialize tc skb extension on allocation
	net: mvpp2: add buffer header handling in RX
	i915: fix build warning in intel_dp_get_link_status()
	samples/bpf: Consider frame size in tx_only of xdpsock sample
	net: hns3: check the return of skb_checksum_help()
	bpftool: Add sock_release help info for cgroup attach/prog load command
	SUNRPC: More fixes for backlog congestion
	Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
	net: hso: bail out on interrupt URB allocation failure
	scripts/clang-tools: switch explicitly to Python 3
	neighbour: Prevent Race condition in neighbour subsytem
	usb: core: reduce power-on-good delay time of root hub
	Linux 5.10.42

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I05d98d1355a080e0951b4b2ae77f0a9ccb6dfc5d
2021-06-03 18:47:38 +02:00
Sargun Dhillon
b71781c589 seccomp: Refactor notification handler to prepare for new semantics
commit ddc473916955f7710d1eb17c1273d91c8622a9fe upstream.

This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.

Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].

[1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/

Fixes: 7cf97b1254 ("seccomp: Introduce addfd ioctl to seccomp user notifier")
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Rodrigo Campos <rodrigo@kinvolk.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03 09:00:31 +02:00
Greg Kroah-Hartman
d8c7f0a3cd Merge 5.10.20 into android12-5.10
Changes in 5.10.20
	vmlinux.lds.h: add DWARF v5 sections
	vdpa/mlx5: fix param validation in mlx5_vdpa_get_config()
	debugfs: be more robust at handling improper input in debugfs_lookup()
	debugfs: do not attempt to create a new file before the filesystem is initalized
	scsi: libsas: docs: Remove notify_ha_event()
	scsi: qla2xxx: Fix mailbox Ch erroneous error
	kdb: Make memory allocations more robust
	w1: w1_therm: Fix conversion result for negative temperatures
	PCI: qcom: Use PHY_REFCLK_USE_PAD only for ipq8064
	PCI: Decline to resize resources if boot config must be preserved
	virt: vbox: Do not use wait_event_interruptible when called from kernel context
	bfq: Avoid false bfq queue merging
	ALSA: usb-audio: Fix PCM buffer allocation in non-vmalloc mode
	MIPS: vmlinux.lds.S: add missing PAGE_ALIGNED_DATA() section
	vmlinux.lds.h: Define SANTIZER_DISCARDS with CONFIG_GCOV_KERNEL=y
	random: fix the RNDRESEEDCRNG ioctl
	ALSA: pcm: Call sync_stop at disconnection
	ALSA: pcm: Assure sync with the pending stop operation at suspend
	ALSA: pcm: Don't call sync_stop if it hasn't been stopped
	drm/i915/gt: One more flush for Baytrail clear residuals
	ath10k: Fix error handling in case of CE pipe init failure
	Bluetooth: btqcomsmd: Fix a resource leak in error handling paths in the probe function
	Bluetooth: hci_uart: Fix a race for write_work scheduling
	Bluetooth: Fix initializing response id after clearing struct
	arm64: dts: renesas: beacon kit: Fix choppy Bluetooth Audio
	arm64: dts: renesas: beacon: Fix audio-1.8V pin enable
	ARM: dts: exynos: correct PMIC interrupt trigger level on Artik 5
	ARM: dts: exynos: correct PMIC interrupt trigger level on Monk
	ARM: dts: exynos: correct PMIC interrupt trigger level on Rinato
	ARM: dts: exynos: correct PMIC interrupt trigger level on Spring
	ARM: dts: exynos: correct PMIC interrupt trigger level on Arndale Octa
	ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid XU3 family
	arm64: dts: exynos: correct PMIC interrupt trigger level on TM2
	arm64: dts: exynos: correct PMIC interrupt trigger level on Espresso
	memory: mtk-smi: Fix PM usage counter unbalance in mtk_smi ops
	Bluetooth: hci_qca: Fix memleak in qca_controller_memdump
	staging: vchiq: Fix bulk userdata handling
	staging: vchiq: Fix bulk transfers on 64-bit builds
	arm64: dts: qcom: msm8916-samsung-a5u: Fix iris compatible
	net: stmmac: dwmac-meson8b: fix enabling the timing-adjustment clock
	bpf: Add bpf_patch_call_args prototype to include/linux/bpf.h
	bpf: Avoid warning when re-casting __bpf_call_base into __bpf_call_base_args
	firmware: arm_scmi: Fix call site of scmi_notification_exit
	arm64: dts: allwinner: A64: properly connect USB PHY to port 0
	arm64: dts: allwinner: H6: properly connect USB PHY to port 0
	arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card
	arm64: dts: allwinner: H6: Allow up to 150 MHz MMC bus frequency
	arm64: dts: allwinner: A64: Limit MMC2 bus frequency to 150 MHz
	arm64: dts: qcom: msm8916-samsung-a2015: Fix sensors
	cpufreq: brcmstb-avs-cpufreq: Free resources in error path
	cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()
	arm64: dts: rockchip: rk3328: Add clock_in_out property to gmac2phy node
	ACPICA: Fix exception code class checks
	usb: gadget: u_audio: Free requests only after callback
	arm64: dts: qcom: sdm845-db845c: Fix reset-pin of ov8856 node
	soc: qcom: socinfo: Fix an off by one in qcom_show_pmic_model()
	soc: ti: pm33xx: Fix some resource leak in the error handling paths of the probe function
	staging: media: atomisp: Fix size_t format specifier in hmm_alloc() debug statemenet
	Bluetooth: drop HCI device reference before return
	Bluetooth: Put HCI device if inquiry procedure interrupts
	memory: ti-aemif: Drop child node when jumping out loop
	ARM: dts: Configure missing thermal interrupt for 4430
	usb: dwc2: Do not update data length if it is 0 on inbound transfers
	usb: dwc2: Abort transaction after errors with unknown reason
	usb: dwc2: Make "trimming xfer length" a debug message
	staging: rtl8723bs: wifi_regd.c: Fix incorrect number of regulatory rules
	x86/MSR: Filter MSR writes through X86_IOC_WRMSR_REGS ioctl too
	arm64: dts: renesas: beacon: Fix EEPROM compatible value
	can: mcp251xfd: mcp251xfd_probe(): fix errata reference
	ARM: dts: armada388-helios4: assign pinctrl to LEDs
	ARM: dts: armada388-helios4: assign pinctrl to each fan
	arm64: dts: armada-3720-turris-mox: rename u-boot mtd partition to a53-firmware
	opp: Correct debug message in _opp_add_static_v2()
	Bluetooth: btusb: Fix memory leak in btusb_mtk_wmt_recv
	soc: qcom: ocmem: don't return NULL in of_get_ocmem
	arm64: dts: msm8916: Fix reserved and rfsa nodes unit address
	arm64: dts: meson: fix broken wifi node for Khadas VIM3L
	iwlwifi: mvm: set enabled in the PPAG command properly
	ARM: s3c: fix fiq for clang IAS
	optee: simplify i2c access
	staging: wfx: fix possible panic with re-queued frames
	ARM: at91: use proper asm syntax in pm_suspend
	ath10k: Fix suspicious RCU usage warning in ath10k_wmi_tlv_parse_peer_stats_info()
	ath10k: Fix lockdep assertion warning in ath10k_sta_statistics
	ath11k: fix a locking bug in ath11k_mac_op_start()
	soc: aspeed: snoop: Add clock control logic
	iwlwifi: mvm: fix the type we use in the PPAG table validity checks
	iwlwifi: mvm: store PPAG enabled/disabled flag properly
	iwlwifi: mvm: send stored PPAG command instead of local
	iwlwifi: mvm: assign SAR table revision to the command later
	iwlwifi: mvm: don't check if CSA event is running before removing
	bpf_lru_list: Read double-checked variable once without lock
	iwlwifi: pnvm: set the PNVM again if it was already loaded
	iwlwifi: pnvm: increment the pointer before checking the TLV
	ath9k: fix data bus crash when setting nf_override via debugfs
	selftests/bpf: Convert test_xdp_redirect.sh to bash
	ibmvnic: Set to CLOSED state even on error
	bnxt_en: reverse order of TX disable and carrier off
	bnxt_en: Fix devlink info's stored fw.psid version format.
	xen/netback: fix spurious event detection for common event case
	dpaa2-eth: fix memory leak in XDP_REDIRECT
	net: phy: consider that suspend2ram may cut off PHY power
	net/mlx5e: Don't change interrupt moderation params when DIM is enabled
	net/mlx5e: Change interrupt moderation channel params also when channels are closed
	net/mlx5: Fix health error state handling
	net/mlx5e: Replace synchronize_rcu with synchronize_net
	net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context
	net/mlx5: Disable devlink reload for multi port slave device
	net/mlx5: Disallow RoCE on multi port slave device
	net/mlx5: Disallow RoCE on lag device
	net/mlx5: Disable devlink reload for lag devices
	net/mlx5e: CT: manage the lifetime of the ct entry object
	net/mlx5e: Check tunnel offload is required before setting SWP
	mac80211: fix potential overflow when multiplying to u32 integers
	libbpf: Ignore non function pointer member in struct_ops
	bpf: Fix an unitialized value in bpf_iter
	bpf, devmap: Use GFP_KERNEL for xdp bulk queue allocation
	bpf: Fix bpf_fib_lookup helper MTU check for SKB ctx
	selftests: mptcp: fix ACKRX debug message
	tcp: fix SO_RCVLOWAT related hangs under mem pressure
	net: axienet: Handle deferred probe on clock properly
	cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds
	b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
	bpf: Clear subreg_def for global function return values
	ibmvnic: add memory barrier to protect long term buffer
	ibmvnic: skip send_request_unmap for timeout reset
	net: dsa: felix: perform teardown in reverse order of setup
	net: dsa: felix: don't deinitialize unused ports
	net: phy: mscc: adding LCPLL reset to VSC8514
	net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout
	net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning
	net: amd-xgbe: Reset link when the link never comes back
	net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP
	net: mvneta: Remove per-cpu queue mapping for Armada 3700
	net: enetc: fix destroyed phylink dereference during unbind
	tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer
	tty: implement read_iter
	fbdev: aty: SPARC64 requires FB_ATY_CT
	drm/gma500: Fix error return code in psb_driver_load()
	gma500: clean up error handling in init
	drm/fb-helper: Add missed unlocks in setcmap_legacy()
	drm/panel: mantix: Tweak init sequence
	drm/vc4: hdmi: Take into account the clock doubling flag in atomic_check
	crypto: sun4i-ss - linearize buffers content must be kept
	crypto: sun4i-ss - fix kmap usage
	crypto: arm64/aes-ce - really hide slower algos when faster ones are enabled
	hwrng: ingenic - Fix a resource leak in an error handling path
	media: allegro: Fix use after free on error
	kcsan: Rewrite kcsan_prandom_u32_max() without prandom_u32_state()
	drm: rcar-du: Fix PM reference leak in rcar_cmm_enable()
	drm: rcar-du: Fix crash when using LVDS1 clock for CRTC
	drm: rcar-du: Fix the return check of of_parse_phandle and of_find_device_by_node
	drm/amdgpu: Fix macro name _AMDGPU_TRACE_H_ in preprocessor if condition
	MIPS: c-r4k: Fix section mismatch for loongson2_sc_init
	MIPS: lantiq: Explicitly compare LTQ_EBU_PCC_ISTAT against 0
	drm/virtio: make sure context is created in gem open
	drm/fourcc: fix Amlogic format modifier masks
	media: ipu3-cio2: Build only for x86
	media: i2c: ov5670: Fix PIXEL_RATE minimum value
	media: imx: Unregister csc/scaler only if registered
	media: imx: Fix csc/scaler unregister
	media: mtk-vcodec: fix error return code in vdec_vp9_decode()
	media: camss: missing error code in msm_video_register()
	media: vsp1: Fix an error handling path in the probe function
	media: em28xx: Fix use-after-free in em28xx_alloc_urbs
	media: media/pci: Fix memleak in empress_init
	media: tm6000: Fix memleak in tm6000_start_stream
	media: aspeed: fix error return code in aspeed_video_setup_video()
	ASoC: cs42l56: fix up error handling in probe
	ASoC: qcom: qdsp6: Move frontend AIFs to q6asm-dai
	evm: Fix memleak in init_desc
	crypto: bcm - Rename struct device_private to bcm_device_private
	sched/fair: Avoid stale CPU util_est value for schedutil in task dequeue
	drm/sun4i: tcon: fix inverted DCLK polarity
	media: imx7: csi: Fix regression for parallel cameras on i.MX6UL
	media: imx7: csi: Fix pad link validation
	media: ti-vpe: cal: fix write to unallocated memory
	MIPS: properly stop .eh_frame generation
	MIPS: Compare __SYNC_loongson3_war against 0
	drm/tegra: Fix reference leak when pm_runtime_get_sync() fails
	drm/amdgpu: toggle on DF Cstate after finishing xgmi injection
	bsg: free the request before return error code
	macintosh/adb-iop: Use big-endian autopoll mask
	drm/amd/display: Fix 10/12 bpc setup in DCE output bit depth reduction.
	drm/amd/display: Fix HDMI deep color output for DCE 6-11.
	media: software_node: Fix refcounts in software_node_get_next_child()
	media: lmedm04: Fix misuse of comma
	media: vidtv: psi: fix missing crc for PMT
	media: atomisp: Fix a buffer overflow in debug code
	media: qm1d1c0042: fix error return code in qm1d1c0042_init()
	media: cx25821: Fix a bug when reallocating some dma memory
	media: mtk-vcodec: fix argument used when DEBUG is defined
	media: pxa_camera: declare variable when DEBUG is defined
	media: uvcvideo: Accept invalid bFormatIndex and bFrameIndex values
	sched/eas: Don't update misfit status if the task is pinned
	f2fs: compress: fix potential deadlock
	ASoC: qcom: lpass-cpu: Remove bit clock state check
	ASoC: SOF: Intel: hda: cancel D0i3 work during runtime suspend
	perf/arm-cmn: Fix PMU instance naming
	perf/arm-cmn: Move IRQs when migrating context
	mtd: parser: imagetag: fix error codes in bcm963xx_parse_imagetag_partitions()
	crypto: talitos - Work around SEC6 ERRATA (AES-CTR mode data size error)
	crypto: talitos - Fix ctr(aes) on SEC1
	drm/nouveau: bail out of nouveau_channel_new if channel init fails
	mm: proc: Invalidate TLB after clearing soft-dirty page state
	ata: ahci_brcm: Add back regulators management
	ASoC: cpcap: fix microphone timeslot mask
	ASoC: codecs: add missing max_register in regmap config
	mtd: parsers: afs: Fix freeing the part name memory in failure
	f2fs: fix to avoid inconsistent quota data
	drm/amdgpu: Prevent shift wrapping in amdgpu_read_mask()
	f2fs: fix a wrong condition in __submit_bio
	ASoC: qcom: Fix typo error in HDMI regmap config callbacks
	KVM: nSVM: Don't strip host's C-bit from guest's CR3 when reading PDPTRs
	drm/mediatek: Check if fb is null
	Drivers: hv: vmbus: Avoid use-after-free in vmbus_onoffer_rescind()
	ASoC: Intel: sof_sdw: add missing TGL_HDMI quirk for Dell SKU 0A5E
	ASoC: Intel: sof_sdw: add missing TGL_HDMI quirk for Dell SKU 0A3E
	locking/lockdep: Avoid unmatched unlock
	ASoC: qcom: lpass: Fix i2s ctl register bit map
	ASoC: rt5682: Fix panic in rt5682_jack_detect_handler happening during system shutdown
	ASoC: SOF: debug: Fix a potential issue on string buffer termination
	btrfs: clarify error returns values in __load_free_space_cache
	btrfs: fix double accounting of ordered extent for subpage case in btrfs_invalidapge
	KVM: x86: Restore all 64 bits of DR6 and DR7 during RSM on x86-64
	s390/zcrypt: return EIO when msg retry limit reached
	drm/vc4: hdmi: Move hdmi reset to bind
	drm/vc4: hdmi: Fix register offset with longer CEC messages
	drm/vc4: hdmi: Fix up CEC registers
	drm/vc4: hdmi: Restore cec physical address on reconnect
	drm/vc4: hdmi: Compute the CEC clock divider from the clock rate
	drm/vc4: hdmi: Update the CEC clock divider on HSM rate change
	drm/lima: fix reference leak in lima_pm_busy
	drm/dp_mst: Don't cache EDIDs for physical ports
	hwrng: timeriomem - Fix cooldown period calculation
	crypto: ecdh_helper - Ensure 'len >= secret.len' in decode_key()
	io_uring: fix possible deadlock in io_uring_poll
	nvmet-tcp: fix receive data digest calculation for multiple h2cdata PDUs
	nvmet-tcp: fix potential race of tcp socket closing accept_work
	nvme-multipath: set nr_zones for zoned namespaces
	nvmet: remove extra variable in identify ns
	nvmet: set status to 0 in case for invalid nsid
	ASoC: SOF: sof-pci-dev: add missing Up-Extreme quirk
	ima: Free IMA measurement buffer on error
	ima: Free IMA measurement buffer after kexec syscall
	ASoC: simple-card-utils: Fix device module clock
	fs/jfs: fix potential integer overflow on shift of a int
	jffs2: fix use after free in jffs2_sum_write_data()
	ubifs: Fix memleak in ubifs_init_authentication
	ubifs: replay: Fix high stack usage, again
	ubifs: Fix error return code in alloc_wbufs()
	irqchip/imx: IMX_INTMUX should not default to y, unconditionally
	smp: Process pending softirqs in flush_smp_call_function_from_idle()
	drm/amdgpu/display: remove hdcp_srm sysfs on device removal
	capabilities: Don't allow writing ambiguous v3 file capabilities
	HSI: Fix PM usage counter unbalance in ssi_hw_init
	power: supply: cpcap: Add missing IRQF_ONESHOT to fix regression
	clk: meson: clk-pll: fix initializing the old rate (fallback) for a PLL
	clk: meson: clk-pll: make "ret" a signed integer
	clk: meson: clk-pll: propagate the error from meson_clk_pll_set_rate()
	selftests/powerpc: Make the test check in eeh-basic.sh posix compliant
	regulator: qcom-rpmh-regulator: add pm8009-1 chip revision
	arm64: dts: qcom: qrb5165-rb5: fix pm8009 regulators
	quota: Fix memory leak when handling corrupted quota file
	i2c: iproc: handle only slave interrupts which are enabled
	i2c: iproc: update slave isr mask (ISR_MASK_SLAVE)
	i2c: iproc: handle master read request
	spi: cadence-quadspi: Abort read if dummy cycles required are too many
	clk: sunxi-ng: h6: Fix CEC clock
	clk: renesas: r8a779a0: Remove non-existent S2 clock
	clk: renesas: r8a779a0: Fix parent of CBFUSA clock
	HID: core: detect and skip invalid inputs to snto32()
	RDMA/siw: Fix handling of zero-sized Read and Receive Queues.
	dmaengine: fsldma: Fix a resource leak in the remove function
	dmaengine: fsldma: Fix a resource leak in an error handling path of the probe function
	dmaengine: owl-dma: Fix a resource leak in the remove function
	dmaengine: hsu: disable spurious interrupt
	mfd: bd9571mwv: Use devm_mfd_add_devices()
	power: supply: cpcap-charger: Fix missing power_supply_put()
	power: supply: cpcap-battery: Fix missing power_supply_put()
	power: supply: cpcap-charger: Fix power_supply_put on null battery pointer
	fdt: Properly handle "no-map" field in the memory region
	of/fdt: Make sure no-map does not remove already reserved regions
	RDMA/rtrs: Extend ibtrs_cq_qp_create
	RDMA/rtrs-srv: Release lock before call into close_sess
	RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect
	RDMA/rtrs-clt: Set mininum limit when create QP
	RDMA/rtrs: Call kobject_put in the failure path
	RDMA/rtrs-srv: Fix missing wr_cqe
	RDMA/rtrs-clt: Refactor the failure cases in alloc_clt
	RDMA/rtrs-srv: Init wr_cnt as 1
	power: reset: at91-sama5d2_shdwc: fix wkupdbc mask
	rtc: s5m: select REGMAP_I2C
	dmaengine: idxd: set DMA channel to be private
	power: supply: fix sbs-charger build, needs REGMAP_I2C
	clocksource/drivers/ixp4xx: Select TIMER_OF when needed
	clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined
	spi: imx: Don't print error on -EPROBEDEFER
	RDMA/mlx5: Use the correct obj_id upon DEVX TIR creation
	IB/mlx5: Add mutex destroy call to cap_mask_mutex mutex
	clk: sunxi-ng: h6: Fix clock divider range on some clocks
	platform/chrome: cros_ec_proto: Use EC_HOST_EVENT_MASK not BIT
	platform/chrome: cros_ec_proto: Add LID and BATTERY to default mask
	regulator: axp20x: Fix reference cout leak
	watch_queue: Drop references to /dev/watch_queue
	certs: Fix blacklist flag type confusion
	regulator: s5m8767: Fix reference count leak
	spi: atmel: Put allocated master before return
	regulator: s5m8767: Drop regulators OF node reference
	power: supply: axp20x_usb_power: Init work before enabling IRQs
	power: supply: smb347-charger: Fix interrupt usage if interrupt is unavailable
	regulator: core: Avoid debugfs: Directory ... already present! error
	isofs: release buffer head before return
	watchdog: intel-mid_wdt: Postpone IRQ handler registration till SCU is ready
	auxdisplay: ht16k33: Fix refresh rate handling
	objtool: Fix error handling for STD/CLD warnings
	objtool: Fix retpoline detection in asm code
	objtool: Fix ".cold" section suffix check for newer versions of GCC
	scsi: lpfc: Fix ancient double free
	iommu: Switch gather->end to the inclusive end
	IB/umad: Return EIO in case of when device disassociated
	IB/umad: Return EPOLLERR in case of when device disassociated
	KVM: PPC: Make the VMX instruction emulation routines static
	powerpc/47x: Disable 256k page size
	powerpc/time: Enable sched clock for irqtime
	mmc: owl-mmc: Fix a resource leak in an error handling path and in the remove function
	mmc: sdhci-sprd: Fix some resource leaks in the remove function
	mmc: usdhi6rol0: Fix a resource leak in the error handling path of the probe
	mmc: renesas_sdhi_internal_dmac: Fix DMA buffer alignment from 8 to 128-bytes
	ARM: 9046/1: decompressor: Do not clear SCTLR.nTLSMD for ARMv7+ cores
	i2c: qcom-geni: Store DMA mapping data in geni_i2c_dev struct
	amba: Fix resource leak for drivers without .remove
	iommu: Move iotlb_sync_map out from __iommu_map
	iommu: Properly pass gfp_t in _iommu_map() to avoid atomic sleeping
	IB/mlx5: Return appropriate error code instead of ENOMEM
	IB/cm: Avoid a loop when device has 255 ports
	tracepoint: Do not fail unregistering a probe due to memory failure
	rtc: zynqmp: depend on HAS_IOMEM
	perf tools: Fix DSO filtering when not finding a map for a sampled address
	perf vendor events arm64: Fix Ampere eMag event typo
	RDMA/rxe: Fix coding error in rxe_recv.c
	RDMA/rxe: Fix coding error in rxe_rcv_mcast_pkt
	RDMA/rxe: Correct skb on loopback path
	spi: stm32: properly handle 0 byte transfer
	mfd: altera-sysmgr: Fix physical address storing more
	mfd: wm831x-auxadc: Prevent use after free in wm831x_auxadc_read_irq()
	powerpc/pseries/dlpar: handle ibm, configure-connector delay status
	powerpc/8xx: Fix software emulation interrupt
	clk: qcom: gcc-msm8998: Fix Alpha PLL type for all GPLLs
	kunit: tool: fix unit test cleanup handling
	kselftests: dmabuf-heaps: Fix Makefile's inclusion of the kernel's usr/include dir
	RDMA/hns: Fixed wrong judgments in the goto branch
	RDMA/siw: Fix calculation of tx_valid_cpus size
	RDMA/hns: Fix type of sq_signal_bits
	RDMA/hns: Disable RQ inline by default
	clk: divider: fix initialization with parent_hw
	spi: pxa2xx: Fix the controller numbering for Wildcat Point
	powerpc/uaccess: Avoid might_fault() when user access is enabled
	powerpc/kuap: Restore AMR after replaying soft interrupts
	regulator: qcom-rpmh: fix pm8009 ldo7
	clk: aspeed: Fix APLL calculate formula from ast2600-A2
	selftests/ftrace: Update synthetic event syntax errors
	perf symbols: Use (long) for iterator for bfd symbols
	regulator: bd718x7, bd71828, Fix dvs voltage levels
	spi: dw: Avoid stack content exposure
	spi: Skip zero-length transfers in spi_transfer_one_message()
	printk: avoid prb_first_valid_seq() where possible
	perf symbols: Fix return value when loading PE DSO
	nfsd: register pernet ops last, unregister first
	svcrdma: Hold private mutex while invoking rdma_accept()
	ceph: fix flush_snap logic after putting caps
	RDMA/hns: Fixes missing error code of CMDQ
	RDMA/ucma: Fix use-after-free bug in ucma_create_uevent
	RDMA/rtrs-srv: Fix stack-out-of-bounds
	RDMA/rtrs: Only allow addition of path to an already established session
	RDMA/rtrs-srv: fix memory leak by missing kobject free
	RDMA/rtrs-srv-sysfs: fix missing put_device
	RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR()
	Input: sur40 - fix an error code in sur40_probe()
	perf record: Fix continue profiling after draining the buffer
	perf intel-pt: Fix missing CYC processing in PSB
	perf intel-pt: Fix premature IPC
	perf intel-pt: Fix IPC with CYC threshold
	perf test: Fix unaligned access in sample parsing test
	Input: elo - fix an error code in elo_connect()
	sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
	sparc: fix led.c driver when PROC_FS is not enabled
	Input: zinitix - fix return type of zinitix_init_touch()
	ARM: 9065/1: OABI compat: fix build when EPOLL is not enabled
	misc: eeprom_93xx46: Fix module alias to enable module autoprobe
	phy: rockchip-emmc: emmc_phy_init() always return 0
	phy: cadence-torrent: Fix error code in cdns_torrent_phy_probe()
	misc: eeprom_93xx46: Add module alias to avoid breaking support for non device tree users
	PCI: rcar: Always allocate MSI addresses in 32bit space
	soundwire: cadence: fix ACK/NAK handling
	pwm: rockchip: Enable APB clock during register access while probing
	pwm: rockchip: rockchip_pwm_probe(): Remove superfluous clk_unprepare()
	pwm: rockchip: Eliminate potential race condition when probing
	PCI: xilinx-cpm: Fix reference count leak on error path
	VMCI: Use set_page_dirty_lock() when unregistering guest memory
	PCI: Align checking of syscall user config accessors
	mei: hbm: call mei_set_devstate() on hbm stop response
	drm/msm: Fix MSM_INFO_GET_IOVA with carveout
	drm/msm/dsi: Correct io_start for MSM8994 (20nm PHY)
	drm/msm/mdp5: Fix wait-for-commit for cmd panels
	drm/msm: Fix race of GPU init vs timestamp power management.
	drm/msm: Fix races managing the OOB state for timestamp vs timestamps.
	drm/msm/dp: trigger unplug event in msm_dp_display_disable
	vfio/iommu_type1: Populate full dirty when detach non-pinned group
	vfio/iommu_type1: Fix some sanity checks in detach group
	vfio-pci/zdev: fix possible segmentation fault issue
	ext4: fix potential htree index checksum corruption
	phy: USB_LGM_PHY should depend on X86
	coresight: etm4x: Skip accessing TRCPDCR in save/restore
	nvmem: core: Fix a resource leak on error in nvmem_add_cells_from_of()
	nvmem: core: skip child nodes not matching binding
	soundwire: bus: use sdw_update_no_pm when initializing a device
	soundwire: bus: use sdw_write_no_pm when setting the bus scale registers
	soundwire: export sdw_write/read_no_pm functions
	soundwire: bus: fix confusion on device used by pm_runtime
	misc: fastrpc: fix incorrect usage of dma_map_sgtable
	remoteproc/mediatek: acknowledge watchdog IRQ after handled
	regmap: sdw: use _no_pm functions in regmap_read/write
	ext: EXT4_KUNIT_TESTS should depend on EXT4_FS instead of selecting it
	mailbox: sprd: correct definition of SPRD_OUTBOX_FIFO_FULL
	device-dax: Fix default return code of range_parse()
	PCI: pci-bridge-emul: Fix array overruns, improve safety
	PCI: cadence: Fix DMA range mapping early return error
	i40e: Fix flow for IPv6 next header (extension header)
	i40e: Add zero-initialization of AQ command structures
	i40e: Fix overwriting flow control settings during driver loading
	i40e: Fix addition of RX filters after enabling FW LLDP agent
	i40e: Fix VFs not created
	Take mmap lock in cacheflush syscall
	nios2: fixed broken sys_clone syscall
	i40e: Fix add TC filter for IPv6
	octeontx2-af: Fix an off by one in rvu_dbg_qsize_write()
	pwm: iqs620a: Fix overflow and optimize calculations
	vfio/type1: Use follow_pte()
	ice: report correct max number of TCs
	ice: Account for port VLAN in VF max packet size calculation
	ice: Fix state bits on LLDP mode switch
	ice: update the number of available RSS queues
	net: stmmac: fix CBS idleslope and sendslope calculation
	net/mlx4_core: Add missed mlx4_free_cmd_mailbox()
	PCI: rockchip: Make 'ep-gpios' DT property optional
	vxlan: move debug check after netdev unregister
	wireguard: device: do not generate ICMP for non-IP packets
	wireguard: kconfig: use arm chacha even with no neon
	ocfs2: fix a use after free on error
	mm: memcontrol: fix NR_ANON_THPS accounting in charge moving
	mm: memcontrol: fix slub memory accounting
	mm/memory.c: fix potential pte_unmap_unlock pte error
	mm/hugetlb: fix potential double free in hugetlb_register_node() error path
	mm/hugetlb: suppress wrong warning info when alloc gigantic page
	mm/compaction: fix misbehaviors of fast_find_migrateblock()
	r8169: fix jumbo packet handling on RTL8168e
	NFSv4: Fixes for nfs4_bitmask_adjust()
	KVM: SVM: Intercept INVPCID when it's disabled to inject #UD
	KVM: x86/mmu: Expand collapsible SPTE zap for TDP MMU to ZONE_DEVICE and HugeTLB pages
	arm64: Add missing ISB after invalidating TLB in __primary_switch
	i2c: brcmstb: Fix brcmstd_send_i2c_cmd condition
	i2c: exynos5: Preserve high speed master code
	mm,thp,shmem: make khugepaged obey tmpfs mount flags
	mm: fix memory_failure() handling of dax-namespace metadata
	mm/rmap: fix potential pte_unmap on an not mapped pte
	proc: use kvzalloc for our kernel buffer
	csky: Fix a size determination in gpr_get()
	scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
	scsi: sd: sd_zbc: Don't pass GFP_NOIO to kvcalloc
	block: reopen the device in blkdev_reread_part
	ide/falconide: Fix module unload
	scsi: sd: Fix Opal support
	blk-settings: align max_sectors on "logical_block_size" boundary
	soundwire: intel: fix possible crash when no device is detected
	ACPI: property: Fix fwnode string properties matching
	ACPI: configfs: add missing check after configfs_register_default_group()
	cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known
	HID: logitech-dj: add support for keyboard events in eQUAD step 4 Gaming
	HID: wacom: Ignore attempts to overwrite the touch_max value from HID
	Input: raydium_ts_i2c - do not send zero length
	Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
	Input: joydev - prevent potential read overflow in ioctl
	Input: i8042 - add ASUS Zenbook Flip to noselftest list
	media: mceusb: Fix potential out-of-bounds shift
	USB: serial: option: update interface mapping for ZTE P685M
	usb: musb: Fix runtime PM race in musb_queue_resume_work
	usb: dwc3: gadget: Fix setting of DEPCFG.bInterval_m1
	usb: dwc3: gadget: Fix dep->interval for fullspeed interrupt
	USB: serial: ftdi_sio: fix FTX sub-integer prescaler
	USB: serial: pl2303: fix line-speed handling on newer chips
	USB: serial: mos7840: fix error code in mos7840_write()
	USB: serial: mos7720: fix error code in mos7720_write()
	phy: lantiq: rcu-usb2: wait after clock enable
	ALSA: fireface: fix to parse sync status register of latter protocol
	ALSA: hda: Add another CometLake-H PCI ID
	ALSA: hda/hdmi: Drop bogus check at closing a stream
	ALSA: hda/realtek: modify EAPD in the ALC886
	ALSA: hda/realtek: Quirk for HP Spectre x360 14 amp setup
	MIPS: Ingenic: Disable HPTLB for D0 XBurst CPUs too
	MIPS: Support binutils configured with --enable-mips-fix-loongson3-llsc=yes
	MIPS: VDSO: Use CLANG_FLAGS instead of filtering out '--target='
	Revert "MIPS: Octeon: Remove special handling of CONFIG_MIPS_ELF_APPENDED_DTB=y"
	Revert "bcache: Kill btree_io_wq"
	bcache: Give btree_io_wq correct semantics again
	bcache: Move journal work to new flush wq
	Revert "drm/amd/display: Update NV1x SR latency values"
	drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()
	drm/amd/display: Remove Assert from dcn10_get_dig_frontend
	drm/amd/display: Add vupdate_no_lock interrupts for DCN2.1
	drm/amdkfd: Fix recursive lock warnings
	drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)
	drm/nouveau/kms: handle mDP connectors
	drm/modes: Switch to 64bit maths to avoid integer overflow
	drm/sched: Cancel and flush all outstanding jobs before finish.
	drm/panel: kd35t133: allow using non-continuous dsi clock
	drm/rockchip: Require the YTR modifier for AFBC
	ASoC: siu: Fix build error by a wrong const prefix
	selinux: fix inconsistency between inode_getxattr and inode_listsecurity
	erofs: initialized fields can only be observed after bit is set
	tpm_tis: Fix check_locality for correct locality acquisition
	tpm_tis: Clean up locality release
	KEYS: trusted: Fix incorrect handling of tpm_get_random()
	KEYS: trusted: Fix migratable=1 failing
	KEYS: trusted: Reserve TPM for seal and unseal operations
	btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
	btrfs: do not warn if we can't find the reloc root when looking up backref
	btrfs: add asserts for deleting backref cache nodes
	btrfs: abort the transaction if we fail to inc ref in btrfs_copy_root
	btrfs: fix reloc root leak with 0 ref reloc roots on recovery
	btrfs: splice remaining dirty_bg's onto the transaction dirty bg list
	btrfs: handle space_info::total_bytes_pinned inside the delayed ref itself
	btrfs: account for new extents being deleted in total_bytes_pinned
	btrfs: fix extent buffer leak on failure to copy root
	drm/i915/gt: Flush before changing register state
	drm/i915/gt: Correct surface base address for renderclear
	crypto: arm64/sha - add missing module aliases
	crypto: aesni - prevent misaligned buffers on the stack
	crypto: michael_mic - fix broken misalignment handling
	crypto: sun4i-ss - checking sg length is not sufficient
	crypto: sun4i-ss - IV register does not work on A10 and A13
	crypto: sun4i-ss - handle BigEndian for cipher
	crypto: sun4i-ss - initialize need_fallback
	soc: samsung: exynos-asv: don't defer early on not-supported SoCs
	soc: samsung: exynos-asv: handle reading revision register error
	seccomp: Add missing return in non-void function
	arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL)
	misc: rtsx: init of rts522a add OCP power off when no card is present
	drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
	pstore: Fix typo in compression option name
	dts64: mt7622: fix slow sd card access
	arm64: dts: agilex: fix phy interface bit shift for gmac1 and gmac2
	staging/mt7621-dma: mtk-hsdma.c->hsdma-mt7621.c
	staging: gdm724x: Fix DMA from stack
	staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
	floppy: reintroduce O_NDELAY fix
	media: i2c: max9286: fix access to unallocated memory
	media: ir_toy: add another IR Droid device
	media: ipu3-cio2: Fix mbus_code processing in cio2_subdev_set_fmt()
	media: marvell-ccic: power up the device on mclk enable
	media: smipcie: fix interrupt handling and IR timeout
	x86/virt: Eat faults on VMXOFF in reboot flows
	x86/reboot: Force all cpus to exit VMX root if VMX is supported
	x86/fault: Fix AMD erratum #91 errata fixup for user code
	x86/entry: Fix instrumentation annotation
	powerpc/prom: Fix "ibm,arch-vec-5-platform-support" scan
	rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers
	rcu/nocb: Perform deferred wake up before last idle's need_resched() check
	kprobes: Fix to delay the kprobes jump optimization
	arm64: Extend workaround for erratum 1024718 to all versions of Cortex-A55
	iommu/arm-smmu-qcom: Fix mask extraction for bootloader programmed SMRs
	arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails
	arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
	arm64 module: set plt* section addresses to 0x0
	arm64: spectre: Prevent lockdep splat on v4 mitigation enable path
	riscv: Disable KSAN_SANITIZE for vDSO
	watchdog: qcom: Remove incorrect usage of QCOM_WDT_ENABLE_IRQ
	watchdog: mei_wdt: request stop on unregister
	coresight: etm4x: Handle accesses to TRCSTALLCTLR
	mtd: spi-nor: sfdp: Fix last erase region marking
	mtd: spi-nor: sfdp: Fix wrong erase type bitmask for overlaid region
	mtd: spi-nor: core: Fix erase type discovery for overlaid region
	mtd: spi-nor: core: Add erase size check for erase command initialization
	mtd: spi-nor: hisi-sfc: Put child node np on error path
	fs/affs: release old buffer head on error path
	seq_file: document how per-entry resources are managed.
	x86: fix seq_file iteration for pat/memtype.c
	mm: memcontrol: fix swap undercounting in cgroup2
	mm: memcontrol: fix get_active_memcg return value
	hugetlb: fix update_and_free_page contig page struct assumption
	hugetlb: fix copy_huge_page_from_user contig page struct assumption
	mm/vmscan: restore zone_reclaim_mode ABI
	mm, compaction: make fast_isolate_freepages() stay within zone
	KVM: nSVM: fix running nested guests when npt=0
	nvmem: qcom-spmi-sdam: Fix uninitialized pdev pointer
	module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
	mmc: sdhci-esdhc-imx: fix kernel panic when remove module
	mmc: sdhci-pci-o2micro: Bug fix for SDR104 HW tuning failure
	powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
	powerpc/kexec_file: fix FDT size estimation for kdump kernel
	powerpc/32s: Add missing call to kuep_lock on syscall entry
	spmi: spmi-pmic-arb: Fix hw_irq overflow
	mei: fix transfer over dma with extended header
	mei: me: emmitsburg workstation DID
	mei: me: add adler lake point S DID
	mei: me: add adler lake point LP DID
	gpio: pcf857x: Fix missing first interrupt
	mfd: gateworks-gsc: Fix interrupt type
	printk: fix deadlock when kernel panic
	exfat: fix shift-out-of-bounds in exfat_fill_super()
	zonefs: Fix file size of zones in full condition
	kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
	thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error
	cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
	cpufreq: intel_pstate: Change intel_pstate_get_hwp_max() argument
	cpufreq: intel_pstate: Get per-CPU max freq via MSR_HWP_CAPABILITIES if available
	proc: don't allow async path resolution of /proc/thread-self components
	s390/vtime: fix inline assembly clobber list
	virtio/s390: implement virtio-ccw revision 2 correctly
	um: mm: check more comprehensively for stub changes
	um: defer killing userspace on page table update failures
	irqchip/loongson-pch-msi: Use bitmap_zalloc() to allocate bitmap
	f2fs: fix out-of-repair __setattr_copy()
	f2fs: enforce the immutable flag on open files
	f2fs: flush data when enabling checkpoint back
	sparc32: fix a user-triggerable oops in clear_user()
	spi: fsl: invert spisel_boot signal on MPC8309
	spi: spi-synquacer: fix set_cs handling
	gfs2: fix glock confusion in function signal_our_withdraw
	gfs2: Don't skip dlm unlock if glock has an lvb
	gfs2: Lock imbalance on error path in gfs2_recover_one
	gfs2: Recursive gfs2_quota_hold in gfs2_iomap_end
	dm: fix deadlock when swapping to encrypted device
	dm table: fix iterate_devices based device capability checks
	dm table: fix DAX iterate_devices based device capability checks
	dm table: fix zoned iterate_devices based device capability checks
	dm writecache: fix performance degradation in ssd mode
	dm writecache: return the exact table values that were set
	dm writecache: fix writing beyond end of underlying device when shrinking
	dm era: Recover committed writeset after crash
	dm era: Update in-core bitset after committing the metadata
	dm era: Verify the data block size hasn't changed
	dm era: Fix bitset memory leaks
	dm era: Use correct value size in equality function of writeset tree
	dm era: Reinitialize bitset cache before digesting a new writeset
	dm era: only resize metadata in preresume
	drm/i915: Reject 446-480MHz HDMI clock on GLK
	kgdb: fix to kill breakpoints on initmem after boot
	ipv6: silence compilation warning for non-IPV6 builds
	net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending
	wireguard: selftests: test multiple parallel streams
	wireguard: queueing: get rid of per-peer ring buffers
	net: sched: fix police ext initialization
	net: qrtr: Fix memory leak in qrtr_tun_open
	net_sched: fix RTNL deadlock again caused by request_module()
	ARM: dts: aspeed: Add LCLK to lpc-snoop
	Linux 5.10.20

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3fbcecd9413ce212dac68d5cc800c9457feba56a
2021-03-07 12:33:33 +01:00
Paul Cercueil
b506450ce3 seccomp: Add missing return in non-void function
commit 04b38d012556199ba4c31195940160e0c44c64f0 upstream.

We don't actually care about the value, since the kernel will panic
before that; but a value should nonetheless be returned, otherwise the
compiler will complain.

Fixes: 8112c4f140 ("seccomp: remove 2-phase API")
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210111172839.640914-1-paul@crapouillou.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04 11:38:32 +01:00
Jann Horn
6212e56b38 UPSTREAM: seccomp: Remove bogus __user annotations
Buffers that are passed to read_actions_logged() and write_actions_logged()
are in kernel memory; the sysctl core takes care of copying from/to
userspace.

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Reviewed-by: Tyler Hicks <code@tyhicks.com>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201120170545.1419332-1-jannh@google.com
(cherry picked from commit fab686eb0307121e7a2890b6d6c57edd2457863d)
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Change-Id: I378d1eb73ff01928d3255191ed4443cd7b87c720
Bug: 176068146
2020-12-21 18:48:41 +00:00
YiFei Zhu
2d660e9770 UPSTREAM: seccomp/cache: Add "emulator" to check if filter is constant allow
SECCOMP_CACHE will only operate on syscalls that do not access
any syscall arguments or instruction pointer. To facilitate
this we need a static analyser to know whether a filter will
return allow regardless of syscall arguments for a given
architecture number / syscall number pair. This is implemented
here with a pseudo-emulator, and stored in a per-filter bitmap.

In order to build this bitmap at filter attach time, each filter is
emulated for every syscall (under each possible architecture), and
checked for any accesses of struct seccomp_data that are not the "arch"
nor "nr" (syscall) members. If only "arch" and "nr" are examined, and
the program returns allow, then we can be sure that the filter must
return allow independent from syscall arguments.

Nearly all seccomp filters are built from these cBPF instructions:

BPF_LD  | BPF_W    | BPF_ABS
BPF_JMP | BPF_JEQ  | BPF_K
BPF_JMP | BPF_JGE  | BPF_K
BPF_JMP | BPF_JGT  | BPF_K
BPF_JMP | BPF_JSET | BPF_K
BPF_JMP | BPF_JA
BPF_RET | BPF_K
BPF_ALU | BPF_AND  | BPF_K

Each of these instructions are emulated. Any weirdness or loading
from a syscall argument will cause the emulator to bail.

The emulation is also halted if it reaches a return. In that case,
if it returns an SECCOMP_RET_ALLOW, the syscall is marked as good.

Emulator structure and comments are from Kees [1] and Jann [2].

Emulation is done at attach time. If a filter depends on more
filters, and if the dependee does not guarantee to allow the
syscall, then we skip the emulation of this syscall.

[1] https://lore.kernel.org/lkml/20200923232923.3142503-5-keescook@chromium.org/
[2] https://lore.kernel.org/lkml/CAG48ez1p=dR_2ikKq=xVxkoGg0fYpTBpkhJSv1w-6BG=76PAvw@mail.gmail.com/

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Reviewed-by: Jann Horn <jannh@google.com>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/71c7be2db5ee08905f41c3be5c1ad6e2601ce88f.1602431034.git.yifeifz2@illinois.edu
(cherry picked from commit 8e01b51a31a1e08e2c3e8fcc0ef6790441be2f61)
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Change-Id: I5047f7f0d6502e5de6c047743f1053fda3025a6e
Bug: 176068146
2020-12-21 18:46:50 +00:00
YiFei Zhu
f89fef0eee UPSTREAM: seccomp/cache: Lookup syscall allowlist bitmap for fast path
The overhead of running Seccomp filters has been part of some past
discussions [1][2][3]. Oftentimes, the filters have a large number
of instructions that check syscall numbers one by one and jump based
on that. Some users chain BPF filters which further enlarge the
overhead. A recent work [6] comprehensively measures the Seccomp
overhead and shows that the overhead is non-negligible and has a
non-trivial impact on application performance.

We observed some common filters, such as docker's [4] or
systemd's [5], will make most decisions based only on the syscall
numbers, and as past discussions considered, a bitmap where each bit
represents a syscall makes most sense for these filters.

The fast (common) path for seccomp should be that the filter permits
the syscall to pass through, and failing seccomp is expected to be
an exceptional case; it is not expected for userspace to call a
denylisted syscall over and over.

When it can be concluded that an allow must occur for the given
architecture and syscall pair (this determination is introduced in
the next commit), seccomp will immediately allow the syscall,
bypassing further BPF execution.

Each architecture number has its own bitmap. The architecture
number in seccomp_data is checked against the defined architecture
number constant before proceeding to test the bit against the
bitmap with the syscall number as the index of the bit in the
bitmap, and if the bit is set, seccomp returns allow. The bitmaps
are all clear in this patch and will be initialized in the next
commit.

When only one architecture exists, the check against architecture
number is skipped, suggested by Kees Cook [7].

[1] https://lore.kernel.org/linux-security-module/c22a6c3cefc2412cad00ae14c1371711@huawei.com/T/
[2] https://lore.kernel.org/lkml/202005181120.971232B7B@keescook/T/
[3] https://github.com/seccomp/libseccomp/issues/116
[4] ae0ef82b90/profiles/seccomp/default.json
[5] 6743a1caf4/src/shared/seccomp-util.c (L270)
[6] Draco: Architectural and Operating System Support for System Call Security
    https://tianyin.github.io/pub/draco.pdf, MICRO-53, Oct. 2020
[7] https://lore.kernel.org/bpf/202010091614.8BB0EB64@keescook/

Co-developed-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu>
Signed-off-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu>
Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/10f91a367ec4fcdea7fc3f086de3f5f13a4a7436.1602431034.git.yifeifz2@illinois.edu
(cherry picked from commit f9d480b6ffbeb336bf7f6ce44825c00f61b3abae)A
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Change-Id: I50b6682e17dc6e91b5e92017361200d722282825
Bug: 176068146
2020-12-21 18:46:39 +00:00
Mickaël Salaün
fb14528e44 seccomp: Set PF_SUPERPRIV when checking capability
Replace the use of security_capable(current_cred(), ...) with
ns_capable_noaudit() which set PF_SUPERPRIV.

Since commit 98f368e9e2 ("kernel: Add noaudit variant of
ns_capable()"), a new ns_capable_noaudit() helper is available.  Let's
use it!

Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Will Drewry <wad@chromium.org>
Cc: stable@vger.kernel.org
Fixes: e2cfabdfd0 ("seccomp: add system call filtering using BPF")
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201030123849.770769-3-mic@digikod.net
2020-11-17 12:53:22 -08:00
Jann Horn
dfe719fef0 seccomp: Make duplicate listener detection non-racy
Currently, init_listener() tries to prevent adding a filter with
SECCOMP_FILTER_FLAG_NEW_LISTENER if one of the existing filters already
has a listener. However, this check happens without holding any lock that
would prevent another thread from concurrently installing a new filter
(potentially with a listener) on top of the ones we already have.

Theoretically, this is also a data race: The plain load from
current->seccomp.filter can race with concurrent writes to the same
location.

Fix it by moving the check into the region that holds the siglock to guard
against concurrent TSYNC.

(The "Fixes" tag points to the commit that introduced the theoretical
data race; concurrent installation of another filter with TSYNC only
became possible later, in commit 51891498f2 ("seccomp: allow TSYNC and
USER_NOTIF together").)

Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Reviewed-by: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201005014401.490175-1-jannh@google.com
2020-10-08 13:17:47 -07:00
Denis Efremov
2d9ca267a9 seccomp: Use current_pt_regs() instead of task_pt_regs(current)
As described in commit a3460a5974 ("new helper: current_pt_regs()"):
- arch versions are "optimized versions".
- some architectures have task_pt_regs() working only for traced tasks
  blocked on signal delivery. current_pt_regs() needs to work for *all*
  processes.

In preparation for adding a coccinelle rule for using current_*(), instead
of raw accesses to current members, modify seccomp_do_user_notification(),
__seccomp_filter(), __secure_computing() to use current_pt_regs().

Signed-off-by: Denis Efremov <efremov@linux.com>
Link: https://lore.kernel.org/r/20200824125921.488311-1-efremov@linux.com
[kees: Reworded commit log, add comment to populate_seccomp_data()]
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08 16:26:45 -07:00
Rich Felker
4d671d922d seccomp: kill process instead of thread for unknown actions
Asynchronous termination of a thread outside of the userspace thread
library's knowledge is an unsafe operation that leaves the process in
an inconsistent, corrupt, and possibly unrecoverable state. In order
to make new actions that may be added in the future safe on kernels
not aware of them, change the default action from
SECCOMP_RET_KILL_THREAD to SECCOMP_RET_KILL_PROCESS.

Signed-off-by: Rich Felker <dalias@libc.org>
Link: https://lore.kernel.org/r/20200829015609.GA32566@brightrain.aerifal.cx
[kees: Fixed up coredump selection logic to match]
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08 12:00:49 -07:00
Tycho Andersen
e839317900 seccomp: don't leave dangling ->notif if file allocation fails
Christian and Kees both pointed out that this is a bit sloppy to open-code
both places, and Christian points out that we leave a dangling pointer to
->notif if file allocation fails. Since we check ->notif for null in order
to determine if it's ok to install a filter, this means people won't be
able to install a filter if the file allocation fails for some reason, even
if they subsequently should be able to.

To fix this, let's hoist this free+null into its own little helper and use
it.

Reported-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200902140953.1201956-1-tycho@tycho.pizza
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08 11:30:16 -07:00
Tycho Andersen
a566a9012a seccomp: don't leak memory when filter install races
In seccomp_set_mode_filter() with TSYNC | NEW_LISTENER, we first initialize
the listener fd, then check to see if we can actually use it later in
seccomp_may_assign_mode(), which can fail if anyone else in our thread
group has installed a filter and caused some divergence. If we can't, we
partially clean up the newly allocated file: we put the fd, put the file,
but don't actually clean up the *memory* that was allocated at
filter->notif. Let's clean that up too.

To accomplish this, let's hoist the actual "detach a notifier from a
filter" code to its own helper out of seccomp_notify_release(), so that in
case anyone adds stuff to init_listener(), they only have to add the
cleanup code in one spot. This does a bit of extra locking and such on the
failure path when the filter is not attached, but it's a slow failure path
anyway.

Fixes: 51891498f2 ("seccomp: allow TSYNC and USER_NOTIF together")
Reported-by: syzbot+3ad9614a12f80994c32e@syzkaller.appspotmail.com
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200902014017.934315-1-tycho@tycho.pizza
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08 11:19:50 -07:00
Sargun Dhillon
7cf97b1254 seccomp: Introduce addfd ioctl to seccomp user notifier
The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over
an fd. It is often used in settings where a supervising task emulates
syscalls on behalf of a supervised task in userspace, either to further
restrict the supervisee's syscall abilities or to circumvent kernel
enforced restrictions the supervisor deems safe to lift (e.g. actually
performing a mount(2) for an unprivileged container).

While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall,
only a certain subset of syscalls could be correctly emulated. Over the
last few development cycles, the set of syscalls which can't be emulated
has been reduced due to the addition of pidfd_getfd(2). With this we are
now able to, for example, intercept syscalls that require the supervisor
to operate on file descriptors of the supervisee such as connect(2).

However, syscalls that cause new file descriptors to be installed can not
currently be correctly emulated since there is no way for the supervisor
to inject file descriptors into the supervisee. This patch adds a
new addfd ioctl to remove this restriction by allowing the supervisor to
install file descriptors into the intercepted task. By implementing this
feature via seccomp the supervisor effectively instructs the supervisee
to install a set of file descriptors into its own file descriptor table
during the intercepted syscall. This way it is possible to intercept
syscalls such as open() or accept(), and install (or replace, like
dup2(2)) the supervisor's resulting fd into the supervisee. One
replacement use-case would be to redirect the stdout and stderr of a
supervisee into log file descriptors opened by the supervisor.

The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.

As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[2][3]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.

[1]: https://lore.kernel.org/lkml/87o8w9bcaf.fsf@mid.deneb.enyo.de/
[2]: https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@rasmusvillemoes.dk/
[3]: https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal

Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Jann Horn <jannh@google.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Suggested-by: Matt Denton <mpdenton@google.com>
Link: https://lore.kernel.org/r/20200603011044.7972-4-sargun@sargun.me
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Reviewed-by: Will Drewry <wad@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-14 16:29:42 -07:00
Kees Cook
fe4bfff86e seccomp: Use -1 marker for end of mode 1 syscall list
The terminator for the mode 1 syscalls list was a 0, but that could be
a valid syscall number (e.g. x86_64 __NR_read). By luck, __NR_read was
listed first and the loop construct would not test it, so there was no
bug. However, this is fragile. Replace the terminator with -1 instead,
and make the variable name for mode 1 syscall lists more descriptive.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:52 -07:00
Kees Cook
47e33c05f9 seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID
When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong
direction flag set. While this isn't a big deal as nothing currently
enforces these bits in the kernel, it should be defined correctly. Fix
the define and provide support for the old command until it is no longer
needed for backward compatibility.

Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:52 -07:00
Kees Cook
e68f9d49dd seccomp: Use pr_fmt
Avoid open-coding "seccomp: " prefixes for pr_*() calls.

Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:52 -07:00
Christian Brauner
99cdb8b9a5 seccomp: notify about unused filter
We've been making heavy use of the seccomp notifier to intercept and
handle certain syscalls for containers. This patch allows a syscall
supervisor listening on a given notifier to be notified when a seccomp
filter has become unused.

A container is often managed by a singleton supervisor process the
so-called "monitor". This monitor process has an event loop which has
various event handlers registered. If the user specified a seccomp
profile that included a notifier for various syscalls then we also
register a seccomp notify even handler. For any container using a
separate pid namespace the lifecycle of the seccomp notifier is bound to
the init process of the pid namespace, i.e. when the init process exits
the filter must be unused.

If a new process attaches to a container we force it to assume a seccomp
profile. This can either be the same seccomp profile as the container
was started with or a modified one. If the attaching process makes use
of the seccomp notifier we will register a new seccomp notifier handler
in the monitor's event loop. However, when the attaching process exits
we can't simply delete the handler since other child processes could've
been created (daemons spawned etc.) that have inherited the seccomp
filter and so we need to keep the seccomp notifier fd alive in the event
loop. But this is problematic since we don't get a notification when the
seccomp filter has become unused and so we currently never remove the
seccomp notifier fd from the event loop and just keep accumulating fds
in the event loop. We've had this issue for a while but it has recently
become more pressing as more and larger users make use of this.

To fix this, we introduce a new "users" reference counter that tracks any
tasks and dependent filters making use of a filter. When a notifier is
registered waiting tasks will be notified that the filter is now empty
by receiving a (E)POLLHUP event.

The concept in this patch introduces is the same as for signal_struct,
i.e. reference counting for life-cycle management is decoupled from
reference counting taks using the object. There's probably some trickery
possible but the second counter is just the correct way of doing this
IMHO and has precedence.

Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matt Denton <mpdenton@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jann Horn <jannh@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Jeffrey Vander Stoep <jeffv@google.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200531115031.391515-3-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Christian Brauner
76194c4e83 seccomp: Lift wait_queue into struct seccomp_filter
Lift the wait_queue from struct notification into struct seccomp_filter.
This is cleaner overall and lets us avoid having to take the notifier
mutex in the future for EPOLLHUP notifications since we need to neither
read nor modify the notifier specific aspects of the seccomp filter. In
the exit path I'd very much like to avoid having to take the notifier mutex
for each filter in the task's filter hierarchy.

Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matt Denton <mpdenton@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jann Horn <jannh@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Jeffrey Vander Stoep <jeffv@google.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Christian Brauner
3a15fb6ed9 seccomp: release filter after task is fully dead
The seccomp filter used to be released in free_task() which is called
asynchronously via call_rcu() and assorted mechanisms. Since we need
to inform tasks waiting on the seccomp notifier when a filter goes empty
we will notify them as soon as a task has been marked fully dead in
release_task(). To not split seccomp cleanup into two parts, move
filter release out of free_task() and into release_task() after we've
unhashed struct task from struct pid, exited signals, and unlinked it
from the threadgroups' thread list. We'll put the empty filter
notification infrastructure into it in a follow up patch.

This also renames put_seccomp_filter() to seccomp_filter_release() which
is a more descriptive name of what we're doing here especially once
we've added the empty filter notification mechanism in there.

We're also NULL-ing the task's filter tree entrypoint which seems
cleaner than leaving a dangling pointer in there. Note that this shouldn't
need any memory barriers since we're calling this when the task is in
release_task() which means it's EXIT_DEAD. So it can't modify its seccomp
filters anymore. You can also see this from the point where we're calling
seccomp_filter_release(). It's after __exit_signal() and at this point,
tsk->sighand will already have been NULLed which is required for
thread-sync and filter installation alike.

Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matt Denton <mpdenton@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jann Horn <jannh@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Jeffrey Vander Stoep <jeffv@google.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200531115031.391515-2-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Christian Brauner
b707ddee11 seccomp: rename "usage" to "refs" and document
Naming the lifetime counter of a seccomp filter "usage" suggests a
little too strongly that its about tasks that are using this filter
while it also tracks other references such as the user notifier or
ptrace. This also updates the documentation to note this fact.

We'll be introducing an actual usage counter in a follow-up patch.

Cc: Tycho Andersen <tycho@tycho.ws>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matt Denton <mpdenton@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jann Horn <jannh@google.com>
Cc: Chris Palmer <palmer@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Robert Sesek <rsesek@google.com>
Cc: Jeffrey Vander Stoep <jeffv@google.com>
Cc: Linux Containers <containers@lists.linux-foundation.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200531115031.391515-1-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Sargun Dhillon
9f87dcf14b seccomp: Add find_notification helper
This adds a helper which can iterate through a seccomp_filter to
find a notification matching an ID. It removes several replicated
chunks of code.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Cc: Matt Denton <mpdenton@google.com>
Cc: Kees Cook <keescook@google.com>,
Cc: Jann Horn <jannh@google.com>,
Cc: Robert Sesek <rsesek@google.com>,
Cc: Chris Palmer <palmer@google.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Tycho Andersen <tycho@tycho.ws>
Link: https://lore.kernel.org/r/20200601112532.150158-1-sargun@sargun.me
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Kees Cook
c818c03b66 seccomp: Report number of loaded filters in /proc/$pid/status
A common question asked when debugging seccomp filters is "how many
filters are attached to your process?" Provide a way to easily answer
this question through /proc/$pid/status with a "Seccomp_filters" line.

Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-10 16:01:51 -07:00
Christoph Hellwig
32927393dc sysctl: pass kernel pointers to ->proc_handler
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from  userspace in common code.  This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.

As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-27 02:07:40 -04:00
Linus Torvalds
29d9f30d4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Fix the iwlwifi regression, from Johannes Berg.

   2) Support BSS coloring and 802.11 encapsulation offloading in
      hardware, from John Crispin.

   3) Fix some potential Spectre issues in qtnfmac, from Sergey
      Matyukevich.

   4) Add TTL decrement action to openvswitch, from Matteo Croce.

   5) Allow paralleization through flow_action setup by not taking the
      RTNL mutex, from Vlad Buslov.

   6) A lot of zero-length array to flexible-array conversions, from
      Gustavo A. R. Silva.

   7) Align XDP statistics names across several drivers for consistency,
      from Lorenzo Bianconi.

   8) Add various pieces of infrastructure for offloading conntrack, and
      make use of it in mlx5 driver, from Paul Blakey.

   9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.

  10) Lots of parallelization improvements during configuration changes
      in mlxsw driver, from Ido Schimmel.

  11) Add support to devlink for generic packet traps, which report
      packets dropped during ACL processing. And use them in mlxsw
      driver. From Jiri Pirko.

  12) Support bcmgenet on ACPI, from Jeremy Linton.

  13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
      Starovoitov, and your's truly.

  14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.

  15) Fix sysfs permissions when network devices change namespaces, from
      Christian Brauner.

  16) Add a flags element to ethtool_ops so that drivers can more simply
      indicate which coalescing parameters they actually support, and
      therefore the generic layer can validate the user's ethtool
      request. Use this in all drivers, from Jakub Kicinski.

  17) Offload FIFO qdisc in mlxsw, from Petr Machata.

  18) Support UDP sockets in sockmap, from Lorenz Bauer.

  19) Fix stretch ACK bugs in several TCP congestion control modules,
      from Pengcheng Yang.

  20) Support virtual functiosn in octeontx2 driver, from Tomasz
      Duszynski.

  21) Add region operations for devlink and use it in ice driver to dump
      NVM contents, from Jacob Keller.

  22) Add support for hw offload of MACSEC, from Antoine Tenart.

  23) Add support for BPF programs that can be attached to LSM hooks,
      from KP Singh.

  24) Support for multiple paths, path managers, and counters in MPTCP.
      From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
      and others.

  25) More progress on adding the netlink interface to ethtool, from
      Michal Kubecek"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
  net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
  cxgb4/chcr: nic-tls stats in ethtool
  net: dsa: fix oops while probing Marvell DSA switches
  net/bpfilter: remove superfluous testing message
  net: macb: Fix handling of fixed-link node
  net: dsa: ksz: Select KSZ protocol tag
  netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
  net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
  net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
  net: stmmac: create dwmac-intel.c to contain all Intel platform
  net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
  net: dsa: bcm_sf2: Add support for matching VLAN TCI
  net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
  net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
  net: dsa: bcm_sf2: Disable learning for ASP port
  net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
  net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
  net: dsa: b53: Restore VLAN entries upon (re)configuration
  net: dsa: bcm_sf2: Fix overflow checks
  hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
  ...
2020-03-31 17:29:33 -07:00
Sven Schnelle
3db81afd99 seccomp: Add missing compat_ioctl for notify
Executing the seccomp_bpf testsuite under a 64-bit kernel with 32-bit
userland (both s390 and x86) doesn't work because there's no compat_ioctl
handler defined. Add the handler.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200310123332.42255-1-svens@linux.ibm.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-03-29 21:10:51 -07:00
Tycho Andersen
51891498f2 seccomp: allow TSYNC and USER_NOTIF together
The restriction introduced in 7a0df7fbc1 ("seccomp: Make NEW_LISTENER and
TSYNC flags exclusive") is mostly artificial: there is enough information
in a seccomp user notification to tell which thread triggered a
notification. The reason it was introduced is because TSYNC makes the
syscall return a thread-id on failure, and NEW_LISTENER returns an fd, and
there's no way to distinguish between these two cases (well, I suppose the
caller could check all fds it has, then do the syscall, and if the return
value was an fd that already existed, then it must be a thread id, but
bleh).

Matthew would like to use these two flags together in the Chrome sandbox
which wants to use TSYNC for video drivers and NEW_LISTENER to proxy
syscalls.

So, let's fix this ugliness by adding another flag, TSYNC_ESRCH, which
tells the kernel to just return -ESRCH on a TSYNC error. This way,
NEW_LISTENER (and any subsequent seccomp() commands that want to return
positive values) don't conflict with each other.

Suggested-by: Matthew Denton <mpdenton@google.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Link: https://lore.kernel.org/r/20200304180517.23867-1-tycho@tycho.ws
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-03-04 14:48:54 -08:00
David Miller
3d9f773cf2 bpf: Use bpf_prog_run_pin_on_cpu() at simple call sites.
All of these cases are strictly of the form:

	preempt_disable();
	BPF_PROG_RUN(...);
	preempt_enable();

Replace this with bpf_prog_run_pin_on_cpu() which wraps BPF_PROG_RUN()
with:

	migrate_disable();
	BPF_PROG_RUN(...);
	migrate_enable();

On non RT enabled kernels this maps to preempt_disable/enable() and on RT
enabled kernels this solely prevents migration, which is sufficient as
there is no requirement to prevent reentrancy to any BPF program from a
preempting task. The only requirement is that the program stays on the same
CPU.

Therefore, this is a trivially correct transformation.

The seccomp loop does not need protection over the loop. It only needs
protection per BPF filter program

[ tglx: Converted to bpf_prog_run_pin_on_cpu() ]

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200224145643.691493094@linutronix.de
2020-02-24 16:20:09 -08:00
Sargun Dhillon
2882d53c9c seccomp: Check that seccomp_notif is zeroed out by the user
This patch is a small change in enforcement of the uapi for
SECCOMP_IOCTL_NOTIF_RECV ioctl. Specifically, the datastructure which
is passed (seccomp_notif) must be zeroed out. Previously any of its
members could be set to nonsense values, and we would ignore it.

This ensures all fields are set to their zero value.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>
Acked-by: Tycho Andersen <tycho@tycho.ws>
Link: https://lore.kernel.org/r/20191229062451.9467-2-sargun@sargun.me
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-01-02 13:03:45 -08:00
Christian Brauner
fb3c5386b3 seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE
This allows the seccomp notifier to continue a syscall. A positive
discussion about this feature was triggered by a post to the
ksummit-discuss mailing list (cf. [3]) and took place during KSummit
(cf. [1]) and again at the containers/checkpoint-restore
micro-conference at Linux Plumbers.

Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
which enables a process (watchee) to retrieve an fd for its seccomp
filter. This fd can then be handed to another (usually more privileged)
process (watcher). The watcher will then be able to receive seccomp
messages about the syscalls having been performed by the watchee.

This feature is heavily used in some userspace workloads. For example,
it is currently used to intercept mknod() syscalls in user namespaces
aka in containers.
The mknod() syscall can be easily filtered based on dev_t. This allows
us to only intercept a very specific subset of mknod() syscalls.
Furthermore, mknod() is not possible in user namespaces toto coelo and
so intercepting and denying syscalls that are not in the whitelist on
accident is not a big deal. The watchee won't notice a difference.

In contrast to mknod(), a lot of other syscall we intercept (e.g.
setxattr()) cannot be easily filtered like mknod() because they have
pointer arguments. Additionally, some of them might actually succeed in
user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
currently cannot tell seccomp to continue from a user notifier we are
stuck with performing all of the syscalls in lieu of the container. This
is a huge security liability since it is extremely difficult to
correctly assume all of the necessary privileges of the calling task
such that the syscall can be successfully emulated without escaping
other additional security restrictions (think missing CAP_MKNOD for
mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
telling seccomp to resume the syscall.

One thing that came up in the discussion was the problem that another
thread could change the memory after userspace has decided to let the
syscall continue which is a well known TOCTOU with seccomp which is
present in other ways already.
The discussion showed that this feature is already very useful for any
syscall without pointer arguments. For any accidentally intercepted
non-pointer syscall it is safe to continue.
For syscalls with pointer arguments there is a race but for any cautious
userspace and the main usec cases the race doesn't matter. The notifier
is intended to be used in a scenario where a more privileged watcher
supervises the syscalls of lesser privileged watchee to allow it to get
around kernel-enforced limitations by performing the syscall for it
whenever deemed save by the watcher. Hence, if a user tricks the watcher
into allowing a syscall they will either get a deny based on
kernel-enforced restrictions later or they will have changed the
arguments in such a way that they manage to perform a syscall with
arguments that they would've been allowed to do anyway.
In general, it is good to point out again, that the notifier fd was not
intended to allow userspace to implement a security policy but rather to
work around kernel security mechanisms in cases where the watcher knows
that a given action is safe to perform.

/* References */
[1]: https://linuxplumbersconf.org/event/4/contributions/560
[2]: https://linuxplumbersconf.org/event/4/contributions/477
[3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
[4]: commit 6a21cc50f0 ("seccomp: add a return code to trap to userspace")

Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
CC: Tyler Hicks <tyhicks@canonical.com>
Link: https://lore.kernel.org/r/20190920083007.11475-2-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2019-10-10 14:45:51 -07:00
Eric W. Biederman
a89e9b8abf signal: Remove the signal number and task parameters from force_sig_info
force_sig_info always delivers to the current task and the signal
parameter always matches info.si_signo.  So remove those parameters to
make it a simpler less error prone interface, and to make it clear
that none of the callers are doing anything clever.

This guarantees that force_sig_info will not grow any new buggy
callers that attempt to call force_sig on a non-current task, or that
pass an signal number that does not match info.si_signo.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-05-29 09:31:44 -05:00
Linus Torvalds
02aff8db64 audit/stable-5.2 PR 20190507
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlzRrzoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNc7hAApgsi+3Jf9i29mgrKdrTciZ35TegK
 C8pTlOIndpBcmdwDakR50/PgfMHdHll8M9TReVNEjbe0S+Ww5GTE7eWtL3YqoPC2
 MuXEqcriz6UNi5Xma6vCZrDznWLXkXnzMDoDoYGDSoKuUYxef0fuqxDBnERM60Ht
 s52+0XvR5ZseBw7I1KIv/ix2fXuCGq6eCdqassm0rvLPQ7bq6nWzFAlNXOLud303
 DjIWu6Op2EL0+fJSmG+9Z76zFjyEbhMIhw5OPDeH4eO3pxX29AIv0m0JlI7ZXxfc
 /VVC3r5G4WrsWxwKMstOokbmsQxZ5pB3ZaceYpco7U+9N2e3SlpsNM9TV+Y/0ac/
 ynhYa//GK195LpMXx1BmWmLpjBHNgL8MvQkVTIpDia0GT+5sX7+haDxNLGYbocmw
 A/mR+KM2jAU3QzNseGh6c659j3K4tbMIFMNxt7pUBxVPLafcccNngFGTpzCwu5GU
 b7y4d21g6g/3Irj14NYU/qS8dTjW0rYrCMDquTpxmMfZ2xYuSvQmnBw91NQzVBp2
 98L2/fsUG3yOa5MApgv+ryJySsIM+SW+7leKS5tjy/IJINzyPEZ85l3o8ck8X4eT
 nohpKc/ELmeyi3omFYq18ecvFf2YRS5jRnz89i9q65/3ESgGiC0wyGOhNTvjvsyv
 k4jT0slIK614aGk=
 =p8Fp
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "We've got a reasonably broad set of audit patches for the v5.2 merge
  window, the highlights are below:

   - The biggest change, and the source of all the arch/* changes, is
     the patchset from Dmitry to help enable some of the work he is
     doing around PTRACE_GET_SYSCALL_INFO.

     To be honest, including this in the audit tree is a bit of a
     stretch, but it does help move audit a little further along towards
     proper syscall auditing for all arches, and everyone else seemed to
     agree that audit was a "good" spot for this to land (or maybe they
     just didn't want to merge it? dunno.).

   - We can now audit time/NTP adjustments.

   - We continue the work to connect associated audit records into a
     single event"

* tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (21 commits)
  audit: fix a memory leak bug
  ntp: Audit NTP parameters adjustment
  timekeeping: Audit clock adjustments
  audit: purge unnecessary list_empty calls
  audit: link integrity evm_write_xattrs record to syscall event
  syscall_get_arch: add "struct task_struct *" argument
  unicore32: define syscall_get_arch()
  Move EM_UNICORE to uapi/linux/elf-em.h
  nios2: define syscall_get_arch()
  nds32: define syscall_get_arch()
  Move EM_NDS32 to uapi/linux/elf-em.h
  m68k: define syscall_get_arch()
  hexagon: define syscall_get_arch()
  Move EM_HEXAGON to uapi/linux/elf-em.h
  h8300: define syscall_get_arch()
  c6x: define syscall_get_arch()
  arc: define syscall_get_arch()
  Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
  audit: Make audit_log_cap and audit_copy_inode static
  audit: connect LOGIN record to its syscall record
  ...
2019-05-07 19:06:04 -07:00
Linus Torvalds
78ee8b1b9b Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Just a few bugfixes and documentation updates"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  seccomp: fix up grammar in comment
  Revert "security: inode: fix a missing check for securityfs_create_file"
  Yama: mark function as static
  security: inode: fix a missing check for securityfs_create_file
  keys: safe concurrent user->{session,uid}_keyring access
  security: don't use RCU accessors for cred->session_keyring
  Yama: mark local symbols as static
  LSM: lsm_hooks.h: fix documentation format
  LSM: fix documentation for the shm_* hooks
  LSM: fix documentation for the sem_* hooks
  LSM: fix documentation for the msg_queue_* hooks
  LSM: fix documentation for the audit_* hooks
  LSM: fix documentation for the path_chmod hook
  LSM: fix documentation for the socket_getpeersec_dgram hook
  LSM: fix documentation for the task_setscheduler hook
  LSM: fix documentation for the socket_post_create hook
  LSM: fix documentation for the syslog hook
  LSM: fix documentation for sb_copy_data hook
2019-05-07 08:39:54 -07:00
Linus Torvalds
83a50840e7 seccomp use-after-free fix
- Add logic for making some seccomp flags exclusive (Tycho)
 - Update selftests for exclusivity testing (Kees)
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlzHVl0WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJqZuD/wK/PccncrPcBVtyFwWVVPj1HaM
 97icUgcbzC2mgpGmIDj5lZwpzXjvSlvkLenwcX+QEO0BfRbomUtcFqiMo3GMsHE3
 JMJDQ4r+eQLZX2r/f0rgJ+yS80DzpgF4PjLbC2kcDXdVTNUBetafwq4tfP1wEYbE
 Fumw64hjJidvahKUlJh94xQzatBFSA6gzPcWCn6VbFKDIQ/Zu1zMvWPxsVqOEAol
 rNSW5qFlxHI35znMg2/5tfZ8Z9bbemYcYDwlWwCZkNcoRBfs5rpgFhYuE5o5qYZT
 ndQQnfv24HoH0Q1zMq67uLdcPwVzg8VQjKQiZr9QWhKfSsFi8mtd00/yvqm9z/Hy
 1gwHv6bSzmfNyPYoFCTHKrMutUKy9aUHBdPjXdjOOD6V30QWbCETUHQ+Ipkq7qCm
 YbIhJL/FRHF2BAFU7uT+2/xob9JGD80n5nYZtZDdBx0zgDZb5xTuSN8fi8jVf+Ye
 so6Zwu64OdcAt+AGIl0Q3f+bCBYnjLF1Ec14TfJgOZAuw1fdsi8uAsFBV+aHu7tP
 SsDqDLCcY6p98x7AlFpEf4pN4oIC7kWOMFdJH7dK9pNeh4Q6Omf0vpHY6tAxC8yX
 LsFcimfKgJnlGPoqLN04Aq3K5Qj55lcpNv8RbQ5YuKujzhHH3/yltNCWSR59TFsz
 anZKkfzZckEdoJ9vSg==
 =12Pp
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp fixes from Kees Cook:
 "Syzbot found a use-after-free bug in seccomp due to flags that should
  not be allowed to be used together.

  Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
  been running for several days without triggering KASan (before this
  fix, it would reproduce). These patches have also been in -next for
  almost a week, just to be sure.

   - Add logic for making some seccomp flags exclusive (Tycho)

   - Update selftests for exclusivity testing (Kees)"

* tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Make NEW_LISTENER and TSYNC flags exclusive
  selftests/seccomp: Prepare for exclusive seccomp flags
2019-04-29 13:24:34 -07:00
Tycho Andersen
7a0df7fbc1 seccomp: Make NEW_LISTENER and TSYNC flags exclusive
As the comment notes, the return codes for TSYNC and NEW_LISTENER
conflict, because they both return positive values, one in the case of
success and one in the case of error. So, let's disallow both of these
flags together.

While this is technically a userspace break, all the users I know
of are still waiting on me to land this feature in libseccomp, so I
think it'll be safe. Also, at present my use case doesn't require
TSYNC at all, so this isn't a big deal to disallow. If someone
wanted to support this, a path forward would be to add a new flag like
TSYNC_AND_LISTENER_YES_I_UNDERSTAND_THAT_TSYNC_WILL_JUST_RETURN_EAGAIN,
but the use cases are so different I don't see it really happening.

Finally, it's worth noting that this does actually fix a UAF issue: at the
end of seccomp_set_mode_filter(), we have:

        if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
                if (ret < 0) {
                        listener_f->private_data = NULL;
                        fput(listener_f);
                        put_unused_fd(listener);
                } else {
                        fd_install(listener, listener_f);
                        ret = listener;
                }
        }
out_free:
        seccomp_filter_free(prepared);

But if ret > 0 because TSYNC raced, we'll install the listener fd and then
free the filter out from underneath it, causing a UAF when the task closes
it or dies. This patch also switches the condition to be simply if (ret),
so that if someone does add the flag mentioned above, they won't have to
remember to fix this too.

Reported-by: syzbot+b562969adb2e04af3442@syzkaller.appspotmail.com
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
CC: stable@vger.kernel.org # v5.0+
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
2019-04-25 15:55:58 -07:00
Tycho Andersen
6beff00b79 seccomp: fix up grammar in comment
This sentence is kind of a train wreck anyway, but at least dropping the
extra pronoun helps somewhat.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2019-04-23 16:21:12 -07:00
Steven Rostedt (Red Hat)
b35f549df1 syscalls: Remove start and number from syscall_get_arguments() args
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.

This should help out the performance of tracing system calls by ptrace,
ftrace and perf.

Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Dave Martin <dave.martin@arm.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-05 09:26:43 -04:00
Dmitry V. Levin
16add41164 syscall_get_arch: add "struct task_struct *" argument
This argument is required to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request: syscall_get_arch() is going
to be called from ptrace_request() along with syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions with a tracee as their argument.

The primary intent is that the triple (audit_arch, syscall_nr, arg1..arg6)
should describe what system call is being called and what its arguments
are.

Reverts: 5e937a9ae9 ("syscall_get_arch: remove useless function arguments")
Reverts: 1002d94d30 ("syscall.h: fix doc text for syscall_get_arch()")
Reviewed-by: Andy Lutomirski <luto@kernel.org> # for x86
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Kees Cook <keescook@chromium.org> # seccomp parts
Acked-by: Mark Salter <msalter@redhat.com> # for the c6x bit
Cc: Elvira Khabirova <lineprinter@altlinux.org>
Cc: Eugene Syromyatnikov <esyr@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: x86@kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: linux-audit@redhat.com
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 21:12:36 -04:00
Linus Torvalds
ae5906ceee Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:

 - Extend LSM stacking to allow sharing of cred, file, ipc, inode, and
   task blobs. This paves the way for more full-featured LSMs to be
   merged, and is specifically aimed at LandLock and SARA LSMs. This
   work is from Casey and Kees.

 - There's a new LSM from Micah Morton: "SafeSetID gates the setid
   family of syscalls to restrict UID/GID transitions from a given
   UID/GID to only those approved by a system-wide whitelist." This
   feature is currently shipping in ChromeOS.

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (62 commits)
  keys: fix missing __user in KEYCTL_PKEY_QUERY
  LSM: Update list of SECURITYFS users in Kconfig
  LSM: Ignore "security=" when "lsm=" is specified
  LSM: Update function documentation for cap_capable
  security: mark expected switch fall-throughs and add a missing break
  tomoyo: Bump version.
  LSM: fix return value check in safesetid_init_securityfs()
  LSM: SafeSetID: add selftest
  LSM: SafeSetID: remove unused include
  LSM: SafeSetID: 'depend' on CONFIG_SECURITY
  LSM: Add 'name' field for SafeSetID in DEFINE_LSM
  LSM: add SafeSetID module that gates setid calls
  LSM: add SafeSetID module that gates setid calls
  tomoyo: Allow multiple use_group lines.
  tomoyo: Coding style fix.
  tomoyo: Swicth from cred->security to task_struct->security.
  security: keys: annotate implicit fall throughs
  security: keys: annotate implicit fall throughs
  security: keys: annotate implicit fall through
  capabilities:: annotate implicit fall through
  ...
2019-03-07 11:44:01 -08:00
Alexei Starovoitov
e80d02dd76 seccomp, bpf: disable preemption before calling into bpf prog
All BPF programs must be called with preemption disabled.

Fixes: 568f196756 ("bpf: check that BPF programs run with preemption disabled")
Reported-by: syzbot+8bf19ee2aa580de7a2a7@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-22 00:14:19 +01:00
James Morris
9624d5c9c7 Linux 5.0-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxFDv0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGBPsH/3Ij47fut8kwxGSX
 Tmx7Y+VYftRiKSwK3+HxsCvde3scqfkxAukb3HeJDzZdpnouT0k4nqUYQabAANi/
 MdaO+NSBRp/NjzZcpFG9QAroIQ2G2sRQ4E8ldFcNmdsjZWlUfKIHPfYHzvvc06L4
 MhvdkpMa/p51Jz9egQs0kfSvrb6fh4OEDTI19/aaGR0oJBhoGhLrqTI+vdYhMiyO
 wWtUXgZfsmlCBdAQLRh04CxGTc/32VApoB/SwP9sF+xD3gcL0mPFNKUociio6K2Y
 a7u7yuzUKvVwuafVgX9QT+f+je5/5u+WFsG/26cfXzizZoNWW5oDl3sBD3hRNkvt
 J13lB1w=
 =ch+/
 -----END PGP SIGNATURE-----

Merge tag 'v5.0-rc3' into next-general

Sync to Linux 5.0-rc3 to pull in the VFS changes which impacted a lot
of the LSM code.
2019-01-22 14:33:10 -08:00
Tycho Andersen
a811dc6155 seccomp: fix UAF in user-trap code
On the failure path, we do an fput() of the listener fd if the filter fails
to install (e.g. because of a TSYNC race that's lost, or if the thread is
killed, etc.). fput() doesn't actually release the fd, it just ads it to a
work queue. Then the thread proceeds to free the filter, even though the
listener struct file has a reference to it.

To fix this, on the failure path let's set the private data to null, so we
know in ->release() to ignore the filter.

Reported-by: syzbot+981c26489b2d1c6316ba@syzkaller.appspotmail.com
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-01-15 09:43:12 -08:00
Micah Morton
c1a85a00ea LSM: generalize flag passing to security_capable
This patch provides a general mechanism for passing flags to the
security_capable LSM hook. It replaces the specific 'audit' flag that is
used to tell security_capable whether it should log an audit message for
the given capability check. The reason for generalizing this flag
passing is so we can add an additional flag that signifies whether
security_capable is being called by a setid syscall (which is needed by
the proposed SafeSetID LSM).

Signed-off-by: Micah Morton <mortonm@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.morris@microsoft.com>
2019-01-10 14:16:06 -08:00
Tycho Andersen
319deec7db seccomp: fix poor type promotion
sparse complains,

kernel/seccomp.c:1172:13: warning: incorrect type in assignment (different base types)
kernel/seccomp.c:1172:13:    expected restricted __poll_t [usertype] ret
kernel/seccomp.c:1172:13:    got int
kernel/seccomp.c:1173:13: warning: restricted __poll_t degrades to integer

Instead of assigning this to ret, since we don't use this anywhere, let's
just test it against 0 directly.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Reported-by: 0day robot <lkp@intel.com>
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-13 16:49:01 -08:00
Tycho Andersen
6a21cc50f0 seccomp: add a return code to trap to userspace
This patch introduces a means for syscalls matched in seccomp to notify
some other task that a particular filter has been triggered.

The motivation for this is primarily for use with containers. For example,
if a container does an init_module(), we obviously don't want to load this
untrusted code, which may be compiled for the wrong version of the kernel
anyway. Instead, we could parse the module image, figure out which module
the container is trying to load and load it on the host.

As another example, containers cannot mount() in general since various
filesystems assume a trusted image. However, if an orchestrator knows that
e.g. a particular block device has not been exposed to a container for
writing, it want to allow the container to mount that block device (that
is, handle the mount for it).

This patch adds functionality that is already possible via at least two
other means that I know about, both of which involve ptrace(): first, one
could ptrace attach, and then iterate through syscalls via PTRACE_SYSCALL.
Unfortunately this is slow, so a faster version would be to install a
filter that does SECCOMP_RET_TRACE, which triggers a PTRACE_EVENT_SECCOMP.
Since ptrace allows only one tracer, if the container runtime is that
tracer, users inside the container (or outside) trying to debug it will not
be able to use ptrace, which is annoying. It also means that older
distributions based on Upstart cannot boot inside containers using ptrace,
since upstart itself uses ptrace to monitor services while starting.

The actual implementation of this is fairly small, although getting the
synchronization right was/is slightly complex.

Finally, it's worth noting that the classic seccomp TOCTOU of reading
memory data from the task still applies here, but can be avoided with
careful design of the userspace handler: if the userspace handler reads all
of the task memory that is necessary before applying its security policy,
the tracee's subsequent memory edits will not be read by the tracer.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: "Serge E. Hallyn" <serge@hallyn.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
CC: Christian Brauner <christian@brauner.io>
CC: Tyler Hicks <tyhicks@canonical.com>
CC: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-11 16:28:41 -08:00
Tycho Andersen
a5662e4d81 seccomp: switch system call argument type to void *
The const qualifier causes problems for any code that wants to write to the
third argument of the seccomp syscall, as we will do in a future patch in
this series.

The third argument to the seccomp syscall is documented as void *, so
rather than just dropping the const, let's switch everything to use void *
as well.

I believe this is safe because of 1. the documentation above, 2. there's no
real type information exported about syscalls anywhere besides the man
pages.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: "Serge E. Hallyn" <serge@hallyn.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
CC: Christian Brauner <christian@brauner.io>
CC: Tyler Hicks <tyhicks@canonical.com>
CC: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-11 16:28:41 -08:00