Commit Graph

41626 Commits

Author SHA1 Message Date
Kan Liang
023036e389 perf/core: Fix the same task check in perf_event_set_output
[ Upstream commit 24d3ae2f37d8bc3c14b31d353c5d27baf582b6a6 ]

The same task check in perf_event_set_output has some potential issues
for some usages.

For the current perf code, there is a problem if using of
perf_event_open() to have multiple samples getting into the same mmap’d
memory when they are both attached to the same process.
https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/
Because the event->ctx is not ready when the perf_event_set_output() is
invoked in the perf_event_open().

Besides the above issue, before the commit bd2756811766 ("perf: Rewrite
core context handling"), perf record can errors out when sampling with
a hardware event and a software event as below.
 $ perf record -e cycles,dummy --per-thread ls
 failed to mmap with 22 (Invalid argument)
That's because that prior to the commit a hardware event and a software
event are from different task context.

The problem should be a long time issue since commit c3f00c7027
("perk: Separate find_get_context() from event initialization").

The task struct is stored in the event->hw.target for each per-thread
event. It is a more reliable way to determine whether two events are
attached to the same task.

The event->hw.target was also introduced several years ago by the
commit 50f16a8bf9 ("perf: Remove type specific target pointers"). It
can not only be used to fix the issue with the current code, but also
back port to fix the issues with an older kernel.

Note: The event->hw.target was introduced later than commit
c3f00c7027. The patch may cannot be applied between the commit
c3f00c7027 and commit 50f16a8bf9. Anybody that wants to back-port
this at that period may have to find other solutions.

Fixes: c3f00c7027 ("perf: Separate find_get_context() from event initialization")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13 16:55:32 +02:00
Greg Kroah-Hartman
a0f3313ef9 This is the 6.1.23 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmQumsIACgkQONu9yGCS
 aT4yfBAAwaDPXomEa+DY6pkQEE7WPVtIkeO+sQIo7bWHunTDilTLRFeDUJ4THydT
 CnhhlGsBUt8KGeWgSR6hHeTl/c+b+AcBan5k5BBufUGrsDn/XV8QIEyKWhbLIEja
 qWehpogs7BJLg2dFRqTfHQEOhLht1jCmC99tfEozEG4zRudmdS3Z2DbRypfEHshc
 oGOC1Jzg4MLPfB+lCwKNrVMBlR2n/73P7mTUCu/Dc9+DUbm+GtqvsPuGT2LxVyY7
 kkNgGzvdxQQCqtK5X6zyoU61gepsobf6c6kHjBucn8mhaYURT5ndfV9VqLWkDYE7
 71iH0oY5fg2NgbMtQpbA10MokjijFp46I4QxzG/RVl2ZN2pbCFNm5aNIBCwBbF2k
 lN6hwJc1nbTi696o29o1osm+yju3347HCAWC8s+DAszXiquihiUeJBwuCfa1c+Gy
 GhdATa3nNQ/8D0gWULr/kl7DvlgpSpYrbEQGVG2gH6tdsAZt2iKYUtGLFjvDN+fw
 CoMpq2OZTX5afM7AxTX00f5lGmbXhD+T9a+pS9AXhPqKcGv1tt0Gso8dn7cpWpj5
 LxhIE9dK5F1/tI+wPE+8t80CukqQHfoCQ24YO8mfUKmlInwjGd1Hque+ihKJo7ZW
 W5CXlZJJVvpVk9BxMNaYHKfSE+U6G7hYabEAzJXR3fz9vGfoTII=
 =rz/i
 -----END PGP SIGNATURE-----

Merge 6.1.23 into android14-6.1

Changes in 6.1.23
	thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
	cifs: update ip_addr for ses only for primary chan setup
	cifs: prevent data race in cifs_reconnect_tcon()
	cifs: avoid race conditions with parallel reconnects
	zonefs: Reorganize code
	zonefs: Simplify IO error handling
	zonefs: Reduce struct zonefs_inode_info size
	zonefs: Separate zone information from inode information
	zonefs: Fix error message in zonefs_file_dio_append()
	fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
	kernel: kcsan: kcsan_test: build without structleak plugin
	kcsan: avoid passing -g for test
	btrfs: rename BTRFS_FS_NO_OVERCOMMIT to BTRFS_FS_ACTIVE_ZONE_TRACKING
	btrfs: zoned: count fresh BG region as zone unusable
	net: ethernet: ti: am65-cpsw/cpts: Fix CPTS release action
	riscv: ftrace: Fixup panic by disabling preemption
	ARM: dts: aspeed: p10bmc: Update battery node name
	drm/msm/dpu: Refactor sc7280_pp location
	drm/msm/dpu: correct sm8250 and sm8350 scaler
	drm/msm/disp/dpu: fix sc7280_pp base offset
	tty: serial: fsl_lpuart: switch to new dmaengine_terminate_* API
	tty: serial: fsl_lpuart: fix race on RX DMA shutdown
	tracing: Add .percent suffix option to histogram values
	tracing: Add .graph suffix option to histogram value
	tracing: Do not let histogram values have some modifiers
	net: mscc: ocelot: fix stats region batching
	arm64: efi: Set NX compat flag in PE/COFF header
	cifs: fix missing unload_nls() in smb2_reconnect()
	xfrm: Zero padding when dumping algos and encap
	ASoC: codecs: tx-macro: Fix for KASAN: slab-out-of-bounds
	ASoC: Intel: avs: max98357a: Explicitly define codec format
	ASoC: Intel: avs: da7219: Explicitly define codec format
	ASoC: Intel: avs: ssm4567: Remove nau8825 bits
	ASoC: Intel: avs: nau8825: Adjust clock control
	zstd: Fix definition of assert()
	ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535
	ASoC: SOF: ipc3: Check for upper size limit for the received message
	ASoC: SOF: ipc4-topology: Fix incorrect sample rate print unit
	ASoC: SOF: Intel: pci-tng: revert invalid bar size setting
	ASoC: SOF: IPC4: update gain ipc msg definition to align with fw
	md: avoid signed overflow in slot_store()
	x86/PVH: obtain VGA console info in Dom0
	drm/amdkfd: Fix BO offset for multi-VMA page migration
	drm/amdkfd: fix a potential double free in pqm_create_queue
	drm/amdkfd: fix potential kgd_mem UAFs
	net: hsr: Don't log netdev_err message on unknown prp dst node
	ALSA: asihpi: check pao in control_message()
	ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set()
	fbdev: tgafb: Fix potential divide by zero
	ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
	sched_getaffinity: don't assume 'cpumask_size()' is fully initialized
	nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
	drm/amdkfd: Fixed kfd_process cleanup on module exit.
	net/mlx5e: Lower maximum allowed MTU in XSK to match XDP prerequisites
	fbdev: nvidia: Fix potential divide by zero
	fbdev: intelfb: Fix potential divide by zero
	fbdev: lxfb: Fix potential divide by zero
	fbdev: au1200fb: Fix potential divide by zero
	tools/power turbostat: Fix /dev/cpu_dma_latency warnings
	tools/power turbostat: fix decoding of HWP_STATUS
	tracing: Fix wrong return in kprobe_event_gen_test.c
	btrfs: fix uninitialized variable warning in btrfs_update_block_group
	btrfs: use temporary variable for space_info in btrfs_update_block_group
	mtd: rawnand: meson: initialize struct with zeroes
	mtd: nand: mxic-ecc: Fix mxic_ecc_data_xfer_wait_for_completion() when irq is used
	ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx()
	riscv/kvm: Fix VM hang in case of timer delta being zero.
	mips: bmips: BCM6358: disable RAC flush for TP1
	ALSA: usb-audio: Fix recursive locking at XRUN during syncing
	PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabled
	platform/x86: think-lmi: add missing type attribute
	platform/x86: think-lmi: use correct possible_values delimiters
	platform/x86: think-lmi: only display possible_values if available
	platform/x86: think-lmi: Add possible_values for ThinkStation
	platform/surface: aggregator: Add missing fwnode_handle_put()
	mtd: rawnand: meson: invalidate cache on polling ECC bit
	SUNRPC: fix shutdown of NFS TCP client socket
	sfc: ef10: don't overwrite offload features at NIC reset
	scsi: megaraid_sas: Fix crash after a double completion
	scsi: mpt3sas: Don't print sense pool info twice
	net: dsa: realtek: fix out-of-bounds access
	ptp_qoriq: fix memory leak in probe()
	net: dsa: microchip: ksz8: fix ksz8_fdb_dump()
	net: dsa: microchip: ksz8: fix ksz8_fdb_dump() to extract all 1024 entries
	net: dsa: microchip: ksz8: fix offset for the timestamp filed
	net: dsa: microchip: ksz8: ksz8_fdb_dump: avoid extracting ghost entry from empty dynamic MAC table.
	net: dsa: microchip: ksz8863_smi: fix bulk access
	net: dsa: microchip: ksz8: fix MDB configuration with non-zero VID
	r8169: fix RTL8168H and RTL8107E rx crc error
	regulator: Handle deferred clk
	net/net_failover: fix txq exceeding warning
	net: stmmac: don't reject VLANs when IFF_PROMISC is set
	drm/i915/tc: Fix the ICL PHY ownership check in TC-cold state
	platform/x86/intel/pmc: Alder Lake PCH slp_s0_residency fix
	can: bcm: bcm_tx_setup(): fix KMSAN uninit-value in vfs_write
	s390/vfio-ap: fix memory leak in vfio_ap device driver
	ACPI: bus: Rework system-level device notification handling
	loop: LOOP_CONFIGURE: send uevents for partitions
	net: mvpp2: classifier flow fix fragmentation flags
	net: mvpp2: parser fix QinQ
	net: mvpp2: parser fix PPPoE
	smsc911x: avoid PHY being resumed when interface is not up
	ice: Fix ice_cfg_rdma_fltr() to only update relevant fields
	ice: add profile conflict check for AVF FDIR
	ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg()
	ALSA: ymfpci: Create card with device-managed snd_devm_card_new()
	ALSA: ymfpci: Fix BUG_ON in probe function
	net: ipa: compute DMA pool size properly
	i40e: fix registers dump after run ethtool adapter self test
	bnxt_en: Fix reporting of test result in ethtool selftest
	bnxt_en: Fix typo in PCI id to device description string mapping
	bnxt_en: Add missing 200G link speed reporting
	net: dsa: mv88e6xxx: Enable IGMP snooping on user ports only
	net: ethernet: mtk_eth_soc: fix flow block refcounting logic
	net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow
	pinctrl: ocelot: Fix alt mode for ocelot
	Input: xpad - fix incorrectly applied patch for MAP_PROFILE_BUTTON
	iommu/vt-d: Allow zero SAGAW if second-stage not supported
	Input: i8042 - add TUXEDO devices to i8042 quirk tables for partial fix
	Input: alps - fix compatibility with -funsigned-char
	Input: focaltech - use explicitly signed char type
	cifs: prevent infinite recursion in CIFSGetDFSRefer()
	cifs: fix DFS traversal oops without CONFIG_CIFS_DFS_UPCALL
	Input: i8042 - add quirk for Fujitsu Lifebook A574/H
	Input: goodix - add Lenovo Yoga Book X90F to nine_bytes_report DMI table
	btrfs: fix deadlock when aborting transaction during relocation with scrub
	btrfs: fix race between quota disable and quota assign ioctls
	btrfs: scan device in non-exclusive mode
	zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space
	block/io_uring: pass in issue_flags for uring_cmd task_work handling
	io_uring/poll: clear single/double poll flags on poll arming
	io_uring/rsrc: fix rogue rsrc node grabbing
	io_uring: fix poll/netmsg alloc caches
	vmxnet3: use gro callback when UPT is enabled
	zonefs: Always invalidate last cached page on append write
	dm: fix __send_duplicate_bios() to always allow for splitting IO
	can: j1939: prevent deadlock by moving j1939_sk_errqueue()
	xen/netback: don't do grant copy across page boundary
	net: phy: dp83869: fix default value for tx-/rx-internal-delay
	modpost: Fix processing of CRCs on 32-bit build machines
	pinctrl: amd: Disable and mask interrupts on resume
	pinctrl: at91-pio4: fix domain name assignment
	platform/x86: ideapad-laptop: Stop sending KEY_TOUCHPAD_TOGGLE
	powerpc: Don't try to copy PPR for task with NULL pt_regs
	powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled
	powerpc/64s: Fix __pte_needs_flush() false positive warning
	NFSv4: Fix hangs when recovering open state after a server reboot
	ALSA: hda/conexant: Partial revert of a quirk for Lenovo
	ALSA: usb-audio: Fix regression on detection of Roland VS-100
	ALSA: hda/realtek: Add quirks for some Clevo laptops
	ALSA: hda/realtek: Add quirk for Lenovo ZhaoYang CF4620Z
	xtensa: fix KASAN report for show_stack
	rcu: Fix rcu_torture_read ftrace event
	dt-bindings: mtd: jedec,spi-nor: Document CPOL/CPHA support
	s390/uaccess: add missing earlyclobber annotations to __clear_user()
	s390: reintroduce expoline dependence to scripts
	drm/etnaviv: fix reference leak when mmaping imported buffer
	drm/amdgpu: allow more APUs to do mode2 reset when go to S4
	drm/amd/display: Add DSC Support for Synaptics Cascaded MST Hub
	drm/amd/display: Take FEC Overhead into Timeslot Calculation
	drm/i915/gem: Flush lmem contents after construction
	drm/i915/dpt: Treat the DPT BO as a framebuffer
	drm/i915: Disable DC states for all commits
	drm/i915: Move CSC load back into .color_commit_arm() when PSR is enabled on skl/glk
	KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value
	KVM: arm64: Disable interrupts while walking userspace PTs
	net: dsa: mv88e6xxx: read FID when handling ATU violations
	net: dsa: mv88e6xxx: replace ATU violation prints with trace points
	net: dsa: mv88e6xxx: replace VTU violation prints with trace points
	selftests/bpf: Test btf dump for struct with padding only fields
	libbpf: Fix BTF-to-C converter's padding logic
	selftests/bpf: Add few corner cases to test padding handling of btf_dump
	libbpf: Fix btf_dump's packed struct determination
	usb: ucsi: Fix ucsi->connector race
	drm/amdkfd: Get prange->offset after svm_range_vram_node_new
	hsr: ratelimit only when errors are printed
	x86/PVH: avoid 32-bit build warning when obtaining VGA console info
	Revert "cpuidle, intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE *again*"
	Linux 6.1.23

Change-Id: I15af3697170567c4678bcc9c2380d80e7cef5bc9
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-04-06 14:14:07 +00:00
Anton Gusev
089d656992 tracing: Fix wrong return in kprobe_event_gen_test.c
[ Upstream commit bc4f359b3b607daac0290d0038561237a86b38cb ]

Overwriting the error code with the deletion result may cause the
function to return 0 despite encountering an error. Commit b111545d26
("tracing: Remove the useless value assignment in
test_create_synth_event()") solves a similar issue by
returning the original error code, so this patch does the same.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lore.kernel.org/linux-trace-kernel/20230131075818.5322-1-aagusev@ispras.ru

Signed-off-by: Anton Gusev <aagusev@ispras.ru>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:41 +02:00
Linus Torvalds
c8943cf3ab sched_getaffinity: don't assume 'cpumask_size()' is fully initialized
[ Upstream commit 6015b1aca1a233379625385feb01dd014aca60b5 ]

The getaffinity() system call uses 'cpumask_size()' to decide how big
the CPU mask is - so far so good.  It is indeed the allocation size of a
cpumask.

But the code also assumes that the whole allocation is initialized
without actually doing so itself.  That's wrong, because we might have
fixed-size allocations (making copying and clearing more efficient), but
not all of it is then necessarily used if 'nr_cpu_ids' is smaller.

Having checked other users of 'cpumask_size()', they all seem to be ok,
either using it purely for the allocation size, or explicitly zeroing
the cpumask before using the size in bytes to copy it.

See for example the ublk_ctrl_get_queue_affinity() function that uses
the proper 'zalloc_cpumask_var()' to make sure that the whole mask is
cleared, whether the storage is on the stack or if it was an external
allocation.

Fix this by just zeroing the allocation before using it.  Do the same
for the compat version of sched_getaffinity(), which had the same logic.

Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to
access the bits.  For a cpumask_var_t, it ends up being a pointer to the
same data either way, but it's just a good idea to treat it like you
would a 'cpumask_t'.  The compat case already did that.

Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:40 +02:00
Steven Rostedt (Google)
39cd75f2f3 tracing: Do not let histogram values have some modifiers
[ Upstream commit e0213434fe3e4a0d118923dc98d31e7ff1cd9e45 ]

Histogram values can not be strings, stacktraces, graphs, symbols,
syscalls, or grouped in buckets or log. Give an error if a value is set to
do so.

Note, the histogram code was not prepared to handle these modifiers for
histograms and caused a bug.

Mark Rutland reported:

 # echo 'p:copy_to_user __arch_copy_to_user n=$arg2' >> /sys/kernel/tracing/kprobe_events
 # echo 'hist:keys=n:vals=hitcount.buckets=8:sort=hitcount' > /sys/kernel/tracing/events/kprobes/copy_to_user/trigger
 # cat /sys/kernel/tracing/events/kprobes/copy_to_user/hist
[  143.694628] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  143.695190] Mem abort info:
[  143.695362]   ESR = 0x0000000096000004
[  143.695604]   EC = 0x25: DABT (current EL), IL = 32 bits
[  143.695889]   SET = 0, FnV = 0
[  143.696077]   EA = 0, S1PTW = 0
[  143.696302]   FSC = 0x04: level 0 translation fault
[  143.702381] Data abort info:
[  143.702614]   ISV = 0, ISS = 0x00000004
[  143.702832]   CM = 0, WnR = 0
[  143.703087] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000448f9000
[  143.703407] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[  143.704137] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  143.704714] Modules linked in:
[  143.705273] CPU: 0 PID: 133 Comm: cat Not tainted 6.2.0-00003-g6fc512c10a7c #3
[  143.706138] Hardware name: linux,dummy-virt (DT)
[  143.706723] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  143.707120] pc : hist_field_name.part.0+0x14/0x140
[  143.707504] lr : hist_field_name.part.0+0x104/0x140
[  143.707774] sp : ffff800008333a30
[  143.707952] x29: ffff800008333a30 x28: 0000000000000001 x27: 0000000000400cc0
[  143.708429] x26: ffffd7a653b20260 x25: 0000000000000000 x24: ffff10d303ee5800
[  143.708776] x23: ffffd7a6539b27b0 x22: ffff10d303fb8c00 x21: 0000000000000001
[  143.709127] x20: ffff10d303ec2000 x19: 0000000000000000 x18: 0000000000000000
[  143.709478] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  143.709824] x14: 0000000000000000 x13: 203a6f666e692072 x12: 6567676972742023
[  143.710179] x11: 0a230a6d6172676f x10: 000000000000002c x9 : ffffd7a6521e018c
[  143.710584] x8 : 000000000000002c x7 : 7f7f7f7f7f7f7f7f x6 : 000000000000002c
[  143.710915] x5 : ffff10d303b0103e x4 : ffffd7a653b20261 x3 : 000000000000003d
[  143.711239] x2 : 0000000000020001 x1 : 0000000000000001 x0 : 0000000000000000
[  143.711746] Call trace:
[  143.712115]  hist_field_name.part.0+0x14/0x140
[  143.712642]  hist_field_name.part.0+0x104/0x140
[  143.712925]  hist_field_print+0x28/0x140
[  143.713125]  event_hist_trigger_print+0x174/0x4d0
[  143.713348]  hist_show+0xf8/0x980
[  143.713521]  seq_read_iter+0x1bc/0x4b0
[  143.713711]  seq_read+0x8c/0xc4
[  143.713876]  vfs_read+0xc8/0x2a4
[  143.714043]  ksys_read+0x70/0xfc
[  143.714218]  __arm64_sys_read+0x24/0x30
[  143.714400]  invoke_syscall+0x50/0x120
[  143.714587]  el0_svc_common.constprop.0+0x4c/0x100
[  143.714807]  do_el0_svc+0x44/0xd0
[  143.714970]  el0_svc+0x2c/0x84
[  143.715134]  el0t_64_sync_handler+0xbc/0x140
[  143.715334]  el0t_64_sync+0x190/0x194
[  143.715742] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (f9400000)
[  143.716510] ---[ end trace 0000000000000000 ]---
Segmentation fault

Link: https://lkml.kernel.org/r/20230302020810.559462599@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: c6afad49d1 ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:36 +02:00
Masami Hiramatsu (Google)
8ebeea1052 tracing: Add .graph suffix option to histogram value
[ Upstream commit a2c54256dec7510477e2b4f4db187e638f7cac37 ]

Add the .graph suffix which shows the bar graph of the histogram value.

For example, the below example shows that the bar graph
of the histogram of the runtime for each tasks.

------
  # cd /sys/kernel/debug/tracing/
  # echo hist:keys=pid:vals=runtime.graph:sort=pid > \
   events/sched/sched_stat_runtime/trigger
  # sleep 10
  # cat events/sched/sched_stat_runtime/hist
 # event histogram
 #
 # trigger info: hist:keys=pid:vals=hitcount,runtime.graph:sort=pid:size=2048 [active]
 #

 { pid:         14 } hitcount:          2  runtime:
 { pid:         16 } hitcount:          8  runtime:
 { pid:         26 } hitcount:          1  runtime:
 { pid:         57 } hitcount:          3  runtime:
 { pid:         61 } hitcount:         20  runtime: ###
 { pid:         66 } hitcount:          2  runtime:
 { pid:         70 } hitcount:          3  runtime:
 { pid:         72 } hitcount:          2  runtime:
 { pid:        145 } hitcount:         14  runtime: ####################
 { pid:        152 } hitcount:          5  runtime: #######
 { pid:        153 } hitcount:          2  runtime: ####

 Totals:
     Hits: 62
     Entries: 11
     Dropped: 0
-------

Link: https://lore.kernel.org/linux-trace-kernel/166610813953.56030.10944148382315789485.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Stable-dep-of: e0213434fe3e ("tracing: Do not let histogram values have some modifiers")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:36 +02:00
Masami Hiramatsu (Google)
93454d1a30 tracing: Add .percent suffix option to histogram values
[ Upstream commit abaa5258ce5e5887a9de049f50a85dc023391a1c ]

Add .percent suffix option to show the histogram values in percentage.
This feature is useful when we need yo undersntand the overall trend
for the histograms of large values.
E.g. this shows the runtime percentage for each tasks.

------
  # cd /sys/kernel/debug/tracing/
  # echo hist:keys=pid:vals=hitcount,runtime.percent:sort=pid > \
    events/sched/sched_stat_runtime/trigger
  # sleep 10
  # cat events/sched/sched_stat_runtime/hist
 # event histogram
 #
 # trigger info: hist:keys=pid:vals=hitcount,runtime.percent:sort=pid:size=2048 [active]
 #

 { pid:          8 } hitcount:          7  runtime (%):   4.14
 { pid:         14 } hitcount:          5  runtime (%):   3.69
 { pid:         16 } hitcount:         11  runtime (%):   3.41
 { pid:         61 } hitcount:         41  runtime (%):  19.75
 { pid:         65 } hitcount:          4  runtime (%):   1.48
 { pid:         70 } hitcount:          6  runtime (%):   3.60
 { pid:         72 } hitcount:          2  runtime (%):   1.10
 { pid:        144 } hitcount:         10  runtime (%):  32.01
 { pid:        151 } hitcount:          8  runtime (%):  22.66
 { pid:        152 } hitcount:          2  runtime (%):   8.10

 Totals:
     Hits: 96
     Entries: 10
     Dropped: 0
-----

Link: https://lore.kernel.org/linux-trace-kernel/166610813077.56030.4238090506973562347.stgit@devnote2

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Stable-dep-of: e0213434fe3e ("tracing: Do not let histogram values have some modifiers")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:36 +02:00
Marco Elver
bae092f587 kcsan: avoid passing -g for test
[ Upstream commit 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e ]

Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5, the assembler will complain with:

  Error: non-constant .uleb128 is not supported

This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.

All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)

Link: https://lkml.kernel.org/r/20230316224705.709984-2-elver@google.com
Fixes: 1fe84fd4a4 ("kcsan: Add test suite")
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:35 +02:00
Anders Roxell
01f3150cc7 kernel: kcsan: kcsan_test: build without structleak plugin
[ Upstream commit 6fcd4267a840d0536b8e5334ad5f31e4105fce85 ]

Building kcsan_test with structleak plugin enabled makes the stack frame
size to grow.

kernel/kcsan/kcsan_test.c:704:1: error: the frame size of 3296 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]

Turn off the structleak plugin checks for kcsan_test.

Link: https://lkml.kernel.org/r/20221128104358.2660634-1-anders.roxell@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marco Elver <elver@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Gow <davidgow@google.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 5eb39cde1e24 ("kcsan: avoid passing -g for test")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06 12:10:34 +02:00
Greg Kroah-Hartman
7f4246b044 Revert "ANDROID: kernel: Add restricted vendor hook in creds"
This reverts commit 1abc68878a.

The hooks added in that commit:
	android_rvh_commit_creds
	android_rvh_exit_creds
	android_rvh_override_creds
	android_rvh_revert_creds
are not used by anyone, so remove them.

If they are needed in the future, they can be resubmitted for review.

Bug: 181639260
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: I1b371d6b0be827e39c5163e2ed2134307d81730a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-31 18:25:45 +00:00
Greg Kroah-Hartman
db50ac4d0a Merge 6.1.22 into android14-6.1
Changes in 6.1.22
	interconnect: qcom: osm-l3: fix icc_onecell_data allocation
	interconnect: qcom: sm8450: switch to qcom_icc_rpmh_* function
	interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
	perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
	perf: fix perf_event_context->time
	tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
	drm/amd/display: Include virtual signal to set k1 and k2 values
	drm/amd/display: fix k1 k2 divider programming for phantom streams
	drm/amd/display: Remove OTG DIV register write for Virtual signals.
	mptcp: refactor passive socket initialization
	mptcp: use the workqueue to destroy unaccepted sockets
	mptcp: fix UaF in listener shutdown
	drm/amd/display: Fix DP MST sinks removal issue
	arm64: dts: qcom: sm8450: Mark UFS controller as cache coherent
	power: supply: bq24190: Fix use after free bug in bq24190_remove due to race condition
	power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition
	arm64: dts: imx8dxl-evk: Disable hibernation mode of AR8031 for EQOS
	arm64: dts: imx8dxl-evk: Fix eqos phy reset gpio
	ARM: dts: imx6sll: e70k02: fix usbotg1 pinctrl
	ARM: dts: imx6sll: e60k02: fix usbotg1 pinctrl
	ARM: dts: imx6sl: tolino-shine2hd: fix usbotg1 pinctrl
	arm64: dts: imx8mn: specify #sound-dai-cells for SAI nodes
	arm64: dts: imx93: add missing #address-cells and #size-cells to i2c nodes
	NFS: Fix /proc/PID/io read_bytes for buffered reads
	xsk: Add missing overflow check in xdp_umem_reg
	iavf: fix inverted Rx hash condition leading to disabled hash
	iavf: fix non-tunneled IPv6 UDP packet type and hashing
	iavf: do not track VLAN 0 filters
	intel/igbvf: free irq on the error path in igbvf_request_msix()
	igbvf: Regard vf reset nack as success
	igc: fix the validation logic for taprio's gate list
	i2c: imx-lpi2c: check only for enabled interrupt flags
	i2c: mxs: ensure that DMA buffers are safe for DMA
	i2c: hisi: Only use the completion interrupt to finish the transfer
	scsi: scsi_dh_alua: Fix memleak for 'qdata' in alua_activate()
	nfsd: don't replace page in rq_pages if it's a continuation of last page
	net: dsa: b53: mmap: fix device tree support
	net: usb: smsc95xx: Limit packet length to skb->len
	efi/libstub: smbios: Use length member instead of record struct size
	qed/qed_sriov: guard against NULL derefs from qed_iov_get_vf_info
	xirc2ps_cs: Fix use after free bug in xirc2ps_detach
	net: phy: Ensure state transitions are processed from phy_stop()
	net: mdio: fix owner field for mdio buses registered using device-tree
	net: mdio: fix owner field for mdio buses registered using ACPI
	net: stmmac: Fix for mismatched host/device DMA address width
	thermal/drivers/mellanox: Use generic thermal_zone_get_trip() function
	mlxsw: core_thermal: Fix fan speed in maximum cooling state
	drm/i915: Print return value on error
	drm/i915/fbdev: lock the fbdev obj before vma pin
	drm/i915/guc: Rename GuC register state capture node to be more obvious
	drm/i915/guc: Fix missing ecodes
	drm/i915/gt: perform uc late init after probe error injection
	net: qcom/emac: Fix use after free bug in emac_remove due to race condition
	net: usb: lan78xx: Limit packet length to skb->len
	net/ps3_gelic_net: Fix RX sk_buff length
	net/ps3_gelic_net: Use dma_mapping_error
	octeontx2-vf: Add missing free for alloc_percpu
	bootconfig: Fix testcase to increase max node
	keys: Do not cache key in task struct if key is requested from kernel thread
	ice: check if VF exists before mode check
	iavf: fix hang on reboot with ice
	i40e: fix flow director packet filter programming
	bpf: Adjust insufficient default bpf_jit_limit
	net/mlx5e: Set uplink rep as NETNS_LOCAL
	net/mlx5e: Block entering switchdev mode with ns inconsistency
	net/mlx5: Fix steering rules cleanup
	net/mlx5e: Overcome slow response for first macsec ASO WQE
	net/mlx5: Read the TC mapping of all priorities on ETS query
	net/mlx5: E-Switch, Fix an Oops in error handling code
	net: dsa: tag_brcm: legacy: fix daisy-chained switches
	atm: idt77252: fix kmemleak when rmmod idt77252
	erspan: do not use skb_mac_header() in ndo_start_xmit()
	net/sonic: use dma_mapping_error() for error check
	nvme-tcp: fix nvme_tcp_term_pdu to match spec
	mlxsw: spectrum_fid: Fix incorrect local port type
	hvc/xen: prevent concurrent accesses to the shared ring
	ksmbd: add low bound validation to FSCTL_SET_ZERO_DATA
	ksmbd: add low bound validation to FSCTL_QUERY_ALLOCATED_RANGES
	ksmbd: fix possible refcount leak in smb2_open()
	Bluetooth: hci_sync: Resume adv with no RPA when active scan
	Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet
	Bluetooth: btusb: Remove detection of ISO packets over bulk
	Bluetooth: ISO: fix timestamped HCI ISO data packet parsing
	Bluetooth: Remove "Power-on" check from Mesh feature
	gve: Cache link_speed value from device
	net: asix: fix modprobe "sysfs: cannot create duplicate filename"
	net: dsa: mt7530: move enabling disabling core clock to mt7530_pll_setup()
	net: dsa: mt7530: move lowering TRGMII driving to mt7530_setup()
	net: dsa: mt7530: move setting ssc_delta to PHY_INTERFACE_MODE_TRGMII case
	net: mdio: thunder: Add missing fwnode_handle_put()
	drm/amd/display: Set dcn32 caps.seamless_odm
	Bluetooth: btqcomsmd: Fix command timeout after setting BD address
	Bluetooth: L2CAP: Fix responding with wrong PDU type
	Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work
	Bluetooth: mgmt: Fix MGMT add advmon with RSSI command
	Bluetooth: HCI: Fix global-out-of-bounds
	platform/chrome: cros_ec_chardev: fix kernel data leak from ioctl
	entry: Fix noinstr warning in __enter_from_user_mode()
	perf/x86/amd/core: Always clear status for idx
	entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
	hwmon: fix potential sensor registration fail if of_node is missing
	hwmon (it87): Fix voltage scaling for chips with 10.9mV ADCs
	scsi: qla2xxx: Synchronize the IOCB count to be in order
	scsi: qla2xxx: Perform lockless command completion in abort path
	smb3: lower default deferred close timeout to address perf regression
	smb3: fix unusable share after force unmount failure
	uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
	thunderbolt: Use scale field when allocating USB3 bandwidth
	thunderbolt: Call tb_check_quirks() after initializing adapters
	thunderbolt: Add quirk to disable CLx
	thunderbolt: Fix memory leak in margining
	thunderbolt: Disable interrupt auto clear for rings
	thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
	thunderbolt: Use const qualifier for `ring_interrupt_index`
	thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
	ASoC: amd: yp: Add OMEN by HP Gaming Laptop 16z-n000 to quirks
	ASoC: amd: yc: Add DMI entries to support HP OMEN 16-n0xxx (8A43)
	ACPI: x86: Drop quirk for HP Elitebook
	ACPI: x86: utils: Add Cezanne to the list for forcing StorageD3Enable
	riscv: Bump COMMAND_LINE_SIZE value to 1024
	drm/cirrus: NULL-check pipe->plane.state->fb in cirrus_pipe_update()
	HID: cp2112: Fix driver not registering GPIO IRQ chip as threaded
	ca8210: fix mac_len negative array access
	HID: logitech-hidpp: Add support for Logitech MX Master 3S mouse
	HID: intel-ish-hid: ipc: Fix potential use-after-free in work function
	m68k: mm: Fix systems with memory at end of 32-bit address space
	m68k: Only force 030 bus error if PC not in exception table
	selftests/bpf: check that modifier resolves after pointer
	scsi: target: iscsi: Fix an error message in iscsi_check_key()
	scsi: qla2xxx: Add option to disable FC2 Target support
	scsi: hisi_sas: Check devm_add_action() return value
	scsi: ufs: core: Add soft dependency on governor_simpleondemand
	scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
	scsi: lpfc: Avoid usage of list iterator variable after loop
	scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabled
	scsi: mpi3mr: Wait for diagnostic save during controller init
	scsi: mpi3mr: NVMe command size greater than 8K fails
	scsi: mpi3mr: Bad drive in topology results kernel crash
	scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX file
	platform/x86: int3472: Add GPIOs to Surface Go 3 Board data
	net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990
	net: usb: qmi_wwan: add Telit 0x1080 composition
	drm/amd/display: Update clock table to include highest clock setting
	sh: sanitize the flags on sigreturn
	drm/amdgpu: Fix call trace warning and hang when removing amdgpu device
	drm/amd: Fix initialization mistake for NBIO 7.3.0
	net/sched: act_mirred: better wording on protection against excessive stack growth
	act_mirred: use the backlog for nested calls to mirred ingress
	cifs: lock chan_lock outside match_session
	cifs: append path to open_enter trace event
	cifs: do not poll server interfaces too regularly
	cifs: empty interface list when server doesn't support query interfaces
	cifs: dump pending mids for all channels in DebugData
	cifs: print session id while listing open files
	cifs: fix dentry lookups in directory handle cache
	x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
	selftests/x86/amx: Add a ptrace test
	scsi: core: Add BLIST_SKIP_VPD_PAGES for SKhynix H28U74301AMR
	usb: misc: onboard-hub: add support for Microchip USB2517 USB 2.0 hub
	usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
	usb: dwc2: fix a devres leak in hw_enable upon suspend resume
	usb: gadget: u_audio: don't let userspace block driver unbind
	btrfs: zoned: fix btrfs_can_activate_zone() to support DUP profile
	Bluetooth: Fix race condition in hci_cmd_sync_clear
	efi: sysfb_efi: Fix DMI quirks not working for simpledrm
	mm/slab: Fix undefined init_cache_node_node() for NUMA and !SMP
	fscrypt: destroy keyring after security_sb_delete()
	fsverity: Remove WQ_UNBOUND from fsverity read workqueue
	lockd: set file_lock start and end when decoding nlm4 testargs
	arm64: dts: imx8mm-nitrogen-r2: fix WM8960 clock name
	igb: revert rtnl_lock() that causes deadlock
	dm thin: fix deadlock when swapping to thin device
	usb: typec: tcpm: fix create duplicate source-capabilities file
	usb: typec: tcpm: fix warning when handle discover_identity message
	usb: cdns3: Fix issue with using incorrect PCI device function
	usb: cdnsp: Fixes issue with redundant Status Stage
	usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
	usb: chipdea: core: fix return -EINVAL if request role is the same with current role
	usb: chipidea: core: fix possible concurrent when switch role
	usb: dwc3: gadget: Add 1ms delay after end transfer command without IOC
	usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()
	usb: ucsi_acpi: Increase the command completion timeout
	mm: kfence: fix using kfence_metadata without initialization in show_object()
	kfence: avoid passing -g for test
	io_uring/net: avoid sending -ECONNABORTED on repeated connection requests
	io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()
	Revert "kasan: drop skip_kasan_poison variable in free_pages_prepare"
	test_maple_tree: add more testing for mas_empty_area()
	maple_tree: fix mas_skip_node() end slot detection
	ksmbd: fix wrong signingkey creation when encryption is AES256
	ksmbd: set FILE_NAMED_STREAMS attribute in FS_ATTRIBUTE_INFORMATION
	ksmbd: don't terminate inactive sessions after a few seconds
	ksmbd: return STATUS_NOT_SUPPORTED on unsupported smb2.0 dialect
	ksmbd: return unsupported error on smb1 mount
	wifi: mac80211: fix qos on mesh interfaces
	nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy()
	drm/bridge: lt8912b: return EPROBE_DEFER if bridge is not found
	drm/amd/display: fix wrong index used in dccg32_set_dpstreamclk
	drm/meson: fix missing component unbind on bind errors
	drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
	drm/i915/active: Fix missing debug object activation
	drm/i915: Preserve crtc_state->inherited during state clearing
	drm/amdgpu: skip ASIC reset for APUs when go to S4
	drm/amdgpu: reposition the gpu reset checking for reuse
	riscv: mm: Fix incorrect ASID argument when flushing TLB
	riscv: Handle zicsr/zifencei issues between clang and binutils
	tee: amdtee: fix race condition in amdtee_open_session
	firmware: arm_scmi: Fix device node validation for mailbox transport
	arm64: dts: qcom: sc7280: Mark PCIe controller as cache coherent
	arm64: dts: qcom: sm8150: Fix the iommu mask used for PCIe controllers
	soc: qcom: llcc: Fix slice configuration values for SC8280XP
	mm/ksm: fix race with VMA iteration and mm_struct teardown
	bus: imx-weim: fix branch condition evaluates to a garbage value
	i2c: xgene-slimpro: Fix out-of-bounds bug in xgene_slimpro_i2c_xfer()
	dm stats: check for and propagate alloc_percpu failure
	dm crypt: add cond_resched() to dmcrypt_write()
	dm crypt: avoid accessing uninitialized tasklet
	sched/fair: sanitize vruntime of entity being placed
	sched/fair: Sanitize vruntime of entity being migrated
	drm/amdkfd: introduce dummy cache info for property asic
	drm/amdkfd: Fix the warning of array-index-out-of-bounds
	drm/amdkfd: add GC 11.0.4 KFD support
	drm/amdkfd: Fix the memory overrun
	Linux 6.1.22

Change-Id: Id13b4655dbfb59c29a0b8953e5e0cda3703f1879
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-31 08:15:39 +00:00
Vincent Guittot
388c4c1d12 sched/fair: Sanitize vruntime of entity being migrated
commit a53ce18cacb477dd0513c607f187d16f0fa96f71 upstream.

Commit 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.

For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.

In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.

Fixes: 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Link: https://lore.kernel.org/r/20230317160810.107988-1-vincent.guittot@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30 12:49:30 +02:00
Zhang Qiao
e427be6889 sched/fair: sanitize vruntime of entity being placed
commit 829c1651e9c4a6f78398d3e67651cef9bb6b42cc upstream.

When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
to the base level (around cfs_rq->min_vruntime), so that the entity
doesn't gain extra boost when placed backwards.

However, if the entity being placed wasn't executed for a long time, its
vruntime may get too far behind (e.g. while cfs_rq was executing a
low-weight hog), which can inverse the vruntime comparison due to s64
overflow.  This results in the entity being placed with its original
vruntime way forwards, so that it will effectively never get to the cpu.

To prevent that, ignore the vruntime of the entity being placed if it
didn't execute for much longer than the characteristic sheduler time
scale.

[rkagan: formatted, adjusted commit log, comments, cutoff value]
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Co-developed-by: Roman Kagan <rkagan@amazon.de>
Signed-off-by: Roman Kagan <rkagan@amazon.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230130122216.3555094-1-rkagan@amazon.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30 12:49:30 +02:00
Frederic Weisbecker
d716ea059c entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
[ Upstream commit b416514054810cf2d2cc348ae477cea619b64da7 ]

RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB).  These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).

However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.

Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.

Fixes: 47b8ff194c ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230315194349.10798-3-joel@joelfernandes.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:49:13 +02:00
Josh Poimboeuf
adfc7aaa0d entry: Fix noinstr warning in __enter_from_user_mode()
[ Upstream commit f87d28673b71b35b248231a2086f9404afbb7f28 ]

__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().

The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().

Just use __ct_state() instead.

Fixes the following warnings:

  vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section

Fixes: 171476775d ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:49:13 +02:00
Daniel Borkmann
9cda812c76 bpf: Adjust insufficient default bpf_jit_limit
[ Upstream commit 10ec8ca8ec1a2f04c4ed90897225231c58c124a7 ]

We've seen recent AWS EKS (Kubernetes) user reports like the following:

  After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
  clusters after a few days a number of the nodes have containers stuck
  in ContainerCreating state or liveness/readiness probes reporting the
  following error:

    Readiness probe errored: rpc error: code = Unknown desc = failed to
    exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
    OCI runtime exec failed: exec failed: unable to start container process:
    unable to init seccomp: error loading seccomp filter into kernel:
    error loading seccomp filter: errno 524: unknown

  However, we had not been seeing this issue on previous AMIs and it only
  started to occur on v20230217 (following the upgrade from kernel 5.4 to
  5.10) with no other changes to the underlying cluster or workloads.

  We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
  which helped to immediately allow containers to be created and probes to
  execute but after approximately a day the issue returned and the value
  returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
  was steadily increasing.

I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.

The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.

Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.

Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:49:08 +02:00
Costa Shulyupin
a1f4880655 tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
[ Upstream commit 71c7a30442b724717a30d5e7d1662ba4904eb3d4 ]

There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.

Test case:

  # cd /sys/kernel/tracing
  # echo 0 > tracing_on
  # echo round-robin > hwlat_detector/mode
  # echo hwlat > current_tracer
  # unshare --fork --pid bash -c 'echo 1 > tracing_on'
  # dmesg -c

Actual behavior:

[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none

Link: https://lore.kernel.org/linux-trace-kernel/20230316144535.1004952-1-costa.shul@redhat.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 0330f7aa8e ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:48:59 +02:00
Song Liu
d496185c25 perf: fix perf_event_context->time
[ Upstream commit baf1b12a67f5b24f395baca03e442ce27cab0c18 ]

Time readers rely on perf_event_context->[time|timestamp|timeoffset] to get
accurate time_enabled and time_running for an event. The difference between
ctx->timestamp and ctx->time is the among of time when the context is not
enabled. __update_context_time(ctx, false) is used to increase timestamp,
but not time. Therefore, it should only be called in ctx_sched_in() when
EVENT_TIME was not enabled.

Fixes: 09f5e7dc7a ("perf: Fix perf_event_read_local() time")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20230313171608.298734-1-song@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:48:59 +02:00
Yang Jihong
ff8137727a perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
[ Upstream commit eb81a2ed4f52be831c9fb879752d89645a312c13 ]

syzkaller reportes a KASAN issue with stack-out-of-bounds.
The call trace is as follows:
  dump_stack+0x9c/0xd3
  print_address_description.constprop.0+0x19/0x170
  __kasan_report.cold+0x6c/0x84
  kasan_report+0x3a/0x50
  __perf_event_header__init_id+0x34/0x290
  perf_event_header__init_id+0x48/0x60
  perf_output_begin+0x4a4/0x560
  perf_event_bpf_output+0x161/0x1e0
  perf_iterate_sb_cpu+0x29e/0x340
  perf_iterate_sb+0x4c/0xc0
  perf_event_bpf_event+0x194/0x2c0
  __bpf_prog_put.constprop.0+0x55/0xf0
  __cls_bpf_delete_prog+0xea/0x120 [cls_bpf]
  cls_bpf_delete_prog_work+0x1c/0x30 [cls_bpf]
  process_one_work+0x3c2/0x730
  worker_thread+0x93/0x650
  kthread+0x1b8/0x210
  ret_from_fork+0x1f/0x30

commit 267fb27352 ("perf: Reduce stack usage of perf_output_begin()")
use on-stack struct perf_sample_data of the caller function.

However, perf_event_bpf_output uses incorrect parameter to convert
small-sized data (struct perf_bpf_event) into large-sized data
(struct perf_sample_data), which causes memory overwriting occurs in
__perf_event_header__init_id.

Fixes: 267fb27352 ("perf: Reduce stack usage of perf_output_begin()")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230314044735.56551-1-yangjihong1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:48:59 +02:00
Greg Kroah-Hartman
345103eb06 Revert "Revert "wait: Return number of exclusive waiters awaken""
This reverts commit 2b47e2bee0.

It was perserving the ABI, but that is not needed anymore at this point
in time.

Change-Id: I7efd4dd7abde9f5baa37cb3731aa40b7ff94d2bb
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-30 10:37:08 +00:00
Ramji Jiyani
0d5b95acb6 ANDROID: GKI: Multi arch exports protection support
ABI is being implemented for x86_64, making it necessary
to support protected exports header file generation for
the GKI modules for multiple architecture.

Enable support to select required inputs based on the ARCH
to generate gki_module_protected_exports.h during kernel
build.

Inputs for generating gki_module_protected_exports.h are:

ARCH = arm64:
ABI Protected exports list: abi_gki_protected_exports_aarch64
Protected GKI modules list: gki_aarch64_protected_modules

ARCH = x86_64:
ABI Protected exports list: abi_gki_protected_exports_x86_64
Protected GKI modules list: gki_x86_64_protected_modules

Test: TH
Test: Manual verification of the generated header file
Test: bazel run //common:kernel_aarch64_abi_update_protected_exports
Bug: 151893768
Change-Id: Ic4bcb2732199b71a7973b5ce4c852bcd95d37131
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-03-29 23:11:03 +00:00
Greg Kroah-Hartman
88df355018 Revert "ANDROID: module: Add vendor hooks"
This reverts commit 60e6687899.

The hooks added in it, android_rvh_set_module_core_rw_nx,
android_rvh_set_module_init_rw_nx,
android_rvh_set_module_permit_before_init, and
android_rvh_set_module_permit_after_init, are not used by any vendor
symbol list, so remove them as they are unused.

Bug: 248994334
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: I39d02510916e2a645526f7d3bfaa3e4066901a3e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-29 08:05:07 +00:00
Greg Kroah-Hartman
d14ac9ddc6 Merge 6.1.21 into android14-6.1
Changes in 6.1.21
	xfrm: Allow transport-mode states with AF_UNSPEC selector
	drm/virtio: Pass correct device to dma_sync_sgtable_for_device()
	drm/msm/gem: Prevent blocking within shrinker loop
	drm/panfrost: Don't sync rpm suspension after mmu flushing
	fbdev: chipsfb: Fix error codes in chipsfb_pci_init()
	cifs: Move the in_send statistic to __smb_send_rqst()
	drm/meson: fix 1px pink line on GXM when scaling video overlay
	clk: HI655X: select REGMAP instead of depending on it
	ASoC: SOF: Intel: MTL: Fix the device description
	ASoC: SOF: Intel: HDA: Fix device description
	ASoC: SOF: Intel: SKL: Fix device description
	ASOC: SOF: Intel: pci-tgl: Fix device description
	ASoC: SOF: ipc4-topology: set dmic dai index from copier
	docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
	scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
	scsi: mpi3mr: Fix throttle_groups memory leak
	scsi: mpi3mr: Fix config page DMA memory leak
	scsi: mpi3mr: Fix mpi3mr_hba_port memory leak in mpi3mr_remove()
	scsi: mpi3mr: Fix sas_hba.phy memory leak in mpi3mr_remove()
	scsi: mpi3mr: Return proper values for failures in firmware init path
	scsi: mpi3mr: Fix memory leaks in mpi3mr_init_ioc()
	scsi: mpi3mr: ioctl timeout when disabling/enabling interrupt
	scsi: mpi3mr: Fix expander node leak in mpi3mr_remove()
	ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
	netfilter: nft_nat: correct length for loading protocol registers
	netfilter: nft_masq: correct length for loading protocol registers
	netfilter: nft_redir: correct length for loading protocol registers
	netfilter: nft_redir: correct value of inet type `.maxattrs`
	scsi: core: Add BLIST_NO_VPD_SIZE for some VDASD
	scsi: core: Fix a procfs host directory removal regression
	ftrace,kcfi: Define ftrace_stub_graph conditionally
	tcp: tcp_make_synack() can be called from process context
	vdpa/mlx5: should not activate virtq object when suspended
	wifi: nl80211: fix NULL-ptr deref in offchan check
	wifi: cfg80211: fix MLO connection ownership
	selftests: fix LLVM build for i386 and x86_64
	nfc: pn533: initialize struct pn533_out_arg properly
	ipvlan: Make skb->skb_iif track skb->dev for l3s mode
	i40e: Fix kernel crash during reboot when adapter is in recovery mode
	vhost-vdpa: free iommu domain after last use during cleanup
	vdpa_sim: not reset state in vdpasim_queue_ready
	vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_ready
	PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
	drm/i915/psr: Use calculated io and fast wake lines
	drm/i915/sseu: fix max_subslices array-index-out-of-bounds access
	net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
	qed/qed_dev: guard against a possible division by zero
	net: dsa: mt7530: remove now incorrect comment regarding port 5
	net: dsa: mt7530: set PLL frequency and trgmii only when trgmii is used
	block: do not reverse request order when flushing plug list
	loop: Fix use-after-free issues
	net: tunnels: annotate lockless accesses to dev->needed_headroom
	net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails
	tcp: Fix bind() conflict check for dual-stack wildcard address.
	nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition
	mlxsw: spectrum: Fix incorrect parsing depth after reload
	net/smc: fix deadlock triggered by cancel_delayed_work_syn()
	net: usb: smsc75xx: Limit packet length to skb->len
	drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
	powerpc/mm: Fix false detection of read faults
	block: null_blk: Fix handling of fake timeout request
	nvme: fix handling single range discard request
	nvmet: avoid potential UAF in nvmet_req_complete()
	block: sunvdc: add check for mdesc_grab() returning NULL
	net/mlx5e: Fix macsec ASO context alignment
	net/mlx5e: Don't cache tunnel offloads capability
	net/mlx5: Fix setting ec_function bit in MANAGE_PAGES
	net/mlx5: Disable eswitch before waiting for VF pages
	net/mlx5e: Support Geneve and GRE with VF tunnel offload
	net/mlx5: E-switch, Fix wrong usage of source port rewrite in split rules
	net/mlx5: E-switch, Fix missing set of split_count when forward to ovs internal port
	net/mlx5e: Fix cleanup null-ptr deref on encap lock
	net/mlx5: Set BREAK_FW_WAIT flag first when removing driver
	veth: Fix use after free in XDP_REDIRECT
	ice: xsk: disable txq irq before flushing hw
	net: dsa: don't error out when drivers return ETH_DATA_LEN in .port_max_mtu()
	net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290
	ravb: avoid PHY being resumed when interface is not up
	sh_eth: avoid PHY being resumed when interface is not up
	ipv4: Fix incorrect table ID in IOCTL path
	net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull
	net: atlantic: Fix crash when XDP is enabled but no program is loaded
	net/iucv: Fix size of interrupt data
	i825xx: sni_82596: use eth_hw_addr_set()
	selftests: net: devlink_port_split.py: skip test if no suitable device available
	qed/qed_mng_tlv: correctly zero out ->min instead of ->hour
	net: dsa: microchip: fix RGMII delay configuration on KSZ8765/KSZ8794/KSZ8795
	ethernet: sun: add check for the mdesc_grab()
	bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change
	bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails
	hwmon: (adt7475) Display smoothing attributes in correct order
	hwmon: (adt7475) Fix masking of hysteresis registers
	hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition
	hwmon: (ina3221) return prober error code
	hwmon: (ucd90320) Add minimum delay between bus accesses
	hwmon: tmp512: drop of_match_ptr for ID table
	kconfig: Update config changed flag before calling callback
	hwmon: (adm1266) Set `can_sleep` flag for GPIO chip
	hwmon: (ltc2992) Set `can_sleep` flag for GPIO chip
	media: m5mols: fix off-by-one loop termination error
	mmc: atmel-mci: fix race between stop command and start of next command
	soc: mediatek: mtk-svs: keep svs alive if CONFIG_DEBUG_FS not supported
	jffs2: correct logic when creating a hole in jffs2_write_begin
	rust: arch/um: Disable FP/SIMD instruction to match x86
	ext4: fail ext4_iget if special inode unallocated
	ext4: update s_journal_inum if it changes after journal replay
	ext4: fix task hung in ext4_xattr_delete_inode
	drm/amdkfd: Fix an illegal memory access
	net/9p: fix bug in client create for .L
	LoongArch: Only call get_timer_irq() once in constant_clockevent_init()
	sh: intc: Avoid spurious sizeof-pointer-div warning
	drm/amdgpu: fix ttm_bo calltrace warning in psp_hw_fini
	drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
	ext4: fix possible double unlock when moving a directory
	Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"
	tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
	serial: 8250_em: Fix UART port type
	serial: 8250_fsl: fix handle_irq locking
	serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
	firmware: xilinx: don't make a sleepable memory allocation from an atomic context
	memory: tegra: fix interconnect registration race
	memory: tegra20-emc: fix interconnect registration race
	memory: tegra124-emc: fix interconnect registration race
	memory: tegra30-emc: fix interconnect registration race
	drm/ttm: Fix a NULL pointer dereference
	s390/ipl: add missing intersection check to ipl_report handling
	interconnect: fix icc_provider_del() error handling
	interconnect: fix provider registration API
	interconnect: imx: fix registration race
	interconnect: fix mem leak when freeing nodes
	interconnect: qcom: osm-l3: fix registration race
	interconnect: qcom: rpm: fix probe child-node error handling
	interconnect: qcom: rpm: fix registration race
	interconnect: qcom: rpmh: fix probe child-node error handling
	interconnect: qcom: rpmh: fix registration race
	interconnect: qcom: msm8974: fix registration race
	interconnect: exynos: fix node leak in probe PM QoS error path
	interconnect: exynos: fix registration race
	md: select BLOCK_LEGACY_AUTOLOAD
	cifs: generate signkey for the channel that's reconnecting
	tracing: Make splice_read available again
	tracing: Check field value in hist_field_name()
	tracing: Make tracepoint lockdep check actually test something
	cifs: Fix smb2_set_path_size()
	KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
	KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
	KVM: nVMX: add missing consistency checks for CR0 and CR4
	ALSA: hda: intel-dsp-config: add MTL PCI id
	ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
	ALSA: hda/realtek: fix speaker, mute/micmute LEDs not work on a HP platform
	Revert "riscv: mm: notify remote harts about mmu cache updates"
	riscv: asid: Fixup stale TLB entry cause application crash
	drm/shmem-helper: Remove another errant put in error path
	drm/sun4i: fix missing component unbind on bind errors
	drm/i915/active: Fix misuse of non-idle barriers as fence trackers
	drm/i915/dg2: Add HDMI pixel clock frequencies 267.30 and 319.89 MHz
	drm/amdgpu: Don't resume IOMMU after incomplete init
	drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
	drm/amd/pm: bump SMU 13.0.4 driver_if header version
	drm/amd/display: Do not set DRR on pipe Commit
	drm/amd/display: disconnect MPCC only on OTG change
	mptcp: fix possible deadlock in subflow_error_report
	mptcp: add ro_after_init for tcp{,v6}_prot_override
	mptcp: avoid setting TCP_CLOSE state twice
	mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()
	ftrace: Fix invalid address access in lookup_rec() when index is 0
	ocfs2: fix data corruption after failed write
	nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
	ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
	vp_vdpa: fix the crash in hot unplug with vp_vdpa
	mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
	mm: teach mincore_hugetlb about pte markers
	powerpc/64: Set default CPU in Kconfig
	powerpc/boot: Don't always pass -mcpu=powerpc when building 32-bit uImage
	mmc: sdhci_am654: lower power-on failed message severity
	fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks
	trace/hwlat: Do not wipe the contents of per-cpu thread data
	trace/hwlat: Do not start per-cpu thread if it is already running
	ACPI: PPTT: Fix to avoid sleep in the atomic context when PPTT is absent
	net: phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bit
	fbdev: Fix incorrect page mapping clearance at fb_deferred_io_release()
	cpuidle: psci: Iterate backwards over list in psci_pd_remove()
	ASoC: Intel: soc-acpi: fix copy-paste issue in topology names
	ASoC: qcom: q6prm: fix incorrect clk_root passed to ADSP
	x86/mce: Make sure logged MCEs are processed after sysfs update
	x86/mm: Fix use of uninitialized buffer in sme_enable()
	x86/resctrl: Clear staged_config[] before and after it is used
	powerpc: Pass correct CPU reference to assembler
	virt/coco/sev-guest: Check SEV_SNP attribute at probe time
	virt/coco/sev-guest: Simplify extended guest request handling
	virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
	virt/coco/sev-guest: Carve out the request issuing logic into a helper
	virt/coco/sev-guest: Do some code style cleanups
	virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
	virt/coco/sev-guest: Add throttling awareness
	io_uring/msg_ring: let target know allocated index
	perf: Fix check before add_event_to_groups() in perf_group_detach()
	powerpc: Disable CPU unknown by CLANG when CC_IS_CLANG
	powerpc/64: Replace -mcpu=e500mc64 by -mcpu=e5500
	Linux 6.1.21

Change-Id: I4b7f6e01381c0c121c9e89e51071ea60f1f7e29a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-24 08:47:17 +00:00
Greg Kroah-Hartman
a22c3a8790 Merge 6.1.20 into android14-6.1
Changes in 6.1.20
	fs: prevent out-of-bounds array speculation when closing a file descriptor
	btrfs: fix unnecessary increment of read error stat on write error
	btrfs: fix percent calculation for bg reclaim message
	io_uring/uring_cmd: ensure that device supports IOPOLL
	erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms
	perf inject: Fix --buildid-all not to eat up MMAP2
	fork: allow CLONE_NEWTIME in clone3 flags
	RISC-V: Stop emitting attributes
	x86/CPU/AMD: Disable XSAVES on AMD family 0x17
	drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
	drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21
	drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv
	drm/display: Don't block HDR_OUTPUT_METADATA on unknown EOTF
	drm/connector: print max_requested_bpc in state debugfs
	staging: rtl8723bs: Fix key-store index handling
	staging: rtl8723bs: Pass correct parameters to cfg80211_get_bss()
	ext4: fix cgroup writeback accounting with fs-layer encryption
	ext4: fix RENAME_WHITEOUT handling for inline directories
	ext4: fix another off-by-one fsmap error on 1k block filesystems
	ext4: move where set the MAY_INLINE_DATA flag is set
	ext4: fix WARNING in ext4_update_inline_data
	ext4: zero i_disksize when initializing the bootloader inode
	HID: core: Provide new max_buffer_size attribute to over-ride the default
	HID: uhid: Over-ride the default maximum data buffer value with our own
	nfc: change order inside nfc_se_io error path
	KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling
	KVM: VMX: Don't bother disabling eVMCS static key on module exit
	KVM: x86: Move guts of kvm_arch_init() to standalone helper
	KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace
	fs: dlm: fix log of lowcomms vs midcomms
	fs: dlm: add midcomms init/start functions
	fs: dlm: start midcomms before scand
	fs: dlm: remove send repeat remove handling
	fs: dlm: use packet in dlm_mhandle
	fd: dlm: trace send/recv of dlm message and rcom
	fs: dlm: fix use after free in midcomms commit
	fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
	fs: dlm: be sure to call dlm_send_queue_flush()
	fs: dlm: fix race setting stop tx flag
	udf: Fix off-by-one error when discarding preallocation
	bus: mhi: ep: Power up/down MHI stack during MHI RESET
	bus: mhi: ep: Change state_lock to mutex
	Input: exc3000 - properly stop timer on shutdown
	ipmi:ssif: Remove rtc_us_timer
	ipmi:ssif: Increase the message retry time
	ipmi:ssif: Add a timer between request retries
	spi: intel: Check number of chip selects after reading the descriptor
	drm/i915: Introduce intel_panel_init_alloc()
	drm/i915: Do panel VBT init early if the VBT declares an explicit panel type
	drm/i915: Populate encoder->devdata for DSI on icl+
	block: Revert "block: Do not reread partition table on exclusively open device"
	block: fix scan partition for exclusively open device again
	riscv: Add header include guards to insn.h
	scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
	ext4: Fix possible corruption when moving a directory
	cifs: improve checking of DFS links over STATUS_OBJECT_NAME_INVALID
	drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototype
	drm/msm: Fix potential invalid ptr free
	drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register
	drm/msm/a5xx: fix highest bank bit for a530
	drm/msm/a5xx: fix the emptyness check in the preempt code
	drm/msm/a5xx: fix context faults during ring switch
	bgmac: fix *initial* chip reset to support BCM5358
	nfc: fdp: add null check of devm_kmalloc_array in fdp_nci_i2c_read_device_properties
	powerpc: dts: t1040rdb: fix compatible string for Rev A boards
	tls: rx: fix return value for async crypto
	drm/msm/dpu: disable features unsupported by QCM2290
	ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping()
	net: lan966x: Fix port police support using tc-matchall
	selftests: nft_nat: ensuring the listening side is up before starting the client
	netfilter: nft_last: copy content when cloning expression
	netfilter: nft_quota: copy content when cloning expression
	net: tls: fix possible race condition between do_tls_getsockopt_conf() and do_tls_setsockopt_conf()
	net: use indirect calls helpers for sk_exit_memory_pressure()
	perf stat: Fix counting when initial delay configured
	net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver
	net: caif: Fix use-after-free in cfusbl_device_notify()
	ice: copy last block omitted in ice_get_module_eeprom()
	bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
	drm/msm/dpu: fix len of sc7180 ctl blocks
	drm/msm/dpu: drop DPU_DIM_LAYER from MIXER_MSM8998_MASK
	drm/msm/dpu: fix clocks settings for msm8998 SSPP blocks
	drm/msm/dpu: clear DSPP reservations in rm release
	net: stmmac: add to set device wake up flag when stmmac init phy
	net: phylib: get rid of unnecessary locking
	bnxt_en: Avoid order-5 memory allocation for TPA data
	netfilter: ctnetlink: revert to dumping mark regardless of event type
	netfilter: tproxy: fix deadlock due to missing BH disable
	m68k: mm: Move initrd phys_to_virt handling after paging_init()
	btrfs: fix extent map logging bit not cleared for split maps after dropping range
	bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES
	btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
	net: phy: smsc: fix link up detection in forced irq mode
	net: ethernet: mtk_eth_soc: fix RX data corruption issue
	net: tls: fix device-offloaded sendpage straddling records
	scsi: megaraid_sas: Update max supported LD IDs to 240
	scsi: sd: Fix wrong zone_write_granularity value during revalidate
	netfilter: conntrack: adopt safer max chain length
	platform: mellanox: select REGMAP instead of depending on it
	platform: x86: MLX_PLATFORM: select REGMAP instead of depending on it
	block: fix wrong mode for blkdev_put() from disk_scan_partitions()
	NFSD: Protect against filesystem freezing
	ice: Fix DSCP PFC TLV creation
	ethernet: ice: avoid gcc-9 integer overflow warning
	net/smc: fix fallback failed while sendmsg with fastopen
	octeontx2-af: Unlock contexts in the queue context cache in case of fault detection
	SUNRPC: Fix a server shutdown leak
	net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC
	af_unix: fix struct pid leaks in OOB support
	erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"
	riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
	RISC-V: Don't check text_mutex during stop_machine
	drm/amdgpu: fix return value check in kfd
	ext4: Fix deadlock during directory rename
	drm/amdgpu/soc21: don't expose AV1 if VCN0 is harvested
	drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4
	adreno: Shutdown the GPU properly
	drm/msm/adreno: fix runtime PM imbalance at unbind
	watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
	tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
	MIPS: Fix a compilation issue
	powerpc/64: Don't recurse irq replay
	powerpc/iommu: fix memory leak with using debugfs_lookup()
	clk: renesas: rcar-gen3: Disable R-Car H3 ES1.*
	powerpc/bpf/32: Only set a stack frame when necessary
	powerpc/64: Fix task_cpu in early boot when booting non-zero cpuid
	powerpc/64: Move paca allocation to early_setup()
	powerpc/kcsan: Exclude udelay to prevent recursive instrumentation
	alpha: fix R_ALPHA_LITERAL reloc for large modules
	macintosh: windfarm: Use unsigned type for 1-bit bitfields
	PCI: Add SolidRun vendor ID
	scripts: handle BrokenPipeError for python scripts
	media: ov5640: Fix analogue gain control
	media: rc: gpio-ir-recv: add remove function
	drm/amd/display: Allow subvp on vactive pipes that are 2560x1440@60
	drm/amd/display: adjust MALL size available for DCN32 and DCN321
	filelocks: use mount idmapping for setlease permission check
	Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES"
	UML: define RUNTIME_DISCARD_EXIT
	Linux 6.1.20

Change-Id: I2f92629ce02bc07295fea17b16f9bb567916a285
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-23 08:23:43 +00:00
Budimir Markovic
529546ea28 perf: Fix check before add_event_to_groups() in perf_group_detach()
commit fd0815f632c24878e325821943edccc7fde947a2 upstream.

Events should only be added to a groups rb tree if they have not been
removed from their context by list_del_event(). Since remove_on_exec
made it possible to call list_del_event() on individual events before
they are detached from their group, perf_group_detach() should check each
sibling's attach_state before calling add_event_to_groups() on it.

Fixes: 2e498d0a74 ("perf: Add support for event removal on exec")
Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/ZBFzvQV9tEqoHEtH@gentoo
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:34:06 +01:00
Tero Kristo
ac1d15d58d trace/hwlat: Do not start per-cpu thread if it is already running
commit 08697bca9bbba15f2058fdbd9f970bd5f6a8a2e8 upstream.

The hwlatd tracer will end up starting multiple per-cpu threads with
the following script:

    #!/bin/sh
    cd /sys/kernel/debug/tracing
    echo 0 > tracing_on
    echo hwlat > current_tracer
    echo per-cpu > hwlat_detector/mode
    echo 100000 > hwlat_detector/width
    echo 200000 > hwlat_detector/window
    echo 1 > tracing_on

To fix the issue, check if the hwlatd thread for the cpu is already
running, before starting a new one. Along with the previous patch, this
avoids running multiple instances of the same CPU thread on the system.

Link: https://lore.kernel.org/all/20230302113654.2984709-1-tero.kristo@linux.intel.com/
Link: https://lkml.kernel.org/r/20230310100451.3948583-3-tero.kristo@linux.intel.com

Cc: stable@vger.kernel.org
Fixes: f46b16520a ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:34:04 +01:00
Tero Kristo
a78eab86e2 trace/hwlat: Do not wipe the contents of per-cpu thread data
commit 4c42f5f0d1dd20bddd9f940beb1e6ccad60c4498 upstream.

Do not wipe the contents of the per-cpu kthread data when starting the
tracer, as this will completely forget about already running instances
and can later start new additional per-cpu threads.

Link: https://lore.kernel.org/all/20230302113654.2984709-1-tero.kristo@linux.intel.com/
Link: https://lkml.kernel.org/r/20230310100451.3948583-2-tero.kristo@linux.intel.com

Cc: stable@vger.kernel.org
Fixes: f46b16520a ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:34:04 +01:00
Chen Zhongjin
4f84f31f63 ftrace: Fix invalid address access in lookup_rec() when index is 0
commit ee92fa443358f4fc0017c1d0d325c27b37802504 upstream.

KASAN reported follow problem:

 BUG: KASAN: use-after-free in lookup_rec
 Read of size 8 at addr ffff000199270ff0 by task modprobe
 CPU: 2 Comm: modprobe
 Call trace:
  kasan_report
  __asan_load8
  lookup_rec
  ftrace_location
  arch_check_ftrace_location
  check_kprobe_address_safe
  register_kprobe

When checking pg->records[pg->index - 1].ip in lookup_rec(), it can get a
pg which is newly added to ftrace_pages_start in ftrace_process_locs().
Before the first pg->index++, index is 0 and accessing pg->records[-1].ip
will cause this problem.

Don't check the ip when pg->index is 0.

Link: https://lore.kernel.org/linux-trace-kernel/20230309080230.36064-1-chenzhongjin@huawei.com

Cc: stable@vger.kernel.org
Fixes: 9644302e33 ("ftrace: Speed up search by skipping pages by address")
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:34:02 +01:00
Steven Rostedt (Google)
3968bb946a tracing: Check field value in hist_field_name()
commit 9f116f76fa8c04c81aef33ad870dbf9a158e5b70 upstream.

The function hist_field_name() cannot handle being passed a NULL field
parameter. It should never be NULL, but due to a previous bug, NULL was
passed to the function and the kernel crashed due to a NULL dereference.
Mark Rutland reported this to me on IRC.

The bug was fixed, but to prevent future bugs from crashing the kernel,
check the field and add a WARN_ON() if it is NULL.

Link: https://lkml.kernel.org/r/20230302020810.762384440@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: c6afad49d1 ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers")
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:33:59 +01:00
Sung-hun Kim
192dcbf573 tracing: Make splice_read available again
commit e400be674a1a40e9dcb2e95f84d6c1fd2d88f31d upstream.

Since the commit 36e2c7421f ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.

This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.

Link: https://lore.kernel.org/linux-trace-kernel/20230314013707.28814-1-sfoon.kim@samsung.com

Cc: stable@vger.kernel.org
Fixes: 36e2c7421f ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <sfoon.kim@samsung.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22 13:33:59 +01:00
Matthias Männich
a58297e0d2 Merge "Merge 6.1.18 into android14-6.1" into android14-6.1 2023-03-22 10:14:03 +00:00
Greg Kroah-Hartman
d956976040 Merge 6.1.18 into android14-6.1
Changes in 6.1.18
	net/sched: Retire tcindex classifier
	auxdisplay: hd44780: Fix potential memory leak in hd44780_remove()
	fs/jfs: fix shift exponent db_agl2size negative
	driver: soc: xilinx: fix memory leak in xlnx_add_cb_for_notify_event()
	f2fs: don't rely on F2FS_MAP_* in f2fs_iomap_begin
	f2fs: fix to avoid potential deadlock
	objtool: Fix memory leak in create_static_call_sections()
	soc: mediatek: mtk-pm-domains: Allow mt8186 ADSP default power on
	memory: renesas-rpc-if: Split-off private data from struct rpcif
	memory: renesas-rpc-if: Move resource acquisition to .probe()
	soc: mediatek: mtk-svs: Enable the IRQ later
	pwm: sifive: Always let the first pwm_apply_state succeed
	pwm: stm32-lp: fix the check on arr and cmp registers update
	f2fs: introduce trace_f2fs_replace_atomic_write_block
	f2fs: correct i_size change for atomic writes
	f2fs: clear atomic_write_task in f2fs_abort_atomic_write()
	soc: mediatek: mtk-svs: restore default voltages when svs_init02() fail
	soc: mediatek: mtk-svs: reset svs when svs_resume() fail
	soc: mediatek: mtk-svs: Use pm_runtime_resume_and_get() in svs_init01()
	fs: f2fs: initialize fsdata in pagecache_write()
	f2fs: allow set compression option of files without blocks
	f2fs: fix to abort atomic write only during do_exist()
	um: vector: Fix memory leak in vector_config
	ubi: ensure that VID header offset + VID header size <= alloc, size
	ubifs: Fix build errors as symbol undefined
	ubifs: Fix memory leak in ubifs_sysfs_init()
	ubifs: Rectify space budget for ubifs_symlink() if symlink is encrypted
	ubifs: Rectify space budget for ubifs_xrename()
	ubifs: Fix wrong dirty space budget for dirty inode
	ubifs: do_rename: Fix wrong space budget when target inode's nlink > 1
	ubifs: Reserve one leb for each journal head while doing budget
	ubi: Fix use-after-free when volume resizing failed
	ubi: Fix unreferenced object reported by kmemleak in ubi_resize_volume()
	ubifs: Fix memory leak in alloc_wbufs()
	ubi: Fix possible null-ptr-deref in ubi_free_volume()
	ubifs: Re-statistic cleaned znode count if commit failed
	ubifs: dirty_cow_znode: Fix memleak in error handling path
	ubifs: ubifs_writepage: Mark page dirty after writing inode failed
	ubifs: ubifs_releasepage: Remove ubifs_assert(0) to valid this process
	ubi: fastmap: Fix missed fm_anchor PEB in wear-leveling after disabling fastmap
	ubi: Fix UAF wear-leveling entry in eraseblk_count_seq_show()
	ubi: ubi_wl_put_peb: Fix infinite loop when wear-leveling work failed
	f2fs: fix to avoid potential memory corruption in __update_iostat_latency()
	soc: qcom: stats: Populate all subsystem debugfs files
	ext4: use ext4_fc_tl_mem in fast-commit replay path
	ext4: don't show commit interval if it is zero
	netfilter: nf_tables: allow to fetch set elements when table has an owner
	x86: um: vdso: Add '%rcx' and '%r11' to the syscall clobber list
	um: virtio_uml: free command if adding to virtqueue failed
	um: virtio_uml: mark device as unregistered when breaking it
	um: virtio_uml: move device breaking into workqueue
	um: virt-pci: properly remove PCI device from bus
	f2fs: synchronize atomic write aborts
	watchdog: rzg2l_wdt: Issue a reset before we put the PM clocks
	watchdog: rzg2l_wdt: Handle TYPE-B reset for RZ/V2M
	watchdog: at91sam9_wdt: use devm_request_irq to avoid missing free_irq() in error path
	watchdog: Fix kmemleak in watchdog_cdev_register
	watchdog: pcwd_usb: Fix attempting to access uninitialized memory
	watchdog: sbsa_wdog: Make sure the timeout programming is within the limits
	netfilter: ctnetlink: fix possible refcount leak in ctnetlink_create_conntrack()
	netfilter: conntrack: fix rmmod double-free race
	netfilter: ip6t_rpfilter: Fix regression with VRF interfaces
	netfilter: ebtables: fix table blob use-after-free
	netfilter: xt_length: use skb len to match in length_mt6
	netfilter: ctnetlink: make event listener tracking global
	netfilter: x_tables: fix percpu counter block leak on error path when creating new netns
	ptp: vclock: use mutex to fix "sleep on atomic" bug
	drm/i915: move a Kconfig symbol to unbreak the menu presentation
	ipv6: Add lwtunnel encap size of all siblings in nexthop calculation
	octeontx2-pf: Recalculate UDP checksum for ptp 1-step sync packet
	net: sunhme: Fix region request
	sctp: add a refcnt in sctp_stream_priorities to avoid a nested loop
	octeontx2-pf: Use correct struct reference in test condition
	net: fix __dev_kfree_skb_any() vs drop monitor
	9p/xen: fix version parsing
	9p/xen: fix connection sequence
	9p/rdma: unmap receive dma buffer in rdma_request()/post_recv()
	spi: tegra210-quad: Fix validate combined sequence
	mlx5: fix skb leak while fifo resync and push
	mlx5: fix possible ptp queue fifo use-after-free
	net/mlx5: ECPF, wait for VF pages only after disabling host PFs
	net/mlx5e: Verify flow_source cap before using it
	net/mlx5: Geneve, Fix handling of Geneve object id as error code
	ext4: fix incorrect options show of original mount_opt and extend mount_opt2
	nfc: fix memory leak of se_io context in nfc_genl_se_io
	net/sched: transition act_pedit to rcu and percpu stats
	net/sched: act_pedit: fix action bind logic
	net/sched: act_mpls: fix action bind logic
	net/sched: act_sample: fix action bind logic
	net: dsa: seville: ignore mscc-miim read errors from Lynx PCS
	net: dsa: felix: fix internal MDIO controller resource length
	ARM: dts: spear320-hmi: correct STMPE GPIO compatible
	tcp: tcp_check_req() can be called from process context
	vc_screen: modify vcs_size() handling in vcs_read()
	spi: tegra210-quad: Fix iterator outside loop
	rtc: sun6i: Always export the internal oscillator
	genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
	scsi: ipr: Work around fortify-string warning
	scsi: mpi3mr: Fix an issue found by KASAN
	scsi: mpi3mr: Use number of bits to manage bitmap sizes
	rtc: allow rtc_read_alarm without read_alarm callback
	io_uring: fix size calculation when registering buf ring
	loop: loop_set_status_from_info() check before assignment
	ASoC: adau7118: don't disable regulators on device unbind
	ASoC: apple: mca: Fix final status read on SERDES reset
	ASoC: apple: mca: Fix SERDES reset sequence
	ASoC: apple: mca: Improve handling of unavailable DMA channels
	nvme: bring back auto-removal of deleted namespaces during sequential scan
	nvme-tcp: don't access released socket during error recovery
	nvme-fabrics: show well known discovery name
	ASoC: zl38060 add gpiolib dependency
	ASoC: mediatek: mt8195: add missing initialization
	thermal: intel: quark_dts: fix error pointer dereference
	thermal: intel: BXT_PMIC: select REGMAP instead of depending on it
	tracing: Add NULL checks for buffer in ring_buffer_free_read_page()
	kernel/printk/index.c: fix memory leak with using debugfs_lookup()
	firmware/efi sysfb_efi: Add quirk for Lenovo IdeaPad Duet 3
	bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support
	mfd: arizona: Use pm_runtime_resume_and_get() to prevent refcnt leak
	IB/hfi1: Update RMT size calculation
	iommu/amd: Fix error handling for pdev_pri_ats_enable()
	PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3()
	media: uvcvideo: Remove format descriptions
	media: uvcvideo: Handle cameras with invalid descriptors
	media: uvcvideo: Handle errors from calls to usb_string
	media: uvcvideo: Quirk for autosuspend in Logitech B910 and C910
	media: uvcvideo: Silence memcpy() run-time false positive warnings
	USB: fix memory leak with using debugfs_lookup()
	cacheinfo: Fix shared_cpu_map to handle shared caches at different levels
	staging: emxx_udc: Add checks for dma_alloc_coherent()
	tty: fix out-of-bounds access in tty_driver_lookup_tty()
	tty: serial: fsl_lpuart: disable the CTS when send break signal
	serial: sc16is7xx: setup GPIO controller later in probe
	mei: bus-fixup:upon error print return values of send and receive
	tools/iio/iio_utils:fix memory leak
	bus: mhi: ep: Fix the debug message for MHI_PKT_TYPE_RESET_CHAN_CMD cmd
	iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_status_word()
	iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_config_word()
	media: uvcvideo: Add GUID for BGRA/X 8:8:8:8
	soundwire: bus_type: Avoid lockdep assert in sdw_drv_probe()
	PCI: loongson: Prevent LS7A MRRS increases
	staging: pi433: fix memory leak with using debugfs_lookup()
	USB: dwc3: fix memory leak with using debugfs_lookup()
	USB: chipidea: fix memory leak with using debugfs_lookup()
	USB: ULPI: fix memory leak with using debugfs_lookup()
	USB: uhci: fix memory leak with using debugfs_lookup()
	USB: sl811: fix memory leak with using debugfs_lookup()
	USB: fotg210: fix memory leak with using debugfs_lookup()
	USB: isp116x: fix memory leak with using debugfs_lookup()
	USB: isp1362: fix memory leak with using debugfs_lookup()
	USB: gadget: gr_udc: fix memory leak with using debugfs_lookup()
	USB: gadget: bcm63xx_udc: fix memory leak with using debugfs_lookup()
	USB: gadget: lpc32xx_udc: fix memory leak with using debugfs_lookup()
	USB: gadget: pxa25x_udc: fix memory leak with using debugfs_lookup()
	USB: gadget: pxa27x_udc: fix memory leak with using debugfs_lookup()
	usb: host: xhci: mvebu: Iterate over array indexes instead of using pointer math
	USB: ene_usb6250: Allocate enough memory for full object
	usb: uvc: Enumerate valid values for color matching
	usb: gadget: uvc: Make bSourceID read/write
	PCI: Align extra resources for hotplug bridges properly
	PCI: Take other bus devices into account when distributing resources
	PCI: Distribute available resources for root buses, too
	tty: pcn_uart: fix memory leak with using debugfs_lookup()
	misc: vmw_balloon: fix memory leak with using debugfs_lookup()
	drivers: base: component: fix memory leak with using debugfs_lookup()
	drivers: base: dd: fix memory leak with using debugfs_lookup()
	kernel/fail_function: fix memory leak with using debugfs_lookup()
	PCI: loongson: Add more devices that need MRRS quirk
	PCI: Add ACS quirk for Wangxun NICs
	PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
	phy: rockchip-typec: Fix unsigned comparison with less than zero
	RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size
	iommu: Attach device group to old domain in error path
	soundwire: cadence: Remove wasted space in response_buf
	soundwire: cadence: Drain the RX FIFO after an IO timeout
	net: tls: avoid hanging tasks on the tx_lock
	x86/resctl: fix scheduler confusion with 'current'
	vDPA/ifcvf: decouple hw features manipulators from the adapter
	vDPA/ifcvf: decouple config space ops from the adapter
	vDPA/ifcvf: alloc the mgmt_dev before the adapter
	vDPA/ifcvf: decouple vq IRQ releasers from the adapter
	vDPA/ifcvf: decouple config IRQ releaser from the adapter
	vDPA/ifcvf: decouple vq irq requester from the adapter
	vDPA/ifcvf: decouple config/dev IRQ requester and vectors allocator from the adapter
	vDPA/ifcvf: ifcvf_request_irq works on ifcvf_hw
	vDPA/ifcvf: manage ifcvf_hw in the mgmt_dev
	vDPA/ifcvf: allocate the adapter in dev_add()
	drm/display/dp_mst: Add drm_atomic_get_old_mst_topology_state()
	drm/display/dp_mst: Fix down/up message handling after sink disconnect
	drm/display/dp_mst: Fix down message handling after a packet reception error
	drm/display/dp_mst: Fix payload addition on a disconnected sink
	drm/i915/dp_mst: Add the MST topology state for modesetted CRTCs
	drm/i915: Fix system suspend without fbdev being initialized
	media: uvcvideo: Fix race condition with usb_kill_urb
	io_uring: fix two assignments in if conditions
	io_uring/poll: allow some retries for poll triggering spuriously
	arm64: efi: Make efi_rt_lock a raw_spinlock
	arm64: mte: Fix/clarify the PG_mte_tagged semantics
	arm64: Reset KASAN tag in copy_highpage with HW tags only
	usb: gadget: uvc: fix missing mutex_unlock() if kstrtou8() fails
	Linux 6.1.18

Change-Id: Icb8e56528d481a17780bdd517c69efa9e76b94c0
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-21 08:22:15 +00:00
Suren Baghdasaryan
ac86382170 ANDROID: Revert "psi: allow unprivileged users with CAP_SYS_RESOURCE to write psi files"
This reverts commit 6db12ee045.

In Android, system_server registers psi trigger to detect memory
pressure. This commit requires processes registering new triggers to
have CAP_SYS_RESOURCE capability, which system_server does not have.
Reverting this change until a solution can be found to fix the breakage
of functionality in Android T using 5.15 kernels.

Bug: 243781242
Bug: 244148051
Reported-by: liuhailong <liuhailong@oppo.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: If6c8580af8734f3b765d48c782a536aad357e6f0
2023-03-20 20:20:07 +00:00
David Disseldorp
587a6fda90 watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
[ Upstream commit 03e1d60e177eedbd302b77af4ea5e21b5a7ade31 ]

The watch_queue_set_size() allocation error paths return the ret value
set via the prior pipe_resize_ring() call, which will always be zero.

As a result, IOC_WATCH_QUEUE_SET_SIZE callers such as "keyctl watch"
fail to detect kernel wqueue->notes allocation failures and proceed to
KEYCTL_WATCH_KEY, with any notifications subsequently lost.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17 08:50:30 +01:00
Lorenz Bauer
f0c8306c1a btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
[ Upstream commit 9b459804ff9973e173fabafba2a1319f771e85fa ]

btf_datasec_resolve contains a bug that causes the following BTF
to fail loading:

    [1] DATASEC a size=2 vlen=2
        type_id=4 offset=0 size=1
        type_id=7 offset=1 size=1
    [2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
    [3] PTR (anon) type_id=2
    [4] VAR a type_id=3 linkage=0
    [5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
    [6] TYPEDEF td type_id=5
    [7] VAR b type_id=6 linkage=0

This error message is printed during btf_check_all_types:

    [1] DATASEC a size=2 vlen=2
        type_id=7 offset=1 size=1 Invalid type

By tracing btf_*_resolve we can pinpoint the problem:

    btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0
        btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0
            btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0
        btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0
    btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22

The last invocation of btf_datasec_resolve should invoke btf_var_resolve
by means of env_stack_push, instead it returns EINVAL. The reason is that
env_stack_push is never executed for the second VAR.

    if (!env_type_is_resolve_sink(env, var_type) &&
        !env_type_is_resolved(env, var_type_id)) {
        env_stack_set_next_member(env, i + 1);
        return env_stack_push(env, var_type, var_type_id);
    }

env_type_is_resolve_sink() changes its behaviour based on resolve_mode.
For RESOLVE_PTR, we can simplify the if condition to the following:

    (btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved()

Since we're dealing with a VAR the clause evaluates to false. This is
not sufficient to trigger the bug however. The log output and EINVAL
are only generated if btf_type_id_size() fails.

    if (!btf_type_id_size(btf, &type_id, &type_size)) {
        btf_verifier_log_vsi(env, v->t, vsi, "Invalid type");
        return -EINVAL;
    }

Most types are sized, so for example a VAR referring to an INT is not a
problem. The bug is only triggered if a VAR points at a modifier. Since
we skipped btf_var_resolve that modifier was also never resolved, which
means that btf_resolved_type_id returns 0 aka VOID for the modifier.
This in turn causes btf_type_id_size to return NULL, triggering EINVAL.

To summarise, the following conditions are necessary:

- VAR pointing at PTR, STRUCT, UNION or ARRAY
- Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or
  TYPE_TAG

The fix is to reset resolve_mode to RESOLVE_TBD before attempting to
resolve a VAR from a DATASEC.

Fixes: 1dc9285184 ("bpf: kernel side support for BTF Var and DataSec")
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17 08:50:26 +01:00
Tobias Klauser
120f7a9287 fork: allow CLONE_NEWTIME in clone3 flags
commit a402f1e35313fc7ce2ca60f543c4402c2c7c3544 upstream.

Currently, calling clone3() with CLONE_NEWTIME in clone_args->flags
fails with -EINVAL. This is because CLONE_NEWTIME intersects with
CSIGNAL. However, CSIGNAL was deprecated when clone3 was introduced in
commit 7f192e3cd3 ("fork: add clone3"), allowing re-use of that part
of clone flags.

Fix this by explicitly allowing CLONE_NEWTIME in clone3_args_valid. This
is also in line with the respective check in check_unshare_flags which
allow CLONE_NEWTIME for unshare().

Fixes: 769071ac9f ("ns: Introduce Time Namespace")
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17 08:50:14 +01:00
Greg Kroah-Hartman
9154eb052f Revert "Revert "sched/psi: Stop relying on timer_pending() for poll_work rescheduling""
This reverts commit 02bdd918e6.  It was
perserving the ABI, but that is not needed anymore at this point in
time.

Change-Id: I486cebed8ec0f91985d117eed3e1069d6160e267
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-15 16:41:08 +00:00
Greg Kroah-Hartman
5b483d8a04 Merge changes I95ce33fb,I03723a9f,I4b1cf7f1,I6e17c9b3,I446172f8, ... into android14-6.1
* changes:
  Merge 6.1.17 into android14-6.1
  ANDROID: update abi definition due to io_uring changes.
  UPSTREAM: Revert "blk-cgroup: dropping parent refcount after pd_free_fn() is done"
  UPSTREAM: Revert "blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()"
  Revert "kobject: modify kobject_get_path() to take a const *"
  Revert "wait: Return number of exclusive waiters awaken"
  Revert "sbitmap: Use single per-bitmap counting to wake up queued tags"
  Revert "sbitmap: correct wake_batch recalculation to avoid potential IO hung"
  Revert "sbitmap: Advance the queue index before waking up a queue"
  Revert "sbitmap: Try each queue to wake up at least one waiter"
  Revert "HID: retain initial quirks set up when creating HID devices"
  Merge 6.1.16 into android14-6.1
2023-03-14 17:38:04 +00:00
Sangmoon Kim
c5ea4db533 ANDROID: power: add vendor hooks for try_to_freeze fail
Add hooks to gather data of unfrozen tasks and summarize it
with other information.

Bug: 273189923

Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Change-Id: I61da3d253bd9959c6f06e09c9a35c4b242cedafe
(cherry picked from commit 2232e3fc85534a176e7f8bdfe8c56820d10dc111)
2023-03-13 20:34:25 +00:00
Sangmoon Kim
8635a09118 ANDROID: softlockup: add vendor hook for a softlockup task
Add hook to gather data of softlockup and summarize it with
other information.

Bug: 273189923

Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Change-Id: I5263bbd573c3fa4b4c981ac26c943721ce09506d
(cherry picked from commit 5cc613a916fdd4285ba5118a0d3063a32c31fbcb)
2023-03-13 20:34:25 +00:00
Greg Kroah-Hartman
2b47e2bee0 Revert "wait: Return number of exclusive waiters awaken"
This reverts commit d710b1e91b.

It breaks the ABI right now, but will be brought back at the next ABI
break as it will be needed for Android systems.

Bug: 161946584
Change-Id: I1dcf1f311ce25059466d000543b020fb33d237b4
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-13 18:51:55 +00:00
Greg Kroah-Hartman
2cb73a87e4 Merge 6.1.16 into android14-6.1
Changes in 6.1.16
	HID: asus: use spinlock to protect concurrent accesses
	HID: asus: use spinlock to safely schedule workers
	powerpc/mm: Rearrange if-else block to avoid clang warning
	ata: ahci: Revert "ata: ahci: Add Tiger Lake UP{3,4} AHCI controller"
	ARM: OMAP2+: Fix memory leak in realtime_counter_init()
	arm64: dts: qcom: qcs404: use symbol names for PCIe resets
	arm64: dts: qcom: msm8996-tone: Fix USB taking 6 minutes to wake up
	arm64: dts: qcom: sm8150-kumano: Panel framebuffer is 2.5k instead of 4k
	arm64: dts: qcom: sm6350: Fix up the ramoops node
	arm64: dts: qcom: sm6125: Reorder HSUSB PHY clocks to match bindings
	arm64: dts: qcom: sm6125-seine: Clean up gpio-keys (volume down)
	arm64: dts: imx8m: Align SoC unique ID node unit address
	ARM: zynq: Fix refcount leak in zynq_early_slcr_init
	arm64: dts: mediatek: mt8195: Add power domain to U3PHY1 T-PHY
	arm64: dts: mediatek: mt8183: Fix systimer 13 MHz clock description
	arm64: dts: mediatek: mt8192: Fix systimer 13 MHz clock description
	arm64: dts: mediatek: mt8195: Fix systimer 13 MHz clock description
	arm64: dts: mediatek: mt8186: Fix systimer 13 MHz clock description
	arm64: dts: qcom: sdm845-db845c: fix audio codec interrupt pin name
	x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC
	arm64: dts: qcom: sc7180: correct SPMI bus address cells
	arm64: dts: qcom: sc7280: correct SPMI bus address cells
	arm64: dts: qcom: sc8280xp: correct SPMI bus address cells
	arm64: dts: qcom: sc8280xp: Vote for CX in USB controllers
	arm64: dts: meson-gxl: jethub-j80: Fix WiFi MAC address node
	arm64: dts: meson-gxl: jethub-j80: Fix Bluetooth MAC node name
	arm64: dts: meson-axg: jethub-j1xx: Fix MAC address node names
	arm64: dts: meson-gx: Fix Ethernet MAC address unit name
	arm64: dts: meson-g12a: Fix internal Ethernet PHY unit name
	arm64: dts: meson-gx: Fix the SCPI DVFS node name and unit address
	cpuidle, intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE *again*
	arm64: dts: ti: k3-am62: Enable SPI nodes at the board level
	arm64: dts: ti: k3-am62-main: Fix clocks for McSPI
	arm64: tegra: Fix duplicate regulator on Jetson TX1
	arm64: dts: msm8992-bullhead: add memory hole region
	arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem size
	arm64: dts: qcom: msm8992-bullhead: Disable dfps_data_mem
	arm64: dts: qcom: ipq8074: correct USB3 QMP PHY-s clock output names
	arm64: dts: qcom: ipq8074: fix Gen2 PCIe QMP PHY
	arm64: dts: qcom: ipq8074: fix Gen3 PCIe QMP PHY
	arm64: dts: qcom: ipq8074: correct Gen2 PCIe ranges
	arm64: dts: qcom: ipq8074: fix Gen3 PCIe node
	arm64: dts: qcom: ipq8074: correct PCIe QMP PHY output clock names
	arm64: dts: meson: remove CPU opps below 1GHz for G12A boards
	ARM: OMAP1: call platform_device_put() in error case in omap1_dm_timer_init()
	arm64: dts: mediatek: mt8192: Mark scp_adsp clock as broken
	ARM: bcm2835_defconfig: Enable the framebuffer
	ARM: s3c: fix s3c64xx_set_timer_source prototype
	arm64: dts: ti: k3-j7200: Fix wakeup pinmux range
	ARM: dts: exynos: correct wr-active property in Exynos3250 Rinato
	ARM: imx: Call ida_simple_remove() for ida_simple_get
	arm64: dts: amlogic: meson-gx: fix SCPI clock dvfs node name
	arm64: dts: amlogic: meson-axg: fix SCPI clock dvfs node name
	arm64: dts: amlogic: meson-gx: add missing SCPI sensors compatible
	arm64: dts: amlogic: meson-axg-jethome-jethub-j1xx: fix supply name of USB controller node
	arm64: dts: amlogic: meson-gxl-s905d-sml5442tw: drop invalid clock-names property
	arm64: dts: amlogic: meson-gx: add missing unit address to rng node name
	arm64: dts: amlogic: meson-gxl-s905w-jethome-jethub-j80: fix invalid rtc node name
	arm64: dts: amlogic: meson-axg-jethome-jethub-j1xx: fix invalid rtc node name
	arm64: dts: amlogic: meson-gxl: add missing unit address to eth-phy-mux node name
	arm64: dts: amlogic: meson-gx-libretech-pc: fix update button name
	arm64: dts: amlogic: meson-sm1-bananapi-m5: fix adc keys node names
	arm64: dts: amlogic: meson-gxl-s905d-phicomm-n1: fix led node name
	arm64: dts: amlogic: meson-gxbb-kii-pro: fix led node name
	arm64: dts: amlogic: meson-sm1-odroid-hc4: fix active fan thermal trip
	locking/rwsem: Disable preemption in all down_read*() and up_read() code paths
	arm64: dts: renesas: beacon-renesom: Fix gpio expander reference
	arm64: dts: meson: radxa-zero: allow usb otg mode
	arm64: dts: meson: bananapi-m5: switch VDDIO_C pin to OPEN_DRAIN
	ARM: dts: sun8i: nanopi-duo2: Fix regulator GPIO reference
	ublk_drv: remove nr_aborted_queues from ublk_device
	ublk_drv: don't probe partitions if the ubq daemon isn't trusted
	ARM: dts: imx7s: correct iomuxc gpr mux controller cells
	sbitmap: remove redundant check in __sbitmap_queue_get_batch
	sbitmap: Use single per-bitmap counting to wake up queued tags
	sbitmap: correct wake_batch recalculation to avoid potential IO hung
	arm64: dts: mt8195: Fix CPU map for single-cluster SoC
	arm64: dts: mt8192: Fix CPU map for single-cluster SoC
	arm64: dts: mt8186: Fix CPU map for single-cluster SoC
	arm64: dts: mediatek: mt7622: Add missing pwm-cells to pwm node
	arm64: dts: mediatek: mt8186: Fix watchdog compatible
	arm64: dts: mediatek: mt8195: Fix watchdog compatible
	arm64: dts: mediatek: mt7986: Fix watchdog compatible
	ARM: dts: stm32: Update part number NVMEM description on stm32mp131
	blk-mq: avoid sleep in blk_mq_alloc_request_hctx
	blk-mq: remove stale comment for blk_mq_sched_mark_restart_hctx
	blk-mq: wait on correct sbitmap_queue in blk_mq_mark_tag_wait
	blk-mq: Fix potential io hung for shared sbitmap per tagset
	blk-mq: correct stale comment of .get_budget
	arm64: dts: qcom: msm8996: support using GPLL0 as kryocc input
	arm64: dts: qcom: msm8996 switch from RPM_SMD_BB_CLK1 to RPM_SMD_XO_CLK_SRC
	arm64: dts: qcom: sm8350: drop incorrect cells from serial
	arm64: dts: qcom: sm8450: drop incorrect cells from serial
	arm64: dts: qcom: msm8992-lg-bullhead: Correct memory overlaps with the SMEM and MPSS memory regions
	arm64: dts: qcom: msm8953: correct TLMM gpio-ranges
	arm64: dts: qcom: msm8992-*: Fix up comments
	arm64: dts: qcom: msm8992-lg-bullhead: Enable regulators
	s390/dasd: Fix potential memleak in dasd_eckd_init()
	sched/rt: pick_next_rt_entity(): check list_entry
	perf/x86/intel/ds: Fix the conversion from TSC to perf time
	x86/perf/zhaoxin: Add stepping check for ZXC
	KEYS: asymmetric: Fix ECDSA use via keyctl uapi
	block: ublk: check IO buffer based on flag need_get_data
	arm64: dts: qcom: pmk8350: Specify PBS register for PON
	arm64: dts: qcom: pmk8350: Use the correct PON compatible
	erofs: relinquish volume with mutex held
	block: sync mixed merged request's failfast with 1st bio's
	block: Fix io statistics for cgroup in throttle path
	block: bio-integrity: Copy flags when bio_integrity_payload is cloned
	block: use proper return value from bio_failfast()
	wifi: mt76: mt7915: add missing of_node_put()
	wifi: mt76: mt7921s: fix slab-out-of-bounds access in sdio host
	wifi: mt76: mt7915: check return value before accessing free_block_num
	wifi: mt76: mt7915: drop always true condition of __mt7915_reg_addr()
	wifi: mt76: mt7915: fix unintended sign extension of mt7915_hw_queue_read()
	wifi: mt76: fix coverity uninit_use_in_call in mt76_connac2_reverse_frag0_hdr_trans()
	wifi: rsi: Fix memory leak in rsi_coex_attach()
	wifi: rtlwifi: rtl8821ae: don't call kfree_skb() under spin_lock_irqsave()
	wifi: rtlwifi: rtl8188ee: don't call kfree_skb() under spin_lock_irqsave()
	wifi: rtlwifi: rtl8723be: don't call kfree_skb() under spin_lock_irqsave()
	wifi: iwlegacy: common: don't call dev_kfree_skb() under spin_lock_irqsave()
	wifi: libertas: fix memory leak in lbs_init_adapter()
	wifi: rtl8xxxu: don't call dev_kfree_skb() under spin_lock_irqsave()
	wifi: rtw89: 8852c: rfk: correct DACK setting
	wifi: rtw89: 8852c: rfk: correct DPK settings
	wifi: rtlwifi: Fix global-out-of-bounds bug in _rtl8812ae_phy_set_txpower_limit()
	libbpf: Fix btf__align_of() by taking into account field offsets
	wifi: ipw2x00: don't call dev_kfree_skb() under spin_lock_irqsave()
	wifi: ipw2200: fix memory leak in ipw_wdev_init()
	wifi: wilc1000: fix potential memory leak in wilc_mac_xmit()
	wifi: wilc1000: add missing unregister_netdev() in wilc_netdev_ifc_init()
	wifi: brcmfmac: fix potential memory leak in brcmf_netdev_start_xmit()
	wifi: brcmfmac: unmap dma buffer in brcmf_msgbuf_alloc_pktid()
	wifi: libertas_tf: don't call kfree_skb() under spin_lock_irqsave()
	wifi: libertas: if_usb: don't call kfree_skb() under spin_lock_irqsave()
	wifi: libertas: main: don't call kfree_skb() under spin_lock_irqsave()
	wifi: libertas: cmdresp: don't call kfree_skb() under spin_lock_irqsave()
	wifi: wl3501_cs: don't call kfree_skb() under spin_lock_irqsave()
	libbpf: Fix invalid return address register in s390
	crypto: x86/ghash - fix unaligned access in ghash_setkey()
	ACPICA: Drop port I/O validation for some regions
	genirq: Fix the return type of kstat_cpu_irqs_sum()
	rcu-tasks: Improve comments explaining tasks_rcu_exit_srcu purpose
	rcu-tasks: Remove preemption disablement around srcu_read_[un]lock() calls
	rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
	lib/mpi: Fix buffer overrun when SG is too long
	crypto: ccp - Avoid page allocation failure warning for SEV_GET_ID2
	platform/chrome: cros_ec_typec: Update port DP VDO
	ACPICA: nsrepair: handle cases without a return value correctly
	selftests/xsk: print correct payload for packet dump
	selftests/xsk: print correct error codes when exiting
	arm64/cpufeature: Fix field sign for DIT hwcap detection
	kselftest/arm64: Fix syscall-abi for systems without 128 bit SME
	workqueue: Protects wq_unbound_cpumask with wq_pool_attach_mutex
	s390/early: fix sclp_early_sccb variable lifetime
	s390/vfio-ap: fix an error handling path in vfio_ap_mdev_probe_queue()
	x86/signal: Fix the value returned by strict_sas_size()
	thermal/drivers/tsens: Drop msm8976-specific defines
	thermal/drivers/tsens: Sort out msm8976 vs msm8956 data
	thermal/drivers/tsens: fix slope values for msm8939
	thermal/drivers/tsens: limit num_sensors to 9 for msm8939
	wifi: rtw89: fix potential leak in rtw89_append_probe_req_ie()
	wifi: rtw89: Add missing check for alloc_workqueue
	wifi: rtl8xxxu: Fix memory leaks with RTL8723BU, RTL8192EU
	wifi: orinoco: check return value of hermes_write_wordrec()
	thermal/drivers/imx_sc_thermal: Drop empty platform remove function
	thermal/drivers/imx_sc_thermal: Fix the loop condition
	wifi: ath9k: htc_hst: free skb in ath9k_htc_rx_msg() if there is no callback function
	wifi: ath9k: hif_usb: clean up skbs if ath9k_hif_usb_rx_stream() fails
	wifi: ath9k: Fix potential stack-out-of-bounds write in ath9k_wmi_rsp_callback()
	wifi: ath11k: Fix memory leak in ath11k_peer_rx_frag_setup
	wifi: cfg80211: Fix extended KCK key length check in nl80211_set_rekey_data()
	ACPI: battery: Fix missing NUL-termination with large strings
	selftests/bpf: Fix build errors if CONFIG_NF_CONNTRACK=m
	crypto: ccp - Failure on re-initialization due to duplicate sysfs filename
	crypto: essiv - Handle EBUSY correctly
	crypto: seqiv - Handle EBUSY correctly
	powercap: fix possible name leak in powercap_register_zone()
	x86/microcode: Add a parameter to microcode_check() to store CPU capabilities
	x86/microcode: Check CPU capabilities after late microcode update correctly
	x86/microcode: Adjust late loading result reporting message
	selftests/bpf: Use consistent build-id type for liburandom_read.so
	selftests/bpf: Fix vmtest static compilation error
	crypto: xts - Handle EBUSY correctly
	leds: led-class: Add missing put_device() to led_put()
	s390/bpf: Add expoline to tail calls
	wifi: iwlwifi: mei: fix compilation errors in rfkill()
	kselftest/arm64: Fix enumeration of systems without 128 bit SME
	can: rcar_canfd: Fix R-Car V3U GAFLCFG field accesses
	selftests/bpf: Initialize tc in xdp_synproxy
	crypto: ccp - Flush the SEV-ES TMR memory before giving it to firmware
	bpftool: profile online CPUs instead of possible
	wifi: mt76: mt7915: call mt7915_mcu_set_thermal_throttling() only after init_work
	wifi: mt76: mt7915: fix memory leak in mt7915_mcu_exit
	wifi: mt76: mt7915: fix WED TxS reporting
	wifi: mt76: add memory barrier to SDIO queue kick
	wifi: mt76: mt7921: fix error code of return in mt7921_acpi_read
	net/mlx5: Enhance debug print in page allocation failure
	irqchip: Fix refcount leak in platform_irqchip_probe
	irqchip/alpine-msi: Fix refcount leak in alpine_msix_init_domains
	irqchip/irq-mvebu-gicp: Fix refcount leak in mvebu_gicp_probe
	irqchip/ti-sci: Fix refcount leak in ti_sci_intr_irq_domain_probe
	s390/mem_detect: fix detect_memory() error handling
	s390/vmem: fix empty page tables cleanup under KASAN
	s390/boot: cleanup decompressor header files
	s390/mem_detect: rely on diag260() if sclp_early_get_memsize() fails
	s390/boot: fix mem_detect extended area allocation
	net: add sock_init_data_uid()
	tun: tun_chr_open(): correctly initialize socket uid
	tap: tap_open(): correctly initialize socket uid
	OPP: fix error checking in opp_migrate_dentry()
	cpufreq: davinci: Fix clk use after free
	Bluetooth: hci_conn: Refactor hci_bind_bis() since it always succeeds
	Bluetooth: L2CAP: Fix potential user-after-free
	Bluetooth: hci_qca: get wakeup status from serdev device handle
	net: ipa: generic command param fix
	s390: vfio-ap: tighten the NIB validity check
	s390/ap: fix status returned by ap_aqic()
	s390/ap: fix status returned by ap_qact()
	libbpf: Fix alen calculation in libbpf_nla_dump_errormsg()
	xen/grant-dma-iommu: Implement a dummy probe_device() callback
	rds: rds_rm_zerocopy_callback() correct order for list_add_tail()
	crypto: rsa-pkcs1pad - Use akcipher_request_complete
	m68k: /proc/hardware should depend on PROC_FS
	RISC-V: time: initialize hrtimer based broadcast clock event device
	clocksource/drivers/riscv: Patch riscv_clock_next_event() jump before first use
	wifi: iwl3945: Add missing check for create_singlethread_workqueue
	wifi: iwl4965: Add missing check for create_singlethread_workqueue()
	wifi: mwifiex: fix loop iterator in mwifiex_update_ampdu_txwinsize()
	selftests/bpf: Fix out-of-srctree build
	ACPI: resource: Add IRQ overrides for MAINGEAR Vector Pro 2 models
	ACPI: resource: Do IRQ override on all TongFang GMxRGxx
	crypto: octeontx2 - Fix objects shared between several modules
	crypto: crypto4xx - Call dma_unmap_page when done
	wifi: mac80211: move color collision detection report in a delayed work
	wifi: mac80211: make rate u32 in sta_set_rate_info_rx()
	wifi: mac80211: fix non-MLO station association
	wifi: mac80211: Don't translate MLD addresses for multicast
	wifi: mac80211: avoid u32_encode_bits() warning
	wifi: mac80211: fix off-by-one link setting
	tools/lib/thermal: Fix thermal_sampling_exit()
	thermal/drivers/hisi: Drop second sensor hi3660
	selftests/bpf: Fix map_kptr test.
	wifi: mac80211: pass 'sta' to ieee80211_rx_data_set_sta()
	bpf: Zeroing allocated object from slab in bpf memory allocator
	selftests/bpf: Fix xdp_do_redirect on s390x
	can: esd_usb: Move mislocated storage of SJA1000_ECC_SEG bits in case of a bus error
	can: esd_usb: Make use of can_change_state() and relocate checking skb for NULL
	xsk: check IFF_UP earlier in Tx path
	LoongArch, bpf: Use 4 instructions for function address in JIT
	bpf: Fix global subprog context argument resolution logic
	irqchip/irq-brcmstb-l2: Set IRQ_LEVEL for level triggered interrupts
	irqchip/irq-bcm7120-l2: Set IRQ_LEVEL for level triggered interrupts
	net/smc: fix potential panic dues to unprotected smc_llc_srv_add_link()
	net/smc: fix application data exception
	selftests/net: Interpret UDP_GRO cmsg data as an int value
	l2tp: Avoid possible recursive deadlock in l2tp_tunnel_register()
	net: bcmgenet: fix MoCA LED control
	net: lan966x: Fix possible deadlock inside PTP
	net/mlx4_en: Introduce flexible array to silence overflow warning
	selftest: fib_tests: Always cleanup before exit
	sefltests: netdevsim: wait for devlink instance after netns removal
	drm: Fix potential null-ptr-deref due to drmm_mode_config_init()
	drm/fourcc: Add missing big-endian XRGB1555 and RGB565 formats
	drm/bridge: ti-sn65dsi83: Fix delay after reset deassert to match spec
	drm: mxsfb: DRM_IMX_LCDIF should depend on ARCH_MXC
	drm: mxsfb: DRM_MXSFB should depend on ARCH_MXS || ARCH_MXC
	drm/bridge: megachips: Fix error handling in i2c_register_driver()
	drm/vkms: Fix memory leak in vkms_init()
	drm/vkms: Fix null-ptr-deref in vkms_release()
	drm/vc4: dpi: Fix format mapping for RGB565
	drm: tidss: Fix pixel format definition
	gpu: ipu-v3: common: Add of_node_put() for reference returned by of_graph_get_port_by_id()
	drm/vc4: drop all currently held locks if deadlock happens
	hwmon: (ftsteutates) Fix scaling of measurements
	drm/msm/dpu: check for null return of devm_kzalloc() in dpu_writeback_init()
	drm/msm/hdmi: Add missing check for alloc_ordered_workqueue
	pinctrl: qcom: pinctrl-msm8976: Correct function names for wcss pins
	pinctrl: stm32: Fix refcount leak in stm32_pctrl_get_irq_domain
	pinctrl: rockchip: Fix refcount leak in rockchip_pinctrl_parse_groups
	drm/vc4: hvs: Set AXI panic modes
	drm/vc4: hvs: SCALER_DISPBKGND_AUTOHS is only valid on HVS4
	drm/vc4: hvs: Correct interrupt masking bit assignment for HVS5
	drm/vc4: hvs: Fix colour order for xRGB1555 on HVS5
	drm/vc4: hdmi: Correct interlaced timings again
	drm/msm: clean event_thread->worker in case of an error
	drm/panel-edp: fix name for IVO product id 854b
	scsi: qla2xxx: Fix exchange oversubscription
	scsi: qla2xxx: Fix exchange oversubscription for management commands
	scsi: qla2xxx: edif: Fix clang warning
	ASoC: fsl_sai: initialize is_dsp_mode flag
	drm/bridge: tc358767: Set default CLRSIPO count
	drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()
	ALSA: hda/ca0132: minor fix for allocation size
	drm/amdgpu: Use the sched from entity for amdgpu_cs trace
	drm/msm/gem: Add check for kmalloc
	drm/msm/dpu: Disallow unallocated resources to be returned
	drm/bridge: lt9611: fix sleep mode setup
	drm/bridge: lt9611: fix HPD reenablement
	drm/bridge: lt9611: fix polarity programming
	drm/bridge: lt9611: fix programming of video modes
	drm/bridge: lt9611: fix clock calculation
	drm/bridge: lt9611: pass a pointer to the of node
	regulator: tps65219: use IS_ERR() to detect an error pointer
	drm/mipi-dsi: Fix byte order of 16-bit DCS set/get brightness
	drm: exynos: dsi: Fix MIPI_DSI*_NO_* mode flags
	drm/msm/dsi: Allow 2 CTRLs on v2.5.0
	scsi: ufs: exynos: Fix DMA alignment for PAGE_SIZE != 4096
	drm/msm/dpu: sc7180: add missing WB2 clock control
	drm/msm: use strscpy instead of strncpy
	drm/msm/dpu: Add check for cstate
	drm/msm/dpu: Add check for pstates
	drm/msm/mdp5: Add check for kzalloc
	habanalabs: bugs fixes in timestamps buff alloc
	pinctrl: bcm2835: Remove of_node_put() in bcm2835_of_gpio_ranges_fallback()
	pinctrl: mediatek: Initialize variable pullen and pullup to zero
	pinctrl: mediatek: Initialize variable *buf to zero
	gpu: host1x: Fix mask for syncpoint increment register
	gpu: host1x: Don't skip assigning syncpoints to channels
	drm/tegra: firewall: Check for is_addr_reg existence in IMM check
	pinctrl: renesas: rzg2l: Fix configuring the GPIO pins as interrupts
	drm/msm/dpu: set pdpu->is_rt_pipe early in dpu_plane_sspp_atomic_update()
	drm/mediatek: dsi: Reduce the time of dsi from LP11 to sending cmd
	drm/mediatek: Use NULL instead of 0 for NULL pointer
	drm/mediatek: Drop unbalanced obj unref
	drm/mediatek: mtk_drm_crtc: Add checks for devm_kcalloc
	drm/mediatek: Clean dangling pointer on bind error path
	ASoC: soc-compress.c: fixup private_data on snd_soc_new_compress()
	dt-bindings: display: mediatek: Fix the fallback for mediatek,mt8186-disp-ccorr
	gpio: vf610: connect GPIO label to dev name
	ASoC: topology: Properly access value coming from topology file
	spi: dw_bt1: fix MUX_MMIO dependencies
	ASoC: mchp-spdifrx: fix controls which rely on rsr register
	ASoC: mchp-spdifrx: fix return value in case completion times out
	ASoC: mchp-spdifrx: fix controls that works with completion mechanism
	ASoC: mchp-spdifrx: disable all interrupts in mchp_spdifrx_dai_remove()
	dm: improve shrinker debug names
	regmap: apply reg_base and reg_downshift for single register ops
	ASoC: rsnd: fixup #endif position
	ASoC: mchp-spdifrx: Fix uninitialized use of mr in mchp_spdifrx_hw_params()
	ASoC: dt-bindings: meson: fix gx-card codec node regex
	regulator: tps65219: use generic set_bypass()
	hwmon: (asus-ec-sensors) add missing mutex path
	hwmon: (ltc2945) Handle error case in ltc2945_value_store
	ALSA: hda: Fix the control element identification for multiple codecs
	drm/amdgpu: fix enum odm_combine_mode mismatch
	scsi: mpt3sas: Fix a memory leak
	scsi: aic94xx: Add missing check for dma_map_single()
	HID: multitouch: Add quirks for flipped axes
	HID: retain initial quirks set up when creating HID devices
	ASoC: qcom: q6apm-lpass-dai: unprepare stream if its already prepared
	ASoC: qcom: q6apm-dai: fix race condition while updating the position pointer
	ASoC: qcom: q6apm-dai: Add SNDRV_PCM_INFO_BATCH flag
	ASoC: codecs: lpass: register mclk after runtime pm
	ASoC: codecs: lpass: fix incorrect mclk rate
	drm/amd/display: don't call dc_interrupt_set() for disabled crtcs
	HID: logitech-hidpp: Hard-code HID++ 1.0 fast scroll support
	spi: bcm63xx-hsspi: Fix multi-bit mode setting
	hwmon: (mlxreg-fan) Return zero speed for broken fan
	ASoC: tlv320adcx140: fix 'ti,gpio-config' DT property init
	dm: remove flush_scheduled_work() during local_exit()
	nfs4trace: fix state manager flag printing
	NFS: fix disabling of swap
	spi: synquacer: Fix timeout handling in synquacer_spi_transfer_one()
	ASoC: soc-dapm.h: fixup warning struct snd_pcm_substream not declared
	HID: bigben: use spinlock to protect concurrent accesses
	HID: bigben_worker() remove unneeded check on report_field
	HID: bigben: use spinlock to safely schedule workers
	hid: bigben_probe(): validate report count
	ALSA: hda/hdmi: Register with vga_switcheroo on Dual GPU Macbooks
	drm/shmem-helper: Fix locking for drm_gem_shmem_get_pages_sgt()
	NFSD: enhance inter-server copy cleanup
	NFSD: fix leaked reference count of nfsd4_ssc_umount_item
	nfsd: fix race to check ls_layouts
	nfsd: clean up potential nfsd_file refcount leaks in COPY codepath
	NFSD: fix problems with cleanup on errors in nfsd4_copy
	nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open
	nfsd: don't fsync nfsd_files on last close
	NFSD: copy the whole verifier in nfsd_copy_write_verifier
	cifs: Fix lost destroy smbd connection when MR allocate failed
	cifs: Fix warning and UAF when destroy the MR list
	cifs: use tcon allocation functions even for dummy tcon
	gfs2: jdata writepage fix
	perf llvm: Fix inadvertent file creation
	leds: led-core: Fix refcount leak in of_led_get()
	leds: is31fl319x: Wrap mutex_destroy() for devm_add_action_or_rest()
	leds: simatic-ipc-leds-gpio: Make sure we have the GPIO providing driver
	tools/tracing/rtla: osnoise_hist: use total duration for average calculation
	perf inject: Use perf_data__read() for auxtrace
	perf intel-pt: Do not try to queue auxtrace data on pipe
	perf test bpf: Skip test if kernel-debuginfo is not present
	perf tools: Fix auto-complete on aarch64
	sparc: allow PM configs for sparc32 COMPILE_TEST
	selftests: find echo binary to use -ne options
	selftests/ftrace: Fix bash specific "==" operator
	selftests: use printf instead of echo -ne
	perf record: Fix segfault with --overwrite and --max-size
	printf: fix errname.c list
	perf tests stat_all_metrics: Change true workload to sleep workload for system wide check
	objtool: add UACCESS exceptions for __tsan_volatile_read/write
	mfd: cs5535: Don't build on UML
	mfd: pcf50633-adc: Fix potential memleak in pcf50633_adc_async_read()
	dmaengine: idxd: Set traffic class values in GRPCFG on DSA 2.0
	RDMA/erdma: Fix refcount leak in erdma_mmap
	dmaengine: HISI_DMA should depend on ARCH_HISI
	RDMA/hns: Fix refcount leak in hns_roce_mmap
	iio: light: tsl2563: Do not hardcode interrupt trigger type
	usb: gadget: fusb300_udc: free irq on the error path in fusb300_probe()
	i2c: designware: fix i2c_dw_clk_rate() return size to be u32
	soundwire: cadence: Don't overflow the command FIFOs
	driver core: fix potential null-ptr-deref in device_add()
	kobject: modify kobject_get_path() to take a const *
	kobject: Fix slab-out-of-bounds in fill_kobj_path()
	alpha/boot/tools/objstrip: fix the check for ELF header
	media: uvcvideo: Check for INACTIVE in uvc_ctrl_is_accessible()
	media: uvcvideo: Implement mask for V4L2_CTRL_TYPE_MENU
	media: uvcvideo: Refactor uvc_ctrl_mappings_uvcXX
	media: uvcvideo: Refactor power_line_frequency_controls_limited
	coresight: etm4x: Fix accesses to TRCSEQRSTEVR and TRCSEQSTR
	coresight: cti: Prevent negative values of enable count
	coresight: cti: Add PM runtime call in enable_store
	usb: typec: intel_pmc_mux: Don't leak the ACPI device reference count
	PCI/IOV: Enlarge virtfn sysfs name buffer
	PCI: switchtec: Return -EFAULT for copy_to_user() errors
	PCI: endpoint: pci-epf-vntb: Clean up kernel_doc warning
	PCI: endpoint: pci-epf-vntb: Add epf_ntb_mw_bar_clear() num_mws kernel-doc
	hwtracing: hisi_ptt: Only add the supported devices to the filters list
	tty: serial: fsl_lpuart: disable Rx/Tx DMA in lpuart32_shutdown()
	tty: serial: fsl_lpuart: clear LPUART Status Register in lpuart32_shutdown()
	serial: tegra: Add missing clk_disable_unprepare() in tegra_uart_hw_init()
	Revert "char: pcmcia: cm4000_cs: Replace mdelay with usleep_range in set_protocol"
	eeprom: idt_89hpesx: Fix error handling in idt_init()
	applicom: Fix PCI device refcount leak in applicom_init()
	firmware: stratix10-svc: add missing gen_pool_destroy() in stratix10_svc_drv_probe()
	firmware: stratix10-svc: fix error handle while alloc/add device failed
	VMCI: check context->notify_page after call to get_user_pages_fast() to avoid GPF
	mei: pxp: Use correct macros to initialize uuid_le
	misc/mei/hdcp: Use correct macros to initialize uuid_le
	misc: fastrpc: Fix an error handling path in fastrpc_rpmsg_probe()
	driver core: fix resource leak in device_add()
	driver core: location: Free struct acpi_pld_info *pld before return false
	drivers: base: transport_class: fix possible memory leak
	drivers: base: transport_class: fix resource leak when transport_add_device() fails
	firmware: dmi-sysfs: Fix null-ptr-deref in dmi_sysfs_register_handle
	fotg210-udc: Add missing completion handler
	dmaengine: dw-edma: Fix missing src/dst address of interleaved xfers
	fpga: microchip-spi: move SPI I/O buffers out of stack
	fpga: microchip-spi: rewrite status polling in a time measurable way
	usb: early: xhci-dbc: Fix a potential out-of-bound memory access
	tty: serial: fsl_lpuart: Fix the wrong RXWATER setting for rx dma case
	RDMA/cxgb4: add null-ptr-check after ip_dev_find()
	usb: musb: mediatek: don't unregister something that wasn't registered
	usb: gadget: configfs: Restrict symlink creation is UDC already binded
	phy: mediatek: remove temporary variable @mask_
	PCI: mt7621: Delay phy ports initialization
	iommu: dart: Add suspend/resume support
	iommu: dart: Support >64 stream IDs
	iommu/dart: Fix apple_dart_device_group for PCI groups
	iommu/vt-d: Set No Execute Enable bit in PASID table entry
	power: supply: remove faulty cooling logic
	RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish()
	usb: max-3421: Fix setting of I/O pins
	RDMA/irdma: Cap MSIX used to online CPUs + 1
	serial: fsl_lpuart: fix RS485 RTS polariy inverse issue
	tty: serial: imx: Handle RS485 DE signal active high
	tty: serial: imx: disable Ageing Timer interrupt request irq
	driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
	driver core: fw_devlink: Don't purge child fwnode's consumer links
	driver core: fw_devlink: Allow marking a fwnode link as being part of a cycle
	driver core: fw_devlink: Consolidate device link flag computation
	driver core: fw_devlink: Improve check for fwnode with no device/driver
	driver core: fw_devlink: Make cycle detection more robust
	mtd: mtdpart: Don't create platform device that'll never probe
	usb: host: fsl-mph-dr-of: reuse device_set_of_node_from_dev
	dmaengine: dw-edma: Fix readq_ch() return value truncation
	PCI: Fix dropping valid root bus resources with .end = zero
	phy: rockchip-typec: fix tcphy_get_mode error case
	PCI: qcom: Fix host-init error handling
	iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()
	iommu: Fix error unwind in iommu_group_alloc()
	iommu/amd: Do not identity map v2 capable device when snp is enabled
	dmaengine: sf-pdma: pdma_desc memory leak fix
	dmaengine: dw-axi-dmac: Do not dereference NULL structure
	dmaengine: ptdma: check for null desc before calling pt_cmd_callback
	iommu/vt-d: Fix error handling in sva enable/disable paths
	iommu/vt-d: Allow to use flush-queue when first level is default
	RDMA/rxe: cleanup some error handling in rxe_verbs.c
	RDMA/rxe: Fix missing memory barriers in rxe_queue.h
	IB/hfi1: Fix math bugs in hfi1_can_pin_pages()
	IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors
	Revert "remoteproc: qcom_q6v5_mss: map/unmap metadata region before/after use"
	remoteproc: qcom_q6v5_mss: Use a carveout to authenticate modem headers
	media: ti: cal: fix possible memory leak in cal_ctx_create()
	media: platform: ti: Add missing check for devm_regulator_get
	media: imx: imx7-media-csi: fix missing clk_disable_unprepare() in imx7_csi_init()
	powerpc: Remove linker flag from KBUILD_AFLAGS
	s390/vdso: Drop '-shared' from KBUILD_CFLAGS_64
	builddeb: clean generated package content
	media: max9286: Fix memleak in max9286_v4l2_register()
	media: ov2740: Fix memleak in ov2740_init_controls()
	media: ov5675: Fix memleak in ov5675_init_controls()
	media: ov5640: Fix soft reset sequence and timings
	media: ov5640: Handle delays when no reset_gpio set
	media: mc: Get media_device directly from pad
	media: i2c: ov772x: Fix memleak in ov772x_probe()
	media: i2c: imx219: Split common registers from mode tables
	media: i2c: imx219: Fix binning for RAW8 capture
	media: platform: mtk-mdp3: Fix return value check in mdp_probe()
	media: camss: csiphy-3ph: avoid undefined behavior
	media: platform: mtk-mdp3: remove unused VIDEO_MEDIATEK_VPU config
	media: platform: mtk-mdp3: fix Kconfig dependencies
	media: v4l2-jpeg: correct the skip count in jpeg_parse_app14_data
	media: v4l2-jpeg: ignore the unknown APP14 marker
	media: hantro: Fix JPEG encoder ENUM_FRMSIZE on RK3399
	media: imx-jpeg: Apply clk_bulk api instead of operating specific clk
	media: amphion: correct the unspecified color space
	media: drivers/media/v4l2-core/v4l2-h264 : add detection of null pointers
	media: rc: Fix use-after-free bugs caused by ene_tx_irqsim()
	media: atomisp: Only set default_run_mode on first open of a stream/asd
	media: i2c: ov7670: 0 instead of -EINVAL was returned
	media: usb: siano: Fix use after free bugs caused by do_submit_urb
	media: saa7134: Use video_unregister_device for radio_dev
	rpmsg: glink: Avoid infinite loop on intent for missing channel
	rpmsg: glink: Release driver_override
	ARM: OMAP2+: omap4-common: Fix refcount leak bug
	arm64: dts: qcom: msm8996: Add additional A2NoC clocks
	udf: Define EFSCORRUPTED error code
	context_tracking: Fix noinstr vs KASAN
	exit: Detect and fix irq disabled state in oops
	ARM: dts: exynos: Use Exynos5420 compatible for the MIPI video phy
	fs: Use CHECK_DATA_CORRUPTION() when kernel bugs are detected
	blk-iocost: fix divide by 0 error in calc_lcoefs()
	blk-cgroup: dropping parent refcount after pd_free_fn() is done
	blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()
	trace/blktrace: fix memory leak with using debugfs_lookup()
	btrfs: scrub: improve tree block error reporting
	arm64: zynqmp: Enable hs termination flag for USB dwc3 controller
	cpuidle, intel_idle: Fix CPUIDLE_FLAG_INIT_XSTATE
	x86/fpu: Don't set TIF_NEED_FPU_LOAD for PF_IO_WORKER threads
	cpuidle: drivers: firmware: psci: Dont instrument suspend code
	cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
	perf/x86/intel/uncore: Add Meteor Lake support
	wifi: ath9k: Fix use-after-free in ath9k_hif_usb_disconnect()
	wifi: ath11k: fix monitor mode bringup crash
	wifi: brcmfmac: Fix potential stack-out-of-bounds in brcmf_c_preinit_dcmds()
	rcu: Make RCU_LOCKDEP_WARN() avoid early lockdep checks
	rcu: Suppress smp_processor_id() complaint in synchronize_rcu_expedited_wait()
	srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL
	rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug
	rcu-tasks: Handle queue-shrink/callback-enqueue race condition
	wifi: ath11k: debugfs: fix to work with multiple PCI devices
	thermal: intel: Fix unsigned comparison with less than zero
	timers: Prevent union confusion from unexpected restart_syscall()
	x86/bugs: Reset speculation control settings on init
	bpftool: Always disable stack protection for BPF objects
	wifi: brcmfmac: ensure CLM version is null-terminated to prevent stack-out-of-bounds
	wifi: mt7601u: fix an integer underflow
	inet: fix fast path in __inet_hash_connect()
	ice: restrict PTP HW clock freq adjustments to 100, 000, 000 PPB
	ice: add missing checks for PF vsi type
	ACPI: Don't build ACPICA with '-Os'
	bpf, docs: Fix modulo zero, division by zero, overflow, and underflow
	thermal: intel: intel_pch: Add support for Wellsburg PCH
	clocksource: Suspend the watchdog temporarily when high read latency detected
	crypto: hisilicon: Wipe entire pool on error
	net: bcmgenet: Add a check for oversized packets
	m68k: Check syscall_trace_enter() return code
	s390/mm,ptdump: avoid Kasan vs Memcpy Real markers swapping
	netfilter: nf_tables: NULL pointer dereference in nf_tables_updobj()
	can: isotp: check CAN address family in isotp_bind()
	gcc-plugins: drop -std=gnu++11 to fix GCC 13 build
	tools/power/x86/intel-speed-select: Add Emerald Rapid quirk
	wifi: mt76: dma: free rx_head in mt76_dma_rx_cleanup
	ACPI: video: Fix Lenovo Ideapad Z570 DMI match
	net/mlx5: fw_tracer: Fix debug print
	coda: Avoid partial allocation of sig_inputArgs
	uaccess: Add minimum bounds check on kernel buffer size
	s390/idle: mark arch_cpu_idle() noinstr
	time/debug: Fix memory leak with using debugfs_lookup()
	PM: domains: fix memory leak with using debugfs_lookup()
	PM: EM: fix memory leak with using debugfs_lookup()
	Bluetooth: Fix issue with Actions Semi ATS2851 based devices
	Bluetooth: btusb: Add new PID/VID 0489:e0f2 for MT7921
	Bluetooth: btusb: Add VID:PID 13d3:3529 for Realtek RTL8821CE
	wifi: rtw89: debug: avoid invalid access on RTW89_DBG_SEL_MAC_30
	hv_netvsc: Check status in SEND_RNDIS_PKT completion message
	s390/kfence: fix page fault reporting
	devlink: Fix TP_STRUCT_entry in trace of devlink health report
	scm: add user copy checks to put_cmsg()
	drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Tab 3 X90F
	drm: panel-orientation-quirks: Add quirk for DynaBook K50
	drm/amd/display: Reduce expected sdp bandwidth for dcn321
	drm/amd/display: Revert Reduce delay when sink device not able to ACK 00340h write
	drm/amd/display: Fix potential null-deref in dm_resume
	drm/omap: dsi: Fix excessive stack usage
	HID: Add Mapping for System Microphone Mute
	drm/tiny: ili9486: Do not assume 8-bit only SPI controllers
	drm/amd/display: Defer DIG FIFO disable after VID stream enable
	drm/radeon: free iio for atombios when driver shutdown
	drm/amd: Avoid BUG() for case of SRIOV missing IP version
	drm/amdkfd: Page aligned memory reserve size
	scsi: lpfc: Fix use-after-free KFENCE violation during sysfs firmware write
	Revert "fbcon: don't lose the console font across generic->chip driver switch"
	drm/amd: Avoid ASSERT for some message failures
	drm: amd: display: Fix memory leakage
	drm/amd/display: fix mapping to non-allocated address
	HID: uclogic: Add frame type quirk
	HID: uclogic: Add battery quirk
	HID: uclogic: Add support for XP-PEN Deco Pro SW
	HID: uclogic: Add support for XP-PEN Deco Pro MW
	drm/msm/dsi: Add missing check for alloc_ordered_workqueue
	drm: rcar-du: Add quirk for H3 ES1.x pclk workaround
	drm: rcar-du: Fix setting a reserved bit in DPLLCR
	drm/drm_print: correct format problem
	drm/amd/display: Set hvm_enabled flag for S/G mode
	habanalabs: extend fatal messages to contain PCI info
	habanalabs: fix bug in timestamps registration code
	docs/scripts/gdb: add necessary make scripts_gdb step
	drm/msm/dpu: Add DSC hardware blocks to register snapshot
	ASoC: soc-compress: Reposition and add pcm_mutex
	ASoC: kirkwood: Iterate over array indexes instead of using pointer math
	regulator: max77802: Bounds check regulator id against opmode
	regulator: s5m8767: Bounds check id indexing into arrays
	Revert "drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled"
	drm/amd/display: fix FCLK pstate change underflow
	gfs2: Improve gfs2_make_fs_rw error handling
	hwmon: (coretemp) Simplify platform device handling
	hwmon: (nct6775) Directly call ASUS ACPI WMI method
	hwmon: (nct6775) B650/B660/X670 ASUS boards support
	pinctrl: at91: use devm_kasprintf() to avoid potential leaks
	drm/amd/display: Do not commit pipe when updating DRR
	scsi: snic: Fix memory leak with using debugfs_lookup()
	scsi: ufs: core: Fix device management cmd timeout flow
	HID: logitech-hidpp: Don't restart communication if not necessary
	drm/amd/display: Enable P-state validation checks for DCN314
	drm: panel-orientation-quirks: Add quirk for Lenovo IdeaPad Duet 3 10IGL5
	drm/amd/display: Disable HUBP/DPP PG on DCN314 for now
	dm thin: add cond_resched() to various workqueue loops
	dm cache: add cond_resched() to various workqueue loops
	nfsd: zero out pointers after putting nfsd_files on COPY setup error
	nfsd: don't hand out delegation on setuid files being opened for write
	cifs: prevent data race in smb2_reconnect()
	drm/shmem-helper: Revert accidental non-GPL export
	driver core: fw_devlink: Avoid spurious error message
	wifi: rtl8xxxu: fixing transmisison failure for rtl8192eu
	scsi: mpt3sas: Remove usage of dma_get_required_mask() API
	firmware: coreboot: framebuffer: Ignore reserved pixel color bits
	block: don't allow multiple bios for IOCB_NOWAIT issue
	block: clear bio->bi_bdev when putting a bio back in the cache
	block: be a bit more careful in checking for NULL bdev while polling
	rtc: pm8xxx: fix set-alarm race
	ipmi: ipmb: Fix the MODULE_PARM_DESC associated to 'retry_time_ms'
	ipmi:ssif: resend_msg() cannot fail
	ipmi_ssif: Rename idle state and check
	io_uring: Replace 0-length array with flexible array
	io_uring: use user visible tail in io_uring_poll()
	io_uring: handle TIF_NOTIFY_RESUME when checking for task_work
	io_uring: add a conditional reschedule to the IOPOLL cancelation loop
	io_uring: add reschedule point to handle_tw_list()
	io_uring/rsrc: disallow multi-source reg buffers
	io_uring: remove MSG_NOSIGNAL from recvmsg
	io_uring: fix fget leak when fs don't support nowait buffered read
	s390/extmem: return correct segment type in __segment_load()
	s390: discard .interp section
	s390/kprobes: fix irq mask clobbering on kprobe reenter from post_handler
	s390/kprobes: fix current_kprobe never cleared after kprobes reenter
	KVM: s390: disable migration mode when dirty tracking is disabled
	cifs: Fix uninitialized memory read in smb3_qfs_tcon()
	cifs: Fix uninitialized memory reads for oparms.mode
	cifs: fix mount on old smb servers
	cifs: introduce cifs_io_parms in smb2_async_writev()
	cifs: split out smb3_use_rdma_offload() helper
	cifs: don't try to use rdma offload on encrypted connections
	cifs: Check the lease context if we actually got a lease
	cifs: return a single-use cfid if we did not get a lease
	scsi: mpi3mr: Fix missing mrioc->evtack_cmds initialization
	scsi: mpi3mr: Fix issues in mpi3mr_get_all_tgt_info()
	scsi: mpi3mr: Remove unnecessary memcpy() to alltgt_info->dmi
	btrfs: hold block group refcount during async discard
	locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath
	ksmbd: fix wrong data area length for smb2 lock request
	ksmbd: do not allow the actual frame length to be smaller than the rfc1002 length
	ksmbd: fix possible memory leak in smb2_lock()
	torture: Fix hang during kthread shutdown phase
	ARM: dts: exynos: correct HDMI phy compatible in Exynos4
	io_uring: mark task TASK_RUNNING before handling resume/task work
	hfs: fix missing hfs_bnode_get() in __hfs_bnode_create
	fs: hfsplus: fix UAF issue in hfsplus_put_super
	exfat: fix reporting fs error when reading dir beyond EOF
	exfat: fix unexpected EOF while reading dir
	exfat: redefine DIR_DELETED as the bad cluster number
	exfat: fix inode->i_blocks for non-512 byte sector size device
	fs: dlm: don't set stop rx flag after node reset
	fs: dlm: move sending fin message into state change handling
	fs: dlm: send FIN ack back in right cases
	f2fs: fix information leak in f2fs_move_inline_dirents()
	f2fs: retry to update the inode page given data corruption
	f2fs: fix cgroup writeback accounting with fs-layer encryption
	f2fs: fix kernel crash due to null io->bio
	ocfs2: fix defrag path triggering jbd2 ASSERT
	ocfs2: fix non-auto defrag path not working issue
	fs/cramfs/inode.c: initialize file_ra_state
	selftests/landlock: Skip overlayfs tests when not supported
	selftests/landlock: Test ptrace as much as possible with Yama
	udf: Truncate added extents on failed expansion
	udf: Do not bother merging very long extents
	udf: Do not update file length for failed writes to inline files
	udf: Preserve link count of system files
	udf: Detect system inodes linked into directory hierarchy
	udf: Fix file corruption when appending just after end of preallocated extent
	md: don't update recovery_cp when curr_resync is ACTIVE
	RDMA/siw: Fix user page pinning accounting
	KVM: Destroy target device if coalesced MMIO unregistration fails
	KVM: VMX: Fix crash due to uninitialized current_vmcs
	KVM: Register /dev/kvm as the _very_ last thing during initialization
	KVM: x86: Purge "highest ISR" cache when updating APICv state
	KVM: x86: Blindly get current x2APIC reg value on "nodecode write" traps
	KVM: x86: Don't inhibit APICv/AVIC on xAPIC ID "change" if APIC is disabled
	KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID
	KVM: SVM: Flush the "current" TLB when activating AVIC
	KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target
	KVM: SVM: Don't put/load AVIC when setting virtual APIC mode
	KVM: x86: Inject #GP if WRMSR sets reserved bits in APIC Self-IPI
	KVM: x86: Inject #GP on x2APIC WRMSR that sets reserved bits 63:32
	KVM: SVM: Fix potential overflow in SEV's send|receive_update_data()
	KVM: SVM: hyper-v: placate modpost section mismatch error
	selftests: x86: Fix incorrect kernel headers search path
	x86/virt: Force GIF=1 prior to disabling SVM (for reboot flows)
	x86/crash: Disable virt in core NMI crash handler to avoid double shootdown
	x86/reboot: Disable virtualization in an emergency if SVM is supported
	x86/reboot: Disable SVM, not just VMX, when stopping CPUs
	x86/kprobes: Fix __recover_optprobed_insn check optimizing logic
	x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range
	x86/microcode/amd: Remove load_microcode_amd()'s bsp parameter
	x86/microcode/AMD: Add a @cpu parameter to the reloading functions
	x86/microcode/AMD: Fix mixed steppings support
	x86/speculation: Allow enabling STIBP with legacy IBRS
	Documentation/hw-vuln: Document the interaction between IBRS and STIBP
	virt/sev-guest: Return -EIO if certificate buffer is not large enough
	brd: mark as nowait compatible
	brd: return 0/-error from brd_insert_page()
	brd: check for REQ_NOWAIT and set correct page allocation mask
	ima: fix error handling logic when file measurement failed
	ima: Align ima_file_mmap() parameters with mmap_file LSM hook
	selftests/powerpc: Fix incorrect kernel headers search path
	selftests/ftrace: Fix eprobe syntax test case to check filter support
	selftests: sched: Fix incorrect kernel headers search path
	selftests: core: Fix incorrect kernel headers search path
	selftests: pid_namespace: Fix incorrect kernel headers search path
	selftests: arm64: Fix incorrect kernel headers search path
	selftests: clone3: Fix incorrect kernel headers search path
	selftests: pidfd: Fix incorrect kernel headers search path
	selftests: membarrier: Fix incorrect kernel headers search path
	selftests: kcmp: Fix incorrect kernel headers search path
	selftests: media_tests: Fix incorrect kernel headers search path
	selftests: gpio: Fix incorrect kernel headers search path
	selftests: filesystems: Fix incorrect kernel headers search path
	selftests: user_events: Fix incorrect kernel headers search path
	selftests: ptp: Fix incorrect kernel headers search path
	selftests: sync: Fix incorrect kernel headers search path
	selftests: rseq: Fix incorrect kernel headers search path
	selftests: move_mount_set_group: Fix incorrect kernel headers search path
	selftests: mount_setattr: Fix incorrect kernel headers search path
	selftests: perf_events: Fix incorrect kernel headers search path
	selftests: ipc: Fix incorrect kernel headers search path
	selftests: futex: Fix incorrect kernel headers search path
	selftests: drivers: Fix incorrect kernel headers search path
	selftests: dmabuf-heaps: Fix incorrect kernel headers search path
	selftests: vm: Fix incorrect kernel headers search path
	selftests: seccomp: Fix incorrect kernel headers search path
	irqdomain: Fix association race
	irqdomain: Fix disassociation race
	irqdomain: Look for existing mapping only once
	irqdomain: Drop bogus fwspec-mapping error handling
	irqdomain: Refactor __irq_domain_alloc_irqs()
	irqdomain: Fix mapping-creation race
	irqdomain: Fix domain registration race
	crypto: qat - fix out-of-bounds read
	mm/damon/paddr: fix missing folio_put()
	ALSA: ice1712: Do not left ice->gpio_mutex locked in aureon_add_controls()
	ALSA: hda/realtek: Add quirk for HP EliteDesk 800 G6 Tower PC
	jbd2: fix data missing when reusing bh which is ready to be checkpointed
	ext4: optimize ea_inode block expansion
	ext4: refuse to create ea block when umounted
	cxl/pmem: Fix nvdimm registration races
	mtd: spi-nor: sfdp: Fix index value for SCCR dwords
	mtd: spi-nor: spansion: Consider reserved bits in CFR5 register
	mtd: spi-nor: Fix shift-out-of-bounds in spi_nor_set_erase_type
	dm: send just one event on resize, not two
	dm: add cond_resched() to dm_wq_work()
	dm: add cond_resched() to dm_wq_requeue_work()
	wifi: rtw88: use RTW_FLAG_POWERON flag to prevent to power on/off twice
	wifi: rtl8xxxu: Use a longer retry limit of 48
	wifi: ath11k: allow system suspend to survive ath11k
	wifi: cfg80211: Fix use after free for wext
	wifi: cfg80211: Set SSID if it is not already set
	cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
	qede: fix interrupt coalescing configuration
	thermal: intel: powerclamp: Fix cur_state for multi package system
	dm flakey: fix logic when corrupting a bio
	dm cache: free background tracker's queued work in btracker_destroy
	dm flakey: don't corrupt the zero page
	dm flakey: fix a bug with 32-bit highmem systems
	hwmon: (peci/cputemp) Fix off-by-one in coretemp_label allocation
	hwmon: (nct6775) Fix incorrect parenthesization in nct6775_write_fan_div()
	ARM: dts: qcom: sdx65: Add Qcom SMMU-500 as the fallback for IOMMU node
	ARM: dts: qcom: sdx55: Add Qcom SMMU-500 as the fallback for IOMMU node
	ARM: dts: exynos: correct TMU phandle in Exynos4210
	ARM: dts: exynos: correct TMU phandle in Exynos4
	ARM: dts: exynos: correct TMU phandle in Odroid XU3 family
	ARM: dts: exynos: correct TMU phandle in Exynos5250
	ARM: dts: exynos: correct TMU phandle in Odroid XU
	ARM: dts: exynos: correct TMU phandle in Odroid HC1
	arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP
	fuse: add inode/permission checks to fileattr_get/fileattr_set
	rbd: avoid use-after-free in do_rbd_add() when rbd_dev_create() fails
	ceph: update the time stamps and try to drop the suid/sgid
	regulator: core: Use ktime_get_boottime() to determine how long a regulator was off
	panic: fix the panic_print NMI backtrace setting
	mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON
	alpha: fix FEN fault handling
	dax/kmem: Fix leak of memory-hotplug resources
	mips: fix syscall_get_nr
	media: ipu3-cio2: Fix PM runtime usage_count in driver unbind
	remoteproc/mtk_scp: Move clk ops outside send_lock
	docs: gdbmacros: print newest record
	mm: memcontrol: deprecate charge moving
	mm/thp: check and bail out if page in deferred queue already
	ktest.pl: Give back console on Ctrt^C on monitor
	kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list
	ktest.pl: Fix missing "end_monitor" when machine check fails
	ktest.pl: Add RUN_TIMEOUT option with default unlimited
	memory tier: release the new_memtier in find_create_memory_tier()
	ring-buffer: Handle race between rb_move_tail and rb_check_pages
	tools/bootconfig: fix single & used for logical condition
	tracing/eprobe: Fix to add filter on eprobe description in README file
	iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter
	iommu/amd: Improve page fault error reporting
	scsi: aacraid: Allocate cmd_priv with scsicmd
	scsi: qla2xxx: Fix link failure in NPIV environment
	scsi: qla2xxx: Check if port is online before sending ELS
	scsi: qla2xxx: Fix DMA-API call trace on NVMe LS requests
	scsi: qla2xxx: Remove unintended flag clearing
	scsi: qla2xxx: Fix erroneous link down
	scsi: qla2xxx: Remove increment of interface err cnt
	scsi: ses: Don't attach if enclosure has no components
	scsi: ses: Fix slab-out-of-bounds in ses_enclosure_data_process()
	scsi: ses: Fix possible addl_desc_ptr out-of-bounds accesses
	scsi: ses: Fix possible desc_ptr out-of-bounds accesses
	scsi: ses: Fix slab-out-of-bounds in ses_intf_remove()
	RISC-V: add a spin_shadow_stack declaration
	riscv: Avoid enabling interrupts in die()
	riscv: mm: fix regression due to update_mmu_cache change
	riscv: jump_label: Fixup unaligned arch_static_branch function
	riscv, mm: Perform BPF exhandler fixup on page fault
	riscv: ftrace: Remove wasted nops for !RISCV_ISA_C
	riscv: ftrace: Reduce the detour code size to half
	MIPS: DTS: CI20: fix otg power gpio
	PCI/PM: Observe reset delay irrespective of bridge_d3
	PCI: Unify delay handling for reset and resume
	PCI: hotplug: Allow marking devices as disconnected during bind/unbind
	PCI: Avoid FLR for AMD FCH AHCI adapters
	PCI/DPC: Await readiness of secondary bus after reset
	bus: mhi: ep: Only send -ENOTCONN status if client driver is available
	bus: mhi: ep: Move chan->lock to the start of processing queued ch ring
	bus: mhi: ep: Save channel state locally during suspend and resume
	iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode
	iommu/vt-d: Fix PASID directory pointer coherency
	vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR
	vfio/type1: prevent underflow of locked_vm via exec()
	vfio/type1: track locked_vm per dma
	vfio/type1: restore locked_vm
	drm/amd: Fix initialization for nbio 7.5.1
	drm/i915/quirks: Add inverted backlight quirk for HP 14-r206nv
	drm/radeon: Fix eDP for single-display iMac11,2
	drm/i915: Don't use stolen memory for ring buffers with LLC
	drm/i915: Don't use BAR mappings for ring buffers with LLC
	drm/gud: Fix UBSAN warning
	drm/edid: fix AVI infoframe aspect ratio handling
	drm/edid: fix parsing of 3D modes from HDMI VSDB
	qede: avoid uninitialized entries in coal_entry array
	brd: use radix_tree_maybe_preload instead of radix_tree_preload
	sbitmap: Advance the queue index before waking up a queue
	wait: Return number of exclusive waiters awaken
	sbitmap: Try each queue to wake up at least one waiter
	kbuild: Port silent mode detection to future gnu make.
	net: avoid double iput when sock_alloc_file fails
	Linux 6.1.16

Change-Id: I705caf70ee547e6d55f38d133bdcd50713aed745
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-13 15:45:34 +00:00
Vincent Donnefort
34411786c7 ANDROID: ring-buffer: Fix ring_buffer_read_page for external writers
No shortcut is possible for reading a page without removing it from
the ring-buffer. The reader needs to be moved and its timestamp
updated.

Bug: 249050813
Change-Id: I80fbc1e265500e419278346e2973df2488b7e8b3
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2023-03-13 13:36:34 +00:00
Greg Kroah-Hartman
29d53c4c5a kernel/fail_function: fix memory leak with using debugfs_lookup()
[ Upstream commit 2bb3669f576559db273efe49e0e69f82450efbca ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20230202151633.2310897-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:39 +01:00
Greg Kroah-Hartman
c578a68ffc kernel/printk/index.c: fix memory leak with using debugfs_lookup()
[ Upstream commit 55bf243c514553e907efcf2bda92ba090eca8c64 ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230202151411.2308576-1-gregkh@linuxfoundation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:32 +01:00
Jia-Ju Bai
2072332c04 tracing: Add NULL checks for buffer in ring_buffer_free_read_page()
[ Upstream commit 3e4272b9954094907f16861199728f14002fcaf6 ]

In a previous commit 7433632c9f, buffer, buffer->buffers and
buffer->buffers[cpu] in ring_buffer_wake_waiters() can be NULL,
and thus the related checks are added.

However, in the same call stack, these variables are also used in
ring_buffer_free_read_page():

tracing_buffers_release()
  ring_buffer_wake_waiters(iter->array_buffer->buffer)
    cpu_buffer = buffer->buffers[cpu] -> Add checks by previous commit
  ring_buffer_free_read_page(iter->array_buffer->buffer)
    cpu_buffer = buffer->buffers[cpu] -> No check

Thus, to avod possible null-pointer derefernces, the related checks
should be added.

These results are reported by a static tool designed by myself.

Link: https://lkml.kernel.org/r/20230113125501.760324-1-baijiaju1990@gmail.com

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:32 +01:00
Sergey Shtylyov
926aef60ea genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
[ Upstream commit feabecaff5902f896531dde90646ca5dfa9d4f7d ]

If ipi_send_{mask|single}() is called with an invalid interrupt number, all
the local variables there will be NULL. ipi_send_verify() which is invoked
from these functions does verify its 'data' parameter, resulting in a
kernel oops in irq_data_get_affinity_mask() as the passed NULL pointer gets
dereferenced.

Add a missing NULL pointer check in ipi_send_verify()...

Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.

Fixes: 3b8e29a82d ("genirq: Implement ipi_send_mask/single()")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/b541232d-c2b6-1fe9-79b4-a7129459e4d0@omp.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11 13:55:29 +01:00
Sangmoon Kim
1e6d82d241 ANDROID: Re-apply vendor hooks for information of blocked tasks
This reverts commit 66330b896c (Revert "ANDROID: vendor_hooks:
add waiting information for blocked tasks")

The original patch has been reverted to resolve merge issues
with 5.18-rc1. This patch adds again the vendor hooks for the
original purpose.

Bug: 271799327

Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Change-Id: I86b9b7dd553b7b6a5930ace6280ecd66dc5dc4df
2023-03-10 18:16:02 +00:00
Gabriel Krisman Bertazi
d710b1e91b wait: Return number of exclusive waiters awaken
commit ee7dc86b6d3e3b86c2c487f713eda657850de238 upstream.

Sbitmap code will need to know how many waiters were actually woken for
its batched wakeups implementation.  Return the number of woken
exclusive waiters from __wake_up() to facilitate that.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221115224553.23594-3-krisman@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:34 +01:00
Masami Hiramatsu (Google)
4aa7389400 tracing/eprobe: Fix to add filter on eprobe description in README file
commit 133921530c42960c07d25d12677f9e131a2b0cdf upstream.

Fix to add a description of the filter on eprobe in README file. This
is required to identify the kernel supports the filter on eprobe or not.

Link: https://lore.kernel.org/all/167309833728.640500.12232259238201433587.stgit@devnote3/

Fixes: 752be5c5c9 ("tracing/eprobe: Add eprobe filter support")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:27 +01:00
Mukesh Ojha
9674390ac5 ring-buffer: Handle race between rb_move_tail and rb_check_pages
commit 8843e06f67b14f71c044bf6267b2387784c7e198 upstream.

It seems a data race between ring_buffer writing and integrity check.
That is, RB_FLAG of head_page is been updating, while at same time
RB_FLAG was cleared when doing integrity check rb_check_pages():

  rb_check_pages()            rb_handle_head_page():
  --------                    --------
  rb_head_page_deactivate()
                              rb_head_page_set_normal()
  rb_head_page_activate()

We do intergrity test of the list to check if the list is corrupted and
it is still worth doing it. So, let's refactor rb_check_pages() such that
we no longer clear and set flag during the list sanity checking.

[1] and [2] are the test to reproduce and the crash report respectively.

1:
``` read_trace.sh
  while true;
  do
    # the "trace" file is closed after read
    head -1 /sys/kernel/tracing/trace > /dev/null
  done
```
``` repro.sh
  sysctl -w kernel.panic_on_warn=1
  # function tracer will writing enough data into ring_buffer
  echo function > /sys/kernel/tracing/current_tracer
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
  ./read_trace.sh &
```

2:
------------[ cut here ]------------
WARNING: CPU: 9 PID: 62 at kernel/trace/ring_buffer.c:2653
rb_move_tail+0x450/0x470
Modules linked in:
CPU: 9 PID: 62 Comm: ksoftirqd/9 Tainted: G        W          6.2.0-rc6+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:rb_move_tail+0x450/0x470
Code: ff ff 4c 89 c8 f0 4d 0f b1 02 48 89 c2 48 83 e2 fc 49 39 d0 75 24
83 e0 03 83 f8 02 0f 84 e1 fb ff ff 48 8b 57 10 f0 ff 42 08 <0f> 0b 83
f8 02 0f 84 ce fb ff ff e9 db
RSP: 0018:ffffb5564089bd00 EFLAGS: 00000203
RAX: 0000000000000000 RBX: ffff9db385a2bf81 RCX: ffffb5564089bd18
RDX: ffff9db281110100 RSI: 0000000000000fe4 RDI: ffff9db380145400
RBP: ffff9db385a2bf80 R08: ffff9db385a2bfc0 R09: ffff9db385a2bfc2
R10: ffff9db385a6c000 R11: ffff9db385a2bf80 R12: 0000000000000000
R13: 00000000000003e8 R14: ffff9db281110100 R15: ffffffffbb006108
FS:  0000000000000000(0000) GS:ffff9db3bdcc0000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005602323024c8 CR3: 0000000022e0c000 CR4: 00000000000006e0
Call Trace:
 <TASK>
 ring_buffer_lock_reserve+0x136/0x360
 ? __do_softirq+0x287/0x2df
 ? __pfx_rcu_softirq_qs+0x10/0x10
 trace_function+0x21/0x110
 ? __pfx_rcu_softirq_qs+0x10/0x10
 ? __do_softirq+0x287/0x2df
 function_trace_call+0xf6/0x120
 0xffffffffc038f097
 ? rcu_softirq_qs+0x5/0x140
 rcu_softirq_qs+0x5/0x140
 __do_softirq+0x287/0x2df
 run_ksoftirqd+0x2a/0x30
 smpboot_thread_fn+0x188/0x220
 ? __pfx_smpboot_thread_fn+0x10/0x10
 kthread+0xe7/0x110
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2c/0x50
 </TASK>
---[ end trace 0000000000000000 ]---

[ crash report and test reproducer credit goes to Zheng Yejian]

Link: https://lore.kernel.org/linux-trace-kernel/1676376403-16462-1-git-send-email-quic_mojha@quicinc.com

Cc: <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 1039221cc2 ("ring-buffer: Do not disable recording when there is an iterator")
Reported-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:27 +01:00
Masami Hiramatsu (Google)
a467e3e04d kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list
commit 4fbd2f83fda0ca44a2ec6421ca3508b355b31858 upstream.

Since forcibly unoptimized kprobes will be put on the freeing_list directly
in the unoptimize_kprobe(), do_unoptimize_kprobes() must continue to check
the freeing_list even if unoptimizing_list is empty.

This bug can happen if a kprobe is put in an instruction which is in the
middle of the jump-replaced instruction sequence of an optprobe, *and* the
optprobe is recently unregistered and queued on unoptimizing_list.
In this case, the optprobe will be unoptimized forcibly (means immediately)
and put it into the freeing_list, expecting the optprobe will be handled in
do_unoptimize_kprobe().
But if there is no other optprobes on the unoptimizing_list, current code
returns from the do_unoptimize_kprobe() soon and does not handle the
optprobe which is on the freeing_list. Then the optprobe will hit the
WARN_ON_ONCE() in the do_free_cleaned_kprobes(), because it is not handled
in the latter loop of the do_unoptimize_kprobe().

To solve this issue, do not return from do_unoptimize_kprobes() immediately
even if unoptimizing_list is empty.

Moreover, this change affects another case. kill_optimized_kprobes() expects
kprobe_optimizer() will just free the optprobe on freeing_list.
So I changed it to just do list_move() to freeing_list if optprobes are on
unoptimizing list. And the do_unoptimize_kprobe() will skip
arch_disarm_kprobe() if the probe on freeing_list has gone flag.

Link: https://lore.kernel.org/all/Y8URdIfVr3pq2X8w@xpf.sh.intel.com/
Link: https://lore.kernel.org/all/167448024501.3253718.13037333683110512967.stgit@devnote3/

Fixes: e4add24778 ("kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:27 +01:00
Dan Williams
6b60250d8a dax/kmem: Fix leak of memory-hotplug resources
commit e686c32590f40bffc45f105c04c836ffad3e531a upstream.

While experimenting with CXL region removal the following corruption of
/proc/iomem appeared.

Before:
f010000000-f04fffffff : CXL Window 0
  f010000000-f02fffffff : region4
    f010000000-f02fffffff : dax4.0
      f010000000-f02fffffff : System RAM (kmem)

After (modprobe -r cxl_test):
f010000000-f02fffffff : **redacted binary garbage**
  f010000000-f02fffffff : System RAM (kmem)

...and testing further the same is visible with persistent memory
assigned to kmem:

Before:
480000000-243fffffff : Persistent Memory
  480000000-57e1fffff : namespace3.0
  580000000-243fffffff : dax3.0
    580000000-243fffffff : System RAM (kmem)

After (ndctl disable-region all):
480000000-243fffffff : Persistent Memory
  580000000-243fffffff : ***redacted binary garbage***
    580000000-243fffffff : System RAM (kmem)

The corrupted data is from a use-after-free of the "dax4.0" and "dax3.0"
resources, and it also shows that the "System RAM (kmem)" resource is
not being removed. The bug does not appear after "modprobe -r kmem", it
requires the parent of "dax4.0" and "dax3.0" to be removed which
re-parents the leaked "System RAM (kmem)" instances. Those in turn
reference the freed resource as a parent.

First up for the fix is release_mem_region_adjustable() needs to
reliably delete the resource inserted by add_memory_driver_managed().
That is thwarted by a check for IORESOURCE_SYSRAM that predates the
dax/kmem driver, from commit:

65c7878413 ("kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable")

That appears to be working around the behavior of HMM's
"MEMORY_DEVICE_PUBLIC" facility that has since been deleted. With that
check removed the "System RAM (kmem)" resource gets removed, but
corruption still occurs occasionally because the "dax" resource is not
reliably removed.

The dax range information is freed before the device is unregistered, so
the driver can not reliably recall (another use after free) what it is
meant to release. Lastly if that use after free got lucky, the driver
was covering up the leak of "System RAM (kmem)" due to its use of
release_resource() which detaches, but does not free, child resources.
The switch to remove_resource() forces remove_memory() to be responsible
for the deletion of the resource added by add_memory_driver_managed().

Fixes: c2f3011ee6 ("device-dax: add an allocation interface for device-dax instances")
Cc: <stable@vger.kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167653656244.3147810.5705900882794040229.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:25 +01:00
Guilherme G. Piccoli
e6737d9772 panic: fix the panic_print NMI backtrace setting
commit b905039e428d639adeebb719b76f98865ea38d4d upstream.

Commit 8d470a45d1 ("panic: add option to dump all CPUs backtraces in
panic_print") introduced a setting for the "panic_print" kernel parameter
to allow users to request a NMI backtrace on panic.  Problem is that the
panic_print handling happens after the secondary CPUs are already
disabled, hence this option ended-up being kind of a no-op - kernel skips
the NMI trace in idling CPUs, which is the case of offline CPUs.

Fix it by checking the NMI backtrace bit in the panic_print prior to the
CPU disabling function.

Link: https://lkml.kernel.org/r/20230226160838.414257-1-gpiccoli@igalia.com
Fixes: 8d470a45d1 ("panic: add option to dump all CPUs backtraces in panic_print")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Cc: <stable@vger.kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:25 +01:00
Marc Zyngier
d2bea57888 irqdomain: Fix domain registration race
commit 8932c32c3053accd50702b36e944ac2016cd103c upstream.

Hierarchical domains created using irq_domain_create_hierarchy() are
currently added to the domain list before having been fully initialised.

This specifically means that a racing allocation request might fail to
allocate irq data for the inner domains of a hierarchy in case the
parent domain pointer has not yet been set up.

Note that this is not really any issue for irqchip drivers that are
registered early (e.g. via IRQCHIP_DECLARE() or IRQCHIP_ACPI_DECLARE())
but could potentially cause trouble with drivers that are registered
later (e.g. modular drivers using IRQCHIP_PLATFORM_DRIVER_BEGIN(),
gpiochip drivers, etc.).

Fixes: afb7da83b9 ("irqdomain: Introduce helper function irq_domain_add_hierarchy()")
Cc: stable@vger.kernel.org      # 3.19
Signed-off-by: Marc Zyngier <maz@kernel.org>
[ johan: add commit message ]
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-8-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
b89b0c737d irqdomain: Fix mapping-creation race
commit 601363cc08da25747feb87c55573dd54de91d66a upstream.

Parallel probing of devices that share interrupts (e.g. when a driver
uses asynchronous probing) can currently result in two mappings for the
same hardware interrupt to be created due to missing serialisation.

Make sure to hold the irq_domain_mutex when creating mappings so that
looking for an existing mapping before creating a new one is done
atomically.

Fixes: 765230b5f0 ("driver-core: add asynchronous probing support for drivers")
Fixes: b62b2cf575 ("irqdomain: Fix handling of type settings for existing mappings")
Link: https://lore.kernel.org/r/YuJXMHoT4ijUxnRb@hovoldconsulting.com
Cc: stable@vger.kernel.org      # 4.8
Cc: Dmitry Torokhov <dtor@chromium.org>
Cc: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-7-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
1c89f39e75 irqdomain: Refactor __irq_domain_alloc_irqs()
commit d55f7f4c58c07beb5050a834bf57ae2ede599c7e upstream.

Refactor __irq_domain_alloc_irqs() so that it can be called internally
while holding the irq_domain_mutex.

This will be used to fix a shared-interrupt mapping race, hence the
Fixes tag.

Fixes: b62b2cf575 ("irqdomain: Fix handling of type settings for existing mappings")
Cc: stable@vger.kernel.org      # 4.8
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-6-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
1b4aa065ea irqdomain: Drop bogus fwspec-mapping error handling
commit e3b7ab025e931accdc2c12acf9b75c6197f1c062 upstream.

In case a newly allocated IRQ ever ends up not having any associated
struct irq_data it would not even be possible to dispose the mapping.

Replace the bogus disposal with a WARN_ON().

This will also be used to fix a shared-interrupt mapping race, hence the
CC-stable tag.

Fixes: 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
Cc: stable@vger.kernel.org      # 4.8
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-4-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
b6655a4910 irqdomain: Look for existing mapping only once
commit 6e6f75c9c98d2d246d90411ff2b6f0cd271f4cba upstream.

Avoid looking for an existing mapping twice when creating a new mapping
using irq_create_fwspec_mapping() by factoring out the actual allocation
which is shared with irq_create_mapping_affinity().

The new helper function will also be used to fix a shared-interrupt
mapping race, hence the Fixes tag.

Fixes: b62b2cf575 ("irqdomain: Fix handling of type settings for existing mappings")
Cc: stable@vger.kernel.org      # 4.8
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-5-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
deb243ca05 irqdomain: Fix disassociation race
commit 3f883c38f5628f46b30bccf090faec054088e262 upstream.

The global irq_domain_mutex is held when mapping interrupts from
non-hierarchical domains but currently not when disposing them.

This specifically means that updates of the domain mapcount is racy
(currently only used for statistics in debugfs).

Make sure to hold the global irq_domain_mutex also when disposing
mappings from non-hierarchical domains.

Fixes: 9dc6be3d41 ("genirq/irqdomain: Add map counter")
Cc: stable@vger.kernel.org      # 4.13
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-3-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Johan Hovold
33bf92b1d0 irqdomain: Fix association race
commit b06730a571a9ff1ba5bd6b20bf9e50e5a12f1ec6 upstream.

The sanity check for an already mapped virq is done outside of the
irq_domain_mutex-protected section which means that an (unlikely) racing
association may not be detected.

Fix this by factoring out the association implementation, which will
also be used in a follow-on change to fix a shared-interrupt mapping
race.

Fixes: ddaf144c61 ("irqdomain: Refactor irq_domain_associate_many()")
Cc: stable@vger.kernel.org      # 3.11
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230213104302.17307-2-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:19 +01:00
Yang Jihong
57d9df9187 x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range
commit f1c97a1b4ef709e3f066f82e3ba3108c3b133ae6 upstream.

When arch_prepare_optimized_kprobe calculating jump destination address,
it copies original instructions from jmp-optimized kprobe (see
__recover_optprobed_insn), and calculated based on length of original
instruction.

arch_check_optimized_kprobe does not check KPROBE_FLAG_OPTIMATED when
checking whether jmp-optimized kprobe exists.
As a result, setup_detour_execution may jump to a range that has been
overwritten by jump destination address, resulting in an inval opcode error.

For example, assume that register two kprobes whose addresses are
<func+9> and <func+11> in "func" function.
The original code of "func" function is as follows:

   0xffffffff816cb5e9 <+9>:     push   %r12
   0xffffffff816cb5eb <+11>:    xor    %r12d,%r12d
   0xffffffff816cb5ee <+14>:    test   %rdi,%rdi
   0xffffffff816cb5f1 <+17>:    setne  %r12b
   0xffffffff816cb5f5 <+21>:    push   %rbp

1.Register the kprobe for <func+11>, assume that is kp1, corresponding optimized_kprobe is op1.
  After the optimization, "func" code changes to:

   0xffffffff816cc079 <+9>:     push   %r12
   0xffffffff816cc07b <+11>:    jmp    0xffffffffa0210000
   0xffffffff816cc080 <+16>:    incl   0xf(%rcx)
   0xffffffff816cc083 <+19>:    xchg   %eax,%ebp
   0xffffffff816cc084 <+20>:    (bad)
   0xffffffff816cc085 <+21>:    push   %rbp

Now op1->flags == KPROBE_FLAG_OPTIMATED;

2. Register the kprobe for <func+9>, assume that is kp2, corresponding optimized_kprobe is op2.

register_kprobe(kp2)
  register_aggr_kprobe
    alloc_aggr_kprobe
      __prepare_optimized_kprobe
        arch_prepare_optimized_kprobe
          __recover_optprobed_insn    // copy original bytes from kp1->optinsn.copied_insn,
                                      // jump address = <func+14>

3. disable kp1:

disable_kprobe(kp1)
  __disable_kprobe
    ...
    if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
      ret = disarm_kprobe(orig_p, true)       // add op1 in unoptimizing_list, not unoptimized
      orig_p->flags |= KPROBE_FLAG_DISABLED;  // op1->flags ==  KPROBE_FLAG_OPTIMATED | KPROBE_FLAG_DISABLED
    ...

4. unregister kp2
__unregister_kprobe_top
  ...
  if (!kprobe_disabled(ap) && !kprobes_all_disarmed) {
    optimize_kprobe(op)
      ...
      if (arch_check_optimized_kprobe(op) < 0) // because op1 has KPROBE_FLAG_DISABLED, here not return
        return;
      p->kp.flags |= KPROBE_FLAG_OPTIMIZED;   //  now op2 has KPROBE_FLAG_OPTIMIZED
  }

"func" code now is:

   0xffffffff816cc079 <+9>:     int3
   0xffffffff816cc07a <+10>:    push   %rsp
   0xffffffff816cc07b <+11>:    jmp    0xffffffffa0210000
   0xffffffff816cc080 <+16>:    incl   0xf(%rcx)
   0xffffffff816cc083 <+19>:    xchg   %eax,%ebp
   0xffffffff816cc084 <+20>:    (bad)
   0xffffffff816cc085 <+21>:    push   %rbp

5. if call "func", int3 handler call setup_detour_execution:

  if (p->flags & KPROBE_FLAG_OPTIMIZED) {
    ...
    regs->ip = (unsigned long)op->optinsn.insn + TMPL_END_IDX;
    ...
  }

The code for the destination address is

   0xffffffffa021072c:  push   %r12
   0xffffffffa021072e:  xor    %r12d,%r12d
   0xffffffffa0210731:  jmp    0xffffffff816cb5ee <func+14>

However, <func+14> is not a valid start instruction address. As a result, an error occurs.

Link: https://lore.kernel.org/all/20230216034247.32348-3-yangjihong1@huawei.com/

Fixes: f66c0447cc ("kprobes: Set unoptimized flag after unoptimizing code")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:13 +01:00
Yang Jihong
1a3439f548 x86/kprobes: Fix __recover_optprobed_insn check optimizing logic
commit 868a6fc0ca2407622d2833adefe1c4d284766c4c upstream.

Since the following commit:

  commit f66c0447cc ("kprobes: Set unoptimized flag after unoptimizing code")

modified the update timing of the KPROBE_FLAG_OPTIMIZED, a optimized_kprobe
may be in the optimizing or unoptimizing state when op.kp->flags
has KPROBE_FLAG_OPTIMIZED and op->list is not empty.

The __recover_optprobed_insn check logic is incorrect, a kprobe in the
unoptimizing state may be incorrectly determined as unoptimizing.
As a result, incorrect instructions are copied.

The optprobe_queued_unopt function needs to be exported for invoking in
arch directory.

Link: https://lore.kernel.org/all/20230216034247.32348-2-yangjihong1@huawei.com/

Fixes: f66c0447cc ("kprobes: Set unoptimized flag after unoptimizing code")
Cc: stable@vger.kernel.org
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:13 +01:00
Joel Fernandes (Google)
77837a24bc torture: Fix hang during kthread shutdown phase
commit d52d3a2bf408ff86f3a79560b5cce80efb340239 upstream.

During rcutorture shutdown, the rcu_torture_cleanup() function calls
torture_cleanup_begin(), which sets the fullstop global variable to
FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
fakewriters to exit all of their "while" loops and start shutting down.

They then call torture_kthread_stopping(), which in turn waits for
kthread_stop() to be called.  However, rcu_torture_cleanup() has
not yet called kthread_stop() on those threads, and before it gets a
chance to do so, multiple instances of torture_kthread_stopping() invoke
schedule_timeout_interruptible(1) in a tight loop.  Tracing confirms that
TIMER_SOFTIRQ can then continuously execute timer callbacks.  If that
TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
task might never invoke kthread_stop().

This commit improves this situation by increasing the timeout passed to
schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
invocations.  Testing has shown 100 runs of TREE07 passing reliably,
as oppose to the tens-of-percent failure rates seen beforehand.

Cc: Paul McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: <stable@vger.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:07 +01:00
Waiman Long
35ab0cadbc locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath
commit b613c7f31476c44316bfac1af7cac714b7d6bef9 upstream.

A non-first waiter can potentially spin in the for loop of
rwsem_down_write_slowpath() without sleeping but fail to acquire the
lock even if the rwsem is free if the following sequence happens:

  Non-first RT waiter    First waiter      Lock holder
  -------------------    ------------      -----------
  Acquire wait_lock
  rwsem_try_write_lock():
    Set handoff bit if RT or
      wait too long
    Set waiter->handoff_set
  Release wait_lock
                         Acquire wait_lock
                         Inherit waiter->handoff_set
                         Release wait_lock
					   Clear owner
                                           Release lock
  if (waiter.handoff_set) {
    rwsem_spin_on_owner(();
    if (OWNER_NULL)
      goto trylock_again;
  }
  trylock_again:
  Acquire wait_lock
  rwsem_try_write_lock():
     if (first->handoff_set && (waiter != first))
	return false;
  Release wait_lock

A non-first waiter cannot really acquire the rwsem even if it mistakenly
believes that it can spin on OWNER_NULL value. If that waiter happens
to be an RT task running on the same CPU as the first waiter, it can
block the first waiter from acquiring the rwsem leading to live lock.
Fix this problem by making sure that a non-first waiter cannot spin in
the slowpath loop without sleeping.

Fixes: d257cc8cb8 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230126003628.365092-2-longman@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10 09:34:06 +01:00
Greg Kroah-Hartman
5100c4efc3 PM: EM: fix memory leak with using debugfs_lookup()
[ Upstream commit a0e8c13ccd6a9a636d27353da62c2410c4eca337 ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:53 +01:00
Greg Kroah-Hartman
15cffd01ed time/debug: Fix memory leak with using debugfs_lookup()
[ Upstream commit 5b268d8abaec6cbd4bd70d062e769098d96670aa ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic at
once.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230202151214.2306822-1-gregkh@linuxfoundation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:52 +01:00
Feng Tang
856dbac0a8 clocksource: Suspend the watchdog temporarily when high read latency detected
[ Upstream commit b7082cdfc464bf9231300605d03eebf943dda307 ]

Bugs have been reported on 8 sockets x86 machines in which the TSC was
wrongly disabled when the system is under heavy workload.

 [ 818.380354] clocksource: timekeeping watchdog on CPU336: hpet wd-wd read-back delay of 1203520ns
 [ 818.436160] clocksource: wd-tsc-wd read-back delay of 181880ns, clock-skew test skipped!
 [ 819.402962] clocksource: timekeeping watchdog on CPU338: hpet wd-wd read-back delay of 324000ns
 [ 819.448036] clocksource: wd-tsc-wd read-back delay of 337240ns, clock-skew test skipped!
 [ 819.880863] clocksource: timekeeping watchdog on CPU339: hpet read-back delay of 150280ns, attempt 3, marking unstable
 [ 819.936243] tsc: Marking TSC unstable due to clocksource watchdog
 [ 820.068173] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
 [ 820.092382] sched_clock: Marking unstable (818769414384, 1195404998)
 [ 820.643627] clocksource: Checking clocksource tsc synchronization from CPU 267 to CPUs 0,4,25,70,126,430,557,564.
 [ 821.067990] clocksource: Switched to clocksource hpet

This can be reproduced by running memory intensive 'stream' tests,
or some of the stress-ng subcases such as 'ioport'.

The reason for these issues is the when system is under heavy load, the
read latency of the clocksources can be very high.  Even lightweight TSC
reads can show high latencies, and latencies are much worse for external
clocksources such as HPET or the APIC PM timer.  These latencies can
result in false-positive clocksource-unstable determinations.

These issues were initially reported by a customer running on a production
system, and this problem was reproduced on several generations of Xeon
servers, especially when running the stress-ng test.  These Xeon servers
were not production systems, but they did have the latest steppings
and firmware.

Given that the clocksource watchdog is a continual diagnostic check with
frequency of twice a second, there is no need to rush it when the system
is under heavy load.  Therefore, when high clocksource read latencies
are detected, suspend the watchdog timer for 5 minutes.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Waiman Long <longman@redhat.com>
Cc: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Feng Tang <feng.tang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:50 +01:00
Jann Horn
3a43a366ec timers: Prevent union confusion from unexpected restart_syscall()
[ Upstream commit 9f76d59173d9d146e96c66886b671c1915a5c5e5 ]

The nanosleep syscalls use the restart_block mechanism, with a quirk:
The `type` and `rmtp`/`compat_rmtp` fields are set up unconditionally on
syscall entry, while the rest of the restart_block is only set up in the
unlikely case that the syscall is actually interrupted by a signal (or
pseudo-signal) that doesn't have a signal handler.

If the restart_block was set up by a previous syscall (futex(...,
FUTEX_WAIT, ...) or poll()) and hasn't been invalidated somehow since then,
this will clobber some of the union fields used by futex_wait_restart() and
do_restart_poll().

If userspace afterwards wrongly calls the restart_syscall syscall,
futex_wait_restart()/do_restart_poll() will read struct fields that have
been clobbered.

This doesn't actually lead to anything particularly interesting because
none of the union fields contain trusted kernel data, and
futex(..., FUTEX_WAIT, ...) and poll() aren't syscalls where it makes much
sense to apply seccomp filters to their arguments.

So the current consequences are just of the "if userspace does bad stuff,
it can damage itself, and that's not a problem" flavor.

But still, it seems like a hazard for future developers, so invalidate the
restart_block when partly setting it up in the nanosleep syscalls.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230105134403.754986-1-jannh@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:49 +01:00
Zqiang
5a2153b83c rcu-tasks: Handle queue-shrink/callback-enqueue race condition
[ Upstream commit a4fcfbee8f6274f9b3f9a71dd5b03e6772ce33f3 ]

The rcu_tasks_need_gpcb() determines whether or not: (1) There are
callbacks needing another grace period, (2) There are callbacks ready
to be invoked, and (3) It would be a good time to shrink back down to a
single-CPU callback list.  This third case is interesting because some
other CPU might be adding new callbacks, which might suddenly make this
a very bad time to be shrinking.

This is currently handled by requiring call_rcu_tasks_generic() to
enqueue callbacks under the protection of rcu_read_lock() and requiring
rcu_tasks_need_gpcb() to wait for an RCU grace period to elapse before
finalizing the transition.  This works well in practice.

Unfortunately, the current code assumes that a grace period whose end is
detected by the poll_state_synchronize_rcu() in the second "if" condition
actually ended before the earlier code counted the callbacks queued on
CPUs other than CPU 0 (local variable "ncbsnz").  Given the current code,
it is possible that a long-delayed call_rcu_tasks_generic() invocation
will queue a callback on a non-zero CPU after these CPUs have had their
callbacks counted and zero has been stored to ncbsnz.  Such a callback
would trigger the WARN_ON_ONCE() in the second "if" statement.

To see this, consider the following sequence of events:

o	CPU 0 invokes rcu_tasks_one_gp(), and counts fewer than
	rcu_task_collapse_lim callbacks.  It sees at least one
	callback queued on some other CPU, thus setting ncbsnz
	to a non-zero value.

o	CPU 1 invokes call_rcu_tasks_generic() and loads 42 from
	->percpu_enqueue_lim.  It therefore decides to enqueue its
	callback onto CPU 1's callback list, but is delayed.

o	CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
	of callbacks does not exceed rcu_task_collapse_lim.  It therefore
	checks percpu_enqueue_lim, and sees that its value is greater
	than the value one.  CPU 0 therefore  starts the shift back
	to a single callback list.  It sets ->percpu_enqueue_lim to 1,
	but CPU 1 has already read the old value of 42.  It also gets
	a grace-period state value from get_state_synchronize_rcu().

o	CPU 0 sees that ncbsnz is non-zero in its second "if" statement,
	so it declines to finalize the shrink operation.

o	CPU 0 again invokes rcu_tasks_one_gp(), and counts fewer than
	rcu_task_collapse_lim callbacks.  It also sees that there are
	no callback queued on any other CPU, and thus sets ncbsnz to zero.

o	CPU 1 resumes execution and enqueues its callback onto its own
	list.  This invalidates the value of ncbsnz.

o	CPU 0 sees the rcu_task_cb_adjust is non-zero and that the number
	of callbacks does not exceed rcu_task_collapse_lim.  It therefore
	checks percpu_enqueue_lim, but sees that its value is already
	unity.	It therefore does not get a new grace-period state value.

o	CPU 0 sees that rcu_task_cb_adjust is non-zero, ncbsnz is zero,
	and that poll_state_synchronize_rcu() says that the grace period
	has completed.  it therefore finalizes the shrink operation,
	setting ->percpu_dequeue_lim to the value one.

o	CPU 0 does a debug check, scanning the other CPUs' callback lists.
	It sees that CPU 1's list has a callback, so it (rightly)
	triggers the WARN_ON_ONCE().  After all, the new value of
	->percpu_dequeue_lim says to not bother looking at CPU 1's
	callback list, which means that this callback will never be
	invoked.  This can result in hangs and maybe even OOMs.

Based on long experience with rcutorture, this is an extremely
low-probability race condition, but it really can happen, especially in
preemptible kernels or within guest OSes.

This commit therefore checks for completion of the grace period
before counting callbacks.  With this change, in the above failure
scenario CPU 0 would know not to prematurely end the shrink operation
because the grace period would not have completed before the count
operation started.

[ paulmck: Adjust grace-period end rather than adding RCU reader. ]
[ paulmck: Avoid spurious WARN_ON_ONCE() with ->percpu_dequeue_lim check. ]

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:48 +01:00
Zqiang
94ed8ac1bb rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug
[ Upstream commit ea5c8987fef20a8cca07e428aa28bc64649c5104 ]

The synchronize_rcu_tasks_rude() function invokes rcu_tasks_rude_wait_gp()
to wait one rude RCU-tasks grace period.  The rcu_tasks_rude_wait_gp()
function in turn checks if there is only a single online CPU.  If so, it
will immediately return, because a call to synchronize_rcu_tasks_rude()
is by definition a grace period on a single-CPU system.  (We could
have blocked!)

Unfortunately, this check uses num_online_cpus() without synchronization,
which can result in too-short grace periods.  To see this, consider the
following scenario:

        CPU0                                   CPU1 (going offline)
                                          migration/1 task:
                                      cpu_stopper_thread
                                       -> take_cpu_down
                                          -> _cpu_disable
                                           (dec __num_online_cpus)
                                          ->cpuhp_invoke_callback
                                                preempt_disable
                                                access old_data0
           task1
 del old_data0                                  .....
 synchronize_rcu_tasks_rude()
 task1 schedule out
 ....
 task2 schedule in
 rcu_tasks_rude_wait_gp()
     ->__num_online_cpus == 1
       ->return
 ....
 task1 schedule in
 ->free old_data0
                                                preempt_enable

When CPU1 decrements __num_online_cpus, its value becomes 1.  However,
CPU1 has not finished going offline, and will take one last trip through
the scheduler and the idle loop before it actually stops executing
instructions.  Because synchronize_rcu_tasks_rude() is mostly used for
tracing, and because both the scheduler and the idle loop can be traced,
this means that CPU0's prematurely ended grace period might disrupt the
tracing on CPU1.  Given that this disruption might include CPU1 executing
instructions in memory that was just now freed (and maybe reallocated),
this is a matter of some concern.

This commit therefore removes that problematic single-CPU check from the
rcu_tasks_rude_wait_gp() function.  This dispenses with the single-CPU
optimization, but there is no evidence indicating that this optimization
is important.  In addition, synchronize_rcu_tasks_generic() contains a
similar optimization (albeit only for early boot), which also splats.
(As in exactly why are you invoking synchronize_rcu_tasks_rude() so
early in boot, anyway???)

It is OK for the synchronize_rcu_tasks_rude() function's check to be
unsynchronized because the only times that this check can evaluate to
true is when there is only a single CPU running with preemption
disabled.

While in the area, this commit also fixes a minor bug in which a
call to synchronize_rcu_tasks_rude() would instead be attributed to
synchronize_rcu_tasks().

[ paulmck: Add "synchronize_" prefix and "()" suffix. ]

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:48 +01:00
Pingfan Liu
2c4d26dad7 srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL
[ Upstream commit 7f24626d6dd844bfc6d1f492d214d29c86d02550 ]

Commit 994f706872 ("srcu: Make Tree SRCU able to operate without
snp_node array") assumes that cpu 0 is always online.  However, there
really are situations when some other CPU is the boot CPU, for example,
when booting a kdump kernel with the maxcpus=1 boot parameter.

On PowerPC, the kdump kernel can hang as follows:
...
[    1.740036] systemd[1]: Hostname set to <xyz.com>
[  243.686240] INFO: task systemd:1 blocked for more than 122 seconds.
[  243.686264]       Not tainted 6.1.0-rc1 #1
[  243.686272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.686281] task:systemd         state:D stack:0     pid:1     ppid:0      flags:0x00042000
[  243.686296] Call Trace:
[  243.686301] [c000000016657640] [c000000016657670] 0xc000000016657670 (unreliable)
[  243.686317] [c000000016657830] [c00000001001dec0] __switch_to+0x130/0x220
[  243.686333] [c000000016657890] [c000000010f607b8] __schedule+0x1f8/0x580
[  243.686347] [c000000016657940] [c000000010f60bb4] schedule+0x74/0x140
[  243.686361] [c0000000166579b0] [c000000010f699b8] schedule_timeout+0x168/0x1c0
[  243.686374] [c000000016657a80] [c000000010f61de8] __wait_for_common+0x148/0x360
[  243.686387] [c000000016657b20] [c000000010176bb0] __flush_work.isra.0+0x1c0/0x3d0
[  243.686401] [c000000016657bb0] [c0000000105f2768] fsnotify_wait_marks_destroyed+0x28/0x40
[  243.686415] [c000000016657bd0] [c0000000105f21b8] fsnotify_destroy_group+0x68/0x160
[  243.686428] [c000000016657c40] [c0000000105f6500] inotify_release+0x30/0xa0
[  243.686440] [c000000016657cb0] [c0000000105751a8] __fput+0xc8/0x350
[  243.686452] [c000000016657d00] [c00000001017d524] task_work_run+0xe4/0x170
[  243.686464] [c000000016657d50] [c000000010020e94] do_notify_resume+0x134/0x140
[  243.686478] [c000000016657d80] [c00000001002eb18] interrupt_exit_user_prepare_main+0x198/0x270
[  243.686493] [c000000016657de0] [c00000001002ec60] syscall_exit_prepare+0x70/0x180
[  243.686505] [c000000016657e10] [c00000001000bf7c] system_call_vectored_common+0xfc/0x280
[  243.686520] --- interrupt: 3000 at 0x7fffa47d5ba4
[  243.686528] NIP:  00007fffa47d5ba4 LR: 0000000000000000 CTR: 0000000000000000
[  243.686538] REGS: c000000016657e80 TRAP: 3000   Not tainted  (6.1.0-rc1)
[  243.686548] MSR:  800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE>  CR: 42044440  XER: 00000000
[  243.686572] IRQMASK: 0
[  243.686572] GPR00: 0000000000000006 00007ffffa606710 00007fffa48e7200 0000000000000000
[  243.686572] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000001
[  243.686572] GPR08: 000001000c172dd0 0000000000000000 0000000000000000 0000000000000000
[  243.686572] GPR12: 0000000000000000 00007fffa4ff4bc0 0000000000000000 0000000000000000
[  243.686572] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  243.686572] GPR20: 0000000132dfdc50 000000000000000e 0000000000189375 0000000000000000
[  243.686572] GPR24: 00007ffffa606ae0 0000000000000005 000001000c185490 000001000c172570
[  243.686572] GPR28: 000001000c172990 000001000c184850 000001000c172e00 00007fffa4fedd98
[  243.686683] NIP [00007fffa47d5ba4] 0x7fffa47d5ba4
[  243.686691] LR [0000000000000000] 0x0
[  243.686698] --- interrupt: 3000
[  243.686708] INFO: task kworker/u16:1:24 blocked for more than 122 seconds.
[  243.686717]       Not tainted 6.1.0-rc1 #1
[  243.686724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.686733] task:kworker/u16:1   state:D stack:0     pid:24    ppid:2      flags:0x00000800
[  243.686747] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[  243.686758] Call Trace:
[  243.686762] [c0000000166736e0] [c00000004fd91000] 0xc00000004fd91000 (unreliable)
[  243.686775] [c0000000166738d0] [c00000001001dec0] __switch_to+0x130/0x220
[  243.686788] [c000000016673930] [c000000010f607b8] __schedule+0x1f8/0x580
[  243.686801] [c0000000166739e0] [c000000010f60bb4] schedule+0x74/0x140
[  243.686814] [c000000016673a50] [c000000010f699b8] schedule_timeout+0x168/0x1c0
[  243.686827] [c000000016673b20] [c000000010f61de8] __wait_for_common+0x148/0x360
[  243.686840] [c000000016673bc0] [c000000010210840] __synchronize_srcu.part.0+0xa0/0xe0
[  243.686855] [c000000016673c30] [c0000000105f2c64] fsnotify_mark_destroy_workfn+0xc4/0x1a0
[  243.686868] [c000000016673ca0] [c000000010174ea8] process_one_work+0x2a8/0x570
[  243.686882] [c000000016673d40] [c000000010175208] worker_thread+0x98/0x5e0
[  243.686895] [c000000016673dc0] [c0000000101828d4] kthread+0x124/0x130
[  243.686908] [c000000016673e10] [c00000001000cd40] ret_from_kernel_thread+0x5c/0x64
[  366.566274] INFO: task systemd:1 blocked for more than 245 seconds.
[  366.566298]       Not tainted 6.1.0-rc1 #1
[  366.566305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  366.566314] task:systemd         state:D stack:0     pid:1     ppid:0      flags:0x00042000
[  366.566329] Call Trace:
...

The above splat occurs because PowerPC really does use maxcpus=1
instead of nr_cpus=1 in the kernel command line.  Consequently, the
(quite possibly non-zero) kdump CPU is the only online CPU in the kdump
kernel.  SRCU unconditionally queues a sdp->work on cpu 0, for which no
worker thread has been created, so sdp->work will be never executed and
__synchronize_srcu() will never be completed.

This commit therefore replaces CPU ID 0 with get_boot_cpu_id() in key
places in Tree SRCU.  Since the CPU indicated by get_boot_cpu_id()
is guaranteed to be online, this avoids the above splat.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rcu@vger.kernel.org
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:48 +01:00
Paul E. McKenney
05737bd85c rcu: Suppress smp_processor_id() complaint in synchronize_rcu_expedited_wait()
[ Upstream commit 2d7f00b2f01301d6e41fd4a28030dab0442265be ]

The normal grace period's RCU CPU stall warnings are invoked from the
scheduling-clock interrupt handler, and can thus invoke smp_processor_id()
with impunity, which allows them to directly invoke dump_cpu_task().
In contrast, the expedited grace period's RCU CPU stall warnings are
invoked from process context, which causes the dump_cpu_task() function's
calls to smp_processor_id() to complain bitterly in debug kernels.

This commit therefore causes synchronize_rcu_expedited_wait() to disable
preemption around its call to dump_cpu_task().

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:48 +01:00
Peter Zijlstra
b78434f6ee cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
[ Upstream commit 5a5d7e9badd2cb8065db171961bd30bd3595e4b6 ]

In order to avoid WARN/BUG from generating nested or even recursive
warnings, force rcu_is_watching() true during
WARN/lockdep_rcu_suspicious().

Notably things like unwinding the stack can trigger rcu_dereference()
warnings, which then triggers more unwinding which then triggers more
warnings etc..

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230126151323.408156109@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:47 +01:00
Greg Kroah-Hartman
3036f5f5ae trace/blktrace: fix memory leak with using debugfs_lookup()
[ Upstream commit 83e8864fee26f63a7435e941b7c36a20fd6fe93e ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230202141956.2299521-1-gregkh@linuxfoundation.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:46 +01:00
Nicholas Piggin
711bd1b553 exit: Detect and fix irq disabled state in oops
[ Upstream commit 001c28e57187570e4b5aa4492c7a957fb6d65d7b ]

If a task oopses with irqs disabled, this can cause various cascading
problems in the oops path such as sleep-from-invalid warnings, and
potentially worse.

Since commit 0258b5fd7c ("coredump: Limit coredumps to a single
thread group"), the unconditional irq enable in coredump_task_exit()
will "fix" the irq state to be enabled early in do_exit(), so currently
this may not be triggerable, but that is coincidental and fragile.

Detect and fix the irqs_disabled() condition in the oops path before
calling do_exit(), similarly to the way in_atomic() is handled.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://lore.kernel.org/lkml/20221004094401.708299-1-npiggin@gmail.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:45 +01:00
Peter Zijlstra
982f8ef8ec context_tracking: Fix noinstr vs KASAN
[ Upstream commit 0e26e1de0032779e43929174339429c16307a299 ]

Low level noinstr context-tracking code is calling out to instrumented
code on KASAN:

  vmlinux.o: warning: objtool: __ct_user_enter+0x72: call to __kasan_check_write() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __ct_user_exit+0x47: call to __kasan_check_write() leaves .noinstr.text section

Use even lower level atomic methods to avoid the instrumentation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.458034262@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:45 +01:00
Andrii Nakryiko
bb1cc7fc3e bpf: Fix global subprog context argument resolution logic
[ Upstream commit d384dce281ed1b504fae2e279507827638d56fa3 ]

KPROBE program's user-facing context type is defined as typedef
bpf_user_pt_regs_t. This leads to a problem when trying to passing
kprobe/uprobe/usdt context argument into global subprog, as kernel
always strip away mods and typedefs of user-supplied type, but takes
expected type from bpf_ctx_convert as is, which causes mismatch.

Current way to work around this is to define a fake struct with the same
name as expected typedef:

  struct bpf_user_pt_regs_t {};

  __noinline my_global_subprog(struct bpf_user_pt_regs_t *ctx) { ... }

This patch fixes the issue by resolving expected type, if it's not
a struct. It still leaves the above work-around working for backwards
compatibility.

Fixes: 91cc1a9974 ("bpf: Annotate context types")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230216045954.3002473-2-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:06 +01:00
Hou Tao
678ea18d62 bpf: Zeroing allocated object from slab in bpf memory allocator
[ Upstream commit 997849c4b969034e225153f41026657def66d286 ]

Currently the freed element in bpf memory allocator may be immediately
reused, for htab map the reuse will reinitialize special fields in map
value (e.g., bpf_spin_lock), but lookup procedure may still access
these special fields, and it may lead to hard-lockup as shown below:

 NMI backtrace for cpu 16
 CPU: 16 PID: 2574 Comm: htab.bin Tainted: G             L     6.1.0+ #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
 RIP: 0010:queued_spin_lock_slowpath+0x283/0x2c0
 ......
 Call Trace:
  <TASK>
  copy_map_value_locked+0xb7/0x170
  bpf_map_copy_value+0x113/0x3c0
  __sys_bpf+0x1c67/0x2780
  __x64_sys_bpf+0x1c/0x20
  do_syscall_64+0x30/0x60
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
 ......
  </TASK>

For htab map, just like the preallocated case, these is no need to
initialize these special fields in map value again once these fields
have been initialized. For preallocated htab map, these fields are
initialized through __GFP_ZERO in bpf_map_area_alloc(), so do the
similar thing for non-preallocated htab in bpf memory allocator. And
there is no need to use __GFP_ZERO for per-cpu bpf memory allocator,
because __alloc_percpu_gfp() does it implicitly.

Fixes: 0fd7c5d433 ("bpf: Optimize call_rcu in non-preallocated hash map.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230215082132.3856544-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:33:06 +01:00
Lai Jiangshan
0903111d67 workqueue: Protects wq_unbound_cpumask with wq_pool_attach_mutex
[ Upstream commit 99c621ef243bda726fb8d982a274ded96570b410 ]

When unbind_workers() reads wq_unbound_cpumask to set the affinity of
freshly-unbound kworkers, it only holds wq_pool_attach_mutex. This isn't
sufficient as wq_unbound_cpumask is only protected by wq_pool_mutex.

Make wq_unbound_cpumask protected with wq_pool_attach_mutex and also
remove the need of temporary saved_cpumask.

Fixes: 10a5a651e3 ("workqueue: Restrict kworker in the offline CPU pool running on housekeeping CPUs")
Reported-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:53 +01:00
Frederic Weisbecker
62030a4915 rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()
[ Upstream commit 28319d6dc5e2ffefa452c2377dd0f71621b5bff0 ]

RCU Tasks and PID-namespace unshare can interact in do_exit() in a
complicated circular dependency:

1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace
   that every subsequent child of TASK A will belong to. But TASK A
   doesn't itself belong to that new PID namespace.

2) TASK A forks() and creates TASK B. TASK A stays attached to its PID
   namespace (let's say PID_NS1) and TASK B is the first task belonging
   to the new PID namespace created by unshare()  (let's call it PID_NS2).

3) Since TASK B is the first task attached to PID_NS2, it becomes the
   PID_NS2 child reaper.

4) TASK A forks() again and creates TASK C which get attached to PID_NS2.
   Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has
   TASK B (belonging to PID_NS2) as a pid_namespace child_reaper.

5) TASK B exits and since it is the child reaper for PID_NS2, it has to
   kill all other tasks attached to PID_NS2, and wait for all of them to
   die before getting reaped itself (zap_pid_ns_process()).

6) TASK A calls synchronize_rcu_tasks() which leads to
   synchronize_srcu(&tasks_rcu_exit_srcu).

7) TASK B is waiting for TASK C to get reaped. But TASK B is under a
   tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between
   exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A.

8) TASK C exits and since TASK A is its parent, it waits for it to reap
   TASK C, but it can't because TASK A waits for TASK B that waits for
   TASK C.

Pid_namespace semantics can hardly be changed at this point. But the
coverage of tasks_rcu_exit_srcu can be reduced instead.

The current task is assumed not to be concurrently reapable at this
stage of exit_notify() and therefore tasks_rcu_exit_srcu can be
temporarily relaxed without breaking its constraints, providing a way
out of the deadlock scenario.

[ paulmck: Fix build failure by adding additional declaration. ]

Fixes: 3f95aa81d2 ("rcu: Make TASKS_RCU handle tasks that are almost done exiting")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W . Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:52 +01:00
Frederic Weisbecker
24f259ce3a rcu-tasks: Remove preemption disablement around srcu_read_[un]lock() calls
[ Upstream commit 44757092958bdd749775022f915b7ac974384c2a ]

Ever since the following commit:

	5a41344a3d ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()")

SRCU doesn't rely anymore on preemption to be disabled in order to
modify the per-CPU counter. And even then it used to be done from the API
itself.

Therefore and after checking further, it appears to be safe to remove
the preemption disablement around __srcu_read_[un]lock() in
exit_tasks_rcu_start() and exit_tasks_rcu_finish()

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Stable-dep-of: 28319d6dc5e2 ("rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:51 +01:00
Frederic Weisbecker
a2b0cda452 rcu-tasks: Improve comments explaining tasks_rcu_exit_srcu purpose
[ Upstream commit e4e1e8089c5fd948da12cb9f4adc93821036945f ]

Make sure we don't need to look again into the depths of git blame in
order not to miss a subtle part about how rcu-tasks is dealing with
exiting tasks.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Stable-dep-of: 28319d6dc5e2 ("rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:51 +01:00
Pietro Borrello
6b4fcc4e8a sched/rt: pick_next_rt_entity(): check list_entry
[ Upstream commit 7c4a5b89a0b5a57a64b601775b296abf77a9fe97 ]

Commit 326587b840 ("sched: fix goto retry in pick_next_task_rt()")
removed any path which could make pick_next_rt_entity() return NULL.
However, BUG_ON(!rt_se) in _pick_next_task_rt() (the only caller of
pick_next_rt_entity()) still checks the error condition, which can
never happen, since list_entry() never returns NULL.
Remove the BUG_ON check, and instead emit a warning in the only
possible error condition here: the queue being empty which should
never happen.

Fixes: 326587b840 ("sched: fix goto retry in pick_next_task_rt()")
Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20230128-list-entry-null-check-sched-v3-1-b1a71bd1ac6b@diag.uniroma1.it
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:45 +01:00
Waiman Long
fd38b56f3a locking/rwsem: Disable preemption in all down_read*() and up_read() code paths
[ Upstream commit 3f5245538a1964ae186ab7e1636020a41aa63143 ]

Commit:

  91d2a812df ("locking/rwsem: Make handoff writer optimistically spin on owner")

... assumes that when the owner field is changed to NULL, the lock will
become free soon. But commit:

  48dfb5d256 ("locking/rwsem: Disable preemption while trying for rwsem lock")

... disabled preemption when acquiring rwsem for write.

However, preemption has not yet been disabled when acquiring a read lock
on a rwsem.  So a reader can add a RWSEM_READER_BIAS to count without
setting owner to signal a reader, got preempted out by a RT task which
then spins in the writer slowpath as owner remains NULL leading to live lock.

One easy way to fix this problem is to disable preemption at all the
down_read*() and up_read() code paths as implemented in this patch.

Fixes: 91d2a812df ("locking/rwsem: Make handoff writer optimistically spin on owner")
Reported-by: Mukesh Ojha <quic_mojha@quicinc.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230126003628.365092-3-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10 09:32:41 +01:00
Neill Kapron
bd772a54d2 Revert "exit: Remove profile_task_exit & profile_munmap"
This reverts commit 2d4bcf886e, which
removed hooks required for Android uid_sys_stats.c.

Bug: 219790626
Change-Id: Icea00bbf9abe2fb17312b25f2d575d29aa360999
[nkapron: resolve conflict with 2873cd31a2 exit: Remove profile_handoff_task]
Signed-off-by: Neill Kapron <nkapron@google.com>
2023-03-09 23:13:08 +00:00
YOUNGJIN JOO
b97cdd877a ANDROID: printk: export symbol for tracepoint_console
Initial kernel bootup logs get overwritten after running
for a long time, and there can be debugging scenario where
we need initial ~100s bootup logs for debugging.
'tracepoint_console' is helping in achieving this purpose.

'tracepoint_console' replaces 'android_vh_log_buf' and
'android_vh_logbuf_pr_cong' vendor hooks in previous GKI kernels.

Bug: 271373835
Change-Id: If68801ba584e8e71e3e7aa16c64a5588c1f5a114
Signed-off-by: YOUNGJIN JOO <youngjin79.joo@samsung.com>
2023-03-03 17:15:34 +00:00
Greg Kroah-Hartman
e1300f4942 This is the 6.1.15 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmQB0YgACgkQONu9yGCS
 aT4LLg//V4AJCRhMlPEY43EJLsGok/32yuBqgrU774sCQjTKyoR4JCumcTqwbed/
 aHRl6gul5dvD6+lnTAFeydu40X28e1uNab9lC++SilILMyR6RddnQVB50uXsFe5C
 LpjY+7OAQAoyK2+wsiXpeWmYReJbdbfUBKhtEyXnp5LsKYD9JQv0vNws6Wiekz/A
 4d7FkK9rnJyzbyS8zv4hjDEz7+KYM02VDYvpr48Rts3m0JzJL7gqzKF3A6n6+ukT
 y8X5KLIODqhtt0LTt59cDL1mU/z3XDzeeUdL9FPxvk3o0dUvjIay1DQwjL6RyhLC
 /INUduF0kjbQoC9TdF9g/JJ8oRi05XDQgJCdyDSvFg/2OAJ+gLzrcXfAAdpdAo2v
 OXooZLk5YhW2F9QKzzK4OBimtvCGEZWl6CwsznQJUGPQxK2emTiTwXYiglj1Engi
 ROcF3WJAjDj7YfWOtO4U0DRN4NrzUDeYw23JO3DFBDan5eWimuli2rSN9thrYAKa
 w4HdHwEjGEk4ueZoC7Fv1HKQN90sUjEXtxp+86RBAq63rqeHFZRkdduyk78wBCM0
 yu79bKJ5cGeldRTIJYs4tv1uJmE2UJZl+d5fCew1P0grSTYy77/33sWBKT4+OuEz
 eQ0qWuIBdWCFfnD9HkVii4/LJa21MlGt9H3azI5bJEY22SNuqkM=
 =u4Aa
 -----END PGP SIGNATURE-----

Merge 6.1.15 into android14-6.1

Changes in 6.1.15
	Fix XFRM-I support for nested ESP tunnels
	arm64: dts: rockchip: reduce thermal limits on rk3399-pinephone-pro
	arm64: dts: rockchip: drop unused LED mode property from rk3328-roc-cc
	ARM: dts: rockchip: add power-domains property to dp node on rk3288
	arm64: dts: rockchip: add missing #interrupt-cells to rk356x pcie2x1
	arm64: dts: rockchip: fix probe of analog sound card on rock-3a
	HID: elecom: add support for TrackBall 056E:011C
	HID: Ignore battery for Elan touchscreen on Asus TP420IA
	ACPI: NFIT: fix a potential deadlock during NFIT teardown
	pinctrl: amd: Fix debug output for debounce time
	btrfs: send: limit number of clones and allocated memory size
	arm64: dts: rockchip: align rk3399 DMC OPP table with bindings
	ASoC: rt715-sdca: fix clock stop prepare timeout issue
	IB/hfi1: Assign npages earlier
	powerpc: Don't select ARCH_WANTS_NO_INSTR
	ASoC: SOF: amd: Fix for handling spurious interrupts from DSP
	ARM: dts: stihxxx-b2120: fix polarity of reset line of tsin0 port
	neigh: make sure used and confirmed times are valid
	HID: core: Fix deadloop in hid_apply_multiplier.
	ASoC: codecs: es8326: Fix DTS properties reading
	HID: Ignore battery for ELAN touchscreen 29DF on HP
	selftests: ocelot: tc_flower_chains: make test_vlan_ingress_modify() more comprehensive
	x86/cpu: Add Lunar Lake M
	PM: sleep: Avoid using pr_cont() in the tasks freezing code
	bpf: bpf_fib_lookup should not return neigh in NUD_FAILED state
	net: Remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues().
	vc_screen: don't clobber return value in vcs_read
	drm/amd/display: Move DCN314 DOMAIN power control to DMCUB
	drm/amd/display: Fix race condition in DPIA AUX transfer
	usb: dwc3: pci: add support for the Intel Meteor Lake-M
	USB: serial: option: add support for VW/Skoda "Carstick LTE"
	usb: gadget: u_serial: Add null pointer check in gserial_resume
	arm64: dts: uniphier: Fix property name in PXs3 USB node
	usb: typec: pd: Remove usb_suspend_supported sysfs from sink PDO
	drm/amd/display: Properly reuse completion structure
	attr: add in_group_or_capable()
	fs: move should_remove_suid()
	attr: add setattr_should_drop_sgid()
	attr: use consistent sgid stripping checks
	fs: use consistent setgid checks in is_sxid()
	scripts/tags.sh: fix incompatibility with PCRE2
	USB: core: Don't hold device lock while reading the "descriptors" sysfs file
	Linux 6.1.15

Change-Id: I2489d74e0905d26c0afb69f1036cb43890bec060
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-03-03 12:21:45 +00:00
Rafael J. Wysocki
f2173508b1 PM: sleep: Avoid using pr_cont() in the tasks freezing code
commit a449dfbfc0894676ad0aa1873383265047529e3a upstream.

Using pr_cont() in the tasks freezing code related to system-wide
suspend and hibernation is problematic, because the continuation
messages printed there are susceptible to interspersing with other
unrelated messages which results in output that is hard to
understand.

Address this issue by modifying try_to_freeze_tasks() to print
messages that don't require continuations and adjusting its
callers accordingly.

Reported-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-03 11:52:23 +01:00
Will McVicker
d834db9f2c ANDROID: modules: re-introduce the MODULE_SCMVERSION config
Config MODULE_SCMVERSION introduces a new module attribute --
`scmversion` -- which can be used to identify a given module's SCM
version.  This is very useful for developers that update their kernel
independently from their kernel modules or vice-versa since the SCM
version provided by UTS_RELEASE (`uname -r`) will now differ from the
module's vermagic attribute.

For example, we have a CI setup that tests new kernel changes on the
hikey960 and db845c devices without updating their kernel modules. When
these tests fail, we need to be able to identify the exact device
configuration the test was using. By including MODULE_SCMVERSION, we can
identify the exact kernel and modules' SCM versions for debugging the
failures.

Additionally, by exposing the SCM version via the sysfs node
/sys/module/MODULENAME/scmversion, one can also verify the SCM versions
of the modules loaded from the initramfs. Currently, modinfo can only
retrieve module attributes from the module's ko on disk and not from the
actual module that is loaded in RAM.

You can retrieve the SCM version in two ways,

1) By using modinfo:
    > modinfo -F scmversion MODULENAME
2) By module sysfs node:
    > cat /sys/module/MODULENAME/scmversion

Bug: 180027765
Link: https://lore.kernel.org/all/20210121213641.3477522-1-willmcvicker@google.com/
Signed-off-by: Will McVicker <willmcvicker@google.com>
Change-Id: Ib7c72c72f95c4545adb7cd4e842729557039ce3a
2023-03-01 01:47:13 +00:00
Greg Kroah-Hartman
02bdd918e6 Revert "sched/psi: Stop relying on timer_pending() for poll_work rescheduling"
This reverts commit afec25854c which is
commit 710ffe671e014d5ccbcff225130a178b088ef090 upstream.

It breaks the kernel ABI, but will be brought back in the near future
when an ABI break is allowed.

Bug: 161946584
Cc: Suren Baghdasaryan <surenb@google.com>
Change-Id: If05233e46a6d8baf11e53d4fbdb8ac3fcc5e7d0a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-28 12:00:57 +00:00
Greg Kroah-Hartman
f9cc3a7058 This is the 6.1.14 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmP54jIACgkQONu9yGCS
 aT5fDRAAjFsMbxMrru7XL9In9tJM51bQdVADhYkVm3QmehDEFhcKryKn/WH8zJGb
 /r5aOhErgOMb52IjjTMDiUP7VmNjdfCMkL8JrWyBPZ5ZGMvxdGdaUeer5Q+n4HG/
 v1ES8T5vrZDFn4jKfbz2hK9adjOgjCry2oqnxqOyNN6b9kSdLY34mqGqUbfMsgkQ
 ZUJvLevwY2AKKMCjz3DQpxCDLMnfrCVvt7swGIFFmehtlYfrSf/HJgWpBNtoGvm0
 QRJY4yAb3EqQDv4AcEr8mO7QgH9IBKoMsSNuUO0Q14Pqg5cklMC4Mfc6ENlvw59I
 KchWeophErmdVOT7s6UnOxb4vygvCXI5Gf5eSg9K4esOjpQarOEKJDc5D6jidFnu
 O2xrF+RPzIhl/ud2NnnJ9uSs4mM63guVwW7QwxR7427dgYbJYDvErwskFzNayISV
 6kkBbus0AK7KxBVvZmOY/wUBTh93CS9gqVfoKO96IRpBOxkH9NLwgqdQmgOqRHB8
 e4SzvrjJlnLEXdTPDGSr0nzHKh735ab7H/xVeB64qBVwKClifpD882HuYgT4cxl+
 A0G+vbYGB1Ijdy3O7QQx6AQtp5S474vpsB8WWeL33U25JfEcTL03SMZ0rh5tdJgN
 v/5gY+txCjo42Kdp4BiY2GdBf8FdGEqLB/ELCpg/zYc7YZW4wUA=
 =4fvq
 -----END PGP SIGNATURE-----

Merge 6.1.14 into android14-6.1

Changes in 6.1.14
	drm/etnaviv: don't truncate physical page address
	wifi: ath11k: fix warning in dma_free_coherent() of memory chunks while recovery
	wifi: rtl8xxxu: gen2: Turn on the rate control
	drm/edid: Fix minimum bpc supported with DSC1.2 for HDMI sink
	clk: mxl: Switch from direct readl/writel based IO to regmap based IO
	clk: mxl: Remove redundant spinlocks
	clk: mxl: Add option to override gate clks
	clk: mxl: Fix a clk entry by adding relevant flags
	powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G
	clk: mxl: syscon_node_to_regmap() returns error pointers
	sched/psi: Stop relying on timer_pending() for poll_work rescheduling
	random: always mix cycle counter in add_latent_entropy()
	scsi: libsas: Add smp_ata_check_ready_type()
	scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset
	spi: mediatek: Enable irq when pdata is ready
	docs: perf: Fix PMU instance name of hisi-pcie-pmu
	KVM: x86: Fail emulation during EMULTYPE_SKIP on any exception
	KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid
	KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRS
	can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
	powerpc: dts: t208x: Disable 10G on MAC1 and MAC2
	spi: mediatek: Enable irq before the spi registration
	drm/i915: Remove __maybe_unused from mtl_info
	KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET
	selftests: kvm: move declaration at the beginning of main()
	powerpc/64s/radix: Fix RWX mapping with relocated kernel
	nfp: ethtool: support reporting link modes
	nfp: ethtool: fix the bug of setting unsupported port speed
	uaccess: Add speculation barrier to copy_from_user()
	x86/alternatives: Introduce int3_emulate_jcc()
	x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions
	x86/static_call: Add support for Jcc tail-calls
	Bluetooth: btusb: Add more device IDs for WCN6855
	riscv: remove special treatment for the link order of head.o
	arm64: remove special treatment for the link order of head.o
	arch: fix broken BuildID for arm64 and riscv
	powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
	powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
	s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
	sh: define RUNTIME_DISCARD_EXIT
	wifi: mwifiex: Add missing compatible string for SD8787
	audit: update the mailing list in MAINTAINERS
	platform/x86/amd/pmf: Add depends on CONFIG_POWER_SUPPLY
	platform/x86: nvidia-wmi-ec-backlight: Add force module parameter
	ext4: Fix function prototype mismatch for ext4_feat_ktype
	randstruct: disable Clang 15 support
	bpf: add missing header file include
	Linux 6.1.14

Change-Id: I704196855630a372d2c6564ab68bf7f7968b889e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-25 15:37:47 +00:00
Linus Torvalds
de41a146f9 bpf: add missing header file include
commit f3dd0c53370e70c0f9b7e931bbec12916f3bb8cc upstream.

Commit 74e19ef0ff80 ("uaccess: Add speculation barrier to
copy_from_user()") built fine on x86-64 and arm64, and that's the extent
of my local build testing.

It turns out those got the <linux/nospec.h> include incidentally through
other header files (<linux/kvm_host.h> in particular), but that was not
true of other architectures, resulting in build errors

  kernel/bpf/core.c: In function ‘___bpf_prog_run’:
  kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’

so just make sure to explicitly include the proper <linux/nospec.h>
header file to make everybody see it.

Fixes: 74e19ef0ff80 ("uaccess: Add speculation barrier to copy_from_user()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Reported-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25 11:25:43 +01:00
Dave Hansen
684db631a1 uaccess: Add speculation barrier to copy_from_user()
commit 74e19ef0ff8061ef55957c3abd71614ef0f42f47 upstream.

The results of "access_ok()" can be mis-speculated.  The result is that
you can end speculatively:

	if (access_ok(from, size))
		// Right here

even for bad from/size combinations.  On first glance, it would be ideal
to just add a speculation barrier to "access_ok()" so that its results
can never be mis-speculated.

But there are lots of system calls just doing access_ok() via
"copy_to_user()" and friends (example: fstat() and friends).  Those are
generally not problematic because they do not _consume_ data from
userspace other than the pointer.  They are also very quick and common
system calls that should not be needlessly slowed down.

"copy_from_user()" on the other hand uses a user-controller pointer and
is frequently followed up with code that might affect caches.  Take
something like this:

	if (!copy_from_user(&kernelvar, uptr, size))
		do_something_with(kernelvar);

If userspace passes in an evil 'uptr' that *actually* points to a kernel
addresses, and then do_something_with() has cache (or other)
side-effects, it could allow userspace to infer kernel data values.

Add a barrier to the common copy_from_user() code to prevent
mis-speculated values which happen after the copy.

Also add a stub for architectures that do not define barrier_nospec().
This makes the macro usable in generic code.

Since the barrier is now usable in generic code, the x86 #ifdef in the
BPF code can also go away.

Reported-by: Jordy Zomer <jordyzomer@google.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>   # BPF bits
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-25 11:25:41 +01:00
Suren Baghdasaryan
afec25854c sched/psi: Stop relying on timer_pending() for poll_work rescheduling
[ Upstream commit 710ffe671e014d5ccbcff225130a178b088ef090 ]

Psi polling mechanism is trying to minimize the number of wakeups to
run psi_poll_work and is currently relying on timer_pending() to detect
when this work is already scheduled. This provides a window of opportunity
for psi_group_change to schedule an immediate psi_poll_work after
poll_timer_fn got called but before psi_poll_work could reschedule itself.
Below is the depiction of this entire window:

poll_timer_fn
  wake_up_interruptible(&group->poll_wait);

psi_poll_worker
  wait_event_interruptible(group->poll_wait, ...)
  psi_poll_work
    psi_schedule_poll_work
      if (timer_pending(&group->poll_timer)) return;
      ...
      mod_timer(&group->poll_timer, jiffies + delay);

Prior to 461daba06b we used to rely on poll_scheduled atomic which was
reset and set back inside psi_poll_work and therefore this race window
was much smaller.
The larger window causes increased number of wakeups and our partners
report visible power regression of ~10mA after applying 461daba06b.
Bring back the poll_scheduled atomic and make this race window even
narrower by resetting poll_scheduled only when we reach polling expiration
time. This does not completely eliminate the possibility of extra wakeups
caused by a race with psi_group_change however it will limit it to the
worst case scenario of one extra wakeup per every tracking window (0.5s
in the worst case).
This patch also ensures correct ordering between clearing poll_scheduled
flag and obtaining changed_states using memory barrier. Correct ordering
between updating changed_states and setting poll_scheduled is ensured by
atomic_xchg operation.
By tracing the number of immediate rescheduling attempts performed by
psi_group_change and the number of these attempts being blocked due to
psi monitor being already active, we can assess the effects of this change:

Before the patch:
                                           Run#1    Run#2      Run#3
Immediate reschedules attempted:           684365   1385156    1261240
Immediate reschedules blocked:             682846   1381654    1258682
Immediate reschedules (delta):             1519     3502       2558
Immediate reschedules (% of attempted):    0.22%    0.25%      0.20%

After the patch:
                                           Run#1    Run#2      Run#3
Immediate reschedules attempted:           882244   770298    426218
Immediate reschedules blocked:             881996   769796    426074
Immediate reschedules (delta):             248      502       144
Immediate reschedules (% of attempted):    0.03%    0.07%     0.03%

The number of non-blocked immediate reschedules dropped from 0.22-0.25%
to 0.03-0.07%. The drop is attributed to the decrease in the race window
size and the fact that we allow this race only when psi monitors reach
polling window expiration time.

Fixes: 461daba06b ("psi: eliminate kthread_worker from psi trigger scheduling mechanism")
Reported-by: Kathleen Chang <yt.chang@mediatek.com>
Reported-by: Wenju Xu <wenju.xu@mediatek.com>
Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: SH Chen <show-hong.chen@mediatek.com>
Link: https://lore.kernel.org/r/20221028194541.813985-1-surenb@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-25 11:25:39 +01:00
Greg Kroah-Hartman
dafc2fae4d This is the 6.1.13 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmP2A8MACgkQONu9yGCS
 aT7GrhAAky2nTRG9J0oPxh5Eu7wNKmjqDWNj9c6it3iGHpb+tfOY+LfPXHmWz0kX
 NoaNYGZGD8SDbkmwrSOmFB1Q/0OZ4/aIwM7Kwcw72UJVvrlsKx1HwiJjXKk809ZL
 bVlLUQzFTwyVIYcvjXQ8CuBHwBinLc3qkcyYGgbS8bseR4pDuxwoToDwAxk1d/0j
 ozWuzUKhSdYHYIUrk3papUro2UpF+Kb7KFpNiVo2wMaZM7en2XK3khCt8TuojH6c
 DXL+KZ/HbB8Ig1PWLaw2/6o4ispNy6bz7CJx6oDiOILR+le8xZA5WTdkXT3ovjyr
 LxutmPTTw6PxextIyVRblJWzXNcjdlV552U4gnnngcWn6wg4D4otqYnYvTaAUc+u
 sQnwrlQFxB2KfFKLNepGAy7klQJsYP3eadjDgGXP9TSmuUvUYRaNr6h0XukbyYkc
 kx2+Tw51NMKEqhgnaiKDN8AZEDTuLu5F4+NrUertxlb3PWeRRMRYVGJ1uw0KJg6t
 d5eniCB00SaaqN6M68u/hRYRi3gnwIsU7DitEpqejqwzskMpgegMFvebmCwORiq3
 D+FD4EHOlztIToXhmEOXp0cz8fs27MuWmq4GkSwXvJuq+id5cQFdDN5GeLgNdAvH
 Kiu/Y+DY6ObW31tAQ1Jjp20L2RaWWvubrCBGeIqiDzUWmCohsks=
 =TXvc
 -----END PGP SIGNATURE-----

Merge 6.1.13 into android14-6.1

Changes in 6.1.13
	mptcp: sockopt: make 'tcp_fastopen_connect' generic
	mptcp: fix locking for setsockopt corner-case
	mptcp: deduplicate error paths on endpoint creation
	mptcp: fix locking for in-kernel listener creation
	btrfs: move the auto defrag code to defrag.c
	btrfs: lock the inode in shared mode before starting fiemap
	ASoC: amd: yc: Add DMI support for new acer/emdoor platforms
	ASoC: SOF: sof-audio: start with the right widget type
	ALSA: usb-audio: Add FIXED_RATE quirk for JBL Quantum610 Wireless
	ASoC: Intel: sof_rt5682: always set dpcm_capture for amplifiers
	ASoC: Intel: sof_cs42l42: always set dpcm_capture for amplifiers
	ASoC: Intel: sof_nau8825: always set dpcm_capture for amplifiers
	ASoC: Intel: sof_ssp_amp: always set dpcm_capture for amplifiers
	selftests/bpf: Verify copy_register_state() preserves parent/live fields
	ALSA: hda: Do not unset preset when cleaning up codec
	ASoC: amd: yc: Add Xiaomi Redmi Book Pro 15 2022 into DMI table
	bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
	ASoC: cs42l56: fix DT probe
	tools/virtio: fix the vringh test for virtio ring changes
	vdpa: ifcvf: Do proper cleanup if IFCVF init fails
	net/rose: Fix to not accept on connected socket
	selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibility
	net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC
	powerpc/64: Fix perf profiling asynchronous interrupt handlers
	fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()
	drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED
	net: ethernet: mtk_eth_soc: Avoid truncating allocation
	net: sched: sch: Bounds check priority
	s390/decompressor: specify __decompress() buf len to avoid overflow
	nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
	nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set
	nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set
	drm/amd/display: Add missing brackets in calculation
	drm/amd/display: Adjust downscaling limits for dcn314
	drm/amd/display: Unassign does_plane_fit_in_mall function from dcn3.2
	drm/amd/display: Reset DMUB mailbox SW state after HW reset
	drm/amdgpu: enable HDP SD for gfx 11.0.3
	drm/amdgpu: Enable vclk dclk node for gc11.0.3
	drm/amd/display: Properly handle additional cases where DCN is not supported
	platform/x86: touchscreen_dmi: Add Chuwi Vi8 (CWI501) DMI match
	ceph: move mount state enum to super.h
	ceph: blocklist the kclient when receiving corrupted snap trace
	selftests: mptcp: userspace: fix v4-v6 test in v6.1
	of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem
	kasan: fix Oops due to missing calls to kasan_arch_is_ready()
	mm: shrinkers: fix deadlock in shrinker debugfs
	aio: fix mremap after fork null-deref
	vmxnet3: move rss code block under eop descriptor
	fbdev: Fix invalid page access after closing deferred I/O devices
	drm: Disable dynamic debug as broken
	drm/amd/amdgpu: fix warning during suspend
	drm/amd/display: Fail atomic_check early on normalize_zpos error
	drm/vmwgfx: Stop accessing buffer objects which failed init
	drm/vmwgfx: Do not drop the reference to the handle too soon
	mmc: jz4740: Work around bug on JZ4760(B)
	mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
	mmc: sdio: fix possible resource leaks in some error paths
	mmc: mmc_spi: fix error handling in mmc_spi_probe()
	ALSA: hda: Fix codec device field initializan
	ALSA: hda/conexant: add a new hda codec SN6180
	ALSA: hda/realtek - fixed wrong gpio assigned
	ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
	ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops
	ata: ahci: Add Tiger Lake UP{3,4} AHCI controller
	ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
	sched/psi: Fix use-after-free in ep_remove_wait_queue()
	hugetlb: check for undefined shift on 32 bit architectures
	nilfs2: fix underflow in second superblock position calculations
	mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
	mm/filemap: fix page end in filemap_get_read_batch
	mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
	gpio: sim: fix a memory leak
	freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
	coredump: Move dump_emit_page() to kill unused warning
	Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
	net: Fix unwanted sign extension in netdev_stats_to_stats64()
	revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"
	drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
	drm/vc4: Fix YUV plane handling when planes are in different buffers
	drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
	ice: fix lost multicast packets in promisc mode
	ixgbe: allow to increase MTU to 3K with XDP enabled
	i40e: add double of VLAN header when computing the max MTU
	net: bgmac: fix BCM5358 support by setting correct flags
	net: ethernet: ti: am65-cpsw: Add RX DMA Channel Teardown Quirk
	sctp: sctp_sock_filter(): avoid list_entry() on possibly empty list
	net/sched: tcindex: update imperfect hash filters respecting rcu
	ice: xsk: Fix cleaning of XDP_TX frames
	dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.
	net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
	net/sched: act_ctinfo: use percpu stats
	net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
	net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
	bnxt_en: Fix mqprio and XDP ring checking logic
	tracing: Make trace_define_field_ext() static
	net: stmmac: Restrict warning on disabling DMA store and fwd mode
	net: use a bounce buffer for copying skb->mark
	tipc: fix kernel warning when sending SYN message
	net: mpls: fix stale pointer if allocation fails during device rename
	igb: conditionalize I2C bit banging on external thermal sensor support
	igb: Fix PPS input and output using 3rd and 4th SDP
	ixgbe: add double of VLAN header when computing the max MTU
	ipv6: Fix datagram socket connection with DSCP.
	ipv6: Fix tcp socket connection with DSCP.
	mm/gup: add folio to list when folio_isolate_lru() succeed
	mm: extend max struct page size for kmsan
	i40e: Add checking for null for nlmsg_find_attr()
	net/sched: tcindex: search key must be 16 bits
	nvme-tcp: stop auth work after tearing down queues in error recovery
	nvme-rdma: stop auth work after tearing down queues in error recovery
	KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)
	kvm: initialize all of the kvm_debugregs structure before sending it to userspace
	perf/x86: Refuse to export capabilities for hybrid PMUs
	alarmtimer: Prevent starvation by small intervals and SIG_IGN
	nvme-pci: refresh visible attrs for cmb attributes
	ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
	net: sched: sch: Fix off by one in htb_activate_prios()
	Linux 6.1.13

Change-Id: I8a1e4175939c14f726c545001061b95462566386
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-22 12:32:41 +00:00
Thomas Gleixner
70fdd9831a alarmtimer: Prevent starvation by small intervals and SIG_IGN
commit d125d1349abeb46945dc5e98f7824bf688266f13 upstream.

syzbot reported a RCU stall which is caused by setting up an alarmtimer
with a very small interval and ignoring the signal. The reproducer arms the
alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a
problem per se, but that's an issue when the signal is ignored because then
the timer is immediately rearmed because there is no way to delay that
rearming to the signal delivery path.  See posix_timer_fn() and commit
58229a1899 ("posix-timers: Prevent softirq starvation by small intervals
and SIG_IGN") for details.

The reproducer does not set SIG_IGN explicitely, but it sets up the timers
signal with SIGCONT. That has the same effect as explicitely setting
SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and
the task is not ptraced.

The log clearly shows that:

   [pid  5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---

It works because the tasks are traced and therefore the signal is queued so
the tracer can see it, which delays the restart of the timer to the signal
delivery path. But then the tracer is killed:

   [pid  5087] kill(-5102, SIGKILL <unfinished ...>
   ...
   ./strace-static-x86_64: Process 5107 detached

and after it's gone the stall can be observed:

   syzkaller login: [   79.439102][    C0] hrtimer: interrupt took 68471 ns
   [  184.460538][    C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
   ...
   [  184.658237][    C1] rcu: Stack dump where RCU GP kthread last ran:
   [  184.664574][    C1] Sending NMI from CPU 1 to CPUs 0:
   [  184.669821][    C0] NMI backtrace for cpu 0
   [  184.669831][    C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0
   ...
   [  184.670036][    C0] Call Trace:
   [  184.670041][    C0]  <IRQ>
   [  184.670045][    C0]  alarmtimer_fired+0x327/0x670

posix_timer_fn() prevents that by checking whether the interval for
timers which have the signal ignored is smaller than a jiffie and
artifically delay it by shifting the next expiry out by a jiffie. That's
accurate vs. the overrun accounting, but slightly inaccurate
vs. timer_gettimer(2).

The comment in that function says what needs to be done and there was a fix
available for the regular userspace induced SIG_IGN mechanism, but that did
not work due to the implicit ignore for SIGCONT and similar signals. This
needs to be worked on, but for now the only available workaround is to do
exactly what posix_timer_fn() does:

Increase the interval of self-rearming timers, which have their signal
ignored, to at least a jiffie.

Interestingly this has been fixed before via commit ff86bf0c65
("alarmtimer: Rate limit periodic intervals") already, but that fix got
lost in a later rework.

Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com
Fixes: f2c45807d3 ("alarmtimer: Switch over to generic set/get/rearm routine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <jstultz@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22 12:59:55 +01:00
Steven Rostedt (Google)
ac6e733f81 tracing: Make trace_define_field_ext() static
commit 70b5339caf847b8b6097b6dfab0c5a99b40713c8 upstream.

trace_define_field_ext() is not used outside of trace_events.c, it should
be static.

Link: https://lore.kernel.org/oe-kbuild-all/202302130750.679RaRog-lkp@intel.com/

Fixes: b6c7abd1c28a ("tracing: Fix TASK_COMM_LEN in trace event format file")
Reported-by: Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22 12:59:53 +01:00
Peter Zijlstra
7f9f6c54da freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
commit eedeb787ebb53de5c5dcf7b7b39d01bf1b0f037d upstream.

Tetsuo-San noted that commit f5d39b0208 ("freezer,sched: Rewrite
core freezer logic") broke call_usermodehelper_exec() for the KILLABLE
case.

Specifically it was missed that the second, unconditional,
wait_for_completion() was not optional and ensures the on-stack
completion is unused before going out-of-scope.

Fixes: f5d39b0208 ("freezer,sched: Rewrite core freezer logic")
Reported-by: syzbot+6cd18e123583550cf469@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22 12:59:50 +01:00
Munehisa Kamata
c6879a4dce sched/psi: Fix use-after-free in ep_remove_wait_queue()
commit c2dbe32d5db5c4ead121cf86dabd5ab691fb47fe upstream.

If a non-root cgroup gets removed when there is a thread that registered
trigger and is polling on a pressure file within the cgroup, the polling
waitqueue gets freed in the following path:

 do_rmdir
   cgroup_rmdir
     kernfs_drain_open_files
       cgroup_file_release
         cgroup_pressure_release
           psi_trigger_destroy

However, the polling thread still has a reference to the pressure file and
will access the freed waitqueue when the file is closed or upon exit:

 fput
   ep_eventpoll_release
     ep_free
       ep_remove_wait_queue
         remove_wait_queue

This results in use-after-free as pasted below.

The fundamental problem here is that cgroup_file_release() (and
consequently waitqueue's lifetime) is not tied to the file's real lifetime.
Using wake_up_pollfree() here might be less than ideal, but it is in line
with the comment at commit 42288cb44c ("wait: add wake_up_pollfree()")
since the waitqueue's lifetime is not tied to file's one and can be
considered as another special case. While this would be fixable by somehow
making cgroup_file_release() be tied to the fput(), it would require
sizable refactoring at cgroups or higher layer which might be more
justifiable if we identify more cases like this.

  BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0
  Write of size 4 at addr ffff88810e625328 by task a.out/4404

	CPU: 19 PID: 4404 Comm: a.out Not tainted 6.2.0-rc6 #38
	Hardware name: Amazon EC2 c5a.8xlarge/, BIOS 1.0 10/16/2017
	Call Trace:
	<TASK>
	dump_stack_lvl+0x73/0xa0
	print_report+0x16c/0x4e0
	kasan_report+0xc3/0xf0
	kasan_check_range+0x2d2/0x310
	_raw_spin_lock_irqsave+0x60/0xc0
	remove_wait_queue+0x1a/0xa0
	ep_free+0x12c/0x170
	ep_eventpoll_release+0x26/0x30
	__fput+0x202/0x400
	task_work_run+0x11d/0x170
	do_exit+0x495/0x1130
	do_group_exit+0x100/0x100
	get_signal+0xd67/0xde0
	arch_do_signal_or_restart+0x2a/0x2b0
	exit_to_user_mode_prepare+0x94/0x100
	syscall_exit_to_user_mode+0x20/0x40
	do_syscall_64+0x52/0x90
	entry_SYSCALL_64_after_hwframe+0x63/0xcd
	</TASK>

 Allocated by task 4404:

	kasan_set_track+0x3d/0x60
	__kasan_kmalloc+0x85/0x90
	psi_trigger_create+0x113/0x3e0
	pressure_write+0x146/0x2e0
	cgroup_file_write+0x11c/0x250
	kernfs_fop_write_iter+0x186/0x220
	vfs_write+0x3d8/0x5c0
	ksys_write+0x90/0x110
	do_syscall_64+0x43/0x90
	entry_SYSCALL_64_after_hwframe+0x63/0xcd

 Freed by task 4407:

	kasan_set_track+0x3d/0x60
	kasan_save_free_info+0x27/0x40
	____kasan_slab_free+0x11d/0x170
	slab_free_freelist_hook+0x87/0x150
	__kmem_cache_free+0xcb/0x180
	psi_trigger_destroy+0x2e8/0x310
	cgroup_file_release+0x4f/0xb0
	kernfs_drain_open_files+0x165/0x1f0
	kernfs_drain+0x162/0x1a0
	__kernfs_remove+0x1fb/0x310
	kernfs_remove_by_name_ns+0x95/0xe0
	cgroup_addrm_files+0x67f/0x700
	cgroup_destroy_locked+0x283/0x3c0
	cgroup_rmdir+0x29/0x100
	kernfs_iop_rmdir+0xd1/0x140
	vfs_rmdir+0xfe/0x240
	do_rmdir+0x13d/0x280
	__x64_sys_rmdir+0x2c/0x30
	do_syscall_64+0x43/0x90
	entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 0e94682b73 ("psi: introduce psi monitor")
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Mengchi Cheng <mengcc@amazon.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/lkml/20230106224859.4123476-1-kamatam@amazon.com/
Link: https://lore.kernel.org/r/20230214212705.4058045-1-kamatam@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22 12:59:49 +01:00
Chris Goldsworthy
116e1532b9 ANDROID: arm64/mm: Add command line option to make ZONE_DMA32 empty
ZONE_DMA32 is enabled by default on android14-6.1, yet it is not
needed for all devices, nor is it desirable to have if not needed. For
instance, if a partner in GKI 1.0 did not use ZONE_DMA32, memory can
be lower for ZONE_NORMAL relative to older targets, such that memory
would run out more quickly in ZONE_NORMAL leading kswapd to be invoked
unnecessarily.

Correspondingly, provide a means of making ZONE_DMA32 empty via the
kernel command line when it is compiled in via CONFIG_ZONE_DMA32.

P.S. The following two patches are squashed into this one,
1. bf96382 ("ANDROID: dma-direct: Make DMA32 disablement work for CONFIG_NUMA")
2. 135406c ("ANDROID: dma-direct: Document disable_dma32")

Bug: 199917449
Bug: 268587627
Change-Id: I70ec76914b92e518d61a61072f0b3cb41cb28646
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Signed-off-by: Chinwen Chang <chinwen.chang@mediatek.com>
2023-02-16 22:07:29 +00:00
Greg Kroah-Hartman
b6010109cf Merge 6.1.12 into android14-6.1
Changes in 6.1.12
	hv_netvsc: Allocate memory in netvsc_dma_map() with GFP_ATOMIC
	btrfs: limit device extents to the device size
	btrfs: zlib: zero-initialize zlib workspace
	ALSA: hda/realtek: Add Positivo N14KP6-TG
	ALSA: emux: Avoid potential array out-of-bound in snd_emux_xg_control()
	ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 360
	ALSA: hda/realtek: Enable mute/micmute LEDs on HP Elitebook, 645 G9
	ALSA: hda/realtek: Add quirk for ASUS UM3402 using CS35L41
	ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
	Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"
	Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"
	tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
	of/address: Return an error when no valid dma-ranges are found
	can: j1939: do not wait 250 ms if the same addr was already claimed
	HID: logitech: Disable hi-res scrolling on USB
	xfrm: compat: change expression for switch in xfrm_xlate64
	IB/hfi1: Restore allocated resources on failed copyout
	xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr()
	IB/IPoIB: Fix legacy IPoIB due to wrong number of queues
	xfrm: annotate data-race around use_time
	RDMA/irdma: Fix potential NULL-ptr-dereference
	RDMA/usnic: use iommu_map_atomic() under spin_lock()
	xfrm: fix bug with DSCP copy to v6 from v4 tunnel
	of: Make OF framebuffer device names unique
	net: phylink: move phy_device_free() to correctly release phy device
	bonding: fix error checking in bond_debug_reregister()
	net: macb: Perform zynqmp dynamic configuration only for SGMII interface
	net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHY
	ionic: clean interrupt before enabling queue to avoid credit race
	ionic: refactor use of ionic_rx_fill()
	ionic: missed doorbell workaround
	cpufreq: qcom-hw: Fix cpufreq_driver->get() for non-LMH systems
	uapi: add missing ip/ipv6 header dependencies for linux/stddef.h
	net: microchip: sparx5: fix PTP init/deinit not checking all ports
	HID: amd_sfh: if no sensors are enabled, clean up
	drm/i915: Don't do the WM0->WM1 copy w/a if WM1 is already enabled
	drm/virtio: exbuf->fence_fd unmodified on interrupted wait
	cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
	nvidiafb: detect the hardware support before removing console.
	ice: Do not use WQ_MEM_RECLAIM flag for workqueue
	ice: Fix disabling Rx VLAN filtering with port VLAN enabled
	ice: switch: fix potential memleak in ice_add_adv_recipe()
	net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
	net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"
	net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
	net/mlx5: Bridge, fix ageing of peer FDB entries
	net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode
	net/mlx5e: IPoIB, Show unknown speed instead of error
	net/mlx5: Store page counters in a single array
	net/mlx5: Expose SF firmware pages counter
	net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
	net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
	net/mlx5: Serialize module cleanup with reload and remove
	igc: Add ndo_tx_timeout support
	net: ethernet: mtk_eth_soc: fix wrong parameters order in __xdp_rxq_info_reg()
	txhash: fix sk->sk_txrehash default
	selftests: Fix failing VXLAN VNI filtering test
	rds: rds_rm_zerocopy_callback() use list_first_entry()
	net: mscc: ocelot: fix all IPv6 getting trapped to CPU when PTP timestamping is used
	selftests: forwarding: lib: quote the sysctl values
	arm64: dts: rockchip: fix input enable pinconf on rk3399
	arm64: dts: rockchip: set sdmmc0 speed to sd-uhs-sdr50 on rock-3a
	ALSA: pci: lx6464es: fix a debug loop
	riscv: stacktrace: Fix missing the first frame
	arm64: dts: mediatek: mt8195: Fix vdosys* compatible strings
	ASoC: tas5805m: rework to avoid scheduling while atomic.
	ASoC: tas5805m: add missing page switch.
	ASoC: fsl_sai: fix getting version from VERID
	ASoC: topology: Return -ENOMEM on memory allocation failure
	clk: microchip: mpfs-ccc: Use devm_kasprintf() for allocating formatted strings
	pinctrl: mediatek: Fix the drive register definition of some Pins
	pinctrl: aspeed: Fix confusing types in return value
	pinctrl: single: fix potential NULL dereference
	spi: dw: Fix wrong FIFO level setting for long xfers
	pinctrl: aspeed: Revert "Force to disable the function's signal"
	pinctrl: intel: Restore the pins that used to be in Direct IRQ mode
	cifs: Fix use-after-free in rdata->read_into_pages()
	net: USB: Fix wrong-direction WARNING in plusb.c
	mptcp: do not wait for bare sockets' timeout
	mptcp: be careful on subflow status propagation on errors
	selftests: mptcp: allow more slack for slow test-case
	selftests: mptcp: stop tests earlier
	btrfs: simplify update of last_dir_index_offset when logging a directory
	btrfs: free device in btrfs_close_devices for a single device filesystem
	usb: core: add quirk for Alcor Link AK9563 smartcard reader
	usb: typec: altmodes/displayport: Fix probe pin assign check
	cxl/region: Fix null pointer dereference for resetting decoder
	cxl/region: Fix passthrough-decoder detection
	clk: ingenic: jz4760: Update M/N/OD calculation algorithm
	pinctrl: qcom: sm8450-lpass-lpi: correct swr_rx_data group
	drm/amd/pm: add SMU 13.0.7 missing GetPptLimit message mapping
	ceph: flush cap releases when the session is flushed
	nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE
	riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte
	riscv: kprobe: Fixup misaligned load text
	powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch
	drm/amdgpu: Use the TGID for trace_amdgpu_vm_update_ptes
	tracing: Fix TASK_COMM_LEN in trace event format file
	rtmutex: Ensure that the top waiter is always woken up
	arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitive
	arm64: dts: meson-g12-common: Make mmc host controller interrupts level-sensitive
	arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitive
	Fix page corruption caused by racy check in __free_pages
	arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
	drm/amd/pm: bump SMU 13.0.0 driver_if header version
	drm/amdgpu: Add unique_id support for GC 11.0.1/2
	drm/amd/pm: bump SMU 13.0.7 driver_if header version
	drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
	drm/amdgpu/smu: skip pptable init under sriov
	drm/amd/display: properly handling AGP aperture in vm setup
	drm/amd/display: fix cursor offset on rotation 180
	drm/i915: Move fd_install after last use of fence
	drm/i915: Initialize the obj flags for shmem objects
	drm/i915: Fix VBT DSI DVO port handling
	x86/speculation: Identify processors vulnerable to SMT RSB predictions
	KVM: x86: Mitigate the cross-thread return address predictions bug
	Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions
	Linux 6.1.12

Change-Id: I4deaf57516f3e7b40e728d473986fa355a11fc37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-16 16:46:43 +00:00
Greg Kroah-Hartman
a42f6e7d0a Revert "ANDROID: sched/cpuset: Add vendor hook to change tasks affinity"
This reverts commit 8ecd88d9d3 as it is
broken with regards to upstream changes made in 6.1.12.

If this is still needed, it can be brought back in a way that works
properly based on the changes made upstream.

Bug: 174125747
Cc: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
Cc: Sai Harshini Nimmala <quic_snimmala@quicinc.com>
Change-Id: Ic3163351faabbecbce688a87215f79ca3b5d6188
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-16 10:25:45 +00:00
Wander Lairson Costa
446ac8dd89 rtmutex: Ensure that the top waiter is always woken up
commit db370a8b9f67ae5f17e3d5482493294467784504 upstream.

Let L1 and L2 be two spinlocks.

Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top
waiter of L2.

Let T2 be the task holding L2.

Let T3 be a task trying to acquire L1.

The following events will lead to a state in which the wait queue of L2
isn't empty, but no task actually holds the lock.

T1                T2                                  T3
==                ==                                  ==

                                                      spin_lock(L1)
                                                      | raw_spin_lock(L1->wait_lock)
                                                      | rtlock_slowlock_locked(L1)
                                                      | | task_blocks_on_rt_mutex(L1, T3)
                                                      | | | orig_waiter->lock = L1
                                                      | | | orig_waiter->task = T3
                                                      | | | raw_spin_unlock(L1->wait_lock)
                                                      | | | rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3)
                  spin_unlock(L2)                     | | | |
                  | rt_mutex_slowunlock(L2)           | | | |
                  | | raw_spin_lock(L2->wait_lock)    | | | |
                  | | wakeup(T1)                      | | | |
                  | | raw_spin_unlock(L2->wait_lock)  | | | |
                                                      | | | | waiter = T1->pi_blocked_on
                                                      | | | | waiter == rt_mutex_top_waiter(L2)
                                                      | | | | waiter->task == T1
                                                      | | | | raw_spin_lock(L2->wait_lock)
                                                      | | | | dequeue(L2, waiter)
                                                      | | | | update_prio(waiter, T1)
                                                      | | | | enqueue(L2, waiter)
                                                      | | | | waiter != rt_mutex_top_waiter(L2)
                                                      | | | | L2->owner == NULL
                                                      | | | | wakeup(T1)
                                                      | | | | raw_spin_unlock(L2->wait_lock)
T1 wakes up
T1 != top_waiter(L2)
schedule_rtlock()

If the deadline of T1 is updated before the call to update_prio(), and the
new deadline is greater than the deadline of the second top waiter, then
after the requeue, T1 is no longer the top waiter, and the wrong task is
woken up which will then go back to sleep because it is not the top waiter.

This can be reproduced in PREEMPT_RT with stress-ng:

while true; do
    stress-ng --sched deadline --sched-period 1000000000 \
    	    --sched-runtime 800000000 --sched-deadline \
    	    1000000000 --mmapfork 23 -t 20
done

A similar issue was pointed out by Thomas versus the cases where the top
waiter drops out early due to a signal or timeout, which is a general issue
for all regular rtmutex use cases, e.g. futex.

The problematic code is in rt_mutex_adjust_prio_chain():

    	// Save the top waiter before dequeue/enqueue
	prerequeue_top_waiter = rt_mutex_top_waiter(lock);

	rt_mutex_dequeue(lock, waiter);
	waiter_update_prio(waiter, task);
	rt_mutex_enqueue(lock, waiter);

	// Lock has no owner?
	if (!rt_mutex_owner(lock)) {
	   	// Top waiter changed
  ---->		if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
  ---->			wake_up_state(waiter->task, waiter->wake_state);

This only takes the case into account where @waiter is the new top waiter
due to the requeue operation.

But it fails to handle the case where @waiter is not longer the top
waiter due to the requeue operation.

Ensure that the new top waiter is woken up so in all cases so it can take
over the ownerless lock.

[ tglx: Amend changelog, add Fixes tag ]

Fixes: c014ef69b3 ("locking/rtmutex: Add wake_state to rt_mutex_waiter")
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com
Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:11:54 +01:00
Yafang Shao
386a8d694f tracing: Fix TASK_COMM_LEN in trace event format file
commit b6c7abd1c28a63ad633433d037ee15a1bc3023ba upstream.

After commit 3087c61ed2 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"),
the content of the format file under
/sys/kernel/tracing/events/task/task_newtask was changed from
  field:char comm[16];    offset:12;    size:16;    signed:0;
to
  field:char comm[TASK_COMM_LEN];    offset:12;    size:16;    signed:0;

John reported that this change breaks older versions of perfetto.
Then Mathieu pointed out that this behavioral change was caused by the
use of __stringify(_len), which happens to work on macros, but not on enum
labels. And he also gave the suggestion on how to fix it:
  :One possible solution to make this more robust would be to extend
  :struct trace_event_fields with one more field that indicates the length
  :of an array as an actual integer, without storing it in its stringified
  :form in the type, and do the formatting in f_show where it belongs.

The result as follows after this change,
$ cat /sys/kernel/tracing/events/task/task_newtask/format
        field:char comm[16];    offset:12;      size:16;        signed:0;

Link: https://lore.kernel.org/lkml/Y+QaZtz55LIirsUO@google.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230210155921.4610-1-laoar.shao@gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230212151303.12353-1-laoar.shao@gmail.com

Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kajetan Puchalski <kajetan.puchalski@arm.com>
CC: Qais Yousef <qyousef@layalina.io>
Fixes: 3087c61ed2 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN")
Reported-by: John Stultz <jstultz@google.com>
Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:11:54 +01:00
Will Deacon
c47c2b173d cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
[ Upstream commit 7a2127e66a00e073db8d90f9aac308f4a8a64226 ]

set_cpus_allowed_ptr() will fail with -EINVAL if the requested
affinity mask is not a subset of the task_cpu_possible_mask() for the
task being updated. Consequently, on a heterogeneous system with cpusets
spanning the different CPU types, updates to the cgroup hierarchy can
silently fail to update task affinities when the effective affinity
mask for the cpuset is expanded.

For example, consider an arm64 system with 4 CPUs, where CPUs 2-3 are
the only cores capable of executing 32-bit tasks. Attaching a 32-bit
task to a cpuset containing CPUs 0-2 will correctly affine the task to
CPU 2. Extending the cpuset to CPUs 0-3, however, will fail to extend
the affinity mask of the 32-bit task because update_tasks_cpumask() will
pass the full 0-3 mask to set_cpus_allowed_ptr().

Extend update_tasks_cpumask() to take a temporary 'cpumask' paramater
and use it to mask the 'effective_cpus' mask with the possible mask for
each task being updated.

Fixes: 431c69fac0 ("cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14 19:11:45 +01:00
Shiju Jose
0a3e60b3fe tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
commit 3e46d910d8acf94e5360126593b68bf4fee4c4a1 upstream.

poll() and select() on per_cpu trace_pipe and trace_pipe_raw do not work
since kernel 6.1-rc6. This issue is seen after the commit
42fb0a1e84 ("tracing/ring-buffer: Have
polling block on watermark").

This issue is firstly detected and reported, when testing the CXL error
events in the rasdaemon and also erified using the test application for poll()
and select().

This issue occurs for the per_cpu case, when calling the ring_buffer_poll_wait(),
in kernel/trace/ring_buffer.c, with the buffer_percent > 0 and then wait until the
percentage of pages are available. The default value set for the buffer_percent is 50
in the kernel/trace/trace.c.

As a fix, allow userspace application could set buffer_percent as 0 through
the buffer_percent_fops, so that the task will wake up as soon as data is added
to any of the specific cpu buffer.

Link: https://lore.kernel.org/linux-trace-kernel/20230202182309.742-2-shiju.jose@huawei.com

Cc: <mhiramat@kernel.org>
Cc: <mchehab@kernel.org>
Cc: <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 42fb0a1e84 ("tracing/ring-buffer: Have polling block on watermark")
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14 19:11:42 +01:00
Neeraj Upadhyay
613832cac6 ANDROID: irq: manage: Export irq_do_set_affinity symbol
Vendor kernel modules may implement irq balancers, which could
take irq desc lock of an irq and then based on current affinity
mask or affinity hint, reconfigure the affinity of that irq.
For example : For an irq, for which affinity is broken i.e. all
the cpus in its affinity mask have gone offline. For such irqs,
we might want to reset the affinity, when the original set of
affined cpus, come back online. desc->affinity_hint can be used
for figuring out the original affinity. So, the sequence for doing
this becomes:

desc = irq_to_desc(i);
raw_spin_lock(&desc->lock);
affinity = desc->affinity_hint;
raw_spin_unlock(&desc->lock);
irq_set_affinity_hint(i, affinity);

Here, we need to release the desc lock before calling the exported
api irq_set_affinity_hint(). This creates a window where, after
unlocking desc lock and before calling irq_set_affinity_hint(),
where this setting can race with other irq_set_affinity_hint()
callers. So, export irq_do_set_affinity() symbol to provide an
api, which can be called with desc lock held.

Bug: 187157600
Change-Id: Ifad88bfaa1e7eec09c3fe5a9dd7d1d421362b41e
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
(cherry picked from commit 9f7014a6d21b6f650df4ba43869649ff37aa8c75)
Signed-off-by: Guru Das Srinagesh <quic_gurus@quicinc.com>
2023-02-14 10:07:59 -08:00
Jing-Ting Wu
cc50e0da70 ANDROID: sched: add vendor hook to PELT multiplier
We add vendor hook at sched_pelt_multiplier for
performance tuning.

Bug: 268491135

Change-Id: I10e3436a986dd5dd7d375460922407666f27739d
Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>
2023-02-10 18:55:55 +00:00
Greg Kroah-Hartman
c747c01851 This is the 6.1.11 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmPkywAACgkQONu9yGCS
 aT42Kw/9FFrdwv29yND651dPIglYKgO0Oz27/LFNGqst1A/G1ITzfs/94NSRr+9j
 uvwmBLbC+n/OXYavliBVWlPaYUCLqoFSfR+q953yz/UT0803E8BUvQ8NN8O7lsg7
 hfbWJaASxt5puy2pBFypeWM+OXoVOvUBj3VhbgtUwwcYLPuYafj9rCAytdIIf5fr
 RKWBLfx7As4OJ+Hb3KNkolTkFDTfV5+zqCAc9Ko474d1bpRnF15UdQN8Kkinr2+O
 YNGTvDT8jR8eAk/9PiCNrG7DEMSKaczP8n/ap6PikD/KnK7ShtCLwZztLnmu65g1
 vZG+cnEda8FuY3Ms03UrHhKqzMzBY/vslzBNMBTNmDsr+b7ilhffAYXPKS8s7xrg
 bJjmfzfITFAjXrml25enVO0V9RtTxv6E07U7SnDrLsvE2KBFZfUR/3Xl70bVBb0S
 db60kmEoq3XHHtoVySOHlfihVHSy02V9dlFcLOYMQsDHsGVsRXOR87g6d7+rJS3h
 hYWz5YxMLJUr2qn2836DPBnX9Ix0VjDx+X2fB4bNYzKc1dMlgzbpYrhk9LEOUDsx
 emJuqZskjkLby9Bw36N3eHW3fKPOFrwpYwPWYJHdWx1mmFSNdV6MdfEtZXpuEkFJ
 iFyJPeeODGadoiznnXTaBFfhozRj+B6FXrY6pkF+WMoSt8ZlZpM=
 =vu7j
 -----END PGP SIGNATURE-----

Merge 6.1.11 into android14-6.1

Changes in 6.1.11
	firewire: fix memory leak for payload of request subaction to IEC 61883-1 FCP region
	bus: sunxi-rsb: Fix error handling in sunxi_rsb_init()
	arm64: dts: imx8m-venice: Remove incorrect 'uart-has-rtscts'
	arm64: dts: freescale: imx8dxl: fix sc_pwrkey's property name linux,keycode
	ASoC: amd: acp-es8336: Drop reference count of ACPI device after use
	ASoC: Intel: bytcht_es8316: Drop reference count of ACPI device after use
	ASoC: Intel: bytcr_rt5651: Drop reference count of ACPI device after use
	ASoC: Intel: bytcr_rt5640: Drop reference count of ACPI device after use
	ASoC: Intel: bytcr_wm5102: Drop reference count of ACPI device after use
	ASoC: Intel: sof_es8336: Drop reference count of ACPI device after use
	ASoC: Intel: avs: Implement PCI shutdown
	bpf: Fix off-by-one error in bpf_mem_cache_idx()
	bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
	ALSA: hda/via: Avoid potential array out-of-bound in add_secret_dac_path()
	bpf: Fix to preserve reg parent/live fields when copying range info
	selftests/filesystems: grant executable permission to run_fat_tests.sh
	ASoC: SOF: ipc4-mtrace: prevent underflow in sof_ipc4_priority_mask_dfs_write()
	bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
	media: v4l2-ctrls-api.c: move ctrl->is_new = 1 to the correct line
	bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
	arm64: dts: imx8mm: Fix pad control for UART1_DTE_RX
	arm64: dts: imx8mm-verdin: Do not power down eth-phy
	drm/vc4: hdmi: make CEC adapter name unique
	drm/ssd130x: Init display before the SSD130X_DISPLAY_ON command
	scsi: Revert "scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT"
	bpf: Fix the kernel crash caused by bpf_setsockopt().
	ALSA: memalloc: Workaround for Xen PV
	vhost/net: Clear the pending messages when the backend is removed
	copy_oldmem_kernel() - WRITE is "data source", not destination
	WRITE is "data source", not destination...
	READ is "data destination", not source...
	zcore: WRITE is "data source", not destination...
	memcpy_real(): WRITE is "data source", not destination...
	fix iov_iter_bvec() "direction" argument
	fix 'direction' argument of iov_iter_{init,bvec}()
	fix "direction" argument of iov_iter_kvec()
	use less confusing names for iov_iter direction initializers
	vhost-scsi: unbreak any layout for response
	ice: Prevent set_channel from changing queues while RDMA active
	qede: execute xdp_do_flush() before napi_complete_done()
	virtio-net: execute xdp_do_flush() before napi_complete_done()
	dpaa_eth: execute xdp_do_flush() before napi_complete_done()
	dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
	skb: Do mix page pool and page referenced frags in GRO
	sfc: correctly advertise tunneled IPv6 segmentation
	net: phy: dp83822: Fix null pointer access on DP83825/DP83826 devices
	net: wwan: t7xx: Fix Runtime PM initialization
	block, bfq: replace 0/1 with false/true in bic apis
	block, bfq: fix uaf for bfqq in bic_set_bfqq()
	netrom: Fix use-after-free caused by accept on already connected socket
	fscache: Use wait_on_bit() to wait for the freeing of relinquished volume
	platform/x86/amd/pmf: update to auto-mode limits only after AMT event
	platform/x86/amd/pmf: Add helper routine to update SPS thermals
	platform/x86/amd/pmf: Fix to update SPS default pprof thermals
	platform/x86/amd/pmf: Add helper routine to check pprof is balanced
	platform/x86/amd/pmf: Fix to update SPS thermals when power supply change
	platform/x86/amd/pmf: Ensure mutexes are initialized before use
	platform/x86: thinkpad_acpi: Fix thinklight LED brightness returning 255
	drm/i915/guc: Fix locking when searching for a hung request
	drm/i915: Fix request ref counting during error capture & debugfs dump
	drm/i915: Fix up locking around dumping requests lists
	drm/i915/adlp: Fix typo for reference clock
	net/tls: tls_is_tx_ready() checked list_entry
	ALSA: firewire-motu: fix unreleased lock warning in hwdep device
	netfilter: br_netfilter: disable sabotage_in hook after first suppression
	block: ublk: extending queue_size to fix overflow
	kunit: fix kunit_test_init_section_suites(...)
	squashfs: harden sanity check in squashfs_read_xattr_id_table
	maple_tree: should get pivots boundary by type
	sctp: do not check hb_timer.expires when resetting hb_timer
	net: phy: meson-gxl: Add generic dummy stubs for MMD register access
	drm/panel: boe-tv101wum-nl6: Ensure DSI writes succeed during disable
	ip/ip6_gre: Fix changing addr gen mode not generating IPv6 link local address
	ip/ip6_gre: Fix non-point-to-point tunnel not generating IPv6 link local address
	riscv: kprobe: Fixup kernel panic when probing an illegal position
	igc: return an error if the mac type is unknown in igc_ptp_systim_to_hwtstamp()
	octeontx2-af: Fix devlink unregister
	can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate
	can: raw: fix CAN FD frame transmissions over CAN XL devices
	can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq
	ata: libata: Fix sata_down_spd_limit() when no link speed is reported
	selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning
	selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided
	selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs
	selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking
	virtio-net: Keep stop() to follow mirror sequence of open()
	net: openvswitch: fix flow memory leak in ovs_flow_cmd_new
	efi: fix potential NULL deref in efi_mem_reserve_persistent
	rtc: sunplus: fix format string for printing resource
	certs: Fix build error when PKCS#11 URI contains semicolon
	kbuild: modinst: Fix build error when CONFIG_MODULE_SIG_KEY is a PKCS#11 URI
	i2c: designware-pci: Add new PCI IDs for AMD NAVI GPU
	i2c: mxs: suppress probe-deferral error message
	scsi: target: core: Fix warning on RT kernels
	x86/aperfmperf: Erase stale arch_freq_scale values when disabling frequency invariance readings
	perf/x86/intel: Add Emerald Rapids
	perf/x86/intel/cstate: Add Emerald Rapids
	scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress
	scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddress
	i2c: rk3x: fix a bunch of kernel-doc warnings
	Revert "gfs2: stop using generic_writepages in gfs2_ail1_start_one"
	x86/build: Move '-mindirect-branch-cs-prefix' out of GCC-only block
	platform/x86: dell-wmi: Add a keymap for KEY_MUTE in type 0x0010 table
	platform/x86: hp-wmi: Handle Omen Key event
	platform/x86: gigabyte-wmi: add support for B450M DS3H WIFI-CF
	platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN
	net/x25: Fix to not accept on connected socket
	drm/amd/display: Fix timing not changning when freesync video is enabled
	bcache: Silence memcpy() run-time false positive warnings
	iio: adc: stm32-dfsdm: fill module aliases
	usb: dwc3: qcom: enable vbus override when in OTG dr-mode
	usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait
	vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF
	fbcon: Check font dimension limits
	cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()
	hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()
	ARM: dts: imx7d-smegw01: Fix USB host over-current polarity
	net: qrtr: free memory on error path in radix_tree_insert()
	can: isotp: split tx timer into transmission and timeout
	can: isotp: handle wait_event_interruptible() return values
	watchdog: diag288_wdt: do not use stack buffers for hardware data
	watchdog: diag288_wdt: fix __diag288() inline assembly
	ALSA: hda/realtek: Add Acer Predator PH315-54
	ALSA: hda/realtek: fix mute/micmute LEDs, speaker don't work for a HP platform
	ASoC: codecs: wsa883x: correct playback min/max rates
	ASoC: SOF: sof-audio: unprepare when swidget->use_count > 0
	ASoC: SOF: sof-audio: skip prepare/unprepare if swidget is NULL
	ASoC: SOF: keep prepare/unprepare widgets in sink path
	efi: Accept version 2 of memory attributes table
	rtc: efi: Enable SET/GET WAKEUP services as optional
	iio: hid: fix the retval in accel_3d_capture_sample
	iio: hid: fix the retval in gyro_3d_capture_sample
	iio: adc: xilinx-ams: fix devm_krealloc() return value check
	iio: adc: berlin2-adc: Add missing of_node_put() in error path
	iio: imx8qxp-adc: fix irq flood when call imx8qxp_adc_read_raw()
	iio:adc:twl6030: Enable measurements of VUSB, VBAT and others
	iio: light: cm32181: Fix PM support on system with 2 I2C resources
	iio: imu: fxos8700: fix ACCEL measurement range selection
	iio: imu: fxos8700: fix incomplete ACCEL and MAGN channels readback
	iio: imu: fxos8700: fix IMU data bits returned to user space
	iio: imu: fxos8700: fix map label of channel type to MAGN sensor
	iio: imu: fxos8700: fix swapped ACCEL and MAGN channels readback
	iio: imu: fxos8700: fix incorrect ODR mode readback
	iio: imu: fxos8700: fix failed initialization ODR mode assignment
	iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN
	iio: imu: fxos8700: fix MAGN sensor scale and unit
	nvmem: brcm_nvram: Add check for kzalloc
	nvmem: sunxi_sid: Always use 32-bit MMIO reads
	nvmem: qcom-spmi-sdam: fix module autoloading
	parisc: Fix return code of pdc_iodc_print()
	parisc: Replace hardcoded value with PRIV_USER constant in ptrace.c
	parisc: Wire up PTRACE_GETREGS/PTRACE_SETREGS for compat case
	riscv: disable generation of unwind tables
	Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
	mm: multi-gen LRU: fix crash during cgroup migration
	mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
	mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
	usb: gadget: f_uac2: Fix incorrect increment of bNumEndpoints
	usb: typec: ucsi: Don't attempt to resume the ports before they exist
	usb: gadget: udc: do not clear gadget driver.bus
	kernel/irq/irqdomain.c: fix memory leak with using debugfs_lookup()
	HV: hv_balloon: fix memory leak with using debugfs_lookup()
	x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
	fpga: m10bmc-sec: Fix probe rollback
	fpga: stratix10-soc: Fix return value check in s10_ops_write_init()
	mm/uffd: fix pte marker when fork() without fork event
	mm/swapfile: add cond_resched() in get_swap_pages()
	mm/khugepaged: fix ->anon_vma race
	mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
	mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
	highmem: round down the address passed to kunmap_flush_on_unmap()
	ia64: fix build error due to switch case label appearing next to declaration
	Squashfs: fix handling and sanity checking of xattr_ids count
	maple_tree: fix mas_empty_area_rev() lower bound validation
	migrate: hugetlb: check for hugetlb shared PMD in node migration
	dma-buf: actually set signaling bit for private stub fences
	serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
	drm/i915: Avoid potential vm use-after-free
	drm/i915: Fix potential bit_17 double-free
	drm/amd: Fix initialization for nbio 4.3.0
	drm/amd/pm: drop unneeded dpm features disablement for SMU 13.0.4/11
	drm/amdgpu: update wave data type to 3 for gfx11
	nvmem: core: initialise nvmem->id early
	nvmem: core: remove nvmem_config wp_gpio
	nvmem: core: fix cleanup after dev_set_name()
	nvmem: core: fix registration vs use race
	nvmem: core: fix device node refcounting
	nvmem: core: fix cell removal on error
	nvmem: core: fix return value
	phy: qcom-qmp-combo: fix runtime suspend
	serial: 8250_dma: Fix DMA Rx completion race
	serial: 8250_dma: Fix DMA Rx rearm race
	platform/x86/amd: pmc: add CONFIG_SERIO dependency
	ASoC: SOF: sof-audio: prepare_widgets: Check swidget for NULL on sink failure
	iio:adc:twl6030: Enable measurement of VAC
	powerpc/64s/radix: Fix crash with unaligned relocated kernel
	powerpc/64s: Fix local irq disable when PMIs are disabled
	powerpc/imc-pmu: Revert nest_init_lock to being a mutex
	fs/ntfs3: Validate attribute data and valid sizes
	ovl: Use "buf" flexible array for memcpy() destination
	f2fs: initialize locks earlier in f2fs_fill_super()
	fbdev: smscufx: fix error handling code in ufx_usb_probe
	f2fs: fix to do sanity check on i_extra_isize in is_alive()
	wifi: brcmfmac: Check the count value of channel spec to prevent out-of-bounds reads
	gfs2: Cosmetic gfs2_dinode_{in,out} cleanup
	gfs2: Always check inode size of inline inodes
	bpf: Skip invalid kfunc call in backtrack_insn
	Linux 6.1.11

Change-Id: I69722bc9711b91f2fca18de59746ada373f64c5e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-09 13:29:55 +00:00
Hao Sun
74eec8266f bpf: Skip invalid kfunc call in backtrack_insn
commit d3178e8a434b58678d99257c0387810a24042fb6 upstream.

The verifier skips invalid kfunc call in check_kfunc_call(), which
would be captured in fixup_kfunc_call() if such insn is not eliminated
by dead code elimination. However, this can lead to the following
warning in backtrack_insn(), also see [1]:

  ------------[ cut here ]------------
  verifier backtracking bug
  WARNING: CPU: 6 PID: 8646 at kernel/bpf/verifier.c:2756 backtrack_insn
  kernel/bpf/verifier.c:2756
	__mark_chain_precision kernel/bpf/verifier.c:3065
	mark_chain_precision kernel/bpf/verifier.c:3165
	adjust_reg_min_max_vals kernel/bpf/verifier.c:10715
	check_alu_op kernel/bpf/verifier.c:10928
	do_check kernel/bpf/verifier.c:13821 [inline]
	do_check_common kernel/bpf/verifier.c:16289
  [...]

So make backtracking conservative with this by returning ENOTSUPP.

  [1] https://lore.kernel.org/bpf/CACkBjsaXNceR8ZjkLG=dT3P=4A8SBsg0Z5h5PWLryF5=ghKq=g@mail.gmail.com/

Reported-by: syzbot+4da3ff23081bafe74fc2@syzkaller.appspotmail.com
Signed-off-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20230104014709.9375-1-sunhao.th@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09 11:28:27 +01:00
Greg Kroah-Hartman
cf1c917bf1 kernel/irq/irqdomain.c: fix memory leak with using debugfs_lookup()
commit d83d7ed260283560700d4034a80baad46620481b upstream.

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable <stable@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230202151554.2310273-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09 11:28:21 +01:00
Waiman Long
a2ab7f2cf5 cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()
commit e5ae8803847b80fe9d744a3174abe2b7bfed222a upstream.

It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].

Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.

Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.

[1] https://www.spinics.net/lists/cgroups/msg36254.html

Fixes: f0af1bfc27 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-09 11:28:15 +01:00
Al Viro
5a19095103 use less confusing names for iov_iter direction initializers
[ Upstream commit de4eda9de2d957ef2d6a8365a01e26a435e958cb ]

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 6dd88fd59da8 ("vhost-scsi: unbreak any layout for response")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:04 +01:00
Kui-Feng Lee
3331d34160 bpf: Fix the kernel crash caused by bpf_setsockopt().
[ Upstream commit 5416c9aea8323583e8696f0500b6142dfae80821 ]

The kernel crash was caused by a BPF program attached to the
"lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to
`bpf_setsockopt()` in order to set the TCP_NODELAY flag as an
example. Flags like TCP_NODELAY can prompt the kernel to flush a
socket's outgoing queue, and this hook
"lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by
softirqs. The issue was that in certain circumstances, when
`tcp_write_xmit()` was called to flush the queue, it would also allow
BH (bottom-half) to run. This could lead to our program attempting to
flush the same socket recursively, which caused a `skbuff` to be
unlinked twice.

`security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs
before the sock ownership is checked in `tcp_v4_rcv()`. Consequently,
if a bpf program runs on `security_sock_rcv_skb()` while under softirq
conditions, it may not possess the lock needed for `bpf_setsockopt()`,
thus presenting an issue.

The patch fixes this issue by ensuring that a BPF program attached to
the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call
`bpf_setsockopt()`.

The differences from v1 are
 - changing commit log to explain holding the lock of the sock,
 - emphasizing that TCP_NODELAY is not the only flag, and
 - adding the fixes tag.

v1: https://lore.kernel.org/bpf/20230125000244.1109228-1-kuifeng@meta.com/

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Fixes: 9113d7e48e ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup")
Link: https://lore.kernel.org/r/20230127001732.4162630-1-kuifeng@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:02 +01:00
Jiri Olsa
d5c7a2ab5e bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
[ Upstream commit 74bc3a5acc82f020d2e126f56c535d02d1e74e37 ]

We take the BTF reference before we register dtors and we need
to put it back when it's done.

We probably won't se a problem with kernel BTF, but module BTF
would stay loaded (because of the extra ref) even when its module
is removed.

Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Fixes: 5ce937d613 ("bpf: Populate pairs of btf_id and destructor kfunc in btf")
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230120122148.1522359-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:01 +01:00
Eduard Zingerman
7c7652ffa9 bpf: Fix to preserve reg parent/live fields when copying range info
[ Upstream commit 71f656a50176915d6813751188b5758daa8d012b ]

Register range information is copied in several places. The intent is
to transfer range/id information from one register/stack spill to
another. Currently this is done using direct register assignment, e.g.:

static void find_equal_scalars(..., struct bpf_reg_state *known_reg)
{
	...
	struct bpf_reg_state *reg;
	...
			*reg = *known_reg;
	...
}

However, such assignments also copy the following bpf_reg_state fields:

struct bpf_reg_state {
	...
	struct bpf_reg_state *parent;
	...
	enum bpf_reg_liveness live;
	...
};

Copying of these fields is accidental and incorrect, as could be
demonstrated by the following example:

     0: call ktime_get_ns()
     1: r6 = r0
     2: call ktime_get_ns()
     3: r7 = r0
     4: if r0 > r6 goto +1             ; r0 & r6 are unbound thus generated
                                       ; branch states are identical
     5: *(u64 *)(r10 - 8) = 0xdeadbeef ; 64-bit write to fp[-8]
    --- checkpoint ---
     6: r1 = 42                        ; r1 marked as written
     7: *(u8 *)(r10 - 8) = r1          ; 8-bit write, fp[-8] parent & live
                                       ; overwritten
     8: r2 = *(u64 *)(r10 - 8)
     9: r0 = 0
    10: exit

This example is unsafe because 64-bit write to fp[-8] at (5) is
conditional, thus not all bytes of fp[-8] are guaranteed to be set
when it is read at (8). However, currently the example passes
verification.

First, the execution path 1-10 is examined by verifier.
Suppose that a new checkpoint is created by is_state_visited() at (6).
After checkpoint creation:
- r1.parent points to checkpoint.r1,
- fp[-8].parent points to checkpoint.fp[-8].
At (6) the r1.live is set to REG_LIVE_WRITTEN.
At (7) the fp[-8].parent is set to r1.parent and fp[-8].live is set to
REG_LIVE_WRITTEN, because of the following code called in
check_stack_write_fixed_off():

static void save_register_state(struct bpf_func_state *state,
				int spi, struct bpf_reg_state *reg,
				int size)
{
	...
	state->stack[spi].spilled_ptr = *reg;  // <--- parent & live copied
	if (size == BPF_REG_SIZE)
		state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
	...
}

Note the intent to mark stack spill as written only if 8 bytes are
spilled to a slot, however this intent is spoiled by a 'live' field copy.
At (8) the checkpoint.fp[-8] should be marked as REG_LIVE_READ but
this does not happen:
- fp[-8] in a current state is already marked as REG_LIVE_WRITTEN;
- fp[-8].parent points to checkpoint.r1, parentage chain is used by
  mark_reg_read() to mark checkpoint states.
At (10) the verification is finished for path 1-10 and jump 4-6 is
examined. The checkpoint.fp[-8] never gets REG_LIVE_READ mark and this
spill is pruned from the cached states by clean_live_states(). Hence
verifier state obtained via path 1-4,6 is deemed identical to one
obtained via path 1-6 and program marked as safe.

Note: the example should be executed with BPF_F_TEST_STATE_FREQ flag
set to force creation of intermediate verifier states.

This commit revisits the locations where bpf_reg_state instances are
copied and replaces the direct copies with a call to a function
copy_register_state(dst, src) that preserves 'parent' and 'live'
fields of the 'dst'.

Fixes: 679c782de1 ("bpf/verifier: per-register parent pointers")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230106142214.1040390-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:01 +01:00
Yonghong Song
6a199d556c bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
[ Upstream commit bdb7fdb0aca8b96cef9995d3a57e251c2289322f ]

In current bpf_send_signal() and bpf_send_signal_thread() helper
implementation, irq_work is used to handle nmi context. Hao Sun
reported in [1] that the current task at the entry of the helper
might be gone during irq_work callback processing. To fix the issue,
a reference is acquired for the current task before enqueuing into
the irq_work so that the queued task is still available during
irq_work callback processing.

  [1] https://lore.kernel.org/bpf/20230109074425.12556-1-sunhao.th@gmail.com/

Fixes: 8b401f9ed2 ("bpf: implement bpf_send_signal() helper")
Tested-by: Hao Sun <sunhao.th@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20230118204815.3331855-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:00 +01:00
Hou Tao
c32efcf9ff bpf: Fix off-by-one error in bpf_mem_cache_idx()
[ Upstream commit 36024d023d139a0c8b552dc3b7f4dc7b4c139e8f ]

According to the definition of sizes[NUM_CACHES], when the size passed
to bpf_mem_cache_size() is 256, it should return 6 instead 7.

Fixes: 7c8199e24f ("bpf: Introduce any context BPF specific memory allocator.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20230118084630.3750680-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-09 11:28:00 +01:00
Greg Kroah-Hartman
be71152643 This is the 6.1.10 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmPgpwAACgkQONu9yGCS
 aT7EZhAAmhKEM2VUV115LmojNeeP26BKwVNsVWZbqc6IEi8xK2KMJQrlhFNOz8iG
 i1Poa5poCO2ZELEUwPTUkcw5X7mMxd9ik9L0lfFmvcXNsf5FWd1+KQzMMCMy6fY1
 aI3jW+7wZGT60ABnc6dZUU9v74Rh7ULlokFr3Os6qJXAD+uXLKEkf0mpgXN86jvN
 Kahl21UVzddHna7+HJXOBxx+LJbD0K7yMhKwGVC5uiV2Dt0BATvn6vFqEsZBchEB
 gunfiLc1TT8Tye04WHpHgeMtqedpdZqAxdMeAMSyt1ZMSMmHgYxHhma1oRsA3YNF
 UGhEwsI0Jz3lWtUQNem5o9fYfSn2H2g4lkNU6BnlPBRFno0xuaJ/vgJIbhnZ9g+b
 toLiLHkAP8AAgacOQ11HV+RVcNWFdIThVzqEqG7MEENy7gXtFAPIOBwF+CvLWX/k
 RoYUFfffm803Sxl2/SUBp8uI5YArVSZ5oGdFWk3ogegdBQ4cC1TIGTO8ppE0CZpL
 6PL59tVwMdymm6zDEEZUfILygflCsvLCsEcAF69kaRhRMzKaJf9WYx81IoZ9atFu
 Ck3U2jl2b7acEI0kdRlmFgpSyc79yV26osoVst2QsEfKUs3eVQ759rgjt4tvq9qP
 CUlUbieluvuF+W559hAJPr2XvXFOhsvYn5RUYC0+E3ahdMdYagA=
 =itLu
 -----END PGP SIGNATURE-----

Merge 6.1.10 into android14-6.1

Changes in 6.1.10
	ARM: dts: imx: Fix pca9547 i2c-mux node name
	ARM: dts: vf610: Fix pca9548 i2c-mux node names
	arm64: dts: freescale: Fix pca954x i2c-mux node names
	arm64: dts: imx8mq-thor96: fix no-mmc property for SDHCI
	firmware: arm_scmi: Clear stale xfer->hdr.status
	bpf: Skip task with pid=1 in send_signal_common()
	erofs/zmap.c: Fix incorrect offset calculation
	mac80211: Fix MLO address translation for multiple bss case
	arm64: dts: msm8994-angler: fix the memory map
	ARM: omap1: fix building gpio15xx
	kselftest: Fix error message for unconfigured LLVM builds
	erofs: clean up parsing of fscache related options
	blk-cgroup: fix missing pd_online_fn() while activating policy
	LoongArch: Get frame info in unwind_start() when regs is not available
	ACPI: video: Add backlight=native DMI quirk for Acer Aspire 4810T
	block: fix hctx checks for batch allocation
	s390: workaround invalid gcc-11 out of bounds read warning
	HID: uclogic: Add support for XP-PEN Deco 01 V2
	HID: playstation: sanity check DualSense calibration data.
	dmaengine: imx-sdma: Fix a possible memory leak in sdma_transfer_init
	gpiolib: acpi: Allow ignoring wake capability on pins that aren't in _AEI
	cifs: fix return of uninitialized rc in dfs_cache_update_tgthint()
	nvme-apple: only reset the controller when RTKit is running
	gpiolib: acpi: Add a ignore wakeup quirk for Clevo NL5xRU
	gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
	net: fix NULL pointer in skb_segment_list
	rust: print: avoid evaluating arguments in `pr_*` macros in `unsafe` blocks
	net: mctp: purge receive queues on sk destruction
	Linux 6.1.10

Change-Id: I38cb1a3d3f619094d9a248c59bccbe4a56c6d70e
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-06 10:42:24 +00:00
Hao Sun
1283a01b6e bpf: Skip task with pid=1 in send_signal_common()
[ Upstream commit a3d81bc1eaef48e34dd0b9b48eefed9e02a06451 ]

The following kernel panic can be triggered when a task with pid=1 attaches
a prog that attempts to send killing signal to itself, also see [1] for more
details:

  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
  CPU: 3 PID: 1 Comm: systemd Not tainted 6.1.0-09652-g59fe41b5255f #148
  Call Trace:
  <TASK>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x100/0x178 lib/dump_stack.c:106
  panic+0x2c4/0x60f kernel/panic.c:275
  do_exit.cold+0x63/0xe4 kernel/exit.c:789
  do_group_exit+0xd4/0x2a0 kernel/exit.c:950
  get_signal+0x2460/0x2600 kernel/signal.c:2858
  arch_do_signal_or_restart+0x78/0x5d0 arch/x86/kernel/signal.c:306
  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
  exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
  do_syscall_64+0x44/0xb0 arch/x86/entry/common.c:86
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

So skip task with pid=1 in bpf_send_signal_common() to avoid the panic.

  [1] https://lore.kernel.org/bpf/20221222043507.33037-1-sunhao.th@gmail.com

Signed-off-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230106084838.12690-1-sunhao.th@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-06 08:06:31 +01:00
Greg Kroah-Hartman
936f394ef7 This is the 6.1.9 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmPaFzoACgkQONu9yGCS
 aT6Y7Q//bOQ+QfUsJ9oi0hCQpC4L4REaM/WpqyWFn+/75KB4KDZ7IGaHAZ8UZSPQ
 DwZ0aoIAapQyAL7Q5WUDnG51Q07Xi4NfWPHNlz1FqAKdJu2D8uAmYP9I6M0JpEbg
 nV5ki8UXETkIu7EnfS7+5MjHLt99DaA+W0Z1J+qqXONRoszELUNfMdTZMoqVX5Vx
 gqmSpHmySt2mhSr8k4Inx5OvhF6pZ9mQVq0baUEieAcyaRXSRBBLTtOgntcYyq+R
 aAoCV5E+lLDZVkjntc6wKtTECD6zegfXCBqZdxQ1RUt5SBTn7K2XnGqQt+V3UbeH
 5kFwUngvnpGDQeS8VuzWo+yGBLu0cp6PShP329SbO5o0bY8qRxiWfr37sxfMq/yh
 F947AjG2wWouCK4xle68/O6GvZNLtKJI1Z0MihpFKmeLbvL0S88rkSnhwjPQ5qBe
 kK8RfUATLKkl6XoTyJT/v/o+/tlAuHj3txrH3zsB0MQWuuxBkZ1JAAnmDnBCcvIJ
 BAr6HFRFr6kTfcREnMKkWr2EXO98DGrk0Eg9FTedm1F4RSL8iGQenTXNmRMhSxFv
 /MtF0sRwkstI+v7EINmmK+wNJeye03WjmWDjJVxIqOwfmGC5EfCGhGV4CfmdnBsE
 N18DZMZ5oc9ft/zmH9Pi/vJUlwRHDS52uQ3r7K3TYXHHveT62FE=
 =8rzU
 -----END PGP SIGNATURE-----

Merge 6.1.9 into android14-6.1

Changes in 6.1.9
	memory: tegra: Remove clients SID override programming
	memory: atmel-sdramc: Fix missing clk_disable_unprepare in atmel_ramc_probe()
	memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe()
	arm64: dts: qcom: sc8280xp: fix primary USB-DP PHY reset
	dmaengine: qcom: gpi: Set link_rx bit on GO TRE for rx operation
	dmaengine: ti: k3-udma: Do conditional decrement of UDMA_CHAN_RT_PEER_BCNT_REG
	soc: imx: imx8mp-blk-ctrl: enable global pixclk with HDMI_TX_PHY PD
	arm64: dts: imx8mp-phycore-som: Remove invalid PMIC property
	ARM: dts: imx6ul-pico-dwarf: Use 'clock-frequency'
	ARM: dts: imx7d-pico: Use 'clock-frequency'
	ARM: dts: imx6qdl-gw560x: Remove incorrect 'uart-has-rtscts'
	arm64: dts: verdin-imx8mm: fix dahlia audio playback
	arm64: dts: imx8mm-beacon: Fix ecspi2 pinmux
	arm64: dts: verdin-imx8mm: fix dev board audio playback
	arm64: dts: imx93-11x11-evk: correct clock and strobe pad setting
	ARM: imx: add missing of_node_put()
	soc: imx: imx8mp-blk-ctrl: don't set power device name
	arm64: dts: imx8mp: Fix missing GPC Interrupt
	arm64: dts: imx8mp: Fix power-domain typo
	arm64: dts: imx8mp-evk: pcie0-refclk cosmetic cleanup
	HID: intel_ish-hid: Add check for ishtp_dma_tx_map
	arm64: dts: imx8mm-venice-gw7901: fix USB2 controller OC polarity
	soc: imx8m: Fix incorrect check for of_clk_get_by_name()
	reset: ti-sci: honor TI_SCI_PROTOCOL setting when not COMPILE_TEST
	reset: uniphier-glue: Fix possible null-ptr-deref
	EDAC/highbank: Fix memory leak in highbank_mc_probe()
	firmware: arm_scmi: Harden shared memory access in fetch_response
	firmware: arm_scmi: Harden shared memory access in fetch_notification
	firmware: arm_scmi: Fix virtio channels cleanup on shutdown
	interconnect: qcom: msm8996: Provide UFS clocks to A2NoC
	interconnect: qcom: msm8996: Fix regmap max_register values
	HID: amd_sfh: Fix warning unwind goto
	tomoyo: fix broken dependency on *.conf.default
	RDMA/rxe: Fix inaccurate constants in rxe_type_info
	RDMA/rxe: Prevent faulty rkey generation
	erofs: fix kvcalloc() misuse with __GFP_NOFAIL
	arm64: dts: marvell: AC5/AC5X: Fix address for UART1
	RDMA/core: Fix ib block iterator counter overflow
	IB/hfi1: Reject a zero-length user expected buffer
	IB/hfi1: Reserve user expected TIDs
	IB/hfi1: Fix expected receive setup error exit issues
	IB/hfi1: Immediately remove invalid memory from hardware
	IB/hfi1: Remove user expected buffer invalidate race
	affs: initialize fsdata in affs_truncate()
	PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe()
	arm64: dts: qcom: msm8992: Don't use sfpb mutex
	arm64: dts: qcom: msm8992-libra: Fix the memory map
	kbuild: export top-level LDFLAGS_vmlinux only to scripts/Makefile.vmlinux
	kbuild: fix 'make modules' error when CONFIG_DEBUG_INFO_BTF_MODULES=y
	phy: ti: fix Kconfig warning and operator precedence
	drm/msm/gpu: Fix potential double-free
	NFSD: fix use-after-free in nfsd4_ssc_setup_dul()
	ARM: dts: at91: sam9x60: fix the ddr clock for sam9x60
	drm/vc4: bo: Fix drmm_mutex_init memory hog
	phy: usb: sunplus: Fix potential null-ptr-deref in sp_usb_phy_probe()
	bpf: hash map, avoid deadlock with suitable hash mask
	amd-xgbe: TX Flow Ctrl Registers are h/w ver dependent
	amd-xgbe: Delay AN timeout during KR training
	bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation
	drm/vc4: bo: Fix unused variable warning
	phy: rockchip-inno-usb2: Fix missing clk_disable_unprepare() in rockchip_usb2phy_power_on()
	net: nfc: Fix use-after-free in local_cleanup()
	net: wan: Add checks for NULL for utdm in undo_uhdlc_init and unmap_si_regs
	net: enetc: avoid deadlock in enetc_tx_onestep_tstamp()
	net: lan966x: add missing fwnode_handle_put() for ports node
	sch_htb: Avoid grafting on htb_destroy_class_offload when destroying htb
	gpio: mxc: Protect GPIO irqchip RMW with bgpio spinlock
	gpio: mxc: Always set GPIOs used as interrupt source to INPUT mode
	wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid
	pinctrl: rockchip: fix reading pull type on rk3568
	net: stmmac: Fix queue statistics reading
	net/sched: sch_taprio: fix possible use-after-free
	l2tp: convert l2tp_tunnel_list to idr
	l2tp: close all race conditions in l2tp_tunnel_register()
	net: usb: sr9700: Handle negative len
	net: mdio: validate parameter addr in mdiobus_get_phy()
	HID: check empty report_list in hid_validate_values()
	HID: check empty report_list in bigben_probe()
	net: stmmac: fix invalid call to mdiobus_get_phy()
	pinctrl: rockchip: fix mux route data for rk3568
	ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp15xx-dhcor-som
	ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp15xx-dhcom-som
	ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp157c-emstamp-argon
	ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp151a-prtt1l
	HID: revert CHERRY_MOUSE_000C quirk
	block/rnbd-clt: fix wrong max ID in ida_alloc_max
	usb: ucsi: Ensure connector delayed work items are flushed
	usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait
	usb: gadget: f_fs: Ensure ep0req is dequeued before free_request
	netfilter: conntrack: handle tcp challenge acks during connection reuse
	Bluetooth: Fix a buffer overflow in mgmt_mesh_add()
	Bluetooth: hci_conn: Fix memory leaks
	Bluetooth: hci_sync: fix memory leak in hci_update_adv_data()
	Bluetooth: ISO: Avoid circular locking dependency
	Bluetooth: ISO: Fix possible circular locking dependency
	Bluetooth: hci_event: Fix Invalid wait context
	Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
	net: ipa: disable ipa interrupt during suspend
	net/mlx5e: Avoid false lock dependency warning on tc_ht even more
	net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT
	net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT
	net/mlx5e: Set decap action based on attr for sample
	net/mlx5: E-switch, Fix switchdev mode after devlink reload
	net: mlx5: eliminate anonymous module_init & module_exit
	drm/panfrost: fix GENERIC_ATOMIC64 dependency
	dmaengine: Fix double increment of client_count in dma_chan_get()
	net: macb: fix PTP TX timestamp failure due to packet padding
	virtio-net: correctly enable callback during start_xmit
	l2tp: prevent lockdep issue in l2tp_tunnel_register()
	HID: betop: check shape of output reports
	drm/i915/selftests: Unwind hugepages to drop wakeref on error
	cifs: fix potential deadlock in cache_refresh_path()
	dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node()
	dmaengine: tegra: Fix memory leak in terminate_all()
	phy: phy-can-transceiver: Skip warning if no "max-bitrate"
	drm/amd/display: fix issues with driver unload
	net: sched: gred: prevent races when adding offloads to stats
	nvme-pci: fix timeout request state check
	tcp: avoid the lookup process failing to get sk in ehash table
	usb: dwc3: fix extcon dependency
	ptdma: pt_core_execute_cmd() should use spinlock
	device property: fix of node refcount leak in fwnode_graph_get_next_endpoint()
	w1: fix deadloop in __w1_remove_master_device()
	w1: fix WARNING after calling w1_process()
	driver core: Fix test_async_probe_init saves device in wrong array
	selftests/net: toeplitz: fix race on tpacket_v3 block close
	net: dsa: microchip: ksz9477: port map correction in ALU table entry register
	thermal: Validate new state in cur_state_store()
	thermal/core: fix error code in __thermal_cooling_device_register()
	thermal: core: call put_device() only after device_register() fails
	net: stmmac: enable all safety features by default
	bnxt: Do not read past the end of test names
	tcp: fix rate_app_limited to default to 1
	scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace
	ASoC: SOF: pm: Set target state earlier
	ASoC: SOF: pm: Always tear down pipelines before DSP suspend
	ASoC: SOF: Add FW state to debugfs
	ASoC: amd: yc: Add Razer Blade 14 2022 into DMI table
	spi: cadence: Fix busy cycles calculation
	cpufreq: CPPC: Add u64 casts to avoid overflowing
	cpufreq: Add Tegra234 to cpufreq-dt-platdev blocklist
	ASoC: mediatek: mt8186: support rt5682s_max98360
	ASoC: mediatek: mt8186: Add machine support for max98357a
	ASoC: amd: yc: Add ASUS M5402RA into DMI table
	ASoC: support machine driver with max98360
	kcsan: test: don't put the expect array on the stack
	cpufreq: Add SM6375 to cpufreq-dt-platdev blocklist
	ASoC: fsl_micfil: Correct the number of steps on SX controls
	drm/msm/a6xx: Avoid gx gbit halt during rpm suspend
	net: usb: cdc_ether: add support for Thales Cinterion PLS62-W modem
	drm: Add orientation quirk for Lenovo ideapad D330-10IGL
	s390/debug: add _ASM_S390_ prefix to header guard
	s390: expicitly align _edata and _end symbols on page boundary
	xen/pvcalls: free active map buffer on pvcalls_front_free_map
	perf/x86/cstate: Add Meteor Lake support
	perf/x86/msr: Add Meteor Lake support
	perf/x86/msr: Add Emerald Rapids
	perf/x86/intel/uncore: Add Emerald Rapids
	nolibc: fix fd_set type
	tools/nolibc: Fix S_ISxxx macros
	tools/nolibc: fix missing includes causing build issues at -O0
	tools/nolibc: prevent gcc from making memset() loop over itself
	cpufreq: armada-37xx: stop using 0 as NULL pointer
	ASoC: fsl_ssi: Rename AC'97 streams to avoid collisions with AC'97 CODEC
	ASoC: fsl-asoc-card: Fix naming of AC'97 CODEC widgets
	ACPI: resource: Skip IRQ override on Asus Expertbook B2402CBA
	drm/amdkfd: Add sync after creating vram bo
	drm/amdkfd: Fix NULL pointer error for GC 11.0.1 on mGPU
	cifs: fix potential memory leaks in session setup
	spi: spidev: remove debug messages that access spidev->spi without locking
	KVM: s390: interrupt: use READ_ONCE() before cmpxchg()
	scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
	scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id
	r8152: add vendor/device ID pair for Microsoft Devkit
	platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HD
	platform/x86: asus-nb-wmi: Add alternate mapping for KEY_CAMERA
	platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCK
	platform/x86: asus-wmi: Add quirk wmi_ignore_fan
	platform/x86: asus-wmi: Ignore fan on E410MA
	platform/x86: simatic-ipc: correct name of a model
	platform/x86: simatic-ipc: add another model
	lockref: stop doing cpu_relax in the cmpxchg loop
	ata: pata_cs5535: Don't build on UML
	firmware: coreboot: Check size of table entry and use flex-array
	btrfs: zoned: enable metadata over-commit for non-ZNS setup
	Revert "selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID"
	arm64: efi: Recover from synchronous exceptions occurring in firmware
	arm64: efi: Avoid workqueue to check whether EFI runtime is live
	arm64: efi: Account for the EFI runtime stack in stack unwinder
	Bluetooth: hci_sync: cancel cmd_timer if hci_open failed
	drm/i915: Allow panel fixed modes to have differing sync polarities
	drm/i915: Allow alternate fixed modes always for eDP
	drm/amdgpu: complete gfxoff allow signal during suspend without delay
	io_uring/msg_ring: fix remote queue to disabled ring
	wifi: mac80211: Proper mark iTXQs for resumption
	wifi: mac80211: Fix iTXQ AMPDU fragmentation handling
	sched/fair: Check if prev_cpu has highest spare cap in feec()
	sched/uclamp: Fix a uninitialized variable warnings
	vfio/type1: Respect IOMMU reserved regions in vfio_test_domain_fgsp()
	scsi: hpsa: Fix allocation size for scsi_host_alloc()
	kvm/vfio: Fix potential deadlock on vfio group_lock
	nfsd: don't free files unconditionally in __nfsd_file_cache_purge
	module: Don't wait for GOING modules
	ftrace: Export ftrace_free_filter() to modules
	tracing: Make sure trace_printk() can output as soon as it can be used
	trace_events_hist: add check for return value of 'create_hist_field'
	ftrace/scripts: Update the instructions for ftrace-bisect.sh
	cifs: Fix oops due to uncleared server->smbd_conn in reconnect
	ksmbd: add max connections parameter
	ksmbd: do not sign response to session request for guest login
	ksmbd: downgrade ndr version error message to debug
	ksmbd: limit pdu length size according to connection status
	ovl: fix tmpfile leak
	ovl: fail on invalid uid/gid mapping at copy up
	io_uring/net: cache provided buffer group value for multishot receives
	KVM: x86/vmx: Do not skip segment attributes if unusable bit is set
	KVM: arm64: GICv4.1: Fix race with doorbell on VPE activation/deactivation
	scsi: ufs: core: Fix devfreq deadlocks
	riscv: fix -Wundef warning for CONFIG_RISCV_BOOT_SPINWAIT
	thermal: intel: int340x: Protect trip temperature from concurrent updates
	regulator: dt-bindings: samsung,s2mps14: add lost samsung,ext-control-gpios
	ipv6: fix reachability confirmation with proxy_ndp
	ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment
	EDAC/device: Respect any driver-supplied workqueue polling value
	EDAC/qcom: Do not pass llcc_driv_data as edac_device_ctl_info's pvt_info
	platform/x86: thinkpad_acpi: Fix profile modes on Intel platforms
	drm/display/dp_mst: Correct the kref of port.
	drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0
	drm/amdgpu: remove unconditional trap enable on add gfx11 queues
	drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignments
	drm/amdgpu/display/mst: limit payload to be updated one by one
	drm/amdgpu/display/mst: update mst_mgr relevant variable when long HPD
	io_uring: inline io_req_task_work_add()
	io_uring: inline __io_req_complete_post()
	io_uring: hold locks for io_req_complete_failed
	io_uring: use io_req_task_complete() in timeout
	io_uring: remove io_req_tw_post_queue
	io_uring: inline __io_req_complete_put()
	net: mana: Fix IRQ name - add PCI and queue number
	io_uring: always prep_async for drain requests
	i2c: designware: use casting of u64 in clock multiplication to avoid overflow
	i2c: designware: Fix unbalanced suspended flag
	drm/drm_vma_manager: Add drm_vma_node_allow_once()
	drm/i915: Fix a memory leak with reused mmap_offset
	iavf: fix temporary deadlock and failure to set MAC address
	iavf: schedule watchdog immediately when changing primary MAC
	netlink: prevent potential spectre v1 gadgets
	net: fix UaF in netns ops registration error path
	net: fec: Use page_pool_put_full_page when freeing rx buffers
	nvme: simplify transport specific device attribute handling
	nvme: consolidate setting the tagset flags
	nvme-fc: fix initialization order
	drm/i915/selftest: fix intel_selftest_modify_policy argument types
	ACPI: video: Add backlight=native DMI quirk for HP Pavilion g6-1d80nr
	ACPI: video: Add backlight=native DMI quirk for HP EliteBook 8460p
	ACPI: video: Add backlight=native DMI quirk for Asus U46E
	netfilter: nft_set_rbtree: Switch to node list walk for overlap detection
	netfilter: nft_set_rbtree: skip elements in transaction from garbage collection
	netlink: annotate data races around nlk->portid
	netlink: annotate data races around dst_portid and dst_group
	netlink: annotate data races around sk_state
	ipv4: prevent potential spectre v1 gadget in ip_metrics_convert()
	ipv4: prevent potential spectre v1 gadget in fib_metrics_match()
	net: dsa: microchip: fix probe of I2C-connected KSZ8563
	net: ethernet: adi: adin1110: Fix multicast offloading
	netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE
	netrom: Fix use-after-free of a listening socket.
	platform/x86: asus-wmi: Fix kbd_dock_devid tablet-switch reporting
	platform/x86: apple-gmux: Move port defines to apple-gmux.h
	platform/x86: apple-gmux: Add apple_gmux_detect() helper
	ACPI: video: Fix apple gmux detection
	tracing/osnoise: Use built-in RCU list checking
	net/sched: sch_taprio: do not schedule in taprio_reset()
	sctp: fail if no bound addresses can be used for a given scope
	riscv/kprobe: Fix instruction simulation of JALR
	nvme: fix passthrough csi check
	gpio: mxc: Unlock on error path in mxc_flip_edge()
	gpio: ep93xx: Fix port F hwirq numbers in handler
	net: ravb: Fix lack of register setting after system resumed for Gen3
	net: ravb: Fix possible hang if RIS2_QFF1 happen
	net: mctp: add an explicit reference from a mctp_sk_key to sock
	net: mctp: move expiry timer delete to unhash
	net: mctp: hold key reference when looking up a general key
	net: mctp: mark socks as dead on unhash, prevent re-add
	thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
	riscv: Move call to init_cpu_topology() to later initialization stage
	net/tg3: resolve deadlock in tg3_reset_task() during EEH
	tsnep: Fix TX queue stop/wake for multiple queues
	net: mdio-mux-meson-g12a: force internal PHY off on mux switch
	Partially revert "perf/arm-cmn: Optimise DTC counter accesses"
	block: ublk: move ublk_chr_class destroying after devices are removed
	treewide: fix up files incorrectly marked executable
	tools: gpio: fix -c option of gpio-event-mon
	Fix up more non-executable files marked executable
	Revert "mm/compaction: fix set skip in fast_find_migrateblock"
	Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode"
	Input: i8042 - add Clevo PCX0DX to i8042 quirk table
	x86/sev: Add SEV-SNP guest feature negotiation support
	acpi: Fix suspend with Xen PV
	dt-bindings: riscv: fix underscore requirement for multi-letter extensions
	dt-bindings: riscv: fix single letter canonical order
	x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL
	dt-bindings: i2c: renesas,rzv2m: Fix SoC specific string
	netfilter: conntrack: unify established states for SCTP paths
	perf/x86/amd: fix potential integer overflow on shift of a int
	amdgpu: fix build on non-DCN platforms.
	Linux 6.1.9

Change-Id: I750dee519337922880b87841f6732565961c6b0a
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-02-01 09:12:59 +00:00
Chuang Wang
250cec4b26 tracing/osnoise: Use built-in RCU list checking
[ Upstream commit 685b64e4d6da4be8b4595654a57db663b3d1dfc2 ]

list_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence false lockdep
warning when CONFIG_PROVE_RCU_LIST is enabled.

Execute as follow:

 [tracing]# echo osnoise > current_tracer
 [tracing]# echo 1 > tracing_on
 [tracing]# echo 0 > tracing_on

The trace_types_lock is held when osnoise_tracer_stop() or
timerlat_tracer_stop() are called in the non-RCU read side section.
So, pass lockdep_is_held(&trace_types_lock) to silence false lockdep
warning.

Link: https://lkml.kernel.org/r/20221227023036.784337-1-nashuiliang@gmail.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: dae181349f ("tracing/osnoise: Support a list of trace_array *tr")
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:46 +01:00
Natalia Petrova
b4e7e81b4f trace_events_hist: add check for return value of 'create_hist_field'
commit 8b152e9150d07a885f95e1fd401fc81af202d9a4 upstream.

Function 'create_hist_field' is called recursively at
trace_events_hist.c:1954 and can return NULL-value that's why we have
to check it to avoid null pointer dereference.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Link: https://lkml.kernel.org/r/20230111120409.4111-1-n.petrova@fintech.ru

Cc: stable@vger.kernel.org
Fixes: 30350d65ac ("tracing: Add variable support to hist triggers")
Signed-off-by: Natalia Petrova <n.petrova@fintech.ru>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-01 08:34:37 +01:00
Steven Rostedt (Google)
198c83963f tracing: Make sure trace_printk() can output as soon as it can be used
commit 3bb06eb6e9acf7c4a3e1b5bc87aed398ff8e2253 upstream.

Currently trace_printk() can be used as soon as early_trace_init() is
called from start_kernel(). But if a crash happens, and
"ftrace_dump_on_oops" is set on the kernel command line, all you get will
be:

  [    0.456075]   <idle>-0         0dN.2. 347519us : Unknown type 6
  [    0.456075]   <idle>-0         0dN.2. 353141us : Unknown type 6
  [    0.456075]   <idle>-0         0dN.2. 358684us : Unknown type 6

This is because the trace_printk() event (type 6) hasn't been registered
yet. That gets done via an early_initcall(), which may be early, but not
early enough.

Instead of registering the trace_printk() event (and other ftrace events,
which are not trace events) via an early_initcall(), have them registered at
the same time that trace_printk() can be used. This way, if there is a
crash before early_initcall(), then the trace_printk()s will actually be
useful.

Link: https://lkml.kernel.org/r/20230104161412.019f6c55@gandalf.local.home

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: e725c731e3 ("tracing: Split tracing initialization into two for early initialization")
Reported-by: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-01 08:34:37 +01:00
Mark Rutland
e5ae9b5a65 ftrace: Export ftrace_free_filter() to modules
commit 8be9fbd5345da52f4a74f7f81d55ff9fa0a2958e upstream.

Setting filters on an ftrace ops results in some memory being allocated
for the filter hashes, which must be freed before the ops can be freed.
This can be done by removing every individual element of the hash by
calling ftrace_set_filter_ip() or ftrace_set_filter_ips() with `remove`
set, but this is somewhat error prone as it's easy to forget to remove
an element.

Make it easier to clean this up by exporting ftrace_free_filter(), which
can be used to clean up all of the filter hashes after an ftrace_ops has
been unregistered.

Using this, fix the ftrace-direct* samples to free hashes prior to being
unloaded. All other code either removes individual filters explicitly or
is built-in and already calls ftrace_free_filter().

Link: https://lkml.kernel.org/r/20230103124912.2948963-3-mark.rutland@arm.com

Cc: stable@vger.kernel.org
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: e1067a07cf ("ftrace/samples: Add module to test multi direct modify interface")
Fixes: 5fae941b9a ("ftrace/samples: Add multi direct interface test module")
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-01 08:34:37 +01:00
Petr Pavlu
14f4d81f64 module: Don't wait for GOING modules
commit 0254127ab977e70798707a7a2b757c9f3c971210 upstream.

During a system boot, it can happen that the kernel receives a burst of
requests to insert the same module but loading it eventually fails
during its init call. For instance, udev can make a request to insert
a frequency module for each individual CPU when another frequency module
is already loaded which causes the init function of the new module to
return an error.

Since commit 6e6de3dee5 ("kernel/module.c: Only return -EEXIST for
modules that have finished loading"), the kernel waits for modules in
MODULE_STATE_GOING state to finish unloading before making another
attempt to load the same module.

This creates unnecessary work in the described scenario and delays the
boot. In the worst case, it can prevent udev from loading drivers for
other devices and might cause timeouts of services waiting on them and
subsequently a failed boot.

This patch attempts a different solution for the problem 6e6de3dee5
was trying to solve. Rather than waiting for the unloading to complete,
it returns a different error code (-EBUSY) for modules in the GOING
state. This should avoid the error situation that was described in
6e6de3dee5 (user space attempting to load a dependent module because
the -EEXIST error code would suggest to user space that the first module
had been loaded successfully), while avoiding the delay situation too.

This has been tested on linux-next since December 2022 and passes
all kmod selftests except test 0009 with module compression enabled
but it has been confirmed that this issue has existed and has gone
unnoticed since prior to this commit and can also be reproduced without
module compression with a simple usleep(5000000) on tools/modprobe.c [0].
These failures are caused by hitting the kernel mod_concurrent_max and can
happen either due to a self inflicted kernel module auto-loead DoS somehow
or on a system with large CPU count and each CPU count incorrectly triggering
many module auto-loads. Both of those issues need to be fixed in-kernel.

[0] https://lore.kernel.org/all/Y9A4fiobL6IHp%2F%2FP@bombadil.infradead.org/

Fixes: 6e6de3dee5 ("kernel/module.c: Only return -EEXIST for modules that have finished loading")
Co-developed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Petr Mladek <pmladek@suse.com>
[mcgrof: enhance commit log with testing and kmod test result interpretation ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-01 08:34:37 +01:00
Qais Yousef
b811432fc5 sched/uclamp: Fix a uninitialized variable warnings
[ Upstream commit e26fd28db82899be71b4b949527373d0a6be1e65 ]

Addresses the following warnings:

> config: riscv-randconfig-m031-20221111
> compiler: riscv64-linux-gcc (GCC) 12.1.0
>
> smatch warnings:
> kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_min'.
> kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_max'.

Fixes: 244226035a1f ("sched/uclamp: Fix fits_capacity() check in feec()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20230112122708.330667-2-qyousef@layalina.io
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:36 +01:00
Pierre Gondois
390eb5433e sched/fair: Check if prev_cpu has highest spare cap in feec()
[ Upstream commit ad841e569f5c88e3332b32a000f251f33ff32187 ]

When evaluating the CPU candidates in the perf domain (pd) containing
the previously used CPU (prev_cpu), find_energy_efficient_cpu()
evaluates the energy of the pd:
- without the task (base_energy)
- with the task placed on prev_cpu (if the task fits)
- with the task placed on the CPU with the highest spare capacity,
  prev_cpu being excluded from this set

If prev_cpu is already the CPU with the highest spare capacity,
max_spare_cap_cpu will be the CPU with the second highest spare
capacity.

On an Arm64 Juno-r2, with a workload of 10 tasks at a 10% duty cycle,
when prev_cpu and max_spare_cap_cpu are both valid candidates,
prev_spare_cap > max_spare_cap at ~82%.
Thus the energy of the pd when placing the task on max_spare_cap_cpu
is computed with no possible positive outcome 82% most of the time.

Do not consider max_spare_cap_cpu as a valid candidate if
prev_spare_cap > max_spare_cap.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20221006081052.3862167-2-pierre.gondois@arm.com
Stable-dep-of: e26fd28db828 ("sched/uclamp: Fix a uninitialized variable warnings")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:36 +01:00
Max Filippov
d5cb15095a kcsan: test: don't put the expect array on the stack
[ Upstream commit 5b24ac2dfd3eb3e36f794af3aa7f2828b19035bd ]

Size of the 'expect' array in the __report_matches is 1536 bytes, which
is exactly the default frame size warning limit of the xtensa
architecture.
As a result allmodconfig xtensa kernel builds with the gcc that does not
support the compiler plugins (which otherwise would push the said
warning limit to 2K) fail with the following message:

  kernel/kcsan/kcsan_test.c:257:1: error: the frame size of 1680 bytes
    is larger than 1536 bytes

Fix it by dynamically allocating the 'expect' array.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:29 +01:00
Luis Gerhorst
b0c89ef025 bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation
[ Upstream commit e4f4db47794c9f474b184ee1418f42e6a07412b6 ]

To mitigate Spectre v4, 2039f26f3a ("bpf: Fix leakage due to
insufficient speculative store bypass mitigation") inserts lfence
instructions after 1) initializing a stack slot and 2) spilling a
pointer to the stack.

However, this does not cover cases where a stack slot is first
initialized with a pointer (subject to sanitization) but then
overwritten with a scalar (not subject to sanitization because
the slot was already initialized). In this case, the second write
may be subject to speculative store bypass (SSB) creating a
speculative pointer-as-scalar type confusion. This allows the
program to subsequently leak the numerical pointer value using,
for example, a branch-based cache side channel.

To fix this, also sanitize scalars if they write a stack slot
that previously contained a pointer. Assuming that pointer-spills
are only generated by LLVM on register-pressure, the performance
impact on most real-world BPF programs should be small.

The following unprivileged BPF bytecode drafts a minimal exploit
and the mitigation:

  [...]
  // r6 = 0 or 1 (skalar, unknown user input)
  // r7 = accessible ptr for side channel
  // r10 = frame pointer (fp), to be leaked
  //
  r9 = r10 # fp alias to encourage ssb
  *(u64 *)(r9 - 8) = r10 // fp[-8] = ptr, to be leaked
  // lfence added here because of pointer spill to stack.
  //
  // Ommitted: Dummy bpf_ringbuf_output() here to train alias predictor
  // for no r9-r10 dependency.
  //
  *(u64 *)(r10 - 8) = r6 // fp[-8] = scalar, overwrites ptr
  // 2039f26f3a: no lfence added because stack slot was not STACK_INVALID,
  // store may be subject to SSB
  //
  // fix: also add an lfence when the slot contained a ptr
  //
  r8 = *(u64 *)(r9 - 8)
  // r8 = architecturally a scalar, speculatively a ptr
  //
  // leak ptr using branch-based cache side channel:
  r8 &= 1 // choose bit to leak
  if r8 == 0 goto SLOW // no mispredict
  // architecturally dead code if input r6 is 0,
  // only executes speculatively iff ptr bit is 1
  r8 = *(u64 *)(r7 + 0) # encode bit in cache (0: slow, 1: fast)
SLOW:
  [...]

After running this, the program can time the access to *(r7 + 0) to
determine whether the chosen pointer bit was 0 or 1. Repeat this 64
times to recover the whole address on amd64.

In summary, sanitization can only be skipped if one scalar is
overwritten with another scalar. Scalar-confusion due to speculative
store bypass can not lead to invalid accesses because the pointer
bounds deducted during verification are enforced using branchless
logic. See 979d63d50c ("bpf: prevent out of bounds speculation on
pointer arithmetic") for details.

Do not make the mitigation depend on !env->allow_{uninit_stack,ptr_leaks}
because speculative leaks are likely unexpected if these were enabled.
For example, leaking the address to a protected log file may be acceptable
while disabling the mitigation might unintentionally leak the address
into the cached-state of a map that is accessible to unprivileged
processes.

Fixes: 2039f26f3a ("bpf: Fix leakage due to insufficient speculative store bypass mitigation")
Signed-off-by: Luis Gerhorst <gerhorst@cs.fau.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Henriette Hofmeier <henriette.hofmeier@rub.de>
Link: https://lore.kernel.org/bpf/edc95bad-aada-9cfc-ffe2-fa9bb206583c@cs.fau.de
Link: https://lore.kernel.org/bpf/20230109150544.41465-1-gerhorst@cs.fau.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:17 +01:00
Tonghao Zhang
084e6764dc bpf: hash map, avoid deadlock with suitable hash mask
[ Upstream commit 9f907439dc80e4a2fcfb949927b36c036468dbb3 ]

The deadlock still may occur while accessed in NMI and non-NMI
context. Because in NMI, we still may access the same bucket but with
different map_locked index.

For example, on the same CPU, .max_entries = 2, we update the hash map,
with key = 4, while running bpf prog in NMI nmi_handle(), to update
hash map with key = 20, so it will have the same bucket index but have
different map_locked index.

To fix this issue, using min mask to hash again.

Fixes: 20b6cc34ea ("bpf: Avoid hashtab deadlock with map_locked")
Signed-off-by: Tonghao Zhang <tong@infragraf.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Hou Tao <houtao1@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230111092903.92389-1-tong@infragraf.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01 08:34:09 +01:00
Ramji Jiyani
a62d07b412 ANDROID: GKI: Fix symbol list wildcard
Update target dependencies for the vendor symbol list to
generate gki_module_unprotected.h with an extra _ for
the wildcard. This makes sure that all abi_gki_aarch64_<vendor>
files are being depend on but not abi_gki_aarch64.xml

Bug: 261722616
Bug: 232430739
Test: TH
Fixes: 13e6a16651 ("ANDROID: GKI: Header generation fix and improvements")
Change-Id: Ic414492b1fcae14d41df234e73d0fd4601e33523
Signed-off-by: Ramji Jiyani <ramjiyani@google.com>
2023-02-01 01:39:47 +00:00
Huang Yiwei
3e4fa5265c ANDROID: hung_task: Add vendor hook for hung task detect
Add vendor hook for hung task detect, so we can decide which
threads need to check, avoiding false alarms. And the NULL
tracehook is used to indicate one check cycle is finished, so
additional checks can be done after one hung task check cycle.

Bug: 188684133
Change-Id: I5d7dfeb071cbfda8121134c38a458202aaa3a8c6
Signed-off-by: Huang Yiwei <quic_hyiwei@quicinc.com>
2023-01-30 11:14:35 +08:00
Vincent Donnefort
346f750327 ANDROID: KVM: arm64: RAW interface to the nVHE hyp tracing
This interface intends to be used by userspace tools to store raw
version of events. In such case, the kernel does not decode anything.

Bug: 229972309
Change-Id: Ib1fca21a34a308ad1361240ef598033ecab3b4ad
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2023-01-27 09:00:12 +00:00
Vincent Donnefort
9b99000d9b ANDROID: timekeeping: Export the boot clock in snapshots
The boot clock is interesting for tracing purpose as it doesn't stop on
device suspend. Exporting it intends to let the nVHE hypervisor for the
arm64 architecture to "replicate" that clock and allow event
synchronization with the host. Replicating implies to know the current
slope.

Bug: 229972309
Change-Id: Iefb6ffc433dac82297401f9acdff9758cc1b6a89
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2023-01-27 09:00:12 +00:00
Vincent Donnefort
5ecbcb61e1 ANDROID: ring-buffer: Introducing external writer support
The ring buffer is convenient: it has a page granularity and its format
is already supported by userspace tools such as trace-cmd. It is a
natural solution to store events that would come from outside the kernel
such as a hypervisor.

In that case, where a writer is external to the kernel, the latter would
only be responsible for the allocation and to read back the ring buffer.

The allocation is done with the newly introduced function which just
needs a size and a set of callbacks (notice only the overwrite mode is
supported at the moment):

  ring_buffer_alloc_ext(unsigned long size,
                        struct ring_buffer_ext_cb *cb)

The callbacks given to this allocator enables communication with the
external writer:

  (*swap_reader)(int cpu):    Ask the writer to swap the current reader
                              page with the head.

  (*update_footers)(int cpu): Ask the writer to update material in the
                              page footers.

Each page from the ring buffer has indeed a footer in which statistics
and page status can be retrieved. This allows the kernel to update its
view on the ring buffer, following a reader page swap or a footers
update.

After the trace_buffer is allocated, a helper serializes the relevant
information into a structure that can be easily sent to the external
writer:

  trace_buffer_pack(struct trace_buffer *trace_buffer,
                    struct trace_buffer_pack *pack)

The footer and pack description can be found in the newly introduced
header file include/linux/ring_buffer_ext.h.

When the kernel is writing to the ring buffer, it can wake up quite
easily the reader. That's not the case when the writer is external. A
new function allows polling for reading the ring buffer:

  ring_buffer_poke(struct trace_buffer *buffer, int cpu)

A ring-buffer allocated for an external writer will forbid any writing
(the whole design of the ring buffer mandates a single writer) and will
also prevent extending or extracting pages.

When I presented this work to the tracingsummit, rosted@ told me he saw
some overlapping with an idea he had to enable him to map the tracing
buffers in userspace. We designed together a solution that would enable
both features. Problem now, if on one hand, the development of the new
design has started already... it would nonetheless impose a significant
revamp of this patchset, which wouldn't make it to Android14. Nothing
technically wrong with anything here, but sending it to LKML wouldn't
make sense, as I know already this isn't as "reusable" as the version
agreed upon.

Bug: 229972309
Change-Id: Iafcc1e2683a7460c94de3db116878c303601df64
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2023-01-27 09:00:12 +00:00
Vincent Donnefort
54c734c8ed ANDROID: ring-buffer: Expose buffer_data_page material
In preparation for allowing the write of ring-buffer compliant pages
outside of ring_buffer.c move to the header, struct buffer_data_page and
timestamp encoding functions.

When I presented this work to the tracingsummit, rosted@ told me he saw
some overlapping with an idea he had to enable him to map the tracing
buffers in userspace. We designed together a solution that would enable
both features. Problem now, if on one hand, the development of the new
design has started already... it would nonetheless impose a significant
revamp of this patchset, which wouldn't make it to Android14. Nothing
technically wrong with anything here, but sending it to LKML wouldn't
make sense, as I know already this isn't as "reusable" as the version
agreed upon.

Bug: 229972309
Change-Id: Icf3329bd899a3dd91279d1bbadaf2dc4e243455c
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
2023-01-27 09:00:12 +00:00
Greg Kroah-Hartman
ee921ef7b4 Merge 6.1.8 into android14-6.1
Changes in 6.1.8
	dma-buf: fix dma_buf_export init order v2
	btrfs: fix trace event name typo for FLUSH_DELAYED_REFS
	wifi: iwlwifi: fw: skip PPAG for JF
	pNFS/filelayout: Fix coalescing test for single DS
	selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID
	net: ethernet: marvell: octeontx2: Fix uninitialized variable warning
	tools/virtio: initialize spinlocks in vring_test.c
	vdpa/mlx5: Return error on vlan ctrl commands if not supported
	vdpa/mlx5: Avoid using reslock in event_handler
	vdpa/mlx5: Avoid overwriting CVQ iotlb
	virtio_pci: modify ENOENT to EINVAL
	vduse: Validate vq_num in vduse_validate_config()
	vdpa_sim_net: should not drop the multicast/broadcast packet
	net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats
	r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
	r8169: fix dmar pte write access is not set error
	bpf: keep a reference to the mm, in case the task is dead.
	RDMA/srp: Move large values to a new enum for gcc13
	selftests: net: fix cmsg_so_mark.sh test hang
	btrfs: always report error in run_one_delayed_ref()
	x86/asm: Fix an assembler warning with current binutils
	f2fs: let's avoid panic if extent_tree is not created
	perf/x86/rapl: Treat Tigerlake like Icelake
	cifs: fix race in assemble_neg_contexts()
	memblock tests: Fix compilation error.
	perf/x86/rapl: Add support for Intel Meteor Lake
	perf/x86/rapl: Add support for Intel Emerald Rapids
	of: fdt: Honor CONFIG_CMDLINE* even without /chosen node, take 2
	fbdev: omapfb: avoid stack overflow warning
	Bluetooth: hci_sync: Fix use HCI_OP_LE_READ_BUFFER_SIZE_V2
	Bluetooth: hci_qca: Fix driver shutdown on closed serdev
	wifi: brcmfmac: fix regression for Broadcom PCIe wifi devices
	wifi: mac80211: fix MLO + AP_VLAN check
	wifi: mac80211: reset multiple BSSID options in stop_ap()
	wifi: mac80211: sdata can be NULL during AMPDU start
	wifi: mac80211: fix initialization of rx->link and rx->link_sta
	nommu: fix memory leak in do_mmap() error path
	nommu: fix do_munmap() error path
	nommu: fix split_vma() map_count error
	proc: fix PIE proc-empty-vm, proc-pid-vm tests
	Add exception protection processing for vd in axi_chan_handle_err function
	LoongArch: Add HWCAP_LOONGARCH_CPUCFG to elf_hwcap
	zonefs: Detect append writes at invalid locations
	nilfs2: fix general protection fault in nilfs_btree_insert()
	mm/shmem: restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE
	hugetlb: unshare some PMDs when splitting VMAs
	mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
	serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
	Revert "serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler"
	xhci-pci: set the dma max_seg_size
	usb: xhci: Check endpoint is valid before dereferencing it
	xhci: Fix null pointer dereference when host dies
	xhci: Add update_hub_device override for PCI xHCI hosts
	xhci: Add a flag to disable USB3 lpm on a xhci root port level.
	usb: acpi: add helper to check port lpm capability using acpi _DSM
	xhci: Detect lpm incapable xHC USB3 roothub ports from ACPI tables
	prlimit: do_prlimit needs to have a speculation check
	USB: serial: option: add Quectel EM05-G (GR) modem
	USB: serial: option: add Quectel EM05-G (CS) modem
	USB: serial: option: add Quectel EM05-G (RS) modem
	USB: serial: option: add Quectel EC200U modem
	USB: serial: option: add Quectel EM05CN (SG) modem
	USB: serial: option: add Quectel EM05CN modem
	staging: vchiq_arm: fix enum vchiq_status return types
	USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100
	usb: misc: onboard_hub: Invert driver registration order
	usb: misc: onboard_hub: Move 'attach' work to the driver
	misc: fastrpc: Fix use-after-free and race in fastrpc_map_find
	misc: fastrpc: Don't remove map on creater_process and device_release
	misc: fastrpc: Fix use-after-free race condition for maps
	usb: core: hub: disable autosuspend for TI TUSB8041
	comedi: adv_pci1760: Fix PWM instruction handling
	ACPI: PRM: Check whether EFI runtime is available
	mmc: sunxi-mmc: Fix clock refcount imbalance during unbind
	mmc: sdhci-esdhc-imx: correct the tuning start tap and step setting
	mm/hugetlb: fix PTE marker handling in hugetlb_change_protection()
	mm/hugetlb: fix uffd-wp handling for migration entries in hugetlb_change_protection()
	mm/hugetlb: pre-allocate pgtable pages for uffd wr-protects
	mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA
	mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end
	btrfs: add extra error messages to cover non-ENOMEM errors from device_add_list()
	btrfs: fix missing error handling when logging directory items
	btrfs: fix directory logging due to race with concurrent index key deletion
	btrfs: add missing setup of log for full commit at add_conflicting_inode()
	btrfs: do not abort transaction on failure to write log tree when syncing log
	btrfs: do not abort transaction on failure to update log root
	btrfs: qgroup: do not warn on record without old_roots populated
	btrfs: fix invalid leaf access due to inline extent during lseek
	btrfs: fix race between quota rescan and disable leading to NULL pointer deref
	cifs: do not include page data when checking signature
	thunderbolt: Disable XDomain lane 1 only in software connection manager
	thunderbolt: Use correct function to calculate maximum USB3 link rate
	thunderbolt: Do not report errors if on-board retimers are found
	thunderbolt: Do not call PM runtime functions in tb_retimer_scan()
	riscv: dts: sifive: fu740: fix size of pcie 32bit memory
	bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD
	tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO buffer
	tty: fix possible null-ptr-defer in spk_ttyio_release
	pktcdvd: check for NULL returna fter calling bio_split_to_limits()
	io_uring/poll: don't reissue in case of poll race on multishot request
	mptcp: explicitly specify sock family at subflow creation time
	mptcp: netlink: respect v4/v6-only sockets
	selftests: mptcp: userspace: validate v4-v6 subflows mix
	USB: gadgetfs: Fix race between mounting and unmounting
	USB: serial: cp210x: add SCALANCE LPE-9000 device id
	usb: cdns3: remove fetched trb from cache before dequeuing
	usb: host: ehci-fsl: Fix module alias
	usb: musb: fix error return code in omap2430_probe()
	usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail
	usb: typec: altmodes/displayport: Add pin assignment helper
	usb: typec: altmodes/displayport: Fix pin assignment calculation
	usb: gadget: g_webcam: Send color matching descriptor per frame
	USB: gadget: Add ID numbers to configfs-gadget driver names
	usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate()
	usb-storage: apply IGNORE_UAS only for HIKSEMI MD202 on RTL9210
	arm64: dts: imx8mp: correct usb clocks
	dt-bindings: phy: g12a-usb2-phy: fix compatible string documentation
	dt-bindings: phy: g12a-usb3-pcie-phy: fix compatible string documentation
	serial: pch_uart: Pass correct sg to dma_unmap_sg()
	dmaengine: lgm: Move DT parsing after initialization
	dmaengine: tegra210-adma: fix global intr clear
	dmaengine: idxd: Let probe fail when workqueue cannot be enabled
	dmaengine: idxd: Prevent use after free on completion memory
	dmaengine: idxd: Do not call DMX TX callbacks during workqueue disable
	serial: amba-pl011: fix high priority character transmission in rs486 mode
	serial: atmel: fix incorrect baudrate setup
	serial: exar: Add support for Sealevel 7xxxC serial cards
	gsmi: fix null-deref in gsmi_get_variable
	mei: bus: fix unlink on bus in error path
	mei: me: add meteor lake point M DID
	VMCI: Use threaded irqs instead of tasklets
	ARM: dts: qcom: apq8084-ifc6540: fix overriding SDHCI
	ARM: omap1: fix !ARCH_OMAP1_ANY link failures
	drm/amdgpu: fix amdgpu_job_free_resources v2
	drm/amdgpu: allow multipipe policy on ASICs with one MEC
	drm/amdgpu: Correct the power calcultion for Renior/Cezanne.
	drm/i915: re-disable RC6p on Sandy Bridge
	drm/i915/display: Check source height is > 0
	drm/i915: Allow switching away via vga-switcheroo if uninitialized
	drm/i915: Remove unused variable
	drm/amd/display: Fix set scaling doesn's work
	drm/amd/display: Calculate output_color_space after pixel encoding adjustment
	drm/amd/display: Fix COLOR_SPACE_YCBCR2020_TYPE matrix
	drm/amd/display: disable S/G display on DCN 3.1.5
	drm/amd/display: disable S/G display on DCN 3.1.4
	cifs: reduce roundtrips on create/qinfo requests
	fs/ntfs3: Fix attr_punch_hole() null pointer derenference
	arm64: efi: Execute runtime services from a dedicated stack
	efi: rt-wrapper: Add missing include
	panic: Separate sysctl logic from CONFIG_SMP
	exit: Put an upper limit on how often we can oops
	exit: Expose "oops_count" to sysfs
	exit: Allow oops_limit to be disabled
	panic: Consolidate open-coded panic_on_warn checks
	panic: Introduce warn_limit
	panic: Expose "warn_count" to sysfs
	docs: Fix path paste-o for /sys/kernel/warn_count
	exit: Use READ_ONCE() for all oops/warn limit reads
	x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN
	drm/amdgpu/discovery: enable soc21 common for GC 11.0.4
	drm/amdgpu/discovery: enable gmc v11 for GC 11.0.4
	drm/amdgpu/discovery: enable gfx v11 for GC 11.0.4
	drm/amdgpu/discovery: enable mes support for GC v11.0.4
	drm/amdgpu: set GC 11.0.4 family
	drm/amdgpu/discovery: set the APU flag for GC 11.0.4
	drm/amdgpu: add gfx support for GC 11.0.4
	drm/amdgpu: add gmc v11 support for GC 11.0.4
	drm/amdgpu/discovery: add PSP IP v13.0.11 support
	drm/amdgpu/pm: enable swsmu for SMU IP v13.0.11
	drm/amdgpu: add smu 13 support for smu 13.0.11
	drm/amdgpu/pm: add GFXOFF control IP version check for SMU IP v13.0.11
	drm/amdgpu/soc21: add mode2 asic reset for SMU IP v13.0.11
	drm/amdgpu/pm: use the specific mailbox registers only for SMU IP v13.0.4
	drm/amdgpu/discovery: enable nbio support for NBIO v7.7.1
	drm/amdgpu: enable PSP IP v13.0.11 support
	drm/amdgpu: enable GFX IP v11.0.4 CG support
	drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
	drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
	drm/amdgpu: add tmz support for GC 11.0.1
	drm/amdgpu: add tmz support for GC IP v11.0.4
	drm/amdgpu: correct MEC number for gfx11 APUs
	octeontx2-pf: Avoid use of GFP_KERNEL in atomic context
	net/ulp: use consistent error code when blocking ULP
	octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt
	net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work()
	block: mq-deadline: Rename deadline_is_seq_writes()
	Revert "wifi: mac80211: fix memory leak in ieee80211_if_add()"
	soc: qcom: apr: Make qcom,protection-domain optional again
	Linux 6.1.8

Change-Id: I35d5b5a1ed4822eddb2fc8b29b323b36f7d11926
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2023-01-26 12:13:04 +00:00
Kees Cook
a18417e27e exit: Use READ_ONCE() for all oops/warn limit reads
commit 7535b832c6399b5ebfc5b53af5c51dd915ee2538 upstream.

Use a temporary variable to take full advantage of READ_ONCE() behavior.
Without this, the report (and even the test) might be out of sync with
the initial test.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Y5x7GXeluFmZ8E0E@hirez.programming.kicks-ass.net
Fixes: 9fc9e278a5c0 ("panic: Introduce warn_limit")
Fixes: d4ccd54d28d3 ("exit: Put an upper limit on how often we can oops")
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: tangmeng <tangmeng@uniontech.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
72c93f9897 panic: Expose "warn_count" to sysfs
commit 8b05aa26336113c4cea25f1c333ee8cd4fc212a6 upstream.

Since Warn count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/warn_count to expose it to userspace.

Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: tangmeng <tangmeng@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-6-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
f53b6dda4d panic: Introduce warn_limit
commit 9fc9e278a5c0b708eeffaf47d6eb0c82aa74ed78 upstream.

Like oops_limit, add warn_limit for limiting the number of warnings when
panic_on_warn is not set.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: tangmeng <tangmeng@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-doc@vger.kernel.org
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-5-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
13aa82f007 panic: Consolidate open-coded panic_on_warn checks
commit 79cc1ba7badf9e7a12af99695a557e9ce27ee967 upstream.

Several run-time checkers (KASAN, UBSAN, KFENCE, KCSAN, sched) roll
their own warnings, and each check "panic_on_warn". Consolidate this
into a single function so that future instrumentation can be added in
a single location.

Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Gow <davidgow@google.com>
Cc: tangmeng <tangmeng@uniontech.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20221117234328.594699-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
e0738725bb exit: Allow oops_limit to be disabled
commit de92f65719cd672f4b48397540b9f9eff67eca40 upstream.

In preparation for keeping oops_limit logic in sync with warn_limit,
have oops_limit == 0 disable checking the Oops counter.

Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
46cacd7913 exit: Expose "oops_count" to sysfs
commit 9db89b41117024f80b38b15954017fb293133364 upstream.

Since Oops count is now tracked and is a fairly interesting signal, add
the entry /sys/kernel/oops_count to expose it to userspace.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-3-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Jann Horn
767997ef5d exit: Put an upper limit on how often we can oops
commit d4ccd54d28d3c8598e2354acc13e28c060961dbb upstream.

Many Linux systems are configured to not panic on oops; but allowing an
attacker to oops the system **really** often can make even bugs that look
completely unexploitable exploitable (like NULL dereferences and such) if
each crash elevates a refcount by one or a lock is taken in read mode, and
this causes a counter to eventually overflow.

The most interesting counters for this are 32 bits wide (like open-coded
refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
platforms is just 16 bits, but probably nobody cares about 32-bit platforms
that much nowadays.)

So let's panic the system if the kernel is constantly oopsing.

The speed of oopsing 2^32 times probably depends on several factors, like
how long the stack trace is and which unwinder you're using; an empirically
important one is whether your console is showing a graphical environment or
a text console that oopses will be printed to.
In a quick single-threaded benchmark, it looks like oopsing in a vfork()
child with a very short stack trace only takes ~510 microseconds per run
when a graphical console is active; but switching to a text console that
oopses are printed to slows it down around 87x, to ~45 milliseconds per
run.
(Adding more threads makes this faster, but the actual oops printing
happens under &die_lock on x86, so you can maybe speed this up by a factor
of around 2 and then any further improvement gets eaten up by lock
contention.)

It looks like it would take around 8-12 days to overflow a 32-bit counter
with repeated oopsing on a multi-core X86 system running a graphical
environment; both me (in an X86 VM) and Seth (with a distro kernel on
normal hardware in a standard configuration) got numbers in that ballpark.

12 days aren't *that* short on a desktop system, and you'd likely need much
longer on a typical server system (assuming that people don't run graphical
desktop environments on their servers), and this is a *very* noisy and
violent approach to exploiting the kernel; and it also seems to take orders
of magnitude longer on some machines, probably because stuff like EFI
pstore will slow it down a ton if that's active.

Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.com
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Kees Cook
acc767cc70 panic: Separate sysctl logic from CONFIG_SMP
commit 9360d035a579d95d1e76c471061b9065b18a0eb1 upstream.

In preparation for adding more sysctls directly in kernel/panic.c, split
CONFIG_SMP from the logic that adds sysctls.

Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: tangmeng <tangmeng@uniontech.com>
Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-1-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:41 +01:00
Paul Moore
8de08b0c44 bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD
commit ef01f4e25c1760920e2c94f1c232350277ace69b upstream.

When changing the ebpf program put() routines to support being called
from within IRQ context the program ID was reset to zero prior to
calling the perf event and audit UNLOAD record generators, which
resulted in problems as the ebpf program ID was bogus (always zero).
This patch addresses this problem by removing an unnecessary call to
bpf_prog_free_id() in __bpf_prog_offload_destroy() and adjusting
__bpf_prog_put() to only call bpf_prog_free_id() after audit and perf
have finished their bpf program unload tasks in
bpf_prog_put_deferred().  For the record, no one can determine, or
remember, why it was necessary to free the program ID, and remove it
from the IDR, prior to executing bpf_prog_put_deferred();
regardless, both Stanislav and Alexei agree that the approach in this
patch should be safe.

It is worth noting that when moving the bpf_prog_free_id() call, the
do_idr_lock parameter was forced to true as the ebpf devs determined
this was the correct as the do_idr_lock should always be true.  The
do_idr_lock parameter will be removed in a follow-up patch, but it
was kept here to keep the patch small in an effort to ease any stable
backports.

I also modified the bpf_audit_prog() logic used to associate the
AUDIT_BPF record with other associated records, e.g. @ctx != NULL.
Instead of keying off the operation, it now keys off the execution
context, e.g. '!in_irg && !irqs_disabled()', which is much more
appropriate and should help better connect the UNLOAD operations with
the associated audit state (other audit records).

Cc: stable@vger.kernel.org
Fixes: d809e134be ("bpf: Prepare bpf_prog_put() to be called from irq context.")
Reported-by: Burn Alting <burn.alting@iinet.net.au>
Reported-by: Jiri Olsa <olsajiri@gmail.com>
Suggested-by: Stanislav Fomichev <sdf@google.com>
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230106154400.74211-1-paul@paul-moore.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:37 +01:00
Greg Kroah-Hartman
91185568c9 prlimit: do_prlimit needs to have a speculation check
commit 739790605705ddcf18f21782b9c99ad7d53a8c11 upstream.

do_prlimit() adds the user-controlled resource value to a pointer that
will subsequently be dereferenced.  In order to help prevent this
codepath from being used as a spectre "gadget" a barrier needs to be
added after checking the range.

Reported-by: Jordy Zomer <jordyzomer@google.com>
Tested-by: Jordy Zomer <jordyzomer@google.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-24 07:24:34 +01:00
Kui-Feng Lee
6c27fc1574 bpf: keep a reference to the mm, in case the task is dead.
[ Upstream commit 7ff94f276f8ea05df82eb115225e9b26f47a3347 ]

Fix the system crash that happens when a task iterator travel through
vma of tasks.

In task iterators, we used to access mm by following the pointer on
the task_struct; however, the death of a task will clear the pointer,
even though we still hold the task_struct.  That can cause an
unexpected crash for a null pointer when an iterator is visiting a
task that dies during the visit.  Keeping a reference of mm on the
iterator ensures we always have a valid pointer to mm.

Co-developed-by: Song Liu <song@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Reported-by: Nathan Slingerland <slinger@meta.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221216221855.4122288-2-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-24 07:24:31 +01:00