android_kernel_asus_sm8350/kernel
Petr Pavlu 6f2e50961f tracing: Fix incomplete locking when disabling buffered events
commit 7fed14f7ac9cf5e38c693836fe4a874720141845 upstream.

The following warning appears when using buffered events:

[  203.556451] WARNING: CPU: 53 PID: 10220 at kernel/trace/ring_buffer.c:3912 ring_buffer_discard_commit+0x2eb/0x420
[...]
[  203.670690] CPU: 53 PID: 10220 Comm: stress-ng-sysin Tainted: G            E      6.7.0-rc2-default #4 56e6d0fcf5581e6e51eaaecbdaec2a2338c80f3a
[  203.670704] Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017
[  203.670709] RIP: 0010:ring_buffer_discard_commit+0x2eb/0x420
[  203.735721] Code: 4c 8b 4a 50 48 8b 42 48 49 39 c1 0f 84 b3 00 00 00 49 83 e8 01 75 b1 48 8b 42 10 f0 ff 40 08 0f 0b e9 fc fe ff ff f0 ff 47 08 <0f> 0b e9 77 fd ff ff 48 8b 42 10 f0 ff 40 08 0f 0b e9 f5 fe ff ff
[  203.735734] RSP: 0018:ffffb4ae4f7b7d80 EFLAGS: 00010202
[  203.735745] RAX: 0000000000000000 RBX: ffffb4ae4f7b7de0 RCX: ffff8ac10662c000
[  203.735754] RDX: ffff8ac0c750be00 RSI: ffff8ac10662c000 RDI: ffff8ac0c004d400
[  203.781832] RBP: ffff8ac0c039cea0 R08: 0000000000000000 R09: 0000000000000000
[  203.781839] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  203.781842] R13: ffff8ac10662c000 R14: ffff8ac0c004d400 R15: ffff8ac10662c008
[  203.781846] FS:  00007f4cd8a67740(0000) GS:ffff8ad798880000(0000) knlGS:0000000000000000
[  203.781851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  203.781855] CR2: 0000559766a74028 CR3: 00000001804c4000 CR4: 00000000001506f0
[  203.781862] Call Trace:
[  203.781870]  <TASK>
[  203.851949]  trace_event_buffer_commit+0x1ea/0x250
[  203.851967]  trace_event_raw_event_sys_enter+0x83/0xe0
[  203.851983]  syscall_trace_enter.isra.0+0x182/0x1a0
[  203.851990]  do_syscall_64+0x3a/0xe0
[  203.852075]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  203.852090] RIP: 0033:0x7f4cd870fa77
[  203.982920] Code: 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 43 0e 00 f7 d8 64 89 01 48
[  203.982932] RSP: 002b:00007fff99717dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000089
[  203.982942] RAX: ffffffffffffffda RBX: 0000558ea1d7b6f0 RCX: 00007f4cd870fa77
[  203.982948] RDX: 0000000000000000 RSI: 00007fff99717de0 RDI: 0000558ea1d7b6f0
[  203.982957] RBP: 00007fff99717de0 R08: 00007fff997180e0 R09: 00007fff997180e0
[  203.982962] R10: 00007fff997180e0 R11: 0000000000000246 R12: 00007fff99717f40
[  204.049239] R13: 00007fff99718590 R14: 0000558e9f2127a8 R15: 00007fff997180b0
[  204.049256]  </TASK>

For instance, it can be triggered by running these two commands in
parallel:

 $ while true; do
    echo hist:key=id.syscall:val=hitcount > \
      /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger;
  done
 $ stress-ng --sysinfo $(nproc)

The warning indicates that the current ring_buffer_per_cpu is not in the
committing state. It happens because the active ring_buffer_event
doesn't actually come from the ring_buffer_per_cpu but is allocated from
trace_buffered_event.

The bug is in function trace_buffered_event_disable() where the
following normally happens:

* The code invokes disable_trace_buffered_event() via
  smp_call_function_many() and follows it by synchronize_rcu(). This
  increments the per-CPU variable trace_buffered_event_cnt on each
  target CPU and grants trace_buffered_event_disable() the exclusive
  access to the per-CPU variable trace_buffered_event.

* Maintenance is performed on trace_buffered_event, all per-CPU event
  buffers get freed.

* The code invokes enable_trace_buffered_event() via
  smp_call_function_many(). This decrements trace_buffered_event_cnt and
  releases the access to trace_buffered_event.

A problem is that smp_call_function_many() runs a given function on all
target CPUs except on the current one. The following can then occur:

* Task X executing trace_buffered_event_disable() runs on CPU 0.

* The control reaches synchronize_rcu() and the task gets rescheduled on
  another CPU 1.

* The RCU synchronization finishes. At this point,
  trace_buffered_event_disable() has the exclusive access to all
  trace_buffered_event variables except trace_buffered_event[CPU0]
  because trace_buffered_event_cnt[CPU0] is never incremented and if the
  buffer is currently unused, remains set to 0.

* A different task Y is scheduled on CPU 0 and hits a trace event. The
  code in trace_event_buffer_lock_reserve() sees that
  trace_buffered_event_cnt[CPU0] is set to 0 and decides the use the
  buffer provided by trace_buffered_event[CPU0].

* Task X continues its execution in trace_buffered_event_disable(). The
  code incorrectly frees the event buffer pointed by
  trace_buffered_event[CPU0] and resets the variable to NULL.

* Task Y writes event data to the now freed buffer and later detects the
  created inconsistency.

The issue is observable since commit dea499781a11 ("tracing: Fix warning
in trace_buffered_event_disable()") which moved the call of
trace_buffered_event_disable() in __ftrace_event_enable_disable()
earlier, prior to invoking call->class->reg(.. TRACE_REG_UNREGISTER ..).
The underlying problem in trace_buffered_event_disable() is however
present since the original implementation in commit 0fc1b09ff1
("tracing: Use temp buffer when filtering events").

Fix the problem by replacing the two smp_call_function_many() calls with
on_each_cpu_mask() which invokes a given callback on all CPUs.

Link: https://lore.kernel.org/all/20231127151248.7232-2-petr.pavlu@suse.com/
Link: https://lkml.kernel.org/r/20231205161736.19663-2-petr.pavlu@suse.com

Cc: stable@vger.kernel.org
Fixes: 0fc1b09ff1 ("tracing: Use temp buffer when filtering events")
Fixes: dea499781a11 ("tracing: Fix warning in trace_buffered_event_disable()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13 18:18:14 +01:00
..
bpf bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END 2023-11-28 16:50:18 +00:00
cgroup cgroup: Remove duplicates in cgroup v1 tasks file 2023-10-25 11:53:19 +02:00
configs
debug kgdb: Flush console before entering kgdb on panic 2023-11-28 16:50:16 +00:00
dma treewide: Remove uninitialized_var() usage 2023-06-09 10:29:01 +02:00
events perf/core: Bail out early if the request AUX area is out of bound 2023-11-28 16:50:13 +00:00
gcov gcov: add support for checksum field 2023-01-18 11:41:42 +01:00
irq genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware 2023-11-28 16:50:18 +00:00
livepatch livepatch: fix race between fork and KLP transition 2022-10-26 13:22:18 +02:00
locking locking/ww_mutex/test: Fix potential workqueue corruption 2023-11-28 16:50:13 +00:00
power PM: hibernate: Clean up sync_read handling in snapshot_write_next() 2023-11-28 16:50:19 +00:00
printk printk: fix return value of printk.devkmsg __setup handler 2022-04-15 14:18:08 +02:00
rcu rcu: Suppress smp_processor_id() complaint in synchronize_rcu_expedited_wait() 2023-03-11 16:43:54 +01:00
sched sched/fair: Don't balance task to its current running CPU 2023-07-27 08:37:42 +02:00
time hrtimers: Push pending hrtimers away from outgoing CPU earlier 2023-12-13 18:18:09 +01:00
trace tracing: Fix incomplete locking when disabling buffered events 2023-12-13 18:18:14 +01:00
.gitignore kbuild: update config_data.gz only when the content of .config is changed 2021-05-11 14:04:16 +02:00
acct.c acct: fix potential integer overflow in encode_comp_t() 2023-01-18 11:41:34 +01:00
async.c treewide: Remove uninitialized_var() usage 2023-06-09 10:29:01 +02:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-09-05 10:27:38 +02:00
audit_tree.c audit: move put_tree() to avoid trim_trees refcount underflow and UAF 2021-09-03 10:08:16 +02:00
audit_watch.c audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare() 2023-11-28 16:50:18 +00:00
audit.c treewide: Remove uninitialized_var() usage 2023-06-09 10:29:01 +02:00
audit.h audit: log AUDIT_TIME_* records only from rules 2022-04-15 14:18:04 +02:00
auditfilter.c audit: fix a net reference leak in audit_list_rules_send() 2020-06-22 09:30:59 +02:00
auditsc.c audit: fix possible soft lockup in __audit_inode_child() 2023-09-23 10:59:46 +02:00
backtracetest.c treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() 2023-04-20 12:07:32 +02:00
bounds.c
capability.c
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-04-05 11:16:42 +02:00
configs.c kernel/configs: Replace GPL boilerplate code with SPDX identifier 2019-07-30 18:34:15 +02:00
context_tracking.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
cpu_pm.c kernel/cpu_pm: Fix uninitted local in cpu_pm 2020-06-22 09:31:22 +02:00
cpu.c hrtimers: Push pending hrtimers away from outgoing CPU earlier 2023-12-13 18:18:09 +01:00
crash_core.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 2019-06-19 17:09:06 +02:00
crash_dump.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
cred.c keys: Fix request_key() cache 2020-01-17 19:48:42 +01:00
delayacct.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25 2019-05-21 11:52:39 +02:00
dma.c
exec_domain.c
exit.c treewide: Remove uninitialized_var() usage 2023-06-09 10:29:01 +02:00
extable.c kernel/extable.c: use address-of operator on section symbols 2023-06-09 10:29:01 +02:00
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-03-11 16:44:15 +01:00
fork.c kernel/fork: beware of __put_task_struct() calling context 2023-09-23 11:00:03 +02:00
freezer.c Revert "libata, freezer: avoid block device removal while system is frozen" 2019-10-06 09:11:37 -06:00
futex.c treewide: Remove uninitialized_var() usage 2023-06-09 10:29:01 +02:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-09-03 11:27:10 +02:00
groups.c
hung_task.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
iomem.c mm/nvdimm: add is_ioremap_addr and use that to check ioremap address 2019-07-12 11:05:40 -07:00
irq_work.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
jump_label.c jump_label: Don't warn on __exit jump entries 2019-08-29 15:10:10 +01:00
kallsyms.c kallsyms: Refactor kallsyms_show_value() to take cred 2020-07-16 08:16:44 +02:00
kcmp.c exec: Transform exec_update_mutex into a rw_semaphore 2021-01-09 13:44:55 +01:00
Kconfig.freezer treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.hz treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.locks treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.preempt sched/rt, Kconfig: Unbreak def/oldconfig with CONFIG_PREEMPT=y 2019-07-22 18:05:11 +02:00
kcov.c
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-27 08:37:10 +02:00
kexec_elf.c kexec_elf: support 32 bit ELF files 2019-09-06 23:58:44 +02:00
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-21 15:44:10 +02:00
kexec_internal.h
kexec.c kexec_load: Disable at runtime if the kernel is locked down 2019-08-19 21:54:15 -07:00
kheaders.c kheaders: Use array declaration instead of char 2023-05-17 11:35:33 +02:00
kmod.c kmod: make request_module() return an error when autoloading is disabled 2020-04-17 10:50:22 +02:00
kprobes.c x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range 2023-03-11 16:44:02 +01:00
ksysfs.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 170 2019-05-30 11:26:39 -07:00
kthread.c kthread: Fix PF_KTHREAD vs to_kthread() race 2021-09-12 08:56:39 +02:00
latencytop.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
Makefile kbuild: update config_data.gz only when the content of .config is changed 2021-05-11 14:04:16 +02:00
module_signature.c module: harden ELF info handling 2021-04-07 14:47:38 +02:00
module_signing.c module: harden ELF info handling 2021-04-07 14:47:38 +02:00
module-internal.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
module.c modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules 2023-09-23 10:59:36 +02:00
notifier.c kernel/notifier.c: intercept duplicate registrations to avoid infinite loops 2020-10-01 13:17:23 +02:00
nsproxy.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
padata.c crypto: pcrypt - Fix hungtask for PADATA_RESET 2023-11-28 16:50:14 +00:00
panic.c exit: Use READ_ONCE() for all oops/warn limit reads 2023-02-06 07:52:50 +01:00
params.c lockdown: Lock down module params that specify hardware parameters (eg. ioport) 2019-08-19 21:54:16 -07:00
pid_namespace.c memcg: enable accounting for pids in nested pid namespaces 2021-09-22 12:26:37 +02:00
pid.c kernel/pid.c: convert struct pid count to refcount_t 2019-07-16 19:23:24 -07:00
profile.c profiling: fix shift too large makes kernel panic 2022-08-25 11:18:02 +02:00
ptrace.c ptrace: Reimplement PTRACE_KILL by always sending SIGKILL 2022-06-14 18:11:24 +02:00
range.c
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2023-11-28 16:50:19 +00:00
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-17 11:35:58 +02:00
resource.c /dev/mem: Revoke mappings when a driver claims the region 2020-06-24 17:50:35 +02:00
rseq.c signal: Remove task parameter from force_sig 2019-05-27 09:36:28 -05:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2022-02-16 12:52:53 +01:00
signal.c signal handling: don't use BUG_ON() for debugging 2022-07-21 20:59:27 +02:00
smp.c smp: Fix offline cpu check in flush_smp_call_function_queue() 2022-04-20 09:19:39 +02:00
smpboot.c kthread: Extract KTHREAD_IS_PER_CPU 2021-02-07 15:35:49 +01:00
smpboot.h
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 11:01:13 -07:00
stackleak.c
stacktrace.c stacktrace: Don't skip first entry on noncurrent tasks 2019-11-04 21:19:25 +01:00
stop_machine.c stop_machine: Avoid potential race behaviour 2019-10-17 12:47:12 +02:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-09-05 10:27:38 +02:00
sys.c prlimit: do_prlimit needs to have a speculation check 2023-01-24 07:17:59 +01:00
sysctl_binary.c
sysctl-test.c kernel/sysctl-test: Add null pointer test for sysctl.c:proc_dointvec() 2020-10-01 13:17:10 +02:00
sysctl.c mm: allow a controlled amount of unfairness in the page lock 2023-08-30 16:27:26 +02:00
task_work.c
taskstats.c taskstats: fix data-race 2020-01-09 10:19:54 +01:00
test_kprobes.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25 2019-05-21 11:52:39 +02:00
torture.c torture: Remove exporting of internal functions 2019-08-01 14:30:22 -07:00
tracepoint.c tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing 2021-07-14 16:53:08 +02:00
tsacct.c taskstats: Cleanup the use of task->exit_code 2022-02-23 11:59:57 +01:00
ucount.c proc/sysctl: add shared variables for range check 2019-07-18 17:08:07 -07:00
uid16.c
uid16.h
umh.c usermodehelper: reset umask to default before executing user process 2020-10-14 10:32:58 +02:00
up.c smp: Fix smp_call_function_single_async prototype 2021-05-14 09:44:33 +02:00
user_namespace.c Keyrings namespacing 2019-07-08 19:36:47 -07:00
user-return-notifier.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
user.c Keyrings namespacing 2019-07-08 19:36:47 -07:00
utsname_sysctl.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
utsname.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
watchdog_hld.c watchdog/perf: more properly prevent false positives with turbo modes 2023-07-27 08:37:10 +02:00
watchdog.c watchdog: export lockup_detector_reconfigure 2022-08-25 11:18:37 +02:00
workqueue_internal.h
workqueue.c workqueue: Override implicit ordered attribute in workqueue_apply_unbound_cpumask() 2023-10-25 11:53:18 +02:00