TP_ARGS is not used anywhere in trace.h nor trace_entries.h
Firstly, I left just #undef TP_ARGS and had no errors - remove it.
Link: http://lkml.kernel.org/r/1446576560-14085-1-git-send-email-0x7f454c46@gmail.com
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Now that max_stack_lock is a global variable, it requires a naming
convention that is unlikely to collide. Rename it to the same naming
convention that the other stack_trace variables have.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
A stack frame may be used in a different way depending on cpu architecture.
Thus it is not always appropriate to slurp the stack contents, as current
check_stack() does, in order to calcurate a stack index (height) at a given
function call. At least not on arm64.
In addition, there is a possibility that we will mistakenly detect a stale
stack frame which has not been overwritten.
This patch makes check_stack() a weak function so as to later implement
arch-specific version.
Link: http://lkml.kernel.org/r/1446182741-31019-5-git-send-email-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The verbose() printer dumps the verifier state to user space, so let gcc
take care to check calls to verbose() for (future) errors. make with W=1
correctly suggests: function might be possible candidate for 'gnu_printf'
format attribute [-Wsuggest-attribute=format].
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This work adds support for "persistent" eBPF maps/programs. The term
"persistent" is to be understood that maps/programs have a facility
that lets them survive process termination. This is desired by various
eBPF subsystem users.
Just to name one example: tc classifier/action. Whenever tc parses
the ELF object, extracts and loads maps/progs into the kernel, these
file descriptors will be out of reach after the tc instance exits.
So a subsequent tc invocation won't be able to access/relocate on this
resource, and therefore maps cannot easily be shared, f.e. between the
ingress and egress networking data path.
The current workaround is that Unix domain sockets (UDS) need to be
instrumented in order to pass the created eBPF map/program file
descriptors to a third party management daemon through UDS' socket
passing facility. This makes it a bit complicated to deploy shared
eBPF maps or programs (programs f.e. for tail calls) among various
processes.
We've been brainstorming on how we could tackle this issue and various
approches have been tried out so far, which can be read up further in
the below reference.
The architecture we eventually ended up with is a minimal file system
that can hold map/prog objects. The file system is a per mount namespace
singleton, and the default mount point is /sys/fs/bpf/. Any subsequent
mounts within a given namespace will point to the same instance. The
file system allows for creating a user-defined directory structure.
The objects for maps/progs are created/fetched through bpf(2) with
two new commands (BPF_OBJ_PIN/BPF_OBJ_GET). I.e. a bpf file descriptor
along with a pathname is being passed to bpf(2) that in turn creates
(we call it eBPF object pinning) the file system nodes. Only the pathname
is being passed to bpf(2) for getting a new BPF file descriptor to an
existing node. The user can use that to access maps and progs later on,
through bpf(2). Removal of file system nodes is being managed through
normal VFS functions such as unlink(2), etc. The file system code is
kept to a very minimum and can be further extended later on.
The next step I'm working on is to add dump eBPF map/prog commands
to bpf(2), so that a specification from a given file descriptor can
be retrieved. This can be used by things like CRIU but also applications
can inspect the meta data after calling BPF_OBJ_GET.
Big thanks also to Alexei and Hannes who significantly contributed
in the design discussion that eventually let us end up with this
architecture here.
Reference: https://lkml.org/lkml/2015/10/15/925
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have duplicated cleanup code in bpf_prog_put() and
bpf_prog_put_rcu() cleanup paths. Back then we decided that it was
not worth it to make it a common helper called by both, but with
the recent addition of resource charging, we could have avoided
the fix in commit ac00737f4e81 ("bpf: Need to call bpf_prog_uncharge_memlock
from bpf_prog_put") if we would have had only a single, common path.
We can simplify it further by assigning aux->prog only once during
allocation time.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a bpf_map_get() function that we're going to use later on and
align/clean the remaining helpers a bit so that we have them a bit
more consistent:
- __bpf_map_get() and __bpf_prog_get() that both work on the fd
struct, check whether the descriptor is eBPF and return the
pointer to the map/prog stored in the private data.
Also, we can return f.file->private_data directly, the function
signature is enough of a documentation already.
- bpf_map_get() and bpf_prog_get() that both work on u32 user fd,
call their respective __bpf_map_get()/__bpf_prog_get() variants,
and take a reference.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we're going to use anon_inode_getfd() invocations in more than just
the current places, make a helper function for both, so that we only need
to pass a map/prog pointer to the helper itself in order to get a fd. The
new helpers are called bpf_map_new_fd() and bpf_prog_new_fd().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make ftrace_event_is_function() return bool to improve readability
due to this particular function only using either one or zero as its
return value.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-9-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Make is_legal_op() return bool to improve readability due to this particular
function only using either one or zero as its return value.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-8-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Make rb_event_is_commit() return bool to improve readability
due to this particular function only using either one or zero as its
return value.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-7-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Make rb_is_reader_page() return bool to improve readability due to this
particular function only using either true or false as its return value.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-4-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch makes report_latency return bool due to this
particular function only using either one or zero as its
return value.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-3-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch makes report_latency return bool to improve readability,
indicating whether this new latency should be reported/recorded.
No functional change.
Link: http://lkml.kernel.org/r/1443537816-5788-2-git-send-email-bywxiaobai@163.com
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Update instancd_rmdir to use tracefs_remove_recursive instead of
debugfs_remove_recursive.This was left in the transition from debugfs
to tracefs.
Link: http://lkml.kernel.org/r/1445169490-18315-2-git-send-email-hello.wjx@gmail.com
Cc: stable@vger.kernel.org # 4.1+
Fixes: 8434dc9340cd2 ("tracing: Convert the tracing facility over to use tracefs")
Signed-off-by: Jiaxing Wang <hello.wjx@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There's no need to record the time tracepoints take when tracing is off.
This is because:
1) We cannot see these records since ring_buffer record is off at that
moment.
2) If tracing is off and benchmark tracepoint is enabled, the time
tracepoint takes is fewer than the same situation when tracing is on,
since the tracepoints need to be wrote into ring_buffer, it would
take more time. If turn on tracing at this moment, the average and
standard deviation cannot exactly present the time that tracepoints
take to write data into ring_buffer.
Link: http://lkml.kernel.org/r/1445947933-27955-1-git-send-email-zhang.chunyan@linaro.org
Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
For the case where pids are already in set_event_pid, and one is added or
removed then each CPU should be checked to make sure that the new or old pid
is on or not on a CPU.
For example:
# echo 123 >> set_event_pid
or
# echo '!123' >> set_event_pid
Link: http://lkml.kernel.org/r/20151030061643.GA19480@cac
Suggested-by: Jiaxing Wang <hello.wjx@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* pm-sleep:
PM / hibernate: fix a comment typo
input: i8042: Avoid resetting controller on system suspend/resume
PM / PCI / ACPI: Kick devices that might have been reset by firmware
PM / sleep: Add flags to indicate platform firmware involvement
PM / sleep: Drop pm_request_idle() from pm_generic_complete()
PCI / PM: Avoid resuming more devices during system suspend
PM / wakeup: wakeup_source_create: use kstrdup_const
PM / sleep: Report interrupt that caused system wakeup
Pull memremap fix from Dan Williams:
"The new memremap() api introduced in the 4.3 cycle to unify/replace
ioremap_cache() and ioremap_wt() is mishandling the highmem case.
This patch has received a build success notification from a
0day-kbuild-robot run and has received an ack from Ard"
From the commit message:
"The impact of this bug is low for now since the pmem driver is the
only user of memremap(), but this is important to fix before more
conversions to memremap arrive in 4.4"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
memremap: fix highmem support
This is really about simplifying the double xchg patterns into
a single cmpxchg, with the same logic. Other than the immediate
cleanup, there are some subtleties this change deals with:
(i) While the load of the old bt is fully ordered wrt everything,
ie:
old_bt = xchg(&q->blk_trace, bt); [barrier]
if (old_bt)
(void) xchg(&q->blk_trace, old_bt); [barrier]
blk_trace could still be changed between the xchg and the old_bt
load. Note that this description is merely theoretical and afaict
very small, but doing everything in a single context with cmpxchg
closes this potential race.
(ii) Ordering guarantees are obviously kept with cmpxchg.
(iii) Gets rid of the hacky-by-nature (void)xchg pattern.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
eviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
css_task_iter_next() checked @it->cur_task before grabbing
css_set_lock and assumed that the result won't change afterwards;
however, tasks could leave the cgroup being iterated terminating the
iterator before css_task_lock is acquired. If this happens,
css_task_iter_next() tries to calculate the current task from NULL
cg_list pointer leading to the following oops.
BUG: unable to handle kernel paging request at fffffffffffff7d0
IP: [<ffffffff810d5f22>] css_task_iter_next+0x42/0x80
...
CPU: 4 PID: 6391 Comm: JobQDisp2 Not tainted 4.0.9-22_fbk4_rc3_81616_ge8d9cb6 #1
Hardware name: Quanta Freedom/Winterfell, BIOS F03_3B08 03/04/2014
task: ffff880868e46400 ti: ffff88083404c000 task.ti: ffff88083404c000
RIP: 0010:[<ffffffff810d5f22>] [<ffffffff810d5f22>] css_task_iter_next+0x42/0x80
RSP: 0018:ffff88083404fd28 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88083404fd68 RCX: ffff8804697fb8b0
RDX: fffffffffffff7c0 RSI: ffff8803b7dff800 RDI: ffffffff822c0278
RBP: ffff88083404fd38 R08: 0000000000017160 R09: ffff88046f4070c0
R10: ffffffff810d61f7 R11: 0000000000000293 R12: ffff880863bf8400
R13: ffff88046b87fd80 R14: 0000000000000000 R15: ffff88083404fe58
FS: 00007fa0567e2700(0000) GS:ffff88046f900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffffffffff7d0 CR3: 0000000469568000 CR4: 00000000001406e0
Stack:
0000000000000246 0000000000000000 ffff88083404fde8 ffffffff810d6248
ffff88083404fd68 0000000000000000 ffff8803b7dff800 000001ef000001ee
0000000000000000 0000000000000000 ffff880863bf8568 0000000000000000
Call Trace:
[<ffffffff810d6248>] cgroup_pidlist_start+0x258/0x550
[<ffffffff810cf66d>] cgroup_seqfile_start+0x1d/0x20
[<ffffffff8121f8ef>] kernfs_seq_start+0x5f/0xa0
[<ffffffff811cab76>] seq_read+0x166/0x380
[<ffffffff812200fd>] kernfs_fop_read+0x11d/0x180
[<ffffffff811a7398>] __vfs_read+0x18/0x50
[<ffffffff811a745d>] vfs_read+0x8d/0x150
[<ffffffff811a756f>] SyS_read+0x4f/0xb0
[<ffffffff818d4772>] system_call_fastpath+0x12/0x17
Fix it by moving the termination condition check inside css_set_lock.
@it->cur_task is now cleared after being put and @it->task_pos is
tested for termination instead of @it->cset_pos as they indicate the
same condition and @it->task_pos is what's being dereferenced.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Calvin Owens <calvinowens@fb.com>
Fixes: ed27b9f7a17d ("cgroup: don't hold css_set_rwsem across css task iteration")
Acked-by: Zefan Li <lizefan@huawei.com>
This patch adds support for dumping a process' (classic BPF) seccomp
filters via ptrace.
PTRACE_SECCOMP_GET_FILTER allows the tracer to dump the user's classic BPF
seccomp filters. addr should be an integer which represents the ith seccomp
filter (0 is the most recently installed filter). data should be a struct
sock_filter * with enough room for the ith filter, or NULL, in which case
the filter is not saved. The return value for this command is the number of
BPF instructions the program represents, or negative in the case of errors.
Command specific errors are ENOENT: which indicates that there is no ith
filter in this seccomp tree, and EMEDIUMTYPE, which indicates that the ith
filter was not installed as a classic BPF filter.
A caveat with this approach is that there is no way to get explicitly at
the heirarchy of seccomp filters, and users need to memcmp() filters to
decide which are inherited. This means that a task which installs two of
the same filter can potentially confuse users of this interface.
v2: * make save_orig const
* check that the orig_prog exists (not necessary right now, but when
grows eBPF support it will be)
* s/n/filter_off and make it an unsigned long to match ptrace
* count "down" the tree instead of "up" when passing a filter offset
v3: * don't take the current task's lock for inspecting its seccomp mode
* use a 0x42** constant for the ptrace command value
v4: * don't copy to userspace while holding spinlocks
v5: * add another condition to WARN_ON
v6: * rebase on net-next
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
CC: Will Drewry <wad@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Serge E. Hallyn <serge.hallyn@ubuntu.com>
CC: Alexei Starovoitov <ast@kernel.org>
CC: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
someone finally caught it thanks to Peter Z's additional checks.
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWLfFDAAoJENkgDmzRrbjxhFQP/jK2tSn8I2kDsd2sNZelXor+
uQGKBpws8c2qY/MXru7HwlAX4L6Qlk6hCm0FyI9ZTqPBfR9sxjqFzOzWjcwFYrHa
EXPPzJcGBR1voK/n/e+J/xHFQgXuIVKiOtmz1PZOp8Hs4zO3SftT4+MuTCpuaC0x
79P0crJ4lV8FKFc+5pjWXceWRIG5XQuIbmDjNLOycuiNad7mtrEnR4xbMQBxD1Hf
P1xT4pKckwVcSvRNw8St25ov65N/l/CPlDtYXov4eay4/Jd4wTFRSipe9v70tOkL
vUGFGwa76ZM9X9y1Q85l2oQEkKMffSl82JEXyDrGPITwlYX0qRGOGfXPL8ihgcFi
Gd3abBdI68HQidkwPNBJmMdSEX8q3ygzFeFOLYNdmeklhqnKRZHvDvbok1poBnSf
EJbE6Cv7eKY9e74lu60NAf76XB48oL/BvEfdW1l4p/8GGxc0EYc6XGjjeHLsQffT
7yMRzEp99nX4JQoSjeDRroLbCCAzzoAcogJnlrnYFywzfVkyWUU3sPpbMFxvRDSW
Xuk/8Ii15AYwXnbQOpojtIJAi9f6F8CzIbtsO5J9cFju8lFUzUYpwC0Lm/pPoxQT
PqgDqVvS3wBOaL7iEA1Sy615qUCxu6eIKK6gqnFs1hK9sTg/aSUF+Oz668rQ/a/E
2XvWiB2C9NErM5IHtQdh
=8AgP
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module preemption fix from Rusty Russell:
"Turns out we should have always been disabling preemption here;
someone finally caught it thanks to Peter Z's additional checks"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: Fix locking in symbol_put_addr()
exported perf symbols are GPL only, mark eBPF helper functions
used in tracing as GPL only as well.
Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix safety checks for bpf_perf_event_read():
- only non-inherited events can be added to perf_event_array map
(do this check statically at map insertion time)
- dynamically check that event is local and !pmu->count
Otherwise buggy bpf program can cause kernel splat.
Also fix error path after perf_event_attrs()
and remove redundant 'extern'.
Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently memremap checks if the range is "System RAM" and returns the
kernel linear address. This is broken for highmem platforms where a
range may be "System RAM", but is not part of the kernel linear mapping.
Fallback to ioremap_cache() in these cases, to let the arch code attempt
to handle it.
Note that ARM ioremap will WARN when attempting to remap ram, and in
that case the caller needs to be fixed. For this reason, existing
ioremap_cache() usages for ARM are already trained to avoid attempts to
remap ram.
The impact of this bug is low for now since the pmem driver is the only
user of memremap(), but this is important to fix before more conversions
to memremap arrive in 4.4.
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
p_start() and p_stop() are seq_file functions that match. Teach sparse to
know that rcu_read_lock_sched() that is taken by p_start() is released by
p_stop.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
My tests found that if a task is running but not filtered when set_event_pid
is modified, then it can still be traced.
Call on_each_cpu() to check if the current running task should be filtered
and update the per cpu flags of tr->data appropriately.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add the necessary hooks to use the pids loaded in set_event_pid to filter
all the events enabled in the tracing instance that match the pids listed.
Two probes are added to both sched_switch and sched_wakeup tracepoints to be
called before other probes are called and after the other probes are called.
The first is used to set the necessary flags to let the probes know to test
if they should be traced or not.
The sched_switch pre probe will set the "ignore_pid" flag if neither the
previous or next task has a matching pid.
The sched_switch probe will set the "ignore_pid" flag if the next task
does not match the matching pid.
The pre probe allows for probes tracing sched_switch to be traced if
necessary.
The sched_wakeup pre probe will set the "ignore_pid" flag if neither the
current task nor the wakee task has a matching pid.
The sched_wakeup post probe will set the "ignore_pid" flag if the current
task does not have a matching pid.
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Create a tracing directory called set_event_pid, which currently has no
function, but will be used to filter all events for the tracing instance or
the pids that are added to the file.
The reason no functionality is added with this commit is that this commit
focuses on the creation and removal of the pids in a safe manner. And tests
can be made against this change to make sure things are correct before
hooking features to the list of pids.
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In order to guarantee that a probe will be called before other probes that
are attached to a tracepoint, there needs to be a mechanism to provide
priority of one probe over the others.
Adding a prio field to the struct tracepoint_func, which lets the probes be
sorted by the priority set in the structure. If no priority is specified,
then a priority of 10 is given (this is a macro, and perhaps may be changed
in the future).
Now probes may be added to affect other probes that are attached to a
tracepoint with a guaranteed order.
One use case would be to allow tracing of tracepoints be able to filter by
pid. A special (higher priority probe) may be added to the sched_switch
tracepoint and set the necessary flags of the other tracepoints to notify
them if they should be traced or not. In case a tracepoint is enabled at the
sched_switch tracepoint too, the order of the two are not random.
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull scheduler fixes from Ingo Molnar:
"Misc fixes all around the map: an instrumentation fix, a nohz
usability fix, a lockdep annotation fix and two task group scheduling
fixes"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Add missing lockdep_unpin() annotations
sched/deadline: Fix migration of SCHED_DEADLINE tasks
nohz: Revert "nohz: Set isolcpus when nohz_full is set"
sched/fair: Update task group's load_avg after task migration
sched/fair: Fix overly small weight for interactive group entities
sched, tracing: Stop/start critical timings around the idle=poll idle loop
Merge fixes from Andrew Morton:
"9 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
ocfs2/dlm: unlock lockres spinlock before dlm_lockres_put
fault-inject: fix inverted interval/probability values in printk
lib/Kconfig.debug: disable -Wframe-larger-than warnings with KASAN=y
mm: make sendfile(2) killable
thp: use is_zero_pfn() only after pte_present() check
mailmap: update Javier Martinez Canillas' email
MAINTAINERS: add Sergey as zsmalloc reviewer
mm: cma: fix incorrect type conversion for size during dma allocation
kmod: don't run async usermode helper as a child of kworker thread
call_usermodehelper_exec_sync() does fork() + wait() with "unignored"
SIGCHLD. What we have missed is that this worker thread can have other
children previously forked by call_usermodehelper_exec_work() without
UMH_WAIT_PROC. If such a child exits in between it becomes a zombie
because auto-reaping only works if SIGCHLD is ignored, and nobody can
reap it (unless/until this worker thread exits too).
Change the !UMH_WAIT_PROC case to use CLONE_PARENT.
Note: this is only first step. All PF_KTHREAD tasks, even created by
kernel_thread() should have ->parent == kthreadd by default.
Fixes: bb304a5c6fc63d8506c ("kmod: handle UMH_WAIT_PROC from system unbound workqueue")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an incremental fix for a patch previously pulled from tip
irq/for-arm.
* 'irq/for-arm' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make the cpuhotplug migration code less noisy
pstore doesn't support unregistering yet. It was marked as TODO.
This patch adds some code to fix it:
1) Add functions to unregister kmsg/console/ftrace/pmsg.
2) Add a function to free compression buffer.
3) Unmap the memory and free it.
4) Add a function to unregister pstore filesystem.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Kees Cook <keescook@chromium.org>
[Removed __exit annotation from ramoops_remove(). Reported by Arnd Bergmann]
Signed-off-by: Tony Luck <tony.luck@intel.com>
This helper is used to send raw data from eBPF program into
special PERF_TYPE_SOFTWARE/PERF_COUNT_SW_BPF_OUTPUT perf_event.
User space needs to perf_event_open() it (either for one or all cpus) and
store FD into perf_event_array (similar to bpf_perf_event_read() helper)
before eBPF program can send data into it.
Today the programs triggered by kprobe collect the data and either store
it into the maps or print it via bpf_trace_printk() where latter is the debug
facility and not suitable to stream the data. This new helper replaces
such bpf_trace_printk() usage and allows programs to have dedicated
channel into user space for post-processing of the raw data collected.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of WARN_ON in perf_event_output() on unpaded raw samples,
pad them automatically.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original arm code has a pr_debug() statement for the case where
the irq chip has no set_affinity() callback. That's sufficient for
debugging and we really don't want to spam dmesg with useless warnings
for the normal case.
Fixes: f1e0bb0ad473: "genirq: Introduce generic irq migration for cpu hotunplug"
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Requested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Currently we see this in "git status" if we build in the source dir:
Untracked files:
(use "git add <file>..." to include in what will be committed)
certs/x509_certificate_list
It looks like it used to live in kernel/ so we squash that .gitignore
entry at the same time. I didn't bother to dig through git history to
see when it moved, since it is just a minor annoyance at most.
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: keyrings@linux-nfs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Both start_branch_trace() and stop_branch_trace() are used in only one
location, and are both static. As they are small functions there is no
need to keep them separated out.
Link: http://lkml.kernel.org/r/1445000689-32596-1-git-send-email-0x7f454c46@gmail.com
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
It is helpful when the crashkernel cmdline parsing routines
actually say which character is the unrecognized one. Make them
do so.
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Baoquan He <bhe@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: WANG Chao <chaowang@redhat.com>
Cc: jerry_hoemann@hp.com
Link: http://lkml.kernel.org/r/1445246268-26285-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>