11024 Commits

Author SHA1 Message Date
3c1a427954 perf annotate s390: Fix perf annotate error -95 (4.10 regression)
since 4.10 perf annotate exits on s390 with an "unknown error -95".
Turns out that commit 786c1b51844d ("perf annotate: Start supporting
cross arch annotation") added a hard requirement for architecture
support when objdump is used but only provided x86 and arm support.
Meanwhile power was added so lets add s390 as well.

While at it make sure to implement the branch and jump types.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: stable@kernel.org # v4.10+
Fixes: 786c1b51844 "perf annotate: Start supporting cross arch annotation"
Link: http://lkml.kernel.org/r/1491465112-45819-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-07 12:33:10 -03:00
89c0a36130 selftests/bpf: fix merge conflict
fix artifact of merge resolution

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 12:21:59 -07:00
6f14f443d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 08:24:51 -07:00
af0e54619d selftests: add a generic testsuite for ethernet device
This patch add a generic testsuite for testing ethernet network device driver.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05 08:30:11 -07:00
99094a5e94 perf annotate: Fix missing number of samples for source_line_samples
The option 'show-total-period' works fine without a option '-l'.  But if
running 'perf annotate --stdio -l --show-total-period', you can see a
problem showing only zero '0' for number of samples.

Before:
    $ perf annotate --stdio -l --show-total-period
...
       0 :        400816:       push   %rbp
       0 :        400817:       mov    %rsp,%rbp
       0 :        40081a:       mov    %edi,-0x24(%rbp)
       0 :        40081d:       mov    %rsi,-0x30(%rbp)
       0 :        400821:       mov    -0x24(%rbp),%eax
       0 :        400824:       mov    -0x30(%rbp),%rdx
       0 :        400828:       mov    (%rdx),%esi
       0 :        40082a:       mov    $0x0,%edx
...

The reason is it was missed to set number of samples of
source_line_samples, so set it ordinarily.

After:
    $ perf annotate --stdio -l --show-total-period
...
       3 :        400816:       push   %rbp
       4 :        400817:       mov    %rsp,%rbp
       0 :        40081a:       mov    %edi,-0x24(%rbp)
       0 :        40081d:       mov    %rsi,-0x30(%rbp)
       1 :        400821:       mov    -0x24(%rbp),%eax
       2 :        400824:       mov    -0x30(%rbp),%rdx
       0 :        400828:       mov    (%rdx),%esi
       1 :        40082a:       mov    $0x0,%edx
...

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liska <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period")
Link: http://lkml.kernel.org/r/1490703125-13643-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 21:08:00 -03:00
9c0899f157 perf tools: Don't die on a print function
Trying to remove die() calls from library functions, postponing exiting
to the tool main code.

Link: http://lkml.kernel.org/n/tip-ackxq5nqe39gunln3tkczs42@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 12:11:07 -03:00
f05082b547 perf tools: Handle allocation failures gracefully
The callers of perf_read_values__enlarge_counters() already propagate
errors, so just print some debug diagnostics and handle allocation
failures gracefully, not trying to do silly things like 'a =
realloc(a)'.

Link: http://lkml.kernel.org/n/tip-nsmmh7uzpg35rzcl9nq7yztp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 12:05:37 -03:00
427748068a perf tools: Remove die() call
We can just use the exit() right after the branch calling die().

Link: http://lkml.kernel.org/n/tip-90athn06d7atf2jkpfvq1iic@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 11:36:22 -03:00
696ced4fb1 tracing/kprobes: expose maxactive for kretprobe in kprobe_events
When a kretprobe is installed on a kernel function, there is a maximum
limit of how many calls in parallel it can catch (aka "maxactive"). A
kernel module could call register_kretprobe() and initialize maxactive
(see example in samples/kprobes/kretprobe_example.c).

But that is not exposed to userspace and it is currently not possible to
choose maxactive when writing to /sys/kernel/debug/tracing/kprobe_events

The default maxactive can be as low as 1 on single-core with a
non-preemptive kernel. This is too low and we need to increase it not
only for recursive functions, but for functions that sleep or resched.

This patch updates the format of the command that can be written to
kprobe_events so that maxactive can be optionally specified.

I need this for a bpf program attached to the kretprobe of
inet_csk_accept, which can sleep for a long time.

This patch includes a basic selftest:

> # ./ftracetest -v  test.d/kprobe/
> === Ftrace unit tests ===
> [1] Kprobe dynamic event - adding and removing	[PASS]
> [2] Kprobe dynamic event - busy event check	[PASS]
> [3] Kprobe dynamic event with arguments	[PASS]
> [4] Kprobes event arguments with types	[PASS]
> [5] Kprobe dynamic event with function tracer	[PASS]
> [6] Kretprobe dynamic event with arguments	[PASS]
> [7] Kretprobe dynamic event with maxactive	[PASS]
>
> # of passed:  7
> # of failed:  0
> # of unresolved:  0
> # of untested:  0
> # of unsupported:  0
> # of xfailed:  0
> # of undefined(test bug):  0

BugLink: https://github.com/iovisor/bcc/issues/1072
Link: http://lkml.kernel.org/r/1491215782-15490-1-git-send-email-alban@kinvolk.io

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-04 10:32:03 -04:00
f3eda8f573 Merge branch 'perf/uncore-json-updates-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc into perf/core
Pull perf/core improvements from Andi Kleen:

This pull requests contains updates to the Intel PMU events JSON files,
plus two one liner code fixes for the JSON files (also appended as patch)

The most remarkable change is support for Sandy Bridge to Skylake
client uncore event list support.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 11:02:47 -03:00
f5a70801b7 perf sdt powerpc: Add argument support
SDT marker argument is in N@OP format. Here OP is arch dependent
component. Add powerpc logic to parse OP and convert it to uprobe
compatible format.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.3156-4-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-04-04 10:36:59 -03:00
7f75540ff2 Linux 4.11-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJY4ZYkAAoJEHm+PkMAQRiGsq4H/R4PMXDoe2XhSSk7IoT97pXV
 /A8np/scAPjzEgYUidbb54OSqWwsPRuPGWONTFeSrE2u0L4wln/REI91jg7QetLq
 IisncExlYeJ/XQ+iO0ZZh9fLbqwIlEJFdSXmyIFr3m/TBxe8a61C8j93oNgM1tHT
 yuwzlq7c3sLq2hsmUG2HyL2kJsEfRasv4Rk0yhFuti12zVsBoTW4qmZuMauq+gdf
 f7cSYgiHhPTdb2o+azg5O7uYNHaQQBxdUMlIuhhYtVOUq+pFDO23SLHSFIW2NwOm
 Zn5R6CFSrLsCw0Bx0v8Xlc151QUbaRK4h9lhUhkBr6d3uNShU1NQ9JojpSvYwBo=
 =vP6E
 -----END PGP SIGNATURE-----

Merge tag 'v4.11-rc5' into x86/mm, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-03 16:36:32 +02:00
3782161362 selftests/bpf: add l4 load balancer test based on sched_cls
this l4lb demo is a comprehensive test case for LLVM codegen and
kernel verifier. It's using fully inlined jhash(), complex packet
parsing and multiple map lookups of different types to stress
llvm and verifier.
The map sizes, map population and test vectors are artificial to
exercise different paths through the bpf program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
8d48f5e427 selftests/bpf: add a test for basic XDP functionality
add C test for xdp_adjust_head(), packet rewrite and map lookups

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
6882804c91 selftests/bpf: add a test for overlapping packet range checks
add simple C test case for llvm and verifier range check fix from
commit b1977682a385 ("bpf: improve verifier packet range checks")

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
dd26b7f54a tools/lib/bpf: expose bpf_program__set_type()
expose bpf_program__set_type() to set program type

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
3084887378 tools/lib/bpf: add support for BPF_PROG_TEST_RUN command
add support for BPF_PROG_TEST_RUN command to libbpf.a

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:45:57 -07:00
02ea80b185 bpf: add various verifier test cases for self-tests
Add a couple of test cases, for example, probing for xadd on a spilled
pointer to packet and map_value_adj register, various other map_value_adj
tests including the unaligned load/store, and trying out pointer arithmetic
on map_value_adj register itself. For the unaligned load/store, we need
to figure out whether the architecture has efficient unaligned access and
need to mark affected tests accordingly.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-01 12:36:37 -07:00
fcc309e618 perf/core improvements and fixes:
New features:
 
 - Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo)
 
     e.g.:
 
   System wide strace like session:
 
   # trace -e statx
    16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
    36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0
   ^C#
 
 User visible:
 
 - Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we
   shouldn't try to calculate duration or print the timestamp for a missing
   matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo)
 
 - Do not print "cycles: 0" in perf report LBR lines in platforms not
   supporting 'cycles', such as Intel's Broadwell (Jin Yao)
 
 - Handle missing $HOME env var (Jiri Olsa)
 
 - Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to
   the next best thing (ax, bx, etc) supported (Ravi Bangoria)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJY3pg+AAoJENZQFvNTUqpAZTMP/RXyPMW/7FGe/fAflngG5EjZ
 CM2C6WvaBLxfAI5vAeQ56ik5IYLvtjPCiWP0jBs71gNpsotH+YpaTfHDyG6tBjJG
 /hYnP2RX0oMVitn9fpLbYKYEH2KecfbNADUZxEAB9nfiVtZ4wGwWC/djzMwDyvXz
 tEx+LHxkmx2zEz6bSaysDj8uMnZreM4etgwu09XLpkGseSPxyDEArleqObEXKw5B
 R2FR9nINPv7YKlq/C0ZBMI9qKJwb534qGaceb5ZqMfTZw5mnqGbcUEcyNf1J1lJN
 SFFPSOm75ViMDM7bWq1g2gipx92o163o+78cf83KrWWm/Hz6B1T3pYqq6FciA9tJ
 ZALQVvP5U1wEhiaTNt75R2PTZxLmMfE2mY/1RM42DT8VD3Awof1lCzuKmVo00Ike
 dfaI8vUYd27RN0P/nqS+GDgI0XtxAEE/El3xgBNdqBmSR4W0eted2c9rj2PKTl7x
 /R7gbVEjhqk5J9uZKYBE2SGbbyF5itymFImqNYE3D+fVlUYHrdvNfUkL9n9/OraA
 9o1vyaYprTMBvgZzFp7ydpkpwPPi0pXzgypabuzV6GnAlFf1bxXBQHOhovz6pGuv
 Ffb0cka/9N//NdJolHBi4qe92iGDDlCY5oq+8O2/4kfMtR8g2Mg9Er8axhGXzVTq
 8jMAt7SroCIgbkHerTdl
 =cpqq
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.12-20170331' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

- Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo)

    e.g.:

  System wide strace like session:

  # trace -e statx
   16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
   36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0
  ^C#

User visible changes:

- Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we
  shouldn't try to calculate duration or print the timestamp for a missing
  matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo)

- Do not print "cycles: 0" in perf report LBR lines in platforms not
  supporting 'cycles', such as Intel's Broadwell (Jin Yao)

- Handle missing $HOME env var (Jiri Olsa)

- Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to
  the next best thing (ax, bx, etc) supported (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-01 12:43:40 +02:00
fd5cead23f perf trace: Beautify statx syscall 'flag' and 'mask' arguments
To test it, build samples/statx/test_statx, which I did as:

  $ make headers_install
  $ cc -I ~/git/linux/usr/include samples/statx/test-statx.c -o /tmp/statx

And then use perf trace on it:

  # perf trace -e statx /tmp/statx /etc/passwd
  statx(/etc/passwd) = 0
  results=7ff
    Size: 3496            Blocks: 8          IO Block: 4096    regular file
  Device: fd:00           Inode: 280156      Links: 1
  Access: (0644/-rw-r--r--)  Uid:     0   Gid:     0
  Access: 2017-03-29 16:01:01.650073438-0300
  Modify: 2017-03-10 16:25:14.156479354-0300
  Change: 2017-03-10 16:25:14.171479328-0300
     0.000 ( 0.007 ms): statx/30648 statx(dfd: CWD, filename: 0x7ef503f4, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff7ef4eb10) = 0
  #

Using the test-stat.c options to change the mask:

  # perf trace -e statx /tmp/statx -O /etc/passwd > /dev/null
     0.000 ( 0.008 ms): statx/30745 statx(dfd: CWD, filename: 0x3a0753f4, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffd3a0735c0) = 0
  #
  # perf trace -e statx /tmp/statx -A /etc/passwd > /dev/null
     0.000 ( 0.010 ms): statx/30757 statx(dfd: CWD, filename: 0xa94e63f4, flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffea94e49d0) = 0
  #
  # trace --no-inherit -e statx /tmp/statx -F /etc/passwd > /dev/null
     0.000 ( 0.011 ms): statx(dfd: CWD, filename: 0x3b02d3f3, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffd3b02c850) = 0
  #
  # trace --no-inherit -e statx /tmp/statx -F -L /etc/passwd > /dev/null
     0.000 ( 0.008 ms): statx(dfd: CWD, filename: 0x15cff3f3, flags: STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff15cfdda0) = 0
  #
  # trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null
     0.000 ( 0.009 ms): statx(dfd: CWD, filename: 0xfa37f3f3, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffffa37da20) = 0
  #

Adding a probe to get the filename collected as well:

  # perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
  Added new event:
    probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)

  You can now use it in all perf tools, such as:

	  perf record -e probe:vfs_getname -aR sleep 1

  # trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null
     0.169 ( 0.007 ms): statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0
  #

Same technique could be used to collect and beautify the result put in
the 'buffer' argument.

Finally do a system wide 'perf trace' session looking for any use of statx,
then run the test proggie with various flags:

  # trace -e statx
   16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0
   33064.447 ( 0.011 ms): statx/4569 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffc5484c790) = 0
   36050.891 ( 0.023 ms): statx/4576 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffeb18b66e0) = 0
   38039.889 ( 0.023 ms): statx/4584 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff1db0ea90) = 0
  ^C#

This one also starts moving the beautifiers from files directly included
in builtin-trace.c to separate objects + a beauty.h header with
prototypes, so that we can add test cases in tools/perf/tests/ to fire
syscalls with various arguments and then get them intercepted as
syscalls:sys_enter_foo or raw_syscalls:sys_enter + sys_exit to then
format and check that the formatted output is the one we expect.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xvzw8eynffvez5czyzidhrno@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-31 14:42:31 -03:00
3e00cbe889 perf tools: Do not fail in case of empty HOME env variable
Currently we fail in the following case:

  $ unset HOME
  $ ./perf record ls
  $ echo $?
  255

It's because the config code init fails due to a missing HOME variable
value. Fix this by skipping the user config init if there's no HOME
variable value.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170330144637.7468-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-31 11:26:04 -03:00
67ef28794d tools include uapi: Grab copies of stat.h and fcntl.h
We will need it to build tools/perf/trace/beauty/statx.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-nin41ve2fa63lrfbdr6x57yr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-31 11:26:03 -03:00
3401e8d1e1 perf vendor events intel: Add missing space in json descriptions
Add a missing space in the JSON description after the uncore unit

Before:

perf list
...
  unc_arb_coh_trk_requests.all
       [Unit: uncore_arbNumber of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc]
...

After:

  unc_arb_coh_trk_requests.all
       [Unit: uncore_arb Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc]

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-p989c7x9kaiy2bnkmgpo6cvt@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:50 -07:00
af34cb4fad perf vendor events intel: Add uncore_arb JSON support
The JSON lists call the box iMPH-U, while perf calls it arb.
Add conversion support to json to convert the unit properly.

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-stq5ly95z2qioggp9bfaqe0h@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:41 -07:00
92c6de0f10 perf vendor events intel: Add uncore events for Skylake client
Add V25 of Skylake uncore events

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-00qmcrmq183x2qrj59g92fma@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:32 -07:00
092a95d416 perf vendor events intel: Add uncore events for Broadwell client
Add V18 of Broadwell uncore events

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-xlbguqdzho7l3qn7di40a7av@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:23 -07:00
0585c6265e perf vendor events intel: Add uncore events for Haswell client
Add V25 of Haswell uncore events

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-133r1do7vvssoyszxgx174hj@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:15 -07:00
bccdcb2a77 perf vendor events intel: Add uncore events for Ivy Bridge client
Add V18 of Ivy Bridge uncore events

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-299k76asec5rwp0i86qygnnt@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:35:01 -07:00
80432c7311 perf vendor events intel: Add uncore events for Sandy Bridge client
Add V15 of Sandy Bridge uncore events

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-2qkwutpwljdue8jmwk3xqdbl@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:34:15 -07:00
9c4e2e2589 perf vendor events intel: Add missing UNC_M_DCLOCKTICKS for Broadwell DE uncore
An earlier update removed the UNC_M_CLOCKTICKS event for Broadwell DE.
But Metric events were still referring to it.
This adds it back under a different name from the event list,
and also fixes up the Metric events to use the new name.

Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/n/tip-zxxzg4g5nr93o7np00vgqqwm@git.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2017-03-30 13:32:25 -07:00
a596a877fd perf utils: Fix spelling mistake: "Invalud" -> "Invalid"
Trivial fix to spelling mistake in pr_debug message.

Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170330095440.19444-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-30 11:09:42 -03:00
c69f203df3 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-30 09:48:58 +02:00
fd2b297514 perf trace: Handle unpaired raw_syscalls:sys_exit event
Which may happen when we start a tracing session and a thread is waiting
for something like "poll" to return, in which case we better print "?"
both for the syscall entry timestamp and for the duration.

E.g.:

Tracing existing mutt session:

  # perf trace -p `pidof mutt`
          ? (     ?   ): mutt/17135  ... [continued]: poll()) = 1
      0.027 ( 0.013 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1
      0.047 ( 0.008 ms): mutt/17135 poll(ufds: 0x7ffcb3c42c50, nfds: 1, timeout_msecs: 1000) = 1
      0.059 ( 0.008 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1
  <SNIP>

Before it would print a large number because we'd do:

  ttrace->entry_time - trace->base_time

And entry_time would be 0, while base_time would be the timestamp for
the first event 'perf trace' reads, oops.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Claudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wbcb93ofva2qdjd5ltn5eeqq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-29 17:16:58 -03:00
c1dfcfad58 perf report: Drop cycles 0 for LBR print
For some platforms, for example Broadwell, it doesn't support cycles
for LBR. But the perf always prints cycles:0, it's not necessary.

The patch refactors the LBR info print code and drops the cycles:0.

For example: perf report --branch-history --no-children --stdio

On Broadwell:
--0.91%--__random_r random_r.c:394 (iterations:2)
          __random_r random_r.c:360 (predicted:0.0%)
          __random_r random_r.c:380 (predicted:0.0%)
          __random_r random_r.c:357

On Skylake:
--1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17)
          main div.c:44 (predicted:52.4% cycles:1)
          main div.c:42 (cycles:2)
          compute_flag div.c:28 (cycles:2)
          compute_flag div.c:27 (cycles:1)
          rand rand.c:28 (cycles:1)
          rand rand.c:28 (cycles:1)
          __random random.c:298 (cycles:1)
          __random random.c:297 (cycles:1)
          __random random.c:295 (cycles:1)
          __random random.c:295 (cycles:1)
          __random random.c:295 (cycles:1)

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
	Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1489046786-10061-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-28 16:20:59 -03:00
d451a205da perf/sdt/x86: Move OP parser to tools/perf/arch/x86/
SDT marker argument is in N@OP format. N is the size of argument and OP
is the actual assembly operand. OP is arch dependent component and hence
it's parsing logic also should be placed under tools/perf/arch/.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.3156-3-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-28 12:25:30 -03:00
2d01ecc580 perf/sdt/x86: Add renaming logic for (missing) 8 bit registers
I found couple of events using al, bl, cl and dl registers for argument.
These are not directly accepted by uprobe_events and thus needs to be
mapped to ax, bx, cx and dx respectively.

Few ex,

  /usr/bin/qemu-system-s390x
    css_adapter_interrupt: 1@%bl
    css_chpid_add: 1@%cl 1@%sil 1@%dl
    dma_bdrv_io: 8@%rbx 8@%rbp -8@%r14 1@%al

  /usr/bin/postgres
    buffer__read__done: ... -1@-bash -1@%al
    buffer__read__start: ... -1@%al

I don't find any sdt events using ah, bh,... registers. But I also don't
see any reason to not use them, so there might be rare events using
these registers, and if so, perf should have a renaming logic for them
too.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170328094754.3156-2-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-28 12:24:56 -03:00
c68677014b perf tools: Remove support for command aliases
This came from 'git', but isn't documented anywhere in
tools/perf/Documentation/, looks like baggage we can do without, ditch
it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-e7uwkn60t4hmlnwj99ba4t2s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-28 11:19:59 -03:00
e5c1ff1475 Linux 4.11-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJY2C9qAAoJEHm+PkMAQRiGaBQIAIGzdlZ6ImiP6zoukrRv7qUr
 44ITm0lsBiL85QGedhQQL+Y9UqwUmlqgFqnH0Gr8YHNbLJWXzdjGbl5aVo4KjASq
 104NLUDXtPww/xZdH4wJMzhuwucYwZOUyDOjOr0ak3cGxOE2xjNjHMZXxWUf20GO
 EpRr6WhV1DUAvAdjdNa9KlcOjMluNpMLLyL1CFLjrkkArrWAyqOURKHAb6ZLghfv
 iZV1qJTVPyYGpnlI3kuEgu2GuDjxqpoNLSr3wHyEHm/pBPEl7MX6zPbzcegBV8TY
 cRRlXo4notdsuknmSNcj0hHuTQvw1kl7BhieLKVsnCyCIM6jjX4TSQZFutmbzwM=
 =5iRl
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.11-rc4' into drm-next

Linux 4.11-rc4

The i915 GVT team need the rc4 code to base some more code on.
2017-03-28 17:34:19 +10:00
3906a13a6b perf/core improvements and fixes:
New features:
 
 - Handle inline functions in callchains (Jin Yao)
 
 - Enable sorting by srcline as key (Milian Wolff)
 
 Fixes:
 
 - Fix no_size logic in addr_filter__resolve_kernel_syms() in the
   auxtrace code (Adrian Hunter)
 
 - Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo)
 
 - Fix divide by zero when calculating percent for an event in a group in
   the annotate by source line code (Taeung Song)
 
 - build-id files now aren't anymore symlinks, their parent directories
   are, so readlink the later (Taeung Song)
 
 - Assorted fixes for null termination problems, mostly related to
   readlink, detected by valgrind (Tommi Rantala)
 
 Infrastructure:
 
 - Make vfs_getname probe point logic in 'perf trace' more robust
   wrt length of pathname (Arnaldo Carvalho de Melo)
 
 - Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo)
 
 - Show 'perf list sdt' option in man page (Ravi Bangoria)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJY2WNUAAoJENZQFvNTUqpAZ3AQAIn/Q+Y665oP57RbikedeifL
 He8vdMUkD/haRo0atbvuu5tRrwiRUabkUa6GKPHNCDl8GUD6UbkztUirL4Cq4v9s
 7ONbCHXzaPnPZbDbl/W7Yx4vADow3YMR9EyNkL8/i2ApZqMCPQ9mUBhxJlSDp7RY
 agYcOugUlYuvHsKVX59fTyvTAq8btfyFQTqhJ+NPddcxsyR5jam9XxxvgMURdFJr
 h6OLO9wqCxlMctqlGXU+6tpqiAR+bp8UZgzDKwabGR4mZR+uLBYGf0FUQz52vf2A
 83ufaZ5UrQUsSnVeYXBPW+i8+Ixu8pEOFDMDcSpk/wQXunLlN52LmuatSCkPBEV1
 jFth8SX3IAX349hpaRBNuLk5UuqS6NKBztYzlaVsKMpuIw4hRPVE3VvqKefZD/hx
 Vdlr1v6fPXMcRUcc3lFFiVCIvs0hRV4IDDIimGjJHf8dm+GFMHH+bk+tfiSQAlmZ
 q3aSKMImUM3vlD01E4BmTVr4IEZHTd3mv0Ml+nbQGNj6Bu2364eBsFRnNHJWwGmt
 c9tcnmeRv6JzrmprVXMuOUyyTcml+b5/vincEEmTxUdbxCbYFkQS3JzPxfpxqFI/
 zM5rlJJ9KKWXmwD6OgUoXT5IUzq4BuIVyJ3DxwuL2rrQggsv0zORxQtVduY+IJSj
 ZD/Qu7SOiFfnAFM6kLwP
 =Lm/M
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.12-20170327' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

 - Handle inline functions in callchains (Jin Yao)

 - Enable sorting by srcline as key (Milian Wolff)

Fixes:

 - Fix no_size logic in addr_filter__resolve_kernel_syms() in the
   auxtrace code (Adrian Hunter)

 - Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo)

 - Fix divide by zero when calculating percent for an event in a group in
   the annotate by source line code (Taeung Song)

 - build-id files now aren't anymore symlinks, their parent directories
   are, so readlink the later (Taeung Song)

 - Assorted fixes for null termination problems, mostly related to
   readlink, detected by valgrind (Tommi Rantala)

Infrastructure changes:

 - Make vfs_getname probe point logic in 'perf trace' more robust
   wrt length of pathname (Arnaldo Carvalho de Melo)

 - Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo)

 - Show 'perf list sdt' option in man page (Ravi Bangoria)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-28 07:44:43 +02:00
d652f4bbca Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-28 07:44:25 +02:00
55f77128e7 perf utils: Readlink /proc/self/exe to find the perf binary
Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-7-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:37:54 -03:00
d4b364df5f perf utils: Null terminate buf in read_ftrace_printk()
Ensure that the string that we read from the data file is null terminated.

Valgrind was complaining:

  ==31357== Invalid read of size 1
  ==31357==    at 0x4EC8C1: __strtok_r_1c (string2.h:200)
  ==31357==    by 0x4EC8C1: parse_ftrace_printk (trace-event-parse.c:161)
  ==31357==    by 0x4F82A8: read_ftrace_printk (trace-event-read.c:204)
  ==31357==    by 0x4F82A8: trace_report (trace-event-read.c:468)
  ==31357==    by 0x4CD552: process_tracing_data (header.c:1576)
  ==31357==    by 0x4D3397: perf_file_section__process (header.c:2705)
  ==31357==    by 0x4D3397: perf_header__process_sections (header.c:2488)
  ==31357==    by 0x4D3397: perf_session__read_header (header.c:2925)
  ==31357==    by 0x4E71E2: perf_session__open (session.c:32)
  ==31357==    by 0x4E71E2: perf_session__new (session.c:139)
  ==31357==    by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
  ==31357==    by 0x497150: run_builtin (perf.c:359)
  ==31357==    by 0x428CE0: handle_internal_command (perf.c:421)
  ==31357==    by 0x428CE0: run_argv (perf.c:467)
  ==31357==    by 0x428CE0: main (perf.c:614)
  ==31357==  Address 0x8ac0efb is 0 bytes after a block of size 1,963 alloc'd
  ==31357==    at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
  ==31357==    by 0x4F827B: read_ftrace_printk (trace-event-read.c:195)
  ==31357==    by 0x4F827B: trace_report (trace-event-read.c:468)
  ==31357==    by 0x4CD552: process_tracing_data (header.c:1576)
  ==31357==    by 0x4D3397: perf_file_section__process (header.c:2705)
  ==31357==    by 0x4D3397: perf_header__process_sections (header.c:2488)
  ==31357==    by 0x4D3397: perf_session__read_header (header.c:2925)
  ==31357==    by 0x4E71E2: perf_session__open (session.c:32)
  ==31357==    by 0x4E71E2: perf_session__new (session.c:139)
  ==31357==    by 0x429F5D: cmd_annotate (builtin-annotate.c:472)
  ==31357==    by 0x497150: run_builtin (perf.c:359)
  ==31357==    by 0x428CE0: handle_internal_command (perf.c:421)
  ==31357==    by 0x428CE0: run_argv (perf.c:467)
  ==31357==    by 0x428CE0: main (perf.c:614)

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-6-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:37:35 -03:00
b7126ef786 perf utils: use sizeof(buf) - 1 in readlink() call
Ensure that we have space for the null byte in buf.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-5-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:36:27 -03:00
0e6ba11511 perf tests: Do not assume that readlink() returns a null terminated string
Ensure that the string in buf is null terminated.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-4-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:35:56 -03:00
5a2342111c perf buildid: Do not assume that readlink() returns a null terminated string
Valgrind was complaining:

  $ valgrind ./perf list >/dev/null
  ==11643== Memcheck, a memory error detector
  ==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
  ==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
  ==11643== Command: ./perf list
  ==11643==
  ==11643== Conditional jump or move depends on uninitialised value(s)
  ==11643==    at 0x4C30620: rindex (vg_replace_strmem.c:199)
  ==11643==    by 0x49DAA9: build_id_cache__origname (build-id.c:198)
  ==11643==    by 0x49E1C7: build_id_cache__valid_id (build-id.c:222)
  ==11643==    by 0x49E1C7: build_id_cache__list_all (build-id.c:507)
  ==11643==    by 0x4B9C8F: print_sdt_events (parse-events.c:2067)
  ==11643==    by 0x4BB0B3: print_events (parse-events.c:2313)
  ==11643==    by 0x439501: cmd_list (builtin-list.c:53)
  ==11643==    by 0x497150: run_builtin (perf.c:359)
  ==11643==    by 0x428CE0: handle_internal_command (perf.c:421)
  ==11643==    by 0x428CE0: run_argv (perf.c:467)
  ==11643==    by 0x428CE0: main (perf.c:614)
  [...]

Additionally, a zero length result from readlink() is not very interesting.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170322130624.21881-3-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:35:06 -03:00
2ccc220238 perf buildid: Do not update SDT cache with null filename
Valgrind was complaining:

  ==2633== Syscall param open(filename) points to unaddressable byte(s)
  ==2633==    at 0x5281CC0: __open_nocancel (syscall-template.S:84)
  ==2633==    by 0x537D38: open (fcntl2.h:53)
  ==2633==    by 0x537D38: get_sdt_note_list (symbol-elf.c:2017)
  ==2633==    by 0x5396FD: probe_cache__scan_sdt (probe-file.c:700)
  ==2633==    by 0x49EA2C: build_id_cache__add_sdt_cache (build-id.c:625)
  ==2633==    by 0x49EA2C: build_id_cache__add_s (build-id.c:697)
  ==2633==    by 0x49EE72: build_id_cache__add_b (build-id.c:717)
  ==2633==    by 0x49EE72: dso__cache_build_id (build-id.c:782)
  ==2633==    by 0x49F190: __dsos__cache_build_ids (build-id.c:793)
  ==2633==    by 0x49F190: machine__cache_build_ids (build-id.c:801)
  ==2633==    by 0x49F190: perf_session__cache_build_ids (build-id.c:815)
  ==2633==    by 0x4CD4F2: write_build_id (header.c:165)
  ==2633==    by 0x4D26F7: do_write_feat (header.c:2296)
  ==2633==    by 0x4D26F7: perf_header__adds_write (header.c:2335)
  ==2633==    by 0x4D26F7: perf_session__write_header (header.c:2414)
  ==2633==    by 0x43B324: __cmd_record (builtin-record.c:1154)
  ==2633==    by 0x43B324: cmd_record (builtin-record.c:1839)
  ==2633==    by 0x455A07: __cmd_record (builtin-kmem.c:1868)
  ==2633==    by 0x455A07: cmd_kmem (builtin-kmem.c:1944)
  ==2633==    by 0x497150: run_builtin (perf.c:359)
  ==2633==    by 0x428CE0: handle_internal_command (perf.c:421)
  ==2633==    by 0x428CE0: run_argv (perf.c:467)
  ==2633==    by 0x428CE0: main (perf.c:614)
  ==2633==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lkml.kernel.org/r/20170322130624.21881-2-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:33:36 -03:00
2e933b1274 perf annotate: Fix a bug of division by zero when calculating percent
Currently perf-annotate with --print-line can print
-nan(0x8000000000000) because of division by zero when calculating
percent. The division by zero happens when a sum of samples is zero in
symbol__get_source_line(), so fix it.

For example:

After running 'perf record' like below,

    $ perf record -e "{cycles,page-faults,branch-misses}" ./a.out

Before:

    $ perf annotate --stdio -l

  Sorted summary for file /home/taeung/workspace/a.out
  ----------------------------------------------

   32.89    -nan    7.04 a.c:38
   25.14    -nan    0.00 a.c:34
   16.26    -nan   56.34 a.c:31
   15.88    -nan    1.41 a.c:37
    5.67    -nan    0.00 a.c:39
    1.13    -nan   35.21 a.c:26
    0.95    -nan    0.00 a.c:44
    0.57    -nan    0.00 a.c:32
   Percent                 |      Source code & Disassembly of a.out for cycles (529 samples)
  -----------------------------------------------------------------------------------------
                         :
  ...

   a.c:26    0.57    -nan    4.23 :         40081a:       mov    %edi,-0x24(%rbp)
   a.c:26    0.00    -nan    9.86 :         40081d:       mov    %rsi,-0x30(%rbp)

  ...

However, if a sum of samples is zero (e.g. 'page-faults'),
skip calculating percent.

After:

    $ perf annotate --stdio -l

  Sorted summary for file /home/taeung/workspace/a.out
  ----------------------------------------------

   32.89    0.00    7.04 a.c:38
   25.14    0.00    0.00 a.c:34
   16.26    0.00   56.34 a.c:31
   15.88    0.00    1.41 a.c:37
    5.67    0.00    0.00 a.c:39
    1.13    0.00   35.21 a.c:26
    0.95    0.00    0.00 a.c:44
    0.57    0.00    0.00 a.c:32
   Percent                 |      Source code & Disassembly of old for cycles (529 samples)
  -----------------------------------------------------------------------------------------
                         :
  ...

  a.c:26    0.57    0.00    4.23 :         40081a:       mov    %edi,-0x24(%rbp)
  a.c:26    0.00    0.00    9.86 :         40081d:       mov    %rsi,-0x30(%rbp)

  ...

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1490598638-13947-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 15:04:56 -03:00
6ebd2547dd perf annotate: Fix a bug following symbolic link of a build-id file
It is wrong way to read link name from a build-id file.  Because a
build-id file is not anymore a symbolic link but build-id directory of
it is symbolic link, so fix it.

For example, if build-id file name gotten from
dso__build_id_filename() is as below,

  /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1/elf

To correctly read link name of build-id, use the build-id dir path that
is a symbolic link, instead of the above build-id file name like below.

  /root/.debug/.build-id/4f/75c7d197c951659d1c1b8b5fd49bcdf8f3f8b1

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1490598638-13947-2-git-send-email-treeze.taeung@gmail.com
Fixes: 01412261d994 ("perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 14:58:20 -03:00
5dfa210e40 perf report: Enable sorting by srcline as key
Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.

Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.

The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:

  ~~~~~~~~~~~~~~~~
  $ perf report --stdio --inline -g address
  # Children      Self  Command       Shared Object        Symbol
  # ........  ........  ............  ...................  .........................................
  #
      99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
            |
            |--64.55%--main complex:655
            |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
            |          /usr/include/c++/6.3.1/complex:664 (inline)
            |          |
            |          |--60.31%--hypot +20
            |          |          |
            |          |          |--8.52%--__hypot_finite +273
            |          |          |
            |          |          |--7.32%--__hypot_finite +411
...
             --35.34%--_start +4194346
                       __libc_start_main +241
                       |
                       |--6.65%--main random.tcc:3326
                       |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                       |
                       |--2.70%--main random.tcc:3326
                       |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                       |
                       |--1.69%--main random.tcc:3326
                       |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
  ...
  ~~~~~~~~~~~~~~~~

With this patch and `-g srcline` we instead get the following output:

  ~~~~~~~~~~~~~~~~
  $ perf report --stdio --inline -g srcline
  # Children      Self  Command       Shared Object        Symbol
  # ........  ........  ............  ...................  .........................................
  #
      99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
            |
            |--64.55%--main complex:655
            |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
            |          /usr/include/c++/6.3.1/complex:664 (inline)
            |          |
            |          |--64.02%--hypot
            |          |          |
            |          |           --59.81%--__hypot_finite
            |          |
            |           --0.53%--cabs
            |
             --35.34%--_start
                       __libc_start_main
                       |
                       |--12.48%--main random.tcc:3326
                       |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                       |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
  ...
  ~~~~~~~~~~~~~~~~

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 12:13:28 -03:00
0d3eb0b778 perf report: Show inline stack for browser mode
If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.

For example:

1. Show inlined function name
   perf report -g function --inline

-    0.69%     0.00%  inline   ld-2.23.so           [.] dl_main
   - dl_main
        0.56% _dl_relocate_object
         _dl_relocate_object (inline)
         elf_dynamic_do_Rela (inline)

2. Show the file/line information
   perf report -g address --inline

-    0.69%     0.00%  inline   ld-2.23.so           [.] _dl_start
     _dl_start rtld.c:307
      /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
   + _dl_sysdep_start dl-sysdep.c:250

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-6-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-27 12:12:59 -03:00