30384 Commits

Author SHA1 Message Date
cfd7092c31 perf test session topology: Fix test to skip the test in guest environment
The session topology test fails in powerpc pSeries platform.

Test logs:

  <<>>
  Session topology : FAILED!
  <<>>

This testcases tests cpu topology by checking the core_id and socket_id
stored in perf_env from perf session. The data from perf session is
compared with the cpu topology information from
"/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id.

In case of virtual environment, detail like physical_package_id is
restricted to be exposed. Hence physical_package_id is set to -1. The
testcase fails on such platforms since socket_id can't be fetched from
topology info.

Skip the testcase in powerpc if physical_package_id returns -1.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>---
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220511114959.84002-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:52:33 -03:00
f8ac1c4784 perf bench numa: Address compiler error on s390
The compilation on s390 results in this error:

  # make DEBUG=y bench/numa.o
  ...
  bench/numa.c: In function ‘__bench_numa’:
  bench/numa.c:1749:81: error: ‘%d’ directive output may be truncated
              writing between 1 and 11 bytes into a region of size between
              10 and 20 [-Werror=format-truncation=]
  1749 |        snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
                                                               ^~
  ...
  bench/numa.c:1749:64: note: directive argument in the range
                 [-2147483647, 2147483646]
  ...
  #

The maximum length of the %d replacement is 11 characters because of the
negative sign.  Therefore extend the array by two more characters.

Output after:

  # make  DEBUG=y bench/numa.o > /dev/null 2>&1; ll bench/numa.o
  -rw-r--r-- 1 root root 418320 May 19 09:11 bench/numa.o
  #

Fixes: 3aff8ba0a4c9c919 ("perf bench numa: Avoid possible truncation when using snprintf()")
Suggested-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520081158.2990006-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
caaaa55477 perf test: Avoid shell test description infinite loop
for_each_shell_test() is already strict in expecting tests to be files
and executable. It is sometimes possible when it iterates over all files
that it finds one that is executable and lacks a newline character. When
this happens the loop never terminates as it doesn't check for EOF.

Add the EOF check to make this loop at least bounded by the file size.

If the description is returned as NULL then also skip the test.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220517204144.645913-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
01b28e4a58 perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.

Without the patch on ADL-N:

  $ perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
  R11 R12 R13 R14 R15

The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.

Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.

With the patch on ADL-N:

  $ perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
  R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
  XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

Fixes: 6466ec14aaf44ff1 ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
451ed8058c perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc
"perf all PMU test" picks the input events from "perf list --raw-dump
pmu" list and runs "perf stat -e" for each of the event in the list. In
case of powerpc, the PowerVM environment supports events from hv_24x7
and hv_gpci PMU which is of example format like below:

- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/

The value for "?" needs to be filled in depending on system and
respective event. CPM_ADJUNCT_INST needs have core value and domain
value. hv_gpci event needs partition_id.  Similarly, there are other
events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
these events on powerpc platform since values like partition_id, domain
is specific to system and event.

Fixes: 3d5ac9effcc640d5 ("perf test: Workload test of all PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:06 -03:00
027bbb884b KVM: x86/speculation: Disable Fill buffer clear within guests
The enumeration of MD_CLEAR in CPUID(EAX=7,ECX=0).EDX{bit 10} is not an
accurate indicator on all CPUs of whether the VERW instruction will
overwrite fill buffers. FB_CLEAR enumeration in
IA32_ARCH_CAPABILITIES{bit 17} covers the case of CPUs that are not
vulnerable to MDS/TAA, indicating that microcode does overwrite fill
buffers.

Guests running in VMM environments may not be aware of all the
capabilities/vulnerabilities of the host CPU. Specifically, a guest may
apply MDS/TAA mitigations when a virtual CPU is enumerated as vulnerable
to MDS/TAA even when the physical CPU is not. On CPUs that enumerate
FB_CLEAR_CTRL the VMM may set FB_CLEAR_DIS to skip overwriting of fill
buffers by the VERW instruction. This is done by setting FB_CLEAR_DIS
during VMENTER and resetting on VMEXIT. For guests that enumerate
FB_CLEAR (explicitly asking for fill buffer clear capability) the VMM
will not use FB_CLEAR_DIS.

Irrespective of guest state, host overwrites CPU buffers before VMENTER
to protect itself from an MMIO capable guest, as part of mitigation for
MMIO Stale Data vulnerabilities.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:41:35 +02:00
5180218615 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Processor MMIO Stale Data is a class of vulnerabilities that may
expose data after an MMIO operation. For more details please refer to
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst

Add the Processor MMIO Stale Data bug enumeration. A microcode update
adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-05-21 12:14:30 +02:00
6c3f5bec9b ARM:
* Correctly expose GICv3 support even if no irqchip is created
   so that userspace doesn't observe it changing pointlessly
   (fixing a regression with QEMU)
 
 * Don't issue a hypercall to set the id-mapped vectors when
   protected mode is enabled (fix for pKVM in combination with
   CPUs affected by Spectre-v3a)
 
 x86: Five oneliners, of which the most interesting two are:
 
 * a NULL pointer dereference on INVPCID executed with
   paging disabled, but only if KVM is using shadow paging
 
 * an incorrect bsearch comparison function which could truncate
   the result and apply PMU event filtering incorrectly.  This one
   comes with a selftests update too.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKH1qYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMadgf9E1u5skRjtv+RWPbfs/v3irnirY/L
 x5TaVb2yiPahNH5qgFL2xnJ9jCcCNlxxn5uKpEAi0JFrqc6uCS0Rh1TPfqEN0lLt
 5PGJD2JSKXAWVRkObS3j5iZuQX4ZvDRY53eSQv6pdcU+evjTq1H5WZ83uciqo0J5
 xilKEtUIpJ9o0ELw9BjAd3vlRlOPpveHq+48DJN7cO0L/eju9Lz9kqJQTE7WQato
 SsmpXPNIaSlk3U3yWAfOYgzyVkZQW/JiKS++TfVr5VQMppbOI6bxo3UfDAygiA9e
 9KZAWrwoXqDMNp9756Y6lfT7g8PblnXgOvTXa/cV+ypaeAuuTU/iBSLwxQ==
 =gWsP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Correctly expose GICv3 support even if no irqchip is created so
     that userspace doesn't observe it changing pointlessly (fixing a
     regression with QEMU)

   - Don't issue a hypercall to set the id-mapped vectors when protected
     mode is enabled (fix for pKVM in combination with CPUs affected by
     Spectre-v3a)

  x86 (five oneliners, of which the most interesting two are):

   - a NULL pointer dereference on INVPCID executed with paging
     disabled, but only if KVM is using shadow paging

   - an incorrect bsearch comparison function which could truncate the
     result and apply PMU event filtering incorrectly. This one comes
     with a selftests update too"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
  KVM: x86: hyper-v: fix type of valid_bank_mask
  KVM: Free new dirty bitmap if creating a new memslot fails
  KVM: eventfd: Fix false positive RCU usage warning
  selftests: kvm/x86: Verify the pmu event filter matches the correct event
  selftests: kvm/x86: Add the helper function create_pmu_event_filter
  kvm: x86/pmu: Fix the compare function used by the pmu event filter
  KVM: arm64: Don't hypercall before EL2 init
  KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
  KVM: x86/mmu: Update number of zapped pages even if page list is stable
2022-05-20 20:34:59 -10:00
90a039fd19 selftests/bpf: add tests verifying unprivileged bpf behaviour
tests load/attach bpf prog with maps, perfbuf and ringbuf, pinning
them.  Then effective caps are dropped and we verify we can

- pick up the pin
- create ringbuf/perfbuf
- get ringbuf/perfbuf events, carry out map update, lookup and delete
- create a link

Negative testing also ensures

- BPF prog load fails
- BPF map create fails
- get fd by id fails
- get next id fails
- query fails
- BTF load fails

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1652970334-30510-3-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-20 19:54:34 -07:00
538aaf9b23 selftests: Add test for timing a bind request to a port with a populated bhash entry
This test populates the bhash table for a given port with
MAX_THREADS * MAX_CONNECTIONS sockets, and then times how long
a bind request on the port takes.

When populating the bhash table, we create the sockets and then bind
the sockets to the same address and port (SO_REUSEADDR and SO_REUSEPORT
are set). When timing how long a bind on the port takes, we bind on a
different address without SO_REUSEPORT set. We do not set SO_REUSEPORT
because we are interested in the case where the bind request does not
go through the tb->fastreuseport path, which is fragile (eg
tb->fastreuseport path does not work if binding with a different uid).

To run the test locally, I did:
* ulimit -n 65535000
* ip addr add 2001:0db8:0:f101::1 dev eth0
* ./bind_bhash_test 443

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-20 18:16:24 -07:00
5feba47273 selftests: fib_nexthops: Make ping timeout configurable
Commit 49bb39bddad2 ("selftests: fib_nexthops: Make the test more robust")
increased the timeout of ping commands to 5 seconds, to make the test
more robust. Make the timeout configurable using '-w' argument to allow
user to change it depending on the system that runs the test. Some systems
suffer from slow forwarding performance, so they may need to change the
timeout.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220519070921.3559701-1-amcohen@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-20 17:21:37 -07:00
2dc323b1c4 selftests/bpf: Remove filtered subtests from output
Currently filtered subtests show up in the output as skipped.

Before:
$ sudo ./test_progs -t log_fixup/missing_map
 #94 /1     log_fixup/bad_core_relo_trunc_none:SKIP
 #94 /2     log_fixup/bad_core_relo_trunc_partial:SKIP
 #94 /3     log_fixup/bad_core_relo_trunc_full:SKIP
 #94 /4     log_fixup/bad_core_relo_subprog:SKIP
 #94 /5     log_fixup/missing_map:OK
 #94        log_fixup:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

After:
$ sudo ./test_progs -t log_fixup/missing_map
 #94 /5     log_fixup/missing_map:OK
 #94        log_fixup:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220520061303.4004808-1-mykolal@fb.com
2022-05-20 16:25:29 -07:00
fa37686065 selftests/bpf: Fix subtest number formatting in test_progs
Remove weird spaces around / while preserving proper
indentation

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220520070144.10312-1-mykolal@fb.com
2022-05-20 16:23:14 -07:00
b23316aabf selftests/bpf: Add missing trampoline program type to trampoline_count test
Currently the trampoline_count test doesn't include any fmod_ret bpf
programs, fix it to make the test cover all possible trampoline program
types.

Since fmod_ret bpf programs can't be attached to __set_task_comm function,
as it's neither whitelisted for error injection nor a security hook, change
it to bpf_modify_return_test.

This patch also does some other cleanups such as removing duplicate code,
dropping inconsistent comments, etc.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220519150610.601313-1-ytcoode@gmail.com
2022-05-20 16:12:14 -07:00
4f90d034bb selftests/bpf: Verify first of struct mptcp_sock
This patch verifies the 'first' struct member of struct mptcp_sock, which
points to the first subflow of msk. Save 'sk' in mptcp_storage, and verify
it with 'first' in verify_msk().

v5:
 - Use ASSERT_EQ() instead of a manual comparison + log (Andrii).

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-8-mathew.j.martineau@linux.intel.com
2022-05-20 15:36:48 -07:00
ccc090f469 selftests/bpf: Verify ca_name of struct mptcp_sock
This patch verifies another member of struct mptcp_sock, ca_name. Add a
new function get_msk_ca_name() to read the sysctl tcp_congestion_control
and verify it in verify_msk().

v3: Access the sysctl through the filesystem to avoid compatibility
    issues with the busybox sysctl command.

v4: use ASSERT_* instead of CHECK_FAIL (Andrii)

v5: use ASSERT_STRNEQ() instead of strncmp() (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-7-mathew.j.martineau@linux.intel.com
2022-05-20 15:36:08 -07:00
0266223467 selftests/bpf: Verify token of struct mptcp_sock
This patch verifies the struct member token of struct mptcp_sock. Add a
new member token in struct mptcp_storage to store the token value of the
msk socket got by bpf_skc_to_mptcp_sock(). Trace the kernel function
mptcp_pm_new_connection() by using bpf fentry prog to obtain the msk token
and save it in a global bpf variable. Pass the variable to verify_msk() to
verify it with the token saved in socket_storage_map.

v4:
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - skip the test if 'ip mptcp monitor' is not supported (Mat)

v5:
 - Drop 'ip mptcp monitor', trace mptcp_pm_new_connection instead (Martin)
 - Use ASSERT_EQ (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-6-mathew.j.martineau@linux.intel.com
2022-05-20 15:35:00 -07:00
3bc48b56e3 selftests/bpf: Test bpf_skc_to_mptcp_sock
This patch extends the MPTCP test base, to test the new helper
bpf_skc_to_mptcp_sock().

Define struct mptcp_sock in bpf_tcp_helpers.h, use bpf_skc_to_mptcp_sock
to get the msk socket in progs/mptcp_sock.c and store the infos in
socket_storage_map.

Get the infos from socket_storage_map in prog_tests/mptcp.c. Add a new
function verify_msk() to verify the infos of MPTCP socket, and rename
verify_sk() to verify_tsk() to verify TCP socket only.

v2: Add CONFIG_MPTCP check for clearer error messages

v4:
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - drop bpf_mptcp_helpers.h (Andrii)

v5:
 - some 'ASSERT_*' were replaced in the next commit by mistake.
 - Drop CONFIG_MPTCP (Martin)
 - Use ASSERT_EQ (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-5-mathew.j.martineau@linux.intel.com
2022-05-20 15:34:39 -07:00
8039d35321 selftests/bpf: Add MPTCP test base
This patch adds a base for MPTCP specific tests.

It is currently limited to the is_mptcp field in case of plain TCP
connection because there is no easy way to get the subflow sk from a msk
in userspace. This implies that we cannot lookup the sk_storage attached
to the subflow sk in the sockops program.

v4:
 - add copyright 2022 (Andrii)
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - drop SEC("version") (Andrii)
 - use is_mptcp in tcp_sock, instead of bpf_tcp_sock (Martin & Andrii)

v5:
 - Drop connect_to_mptcp_fd (Martin)
 - Use BPF test skeleton (Andrii)
 - Use ASSERT_EQ (Andrii)
 - Drop the 'msg' parameter of verify_sk

Co-developed-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-4-mathew.j.martineau@linux.intel.com
2022-05-20 15:33:23 -07:00
d3294cb1e0 selftests/bpf: Enable CONFIG_IKCONFIG_PROC in config
CONFIG_IKCONFIG_PROC is required by BPF selftests, otherwise we get
errors like this:

 libbpf: failed to open system Kconfig
 libbpf: failed to load object 'kprobe_multi'
 libbpf: failed to load BPF skeleton 'kprobe_multi': -22

It's because /proc/config.gz is opened in bpf_object__read_kconfig_file()
in tools/lib/bpf/libbpf.c:

        file = gzopen("/proc/config.gz", "r");

So this patch enables CONFIG_IKCONFIG and CONFIG_IKCONFIG_PROC in
tools/testing/selftests/bpf/config.

Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-3-mathew.j.martineau@linux.intel.com
2022-05-20 15:29:00 -07:00
3bc253c2e6 bpf: Add bpf_skc_to_mptcp_sock_proto
This patch implements a new struct bpf_func_proto, named
bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP,
and a new helper bpf_skc_to_mptcp_sock(), which invokes another new
helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct
mptcp_sock from a given subflow socket.

v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT
v5: Drop EXPORT_SYMBOL (Martin)

Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
2022-05-20 15:29:00 -07:00
7aa424e02a selftests/bpf: Fix some bugs in map_lookup_percpu_elem testcase
comments from Andrii Nakryiko, details in here:
https://lore.kernel.org/lkml/20220511093854.411-1-zhoufeng.zf@bytedance.com/T/

use /* */ instead of //
use libbpf_num_possible_cpus() instead of sysconf(_SC_NPROCESSORS_ONLN)
use 8 bytes for value size
fix memory leak
use ASSERT_EQ instead of ASSERT_OK
add bpf_loop to fetch values on each possible CPU

Fixes: ed7c13776e20c74486b0939a3c1de984c5efb6aa ("selftests/bpf: add test case for bpf_map_lookup_percpu_elem")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220518025053.20492-1-zhoufeng.zf@bytedance.com
2022-05-20 15:07:41 -07:00
c71159648c KVM: s390: selftest: Test suppression indication on key prot exception
Check that suppression is not indicated on injection of a key checked
protection exception caused by a memop after it already modified guest
memory, as that violates the definition of suppression.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20220512131019.2594948-3-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-05-20 16:38:42 +02:00
cbac924200 selftests: drivers/s390x: Add uvdevice tests
Adds some selftests to test ioctl error paths of the uv-uapi.
The Kconfig S390_UV_UAPI must be selected and the Ultravisor facility
must be available. The test can be executed by non-root, however, the
uvdevice special file /dev/uv must be accessible for reading and
writing which may imply root privileges.

  ./test-uv-device
  TAP version 13
  1..6
  # Starting 6 tests from 3 test cases.
  #  RUN           uvio_fixture.att.fault_ioctl_arg ...
  #            OK  uvio_fixture.att.fault_ioctl_arg
  ok 1 uvio_fixture.att.fault_ioctl_arg
  #  RUN           uvio_fixture.att.fault_uvio_arg ...
  #            OK  uvio_fixture.att.fault_uvio_arg
  ok 2 uvio_fixture.att.fault_uvio_arg
  #  RUN           uvio_fixture.att.inval_ioctl_cb ...
  #            OK  uvio_fixture.att.inval_ioctl_cb
  ok 3 uvio_fixture.att.inval_ioctl_cb
  #  RUN           uvio_fixture.att.inval_ioctl_cmd ...
  #            OK  uvio_fixture.att.inval_ioctl_cmd
  ok 4 uvio_fixture.att.inval_ioctl_cmd
  #  RUN           attest_fixture.att_inval_request ...
  #            OK  attest_fixture.att_inval_request
  ok 5 attest_fixture.att_inval_request
  #  RUN           attest_fixture.att_inval_addr ...
  #            OK  attest_fixture.att_inval_addr
  ok 6 attest_fixture.att_inval_addr
  # PASSED: 6 / 6 tests passed.
  # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20220510144724.3321985-3-seiden@linux.ibm.com>
Link: https://lore.kernel.org/kvm/20220510144724.3321985-3-seiden@linux.ibm.com/
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-05-20 16:38:42 +02:00
e0e14cdff3 perf parse-events: Move slots event for the hybrid platform too
The commit 94dbfd6781a0e87b ("perf parse-events: Architecture specific
leader override") introduced a feature to reorder the slots event to
fulfill the restriction of the perf metrics topdown group. But the
feature doesn't work on the hybrid machine.

  $ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1

   Performance counter stats for 'system wide':

       <not counted>      cpu_core/instructions/
       <not counted>      cpu_core/slots/
     <not supported>      cpu_core/topdown-retiring/

         1.002871801 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
current perf hard code the PMU name "cpu".

Introduce a new function to check whether the system supports the perf
metrics feature. The result is cached for the future usage.

For X86, the core PMU name always has "cpu" prefix.

With the patch:

  $ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1

   Performance counter stats for 'system wide':

          76,337,010      cpu_core/slots/
          10,416,809      cpu_core/instructions/
          11,692,372      cpu_core/topdown-retiring/

         1.002805453 seconds time elapsed

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-5-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:13:37 -03:00
e7d1374ed5 perf parse-events: Support different format of the topdown event name
The evsel->name may have a different format for a topdown event, a pure
topdown name (e.g., topdown-fe-bound), or a PMU name + a topdown name
(e.g., cpu/topdown-fe-bound/). The cpu/topdown-fe-bound/ kind format
isn't supported by the arch_evlist__leader(). This format is a very
common format for a hybrid platform, which requires specifying the PMU
name for each event.

Without the patch,

  $ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1

   Performance counter stats for 'system wide':

       <not counted>      instructions
       <not counted>      slots
     <not supported>      cpu/topdown-fe-bound/

         1.003482041 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog
  The events in group usually have to be from the same PMU. Try reorganizing the group.

With the patch,

  $ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1

  Performance counter stats for 'system wide':

         157,383,996      slots
          25,011,711      instructions
          27,441,686      cpu/topdown-fe-bound/

         1.003530890 seconds time elapsed

Fixes: bc355822f0d9623b ("perf parse-events: Move slots only with topdown")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-4-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:12:57 -03:00
e8f4f794d7 perf stat: Always keep perf metrics topdown events in a group
If any member in a group has a different cpu mask than the other
members, the current perf stat disables group. when the perf metrics
topdown events are part of the group, the below <not supported> error
will be triggered.

  $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }

   Performance counter stats for 'system wide':

         141,465,174      slots
     <not supported>      topdown-retiring
       1,605,330,334      uncore_imc_free_running_0/dclk/

The perf metrics topdown events must always be grouped with a slots
event as leader.

Factor out evsel__remove_from_group() to only remove the regular events
from the group.

Remove evsel__must_be_in_group(), since no one use it anymore.

With the patch, the topdown events aren't broken from the group for the
splitting.

  $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }

   Performance counter stats for 'system wide':

         346,110,588      slots
         124,608,256      topdown-retiring
       1,606,869,976      uncore_imc_free_running_0/dclk/

         1.003877592 seconds time elapsed

Fixes: a9a1790247bdcf3b ("perf stat: Ensure group is defined on top of the same cpu mask")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:11:13 -03:00
39d5f412da perf evsel: Fixes topdown events in a weak group for the hybrid platform
The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.

  $./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

       751,765,068      cpu_core/slots/                        (84.07%)
   <not supported>      cpu_core/topdown-bad-spec/
   <not supported>      cpu_core/topdown-be-bound/
   <not supported>      cpu_core/topdown-fe-bound/
   <not supported>      cpu_core/topdown-retiring/
        12,398,197      cpu_core/branch-instructions/          (84.07%)
         1,054,218      cpu_core/branch-misses/                (84.24%)
       539,764,637      cpu_core/bus-cycles/                   (84.64%)
            14,683      cpu_core/cache-misses/                 (84.87%)
         7,277,809      cpu_core/cache-references/             (77.30%)
       222,299,439      cpu_core/cpu-cycles/                   (77.28%)
        63,661,714      cpu_core/instructions/                 (84.85%)
                 0      cpu_core/mem-loads/                    (77.29%)
        12,271,725      cpu_core/mem-stores/                   (77.30%)
       542,241,102      cpu_core/ref-cycles/                   (84.85%)
             8,854      cpu_core/cache-misses/                 (76.71%)
         7,179,013      cpu_core/cache-references/             (76.31%)

         1.003245250 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".

The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.

In case other PMUs also define the "slots" event, checking the PMU type
as well.

With the patch,

  $ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

     766,620,266   cpu_core/slots/                                        (84.06%)
      73,172,129   cpu_core/topdown-bad-spec/ #    9.5% bad speculation   (84.06%)
     193,443,341   cpu_core/topdown-be-bound/ #    25.0% backend bound    (84.06%)
     403,940,929   cpu_core/topdown-fe-bound/ #    52.3% frontend bound   (84.06%)
     102,070,237   cpu_core/topdown-retiring/ #    13.2% retiring         (84.06%)
      12,364,429   cpu_core/branch-instructions/                          (84.03%)
       1,080,124   cpu_core/branch-misses/                                (84.24%)
     564,120,383   cpu_core/bus-cycles/                                   (84.65%)
          36,979   cpu_core/cache-misses/                                 (84.86%)
       7,298,094   cpu_core/cache-references/                             (77.30%)
     227,174,372   cpu_core/cpu-cycles/                                   (77.31%)
      63,886,523   cpu_core/instructions/                                 (84.87%)
               0   cpu_core/mem-loads/                                    (77.31%)
      12,208,782   cpu_core/mem-stores/                                   (77.31%)
     566,409,738   cpu_core/ref-cycles/                                   (84.87%)
          23,118   cpu_core/cache-misses/                                 (76.71%)
       7,212,602   cpu_core/cache-references/                             (76.29%)

       1.003228667 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:09:41 -03:00
92d579ea32 perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
Stat events can come from disk and so need a degree of validation. They
contain a CPU which needs looking up via CPU map to access a counter.

Add the CPU to index translation, alongside validity checking.

Discussion thread:

  https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/

Fixes: 7ac0089d138f80dc ("perf evsel: Pass cpu not cpu map index to synthesize")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Suggested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 10:54:06 -03:00
0ae065a5d2 perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
Avi Kivity reported a problem where the __weak
btf__load_from_kernel_by_id() in tools/perf/util/bpf-event.c was being
used and it called btf__get_from_id() in tools/lib/bpf/btf.c that in
turn called back to btf__load_from_kernel_by_id(), resulting in an
endless loop.

Fix this by adding a feature test to check if
btf__load_from_kernel_by_id() is available when building perf with
LIBBPF_DYNAMIC=1, and if not then provide the fallback to the old
btf__get_from_id(), that doesn't call back to btf__load_from_kernel_by_id()
since at that time it didn't exist at all.

Tested on Fedora 35 where we have libbpf-devel 0.4.0 with LIBBPF_DYNAMIC
where we don't have btf__load_from_kernel_by_id() and thus its feature
test fail, not defining HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID:

  $ cat /tmp/build/perf-urgent/feature/test-libbpf-btf__load_from_kernel_by_id.make.output
  test-libbpf-btf__load_from_kernel_by_id.c: In function ‘main’:
  test-libbpf-btf__load_from_kernel_by_id.c:6:16: error: implicit declaration of function ‘btf__load_from_kernel_by_id’ [-Werror=implicit-function-declaration]
      6 |         return btf__load_from_kernel_by_id(20151128, NULL);
        |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  $

  $ nm /tmp/build/perf-urgent/perf | grep btf__load_from_kernel_by_id
  00000000005ba180 T btf__load_from_kernel_by_id
  $

  $ objdump --disassemble=btf__load_from_kernel_by_id -S /tmp/build/perf-urgent/perf

  /tmp/build/perf-urgent/perf:     file format elf64-x86-64
  <SNIP>
  00000000005ba180 <btf__load_from_kernel_by_id>:
  #include "record.h"
  #include "util/synthetic-events.h"

  #ifndef HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
  struct btf *btf__load_from_kernel_by_id(__u32 id)
  {
    5ba180:	55                   	push   %rbp
    5ba181:	48 89 e5             	mov    %rsp,%rbp
    5ba184:	48 83 ec 10          	sub    $0x10,%rsp
    5ba188:	64 48 8b 04 25 28 00 	mov    %fs:0x28,%rax
    5ba18f:	00 00
    5ba191:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
    5ba195:	31 c0                	xor    %eax,%eax
         struct btf *btf;
  #pragma GCC diagnostic push
  #pragma GCC diagnostic ignored "-Wdeprecated-declarations"
         int err = btf__get_from_id(id, &btf);
    5ba197:	48 8d 75 f0          	lea    -0x10(%rbp),%rsi
    5ba19b:	e8 a0 57 e5 ff       	call   40f940 <btf__get_from_id@plt>
    5ba1a0:	89 c2                	mov    %eax,%edx
  #pragma GCC diagnostic pop

         return err ? ERR_PTR(err) : btf;
    5ba1a2:	48 98                	cltq
    5ba1a4:	85 d2                	test   %edx,%edx
    5ba1a6:	48 0f 44 45 f0       	cmove  -0x10(%rbp),%rax
  }
  <SNIP>

Fixes: 218e7b775d368f38 ("perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions")
Reported-by: Avi Kivity <avi@scylladb.com>
Link: https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/linux-perf-users/YobjjFOblY4Xvwo7@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 09:45:41 -03:00
c41ef29cc1 selftests: kvm/x86: Verify the pmu event filter matches the correct event
Add a test to demonstrate that when the guest programs an event select
it is matched correctly in the pmu event filter and not inadvertently
filtered.  This could happen on AMD if the high nybble[1] in the event
select gets truncated away only leaving the bottom byte[2] left for
matching.

This is a contrived example used for the convenience of demonstrating
this issue, however, this can be applied to event selects 0x28A (OC
Mode Switch) and 0x08A (L1 BTB Correction), where 0x08A could end up
being denied when the event select was only set up to deny 0x28A.

[1] bits 35:32 in the event select register and bits 11:8 in the event
    select.
[2] bits 7:0 in the event select register and bits 7:0 in the event
    select.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220517051238.2566934-3-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20 07:06:55 -04:00
04baa2233d selftests: kvm/x86: Add the helper function create_pmu_event_filter
Add a helper function that creates a pmu event filter given an event
list.  Currently, a pmu event filter can only be created with the same
hard coded event list.  Add a way to create one given a different event
list.

Also, rename make_pmu_event_filter to alloc_pmu_event_filter to clarify
it's purpose given the introduction of create_pmu_event_filter.

No functional changes intended.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220517051238.2566934-2-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20 07:06:55 -04:00
22682a07ac objtool: Fix objtool regression on x32 systems
Commit c087c6e7b551 ("objtool: Fix type of reloc::addend") failed to
appreciate cross building from ILP32 hosts, where 'int' == 'long' and
the issue persists.

As such, use s64/int64_t/Elf64_Sxword for this field and suffer the
pain that is ISO C99 printf formats for it.

Fixes: c087c6e7b551 ("objtool: Fix type of reloc::addend")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
[peterz: reword changelog, s/long long/s64/]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2205161041260.11556@file01.intranet.prod.int.rdu2.redhat.com
2022-05-20 12:45:30 +02:00
ead165fa10 objtool: Fix symbol creation
Nathan reported objtool failing with the following messages:

  warning: objtool: no non-local symbols !?
  warning: objtool: gelf_update_symshndx: invalid section index

The problem is due to commit 4abff6d48dbc ("objtool: Fix code relocs
vs weak symbols") failing to consider the case where an object would
have no non-local symbols.

The problem that commit tries to address is adding a STB_LOCAL symbol
to the symbol table in light of the ELF spec's requirement that:

  In each symbol table, all symbols with STB_LOCAL binding preced the
  weak and global symbols.  As ``Sections'' above describes, a symbol
  table section's sh_info section header member holds the symbol table
  index for the first non-local symbol.

The approach taken is to find this first non-local symbol, move that
to the end and then re-use the freed spot to insert a new local symbol
and increment sh_info.

Except it never considered the case of object files without global
symbols and got a whole bunch of details wrong -- so many in fact that
it is a wonder it ever worked :/

Specifically:

 - It failed to re-hash the symbol on the new index, so a subsequent
   find_symbol_by_index() would not find it at the new location and a
   query for the old location would now return a non-deterministic
   choice between the old and new symbol.

 - It failed to appreciate that the GElf wrappers are not a valid disk
   format (it works because GElf is basically Elf64 and we only
   support x86_64 atm.)

 - It failed to fully appreciate how horrible the libelf API really is
   and got the gelf_update_symshndx() call pretty much completely
   wrong; with the direct consequence that if inserting a second
   STB_LOCAL symbol would require moving the same STB_GLOBAL symbol
   again it would completely come unstuck.

Write a new elf_update_symbol() function that wraps all the magic
required to update or create a new symbol at a given index.

Specifically, gelf_update_sym*() require an @ndx argument that is
relative to the @data argument; this means you have to manually
iterate the section data descriptor list and update @ndx.

Fixes: 4abff6d48dbc ("objtool: Fix code relocs vs weak symbols")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YoPCTEYjoPqE4ZxB@hirez.programming.kicks-ass.net
2022-05-20 12:42:06 +02:00
dba90d6fb8 KVM: selftests: riscv: Remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/testing/selftests/kvm/lib/riscv/processor.c:353:3-4: Unneeded
semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:08:58 +05:30
ac6c85e962 KVM: selftests: riscv: Improve unexpected guest trap handling
Currently, we simply hang using "while (1) ;" upon any unexpected
guest traps because the default guest trap handler is guest_hang().

The above approach is not useful to anyone because KVM selftests
users will only see a hung application upon any unexpected guest
trap.

This patch improves unexpected guest trap handling for KVM RISC-V
selftests by doing the following:
1) Return to host user-space
2) Dump VCPU registers
3) Die using TEST_ASSERT(0, ...)

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-05-20 09:08:56 +05:30
2ba18161d4 selftests: mptcp: add MP_FAIL reset testcase
Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
reset case. Use the test_linkfail value to make 1024KB test files.

Invoke reset_with_fail() to use 'iptables' and 'tc action pedit' rules
to produce the bit flips to trigger the checksum failures on ns2eth2.
Add delays on ns2eth1 to make sure more data can translate on ns2eth2.

The check_invert flag is enabled in reset_with_fail(), so this test
prints out the inverted bytes, instead of the file mismatch errors.

Invoke pedit_action_pkts() to get the numbers of the packets edited
by the tc pedit actions, and print this numbers to the output.

Co-developed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19 20:05:07 -07:00
d7e6f58360 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/mellanox/mlx5/core/main.c
  b33886971dbc ("net/mlx5: Initialize flow steering during driver probe")
  40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support")
  f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table")
https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/
  16d42d313350 ("net/mlx5: Drain fw_reset when removing device")
  8324a02c342a ("net/mlx5: Add exit route when waiting for FW")
https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/

tools/testing/selftests/net/mptcp/mptcp_join.sh
  e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
  b6e074e171bc ("selftests: mptcp: add infinite map testcase")
  5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type")
https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/

net/mptcp/options.c
  ba2c89e0ea74 ("mptcp: fix checksum byte order")
  1e39e5a32ad7 ("mptcp: infinite mapping sending")
  ea66758c1795 ("tcp: allow MPTCP to update the announced window")
https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/

net/mptcp/pm.c
  95d686517884 ("mptcp: fix subflow accounting on close")
  4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs")
https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/

net/mptcp/subflow.c
  ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure")
  0348c690ed37 ("mptcp: add the fallback check")
  f8d4bcacff3b ("mptcp: infinite mapping receiving")
https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19 11:23:59 -07:00
d16495a982 libbpf: remove bpf_create_map*() APIs
To test API removal, get rid of bpf_create_map*() APIs. Perf defines
__weak implementation of bpf_map_create() that redirects to old
bpf_create_map() and that seems to compile and run fine.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
e2371b1632 libbpf: start 1.0 development cycle
Start libbpf 1.0 development cycle by adding LIBBPF_1.0.0 section to
libbpf.map file and marking all current symbols as local. As we remove
all the deprecated APIs we'll populate global list before the final 1.0
release.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
056431ae4d libbpf: fix up global symbol counting logic
Add the same negative ABS filter that we use in VERSIONED_SYM_COUNT to
filter out ABS symbols like LIBBPF_0.8.0.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-19 09:03:31 -07:00
fcfbc93cc3 cxl/port: Reuse 'struct cxl_hdm' context for hdm init
The port driver maps component registers for port operations. Reuse that
mapping for HDM Decoder Capability setup / enable. Move
devm_cxl_setup_hdm() before cxl_hdm_decode_init() and plumb @cxlhdm
through the hdm init helpers.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291691712.1426646.14336397551571515480.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19 08:50:42 -07:00
92804edb11 cxl/pci: Drop @info argument to cxl_hdm_decode_init()
Now that nothing external to cxl_hdm_decode_init() considers
'struct cxl_endpoint_dvec_info' move it internal to
cxl_hdm_decode_init().

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291690612.1426646.7866084245521113414.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19 08:50:41 -07:00
a12562bb70 cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
In preparation for changing how the driver handles 'mem_enable' in the CXL
DVSEC control register. Merge the contents of cxl_hdm_decode_init() into
cxl_dvsec_ranges() and rename the combined function cxl_hdm_decode_init().
The possible cleanups and fixes that result from this merge are saved for a
follow-on change.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165291690027.1426646.10249756632415633752.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19 08:50:41 -07:00
14d7887407 cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
In preparation for fixing the setting of the 'mem_enabled' bit in CXL
DVSEC Control register, move all CXL DVSEC range enumeration into the
same source file.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19 08:50:41 -07:00
2e4ba0ec97 cxl/pci: Move cxl_await_media_ready() to the core
Allow cxl_await_media_ready() to be mocked for testing purposes rather
than carrying the maintenance burden of an indirect function call in the
mainline driver.

With the move cxl_await_media_ready() can no longer reuse the mailbox
timeout override, so add a media_ready_timeout module parameter to the
core to backfill.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-05-19 08:50:41 -07:00
d904c8cc03 Networking fixes for 5.18-rc8, including fixes from can, xfrm and
netfilter subtrees.
 
 Notably this reverts a recent TCP/DCCP netns-related change
 to address a possible UaF.
 
 Current release - regressions:
   - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"
 
   - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in ifdown
 
 Previous releases - regressions:
   - netfilter: flowtable: fix TCP flow teardown
 
   - can: revert "can: m_can: pci: use custom bit timings for Elkhart Lake"
 
   - xfrm: check encryption module availability consistency
 
   - eth: vmxnet3: fix possible use-after-free bugs in vmxnet3_rq_alloc_rx_buf()
 
   - eth: mlx5: initialize flow steering during driver probe
 
   - eth: ice: fix crash when writing timestamp on RX rings
 
 Previous releases - always broken:
   - mptcp: fix checksum byte order
 
   - eth: lan966x: fix assignment of the MAC address
 
   - eth: mlx5: remove HW-GRO from reported features
 
   - eth: ftgmac100: disable hardware checksum on AST2600
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmKGAYYSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkrt8P/2GyYNQT7q0h3Plsxc/m1tIUCPiERROE
 zIU0R2QVc64xpkMISeVb3YYpa3eqhtQsNWgt7Xsr1NRXBmyx60dvGpS81w8Gnxuo
 ruA7SxnH6OA0usviiYPmeGP9emvCEkO5YRW5kxl1Cpum19yNxjfZKJ6ARk0IDp/D
 C1S91PYtF9s25Yytrlpv9lVVBvTHQxg2EQocZHxO+7/j2O8jJP/NAYltpVaRNC2W
 gLcOWTAujrjAfpdsBhJsWXv4dTCQOAgnIXYP9P1JdFMAZtkXoYQUjaXP7dsaAXHw
 iE9FBRkqDKVhj94CxR6VPOSo0kVvOuBfkc1eJeZ74lvahkHBq4EyiVCo6/JhNQTd
 /bi/mTeUlI9yYyu/j9lMDy4CwOuiB69Dl4vNR/G5C1rF7l1vQkZr50pnD96MePwu
 9fR5+ipZsDhj5c77OMiraqnnOyWXVtD2YCZCCw80a9/aWG4zxcIDtnNQIfqAACvx
 0wNgG2bPSKRablytep1Qs84Vvupaa1cC2eTBbA+6LzQqk3CR9/YMUSD6MXitxQyD
 RJYbm5QMqdW2QH8zE21E+8wzIPeN9m66lJFppuntuB+I/CHWAnP/CmdbWysR3FQ+
 5ZisPh4PUqb1VIzGKUbym/D9FB20Vc8zq6oQa8LqiIOODUrxQMg3F2O43OWsYsn3
 TDNCwo5BQ/Z8
 =C848
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, xfrm and netfilter subtrees.

  Notably this reverts a recent TCP/DCCP netns-related change to address
  a possible UaF.

  Current release - regressions:

   - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"

   - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in
     ifdown

  Previous releases - regressions:

   - netfilter: flowtable: fix TCP flow teardown

   - can: revert "can: m_can: pci: use custom bit timings for Elkhart
     Lake"

   - xfrm: check encryption module availability consistency

   - eth: vmxnet3: fix possible use-after-free bugs in
     vmxnet3_rq_alloc_rx_buf()

   - eth: mlx5: initialize flow steering during driver probe

   - eth: ice: fix crash when writing timestamp on RX rings

  Previous releases - always broken:

   - mptcp: fix checksum byte order

   - eth: lan966x: fix assignment of the MAC address

   - eth: mlx5: remove HW-GRO from reported features

   - eth: ftgmac100: disable hardware checksum on AST2600"

* tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.
  ptp: ocp: change sysfs attr group handling
  selftests: forwarding: fix missing backslash
  netfilter: nf_tables: disable expression reduction infra
  netfilter: flowtable: move dst_check to packet path
  netfilter: flowtable: fix TCP flow teardown
  net: ftgmac100: Disable hardware checksum on AST2600
  igb: skip phy status check where unavailable
  nfc: pn533: Fix buggy cleanup order
  mptcp: Do TCP fallback on early DSS checksum failure
  mptcp: fix checksum byte order
  net: af_key: check encryption module availability consistency
  net: af_key: add check for pfkey_broadcast in function pfkey_process
  net/mlx5: Drain fw_reset when removing device
  net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
  net/mlx5e: CT: Fix support for GRE tuples
  net/mlx5e: Remove HW-GRO from reported features
  net/mlx5e: Properly block HW GRO when XDP is enabled
  net/mlx5e: Properly block LRO when XDP is enabled
  net/mlx5e: Block rx-gro-hw feature in switchdev mode
  ...
2022-05-19 05:50:29 -10:00
dc6a7effb4 lkdtm updates for -next
- Test for new usercopy memory regions
 - avoid GCC 12 warnings
 - update expected CONFIGs for selftests
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmKEGMoWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjWvEACa7AYibyhPaikabil0r+qnU5Vr
 GBbRZbs7+lzzOL7UAJ8VRwNnw3sqsaEKo7phqJRqYvo/etzUGUNtIUFCPMHpw15R
 sICXpedImOjeVjMYs4FCccEKGlIz0DMXKhNVvMT/IgLANiSwTMjCUGAVWVIf6OtO
 oXPl5nE3ZIyNYhMPv7hpJ2DB+eEsKqpKv0xkAiZnGpWFhJF0Zex7wb88QAWWO1BP
 po02NdnvA/f6PIYdpMBlm3UjpGSCD3PfiH+grjZGilGMgVcFoVQJdSSvGPhip6/G
 3DVmkSZZe4ldRePn0aJKrwm8bZnqfXgYlu7IKi7y9PSmD8URAtC5PkmnGwkcYxq7
 esA9NPJFflzeNxxB6HbH3Dq1DOXiGUi3vQ6SiNieGOtNnCgynrmOtnRrjumG8sZg
 hdcuXbL0TCOkvy5jQVaLcZfUbH4/hLg1x5qZ2GOl2UKZ69BM0812IQkXOQqBGbKV
 bQvjflwvBvZ/reuBNfQz10BDTnHL60jFjCNkA5K6+/iziat0/D2JfQ8dDk6I/ywn
 rXrFnZmXio0WF4lSbEp54l+a1gcNz3Ngi1VJnzDwRifykfIXGFkp8gnWOFymyJ9Z
 A4W0+yjDkwVsyQcZktS5r9MXhOiEJY+V15k0Rj6mOLRK67sqtgpYbNYZahiyHux4
 91DKbOLQlVN7562GYg==
 =CiSx
 -----END PGP SIGNATURE-----

Merge tag 'lkdtm-next' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into char-misc-next

Kees writes:

lkdtm updates for -next

- Test for new usercopy memory regions
- avoid GCC 12 warnings
- update expected CONFIGs for selftests

* tag 'lkdtm-next' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lkdtm/heap: Hide allocation size from -Warray-bounds
  selftests/lkdtm: Add configs for stackleak and "after free" tests
  lkdtm/usercopy: Check vmalloc and >0-order folios
  lkdtm/usercopy: Rename "heap" to "slab"
  lkdtm: cfi: Fix type width for masking PAC bits
2022-05-19 17:18:55 +02:00
cb4487d2b4 tools/thermal: remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/thermal/thermometer/thermometer.c:147:3-4: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220427030619.81556-2-jiapeng.chong@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00
f21b57eb12 tools/lib/thermal: remove unneeded semicolon
Fix the following coccicheck warnings:

./tools/lib/thermal/commands.c:215:2-3: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220427030619.81556-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-05-19 12:11:52 +02:00