Go to file
Daniel Borkmann a4bbab27c4 bpf: Adjust insufficient default bpf_jit_limit
[ Upstream commit 10ec8ca8ec1a2f04c4ed90897225231c58c124a7 ]

We've seen recent AWS EKS (Kubernetes) user reports like the following:

  After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
  clusters after a few days a number of the nodes have containers stuck
  in ContainerCreating state or liveness/readiness probes reporting the
  following error:

    Readiness probe errored: rpc error: code = Unknown desc = failed to
    exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
    OCI runtime exec failed: exec failed: unable to start container process:
    unable to init seccomp: error loading seccomp filter into kernel:
    error loading seccomp filter: errno 524: unknown

  However, we had not been seeing this issue on previous AMIs and it only
  started to occur on v20230217 (following the upgrade from kernel 5.4 to
  5.10) with no other changes to the underlying cluster or workloads.

  We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
  which helped to immediately allow containers to be created and probes to
  execute but after approximately a day the issue returned and the value
  returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
  was steadily increasing.

I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.

The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.

Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.

Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-05 11:23:35 +02:00
arch ARM: dts: imx6sl: tolino-shine2hd: fix usbotg1 pinctrl 2023-04-05 11:23:32 +02:00
block block, bfq: fix uaf for bfqq in bic_set_bfqq() 2023-03-17 08:45:14 +01:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:13:17 +02:00
crypto crypto: rsa-pkcs1pad - Use akcipher_request_complete 2023-03-11 16:39:27 +01:00
Documentation attr: use consistent sgid stripping checks 2023-03-22 13:30:08 +01:00
drivers net/ps3_gelic_net: Use dma_mapping_error 2023-04-05 11:23:34 +02:00
fs xfs: remove xfs_setattr_time() declaration 2023-03-22 13:30:08 +01:00
include net: mdio: fix owner field for mdio buses registered using device-tree 2023-04-05 11:23:34 +02:00
init init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash 2022-12-02 17:40:03 +01:00
io_uring io_uring: avoid null-ptr-deref in io_arm_poll_handler 2023-03-22 13:30:05 +01:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:24:00 +01:00
kernel bpf: Adjust insufficient default bpf_jit_limit 2023-04-05 11:23:35 +02:00
lib printf: fix errname.c list 2023-03-11 16:39:40 +01:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage 2023-03-22 13:30:04 +01:00
net xsk: Add missing overflow check in xdp_umem_reg 2023-04-05 11:23:32 +02:00
samples samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe() 2023-01-14 10:16:01 +01:00
scripts scripts: handle BrokenPipeError for python scripts 2023-03-17 08:45:15 +01:00
security keys: Do not cache key in task struct if key is requested from kernel thread 2023-04-05 11:23:34 +02:00
sound ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro 2023-03-22 13:30:04 +01:00
tools bootconfig: Fix testcase to increase max node 2023-04-05 11:23:34 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:25:48 +01:00
virt KVM: Register /dev/kvm as the _very_ last thing during initialization 2023-04-05 11:23:30 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS audit: update the mailing list in MAINTAINERS 2023-02-25 11:55:04 +01:00
Makefile Linux 5.10.176 2023-03-22 13:30:08 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.