Merge keystone/android14-6.1-keystone-qcom-release.6.1.25 (bd65f1b) into qcom-6.1

* refs/heads/tmp-bd65f1b: ANDROID: uid_sys_stats: Use llist for deferred work UPSTREAM: usb: typec: ucsi: Fix command cancellation ANDROID: GKI: update symbol list file for xiaomi UPSTREAM: erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF UPSTREAM: erofs: avoid useless loops in z_erofs_pcluster_readmore() when reading beyond EOF UPSTREAM: erofs: Fix detection of atomic context UPSTREAM: erofs: fix compact 4B support for 16k block size UPSTREAM: erofs: kill hooked chains to avoid loops on deduplicated compressed images UPSTREAM: erofs: fix potential overflow calculating xattr_isize UPSTREAM: erofs: stop parsing non-compact HEAD index if clusterofs is invalid UPSTREAM: erofs: initialize packed inode after root inode is assigned ANDROID: GKI: Update ABI for zsmalloc fixes BACKPORT: zsmalloc: fix races between modifications of fullness and isolated UPSTREAM: zsmalloc: consolidate zs_pool's migrate_lock and size_class's locks ANDROID: consolidate.fragment: Enable slub debug in consolidate-fragment BACKPORT: FROMGIT: mm: handle faults that merely update the accessed bit under the VMA lock FROMLIST: mm: Allow fault_dirty_shared_page() to be called under the VMA lock FROMGIT: mm: handle swap and NUMA PTE faults under the VMA lock FROMGIT: mm: run the fault-around code under the VMA lock FROMGIT: mm: move FAULT_FLAG_VMA_LOCK check down from do_fault() FROMGIT: mm: move FAULT_FLAG_VMA_LOCK check down in handle_pte_fault() BACKPORT: FROMGIT: mm: handle some PMD faults under the VMA lock BACKPORT: FROMGIT: mm: handle PUD faults under the VMA lock FROMGIT: mm: move FAULT_FLAG_VMA_LOCK check from handle_mm_fault() BACKPORT: FROMGIT: mm: allow per-VMA locks on file-backed VMAs FROMGIT: mm: remove CONFIG_PER_VMA_LOCK ifdefs FROMGIT: mm: fix a lockdep issue in vma_assert_write_locked FROMGIT: mm: handle userfaults under VMA lock FROMGIT: mm: handle swap page faults under per-VMA lock FROMGIT: mm: change folio_lock_or_retry to use vm_fault directly BACKPORT: FROMGIT: mm: drop per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETED BACKPORT: FROMGIT: mm: move vma locking out of vma_prepare and dup_anon_vma BACKPORT: FROMGIT: mm: always lock new vma before inserting into vma tree FROMGIT: mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_once FROMGIT: mm: replace mmap with vma write lock assertions when operating on a vma FROMGIT: mm: for !CONFIG_PER_VMA_LOCK equate write lock assertion for vma and mmap FROMGIT: mm: don't drop VMA locks in mm_drop_all_locks() BACKPORT: riscv: mm: try VMA lock-based page fault handling first BACKPORT: FROMGIT: mm: enable page walking API to lock vmas during the walk BACKPORT: mm: lock VMA in dup_anon_vma() before setting ->anon_vma UPSTREAM: mm: fix memory ordering for mm_lock_seq and vm_lock_seq FROMGIT: usb: host: ehci-sched: try to turn on io watchdog as long as periodic_count > 0 FROMGIT: BACKPORT: usb: ehci: add workaround for chipidea PORTSC.PEC bug UPSTREAM: tty: n_gsm: fix UAF in gsm_cleanup_mux UPSTREAM: mm/mmap: Fix extra maple tree write FROMGIT: Multi-gen LRU: skip CMA pages when they are not eligible UPSTREAM: mm: skip CMA pages when they are not available UPSTREAM: dma-buf: fix an error pointer vs NULL bug UPSTREAM: dma-buf: keep the signaling time of merged fences v3 UPSTREAM: netfilter: nf_tables: skip bound chain on rule flush UPSTREAM: net/sched: sch_qfq: account for stab overhead in qfq_enqueue UPSTREAM: net/sched: sch_qfq: refactor parsing of netlink parameters UPSTREAM: netfilter: nft_set_pipapo: fix improper element removal ANDROID: Add checkpatch target. UPSTREAM: USB: Gadget: core: Help prevent panic during UVC unconfigure ANDROID: GKI: Update symbols to symbol list ANDROID: vendor_hook: fix the error record position of mutex ANDROID: ABI: add allowed list for galaxy ANDROID: gfp: add __GFP_CMA in gfpflag_names ANDROID: ABI: Update to fix slab-out-of-bounds in xhci_vendor_get_ops ANDROID: usb: host: fix slab-out-of-bounds in xhci_vendor_get_ops ANDROID: GKI: update pixel symbol list for xhci FROMGIT: fs: drop_caches: draining pages before dropping caches ANDROID: GKI: update symbol list file for xiaomi ANDROID: uid_sys_stats: Use a single work for deferred updates ANDROID: ABI: Update symbol for Exynos SoC ANDROID: GKI: Add symbols to symbol list for vivo ANDROID: vendor_hooks: Add tune scan type hook in get_scan_count() FROMGIT: BACKPORT: Multi-gen LRU: Fix can_swap in lru_gen_look_around() FROMGIT: Multi-gen LRU: Avoid race in inc_min_seq() FROMGIT: Multi-gen LRU: Fix per-zone reclaim ANDROID: ABI: update symbol list for galaxy ANDROID: oplus: Update the ABI xml and symbol list ANDROID: vendor_hooks: Add hooks for lookaround ANDROID: ABI: Update STG ABI to format version 2 ANDROID: ABI: Update symbol list for imx FROMGIT: erofs: fix wrong primary bvec selection on deduplicated extents UPSTREAM: media: Add ABGR64_12 video format BACKPORT: media: Add BGR48_12 video format UPSTREAM: media: Add YUV48_12 video format UPSTREAM: media: Add Y212 v4l2 format info UPSTREAM: media: Add Y210, Y212 and Y216 formats UPSTREAM: media: Add Y012 video format UPSTREAM: media: Add P012 and P012M video format ANDROID: GKI: Create symbol files in include/config ANDROID: fuse-bpf: Use stored bpf for create_open ANDROID: fuse-bpf: Add bpf to negative fuse_dentry ANDROID: fuse-bpf: Check inode not null ANDROID: fuse-bpf: Fix flock test compile error ANDROID: fuse-bpf: Add partial ioctl support ANDROID: ABI: Update oplus symbol list UPSTREAM: mm/mempolicy: Take VMA lock before replacing policy BACKPORT: mm: lock_vma_under_rcu() must check vma->anon_vma under vma lock BACKPORT: FROMGIT: irqchip/gic-v3: Workaround for GIC-700 erratum 2941627 ANDROID: GKI: update xiaomi symbol list UPSTREAM: mm: lock newly mapped VMA with corrected ordering UPSTREAM: fork: lock VMAs of the parent process when forking UPSTREAM: mm: lock newly mapped VMA which can be modified after it becomes visible UPSTREAM: mm: lock a vma before stack expansion ANDROID: GKI: bring back find_extend_vma() BACKPORT: mm: always expand the stack with the mmap write lock held BACKPORT: execve: expand new process stack manually ahead of time ANDROID: abi_gki_aarch64_qcom: ufshcd_mcq_poll_cqe_lock UPSTREAM: mm: make find_extend_vma() fail if write lock not held UPSTREAM: powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma() UPSTREAM: mm/fault: convert remaining simple cases to lock_mm_and_find_vma() UPSTREAM: arm/mm: Convert to using lock_mm_and_find_vma() UPSTREAM: riscv/mm: Convert to using lock_mm_and_find_vma() UPSTREAM: mips/mm: Convert to using lock_mm_and_find_vma() UPSTREAM: powerpc/mm: Convert to using lock_mm_and_find_vma() BACKPORT: arch/arm64/mm/fault: Fix undeclared variable error in do_page_fault() BACKPORT: arm64/mm: Convert to using lock_mm_and_find_vma() UPSTREAM: mm: make the page fault mmap locking killable ANDROID: Inherit "user-aware property" across rtmutex. BACKPORT: blk-crypto: use dynamic lock class for blk_crypto_profile::lock ANDROID: ABI: update symbol list for Xclipse GPU ANDROID: drm/ttm: export ttm_tt_unpopulate() ANDROID: GKI: Add ABI symbol list(devlink) for MTK ANDROID: devlink: Select CONFIG_NET_DEVLINK in Kconfig.gki ANDROID: KVM: arm64: Fix memory ordering for pKVM module callbacks BACKPORT: mm: introduce new 'lock_mm_and_find_vma()' page fault helper BACKPORT: maple_tree: fix potential out-of-bounds access in mas_wr_end_piv() UPSTREAM: x86/smp: Cure kexec() vs. mwait_play_dead() breakage UPSTREAM: x86/smp: Use dedicated cache-line for mwait_play_dead() UPSTREAM: x86/smp: Remove pointless wmb()s from native_stop_other_cpus() UPSTREAM: x86/smp: Dont access non-existing CPUID leaf UPSTREAM: x86/smp: Make stop_other_cpus() more robust UPSTREAM: x86/microcode/AMD: Load late on both threads too BACKPORT: mm, hwpoison: when copy-on-write hits poison, take page offline UPSTREAM: mm, hwpoison: try to recover from copy-on write faults BACKPORT: mm/mmap: Fix error return in do_vmi_align_munmap() BACKPORT: mm/mmap: Fix error path in do_vmi_align_munmap() UPSTREAM: HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651. UPSTREAM: HID: hidraw: fix data race on device refcount UPSTREAM: can: isotp: isotp_sendmsg(): fix return error fix on TX path UPSTREAM: fbdev: fix potential OOB read in fast_imageblit() ANDROID: GKI: add function symbols for unisoc ANDROID: cgroup: Cleanup android_rvh_cgroup_force_kthread_migration UPSTREAM: net/sched: cls_fw: Fix improper refcount update leads to use-after-free UPSTREAM: netfilter: nf_tables: fix chain binding transaction logic ANDROID: abi_gki_aarch64_qcom: update abi UPSTREAM: fs/ntfs3: Check fields while reading ANDROID: GKI: Update abi_gki_aarch64_qcom ANDROID: ABI: Update pixel symbol list ANDROID: GKI: Move GKI module headers to generated includes ANDROID: set kmi_symbol_list_add_only for Kleaf builds. ANDROID: GKI: Add Android ABI padding to wwan_port_ops ANDROID: GKI: Add Android ABI padding to wwan_ops ANDROID: update symbol list for unisoc regmap vendor hook ANDROID: GKI: Update mtk ABI symbol list UPSTREAM: media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*() ANDROID: abi_gki_aarch64_qcom: Update QCOM symbol list ANDROID: ABI: Update pixel symbol list ANDROID: GKI: add ABI symbol for xiaomi ANDROID: vendor_hooks: add vendor hook to support SAGT FROMLIST: fuse: revalidate: don't invalidate if interrupted ANDROID: GKI: Update pixel symbol list for thermal ANDROID: thermal: Add vendor thermal genl check ANDROID: GKI: Update the pixel symbol list ANDROID: GKI: Update protected exports FROMGIT: mm: add missing VM_FAULT_RESULT_TRACE name for VM_FAULT_COMPLETED FROMGIT: swap: remove remnants of polling from read_swap_cache_async UPSTREAM: io_uring/poll: serialize poll linked timer start with poll removal Change-Id: Ib4aaa987f777d4cdb0897af78aecb19aaee8d68b Upstream-Build: ks_qcom-android14-6.1-keystone-qcom-release@10801570 UKQ2.230913.001 Signed-off-by: jianzhou <quic_jianzhou@quicinc.com>
2023-09-14 22:56:49 -07:00 · 2023-09-14 22:56:49 -07:00 · dd83e88d4d
commit dd83e88d4d
parent 485dc8a77c bd65f1be37
168 changed files with 10298 additions and 1076 deletions
--- a/BUILD.bazel
+++ b/BUILD.bazel
@ -6,6 +6,7 @@ load("//build/bazel_common_rules/dist:dist.bzl", "copy_to_dist_dir")
 load("//build/kernel/kleaf:common_kernels.bzl", "define_common_kernels")
 load(
    "//build/kernel/kleaf:kernel.bzl",
+    "checkpatch",
    "ddk_headers",
    "kernel_abi",
    "kernel_build",
@ -40,6 +41,11 @@ _GKI_X86_64_MAKE_GOALS = [
    "modules",
 ]

+checkpatch(
+    name = "checkpatch",
+    checkpatch_pl = "scripts/checkpatch.pl",
+)
+
 write_file(
    name = "gki_system_dlkm_modules",
    out = "android/gki_system_dlkm_modules",
@ -76,6 +82,7 @@ define_common_kernels(target_configs = {
        "kmi_symbol_list_strict_mode": True,
        "module_implicit_outs": COMMON_GKI_MODULES_LIST,
        "kmi_symbol_list": "android/abi_gki_aarch64",
+        "kmi_symbol_list_add_only": True,
        "additional_kmi_symbol_lists": [":aarch64_additional_kmi_symbol_lists"],
        "protected_exports_list": "android/abi_gki_protected_exports_aarch64",
        "protected_modules_list": "android/gki_aarch64_protected_modules",
@ -90,6 +97,7 @@ define_common_kernels(target_configs = {
        "kmi_symbol_list_strict_mode": False,
        "module_implicit_outs": COMMON_GKI_MODULES_LIST,
        "kmi_symbol_list": "android/abi_gki_aarch64",
+        "kmi_symbol_list_add_only": True,
        "additional_kmi_symbol_lists": [":aarch64_additional_kmi_symbol_lists"],
        "protected_exports_list": "android/abi_gki_protected_exports_aarch64",
        "protected_modules_list": "android/gki_aarch64_protected_modules",
--- a/Documentation/userspace-api/media/v4l/pixfmt-packed-yuv.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-packed-yuv.rst
@ -257,12 +257,45 @@ the second byte and Y'\ :sub:`7-0` in the third byte.
    - The padding bits contain undefined values that must be ignored by all
      applications and drivers.

+The next table lists the packed YUV 4:4:4 formats with 12 bits per component.
+Expand the bits per component to 16 bits, data in the high bits, zeros in the low bits,
+arranged in little endian order, storing 1 pixel in 6 bytes.
+
+.. flat-table:: Packed YUV 4:4:4 Image Formats (12bpc)
+    :header-rows: 1
+    :stub-columns: 0
+
+    * - Identifier
+      - Code
+      - Byte 1-0
+      - Byte 3-2
+      - Byte 5-4
+      - Byte 7-6
+      - Byte 9-8
+      - Byte 11-10
+
+    * .. _V4L2-PIX-FMT-YUV48-12:
+
+      - ``V4L2_PIX_FMT_YUV48_12``
+      - 'Y312'
+
+      - Y'\ :sub:`0`
+      - Cb\ :sub:`0`
+      - Cr\ :sub:`0`
+      - Y'\ :sub:`1`
+      - Cb\ :sub:`1`
+      - Cr\ :sub:`1`

 4:2:2 Subsampling
 =================

 These formats, commonly referred to as YUYV or YUY2, subsample the chroma
-components horizontally by 2, storing 2 pixels in 4 bytes.
+components horizontally by 2, storing 2 pixels in a container. The container
+is 32-bits for 8-bit formats, and 64-bits for 10+-bit formats.
+
+The packed YUYV formats with more than 8 bits per component are stored as four
+16-bit little-endian words. Each word's most significant bits contain one
+component, and the least significant bits are zero padding.

 .. raw:: latex

@ -270,7 +303,7 @@ components horizontally by 2, storing 2 pixels in 4 bytes.

 .. tabularcolumns:: |p{3.4cm}|p{1.2cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|

-.. flat-table:: Packed YUV 4:2:2 Formats
+.. flat-table:: Packed YUV 4:2:2 Formats in 32-bit container
    :header-rows: 1
    :stub-columns: 0

@ -337,6 +370,46 @@ components horizontally by 2, storing 2 pixels in 4 bytes.
      - Y'\ :sub:`3`
      - Cb\ :sub:`2`

+.. tabularcolumns:: |p{3.4cm}|p{1.2cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|p{0.8cm}|
+
+.. flat-table:: Packed YUV 4:2:2 Formats in 64-bit container
+    :header-rows: 1
+    :stub-columns: 0
+
+    * - Identifier
+      - Code
+      - Word 0
+      - Word 1
+      - Word 2
+      - Word 3
+    * .. _V4L2-PIX-FMT-Y210:
+
+      - ``V4L2_PIX_FMT_Y210``
+      - 'Y210'
+
+      - Y'\ :sub:`0` (bits 15-6)
+      - Cb\ :sub:`0` (bits 15-6)
+      - Y'\ :sub:`1` (bits 15-6)
+      - Cr\ :sub:`0` (bits 15-6)
+    * .. _V4L2-PIX-FMT-Y212:
+
+      - ``V4L2_PIX_FMT_Y212``
+      - 'Y212'
+
+      - Y'\ :sub:`0` (bits 15-4)
+      - Cb\ :sub:`0` (bits 15-4)
+      - Y'\ :sub:`1` (bits 15-4)
+      - Cr\ :sub:`0` (bits 15-4)
+    * .. _V4L2-PIX-FMT-Y216:
+
+      - ``V4L2_PIX_FMT_Y216``
+      - 'Y216'
+
+      - Y'\ :sub:`0` (bits 15-0)
+      - Cb\ :sub:`0` (bits 15-0)
+      - Y'\ :sub:`1` (bits 15-0)
+      - Cr\ :sub:`0` (bits 15-0)
+
 .. raw:: latex

    \normalsize
--- a/Documentation/userspace-api/media/v4l/pixfmt-rgb.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-rgb.rst
@ -762,6 +762,48 @@ nomenclature that instead use the order of components as seen in a 24- or

    \normalsize

+12 Bits Per Component
+==============================
+
+These formats store an RGB triplet in six or eight bytes, with 12 bits per component.
+Expand the bits per component to 16 bits, data in the high bits, zeros in the low bits,
+arranged in little endian order.
+
+.. raw:: latex
+
+    \small
+
+.. flat-table:: RGB Formats With 12 Bits Per Component
+    :header-rows:  1
+
+    * - Identifier
+      - Code
+      - Byte 1-0
+      - Byte 3-2
+      - Byte 5-4
+      - Byte 7-6
+    * .. _V4L2-PIX-FMT-BGR48-12:
+
+      - ``V4L2_PIX_FMT_BGR48_12``
+      - 'B312'
+
+      - B\ :sub:`15-4`
+      - G\ :sub:`15-4`
+      - R\ :sub:`15-4`
+      -
+    * .. _V4L2-PIX-FMT-ABGR64-12:
+
+      - ``V4L2_PIX_FMT_ABGR64_12``
+      - 'B412'
+
+      - B\ :sub:`15-4`
+      - G\ :sub:`15-4`
+      - R\ :sub:`15-4`
+      - A\ :sub:`15-4`
+
+.. raw:: latex
+
+    \normalsize

 Deprecated RGB Formats
 ======================
--- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-luma.rst
@ -103,6 +103,17 @@ are often referred to as greyscale formats.
      - ...
      - ...

+    * .. _V4L2-PIX-FMT-Y012:
+
+      - ``V4L2_PIX_FMT_Y012``
+      - 'Y012'
+
+      - Y'\ :sub:`0`\ [3:0] `0000`
+      - Y'\ :sub:`0`\ [11:4]
+      - ...
+      - ...
+      - ...
+
    * .. _V4L2-PIX-FMT-Y14:

      - ``V4L2_PIX_FMT_Y14``
@ -146,3 +157,7 @@ are often referred to as greyscale formats.
    than 16 bits. For example, 10 bits per pixel uses values in the range 0 to
    1023. For the IPU3_Y10 format 25 pixels are packed into 32 bytes, which
    leaves the 6 most significant bits of the last byte padded with 0.
+
+    For Y012 and Y12 formats, Y012 places its data in the 12 high bits, with
+    padding zeros in the 4 low bits, in contrast to the Y12 format, which has
+    its padding located in the most significant bits of the 16 bit word.
--- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst
@ -123,6 +123,20 @@ All components are stored with the same number of bits per component.
      - Cb, Cr
      - Yes
      - 4x4 tiles
+    * - V4L2_PIX_FMT_P012
+      - 'P012'
+      - 12
+      - 4:2:0
+      - Cb, Cr
+      - Yes
+      - Linear
+    * - V4L2_PIX_FMT_P012M
+      - 'PM12'
+      - 12
+      - 4:2:0
+      - Cb, Cr
+      - No
+      - Linear
    * - V4L2_PIX_FMT_NV16
      - 'NV16'
      - 8
@ -586,6 +600,86 @@ Data in the 10 high bits, zeros in the 6 low bits, arranged in little endian ord
      - Cb\ :sub:`11`
      - Cr\ :sub:`11`

+.. _V4L2-PIX-FMT-P012:
+.. _V4L2-PIX-FMT-P012M:
+
+P012 and P012M
+--------------
+
+P012 is like NV12 with 12 bits per component, expanded to 16 bits.
+Data in the 12 high bits, zeros in the 4 low bits, arranged in little endian order.
+
+.. flat-table:: Sample 4x4 P012 Image
+    :header-rows:  0
+    :stub-columns: 0
+
+    * - start + 0:
+      - Y'\ :sub:`00`
+      - Y'\ :sub:`01`
+      - Y'\ :sub:`02`
+      - Y'\ :sub:`03`
+    * - start + 8:
+      - Y'\ :sub:`10`
+      - Y'\ :sub:`11`
+      - Y'\ :sub:`12`
+      - Y'\ :sub:`13`
+    * - start + 16:
+      - Y'\ :sub:`20`
+      - Y'\ :sub:`21`
+      - Y'\ :sub:`22`
+      - Y'\ :sub:`23`
+    * - start + 24:
+      - Y'\ :sub:`30`
+      - Y'\ :sub:`31`
+      - Y'\ :sub:`32`
+      - Y'\ :sub:`33`
+    * - start + 32:
+      - Cb\ :sub:`00`
+      - Cr\ :sub:`00`
+      - Cb\ :sub:`01`
+      - Cr\ :sub:`01`
+    * - start + 40:
+      - Cb\ :sub:`10`
+      - Cr\ :sub:`10`
+      - Cb\ :sub:`11`
+      - Cr\ :sub:`11`
+
+.. flat-table:: Sample 4x4 P012M Image
+    :header-rows:  0
+    :stub-columns: 0
+
+    * - start0 + 0:
+      - Y'\ :sub:`00`
+      - Y'\ :sub:`01`
+      - Y'\ :sub:`02`
+      - Y'\ :sub:`03`
+    * - start0 + 8:
+      - Y'\ :sub:`10`
+      - Y'\ :sub:`11`
+      - Y'\ :sub:`12`
+      - Y'\ :sub:`13`
+    * - start0 + 16:
+      - Y'\ :sub:`20`
+      - Y'\ :sub:`21`
+      - Y'\ :sub:`22`
+      - Y'\ :sub:`23`
+    * - start0 + 24:
+      - Y'\ :sub:`30`
+      - Y'\ :sub:`31`
+      - Y'\ :sub:`32`
+      - Y'\ :sub:`33`
+    * -
+    * - start1 + 0:
+      - Cb\ :sub:`00`
+      - Cr\ :sub:`00`
+      - Cb\ :sub:`01`
+      - Cr\ :sub:`01`
+    * - start1 + 8:
+      - Cb\ :sub:`10`
+      - Cr\ :sub:`10`
+      - Cb\ :sub:`11`
+      - Cr\ :sub:`11`
+

 Fully Planar YUV Formats
 ========================
--- a/android/ACK_SHA
+++ b/android/ACK_SHA
@ -1,2 +1,2 @@
-71b43c3e005a15b83c564a835361e2b7aa56e086
-android14-6.1-2023-07_r7
+f580df859bb06948e26f249d348a74348c237271
+android14-6.1-2023-08_r3
--- a/android/abi_gki_aarch64.stg
+++ b/android/abi_gki_aarch64.stg
--- a/android/abi_gki_aarch64_galaxy
+++ b/android/abi_gki_aarch64_galaxy
@ -35,6 +35,7 @@
  class_create_file_ns
  class_find_device
  class_remove_file_ns
+  cleancache_register_ops
  __const_udelay
  copy_from_kernel_nofault
  cpu_hwcaps
@ -99,6 +100,8 @@
  __free_pages
  free_pages
  free_pages_exact
+  fsnotify
+  __fsnotify_parent
  generic_file_read_iter
  generic_mii_ioctl
  generic_perform_write
@ -149,6 +152,8 @@
  kasan_flag_enabled
  kasprintf
  kernel_cpustat
+  kernel_neon_begin
+  kernel_neon_end
  kernfs_find_and_get_ns
  kfree
  __kfree_skb
@ -164,6 +169,7 @@
  kobject_put
  kstrdup
  kstrtoint
+  kstrtos16
  kstrtouint
  kstrtoull
  kthread_create_on_node
@ -257,6 +263,7 @@
  register_reboot_notifier
  register_restart_handler
  register_syscore_ops
+  regulator_get_current_limit
  remove_cpu
  rtc_class_open
  rtc_read_time
@ -277,6 +284,9 @@
  single_open
  single_release
  skb_copy_ubufs
+  smpboot_register_percpu_thread
+  smpboot_unregister_percpu_thread
+  snd_soc_add_card_controls
  snd_soc_find_dai
  snd_soc_info_volsw_sx
  snd_soc_put_volsw_sx
@ -285,6 +295,7 @@
  sprintf
  sscanf
  __stack_chk_fail
+  stack_trace_save_regs
  stpcpy
  strcmp
  strim
@ -306,6 +317,12 @@
  system_long_wq
  system_unbound_wq
  sys_tz
+  tcp_register_congestion_control
+  tcp_reno_cong_avoid
+  tcp_reno_ssthresh
+  tcp_reno_undo_cwnd
+  tcp_slow_start
+  tcp_unregister_congestion_control
  time64_to_tm
  __traceiter_android_rvh_arm64_serror_panic
  __traceiter_android_rvh_die_kernel_fault
@ -339,6 +356,7 @@
  __traceiter_android_vh_try_to_freeze_todo
  __traceiter_android_vh_try_to_freeze_todo_unfrozen
  __traceiter_android_vh_watchdog_timer_softlockup
+  __traceiter_android_vh_wq_lockup_pool
  __traceiter_block_rq_insert
  __traceiter_console
  __traceiter_hrtimer_expire_entry
@ -380,6 +398,7 @@
  __tracepoint_android_vh_try_to_freeze_todo
  __tracepoint_android_vh_try_to_freeze_todo_unfrozen
  __tracepoint_android_vh_watchdog_timer_softlockup
+  __tracepoint_android_vh_wq_lockup_pool
  __tracepoint_block_rq_insert
  __tracepoint_console
  __tracepoint_hrtimer_expire_entry
@ -399,6 +418,7 @@
  up_write
  usb_alloc_dev
  usb_gstrings_attach
+  usb_set_configuration
  usbnet_get_endpoints
  usbnet_link_change
  usb_set_device_state
--- a/android/abi_gki_aarch64_mtk
+++ b/android/abi_gki_aarch64_mtk
@ -416,6 +416,7 @@
  device_release_driver
  device_remove_bin_file
  device_remove_file
+  device_remove_file_self
  device_rename
  __device_reset
  device_set_of_node_from_dev
@ -429,6 +430,22 @@
  _dev_info
  __dev_kfree_skb_any
  __dev_kfree_skb_irq
+  devlink_alloc_ns
+  devlink_flash_update_status_notify
+  devlink_fmsg_binary_pair_nest_end
+  devlink_fmsg_binary_pair_nest_start
+  devlink_fmsg_binary_put
+  devlink_free
+  devlink_health_report
+  devlink_health_reporter_create
+  devlink_health_reporter_destroy
+  devlink_health_reporter_priv
+  devlink_health_reporter_state_update
+  devlink_priv
+  devlink_region_create
+  devlink_region_destroy
+  devlink_register
+  devlink_unregister
  dev_load
  devm_add_action
  __devm_alloc_percpu
--- a/android/abi_gki_aarch64_oplus
+++ b/android/abi_gki_aarch64_oplus
@ -86,6 +86,7 @@
  tcf_exts_validate
  tcf_queue_work
  __traceiter_android_rvh_post_init_entity_util_avg
+  __traceiter_android_rvh_rtmutex_force_update
  __traceiter_android_vh_account_process_tick_gran
  __traceiter_android_vh_account_task_time
  __traceiter_android_vh_do_futex
@ -99,11 +100,6 @@
  __traceiter_android_vh_record_pcpu_rwsem_starttime
  __traceiter_android_vh_record_rtmutex_lock_starttime
  __traceiter_android_vh_record_rwsem_lock_starttime
-  __tracepoint_android_vh_record_mutex_lock_starttime
-  __tracepoint_android_vh_record_pcpu_rwsem_starttime
-  __tracepoint_android_vh_record_rtmutex_lock_starttime
-  __tracepoint_android_vh_record_rwsem_lock_starttime
-  __trace_puts
  __traceiter_android_vh_alter_mutex_list_add
  __traceiter_android_vh_binder_free_proc
  __traceiter_android_vh_binder_has_work_ilocked
@ -121,8 +117,11 @@
  __traceiter_android_vh_binder_thread_release
  __traceiter_android_vh_binder_wait_for_work
  __traceiter_android_vh_cgroup_set_task
+  __traceiter_android_vh_check_folio_look_around_ref
  __traceiter_android_vh_dup_task_struct
  __traceiter_android_vh_exit_signal
+  __traceiter_android_vh_look_around
+  __traceiter_android_vh_look_around_migrate_folio
  __traceiter_android_vh_mem_cgroup_id_remove
  __traceiter_android_vh_mem_cgroup_css_offline
  __traceiter_android_vh_mem_cgroup_css_online
@ -136,6 +135,7 @@
  __traceiter_android_vh_cleanup_old_buffers_bypass
  __traceiter_android_vh_dm_bufio_shrink_scan_bypass
  __traceiter_android_vh_mutex_unlock_slowpath
+  __traceiter_android_vh_rtmutex_waiter_prio
  __traceiter_android_vh_rwsem_can_spin_on_owner
  __traceiter_android_vh_rwsem_opt_spin_finish
  __traceiter_android_vh_rwsem_opt_spin_start
@ -143,6 +143,7 @@
  __traceiter_android_vh_sched_stat_runtime_rt
  __traceiter_android_vh_shrink_node_memcgs
  __traceiter_android_vh_sync_txn_recvd
+  __traceiter_android_vh_task_blocks_on_rtmutex
  __traceiter_block_bio_queue
  __traceiter_block_getrq
  __traceiter_block_rq_complete
@ -156,7 +157,9 @@
  __traceiter_sched_stat_wait
  __traceiter_sched_waking
  __traceiter_task_rename
+  __traceiter_android_vh_test_clear_look_around_ref
  __tracepoint_android_rvh_post_init_entity_util_avg
+  __tracepoint_android_rvh_rtmutex_force_update
  __tracepoint_android_vh_account_process_tick_gran
  __tracepoint_android_vh_account_task_time
  __tracepoint_android_vh_alter_mutex_list_add
@ -176,6 +179,7 @@
  __tracepoint_android_vh_binder_thread_release
  __tracepoint_android_vh_binder_wait_for_work
  __tracepoint_android_vh_cgroup_set_task
+  __tracepoint_android_vh_check_folio_look_around_ref
  __tracepoint_android_vh_do_futex
  __tracepoint_android_vh_dup_task_struct
  __tracepoint_android_vh_exit_signal
@ -191,6 +195,8 @@
  __tracepoint_android_vh_futex_wake_traverse_plist
  __tracepoint_android_vh_futex_wake_up_q_finish
  __tracepoint_android_vh_irqtime_account_process_tick
+  __tracepoint_android_vh_look_around
+  __tracepoint_android_vh_look_around_migrate_folio
  __tracepoint_android_vh_mutex_can_spin_on_owner
  __tracepoint_android_vh_mutex_opt_spin_finish
  __tracepoint_android_vh_mutex_opt_spin_start
@ -198,6 +204,11 @@
  __tracepoint_android_vh_cleanup_old_buffers_bypass
  __tracepoint_android_vh_dm_bufio_shrink_scan_bypass
  __tracepoint_android_vh_mutex_unlock_slowpath
+  __tracepoint_android_vh_record_mutex_lock_starttime
+  __tracepoint_android_vh_record_pcpu_rwsem_starttime
+  __tracepoint_android_vh_record_rtmutex_lock_starttime
+  __tracepoint_android_vh_record_rwsem_lock_starttime
+  __tracepoint_android_vh_rtmutex_waiter_prio
  __tracepoint_android_vh_rwsem_can_spin_on_owner
  __tracepoint_android_vh_rwsem_opt_spin_finish
  __tracepoint_android_vh_rwsem_opt_spin_start
@ -205,6 +216,8 @@
  __tracepoint_android_vh_sched_stat_runtime_rt
  __tracepoint_android_vh_shrink_node_memcgs
  __tracepoint_android_vh_sync_txn_recvd
+  __tracepoint_android_vh_task_blocks_on_rtmutex
+  __tracepoint_android_vh_test_clear_look_around_ref
  __tracepoint_block_bio_queue
  __tracepoint_block_getrq
  __tracepoint_block_rq_complete
@ -218,6 +231,7 @@
  __tracepoint_sched_stat_wait
  __tracepoint_sched_waking
  __tracepoint_task_rename
+  __trace_puts
  try_to_free_mem_cgroup_pages
  typec_mux_get_drvdata
  unregister_memory_notifier
@ -227,3 +241,4 @@
  wait_for_completion_killable_timeout
  wakeup_source_remove
  wq_worker_comm
+  zero_pfn
--- a/android/abi_gki_aarch64_pixel
+++ b/android/abi_gki_aarch64_pixel
@ -369,15 +369,19 @@
  devm_clk_put
  devm_device_add_group
  devm_device_add_groups
+  devm_device_remove_group
  __devm_drm_dev_alloc
  devm_drm_panel_bridge_add_typed
  devm_extcon_dev_allocate
  devm_extcon_dev_register
  devm_free_irq
+  devm_fwnode_gpiod_get_index
+  devm_fwnode_pwm_get
  devm_gen_pool_create
  devm_gpiochip_add_data_with_key
  devm_gpiod_get
  devm_gpiod_get_array
+  devm_gpiod_get_index_optional
  devm_gpiod_get_optional
  devm_gpiod_put_array
  devm_gpio_request
@ -396,6 +400,7 @@
  devm_kmemdup
  devm_kstrdup
  devm_kstrdup_const
+  devm_led_classdev_register_ext
  devm_mfd_add_devices
  devm_nvmem_register
  __devm_of_phy_provider_register
@ -410,6 +415,7 @@
  devm_platform_ioremap_resource
  devm_platform_ioremap_resource_byname
  devm_power_supply_register
+  devm_pwm_get
  devm_regmap_add_irq_chip
  __devm_regmap_init
  __devm_regmap_init_i2c
@ -742,6 +748,7 @@
  extcon_register_notifier
  extcon_set_property
  extcon_set_property_capability
+  extcon_set_property_sync
  extcon_set_state_sync
  extcon_unregister_notifier
  fasync_helper
@ -962,8 +969,10 @@
  int_to_scsilun
  iomem_resource
  iommu_alloc_resv_region
+  iommu_attach_device
  iommu_attach_device_pasid
  iommu_attach_group
+  iommu_detach_device
  iommu_detach_device_pasid
  iommu_device_register
  iommu_device_sysfs_add
@ -1124,6 +1133,7 @@
  kvmalloc_node
  led_classdev_register_ext
  led_classdev_unregister
+  led_init_default_state_get
  __list_add_valid
  __list_del_entry_valid
  list_sort
@ -1505,6 +1515,7 @@
  __put_task_struct
  put_unused_fd
  put_vaddr_frames
+  pwm_apply_state
  queue_delayed_work_on
  queue_work_on
  radix_tree_delete_item
@ -1607,6 +1618,7 @@
  regulator_map_voltage_linear
  regulator_notifier_call_chain
  regulator_put
+  regulator_set_active_discharge_regmap
  regulator_set_voltage
  regulator_set_voltage_sel_regmap
  regulator_unregister
@ -1998,10 +2010,17 @@
  __traceiter_device_pm_callback_end
  __traceiter_device_pm_callback_start
  __traceiter_gpu_mem_total
+  __traceiter_hrtimer_expire_entry
+  __traceiter_hrtimer_expire_exit
+  __traceiter_irq_handler_entry
+  __traceiter_irq_handler_exit
  __traceiter_mmap_lock_acquire_returned
  __traceiter_mmap_lock_released
  __traceiter_mmap_lock_start_locking
+  __traceiter_sched_switch
  __traceiter_suspend_resume
+  __traceiter_workqueue_execute_end
+  __traceiter_workqueue_execute_start
  trace_output_call
  __tracepoint_android_rvh_typec_tcpci_get_vbus
  __tracepoint_android_vh_cpu_idle_enter
@ -2027,12 +2046,19 @@
  __tracepoint_device_pm_callback_end
  __tracepoint_device_pm_callback_start
  __tracepoint_gpu_mem_total
+  __tracepoint_hrtimer_expire_entry
+  __tracepoint_hrtimer_expire_exit
+  __tracepoint_irq_handler_entry
+  __tracepoint_irq_handler_exit
  __tracepoint_mmap_lock_acquire_returned
  __tracepoint_mmap_lock_released
  __tracepoint_mmap_lock_start_locking
  tracepoint_probe_register
  tracepoint_probe_unregister
+  __tracepoint_sched_switch
  __tracepoint_suspend_resume
+  __tracepoint_workqueue_execute_end
+  __tracepoint_workqueue_execute_start
  trace_print_array_seq
  trace_print_bitmask_seq
  trace_print_flags_seq
@ -2264,6 +2290,9 @@
  __xfrm_state_destroy
  xfrm_state_lookup_byspi
  xfrm_stateonly_find
+  xhci_address_device
+  xhci_bus_resume
+  xhci_bus_suspend
  xhci_gen_setup
  xhci_init_driver
  xhci_resume
--- a/android/abi_gki_aarch64_qcom
+++ b/android/abi_gki_aarch64_qcom
@ -1550,6 +1550,7 @@
  iommu_group_get_iommudata
  iommu_group_put
  iommu_group_ref_get
+  iommu_group_remove_device
  iommu_group_set_iommudata
  iommu_iova_to_phys
  iommu_map
@ -3282,6 +3283,7 @@
  __traceiter_android_rvh_after_dequeue_task
  __traceiter_android_rvh_after_enqueue_task
  __traceiter_android_rvh_audio_usb_offload_disconnect
+  __traceiter_android_rvh_before_do_sched_yield
  __traceiter_android_rvh_build_perf_domains
  __traceiter_android_rvh_can_migrate_task
  __traceiter_android_rvh_check_preempt_tick
@ -3425,6 +3427,7 @@
  __tracepoint_android_rvh_after_dequeue_task
  __tracepoint_android_rvh_after_enqueue_task
  __tracepoint_android_rvh_audio_usb_offload_disconnect
+  __tracepoint_android_rvh_before_do_sched_yield
  __tracepoint_android_rvh_build_perf_domains
  __tracepoint_android_rvh_can_migrate_task
  __tracepoint_android_rvh_check_preempt_tick
--- a/android/abi_gki_aarch64_unisoc
+++ b/android/abi_gki_aarch64_unisoc
@ -574,6 +574,8 @@
  skb_unlink
  sk_error_report
  sk_free
+  snd_ctl_find_id
+  snd_info_get_line
  snprintf
  sock_alloc_send_pskb
  sock_create_kern
@ -714,6 +716,7 @@
  __traceiter_android_vh_get_thermal_zone_device
  __traceiter_android_vh_modify_thermal_request_freq
  __traceiter_android_vh_modify_thermal_target_freq
+  __traceiter_android_vh_regmap_update
  __traceiter_android_vh_scheduler_tick
  __traceiter_android_vh_thermal_power_cap
  __traceiter_android_vh_thermal_register
@ -792,6 +795,7 @@
  __tracepoint_android_vh_get_thermal_zone_device
  __tracepoint_android_vh_modify_thermal_request_freq
  __tracepoint_android_vh_modify_thermal_target_freq
+  __tracepoint_android_vh_regmap_update
  __tracepoint_android_vh_scheduler_tick
  __tracepoint_android_vh_thermal_power_cap
  __tracepoint_android_vh_thermal_register
@ -1576,6 +1580,11 @@
  spi_controller_suspend
  spi_finalize_current_transfer

+# required by sprd-audio-codec.ko
+  regulator_register
+  snd_pcm_rate_bit_to_rate
+  snd_pcm_rate_to_rate_bit
+
 # required by sprd-bc1p2.ko
  kthread_flush_worker
  __kthread_init_worker
@ -1660,14 +1669,18 @@
  drm_poll
  drm_read
  drm_release
+  drm_send_event_timestamp_locked
  drm_vblank_init
  mipi_dsi_host_register
  mipi_dsi_host_unregister
+  mipi_dsi_set_maximum_return_packet_size
  of_drm_find_bridge
+  of_get_drm_display_mode
  of_graph_get_port_by_id
  of_graph_get_remote_node
  __platform_register_drivers
  platform_unregister_drivers
+  regmap_get_reg_stride

 # required by sprd-iommu.ko
  iommu_device_register
@ -1759,6 +1772,9 @@
  devm_watchdog_register_device
  watchdog_init_timeout

+# required by sprdbt_tty.ko
+  tty_port_link_device
+
 # required by sysdump.ko
  android_rvh_probe_register
  input_close_device
--- a/android/abi_gki_aarch64_vivo
+++ b/android/abi_gki_aarch64_vivo
@ -419,6 +419,7 @@
  __traceiter_android_vh_try_to_freeze_todo
  __traceiter_android_vh_try_to_freeze_todo_unfrozen
  __traceiter_android_vh_try_to_unmap_one
+  __traceiter_android_vh_tune_scan_type
  __traceiter_android_vh_ufs_check_int_errors
  __traceiter_android_vh_ufs_clock_scaling
  __traceiter_android_vh_ufs_compl_command
@ -588,6 +589,7 @@
  __tracepoint_android_vh_try_to_unmap_one
  __tracepoint_android_vh_try_to_freeze_todo
  __tracepoint_android_vh_try_to_freeze_todo_unfrozen
+  __tracepoint_android_vh_tune_scan_type
  __tracepoint_android_vh_ufs_check_int_errors
  __tracepoint_android_vh_ufs_clock_scaling
  __tracepoint_android_vh_ufs_compl_command
--- a/android/abi_gki_aarch64_xiaomi
+++ b/android/abi_gki_aarch64_xiaomi
@ -218,6 +218,12 @@
  kernfs_path_from_node
  blkcg_activate_policy

+#required by mq-deadline module
+  blk_mq_debugfs_rq_show
+  seq_list_start
+  seq_list_next
+  __blk_mq_debugfs_rq_show
+
 #required by metis.ko module
  __traceiter_android_vh_rwsem_read_wait_start
  __traceiter_android_vh_rwsem_write_wait_start
@ -306,3 +312,23 @@
  __tracepoint_android_vh_rmqueue_smallest_bypass
  __traceiter_android_vh_free_one_page_bypass
  __tracepoint_android_vh_free_one_page_bypass
+
+# required by SAGT module
+  __traceiter_android_rvh_before_do_sched_yield
+  __tracepoint_android_rvh_before_do_sched_yield
+
+#required by minetwork.ko
+  sock_wake_async
+  bpf_map_put
+  bpf_map_inc
+  __dev_direct_xmit
+  napi_busy_loop
+  int_active_memcg
+  bpf_redirect_info
+  dma_need_sync
+  page_pool_put_page_bulk
+  build_skb_around
+
+#required by xm_ispv4_pcie.ko
+  pci_ioremap_bar
+  pci_disable_pcie_error_reporting
--- a/android/abi_gki_protected_exports_aarch64
+++ b/android/abi_gki_protected_exports_aarch64
@ -336,12 +336,10 @@ wpan_phy_new
 wpan_phy_register
 wpan_phy_unregister
 wwan_create_port
-wwan_get_debugfs_dir
 wwan_port_get_drvdata
 wwan_port_rx
 wwan_port_txoff
 wwan_port_txon
-wwan_put_debugfs_dir
 wwan_register_ops
 wwan_remove_port
 wwan_unregister_ops
--- a/android/abi_gki_protected_exports_x86_64
+++ b/android/abi_gki_protected_exports_x86_64
@ -336,12 +336,10 @@ wpan_phy_new
 wpan_phy_register
 wpan_phy_unregister
 wwan_create_port
-wwan_get_debugfs_dir
 wwan_port_get_drvdata
 wwan_port_rx
 wwan_port_txoff
 wwan_port_txon
-wwan_put_debugfs_dir
 wwan_register_ops
 wwan_remove_port
 wwan_unregister_ops
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@ -28,6 +28,7 @@ config ALPHA
 	select GENERIC_SMP_IDLE_THREAD
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_MOD_ARCH_SPECIFIC
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select ODD_RT_SIGACTION
 	select OLD_SIGSUSPEND
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@ -119,20 +119,12 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 		flags |= FAULT_FLAG_USER;
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;

 	/* Ok, we have a good vm_area for this memory access, so
 	   we can handle it.  */
- good_area:
 	si_code = SEGV_ACCERR;
 	if (cause < 0) {
 		if (!(vma->vm_flags & VM_EXEC))
@ -189,6 +181,7 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 bad_area:
 	mmap_read_unlock(mm);

+ bad_area_nosemaphore:
 	if (user_mode(regs))
 		goto do_sigsegv;

--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@ -41,6 +41,7 @@ config ARC
 	select HAVE_PERF_EVENTS
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select OF
 	select OF_EARLY_FLATTREE
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@ -113,15 +113,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (unlikely(address < vma->vm_start)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN) || expand_stack(vma, address))
-			goto bad_area;
-	}
+		goto bad_area_nosemaphore;

 	/*
 	 * vm_area is good, now check permissions for this memory access
@ -161,6 +155,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 bad_area:
 	mmap_read_unlock(mm);

+bad_area_nosemaphore:
 	/*
 	 * Major/minor page fault accounting
 	 * (in case of retry we only land here once)
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@ -122,6 +122,7 @@ config ARM
 	select HAVE_UID16
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_REL
 	select NEED_DMA_MAP_STATE
 	select OF_EARLY_FLATTREE if OF
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@ -231,37 +231,11 @@ static inline bool is_permission_fault(unsigned int fsr)
 	return false;
 }

-static vm_fault_t __kprobes
-__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags,
-		unsigned long vma_flags, struct pt_regs *regs)
-{
-	struct vm_area_struct *vma = find_vma(mm, addr);
-	if (unlikely(!vma))
-		return VM_FAULT_BADMAP;
-
-	if (unlikely(vma->vm_start > addr)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			return VM_FAULT_BADMAP;
-		if (addr < FIRST_USER_ADDRESS)
-			return VM_FAULT_BADMAP;
-		if (expand_stack(vma, addr))
-			return VM_FAULT_BADMAP;
-	}
-
-	/*
-	 * ok, we have a good vm_area for this memory access, check the
-	 * permissions on the VMA allow for the fault which occurred.
-	 */
-	if (!(vma->vm_flags & vma_flags))
-		return VM_FAULT_BADACCESS;
-
-	return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
-}
-
 static int __kprobes
 do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
 	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
 	int sig, code;
 	vm_fault_t fault;
 	unsigned int flags = FAULT_FLAG_DEFAULT;
@ -300,31 +274,21 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);

-	/*
-	 * As per x86, we may deadlock here.  However, since the kernel only
-	 * validly references user space from well defined areas of the code,
-	 * we can bug out early if this is from code which shouldn't.
-	 */
-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ARM_pc))
-			goto no_context;
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case, we'll have missed the might_sleep() from
-		 * down_read()
-		 */
-		might_sleep();
-#ifdef CONFIG_DEBUG_VM
-		if (!user_mode(regs) &&
-		    !search_exception_tables(regs->ARM_pc))
-			goto no_context;
-#endif
+	vma = lock_mm_and_find_vma(mm, addr, regs);
+	if (unlikely(!vma)) {
+		fault = VM_FAULT_BADMAP;
+		goto bad_area;
 	}

-	fault = __do_page_fault(mm, addr, flags, vm_flags, regs);
+	/*
+	 * ok, we have a good vm_area for this memory access, check the
+	 * permissions on the VMA allow for the fault which occurred.
+	 */
+	if (!(vma->vm_flags & vm_flags))
+		fault = VM_FAULT_BADACCESS;
+	else
+		fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);

 	/* If we need to retry but a fatal signal is pending, handle the
 	 * signal first. We do not need to release the mmap_lock because
@ -355,6 +319,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
 		return 0;

+bad_area:
 	/*
 	 * If we are in kernel mode at this point, we
 	 * have no context to handle this fault with.
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@ -216,6 +216,7 @@ config ARM64
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
--- a/arch/arm64/configs/consolidate.fragment
+++ b/arch/arm64/configs/consolidate.fragment
@ -3,7 +3,7 @@
 # CONFIG_BITFIELD_KUNIT is not set
 # CONFIG_BITS_TEST is not set
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
-CONFIG_CMDLINE="console=ttyMSM0,115200n8 kasan.stacktrace=off stack_depot_disable=off page_owner=on no_hash_pointers panic_on_taint=0x20 page_pinner=on"
+CONFIG_CMDLINE="console=ttyMSM0,115200n8 kasan.stacktrace=off stack_depot_disable=off page_owner=on no_hash_pointers panic_on_taint=0x20 page_pinner=on slub_debug=FZP,zs_handle,zspage;FZPU"
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_DEBUG_IRQFLAGS=y
 CONFIG_DEBUG_KMEMLEAK=y
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@ -39,7 +39,12 @@ static bool (*default_trap_handler)(struct kvm_cpu_context *host_ctxt);

 int __pkvm_register_host_smc_handler(bool (*cb)(struct kvm_cpu_context *))
 {
-	return cmpxchg(&default_host_smc_handler, NULL, cb) ? -EBUSY : 0;
+	/*
+	 * Paired with smp_load_acquire(&default_host_smc_handler) in
+	 * handle_host_smc(). Ensure memory stores happening during a pKVM module
+	 * init are observed before executing the callback.
+	 */
+	return cmpxchg_release(&default_host_smc_handler, NULL, cb) ? -EBUSY : 0;
 }

 int __pkvm_register_default_trap_handler(bool (*cb)(struct kvm_cpu_context *))
@ -1376,7 +1381,7 @@ static void handle_host_smc(struct kvm_cpu_context *host_ctxt)
 	handled = kvm_host_psci_handler(host_ctxt);
 	if (!handled)
 		handled = kvm_host_ffa_handler(host_ctxt);
-	if (!handled && READ_ONCE(default_host_smc_handler))
+	if (!handled && smp_load_acquire(&default_host_smc_handler))
 		handled = default_host_smc_handler(host_ctxt);
 	if (!handled)
 		__kvm_hyp_host_forward_smc(host_ctxt);
--- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@ -28,14 +28,19 @@ struct kvm_host_psci_config __ro_after_init kvm_host_psci_config;
 static void (*pkvm_psci_notifier)(enum pkvm_psci_notification, struct kvm_cpu_context *);
 static void pkvm_psci_notify(enum pkvm_psci_notification notif, struct kvm_cpu_context *host_ctxt)
 {
-	if (READ_ONCE(pkvm_psci_notifier))
+	if (smp_load_acquire(&pkvm_psci_notifier))
 		pkvm_psci_notifier(notif, host_ctxt);
 }

 #ifdef CONFIG_MODULES
 int __pkvm_register_psci_notifier(void (*cb)(enum pkvm_psci_notification, struct kvm_cpu_context *))
 {
-	return cmpxchg(&pkvm_psci_notifier, NULL, cb) ? -EBUSY : 0;
+	/*
+	 * Paired with smp_load_acquire(&pkvm_psci_notifier) in
+	 * pkvm_psci_notify(). Ensure memory stores hapenning during a pKVM module
+	 * init are observed before executing the callback.
+	 */
+	return cmpxchg_release(&pkvm_psci_notifier, NULL, cb) ? -EBUSY : 0;
 }
 #endif

--- a/arch/arm64/kvm/hyp/nvhe/serial.c
+++ b/arch/arm64/kvm/hyp/nvhe/serial.c
@ -35,7 +35,8 @@ static inline void __hyp_putx4n(unsigned long x, int n)

 static inline bool hyp_serial_enabled(void)
 {
-	return !!READ_ONCE(__hyp_putc);
+	/* Paired with __pkvm_register_serial_driver()'s cmpxchg */
+	return !!smp_load_acquire(&__hyp_putc);
 }

 void hyp_puts(const char *s)
@ -64,5 +65,10 @@ void hyp_putc(char c)

 int __pkvm_register_serial_driver(void (*cb)(char))
 {
-	return cmpxchg(&__hyp_putc, NULL, cb) ? -EBUSY : 0;
+	/*
+	 * Paired with smp_load_acquire(&__hyp_putc) in
+	 * hyp_serial_enabled(). Ensure memory stores hapenning during a pKVM
+	 * module init are observed before executing the callback.
+	 */
+	return cmpxchg_release(&__hyp_putc, NULL, cb) ? -EBUSY : 0;
 }
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@ -502,27 +502,14 @@ static void do_bad_area(unsigned long far, unsigned long esr,
 #define VM_FAULT_BADMAP		0x010000
 #define VM_FAULT_BADACCESS	0x020000

-static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr,
+static vm_fault_t __do_page_fault(struct mm_struct *mm,
+				  struct vm_area_struct *vma, unsigned long addr,
 				  unsigned int mm_flags, unsigned long vm_flags,
 				  struct pt_regs *regs)
 {
-	struct vm_area_struct *vma = find_vma(mm, addr);
-
-	if (unlikely(!vma))
-		return VM_FAULT_BADMAP;
-
 	/*
 	 * Ok, we have a good vm_area for this memory access, so we can handle
 	 * it.
-	 */
-	if (unlikely(vma->vm_start > addr)) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			return VM_FAULT_BADMAP;
-		if (expand_stack(vma, addr))
-			return VM_FAULT_BADMAP;
-	}
-
-	/*
 	 * Check that the permissions on the VMA allow for the fault which
 	 * occurred.
 	 */
@ -554,9 +541,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	unsigned long vm_flags;
 	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
 	unsigned long addr = untagged_addr(far);
-#ifdef CONFIG_PER_VMA_LOCK
 	struct vm_area_struct *vma;
-#endif

 	if (kprobe_page_fault(regs, esr))
 		return 0;
@ -614,7 +599,6 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);

-#ifdef CONFIG_PER_VMA_LOCK
 	if (!(mm_flags & FAULT_FLAG_USER))
 		goto lock_mmap;

@ -627,7 +611,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		goto lock_mmap;
 	}
 	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
-	vma_end_read(vma);
+	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+		vma_end_read(vma);

 	if (!(fault & VM_FAULT_RETRY)) {
 		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
@ -642,32 +627,15 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		return 0;
 	}
 lock_mmap:
-#endif /* CONFIG_PER_VMA_LOCK */
-	/*
-	 * As per x86, we may deadlock here. However, since the kernel only
-	 * validly references user space from well defined areas of the code,
-	 * we can bug out early if this is from code which shouldn't.
-	 */
-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->pc))
-			goto no_context;
+
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above mmap_read_trylock() might have succeeded in which
-		 * case, we'll have missed the might_sleep() from down_read().
-		 */
-		might_sleep();
-#ifdef CONFIG_DEBUG_VM
-		if (!user_mode(regs) && !search_exception_tables(regs->pc)) {
-			mmap_read_unlock(mm);
-			goto no_context;
-		}
-#endif
+	vma = lock_mm_and_find_vma(mm, addr, regs);
+	if (unlikely(!vma)) {
+		fault = VM_FAULT_BADMAP;
+		goto done;
 	}

-	fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs);
+	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);

 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
@ -686,9 +654,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	}
 	mmap_read_unlock(mm);

-#ifdef CONFIG_PER_VMA_LOCK
 done:
-#endif
 	/*
 	 * Handle the "normal" (no error) case first.
 	 */
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@ -96,6 +96,7 @@ config CSKY
 	select HAVE_RSEQ
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
+	select LOCK_MM_AND_FIND_VMA
 	select MAY_HAVE_SPARSE_IRQ
 	select MODULES_USE_ELF_RELA if MODULES
 	select OF
--- a/arch/csky/mm/fault.c
+++ b/arch/csky/mm/fault.c
@ -97,13 +97,12 @@ static inline void mm_fault_error(struct pt_regs *regs, unsigned long addr, vm_f
 	BUG();
 }

-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
+static inline void bad_area_nosemaphore(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
 {
 	/*
 	 * Something tried to access memory that isn't in our memory map.
 	 * Fix it, but check if it's kernel or user first.
 	 */
-	mmap_read_unlock(mm);
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		do_trap(regs, SIGSEGV, code, addr);
@ -238,20 +237,9 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 	if (is_write(regs))
 		flags |= FAULT_FLAG_WRITE;
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, addr);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (likely(vma->vm_start <= addr))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (unlikely(expand_stack(vma, addr))) {
-		bad_area(regs, mm, code, addr);
+		bad_area_nosemaphore(regs, mm, code, addr);
 		return;
 	}

@ -259,11 +247,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it.
 	 */
-good_area:
 	code = SEGV_ACCERR;

 	if (unlikely(access_error(regs, vma))) {
-		bad_area(regs, mm, code, addr);
+		mmap_read_unlock(mm);
+		bad_area_nosemaphore(regs, mm, code, addr);
 		return;
 	}

--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@ -28,6 +28,7 @@ config HEXAGON
 	select GENERIC_SMP_IDLE_THREAD
 	select STACKTRACE_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
 	select ARCH_WANT_LD_ORPHAN_WARN
--- a/arch/hexagon/mm/vm_fault.c
+++ b/arch/hexagon/mm/vm_fault.c
@ -57,21 +57,10 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs)

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-	if (!vma)
-		goto bad_area;
+	vma = lock_mm_and_find_vma(mm, address, regs);
+	if (unlikely(!vma))
+		goto bad_area_nosemaphore;

-	if (vma->vm_start <= address)
-		goto good_area;
-
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-
-	if (expand_stack(vma, address))
-		goto bad_area;
-
-good_area:
 	/* Address space is OK.  Now check access rights. */
 	si_code = SEGV_ACCERR;

@ -140,6 +129,7 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs)
 bad_area:
 	mmap_read_unlock(mm);

+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		force_sig_fault(SIGSEGV, si_code, (void __user *)address);
 		return;
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@ -110,10 +110,12 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
         * register backing store that needs to expand upwards, in
         * this case vma will be null, but prev_vma will ne non-null
         */
-        if (( !vma && prev_vma ) || (address < vma->vm_start) )
-		goto check_expansion;
+        if (( !vma && prev_vma ) || (address < vma->vm_start) ) {
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto bad_area_nosemaphore;
+	}

-  good_area:
 	code = SEGV_ACCERR;

 	/* OK, we've got a good vm_area for this memory area.  Check the access permissions: */
@ -174,35 +176,9 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	mmap_read_unlock(mm);
 	return;

-  check_expansion:
-	if (!(prev_vma && (prev_vma->vm_flags & VM_GROWSUP) && (address == prev_vma->vm_end))) {
-		if (!vma)
-			goto bad_area;
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			goto bad_area;
-		if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
-		    || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
-			goto bad_area;
-		if (expand_stack(vma, address))
-			goto bad_area;
-	} else {
-		vma = prev_vma;
-		if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
-		    || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
-			goto bad_area;
-		/*
-		 * Since the register backing store is accessed sequentially,
-		 * we disallow growing it by more than a page at a time.
-		 */
-		if (address > vma->vm_end + PAGE_SIZE - sizeof(long))
-			goto bad_area;
-		if (expand_upwards(vma, address))
-			goto bad_area;
-	}
-	goto good_area;
-
  bad_area:
 	mmap_read_unlock(mm);
+  bad_area_nosemaphore:
 	if ((isr & IA64_ISR_SP)
 	    || ((isr & IA64_ISR_NA) && (isr & IA64_ISR_CODE_MASK) == IA64_ISR_CODE_LFETCH))
 	{
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@ -107,6 +107,7 @@ config LOONGARCH
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
 	select IRQ_FORCED_THREADING
 	select IRQ_LOONGARCH_CPU
+	select LOCK_MM_AND_FIND_VMA
 	select MMU_GATHER_MERGE_VMAS if MMU
 	select MODULES_USE_ELF_RELA if MODULES
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
--- a/arch/loongarch/mm/fault.c
+++ b/arch/loongarch/mm/fault.c
@ -166,22 +166,18 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (!expand_stack(vma, address))
-		goto good_area;
+	vma = lock_mm_and_find_vma(mm, address, regs);
+	if (unlikely(!vma))
+		goto bad_area_nosemaphore;
+	goto good_area;
+
 /*
 * Something tried to access memory that isn't in our memory map..
 * Fix it, but check if it's kernel or user first..
 */
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	do_sigsegv(regs, write, address, si_code);
 	return;

--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@ -105,8 +105,9 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 		if (address + 256 < rdusp())
 			goto map_err;
 	}
-	if (expand_stack(vma, address))
-		goto map_err;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto map_err_nosemaphore;

 /*
 * Ok, we have a good vm_area for this memory access, so
@ -193,10 +194,12 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	goto send_sig;

 map_err:
+	mmap_read_unlock(mm);
+map_err_nosemaphore:
 	current->thread.signo = SIGSEGV;
 	current->thread.code = SEGV_MAPERR;
 	current->thread.faddr = address;
-	goto send_sig;
+	return send_fault_sig(regs);

 acc_err:
 	current->thread.signo = SIGSEGV;
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@ -192,8 +192,9 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 			&& (kernel_mode(regs) || !store_updates_sp(regs)))
 				goto bad_area;
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;

 good_area:
 	code = SEGV_ACCERR;
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@ -93,6 +93,7 @@ config MIPS
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if 64BIT || !SMP
 	select IRQ_FORCED_THREADING
 	select ISA if EISA
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_REL if MODULES
 	select MODULES_USE_ELF_RELA if MODULES && 64BIT
 	select PERF_USE_VMALLOC
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@ -99,21 +99,13 @@ static void __do_page_fault(struct pt_regs *regs, unsigned long write,

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 /*
 * Ok, we have a good vm_area for this memory access, so
 * we can handle it..
 */
-good_area:
 	si_code = SEGV_ACCERR;

 	if (write) {
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@ -16,6 +16,7 @@ config NIOS2
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_KGDB
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select OF
 	select OF_EARLY_FLATTREE
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@ -86,27 +86,14 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,

 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

-	if (!mmap_read_trylock(mm)) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ea))
-			goto bad_area_nosemaphore;
 retry:
-		mmap_read_lock(mm);
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 /*
 * Ok, we have a good vm_area for this memory access, so
 * we can handle it..
 */
-good_area:
 	code = SEGV_ACCERR;

 	switch (cause) {
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@ -127,8 +127,9 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
 		if (address + PAGE_SIZE < regs->sp)
 			goto bad_area;
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;

 	/*
 	 * Ok, we have a good vm_area for this memory access, so
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@ -288,15 +288,19 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 retry:
 	mmap_read_lock(mm);
 	vma = find_vma_prev(mm, address, &prev_vma);
-	if (!vma || address < vma->vm_start)
-		goto check_expansion;
+	if (!vma || address < vma->vm_start) {
+		if (!prev || !(prev->vm_flags & VM_GROWSUP))
+			goto bad_area;
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto bad_area_nosemaphore;
+	}
+
 /*
 * Ok, we have a good vm_area for this memory access. We still need to
 * check the access permissions.
 */

-good_area:
-
 	if ((vma->vm_flags & acc_type) != acc_type)
 		goto bad_area;

@ -342,17 +346,13 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 	mmap_read_unlock(mm);
 	return;

-check_expansion:
-	vma = prev_vma;
-	if (vma && (expand_stack(vma, address) == 0))
-		goto good_area;
-
 /*
 * Something tried to access memory that isn't in our memory map..
 */
 bad_area:
 	mmap_read_unlock(mm);

+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		int signo, si_code;

@ -444,7 +444,7 @@ handle_nadtlb_fault(struct pt_regs *regs)
 {
 	unsigned long insn = regs->iir;
 	int breg, treg, xreg, val = 0;
-	struct vm_area_struct *vma, *prev_vma;
+	struct vm_area_struct *vma;
 	struct task_struct *tsk;
 	struct mm_struct *mm;
 	unsigned long address;
@ -480,7 +480,7 @@ handle_nadtlb_fault(struct pt_regs *regs)
 				/* Search for VMA */
 				address = regs->ior;
 				mmap_read_lock(mm);
-				vma = find_vma_prev(mm, address, &prev_vma);
+				vma = vma_lookup(mm, address);
 				mmap_read_unlock(mm);

 				/*
@ -489,7 +489,6 @@ handle_nadtlb_fault(struct pt_regs *regs)
 				 */
 				acc_type = (insn & 0x40) ? VM_WRITE : VM_READ;
 				if (vma
-				    && address >= vma->vm_start
 				    && (vma->vm_flags & acc_type) == acc_type)
 					val = 1;
 			}
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@ -257,6 +257,7 @@ config PPC
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC			if KASAN && MODULES
+	select LOCK_MM_AND_FIND_VMA
 	select MMU_GATHER_PAGE_SIZE
 	select MMU_GATHER_RCU_TABLE_FREE
 	select MMU_GATHER_MERGE_VMAS
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@ -410,6 +410,7 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm,
 			ret = H_STATE;
 			break;
 		}
+		vma_start_write(vma);
 		/* Copy vm_flags to avoid partial modifications in ksm_madvise */
 		vm_flags = vma->vm_flags;
 		ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
--- a/arch/powerpc/mm/book3s64/subpage_prot.c
+++ b/arch/powerpc/mm/book3s64/subpage_prot.c
@ -143,6 +143,7 @@ static int subpage_walk_pmd_entry(pmd_t *pmd, unsigned long addr,

 static const struct mm_walk_ops subpage_walk_ops = {
 	.pmd_entry	= subpage_walk_pmd_entry,
+	.walk_lock	= PGWALK_WRLOCK_VERIFY,
 };

 static void subpage_mark_vma_nohuge(struct mm_struct *mm, unsigned long addr,
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@ -33,19 +33,11 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
 	if (mm->pgd == NULL)
 		return -EFAULT;

-	mmap_read_lock(mm);
-	ret = -EFAULT;
-	vma = find_vma(mm, ea);
+	vma = lock_mm_and_find_vma(mm, ea, NULL);
 	if (!vma)
-		goto out_unlock;
-
-	if (ea < vma->vm_start) {
-		if (!(vma->vm_flags & VM_GROWSDOWN))
-			goto out_unlock;
-		if (expand_stack(vma, ea))
-			goto out_unlock;
-	}
+		return -EFAULT;

+	ret = -EFAULT;
 	is_write = dsisr & DSISR_ISSTORE;
 	if (is_write) {
 		if (!(vma->vm_flags & VM_WRITE))
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@ -84,11 +84,6 @@ static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
 	return __bad_area_nosemaphore(regs, address, si_code);
 }

-static noinline int bad_area(struct pt_regs *regs, unsigned long address)
-{
-	return __bad_area(regs, address, SEGV_MAPERR);
-}
-
 static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 				    struct vm_area_struct *vma)
 {
@ -474,7 +469,6 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (is_exec)
 		flags |= FAULT_FLAG_INSTRUCTION;

-#ifdef CONFIG_PER_VMA_LOCK
 	if (!(flags & FAULT_FLAG_USER))
 		goto lock_mmap;

@ -494,7 +488,8 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 	}

 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
-	vma_end_read(vma);
+	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+		vma_end_read(vma);

 	if (!(fault & VM_FAULT_RETRY)) {
 		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
@ -506,7 +501,6 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 		return user_mode(regs) ? 0 : SIGBUS;

 lock_mmap:
-#endif /* CONFIG_PER_VMA_LOCK */

 	/* When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in the
@ -515,40 +509,12 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * we will deadlock attempting to validate the fault against the
 	 * address space.  Luckily the kernel only validly references user
 	 * space from well defined areas of code, which are listed in the
-	 * exceptions table.
-	 *
-	 * As the vast majority of faults will be valid we will only perform
-	 * the source reference check when there is a possibility of a deadlock.
-	 * Attempt to lock the address space, if we cannot we then validate the
-	 * source.  If this is invalid we can skip the address space check,
-	 * thus avoiding the deadlock.
+	 * exceptions table. lock_mm_and_find_vma() handles that logic.
 	 */
-	if (unlikely(!mmap_read_trylock(mm))) {
-		if (!is_user && !search_exception_tables(regs->nip))
-			return bad_area_nosemaphore(regs, address);
-
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case we'll have missed the might_sleep() from
-		 * down_read():
-		 */
-		might_sleep();
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma))
-		return bad_area(regs, address);
-
-	if (unlikely(vma->vm_start > address)) {
-		if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
-			return bad_area(regs, address);
-
-		if (unlikely(expand_stack(vma, address)))
-			return bad_area(regs, address);
-	}
+		return bad_area_nosemaphore(regs, address);

 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma)))
@ -584,9 +550,7 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,

 	mmap_read_unlock(current->mm);

-#ifdef CONFIG_PER_VMA_LOCK
 done:
-#endif
 	if (unlikely(fault & VM_FAULT_ERROR))
 		return mm_fault_error(regs, address, fault);

--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@ -39,6 +39,7 @@ config RISCV
 	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_SUPPORTS_HUGETLBFS if MMU
 	select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
+	select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
 	select ARCH_USE_MEMTEST
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
@ -113,6 +114,7 @@ config RISCV
 	select HAVE_RSEQ
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA if MODULES
 	select MODULE_SECTIONS if MODULES
 	select OF
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@ -83,13 +83,13 @@ static inline void mm_fault_error(struct pt_regs *regs, unsigned long addr, vm_f
 	BUG();
 }

-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
+static inline void
+bad_area_nosemaphore(struct pt_regs *regs, int code, unsigned long addr)
 {
 	/*
 	 * Something tried to access memory that isn't in our memory map.
 	 * Fix it, but check if it's kernel or user first.
 	 */
-	mmap_read_unlock(mm);
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		do_trap(regs, SIGSEGV, code, addr);
@ -99,6 +99,15 @@ static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code
 	no_context(regs, addr);
 }

+static inline void
+bad_area(struct pt_regs *regs, struct mm_struct *mm, int code,
+	 unsigned long addr)
+{
+	mmap_read_unlock(mm);
+
+	bad_area_nosemaphore(regs, code, addr);
+}
+
 static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long addr)
 {
 	pgd_t *pgd, *pgd_k;
@ -280,24 +289,40 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 		flags |= FAULT_FLAG_WRITE;
 	else if (cause == EXC_INST_PAGE_FAULT)
 		flags |= FAULT_FLAG_INSTRUCTION;
+	if (!(flags & FAULT_FLAG_USER))
+		goto lock_mmap;
+
+	vma = lock_vma_under_rcu(mm, addr);
+	if (!vma)
+		goto lock_mmap;
+
+	if (unlikely(access_error(cause, vma))) {
+		vma_end_read(vma);
+		goto lock_mmap;
+	}
+
+	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
+	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+		vma_end_read(vma);
+
+	if (!(fault & VM_FAULT_RETRY)) {
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		goto done;
+	}
+	count_vm_vma_lock_event(VMA_LOCK_RETRY);
+
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			no_context(regs, addr);
+		return;
+	}
+lock_mmap:
+
 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, addr);
+	vma = lock_mm_and_find_vma(mm, addr, regs);
 	if (unlikely(!vma)) {
 		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (likely(vma->vm_start <= addr))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
-		return;
-	}
-	if (unlikely(expand_stack(vma, addr))) {
-		tsk->thread.bad_cause = cause;
-		bad_area(regs, mm, code, addr);
+		bad_area_nosemaphore(regs, code, addr);
 		return;
 	}

@ -305,7 +330,6 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it.
 	 */
-good_area:
 	code = SEGV_ACCERR;

 	if (unlikely(access_error(cause, vma))) {
@ -346,6 +370,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs)

 	mmap_read_unlock(mm);

+done:
 	if (unlikely(fault & VM_FAULT_ERROR)) {
 		tsk->thread.bad_cause = cause;
 		mm_fault_error(regs, addr, fault);
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@ -102,6 +102,7 @@ static const struct mm_walk_ops pageattr_ops = {
 	.pmd_entry = pageattr_pmd_entry,
 	.pte_entry = pageattr_pte_entry,
 	.pte_hole = pageattr_pte_hole,
+	.walk_lock = PGWALK_RDLOCK,
 };

 static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask,
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@ -403,7 +403,6 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
 		access = VM_WRITE;
 	if (access == VM_WRITE)
 		flags |= FAULT_FLAG_WRITE;
-#ifdef CONFIG_PER_VMA_LOCK
 	if (!(flags & FAULT_FLAG_USER))
 		goto lock_mmap;
 	vma = lock_vma_under_rcu(mm, address);
@ -414,7 +413,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
 		goto lock_mmap;
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
-	vma_end_read(vma);
+	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+		vma_end_read(vma);
 	if (!(fault & VM_FAULT_RETRY)) {
 		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
 		goto out;
@ -426,7 +426,6 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
 		goto out;
 	}
 lock_mmap:
-#endif /* CONFIG_PER_VMA_LOCK */
 	mmap_read_lock(mm);

 	gmap = NULL;
@ -453,8 +452,9 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
 	if (unlikely(vma->vm_start > address)) {
 		if (!(vma->vm_flags & VM_GROWSDOWN))
 			goto out_up;
-		if (expand_stack(vma, address))
-			goto out_up;
+		vma = expand_stack(mm, address);
+		if (!vma)
+			goto out;
 	}

 	/*
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@ -2510,6 +2510,7 @@ static int thp_split_walk_pmd_entry(pmd_t *pmd, unsigned long addr,

 static const struct mm_walk_ops thp_split_walk_ops = {
 	.pmd_entry	= thp_split_walk_pmd_entry,
+	.walk_lock	= PGWALK_WRLOCK_VERIFY,
 };

 static inline void thp_split_mm(struct mm_struct *mm)
@ -2554,6 +2555,7 @@ static int __zap_zero_pages(pmd_t *pmd, unsigned long start,

 static const struct mm_walk_ops zap_zero_walk_ops = {
 	.pmd_entry	= __zap_zero_pages,
+	.walk_lock	= PGWALK_WRLOCK,
 };

 /*
@ -2655,6 +2657,7 @@ static const struct mm_walk_ops enable_skey_walk_ops = {
 	.hugetlb_entry		= __s390_enable_skey_hugetlb,
 	.pte_entry		= __s390_enable_skey_pte,
 	.pmd_entry		= __s390_enable_skey_pmd,
+	.walk_lock		= PGWALK_WRLOCK,
 };

 int s390_enable_skey(void)
@ -2692,6 +2695,7 @@ static int __s390_reset_cmma(pte_t *pte, unsigned long addr,

 static const struct mm_walk_ops reset_cmma_walk_ops = {
 	.pte_entry		= __s390_reset_cmma,
+	.walk_lock		= PGWALK_WRLOCK,
 };

 void s390_reset_cmma(struct mm_struct *mm)
@ -2728,6 +2732,7 @@ static int s390_gather_pages(pte_t *ptep, unsigned long addr,

 static const struct mm_walk_ops gather_pages_ops = {
 	.pte_entry = s390_gather_pages,
+	.walk_lock = PGWALK_RDLOCK,
 };

 /*
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@ -56,6 +56,7 @@ config SUPERH
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select NEED_SG_DMA_LENGTH
 	select NO_DMA if !MMU && !DMA_COHERENT
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@ -439,21 +439,9 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 	}

 retry:
-	mmap_read_lock(mm);
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (likely(vma->vm_start <= address))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (unlikely(expand_stack(vma, address))) {
-		bad_area(regs, error_code, address);
+		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}

@ -461,7 +449,6 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	if (unlikely(access_error(error_code, vma))) {
 		bad_area_access_error(regs, error_code, address);
 		return;
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@ -56,6 +56,7 @@ config SPARC32
 	select DMA_DIRECT_REMAP
 	select GENERIC_ATOMIC64
 	select HAVE_UID16
+	select LOCK_MM_AND_FIND_VMA
 	select OLD_SIGACTION
 	select ZONE_DMA

--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@ -143,28 +143,19 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	if (pagefault_disabled() || !mm)
 		goto no_context;

+	if (!from_user && address >= PAGE_OFFSET)
+		goto no_context;
+
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

 retry:
-	mmap_read_lock(mm);
-
-	if (!from_user && address >= PAGE_OFFSET)
-		goto bad_area;
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	code = SEGV_ACCERR;
 	if (write) {
 		if (!(vma->vm_flags & VM_WRITE))
@ -318,17 +309,9 @@ static void force_user_fault(unsigned long address, int write)

 	code = SEGV_MAPERR;

-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
-good_area:
+		goto bad_area_nosemaphore;
 	code = SEGV_ACCERR;
 	if (write) {
 		if (!(vma->vm_flags & VM_WRITE))
@ -347,6 +330,7 @@ static void force_user_fault(unsigned long address, int write)
 	return;
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	__do_fault_siginfo(code, SIGSEGV, tsk->thread.kregs, address);
 	return;

--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@ -383,8 +383,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 				goto bad_area;
 		}
 	}
-	if (expand_stack(vma, address))
-		goto bad_area;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto bad_area_nosemaphore;
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
@ -482,8 +483,9 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 	 * Fix it, but check if it's kernel or user first..
 	 */
 bad_area:
-	insn = get_fault_insn(regs, insn);
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
+	insn = get_fault_insn(regs, insn);

 handle_kernel_fault:
 	do_kernel_fault(regs, si_code, fault_code, insn, address);
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@ -47,14 +47,15 @@ int handle_page_fault(unsigned long address, unsigned long ip,
 	vma = find_vma(mm, address);
 	if (!vma)
 		goto out;
-	else if (vma->vm_start <= address)
+	if (vma->vm_start <= address)
 		goto good_area;
-	else if (!(vma->vm_flags & VM_GROWSDOWN))
+	if (!(vma->vm_flags & VM_GROWSDOWN))
 		goto out;
-	else if (is_user && !ARCH_IS_STACKGROW(address))
-		goto out;
-	else if (expand_stack(vma, address))
+	if (is_user && !ARCH_IS_STACKGROW(address))
 		goto out;
+	vma = expand_stack(mm, address);
+	if (!vma)
+		goto out_nosemaphore;

 good_area:
 	*code_out = SEGV_ACCERR;
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@ -272,6 +272,7 @@ config X86
 	select HAVE_GENERIC_VDSO
 	select HOTPLUG_SMT			if SMP
 	select IRQ_FORCED_THREADING
+	select LOCK_MM_AND_FIND_VMA
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
 	select NEED_PER_CPU_PAGE_FIRST_CHUNK
 	select NEED_SG_DMA_LENGTH
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@ -96,4 +96,6 @@ static inline bool intel_cpu_signatures_match(unsigned int s1, unsigned int p1,

 extern u64 x86_read_arch_cap_msr(void);

+extern struct cpumask cpus_stop_mask;
+
 #endif /* _ASM_X86_CPU_H */
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
 void cond_wakeup_cpu0(void);

+void smp_kick_mwait_play_dead(void);
+
 void native_smp_send_reschedule(int cpu);
 void native_send_call_func_ipi(const struct cpumask *mask);
 void native_send_call_func_single_ipi(int cpu);
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@ -705,7 +705,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
 	rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);

 	/* need to apply patch? */
-	if (rev >= mc_amd->hdr.patch_id) {
+	if (rev > mc_amd->hdr.patch_id) {
 		ret = UCODE_OK;
 		goto out;
 	}
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@ -744,15 +744,26 @@ bool xen_set_default_idle(void)
 }
 #endif

+struct cpumask cpus_stop_mask;
+
 void __noreturn stop_this_cpu(void *dummy)
 {
+	struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info);
+	unsigned int cpu = smp_processor_id();
+
 	local_irq_disable();
+
 	/*
-	 * Remove this CPU:
+	 * Remove this CPU from the online mask and disable it
+	 * unconditionally. This might be redundant in case that the reboot
+	 * vector was handled late and stop_other_cpus() sent an NMI.
+	 *
+	 * According to SDM and APM NMIs can be accepted even after soft
+	 * disabling the local APIC.
 	 */
-	set_cpu_online(smp_processor_id(), false);
+	set_cpu_online(cpu, false);
 	disable_local_APIC();
-	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+	mcheck_cpu_clear(c);

 	/*
 	 * Use wbinvd on processors that support SME. This provides support
@ -766,8 +777,17 @@ void __noreturn stop_this_cpu(void *dummy)
 	 * Test the CPUID bit directly because the machine might've cleared
 	 * X86_FEATURE_SME due to cmdline options.
 	 */
-	if (cpuid_eax(0x8000001f) & BIT(0))
+	if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0)))
 		native_wbinvd();
+
+	/*
+	 * This brings a cache line back and dirties it, but
+	 * native_stop_other_cpus() will overwrite cpus_stop_mask after it
+	 * observed that all CPUs reported stop. This write will invalidate
+	 * the related cache line on this CPU.
+	 */
+	cpumask_clear_cpu(cpu, &cpus_stop_mask);
+
 	for (;;) {
 		/*
 		 * Use native_halt() so that memory contents don't change
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@ -21,12 +21,14 @@
 #include <linux/interrupt.h>
 #include <linux/cpu.h>
 #include <linux/gfp.h>
+#include <linux/kexec.h>

 #include <asm/mtrr.h>
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
 #include <asm/proto.h>
 #include <asm/apic.h>
+#include <asm/cpu.h>
 #include <asm/idtentry.h>
 #include <asm/nmi.h>
 #include <asm/mce.h>
@ -146,34 +148,47 @@ static int register_stop_handler(void)

 static void native_stop_other_cpus(int wait)
 {
-	unsigned long flags;
-	unsigned long timeout;
+	unsigned int cpu = smp_processor_id();
+	unsigned long flags, timeout;

 	if (reboot_force)
 		return;

-	/*
-	 * Use an own vector here because smp_call_function
-	 * does lots of things not suitable in a panic situation.
-	 */
+	/* Only proceed if this is the first CPU to reach this code */
+	if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
+		return;
+
+	/* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+	if (kexec_in_progress)
+		smp_kick_mwait_play_dead();

 	/*
-	 * We start by using the REBOOT_VECTOR irq.
-	 * The irq is treated as a sync point to allow critical
-	 * regions of code on other cpus to release their spin locks
-	 * and re-enable irqs.  Jumping straight to an NMI might
-	 * accidentally cause deadlocks with further shutdown/panic
-	 * code.  By syncing, we give the cpus up to one second to
-	 * finish their work before we force them off with the NMI.
+	 * 1) Send an IPI on the reboot vector to all other CPUs.
+	 *
+	 *    The other CPUs should react on it after leaving critical
+	 *    sections and re-enabling interrupts. They might still hold
+	 *    locks, but there is nothing which can be done about that.
+	 *
+	 * 2) Wait for all other CPUs to report that they reached the
+	 *    HLT loop in stop_this_cpu()
+	 *
+	 * 3) If #2 timed out send an NMI to the CPUs which did not
+	 *    yet report
+	 *
+	 * 4) Wait for all other CPUs to report that they reached the
+	 *    HLT loop in stop_this_cpu()
+	 *
+	 * #3 can obviously race against a CPU reaching the HLT loop late.
+	 * That CPU will have reported already and the "have all CPUs
+	 * reached HLT" condition will be true despite the fact that the
+	 * other CPU is still handling the NMI. Again, there is no
+	 * protection against that as "disabled" APICs still respond to
+	 * NMIs.
 	 */
-	if (num_online_cpus() > 1) {
-		/* did someone beat us here? */
-		if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
-			return;
-
-		/* sync above data before sending IRQ */
-		wmb();
+	cpumask_copy(&cpus_stop_mask, cpu_online_mask);
+	cpumask_clear_cpu(cpu, &cpus_stop_mask);

+	if (!cpumask_empty(&cpus_stop_mask)) {
 		apic_send_IPI_allbutself(REBOOT_VECTOR);

 		/*
@ -183,24 +198,22 @@ static void native_stop_other_cpus(int wait)
 		 * CPUs reach shutdown state.
 		 */
 		timeout = USEC_PER_SEC;
-		while (num_online_cpus() > 1 && timeout--)
+		while (!cpumask_empty(&cpus_stop_mask) && timeout--)
 			udelay(1);
 	}

 	/* if the REBOOT_VECTOR didn't work, try with the NMI */
-	if (num_online_cpus() > 1) {
+	if (!cpumask_empty(&cpus_stop_mask)) {
 		/*
 		 * If NMI IPI is enabled, try to register the stop handler
 		 * and send the IPI. In any case try to wait for the other
 		 * CPUs to stop.
 		 */
 		if (!smp_no_nmi_ipi && !register_stop_handler()) {
-			/* Sync above data before sending IRQ */
-			wmb();
-
 			pr_emerg("Shutting down cpus with NMI\n");

-			apic_send_IPI_allbutself(NMI_VECTOR);
+			for_each_cpu(cpu, &cpus_stop_mask)
+				apic->send_IPI(cpu, NMI_VECTOR);
 		}
 		/*
 		 * Don't wait longer than 10 ms if the caller didn't
@ -208,7 +221,7 @@ static void native_stop_other_cpus(int wait)
 		 * one or more CPUs do not reach shutdown state.
 		 */
 		timeout = USEC_PER_MSEC * 10;
-		while (num_online_cpus() > 1 && (wait || timeout--))
+		while (!cpumask_empty(&cpus_stop_mask) && (wait || timeout--))
 			udelay(1);
 	}

@ -216,6 +229,12 @@ static void native_stop_other_cpus(int wait)
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 	local_irq_restore(flags);
+
+	/*
+	 * Ensure that the cpus_stop_mask cache lines are invalidated on
+	 * the other CPUs. See comment vs. SME in stop_this_cpu().
+	 */
+	cpumask_clear(&cpus_stop_mask);
 }

 /*
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@ -53,6 +53,7 @@
 #include <linux/tboot.h>
 #include <linux/gfp.h>
 #include <linux/cpuidle.h>
+#include <linux/kexec.h>
 #include <linux/numa.h>
 #include <linux/pgtable.h>
 #include <linux/overflow.h>
@ -99,6 +100,20 @@ EXPORT_PER_CPU_SYMBOL(cpu_die_map);
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);

+struct mwait_cpu_dead {
+	unsigned int	control;
+	unsigned int	status;
+};
+
+#define CPUDEAD_MWAIT_WAIT	0xDEADBEEF
+#define CPUDEAD_MWAIT_KEXEC_HLT	0x4A17DEAD
+
+/*
+ * Cache line aligned data for mwait_play_dead(). Separate on purpose so
+ * that it's unlikely to be touched by other CPUs.
+ */
+static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
+
 /* Logical package management. We might want to allocate that dynamically */
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
@ -155,6 +170,10 @@ static void smp_callin(void)
 {
 	int cpuid;

+	/* Mop up eventual mwait_play_dead() wreckage */
+	this_cpu_write(mwait_cpu_dead.status, 0);
+	this_cpu_write(mwait_cpu_dead.control, 0);
+
 	/*
 	 * If waken up by an INIT in an 82489DX configuration
 	 * cpu_callout_mask guarantees we don't get here before
@ -1746,10 +1765,10 @@ EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
 */
 static inline void mwait_play_dead(void)
 {
+	struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
 	unsigned int eax, ebx, ecx, edx;
 	unsigned int highest_cstate = 0;
 	unsigned int highest_subcstate = 0;
-	void *mwait_ptr;
 	int i;

 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
@ -1784,12 +1803,9 @@ static inline void mwait_play_dead(void)
 			(highest_subcstate - 1);
 	}

-	/*
-	 * This should be a memory location in a cache line which is
-	 * unlikely to be touched by other processors.  The actual
-	 * content is immaterial as it is not actually modified in any way.
-	 */
-	mwait_ptr = &current_thread_info()->flags;
+	/* Set up state for the kexec() hack below */
+	md->status = CPUDEAD_MWAIT_WAIT;
+	md->control = CPUDEAD_MWAIT_WAIT;

 	wbinvd();

@ -1802,16 +1818,63 @@ static inline void mwait_play_dead(void)
 		 * case where we return around the loop.
 		 */
 		mb();
-		clflush(mwait_ptr);
+		clflush(md);
 		mb();
-		__monitor(mwait_ptr, 0, 0);
+		__monitor(md, 0, 0);
 		mb();
 		__mwait(eax, 0);

+		if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
+			/*
+			 * Kexec is about to happen. Don't go back into mwait() as
+			 * the kexec kernel might overwrite text and data including
+			 * page tables and stack. So mwait() would resume when the
+			 * monitor cache line is written to and then the CPU goes
+			 * south due to overwritten text, page tables and stack.
+			 *
+			 * Note: This does _NOT_ protect against a stray MCE, NMI,
+			 * SMI. They will resume execution at the instruction
+			 * following the HLT instruction and run into the problem
+			 * which this is trying to prevent.
+			 */
+			WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
+			while(1)
+				native_halt();
+		}
+
 		cond_wakeup_cpu0();
 	}
 }

+/*
+ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
+ * mwait_play_dead().
+ */
+void smp_kick_mwait_play_dead(void)
+{
+	u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
+	struct mwait_cpu_dead *md;
+	unsigned int cpu, i;
+
+	for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
+		md = per_cpu_ptr(&mwait_cpu_dead, cpu);
+
+		/* Does it sit in mwait_play_dead() ? */
+		if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
+			continue;
+
+		/* Wait up to 5ms */
+		for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
+			/* Bring it out of mwait */
+			WRITE_ONCE(md->control, newstate);
+			udelay(5);
+		}
+
+		if (READ_ONCE(md->status) != newstate)
+			pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
+	}
+}
+
 void hlt_play_dead(void)
 {
 	if (__this_cpu_read(cpu_info.x86) >= 4)
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@ -901,12 +901,6 @@ __bad_area(struct pt_regs *regs, unsigned long error_code,
 	__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
 }

-static noinline void
-bad_area(struct pt_regs *regs, unsigned long error_code, unsigned long address)
-{
-	__bad_area(regs, error_code, address, 0, SEGV_MAPERR);
-}
-
 static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 		struct vm_area_struct *vma)
 {
@ -1355,7 +1349,6 @@ void do_user_addr_fault(struct pt_regs *regs,
 	}
 #endif

-#ifdef CONFIG_PER_VMA_LOCK
 	if (!(flags & FAULT_FLAG_USER))
 		goto lock_mmap;

@ -1368,7 +1361,8 @@ void do_user_addr_fault(struct pt_regs *regs,
 		goto lock_mmap;
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
-	vma_end_read(vma);
+	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+		vma_end_read(vma);

 	if (!(fault & VM_FAULT_RETRY)) {
 		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
@ -1385,53 +1379,11 @@ void do_user_addr_fault(struct pt_regs *regs,
 		return;
 	}
 lock_mmap:
-#endif /* CONFIG_PER_VMA_LOCK */

-	/*
-	 * Kernel-mode access to the user address space should only occur
-	 * on well-defined single instructions listed in the exception
-	 * tables.  But, an erroneous kernel fault occurring outside one of
-	 * those areas which also holds mmap_lock might deadlock attempting
-	 * to validate the fault against the address space.
-	 *
-	 * Only do the expensive exception table search when we might be at
-	 * risk of a deadlock.  This happens if we
-	 * 1. Failed to acquire mmap_lock, and
-	 * 2. The access did not originate in userspace.
-	 */
-	if (unlikely(!mmap_read_trylock(mm))) {
-		if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
-			/*
-			 * Fault from code in kernel from
-			 * which we do not expect faults.
-			 */
-			bad_area_nosemaphore(regs, error_code, address);
-			return;
-		}
 retry:
-		mmap_read_lock(mm);
-	} else {
-		/*
-		 * The above down_read_trylock() might have succeeded in
-		 * which case we'll have missed the might_sleep() from
-		 * down_read():
-		 */
-		might_sleep();
-	}
-
-	vma = find_vma(mm, address);
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (unlikely(!vma)) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (likely(vma->vm_start <= address))
-		goto good_area;
-	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
-		bad_area(regs, error_code, address);
-		return;
-	}
-	if (unlikely(expand_stack(vma, address))) {
-		bad_area(regs, error_code, address);
+		bad_area_nosemaphore(regs, error_code, address);
 		return;
 	}

@ -1439,7 +1391,6 @@ void do_user_addr_fault(struct pt_regs *regs,
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	if (unlikely(access_error(error_code, vma))) {
 		bad_area_access_error(regs, error_code, address, vma);
 		return;
@ -1487,9 +1438,7 @@ void do_user_addr_fault(struct pt_regs *regs,
 	}

 	mmap_read_unlock(mm);
-#ifdef CONFIG_PER_VMA_LOCK
 done:
-#endif
 	if (likely(!(fault & VM_FAULT_ERROR)))
 		return;

--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@ -49,6 +49,7 @@ config XTENSA
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select IRQ_DOMAIN
+	select LOCK_MM_AND_FIND_VMA
 	select MODULES_USE_ELF_RELA
 	select PERF_USE_VMALLOC
 	select TRACE_IRQFLAGS_SUPPORT
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@ -130,23 +130,14 @@ void do_page_fault(struct pt_regs *regs)
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

 retry:
-	mmap_read_lock(mm);
-	vma = find_vma(mm, address);
-
+	vma = lock_mm_and_find_vma(mm, address, regs);
 	if (!vma)
-		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+		goto bad_area_nosemaphore;

 	/* Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */

-good_area:
 	code = SEGV_ACCERR;

 	if (is_write) {
@ -205,6 +196,7 @@ void do_page_fault(struct pt_regs *regs)
 	 */
 bad_area:
 	mmap_read_unlock(mm);
+bad_area_nosemaphore:
 	if (user_mode(regs)) {
 		current->thread.bad_vaddr = address;
 		current->thread.error_code = is_write;
--- a/block/blk-crypto-profile.c
+++ b/block/blk-crypto-profile.c
@ -79,7 +79,18 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile,
 	unsigned int slot_hashtable_size;

 	memset(profile, 0, sizeof(*profile));
+
+	/*
+	 * profile->lock of an underlying device can nest inside profile->lock
+	 * of a device-mapper device, so use a dynamic lock class to avoid
+	 * false-positive lockdep reports.
+	 */
+#ifdef CONFIG_LOCKDEP
+	lockdep_register_key(&profile->lockdep_key);
+	__init_rwsem(&profile->lock, "&profile->lock", &profile->lockdep_key);
+#else
 	init_rwsem(&profile->lock);
+#endif

 	if (num_slots == 0)
 		return 0;
@ -89,7 +100,7 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile,
 	profile->slots = kvcalloc(num_slots, sizeof(profile->slots[0]),
 				  GFP_KERNEL);
 	if (!profile->slots)
-		return -ENOMEM;
+		goto err_destroy;

 	profile->num_slots = num_slots;

@ -443,6 +454,9 @@ void blk_crypto_profile_destroy(struct blk_crypto_profile *profile)
 {
 	if (!profile)
 		return;
+#ifdef CONFIG_LOCKDEP
+	lockdep_unregister_key(&profile->lockdep_key);
+#endif
 	kvfree(profile->slot_hashtable);
 	kvfree_sensitive(profile->slots,
 			 sizeof(profile->slots[0]) * profile->num_slots);
--- a/drivers/android/vendor_hooks.c
+++ b/drivers/android/vendor_hooks.c
@ -88,6 +88,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_do_send_sig_info);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_mutex_wait_start);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_mutex_wait_finish);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_mutex_init);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_task_blocks_on_rtmutex);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_rtmutex_waiter_prio);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_rtmutex_wait_start);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_rtmutex_wait_finish);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_mutex_opt_spin_start);
@ -313,3 +315,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_rmqueue_smallest_bypass);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_free_one_page_bypass);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_regmap_update);
 EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_enable_thermal_genl_check);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_check_folio_look_around_ref);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_look_around);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_look_around_migrate_folio);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_test_clear_look_around_ref);
+EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_tune_scan_type);
--- a/drivers/hid/hid-logitech-hidpp.c
+++ b/drivers/hid/hid-logitech-hidpp.c
@ -4299,7 +4299,7 @@ static const struct hid_device_id hidpp_devices[] = {
 	{ /* wireless touchpad T651 */
 	  HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH,
 		USB_DEVICE_ID_LOGITECH_T651),
-	  .driver_data = HIDPP_QUIRK_CLASS_WTP },
+	  .driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT },
 	{ /* Mouse Logitech Anywhere MX */
 	  LDJ_DEVICE(0x1017), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
 	{ /* Mouse logitech M560 */
--- a/drivers/hid/hidraw.c
+++ b/drivers/hid/hidraw.c
@ -272,7 +272,12 @@ static int hidraw_open(struct inode *inode, struct file *file)
 		goto out;
 	}

-	down_read(&minors_rwsem);
+	/*
+	 * Technically not writing to the hidraw_table but a write lock is
+	 * required to protect the device refcount. This is symmetrical to
+	 * hidraw_release().
+	 */
+	down_write(&minors_rwsem);
 	if (!hidraw_table[minor] || !hidraw_table[minor]->exist) {
 		err = -ENODEV;
 		goto out_unlock;
@ -301,7 +306,7 @@ static int hidraw_open(struct inode *inode, struct file *file)
 	spin_unlock_irqrestore(&hidraw_table[minor]->list_lock, flags);
 	file->private_data = list;
 out_unlock:
-	up_read(&minors_rwsem);
+	up_write(&minors_rwsem);
 out:
 	if (err < 0)
 		kfree(list);
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@ -485,8 +485,8 @@ static void do_fault(struct work_struct *work)
 	flags |= FAULT_FLAG_REMOTE;

 	mmap_read_lock(mm);
-	vma = find_extend_vma(mm, address);
-	if (!vma || address < vma->vm_start)
+	vma = vma_lookup(mm, address);
+	if (!vma)
 		/* failed to get a vma in the right range */
 		goto out;

--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@ -203,7 +203,7 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)

 	mmap_read_lock(mm);

-	vma = find_extend_vma(mm, prm->addr);
+	vma = vma_lookup(mm, prm->addr);
 	if (!vma)
 		/* Unmapped area */
 		goto out_put_mm;
--- a/drivers/media/dvb-core/dvb_frontend.c
+++ b/drivers/media/dvb-core/dvb_frontend.c
@ -293,14 +293,22 @@ static int dvb_frontend_get_event(struct dvb_frontend *fe,
 	}

 	if (events->eventw == events->eventr) {
-		int ret;
+		struct wait_queue_entry wait;
+		int ret = 0;

 		if (flags & O_NONBLOCK)
 			return -EWOULDBLOCK;

-		ret = wait_event_interruptible(events->wait_queue,
-					       dvb_frontend_test_event(fepriv, events));
-
+		init_waitqueue_entry(&wait, current);
+		add_wait_queue(&events->wait_queue, &wait);
+		while (!dvb_frontend_test_event(fepriv, events)) {
+			wait_woken(&wait, TASK_INTERRUPTIBLE, 0);
+			if (signal_pending(current)) {
+				ret = -ERESTARTSYS;
+				break;
+			}
+		}
+		remove_wait_queue(&events->wait_queue, &wait);
 		if (ret < 0)
 			return ret;
 	}
--- a/drivers/media/v4l2-core/v4l2-common.c
+++ b/drivers/media/v4l2-core/v4l2-common.c
@ -252,12 +252,16 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
 		{ .format = V4L2_PIX_FMT_RGB565,  .pixel_enc = V4L2_PIXEL_ENC_RGB, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_RGB555,  .pixel_enc = V4L2_PIXEL_ENC_RGB, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_BGR666,  .pixel_enc = V4L2_PIXEL_ENC_RGB, .mem_planes = 1, .comp_planes = 1, .bpp = { 4, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_BGR48_12, .pixel_enc = V4L2_PIXEL_ENC_RGB, .mem_planes = 1, .comp_planes = 1, .bpp = { 6, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_ABGR64_12, .pixel_enc = V4L2_PIXEL_ENC_RGB, .mem_planes = 1, .comp_planes = 1, .bpp = { 8, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },

 		/* YUV packed formats */
 		{ .format = V4L2_PIX_FMT_YUYV,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 2, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_YVYU,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 2, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_UYVY,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 2, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_VYUY,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 2, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_Y212,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 4, 0, 0, 0 }, .hdiv = 2, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_YUV48_12, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 6, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },

 		/* YUV planar formats */
 		{ .format = V4L2_PIX_FMT_NV12,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2 },
@ -267,6 +271,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
 		{ .format = V4L2_PIX_FMT_NV24,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_NV42,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 1, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_P010,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 2, 2, 0, 0 }, .hdiv = 2, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_P012,    .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 2, 4, 0, 0 }, .hdiv = 2, .vdiv = 2 },

 		{ .format = V4L2_PIX_FMT_YUV410,  .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 3, .bpp = { 1, 1, 1, 0 }, .hdiv = 4, .vdiv = 4 },
 		{ .format = V4L2_PIX_FMT_YVU410,  .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 3, .bpp = { 1, 1, 1, 0 }, .hdiv = 4, .vdiv = 4 },
@ -292,6 +297,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
 		{ .format = V4L2_PIX_FMT_NV21M,   .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2 },
 		{ .format = V4L2_PIX_FMT_NV16M,   .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 1 },
 		{ .format = V4L2_PIX_FMT_NV61M,   .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 1 },
+		{ .format = V4L2_PIX_FMT_P012M,   .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 2, 4, 0, 0 }, .hdiv = 2, .vdiv = 2 },

 		/* Bayer RGB formats */
 		{ .format = V4L2_PIX_FMT_SBGGR8,	.pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 1, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@ -1304,11 +1304,14 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_BGRX32:	descr = "32-bit XBGR 8-8-8-8"; break;
 	case V4L2_PIX_FMT_RGBA32:	descr = "32-bit RGBA 8-8-8-8"; break;
 	case V4L2_PIX_FMT_RGBX32:	descr = "32-bit RGBX 8-8-8-8"; break;
+	case V4L2_PIX_FMT_BGR48_12:	descr = "12-bit Depth BGR"; break;
+	case V4L2_PIX_FMT_ABGR64_12:	descr = "12-bit Depth BGRA"; break;
 	case V4L2_PIX_FMT_GREY:		descr = "8-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y4:		descr = "4-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y6:		descr = "6-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y10:		descr = "10-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y12:		descr = "12-bit Greyscale"; break;
+	case V4L2_PIX_FMT_Y012:		descr = "12-bit Greyscale (bits 15-4)"; break;
 	case V4L2_PIX_FMT_Y14:		descr = "14-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y16:		descr = "16-bit Greyscale"; break;
 	case V4L2_PIX_FMT_Y16_BE:	descr = "16-bit Greyscale BE"; break;
@ -1347,6 +1350,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_YUV420:	descr = "Planar YUV 4:2:0"; break;
 	case V4L2_PIX_FMT_HI240:	descr = "8-bit Dithered RGB (BTTV)"; break;
 	case V4L2_PIX_FMT_M420:		descr = "YUV 4:2:0 (M420)"; break;
+	case V4L2_PIX_FMT_YUV48_12:	descr = "12-bit YUV 4:4:4 Packed"; break;
 	case V4L2_PIX_FMT_NV12:		descr = "Y/UV 4:2:0"; break;
 	case V4L2_PIX_FMT_NV21:		descr = "Y/VU 4:2:0"; break;
 	case V4L2_PIX_FMT_NV16:		descr = "Y/UV 4:2:2"; break;
@ -1354,6 +1358,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_NV24:		descr = "Y/UV 4:4:4"; break;
 	case V4L2_PIX_FMT_NV42:		descr = "Y/VU 4:4:4"; break;
 	case V4L2_PIX_FMT_P010:		descr = "10-bit Y/UV 4:2:0"; break;
+	case V4L2_PIX_FMT_P012:		descr = "12-bit Y/UV 4:2:0"; break;
 	case V4L2_PIX_FMT_NV12_4L4:	descr = "Y/UV 4:2:0 (4x4 Linear)"; break;
 	case V4L2_PIX_FMT_NV12_16L16:	descr = "Y/UV 4:2:0 (16x16 Linear)"; break;
 	case V4L2_PIX_FMT_NV12_32L32:   descr = "Y/UV 4:2:0 (32x32 Linear)"; break;
@ -1364,6 +1369,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_NV61M:	descr = "Y/VU 4:2:2 (N-C)"; break;
 	case V4L2_PIX_FMT_NV12MT:	descr = "Y/UV 4:2:0 (64x32 MB, N-C)"; break;
 	case V4L2_PIX_FMT_NV12MT_16X16:	descr = "Y/UV 4:2:0 (16x16 MB, N-C)"; break;
+	case V4L2_PIX_FMT_P012M:	descr = "12-bit Y/UV 4:2:0 (N-C)"; break;
 	case V4L2_PIX_FMT_YUV420M:	descr = "Planar YUV 4:2:0 (N-C)"; break;
 	case V4L2_PIX_FMT_YVU420M:	descr = "Planar YVU 4:2:0 (N-C)"; break;
 	case V4L2_PIX_FMT_YUV422M:	descr = "Planar YUV 4:2:2 (N-C)"; break;
@ -1448,6 +1454,9 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
 	case V4L2_PIX_FMT_NV12M_8L128:	descr = "NV12M (8x128 Linear)"; break;
 	case V4L2_PIX_FMT_NV12_10BE_8L128:	descr = "10-bit NV12 (8x128 Linear, BE)"; break;
 	case V4L2_PIX_FMT_NV12M_10BE_8L128:	descr = "10-bit NV12M (8x128 Linear, BE)"; break;
+	case V4L2_PIX_FMT_Y210:		descr = "10-bit YUYV Packed"; break;
+	case V4L2_PIX_FMT_Y212:		descr = "12-bit YUYV Packed"; break;
+	case V4L2_PIX_FMT_Y216:		descr = "16-bit YUYV Packed"; break;

 	default:
 		/* Compressed formats */
--- a/drivers/misc/uid_sys_stats.c
+++ b/drivers/misc/uid_sys_stats.c
@ -19,6 +19,7 @@
 #include <linux/init.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
+#include <linux/llist.h>
 #include <linux/mm.h>
 #include <linux/proc_fs.h>
 #include <linux/profile.h>
@ -629,7 +630,6 @@ static const struct proc_ops uid_procstat_fops = {
 };

 struct update_stats_work {
-	struct work_struct work;
 	uid_t uid;
 #ifdef CONFIG_UID_SYS_STATS_DEBUG
 	struct task_struct *task;
@ -637,38 +637,46 @@ struct update_stats_work {
 	struct task_io_accounting ioac;
 	u64 utime;
 	u64 stime;
+	struct llist_node node;
 };

+static LLIST_HEAD(work_usw);
+
 static void update_stats_workfn(struct work_struct *work)
 {
-	struct update_stats_work *usw =
-		container_of(work, struct update_stats_work, work);
+	struct update_stats_work *usw, *t;
 	struct uid_entry *uid_entry;
 	struct task_entry *task_entry __maybe_unused;
+	struct llist_node *node;

 	rt_mutex_lock(&uid_lock);
-	uid_entry = find_uid_entry(usw->uid);
-	if (!uid_entry)
-		goto exit;

-	uid_entry->utime += usw->utime;
-	uid_entry->stime += usw->stime;
+	node = llist_del_all(&work_usw);
+	llist_for_each_entry_safe(usw, t, node, node) {
+		uid_entry = find_uid_entry(usw->uid);
+		if (!uid_entry)
+			goto next;
+
+		uid_entry->utime += usw->utime;
+		uid_entry->stime += usw->stime;

 #ifdef CONFIG_UID_SYS_STATS_DEBUG
-	task_entry = find_task_entry(uid_entry, usw->task);
-	if (!task_entry)
-		goto exit;
-	add_uid_tasks_io_stats(task_entry, &usw->ioac,
-			       UID_STATE_DEAD_TASKS);
+		task_entry = find_task_entry(uid_entry, usw->task);
+		if (!task_entry)
+			goto next;
+		add_uid_tasks_io_stats(task_entry, &usw->ioac,
+				       UID_STATE_DEAD_TASKS);
 #endif
-	__add_uid_io_stats(uid_entry, &usw->ioac, UID_STATE_DEAD_TASKS);
-exit:
+		__add_uid_io_stats(uid_entry, &usw->ioac, UID_STATE_DEAD_TASKS);
+next:
+#ifdef CONFIG_UID_SYS_STATS_DEBUG
+		put_task_struct(usw->task);
+#endif
+		kfree(usw);
+	}
 	rt_mutex_unlock(&uid_lock);
-#ifdef CONFIG_UID_SYS_STATS_DEBUG
-	put_task_struct(usw->task);
-#endif
-	kfree(usw);
 }
+static DECLARE_WORK(update_stats_work, update_stats_workfn);

 static int process_notifier(struct notifier_block *self,
 			unsigned long cmd, void *v)
@ -687,7 +695,6 @@ static int process_notifier(struct notifier_block *self,

 		usw = kmalloc(sizeof(struct update_stats_work), GFP_KERNEL);
 		if (usw) {
-			INIT_WORK(&usw->work, update_stats_workfn);
 			usw->uid = uid;
 #ifdef CONFIG_UID_SYS_STATS_DEBUG
 			usw->task = get_task_struct(task);
@ -698,7 +705,8 @@ static int process_notifier(struct notifier_block *self,
 			 */
 			usw->ioac = task->ioac;
 			task_cputime_adjusted(task, &usw->utime, &usw->stime);
-			schedule_work(&usw->work);
+			llist_add(&usw->node, &work_usw);
+			schedule_work(&update_stats_work);
 		}
 		return NOTIFY_OK;
 	}
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@ -2508,8 +2508,10 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
 		gsm->has_devices = false;
 	}
 	for (i = NUM_DLCI - 1; i >= 0; i--)
-		if (gsm->dlci[i])
+		if (gsm->dlci[i]) {
 			gsm_dlci_release(gsm->dlci[i]);
+			gsm->dlci[i] = NULL;
+		}
 	mutex_unlock(&gsm->mutex);
 	/* Now wipe the queues */
 	tty_ldisc_flush(gsm->tty);
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@ -188,11 +188,10 @@ EXPORT_SYMBOL_GPL(xhci_plat_register_vendor_ops);

 static int xhci_vendor_init(struct xhci_hcd *xhci)
 {
-	struct xhci_vendor_ops *ops = xhci_vendor_get_ops(xhci);
-	struct xhci_plat_priv *priv = xhci_to_priv(xhci);
+	struct xhci_vendor_ops *ops = NULL;

 	if (xhci_plat_vendor_overwrite.vendor_ops)
-		ops = priv->vendor_ops = xhci_plat_vendor_overwrite.vendor_ops;
+		ops = xhci->vendor_ops = xhci_plat_vendor_overwrite.vendor_ops;

 	if (ops && ops->vendor_init)
 		return ops->vendor_init(xhci);
@ -202,12 +201,11 @@ static int xhci_vendor_init(struct xhci_hcd *xhci)
 static void xhci_vendor_cleanup(struct xhci_hcd *xhci)
 {
 	struct xhci_vendor_ops *ops = xhci_vendor_get_ops(xhci);
-	struct xhci_plat_priv *priv = xhci_to_priv(xhci);

 	if (ops && ops->vendor_cleanup)
 		ops->vendor_cleanup(xhci);

-	priv->vendor_ops = NULL;
+	xhci->vendor_ops = NULL;
 }

 static int xhci_plat_probe(struct platform_device *pdev)
--- a/drivers/usb/host/xhci-plat.h
+++ b/drivers/usb/host/xhci-plat.h
@ -13,7 +13,6 @@
 struct xhci_plat_priv {
 	const char *firmware_name;
 	unsigned long long quirks;
-	struct xhci_vendor_ops *vendor_ops;
 	struct xhci_vendor_data *vendor_data;
 	int (*plat_setup)(struct usb_hcd *);
 	void (*plat_start)(struct usb_hcd *);
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@ -25,7 +25,6 @@
 #include "xhci-trace.h"
 #include "xhci-debugfs.h"
 #include "xhci-dbgcap.h"
-#include "xhci-plat.h"

 #define DRIVER_AUTHOR "Sarah Sharp"
 #define DRIVER_DESC "'eXtensible' Host Controller (xHC) Driver"
@ -4517,7 +4516,7 @@ static int __maybe_unused xhci_change_max_exit_latency(struct xhci_hcd *xhci,

 struct xhci_vendor_ops *xhci_vendor_get_ops(struct xhci_hcd *xhci)
 {
-	return xhci_to_priv(xhci)->vendor_ops;
+	return xhci->vendor_ops;
 }
 EXPORT_SYMBOL_GPL(xhci_vendor_get_ops);

--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@ -1941,7 +1941,9 @@ struct xhci_hcd {

 	void			*dbc;

-	ANDROID_KABI_RESERVE(1);
+	/* Used for bug 194461020 */
+	ANDROID_KABI_USE(1, struct xhci_vendor_ops *vendor_ops);
+
 	ANDROID_KABI_RESERVE(2);
 	ANDROID_KABI_RESERVE(3);
 	ANDROID_KABI_RESERVE(4);
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@ -132,10 +132,8 @@ static int ucsi_exec_command(struct ucsi *ucsi, u64 cmd)
 	if (ret)
 		return ret;

-	if (cci & UCSI_CCI_BUSY) {
-		ucsi->ops->async_write(ucsi, UCSI_CANCEL, NULL, 0);
-		return -EBUSY;
-	}
+	if (cmd != UCSI_CANCEL && cci & UCSI_CCI_BUSY)
+		return ucsi_exec_command(ucsi, UCSI_CANCEL);

 	if (!(cci & UCSI_CCI_COMMAND_COMPLETE))
 		return -EIO;
@ -149,6 +147,11 @@ static int ucsi_exec_command(struct ucsi *ucsi, u64 cmd)
 		return ucsi_read_error(ucsi);
 	}

+	if (cmd == UCSI_CANCEL && cci & UCSI_CCI_CANCEL_COMPLETE) {
+		ret = ucsi_acknowledge_command(ucsi);
+		return ret ? ret : -EBUSY;
+	}
+
 	return UCSI_CCI_LENGTH(cci);
 }

--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
 	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
 	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
 	u32 bit_mask, eorx, shift;
-	const char *s = image->data, *src;
+	const u8 *s = image->data, *src;
 	u32 *dst;
 	const u32 *tab;
 	size_t tablen;
--- a/files_gki_aarch64.txt
+++ b/files_gki_aarch64.txt
@ -4136,6 +4136,7 @@ include/trace/events/cma.h
 include/trace/events/compaction.h
 include/trace/events/cpuhp.h
 include/trace/events/devfreq.h
+include/trace/events/devlink.h
 include/trace/events/dma_fence.h
 include/trace/events/erofs.h
 include/trace/events/error_report.h
@ -5665,6 +5666,7 @@ net/core/dev.c
 net/core/dev.h
 net/core/dev_addr_lists.c
 net/core/dev_ioctl.c
+net/core/devlink.c
 net/core/dst.c
 net/core/dst_cache.c
 net/core/fib_notifier.c
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@ -315,10 +315,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
 	 * Grow the stack manually; some architectures have a limit on how
 	 * far ahead a user-space access may be in order to grow the stack.
 	 */
-	if (mmap_read_lock_killable(mm))
+	if (mmap_write_lock_killable(mm))
 		return -EINTR;
-	vma = find_extend_vma(mm, bprm->p);
-	mmap_read_unlock(mm);
+	vma = find_extend_vma_locked(mm, bprm->p);
+	mmap_write_unlock(mm);
 	if (!vma)
 		return -EFAULT;

--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@ -10,6 +10,7 @@
 #include <linux/writeback.h>
 #include <linux/sysctl.h>
 #include <linux/gfp.h>
+#include <linux/swap.h>
 #include "internal.h"

 /* A global variable is a bit ugly, but it keeps the code simple */
@ -59,6 +60,7 @@ int drop_caches_sysctl_handler(struct ctl_table *table, int write,
 		static int stfu;

 		if (sysctl_drop_caches & 1) {
+			lru_add_drain_all();
 			iterate_supers(drop_pagecache_sb, NULL);
 			count_vm_event(DROP_PAGECACHE);
 		}
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@ -154,6 +154,7 @@ struct erofs_sb_info {

 	/* what we really care is nid, rather than ino.. */
 	erofs_nid_t root_nid;
+	erofs_nid_t packed_nid;
 	/* used for statfs, f_files - f_favail */
 	u64 inos;

@ -310,7 +311,7 @@ struct erofs_inode {

 	unsigned char datalayout;
 	unsigned char inode_isize;
-	unsigned short xattr_isize;
+	unsigned int xattr_isize;

 	unsigned int xattr_shared_count;
 	unsigned int *xattr_shared_xattrs;
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@ -381,17 +381,7 @@ static int erofs_read_superblock(struct super_block *sb)
 #endif
 	sbi->islotbits = ilog2(sizeof(struct erofs_inode_compact));
 	sbi->root_nid = le16_to_cpu(dsb->root_nid);
-#ifdef CONFIG_EROFS_FS_ZIP
-	sbi->packed_inode = NULL;
-	if (erofs_sb_has_fragments(sbi) && dsb->packed_nid) {
-		sbi->packed_inode =
-			erofs_iget(sb, le64_to_cpu(dsb->packed_nid));
-		if (IS_ERR(sbi->packed_inode)) {
-			ret = PTR_ERR(sbi->packed_inode);
-			goto out;
-		}
-	}
-#endif
+	sbi->packed_nid = le64_to_cpu(dsb->packed_nid);
 	sbi->inos = le64_to_cpu(dsb->inos);

 	sbi->build_time = le64_to_cpu(dsb->build_time);
@ -800,6 +790,16 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)

 	erofs_shrinker_register(sb);
 	/* sb->s_umount is already locked, SB_ACTIVE and SB_BORN are not set */
+#ifdef CONFIG_EROFS_FS_ZIP
+	if (erofs_sb_has_fragments(sbi) && sbi->packed_nid) {
+		sbi->packed_inode = erofs_iget(sb, sbi->packed_nid);
+		if (IS_ERR(sbi->packed_inode)) {
+			err = PTR_ERR(sbi->packed_inode);
+			sbi->packed_inode = NULL;
+			return err;
+		}
+	}
+#endif
 	err = erofs_init_managed_cache(sb);
 	if (err)
 		return err;
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@ -355,20 +355,6 @@ int __init z_erofs_init_zip_subsystem(void)

 enum z_erofs_pclustermode {
 	Z_EROFS_PCLUSTER_INFLIGHT,
-	/*
-	 * The current pclusters was the tail of an exist chain, in addition
-	 * that the previous processed chained pclusters are all decided to
-	 * be hooked up to it.
-	 * A new chain will be created for the remaining pclusters which are
-	 * not processed yet, so different from Z_EROFS_PCLUSTER_FOLLOWED,
-	 * the next pcluster cannot reuse the whole page safely for inplace I/O
-	 * in the following scenario:
-	 *  ________________________________________________________________
-	 * |      tail (partial) page     |       head (partial) page       |
-	 * |   (belongs to the next pcl)  |   (belongs to the current pcl)  |
-	 * |_______PCLUSTER_FOLLOWED______|________PCLUSTER_HOOKED__________|
-	 */
-	Z_EROFS_PCLUSTER_HOOKED,
 	/*
 	 * a weak form of Z_EROFS_PCLUSTER_FOLLOWED, the difference is that it
 	 * could be dispatched into bypass queue later due to uptodated managed
@ -386,8 +372,8 @@ enum z_erofs_pclustermode {
 	 *  ________________________________________________________________
 	 * |  tail (partial) page |          head (partial) page           |
 	 * |  (of the current cl) |      (of the previous collection)      |
-	 * | PCLUSTER_FOLLOWED or |                                        |
-	 * |_____PCLUSTER_HOOKED__|___________PCLUSTER_FOLLOWED____________|
+	 * |                      |                                        |
+	 * |__PCLUSTER_FOLLOWED___|___________PCLUSTER_FOLLOWED____________|
 	 *
 	 * [  (*) the above page can be used as inplace I/O.               ]
 	 */
@ -400,7 +386,7 @@ struct z_erofs_decompress_frontend {
 	struct z_erofs_bvec_iter biter;

 	struct page *candidate_bvpage;
-	struct z_erofs_pcluster *pcl, *tailpcl;
+	struct z_erofs_pcluster *pcl;
 	z_erofs_next_pcluster_t owned_head;
 	enum z_erofs_pclustermode mode;

@ -589,19 +575,7 @@ static void z_erofs_try_to_claim_pcluster(struct z_erofs_decompress_frontend *f)
 		return;
 	}

-	/*
-	 * type 2, link to the end of an existing open chain, be careful
-	 * that its submission is controlled by the original attached chain.
-	 */
-	if (*owned_head != &pcl->next && pcl != f->tailpcl &&
-	    cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
-		    *owned_head) == Z_EROFS_PCLUSTER_TAIL) {
-		*owned_head = Z_EROFS_PCLUSTER_TAIL;
-		f->mode = Z_EROFS_PCLUSTER_HOOKED;
-		f->tailpcl = NULL;
-		return;
-	}
-	/* type 3, it belongs to a chain, but it isn't the end of the chain */
+	/* type 2, it belongs to an ongoing chain */
 	f->mode = Z_EROFS_PCLUSTER_INFLIGHT;
 }

@ -662,9 +636,6 @@ static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe)
 			goto err_out;
 		}
 	}
-	/* used to check tail merging loop due to corrupted images */
-	if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
-		fe->tailpcl = pcl;
 	fe->owned_head = &pcl->next;
 	fe->pcl = pcl;
 	return 0;
@ -685,7 +656,6 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe)

 	/* must be Z_EROFS_PCLUSTER_TAIL or pointed to previous pcluster */
 	DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_NIL);
-	DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);

 	if (!(map->m_flags & EROFS_MAP_META)) {
 		grp = erofs_find_workgroup(fe->inode->i_sb,
@ -704,10 +674,6 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe)

 	if (ret == -EEXIST) {
 		mutex_lock(&fe->pcl->lock);
-		/* used to check tail merging loop due to corrupted images */
-		if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
-			fe->tailpcl = fe->pcl;
-
 		z_erofs_try_to_claim_pcluster(fe);
 	} else if (ret) {
 		return ret;
@ -887,10 +853,9 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
 	 * those chains are handled asynchronously thus the page cannot be used
 	 * for inplace I/O or bvpage (should be processed in a strict order.)
 	 */
-	tight &= (fe->mode >= Z_EROFS_PCLUSTER_HOOKED &&
-		  fe->mode != Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);
+	tight &= (fe->mode > Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);

-	cur = end - min_t(unsigned int, offset + end - map->m_la, end);
+	cur = end - min_t(erofs_off_t, offset + end - map->m_la, end);
 	if (!(map->m_flags & EROFS_MAP_MAPPED)) {
 		zero_user_segment(page, cur, end);
 		goto next_part;
@ -1013,9 +978,11 @@ static void z_erofs_do_decompressed_bvec(struct z_erofs_decompress_backend *be,
 					 struct z_erofs_bvec *bvec)
 {
 	struct z_erofs_bvec_item *item;
+	unsigned int pgnr;

-	if (!((bvec->offset + be->pcl->pageofs_out) & ~PAGE_MASK)) {
-		unsigned int pgnr;
+	if (!((bvec->offset + be->pcl->pageofs_out) & ~PAGE_MASK) &&
+	    (bvec->end == PAGE_SIZE ||
+	     bvec->offset + bvec->end == be->pcl->length)) {

 		pgnr = (bvec->offset + be->pcl->pageofs_out) >> PAGE_SHIFT;
 		DBG_BUGON(pgnr >= be->nr_pages);
@ -1268,11 +1235,7 @@ static void z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io,
 			LIST_HEAD_INIT(be.decompressed_secondary_bvecs),
 	};
 	z_erofs_next_pcluster_t owned = io->head;
-
-	while (owned != Z_EROFS_PCLUSTER_TAIL_CLOSED) {
-		/* impossible that 'owned' equals Z_EROFS_WORK_TPTR_TAIL */
-		DBG_BUGON(owned == Z_EROFS_PCLUSTER_TAIL);
-		/* impossible that 'owned' equals Z_EROFS_PCLUSTER_NIL */
+	while (owned != Z_EROFS_PCLUSTER_TAIL) {
 		DBG_BUGON(owned == Z_EROFS_PCLUSTER_NIL);

 		be.pcl = container_of(owned, struct z_erofs_pcluster, next);
@ -1289,7 +1252,7 @@ static void z_erofs_decompressqueue_work(struct work_struct *work)
 		container_of(work, struct z_erofs_decompressqueue, u.work);
 	struct page *pagepool = NULL;

-	DBG_BUGON(bgq->head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
+	DBG_BUGON(bgq->head == Z_EROFS_PCLUSTER_TAIL);
 	z_erofs_decompress_queue(bgq, &pagepool);
 	erofs_release_pages(&pagepool);
 	kvfree(bgq);
@ -1317,7 +1280,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io,
 	if (atomic_add_return(bios, &io->pending_bios))
 		return;
 	/* Use (kthread_)work and sync decompression for atomic contexts only */
-	if (in_atomic() || irqs_disabled()) {
+	if (!in_task() || irqs_disabled() || rcu_read_lock_any_held()) {
 #ifdef CONFIG_EROFS_FS_PCPU_KTHREAD
 		struct kthread_worker *worker;

@ -1481,7 +1444,7 @@ jobqueue_init(struct super_block *sb,
 		q->eio = false;
 	}
 	q->sb = sb;
-	q->head = Z_EROFS_PCLUSTER_TAIL_CLOSED;
+	q->head = Z_EROFS_PCLUSTER_TAIL;
 	return q;
 }

@ -1513,11 +1476,7 @@ static void move_to_bypass_jobqueue(struct z_erofs_pcluster *pcl,
 	z_erofs_next_pcluster_t *const submit_qtail = qtail[JQ_SUBMIT];
 	z_erofs_next_pcluster_t *const bypass_qtail = qtail[JQ_BYPASS];

-	DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
-	if (owned_head == Z_EROFS_PCLUSTER_TAIL)
-		owned_head = Z_EROFS_PCLUSTER_TAIL_CLOSED;
-
-	WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_TAIL_CLOSED);
+	WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_TAIL);

 	WRITE_ONCE(*submit_qtail, owned_head);
 	WRITE_ONCE(*bypass_qtail, &pcl->next);
@ -1584,15 +1543,11 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f,
 		unsigned int i = 0;
 		bool bypass = true;

-		/* no possible 'owned_head' equals the following */
-		DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
 		DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_NIL);

 		pcl = container_of(owned_head, struct z_erofs_pcluster, next);
+		owned_head = READ_ONCE(pcl->next);

-		/* close the main owned chain at first */
-		owned_head = cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
-				     Z_EROFS_PCLUSTER_TAIL_CLOSED);
 		if (z_erofs_is_inline_pcluster(pcl)) {
 			move_to_bypass_jobqueue(pcl, qtail, owned_head);
 			continue;
@ -1736,7 +1691,7 @@ static void z_erofs_pcluster_readmore(struct z_erofs_decompress_frontend *f,
 	}

 	cur = map->m_la + map->m_llen - 1;
-	while (cur >= end) {
+	while ((cur >= end) && (cur < i_size_read(inode))) {
 		pgoff_t index = cur >> PAGE_SHIFT;
 		struct page *page;

--- a/fs/erofs/zdata.h
+++ b/fs/erofs/zdata.h
@ -94,11 +94,8 @@ struct z_erofs_pcluster {

 /* let's avoid the valid 32-bit kernel addresses */

-/* the chained workgroup has't submitted io (still open) */
+/* the end of a chain of pclusters */
 #define Z_EROFS_PCLUSTER_TAIL           ((void *)0x5F0ECAFE)
-/* the chained workgroup has already submitted io */
-#define Z_EROFS_PCLUSTER_TAIL_CLOSED    ((void *)0x5F0EDEAD)
-
 #define Z_EROFS_PCLUSTER_NIL            (NULL)

 struct z_erofs_decompressqueue {
--- a/fs/erofs/zmap.c
+++ b/fs/erofs/zmap.c
@ -211,6 +211,10 @@ static int legacy_load_cluster_from_disk(struct z_erofs_maprecorder *m,
 		if (advise & Z_EROFS_VLE_DI_PARTIAL_REF)
 			m->partialref = true;
 		m->clusterofs = le16_to_cpu(di->di_clusterofs);
+		if (m->clusterofs >= 1 << vi->z_logical_clusterbits) {
+			DBG_BUGON(1);
+			return -EFSCORRUPTED;
+		}
 		m->pblk = le32_to_cpu(di->di_u.blkaddr);
 		break;
 	default:
@ -269,7 +273,7 @@ static int unpack_compacted_index(struct z_erofs_maprecorder *m,
 	u8 *in, type;
 	bool big_pcluster;

-	if (1 << amortizedshift == 4)
+	if (1 << amortizedshift == 4 && lclusterbits <= 14)
 		vcnt = 2;
 	else if (1 << amortizedshift == 2 && lclusterbits == 12)
 		vcnt = 16;
@ -371,7 +375,6 @@ static int compacted_load_cluster_from_disk(struct z_erofs_maprecorder *m,
 {
 	struct inode *const inode = m->inode;
 	struct erofs_inode *const vi = EROFS_I(inode);
-	const unsigned int lclusterbits = vi->z_logical_clusterbits;
 	const erofs_off_t ebase = ALIGN(iloc(EROFS_I_SB(inode), vi->nid) +
 					vi->inode_isize + vi->xattr_isize, 8) +
 		sizeof(struct z_erofs_map_header);
@ -380,9 +383,6 @@ static int compacted_load_cluster_from_disk(struct z_erofs_maprecorder *m,
 	unsigned int amortizedshift;
 	erofs_off_t pos;

-	if (lclusterbits != 12)
-		return -EOPNOTSUPP;
-
 	if (lcn >= totalidx)
 		return -EINVAL;

--- a/fs/exec.c
+++ b/fs/exec.c
@ -198,33 +198,39 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 		int write)
 {
 	struct page *page;
+	struct vm_area_struct *vma = bprm->vma;
+	struct mm_struct *mm = bprm->mm;
 	int ret;
-	unsigned int gup_flags = FOLL_FORCE;

-#ifdef CONFIG_STACK_GROWSUP
-	if (write) {
-		ret = expand_downwards(bprm->vma, pos);
-		if (ret < 0)
+	/*
+	 * Avoid relying on expanding the stack down in GUP (which
+	 * does not work for STACK_GROWSUP anyway), and just do it
+	 * by hand ahead of time.
+	 */
+	if (write && pos < vma->vm_start) {
+		mmap_write_lock(mm);
+		ret = expand_downwards(vma, pos);
+		if (unlikely(ret < 0)) {
+			mmap_write_unlock(mm);
 			return NULL;
-	}
-#endif
-
-	if (write)
-		gup_flags |= FOLL_WRITE;
+		}
+		mmap_write_downgrade(mm);
+	} else
+		mmap_read_lock(mm);

 	/*
 	 * We are doing an exec().  'current' is the process
-	 * doing the exec and bprm->mm is the new process's mm.
+	 * doing the exec and 'mm' is the new process's mm.
 	 */
-	mmap_read_lock(bprm->mm);
-	ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
+	ret = get_user_pages_remote(mm, pos, 1,
+			write ? FOLL_WRITE : 0,
 			&page, NULL, NULL);
-	mmap_read_unlock(bprm->mm);
+	mmap_read_unlock(mm);
 	if (ret <= 0)
 		return NULL;

 	if (write)
-		acct_arg_size(bprm, vma_pages(bprm->vma));
+		acct_arg_size(bprm, vma_pages(vma));

 	return page;
 }
@ -854,7 +860,7 @@ int setup_arg_pages(struct linux_binprm *bprm,
 		stack_base = vma->vm_start - stack_expand;
 #endif
 	current->mm->start_stack = bprm->p;
-	ret = expand_stack(vma, stack_base);
+	ret = expand_stack_locked(vma, stack_base);
 	if (ret)
 		ret = -EFAULT;

--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@ -208,6 +208,7 @@ int fuse_create_open_backing(
 		struct file *file, unsigned int flags, umode_t mode)
 {
 	struct fuse_inode *dir_fuse_inode = get_fuse_inode(dir);
+	struct fuse_dentry *fuse_entry = get_fuse_dentry(entry);
 	struct fuse_dentry *dir_fuse_dentry = get_fuse_dentry(entry->d_parent);
 	struct dentry *backing_dentry = NULL;
 	struct inode *inode = NULL;
@ -239,29 +240,28 @@ int fuse_create_open_backing(
 	if (err)
 		goto out;

-	if (get_fuse_dentry(entry)->backing_path.dentry)
-		path_put(&get_fuse_dentry(entry)->backing_path);
-	get_fuse_dentry(entry)->backing_path = (struct path) {
+	if (fuse_entry->backing_path.dentry)
+		path_put(&fuse_entry->backing_path);
+	fuse_entry->backing_path = (struct path) {
 		.mnt = dir_fuse_dentry->backing_path.mnt,
 		.dentry = backing_dentry,
 	};
-	path_get(&get_fuse_dentry(entry)->backing_path);
+	path_get(&fuse_entry->backing_path);

 	if (d_inode)
 		target_nodeid = get_fuse_inode(d_inode)->nodeid;

 	inode = fuse_iget_backing(dir->i_sb, target_nodeid,
-			get_fuse_dentry(entry)->backing_path.dentry->d_inode);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
+			fuse_entry->backing_path.dentry->d_inode);
+	if (!inode) {
+		err = -EIO;
 		goto out;
 	}

 	if (get_fuse_inode(inode)->bpf)
 		bpf_prog_put(get_fuse_inode(inode)->bpf);
-	get_fuse_inode(inode)->bpf = dir_fuse_inode->bpf;
-	if (get_fuse_inode(inode)->bpf)
-		bpf_prog_inc(dir_fuse_inode->bpf);
+	get_fuse_inode(inode)->bpf = fuse_entry->bpf;
+	fuse_entry->bpf = NULL;

 	newent = d_splice_alias(inode, entry);
 	if (IS_ERR(newent)) {
@ -269,10 +269,12 @@ int fuse_create_open_backing(
 		goto out;
 	}

+	inode = NULL;
 	entry = newent ? newent : entry;
 	err = finish_open(file, entry, fuse_open_file_backing);

 out:
+	iput(inode);
 	dput(backing_dentry);
 	return err;
 }
@ -966,6 +968,19 @@ void *fuse_file_write_iter_finalize(struct fuse_bpf_args *fa,
 	return ERR_PTR(fwio->ret);
 }

+long fuse_backing_ioctl(struct file *file, unsigned int command, unsigned long arg, int flags)
+{
+	struct fuse_file *ff = file->private_data;
+	long ret;
+
+	if (flags & FUSE_IOCTL_COMPAT)
+		ret = -ENOTTY;
+	else
+		ret = vfs_ioctl(ff->backing_file, command, arg);
+
+	return ret;
+}
+
 int fuse_file_flock_backing(struct file *file, int cmd, struct file_lock *fl)
 {
 	struct fuse_file *ff = file->private_data;
@ -1225,61 +1240,62 @@ int fuse_handle_bpf_prog(struct fuse_entry_bpf *feb, struct inode *parent,
 struct dentry *fuse_lookup_finalize(struct fuse_bpf_args *fa, struct inode *dir,
 			   struct dentry *entry, unsigned int flags)
 {
-	struct fuse_dentry *fd;
-	struct dentry *bd;
-	struct inode *inode, *backing_inode;
-	struct inode *d_inode = entry->d_inode;
+	struct fuse_dentry *fuse_entry;
+	struct dentry *backing_entry;
+	struct inode *inode = NULL, *backing_inode;
+	struct inode *entry_inode = entry->d_inode;
 	struct fuse_entry_out *feo = fa->out_args[0].value;
 	struct fuse_entry_bpf_out *febo = fa->out_args[1].value;
-	struct fuse_entry_bpf *feb = container_of(febo, struct fuse_entry_bpf, out);
+	struct fuse_entry_bpf *feb = container_of(febo, struct fuse_entry_bpf,
+						  out);
 	int error = -1;
 	u64 target_nodeid = 0;
-	struct dentry *ret;
+	struct dentry *ret = NULL;

-	fd = get_fuse_dentry(entry);
-	if (!fd) {
+	fuse_entry = get_fuse_dentry(entry);
+	if (!fuse_entry) {
 		ret = ERR_PTR(-EIO);
 		goto out;
 	}

-	bd = fd->backing_path.dentry;
-	if (!bd) {
+	backing_entry = fuse_entry->backing_path.dentry;
+	if (!backing_entry) {
 		ret = ERR_PTR(-ENOENT);
 		goto out;
 	}

-	backing_inode = bd->d_inode;
-	if (!backing_inode) {
-		ret = 0;
-		goto out;
-	}
+	if (entry_inode)
+		target_nodeid = get_fuse_inode(entry_inode)->nodeid;

-	if (d_inode)
-		target_nodeid = get_fuse_inode(d_inode)->nodeid;
+	backing_inode = backing_entry->d_inode;
+	if (backing_inode)
+		inode = fuse_iget_backing(dir->i_sb, target_nodeid,
+					  backing_inode);

-	inode = fuse_iget_backing(dir->i_sb, target_nodeid, backing_inode);
-
-	if (IS_ERR(inode)) {
-		ret = ERR_PTR(PTR_ERR(inode));
-		goto out;
-	}
-
-	error = fuse_handle_bpf_prog(feb, dir, &get_fuse_inode(inode)->bpf);
+	error = inode ?
+		fuse_handle_bpf_prog(feb, dir, &get_fuse_inode(inode)->bpf) :
+		fuse_handle_bpf_prog(feb, dir, &fuse_entry->bpf);
 	if (error) {
 		ret = ERR_PTR(error);
 		goto out;
 	}

-	error = fuse_handle_backing(feb, &get_fuse_inode(inode)->backing_inode, &fd->backing_path);
-	if (error) {
-		ret = ERR_PTR(error);
-		goto out;
+	if (inode) {
+		error = fuse_handle_backing(feb,
+					&get_fuse_inode(inode)->backing_inode,
+					&fuse_entry->backing_path);
+		if (error) {
+			ret = ERR_PTR(error);
+			goto out;
+		}
+
+		get_fuse_inode(inode)->nodeid = feo->nodeid;
+		ret = d_splice_alias(inode, entry);
+		if (!IS_ERR(ret))
+			inode = NULL;
 	}
-
-	get_fuse_inode(inode)->nodeid = feo->nodeid;
-
-	ret = d_splice_alias(inode, entry);
 out:
+	iput(inode);
 	if (feb->backing_file)
 		fput(feb->backing_file);
 	return ret;
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@ -321,7 +321,7 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 			spin_unlock(&fi->lock);
 		}
 		kfree(forget);
-		if (ret == -ENOMEM)
+		if (ret == -ENOMEM || ret == -EINTR)
 			goto out;
 		if (ret || fuse_invalid_attr(&outarg.attr) ||
 		    fuse_stale_inode(inode, outarg.generation, &outarg.attr))
@ -364,9 +364,14 @@ static void fuse_dentry_release(struct dentry *dentry)
 {
 	struct fuse_dentry *fd = dentry->d_fsdata;

+#ifdef CONFIG_FUSE_BPF
 	if (fd && fd->backing_path.dentry)
 		path_put(&fd->backing_path);

+	if (fd && fd->bpf)
+		bpf_prog_put(fd->bpf);
+#endif
+
 	kfree_rcu(fd, rcu);
 }
 #endif
@ -504,7 +509,6 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 	if (name->len > FUSE_NAME_MAX)
 		goto out;

-
 	forget = fuse_alloc_forget();
 	err = -ENOMEM;
 	if (!forget)
@ -523,32 +527,34 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name

 		err = -ENOENT;
 		if (!entry)
-			goto out_queue_forget;
+			goto out_put_forget;

 		err = -EINVAL;
 		backing_file = bpf_arg.backing_file;
 		if (!backing_file)
-			goto out_queue_forget;
+			goto out_put_forget;

 		if (IS_ERR(backing_file)) {
 			err = PTR_ERR(backing_file);
-			goto out_queue_forget;
+			goto out_put_forget;
 		}

 		backing_inode = backing_file->f_inode;
 		*inode = fuse_iget_backing(sb, outarg->nodeid, backing_inode);
 		if (!*inode)
-			goto out;
+			goto out_put_forget;

 		err = fuse_handle_backing(&bpf_arg,
 				&get_fuse_inode(*inode)->backing_inode,
 				&get_fuse_dentry(entry)->backing_path);
-		if (err)
-			goto out;
-
-		err = fuse_handle_bpf_prog(&bpf_arg, NULL, &get_fuse_inode(*inode)->bpf);
-		if (err)
-			goto out;
+		if (!err)
+			err = fuse_handle_bpf_prog(&bpf_arg, NULL,
+					   &get_fuse_inode(*inode)->bpf);
+		if (err) {
+			iput(*inode);
+			*inode = NULL;
+			goto out_put_forget;
+		}
 	} else
 #endif
 	{
@ -568,9 +574,6 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 	}

 	err = -ENOMEM;
-#ifdef CONFIG_FUSE_BPF
-out_queue_forget:
-#endif
 	if (!*inode && outarg->nodeid) {
 		fuse_queue_forget(fm->fc, forget, outarg->nodeid, 1);
 		goto out;
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@ -76,7 +76,13 @@ struct fuse_dentry {
 		u64 time;
 		struct rcu_head rcu;
 	};
+
+#ifdef CONFIG_FUSE_BPF
 	struct path backing_path;
+
+	/* bpf program *only* set for negative dentries */
+	struct bpf_prog *bpf;
+#endif
 };

 static inline struct fuse_dentry *get_fuse_dentry(const struct dentry *entry)
@ -1664,6 +1670,8 @@ int fuse_file_write_iter_backing(struct fuse_bpf_args *fa,
 void *fuse_file_write_iter_finalize(struct fuse_bpf_args *fa,
 		struct kiocb *iocb, struct iov_iter *from);

+long fuse_backing_ioctl(struct file *file, unsigned int command, unsigned long arg, int flags);
+
 int fuse_file_flock_backing(struct file *file, int cmd, struct file_lock *fl);
 ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma);

--- a/fs/fuse/ioctl.c
+++ b/fs/fuse/ioctl.c
@ -353,6 +353,15 @@ long fuse_ioctl_common(struct file *file, unsigned int cmd,
 	if (fuse_is_bad(inode))
 		return -EIO;

+#ifdef CONFIG_FUSE_BPF
+	{
+		struct fuse_file *ff = file->private_data;
+
+		/* TODO - this is simply passthrough, not a proper BPF filter */
+		if (ff->backing_file)
+			return fuse_backing_ioctl(file, cmd, arg, flags);
+	}
+#endif
 	return fuse_do_ioctl(file, cmd, arg, flags);
 }

--- a/Show More
+++ b/Show More