android_kernel_samsung_sm8650

Author	SHA1	Message	Date
Vincent Palomares	4744b3a4ed	ANDROID: GKI: Expose device async to userspace Setting CONFIG_PM_ADVANCED_DEBUG=y to expose device async fields to userspace, allowing to fine-tune the suspend/resume path. Bug: 235135485 Change-Id: I75060e88ce0c1e199aa8740f446a2c0f8167f3d7 Signed-off-by: Vincent Palomares <paillon@google.com>	2024-04-29 20:04:23 +00:00
Suzuki K Poulose	08cc4037cf	FROMGIT: coresight: etm4x: Fix access to resource selector registers Resource selector pair 0 is always implemented and reserved. We must not touch it, even during save/restore for CPU Idle. Rest of the driver is well behaved. Fix the offending ones. Reported-by: Yabin Cui <yabinc@google.com> Fixes: `f188b5e76a` ("coresight: etm4x: Save/restore state across CPU low power states") Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Yabin Cui <yabinc@google.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Link: https://lore.kernel.org/r/20240412142702.2882478-5-suzuki.poulose@arm.com Bug: 335234033 (cherry picked from commit d6fc00d0f640d6010b51054aa8b0fd191177dbc9 https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git next) Change-Id: I5f3385cb269969a299402fa258b30ab43e95805f Signed-off-by: Yabin Cui <yabinc@google.com>	2024-04-26 12:54:24 -07:00
Suzuki K Poulose	7ff054397a	FROMGIT: coresight: etm4x: Safe access for TRCQCLTR ETM4x implements TRCQCLTR only when the Q elements are supported and the Q element filtering is supported (TRCIDR0.QFILT). Access to the register otherwise could be fatal. Fix this by tracking the availability, like the others. Fixes: `f188b5e76a` ("coresight: etm4x: Save/restore state across CPU low power states") Reported-by: Yabin Cui <yabinc@google.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Yabin Cui <yabinc@google.com> Link: https://lore.kernel.org/r/20240412142702.2882478-4-suzuki.poulose@arm.com Bug: 335234033 (cherry picked from commit 46bf8d7cd8530eca607379033b9bc4ac5590a0cd https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git next) Change-Id: Id848fa14ba8003149f76b5ca54562593f6164150 Signed-off-by: Yabin Cui <yabinc@google.com>	2024-04-26 12:54:24 -07:00
Suzuki K Poulose	f401cce7d9	FROMGIT: coresight: etm4x: Do not save/restore Data trace control registers ETM4x doesn't support Data trace on A class CPUs. As such do not access the Data trace control registers during CPU idle. This could cause problems for ETE. While at it, remove all references to the Data trace control registers. Fixes: `f188b5e76a` ("coresight: etm4x: Save/restore state across CPU low power states") Reported-by: Yabin Cui <yabinc@google.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Yabin Cui <yabinc@google.com> Link: https://lore.kernel.org/r/20240412142702.2882478-3-suzuki.poulose@arm.com Bug: 335234033 (cherry picked from commit 5eb3a0c2c52368cb9902e9a6ea04888e093c487d https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git next) Change-Id: I06977d86aa2d876d166db0fac8fbccf48fd07229 Signed-off-by: Yabin Cui <yabinc@google.com>	2024-04-26 12:54:24 -07:00
Suzuki K Poulose	d9604db041	FROMGIT: coresight: etm4x: Do not hardcode IOMEM access for register restore When we restore the register state for ETM4x, while coming back from CPU idle, we hardcode IOMEM access. This is wrong and could blow up for an ETM with system instructions access (and for ETE). Fixes: `f5bd523690` ("coresight: etm4x: Convert all register accesses") Reported-by: Yabin Cui <yabinc@google.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Yabin Cui <yabinc@google.com> Link: https://lore.kernel.org/r/20240412142702.2882478-2-suzuki.poulose@arm.com Bug: 335234033 (cherry picked from commit 1e7ba33fa591de1cf60afffcabb45600b3607025 https://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git next) Change-Id: Id2ea066374933de51a90f1fca8304338b741845d Signed-off-by: Yabin Cui <yabinc@google.com>	2024-04-26 12:54:24 -07:00
Norihiko Hama	fa87a072a7	ANDROID: GKI: Update honda symbol list for led-trigger Add some missing symbols required for led-trigger 2 function symbol(s) added 'u32* led_get_default_pattern(struct led_classdev, unsigned int)' 'void led_set_brightness(struct led_classdev*, unsigned int)' Bug: 333795249 Change-Id: I9935592d63175a2328c2b8a95556fd3ee6898fdd Signed-off-by: Norihiko Hama <Norihiko.Hama@alpsalpine.com>	2024-04-24 22:45:35 +00:00
xieliujie	c61278bb70	ANDROID: GKI: Update symbols to symbol list Update symbols for vendor hooks of reader optimistic spin. 4 function symbol(s) added 'int __traceiter_android_vh_rwsem_direct_rsteal(void, struct rw_semaphore, bool)' 'int __traceiter_android_vh_rwsem_optimistic_rspin(void, struct rw_semaphore, long, bool)' 'bool osq_lock(struct optimistic_spin_queue)' 'void osq_unlock(struct optimistic_spin_queue*)' 2 variable symbol(s) added 'struct tracepoint __tracepoint_android_vh_rwsem_direct_rsteal' 'struct tracepoint __tracepoint_android_vh_rwsem_optimistic_rspin' Bug: 331742151 Change-Id: I6603ec88f84a9a8adb30b802ba2fdd9b0dc8a016 Signed-off-by: xieliujie <xieliujie@oppo.com>	2024-04-24 10:51:56 +08:00
xieliujie	260bfad693	ANDROID: vendor_hook: Add hooks to support reader optimistic spin in rwsem Since upstream commit `617f3ef951` ("locking/rwsem: Remove reader optimistic spinning"), vendors have seen increased contention and blocking on rwsems. There are attempts to actively fix this upstream: https://lore.kernel.org/lkml/20240406081126.8030-1-bongkyu7.kim@samsung.com/ But in the meantime, provide vendorhooks so that vendors can implement their own optimistic spin routine. In doing so, vendors see improvements in cold launch times on important apps. Bug: 331742151 Change-Id: I7466413de9ee1293e86f73880931235d7a9142ac Signed-off-by: xieliujie <xieliujie@oppo.com> [jstultz: Rewrote commit message] Signed-off-by: John Stultz <jstultz@google.com>	2024-04-24 10:30:39 +08:00
Michal Luczaj	d0c6724b0f	UPSTREAM: af_unix: Fix garbage collector racing against connect() [ Upstream commit 47d8ac011fe1c9251070e1bd64cb10b48193ec51 ] Garbage collector does not take into account the risk of embryo getting enqueued during the garbage collection. If such embryo has a peer that carries SCM_RIGHTS, two consecutive passes of scan_children() may see a different set of children. Leading to an incorrectly elevated inflight count, and then a dangling pointer within the gc_inflight_list. sockets are AF_UNIX/SOCK_STREAM S is an unconnected socket L is a listening in-flight socket bound to addr, not in fdtable V's fd will be passed via sendmsg(), gets inflight count bumped connect(S, addr) sendmsg(S, [V]); close(V) __unix_gc() ---------------- ------------------------- ----------- NS = unix_create1() skb1 = sock_wmalloc(NS) L = unix_find_other(addr) unix_state_lock(L) unix_peer(S) = NS // V count=1 inflight=0 NS = unix_peer(S) skb2 = sock_alloc() skb_queue_tail(NS, skb2[V]) // V became in-flight // V count=2 inflight=1 close(V) // V count=1 inflight=1 // GC candidate condition met for u in gc_inflight_list: if (total_refs == inflight_refs) add u to gc_candidates // gc_candidates={L, V} for u in gc_candidates: scan_children(u, dec_inflight) // embryo (skb1) was not // reachable from L yet, so V's // inflight remains unchanged __skb_queue_tail(L, skb1) unix_state_unlock(L) for u in gc_candidates: if (u.inflight) scan_children(u, inc_inflight_move_tail) // V count=1 inflight=2 (!) If there is a GC-candidate listening socket, lock/unlock its state. This makes GC wait until the end of any ongoing connect() to that socket. After flipping the lock, a possibly SCM-laden embryo is already enqueued. And if there is another embryo coming, it can not possibly carry SCM_RIGHTS. At this point, unix_inflight() can not happen because unix_gc_lock is already taken. Inflight graph remains unaffected. Bug: 336226035 Fixes: `1fd05ba5a2` ("[AF_UNIX]: Rewrite garbage collector, fixes race.") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 507cc232ffe53a352847893f8177d276c3b532a9) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: If321f78b8b3220f5a1caea4b5e9450f1235b0770	2024-04-22 16:23:05 -07:00
Kuniyuki Iwashima	94c88f80ff	UPSTREAM: af_unix: Do not use atomic ops for unix_sk(sk)->inflight. [ Upstream commit 97af84a6bba2ab2b9c704c08e67de3b5ea551bb2 ] When touching unix_sk(sk)->inflight, we are always under spin_lock(&unix_gc_lock). Let's convert unix_sk(sk)->inflight to the normal unsigned long. Bug: 336226035 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240123170856.41348-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 47d8ac011fe1 ("af_unix: Fix garbage collector racing against connect()") Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 301fdbaa0bba4653570f07789909939f977a7620) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I0d965d5f2a863d798c06de9f21d0467f256b538e	2024-04-22 16:22:57 -07:00
Lokesh Gidra	3dfddcb9c2	ANDROID: GKI: fix ABI breakage in struct userfaultfd_ctx The following two commits move 'userfaultfd_ctx' struct from fs/userfaultfd.c to header file and then add a rw_semaphore to it. The ABI is broken by the change. However, given that the type should be private and not accessed by vendor modules, use some GENKSYMS #define magic to preserve the CRC. Also update the .stg file for offset adjustment within 'userfaultfd_ctx'. 5e4c24a57b0c ("userfaultfd: protect mmap_changing with rw_sem in userfaulfd_ctx") f91e6b41dd11 ("userfaultfd: move userfaultfd_ctx struct to header file") Bug: 320478828 Change-Id: I5f97ff34dd8c88fe3d18c4dc902452488ba28cbd Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	8dd482be44	UPSTREAM: userfaultfd: fix deadlock warning when locking src and dst VMAs Use down_read_nested() to avoid the warning. Link: https://lkml.kernel.org/r/20240321235818.125118-1-lokeshgidra@google.com Fixes: 867a43a34ff8 ("userfaultfd: use per-vma locks in userfaultfd operations") Reported-by: syzbot+49056626fe41e01f2ba7@syzkaller.appspotmail.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jann Horn <jannh@google.com> [Bug #2] Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 30af24facf0aed12dec23bdf6eac6a907f88306a) Bug: 320478828 Change-Id: I56d7e33878d6248bba28e1e4204e2b9005d87e4d Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	ce2896c0c6	BACKPORT: userfaultfd: use per-vma locks in userfaultfd operations All userfaultfd operations, except write-protect, opportunistically use per-vma locks to lock vmas. On failure, attempt again inside mmap_lock critical section. Write-protect operation requires mmap_lock as it iterates over multiple vmas. Link: https://lkml.kernel.org/r/20240215182756.3448972-5-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tim Murray <timmurray@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 867a43a34ff8a38772212045262b2c9b77807ea3) Conflicts: mm/userfaultfd.c 1. Resolve conflict in validate_dst_vma() due to absence of range_in_vma(). 2. Use 'page' instead of 'folio' for BUG_ON on copy_from_user() failure in COPY ioctl. 3. Resolve conflict around mfill_file_over_size(). 4. Resolve conflict in comment for __mcopy_atomic_hugetlb() due to function name change. 5. Resolve conflict due to use of 'flags' instead of 'mode' in __mcopy_atomic_hugetlb(). 6. Use find_vma() and validate_dst_vma() in mwriteprotect_range() instead of find_dst_vma(). Bug: 320478828 Change-Id: I6d5b7101218cb1b11329108c3f31f12bb1caebc6 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	daf0b0fc4a	BACKPORT: mm: add vma_assert_locked() for !CONFIG_PER_VMA_LOCK vma_assert_locked() is needed to replace mmap_assert_locked() once we start using per-vma locks in userfaultfd operations. In !CONFIG_PER_VMA_LOCK case when mm is locked, it implies that the given VMA is locked. Link: https://lkml.kernel.org/r/20240215182756.3448972-4-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Tim Murray <timmurray@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 32af81af2f6f4c23b1b4ff68410e91da660af102) Conflicts: include/linux/mm.h 1. lock_vma_under_rcu() definition in !CONFIG_PER_VMA_LOCK case doesn't exist in 6.1. Resolved cherry-pick conflict due to that. Bug: 320478828 Change-Id: I76d414cd08c3d696d3886921a7e27cf94fd17b76 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	a5b6040d5c	BACKPORT: userfaultfd: protect mmap_changing with rw_sem in userfaulfd_ctx Increments and loads to mmap_changing are always in mmap_lock critical section. This ensures that if userspace requests event notification for non-cooperative operations (e.g. mremap), userfaultfd operations don't occur concurrently. This can be achieved by using a separate read-write semaphore in userfaultfd_ctx such that increments are done in write-mode and loads in read-mode, thereby eliminating the dependency on mmap_lock for this purpose. This is a preparatory step before we replace mmap_lock usage with per-vma locks in fill/move ioctls. Link: https://lkml.kernel.org/r/20240215182756.3448972-3-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tim Murray <timmurray@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 5e4c24a57b0c126686534b5b159a406c5dd02400) Conflicts: fs/userfaultfd.c include/linux/userfaultfd_k.h mm/userfaultfd.c 1. Functions passing control from fs/userfaultfd.c to mm/userfaultfd.c were renamed after 6.1. a. Replace mfill_atomic_copy() with mcopy_atomic() b. Replace mfill_atomic_zeropage() with mfill_zeropage() c. Replace mfill_atomic_continue() with mcopy_continue() d. Replace mfill_atomic() with __mcopy_atomic() e. Replace mfill_atomic_hugetlb() with __mcopy_atomic_hugetlb() 2. uffd flags were unified into a single parameter after 6.1. Replace 'flags' with 'mcopy_mode' and 'mode'. 3. Fetch dst_mm from dst_vma in __mcopy_atomic_hugetlb(). Bug: 320478828 Change-Id: I77615c36a0c891801c9eb9de3609df4e7f125c39 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	6b5ee039a1	BACKPORT: userfaultfd: move userfaultfd_ctx struct to header file Patch series "per-vma locks in userfaultfd", v7. Performing userfaultfd operations (like copy/move etc.) in critical section of mmap_lock (read-mode) causes significant contention on the lock when operations requiring the lock in write-mode are taking place concurrently. We can use per-vma locks instead to significantly reduce the contention issue. Android runtime's Garbage Collector uses userfaultfd for concurrent compaction. mmap-lock contention during compaction potentially causes jittery experience for the user. During one such reproducible scenario, we observed the following improvements with this patch-set: - Wall clock time of compaction phase came down from ~3s to <500ms - Uninterruptible sleep time (across all threads in the process) was ~10ms (none in mmap_lock) during compaction, instead of >20s This patch (of 4): Move the struct to userfaultfd_k.h to be accessible from mm/userfaultfd.c. There are no other changes in the struct. This is required to prepare for using per-vma locks in userfaultfd operations. Link: https://lkml.kernel.org/r/20240215182756.3448972-1-lokeshgidra@google.com Link: https://lkml.kernel.org/r/20240215182756.3448972-2-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tim Murray <timmurray@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit f91e6b41dd11daffb138e3afdb4804aefc3d4e1b) Conflicts: include/linux/userfaultfd_k.h 1. Retain 'sysctl_unprivileged_userfaultfd' global variable. Bug: 320478828 Change-Id: Iebaae028d5e793dd50342b141c1d46b79026834a Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	ac96edb501	BACKPORT: userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb In mfill_atomic_hugetlb(), mmap_changing isn't being checked again if we drop mmap_lock and reacquire it. When the lock is not held, mmap_changing could have been incremented. This is also inconsistent with the behavior in mfill_atomic(). Link: https://lkml.kernel.org/r/20240117223729.1444522-1-lokeshgidra@google.com Fixes: `df2cc96e77` ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races") Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 67695f18d55924b2013534ef3bdc363bc9e14605) Conflicts: mm/userfaultfd.c 1. Update mfill_atomic_hugetlb() parameters to pass 'wp_copy' and 'mode' instead of 'flags'. Bug: 320478828 Change-Id: I11ef09b2b8e477c32cc731205fd48b25bcbd020f Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	51eab7ecc4	BACKPORT: selftests/mm: add separate UFFDIO_MOVE test for PMD splitting Add a test for UFFDIO_MOVE ioctl operating on a hugepage which has to be split because destination is marked with MADV_NOHUGEPAGE. With this we cover all 3 cases: normal page move, hugepage move, hugepage splitting before move. Link: https://lkml.kernel.org/r/20231230025636.2477429-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit a5b7620bab81f16e8bbb04f4aea94c4c7feb0d77) Conflicts: tools/testing/selftests/mm/uffd-unit-tests.c tools/testing/selftests/vm/userfaultfd.c 1. Add request_src_hugepages() to enable THP on src 2. Add madvise() to enable THP on dst in request_hugepages() 3. Add request_split_hugepages() to enable THP on src and disable on dst 4. Change return type of uffd_move_pmd_split_test() to int Bug: 274911254 Change-Id: I21147a5b7f3e8bbe2befa8bff536e62826e9f6e3 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	f152691515	BACKPORT: selftests/mm: add UFFDIO_MOVE ioctl test Add tests for new UFFDIO_MOVE ioctl which uses uffd to move source into destination buffer while checking the contents of both after the move. After the operation the content of the destination buffer should match the original source buffer's content while the source buffer should be zeroed. Separate tests are designed for PMD aligned and unaligned cases because they utilize different code paths in the kernel. Link: https://lkml.kernel.org/r/20231206103702.3873743-6-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit a2bf6a9ca80532b75f8f8b6a1cd75ef7e5150576) Conflicts: tools/testing/selftests/mm/uffd-common.c tools/testing/selftests/mm/uffd-common.h tools/testing/selftests/mm/uffd-unit-tests.c tools/testing/selftests/vm/userfaultfd.c 1. Removed errmsg parameter from prevent_hugepages() and post_hugepages() 2. Removed uffd_test_args parameter from uffd_move_* functions 3. Added uffd_test_case_ops parameter in uffd_move_test_common() 4. Added userfaultfd_move_test() for all 'move' tests, called from userfaultfd_stress() 5. Added 'test_uffdio_move' global bool variable, which is set to true only when testing anon mappings 6. Added call to uffd_test_ctx_init() and uffd_test_ctx_clear() in uffd_move_test_common() 7. Replaced uffd_args with uffd_stats 8. Converted return type of uffd_move_test() and uffd_move_pmd_test() to `int` 9. Added uffd_test_page_fault_handler as uffd_args doesn't exist. uffd_poll_thread() checks if it is NULL then calls uffd_handle_page_fault(). 10. Replaced uffd_register() (isn't defined on 6.1) with UFFDIO_REGISTER ioctl call 11. Added printf() calls to log when the test is starting and finishing. 12. Change return type of uffd_move_test_common() to int. Bug: 274911254 Change-Id: I1c68445d9c64533aab0ba27c2e010347d0807981 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	a5d504c067	BACKPORT: selftests/mm: add uffd_test_case_ops to allow test case-specific operations Currently each test can specify unique operations using uffd_test_ops, however these operations are per-memory type and not per-test. Add uffd_test_case_ops which each test case can customize for its own needs regardless of the memory type being used. Pre- and post-allocation operations are added, some of which will be used in the next patch to implement test-specific operations like madvise after memory is allocated but before it is accessed. Link: https://lkml.kernel.org/r/20231206103702.3873743-5-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit e8a422408ba9760e2640ca57e4b79c3dd7f48bd2) Conflicts: tools/testing/selftests/mm/uffd-common.c tools/testing/selftests/mm/uffd-common.h tools/testing/selftests/mm/uffd-unit-tests.c tools/testing/selftests/vm/userfaultfd.c 1. Userfaultfd selftest was split into separate uffd-* files and moved to selftests/mm. 2. In 6.1 there is no mechanism to run individual unit-tests. All unit-tests are run after stress test. Consequently, the tests are not abstracted using 'uffd_test_case_t'. Therefore, added 'uffd_test_case_ops' as a parameter to uffd_test_ctx_init(). Bug: 274911254 Change-Id: I6480abf1709ca717d9baad5047bf675852f10726 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	ee72d5a7d9	BACKPORT: selftests/mm: call uffd_test_ctx_clear at the end of the test uffd_test_ctx_clear() is being called from uffd_test_ctx_init() to unmap areas used in the previous test run. This approach is problematic because while unmapping areas uffd_test_ctx_clear() uses page_size and nr_pages which might differ from one test run to another. Fix this by calling uffd_test_ctx_clear() after each test is done. Link: https://lkml.kernel.org/r/20231206103702.3873743-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 1c8d39fa7b63dcbb77af7b0325fdc519c35fe618) Conflicts: tools/testing/selftests/mm/uffd-common.c tools/testing/selftests/mm/uffd-common.h tools/testing/selftests/mm/uffd-stress.c tools/testing/selftests/mm/uffd-unit-tests.c tools/testing/selftests/vm/userfaultfd.c 1. Userfaultfd selftest was split into separate files and as a consequence the code moved from selftests/vm/userfaultfd.c to selftests/mm/uffd_* files. Bug: 274911254 Change-Id: Ic224c3965a645342dc0f41e743d3c072b7bb852e Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	abd6748ba6	UPSTREAM: userfaultfd: fix return error if mmap_changing is non-zero in MOVE ioctl To be consistent with other uffd ioctl's returning EAGAIN when mmap_changing is detected, we should change UFFDIO_MOVE to do the same. Link: https://lkml.kernel.org/r/20240117223922.1445327-1-lokeshgidra@google.com Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Acked-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 6ca03f1bb5a7427a66df62c954b3500a4255cdb9) Bug: 274911254 Change-Id: I3499e3987bf72d3d7a307165e0f9d1ed6d2b0611 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Lokesh Gidra	4f658d7723	BACKPORT: userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE Commit d7a08838ab74 ("mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails") moved the src_folio->{mapping, index} changing to after clearing the page-table and ensuring that it's not pinned. This avoids failure of swapout+migration and possibly memory corruption. However, the commit missed fixing it in the huge-page case. Link: https://lkml.kernel.org/r/20240404171726.2302435-1-lokeshgidra@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Lokesh Gidra <lokeshgidra@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit c0205eaf3af9f5db14d4b5ee4abacf4a583c3c50) Conflicts: mm/huge_memory.c 1. Replace folio_move_anon_rmap() with page_move_anon_rmap(). Bug: 274911254 Change-Id: I15a07ea22de7ae38ed20320a73c995c7c48ef42b Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Qi Zheng	bfb4b24b64	BACKPORT: mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails After ptep_clear_flush(), if we find that src_folio is pinned we will fail UFFDIO_MOVE and put src_folio back to src_pte entry, but the change to src_folio->{mapping,index} is not restored in this process. This is not what we expected, so fix it. This can cause the rmap for that page to be invalid, possibly resulting in memory corruption. At least swapout+migration would no longer work, because we might fail to locate the mappings of that folio. Link: https://lkml.kernel.org/r/20240222080815.46291-1-zhengqi.arch@bytedance.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit d7a08838ab74652f2b53fee9763f0178278c3a4b) Conflicts: mm/userfaultfd.c 1. Replace folio_move_anon_rmap() with page_move_anon_rmap(). Bug: 274911254 Change-Id: Ie4bf5785244271ab233c6230ed71460fd571bd1a Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	6ecd08eaf4	BACKPORT: userfaultfd: handle zeropage moves by UFFDIO_MOVE Current implementation of UFFDIO_MOVE fails to move zeropages and returns EBUSY when it encounters one. We can handle them by mapping a zeropage at the destination and clearing the mapping at the source. This is done both for ordinary and for huge zeropages. Link: https://lkml.kernel.org/r/20240131175618.2417291-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202401300107.U8iMAkTl-lkp@intel.com/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit eb1521dad8f391d3f3b88f589db27a288b55b8ed) Conflicts: mm/huge_memory.c 1. Replace folio_move_anon_rmap() with page_move_anon_rmap(). 2. Remove vma parameter in pmd_mkwrite() calls. Bug: 274911254 Change-Id: I271aa365bb3930e7e480d5749b44863eeca74dda Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	e275c2b743	UPSTREAM: userfaultfd: avoid huge_zero_page in UFFDIO_MOVE While testing UFFDIO_MOVE ioctl, syzbot triggered VM_BUG_ON_PAGE caused by a call to PageAnonExclusive() with a huge_zero_page as a parameter. UFFDIO_MOVE does not yet handle zeropages and returns EBUSY when one is encountered. Add an early huge_zero_page check in the PMD move path to avoid this situation. Link: https://lkml.kernel.org/r/20240112013935.1474648-1-surenb@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Reported-by: syzbot+705209281e36404998f6@syzkaller.appspotmail.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 5d4747a6cc8e78ce74742d557fc9b7697fcacc95) Bug: 274911254 Change-Id: I7096b02b3a5b101e049608703ee77179d469a434 Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Suren Baghdasaryan	60c5a0e023	UPSTREAM: userfaultfd: fix move_pages_pte() splitting folio under RCU read lock While testing the split PMD path with lockdep enabled I've got an "Invalid wait context" error caused by split_huge_page_to_list() trying to lock anon_vma->rwsem while inside RCU read section. The issues is due to move_pages_pte() calling split_folio() under RCU read lock. Fix this by unmapping the PTEs and exiting RCU read section before splitting the folio and then retrying. The same retry pattern is used when locking the folio or anon_vma in this function. After splitting the large folio we unlock and release it because after the split the old folio might not be the one that contains the src_addr. Link: https://lkml.kernel.org/r/20240102233256.1077959-1-surenb@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 982ae058b2f08f576e4f3d4055f8916ba789f3d4) Bug: 274911254 Change-Id: I382c6631d821b0ed26d9b15afa78a417dafaeb2e Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Andrea Arcangeli	5025ad140e	BACKPORT: userfaultfd: UFFDIO_MOVE uABI Implement the uABI of UFFDIO_MOVE ioctl. UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application needs pages to be allocated [1]. However, with UFFDIO_MOVE, if pages are available (in userspace) for recycling, as is usually the case in heap compaction algorithms, then we can avoid the page allocation and memcpy (done by UFFDIO_COPY). Also, since the pages are recycled in the userspace, we avoid the need to release (via madvise) the pages back to the kernel [2]. We see over 40% reduction (on a Google pixel 6 device) in the compacting thread's completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was measured using a benchmark that emulates a heap compaction implementation using userfaultfd (to allow concurrent accesses by application threads). More details of the usecase are explained in [2]. Furthermore, UFFDIO_MOVE enables moving swapped-out pages without touching them within the same vma. Today, it can only be done by mremap, however it forces splitting the vma. [1] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redhat.com/ [2] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyjniNVjp0Aw@mail.gmail.com/ Update for the ioctl_userfaultfd(2) manpage: UFFDIO_MOVE (Since Linux xxx) Move a continuous memory chunk into the userfault registered range and optionally wake up the blocked thread. The source and destination addresses and the number of bytes to move are specified by the src, dst, and len fields of the uffdio_move structure pointed to by argp: struct uffdio_move { __u64 dst; /* Destination of move / __u64 src; / Source of move / __u64 len; / Number of bytes to move / __u64 mode; / Flags controlling behavior of move / __s64 move; / Number of bytes moved, or negated error */ }; The following value may be bitwise ORed in mode to change the behavior of the UFFDIO_MOVE operation: UFFDIO_MOVE_MODE_DONTWAKE Do not wake up the thread that waits for page-fault resolution UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES Allow holes in the source virtual range that is being moved. When not specified, the holes will result in ENOENT error. When specified, the holes will be accounted as successfully moved memory. This is mostly useful to move hugepage aligned virtual regions without knowing if there are transparent hugepages in the regions or not, but preventing the risk of having to split the hugepage during the operation. The move field is used by the kernel to return the number of bytes that was actually moved, or an error (a negated errno- style value). If the value returned in move doesn't match the value that was specified in len, the operation fails with the error EAGAIN. The move field is output-only; it is not read by the UFFDIO_MOVE operation. The operation may fail for various reasons. Usually, remapping of pages that are not exclusive to the given process fail; once KSM might deduplicate pages or fork() COW-shares pages during fork() with child processes, they are no longer exclusive. Further, the kernel might only perform lightweight checks for detecting whether the pages are exclusive, and return -EBUSY in case that check fails. To make the operation more likely to succeed, KSM should be disabled, fork() should be avoided or MADV_DONTFORK should be configured for the source VMA before fork(). This ioctl(2) operation returns 0 on success. In this case, the entire area was moved. On error, -1 is returned and errno is set to indicate the error. Possible errors include: EAGAIN The number of bytes moved (i.e., the value returned in the move field) does not equal the value that was specified in the len field. EINVAL Either dst or len was not a multiple of the system page size, or the range specified by src and len or dst and len was invalid. EINVAL An invalid bit was specified in the mode field. ENOENT The source virtual memory range has unmapped holes and UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES is not set. EEXIST The destination virtual memory range is fully or partially mapped. EBUSY The pages in the source virtual memory range are either pinned or not exclusive to the process. The kernel might only perform lightweight checks for detecting whether the pages are exclusive. To make the operation more likely to succeed, KSM should be disabled, fork() should be avoided or MADV_DONTFORK should be configured for the source virtual memory area before fork(). ENOMEM Allocating memory needed for the operation failed. ESRCH The target process has exited at the time of a UFFDIO_MOVE operation. Link: https://lkml.kernel.org/r/20231206103702.3873743-3-surenb@google.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit adef440691bab824e39c1b17382322d195e1fab0) Conflicts: mm/huge_memory.c mm/userfaultfd.c 1. Add vma parameter in mmu_notifier_range_init() calls. 2. Replace folio_move_anon_rmap() with page_move_anon_rmap(). 3. Remove vma parameter in pmd_mkwrite() calls. 4. Replace pte_offset_map_nolock() with pte_offset_map()+pte_lockptr() combo. 5. Remove VM_SHADOW_STACK in vma_move_compatible(). 6. Replace pmdp_get_lockless() with pmd_read_atomic(). Bug: 274911254 Change-Id: I1116f15a395f1a8bac176906f7f9c2411e59dc54 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Andrea Arcangeli	25db7c13d8	UPSTREAM: mm/rmap: support move to different root anon_vma in folio_move_anon_rmap() Patch series "userfaultfd move option", v6. This patch series introduces UFFDIO_MOVE feature to userfaultfd, which has long been implemented and maintained by Andrea in his local tree [1], but was not upstreamed due to lack of use cases where this approach would be better than allocating a new page and copying the contents. Previous upstraming attempts could be found at [6] and [7]. UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application needs pages to be allocated [2]. However, with UFFDIO_MOVE, if pages are available (in userspace) for recycling, as is usually the case in heap compaction algorithms, then we can avoid the page allocation and memcpy (done by UFFDIO_COPY). Also, since the pages are recycled in the userspace, we avoid the need to release (via madvise) the pages back to the kernel [3]. We see over 40% reduction (on a Google pixel 6 device) in the compacting thread's completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was measured using a benchmark that emulates a heap compaction implementation using userfaultfd (to allow concurrent accesses by application threads). More details of the usecase are explained in [3]. Furthermore, UFFDIO_MOVE enables moving swapped-out pages without touching them within the same vma. Today, it can only be done by mremap, however it forces splitting the vma. TODOs for follow-up improvements: - cross-mm support. Known differences from single-mm and missing pieces: - memcg recharging (might need to isolate pages in the process) - mm counters - cross-mm deposit table moves - cross-mm test - document the address space where src and dest reside in struct uffdio_move - TLB flush batching. Will require extensive changes to PTL locking in move_pages_pte(). OTOH that might let us reuse parts of mremap code. This patch (of 5): For now, folio_move_anon_rmap() was only used to move a folio to a different anon_vma after fork(), whereby the root anon_vma stayed unchanged. For that, it was sufficient to hold the folio lock when calling folio_move_anon_rmap(). However, we want to make use of folio_move_anon_rmap() to move folios between VMAs that have a different root anon_vma. As folio_referenced() performs an RMAP walk without holding the folio lock but only holding the anon_vma in read mode, holding the folio lock is insufficient. When moving to an anon_vma with a different root anon_vma, we'll have to hold both, the folio lock and the anon_vma lock in write mode. Consequently, whenever we succeeded in folio_lock_anon_vma_read() to read-lock the anon_vma, we have to re-check if the mapping was changed in the meantime. If that was the case, we have to retry. Note that folio_move_anon_rmap() must only be called if the anon page is exclusive to a process, and must not be called on KSM folios. This is a preparation for UFFDIO_MOVE, which will hold the folio lock, the anon_vma lock in write mode, and the mmap_lock in read mode. Link: https://lkml.kernel.org/r/20231206103702.3873743-1-surenb@google.com Link: https://lkml.kernel.org/r/20231206103702.3873743-2-surenb@google.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: kernel-team@android.com Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Nicolas Geoffray <ngeoffray@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: ZhangPeng <zhangpeng362@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 880a99b60d467eefd96322e27b0a8c0b805dfa43) Bug: 274911254 Change-Id: Iad9619c0273e050af26356f66ae9fc88b56d68bd Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>	2024-04-22 18:09:14 +00:00
Nikhil V	503add1843	ANDROID: PM: hibernate: Encryption support with compression Currently only the uncompressed hibernation snapshot image is encrypted before being written to the swap partition. Extend the encryption support for compression enabled scenarios as well. Bug: 335581841 Change-Id: Ida781b727f56b664a67e2887a4db3d6b355dafdb Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 11:05:17 +05:30
Nikhil V	3e99ae28ea	ANDROID: abi_gki_aarch64_qcom: Update symbol list Add android_vh_hibernate_save_cmp_len, android_vh_hibernated_do_mem_alloc symbols to support compression with hibernation. Symbols added: __traceiter_android_vh_hibernate_save_cmp_len __traceiter_android_vh_hibernated_do_mem_alloc __tracepoint_android_vh_hibernate_save_cmp_len __tracepoint_android_vh_hibernated_do_mem_alloc Bug: 335581841 Change-Id: I99e704bd54f220bac180c5bbfec48da44359f27e Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 11:04:47 +05:30
Nikhil V	8f08ea0d59	ANDROID: vendor_hooks: Add hooks to support hibernation In case of hibernation with compression enabled, 'n' number of pages will be compressed to 'x' number of pages before being written to the disk. Keep a note of these compressed block counts so that bootloader can directly read 'x' pages and pass it on to the decompressor. An array will be maintained which will hold the count of these compressed blocks and later on written to the the disk as part of the hibernation image save process. The vendor hook '__tracepoint_android_vh_hibernated_do_mem_alloc' does the required memory allocations, for example, the array which is dynamically allocated based on the snapshot image size so as to hold the compressed block counts etc. This memory is later freed as part of PM_POST_HIBERNATION notifier call. The vendor hook '__tracepoint_android_vh_hibernate_save_cmp_len' saves the compressed block counts to the array which is later written to the disk. Bug: 335581841 Change-Id: I574b641e2d9f4cd503c7768a66a7be3142c2686b Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 11:04:11 +05:30
Nikhil V	e7e8932600	ANDROID: gki_defconfig: Sync gki_defconfig After applying commit `990d3701d0` ("BACKPORT: PM: hibernate: Move to crypto APIs for LZO compression"), CRPTO_LZO is selected by default. Sync the gki_defconfig accordingly. This doesn't add any functional change. Bug: 335581841 Change-Id: Iafa7211245f6d66a96f8f0030e2574c7a220d3a4 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 11:02:55 +05:30
Nikhil V	54c2418b76	UPSTREAM: PM: hibernate: Support to select compression algorithm Currently the default compression algorithm is selected based on compile time options. Introduce a module parameter "hibernate.compressor" to override this behaviour. Different compression algorithms have different characteristics and hibernation may benefit when it uses any of these algorithms, especially when a secondary algorithm(LZ4) offers better decompression speeds over a default algorithm(LZO), which in turn reduces hibernation image restore time. Users can override the default algorithm in two ways: 1) Passing "hibernate.compressor" as kernel command line parameter. Usage: LZO: hibernate.compressor=lzo LZ4: hibernate.compressor=lz4 2) Specifying the algorithm at runtime. Usage: LZO: echo lzo > /sys/module/hibernate/parameters/compressor LZ4: echo lz4 > /sys/module/hibernate/parameters/compressor Currently LZO and LZ4 are the supported algorithms. LZO is the default compression algorithm used with hibernation. Bug: 335581841 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 3fec6e5961b77af6a952b77f5c2ea26f7513b216) Change-Id: I3c787939c8d37dfeb2c164de69078d303411f938 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 10:56:17 +05:30
Nikhil V	76c7e9747b	UPSTREAM: PM: hibernate: Add support for LZ4 compression for hibernation Extend the support for LZ4 compression to be used with hibernation. The main idea is that different compression algorithms have different characteristics and hibernation may benefit when it uses any of these algorithms: a default algorithm, having higher compression rate but is slower(compression/decompression) and a secondary algorithm, that is faster(compression/decompression) but has lower compression rate. LZ4 algorithm has better decompression speeds over LZO. This reduces the hibernation image restore time. As per test results: LZO LZ4 Size before Compression(bytes) 682696704 682393600 Size after Compression(bytes) 146502402 155993547 Decompression Rate 335.02 MB/s 501.05 MB/s Restore time 4.4s 3.8s LZO is the default compression algorithm used for hibernation. Enable CONFIG_HIBERNATION_COMP_LZ4 to set the default compressor as LZ4. Bug: 335581841 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 8bc29736357e7f9a6bd0d16b57b5612197e1924b) Change-Id: I640d834bb626e9a139e41740d4bef7548d5c6401 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 10:55:24 +05:30
Nikhil V	990d3701d0	BACKPORT: PM: hibernate: Move to crypto APIs for LZO compression Currently for hibernation, LZO is the only compression algorithm available and uses the existing LZO library calls. However, there is no flexibility to switch to other algorithms which provides better results. The main idea is that different compression algorithms have different characteristics and hibernation may benefit when it uses alternate algorithms. By moving to crypto based APIs, it lays a foundation to use other compression algorithms for hibernation. There are no functional changes introduced by this approach. Bug: 335581841 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit a06c6f5d3cc90b3b070d7b99979d57238db77a86) Change-Id: I8d15262f9823219d291b84eab28b2ec44474dad4 [quic_nprakash: Resolved minor conflicts in kernel/power/(power.h,swap.c)] Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 10:54:46 +05:30
Nikhil V	d224d17a14	BACKPORT: PM: hibernate: Rename lzo* to make it generic Renaming lzo* to generic names, except for lzo_xxx() APIs. This is used in the next patch where we move to crypto based APIs for compression. There are no functional changes introduced by this approach. Bug: 335581841 Signed-off-by: Nikhil V <quic_nprakash@quicinc.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit 89a807625f9701154167bf6bf136adfa1be4d849) Change-Id: I8e2032658132965bed4c7b24b7409ae7a1bfa6cd [quic_nprakash: Resolved minor conflicts in kernel/power/swap.c] Signed-off-by: Nikhil V <quic_nprakash@quicinc.com>	2024-04-22 10:54:11 +05:30
Youngmin Nam	dcb09569bb	ANDROID: ABI: Update symbol list for Exynos SoC There are no new symbols to be added to GKI symbol list. We simply update our symbol list. Bug: 335537438 Change-Id: Iae0594ff776853df2b38eb3215a5378a03995c40 Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>	2024-04-18 14:01:15 +09:00
Andre Ding	692e3553d2	ANDROID: abi_gki_aarch64_qcom: Update symbol list Symbols updated to QCOM abi symbol list for Marvell Phy and Mdio Bus Mux: ethnl_cable_test_amplitude ethnl_cable_test_pulse ethnl_cable_test_step genphy_check_and_restart_aneg genphy_read_status_fixed of_mdio_find_bus phy_config_aneg phy_gbit_fibre_features Bug: 335414016 Change-Id: I1fa732935cb1a33bcde782fdfe38f5b3fcfae4cb Signed-off-by: Andre Ding <quic_shuangxi@quicinc.com>	2024-04-17 20:36:33 +00:00
Yongqiang Niu	8943be7d1b	BACKPORT: mtk-mmsys: Change mtk-mmsys & mtk-mutex to modules Change mtk-mmsys & mtk-mutex to modules for gki Bug: 335112842 (cherry picked from commit a7596e62dac7318456c1aa9af5bfccf0f8e6ad7e https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ master) Signed-off-by: Yongqiang Niu <yongqiang.niu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221118063018.13520-1-yongqiang.niu@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Change-Id: I5a0789d26b188ca576a92df34af5940f539f1ac3	2024-04-17 20:28:21 +00:00
AngeloGioacchino Del Regno	34e8dc4ed0	BACKPORT: clk: mediatek: Split configuration options for MT8186 clock drivers When building clock drivers for MT8186, some may want to build in only some of them to, for example, get CPUFreq up faster, and some may want to leave out some clock drivers entirely as a machine may not need the Warp Engine or the camera ISP (hence, their clock drivers). Split the various clock drivers in their own configuration options, keeping MT8186 configuration options consistent with other MediaTek SoCs. While at it, also allow building the remaining clock drivers as modules by switching COMMON_CLK_MT8186 to tristate. Bug: 335112842 (cherry picked from commit 5baf38e06a570a2a4ed471a996aff6d6ba69cceb https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ master) Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20230306140543.1813621-47-angelogioacchino.delregno@collabora.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Change-Id: Id9dad0e1d56fcae3bb9302b7db63be31afc91d5d	2024-04-17 20:28:21 +00:00
AngeloGioacchino Del Regno	a5ce14670a	BACKPORT: clk: mediatek: Add MODULE_LICENSE() where missing In order to successfully build clock drivers as modules it is required to declare a module license: add it where missing. While at it, also change the MODULE_LICENSE text from "GPL v2" to "GPL" (which means the same) on clk-mt7981-eth.c. Bug: 335112842 (cherry picked from commit a451da86cf6d10e94372d20622ec41aac9ec00b5 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ master) Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Tested-by: Miles Chen <miles.chen@mediatek.com> Tested-by: Chen-Yu Tsai <wenst@chromium.org> # MT8183, MT8192, MT8195 Chromebooks Link: https://lore.kernel.org/r/20230306140543.1813621-38-angelogioacchino.delregno@collabora.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Change-Id: Id78b0caa4520350049ae6162481c381eb3309897	2024-04-17 20:28:21 +00:00
Ken Huang	4bfe25d0b6	ANDROID: Update the ABI symbol list Adding the following symbols: - devm_drm_bridge_add Bug: 333511135 Change-Id: I4f815c00515c1f2660032e584778edf1c2c41da4 Signed-off-by: Ken Huang <kenbshuang@google.com>	2024-04-17 20:24:33 +00:00
Bart Van Assche	24edb63b85	Reapply "ANDROID: block: Add support for filesystem requests and small segments" This reverts commit I764adf995cae6b485d4d98e410c78128a88647e0. Bug: 333812722 Bug: 308663717 Bug: 319125789 Change-Id: Ie7f03b9d1ab67a33ccbc4311ba26cd746e1aaa22 Signed-off-by: Bart Van Assche <bvanassche@acm.org> [jyescas@google.com: Call blk_segments() in blk_mq_submit_bio() for the case when request is defined or it is null] Signed-off-by: Juan Yescas <jyescas@google.com>	2024-04-17 18:38:54 +00:00
Michael Wu	141ebdcb28	UPSTREAM: usb:typec:tcpm:support double Rp to Vbus cable as sink The USB Type-C Cable and Connector Specification defines the wire connections for the USB Type-C to USB 2.0 Standard-A cable assembly (Release 2.2, Chapter 3.5.2). The Notes says that Pin A5 (CC) of the USB Type-C plug shall be connected to Vbus through a resister Rp. However, there is a large amount of such double Rp connected to Vbus non-standard cables which produced by UGREEN circulating on the market, and it can affects the normal operations of the state machine easily, especially to CC1 and CC2 be pulled up at the same time. In fact, we can regard those cables as sink to avoid abnormal state. Message as follow: [ 58.900212] VBUS on [ 59.265433] CC1: 0 -> 3, CC2: 0 -> 3 [state TOGGLING, polarity 0, connected] [ 62.623308] CC1: 3 -> 0, CC2: 3 -> 0 [state TOGGLING, polarity 0, disconnected] [ 62.625006] VBUS off [ 62.625012] VBUS VSAFE0V Bug: 335057705 Change-Id: I415db22b0012ace9535039bc4c8e5ec113482e33 Signed-off-by: Michael Wu <michael@allwinnertech.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20230920063030.66312-1-michael@allwinnertech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Frank Wang <frank.wang@rock-chips.com> (cherry picked from commit dbc1defec1aa7d8d80da3ea9e3ddafbcfca8f822)	2024-04-17 15:33:19 +00:00
John Scheible	8672a5ee4d	ANDROID: Update the ABI symbol list Adding the following symbols: - devm_pm_runtime_enable Bug: 335356311 Change-Id: Iecd45183cead8807974bb2a065c48aab86e47e89 Signed-off-by: John Scheible <johnscheible@google.com>	2024-04-16 21:17:14 -07:00
Will McVicker	089d1b8f6d	ANDROID: Add known structs used by modules to KMI This adds `struct dwc3` and `struct kernel_all_info` to the KMI via fake GKI symbols as we know some partners are using these in their out-of-tree drivers. This ensures that future changes to these structs will not break partner builds. Bug: 332277393 Bug: 236036821 Change-Id: Ifa1ac6b71d58415339a63f16a79c1f713dda789f Signed-off-by: Will McVicker <willmcvicker@google.com>	2024-04-16 13:49:35 -07:00
Pablo Neira Ayuso	77fec6cefe	UPSTREAM: netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path commit 0d459e2ffb541841714839e8228b845458ed3b27 upstream. The commit mutex should not be released during the critical section between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC worker could collect expired objects and get the released commit lock within the same GC sequence. nf_tables_module_autoload() temporarily releases the mutex to load module dependencies, then it goes back to replay the transaction again. Move it at the end of the abort phase after nft_gc_seq_end() is called. Bug: 332996726 Cc: stable@vger.kernel.org Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path") Reported-by: Kuan-Ting Chen <hexrabbit@devco.re> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 8038ee3c3e5b59bcd78467686db5270c68544e30) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I637389421d8eca5ab59a41bd1a4b70432440034c	2024-04-15 11:20:28 +00:00
Pablo Neira Ayuso	e27468009d	UPSTREAM: netfilter: nf_tables: release batch on table validation from abort path commit a45e6889575c2067d3c0212b6bc1022891e65b91 upstream. Unlike early commit path stage which triggers a call to abort, an explicit release of the batch is required on abort, otherwise mutex is released and commit_list remains in place. Add WARN_ON_ONCE to ensure commit_list is empty from the abort path before releasing the mutex. After this patch, commit_list is always assumed to be empty before grabbing the mutex, therefore `03c1f1ef15` ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()") only needs to release the pending modules for registration. Bug: 332996726 Cc: stable@vger.kernel.org Fixes: `c0391b6ab8` ("netfilter: nf_tables: missing validation from the abort path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b0b36dcbe0f24383612e5e62bd48df5a8107f7fc) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I38f9b05ac4eadd1d2b7b306cccaf0aeacb61b57a	2024-04-15 11:20:28 +00:00
Pablo Neira Ayuso	26f2c9be9e	UPSTREAM: netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout commit 552705a3650bbf46a22b1adedc1b04181490fc36 upstream. While the rhashtable set gc runs asynchronously, a race allows it to collect elements from anonymous sets with timeouts while it is being released from the commit path. Mingi Cho originally reported this issue in a different path in 6.1.x with a pipapo set with low timeouts which is not possible upstream since 7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set element timeout"). Fix this by setting on the dead flag for anonymous sets to skip async gc in this case. According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on transaction abort"), Florian plans to accelerate abort path by releasing objects via workqueue, therefore, this sets on the dead flag for abort path too. Bug: 329205787 Cc: stable@vger.kernel.org Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Mingi Cho <mgcho.minic@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 406b0241d0eb598a0b330ab20ae325537d8d8163) Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I6170493c267e020c50a739150f8c421deb635b35	2024-04-15 10:33:08 +00:00

1 2 3 4 5 ...

1159019 Commits