We have namespaces, so use them for all vfs-exported namespaces so that
filesystems can use them, but not anything else.
Some in-kernel drivers that do direct filesystem accesses (because they
serve up files) are also allowed access to these symbols to keep 'make
allmodconfig' builds working properly, but it is not needed for Android
kernel images.
Bug: 157965270
Bug: 210074446
Cc: Matthias Maennich <maennich@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iaf6140baf3a18a516ab2d5c3966235c42f3f70de
[ Upstream commit ec6af094ea28f0f2dda1a6a33b14cd57e36a9755 ]
Packet sockets may switch ring versions. Avoid misinterpreting state
between versions, whose fields share a union. rx_owner_map is only
allocated with a packet ring (pg_vec) and both are swapped together.
If pg_vec is NULL, meaning no packet ring was allocated, then neither
was rx_owner_map. And the field may be old state from a tpacket_v3.
Bug: 213464034
Fixes: 61fad6816f ("net/packet: tpacket_rcv: avoid a producer race condition")
Reported-by: Syzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: Ifd09717336bafe2a3e20389f7f7eb7b95d19e8cd
The MODULE_IMPORT_NS() macro does not allow defined strings to work
properly with it, so add a layer of indirection to allow this to happen.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Matthias Maennich <maennich@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220108140657.3361237-1-gregkh@linuxfoundation.org
Bug: 210074446
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ibd64ba139912ea10e81ac22490831129b23a31e1
Commit c3a6cf19e6 ("export: avoid code duplication in
include/linux/export.h") broke the ability for a defined string to be
used as a namespace value. Fix this up by adding another layer of
indirection to preserve the previous functionality.
Fixes: c3a6cf19e6 ("export: avoid code duplication in include/linux/export.h")
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthias Maennich <maennich@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220108140415.3360088-1-gregkh@linuxfoundation.org
Bug: 210074446
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie43aa24f64b55cd1d70161c906b0ef32610430aa
We want to add some hooks in the binder module so that we can reduce
block time until binder thread is available
Here are what new hooks do for:
1、android_vh_binder_looper_state_registered: choose a binder thread(do proc work) as a low-level thread.Only this thread has power to excute background binder transaction.
2、android_vh_binder_thread_read: let binder thread do works which come from
our list.
3、android_vh_binder_free_proc: free some pointers and variable.
4、android_vh_binder_thread_release: free the list that we create before.
5、android_vh_binder_has_work_ilocked: to check if our list has work.
6、android_vh_binder_read_done: because of we add hook in binder_has_work_ilocked,
binder_has_work_ilocked may return true, so we try to wake up low-level thread immediately.
Bug: 212483521
Change-Id: Ic40f452cc4dcf8fc85422e23e6f1a7ad77547309
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
To allow process_mrelease to reap targeted mm in parallel with exit_mmap
mark the victim with MMF_OOM_VICTIM flag.
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I89cf5f8fbeeb18b93a340b9ebe7f200837ebe846
With exit_mmap holding mmap_write_lock during free_pgtables call,
process_mrelease does not need to elevate mm->mm_users in order to
prevent exit_mmap from destrying pagetables while __oom_reap_task_mm
is walking the VMA tree. The change prevents process_mrelease from
calling the last mmput, which can lead to waiting for IO completion
in exit_aio.
Fixes: 337546e83fc7 ("mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Link: https://lore.kernel.org/all/20211124235906.14437-2-surenb@google.com/
Bug: 130172058
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I1e2728e0c477af9cc20e9e0b715ee67dee760618
oom-reaper and process_mrelease system call should protect against
races with exit_mmap which can destroy page tables while they
walk the VMA tree. oom-reaper protects from that race by setting
MMF_OOM_VICTIM and by relying on exit_mmap to set MMF_OOM_SKIP
before taking and releasing mmap_write_lock. process_mrelease has
to elevate mm->mm_users to prevent such race. Both oom-reaper and
process_mrelease hold mmap_read_lock when walking the VMA tree.
The locking rules and mechanisms could be simpler if exit_mmap takes
mmap_write_lock while executing destructive operations such as
free_pgtables.
Change exit_mmap to hold the mmap_write_lock when calling
free_pgtables. Operations like unmap_vmas() and unlock_range() are not
destructive and could run under mmap_read_lock but for simplicity we
take one mmap_write_lock during almost the entire operation. Note
also that because oom-reaper checks VM_LOCKED flag, unlock_range()
should not be allowed to race with it.
In most cases this lock should be uncontended. Previously, Kirill
reported ~4% regression caused by a similar change [1]. We reran the
same test and although the individual results are quite noisy, the
percentiles show lower regression with 1.6% being the worst case [2].
The change allows oom-reaper and process_mrelease to execute safely
under mmap_read_lock without worries that exit_mmap might destroy page
tables from under them.
[1] https://lore.kernel.org/all/20170725141723.ivukwhddk2voyhuc@node.shutemov.name/
[2] https://lore.kernel.org/all/CAJuCfpGC9-c9P40x7oy=jy5SphMcd0o0G_6U1-+JAziGKG6dGA@mail.gmail.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Link: https://lore.kernel.org/all/20211124235906.14437-1-surenb@google.com/
Bug: 130172058
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ic87272d09a0b68a1b0e968e8f1a1510fd6fc776a
Race between process_mrelease and exit_mmap, where free_pgtables is
called while __oom_reap_task_mm is in progress, leads to kernel crash
during pte_offset_map_lock call. oom-reaper avoids this race by setting
MMF_OOM_VICTIM flag and causing exit_mmap to take and release
mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.
Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
fix this race, however that would be considered a hack. Fix this race
by elevating mm->mm_users and preventing exit_mmap from executing until
process_mrelease is finished. Patch slightly refactors the code to
adapt for a possible mmget_not_zero failure.
This fix has considerable negative impact on process_mrelease
performance and will likely need later optimization.
Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com
Fixes: 884a7e5964e0 ("mm: introduce process_mrelease system call")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 337546e83fc7e50917f44846beee936abb9c9f1f)
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I7cf9c869faa7b746995a94ea93f6a617104385aa
In modern systems it's not unusual to have a system component monitoring
memory conditions of the system and tasked with keeping system memory
pressure under control. One way to accomplish that is to kill
non-essential processes to free up memory for more important ones.
Examples of this are Facebook's OOM killer daemon called oomd and
Android's low memory killer daemon called lmkd.
For such system component it's important to be able to free memory quickly
and efficiently. Unfortunately the time process takes to free up its
memory after receiving a SIGKILL might vary based on the state of the
process (uninterruptible sleep), size and OPP level of the core the
process is running. A mechanism to free resources of the target process
in a more predictable way would improve system's ability to control its
memory pressure.
Introduce process_mrelease system call that releases memory of a dying
process from the context of the caller. This way the memory is freed in a
more controllable way with CPU affinity and priority of the caller. The
workload of freeing the memory will also be charged to the caller. The
operation is allowed only on a dying process.
After previous discussions [1, 2, 3] the decision was made [4] to
introduce a dedicated system call to cover this use case.
The API is as follows,
int process_mrelease(int pidfd, unsigned int flags);
DESCRIPTION
The process_mrelease() system call is used to free the memory of
an exiting process.
The pidfd selects the process referred to by the PID file
descriptor.
(See pidfd_open(2) for further information)
The flags argument is reserved for future use; currently, this
argument must be specified as 0.
RETURN VALUE
On success, process_mrelease() returns 0. On error, -1 is
returned and errno is set to indicate the error.
ERRORS
EBADF pidfd is not a valid PID file descriptor.
EAGAIN Failed to release part of the address space.
EINTR The call was interrupted by a signal; see signal(7).
EINVAL flags is not 0.
EINVAL The memory of the task cannot be released because the
process is not exiting, the address space is shared
with another live process or there is a core dump in
progress.
ENOSYS This system call is not supported, for example, without
MMU support built into Linux.
ESRCH The target process does not exist (i.e., it has terminated
and been waited on).
[1] https://lore.kernel.org/lkml/20190411014353.113252-3-surenb@google.com/
[2] https://lore.kernel.org/linux-api/20201113173448.1863419-1-surenb@google.com/
[3] https://lore.kernel.org/linux-api/20201124053943.1684874-3-surenb@google.com/
[4] https://lore.kernel.org/linux-api/20201223075712.GA4719@lst.de/
Link: https://lkml.kernel.org/r/20210809185259.405936-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 884a7e5964e06ed93c7771c0d7cf19c09a8946f1)
Bug: 189803002
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I60d37051acaeff1b7eb7d10aeca23dfa1f2469a3
Add a new fips140_lab_util command 'show_invalid_inputs' which uses
AF_ALG to call some crypto algorithms with invalid parameters to show
that they fail. This is needed to meet a new requirement we've received
from the lab. This requirement is vague, but a representative sample of
algorithms and inputs appears to be acceptable.
For this to work, AF_ALG needs to be enabled in the kernel. This makes
fips140_lab_util start depending on a custom kernel build, not just on a
custom fips140 module build as was the case before. However, the lab
testing was going to need custom boot images anyway once fips140.ko is
included in the normal builds, since the production build of fips140.ko
won't have CONFIG_CRYPTO_FIPS140_MOD_EVAL_TESTING=y. AF_ALG is also
needed to do the Jitter RNG entropy analysis properly, and the
AF_ALG-enabled kernel can also be reused for ACVP testing.
Bug: 188620248
Change-Id: I69054eab5005fc3ca0ea081760877f73ea229f5b
Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 04e49b41be57bbc668e39a2bb65fa6022a22deba)
fips140_lab_test doesn't really do any tests per se, but rather is a
utility program that dumps some output. The actual "test" is when the
lab checks the output; we aren't allowed to check it ourselves.
We also need to add some new functionality, which would work well as
sub-commands. Also, the original idea was that this was just sample
code which the lab would modify, but that's not actually happening.
Therefore, rename fips140_lab_test to fips140_lab_util, and refactor its
functionality into sub-commands 'show_module_version' and
'show_service_indicators'. This fits better with what is needed.
Bug: 188620248
Change-Id: I7da84a139283f185f79b8d866547151169f26415
Signed-off-by: Eric Biggers <ebiggers@google.com>
(cherry picked from commit 6ed33b82eaf8352574ba9ac7cff351a678fbe8e6)
A new API, rproc_set_firmware() is added to allow the remoteproc platform
drivers and remoteproc client drivers to be able to configure a custom
firmware name that is different from the default name used during
remoteproc registration. This function is being introduced to provide
a kernel-level equivalent of the current sysfs interface to remoteproc
client drivers, and can only change firmwares when the remoteproc is
offline. This allows some remoteproc drivers to choose different firmwares
at runtime based on the functionality the remote processor is providing.
The TI PRU Ethernet driver will be an example of such usage as it
requires to use different firmwares for different supported protocols.
Also, update the firmware_store() function used by the sysfs interface
to reuse this function to avoid code duplication.
Bug: 213024513
Change-Id: Ie365179ac296c43c7c5c54b46f9f9f7587d5d263
Reviewed-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Signed-off-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20201121032042.6195-1-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
(cherry picked from commit 4c1ad562d303526b5d9b49f5e0d72da13ef78dec
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master)
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
In __arm_v7s_alloc_table function:
iommu call kmem_cache_alloc to allocate page table, this function
allocate memory may fail, when kmem_cache_alloc fails to allocate
table, call virt_to_phys will be abnomal and return unexpected phys
and goto out_free, then call kmem_cache_free to release table will
trigger KE, __get_free_pages and free_pages have similar problem,
so add error handle for page table allocation failure.
Fixes: 29859aeb8a ("iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE")
Signed-off-by: Yunfei Wang <yf.wang@mediatek.com>
Cc: <stable@vger.kernel.org> # 5.10.*
Acked-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20211207113315.29109-1-yf.wang@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
Bug: 210958369
(cherry picked from commit a556cfe4cabc6d79cbb7733f118bbb420b376fe6
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-joerg/arm-smmu/updates)
Signed-off-by: Yunfei Wang <yf.wang@mediatek.com>
Change-Id: I6435903336d1e15b5a57d08c284b3d3d66ea985d
commit ef6c8d6ccf0c1dccdda092ebe8782777cd7803c9 upstream.
When SCTP handles an INIT chunk, it calls for example:
sctp_sf_do_5_1B_init
sctp_verify_init
sctp_verify_param
sctp_process_init
sctp_process_param
handling of SCTP_PARAM_SET_PRIMARY
sctp_verify_init() wasn't doing proper size validation and neither the
later handling, allowing it to work over the chunk itself, possibly being
uninitialized memory.
Bug: 197154735
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: I032230924ead7a03dfb3101e9cd4d48e36bfc616
commit b6ffe7671b24689c09faa5675dd58f93758a97ae upstream.
In one of the fallbacks that SCTP has for identifying an association for an
incoming packet, it looks for AddIp chunk (from ASCONF) and take a peek.
Thing is, at this stage nothing was validating that the chunk actually had
enough content for that, allowing the peek to happen over uninitialized
memory.
Similar check already exists in actual asconf handling in
sctp_verify_asconf().
Bug: 197154735
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: Ibfe53fc724143423353ed6b2984d2508ee4fc457
[ Upstream commit 30e29a9a2bc6a4888335a6ede968b75cd329657a ]
In prealloc_elems_and_freelist(), the multiplication to calculate the
size passed to bpf_map_area_alloc() could lead to an integer overflow.
As a result, out-of-bounds write could occur in pcpu_freelist_populate()
as reported by KASAN:
[...]
[ 16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100
[ 16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78
[ 16.970038]
[ 16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1
[ 16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 16.972026] Call Trace:
[ 16.972306] dump_stack_lvl+0x34/0x44
[ 16.972687] print_address_description.constprop.0+0x21/0x140
[ 16.973297] ? pcpu_freelist_populate+0xd9/0x100
[ 16.973777] ? pcpu_freelist_populate+0xd9/0x100
[ 16.974257] kasan_report.cold+0x7f/0x11b
[ 16.974681] ? pcpu_freelist_populate+0xd9/0x100
[ 16.975190] pcpu_freelist_populate+0xd9/0x100
[ 16.975669] stack_map_alloc+0x209/0x2a0
[ 16.976106] __sys_bpf+0xd83/0x2ce0
[...]
The possibility of this overflow was originally discussed in [0], but
was overlooked.
Fix the integer overflow by changing elem_size to u64 from u32.
[0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/
Bug: 202511260
Fixes: 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
Signed-off-by: Tatsuhiko Yasumatsu <th.yasumatsu@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Aaron Ding <aaronding@google.com>
Change-Id: I45de17135336ce329b539d3e9e95fdcddafb2b00
We want to use this hook to record the sleeping time due to Futex
Bug: 210947226
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I637f889dce42937116d10979e0c40fddf96cd1a2
Add a tracehook to allow callers into dma_alloc_contiguous() to make
use of the built-in CMA area if the caller has addressing limitations;
this provides a means of allocating from memory whose bounds are
restricted to the lower 4 GB of memory, without having to enable DMA32
(assuming the default CMA area has been restricted to the appropriate
address ranges).
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 1 Added variable
1 Added variable:
[A] 'tracepoint __tracepoint_android_vh_subpage_dma_contig_alloc'
Bug: 199917449
Change-Id: Ia86fb416376bca231405b06ab27b0674c8fe3e14
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Currently the standard memory allocator (snd_dma_malloc_pages*())
passes the byte size to allocate as is. Most of the backends
allocates real pages, hence the actual allocations are aligned in page
size. However, the genalloc doesn't seem assuring the size alignment,
hence it may result in the access outside the buffer when the whole
memory pages are exposed via mmap.
For avoiding such inconsistencies, this patch makes the allocation
size always to be aligned in page size.
Note that, after this change, snd_dma_buffer.bytes field contains the
aligned size, not the originally requested size. This value is also
used for releasing the pages in return.
BUG: 209931573
cherry picked from commit 5c1733e33c888a3cb7f576564d8ad543d5ad4a9e
Change-Id: Ib65f0e29b87d55e13006c7416793a4539d376cc8
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Denis Hsu <denis.hsu@mediatek.com>
mmu_notifier_trylock definition for CONFIG_MMU_NOTIFIER=n configuration
has not been modified from the older version. Correct that mistake.
Fixes: 6971350406 ("ANDROID: fix mmu_notifier race caused by not taking mmap_lock during SPF")
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I71b8644bd2864b6ed98a7ff9c15a99fbd4c5a6c5
Szymon rightly pointed out that the previous check for the endpoint
direction in bRequestType was not looking at only the bit involved, but
rather the whole value. Normally this is ok, but for some request
types, bits other than bit 8 could be set and the check for the endpoint
length could not stall correctly.
Fix that up by only checking the single bit.
Fixes: 153a2d7e3350 ("USB: gadget: detect too-big endpoint 0 requests")
Cc: Felipe Balbi <balbi@kernel.org>
Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Link: https://lore.kernel.org/r/20211214184621.385828-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f08adf5add9a071160c68bb2a61d697f39ab0758
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git usb-linus)
Bug: 210292376
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7e708b2b94433009c87f697346e0515d93454f48
When a kernel thread calls dma_buf_put() to release the last reference
to a dma-buf, fput_many() defers calling the release callback to a
workqueue. This means that if the same kernel thread later calls
dma_heap_buffer_alloc(), it has no guarantee that the memory from the
prior free is available, leading to random failures. As a short-term
workaround, call flush_delayed_fput() to ensure the free completes
synchronously.
Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable
1 Added function:
[A] 'function void flush_delayed_fput()'
Bug: 210598057
Change-Id: Id936aa0bcd410b23b12f4b922b676aa61a358b4c
Signed-off-by: Patrick Daly <quic_pdaly@quicinc.com>
To prevent ABI breakage, move mm->mmu_notifier_lock into
mm->notifier_subscriptions and allocate mm->notifier_subscriptions
during mm creation in mmu_notifier_subscriptions_init. This results
in additional 176 bytes allocated for each mm, but prevents ABI breakage.
mmu_notifier_subscriptions_hdr structure is introduced at the beginning
of mmu_notifier_subscriptions to keep mmu_notifier_subscriptions hidden
and prevent its type CRC from changing when used in other structures.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I6f435708d642b70b22e0243c8b33108c208ce5bb
percpu_rw_semaphore changes to allow calling percpu_free_rwsem in atomic
context cause ABI breakage. Introduce percpu_free_rwsem_atomic wrapper
and change percpu_rwsem_destroy to use it in order to keep
percpu_rw_semaphore struct intact and fix ABI breakage.
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I198a6381fb48059f2aaa2ec38b8c1e5e5e936bb0
When pagefaults are handled speculatively,the pair of
mmu_notifier_invalidate_range_start/mmu_notifier_invalidate_range_end
calls happen without mmap_lock being taken. This enables the following
race:
mmu_notifier_invalidate_range_start
mmap_write_lock
mmu_notifier_register
mmap_write_unlock
mmu_notifier_invalidate_range_end
In this case mmu_notifier_invalidate_range_end will see a new
subscriber not seen at the time of mmu_notifier_invalidate_range_start
and will call ops->invalidate_range_end for that subscriber without
the matching ops->invalidate_range_start, creating imbalance.
Fix this by introducing a new mm->mmu_notifier_lock percpu_rw_semaphore
to synchronize mmu_notifier_invalidate_range_start/
mmu_notifier_invalidate_range_end with mmu_notifier_register when
handling pagefaults speculatively without holding mmap_lock.
percpu_rw_semaphore is used instead of rw_semaphore to prevent cache
line bouncing in the pagefault path.
Fixes: 86ee4a531e ("FROMLIST: x86/mm: add speculative pagefault handling")
Bug: 161210518
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I9c363b2348efcad19818f93b010abf956870ab55
Handle restore and freeze notifications from the PM core. Expose
these to individual virtio drivers that can quiesce and resume vq
operations.
Signed-off-by: Yurii Danilovskyi <glyd@opensynergy.com>
Signed-off-by: Mikhail Golubev <Mikhail.Golubev@opensynergy.com>
Bug: 141626390
Link: https://lore.kernel.org/all/20211214161124.GA202691@opensynergy.com/
Change-Id: Ie53a16991b10c02ac125a55c4bbf04d89f0a365e
Signed-off-by: Mikhail Golubev <Mikhail.Golubev@opensynergy.com>
We assume that stateful devices can maintain their state while
suspended. And for this reason they don't have a freeze callback. If
such a device is reset during resume, the device state/context will be
lost on the device side. And the virtual device will stop working.
Signed-off-by: Anton Yakovlev <Anton.Yakovlev@opensynergy.com>
Signed-off-by: Mikhail Golubev <mikhail.golubev@opensynergy.com>
Bug: 180046477
Link: https://lore.kernel.org/all/20211214163249.GA253555@opensynergy.com/
Change-Id: I20410a5af8f73eebba1986965c347288ee07c0ab
Signed-off-by: Mikhail Golubev <Mikhail.Golubev@opensynergy.com>
Android OTA failed due to SBI_NEED_FSCK flag when pinning the file. Let's avoid
it since we can do in-place-updates.
Bug: 210593661
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 70da2736a4138b86a12873d33fefbb495e22e6f8
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev)
Signed-off-by: Huang Jianan <huangjianan@oppo.com>
Change-Id: I3fd33c984417c10b38e23de6cec017b03d588945
When sysfs_slab_add failed, we shouldn't call debugfs_slab_add() for s
because s will be freed soon. And slab_debugfs_fops will use s later
leading to a use-after-free.
Link: https://lkml.kernel.org/r/20210916123920.48704-5-linmiaohe@huawei.com
Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 209932470
(cherry picked from commit 67823a544414def2a36c212abadb55b23bcda00c)
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Change-Id: I0287b3c9d9ee919f9404143f9b7d8b9c27bafe87
If object's reuse is delayed, it will be excluded from the reconstructed
freelist. But we forgot to adjust the cnt accordingly. So there will
be a mismatch between reconstructed freelist depth and cnt. This will
lead to free_debug_processing() complaining about freelist count or a
incorrect slub inuse count.
Link: https://lkml.kernel.org/r/20210916123920.48704-3-linmiaohe@huawei.com
Fixes: c3895391df ("kasan, slub: fix handling of kasan_slab_free hook")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 209932470
(cherry picked from commit 899447f669da76cc3605665e1a95ee877bc464cc)
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Change-Id: I6811ce42332472baca8d3fddb3662609125fb1e2
Patch series "Fixups for slub".
This series contains various bug fixes for slub. We fix memoryleak,
use-afer-free, NULL pointer dereferencing and so on in slub. More
details can be found in the respective changelogs.
This patch (of 5):
It's possible that __seq_open_private() will return NULL. So we should
check it before using lest dereferencing NULL pointer. And in error
paths, we forgot to release private buffer via seq_release_private().
Memory will leak in these paths.
Link: https://lkml.kernel.org/r/20210916123920.48704-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210916123920.48704-2-linmiaohe@huawei.com
Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 209932470
(cherry picked from commit 2127d22509aec3a83dffb2a3c736df7ba747a7ce)
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Change-Id: Id9cec52a846a7e05e1495033ff4b9a1a6bc615b0
Slub has a static spinlock protected bitmap for marking which objects are on
freelist when it wants to list them, for situations where dynamically
allocating such map can lead to recursion or locking issues, and on-stack
bitmap would be too large.
The handlers of debugfs files alloc_traces and free_traces also currently use this
shared bitmap, but their syscall context makes it straightforward to allocate a
private map before entering locked sections, so switch these processing paths
to use a private bitmap.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Bug: 209932470
(cherry picked from commit b3fd64e1451b5efd94aa0ebc755e02558e6f3ca1)
Change-Id: I5fbf34e0d828d1c8b5e81e3679f81b70ce1fc8bc
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
In this patch(https://patchwork.freedesktop.org/patch/310349),
it add a new IOCTL to support dma-buf user to set debug name.
But it also added a limitation of this IOCTL, it needs the
attachments of dmabuf should be empty, otherwise it will fail.
For the original series, the idea was that allowing name change
mid-use could confuse the users about the dma-buf.
However, the rest of the series also makes sure each dma-buf have a unique
inode(https://patchwork.freedesktop.org/patch/310387/), and any accounting
should probably use that, without relying on the name as much.
So, removing this restriction will let dma-buf userspace users to use it
more comfortably and without any side effect.
Signed-off-by: Guangming Cao <Guangming.Cao@mediatek.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/lkml/20211009024733.65676-1-guangming.cao@mediatek.com/T/
Bug: 209090315
(cherry picked from commit e73c317efbf9a6ab2d1c18eff8343958ab6df73a
https://anongit.freedesktop.org/git/drm/drm-misc.git drm-misc)
Change-Id: Ic163a92d002608c72a0c96854922ad16e0c14b06
Signed-off-by: Yunfei Wang <yf.wang@mediatek.com>
After we start to do core soft reset while usb role switch,
the phy init is invoked at every switch to device mode, but
its counter part de-init is missing, this causes the actual
phy init can not be done when we really want to re-init phy
like system resume, because the counter maintained by phy
core is not 0. considering phy init is actually redundant for
role switch, so move out the phy init from core soft reset to
dwc3 core init where is the only place required.
Fixes: f88359e1588b ("usb: dwc3: core: Do core softreset when switch mode")
Cc: <stable@vger.kernel.org>
Tested-by: faqiang.zhu <faqiang.zhu@nxp.com>
Tested-by: John Stultz <john.stultz@linaro.org> #HiKey960
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1631068099-13559-1-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 194108974
(cherry picked from commit 8cfac9a6744fcb143cb3e94ce002f09fd17fadbb)
Change-Id: I47b3de1b3d56aecc235b89b1d8b9f34961068636
Signed-off-by: Jindong Yue <jindong.yue@nxp.com>
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.
xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.
This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.
Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.
This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.
This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: <stable@vger.kernel.org> #5.12
Reported-by: Tao Wang <wat@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210820123503.2605901-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 209501020
(cherry picked from commit 94f339147fc3eb9edef7ee4ef6e39c569c073753)
Change-Id: Ie7d39365e00b54154be2fd9ca05b5600bd18850d
Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com>
If add_memory_subsection() is called with a size of
memory_block_size_bytes, it calls into add_memory(), which declares
the region as system ram, and adds it to the buddy allocator. This
is inconsistent with the behavior of add_memory_subsection() for
other sizes, for which it does not add the memory to buddy and
instead reserves it for the caller's private use.
Bug: 210008865
Fixes: 417ac617ea ("ANDROID: mm/memory_hotplug: implement {add/remove}_memory_subsection")
Change-Id: Iefb69b0b4e96af670d0e65c325a9538d14b460e3
Signed-off-by: Patrick Daly <quic_pdaly@quicinc.com>
Currently, the UVC function is activated when open on the corresponding
v4l2 device is called. On another open the activation of the function
fails since the deactivation counter in `usb_function_activate` equals
0. However the error is not returned to userspace since the open of the
v4l2 device is successful.
On a close the function is deactivated (since deactivation counter still
equals 0) and the video is disabled in `uvc_v4l2_release`, although the
UVC application potentially is streaming.
Move activation of UVC function to subscription on UVC_EVENT_SETUP
because there we can guarantee for a userspace application utilizing
UVC. Block subscription on UVC_EVENT_SETUP while another application
already is subscribed to it, indicated by `bool func_connected` in
`struct uvc_device`. Extend the `struct uvc_file_handle` with member
`bool is_uvc_app_handle` to tag it as the handle used by the userspace
UVC application.
With this a process is able to check capabilities of the v4l2 device
without deactivating the function for the actual UVC application.
Reviewed-By: Michael Tretter <m.tretter@pengutronix.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Thomas Haemmerle <thomas.haemmerle@wolfvision.net>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Link: https://lore.kernel.org/r/20211003201355.24081-1-m.grzeschik@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 209496225
Change-Id: I17944b520d6cc29f86dd6b64b257c0d3185cb69a
(cherry picked from commit 72ee48ee8925446eaeda8e4ef3f2eb16b4a93d2a)
Signed-off-by: Dan Vacura <w36195@motorola.com>
commit 50252e4b5e989ce64555c7aef7516bdefc2fea72 upstream.
signalfd_poll() and binder_poll() are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, by sending a POLLFREE notification to all waiters.
Unfortunately, only eventpoll handles POLLFREE. A second type of
non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't
handle POLLFREE. This allows a use-after-free to occur if a signalfd or
binder fd is polled with aio poll, and the waitqueue gets freed.
Fix this by making aio poll handle POLLFREE.
A patch by Ramji Jiyani <ramjiyani@google.com>
(https://lore.kernel.org/r/20211027011834.2497484-1-ramjiyani@google.com)
tried to do this by making aio_poll_wake() always complete the request
inline if POLLFREE is seen. However, that solution had two bugs.
First, it introduced a deadlock, as it unconditionally locked the aio
context while holding the waitqueue lock, which inverts the normal
locking order. Second, it didn't consider that POLLFREE notifications
are missed while the request has been temporarily de-queued.
The second problem was solved by my previous patch. This patch then
properly fixes the use-after-free by handling POLLFREE in a
deadlock-free way. It does this by taking advantage of the fact that
freeing of the waitqueue is RCU-delayed, similar to what eventpoll does.
Fixes: 2c14fa838c ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 185125206
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I748544276cf2fe214097751507d9c0ee4e3d3475