[ Upstream commit 2507003a1d10917c9158077bf6030719d02c941e ]
lock_is_held_type(, 1) detects acquired read locks. It only recognized
locks acquired with lock_acquire_shared(). Read locks acquired with
lock_acquire_shared_recursive() are not recognized because a `2' is
stored as the read value.
Rework the check to additionally recognise lock's read value one and two
as a read held lock.
Fixes: e918188611 ("locking: More accurate annotations for read_lock()")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20210903084001.lblecrvz4esl4mrr@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bd4f20de8acad37dbb3154feb34dbc36d506c02 ]
When kmem_cache_zalloc in virtio_gpu_get_vbuf fails, it will return
an error code. But none of its callers checks this error code, and
a core dump will take place.
Considering many of its callers can't handle such error, I add
a __GFP_NOFAIL flag when calling kmem_cache_zalloc to make sure
it won't fail, and delete those unused error handlings.
Fixes: dc5698e80c ("Add virtio gpu driver.")
Signed-off-by: Yuntao Liu <liuyuntao10@huawei.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210828104321.3410312-1-liuyuntao10@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4f868191138975f2fdf2f37c11318b47db4acc9 ]
The hardware sets the TMUWCF bit back to 0 when the TMU write
combiner flush completes so we should be checking for that instead
of the L2TFLS bit.
v2 (Melissa Wen):
- Add Signed-off-by and Fixes tags.
- Change the error message for the timeout to be more clear.
Fixes spurious Vulkan CTS failures in:
dEQP-VK.binding_model.descriptorset_random.*
Fixes: d223f98f02 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915100507.3945-1-itoral@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9af9dcf11bda3e2c0e24c1acaacb8685ad974e93 ]
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.
The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.
Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.
Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ad74d39c51dd41b3c819f4f5396655f0629b4fd ]
The current definition of 2W burst length is invalid.
This patch fixes it. Current downstream DEU driver doesn't
use DMA. An incorrect burst length value doesn't cause any
errors. This patch also adds other burst length values.
Fixes: dfec1a827d ("MIPS: Lantiq: Add DMA support")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f0b2b2df5423fb369ac762c77900bc7765496d58 ]
The sync_sched_exp_online_cleanup() checks to see if RCU needs
an expedited quiescent state from the incoming CPU, sending it
an IPI if so. Before sending IPI, it checks whether expedited
qs need has been already requested for the incoming CPU, by
checking rcu_data.cpu_no_qs.b.exp for the current cpu, on which
sync_sched_exp_online_cleanup() is running. This works for the
case where incoming CPU is same as self. However, for the case
where incoming CPU is different from self, expedited request
won't get marked, which can potentially delay reporting of
expedited quiescent state for the incoming CPU.
Fixes: e015a34112 ("rcu: Avoid self-IPI in sync_sched_exp_online_cleanup()")
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49d8a5606428ca0962d09050a5af81461ff90fbb ]
Before freeing struct sco_conn, all delayed timeout work should be
cancelled. Otherwise, sco_sock_timeout could potentially use the
sco_conn after it has been freed.
Additionally, sco_conn.timeout_work should be initialized when the
connection is allocated, not when the channel is added. This is
because an sco_conn can create channels with multiple sockets over its
lifetime, which happens if sockets are released but the connection
isn't deleted.
Fixes: ba316be1b6a0 ("Bluetooth: schedule SCO timeouts with delayed_work")
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7b1d02fc43925a4d569ec221715db2dfa1ce4f5 ]
The internal stream state sets the timeout to 120 seconds 2 seconds
after the creation of the flow, attach this internal stream state to the
IPS_ASSURED flag for consistent event reporting.
Before this patch:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Note IPS_ASSURED for the flow not yet in the internal stream state.
after this update:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 120 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Before this patch, short-lived UDP flows never entered IPS_ASSURED, so
they were already candidate flow to be deleted by early_drop under
stress.
Before this patch, IPS_ASSURED is set on regardless the internal stream
state, attach this internal stream state to IPS_ASSURED.
packet #1 (original direction) enters NEW state
packet #2 (reply direction) enters ESTABLISHED state, sets on IPS_SEEN_REPLY
paclet #3 (any direction) sets on IPS_ASSURED (if 2 seconds since the
creation has passed by).
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66e29fcda1824f0427966fbee2bd2c85bf362c82 ]
With idle polling, IPIs are not sent when a CPU idle, but queued
and run later from do_idle(). The default kgdb_call_nmi_hook()
implementation gets the pointer to struct pt_regs from get_irq_reqs(),
which doesn't work in that case because it was not called from the
IPI interrupt handler. Fix it by defining our own kgdb_roundup()
function which sents an IPI_ENTER_KGDB. When that IPI is received
on the target CPU kgdb_nmicallback() is called.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e0ba125c2bf1030af3267058019ba86da96863f ]
With 64 bit kernels unwind_special() is not working because
it compares the pc to the address of the function descriptor.
Add a helper function that compares pc with the dereferenced
address. This fixes all of the backtraces on my c8000. Without
this changes, a lot of backtraces are missing in kdb or the
show-all-tasks command from /proc/sysrq-trigger.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9cc2fa4f4a92ccc6760d764e7341be46ee8aaaa1 ]
The function end_of_stack() returns a pointer to the last entry of a
stack. For architectures like parisc where the stack grows upwards
return the pointer to the highest address in the stack.
Without this change I faced a crash on parisc, because the stackleak
functionality wrote STACKLEAK_POISON to the lowest address and thus
overwrote the first 4 bytes of the task_struct which included the
TIF_FLAGS.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f35dcaa0a8a29188ed61083d153df1454cf89d08 ]
close_range() test type conflicts with close_range() library call in
x86_64-linux-gnu/bits/unistd_ext.h. Fix it by changing the name to
core_close_range().
gcc -g -I../../../../usr/include/ close_range_test.c -o ../tools/testing/selftests/core/close_range_test
In file included from close_range_test.c:16:
close_range_test.c:57:6: error: conflicting types for ‘close_range’; have ‘void(struct __test_metadata *)’
57 | TEST(close_range)
| ^~~~~~~~~~~
../kselftest_harness.h:181:21: note: in definition of macro ‘__TEST_IMPL’
181 | static void test_name(struct __test_metadata *_metadata); \
| ^~~~~~~~~
close_range_test.c:57:1: note: in expansion of macro ‘TEST’
57 | TEST(close_range)
| ^~~~
In file included from /usr/include/unistd.h:1204,
from close_range_test.c:13:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:56:12: note: previous declaration of ‘close_range’ with type ‘int(unsigned int, unsigned int, int)’
56 | extern int close_range (unsigned int __fd, unsigned int __max_fd,
| ^~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 285f68afa8b20f752b0b7194d54980b5e0e27b75 ]
The following issue is observed with CONFIG_DEBUG_PREEMPT when KVM loads:
KVM: vmx: using Hyper-V Enlightened VMCS
BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/488
caller is set_hv_tscchange_cb+0x16/0x80
CPU: 1 PID: 488 Comm: systemd-udevd Not tainted 5.15.0-rc5+ #396
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019
Call Trace:
dump_stack_lvl+0x6a/0x9a
check_preemption_disabled+0xde/0xe0
? kvm_gen_update_masterclock+0xd0/0xd0 [kvm]
set_hv_tscchange_cb+0x16/0x80
kvm_arch_init+0x23f/0x290 [kvm]
kvm_init+0x30/0x310 [kvm]
vmx_init+0xaf/0x134 [kvm_intel]
...
set_hv_tscchange_cb() can get preempted in between acquiring
smp_processor_id() and writing to HV_X64_MSR_REENLIGHTENMENT_CONTROL. This
is not an issue by itself: HV_X64_MSR_REENLIGHTENMENT_CONTROL is a
partition-wide MSR and it doesn't matter which particular CPU will be
used to receive reenlightenment notifications. The only real problem can
(in theory) be observed if the CPU whose id was acquired with
smp_processor_id() goes offline before we manage to write to the MSR,
the logic in hv_cpu_die() won't be able to reassign it correctly.
Reported-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20211012155005.1613352-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a27ca39478270e07baf9c09aa0c99709769ba03 ]
For packets originating from hardware scan, the channel and band is
included in the buffer descriptor (bd->rf_band & bd->rx_ch).
For 2Ghz band the channel value is directly reported in the 4-bit
rx_ch field. For 5Ghz band, the rx_ch field contains a mapping
index (given the 4-bit limitation).
The reserved0 value field is also used to extend 4-bit mapping to
5-bit mapping to support more than 16 5Ghz channels.
This change adds correct reporting of the frequency/band, that is
used in scan mechanism. And is required for 5Ghz hardware scan
support.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634554678-7993-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca9b8f56ec089d3a436050afefd17b7237301f47 ]
Fix the missing clk_disable_unprepare() before return
from bcm_qspi_probe() in the error handling case.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211018073413.2029081-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ef9dc0f14ba6124c62547a4fdc59b163d8b864e ]
We got the following lockdep splat while running fstests (specifically
btrfs/003 and btrfs/020 in a row) with the new rc. This was uncovered
by 87579e9b7d8d ("loop: use worker per cgroup instead of kworker") which
converted loop to using workqueues, which comes with lockdep
annotations that don't exist with kworkers. The lockdep splat is as
follows:
WARNING: possible circular locking dependency detected
5.14.0-rc2-custom+ #34 Not tainted
------------------------------------------------------
losetup/156417 is trying to acquire lock:
ffff9c7645b02d38 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x84/0x600
but task is already holding lock:
ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #5 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x28/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x163/0x3a0
path_openat+0x74d/0xa40
do_filp_open+0x9c/0x140
do_sys_openat2+0xb1/0x170
__x64_sys_openat+0x54/0x90
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #4 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
blkdev_get_by_dev.part.0+0xd1/0x3c0
blkdev_get_by_path+0xc0/0xd0
btrfs_scan_one_device+0x52/0x1f0 [btrfs]
btrfs_control_ioctl+0xac/0x170 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (uuid_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
btrfs_rm_device+0x48/0x6a0 [btrfs]
btrfs_ioctl+0x2d1c/0x3110 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#11){.+.+}-{0:0}:
lo_write_bvec+0x112/0x290 [loop]
loop_process_work+0x25f/0xcb0 [loop]
process_one_work+0x28f/0x5d0
worker_thread+0x55/0x3c0
kthread+0x140/0x170
ret_from_fork+0x22/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x266/0x5d0
worker_thread+0x55/0x3c0
kthread+0x140/0x170
ret_from_fork+0x22/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x1130/0x1dc0
lock_acquire+0xf5/0x320
flush_workqueue+0xae/0x600
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x650 [loop]
lo_ioctl+0x29d/0x780 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/156417:
#0: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
stack backtrace:
CPU: 8 PID: 156417 Comm: losetup Not tainted 5.14.0-rc2-custom+ #34
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0x10a/0x120
__lock_acquire+0x1130/0x1dc0
lock_acquire+0xf5/0x320
? flush_workqueue+0x84/0x600
flush_workqueue+0xae/0x600
? flush_workqueue+0x84/0x600
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x650 [loop]
lo_ioctl+0x29d/0x780 [loop]
? __lock_acquire+0x3a0/0x1dc0
? update_dl_rq_load_avg+0x152/0x360
? lock_is_held_type+0xa5/0x120
? find_held_lock.constprop.0+0x2b/0x80
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f645884de6b
Usually the uuid_mutex exists to protect the fs_devices that map
together all of the devices that match a specific uuid. In rm_device
we're messing with the uuid of a device, so it makes sense to protect
that here.
However in doing that it pulls in a whole host of lockdep dependencies,
as we call mnt_may_write() on the sb before we grab the uuid_mutex, thus
we end up with the dependency chain under the uuid_mutex being added
under the normal sb write dependency chain, which causes problems with
loop devices.
We don't need the uuid mutex here however. If we call
btrfs_scan_one_device() before we scratch the super block we will find
the fs_devices and not find the device itself and return EBUSY because
the fs_devices is open. If we call it after the scratch happens it will
not appear to be a valid btrfs file system.
We do not need to worry about other fs_devices modifying operations here
because we're protected by the exclusive operations locking.
So drop the uuid_mutex here in order to fix the lockdep splat.
A more detailed explanation from the discussion:
We are worried about rm and scan racing with each other, before this
change we'll zero the device out under the UUID mutex so when scan does
run it'll make sure that it can go through the whole device scan thing
without rm messing with us.
We aren't worried if the scratch happens first, because the result is we
don't think this is a btrfs device and we bail out.
The only case we are concerned with is we scratch _after_ scan is able
to read the superblock and gets a seemingly valid super block, so lets
consider this case.
Scan will call device_list_add() with the device we're removing. We'll
call find_fsid_with_metadata_uuid() and get our fs_devices for this
UUID. At this point we lock the fs_devices->device_list_mutex. This is
what protects us in this case, but we have two cases here.
1. We aren't to the device removal part of the RM. We found our device,
and device name matches our path, we go down and we set total_devices
to our super number of devices, which doesn't affect anything because
we haven't done the remove yet.
2. We are past the device removal part, which is protected by the
device_list_mutex. Scan doesn't find the device, it goes down and
does the
if (fs_devices->opened)
return -EBUSY;
check and we bail out.
Nothing about this situation is ideal, but the lockdep splat is real,
and the fix is safe, tho admittedly a bit scary looking.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ copy more from the discussion ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44bee215f72f13874c0e734a0712c2e3264c0108 ]
Fix a warning reported by smatch that ret could be returned without
initialized. The dedupe operations are supposed to to return 0 for a 0
length range but the caller does not pass olen == 0. To keep this
behaviour and also fix the warning initialize ret to 0.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d730ee686800d71ecc5c3cb8460dcdcdeaf38a3 ]
Let GK45 not go into BIOS for determining the AC power state.
The BIOS wrongly returns 0, so hardcode the power state to 1.
The mini PC GK45 by Besstar Tech Lld. (aka Kodlix) just runs
off AC. It does not include any batteries. Nevertheless BIOS
reports AC off:
root@kodlix:/usr/src/linux# cat /sys/class/power_supply/ADP1/online
0
root@kodlix:/usr/src/linux# modprobe acpi_dbg
root@kodlix:/usr/src/linux# tools/power/acpi/acpidbg
- find _PSR
\_SB.PCI0.SBRG.H_EC.ADP1._PSR Method 000000009283cee8 001 Args 0 Len 001C Aml 00000000f54e5f67
- execute \_SB.PCI0.SBRG.H_EC.ADP1._PSR
Evaluating \_SB.PCI0.SBRG.H_EC.ADP1._PSR
Evaluation of \_SB.PCI0.SBRG.H_EC.ADP1._PSR returned object 00000000dc08c187, external buffer length 18
[Integer] = 0000000000000000
that should be
[Integer] = 0000000000000001
Signed-off-by: Stefan Schaeckeler <schaecsn@gmx.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8c9c296adfae9ea05f655d69e9f6e13daa86fb4a ]
The VRF driver invokes netfilter for output+postrouting hooks so that users
can create rules that check for 'oif $vrf' rather than lower device name.
This is a problem when NAT rules are configured.
To avoid any conntrack involvement in round 1, tag skbs as 'untracked'
to prevent conntrack from picking them up.
This gets cleared before the packet gets handed to the ip stack so
conntrack will be active on the second iteration.
One remaining issue is that a rule like
output ... oif $vrfname notrack
won't propagate to the second round because we can't tell
'notrack set via ruleset' and 'notrack set by vrf driver' apart.
However, this isn't a regression: the 'notrack' removal happens
instead of unconditional nf_reset_ct().
I'd also like to avoid leaking more vrf specific conditionals into the
netfilter infra.
For ingress, conntrack has already been done before the packet makes it
to the vrf driver, with this patch egress does connection tracking with
lower/physical device as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 345dac33f58894a56d17b92a41be10e16585ceff ]
When configuring the kernel for big-endian, we set either BE-8 or BE-32
based on the CPU architecture level. Until linux-4.4, we did not have
any ARMv7-M platform allowing big-endian builds, but now i.MX/Vybrid
is in that category, adn we get a build error because of this:
arch/arm/kernel/module-plts.c: In function 'get_module_plt':
arch/arm/kernel/module-plts.c:60:46: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration]
This comes down to picking the wrong default, ARMv7-M uses BE8
like ARMv7-A does. Changing the default gets the kernel to compile
and presumably works.
https://lore.kernel.org/all/1455804123-2526139-2-git-send-email-arnd@arndb.de/
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7427f3bb49d81525b7dd1d0f7c5f6bbc752e6f0e ]
So far, glock_hash_walk took a reference on each glock it iterated over, and it
was the examiner's responsibility to drop those references. Dropping the final
reference to a glock can sleep and the examiners are called in a RCU critical
section with spin locks held, so examiners that didn't need the extra reference
had to drop it asynchronously via gfs2_glock_queue_put or similar. This wasn't
done correctly in thaw_glock which did call gfs2_glock_put, and not at all in
dump_glock_func.
Change glock_hash_walk to not take glock references at all. That way, the
examiners that don't need them won't have to bother with slow asynchronous
puts, and the examiners that do need references can take them themselves.
Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 486408d690e130c3adacf816754b97558d715f46 ]
In gfs2_inode_lookup and gfs2_create_inode, we're calling
gfs2_cancel_delete_work which currently cancels any remote delete work
(delete_work_func) synchronously. This means that if the work is
currently running, it will wait for it to finish. We're doing this to
pevent a previous instance of an inode from having any influence on the
next instance.
However, delete_work_func uses gfs2_inode_lookup internally, and we can
end up in a deadlock when delete_work_func gets interrupted at the wrong
time. For example,
(1) An inode's iopen glock has delete work queued, but the inode
itself has been evicted from the inode cache.
(2) The delete work is preempted before reaching gfs2_inode_lookup.
(3) Another process recreates the inode (gfs2_create_inode). It tries
to cancel any outstanding delete work, which blocks waiting for
the ongoing delete work to finish.
(4) The delete work calls gfs2_inode_lookup, which blocks waiting for
gfs2_create_inode to instantiate and unlock the new inode =>
deadlock.
It turns out that when the delete work notices that its inode has been
re-instantiated, it will do nothing. This means that it's safe to
cancel the delete work asynchronously. This prevents the kind of
deadlock described above.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61e18ce7348bfefb5688a8bcd4b4d6b37c0f9b2a ]
When addr_gen_mode is set to IN6_ADDR_GEN_MODE_NONE, the link-local addr
should not be generated. But it isn't the case for GRE (as well as GRE6)
and SIT tunnels. Make it so that tunnels consider the addr_gen_mode,
especially for IN6_ADDR_GEN_MODE_NONE.
Do this in add_v4_addrs() to cover both GRE and SIT only if the addr
scope is link.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Link: https://lore.kernel.org/r/20211020200618.467342-1-ssuryaextr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3ea5d56f212ad81328c82454829a736197ebccc ]
Currently the stacktrace on clang compiled arm kernel uses the 'lr'
register to find the first frame address from pt_regs. However, that
is wrong after calling another function, because the 'lr' register
is used by 'bl' instruction and never be recovered.
As same as gcc arm kernel, directly use the frame pointer (r11) of
the pt_regs to find the first frame address.
Note that this fixes kretprobe stacktrace issue only with
CONFIG_UNWINDER_FRAME_POINTER=y. For the CONFIG_UNWINDER_ARM,
we need another fix.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f91488ee15bd3cac467e2d6a361fc2d34d1052ae ]
syzbot is reporting kernel panic at smk_cipso_doi() due to memory
allocation fault injection [1]. The reason for need to use panic() was
not explained. But since no fix was proposed for 18 months, for now
let's use __GFP_NOFAIL for utilizing syzbot resource on other bugs.
Link: https://syzkaller.appspot.com/bug?extid=89731ccb6fec15ce1c22 [1]
Reported-by: syzbot <syzbot+89731ccb6fec15ce1c22@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5322b9ab5f63536c41301150b7ce64605ce52cc ]
Just like we have default SMPS mode as dynamic in powersave,
we should not enable RX-diversity in powersave, to reduce
power consumption when connected to a non-MIMO AP.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211017113927.fc896bc5cdaa.I1d11da71b8a5cbe921a37058d5f578f1b14a2023@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3867ab5924b7a9a0b4a117902a08669d8be7c21 ]
get_warnings_count() does fclose() using File * returned from popen().
Fix it to call pclose() as it should.
tools/testing/selftests/kvm/x86_64/mmio_warning_test
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:87:9: warning: ‘fclose’ called on pointer returned from a mismatched allocation function [-Wmismatched-dealloc]
87 | fclose(f);
| ^~~~~~~~~
x86_64/mmio_warning_test.c:84:13: note: returned from ‘popen’
84 | f = popen("dmesg | grep \"WARNING:\" | wc -l", "r");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b81a5f015199f3d585ce710190a9e87714d3c1e ]
When reading the partition table on initial scan hits an I/O error the
I/O will hang with the scan_mutex held:
[<0>] do_read_cache_page+0x49b/0x790
[<0>] read_part_sector+0x39/0xe0
[<0>] read_lba+0xf9/0x1d0
[<0>] efi_partition+0xf1/0x7f0
[<0>] bdev_disk_changed+0x1ee/0x550
[<0>] blkdev_get_whole+0x81/0x90
[<0>] blkdev_get_by_dev+0x128/0x2e0
[<0>] device_add_disk+0x377/0x3c0
[<0>] nvme_mpath_set_live+0x130/0x1b0 [nvme_core]
[<0>] nvme_mpath_add_disk+0x150/0x160 [nvme_core]
[<0>] nvme_alloc_ns+0x417/0x950 [nvme_core]
[<0>] nvme_validate_or_alloc_ns+0xe9/0x1e0 [nvme_core]
[<0>] nvme_scan_work+0x168/0x310 [nvme_core]
[<0>] process_one_work+0x231/0x420
and trying to delete the controller will deadlock as it tries to grab
the scan mutex:
[<0>] nvme_mpath_clear_ctrl_paths+0x25/0x80 [nvme_core]
[<0>] nvme_remove_namespaces+0x31/0xf0 [nvme_core]
[<0>] nvme_do_delete_ctrl+0x4b/0x80 [nvme_core]
As we're now properly ordering the namespace list there is no need to
hold the scan_mutex in nvme_mpath_clear_ctrl_paths() anymore.
And we always need to kick the requeue list as the path will be marked
as unusable and I/O will be requeued _without_ a current path.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2351ead99ce9164fb42555aee3f96af84c4839e9 ]
When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy
the remaining queues after the accept_work was cancelled guarantees
that no new queue will be created.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fcf73a804c7d6bbf0ea63531c6122aa363852e04 ]
When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy the
remaining queues after the RDMA-CM was destroyed guarantees that no
new queue will be created.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3e19dcc4c416d65f99f13d55be2b787f8d0050e ]
When a port is removed through configfs, any connected controllers
are starting teardown flow asynchronously and can still send commands.
This causes a use-after-free bug for any command that dereferences
req->port (like in nvmet_parse_io_cmd).
To fix this, wait for all the teardown scheduled works to complete
(like release_work at rdma/tcp drivers). This ensures there are no
active controllers when the port is eventually removed.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ecda6393db4be44aba27a243e648dc98c9b92e3 ]
The mailbox is initialized after the interrupt handler is installed. As
the firmware is loaded and started even later, it should not happen that
the interrupt occurs without the mailbox being initialized.
As the Linux Driver Verification project (linuxtesting.org) keeps
reporting this as an error, add a check to ignore interrupts before the
mailbox is initialized to fix this potential null pointer dereference.
Reported-by: Yuri Savinykh <s02190703@gse.cs.msu.ru>
Reported-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 037057a5a979c7eeb2ee5d12cf4c24b805192c75 ]
This check is meant to catch cases where a requeue is attempted on a
request that is still inserted. It's never really been useful to catch any
misuse, and now it's actively wrong. Outside of that, this should not be a
BUG_ON() to begin with.
Remove the check as it's now causing active harm, as requeue off the plug
path will trigger it even though the request state is just fine.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/linux-block/CAHj4cs80zAUc2grnCZ015-2Rvd-=gXRfB_dFKy=RTm+wRo09HQ@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d45dac0732a287fc371a23f257cce04e65627947 ]
The bounds check on datalen is off-by-one, so fix it.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ce1bb83a14019f8c396d57ec704d19478747716 ]
If CONFIG_CFI_CLANG=y, attempting to read an event histogram will cause
the kernel to panic due to failed CFI check.
1. echo 'hist:keys=common_pid' >> events/sched/sched_switch/trigger
2. cat events/sched/sched_switch/hist
3. kernel panics on attempting to read hist
This happens because the sort() function expects a generic
int (*)(const void *, const void *) pointer for the compare function.
To prevent this CFI failure, change tracing map cmp_entries_* function
signatures to match this.
Also, fix the build error reported by the kernel test robot [1].
[1] https://lore.kernel.org/r/202110141140.zzi4dRh4-lkp@intel.com/
Link: https://lkml.kernel.org/r/20211014045217.3265162-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d25302e46592c97d29f70ccb1be558df31a9a360 ]
Some unfriendly component, such as dpdk, write the same mask to
unbound kworker cpumask again and again. Every time it write to
this interface some work is queue to cpu, even though the mask
is same with the original mask.
So, fix it by return success and do nothing if the cpumask is
equal with the old one.
Signed-off-by: Mengen Sun <mengensun@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f8d7abaa413c34da9d751289849dbfb7c977d05 ]
This might matter, for example, if the underlying type of enum xz_check
was a signed char. In such a case the validation wouldn't have caught an
unsupported header. I don't know if this problem can occur in the kernel
on any arch but it's still good to fix it because some people might copy
the XZ code to their own projects from Linux instead of the upstream
XZ Embedded repository.
This change may increase the code size by a few bytes. An alternative
would have been to use an unsigned int instead of enum xz_check but
using an enumeration looks cleaner.
Link: https://lore.kernel.org/r/20211010213145.17462-3-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 83d3c4f22a36d005b55f44628f46cc0d319a75e8 ]
With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.
This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.
Link: https://lore.kernel.org/r/20211010213145.17462-2-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b9e2291e355a0eafdd5b1e21a94a6659f24b351 ]
When the in memory flag is changed, we need to persist the change in the
rdev superblock flags. This is needed for "writemostly" and "failfast".
Reviewed-by: Li Feng <fengli@smartx.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>