Pull libata fixes from Tejun Heo:
"Dan found a really old bug where libata hotplug code wasn't sanitizing
index value from userland and may end up indexing with a negative
number. It is scary but fortunately can only be triggered by root.
Other than that, minor fixes"
* 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: fix a couple of doc build warnings
libata: array underflow in ata_find_dev()
ata: sata_rcar: add gen[23] fallback compatibility strings
libata: remove unused rc in ata_eh_handle_port_resume
libata: Cleanup ata_read_log_page()
ata: fix gemini Kconfig dependencies
This patch fixes values of the EPLL K coefficient and changes
the EPLL output frequency values to match exactly what is
possible to achieve with given M, P, S, K coefficients.
This allows to avoid rounding errors and unexpected frequency
being set with clk_set_rate(), due to recalc_rate returning
different values than the PLL rate specified in the
exynos5420_epll_24mhz_tbl table. E.g. this prevents a case
where two consecutive clk_set_rate() calls with same argument
result in different PLL output frequency.
The PLL output frequencies have been calculated with formula:
f = fxtal * (M * 2^16 + K) / (P * 2^S) / 2^16
where fxtal = 24000000.
Fixes: 9842452acd ("clk: samsung: exynos542x: Add EPLL rate table")
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
While working on enabling queued rwlock on SPARC, found this following
code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN
to clear a byte.
static inline u8 *__qrwlock_write_byte(struct qrwlock *lock)
{
return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
}
Problem is many of the fixed big endian architectures don't define
CPU_BIG_ENDIAN and clears the wrong byte.
Define CPU_BIG_ENDIAN for parisc architecture to fix it.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Allow i2c-versatile to be enabled for ARM MPS platforms.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The kerneldoc comments for a couple of functions in drivers/ata/libata-eh.c
had fallen behind the current implementation, resulting in these doc build
warnings:
./drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link'
./drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done'
./drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc'
./drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense'
Update the comments and make the warnings go away.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Tejun Heo <tj@kernel.org>
At least the Acer Iconia Tab8 / aka W1-810 uses 1MiHz instead of
1MHz for one of its busses, fix this up to 1MHz instead of failing
the probe of that bus.
This fixes the accelerometer on the Acer Iconia Tab8 not working.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
When we refuse to probe due to an invalid clock frequency, log
the frequency which is causing this error.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The PH16 pin has a function with mux id 0x5, which is the DET pin of the
"sim" (smart card reader) IP block.
This function is missing in old versions of A10/A20 SoCs' datasheets and
user manuals, so it's also missing in the old drivers. The newest A10
Datasheet V1.70 and A20 Datasheet V1.41 contain this pin function, and
it's discovered during implementing R40 pinctrl driver.
Add it to the driver. As we now merged A20 pinctrl driver to the A10
one, we need to only fix the A10 driver now.
Fixes: f2821b1ca3a2 ("pinctrl: sunxi: Move Allwinner A10 pinctrl
driver to a driver of its own")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
For now empty ID table is not allowed with ACPI and prevents driver to
be probed.
Add a check to allow empty ID table.
This introduces a helper i2c_acpi_match_device().
Note, we rename some static function in i2c-core-acpi.c to distinguish
with public API.
Fixes: da10c06a044b ("i2c: Make I2C ID tables non-mandatory for DT'ed devices")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Rajmohan Mani <rajmohan.mani@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: needed to get some drivers probed again]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
According to pinctrl assignment for Pro4, each definition of USB#2 and
USB#3 are as follows.
184: USB2VBUS
185: USB2OD
186: USB2ID
187: USB3VBUS
188: USB3OD
USB#2 has an additional pin "USB2ID", but the chip doesn't use this pin
while in host-mode. Considering this pin, the pin definitions for USB#3
should be {187, 188}.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The value argument of lp87565_gpio_direction_output() means output level
rather than gpio direction.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
On one of my test machines nhi_mailbox_cmd() called from icm_suspend()
times out and returnes an error which then is propagated to the
caller and causes the entire system suspend to be aborted which isn't
very useful.
Instead of aborting system suspend, print the error into the log
and continue.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Michael Jamet <michael.jamet@intel.com>
The watchdog soft-NMI exception stack setup loads a stack pointer
twice, which is an obvious error. It ends up using the system reset
interrupt (true-NMI) stack, which is also a bug because the watchdog
could be preempted by a system reset interrupt that overwrites the
NMI stack.
Change the soft-NMI to use the "emergency stack". The current kernel
stack is not used, because of the longer-term goal to prevent
asynchronous stack access using soft-disable.
Fixes: 2104180a5369 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZapWhAAoJEHm+PkMAQRiGKb0IAJM6b7SbWaw69Og7+qiFB+zZ
xp29iXqbE9fPISC6a5BRQV1ONjeDM6opGixGHqGC8Hla6k2IYz25VDNoF8wd0MXN
cz/Ih20vd3C5afxXGe5cTT8lsPAlV0mWXxForlu6j8jPeL62FPfq6RhEkw7AcrYL
yfYy3k3qSdOrrvBdII0WAAUi46UfIs+we9BQgbsMbkHOiqV2K0MOrzKE84Xbgepq
RAy2xg6P4b4+hTx8xTrYc1MXwpnqjRc0oJ08gdmiwW3AOOU7LxYFn7zDkLPWi9Rr
g4x6r4YhBTGxT4wNvovLIiqd9QFs//dMCuPWYwEtTICG48umIqqq24beQ0mvCdg=
=08Ic
-----END PGP SIGNATURE-----
Merge tag 'v4.13-rc1' into fixes
The fixes branch is based off a random pre-rc1 commit, because we had
some fixes that needed to go in before rc1 was released.
However we now need to fix some code that went in after that point, but
before rc1, so merge rc1 to get that code into fixes so we can fix it!
This patch fixes an issue in the translation table code potentially
leading to a TT Request + Response storm. The issue may occur for nodes
involving BLA and an inconsistent configuration of the batman-adv AP
isolation feature. However, since the new multicast optimizations, a
single, malformed packet may lead to a mesh-wide, persistent
Denial-of-Service, too.
The issue occurs because nodes are currently OR-ing the TT sync flags of
all originators announcing a specific MAC address via the
translation table. When an intermediate node now receives a TT Request
and wants to answer this on behalf of the destination node, then this
intermediate node now responds with an altered flag field and broken
CRC. The next OGM of the real destination will lead to a CRC mismatch
and triggering a TT Request and Response again.
Furthermore, the OR-ing is currently never undone as long as at least
one originator announcing the according MAC address remains, leading to
the potential persistency of this issue.
This patch fixes this issue by storing the flags used in the CRC
calculation on a a per TT orig entry basis to be able to respond with
the correct, original flags in an intermediate TT Response for one
thing. And to be able to correctly unset sync flags once all nodes
announcing a sync flag vanish for another.
Fixes: e9c00136a475 ("batman-adv: fix tt_global_entries flags update")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Acked-by: Antonio Quartulli <a@unstable.cc>
[sw: typo in commit message]
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Since kernel 4.11 the thread and irq stacks on parisc randomly overflow
the default size of 16k. The reason why stack usage suddenly grew is yet
unknown.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: Helge Deller <deller@gmx.de>
In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG
statement in flush_cache_range() during a system shutdown:
kernel BUG at arch/parisc/kernel/cache.c:595!
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
IAOQ[0]: flush_cache_range+0x144/0x148
IAOQ[1]: flush_cache_page+0x0/0x1a8
RP(r2): flush_cache_range+0xec/0x148
Backtrace:
[<00000000402910ac>] unmap_page_range+0x84/0x880
[<00000000402918f4>] unmap_single_vma+0x4c/0x60
[<0000000040291a18>] zap_page_range_single+0x110/0x160
[<0000000040291c34>] unmap_mapping_range+0x174/0x1a8
[<000000004026ccd8>] truncate_pagecache+0x50/0xa8
[<000000004026cd84>] truncate_setsize+0x54/0x70
[<000000004033d534>] put_aio_ring_file+0x44/0xb0
[<000000004033d5d8>] aio_free_ring+0x38/0x140
[<000000004033d714>] free_ioctx+0x34/0xa8
[<00000000401b0028>] process_one_work+0x1b8/0x4d0
[<00000000401b04f4>] worker_thread+0x1b4/0x648
[<00000000401b9128>] kthread+0x1b0/0x208
[<0000000040150020>] end_fault_vector+0x20/0x28
[<0000000040639518>] nf_ip_reroute+0x50/0xa8
[<0000000040638ed0>] nf_ip_route+0x10/0x78
[<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
Backtrace:
[<0000000040163bf0>] show_stack+0x20/0x38
[<0000000040688480>] dump_stack+0xa8/0x120
[<0000000040163dc4>] die_if_kernel+0x19c/0x2b0
[<0000000040164d0c>] handle_interruption+0xa24/0xa48
This patch modifies flush_cache_range() to handle non current contexts.
In as much as this occurs infrequently, the simplest approach is to
flush the entire cache when this happens.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
For some odd reason, it forces a byte-by-byte copy of each field. A
plain old swap() on most of these fields would be more efficient. We
do need to retain the memswap of i_data however as that field is an array.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
For Lustre, if ea_inode fails in hash validation but passes parent
inode and generation checks, it won't be added to the cache as well
as the error "-EFSCORRUPTED" should be cleared, otherwise it will
cause "Structure needs cleaning" when running getfattr command.
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9723
Cc: stable@vger.kernel.org
Fixes: dec214d00e0d78a08b947d7dccdfdb84407a9f4d
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: tahsin@google.com
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
When changing a file's acl mask, __ext4_set_acl() will first set the group
bits of i_mode to the value of the mask, and only then set the actual
extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the file
had no acl attribute to begin with, the system will from now on assume
that the mask permission bits are actual group permission bits, potentially
granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Two variables in ext4_inode_info, i_reserved_meta_blocks and
i_allocated_meta_blocks, are unused. Removing them saves a little
memory per in-memory inode and cleans up clutter in several tracepoints.
Adjust tracepoint output from ext4_alloc_da_blocks() for consistency
and fix a typo and whitespace near these changes.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Commit 914f82a32d0268847 "ext4: refactor direct IO code" deleted
ext4_ext_direct_IO(), but references to that function remain in
comments. Update them to refer to ext4_direct_IO_write().
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Pull x86 fixes from Thomas Gleixner:
"A small set of x86 fixes:
- prevent the kernel from using the EFI reboot method when EFI is
disabled.
- two patches addressing clang issues"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Disable the address-of-packed-member compiler warning
x86/efi: Fix reboot_mode when EFI runtime services are disabled
x86/boot: #undef memcpy() et al in string.c
Pull scheduler fixes from Thomas Gleixner:
"Two patches addressing build warnings caused by inconsistent kernel
doc comments"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/wait: Clean up some documentation warnings
sched/core: Fix some documentation build warnings
Pull perf fixes from Thomas Gleixner:
"A couple of fixes for performance counters and kprobes:
- a series of small patches which make the uncore performance
counters on Skylake server systems work correctly
- add a missing instruction slot release to the failure path of
kprobes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes/x86: Release insn_slot in failure path
perf/x86/intel/uncore: Fix missing marker for skx_uncore_cha_extra_regs
perf/x86/intel/uncore: Fix SKX CHA event extra regs
perf/x86/intel/uncore: Remove invalid Skylake server CHA filter field
perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umask
perf/x86/intel/uncore: Fix Skylake server PCU PMU event format
perf/x86/intel/uncore: Fix Skylake UPI PMU event masks
Pull irq fix from Thomas Gleixner:
"Fix for a regression caused by the conversion of x86 to the generic
hotplug code.
Instead of doing a plain single line revert, this adds a pile of
comments so the semantics of the force argument are clear"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"
ACPI HID for Hisilicon Hip07/08 should be HISI02A1/2,
not HISI0A21/2, HISI02A1/2 was tested ok but was modified
by the stupid typo when upstream the patches (by me),
correct them to the right IDs (matching the IDs in
drivers/i2c/busses/i2c-designware-platdrv.c).
Fixes: 6e14cf361a0c (ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller)
Reported-by: Tao Tian <tiantao6@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After commit f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to
calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute
in sysfs only behaves as expected on x86 with APERF/MPERF registers
available when it is read from at least twice in a row. The value
returned by the first read may not be meaningful, because the
computations in there use cached values from the previous iteration
of aperfmperf_snapshot_khz() which may be stale.
To prevent that from happening, modify arch_freq_get_on_cpu() to
call aperfmperf_snapshot_khz() twice, with a short delay between
these calls, if the previous invocation of aperfmperf_snapshot_khz()
was too far back in the past (specifically, more that 1s ago).
Also, as pointed out by Doug Smythies, aperf_delta is limited now
and the multiplication of it by cpu_khz won't overflow, so simplify
the s->khz computations too.
Fixes: f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF"
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
bpf_prog_size(prog->len) is not the correct length we want to dump
back to user space. The code in bpf_prog_get_info_by_fd() uses this
to copy prog->insnsi to user space, but bpf_prog_size(prog->len) also
includes the size of struct bpf_prog itself plus program instructions
and is usually used either in context of accounting or for bpf_prog_alloc()
et al, thus we copy out of bounds in bpf_prog_get_info_by_fd()
potentially. Use the correct bpf_prog_insn_size() instead.
Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using CONFIG_UBSAN_SANITIZE_ALL, the TCP code produces a
false-positive warning:
net/ipv4/tcp_output.c: In function 'tcp_connect':
net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds]
tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
^~
net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds]
tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
I have opened a gcc bug for this, but distros have already shipped
compilers with this problem, and it's not clear yet whether there is
a way for gcc to avoid the warning. As the problem is related to the
bitfield access, this introduces a temporary variable to store the old
enum value.
I did not notice this warning earlier, since UBSAN is disabled when
building with COMPILE_TEST, and that was always turned on in both
allmodconfig and randconfig tests.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two fixes for for brcmfmac, the crash was reported by two people
already so it's a high priority fix.
brcmfmac
* fix a crash in skb headroom handling in v4.13-rc1
* fix a memory leak due to a merge error in v4.6
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZe0L2AAoJEG4XJFUm622bCiYIAKXCyEV2CCfNgloFcPmElvPR
HvmYDzxQeJVEjCYjzYLpaBK6rfJtMuBhmHfSrCTSd6YulWOUoUy6IRMi7bgoZe2C
5cccqZl3uU8fG34ib6jKvp6ofx5DX3yFMQQa0toY27VQTkY46AT6yx9UMn4Mi+bQ
p0skV/1gwf0IPGfZQkpSemhosmtEaNLNMiqAJlOQrsi2b9YYcBYTW1svnr/yB0UM
7pl0+zUVE+Ul/eJHb82JeJaRrNYLRZ7KD5bMlV9OoDO/Rlu3933fNE/+jWtuuTVG
VnNLyhEI1CR767CppOhraDVkGvINg+EewCouWQ9ZCUWxTcVuFxAew+QkI6foS2A=
=uMUa
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2017-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.13
Two fixes for for brcmfmac, the crash was reported by two people
already so it's a high priority fix.
brcmfmac
* fix a crash in skb headroom handling in v4.13-rc1
* fix a memory leak due to a merge error in v4.6
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in printk message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Groups of BFQ queues are represented by generic entities in BFQ. When
a queue belonging to a parent entity is deactivated, the parent entity
may need to be deactivated too, in case the deactivated queue was the
only active queue for the parent entity. This deactivation may need to
be propagated upwards if the entity belongs, in its turn, to a further
higher-level entity, and so on. In particular, the upward propagation
of deactivation stops at the first parent entity that remains active
even if one of its child entities has been deactivated.
To decide whether the last non-deactivation condition holds for a
parent entity, BFQ checks whether the field next_in_service is still
not NULL for the parent entity, after the deactivation of one of its
child entity. If it is not NULL, then there are certainly other active
entities in the parent entity, and deactivations can stop.
Unfortunately, this check misses a corner case: if in_service_entity
is not NULL, then next_in_service may happen to be NULL, although the
parent entity is evidently active. This happens if: 1) the entity
pointed by in_service_entity is the only active entity in the parent
entity, and 2) according to the definition of next_in_service, the
in_service_entity cannot be considered as next_in_service. See the
comments on the definition of next_in_service for details on this
second point.
Hitting the above corner case causes crashes.
To address this issue, this commit:
1) Extends the above check on only next_in_service to controlling both
next_in_service and in_service_entity (if any of them is not NULL,
then no further deactivation is performed)
2) Improves the (important) comments on how next_in_service is defined
and updated; in particular it fixes a few rather obscure paragraphs
Reported-by: Eric Wheeler <bfq-sched@lists.ewheeler.net>
Reported-by: Rick Yiu <rick_yiu@htc.com>
Reported-by: Tom X Nguyen <tom81094@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Eric Wheeler <bfq-sched@lists.ewheeler.net>
Tested-by: Rick Yiu <rick_yiu@htc.com>
Tested-by: Laurentiu Nicola <lnicola@dend.ro>
Tested-by: Tom X Nguyen <tom81094@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BFQ implements hierarchical scheduling by representing each group of
queues with a generic parent entity. For each parent entity, BFQ
maintains an in_service_entity pointer: if one of the child entities
happens to be in service, in_service_entity points to it. The
resetting of these pointers happens only on queue expirations: when
the in-service queue is expired, i.e., stops to be the queue in
service, BFQ resets all in_service_entity pointers along the
parent-entity path from this queue to the root entity.
Functions handling the scheduling of entities assume, naturally, that
in-service entities are active, i.e., have pending I/O requests (or,
as a special case, even if they have no pending requests, they are
expected to receive a new request very soon, with the scheduler idling
the storage device while waiting for such an event). Unfortunately,
the above resetting scheme of the in_service_entity pointers may cause
this assumption to be violated. For example, the in-service queue may
happen to remain without requests because of a request merge. In this
case the queue does become idle, and all related data structures are
updated accordingly. But in_service_entity still points to the queue
in the parent entity. This inconsistency may even propagate to
higher-level parent entities, if they happen to become idle as well,
as a consequence of the leaf queue becoming idle. For this queue and
parent entities, scheduling functions have an undefined behaviour,
and, as reported, may easily lead to kernel crashes or hangs.
This commit addresses this issue by simply resetting the
in_service_entity field also when it is detected to point to an entity
becoming idle (regardless of why the entity becomes idle).
Reported-by: Laurentiu Nicola <lnicola@dend.ro>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Laurentiu Nicola <lnicola@dend.ro>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
err in bpf_prog_get_info_by_fd() still holds 0 at that time from prior
check_uarg_tail_zero() check. Explicitly return -EFAULT instead, so
user space can be notified of buggy behavior.
Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an early demuxed packet reaches __udp6_lib_lookup_skb(), the
sk reference is retrieved and used, but the relevant reference
count is leaked and the socket destructor is never called.
Beyond leaking the sk memory, if there are pending UDP packets
in the receive queue, even the related accounted memory is leaked.
In the long run, this will cause persistent forward allocation errors
and no UDP skbs (both ipv4 and ipv6) will be able to reach the
user-space.
Fix this by explicitly accessing the early demux reference before
the lookup, and properly decreasing the socket reference count
after usage.
Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and
the now obsoleted comment about "socket cache".
The newly added code is derived from the current ipv4 code for the
similar path.
v1 -> v2:
fixed the __udp6_lib_rcv() return code for resubmission,
as suggested by Eric
Reported-by: Sam Edwards <CFSworks@gmail.com>
Reported-by: Marc Haber <mh+netdev@zugschlus.de>
Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For SGMII/RGMII/QSGMII interfaces when physical link goes down
while traffic is high is resulting in underflow condition being set
on that specific BGX's LMAC. Which assets a backpresure and VNIC stops
transmitting packets.
This is due to BGX being disabled in link status change callback while
packet is in transit. This patch fixes this issue by not disabling BGX
but instead just disables packet Rx and Tx.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it
was reported to break vhost_net. We want to cache used event and use
it to check for notification. The assumption was that guest won't move
the event idx back, but this could happen in fact when 16 bit index
wraps around after 64K entries.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZee4OAAoJEEg/ir3gV/o+/J0H/1Ko7FROlb5J1fRr32R309vH
fG7GikFdqiYMRshWzqBsNsbZRaxZmgifWv87d9w6vgW/BpsXuvGXAqqcMgOqChRI
bevTXaq9vwpgqkUuUuFkfRgsp496+ADAwF1aXH/LHN9EyVOqnteZgqugBurgyNcZ
wC6sl613KUPTuaV8O6jNEne/4BrIonz+OQCJU+IT2H6OuJzuGgLNSWvnM4ugOVeX
IQUuG9mV4qKh0srJwP4CsVk59vdJitdqqE0paxJrWLx5wTwY23vOshwBc6ZqAdRM
/KZtCIFyR+ez8d+ZkZ5L2UDdUSkFedawAxD7nreARmG/ZRa8RtrpnVyElcnHPS8=
=rIWI
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2017-07-27-V2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2017-07-27
This series contains some misc fixes to the mlx5 driver.
Please pull and let me know if there's any problem.
V1->V2:
- removed redundant braces
for -stable:
4.7
net/mlx5: Fix command bad flow on command entry allocation failure
4.9
net/mlx5: Consider tx_enabled in all modes on remap
net/mlx5e: Fix outer_header_zero() check size
4.10
net/mlx5: Fix mlx5_add_flow_rules call with correct num of dests
4.11
net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure size
net/mlx5e: Add field select to MTPPS register
net/mlx5e: Fix broken disable 1PPS flow
net/mlx5e: Change 1PPS out scheme
net/mlx5e: Add missing support for PTP_CLK_REQ_PPS request
net/mlx5e: Fix wrong delay calculation for overflow check scheduling
net/mlx5e: Schedule overflow check work to mlx5e workqueue
4.12
net/mlx5: Fix command completion after timeout access invalid structure
net/mlx5e: IPoIB, Modify add/remove underlay QPN flows
I hope this is not too much, but most of the patches do apply cleanly on -stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 tunnels use sizeof(struct in6_addr) as dev->addr_len,
but in many places especially bonding, we use struct sockaddr
to copy and set mac addr, this could lead to stack out-of-bounds
access.
Fix it by using a larger address storage like bonding.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Historically, dev_ifsioc() uses struct sockaddr as mac
address definition, this is why dev_set_mac_address()
accepts a struct sockaddr pointer as input but now we
have various types of mac addresse whose lengths
are up to MAX_ADDR_LEN, longer than struct sockaddr,
and saved in dev->addr_len.
It is too late to fix dev_ifsioc() due to API
compatibility, so just reject those larger than
sizeof(struct sockaddr), otherwise we would read
and use some random bytes from kernel stack.
Fortunately, only a few IPv6 tunnel devices have addr_len
larger than sizeof(struct sockaddr) and they don't support
ndo_set_mac_addr(). But with team driver, in lb mode, they
can still be enslaved to a team master and make its mac addr
length as the same.
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman reported that Koelsch and Lager hang during boot, and
bisected this to commit 1c3c5eab171590f8 ("sched/core: Enable
might_sleep() and smp_processor_id() checks early").
The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus
notifier, and unregisters the notifier when it is no longer needed.
However, a notifier must not be unregistered from within the call chain.
This bug went unnoticed, as blocking_notifier_chain_unregister() didn't
take the semaphore during early boot. The aforementioned commit changed
that behavior, leading to a deadlock.
Fix this by removing the call to bus_unregister_notifier(), and keeping
local completion state instead.
Reported-by: Simon Horman <horms+renesas@verge.net.au>
Fixes: 663fbb52159cca6f ("ARM: shmobile: R-Car Gen2: Add da9063/da9210 regulator quirk")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Currently building kernel for xtensa core with aliasing WT cache fails
with the following messages:
mm/memory.c:2152: undefined reference to `flush_dcache_page'
mm/memory.c:2332: undefined reference to `local_flush_cache_page'
mm/memory.c:1919: undefined reference to `local_flush_cache_range'
mm/memory.c:4179: undefined reference to `copy_to_user_page'
mm/memory.c:4183: undefined reference to `copy_from_user_page'
This happens because implementation of these functions is only compiled
when data cache is WB, which looks wrong: even when data cache doesn't
need flushing it still needs invalidation. The functions like
__flush_[invalidate_]dcache_* are correctly defined for both WB and WT
caches (and even if they weren't that'd still be ok, just slower).
Fix this by providing the same implementation of the above functions for
both WB and WT cache.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>