The ct_state field was initially added as an 8-bit field, however six of
the bits are already being used and use cases are already starting to
appear that may push the limits of this field. This patch extends the
field to 32 bits while retaining the internal representation of 8 bits.
This should cover forward compatibility of the ABI for the foreseeable
future.
This patch also reorders the OVS_CS_F_* bits to be sequential.
Suggested-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These comments hadn't caught up to their implementations, fix them.
Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
for these to be consistent with conntrack.
Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I’m using the compilation flag -Werror=old-style-declaration, which
requires that the “inline” word would come at the beginning of the code
line.
$ make drivers/net/ethernet/intel/e1000e/e1000e.ko
...
include/net/inet_timewait_sock.h:116:1: error: ‘inline’ is not at
beginning of declaration [-Werror=old-style-declaration]
static void inline inet_twsk_schedule(struct inet_timewait_sock *tw, int
timeo)
include/net/inet_timewait_sock.h:121:1: error: ‘inline’ is not at
beginning of declaration [-Werror=old-style-declaration]
static void inline inet_twsk_reschedule(struct inet_timewait_sock *tw,
int timeo)
Fixes: ed2e92394589 ("tcp/dccp: fix timewait races in timer handling")
Signed-off-by: Raanan Avargil <raanan.avargil@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull IOVA fixes from David Woodhouse:
"The main fix here is the first one, fixing the over-allocation of
size-aligned requests. The other patches simply make the existing
IOVA code available to users other than the Intel VT-d driver, with no
functional change.
I concede the latter really *should* have been submitted during the
merge window, but since it's basically risk-free and people are
waiting to build on top of it and it's my fault I didn't get it in, I
(and they) would be grateful if you'd take it"
* git://git.infradead.org/intel-iommu:
iommu: Make the iova library a module
iommu: iova: Export symbols
iommu: iova: Move iova cache management to the iova library
iommu/iova: Avoid over-allocating when size-aligned
Merge misc fixes from Andrew Morton:
"12 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
dmapool: fix overflow condition in pool_find_page()
thermal: avoid division by zero in power allocator
memcg: remove pcp_counter_lock
kprobes: use _do_fork() in samples to make them work again
drivers/input/joystick/Kconfig: zhenhua.c needs BITREVERSE
memcg: make mem_cgroup_read_stat() unsigned
memcg: fix dirty page migration
dax: fix NULL pointer in __dax_pmd_fault()
mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault
mm/slab: fix unexpected index mapping result of kmalloc_size(INDEX_NODE+1)
userfaultfd: remove kernel header include from uapi header
arch/x86/include/asm/efi.h: fix build failure
- intel_idle driver fixup for the recently added Skylake chips
support (Len Brown).
- Operating Performance Points (OPP) library fix related to the
recently added support for new DT bindings and a fix for a typo
in a comment (Viresh Kumar, Stephen Boyd).
- ACPI EC driver fix for a recently introduced memory leak in an
error code path (Lv Zheng).
- ACPI PCI IRQ management fix for the issue where an ISA IRQ is
shared with a PCI device which requires it to be configured in a
different way and may cause an interrupt storm to happen as a
result with an extra ACPI SCI IRQ handling simplification on top
of it (Jiang Liu).
- Update of the PCI power management documentation that became
outdated and started to actively confuse the readers to make
it actually reflect the code (Rafael J Wysocki).
- turbostat fixes including an IVB Xeon regression fix (related to
the --debug command line option), Skylake adjustment for the TSC
running at a frequency that doesn't match the base one exactly,
and a Knights Landing quirk to account for the fact that it only
updates APERF and MPERF every 1024 clock cycles plus bumping up
the turbostat version number (Len Brown, Hubert Chrzaniuk).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWDag4AAoJEILEb/54YlRxy/UQAJa39EC2IQd+PrMlgMx3cp2N
ssotwuQiQ0jL2V/qc36wfzgu3A5k0ldHHQGbgX0f/z9LjD+zLsZiPtHj27LrNtG5
J9DgViLh9vut4XEsLlzj8W2z1OcTyAmZyTIiVeFlj/zM517oeXKVYMX2RuhHQk0r
lwDI/hc1rtpUkdN7gkT9DqyO32r1LgNkDt6+ubRr/qrYVhYPXSrp4k9wxnr9j1Bx
0G9bvCz8ETTclRPcfToGU9P86snk5FS3veSm231ioABdry7BxhTZHjQKSZyjuvx4
l8YedxBc0ks7yyeN9lvWPbNSpHLjhYen+d9q1koQsHJYb+gWJ/KbSGu3kfg0bPDj
Rzh1u76ak7MOYpkn+95MRhzIiFxG3IhUoqYhIGGyCNFGAJgPfFos2IJTISAxSmTE
ebCyFEX07AdhjHac4RyRCnMVavZthgLyXHwXiNqG9gdW9aOEzN65svH2LLMBiKcH
IGRCsjom1uCUT0y1gy3R7q1nTCi112IcXwvAziX7QKCNOxLIH8HJNiraVcyl2vY5
BbDyTOQ7VboviWWSQ09+bQFq4CAhe4b9+nR4XhvHO9F0ffxBujBoCwjjFQY+yJIH
9nYaYyUynpi1m0Y1AwlrI8wgVLDfNEE6UU63clHQ2PoOFfDDE+/5I/l3yuWubo0I
cUtW1RVEgDaa61ehyFuS
=ELup
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes mostly, for a few changes made in this cycle (the
intel_idle driver, the OPP library, the ACPI EC driver, turbostat) and
for some issues that have just been discovered (ACPI PCI IRQ
management, PCI power management documentation, turbostat), with a
couple of cleanups on top of them.
Specifics:
- intel_idle driver fixup for the recently added Skylake chips
support (Len Brown).
- Operating Performance Points (OPP) library fix related to the
recently added support for new DT bindings and a fix for a typo in
a comment (Viresh Kumar, Stephen Boyd).
- ACPI EC driver fix for a recently introduced memory leak in an
error code path (Lv Zheng).
- ACPI PCI IRQ management fix for the issue where an ISA IRQ is
shared with a PCI device which requires it to be configured in a
different way and may cause an interrupt storm to happen as a
result with an extra ACPI SCI IRQ handling simplification on top of
it (Jiang Liu).
- Update of the PCI power management documentation that became
outdated and started to actively confuse the readers to make it
actually reflect the code (Rafael J Wysocki).
- turbostat fixes including an IVB Xeon regression fix (related to
the --debug command line option), Skylake adjustment for the TSC
running at a frequency that doesn't match the base one exactly, and
a Knights Landing quirk to account for the fact that it only
updates APERF and MPERF every 1024 clock cycles plus bumping up the
turbostat version number (Len Brown, Hubert Chrzaniuk)"
* tag 'pm+acpi-4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power turbosat: update version number
tools/power turbostat: SKL: Adjust for TSC difference from base frequency
tools/power turbostat: KNL workaround for %Busy and Avg_MHz
tools/power turbostat: IVB Xeon: fix --debug regression
ACPI / PCI: Remove duplicated penalty on SCI IRQ
ACPI, PCI, irq: Do not share PCI IRQ with ISA IRQ
ACPI / EC: Fix a memory leak issue in acpi_ec_query()
PM / OPP: Fix typo modifcation -> modification
PCI / PM: Update runtime PM documentation for PCI devices
PM / OPP: of_property_count_u32_elems() can return errors
intel_idle: Skylake Client Support - updated
Pull networking fixes from David Miller:
1) Fix regression in SKB partial checksum handling, from Pravin B
Shalar.
2) Fix VLAN inside of VXLAN handling in i40e driver, from Jesse
Brandeburg.
3) Cure softlockups during accept() in SCTP, from Karl Heiss.
4) MSG_PEEK should return multiple SKBs worth of data in AF_UNIX, from
Aaron Conole.
5) IPV6 erroneously ignores output interface specifier in lookup key for
route lookups, fix from David Ahern.
6) In Marvell DSA driver, forward unknown frames to CPU port, from
Andrew Lunn.
7) Mission flow flag initializations in some code paths, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Initialize flow flags in input path
net: dsa: fix preparation of a port STP update
testptp: Silence compiler warnings on ppc64
net/mlx4: Handle return codes in mlx4_qp_attach_common
dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port
skbuff: Fix skb checksum partial check.
net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set
net sysfs: Print link speed as signed integer
bna: fix error handling
af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
af_unix: Convert the unix_sk macro to an inline function for type safety
net: sctp: Don't use 64 kilobyte lookup table for four elements
l2tp: protect tunnel->del_work by ref_count
net/ibm/emac: bump version numbers for correct work with ethtool
sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
sctp: Whitespace fix
i40e/i40evf: check for stopped admin queue
i40e: fix VLAN inside VXLAN
r8169: fix handling rtl_readphy result
net: hisilicon: fix handling platform_get_irq result
Commit 733a572e66d2 ("memcg: make mem_cgroup_read_{stat|event}() iterate
possible cpus instead of online") removed the last use of the per memcg
pcp_counter_lock but forgot to remove the variable.
Kill the vestigial variable.
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The problem starts with a file backed dirty page which is charged to a
memcg. Then page migration is used to move oldpage to newpage.
Migration:
- copies the oldpage's data to newpage
- clears oldpage.PG_dirty
- sets newpage.PG_dirty
- uncharges oldpage from memcg
- charges newpage to memcg
Clearing oldpage.PG_dirty decrements the charged memcg's dirty page
count.
However, because newpage is not yet charged, setting newpage.PG_dirty
does not increment the memcg's dirty page count. After migration
completes newpage.PG_dirty is eventually cleared, often in
account_page_cleaned(). At this time newpage is charged to a memcg so
the memcg's dirty page count is decremented which causes underflow
because the count was not previously incremented by migration. This
underflow causes balance_dirty_pages() to see a very large unsigned
number of dirty memcg pages which leads to aggressive throttling of
buffered writes by processes in non root memcg.
This issue:
- can harm performance of non root memcg buffered writes.
- can report too small (even negative) values in
memory.stat[(total_)dirty] counters of all memcg, including the root.
To avoid polluting migrate.c with #ifdef CONFIG_MEMCG checks, introduce
page_memcg() and set_page_memcg() helpers.
Test:
0) setup and enter limited memcg
mkdir /sys/fs/cgroup/test
echo 1G > /sys/fs/cgroup/test/memory.limit_in_bytes
echo $$ > /sys/fs/cgroup/test/cgroup.procs
1) buffered writes baseline
dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
sync
grep ^dirty /sys/fs/cgroup/test/memory.stat
2) buffered writes with compaction antagonist to induce migration
yes 1 > /proc/sys/vm/compact_memory &
rm -rf /data/tmp/foo
dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
kill %
sync
grep ^dirty /sys/fs/cgroup/test/memory.stat
3) buffered writes without antagonist, should match baseline
rm -rf /data/tmp/foo
dd if=/dev/zero of=/data/tmp/foo bs=1M count=1k
sync
grep ^dirty /sys/fs/cgroup/test/memory.stat
(speed, dirty residue)
unpatched patched
1) 841 MB/s 0 dirty pages 886 MB/s 0 dirty pages
2) 611 MB/s -33427456 dirty pages 793 MB/s 0 dirty pages
3) 114 MB/s -33427456 dirty pages 891 MB/s 0 dirty pages
Notice that unpatched baseline performance (1) fell after
migration (3): 841 -> 114 MB/s. In the patched kernel, post
migration performance matches baseline.
Fixes: c4843a7593a9 ("memcg: add per cgroup dirty page accounting")
Signed-off-by: Greg Thelen <gthelen@google.com>
Reported-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As include/uapi/linux/userfaultfd.h is a user visible header file, it
should not include kernel-exclusive header files.
So trying to build the userfaultfd test program from the selftests
directory fails, since it contains a reference to linux/compiler.h. As
it turns out, that header is not really needed there, so we can simply
remove it to fix that issue.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWCfALAAoJELgmozMOVy/dc+MQAKoD6echYpTkWE0otMuHQcYf
zMaVVots+JdRKpA6OqHYQHgKGA80z21BpnjGYwcwB5zB1zPrJwz4vxwGlOBHt01T
xLBReFgSKyJlgOWLXKfPx4bXUdivOBKm203wY0dh+/dC/VROGYoiXYTmSDsfsuKa
8OXT1kWgzRVLtqwqj5GSkgWvtFZ28CjKh6d9egjqcj9tpbh2UupQDZzMyOtZ52X6
Nz/Vo3u4T7qjzlhHOlCwHCDw+97x0yvmvLY1mWweGPfKOnxtXjkzQmTQEpyzU5Mo
EwcqJucrBnmjbLAIBMrbR1mzTUQeD4dHz1jx+EzWE0lVnRL3twe1UaY40176sNlm
aCBA4bIOQ242r3IJ++ss15ol1k5hu7PYKRn9Q8d2sSbQGcSnCHe/YOutQQ+FTEFG
yE9xiLL+pgT8koauROnxg66E3HDM78NGTpjP3EuG4r2Qwa1iFANPfDB6kikuv8bO
rG3qUJcloEPvfatZY+h5QC4UCoB0/W1DAhlfzE3tPBYPmhSEgQDfEOzXTKDakeF0
VB903bYrOL3CVOun4I7fLrDc1leVeiAUKqO2orZs3qIpRWvAKyV/VjolAusMv2+F
/4xPyh95AEMTFfmZogOCofQFk3eOnkWpLdrVTYCKy3i6NVBoy2wHldrl+LuCAN/m
r/DNRBmazShashbeU6wg
=8+cX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
- Fixes for mlx5 related issues
- Fixes for ipoib multicast handling
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/ipoib: increase the max mcast backlog queue
IB/ipoib: Make sendonly multicast joins create the mcast group
IB/ipoib: Expire sendonly multicast joins
IB/mlx5: Remove pa_lkey usages
IB/mlx5: Remove support for IB_DEVICE_LOCAL_DMA_LKEY
IB/iser: Add module parameter for always register memory
xprtrdma: Replace global lkey with lkey local to PD
Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb->data.
Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.
Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.
Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull RCU fixes from Paul E. McKenney, for two regressions
introduced in this merge window:
- Fix bug with recent GCCs.
- Fix false positive lockdep splat.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull SCSI target fixes from Nicholas Bellinger:
"This includes a iser-target series from Jenny + Sagi @ Mellanox that
addresses the few remaining active I/O shutdown bugs, along with a
patch to support zero-copy for immediate data payloads that gives a
nice performance improvement for small block WRITEs.
Also included are some recent >= v4.2 regression bug-fixes. The most
notable is a RCU conversion regression for SPC-3 PR registrations, and
recent removal of obsolete RFC-3720 markers that introduced a login
regression bug with MSFT iSCSI initiators.
Thanks to everyone who has been testing + reporting bugs for v4.x"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Avoid OFMarker + IFMarker negotiation
target: Make TCM_WRITE_PROTECT failure honor D_SENSE bit
target: Fix target_sense_desc_format NULL pointer dereference
target: Propigate backend read-only to core_tpg_add_lun
target: Fix PR registration + APTPL RCU conversion regression
iser-target: Skip data copy if all the command data comes as immediate
iser-target: Change the recv buffers posting logic
iser-target: Fix pending connections handling in target stack shutdown sequnce
iser-target: Remove np_ prefix from isert_np members
iser-target: Remove unused variables
iser-target: Put the reference on commands waiting for unsol data
iser-target: remove command with state ISTATE_REMOVE
Pull networking fixes from David Miller:
1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
instead of cloning them. From Daniel Borkmann.
2) When converting classical BPF into eBPF, fix the setting of the
source reg to BPF_REG_X. From Tycho Andersen.
3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
Linus Lussing.
4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.
5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
Roopa Prabhu.
6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.
7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
Florian Westphal.
8) Check DMA mapping errors in bna driver, from Ivan Vecera.
9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
misclearing of interrupt status in TX timeout handler, etc.) from
David Woodhouse.
10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
Hugne.
11) Fix autobind races et al. in netlink code, from Herbert Xu with
help from Tejun Heo and others.
12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.
13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
Eric Dumazet.
14) Fix array overruns in mac80211, from Johannes Berg and Dan
Carpenter.
15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.
16) Fix race between poll_one_napi and napi_disable, from Neil Horman.
17) Fix byte order in geneve tunnel port config, from John W Linville.
18) Fix handling of ARP replies over lightweight tunnels, from Jiri
Benc.
19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
Kok and Roopa Prabhu.
20) Several reference count handling bug fixes in the PHY/MDIO layer
from Russel King.
21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.
22) Fix crash in icmp_route_lookup(), from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net: Fix panic in icmp_route_lookup
net: update docbook comment for __mdiobus_register()
ppp: fix lockdep splat in ppp_dev_uninit()
net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
phy: marvell: add link partner advertised modes
net: fix net_device refcounting
phy: add phy_device_remove()
phy: fixed-phy: properly validate phy in fixed_phy_update_state()
net: fix phy refcounting in a bunch of drivers
of_mdio: fix MDIO phy device refcounting
phy: add proper phy struct device refcounting
phy: fix mdiobus module safety
net: dsa: fix of_mdio_find_bus() device refcount leak
phy: fix of_mdio_find_bus() device refcount leak
ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
ip6_gre: Reduce log level in ip6gre_err() to debug
fib_rules: fix fib rule dumps across multiple skbs
bnx2x: byte swap rss_key to comply to Toeplitz specs
net: revert "net_sched: move tp->root allocation into fw_init()"
lwtunnel: remove source and destination UDP port config option
...
Avoid IRQs occupied by ISA IRQs when allocating IRQs for PCI link devices,
otherwise it may cause interrupt storm due to incompatible pin attributes.
This issue was triggered on a KVM virtual machine, which
1) uses IRQ9 for SCI in high level mode.
2) defines an PCI interrupt link device (LNKS) with IRQ9 as the only
possible irq.
3) has an PCI device referring to link device LNKS.
So it causes interrupt storm when enabling the PCI device because PCI IRQ
works in low level mode.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull another cgroup fix from Tejun Heo:
"The cgroup writeback support got inadvertently enabled for traditional
hierarchies revealing two regressions which are currently being worked
on. It shouldn't have been enabled on traditional hierarchies, so
disable it on them. This is enough to make the regressions go away
for people who aren't experimenting with cgroup"
* 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup, writeback: don't enable cgroup writeback on traditional hierarchies
Highlights include:
Stable patches:
- fix v4.2 SEEK on files over 2 gigs
- Fix a layout segment reference leak when pNFS I/O falls back to inband I/O.
- Fix recovery of recalled read delegations
Bugfixes:
- Fix a case where NFSv4 fails to send CLOSE after a server reboot
- Fix sunrpc to wait for connections to complete before retrying
- Fix sunrpc races between transport connect/disconnect and shutdown
- Fix an infinite loop when layoutget fail with BAD_STATEID
- nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
- Fix a bogus WARN_ON_ONCE() in O_DIRECT when layout commit_through_mds is set
- Fix layoutreturn/close ordering issues.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWBWKdAAoJEGcL54qWCgDy2rUP/iIWUQSpUPfKKw7xquQUQe4j
ci4nFxpJ/zhKj1u7x3wrkxZAXcEooYo+ZJ7ayzROKcfQL/sUSWGbSLdr3mqrQynv
b0SDmnJK9V+CdBQrA+Jp5UGQxcumpMxsAfqVznT0qkf/wDp44DCVgDz5Aj8cRbWU
6xPfMgVLEnXiId9IgKqg3sJ2NmvMZXuI9sHM6hp6OzRmQDjTcx+LgRz7tnQHgaEk
zGz8R6eDm3OA0wfApqZwJ6JY793HsDdy30W9L0Yi2PVGXfzwoEB8AqgLVwSDIY1B
5hG5zn3tg9PSz9vhJ7M2h4AgFHdB3w3XGdJUafwqZEeqEIagw1iFCWlMyo/lE2dG
G7oob9Jiiwxjc3RDWn2wGaafymrrWZwl2nYzC4O3UvJ3hVJ0mEl1iJagK1m8LzfN
fmnP7tTyPuoOXkzDogZ0YI3FrngO6430PoR2hUPkS1yce/a+IV0HQEmXbSDSwN80
1d9zyC9TnPj6rFjZeaGxGK17BpkC0oIQCPq4OSJB4396wzAwMqoJjJVVWWeAK4UC
PxzoXqAAaBFguSsDbuBMcXgiuUw/7DIZ/pdzsWSiCFgocgF5ZdJdieCNtGk0nbLM
37R7HCauF93JDrkpUMKPnLXScb2IbEh31pFtKzptJYKwMxEiScXXiP3NE9hfX65i
2zLkl2aBvd154RvVKNbp
=GdeV
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable patches:
- fix v4.2 SEEK on files over 2 gigs
- Fix a layout segment reference leak when pNFS I/O falls back to inband I/O.
- Fix recovery of recalled read delegations
Bugfixes:
- Fix a case where NFSv4 fails to send CLOSE after a server reboot
- Fix sunrpc to wait for connections to complete before retrying
- Fix sunrpc races between transport connect/disconnect and shutdown
- Fix an infinite loop when layoutget fail with BAD_STATEID
- nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
- Fix a bogus WARN_ON_ONCE() in O_DIRECT when layout commit_through_mds is set
- Fix layoutreturn/close ordering issues"
* tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS41: make close wait for layoutreturn
NFS: Skip checking ds_cinfo.buckets when lseg's commit_through_mds is set
NFSv4.x/pnfs: Don't try to recover stateids twice in layoutget
NFSv4: Recovery of recalled read delegations is broken
NFS: Fix an infinite loop when layoutget fail with BAD_STATEID
NFS: Do cleanup before resetting pageio read/write to mds
SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose
SUNRPC: Lock the transport layer on shutdown
nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
SUNRPC: Ensure that we wait for connections to complete before retrying
SUNRPC: drop null test before destroy functions
nfs: fix v4.2 SEEK on files over 2 gigs
SUNRPC: Fix races between socket connection and destroy code
nfs: fix pg_test page count calculation
Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount
Commit 96249d70dd70 ("IB/core: Guarantee that a local_dma_lkey
is available") allows ULPs that make use of the local dma key to keep
working as before by allocating a DMA MR with local permissions and
converted these consumers to use the MR associated with the PD
rather then device->local_dma_lkey.
ConnectIB has some known issues with memory registration
using the local_dma_lkey (SEND, RDMA, RECV seems to work ok).
Thus don't expose support for it (remove device->local_dma_lkey
setting), and take advantage of the above commit such that no regression
is introduced to working systems.
The local_dma_lkey support will be restored in CX4 depending on FW
capability query.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds a DF_READ_ONLY flag that is used by IBLOCK to
signal when a backend has been set to read-only mode, in order
to propigate read-only status up to core_tpg_add_lun() for all
future LUN fabric exports.
With this is place, existing emulation for reporting read-only
in spc_emulate_modesense() and normal transport_lookup_cmd_lun()
TCM_WRITE_PROTECTED status checking just works as expected.
Reported-by: Joeue Deng <joeue404@gmail.com>
Reported-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add a phy_device_remove() function to complement phy_device_register(),
which undoes the effects of phy_device_register() by removing the phy
device from visibility, but not freeing it.
This allows these details to be moved out of the mdio bus code into
the phy code where this action belongs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-implement the mdiobus module refcounting to ensure that we actually
ensure that the mdiobus module code does not go away while we might call
into it.
The old scheme using bus->dev.driver was buggy, because bus->dev is a
class device which never has a struct device_driver associated with it,
and hence the associated code trying to obtain a refcount did nothing
useful.
Instead, take the approach that other subsystems do: pass the module
when calling mdiobus_register(), and record that in the mii_bus struct.
When we need to increment the module use count in the phy code, use
this stored pointer. When the phy is deteched, drop the module
refcount, remembering that the phy device might go away at that point.
This doesn't stop the mii_bus going away while there are in-use phys -
it merely stops the underlying code vanishing.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull thermal management fixes from Zhang Rui:
- Power allocator governor changes to allow binding on thermal zones
with missing power estimates information. From Javi Merino.
- Add compile test flags on thermal drivers that allow it without
producing compilation errors. From Eduardo Valentin.
- Fixes around memory allocation on cpu_cooling. From Javi Merino.
- Fix on db8500 cpufreq code to allow autoload. From Luis de
Bethencourt.
- Maintainer entries for cpu cooling device
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: power_allocator: exit early if there are no cooling devices
thermal: power_allocator: don't require tzp to be present for the thermal zone
thermal: power_allocator: relax the requirement of two passive trip points
thermal: power_allocator: relax the requirement of a sustainable_power in tzp
thermal: Add a function to get the minimum power
thermal: cpu_cooling: free power table on error or when unregistering
thermal: cpu_cooling: don't call kcalloc() under rcu_read_lock
thermal: db8500_cpufreq_cooling: Fix module autoload for OF platform driver
thermal: cpu_cooling: Add MAINTAINERS entry
thermal: ti-soc: Kconfig fix to avoid menu showing wrongly
thermal: ti-soc: allow compile test
thermal: qcom_spmi: allow compile test
thermal: exynos: allow compile test
thermal: armada: allow compile test
thermal: dove: allow compile test
thermal: kirkwood: allow compile test
thermal: rockchip: allow compile test
thermal: spear: allow compile test
thermal: hisi: allow compile test
thermal: Fix thermal_zone_of_sensor_register to match documentation
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
ocfs2/dlm: fix deadlock when dispatch assert master
membarrier: clean up selftest
vmscan: fix sane_reclaim helper for legacy memcg
lib/iommu-common.c: do not try to deref a null iommu->lazy_flush() pointer when n < pool->hint
x86, efi, kasan: #undef memset/memcpy/memmove per arch
mm: migrate: hugetlb: putback destination hugepage to active list
mm, dax: VMA with vm_ops->pfn_mkwrite wants to be write-notified
userfaultfd: register uapi generic syscall (aarch64)
userfaultfd: selftest: don't error out if pthread_mutex_t isn't identical
userfaultfd: selftest: return an error if BOUNCE_VERIFY fails
userfaultfd: selftest: avoid my_bcmp false positives with powerpc
userfaultfd: selftest: only warn if __NR_userfaultfd is undefined
userfaultfd: selftest: headers fixup
userfaultfd: selftests: vm: pick up sanitized kernel headers
userfaultfd: revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key"
The UDP tunnel config is asymmetric wrt. to the ports used. The source and
destination ports from one direction of the tunnel are not related to the
ports of the other direction. We need to be able to respond to ARP requests
using the correct ports without involving routing.
As the consequence, UDP ports need to be fixed property of the tunnel
interface and cannot be set per route. Remove the ability to set ports per
route. This is still okay to do, as no kernel has been released with these
attributes yet.
Note that the ability to specify source and destination ports is preserved
for other users of the lwtunnel API which don't use routes for tunnel key
specification (like openvswitch).
If in the future we rework ARP handling to allow port specification, the
attributes can be added back.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using ip lwtunnels, the additional data for xmit (basically, the actual
tunnel to use) are carried in ip_tunnel_info either in dst->lwtstate or in
metadata dst. When replying to ARP requests, we need to send the reply to
the same tunnel the request came from. This means we need to construct
proper metadata dst for ARP replies.
We could perform another route lookup to get a dst entry with the correct
lwtstate. However, this won't always ensure that the outgoing tunnel is the
same as the incoming one, and it won't work anyway for IPv4 duplicate
address detection.
The only thing to do is to "reverse" the ip_tunnel_info.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
VXLAN device can receive skb with checksum partial. But the checksum
offset could be in outer header which is pulled on receive. This results
in negative checksum offset for the skb. Such skb can cause the assert
failure in skb_checksum_help(). Following patch fixes the bug by setting
checksum-none while pulling outer header.
Following is the kernel panic msg from old kernel hitting the bug.
------------[ cut here ]------------
kernel BUG at net/core/dev.c:1906!
RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150
Call Trace:
<IRQ>
[<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch]
[<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch]
[<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
[<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
[<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch]
[<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch]
[<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
[<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0
[<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0
[<ffffffff8157ba7a>] udp_rcv+0x1a/0x20
[<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280
[<ffffffff81550128>] ip_local_deliver+0x88/0x90
[<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370
[<ffffffff81550365>] ip_rcv+0x235/0x300
[<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620
[<ffffffff8151c360>] netif_receive_skb+0x80/0x90
[<ffffffff81459935>] virtnet_poll+0x555/0x6f0
[<ffffffff8151cd04>] net_rx_action+0x134/0x290
[<ffffffff810683d8>] __do_softirq+0xa8/0x210
[<ffffffff8162fe6c>] call_softirq+0x1c/0x30
[<ffffffff810161a5>] do_softirq+0x65/0xa0
[<ffffffff810687be>] irq_exit+0x8e/0xb0
[<ffffffff81630733>] do_IRQ+0x63/0xe0
[<ffffffff81625f2e>] common_interrupt+0x6e/0x6e
Reported-by: Anupam Chanda <achanda@vmware.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inode_cgwb_enabled() gates cgroup writeback support. If it returns
true, each inode is attached to the corresponding memory domain which
gets mapped to io domain. It currently only tests whether the
filesystem and bdi support cgroup writeback; however, cgroup writeback
support doesn't work on traditional hierarchies and thus it should
also test whether memcg and iocg are on the default hierarchy.
This caused traditional hierarchy setups to hit the cgroup writeback
path inadvertently and ended up creating separate writeback domains
for each memcg and mapping them all to the root iocg uncovering a
couple issues in the cgroup writeback path.
cgroup writeback was never meant to be enabled on traditional
hierarchies. Make inode_cgwb_enabled() test whether both memcg and
iocg are on the default hierarchy.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Artem Bityutskiy <dedekind1@gmail.com>
Reported-by: Dexuan Cui <decui@microsoft.com>
Link: http://lkml.kernel.org/g/1443012552.19983.209.camel@gmail.com
Link: http://lkml.kernel.org/g/f30d4a6aa8a546ff88f73021d026a453@SIXPR30MB031.064d.mgd.msft.net
A disappointingly large collection of fixes for SPI issues, though
almost all in drivers (and there mainly the newly added Mediatek
driver) and the core fixes are documentation and error handling. The
driver fixes are all of the usual important if you see them variety.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWBDUFAAoJECTWi3JdVIfQir4H/RIAsfP5QqSzjQHmeZk1eHJv
FIzUiHARK5Jy3ORXVCh3dh8JB67NF9Qa4p/fDcHhk0DFju+HTyvUHmChzGPwEw86
0lRv2PKhHu9e7vJG8IZXKbZKeBT9RtrVe8yQ7SLmQ+z0VxoVFaQwkWVKotzpL8wZ
YCOYGAtmxXvqWDiGuhzqG7RVLKW6vj8xz2BFqm5Gf6O32RpV9wFiNp2EtF8Hu+On
sMEqFWDqMbqIwUhcPKRI9+Zhj1TkzwNUawE+EgD4ydYVndYxSpqtn1veFc1Bv1xo
1FbntlDu/AzlPqtIFBzWLZNUxcwW+qKSjOCFlyCs+k1l6CEf+AoZAm1TsL+rVTw=
=3QsX
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A disappointingly large collection of fixes for SPI issues, though
almost all in drivers (and there mainly the newly added Mediatek
driver) and the core fixes are documentation and error handling.
The driver fixes are all of the usual 'important if you see them'
variety"
* tag 'spi-fix-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: xtensa-xtfpga: fix register endianness
spi: meson: Fix module autoload for OF platform driver
spi: mediatek: fix wrong error return value on probe
spi: fix kernel-doc warnings in spi.h
spi: spidev: fix possible NULL dereference
spi: atmel: remove warning when !CONFIG_PM_SLEEP
spi: bcm2835: BUG: fix wrong use of PAGE_MASK
spi: mediatek: fix spi cs polarity error
spi: Fix documentation of spi_alloc_master()
spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled
spi: Mediatek: Document devicetree bindings update for spi bus
spi: mediatek: fix spi clock usage error
spi: mediatek: remove clk_disable_unprepare()
Drivers might call napi_disable while not holding the napi instance poll_lock.
In those instances, its possible for a race condition to exist between
poll_one_napi and napi_disable. That is to say, poll_one_napi only tests the
NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
the following may happen:
CPU0 CPU1
ndo_tx_timeout napi_poll_dev
napi_disable poll_one_napi
test_and_set_bit (ret 0)
test_bit (ret 1)
reset adapter napi_poll_routine
If the adapter gets a tx timeout without a napi instance scheduled, its possible
for the adapter to think it has exclusive access to the hardware (as the napi
instance is now scheduled via the napi_disable call), while the netpoll code
thinks there is simply work to do. The result is parallel hardware access
leading to corrupt data structures in the driver, and a crash.
Additionaly, there is another, more critical race between netpoll and
napi_disable. The disabled napi state is actually identical to the scheduled
state for a given napi instance. The implication being that, if a napi instance
is disabled, a netconsole instance would see the napi state of the device as
having been scheduled, and poll it, likely while the driver was dong something
requiring exclusive access. In the case above, its fairly clear that not having
the rings in a state ready to be polled will cause any number of crashes.
The fix should be pretty easy. netpoll uses its own bit to indicate that that
the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
We can just gate disabling on that bit as well as the sched bit. That should
prevent netpoll from conducting a napi poll if we convert its set bit to a
test_and_set_bit operation to provide mutual exclusion
Change notes:
V2)
Remove a trailing whtiespace
Resubmit with proper subject prefix
V3)
Clean up spacing nits
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: jmaxwell@redhat.com
Tested-by: jmaxwell@redhat.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the userfaultfd syscalls to uapi asm-generic, it was tested with
postcopy live migration on aarch64 with both 4k and 64k pagesize
kernels.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 51360155eccb907ff8635bd10fc7de876408c2e0 and adapts
fs/userfaultfd.c to use the old version of that function.
It didn't look robust to call __wake_up_common with "nr == 1" when we
absolutely require wakeall semantics, but we've full control of what we
insert in the two waitqueue heads of the blocked userfaults. No
exclusive waitqueue risks to be inserted into those two waitqueue heads
so we can as well stick to "nr == 1" of the old code and we can rely
purely on the fact no waitqueue inserted in one of the two waitqueue
heads we must enforce as wakeall, has wait->flags WQ_FLAG_EXCLUSIVE set.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull cgroup fixes from Tejun Heo:
"The threadgroup locking changes which went in during 4.2 devel cycle
added write locking of a percpu_rwsem in cgroup task migration path;
unfortunately, that involved expedited rcu syncing which turned out to
be too slow and heavy for certain workloads. The patchset which is
dependent on this one didn't get committed during that devel cycle, so
these two patches can be reverted safely.
Oleg reworked percpu_rwsem for 4.4 so that the writer path is a lot
lighter. The reported issue goes away with Oleg's reworked
percpu_rwsem and I'll reapply these patches on the for-4.4 branch so
that they can land together with Oleg's changes"
* 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem"
Revert "cgroup: simplify threadgroup locking"
When creating a timewait socket, we need to arm the timer before
allowing other cpus to find it. The signal allowing cpus to find
the socket is setting tw_refcnt to non zero value.
As we set tw_refcnt in __inet_twsk_hashdance(), we therefore need to
call inet_twsk_schedule() first.
This also means we need to remove tw_refcnt changes from
inet_twsk_schedule() and let the caller handle it.
Note that because we use mod_timer_pinned(), we have the guarantee
the timer wont expire before we set tw_refcnt as we run in BH context.
To make things more readable I introduced inet_twsk_reschedule() helper.
When rearming the timer, we can use mod_timer_pending() to make sure
we do not rearm a canceled timer.
Note: This bug can possibly trigger if packets of a flow can hit
multiple cpus. This does not normally happen, unless flow steering
is broken somehow. This explains this bug was spotted ~5 months after
its introduction.
A similar fix is needed for SYN_RECV sockets in reqsk_queue_hash_req(),
but will be provided in a separate patch for proper tracking.
Fixes: 789f558cfb36 ("tcp/dccp: get rid of central timewait timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ying Cai <ycai@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patch contains Netfilter fixes for your net tree, they are:
1) nf_log_unregister() should only set to NULL the logger that is being
unregistered, instead of everything else. Patch from Florian Westphal.
2) Fix a crash when accessing physoutdev from PREROUTING in br_netfilter.
This is partially reverting the patch to shrink nf_bridge_info to 32 bytes.
Also from Florian.
3) Use existing match/target extensions in the internal nft_compat extension
lists when the extension is family unspecific (ie. NFPROTO_UNSPEC).
4) Wait for rcu grace period before leaving nf_log_unregister().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Man page of ip-route(8) says following about route types:
unreachable - these destinations are unreachable. Packets are dis‐
carded and the ICMP message host unreachable is generated. The local
senders get an EHOSTUNREACH error.
blackhole - these destinations are unreachable. Packets are dis‐
carded silently. The local senders get an EINVAL error.
prohibit - these destinations are unreachable. Packets are discarded
and the ICMP message communication administratively prohibited is
generated. The local senders get an EACCES error.
In the inet6 address family, this was correct, except the local senders
got ENETUNREACH error instead of EHOSTUNREACH in case of unreachable route.
In the inet address family, all three route types generated ICMP message
net unreachable, and the local senders got ENETUNREACH error.
In both address families all three route types now behave consistently
with documentation.
Signed-off-by: Nikola Forró <nforro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Code like this in inline functions confuses some recent versions of gcc:
const int n = const-expr;
whatever_t array[n];
For more details, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67055#c13
This compiler bug results in the following failure after 114b7fd4b (rcu:
Create rcu_sync infrastructure):
In file included from include/linux/rcupdate.h:429:0,
from include/linux/rcu_sync.h:5,
from kernel/rcu/sync.c:1:
include/linux/rcutiny.h: In function 'rcu_barrier_sched':
include/linux/rcutiny.h:55:20: internal compiler error: Segmentation fault
static inline void rcu_barrier_sched(void)
This commit therefore eliminates the constant local variable in favor of
direct use of the expression.
Reported-and-tested-by: Mark Salter <msalter@redhat.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
- Fix a memory allocation size in the devfreq core (Xiaolong Ye).
- Fix a mistake in the exynos-ppmu DT binding (Javier Martinez
Canillas).
- Add support for PPMUv2 ((Platform Performance Monitoring Unit
version 2.0) on the Exynos5433 SoCs (Chanwoo Choi).
- Fix a type casting bug in the Exynos PPMU code (MyungJoo Ham).
- Assorted devfreq code cleanups and optimizations (Javi Merino,
MyungJoo Ham, Viresh Kumar).
- Fix up the ACPI cpufreq driver to use a more lightweight way
to get to its private data in the ->get() callback (Rafael J
Wysocki).
- Fix a CONFIG_ prefix bug in one of the ACPI drivers and make
the ACPI subsystem use IS_ENABLED() instead of #ifdefs in
function bodies (Sudeep Holla).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJV/IoLAAoJEILEb/54YlRxoWEP/3P7xbQ0FETVcuY8TqqauH0r
AgmXqz9XYKgLwBEVnElVZE7fFu+5u5UiwQYv1wGePfR5tXQ9gxFYjh7RxmnFQVPV
nJaPWO00ukiy8BugPnLepoEiJe+feoyAqyGoEmbaJPFiqt6lwVDY6UPatZLu9cP5
/9oD8623hpvk73w9FRYBMY4cXewSl5QbzRUc9NhSlWGmZOOokWFsG7aKletVyWHa
LQJ5JbrsNaR4y1mt6eIwKMNWeO+N/C8e45fcVOGjAR9H7C5b3eNByoTsKoFjEvn9
4yuIkJu8X4hFDVVH8J359hAYKNvdiHLmTMcUAmufO/BH3M2Bc2SwBdnLbg3xnWMF
xQkSYZT2E+uiYPOy/l98s75rN1bFj5MH+zy0dz6Mkz/lWSMh5CNYZoVH42h1h24S
fYKO4KmxZmf+Ufx+zv27shvi4I322bJ/AEPLYFv6LWETHeHYegF+YOlLiDdUQ0B/
UjLVkj/nLhToyG02Q5VlYGtOY3XBNOLuYM+3gMLiCdFRjsUN9d7FawZtbVGlF6ej
58uZgDIGeRaacEvgAUWSfqHf1BT6l/YUekz/CfdTLCOxnPMEzBC5zDxdZ/P4HoRV
BEKuUYPr4dEt/miKM7vCPop2C50o3dqXbD+MyzN0XuxURsxyAcmLiNdzi8lLorRt
aalkUNTfoKuo7Oje4AB4
=+ZST
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"Included are: a somewhat late devfreq update which however is mostly
fixes and cleanups with one new thing only (the PPMUv2 support on
Exynos5433), an ACPI cpufreq driver fixup and two ACPI core cleanups
related to preprocessor directives.
Specifics:
- Fix a memory allocation size in the devfreq core (Xiaolong Ye).
- Fix a mistake in the exynos-ppmu DT binding (Javier Martinez
Canillas).
- Add support for PPMUv2 ((Platform Performance Monitoring Unit
version 2.0) on the Exynos5433 SoCs (Chanwoo Choi).
- Fix a type casting bug in the Exynos PPMU code (MyungJoo Ham).
- Assorted devfreq code cleanups and optimizations (Javi Merino,
MyungJoo Ham, Viresh Kumar).
- Fix up the ACPI cpufreq driver to use a more lightweight way to get
to its private data in the ->get() callback (Rafael J Wysocki).
- Fix a CONFIG_ prefix bug in one of the ACPI drivers and make the
ACPI subsystem use IS_ENABLED() instead of #ifdefs in function
bodies (Sudeep Holla)"
* tag 'pm+acpi-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
ACPI: int340x_thermal: add missing CONFIG_ prefix
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
- Batch of minor fixups to the new hfi1 driver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV/DRlAAoJELgmozMOVy/ddPYP/2PTkdkFzyX/+pl0gc99NnkM
HAF/1+qaDBT5VdQx0da7UN1HMS0yUGJo6BIK5cDGb28qSmey0wyRpGiFHDJzSODv
Vz/uDz2pPJjc2kEll+wxPnJYDsZAK5pZ3M4bNUdqlKVEkT4QFzdTgfg1WCTMjVlt
VKwQB1x2dxLhCoZ6xxQr4PIVcwTmN7AoBzBZC+iywsvQL7T11e0zzEVDnLJy1CPG
iaVwGw3+dlGetl6zT8Db5fG7mN8LiLS3PyF1nW6v6lToznLesFzDPSDbKxh9Feqb
He4Pmm+CvCbjJ/KZQ1AU1jR1ciSza17iR+3s5TKF71oaH+4a9kC4pmCT5faSBdUu
RuWclexIJACyis537QLyK0ymLlRJOHip2c6k87ZJlu5Ce4L2LZIwhRlubxfld54K
lp+wv7lmHuLnGKvBvRZP01rNx0/bn4fdH/ZMhgnsppLpZ/9dvtWhDgkPhe+FEYIy
YNBIAR54zhtIKQuxBiTrVOtt7HczxXCgevJ6/VxJbWF0JwtRSzsKosRovnhf3UDT
XnnnJbjTryIM7oprWynUiY1z4e59C/0u2nayMYWd8btk47Ml+fpiRCh+CW1RJEol
Liscnyb6ox0TCb1jajTCO4C3cfH3djTAf5WlSaTzjafwa2ikRfkwsu8o3dPZJgnG
GqDrapAWCeZurwdR0GCJ
=hsxv
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"The new hfi1 driver in staging/rdma has had a number of fixup patches
since being added to the tree. This is the first batch of those fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hfi: Properly set permissions for user device files
IB/hfi1: mask vs shift confusion
IB/hfi1: clean up some defines
IB/hfi1: info leak in get_ctxt_info()
IB/hfi1: fix a locking bug
IB/hfi1: checking for NULL instead of IS_ERR
IB/hfi1: fix sdma_descq_cnt parameter parsing
IB/hfi1: fix copy_to/from_user() error handling
IB/hfi1: fix pstateinfo from returning improperly byteswapped value
Pull libnvdimm fixes from Dan Williams:
- a boot regression (since v4.2) fix for some ARM configurations from
Tyler
- regression (since v4.1) fixes for mkfs.xfs on a DAX enabled device
from Jeff. These are tagged for -stable.
- a pair of locking fixes from Axel that are hidden from lockdep since
they involve device_lock(). The "btt" one is tagged for -stable, the
other only applies to the new "pfn" mechanism in v4.3.
- a fix for the pmem ->rw_page() path to use wmb_pmem() from Ross.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
mm: fix type cast in __pfn_to_phys()
pmem: add proper fencing to pmem_rw_page()
libnvdimm: pfn_devs: Fix locking in namespace_store
libnvdimm: btt_devs: Fix locking in namespace_store
blockdev: don't set S_DAX for misaligned partitions
dax: fix O_DIRECT I/O to the last block of a blockdev
Pull block updates from Jens Axboe:
"This is a bit bigger than it should be, but I could (did) not want to
send it off last week due to both wanting extra testing, and expecting
a fix for the bounce regression as well. In any case, this contains:
- Fix for the blk-merge.c compilation warning on gcc 5.x from me.
- A set of back/front SG gap merge fixes, from me and from Sagi.
This ensures that we honor SG gapping for integrity payloads as
well.
- Two small fixes for null_blk from Matias, fixing a leak and a
capacity propagation issue.
- A blkcg fix from Tejun, fixing a NULL dereference.
- A fast clone optimization from Ming, fixing a performance
regression since the arbitrarily sized bio's were introduced.
- Also from Ming, a regression fix for bouncing IOs"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix bounce_end_io
block: blk-merge: fast-clone bio when splitting rw bios
block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg
block: Copy a user iovec if it includes gaps
block: Refuse adding appending a gapped integrity page to a bio
block: Refuse request/bio merges with gaps in the integrity payload
block: Check for gaps on front and back merges
null_blk: fix wrong capacity when bs is not 512 bytes
null_blk: fix memory leak on cleanup
block: fix bogus compiler warnings in blk-merge.c
The various definitions of __pfn_to_phys() have been consolidated to
use a generic macro in include/asm-generic/memory_model.h. This hit
mainline in the form of 012dcef3f058 "mm: move __phys_to_pfn and
__pfn_to_phys to asm/generic/memory_model.h". When the generic macro
was implemented the type cast to phys_addr_t was dropped which caused
boot regressions on ARM platforms with more than 4GB of memory and
LPAE enabled.
It was suggested to use PFN_PHYS() defined in include/linux/pfn.h
as provides the correct logic and avoids further duplication.
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* pm-cpufreq:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
* pm-devfreq:
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding