722724 Commits

Author SHA1 Message Date
7a4fa29106 net: sched: Add TCA_HW_OFFLOAD
Qdiscs can be offloaded to HW, but current implementation isn't uniform.
Instead, qdiscs either pass information about offload status via their
TCA_OPTIONS or omit it altogether.

Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform
uAPI for the offloading status of qdiscs.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 13:35:36 -05:00
0a0606970f Merge branch 'aquantia-fixes'
Igor Russkikh says:

====================
net: aquantia: Atlantic driver 12/2017 updates

The patchset contains important hardware fix for machines with large MRRS
and couple of improvement in stats and capabilities reporting

patch v3:
 - Fixed patch #7 after Andrew's finding. NIC level stats actually
   have to be cleaned only on hw struct creation (and this is done
   in kzalloc). On each hwinit we only have to reset link state
   to make sure hw stats update will not increment nic stats during init.

patch v2:
 - split into more detailed commits

Comment from David on wrong defines case will be submitted separately later
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:43 -05:00
d4c242d4ba net: aquantia: Increment driver version
Add a suffix to distinguish kernel mainline version and aquantia releases

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
98bc036de4 net: aquantia: Fix typo in ethtool statistics names
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
f3e2778429 net: aquantia: Update hw counters on hw init
On very first start we should read out current HW counter values
to make diff based calculations later.
This also should be done each time NIC gets down/up or wakes up
after sleep state. We reset link state explicitly to prevent diffs
from being summed this first time.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
fdb4a0830e net: aquantia: Improve link state and statistics check interval callback
Reduce timeout from 2 secs to 1 sec. If link is down,
reduce it to 500msec. This speeds up link detection.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
45cc1c7ad4 net: aquantia: Fill in multicast counter in ndev stats from hardware
This metric comes from HW and is also diff-calculated, like other counters

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
9f8a2203a5 net: aquantia: Fill ndev stat couters from hardware
Originally they were filled from ring sw counters.
These sometimes incorrectly calculate byte and packet amounts
when using LRO/LSO and jumboframes. Filling ndev counters from
hardware makes them precise.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:42 -05:00
be08d839d9 net: aquantia: Extend stat counters to 64bit values
Device hardware provides only 32bit counters. Using these directly
causes byte counters to overflow soon. A separate nic level structure
with 64 bit counters is now used to collect incrementally all the stats
and report these counters to ethtool stats and ndev stats.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
1e36616151 net: aquantia: Fix hardware DMA stream overload on large MRRS
Systems with large MRRS on device (2K, 4K) with high data rates and/or
large MTU, atlantic observes DMA packet buffer overflow. On some systems
that causes PCIe transaction errors, hardware NMIs or datapath freeze.
This patch
1) Limits MRRS from device side to 2K (thats maximum our hardware supports)
2) Limit maximum size of outstanding TX DMA data read requests. This makes
hardware buffers running fine.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
e4d02ca04c net: aquantia: Fix actual speed capabilities reporting
Different hardware device Ids correspond to different maximum speed
available. Extra checks were added for devices D108 and D109 to
remove unsupported speeds from these device capabilities list.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 12:46:41 -05:00
35b99dffc3 sock: free skb in skb_complete_tx_timestamp on error
skb_complete_tx_timestamp must ingest the skb it is passed. Call
kfree_skb if the skb cannot be enqueued.

Fixes: b245be1f4db1 ("net-timestamp: no-payload only sysctl")
Fixes: 9ac25fc06375 ("net: fix socket refcounting in skb_complete_tx_timestamp()")
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:30:36 -05:00
d9356edc44 Merge branch 's390-fixes'
Julian Wiedmann says:

====================
s390/qeth: fixes 2017-12-13

some more patches for 4.15, that fix multiple issues with IP Takeover
configuration in qeth.
Please queue them up for stable kernels as well (4.9 and newer).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:44 -05:00
02f510f326 s390/qeth: update takeover IPs after configuration change
Any modification to the takeover IP-ranges requires that we re-evaluate
which IP addresses are takeover-eligible. Otherwise we might do takeover
for some addresses when we no longer should, or vice-versa.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
8a03a3692b s390/qeth: lock IP table while applying takeover changes
Modifying the flags of an IP addr object needs to be protected against
eg. concurrent removal of the same object from the IP table.

Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
b22d73d668 s390/qeth: don't apply takeover changes to RXIP
When takeover is switched off, current code clears the 'TAKEOVER' flag on
all IPs. But the flag is also used for RXIP addresses, and those should
not be affected by the takeover mode.
Fix the behaviour by consistenly applying takover logic to NORMAL
addresses only.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
7fbd9493f0 s390/qeth: apply takeover changes when mode is toggled
Just as for an explicit enable/disable, toggling the takeover mode also
requires that the IP addresses get updated. Otherwise all IPs that were
added to the table before the mode-toggle, get registered with the old
settings.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:42 -05:00
0f546ffcd0 Here are some batman-adv bugfixes:
- Initialize the fragment headers, by Sven Eckelmann
 
  - Fix a NULL check in BATMAN V, by Sven Eckelmann
 
  - Fix kernel doc for the time_setup() change, by Sven Eckelmann
 
  - Use the right lock in BATMAN IV OGM Update, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlozsZQWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZ1uD/94DURJf4b8kgUDjgMfuKZZyafH
 c4N8SI+3+gZExtgXbb48f91a9S5T7qIAm46c7nHNJpTLEhc/ucup3c06i8VjBc2r
 Tz37jnnjrHzfbw0p10zpLkdDlU0mLtEQBpNGQxru8h2x3gs2RBfDy/yX7MzzAp4m
 maZ3ub0wPAtlRKQ4OywTStP9ZP7BXj4/puCRrn3kPIKDc1Ahwh4dS4r4euSvvR6z
 2Qkfnpb3sbzh1Sa526GNAwgEo6hpssOugQyCIZUK4YJh1OfjnS+xJHWmfBkLw7q9
 OfkuwAjkZMivrcxK/lvdMJ6WltQUyWVkKxDuA6iAtwRV4tX5W5b+9OEToZlHuPid
 11bDXDfmWTfnpYM7uTT7tRbjFpvJ9X4ftUNHHtlL+h+VRz8Uf0xcteBIi7FxHHm5
 fRqMLj5J0WU5rVf2vKTz2XIi5dXWgy/bkoPzalMc8UxWVsKthr34UZmcPRDVvCja
 bwad1H+EEelx4tvPeWTEEhdhLFvalmfgRG3i/paqr547ETllmO462kdUTOmxDnvT
 76GZVsLeHBPrnZi/tqc230EyoZijiM/l41twOZPjb0cklvOEFYR0vs/BUfmvdM/3
 xlUF+CBaVnHcY3HDn7ywVICWP6jnG+LOQ4tOL6jjsKiqbEpPHRHuQ++MnZ2uRUgs
 5e/yBVxdEjo9RoLVsg==
 =US7g
 -----END PGP SIGNATURE-----

Merge tag 'batadv-net-for-davem-20171215' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here are some batman-adv bugfixes:

 - Initialize the fragment headers, by Sven Eckelmann

 - Fix a NULL check in BATMAN V, by Sven Eckelmann

 - Fix kernel doc for the time_setup() change, by Sven Eckelmann

 - Use the right lock in BATMAN IV OGM Update, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:02:11 -05:00
fccff08628 mlxsw: spectrum: Disable MAC learning for ovs port
Learning is currently enabled for ports which are OVS slaves -
even though OVS doesn't need this indication.
Since we're not associating a fid with the port, HW would continuously
notify driver of learned [& aged] MACs which would be logged as errors.

Fixes: 2b94e58df58c ("mlxsw: spectrum: Allow ports to work under OVS master")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 10:47:36 -05:00
8c8f67a46f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2017-12-13

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Addition of explicit scheduling points to map alloc/free
   in order to avoid having to hold the CPU for too long,
   from Eric.

2) Fixing of a corruption in overlapping perf_event_output
   calls from different BPF prog types on the same CPU out
   of different contexts, from Daniel.

3) Fallout fixes for recent correction of broken uapi for
   BPF_PROG_TYPE_PERF_EVENT. um had a missing asm header
   that needed to be pulled in from asm-generic and for
   BPF selftests the asm-generic include did not work,
   so similar asm include scheme was adapted for that
   problematic header that perf is having with other
   header files under tools, from Daniel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 17:30:04 -05:00
f6e168b4a1 Merge branch 'mlx4-misc-fixes'
Tariq Toukan says:

====================
mlx4 misc fixes

This patchset contains misc bug fixes from the team
to the mlx4 Core and Eth drivers.

Patch 1 by Eugenia fixes an MTU issue in selftest.
Patch 2 by Eran fixes an accounting issue in the resource tracker.
Patch 3 by Eran fixes a race condition that causes counter inconsistency.

Series generated against net commit:
200809716aed fou: fix some member types in guehdr

v2:
Patch 2: Add reviewer credit, rephrase commit message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:38:37 -05:00
5a1647c391 net/mlx4_en: Fill all counters under one call of stats lock
Before this patch, the stats_lock was acquired twice. In between the
locks Driver sent command to gather some more statistics (per priority
and counter statistics). If the stats lock was acquired by get
statistics NDO in between we would have report out of sync counters.

Fix this by collecting all stats from Firmware in advance and then
fill the Software structs under one lock.

Fixes: 0b131561a7d6 ("net/mlx4_en: Add Flow control statistics display via ethtool")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:38:37 -05:00
0bb9fc4f54 net/mlx4_core: Fix wrong calculation of free counters
The field res_free indicates the total number of counters which are
available for allocation (reserved and unreserved). Fixed a bug where
the reserved counters were subtracted from res_free before any
allocation was performed.

Before this fix, free counters which were not reserved could not be
allocated.

Fixes: 9de92c60beaa ("net/mlx4_core: Adjust counter grant policy in the resource tracker")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:38:36 -05:00
78034f5fdd net/mlx4_en: Fix selftest for small MTUs
Set the minimal MTU threshold for running loopback selftest.
MTU should be big enough to include packet payload, NET_IP_ALIGN,
Ethernet headers and preamble length.

Fixes: e7c1c2c46201 ("mlx4_en: Added self diagnostics test implementation")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:38:36 -05:00
de9c4e06bb net: phy: marvell: avoid configuring fiber page for SGMII-to-Copper
When in SGMII-to-Copper mode, the fiber page is used for the MAC facing
link, and does not require configuration of the fiber auto-negotiation
settings.  Avoid trying.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:10:54 -05:00
53c64870d0 dwc-xlgmac: Add co-maintainer
Jose Abreu will join to maintain dwc-xlgmac.
He will help with new feature development for
this driver. Thanks Jose and welcome on board!

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:09:20 -05:00
4688eb7cf3 tcp: refresh tcp_mstamp from timers callbacks
Only the retransmit timer currently refreshes tcp_mstamp

We should do the same for delayed acks and keepalives.

Even if RFC 7323 does not request it, this is consistent to what linux
did in the past, when TS values were based on jiffies.

Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Mike Maloney <maloney@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by:  Mike Maloney <maloney@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:04:04 -05:00
9ee11bd03c tcp: fix potential underestimation on rcv_rtt
When ms timestamp is used, current logic uses 1us in
tcp_rcv_rtt_update() when the real rcv_rtt is within 1 - 999us.
This could cause rcv_rtt underestimation.
Fix it by always using a min value of 1ms if ms timestamp is used.

Fixes: 645f4c6f2ebd ("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps")
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 16:01:17 -05:00
c009cb842f skge: remove redundunt free_irq under spinlock
The code to handle multi-port SKGE boards was freeing IRQ
twice. The first one was under lock and might sleep.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 15:47:00 -05:00
3b3397e203 net: phy: meson-gxl: make function meson_gxl_read_status static
The function meson_gxl_read_status is local to the source and does
not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'meson_gxl_read_status' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 15:04:40 -05:00
94a5ef1b77 of_mdio / mdiobus: ensure mdio devices have fwnode correctly populated
Ensure that all mdio devices populate the struct device fwnode pointer
as well as the of_node pointer to allow drivers that wish to use
fwnode APIs to work.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 15:01:47 -05:00
f5e64032a7 net: phy: fix resume handling
When a PHY has the BMCR_PDOWN bit set, it may decide to ignore writes
to other registers, or reset the registers to power-on defaults.
Micrel PHYs do this for their interrupt registers.

The current structure of phylib tries to enable interrupts before
resuming (and releasing) the BMCR_PDOWN bit.  This fails, causing
Micrel PHYs to stop working after a suspend/resume sequence if they
are using interrupts.

Fix this by ensuring that the PHY driver resume methods do not take
the phydev->lock mutex themselves, but the callers of phy_resume()
take that lock.  This then allows us to move the call to phy_resume()
before we enable interrupts in phy_start().

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 15:00:10 -05:00
cd8165c3d5 ARM: dts: vf610-zii-dev: use XAUI for DSA link ports
Use XAUI rather than XGMII for DSA link ports, as this is the interface
mode that the switches actually use. XAUI is the 4 lane bus with clock
per direction, whereas XGMII is a 32 bit bus with clock.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 14:59:15 -05:00
2e51a8dc7f net: dsa: allow XAUI phy interface mode
XGMII is a 32-bit bus plus two clock signals per direction.  XAUI is
four serial lanes per direction.  The 88e6190 supports XAUI but not
XGMII as it doesn't have enough pins.  The same is true of 88e6176.

Match on PHY_INTERFACE_MODE_XAUI for the XAUI port type, but keep
accepting XGMII for backwards compatibility.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 14:59:15 -05:00
6e266610eb hippi: Fix a Fix a possible sleep-in-atomic bug in rr_close
The driver may sleep under a spinlock.
The function call path is:
rr_close (acquire the spinlock)
  free_irq --> may sleep

To fix it, free_irq is moved to the place without holding the spinlock.

This bug is found by my static analysis tool(DSAC) and checked by my code review.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 14:52:57 -05:00
d6da83813f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The follow patchset contains Netfilter fixes for your net tree,
they are:

1) Fix compilation warning in x_tables with clang due to useless
   redundant reassignment, from Colin Ian King.

2) Add bugtrap to net_exit to catch uninitialized lists, patch
   from Vasily Averin.

3) Fix out of bounds memory reads in H323 conntrack helper, this
   comes with an initial patch to remove replace the obscure
   CHECK_BOUND macro as a dependency. From Eric Sesterhenn.

4) Reduce retransmission timeout when window is 0 in TCP conntrack,
   from Florian Westphal.

6) ctnetlink clamp timeout to INT_MAX if timeout is too large,
   otherwise timeout wraps around and it results in killing the
   entry that is being added immediately.

7) Missing CAP_NET_ADMIN checks in cthelper and xt_osf, due to
   no netns support. From Kevin Cernekee.

8) Missing maximum number of instructions checks in xt_bpf, patch
   from Jann Horn.

9) With no CONFIG_PROC_FS ipt_CLUSTERIP compilation breaks,
   patch from Arnd Bergmann.

10) Missing netlink attribute policy in nftables exthdr, from
    Florian Westphal.

11) Enable conntrack with IPv6 MASQUERADE rules, as a357b3f80bc8
    should have done in first place, from Konstantin Khlebnikov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 14:12:20 -05:00
2a9ee696c7 net: ethernet: arc: fix error handling in emac_rockchip_probe
If clk_set_rate() fails, we should disable clk before return.
Found by Linux Driver Verification project (linuxtesting.org).

Changes since v2 [1]:
* Merged with latest code changes

Changes since v1:
Update made thanks to David's review, much appreciated David.
* Improved inconsistent failure handling of clock rate setting
* For completeness of usecase, added arc_emac_probe error handling

Signed-off-by: Branislav Radocaj <branislav@radocaj.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:57:06 -05:00
aceef61ee5 net: qmi_wwan: add Sierra EM7565 1199:9091
Sierra Wireless EM7565 is an Qualcomm MDM9x50 based M.2 modem.
The USB id is added to qmi_wwan.c to allow QMI communication
with the EM7565.

Signed-off-by: Sebastian Sjoholm <ssjoholm@mac.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:55:09 -05:00
a46182b002 net: igmp: Use correct source address on IGMPv3 reports
Closing a multicast socket after the final IPv4 address is deleted
from an interface can generate a membership report that uses the
source IP from a different interface.  The following test script, run
from an isolated netns, reproduces the issue:

    #!/bin/bash

    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link set dummy0 up
    ip link set dummy1 up
    ip addr add 10.1.1.1/24 dev dummy0
    ip addr add 192.168.99.99/24 dev dummy1

    tcpdump -U -i dummy0 &
    socat EXEC:"sleep 2" \
        UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &

    sleep 1
    ip addr del 10.1.1.1/24 dev dummy0
    sleep 5
    kill %tcpdump

RFC 3376 specifies that the report must be sent with a valid IP source
address from the destination subnet, or from address 0.0.0.0.  Add an
extra check to make sure this is the case.

Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:51:27 -05:00
c545a945d0 tipc: eliminate potential memory leak
In the function tipc_sk_mcast_rcv() we call refcount_dec(&skb->users)
on received sk_buffers. Since the reference counter might hit zero at
this point, we have a potential memory leak.

We fix this by replacing refcount_dec() with kfree_skb().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:44:36 -05:00
83593010d3 net: remove duplicate includes
These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.

Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:18:46 -05:00
b5476022bb ipv4: igmp: guard against silly MTU values
IPv4 stack reacts to changes to small MTU, by disabling itself under
RTNL.

But there is a window where threads not using RTNL can see a wrong
device mtu. This can lead to surprises, in igmp code where it is
assumed the mtu is suitable.

Fix this by reading device mtu once and checking IPv4 minimal MTU.

This patch adds missing IPV4_MIN_MTU define, to not abuse
ETH_MIN_MTU anymore.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:13:58 -05:00
b9b312a7a4 ipv6: mcast: better catch silly mtu values
syzkaller reported crashes in IPv6 stack [1]

Xin Long found that lo MTU was set to silly values.

IPv6 stack reacts to changes to small MTU, by disabling itself under
RTNL.

But there is a window where threads not using RTNL can see a wrong
device mtu. This can lead to surprises, in mld code where it is assumed
the mtu is suitable.

Fix this by reading device mtu once and checking IPv6 minimal MTU.

[1]
 skbuff: skb_over_panic: text:0000000010b86b8d len:196 put:20
 head:000000003b477e60 data:000000000e85441e tail:0xd4 end:0xc0 dev:lo
 ------------[ cut here ]------------
 kernel BUG at net/core/skbuff.c:104!
 invalid opcode: 0000 [#1] SMP KASAN
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc2-mm1+ #39
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
 Google 01/01/2011
 RIP: 0010:skb_panic+0x15c/0x1f0 net/core/skbuff.c:100
 RSP: 0018:ffff8801db307508 EFLAGS: 00010286
 RAX: 0000000000000082 RBX: ffff8801c517e840 RCX: 0000000000000000
 RDX: 0000000000000082 RSI: 1ffff1003b660e61 RDI: ffffed003b660e95
 RBP: ffff8801db307570 R08: 1ffff1003b660e23 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff85bd4020
 R13: ffffffff84754ed2 R14: 0000000000000014 R15: ffff8801c4e26540
 FS:  0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000463610 CR3: 00000001c6698000 CR4: 00000000001406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <IRQ>
  skb_over_panic net/core/skbuff.c:109 [inline]
  skb_put+0x181/0x1c0 net/core/skbuff.c:1694
  add_grhead.isra.24+0x42/0x3b0 net/ipv6/mcast.c:1695
  add_grec+0xa55/0x1060 net/ipv6/mcast.c:1817
  mld_send_cr net/ipv6/mcast.c:1903 [inline]
  mld_ifc_timer_expire+0x4d2/0x770 net/ipv6/mcast.c:2448
  call_timer_fn+0x23b/0x840 kernel/time/timer.c:1320
  expire_timers kernel/time/timer.c:1357 [inline]
  __run_timers+0x7e1/0xb60 kernel/time/timer.c:1660
  run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
  __do_softirq+0x29d/0xbb2 kernel/softirq.c:285
  invoke_softirq kernel/softirq.c:365 [inline]
  irq_exit+0x1d3/0x210 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:540 [inline]
  smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
  apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:920

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 13:13:15 -05:00
6b782f43d3 Revert "ravb: add workaround for clock when resuming with WoL enabled"
This reverts commit fbf3d034f2ff6264183cfa6845770e8cc2a986c8.

As of commit 560869100b99a3da ("clk: renesas: cpg-mssr: Restore module
clocks during resume"), the workaround is no longer needed.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13 11:18:40 -05:00
9147efcbe0 bpf: add schedule points to map alloc/free
While using large percpu maps, htab_map_alloc() can hold
cpu for hundreds of ms.

This patch adds cond_resched() calls to percpu alloc/free
call sites, all running in process context.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 15:27:22 -08:00
9dbd2d948e Merge branch 'bpf-misc-fixes'
Daniel Borkmann says:

====================
Couple of outstanding fixes for BPF tree: 1) fixes a perf RB
corruption, 2) and 3) fixes a few build issues from the recent
bpf_perf_event.h uapi corrections. Thanks!
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:52:09 -08:00
720f228e8d bpf: fix broken BPF selftest build
At least on x86_64, the kernel's BPF selftests seemed to have stopped
to build due to 618e165b2a8e ("selftests/bpf: sync kernel headers and
introduce arch support in Makefile"):

  [...]
  In file included from test_verifier.c:29:0:
  ../../../include/uapi/linux/bpf_perf_event.h:11:32:
     fatal error: asm/bpf_perf_event.h: No such file or directory
   #include <asm/bpf_perf_event.h>
                                ^
  compilation terminated.
  [...]

While pulling in tools/arch/*/include/uapi/asm/bpf_perf_event.h seems
to work fine, there's no automated fall-back logic right now that would
do the same out of tools/include/uapi/asm-generic/bpf_perf_event.h. The
usual convention today is to add a include/[uapi/]asm/ equivalent that
would pull in the correct arch header or generic one as fall-back, all
ifdef'ed based on compiler target definition. It's similarly done also
in other cases such as tools/include/asm/barrier.h, thus adapt the same
here.

Fixes: 618e165b2a8e ("selftests/bpf: sync kernel headers and introduce arch support in Makefile")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:51:12 -08:00
a23f06f06d bpf: fix build issues on um due to mising bpf_perf_event.h
Since c895f6f703ad ("bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build
on i386 or x86_64:

  [...]
    CC      init/main.o
  In file included from ../include/linux/perf_event.h:18:0,
                   from ../include/linux/trace_events.h:10,
                   from ../include/trace/syscall.h:7,
                   from ../include/linux/syscalls.h:82,
                   from ../init/main.c:20:
  ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error:
  asm/bpf_perf_event.h: No such file or directory #include
  <asm/bpf_perf_event.h>
  [...]

Lets add missing bpf_perf_event.h also to um arch. This seems
to be the only one still missing.

Fixes: c895f6f703ad ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Richard Weinberger <richard@sigma-star.at>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Richard Weinberger <richard@sigma-star.at>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:51:12 -08:00
283ca526a9 bpf: fix corruption on concurrent perf_event_output calls
When tracing and networking programs are both attached in the
system and both use event-output helpers that eventually call
into perf_event_output(), then we could end up in a situation
where the tracing attached program runs in user context while
a cls_bpf program is triggered on that same CPU out of softirq
context.

Since both rely on the same per-cpu perf_sample_data, we could
potentially corrupt it. This can only ever happen in a combination
of the two types; all tracing programs use a bpf_prog_active
counter to bail out in case a program is already running on
that CPU out of a different context. XDP and cls_bpf programs
by themselves don't have this issue as they run in the same
context only. Therefore, split both perf_sample_data so they
cannot be accessed from each other.

Fixes: 20b9d7ac4852 ("bpf: avoid excessive stack usage for perf_sample_data")
Reported-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-12 09:51:12 -08:00
30791ac419 tcp md5sig: Use skb's saddr when replying to an incoming segment
The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.

Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.

This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.

Fixes: 9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-12 11:15:42 -05:00