Flex array is used to allocate hash buckets which is type struct
hlist_head, but we use `struct hlist_head *` to calculate
array size. Since hlist_head is of size pointer it works fine.
Following patch use correct type.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
git silently included an extra hunk in vport_cmd_set() during
automatic merging. This code is unreachable so it does not actually
introduce a problem but it is clearly incorrect.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Neil Brown reports that with libertas, my recent cfg80211
SME changes in commit ceca7b7121
("cfg80211: separate internal SME implementation") broke
libertas suspend because it we now asked it to disconnect
while already disconnected.
The problematic change is in cfg80211_disconnect() as it
previously checked the SME state and now calls the driver
disconnect operation unconditionally.
Fix this by checking if there's a current_bss indicating
a connection, and do nothing if not.
Reported-and-tested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There are a few places which check nl80211hdr_put() for an ERR_PTR
but actually it returns NULL on error and never error values. In
nl80211_testmode_dump() the return wasn't checked at all so I have
added one.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[some whitespace changes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
skb->sk socket can be of AF_INET or AF_INET6 address family. Thus we
always have to make sure we a referring to the correct interpretation
of skb->sk.
We only depend on header defines to query the mtu, so we don't introduce
a new dependency to ipv6 by this change.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
In xfrm4 and xfrm6 we need to take care about sockets of the other
address family. This could happen because a 6in4 or 4in6 tunnel could
get protected by ipsec.
Because we don't want to have a run-time dependency on ipv6 when only
using ipv4 xfrm we have to embed a pointer to the correct local_error
function in xfrm_state_afinet and look it up when returning an error
depending on the socket address family.
Thanks to vi0ss for the great bug report:
<https://bugzilla.kernel.org/show_bug.cgi?id=58691>
v2:
a) fix two more unsafe interpretations of skb->sk as ipv6 socket
(xfrm6_local_dontfrag and __xfrm6_output)
v3:
a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
building ipv6 as a module (thanks to Steffen Klassert)
Reported-by: <vi0oss@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.
Let's start with a little history:
Feb 20: Vlad Yasevich got his VLAN-aware bridge patchset included in
the 3.9 merge window.
In the kernel commit 6cbdceeb, he added attribute support to
bridge GETLINK requests sent with rtgenmsg.
Mar 6th: Vlad got this iproute2 reference implementation of the bridge
vlan netlink interface accepted (iproute2 9eff0e5c)
Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
http://patchwork.ozlabs.org/patch/239602/http://marc.info/?t=136680900700007
Apr 28th: Linus released 3.9
Apr 30th: Stephen released iproute2 3.9.0
The `bridge vlan show` command haven't been working since the switch to
ifinfomsg, or in a released version of iproute2. Since the kernel side
only supports rtgenmsg, which iproute2 switched away from just prior to
the iproute2 3.9.0 release.
I haven't been able to find any documentation, about neither rtgenmsg
nor ifinfomsg, and in which situation to use which, but kernel commit
88c5b5ce seams to suggest that ifinfomsg should be used.
Fixing this in kernel will break compatibility, but I doubt that anybody
have been using it due to this bug in the user space reference
implementation, at least not without noticing this bug. That said the
functionality is still fully functional in 3.9, when reversing iproute2
commit 63338dca.
This could also be fixed in iproute2, but thats an ugly patch that would
reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
like rtgenmsg usage is discouraged. I'm assuming that the only reason
that Vlad implemented the kernel side to use rtgenmsg, was because
iproute2 was using it at the time.
Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net>
Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.
Following patch reverts optimization from commit
490ab08127 (IP_GRE: Fix IP-Identification.)
CC: Jarno Rajahalme <jrajahalme@nicira.com>
CC: Ansis Atteka <aatteka@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dumping generic netlink families, only the first dump call
is locked with genl_lock(), which protects the list of families,
and thus subsequent calls can access the data without locking,
racing against family addition/removal. This can cause a crash.
Fix it - the locking needs to be conditional because the first
time around it's already locked.
A similar bug was reported to me on an old kernel (3.4.47) but
the exact scenario that happened there is no longer possible,
on those kernels the first round wasn't locked either. Looking
at the current code I found the race described above, which had
also existed on the old kernel.
Cc: stable@vger.kernel.org
Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Probably this one is quite unlikely to be triggered, but it's more safe
to do the call_rcu() at the end after we have dropped the reference on
the asoc and freed sctp packet chunks. The reason why is because in
sctp_transport_destroy_rcu() the transport is being kfree()'d, and if
we're unlucky enough we could run into corrupted pointers. Probably
that's more of theoretical nature, but it's safer to have this simple fix.
Introduced by commit 8c98653f ("sctp: sctp_close: fix release of bindings
for deferred call_rcu's"). I also did the 8c98653f regression test and
it's fine that way.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SCTP Quick failover draft [1] section 5.1, point 5 says that the cwnd
should be 1 MTU. So, instead of 1, set it to 1 MTU.
[1] https://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
Reported-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- reassign pointers to data after skb reallocation to avoid kernel paging errors
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)
iQIcBAABCAAGBQJSBqldAAoJEADl0hg6qKeO9eQQAKUod+Or+ldfBU5b6OvkDFAr
FjSMxVgGS/YFSrD81WLnIGTW/ujz8VrVM7zrUrKJZfsVtBJw5/6hTTV5M/vl6FqQ
k+q6L6R1EqR9qRxqFhxD4u6+Zscc3+2KvS2rAqL8PQmJq0STn4h+Lk3J38DrKF9l
dBn28FUcX4ON/D5uI0AZU/iM/OxTgWLxhYhFbSjxPnAhJEyIii+D6yrBLNN1SZDx
McAKvJ78Tf9rEWt11KBDs6uNtEOSFy0v6hJVkZEiX941rkGOiShzPcU8FCICEaie
EGQisQvVYho+ABHofHcMUkMvXwrtLT0WgiIbxRKS9DJdMBOuu99v/zX2T0thZLUK
l2GvQZQ6/hS/iMNSQvXhxemyZoAxvjxkSBMwrK/BVv5XLUM+sOo+Og4/ADZviPaY
suFwbCDtdr4jzeqBdVxn6mZGosVLtaNM74xpc+cxm7BwmeB2XL22S3sR2WRwsurU
AV+bU2NUpRWGg+rsmQCyUrfD/mGZuiyR12Z7BTYA4KoDPkeh/fpGJZS77aRTgSm5
kIXl1aCtRdw31mebdjrNU4Cp6oOnQ3A2jzb8nKoiBUiKPkEZ16h8QB5ktLjsYrKL
VFQChSWHUwlgzS3xBu3BVQOZ8WwlEV6z6D3TX9RAlhQqjGbk9OujbQJjmBVwLQdH
wWPzO9LSQ9cLYFqAnBof
=QkbK
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Included change:
- reassign pointers to data after skb reallocation to avoid kernel paging errors
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several functions which might reallocate skb data. Currently
some places keep reusing their old ethhdr pointer regardless of whether
they became invalid after such a reallocation or not. This potentially
leads to kernel paging errors.
This patch fixes these by refetching the ethdr pointer after the
potential reallocations.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Pablo Neira Ayuso says:
====================
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the conntrack checks if the ending sequence of a packet
falls within the observed receive window. However it does so even
if it has not observe any packet from the remote yet and uses an
uninitialized receive window (td_maxwin).
If a connection uses Fast Open to send a SYN-data packet which is
dropped afterward in the network. The subsequent SYNs retransmits
will all fail this check and be discarded, leading to a connection
timeout. This is because the SYN retransmit does not contain data
payload so
end == initial sequence number (isn) + 1
sender->td_end == isn + syn_data_len
receiver->td_maxwin == 0
The fix is to only apply this check after td_maxwin is initialized.
Reported-by: Michael Chan <mcfchan@stanford.edu>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Rename mib counter from "low latency" to "busy poll"
v1 also moved the counter to the ip MIB (suggested by Shawn Bohrer)
Eric Dumazet suggested that the current location is better.
So v2 just renames the counter to fit the new naming convention.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Same behavior than 802.1q : finds the encapsulated protocol and
skip 32bit header.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix ipgre_header() (header_ops->create) to return the correct
amount of bytes pushed. Most callers of dev_hard_header() seem
to care only if it was success, but af_packet.c uses it as
offset to the skb to copy from userspace only once. In practice
this fixes packet socket sendto()/sendmsg() to gre tunnels.
Regression introduced in c544193214
("GRE: Refactor GRE tunneling code.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.
Instead continue to backtrack until a valid subtree or node is found
and return this match.
Also remove unneeded NULL check.
Reported-by: Teco Boot <teco@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Lamparter <equinox@diac24.net>
Cc: <boutier@pps.univ-paris-diderot.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If rpcbind causes our connection to the AF_LOCAL socket to close after
we've registered a service, then we want to be careful about reconnecting
since the mount namespace may have changed.
By simply refusing to reconnect the AF_LOCAL socket in the case of
unregister, we avoid the need to somehow save the mount namespace. While
this may lead to some services not unregistering properly, it should
be safe.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.9.x
While investigating about strange increase of retransmit rates
on hosts ~24 days after boot, Van found hystart was disabled
if ca->epoch_start was 0, as following condition is true
when tcp_time_stamp high order bit is set.
(s32)(tcp_time_stamp - ca->epoch_start) < HZ
Quoting Van :
At initialization & after every loss ca->epoch_start is set to zero so
I believe that the above line will turn off hystart as soon as the 2^31
bit is set in tcp_time_stamp & hystart will stay off for 24 days.
I think we've observed that cubic's restart is too aggressive without
hystart so this might account for the higher drop rate we observe.
Diagnosed-by: Van Jacobson <vanj@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
br_sysfs_if.c is for sysfs attributes of bridge ports, while br_sysfs_br.c
is for sysfs attributes of bridge itself. Correct the comment here.
Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 17a6e9f1aa ("tcp_cubic: fix clock dependency") added an
overflow error in bictcp_update() in following code :
/* change the unit from HZ to bictcp_HZ */
t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) -
ca->epoch_start) << BICTCP_HZ) / HZ;
Because msecs_to_jiffies() being unsigned long, compiler does
implicit type promotion.
We really want to constrain (tcp_time_stamp - ca->epoch_start)
to a signed 32bit value, or else 't' has unexpected high values.
This bugs triggers an increase of retransmit rates ~24 days after
boot [1], as the high order bit of tcp_time_stamp flips.
[1] for hosts with HZ=1000
Big thanks to Van Jacobson for spotting this problem.
Diagnosed-by: Van Jacobson <vanj@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for the kernel to time out the AF_LOCAL connection to
the rpcbind socket, and doing so is problematic because when it is
time to reconnect, our process may no longer be using the same mount
namespace.
Reported-by: Nix <nix@esperi.org.uk>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org # 3.9.x
Currently we are reading an uninitialized value for the max_delay
variable when snooping an MLD query message of invalid length and would
update our timers with that.
Fixing this by simply ignoring such broken MLD queries (just like we do
for IGMP already).
This is a regression introduced by:
"bridge: disable snooping if there is no querier" (b00589af3b)
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
AddressSanitizer [1] dynamic checker pointed a potential
out of bound access in leaf_walk_rcu()
We could allocate one more slot in tnode_new() to leave the prefetch()
in-place but it looks not worth the pain.
Bug added in commit 82cfbb0085 ("[IPV4] fib_trie: iterator recode")
[1] :
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dev->ndo_neigh_setup() might need some of the values of neigh_parms, so
populate them before calling it.
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 91657eafb ("xfrm: take net hdr len into account for esp payload
size calculation") introduced a possible interger overflow in
esp{4,6}_get_mtu() handlers in case of x->props.mode equals
XFRM_MODE_TUNNEL. Thus, the following expression will overflow
unsigned int net_adj;
...
<case ipv{4,6} XFRM_MODE_TUNNEL>
net_adj = 0;
...
return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
net_adj) & ~(align - 1)) + (net_adj - 2);
where (net_adj - 2) would be evaluated as <foo> + (0 - 2) in an unsigned
context. Fix it by simply removing brackets as those operations here
do not need to have special precedence.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Benjamin Poirier <bpoirier@suse.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlan devices are LLTX and don't update their own trans_start, so if
dev_trans_start has to be called with a vlan device then 0 or a stale
value will be returned. Currently the bonding is the only such user, and
it's needed for proper arp monitoring when the slaves are vlans.
Fix this by extracting the vlan's real device trans_start.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes we might have stacked vlans on top of each other, and we're
interested in the first non-vlan real device on the path, so transform
vlan_dev_real_dev to go over the stacked vlans and extract the first
non-vlan device.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the semicolon at the end of the list_for_each_entry loop header.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
These structs have a "_pad" member. Also the "phw" structs have an 8
byte "hw_addr[]" array but sometimes only the first 6 bytes are
initialized.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull networking fixes from David Miller:
1) Don't ignore user initiated wireless regulatory settings on cards
with custom regulatory domains, from Arik Nemtsov.
2) Fix length check of bluetooth information responses, from Jaganath
Kanakkassery.
3) Fix misuse of PTR_ERR in btusb, from Adam Lee.
4) Handle rfkill properly while iwlwifi devices are offline, from
Emmanuel Grumbach.
5) Fix r815x devices DMA'ing to stack buffers, from Hayes Wang.
6) Kernel info leak in ATM packet scheduler, from Dan Carpenter.
7) 8139cp doesn't check for DMA mapping errors, from Neil Horman.
8) Fix bridge multicast code to not snoop when no querier exists,
otherwise mutlicast traffic is lost. From Linus Lüssing.
9) Avoid soft lockups in fib6_run_gc(), from Michal Kubecek.
10) Fix races in automatic address asignment on ipv6, which can result
in incorrect lifetime assignments. From Jiri Benc.
11) Cure build bustage when CONFIG_NET_LL_RX_POLL is not set and rename
it CONFIG_NET_RX_BUSY_POLL to eliminate the last reference to the
original naming of this feature. From Cong Wang.
12) Fix crash in TIPC when server socket creation fails, from Ying Xue.
13) macvlan_changelink() silently succeeds when it shouldn't, from
Michael S Tsirkin.
14) HTB packet scheduler can crash due to sign extension, fix from
Stephen Hemminger.
15) With the cable unplugged, r8169 prints out a message every 10
seconds, make it netif_dbg() instead of netif_warn(). From Peter
Wu.
16) Fix memory leak in rtm_to_ifaddr(), from Daniel Borkmann.
17) sis900 gets spurious TX queue timeouts due to mismanagement of link
carrier state, from Denis Kirjanov.
18) Validate somaxconn sysctl to make sure it fits inside of a u16.
From Roman Gushchin.
19) Fix MAC address filtering on qlcnic, from Shahed Shaikh.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits)
qlcnic: Fix for flash update failure on 83xx adapter
qlcnic: Fix link speed and duplex display for 83xx adapter
qlcnic: Fix link speed display for 82xx adapter
qlcnic: Fix external loopback test.
qlcnic: Removed adapter series name from warning messages.
qlcnic: Free up memory in error path.
qlcnic: Fix ingress MAC learning
qlcnic: Fix MAC address filter issue on 82xx adapter
net: ethernet: davinci_emac: drop IRQF_DISABLED
netlabel: use domain based selectors when address based selectors are not available
net: check net.core.somaxconn sysctl values
sis900: Fix the tx queue timeout issue
net: rtm_to_ifaddr: free ifa if ifa_cacheinfo processing fails
r8169: remove "PHY reset until link up" log spam
net: ethernet: cpsw: drop IRQF_DISABLED
htb: fix sign extension bug
macvlan: handle set_promiscuity failures
macvlan: better mode validation
tipc: fix oops when creating server socket fails
net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL
...
Pull nfsd bugfixes from Bruce Fields:
"Most of this is due to a screwup on my part -- some gss-proxy crashes
got fixed before the merge window but somehow never made it out of a
temporary git repo on my laptop...."
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
svcrpc: set cr_gss_mech from gss-proxy as well as legacy upcall
svcrpc: fix kfree oops in gss-proxy code
svcrpc: fix gss-proxy xdr decoding oops
svcrpc: fix gss_rpc_upcall create error
NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure.
NetLabel has the ability to selectively assign network security labels
to outbound traffic based on either the LSM's "domain" (different for
each LSM), the network destination, or a combination of both. Depending
on the type of traffic, local or forwarded, and the type of traffic
selector, domain or address based, different hooks are used to label the
traffic; the goal being minimal overhead.
Unfortunately, there is a bug such that a system using NetLabel domain
based traffic selectors does not correctly label outbound local traffic
that is not assigned to a socket. The issue is that in these cases
the associated NetLabel hook only looks at the address based selectors
and not the domain based selectors. This patch corrects this by
checking both the domain and address based selectors so that the correct
labeling is applied, regardless of the configuration type.
In order to acomplish this fix, this patch also simplifies some of the
NetLabel domainhash structures to use a more common outbound traffic
mapping type: struct netlbl_dommap_def. This simplifies some of the code
in this patch and paves the way for further simplifications in the
future.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible to assign an invalid value to the net.core.somaxconn
sysctl variable, because there is no checks at all.
The sk_max_ack_backlog field of the sock structure is defined as
unsigned short. Therefore, the backlog argument in inet_listen()
shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall
is truncated to the somaxconn value. So, the somaxconn value shouldn't
exceed 65535 (USHRT_MAX).
Also, negative values of somaxconn are meaningless.
before:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
net.core.somaxconn = 65536
$ sysctl -w net.core.somaxconn=-100
net.core.somaxconn = -100
after:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
error: "Invalid argument" setting key "net.core.somaxconn"
$ sysctl -w net.core.somaxconn=-100
error: "Invalid argument" setting key "net.core.somaxconn"
Based on a prior patch from Changli Gao.
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Reported-by: Changli Gao <xiaosuo@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5c766d642 ("ipv4: introduce address lifetime") leaves the ifa
resource that was allocated via inet_alloc_ifa() unfreed when returning
the function with -EINVAL. Thus, free it first via inet_free_ifa().
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
When userspace passes a large priority value
the assignment of the unsigned value hopt->prio
to signed int cl->prio causes cl->prio to become negative and the
comparison is with TC_HTB_NUMPRIO is always false.
The result is that HTB crashes by referencing outside
the array when processing packets. With this patch the large value
wraps around like other values outside the normal range.
See: https://bugzilla.kernel.org/show_bug.cgi?id=60669
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliezer renames several *ll_poll to *busy_poll, but forgets
CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too.
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a race in IPv6 automatic addess assignment. The address is created
with zero lifetime when it's added to various address lists. Before it gets
assigned the correct lifetime, there's a window where a new address may be
configured. This causes the semi-initiated address to be deleted in
addrconf_verify.
This was discovered as a reference leak caused by concurrent run of
__ipv6_ifa_notify for both RTM_NEWADDR and RTM_DELADDR with the same
address.
Fix this by setting the lifetime before the address is added to
inet6_addr_lst.
A few notes:
1. In addrconf_prefix_rcv, by setting update_lft to zero, the
if (update_lft) { ... } condition is no longer executed for newly
created addresses. This is okay, as the ifp fields are set in
ipv6_add_addr now and ipv6_ifa_notify is called (and has been called)
through addrconf_dad_start.
2. The removal of the whole block under ifp->lock in inet6_addr_add is okay,
too, as tstamp is initialized to jiffies in ipv6_add_addr.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.
This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.
Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change made to rsc_parse() in
0dc1531aca "svcrpc: store gss mech in
svc_cred" should also have been propagated to the gss-proxy codepath.
This fixes a crash in the gss-proxy case.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Uninitialized stack data was being used as the destination for memcpy's.
Longer term we'll just delete some of this code; all we're doing is
skipping over xdr that we don't care about.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Since we enabled auto-tuning for sunrpc TCP connections we do not
guarantee that there is enough write-space on each connection to
queue a reply.
If memory pressure causes the window to shrink too small, the request
throttling in sunrpc/svc will not accept any requests so no more requests
will be handled. Even when pressure decreases the window will not
grow again until data is sent on the connection.
This means we get a deadlock: no requests will be handled until there
is more space, and no space will be allocated until a request is
handled.
This can be simulated by modifying svc_tcp_has_wspace to inflate the
number of byte required and removing the 'svc_sock_setbufsize' calls
in svc_setup_socket.
I found that multiplying by 16 was enough to make the requirement
exceed the default allocation. With this modification in place:
mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
would block and eventually time out because the nfs server could not
accept any requests.
This patch relaxes the request throttling to always allow at least one
request through per connection. It does this by checking both
sk_stream_min_wspace() and xprt->xpt_reserved
are zero.
The first is zero when the TCP transmit queue is empty.
The second is zero when there are no RPC requests being processed.
When both of these are zero the socket is idle and so one more
request can safely be allowed through.
Applying this patch allows the above mount command to succeed cleanly.
Tracing shows that the allocated write buffer space quickly grows and
after a few requests are handled, the extra tests are no longer needed
to permit further requests to be processed.
The main purpose of request throttling is to handle the case when one
client is slow at collecting replies and the send queue gets full of
replies that the client hasn't acknowledged (at the TCP level) yet.
As we only change behaviour when the send queue is empty this main
purpose is still preserved.
Reported-by: Ben Myers <bpm@sgi.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Fix a possible off by one access since optlen()
touches opt[offset+1] unsafely when i == tcp_hdrlen(skb) - 1.
This patch replaces tcp_hdrlen() by the local variable tcp_hdrlen
that stores the TCP header length, to save some cycles.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Make sure the packet has enough room for the TCP header and
that it is not malformed.
While at it, store tcph->doff*4 in a variable, as it is used
several times.
This patch also fixes a possible off by one in case of malformed
TCP options.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If there is no querier on a link then we won't get periodic reports and
therefore won't be able to learn about multicast listeners behind ports,
potentially leading to lost multicast packets, especially for multicast
listeners that joined before the creation of the bridge.
These lost multicast packets can appear since c5c2326059
("bridge: Add multicast_querier toggle and disable queries by default")
in particular.
With this patch we are flooding multicast packets if our querier is
disabled and if we didn't detect any other querier.
A grace period of the Maximum Response Delay of the querier is added to
give multicast responses enough time to arrive and to be learned from
before disabling the flooding behaviour again.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "pvc" struct has a hole after pvc.sap_family which is not cleared.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix association failures not triggering a connect-failure event in
cfg80211, from Johannes Berg.
2) Eliminate a potential NULL deref with older iptables tools when
configuring xt_socket rules, from Eric Dumazet.
3) Missing RTNL locking in wireless regulatory code, from Johannes
Berg.
4) Fix OOPS caused by firmware loading races in ath9k_htc, from Alexey
Khoroshilov.
5) Fix usb URB leak in usb_8dev CAN driver, also from Alexey
Khoroshilov.
6) VXLAN namespace teardown fails to unregister devices, from Stephen
Hemminger.
7) Fix multicast settings getting dropped by firmware in qlcnic driver,
from Sucheta Chakraborty.
8) Add sysctl range enforcement for tcp_syn_retries, from Michal Tesar.
9) Fix a nasty bug in bridging where an active timer would get
reinitialized with a setup_timer() call. From Eric Dumazet.
10) Fix use after free in new mlx5 driver, from Dan Carpenter.
11) Fix freed pointer reference in ipv6 multicast routing on namespace
cleanup, from Hannes Frederic Sowa.
12) Some usbnet drivers report TSO and SG in their feature set, but the
usbnet layer doesn't really support them. From Eric Dumazet.
13) Fix crash on EEH errors in tg3 driver, from Gavin Shan.
14) Drop cb_lock when requesting modules in genetlink, from Stanislaw
Gruszka.
15) Kernel stack leaks in cbq scheduler and af_key pfkey messages, from
Dan Carpenter.
16) FEC driver erroneously signals NETDEV_TX_BUSY on transmit leading to
endless loops, from Uwe Kleine-König.
17) Fix hangs from loading mvneta driver, from Arnaud Patard.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (84 commits)
mlx5: fix error return code in mlx5_alloc_uuars()
mvneta: Try to fix mvneta when compiled as module
mvneta: Fix hang when loading the mvneta driver
atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE
af_key: more info leaks in pfkey messages
net/fec: Don't let ndo_start_xmit return NETDEV_TX_BUSY without link
net_sched: Fix stack info leak in cbq_dump_wrr().
igb: fix vlan filtering in promisc mode when not in VT mode
ixgbe: Fix Tx Hang issue with lldpad on 82598EB
genetlink: release cb_lock before requesting additional module
net: fec: workaround stop tx during errata ERR006358
qlcnic: Fix diagnostic interrupt test for 83xx adapters.
qlcnic: Fix setting Guest VLAN
qlcnic: Fix operation type and command type.
qlcnic: Fix initialization of work function.
Revert "atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring"
atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring
net/tg3: Fix warning from pci_disable_device()
net/tg3: Fix kernel crash
...
In case the AP has different regulatory information than we do,
it can happen that we connect to an AP based on e.g. the world
roaming regulatory data, and then update our database with the
AP's country information disables the channel the AP is using.
If this happens on an HT AP, the bandwidth tracking code will
hit the WARN_ON() and disconnect. Since that's not very useful,
ignore the channel-disable flag in bandwidth tracking.
Cc: stable@vger.kernel.org
Reported-by: Chris Wright <chrisw@sous-sol.org>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When a P2P GO interface goes down, cfg80211 doesn't properly
tear it down, leading to warnings later. Add the GO interface
type to the enumeration to tear it down like AP interfaces.
Otherwise, we leave it pending and mac80211's state can get
very confused, leading to warnings later.
Cc: stable@vger.kernel.org
Reported-by: Ilan Peer <ilan.peer@intel.com>
Tested-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
While we're connected, the AP shouldn't change the primary channel
in the HT information. We checked this, and dropped the connection
if it did change it.
Unfortunately, this is causing problems on some APs, e.g. on the
Netgear WRT610NL: the beacons seem to always contain a bad channel
and if we made a connection using a probe response (correct data)
we drop the connection immediately and can basically not connect
properly at all.
Work around this by ignoring the HT primary channel information in
beacons if we're already connected.
Also print out more verbose messages in the other situations to
help diagnose similar bugs quicker in the future.
Cc: stable@vger.kernel.org [3.10]
Acked-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
TX status notification can get lost, or the frames could
get stuck on the queue, so don't wait for the callback
from the driver forever and instead time out after half
a second.
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
We have:
- A build failure fix for the NCI SPI transport layer due to a
missing CRC_CCITT Kconfig dependency.
- A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
merge window but the typical terminology for loading a firmware to a
target is firmware download rather than upload. In order to avoid any
confusion in a file exported to userspace, we rename this command into
CMD_FW_DOWNLOAD.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR+E9jAAoJEIqAPN1PVmxKzkwP/3dy9wpwQG7f8FLv61IhbhhQ
8gVqY3BX1RJdez1vH5MvqkTK6U3SmlQvtJM8pIPyfPXyR1+Af5AxqQh3vjTP3xUG
PuMtmQOlz5OJ6ErttxZtYERVtrhFkasMVmVqKrN9ptItPADfOmeC0/hyoEnoYsWQ
HrhZn1lYsf98zmEbNS2KoRcZUVLClbg4xosTktTVaz56jIGVuM8MAch+FS+tJhhl
av0MX/VZvAUllSnlWWDmt0Lh9isJOLOMtqIRj6PLBAp2ra9sPNO5TlZ4lz2og2gx
zesVhBBLyiF9oluuQj/FJft+s5Khcm0R9W969raL5SvehWY77wHoY76ZqHMUE2Qv
7RPUvFRfOA5LvKJM8MduJ8fMf830mZWD7cByhIfUxtWQZumwPfn2Mbl3xNkPLFZB
L2x13SwGjU+PdCo70+ybgr8zUvYIxiVULwq5xFynvXJNSpOujIe3nPdQb7QtK8C0
4d9OudAHmfHsW93PMBE+Zki8i8GDLTR3DOQoXIRi7oPR+EVL2JDsBQvnauXhdSap
mp9iyuoqAYjgc6e2o8coVqViXWbKmBEa9n7NKrX3dPrI9e5F67WChAyehBCu9KV3
zZxruhEJBw6PLmIGDETk1XIVd9G6rfMBswnDfSJBjjG5PrUh6Xbfwa1y+KiRKqCh
FG+IvbfHWZRmdeFX3U4P
=p4r5
-----END PGP SIGNATURE-----
Merge tag 'nfc-fixes-3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes
Samuel Ortiz <sameo@linux.intel.com> says:
'This is the second NFC fixes pull request for 3.11.
We have:
- A build failure fix for the NCI SPI transport layer due to a
missing CRC_CCITT Kconfig dependency.
- A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
merge window but the typical terminology for loading a firmware to a
target is firmware download rather than upload. In order to avoid any
confusion in a file exported to userspace, we rename this command into
CMD_FW_DOWNLOAD."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Commit "3d9646d mac80211: fix channel selection bug" introduced a possible
infinite loop by moving the out target above the chandef_downgrade
while loop. When we downgrade to NL80211_CHAN_WIDTH_20_NOHT, we jump
back up to re-run the while loop...indefinitely. Replace goto with
break and carry on. This may not be sufficient to connect to the AP,
but will at least keep the cpu from livelocking. Thanks to Derek Atkins
as an extra pair of debugging eyes.
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, it is not possible to use neither NLM_F_EXCL nor
NLM_F_REPLACE from genetlink. This is due to this checking in
genl_family_rcv_msg:
if (nlh->nlmsg_flags & NLM_F_DUMP)
NLM_F_DUMP is NLM_F_MATCH|NLM_F_ROOT. Thus, if NLM_F_EXCL or
NLM_F_REPLACE flag is set, genetlink believes that you're
requesting a dump and it calls the .dumpit callback.
The solution that I propose is to refine this checking to
make it stricter:
if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP)
And given the combination NLM_F_REPLACE and NLM_F_EXCL does
not make sense to me, it removes the ambiguity.
There was a patch that tried to fix this some time ago (0ab03c2
netlink: test for all flags of the NLM_F_DUMP composite) but it
tried to resolve this ambiguity in *all* existing netlink subsystems,
not only genetlink. That patch was reverted since it broke iproute2,
which is using NLM_F_ROOT to request the dump of the routing cache.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages". There are some struct members which don't get initialized
and could disclose small amounts of private information.
Acked-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Loading a firmware into a target is typically called firmware
download, not firmware upload. So we rename the netlink API to
NFC_CMD_FW_DOWNLOAD in order to avoid any terminology confusion from
userspace.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
This is similar to the race Linus had reported, but in this case
it's an older bug: nl80211_prepare_wdev_dump() uses the wiphy
index in cb->args[0] as it is and thus parses the message over
and over again instead of just once because 0 is the first valid
wiphy index. Similar code in nl80211_testmode_dump() correctly
offsets the wiphy_index by 1, do that here as well.
Cc: stable@vger.kernel.org
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Make sure the reserved fields, and padding (if any), are
fully initialized.
Based upon a patch by Dan Carpenter and feedback from
Joe Perches.
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain circumstances, such as an HCI driver using __hci_cmd_sync_ev
with HCI_EV_CMD_COMPLETE as the expected completion event there is the
chance that hci_event_packet will call hci_req_cmd_complete twice (once
for the explicitly looked after event and another time in the actual
handler of cmd_complete).
In the case of __hci_cmd_sync_ev this introduces a race where the first
call wakes up the blocking __hci_cmd_sync_ev and lets it complete.
However, by the time that a second __hci_cmd_sync_ev call is already in
progress the second hci_req_cmd_complete call (from the previous
operation) will wake up the blocking function prematurely and cause it
to fail, as witnessed by the following log:
[ 639.232195] hci_rx_work: hci0 Event packet
[ 639.232201] hci_req_cmd_complete: opcode 0xfc8e status 0x00
[ 639.232205] hci_sent_cmd_data: hci0 opcode 0xfc8e
[ 639.232210] hci_req_sync_complete: hci0 result 0x00
[ 639.232220] hci_cmd_complete_evt: hci0 opcode 0xfc8e
[ 639.232225] hci_req_cmd_complete: opcode 0xfc8e status 0x00
[ 639.232228] __hci_cmd_sync_ev: hci0 end: err 0
[ 639.232234] __hci_cmd_sync_ev: hci0
[ 639.232238] hci_req_add_ev: hci0 opcode 0xfc8e plen 250
[ 639.232242] hci_prepare_cmd: skb len 253
[ 639.232246] hci_req_run: length 1
[ 639.232250] hci_sent_cmd_data: hci0 opcode 0xfc8e
[ 639.232255] hci_req_sync_complete: hci0 result 0x00
[ 639.232266] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 1
[ 639.232271] __hci_cmd_sync_ev: hci0 end: err 0
[ 639.232276] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed (-61)
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
None of the BlueFRITZ! devices with manufacurer ID 31 (AVM Berlin)
support HCI_Read_Local_Supported_Commands. It is safe to use the
manufacturer ID (instead of e.g. a USB ID specific quirk) because the
company never created any newer controllers.
< HCI Command: Read Local Supported Comm.. (0x04|0x0002) plen 0 [hci0] 0.210014
> HCI Event: Command Status (0x0f) plen 4 [hci0] 0.217361
Read Local Supported Commands (0x04|0x0002) ncmd 1
Status: Unknown HCI Command (0x01)
Reported-by: Jörg Esser <jackfritt@boh.de>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Tested-by: Jörg Esser <jackfritt@boh.de>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Without this patch, the fields app_solicit, gc_thresh1, gc_thresh2,
gc_thresh3, proxy_qlen, ucast_solicit, mcast_solicit could have
assumed negative values when setting large numbers.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If hci_dev_open() is called after hci_register_dev() added the device to
the hci_dev_list but before the workqueue are created we could run into a
NULL pointer dereference (see below).
This bug is very unlikely to happen, systems using bluetoothd to
manage their bluetooth devices will never see this happen.
BUG: unable to handle kernel NULL pointer dereference
0100
IP: [<ffffffff81077502>] __queue_work+0x32/0x3d0
(...)
Call Trace:
[<ffffffff81077be5>] queue_work_on+0x45/0x50
[<ffffffffa016e8ff>] hci_req_run+0xbf/0xf0 [bluetooth]
[<ffffffffa01709b0>] ? hci_init2_req+0x720/0x720 [bluetooth]
[<ffffffffa016ea06>] __hci_req_sync+0xd6/0x1c0 [bluetooth]
[<ffffffff8108ee10>] ? try_to_wake_up+0x2b0/0x2b0
[<ffffffff8150e3f0>] ? usb_autopm_put_interface+0x30/0x40
[<ffffffffa016fad5>] hci_dev_open+0x275/0x2e0 [bluetooth]
[<ffffffffa0182752>] hci_sock_ioctl+0x1f2/0x3f0 [bluetooth]
[<ffffffff815c6050>] sock_do_ioctl+0x30/0x70
[<ffffffff815c75f9>] sock_ioctl+0x79/0x2f0
[<ffffffff811a8046>] do_vfs_ioctl+0x96/0x560
[<ffffffff811a85a1>] SyS_ioctl+0x91/0xb0
[<ffffffff816d989d>] system_call_fastpath+0x1a/0x1f
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
The length check is invalid since the length varies with type of
info response.
This was introduced by the commit cb3b3152b2
Because of this, l2cap info rsp is not handled and command reject is sent.
> ACL data: handle 11 flags 0x02 dlen 16
L2CAP(s): Info rsp: type 2 result 0
Extended feature mask 0x00b8
Enhanced Retransmission mode
Streaming mode
FCS Option
Fixed Channels
< ACL data: handle 11 flags 0x00 dlen 10
L2CAP(s): Command rej: reason 0
Command not understood
Cc: stable@vger.kernel.org
Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Chan-Yeol Park <chanyeol.park@samsung.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
The current regdomain was not always set by the core. This causes
cards with a custom regulatory domain to ignore user initiated changes
if done before the card was registered.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John W. Linville says:
====================
This is another batch of fixes intended for the 3.11 stream. FWIW,
this is the first request with fixes from the mac80211 and iwlwifi
trees as well.
Regarding the mac80211 bits, Johannes says:
"Here I have a fix for RSSI thresholds in mesh, two minstrel fixes from
Felix, an nl80211 fix from Michal and four various fixes I did myself."
As for the iwlwifi bits, Johannes says:
"Here I have a fix for debugfs directory creation (causing a spurious
error message), two scanning fixes from David Spinadel, an LED fix and
two patches related to a BA session problem that eventually caused
firmware crashes from Emmanuel and a small BT fix for older devices as
well as a workaround for a firmware problem with APs with very small
beacon intervals from myself."
Along with those:
Arend van Spriel addresses a lock-up and a NULL pointer dereference
in brcmfmac.
Daniel Drake fixes an unhandled interrupt during device tear down
in mwifiex.
Larry Finger corrects a wil6210 build error.
Oleksij Rempel fixes two ath9k_htc problems related to keeping the
driver and firmware in sync.
Solomon Peachy gives us a cw1200 fix to avoid an oops in monitor mode.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
build_skb() specifies that the data parameter must come from a kmalloc'd
area, this is only true if frag_size equals 0, because then build_skb()
will use kzsize(data) to figure out the actual data size. Update the
comment to reflect that special condition.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the <= max condition in the for loop, it will be always go 1
element further than needed. If the condition for the while loop is
never met, then max is MAX_STAT_DEPTH, and for loop will walk off the
end of nodesizes[].
Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
My commit:
commit 12e7f51702
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date: Thu Feb 28 10:55:26 2013 +0100
mac80211: cleanup generic suspend/resume procedures
removed check for deleting MONITOR and AP_VLAN when suspend. That can
cause a crash (i.e. in iwlagn_mac_remove_interface()) since we remove
interface in the driver that we did not add before.
Reference:
http://marc.info/?l=linux-kernel&m=137391815113860&w=2
Bisected-by: Ortwin Glück <odi@odi.ch>
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pablo Neira Ayuso says:
====================
The following patchset contains Netfilter fixes for your net tree,
they are:
* Fix potential NULL dereference in the socket match if revision 0
is used, from Eric Dumazet.
* Fix missing expectation NAT initialization that results in dumping
the NAT part via ctnetlink, thus leading to problems in expectation
synchronization through conntrackd, from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is intended to avoid the buffering to non-assoc mesh STA
and also to avoid the triggering of frame to non-assoc mesh STA which
could cause kernel panic in specific hw.
One of the examples, is kernel panic happens to ath9k if user space
inserts the mesh STA and not proceed with the SAE and AMPE, and later
the same mesh STA is detected again. The sta_state of the mesh STA remains
at IEEE80211_STA_NONE and if the ieee80211_sta_ps_deliver_wakeup is called
and subsequently the ath_tx_aggr_wakeup, the kernel panic due to
ath_tx_node_init is not called before to initialize the require data
structures.
This issue is reported by Cedric Voncken before.
http://www.spinics.net/lists/linux-wireless/msg106342.html
[<831ea6b4>] ath_tx_aggr_wakeup+0x44/0xcc [ath9k]
[<83084214>] ieee80211_sta_ps_deliver_wakeup+0xb8/0x208 [mac80211]
[<830b9824>] ieee80211_mps_sta_status_update+0x94/0x108 [mac80211]
[<83099398>] ieee80211_sta_ps_transition+0xc94/0x34d8 [mac80211]
[<8022399c>] nf_iterate+0x98/0x104
[<8309bb60>] ieee80211_sta_ps_transition+0x345c/0x34d8 [mac80211]
Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
- Fix a regression against NFSv4 FreeBSD servers when creating a new file
- Fix another regression in rpc_client_register()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0
TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z
6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu
oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg
/Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e
a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3
t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4
G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b
4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd
rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS
vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0
ZxkEgSQKLZSXYb5ab770
=XE+7
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a regression against NFSv4 FreeBSD servers when creating a new
file
- Fix another regression in rpc_client_register()
* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix a regression against the FreeBSD server
SUNRPC: Fix another issue with rpc_client_register()
commit 9f00b2e7cf ("bridge: only expire the mdb entry when query is
received") added a nasty bug as an active timer can be reinitialized.
setup_timer() must be done once, no matter how many time mod_timer()
is called. br_multicast_new_group() is the right place to do this.
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Diagnosed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Limit the min/max value passed to the
/proc/sys/net/ipv4/tcp_syn_retries.
Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kbuild test robot found following error:
net/built-in.o: In function `nci_spi_send':
>> spi.c:(.text+0x19a76f): undefined reference to `crc_ccitt'
Add CRC_CCITT module to Kconfig to fix it
Reported-by: kbuild test robot.
Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Pull networking fixes from David Miller:
"A couple interesting SKB fragment handling fixes, plus the usual small
bits here and there:
1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from
Tim Gardner.
2) Get rid of a stupid reimplementation on "%*phC" in our sysfs MAC
address printing helper.
3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the
device can't do checksumming offloads then it shouldn't say it can
do SG either. From Haiyang Zhang.
4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens.
5) Don't leak DMA mappings on mapping failures, from Neil Horman.
6) We need to reset the transport header of SKBs in ipv4 before we
attempt to perform early socket demux, just like ipv6 does. From
Eric Dumazet.
7) Add missing locking on vxlan device removal, from Stephen
Hemminger.
8) xen-netfront has to make two passes over an SKB to prepare it for
transfer. One pass calculates the number of slots needed, the
second massages the SKB and fills the slots. Unfortunately, the
first pass doesn't calculate the number of slots properly so we
can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't
work out so well. Fix from Jan Beulich with help and discussion
with several others.
9) Fix a similar problem in tun and macvtap, which have to split up
scatter-gather elements at PAGE_SIZE boundaries. Don't do
zerocopy if it would result in a > MAX_SKB_FRAGS skb. Fixes from
Jason Wang.
10) On receive, once we've decoded the VLAN state completely, clear
skb->vlan_tci. Otherwise demuxed tunnels underneath can trigger
the VLAN code again, corrupting the packet. Fix from Eric
Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
vlan: fix a race in egress prio management
vlan: mask vlan prio bits
macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
pkt_sched: sch_qfq: remove a source of high packet delay/jitter
xen-netfront: pull on receive skb may need to happen earlier
vxlan: add necessary locking on device removal
hyperv: Fix the NETIF_F_SG flag setting in netvsc
net: Fix sysfs_format_mac() code duplication.
be2net: Fix to avoid hardware workaround when not needed
macvtap: do not assume 802.1Q when send vlan packets
macvtap: fix the missing ret value of TUNSETQUEUE
ipv4: set transport header earlier
mlx5 core: Fix __udivdi3 when compiling for 32 bit arches
bgmac: add dependency to phylib
net/irda: fixed style issues in irlan_eth
ethtool: fixed trailing statements in ethtool
ndisc: bool initializations should use true and false
atl1e: unmap partially mapped skb on dma error and free skb
egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.
We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 48cc32d38a
("vlan: don't deliver frames for unknown vlans to protocols")
Florian made sure we set pkt_type to PACKET_OTHERHOST
if the vlan id is set and we could find a vlan device for this
particular id.
But we also have a problem if prio bits are set.
Steinar reported an issue on a router receiving IPv6 frames with a
vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device,
because skb->vlan_tci is set.
Forwarded frame is completely corrupted : We can see (8100:4000)
being inserted in the middle of IPv6 source address :
16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 >
9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64
0x0000: 0000 0029 8000 c7c3 7103 0001 a0ae e651
0x0010: 0000 0000 ccce 0b00 0000 0000 1011 1213
0x0020: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
0x0030: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233
It seems we are not really ready to properly cope with this right now.
We can probably do better in future kernels :
vlan_get_ingress_priority() should be a netdev property instead of
a per vlan_dev one.
For stable kernels, lets clear vlan_tci to fix the bugs.
Reported-by: Steinar H. Gunderson <sesse@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
QFQ+ inherits from QFQ a design choice that may cause a high packet
delay/jitter and a severe short-term unfairness. As QFQ, QFQ+ uses a
special quantity, the system virtual time, to track the service
provided by the ideal system it approximates. When a packet is
dequeued, this quantity must be incremented by the size of the packet,
divided by the sum of the weights of the aggregates waiting to be
served. Tracking this sum correctly is a non-trivial task, because, to
preserve tight service guarantees, the decrement of this sum must be
delayed in a special way [1]: this sum can be decremented only after
that its value would decrease also in the ideal system approximated by
QFQ+. For efficiency, QFQ+ keeps track only of the 'instantaneous'
weight sum, increased and decreased immediately as the weight of an
aggregate changes, and as an aggregate is created or destroyed (which,
in its turn, happens as a consequence of some class being
created/destroyed/changed). However, to avoid the problems caused to
service guarantees by these immediate decreases, QFQ+ increments the
system virtual time using the maximum value allowed for the weight
sum, 2^10, in place of the dynamic, instantaneous value. The
instantaneous value of the weight sum is used only to check whether a
request of weight increase or a class creation can be satisfied.
Unfortunately, the problems caused by this choice are worse than the
temporary degradation of the service guarantees that may occur, when a
class is changed or destroyed, if the instantaneous value of the
weight sum was used to update the system virtual time. In fact, the
fraction of the link bandwidth guaranteed by QFQ+ to each aggregate is
equal to the ratio between the weight of the aggregate and the sum of
the weights of the competing aggregates. The packet delay guaranteed
to the aggregate is instead inversely proportional to the guaranteed
bandwidth. By using the maximum possible value, and not the actual
value of the weight sum, QFQ+ provides each aggregate with the worst
possible service guarantees, and not with service guarantees related
to the actual set of competing aggregates. To see the consequences of
this fact, consider the following simple example.
Suppose that only the following aggregates are backlogged, i.e., that
only the classes in the following aggregates have packets to transmit:
one aggregate with weight 10, say A, and ten aggregates with weight 1,
say B1, B2, ..., B10. In particular, suppose that these aggregates are
always backlogged. Given the weight distribution, the smoothest and
fairest service order would be:
A B1 A B2 A B3 A B4 A B5 A B6 A B7 A B8 A B9 A B10 A B1 A B2 ...
QFQ+ would provide exactly this optimal service if it used the actual
value for the weight sum instead of the maximum possible value, i.e.,
11 instead of 2^10. In contrast, since QFQ+ uses the latter value, it
serves aggregates as follows (easy to prove and to reproduce
experimentally):
A B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 A A A A A A A A A A B1 B2 ... B10 A A ...
By replacing 10 with N in the above example, and by increasing N, one
can increase at will the maximum packet delay and the jitter
experienced by the classes in aggregate A.
This patch addresses this issue by just using the above
'instantaneous' value of the weight sum, instead of the maximum
possible value, when updating the system virtual time. After the
instantaneous weight sum is decreased, QFQ+ may deviate from the ideal
service for a time interval in the order of the time to serve one
maximum-size packet for each backlogged class. The worst-case extent
of the deviation exhibited by QFQ+ during this time interval [1] is
basically the same as of the deviation described above (but, without
this patch, QFQ+ suffers from such a deviation all the time). Finally,
this patch modifies the comment to the function qfq_slot_insert, to
make it coherent with the fact that the weight sum used by QFQ+ can
now be lower than the maximum possible value.
[1] P. Valente, "Extending WF2Q+ to support a dynamic traffic mix",
Proceedings of AAA-IDEA'05, June 2005.
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull phase two of __cpuinit removal from Paul Gortmaker:
"With the __cpuinit infrastructure removed earlier, this group of
commits only removes the function/data tagging that was done with the
various (now no-op) __cpuinit related prefixes.
Now that the dust has settled with yesterday's v3.11-rc1, there
hopefully shouldn't be any new users leaking back in tree, but I think
we can leave the harmless no-op stubs there for a release as a
courtesy to those who still have out of tree stuff and weren't paying
attention.
Although the commits are against the recent tag to allow for minor
context refreshes for things like yesterday's v3.11-rc1~ slab content,
the patches have been largely unchanged for weeks, aside from such
trivial updates.
For detail junkies, the largely boring and mostly irrelevant history
of the patches can be viewed at:
http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git
If nothing else, I guess it does at least demonstrate the level of
involvement required to shepherd such a treewide change to completion.
This is the same repository of patches that has been applied to the
end of the daily linux-next branches for the past several weeks"
* 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (28 commits)
block: delete __cpuinit usage from all block files
drivers: delete __cpuinit usage from all remaining drivers files
kernel: delete __cpuinit usage from all core kernel files
rcu: delete __cpuinit usage from all rcu files
net: delete __cpuinit usage from all net files
acpi: delete __cpuinit usage from all acpi files
hwmon: delete __cpuinit usage from all hwmon files
cpufreq: delete __cpuinit usage from all cpufreq files
clocksource+irqchip: delete __cpuinit usage from all related files
x86: delete __cpuinit usage from all x86 files
score: delete __cpuinit usage from all score files
xtensa: delete __cpuinit usage from all xtensa files
openrisc: delete __cpuinit usage from all openrisc files
m32r: delete __cpuinit usage from all m32r files
hexagon: delete __cpuinit usage from all hexagon files
frv: delete __cpuinit usage from all frv files
cris: delete __cpuinit usage from all cris files
metag: delete __cpuinit usage from all metag files
tile: delete __cpuinit usage from all tile files
sh: delete __cpuinit usage from all sh files
...
Pull nfsd bugfixes from Bruce Fields:
"Just three minor bugfixes"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
svcrdma: underflow issue in decode_write_list()
nfsd4: fix minorversion support interface
lockd: protect nlm_blocked access in nlmsvc_retry_blocked
It's just a duplicate implementation of "%*phC". Thanks to Joe
Perches for showing that we had exactly this support in the
lib/vsprintf.c code already.
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 45f00f99d6 ("ipv4: tcp: clean up tcp_v4_early_demux()") added a
performance regression for non GRO traffic, basically disabling
IP early demux.
IPv6 stack resets transport header in ip6_rcv() before calling
IP early demux in ip6_rcv_finish(), while IPv4 does this only in
ip_local_deliver_finish(), _after_ IP early demux.
GRO traffic happened to enable IP early demux because transport header
is also set in inet_gro_receive()
Instead of reverting the faulty commit, we can make IPv4/IPv6 behave the
same : transport_header should be set in ip_rcv() instead of
ip_local_deliver_finish()
ip_local_deliver_finish() can also use skb_network_header_len() which is
faster than ip_hdrlen()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>