CCID3 performance depends much on the accuracy of RTT samples. If RTT
samples grow too large, performance can be catastrophically poor.
To limit the amount of possible damage in such cases, the patch
* introduces an upper limit which identifies a maximum `sane' RTT value;
* uses a macro to enforce this upper limit.
Using a macro was given preference, since it is necessary to identify the
calling function in the warning message. Since exceeding this threshold
identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN.
Many thanks to Ian McDonald for collaboration on this issue.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
In both the sender and the receiver it is possible that the stored
RTT value is accessed before an actual RTT estimate has been computed.
This patch
* initialises the sender RTT to 0
- the sender always accesses the RTT in ccid3_hc_tx_packet_sent
- the RTT is further needed for the window counter algorithm
* replaces the receiver initialisation of 5msec with 0
- which has the same effect and removes an `XXX'
- the RTT value is needed in ccid3_hc_rx_packet_recv as rtt_prev
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The function ccid3_hc_tx_insert_options only does a redundant no-op,
as the operation
DCCP_SKB_CB(skb)->dccpd_ccval = hctx->ccid3hctx_last_win_count;
is already performed _unconditionally_ in ccid3_hc_tx_send_packet.
Since there is further no current need for this function, it is removed
entirely. Since furthermore, there is actually no present need for the
entire interface function ccid_hc_tx_insert_options, it was decided to
remove it also, to clean up the interface.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This adds a (debug) warning message which is triggered whenever a packet is
discarded due to send failure.
It also adds a conditional, so that an interruption during dccp_wait_for_ccid
is not treated as a `BUG': the rationale is that interruptions are external,
whereas bug warnings are concerned with the internals.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is an optimisation to reduce CPU load. The received feedback is now
only directed to the active CCID component, without requiring processing
also by the inactive one.
As a consequence, a similar test in ccid3.c is now redundant and is
also removed.
Justification:
Currently DCCP works as a unidirectional service, i.e. a listening server
is not at the same time a connecting client.
As far as I can see, several modifications are necessary until that
becomes possible.
At the present time, received feedback is both fed to the rx/tx CCID
modules. In unidirectional service, only one of these is active at any
one time.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
In migrating towards using the newer functions scaled_div/scaled_div32
for TFRC computations mapped from floating-point onto integer arithmetic,
this completes the last stage of modifications.
In particular, the overflow case for computing X_calc is circumvented by
* breaking the computation into two stages
* the first stage, res = (s*1E6)/R, cannot overflow due to use of u64
* in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due
to (i) returning UINT_MAX in this case (which is logically appropriate)
and (ii) issuing a warning message into the system log (since very likely
there is a problem somewhere else with the parameters)
Lastly, all such scaling operations are now exported into tfrc.h, since
actually this form of scaled computation is specific to TFRC and not to CCID3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Problem:
Most target types in the CCID3 code are u32, so subtle conversion errors
can occur if signed time calculations yield negative results: the original
values are lost in the conversion to unsigned, calculation errors go undetected.
This patch therefore
* sets all critical time types from unsigned to suseconds_t
* avoids comparison between signed/unsigned via type-casting
* provides ample warning messages in case time calculations are negative
These warning messages can be removed at a later stage when the code
has undergone more testing.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This simplifies the calculation of a value p for a given fval when the
first loss interval is computed (RFC 3448, 6.3.1). It makes use of the
two new functions scaled_div/scaled_div32 to provide overflow protection.
Additionally, protection against divide-by-zero is extended - in this
case the function will return the maximally possible value of p=100%.
Background:
The maximum fval, f(100%), is approximately 244, i.e. the scaled value of fval
should never exceed 244E6, which fits easily into u32. The problem is the scaling
by 10^6, since additionally R(TT) is in microseconds.
This is resolved by breaking the division into two stages: the first stage
computes fval=(s*10^6)/R, stores that into u64; the second stage computes
fval = (fval*10^6)/X_recv and complains if overflow is reached for u32.
This case is safe since the TFRC reverse-lookup routine then returns p=100%.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This replaces the remaining uses of usecs_div with scaled_div32, which
internally uses 64bit division and produces a warning on overflow.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch
* resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0
* supports all sending rates from 1/64 bytes/second up to 4Gbyte/second
* simplifies the present overflow problems in calculations
Current sending rate X and the cached value X_recv of the receiver-estimated
sending rate are both scaled by 64 (2^6) in order to
* cope with low sending rates (minimally 1 byte/second)
* allow upgrading to use a packets-per-second implementation of CCID 3
* avoid calculation errors due to integer arithmetic cut-off
The patch implements a revised strategy from
http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html
The only difference with regard to that strategy is that t_ipi is already
used in the calculation of the nofeedback timeout, which saves one division.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This fixes
1) a bug in the recomputation of the sending rate by the nofeedback
timer when no feedback at all has so far been sent by the receiver:
min_t was used instead of max_t, which is wrong (cf. RFC 3448, p. 10);
2) an error in the computation of larger initial windows: instead of
min(... max()) (cf. RFC 4342, 5.), the code had used max(... max()).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This performs two optimisations for the recomputation of the sending rate.
1) Currently the target sending rate X_calc is recalculated whenever
a) the nofeedback timer expires, or
b) a feedback packet is received.
In the (a) case, recomputing X_calc is redundant, since
* the parameters p and RTT do not change in between the
reception of feedback packets;
* the parameter X_recv is either modified from received
feedback or via the nofeedback timer;
* a test (`p == 0') in the nofeedback timer avoids using
a stale/undefined value of X_calc if p was previously 0.
2) The nofeedback timer now only recomputes a timestamp when p == 0.
This is according to step (4) of [RFC 3448, 4.3] and avoids
unnecessarily determining a timestamp.
A debug statement about not updating X is also removed - it helps very
little in debugging and just clutters the logs.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch follows a suggestion by Ian McDonald and ensures that in
the current code the value of p can not exceed 100%. Such a value is
illegal and would consequently cause a bug condition in tfrc_calc_x().
The receiver case is also tested, and a warning message is added.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
It simplifies waiting for the CCID module to signal that a packet
is ready to be sent. Other simplifications flow on from this such as
removing constants.
As a result of this EAGAIN is not returned any more by dccp_wait_for_ccid
(which would otherwise lead to unnecessarily discarding the packet in
dccp_write_xmit).
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Fix ieee80211-softmac compile problem where it's using schedule_work() on a
delayed_work struct.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- turn intermediate classes into leaves again when their
last child is deleted (struct htb_class changed)
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
- deactivating of active classes when q.qlen drops to zero
(cbq_drop)
- a redundant instruction removed from cbq_deactivate_class
PS: probably htb_deactivate in htb_delete and
cbq_deactivate_class in cbq_delete are also
redundant now.
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back in 2.4 arp requests that were recevied by netpoll were processed
in netconsole_receive_skb, where they were responded to using the src
mac of the request sender. In the 2.6 kernel arp_reply is responsible
for this function, but instead of using the src mac address of the
incomming request, the stored mac address that was registered for the
netconsole application is used. While this is usually ok, it can lead
to failures in netpoll in some situations (specifically situations
where a network may have two gateways, as arp requests from one may be
responded to using the mac address of the other). This patch reverts
the behavior to what we had in 2.4, in which all arp requests are sent
back using the src address of the request sender.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only the callsign but not the SSID part of an AX.25 address is ASCII
based but Linux by initializes the SSID which should be just a 4-bit
number from ASCII anyway.
Fix that and convert the code to use a shared constant for both default
addresses. While at it, use the same style for null_ax25_address also.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hard header cache is in the main output path, so using
seqlock instead of reader/writer lock should reduce overhead.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the grungy swap all the occurrences in the right places patch that
goes with the updates. At this point we have the same functionality as
before (except that sgttyb() returns speeds not zero) and are ready to
begin turning new stuff on providing nobody reports lots of bugs
If you are a tty driver author converting an out of tree driver the only
impact should be termios->ktermios name changes for the speed/property
setting functions from your upper layers.
If you are implementing your own TCGETS function before then your driver
was broken already and its about to get a whole lot more painful for you so
please fix it 8)
Also fill in c_ispeed/ospeed on init for most devices, although the current
code will do this for you anyway but I'd like eventually to lose that extra
paranoia
[akpm@osdl.org: bluetooth fix]
[mp3@de.ibm.com: sclp fix]
[mp3@de.ibm.com: warning fix for tty3270]
[hugh@veritas.com: fix tty_ioctl powerpc build]
[jdike@addtoit.com: uml: fix ->set_termios declaration]
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Acked-by: Peter Oberparleiter <oberpar@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the core of the switch to the new framework. I've split it from the
driver patches which are mostly search/replace and would encourage people to
give this one a good hard stare.
The references to BOTHER and ISHIFT are the termios values that must be
defined by a platform once it wants to turn on "new style" ioctl support. The
code patches here ensure that providing
1. The termios overlays the ktermios in memory
2. The only new kernel only fields are c_ispeed/c_ospeed (or none)
the existing behaviour is retained. This is true for the patches at this
point in time.
Future patches will define BOTHER, ISHIFT and enable newer termios structures
for each architecture, and once they are all done some of the ifdefs also
vanish.
[akpm@osdl.org: warning fix]
[akpm@osdl.org: IRDA fix]
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We currently insert socket dentries into the global dentry hashtable. This
is suboptimal because there is currently no way these entries can be used
for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
them in dentry hashtable slows dcache lookups.
To let __dpath() still work correctly (ie not adding a " (deleted)") after
dentry name, we do :
- Right after d_alloc(), pretend they are hashed by clearing the
DCACHE_UNHASHED bit.
- Call d_instantiate() instead of d_add() : dentry is not inserted in
hash table.
__dpath() & friends work as intended during dentry lifetime.
- At dismantle time, once dput() must clear the dentry, setting again
DCACHE_UNHASHED bit inside the custom d_delete() function provided by
socket code, so that dput() can just kill_it.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.
the compiler can skip truly unused functions just fine:
text data bss dec hex filename
1624412 728710 3674856 6027978 5bfaca vmlinux.before
1624412 728710 3674856 6027978 5bfaca vmlinux.after
[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Name some of the remaning 'old_style_spin_init' locks
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Stick NFS sockets in their own class to avoid some lockdep warnings. NFS
sockets are never exposed to user-space, and will hence not trigger certain
code paths that would otherwise pose deadlock scenarios.
[akpm@osdl.org: cleanups]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Dickson <SteveD@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
[ Fixed patch corruption by quilt, pointed out by Peter Zijlstra ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.
[akpm@osdl.org: fix ueagle driver]
Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_KERNEL is an alias of GFP_KERNEL.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
SLAB_ATOMIC is an alias of GFP_ATOMIC
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The patch (as824b) makes percpu_free() ignore NULL arguments, as one would
expect for a deallocation routine. (Note that free_percpu is #defined as
percpu_free in include/linux/percpu.h.) A few callers are updated to remove
now-unneeded tests for NULL. A few other callers already seem to assume
that passing a NULL pointer to percpu_free() is okay!
The patch also removes an unnecessary NULL check in percpu_depopulate().
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Node-aware allocation of skbs for the receive path.
Details:
- __alloc_skb gets a new node argument and cals the node-aware
slab functions with it.
- netdev_alloc_skb passed the node number it gets from dev_to_node
to it, everyone else passes -1 (any node)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix non-ANSI function declaration:
net/netfilter/nf_conntrack_core.c:1096:25: warning: non-ANSI function declaration of function 'nf_conntrack_flush'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Ralf Baechle's observations, the schedule_work() call
should give enough of a memory barrier, so the explicit one
here is totally unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
I believe all the below memory barriers only matter on SMP so
therefore the smp_* variant of the barrier should be used.
I'm wondering if the barrier in net/ipv4/inet_timewait_sock.c should be
dropped entirely. schedule_work's implementation currently implies a
memory barrier and I think sane semantics of schedule_work() should imply
a memory barrier, as needed so the caller shouldn't have to worry.
It's not quite obvious why the barrier in net/packet/af_packet.c is
needed; maybe it should be implied through flush_dcache_page?
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We grab a reference to the route's inetpeer entry but
forget to release it in xfrm4_dst_destroy().
Bug discovered by Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disables auditing in ipsec when CONFIG_AUDITSYSCALL is
disabled in the kernel.
Also includes a bug fix for xfrm_state.c as a result of
original ipsec audit patch.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
An audit message occurs when an ipsec SA
or ipsec policy is created/deleted.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We must reserve SAR + MAX_HEADER bytes for IrLMP to fit in.
Patch from Jeet Chaudhuri <jeetlinux@yahoo.co.in>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The command flags for dump and do were swapped..
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
When user builds IPv6 header and send it through raw socket, kernel
tries to release unlocked sock. (Kernel log shows
"BUG: bad unlock balance detected" with enabled debug option.)
The lock is held only for non-hdrincl sock in this function
then this patch fix to do nothing about lock for hdrincl one.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit "[IPV6]: Use kmemdup" (commit-id:
af879cc704) broke IPv6 fragments.
Bug was spotted by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the first fw classifier is initialized, there is a small window
between the ->init() and ->change() calls, during which the classifier
is active but not entirely set up and tp->root is still NULL (->init()
does nothing).
When a packet is queued during this window a NULL pointer dereference
occurs in fw_classify() when trying to dereference head->mask;
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The attached patch resolves an issue where a IP DNATed packet with a
martian source is forwarded while it's better to drop it. It also
resolves messages complaining about ip forwarding being disabled while
it's actually enabled. Thanks to lepton <ytht.net@gmail.com> for
reporting this problem.
This is probably a candidate for the -stable release.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original code continues loop to find expectation in list if the master
conntrack of the found expectation is unconfirmed. But it never success
in that case, because nf_conntrack_expect_related() never insert
clashed expectation to the list.
This stops loop in that case.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In compat mode, matches and targets valid hooks checks always successful due
to not initialized e->comefrom field yet. This patch separates this checks from
translation code and moves them after mark_source_chains() call, where these
marks are initialized.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by; Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 590bdf7fd2 introduced a regression
in match/target hook validation. mark_source_chains builds a bitmask
for each rule representing the hooks it can be reached from, which is
then used by the matches and targets to make sure they are only called
from valid hooks. The patch moved the match/target specific validation
before the mark_source_chains call, at which point the mask is always zero.
This patch returns back to the old order and moves the standard checks
to mark_source_chains. This allows to get rid of a special case for
standard targets as a nice side-effect.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change optimizes the dumping of Security policies.
1) Before this change ..
speedopolis:~# time ./ip xf pol
real 0m22.274s
user 0m0.000s
sys 0m22.269s
2) Turn off sub-policies
speedopolis:~# ./ip xf pol
real 0m13.496s
user 0m0.000s
sys 0m13.493s
i suppose the above is to be expected
3) With this change ..
speedopolis:~# time ./ip x policy
real 0m7.901s
user 0m0.008s
sys 0m7.896s
Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.
Always use the output device when checking disable_xfrm.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves command capabilities to command flags. Other than
being cleaner, saves several bytes.
We increment the nlctrl version so as to signal to user space that
to not expect the attributes. We will try to be careful
not to do this too often ;->
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
These appear to be deprecated. Removing them also gets rid of some sparse
noise.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean-up:
The RPC client currently creates some sysctls that are specific to the
socket transport. Move those entirely into xprtsock.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Over time we will want to add some specific init and cleanup logic for the
xprtsock implementation. Add stub routines for initialization and exit
processing.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean-up: hch suggested that the RPC client shouldn't pollute the name
space used by the generic skb manipulation routines in net/core/skbuff.c.
Rename a couple of types in xdr.h to adhere to this convention.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean-up: eliminate xs_tcp_copy_data -- it's exactly the same logic as the
common routine skb_read_bits. The UDP and TCP socket read code now share
the same routine for copying data into an xdr_buf.
Now that skb_read_bits() is exported, rename it to avoid confusing it with
a generic skb_* function. As these functions are XDR-specific, they should
not have names that suggest they are of generic use. Also rename
skb_read_and_csum_bits() to be consistent.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For now we will assume that all transports will use the address format
buffers in the rpc_xprt struct to store their addresses. Change
rpc_peer2str() to be a generic routine to handle this, and get rid of the
print_address() op in the rpc_xprt_ops vector.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the three fields for saving socket callback functions out of the
rpc_xprt structure and into a private data structure maintained in
net/sunrpc/xprtsock.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the socket-specific buffer size parameters for UDP sockets to a
private data structure maintained in net/sunrpc/xprtsock.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the socket-specific connection management fields out of the generic
rpc_xprt structure into a private data structure maintained in
net/sunrpc/xprtsock.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move "XPRT_LAST_FRAG" and friends from xprt.h into xprtsock.c, and rename
them to use the naming scheme in use in xprtsock.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the TCP receive state variables from the generic rpc_xprt structure to
a private structure maintained inside net/sunrpc/xprtsock.c.
Also rename a function/variable pair to refer to RPC fragment headers
instead of record markers, to be consistent with types defined in
sunrpc/*.h.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The "sock" and "inet" fields are socket-specific. Move them to a private
data structure maintained entirely within net/sunrpc/xprtsock.c
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When setting up a new transport instance, allocate enough memory for an
rpc_xprt and a private area. As part of the same memory allocation, it
will be easy to find one, given a pointer to the other.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We're currently not actually using seed or seed_init.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sealalg is checked in several places, giving the impression it could be
either SEAL_ALG_NONE or SEAL_ALG_DES. But in fact SEAL_ALG_NONE seems to
be sufficient only for making mic's, and all the contexts we get must be
capable of wrapping as well. So the sealalg must be SEAL_ALG_DES. As
with signalg, just check for the right value on the downcall and ignore it
otherwise. Similarly, tighten expectations for the sealalg on incoming
tokens, in case we do support other values eventually.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Remove some unnecessary goto labels; clean up some return values; etc.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We're doing some pointless translation between krb5 constants and kernel
crypto string names.
Also clean up some related spkm3 code as necessary.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Previous changes reveal some obvious cruft.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We also only ever receive one value of the signalg, so let's not pretend
otherwise
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We designed the krb5 context import without completely understanding the
context. Now it's clear that there are a number of fields that we ignore,
or that we depend on having one single value.
In particular, we only support one value of signalg currently; so let's
check the signalg field in the downcall (in case we decide there's
something else we could support here eventually), but ignore it otherwise.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This updates the spkm3 code to bring it up to date with our current
understanding of the spkm3 spec.
In doing so, we're changing the downcall format used by gssd in the spkm3 case,
which will cause an incompatilibity with old userland spkm3 support. Since the
old code a) didn't implement the protocol correctly, and b) was never
distributed except in the form of some experimental patches from the citi web
site, we're assuming this is OK.
We do detect the old downcall format and print warning (and fail). We also
include a version number in the new downcall format, to be used in the
future in case any further change is required.
In some more detail:
- fix integrity support
- removed dependency on NIDs. instead OIDs are used
- known OID values for algorithms added.
- fixed some context fields and types
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Since process_xdr_buf() is useful outside of the kerberos-specific code, we
move it to net/sunrpc/xdr.c, export it, and rename it in keeping with xdr_*
naming convention of xdr.c.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This code is never called from interrupt context; it's always run by either
a user thread or rpciod. So KM_SKB_SUNRPC_DATA is inappropriate here.
Thanks to Aimé Le Rouzic for capturing an oops which showed the kernel
taking an interrupt while we were in this piece of code, resulting in a
nested kmap_atomic(.,KM_SKB_SUNRPC_DATA) call from
xdr_partial_copy_from_skb().
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Dumping all this data to the logs is wasteful (even when debugging is turned
off), and creates too much output to be useful when it's turned on.
Fix a minor style bug or two while we're at it.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Don't wake up bind waiters if a task finds that another task is already
trying to bind.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Change the location where the rpc_xprt structure is allocated so each
transport implementation can allocate a private area from the same
chunk of memory.
Note also that xprt->ops->destroy, rather than xprt_destroy, is now
responsible for freeing rpc_xprt when the transport is destroyed.
Test plan:
Connectathon.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All internal RPC client operations should no longer depend on the BKL,
however lockd and NFS callbacks may still require it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use RCU to ensure that we can safely call rpc_finish_wakeup after we've
called __rpc_do_wake_up_task. If not, there is a theoretical race, in which
the rpc_task finishes executing, and gets freed first.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:
First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.
Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.
Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Routine ieee80211softmac_wx_set_mlme has one return that fails
to release a mutex acquired at entry.
Signed-off-by: Maxime Austruy <maxime@tralhalla.org>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In 2.6.19 a deauthentication from the AP doesn't start a
reassociation by the softmac code. It appears that
mac->associnfo.associating must be set and the
ieee80211softmac_assoc_work function must be scheduled. This patch
fixes that.
Signed-off-by: Ulrich Kunitz <kune@deine-taler.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Conflicts:
drivers/ata/libata-scsi.c
include/linux/libata.h
Futher merge of Linus's head and compilation fixups.
Signed-Off-By: David Howells <dhowells@redhat.com>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
Since we never checked the ->family value of templates
before, many applications simply leave it at zero.
Detect this and fix it up to be the pol->family value.
Also, do not clobber xp->family while reading in templates,
that is not necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
This replaces the linear search algorithm for reverse lookup with
binary search.
It has the advantage of better scalability: O(log2(N)) instead of O(N).
This means that the average number of iterations is reduced from 250
(linear search if each value appears equally likely) down to at most 9.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch deprecates the existing use of an arbitrary value TFRC_SMALLEST_P
for low-threshold values of p. This avoids masking low-resolution errors.
Instead, the code now checks against real boundaries (implemented by preceding
patch) and provides warnings whenever a real value falls below the threshold.
If such messages are observed, it is a better solution to take this as an
indication that the lookup table needs to be re-engineered.
Changelog:
----------
This patch
* makes handling all TFRC resolution errors local to the TFRC library
* removes unnecessary test whether X_calc is 'infinity' due to p==0 -- this
condition is already caught by tfrc_calc_x()
* removes setting ccid3hctx_p = TFRC_SMALLEST_P in ccid3_hc_tx_packet_recv
since this is now done by the TFRC library
* updates BUG_ON test in ccid3_hc_tx_no_feedback_timer to take into account
that p now is either 0 (and then X_calc is irrelevant), or it is > 0; since
the handling of TFRC_SMALLEST_P is now taken care of in the tfrc library
Justification:
--------------
The TFRC code uses a lookup table which has a bounded resolution.
The lowest possible value of the loss event rate `p' which can be
resolved is currently 0.0001. Substituting this lower threshold for
p when p is less than 0.0001 results in a huge, exponentially-growing
error. The error can be computed by the following formula:
(f(0.0001) - f(p))/f(p) * 100 for p < 0.0001
Currently the solution is to use an (arbitrary) value
TFRC_SMALLEST_P = 40 * 1E-6 = 0.00004
and to consider all values below this value as `virtually zero'. Due to
the exponentially growing resolution error, this is not a good idea, since
it hides the fact that the table can not resolve practically occurring cases.
Already at p == TFRC_SMALLEST_P, the error is as high as 58.19%!
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This
* adds documentation about the lowest resolution that is possible within
the bounds of the current lookup table
* defines a constant TFRC_SMALLEST_P which defines this resolution
* issues a warning if a given value of p is below resolution
* combines two previously adjacent if-blocks of nearly identical
structure into one
This patch does not change the algorithm as such.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
1) For the forward X_calc lookup, it
* protects effectively against RTT=0 (this case is possible), by
returning the maximal lookup value instead of just setting it to 1
* reformulates the array-bounds exceeded condition: this only happens
if p is greater than 1E6 (due to the scaling)
* the case of negative indices can now with certainty be excluded,
since documentation shows that the formulas are within bounds
* additional protection against p = 0 (would give divide-by-zero)
2) For the reverse lookup, it warns against
* protects against exceeding array bounds
* now returns 0 if f(p) = 0, due to function definition
* warns about minimal resolution error and returns the smallest table
value instead of p=0 [this would mask congestion conditions]
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This fixes the following small error in tfrc_calc_x_reverse_lookup.
1) The table is generated by the following equations:
lookup[index][0] = g((index+1) * 1000000/TFRC_CALC_X_ARRSIZE);
lookup[index][1] = g((index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE);
where g(q) is 1E6 * f(q/1E6)
2) The reverse lookup assigns an entry in lookup[index][small]
3) This index needs to match the above, i.e.
* if small=0 then
p = (index + 1) * 1000000/TFRC_CALC_X_ARRSIZE
* if small=1 then
p = (index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE
These are exactly the changes that the patch makes; previously the code did
not conform to the way the lookup table was generated (this difference resulted
in a mean error of about 1.12%).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This adds documentation for the TCP Reno throughput equation which is at
the heart of the TFRC sending rate / loss rate calculations.
It spells out precisely how the values were determined and what they mean.
The equations were derived through reverse engineering and found to be
fully accurate (verified using test programs).
This patch does not change any code.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This avoids a (harmless) warning message being printed at the DCCP server
(the receiver of a DCCP half connection).
Incoming packets are both directed to
* ccid_hc_rx_packet_recv() for the server half
* ccid_hc_tx_packet_recv() for the client half
The message gets printed since on a server the client half is currently not
sending data packets.
This is resolved for the moment by checking the DCCP-role first. In future
times (bidirectional DCCP connections), this test may have to be more
sophisticated.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The main object of this patch is the following bug:
==> In ccid3_hc_tx_packet_recv, the parameters p and X_recv were updated
_after_ the send rate was calculated. This is clearly an error and is
resolved by re-ordering statements.
In addition,
* r_sample is converted from u32 to long to check whether the time difference
was negative (it would otherwise be converted to a large u32 value)
* protection against RTT=0 (this is possible) is provided in a further patch
* t_elapsed is also converted to long, to match the type of r_sample
* adds a a more debugging information regarding current send rates
* various trivial comment/documentation updates
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This bug resulted in ccid3_hc_tx_send_packet returning negative
delay values, which in turn triggered silently dequeueing packets in
dccp_write_xmit. As a result, only a few out of the submitted packets made
it at all onto the network. Occasionally, when dccp_wait_for_ccid was
involved, this also triggered a bug warning since ccid3_hc_tx_send_packet
returned a negative value (which in reality was a negative delay value).
The cause for this bug lies in the comparison
if (delay >= hctx->ccid3hctx_delta)
return delay / 1000L;
The type of `delay' is `long', that of ccid3hctx_delta is `u32'. When comparing
negative long values against u32 values, the test returned `true' whenever delay
was smaller than 0 (meaning the packet was overdue to send).
The fix is by casting, subtracting, and then testing the difference with
regard to 0.
This has been tested and shown to work.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The TFRC nofeedback timer normally expires after the maximum of 4
RTTs and twice the current send interval (RFC 3448, 4.3). On LANs
with a small RTT this can mean a high processing load and reduced
performance, since then the nofeedback timer is triggered very
frequently.
This patch provides a configuration option to set the bound for the
nofeedback timer, using as default 100 milliseconds.
By setting the configuration option to 0, strict RFC 3448 behaviour
can be enforced for the nofeedback timer.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
aevents can not uniquely identify an SA. We break the ABI with this
patch, but consensus is that since it is not yet utilized by any
(known) application then it is fine (better do it now than later).
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
To use ipv6_find_hdr(), IP6_NF_IPTABLES is necessary.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Binderman's icc logs:
net/rose/rose_route.c(399): remark #593: variable "err" was set but never used
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- move EXPORT_SYMBOL next to exported symbol
- use EXPORT_SYMBOL_GPL since this is what the original code used
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also remove the references to "new connection tracking" from Kconfig.
After some short stabilization period of the new connection tracking
helpers/NAT code the old one will be removed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IPv4 and IPv6 capable nf_conntrack port of the TFTP conntrack/NAT helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IPv4 and IPv6 capable nf_conntrack port of the SIP conntrack/NAT helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add nf_conntrack port of the PPtP conntrack/NAT helper. Since there seems
to be no IPv6-capable PPtP implementation the helper only support IPv4.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add nf_conntrack port of the NetBIOS name service conntrack helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add nf_conntrack port of the IRC conntrack/NAT helper. Since DCC doesn't
support IPv6 yet, the helper is still IPv4 only.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IPv4 and IPv6 capable nf_conntrack port of the H.323 conntrack/NAT helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IPv4 and IPv6 capable nf_conntrack port of the Amanda conntrack/NAT helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expectation address masks need to be differently initialized depending
on the address family, create helper function to avoid cluttering up
the code too much.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add FTP NAT helper.
Split out from Jozsef's big nf_nat patch with a few small fixes by myself.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NAT support for nf_conntrack. Joint work of Jozsef Kadlecsik,
Yasuyuki Kozakai, Martin Josefsson and myself.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improve the connection tracking selection (well, the user experience,
not really the aesthetics) by offering one option to enable connection
tracking and a choice between the implementations.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some helpers (namely H.323) manually assign further helpers to expected
connections. This is not possible with nf_conntrack anymore since we
need to know whether a helper is used at allocation time.
Handle the helper assignment centrally, which allows to perform the
correct allocation and as a nice side effect eliminates the need
for the H.323 helper to fiddle with nf_conntrack_lock.
Mid term the allocation scheme really needs to be redesigned since
we do both the helper and expectation lookup _twice_ for every new
connection.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resync with Al Viro's ip_conntrack annotations and fix a missed
spot in ip_nat_proto_icmp.c.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding the alignment to the size doesn't make any sense, what it
should do is align the size of the conntrack structure to the
alignment requirements of the helper structure and return an
aligned pointer in nfct_help().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
NF_CONNTRACK_PROC_COMPAT depends on NF_CONNTRACK_IPV4, not NF_CONNTRACK.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Accept -1 as delimiter to abort parsing without an error at the first
unknown character. This is needed by the upcoming nf_conntrack SIP
helper, where addresses are delimited by either '\r' or '\n' characters.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not returning -EINVAL, because someone might want to use the value
zero in some future gact_prob algorithm?
Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove assumption that generic netlink commands cannot have dump
completion callbacks.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
All that remained was skb_migrate() and that was overkill
for what the two call sites were trying to do.
Signed-off-by: David S. Miller <davem@davemloft.net>
The tc actions increased the size of struct tc_police, which broke
compatibility with old iproute binaries since both the act_police
and the old NET_CLS_POLICE code check for an exact size match.
Since the new members are not even used, the simple fix is to also
accept the size of the old structure. Dumping is not affected since
old userspace will receive a bigger structure, which is handled fine.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the following unused EXPORT_SYMBOL's:
- sch_api.c: qdisc_lookup
- sch_generic.c: __netdev_watchdog_up
- sch_generic.c: noop_qdisc_ops
- sch_generic.c: qdisc_alloc
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
... and pass just repl->name to translate_table()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can check newinfo->hook_entry[...] instead.
Kill unused argument.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check for valid_hooks is redundant (newinfo->hook_entry[i] will
be NULL if bit i is not set). Kill it, kill unused arguments.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since newinfo->hook_table[] already has been set up, we can switch to using
it instead of repl->{hook_info,valid_hooks}.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take intialization of ->hook_entry[...], ->entries_size and ->nentries
over there, pull the check for empty chains into the end of that sucker.
Now it's self-contained, so we can move it up in the very beginning of
translate_table() *and* we can rely on ->hook_entry[] being properly
transliterated after it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's easier to expand the iterator here *and* we'll be able to move all
uses of ebt_replace from translate_table() into this one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split ebt_check_entry_size_and_hooks() in two parts - one that does
sanity checks on pointers (basically, checks that we can safely
use iterator from now on) and the rest of it (looking into details
of entry).
The loop applying ebt_check_entry_size_and_hooks() is split in two.
Populating newinfo->hook_entry[] is done in the first part.
Unused arguments killed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to revisit a chain we'd already finished with during
the check for current hook. It's either instant loop (which
we'd just detected) or a duplicate work.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need that for iterator to work; existing check had been too weak.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to verify that
a) we are not too close to the end of buffer to dereference
b) next entry we'll be checking won't be _before_ our
While we are at it, don't subtract unrelated pointers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
- ipv4/tcp.c: __tcp_alloc_md5sig_pool()
- ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
- ipv4/udplite.c: udplite_rcv()
- ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
- ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
- ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
- ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
(gcc should know best when to inline them)
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It just obfuscates the code and adds limited value. And as Adrian
Bunk noticed, it lacked Kconfig help text too, so just kill it.
Signed-off-by: David S. Miller <davem@davemloft.net>
When peeking at the next packet in a child qdisc by calling dequeue/requeue,
the upper qdisc qlen counter may get out of sync in case the requeue fails.
The qdisc and the child qdisc both have their counter decremented, but since
no packet is given to the upper qdisc it won't decrement its counter itself.
requeue should not fail, so this is mostly for "correctness".
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert HTB to use qdisc_tree_decrease_len() and add a callback
for deactivating a class when its child queue becomes empty.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert HFSC to use qdisc_tree_decrease_len() and add a callback
for deactivating a class when its child queue becomes empty.
All queue purging goes through hfsc_purge_queue(), which is used in
three cases: grafting, class creation (when a leaf class is turned
into an intermediate class by attaching a new class) and class
deletion. In all cases qdisc_tree_decrease_len() is needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the "simple" qdiscs to use qdisc_tree_decrease_qlen() where
necessary:
- all graft operations
- destruction of old child qdiscs in prio, red and tbf change operation
- purging of queue in sfq change operation
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are multiple problems related to qlen adjustment that can lead
to an upper qdisc getting out of sync with the real number of packets
queued, leading to endless dequeueing attempts by the upper layer code.
All qdiscs must maintain an accurate q.qlen counter. There are basically
two groups of operations affecting the qlen: operations that propagate
down the tree (enqueue, dequeue, requeue, drop, reset) beginning at the
root qdisc and operations only affecting a subtree or single qdisc
(change, graft, delete class). Since qlen changes during operations from
the second group don't propagate to ancestor qdiscs, their qlen values
become desynchronized.
This patch adds a function to propagate qlen changes up the qdisc tree,
optionally calling a callback function to perform qdisc-internal
maintenance when the child qdisc becomes empty. The follow-up patches
will convert all qdiscs to use this function where necessary.
Noticed by Timo Steinbach <tsteinbach@astaro.com>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set parent classids in default qdiscs to allow walking up the tree
from outside the qdiscs. This is needed by the next patch.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
qlen adjustment should happen immediately in ->delete and not in the
class destroy function because the reference count will not hit zero in
->delete (sch_api holds a reference) but in ->put. Since the qdisc
lock is released between deletion of the class and final destruction
this creates an externally visible error in the qlen counter.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the ranged tag (tag type #5) to the CIPSOv4 protocol.
The ranged tag allows for seven, or eight if zero is the lowest category,
category ranges to be specified in a CIPSO option. Each range is specified by
two unsigned 16 bit fields, each with a maximum value of 65534. The two values
specify the start and end of the category range; if the start of the category
range is zero then it is omitted.
See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Add support for the enumerated tag (tag type #2) to the CIPSOv4 protocol.
The enumerated tag allows for 15 categories to be specified in a CIPSO option,
where each category is an unsigned 16 bit field with a maximum value of 65534.
See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The original NetLabel category bitmap was a straight char bitmap which worked
fine for the initial release as it only supported 240 bits due to limitations
in the CIPSO restricted bitmap tag (tag type 0x01). This patch converts that
straight char bitmap into an extensibile/sparse bitmap in order to lay the
foundation for other CIPSO tag types and protocols.
This patch also has a nice side effect in that all of the security attributes
passed by NetLabel into the LSM are now in a format which is in the host's
native byte/bit ordering which makes the LSM specific code much simpler; look
at the changes in security/selinux/ss/ebitmap.c as an example.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The attached patch adds --snat-arp support, which makes it possible to
change the source mac address in both the mac header and the arp header
with one rule.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add new NFLOG target to allow use of nfnetlink_log for both IPv4 and IPv6.
Currently we have two (unsupported by userspace) hacks in the LOG and ULOG
targets to optionally call to the nflog API. They lack a few features,
namely the IPv4 and IPv6 LOG targets can not specify a number of arguments
related to nfnetlink_log, while the ULOG target is only available for IPv4.
Remove those hacks and add a clean way to use nfnetlink_log.
Signed-off-by: Patrick McHardy <kaber@trash.net>
| NEW | UPDATE | DESTROY |
----------------------------------------|
tuples | Y | Y | Y |
status | Y | Y | N |
timeout | Y | Y | N |
protoinfo | S | S | N |
helper | S | S | N |
mark | S | S | N |
counters | F | F | Y |
Leyend:
Y: yes
N: no
S: iif the field is set
F: iif overflow
This patch also replace IPCT_HELPINFO by IPCT_HELPER since we want to
track the helper assignation process, not the changes in the private
information held by the helper.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Check that status flags are available in the netlink message received
to create a new conntrack.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The NAT handling of the SIP helper has a few problems:
- Request headers are only mangled in the reply direction, From/To headers
not at all, which can lead to authentication failures with DNAT in case
the authentication domain is the IP address
- Contact headers in responses are only mangled for REGISTER responses
- Headers may be mangled even though they contain addresses not
participating in the connection, like alternative addresses
- Packets are droppen when domain names are used where the helper expects
IP addresses
This patch takes a different approach, instead of fixed rules what field
to mangle to what content, it adds symetric mapping of From/To/Via/Contact
headers, which allows to deal properly with echoed addresses in responses
and foreign addresses not belonging to the connection.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Not every header has a shortcut, so make them optional instead
of searching for the same string twice.
Signed-off-by: Patrick McHardy <kaber@trash.net>
- Use enum for header field enumeration
- Use numerical value instead of pointer to header info structure to
identify headers, unexport ct_sip_hdrs
- group SIP and SDP entries in header info structure
- remove double forward declaration of ct_sip_get_info
Signed-off-by: Patrick McHardy <kaber@trash.net>
The NAT helpr hooks are protected by RCU, but all of the
conntrack helpers test and use the global pointers instead
of copying them first using rcu_dereference()
Also replace synchronize_net() by synchronize_rcu() for clarity
since sychronizing only with packet receive processing is
insufficient to prevent races.
Signed-off-by: Patrick McHardy <kaber@trash.net>
We usually uses 'xxx_find_get' for function which increments
reference count.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch adds /proc/net/ip_conntrack, /proc/net/ip_conntrack_expect and
/proc/net/stat/ip_conntrack files to keep old programs using them working.
The /proc/net/ip_conntrack and /proc/net/ip_conntrack_expect files show only
IPv4 entries, the /proc/net/stat/ip_conntrack shows global statistics.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add helper functions for sysctl registration with optional instantiating
of common path elements (like net/netfilter) and use it for support for
automatic registation of conntrack protocol sysctls.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Only update the conntrack timer if there's been at least HZ jiffies since
the last update. Reduces the number of del_timer/add_timer cycles from one
per packet to one per connection per second (plus once for each state change
of a connection)
Should handle timer wraparounds and connection timeout changes.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Remove unused struct list_head from struct nf_conntrack_l3proto and
nf_conntrack_l4proto as all protocols are kept in arrays, not linked
lists.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Remove the usage of ASSERT_READ_LOCK/ASSERT_WRITE_LOCK in nf_conntrack,
it didn't do anything, it was just an empty define and it uglified the code.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add some more sanity checks when registering/unregistering l3/l4 protocols.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Place rarely written variables in the read-mostly section by using
__read_mostly
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out L3/L4 protocol handling into its own file
nf_conntrack_proto.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out the event cache into its own file
nf_conntrack_ecache.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out handling of helpers into its own file
nf_conntrack_helper.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out expectation handling into its own file
nf_conntrack_expect.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch implements a suggestion by Ian McDonald and
1) Avoids tests against negative packet lengths by using unsigned int
for packet payload lengths in the CCID send_packet()/packet_sent() routines
2) As a consequence, it removes an now unnecessary test with regard to `len > 0'
in ccid3_hc_tx_packet_sent: that condition is always true, since
* negative packet lengths are avoided
* ccid3_hc_tx_send_packet flags an error whenever the payload length is 0.
As a consequence, ccid3_hc_tx_packet_sent is never called as all errors
returned by ccid_hc_tx_send_packet are caught in dccp_write_xmit
3) Removes the third argument of ccid_hc_tx_send_packet (the `len' parameter),
since it is currently always set to skb->len. The code is updated with regard
to this parameter change.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This implements the larger-initial-windows feature for CCID 3, as described in
section 5 of RFC 4342. When the first feedback packet arrives, the sender can
send up to 2..4 packets per RTT, instead of just one.
The patch further
* reduces the number of timestamping calls by passing the timestamp value
(which is computed in one of the calling functions anyway) as argument
* renames one constant with a very long name into one which is shorter and
resembles the one in RFC 3448 (t_mbi)
* simplifies some of the min_t/max_t cases where both `x', `y' have the same
type
Commiter note: renamed TFRC_t_mbi to TFRC_T_MBI, to follow Linux coding style.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
To reflect the fact that this now is of no effect, not making apps
stop working, just be warned in the system log.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This removes and cleans up unused variables and structures which have become
unnecessary following the introduction of the EWMA patch to automatically track
the CCID 3 receiver/sender packet sizes `s'.
It deprecates the PACKET_SIZE socket option by returning an error code and
printing a deprecation warning if an application tries to read or write this
socket option.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This corrects the setting of the nofeedback timer with regard to RFC
3448 - previously it was not set to max(4*R, 2*s/X) as specified. Using
the maximum of 1 second as upper bound (as it was done before) can have
detrimental effects, especially if R is small.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is in response to a request sent earlier by Eric W. Biederman
and replaces all sysctl numbers for net.dccp.default with CTL_UNNUMBERED.
It has been tested to compile and to work.
Commiter note: I've removed the use of CTL_UNNUMBERED, not setting .ctl_name
sets it to 0, that is the what CTL_UNNUMBERED is, reason is
to avoid unneeded source code cluttering.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch
* removes setting t_RTO in ccid3_hc_tx_init (per [RFC 3448, 4.2], t_RTO is
undefined until feedback has been received);
* makes some trivial changes (updates of comments);
* performs a small optimisation by exploiting that the feedback timeout
uses the value of t_ipi. The way it is done is safe, because the timeouts
appear after the changes to t_ipi, ensuring that up-to-date values are used;
* in ccid3_hc_tx_packet_recv, moves the t_rto statement closer to the calculation
of the next_tmout. This makes the code clearer to read and is also safe, since
t_rto is not updated until the next call of ccid3_hc_tx_packet_recv, and is not
read by the functions called via ccid_wait_for_ccid();
* removes a `max' statement in sk_reset_timer, this is not needed since the timeout
value is always greater than 1E6 microseconds.
* adds `XXX'es to highlight that currently the nofeedback timer is set
in a non-standard way
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch:
* consolidates updating of parameters (t_nom, t_ipi, t_delta) which
need to be updated at the same time, since they are inter-dependent
* removes two inline functions which are no longer needed as a result of
the above consolidation
* resolves a FIXME regarding the re-calculation of t_ipi within the nofeedback
timer, in the state where no feedback has previously been received
* ties updating these parameters to updating the sending rate X, exploiting
that all three parameters in turn depend on X; and using a small optimisation
which can reduce the number of required instructions: only update the three
parameters when X really changes
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch concerns updating the value of the nofeedback timer when no feedback
has been received so far.
Since in this case the value of R is still undefined according to [RFC 3448,
4.2], we can not perform step (3) of [RFC 3448, 4.3]. A clarification is
provided in [RFC 4342, sec. 5], which states that in these cases the nofeedback
timer (still) expires "after two seconds".
Many thanks to Ian McDonald for pointing this out and providing the
clarification.
The patch
* implements [RFC 4342, sec. 5] with regard to the above case
* consolidates handling timer restart by
- adding an appropriate jump label and
- initialising the timeout value
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Might as well make flush notifier prettier when subpolicy used
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The
justification is that UDP(-Lite) is a transport-layer protocol and therefore
the socket option code (at least in theory) should be AF-independent.
Furthermore, there is the following code reduplication:
* do_udp{,v6}_getsockopt is 100% identical between v4 and v6
* do_udp{,v6}_setsockopt is identical up to the following differerence
--v4 in contrast to v4 additionally allows the experimental encapsulation
types UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE
--the remainder is identical between v4 and v6
I believe that this difference is of little relevance.
The advantages in not duplicating twice almost completely identical code.
The patch further simplifies the interface of udp{,v6}_push_pending_frames,
since for the second argument (struct udp_sock *up) it always holds that
up = udp_sk(sk); where sk is the first function argument.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar
way, therefore rtnl_put_cacheinfo() is added to reuse code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This considers the case - ACK received while no packet has been sent
so far. Resolved by printing a (rate-limited) warning message.
Further removes an unnecessary BUG_ON in ccid3_hc_tx_packet_recv,
received feedback on a terminating connection is simply ignored.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch removes a switch statement which is redundant since,
* nothing is done in states TFRC_SSTATE_NO_SENT/TFRC_SSTATE_NO_FBACK
* it is impossible that the function is called in the state TFRC_SSTATE_TERM, since
--the function is called, in dccp_write_xmit, after ccid3_hc_tx_send_packet
--if ccid3_hc_tx_send_packet is called in state TFRC_SSTATE_TERM, it returns
-EINVAL, which means that ccid3_hc_tx_packet_sent will not be called
(compare dccp_write_xmit)
--> therefore, this case is logically impossible
* the remaining state is TFRC_SSTATE_FBACK which conditionally updates t_ipi, t_nom,
and t_delta. This is a no-op, since
--t_ipi only changes when feedback is received
--however, when feedback arrives via ccid3_hc_tx_packet_recv, there is an identical
code block which performs the same set of operations
--performing the same set of operations again in ccid3_hc_tx_packet_sent therefore
does not change anything, since between the time of receiving the last feedback
(and therefore update of t_ipi, t_nom, and t_delta), the value of t_ipi has not
changed
--since t_ipi has not changed, the values of t_delta and t_nom also do not change,
they depend fully on t_ipi
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This resolves an `XXX' in ccid3_hc_tx_send_packet().
The function is only called on Data and DataAck packets and returns a negative
result on zero-sized messages. This is a reasonable policy since CCID 3 is a
congestion-control module and congestion control on zero-sized Data(Ack)
packets is in a way pathological.
The patch uses a more suitable error code for this case, it returns the Posix.1
code `EBADMSG' ("Not a data message") instead of `ENOTCONN'.
As a result of ignoring zero-sized packets, a the condition for a warning
"First packet is data" in ccid3_hc_tx_packet_sent is always satisfied; this
message has been removed since it will always be printed.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This makes some logically equivalent simplifications, by replacing
rc - values plus goto's with direct return statements.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch performs a simplifying (performance) optimisation:
In each call of the inline function ccid3_calc_new_t_ipi(), the state is
tested against TFRC_SSTATE_NO_FBACK. This is expensive when the function
is called very often. A simpler solution, implemented by this patch, is
to adapt the control flow.
Background:
Now that we can stuff bigger ack vectors into options.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Ack vectors grow proportional to the window size. If an ack vector does not fit
into a single option, it must be spread across multiple options. This patch
will allow for windows to grow larger.
Committer note: Simplified the patch a bit, original algorithm kept.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Commiter note:
This was split from Andrea's original patch, in the process I changed the type
of the ackvec index fields to u16 instead of to int and haven't folded
dccp_ackvec_parse with dccp_ackvec_check_rcv_ackno.
Next patch will actually do the insertion of more than one ackvec per packet,
using, initially, up to a max of 2 ackvecs as per Andrea's original patch, then
I'll work on support for larger ackvecs, be it using a sysctl or using
setsockopt.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Commiter note: original patch was splitted.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Extends the netlink interface to support the __le16 type and
converts address addition, deletion and, dumping to use the
new netlink interface.
Fixes multiple occasions of possible illegal memory references
due to not validated netlink attributes.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Log an error if the remote tunnel endpoint is unable to handle
tunneled packets.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow link-local tunnel endpoints if the underlying link is defined.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing the mandatory tunnel endpoint checks when the tunnel is set up
isn't enough as interfaces can go up or down and addresses can be
added or deleted after this. The checks need to be done realtime when
the tunnel is processing a packet.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
A logic bug in tunnel lookup could result in duplicate tunnels when
changing an existing device.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on Jamal's patch but compiled and even tested. :-)
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
... and switch the damn checksum update to something saner
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
... so caller can use ->ipaddr instead of ->ipaddr_h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
... switched to taking and returning pointers to net-endian
sctp_addr resp. Together, since the only user of sctp_find_unmatch_addr()
just passes its value to sctp_make_asconf_update_ip().
sctp_make_asconf_update_ip() is actually endian-agnostic.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
both are done in one go since almost always we have result of
the latter immediately passed to the former. Possibly non-obvious
note: sctp_process_param() is endian-agnostic
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ditto for its only caller (sctp_endpoint_is_peeled_off)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Along with it, statics in input.c that end up calling it
(__sctp_lookup_association, sctp_lookup_association,
__sctp_rcv_init_lookup, __sctp_rcv_lookup). Callers
are adjusted.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only caller (__sctp_rcv_lookup_endpoint()) also switched,
its caller adjusted
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its only use happens on the same host, when it gets quoted back to
us. So we are free to flip to net-endian and avoid extra PITA.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
switched to taking a pointer to net-endian sctp_addr
and a net-endian port number. Instances and callers
adjusted; interestingly enough, the only calls are
direct calls of specific instances - the method is not
used at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
instances of ->cmp_addr() are fine with switching both arguments
to net-endian; callers other than in sctp_cmp_addr_exact() (both
as ->cmp_addr(...) and direct calls of instances) adjusted;
sctp_cmp_addr_exact() switched to net-endian itself and adjustment
is done in its callers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
No actual modifications of method instances are needed -
they don't look at port numbers. Switch callers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add sctp_chunk->source, sctp_sockaddr_entry->a, sctp_transport->ipaddr
and sctp_transport->saddr, maintain them as net-endian mirrors of
their host-endian counterparts.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Part 1: rename sctp_chunk->source, sctp_sockaddr_entry->a,
sctp_transport->ipaddr and sctp_transport->saddr (to ..._h)
The next patch will reintroduce these fields and keep them as
net-endian mirrors of the original (renamed) ones. Split in
two patches to make sure that we hadn't forgotten any instanes.
Later in the series we'll eliminate uses of host-endian variants
(basically switching users to net-endian counterparts as we
progress through that mess). Then host-endian ones will die.
Other embedded host-endian sctp_addr will be easier to switch
directly, so we leave them alone for now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Again, invalid sockaddr passed to userland - host-endiand sin_port.
Potential leak, again, but less dramatic than in previous case.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
a) struct sockaddr_storage * passed to sctp_ulpevent_make_peer_addr_change()
actually points at union sctp_addr field in a structure. Then that sucker
gets copied to userland, with whatever junk we might have there.
b) it's actually having host-endian sin_port.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
It expects (and gets) laddr with net-endian sin_port. And then it calls
sctp_bind_addr_match(), which *does* care about port numbers in case of
ipv6 and expects them to be host-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
That's going to be a long series. Introduced temporary helpers
doing copy-and-convert for sctp_addr; they are used to kill
flip-in-place in global data structures and will be used
to gradually push host-endian uses of sctp_addr out of existence.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
also always get __be16 protocol error; switch to SCTP_PERR()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
argument stored for SCTP_CMD_INIT_FAILED is always __be16
(protocol error). Introduced new field and accessor for
it (SCTP_PERR()); switched to their use (from SCTP_U32() and
.u32)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make copy_to_user_policy_type take a type instead a policy and
fix its users to pass the type
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes dependency on buggy rta_buf, fixes a memory corruption bug due to
a unvalidated netlink attribute, and simplifies the code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one got lost on the way from Ian to Gerrit to me, fix it.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This removes a non-referenced variable.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This makes the code of the dccp_probe module more portable.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This adds documentation to the CCID 3 rx/tx socket fields, plus some
minor re-formatting.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This reaps the benefit of the earlier patch, which changed the type of
CCID 3 states to use enums, in that many conditions are now simplified
and the number of possible (unexpected) values is greatly reduced.
In a few instances, this also allowed to simplify pre-conditions; where
care has been taken to retain logical equivalence.
[DCCP]: Introduce a consistent BUG/WARN message scheme
This refines the existing set of DCCP messages so that
* BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts
* DCCP_CRIT (for severe warnings) is not rate-limited
* DCCP_WARN() is introduced as rate-limited wrapper
Using these allows a faster and cleaner transition to their original
counterparts once the code has matured into a full DCCP implementation.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Previously the transmit queue was unbounded.
This patch:
* puts a limit on transmit queue length
and sends back EAGAIN if the buffer is full
* sets the TX queue length to a sensible default
* implements tx buffer sysctls for DCCP
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This adds a CCID3 debug option to the configuration menu
which is missing in Kconfig, but already used by the code.
CCID 2 already provides such an entry.
To enable debugging, set CONFIG_IP_DCCP_CCID3_DEBUG=y
NOTE: The use of ccid3_{t,r}x_state_name is safe, since
now only enum values can appear.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch
* makes debugging (when configured) work both for static / module build
* provides generic debugging macros for use in other DCCP / CCID modules
* adds missing information about debug parameters to Kconfig
* performs some code tidy-up
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The audit_enabled flag is used to signal when syscall auditing is to be
performed. While NetLabel uses a Netlink interface instead of syscalls, it is
reasonable to consider the NetLabel Netlink interface as a form of syscall so
pay attention to the audit_enabled flag when generating audit messages in
NetLabel.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The cipso_v4_doi_search() function behaves the same as cipso_v4_doi_getdef()
but is a local, static function so use it whenever possibile in the CIPSOv4
code base.
Signed-of-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The CIPSOv4 translated tag #1 mapping does not always return the correct error
code if the desired mapping does not exist; instead of returning -EPERM it
returns -ENOSPC indicating that the buffer is not large enough to hold the
translated value. This was caused by failing to check a specific error
condition. This patch fixes this so that unknown mappings return
-EPERM which is consistent with the rest of the related CIPSOv4 code.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
While the original CIPSOv4 code had provisions for multiple tag types the
implementation was not as great as it could be, pushing a lot of non-tag
specific processing into the tag specific code blocks. This patch fixes that
issue making it easier to support multiple tag types in the future.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently the CIPSOv4 engine does not do any sort of checking when a new DOI
definition is added. The tags are still verified but only as a side effect of
normal NetLabel operation (packet processing, socket labeling, etc.) which
would cause application errors due to the faulty configuration. This patch
adds tag checking when new DOI definition are added allowing us to catch these
configuration problems when they happen.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Right now the NetLabel code always jumps into the CIPSOv4 layer to determine if
a CIPSO IP option is present. However, we can do this check directly in the
NetLabel code by making use of the CIPSO_V4_OPTEXIST() macro which should save
us a function call in the common case of not having a CIPSOv4 option present.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The existing netlbl_lsm_secattr struct required the LSM to check all of the
fields to determine if any security attributes were present resulting in a lot
of work in the common case of no attributes. This patch adds a 'flags' field
which is used to indicate which attributes are present in the structure; this
should allow the LSM to do a quick comparison to determine if the structure
holds any security attributes.
Example:
if (netlbl_lsm_secattr->flags)
/* security attributes present */
else
/* NO security attributes present */
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Currently the NetLabel unlabeled packet accept flag is an atomic type and it
is checked for every non-NetLabel packet which comes into the system but rarely
ever changed. This patch changes this flag to a normal integer and protects it
with RCU locking.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
These are code optimizations which are relevant when dealing with large
windows. They are not coded the way I would like to, but they do the job for
the short-term. This patch should be more neat.
Commiter note: Changed the seqno comparisions to use {after,before}48 to handle
wrapping.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Spotted by Ian McDonald, tentatively fixed by Gerrit Renker:
http://www.mail-archive.com/dccp%40vger.kernel.org/msg00599.html
Rewritten not to unroll sk_receive_skb, in the common case, i.e. no lock
debugging, its optimized away.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch tackles the following problem:
* the ccid3_hc_{t,r}x_sock define ccid3hc{t,r}x_state as `u8', but
in reality there can only be a few, pre-defined enum names
* this necessitates addiditional checking for unexpected values
which would otherwise be caught by the compiler
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
udp_push_pending_frames is only referenced within
net/ipv4/udp.c and hence can remain static.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:
* UDP and UDP-Lite now use separate object files
* source file dependencies resolved via header files
net/ipv{4,6}/udp_impl.h
* order of inclusion files in udp.c/udplite.c adapted
accordingly
[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)
This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
* generic routines are all located in net/ipv4/udp.c
* UDP-Lite specific routines are in net/ipv4/udplite.c
* MIB/statistics support in /proc/net/snmp and /proc/net/udplite
* shared API with extensions for partial checksum coverage
[NET/IPv6]: Extension for UDP-Lite over IPv6
It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
* UDPv6 generic and shared code is in net/ipv6/udp.c
* UDP-Litev6 specific extensions are in net/ipv6/udplite.c
* MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
* support for IPV6_ADDRFORM
* aligned the coding style of protocol initialisation with af_inet6.c
* made the error handling in udpv6_queue_rcv_skb consistent;
to return `-1' on error on all error cases
* consolidation of shared code
[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support
The UDP-Lite patch further provides
* API documentation for UDP-Lite
* basic xfrm support
* basic netfilter support for IPv4 and IPv6 (LOG target)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
RTM_GETPREFIX is completely unused and is thus removed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
By replacing the current method of exporting the device configuration
which included allocating a temporary buffer, copying ipv6_devconf
into it and copying that buffer into the message with a method that
uses nla_reserve() allowing to copy the device configuration directly
into the skb data buffer, a GFP_ATOMIC allocation could be removed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just some mis-placed ifdefs:
net/ipv4/tcp_minisocks.c: In function ‘tcp_twsk_destructor’:
net/ipv4/tcp_minisocks.c:364: warning: unused variable ‘twsk’
net/ipv6/tcp_ipv6.c:1846: warning: ‘tcp_sock_ipv6_specific’ defined but not used
net/ipv6/tcp_ipv6.c:1877: warning: ‘tcp_sock_ipv6_mapped_specific’ defined but not used
Signed-off-by: David S. Miller <davem@davemloft.net>
By modyfing genlmsg_put() to take a genl_family and by adding
genlmsg_put_reply() the process of constructing the netlink
and generic netlink headers is simplified.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A generic netlink user has no interest in knowing how to
address the source of the original request.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The beast had a long and not very happy history. At one
point, a friend (netdump) had asked that he open up a little.
Well, the friend was long gone now, and the beast had
this dangling piece hanging (netpoll_queue).
It wasn't hard to stitch the netpoll_queue back in
where it belonged and make everything tidy.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The netpoll beast was still not happy. If the beast got
clogged pipes, it tended to stare blankly off in space
for a long time.
The problem couldn't be completely fixed because the
beast talked with irq's disabled. But it could be made
less painful and shorter.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
When the netpoll beast got busy, he tended to babble.
Instead of talking out of his large mouth as normal,
he tended to try to snort out other orifices. This lead
to words (skbs) ending up in odd places (like NIT) that
he did not intend.
The normal way of talking wouldn't work, but he could
at least change to using the same tone all the time.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The beast was not always healthy. When it was sick,
it tended to be laconic and not tell anyone the real problem.
A few small changes had it telling the world about its
problems, if they really wanted to hear.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
When the netpoll beast got really busy, it tended to clog
things, so it stored them for later. But the beast was putting
all it's skb's in one basket. This was bad because maybe some
pipes were clogged and others were not.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
After looking harder, Steve noticed that the netpoll
beast leaked a little every time it shutdown for a nap.
Not a big leak, but a nuisance kind of thing.
He took out his refcount duct tape and patched the
leak. It was overkill since there was already other
locking in that area, but it looked clean and wouldn't
attract fleas.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>