Commit Graph

8839 Commits

Author SHA1 Message Date
Stephen Hemminger
b1153f29ee netlink: make socket filters work on netlink
Make socket filters work for netlink unicast and notifications.
This is useful for applications like Zebra that get overrun with
messages that are then ignored.

Note: netlink messages are in host byte order, but packet filter
state machine operations are done as network byte order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:46:12 -07:00
Phil Oester
12b101555f [IPV4]: Fix null dereference in ip_defrag
Been seeing occasional panics in my testing of 2.6.25-rc in ip_defrag.
Offending line in ip_defrag is here:

	net = skb->dev->nd_net

where dev is NULL.  Bisected the problem down to commit
ac18e7509e ([NETNS][FRAGS]: Make the
inet_frag_queue lookup work in namespaces).  

Below patch (idea from Patrick McHardy) fixes the problem for me.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:01:50 -07:00
Linus Torvalds
7d3628b230 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
  [NET] ifb: set separate lockdep classes for queue locks
  [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
  [TCP]: Fix shrinking windows with window scaling
  netpoll: zap_completion_queue: adjust skb->users counter
  bridge: use time_before() in br_fdb_cleanup()
  [TG3]: Fix build warning on sparc32.
  MAINTAINERS: bluez-devel is subscribers-only
  audit: netlink socket can be auto-bound to pid other than current->pid (v2)
  [NET]: Fix permissions of /proc/net
  [SCTP]: Fix a race between module load and protosw access
  [NETFILTER]: ipt_recent: sanity check hit count
  [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
  [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning
  [IPV4]: esp_output() misannotations
  [8021Q]: vlan_dev misannotations
  xfrm: ->eth_proto is __be16
  [IPV4]: ipv4_is_lbcast() misannotations
  [SUNRPC]: net/* NULL noise
  [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
  [PKT_SCHED]: annotate cls_u32
  ...
2008-03-21 07:57:45 -07:00
Daniel Lezcano
6f8b13bcb3 [NETNS][IPV6] tcp6 - make proc per namespace
Make the proc for tcp6 to be per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:45 -07:00
Daniel Lezcano
0c96d8c50b [NETNS][IPV6] udp6 - make proc per namespace
The proc init/exit functions take a new network namespace parameter in
order to register/unregister /proc/net/udp6 for a namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:17 -07:00
Daniel Lezcano
f40c8174d3 [NETNS][IPV4] tcp - make proc handle the network namespaces
This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:13:54 -07:00
Daniel Lezcano
8d9f1744ca [NETNS][IPV6] tcp - assign the netns for timewait sockets
Copy the network namespace from the socket to the timewait socket.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:12:54 -07:00
Daniel Lezcano
a91275eff4 [NETNS][IPV6] udp - make proc handle the network namespace
This patch makes the common udp proc functions to take care of which
socket they should show taking into account the namespace it belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:11:58 -07:00
Daniel Lezcano
ea82edf704 [NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in
When CONFIG_PROC_FS=no, the out_sock_create label is not used because
the code using it is disabled and that leads to a warning at compile
time.

This patch fix that by making a specific function to initialize proc
for igmp6, and remove the annoying CONFIG_PROC_FS sections in
init/exit function.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:10:53 -07:00
Peter P Waskiewicz Jr
82cc1a7a56 [NET]: Add per-connection option to set max TSO frame size
Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 03:43:19 -07:00
David S. Miller
a25606c845 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-03-21 03:42:24 -07:00
YOSHIFUJI Hideaki
38fe999e22 [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
Based on notice from "Colin" <colins@sjtu.edu.cn>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:13:58 -07:00
Patrick McHardy
607bfbf2d5 [TCP]: Fix shrinking windows with window scaling
When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.


The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
  below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:11:27 -07:00
Jarek Poplawski
8a455b087c netpoll: zap_completion_queue: adjust skb->users counter
zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter.  Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:07:27 -07:00
Fabio Checconi
2bec008ca9 bridge: use time_before() in br_fdb_cleanup()
In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
should be compared using the time_after() macro.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:54:58 -07:00
Vlad Yasevich
270637abff [SCTP]: Fix a race between module load and protosw access
There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization.  Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:17:14 -07:00
Daniel Hokka Zakrisson
d0ebf13359 [NETFILTER]: ipt_recent: sanity check hit count
If a rule using ipt_recent is created with a hit count greater than
ip_pkt_list_tot, the rule will never match as it cannot keep track
of enough timestamps. This patch makes ipt_recent refuse to create such
rules.

With ip_pkt_list_tot's default value of 20, the following can be used
to reproduce the problem.

nc -u -l 0.0.0.0 1234 &
for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

This limits it to 20 packets:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 20 --name test --rsource -j DROP

While this is unlimited:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 21 --name test --rsource -j DROP

With the patch the second rule-set will throw an EINVAL.

Reported-by: Sean Kennedy <skennedy@vcn.com>
Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:07:10 -07:00
Roel Kluin
6aebb9b280 [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
logical-bitwise & confusion

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:06:23 -07:00
Trond Myklebust
c7c350e92a Merge branch 'hotfixes' into devel 2008-03-19 17:59:44 -04:00
Ingo Molnar
6f3d09291b sched, net: socket wakeups are sync
'sync' wakeups are a hint towards the scheduler that (certain)
networking related wakeups likely create coupling between tasks.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19 04:27:53 +01:00
Robert P. J. Day
938b93adb2 [NET]: Add debugging names to __RW_LOCK_UNLOCKED macros.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-18 00:59:23 -07:00
David S. Miller
577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
David S. Miller
2f633928cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-17 23:44:31 -07:00
Al Viro
5e226e4d90 [IPV4]: esp_output() misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:50:23 -07:00
Al Viro
9534f035ee [8021Q]: vlan_dev misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:49:48 -07:00
Al Viro
27724426a9 [SUNRPC]: net/* NULL noise
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:48:03 -07:00
Al Viro
bc92dd194d [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:47:32 -07:00
Al Viro
0382b9c354 [PKT_SCHED]: annotate cls_u32
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:46:46 -07:00
Al Viro
e6f1cebf71 [NET] endianness noise: INADDR_ANY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:44:53 -07:00
Trond Myklebust
98a8e32394 SUNRPC: Add a helper rpcauth_lookup_generic_cred()
The NFSv4 protocol allows clients to negotiate security protocols on the
fly in the case where an administrator on the server changes the export
settings and/or in the case where we may have a filesystem migration event.

Instead of having the NFS client code cache credentials that are tied to a
particular AUTH method it is therefore preferable to have a generic credential
that can be converted into whatever AUTH is in use by the RPC client when
the read/write/sillyrename/... is put on the wire.

We do this by means of the new "generic" credential, which basically just
caches the minimal information that is needed to look up an RPCSEC_GSS,
AUTH_SYS, or AUTH_NULL credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:49 -04:00
Trond Myklebust
5c691044ec SUNRPC: Add an rpc_credop callback for binding a credential to an rpc_task
We need the ability to treat 'generic' creds specially, since they want to
bind instances of the auth cred instead of binding themselves.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:41 -04:00
Trond Myklebust
9a559efd41 SUNRPC: Add a generic RPC credential
Add an rpc credential that is not tied to any particular auth mechanism,
but that can be cached by NFS, and later used to look up a cred for
whichever auth mechanism that turns out to be valid when the RPC call is
being made.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:38 -04:00
Trond Myklebust
4ccda2cdd8 SUNRPC: Clean up rpcauth_bindcred()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:35 -04:00
Trond Myklebust
af09383577 SUNRPC: Fix RPCAUTH_LOOKUP_ROOTCREDS
The current RPCAUTH_LOOKUP_ROOTCREDS flag only works for AUTH_SYS
authentication, and then only as a special case in the code. This patch
removes the auth_sys special casing, and replaces it with generic code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:32 -04:00
Trond Myklebust
25337fdc85 SUNRPC: Fix a bug in rpcauth_lookup_credcache()
The hash bucket is for some reason always being set to zero.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:29 -04:00
Adrian Bunk
d9357136ac the scheduled rc80211-simple.c removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Adrian Bunk
7524d7d6de the scheduled ieee80211 softmac removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Linus Torvalds
609eb39c8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  [SCTP]: Fix local_addr deletions during list traversals.
  net: fix build with CONFIG_NET=n
  [TCP]: Prevent sending past receiver window with TSO (at last skb)
  rt2x00: Add new D-Link USB ID
  rt2x00: never disable multicast because it disables broadcast too
  libertas: fix the 'compare command with itself' properly
  drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
  [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
  [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
  [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
  [NETFILTER]: xt_time: fix failure to match on Sundays
  [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
  [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
  [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
  [NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
  [NET]: Make /proc/net a symlink on /proc/self/net (v3)
  RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
  net/enc28j60: oops fix
  ...
2008-03-12 13:08:09 -07:00
Tom Tucker
3fedb3c5a8 SVCRDMA: Fix erroneous BUG_ON in send_write
The assertion that checks for sge context overflow is
incorrectly hard-coded to 32. This causes a kernel bug
check when using big-data mounts. Changed the BUG_ON to
use the computed value RPCSVC_MAXPAGES.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
Tom Tucker
c48cbb405c SVCRDMA: Add xprt refs to fix close/unmount crash
RDMA connection shutdown on an SMP machine can cause a kernel crash due
to the transport close path racing with the I/O tasklet.

Additional transport references were added as follows:
- A reference when on the DTO Q to avoid having the transport
  deleted while queued for I/O.
- A reference while there is a QP able to generate events.
- A reference until the DISCONNECTED event is received on the CM ID

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
David S. Miller
ba73d4c84a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-11 19:17:18 -07:00
Chidambar 'ilLogict' Zinnoury
22626216c4 [SCTP]: Fix local_addr deletions during list traversals.
Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead.  This fixes some interesting SCTP
crashes.

Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 18:05:02 -07:00
Ilpo Järvinen
5ea3a74806 [TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.

Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired."  messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 17:55:27 -07:00
Patrick McHardy
94be1a3f36 [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
Commit ce7663d84:

[NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem

changed nf_unregister_queue_handler to return an error when attempting to
unregister a queue handler that is not identical to the one passed in.
This is correct in case we really do have a different queue handler already
registered, but some existing userspace code always does an unbind before
bind and aborts if that fails, so try to be nice and return success in
that case.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:45:05 -07:00
Patrick McHardy
914afea84e [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
Similar to the nfnetlink_log problem, nfnetlink_queue incorrectly
returns -EPERM when binding or unbinding to an address family and
queueing instance 0 exists and is owned by a different process. Unlike
nfnetlink_log it previously completes the operation, but it is still
incorrect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:36 -07:00
Patrick McHardy
b7047a1c88 [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
When binding or unbinding to an address family, the res_id is usually set
to zero. When logging instance 0 already exists and is owned by a different
process, this makes nfunl_recv_config return -EPERM without performing
the bind operation.

Since no operation on the foreign logging instance itself was requested,
this is incorrect. Move bind/unbind commands before the queue instance
permissions checks.

Also remove an incorrect comment.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:13 -07:00
Pekka Enberg
019f692ea7 [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c
that can be replaced with a call to ksize().

Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:41 -07:00
Alexey Dobriyan
3d89e9cf36 [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:10 -07:00
Jan Engelhardt
4f4c9430cf [NETFILTER]: xt_time: fix failure to match on Sundays
From: Andrew Schulman <andrex@alumni.utexas.net>

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

    iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:40 -07:00
Eric Leblond
7000d38d61 [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:04 -07:00
Eric Leblond
cabaa9bfb0 [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:41:43 -07:00
Trond Myklebust
9446389ef6 Merge commit 'origin' into devel 2008-03-08 11:49:24 -05:00
Johannes Berg
e5f98f2df9 mac80211: don't call conf_tx under RCU lock
Reinette pointed out that with the sta_info RCU-ification
the behaviour here changed and the conf_tx callback is
now invoked under RCU read lock. That is not necessary so
this patch restores the original behaviour

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-07 16:02:59 -05:00
Linus Torvalds
4c1aa6f8b9 Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
  NFS: Fix the fsid revalidation in nfs_update_inode()
  SUNRPC: Fix a nfs4 over rdma transport oops
  NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
2008-03-07 12:08:07 -08:00
Tom Talpey
ee1a2c564f SUNRPC: Fix a nfs4 over rdma transport oops
Prevent an RPC oops when freeing a dynamically allocated RDMA
buffer, used in certain special-case large metadata operations.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:35:32 -05:00
Daniel Lezcano
b8ad0cbc58 [NETNS][IPV6] mcast - handle several network namespace
This patch make use of the network namespace information at the right
places to handle the multicast for several network namespaces.  It
makes the socket control to be per namespace too.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:55 -08:00
Daniel Lezcano
e504799276 [NETNS][IPV6] tcp6 - handle several network namespace
We have the right network namespace at the right place now.
So make use of this information to make tcp6 per network namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:26 -08:00
Daniel Lezcano
93ec926b07 [NETNS][IPV6] tcp6 - make socket control per namespace
Instead of having a tcp6_socket global to all the namespace, there is
tcp6 socket control per namespace. That is consistent with which
namespace sent a RST and allows to pass the socket to the underlying
function to retrieve the network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:02 -08:00
Daniel Lezcano
1762f7e88e [NETNS][IPV6] ndisc - make socket control per namespace
Make ndisc socket control per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:15:34 -08:00
Daniel Lezcano
a18bc6959d [NETNS][IPV6] ndisc - make ndisc handle multiple network namespaces
Make ndisc handle multiple network namespaces: 
Remove references to init_net, add network namespace parameters and add 
pernet_operations for ndisc

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:49 -08:00
Daniel Lezcano
8a3edd800d [NETNS][IPV6] fix some missing namespace
This patch adds some missing namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:16 -08:00
David S. Miller
db8dac20d5 [UDP]: Revert udplite and code split.
This reverts commit db1ed684f6 ("[IPV6]
UDP: Rename IPv6 UDP files."), commit
8be8af8fa4 ("[IPV4] UDP: Move
IPv4-specific bits to other file.") and commit
e898d4db27 ("[UDP]: Allow users to
configure UDP-Lite.").

First, udplite is of such small cost, and it is a core protocol just
like TCP and normal UDP are.

We spent enormous amounts of effort to make udplite share as much code
with core UDP as possible.  All of that work is less valuable if we're
just going to slap a config option on udplite support.

It is also causing build failures, as reported on linux-next, showing
that the changeset was not tested very well.  In fact, this is the
second build failure resulting from the udplite change.

Finally, the config options provided was a bool, instead of a modular
option.  Meaning the udplite code does not even get build tested
by allmodconfig builds, and furthermore the user is not presented
with a reasonable modular build option which is particularly needed
by distribution vendors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 16:22:02 -08:00
Allan Stephens
ba0fa45994 [TIPC]: Update version to 1.6.3
This patch updates TIPC's version number to 1.6.3.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:40 -08:00
Allan Stephens
c0cb7ef086 [TIPC]: Enhancements to message header writing
This patch makes two enhancements to the routine used to
set bit fields within a TIPC message header:

 1) It now ignores any bits of the new field value that are not
    covered by the mask being used.  (Previously, if the new value
    exceeded the size of the mask the extra bits could corrupt
    other fields in the message header word being updated.)

 2) The code has been optimized to minimize the number of run-time
    endianness conversion operations by leveraging the fact that the
    mask (and, in some cases, the value as well) is constant and the
    necessary conversion can be performed by the compiler.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:10 -08:00
Allan Stephens
37695420a2 [TIPC]: Use correct bitmask when setting version
This patch ensures that the 3-bit version field of the TIPC
message header is masked correctly when written into a message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:07:42 -08:00
Allan Stephens
06d82c9191 [TIPC]: Minor cleanup of message header code
This patch eliminates some unused or duplicate message header
symbols, and fixes up the comments and/or location of a few
other symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:55 -08:00
Allan Stephens
0e0609bbd2 [TIPC]: Eliminate "sparse" symbol warnings
This patch eliminates warnings about undeclared symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:06 -08:00
Allan Stephens
e247a8f5d0 [TIPC]: Add argument validation for shutdown()
This patch validates that the "how" argument to shutdown()
is SHUT_RDWR, since this is the only form that TIPC supports.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:38 -08:00
Allan Stephens
8c8696553a [TIPC]: Removal of message header option code
This patch removes code associated with optional, user-specified
fields of the TIPC message header.  Such fields were never
utilized by TIPC, and have now been removed from the protocol
specification.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:07 -08:00
Johannes Berg
69d3b6f491 mac80211: fix hardware scan completion
The mac80211 MLME requires restarting timers after a scan
completes but this wasn't done when hardware scan offload
was added, so add it now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Bill Moss <bmoss@clemson.edu>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo
2a8ca29a88 mac80211: fix mesh_path and sta_info get_by_idx functions
Skip properly entries whose dev does not match.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo
a00de5d08b mac80211: path IE fields macros, fix alignment problems and clean up
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:53 -05:00
Luis Carlos Cobo
b4e08ea141 mac80211: add PLINK_ prefix and kernel doc to enum plink_state
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:52 -05:00
Luis Carlos Cobo
cfa22c716f mac80211: always force mesh_path deletions
Postponing the deletion is not really useful anymore.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo
89a1ad6990 mac80211: delete mesh_path timer on mesh_path removal
This avoids dereferencing a no longer existing struct mesh_path.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo
aa2b592843 mac80211: clean up use of endianness conversion functions
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo
4f5d4c4da8 mac80211: breakdown mesh network attributes in different extra fields for wext
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo
3b091cd494 mac80211: move comment to better location
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo
1d1b535969 mac80211: fix incorrect parenthesis
Pointed out by Johannes Berg.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo
37659ff8e1 mac80211: fix mesh endianness sparse warnings and unmark it as broken
This patch fixes all the mesh related endianness warnings reported by sparse. As
they were the reason why Johannes marked mesh as BROKEN, that flag has been
removed.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:48 -05:00
Johannes Berg
96c46546e2 mac80211: always insert key into list
Today I hit one of my new WARN_ONs in the mac80211 code because
a key wasn't being freed correctly. After wondering for a while
I finally tracked it to the fact that STA keys aren't added to
the per-sdata key list correctly, they are supposed to always be
on that list, not just for default keys. This patch fixes that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
03e4497ebe mac80211: fix sta_info mesh timer bug
I noticed a bug I introduced when mesh is enabled: sta_info_destroy()
will end up calling cancel_timer() on a timer that has never been
initialized because the timer is only initialized in mesh_plink_alloc(),
not in sta_info_alloc(). This patch moves the initialization of all mesh
related fields into sta_info_alloc(), adds a bit of sanity checking to
the cfg80211 handlers and sta_info_insert() and makes mesh_plink_alloc()
a static helper function that is only used from the mesh plink code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
dbbea6713d mac80211: add documentation book
Quite a while ago I started this book. The required kernel-doc
patches have since gone into the tree so it is now possible to
build the book in mainline.

The actual documentation is still rather incomplete and not all
things are linked into the book, but this enables us to edit
the documentation collaboratively, hopefully driver authors can
add documentation based on their experience with mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
7c8076bd8b mac80211: don't clear next_hop in path reclaim
Luis pointed out that this path is going to be freed right
away anyway so there's no point in assigning next_hop.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
44213b5e13 mac80211: remove STA entries when taking down interface
When we take down an interface, we need to remove the STA info
items that belong to it because otherwise we might invoke a
sta_notify() callback in the driver when we later delete the
STA entries, but in that case the driver will already have
removed its knowledge of the interface they belonged to leading
to confusion. Also, we could invoke the set_tim() callback after
the driver removed its knowledge of the interface, which can
lead to a crash if it requests a beacon with a then-invalid vif
pointer!

A side effect of this patch is that, because it was easier, it
disallows changing the WDS peer while an interface is up. Should
that actually be necessary, it can be added back, but the WDS
peer STA entry may not be added while the interface is UP so for
now I've simplified the WDS peer's STA entry lifetime management.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
693b1bbcc4 mac80211: clean up sta_info and document locking
This patch cleans up the sta_info struct and documents how
each set of variables is locked. Notably, flags locking is
completely missing. It also adds kernel-doc for some (but
not all yet) members of the struct.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
73651ee639 mac80211: split sta_info_add
sta_info_add() has two functions: allocating a station info
structure and inserting it into the hash table/list. Splitting
these two functions allows allocating with GFP_KERNEL in many
places instead of GFP_ATOMIC which is now required by the RCU
protection. Additionally, in many places RCU protection is now
no longer needed at all because between sta_info_alloc() and
sta_info_insert() the caller owns the structure.

This fixes a few race conditions with setting initial flags
and similar, but not all (see comments in ieee80211_sta.c and
cfg.c). More documentation on the existing races will be in
a follow-up patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
d0709a6518 mac80211: RCU-ify STA info structure access
This makes access to the STA hash table/list use RCU to protect
against freeing of items. However, it's not a true RCU, the
copy step is missing: whenever somebody changes a STA item it
is simply updated. This is an existing race condition that is
now somewhat understandable.

This patch also fixes the race key freeing vs. STA destruction
by making sure that sta_info_destroy() is always called under
RTNL and frees the key.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
5cf121c3cd mac80211: split ieee80211_txrx_data
Split it into ieee80211_tx_data and ieee80211_rx_data to clarify
usage/flag usage and remove the stupid union thing.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
7495883bdd mac80211: reorder a few fields in sta_info
Three __le16s followed by an enum (int) leave a two-byte hole
of padding which we can use for two of the other fields.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
42096b634f mac80211: fix kernel-doc comment for mesh_plink_deactivate
Accidentally copied in a __mesh_plink_deactivate, noticed by Luis
Carlos Cobo.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
d6d1a5a709 mac80211: clean up mesh RX path a bit more
Moves another ifdef into the sta_info header file in favour of
compiling more code even w/o CONFIG_MAC80211_MESH.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
c1edd987a4 mac80211: export mesh_plink_broken
This needs to be exported because rate control algorithms
can be modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:45 -05:00
Johannes Berg
5c142e8db4 mac80211: clarify mesh Kconfig
This clarifies that the mesh networking code is currently
based on Draft 1.08 of the 802.11 Mesh Networking amendment.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:44 -05:00
Johannes Berg
ff59dc76e6 mac80211: add missing "break" statement in mesh code
This inserts a missing break statement which, if hit, would cause
the code to fall-through and unlock a spinlock twice. Noticed via
sparse's "lock count wrong in basic block" warning and careful
code inspection.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg
2f5ce793c0 mac80211: enable mesh in Kconfig
Currently marked BROKEN because of endianness problems.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg
dc0b0f7d1e mac80211: mesh hwmp locking fixes
This fixes missing unlocks noticed by sparse.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Johannes Berg
902acc7896 mac80211: clean up mesh code
Various cleanups, reducing the #ifdef mess and other things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
f7a9214437 mac80211: complete the mesh (interface handling) code
This completes the mesh interface handling code and a few other
bits about the mac80211 module.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
c5dd9c2bd0 mac80211: mesh path and mesh peer configuration
This adds code to allow adding mesh interfaces and configuring
mesh peers etc. Also, it adds code for station dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
9f42f60705 mac80211: mesh statistics and config through debugfs
This patch contains the debugfs code for mesh statistics and configuration
parameters. Please note that generic support for r/w debugfs attributes has been
added.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
050ac52cbe mac80211: code for on-demand Hybrid Wireless Mesh Protocol
This file implements the on-demand Hybrid Wireless Mesh Protocol, at this moment
using hop-count as the metric. When no mesh path exists for a given destination
or the mesh path is not active, frames addressed to that destination will be
queued and a Path Request frame will be sent. Queued frames will be sent when
the path is resolved (usually after reception of a Path Response) or discarded
if discovery times out. Path Requests will also be sent to refresh paths that
are being used and are close to expiring.

Path Errors are sent when a path discovery process triggered by the attempt to
forward a frame originated in a different mesh point times out. Path Errors are
also sent when a peer link is determined to be unreachable because of high error
rates.

Multiple destination support in Path Requests and Path Errors and precursors
have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
eb2b9311fd mac80211: mesh path table implementation
The mesh path table associates destinations with the next hop to reach them. The
table is a hash of linked lists protected by rcu mechanisms. Every mesh path
contains a lock to protect the mesh path state.

Each outgoing mesh frame requires a look up into this table. Therefore, the
table it has been designed so it is not necessary to hold any lock to find the
appropriate next hop.

If the path is determined to be active within a rcu context we can safely
dereference mpath->next_hop->addr, since it holds a reference to the sta
next_hop. After a mesh path has been set active for the first time it next_hop
must always point to a valid sta.  If this is not possible the mpath must be
deleted or replaced in a RCU safe fashion.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
c3896d2ca4 mac80211: mesh peer link implementation
This file implements mesh discovery and peer link establishment support using
the mesh peer link table provided in mesh_plinktbl.c.

Secure peer links have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
f709fc696d mac80211: mesh changes to the MLME
This includes support for mesh network scanning. The ugly code in
ieee80211_sta_scan_result() is my approach to work around wext. This has been
tested with wireless tools version 29 and works as expected (the new interface
mode is just not shown).

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
ee3858551a mac80211: mesh data structures and first mesh changes
Includes integration in struct sta_info of mesh peer link elements, previously
on their own mesh peer link table.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
33b64eb2b1 mac80211: support for mesh interfaces in mac80211 data path
This changes the TX/RX paths in mac80211 to support mesh interfaces.
This code will be cleaned up later again before being enabled.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
2e3c873682 mac80211: support functions for mesh
The two important features coded in mesh.c are:

Recently Multicast Cache: in on-demand HWMP, multicast traffic is retransmitted
by every receiving node. Even though a mesh TTL counter avoids infinite loops,
it is also necessary to avoid traffic explosion by keeping a cache of multicast
mesh frame that have been received recently. With this feature, maximum number
of retransmissions of a multicast frame for the case of N nodes within the range
of each other would be N. Without it, the maximum number of retransmissions
would be in the order of N^(MESH_TTL - 1).

Code to support mesh tables.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
ccf80ddfe4 mac80211: mesh function and data structures definitions
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Johannes Berg
6032f934c8 mac80211: add mesh interface type
This adds the mesh interface type.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
2ec600d672 nl80211/cfg80211: support for mesh, sta dumping
Added support for mesh id and mesh path operation as well as
station structure dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Harvey Harrison
0dc47877a3 net: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 20:47:47 -08:00
David Howells
1ff82fe002 RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
Fix rxrpc_recvmsg() to return msg_name correctly.  We shouldn't
overwrite the *msg struct, but should rather write into msg->msg_name
(there's a '&' unary operator that shouldn't be there).

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:53:55 -08:00
Tobias Klauser
a4e2acf01a bluetooth: make bnep_sock_cleanup() return void
bnep_sock_cleanup() always returns 0 and its return value isn't used
anywhere in the code.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:47:40 -08:00
Tobias Klauser
04005dd9ae bluetooth: Make hci_sock_cleanup() return void
hci_sock_cleanup() always returns 0 and its return value isn't used
anywhere in the code.

Compile-tested with 'make allyesconfig && make net/bluetooth/bluetooth.ko'

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
2008-03-05 18:47:03 -08:00
Dave Young
147e2d5983 bluetooth: hci_core: defer hci_unregister_sysfs()
Alon Bar-Lev reports:

 Feb 16 23:41:33 alon1 usb 3-1: configuration #1 chosen from 1 choice
Feb 16 23:41:33 alon1 BUG: unable to handle kernel NULL pointer  
dereference at virtual address 00000008
Feb 16 23:41:33 alon1 printing eip: c01b2db6 *pde = 00000000
Feb 16 23:41:33 alon1 Oops: 0000 [#1] PREEMPT
Feb 16 23:41:33 alon1 Modules linked in: ppp_deflate zlib_deflate  
zlib_inflate bsd_comp ppp_async rfcomm l2cap hci_usb vmnet(P)  
vmmon(P) tun radeon drm autofs4 ipv6 aes_generic crypto_algapi  
ieee80211_crypt_ccmp nf_nat_irc nf_nat_ftp nf_conntrack_irc  
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT  
xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack  
iptable_filter ip_tables x_tables snd_pcm_oss snd_mixer_oss  
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device  
bluetooth ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave  
cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput  
fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base pcmcia  
snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm nsc_ircc snd_timer  
ipw2200 thinkpad_acpi irda snd ehci_hcd yenta_socket uhci_hcd  
psmouse ieee80211 soundcore intel_agp hwmon rsrc_nonstatic pcspkr  
e1000 crc_ccitt snd_page_alloc i2c_i801 ieee80211_crypt pcmcia_core  
agpgart thermal bat!
tery nvram rtc sr_mod ac sg firmware_class button processor cdrom  
unix usbcore evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod  
scsi_mod
Feb 16 23:41:33 alon1
Feb 16 23:41:33 alon1 Pid: 4, comm: events/0 Tainted: P         
(2.6.24-gentoo-r2 #1)
Feb 16 23:41:33 alon1 EIP: 0060:[<c01b2db6>] EFLAGS: 00010282 CPU: 0
Feb 16 23:41:33 alon1 EIP is at sysfs_get_dentry+0x26/0x80
Feb 16 23:41:33 alon1 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX:  
f48a2210
Feb 16 23:41:33 alon1 ESI: f72eb900 EDI: f4803ae0 EBP: f4803ae0 ESP:  
f7c49efc
Feb 16 23:41:33 alon1 hcid[7004]: HCI dev 0 registered
Feb 16 23:41:33 alon1 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Feb 16 23:41:33 alon1 Process events/0 (pid: 4, ti=f7c48000  
task=f7c3efc0 task.ti=f7c48000)
Feb 16 23:41:33 alon1 Stack: f7cb6140 f4822668 f7e71e10 c01b304d  
ffffffff ffffffff fffffffe c030ba9c
Feb 16 23:41:33 alon1 f7cb6140 f4822668 f6da6720 f7cb6140 f4822668  
f6da6720 c030ba8e c01ce20b
Feb 16 23:41:33 alon1 f6e9dd00 c030ba8e f6da6720 f6e9dd00 f6e9dd00  
00000000 f4822600 00000000
Feb 16 23:41:33 alon1 Call Trace:
Feb 16 23:41:33 alon1 [<c01b304d>] sysfs_move_dir+0x3d/0x1f0
Feb 16 23:41:33 alon1 [<c01ce20b>] kobject_move+0x9b/0x120
Feb 16 23:41:33 alon1 [<c0241711>] device_move+0x51/0x110
Feb 16 23:41:33 alon1 [<f9aaed80>] del_conn+0x0/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<f9aaed99>] del_conn+0x19/0x70 [bluetooth]
Feb 16 23:41:33 alon1 [<c012c1a1>] run_workqueue+0x81/0x140
Feb 16 23:41:33 alon1 [<c02c0c88>] schedule+0x168/0x2e0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c9cb>] worker_thread+0x9b/0xf0
Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50
Feb 16 23:41:33 alon1 [<c012c930>] worker_thread+0x0/0xf0
Feb 16 23:41:33 alon1 [<c012f962>] kthread+0x42/0x70
Feb 16 23:41:33 alon1 [<c012f920>] kthread+0x0/0x70
Feb 16 23:41:33 alon1 [<c0104c2f>] kernel_thread_helper+0x7/0x18
Feb 16 23:41:33 alon1 =======================
Feb 16 23:41:33 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0  
56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74  
47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98  
e7 10 00 8b 43 10
Feb 16 23:41:33 alon1 EIP: [<c01b2db6>] sysfs_get_dentry+0x26/0x80  
SS:ESP 0068:f7c49efc
Feb 16 23:41:33 alon1 ---[ end trace aae864e9592acc1d ]---

Defer hci_unregister_sysfs because hci device could be destructed
while hci conn devices still there.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Stefan Seyfried <seife@suse.de>
Acked-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
2008-03-05 18:45:59 -08:00
Eric Dumazet
ee6b967301 [IPV4]: Add 'rtable' field in struct sk_buff to alias 'dst' and avoid casts
(Anonymous) unions can help us to avoid ugly casts.

A common cast it the (struct rtable *)skb->dst one.

Defining an union like  :
union {
     struct dst_entry *dst;
     struct rtable *rtable;
};
permits to use skb->rtable in place.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:30:47 -08:00
Neil Horman
219b99a9ed [SCTP]: Bring MAX_BURST socket option into ietf API extension compliance
Brings max_burst socket option set/get into line with the latest ietf
socket extensions api draft, while maintaining backwards
compatibility.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 13:44:46 -08:00
Gui Jianfeng
140ee9603c SCTP: Fix chunk parameter processing bug
If an address family is not listed in "Supported Address Types"
parameter(INIT Chunk), but the packet is sent by that family, this
address family should be considered as supported by peer.  Otherwise,
an error condition will occur. For instance, if kernel receives an
IPV6 SCTP INIT chunk with "Support Address Types" parameter which
indicates just supporting IPV4 Address family. Kernel will reply an
IPV6 SCTP INIT ACK packet, but the source ipv6 address in ipv6 header
will be vacant. This is not correct.

refer to RFC4460 as following:
      IMPLEMENTATION NOTE: If an SCTP endpoint lists in the 'Supported
      Address Types' parameter either IPv4 or IPv6, but uses the other
      family for sending the packet containing the INIT chunk, or if it
      also lists addresses of the other family in the INIT chunk, then
      the address family that is not listed in the 'Supported Address
      Types' parameter SHOULD also be considered as supported by the
      receiver of the INIT chunk.  The receiver of the INIT chunk SHOULD
      NOT respond with any kind of error indication.

Here is a fix to comply to RFC.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 13:43:32 -08:00
Daniel Lezcano
a05c44f6d5 [IPV6]: Remove commented lines.
Remove commented lines from netns patchset.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 12:37:29 -08:00
David S. Miller
255333c1db Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/rc80211_pid_algo.c
2008-03-05 12:26:41 -08:00
Benjamin Thery
9a43b709a2 [NETNS][IPV6] icmp6 - make icmpv6_socket per namespace
This patch make the changes necessary to support network namespaces in
ICMPv6.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:49:18 -08:00
Daniel Lezcano
da6bb5c0c5 [NETNS][IPV6] ip6_input - enable ipv6_rcv to handle several network namespace
The different subsystem of ipv6 are ready for namespaces, so let's
activate it for ipv6_rcv.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:56 -08:00
Daniel Lezcano
c20121ae87 [NETNS][IPV6] route6 - pass always a valid socket to ip6_dst_lookup
The ip6_dst_lookup receive a socket as parameter. In some part of the code
it is called with a NULL socket parameter. We want to rely on the socket
to retrieve the network namespace, so we always pass a valid socket in all
cases.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:35 -08:00
Daniel Lezcano
4591db4f37 [NETNS][IPV6] route6 - add netns parameter to ip6_route_output
Add an netns parameter to ip6_route_output. That will allow to access
to the right routing table for outgoing traffic.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:48:10 -08:00
Benjamin Thery
6fda735005 [NETNS][IPV6] addrconf - make addrconf per namespace
All the infrastructure to propagate the network namespace information
is ready. Make use of it.

There is a special case here between the initial network namespace and
the other namespaces:

* When ipv6 is initialized at boot time (aka in the init_net), it
registers to the notifier callback. So addrconf_notify will be called
as many time as there are network devices setup on the system and the
function will add ipv6 addresses to the network devices. But the first
device which needs to have its ipv6 address setup is the loopback,
unfortunatly this is not the case. So the loopback address is setup
manually in the ipv6 init function.

* With the network namespace, this ordering problem does not appears
because notifier is already setup and active, so as soon as we
register the loopback the ipv6 address is setup and it will be the
first device.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:47:47 -08:00
Daniel Lezcano
af2849377e [NETNS][IPV6] addrconf - Pass the proper network namespace parameters to addrconf
This patch propagates the network namespace pointer to the address
configuration routines which need it, which means adding a new
parameter to these functions, and make them use it instead of using
the initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:46:57 -08:00
Daniel Lezcano
300bf591de [NETNS][IPV6] proc - protect snmp6 from non-init_net calls
This patchset avoids creation of the /proc entry for snmp6 when
the call is made from a network namespace different from the init_net.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:46:31 -08:00
Benjamin Thery
075de93957 [NETNS][IPV6] af_inet6 - allow socket creation per namespace
Allow creation of IPv6 raw and datagram sockets in network namespaces
other than init_net.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:45:59 -08:00
Benjamin Thery
94911fe317 [NETNS][IPV6] Move sysctl initialization later on in the IPv6 init sequence
This patch moves initialization of IPv6 sysctl stuff at the end of
IPv6 initialization.

This will be helpful for network namespaces where some sysctl entries
depend on per-namespace variables, that need to be allocated and
initialized before they are referenced by sysctl.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 10:45:36 -08:00
Stephen Hemminger
dea75bdfa5 [IPCONFIG]: The kernel gets no IP from some DHCP servers
From: Stephen Hemminger <shemminger@linux-foundation.org>

Based upon a patch by Marcel Wappler:
 
   This patch fixes a DHCP issue of the kernel: some DHCP servers
   (i.e.  in the Linksys WRT54Gv5) are very strict about the contents
   of the DHCPDISCOVER packet they receive from clients.
 
   Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and
   'siaddr' MUST be set to '0'.  These DHCP servers ignore Linux
   kernel's DHCP discovery packets with these two fields set to
   '255.255.255.255' (in contrast to popular DHCP clients, such as
   'dhclient' or 'udhcpc').  This leads to a not booting system.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 17:03:49 -08:00
David S. Miller
3123e666ea Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-03-04 16:44:01 -08:00
Stefano Brivio
1d60ab0574 rc80211-pid: fix rate adjustment
Merge rate_control_pid_shift_adjust() to rate_control_pid_adjust_rate()
in order to make the learning algorithm aware of constraints on rates. Also
add some comments and rename variables.

This fixes a bug which prevented 802.11b/g non-AP STAs from working with
802.11b only AP STAs.

This patch was originally destined for 2.6.26, and is being backported
to fix a user reported problem in post-2.6.24 kernels.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-04 18:36:35 -05:00
Herbert Xu
ed58dd41f3 [ESP]: Add select on AUTHENC
Now the ESP uses the AEAD interface even for algorithms which are
not combined mode, we need to select CONFIG_CRYPTO_AUTHENC as
otherwise only combined mode algorithms will work.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 14:29:21 -08:00
Sangtae Ha
6b3d626321 [TCP]: TCP cubic v2.2
We have updated CUBIC to fix some issues with slow increase in large
BDP networks. We also improved its convergence speed. The fix is in
fact very simple -- the window increase limit of smax during the
window probing phase (i.e., convex growth phase) is removed. We found
that this does not affect TCP friendliness, but only improves its
scalability. We have run some tests in our lab and also over the
Internet path from NCSU to Japan. These results can be seen from the
following page:

http://netsrv.csc.ncsu.edu/wiki/index.php/Intra_protocol_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/RTT_fairness_testing_with_linux-2.6.23.9
http://netsrv.csc.ncsu.edu/wiki/index.php/TCP_friendliness_testing_with_linux-2.6.23.9

Signed-off-by: Sangtae Ha <sha2@ncsu.edu>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 14:17:41 -08:00
Daniel Lezcano
7019b78e14 [NETNS][IPV6] route6 - Make ip6_dst_gc simpler
This patches improves the readibility of the ip6_dst_gc() routine.
It simplifies long lines which grow a lot due to the introduction
of network namespaces support.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:50:14 -08:00
Benjamin Thery
6891a346c3 [NETNS][IPV6] route6 - make garbage collection work with multiple network namespaces
This patch makes the necessary changes to make IPv6 dst_entry garbage
collection work with multiple network namespaces.

In ip6_dst_gc(), static local variables are now declared
per-namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:49:47 -08:00
Benjamin Thery
f2fc6a5458 [NETNS][IPV6] route6 - move ip6_dst_ops inside the network namespace
The ip6_dst_ops is moved inside the network namespace structure.  All
references to this structure are now relative to the initial network
namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:49:23 -08:00
Daniel Lezcano
9a7ec3a94d [NETNS][IPV6] route6 - dynamically allocate ip6_dst_ops
ip6_dst_ops is dynamically allocated in init and exit functions.  That
provides the ability to do multiple instanciations of this structure.

This will be needed for network namespaces, indeed dst_ops stores data
that are required to be per namespace: entries and gc_thresh.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:53 -08:00
Daniel Lezcano
8ed6778967 [NETNS][IPV6] rt6_info - move rt6_info structure inside the namespace
The rt6_info structures are moved inside the network namespace
structure. All references to these structures are now relative to the
initial network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:30 -08:00
Daniel Lezcano
bdb3289f73 [NETNS][IPV6] rt6_info - make rt6_info accessed as a pointer
This patch make mindless changes and prepares the code to use dynamic
allocation for rt6_info structure. The code accesses the rt6_info
structure as a pointer instead of a global static variable.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:48:10 -08:00
Daniel Lezcano
5578689a4e [NETNS][IPV6] route6 - make route6 per namespace
This patch makes the routing engine use the network namespaces to
access routing informations: Add a network namespace parameter to
ipv6_route_ioctl and propagate the network namespace value to all the
routing code that have not yet been changed.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:47:47 -08:00
Daniel Lezcano
7b4da53229 [NETNS][IPV6] route6 - Pass the network namespace parameter to rt6_purge_dflt_routers
Add a network namespace parameter to rt6_purge_dflt_routers.  This is
needed to call fib6_get_table with the appropriate network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:47:14 -08:00
Daniel Lezcano
efa2cea0d9 [NETNS][IPV6] route6 - Pass network namespace to rt6_add_route_info and rt6_get_route_info
Add a network namespace parameter to rt6_add_route_info() and
rt6_get_route_info to enable them to handle multiple network
namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:46:48 -08:00
Daniel Lezcano
69ddb80562 [NETNS][IPV6] route6 - Make proc entry /proc/net/rt6_stats per namespace
Make the proc entry /proc/net/rt6_stats work in all network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:46:23 -08:00
Daniel Lezcano
606a2b4862 [NETNS][IPV6] route6 - Pass the network namespace parameter to rt6_lookup
Add a network namespace parameter to rt6_lookup().

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:45:59 -08:00
Daniel Lezcano
cdb1876192 [NETNS][IPV6] route6 - create route6 proc files for the namespace
Make /proc/net/ipv6_route and /proc/net/rt6_stats to be per namespace.
These proc files are now created when the network namespace is
initialized.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 13:45:33 -08:00
David S. Miller
d9452e9f81 [NETPOLL]: Revert two bogus cleanups that broke netconsole.
Based upon a report by Andrew Morton and code analysis done
by Jarek Poplawski.

This reverts 33f807ba0d ("[NETPOLL]:
Kill NETPOLL_RX_DROP, set but never tested.")  and
c7b6ea24b4 ("[NETPOLL]: Don't need
rx_flags.").

The rx_flags did get tested for zero vs. non-zero and therefore we do
need those tests and that code which sets NETPOLL_RX_DROP et al.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-04 12:28:49 -08:00
Timo Teras
83321d6b98 [AF_KEY]: Dump SA/SP entries non-atomically
Stop dumping of entries when af_key socket receive queue is getting
full and continue it later when there is more room again.

This fixes dumping of large databases. Currently the entries not
fitting into the receive queue are just dropped (including the
end-of-dump message) which can confuse applications.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:40:12 -08:00
Matthias Kaehlcke
26bad2c05e [TIPC]: Convert tsock->sem in a mutex
The semaphore tsock->sem is used as mutex, convert it to the mutex API

Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:35:53 -08:00
Benjamin Thery
c572872f89 [NETNS][IPV6] rt6_stats - make the stats per network namespace
The rt6_stats is now per namespace with this patch. It is allocated
when a network namespace is created and freed when the network
namespace exits and references are relative to the network namespace.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:34:17 -08:00
Daniel Lezcano
6cc118bd50 [NETNS][IPV6] rt6_stats - dynamically allocate the routes statistics
This patch allocates the rt6_stats struct dynamically when the fib6 is
initialized. That provides the ability to create several instances of
this structure for the network namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:43 -08:00
Daniel Lezcano
dcabb819a6 [NETNS][IPV6] fib6_rules - handle several network namespaces
The fib6_rules_ops is moved to the network namespace structure.  All
references are changed to have it relatively to it.

Each time a network namespace is created a new fib6_rules_ops is
allocated, initialized and stored into the network namespace
structure.

The common part of the fib rules is namespace aware, so it is quite
easy to retrieve the network namespace from the rules and use it in
the different callbacks.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:08 -08:00
Daniel Lezcano
eb5564b853 [NETNS][IPV6] fib6 rule - dynamic allocation of the rules struct ops
The fib6_rules_ops structure is dynamically allocated, so that allows
to make several instances of it per network namespace.

The global static fib6_rules_ops structure is renamed to
fib6_rules_ops_template in order to quickly memcopy it for the
structure initialization.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:32:30 -08:00
Benjamin Thery
ec7d43c291 [NETNS][IPV6] ip6_fib - clean node use namespace
The fib6_clean_node function should have the network namespace it is
working on. The fib6_cleaner_t structure is extended with the network
namespace field to be passed to the fib6_clean_node function.

The different functions calling the fib6_clean_node function are
extended with the netns parameter when needed to propagate the netns
pointer.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:57 -08:00
Daniel Lezcano
63152fc0de [NETNS][IPV6] ip6_fib - gc timer per namespace
Move the timer initialization at the network namespace creation and
store the network namespace in the timer argument.

That enables multiple timers (one per network namespace) to do garbage
collecting.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:11 -08:00
Daniel Lezcano
450d19f8ab [NETNS][IPV6] ip6_fib - dynamically allocate gc-timer
The ip6_fib_timer gc timer is dynamically allocated and initialized in
the ip6 fib init function. There are no more references to a static
global variable. That will allow to make multiple instance of the
garbage collecting timer and make them per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:29:33 -08:00
Daniel Lezcano
5b7c931dff [NETNS][IPV6] ip6_fib - add net to gc timer parameter
The fib tables are now relative to the network namespace. When the
garbage collector timer expires, we must have a network namespace
parameter in order to retrieve the tables. For now this is the
init_net, but we should be able to have a timer per namespace and use
the timer callback parameter to pass the network namespace from the
expired timer.

The timer callback, fib6_run_gc, is actually used to be called
synchronously by some functions and asynchronously when the timer
expires.

When the timer expires, the delay specified for fib6_run_gc parameter
is always zero. So, I changed fib6_run_gc to not be a timer callback
but a function called by the timer callback and I added a timer
callback where its work is just to retrieve from the data arg of the
timer the network namespace and call fib6_run_gc with zero expiring
time and the network namespace parameters. That makes the code cleaner
for the fib6_run_gc callers.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:28:58 -08:00
Daniel Lezcano
f3db48517f [NETNS][IPV6] ip6_fib - fib6_clean_all handle several network namespaces
The function fib6_clean_all takes the network namespace as
parameter. That allows to flush the routes related to a specific
network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:27:06 -08:00
Daniel Lezcano
58f09b78b7 [NETNS][IPV6] ip6_fib - make it per network namespace
The fib table for ipv6 are moved to the network namespace structure.
All references to them are made relatively to the network namespace.

All external calls to the ip6_fib functions taking the network
namespace parameter are made using the init_net variable, so the
ip6_fib engine is ready for the namespaces but the callers not yet.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:25:27 -08:00
Daniel Lezcano
e0b85590bc [NETNS][IPV6] ip6_fib - dynamically allocate the fib tables
This patch changes the fib6 tables to be dynamically allocated.  That
provides the ability to make several instances of them when a new
network namespace is created.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:24:31 -08:00
YOSHIFUJI Hideaki
4192717807 [IPV6] MCAST: Use standard path for sending MLD/MLDv2 messages.
This is changing the paths for sending MLD/MLDv2 messages
from dev_queue_xmit() to standard dst_output().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
3b00944c5c [IPV6]: Make ndisc_dst_alloc() common for later use.
For later use, this patch is renaming ndisc_dst_alloc()
(and related function/structures) to icmp6_dst_alloc()
(and so on).  This patch also removing unused function-
pointer argument for it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
95e41e93e1 [IPV6]: Make ndisc_flow_init() common for later use.
For later use, this patch is renaming ndisc_flow_init() to
icmpv6_flow_init() and putting it in common place.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
5e5f3f0f80 [IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().
Since most users of ipv6_get_saddr() pass non-NULL as
dst argument, use ipv6_dev_get_saddr() directly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
5ee0910509 [IPV6] SYSCTL: complete initialization for sysctl table in subsystem code.
Move initialization bits for subsystem sysctl tables to
appropriate functions.
 - route
 - icmp

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
662397fd7a [IPV6]: Move packet_type{} related bits to af_inet6.c.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
db1ed684f6 [IPV6] UDP: Rename IPv6 UDP files.
Rename net/ipv6/udp.c to net/ipv6/udp_ipv6.c
Rename net/ipv6/udplite.c to net/ipv6/udplite_ipv6.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
8be8af8fa4 [IPV4] UDP: Move IPv4-specific bits to other file.
Move IPv4-specific UDP bits from net/ipv4/udp.c into (new) net/ipv4/udp_ipv4.c.
Rename net/ipv4/udplite.c to net/ipv4/udplite_ipv4.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
cf80efc27d [IPV4]: Fix size description of CONFIG_INET.
CONFIG_INET now enlarges about 400KB, not 140KB.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
e898d4db27 [UDP]: Allow users to configure UDP-Lite.
Let's give users an option for disabling UDP-Lite (~4K).

old:
|    text	   data	    bss	    dec	    hex	filename
|  286498	  12432	   6072	 305002	  4a76a	net/ipv4/built-in.o
|  193830	   8192	   3204	 205226	  321aa	net/ipv6/ipv6.o

new (without UDP-Lite):
|    text	   data	    bss	    dec	    hex	filename
|  284086	  12136	   5432	 301654	  49a56	net/ipv4/built-in.o
|  191835	   7832	   3076	 202743	  317f7	net/ipv6/ipv6.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
Glenn Griffin
c6aefafb7e [TCP]: Add IPv6 support to TCP SYN cookies
Updated to incorporate Eric's suggestion of using a per cpu buffer
rather than allocating on the stack.  Just a two line change, but will
resend in it's entirety.

Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Eric Dumazet
11baab7ac3 [TCP]: lower stack usage in cookie_hash() function
400 bytes allocated on stack might be a litle bit too much. Using a
per_cpu var is more friendly.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Pavel Emelyanov
988b705077 [ARP]: Introduce the arp_hdr_len helper.
There are some place, that calculate the ARP header length. These
calculations are correct, but 
 a) some operate with "magic" constants,
 b) enlarge the code length (sometimes at the cost of coding style),
 c) are not informative from the first glance.

The proposal is to introduce a helper, that includes all the good
sides of these calculations.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:20:57 -08:00
Dave Young
8e8440f535 [BLUETOOTH]: l2cap info_timer delete fix in hci_conn_del
When the l2cap info_timer is active the info_state will be set to
L2CAP_INFO_FEAT_MASK_REQ_SENT, and it will be unset after the timer is
deleted or timeout triggered.

Here in l2cap_conn_del only call del_timer_sync when the info_state is
set to L2CAP_INFO_FEAT_MASK_REQ_SENT.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:18:55 -08:00
Frank Blaschka
7e36763b2c [NET]: Fix race in generic address resolution.
neigh_update sends skb from neigh->arp_queue while neigh_timer_handler
has increased skbs refcount and calls solicit with the
skb. neigh_timer_handler should not increase skbs refcount but make a
copy of the skb and do solicit with the copy.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:16:04 -08:00
Heiko Carstens
c3d84a4dd2 iucv: fix build error on !SMP
Since a5fbb6d106
"KVM: fix !SMP build error" smp_call_function isn't a define anymore
that folds into nothing but a define that calls up_smp_call_function
with all parameters. Hence we cannot #ifdef out the unused code
anymore...
This seems to be the preferred method, so do this for s390 as well.

net/iucv/iucv.c: In function 'iucv_cleanup_queue':
net/iucv/iucv.c:657: error: '__iucv_cleanup_queue' undeclared

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:12:33 -08:00
Ilpo Järvinen
d152a7d88a [TCP]: Must count fack_count also when skipping
It makes fackets_out to grow too slowly compared with the
real write queue.

This shouldn't cause those BUG_TRAP(packets <= tp->packets_out)
to trigger but how knows how such inconsistent fackets_out
affects here and there around TCP when everything is nowadays
assuming accurate fackets_out. So lets see if this silences
them all.

Reported by Guillaume Chazarain <guichaz@gmail.com>.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:10:16 -08:00
Alexey Dobriyan
8ed7edce82 ipv6: fix inet6_init/icmpv6_cleanup sections mismatch
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:02:54 -08:00
Denis V. Lunev
7cd04fa7e3 [TCP]: Merge exit paths in tcp_v4_conn_request.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:59:32 -08:00
Denis V. Lunev
f0fd56ed40 [SCTP]: seq_printf format warning. (fixed)
sctp_association->hbinterval is unsigned long. Replace %8d with %8lu.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:55:54 -08:00
Denis V. Lunev
da7ef338a2 [IPV4]: skb->dst can't be NULL in ip_options_echo.
ip_options_echo is called on the packet input path after the initial
routing. The dst entry on the packet is cleared only in the several
very specific places and immidiately assigned back (may be new).

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:50:10 -08:00
Denis V. Lunev
1d1c8d13c4 [ICMP]: Section conflict between icmp_sk_init/icmp_sk_exit.
Functions from __exit section should not be called from ones in __init
section. Fix this conflict.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 14:15:19 -08:00
David S. Miller
4a80f27889 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-02-29 13:41:25 -08:00
Johannes Berg
e486182907 mac80211: fix key replacing, hw accel
Even though I thought about it a lot and had also tested it, some
of my recent changes in the key code broke replacing keys, making
the kernel oops because a key is removed from a list while not on
it.

This patch fixes that using the list as an indication whether or
not the key is on it (an empty list means it's not on any list.)

Also, this patch fixes hw accel enabling, the check for not doing
hw accel when the interface is down was lost and is restored by
this.

Additionally, move adding the key to the list into the function
__ieee80211_key_replace() for more consistency.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:07 -05:00
Johannes Berg
db4d1169d0 mac80211: split ieee80211_key_alloc/free
In order to RCU-ify sta_info, we need to be able to allocate
a key without linking it to an sdata/sta structure (because
allocation cannot be done in an rcu critical section). This
patch splits up ieee80211_key_alloc() and updates all users
appropriately.

While at it, this patch fixes a number of race conditions
such as finally making key replacement atomic, unfortunately
at the expense of more complex code.

Note that this patch documents /existing/ bugs with sta info
and key interaction, there is currently a race condition
when a sta info is freed without holding the RTNL. This will
finally be fixed by a followup patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:04 -05:00
Johannes Berg
6f48422a29 mac80211: remove STA infos last_ack stuff
These things aren't used and the only possible use is within
rate control algorithms, however those can, if they need it,
keep track of it in their private data. last_ack_ms isn't
even updated so completely useless.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:02 -05:00
Johannes Berg
e6a5ddf208 mac80211: safely free beacon in ieee80211_if_reinit
If ieee80211_if_reinit() is called from ieee80211_unregister_hw()
then it is possible that the driver will still request a beacon
(it is allowed to until ieee80211_unregister_hw() has returned.)
This means we need to use an RCU-protected write to the beacon
information even in this function.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:00 -05:00
Johannes Berg
fba4a1e637 mac80211: fix IBSS code
This patch fixes two errors introduced by

    commit 19d35612f3cd7f60dd9174c0100584e21f5a1025
    Author: Bruno Randolf <bruno@thinktube.com>
    Date:   Mon Feb 18 11:21:36 2008 +0900

        mac80211: enable IBSS merging

The first error is an endianness problem that sparse found and
the second is a build failure when CONFIG_MAC80211_IBSS_DEBUG
is not set.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Bruno Randolf <bruno@thinktube.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:41 -05:00
Johannes Berg
f3af89d1aa mac80211: fix debugfs_sta print_mac() warning
When print_mac() was marked as __pure to avoid emitting a function
call in pr_debug() scenarios, a warning in this code surfaced since
it relies on the fact that the buffer is modified and doesn't use
the return value. This patch makes it use the return value instead.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:38 -05:00
Johannes Berg
665e8aafb4 mac80211: Disallow concurrent IBSS/STA mode interfaces
Disallow having more than one IBSS interface up at any time
because of beacon distribution issues, and for now also disallow
having more than one IBSS/STA interface up at the same time
because we use the master interface's BSS struct which would
be completely corrupted when we have more than one up.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:36 -05:00
Johannes Berg
43ba7e958f mac80211: atomically check whether STA exists already
When a STA structure is added, it is often checked whether it
already exists before adding it. This, however, isn't done
atomically so there is a race condition that could lead to two
STA structures being added with the same MAC address. This
patch changes sta_info_add() to return an ERR_PTR in case
of failure and adds the failure mode -EEXIST when the STA
already exists.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:34 -05:00
Johannes Berg
d46e144b65 mac80211: rework TX filtered frame code
This reworks the code for TX filtered frames, splitting it out to
a new function to handle those cases, making the clear instruction
a flag and renaming a few things to be easier to understand and
less Atheros hardware specific. Finally, it also makes the comments
explain more.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:32 -05:00
Pavel Roskin
d97cf01576 mac80211: fix incorrect use of CONFIG_MAC80211_IBSS_DEBUG
Configuration variables are only available to the preprocessor

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:27 -05:00
Johannes Berg
004c872e78 mac80211: consolidate TIM handling code
This consolidates all TIM handling code to avoid re-introducing
errors with the bitmap/set_tim order and to reduce code. While
reading the code I noticed a possible problem so I also added
a comment about that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg
836341a704 mac80211: remove sta TIM flag, fix expiry TIM handling
The TIM flag that is kept in each station's info is completely
useless, there's no code (aside from the debugfs display code)
checking it, hence it can be removed. While doing that, I noticed
that the TIM handling is broken when buffered frames expire, so
fix that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg
d2259243a1 mac80211: invoke set_tim() callback after setting own TIM info
Drivers should be allowed to simply get a complete new beacon when
set_tim() is invoked (and set_tim() is required for drivers that
just want a beacon template!), so we need to update our own TIM
bitmap before calling set_tim() so that getting the beacon will
now get an already updated beacon.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Tomas Winkler
b46b4ee034 wireless: update US regulatory domain
This patch adds channels to US regulatory domain

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:25 -05:00
Johannes Berg
4a9a66e9a8 mac80211: convert sta_info.pspoll to a flag
This doesn't really need to be a full int variable since it's
just a flag to indicate a PS-poll is in progress.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:13 -05:00
Bruno Randolf
9d9bf77d16 mac80211: enable IBSS merging
enable IBSS cell merging. if an IBSS beacon with the same channel, same ESSID
and a TSF higher than the local TSF (mactime) is received, we have to join its
BSSID. while this might not be immediately apparent from reading the 802.11
standard it is compliant and necessary to make IBSS mode functional in many
cases. most drivers have a similar behaviour.

* move the relevant code section (previously only containing debug code) down
to the end of the function, so we can reuse the bss structure.

* we have to compare the mactime (TSF at the time of packet receive) rather
than the current TSF. since mactime is defined as the time the first data
symbol arrived we add the time until byte 24 where the timestamp resides, since
this is how the beacon timestamp is defined. as some some drivers are not able
to give a reliable mactime we fall back to use the current TSF, which will be
enough to catch most (but not all) cases where an IBSS merge is necessary.

* in IBSS mode we want to allow beacons to override probe response info so we
can correctly do merges.

* we don't only configure beacons based on scan results, so change that
message.

* to enable this we have to let all beacons thru in IBSS mode, even if they
have a different BSSID.

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
Bruno Randolf
a607268a0d mac80211: move function ieee80211_sta_join_ibss()
this moves ieee80211_sta_join_ibss() up for the next patch (ibss merge).

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
S.Çağlar Onur
ab46623ec1 net/mac80211/: Use time_* macros
The functions time_before, time_before_eq, time_after, and time_after_eq are more robust for comparing jiffies against other values.

So following patch implements usage of the time_after() macro, defined at linux/jiffies.h, which deals with wrapping correctly

Cc: linux-wireless@vger.kernel.org
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg
ac2bf3242e mac80211: fix ecw2cw brain-damage
This brain-damaged code just bothers me, fix it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg
3330d7be70 mac80211: give burst time in txop rather than 0.1msec units
This changes mac80211 to pass the burst time to conf_tx in txop
units rather than 0.1msec units. 0.1msec units are only required
by atheros hardware (according to current driver support), all
other drivers do other calculations or require the txop value.
Therefore, it results in fewer calculations and more precision
if we just pass the txop value through to the driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:07 -05:00
Johannes Berg
96d510566e mac80211: defer master netdev allocation to ieee80211_register_hw
When we want to go multiqueue, we will need to know the number of
queues the hardware has for registering the master netdev. This
number is only available in ieee80211_register_hw() rather than
ieee80211_alloc_hw(), so defer allocation of the master device to
ieee80211_register_hw().

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:06 -05:00
Ron Rindjunsky
a9af2013ca mac80211: adjustable number of bits for qdisc pool
This fix allows to control the number of bits that qdiscs book keeping
can be done for with respect to the qdisc pool

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:05 -05:00
Michael Wu
3d30d949cf mac80211: Add cooked monitor mode support
This adds "cooked" monitor mode to mac80211. A monitor interface
in "cooked" mode will see all frames that mac80211 has not used
internally.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
8944b79fe9 mac80211: move some code into ieee80211_invoke_rx_handlers
There is some duplicated code that sits in front of each function
call to ieee80211_invoke_rx_handlers() that can very well be part
of that function if it gets slightly different arguments.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
589052904a mac80211: remove "dynamic" RX/TX handlers
It doesn't really make sense to have extra pointers to the RX/TX
handler arrays instead of just using the arrays directly, that
also allows us to make them static.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
2c9745e568 mac80211: clean up some things in the RX path
Uninline ieee80211_invoke_rx_handlers to save .text space,
make the code more readable in some places and remove the
"optimisation" that is hit only very few times and unclear
to start with.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Michael Wu
8cc9a73914 mac80211: Use monitor configuration flags
Take advantage of the monitor configuration flags now provided by cfg80211.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Michael Wu
66f7ac50ed nl80211: Add monitor interface configuration flags
This allows precise control over what a monitor interface shows.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Johannes Berg
e4c26add88 mac80211: split RX_DROP
Some instances of RX_DROP mean that the frame was useless,
others mean that the frame should be visible in userspace
on "cooked" monitor interfaces. This patch splits up RX_DROP
and changes each instance appropriately.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Johannes Berg
9ae54c8463 mac80211: split ieee80211_txrx_result
The _DROP result will need to be split in the RX path but not
in the TX path, so for preparation split up the type into two
types, one for RX and one for TX. Also make sure (via sparse)
that they cannot be confused.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:01 -05:00
Ivo van Doorn
406f2388cc wireless: Fix WARN_ON() with ieee802.11b
When the driver registers a IEEE80211_BAND_2GHZ band,
it can either be 802.11b or 802.11g. But when 802.11b rates
are registered "want" will be 3 (since 4 rates are being registered,
and each of those 4 rates will decrease "want").
Since this is a correct situation, there is no need to trigger
a WARN_ON() for this.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
aac09fbf82 wireless: fix ERP rate flags
In the rate API patch I accidentally reverted the test for
ERP rates, this fixes it. All rates except 1, 2, 5.5 and 11
MBit are ERP rates, not those.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Stefano Brivio
b7c50de92e rc80211-pid: fix rate adjustment
Merge rate_control_pid_shift_adjust() to rate_control_pid_adjust_rate()
in order to make the learning algorithm aware of constraints on rates. Also
add some comments and rename variables.

This fixes a bug which prevented 802.11b/g non-AP STAs from working with
802.11b only AP STAs.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
238814fd9a mac80211: remove port control enable switch, clean up sta flags
This patch removes the 802.1X port acess control enable flag
since it is not required. Instead, set the authorized flag for
each station that we normally communicate with (WDS peers, IBSS
peers and APs we're associated to) and require hostapd to set
the authorized flag for all stations when port control is not
enabled.

Also, since I was working in that area, this documents station
flags and removes the unused "permanent" one.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
69d464d593 mac80211: fix scan band off-by-one error
When checking for the next band to advance to, there
was an off-by-one error that could lead to an access
to an invalid array index. Additionally, the later
check for scan_band >= IEEE80211_NUM_BANDS is not
required since that will never be true.

This also improves the comments related to that code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
ee688b000d nl80211: export hardware bitrate/channel capabilities
This makes nl80211 export the hardware bitrate/channel capabilities
as registered in a wiphy.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
8318d78a44 cfg80211 API for channels/bitrates, mac80211 and driver conversion
This patch creates new cfg80211 wiphy API for channel and bitrate
registration and converts mac80211 and drivers to the new API. The
old mac80211 API is completely ripped out. All drivers (except ath5k)
are updated to the new API, in many cases I expect that optimisations
can be done.

Along with the regulatory code I've also ripped out the
IEEE80211_HW_DEFAULT_REG_DOMAIN_CONFIGURED flag, I believe it to be
unnecessary if the hardware simply gives us whatever channels it wants
to support and we then enable/disable them as required, which is pretty
much required for travelling.

Additionally, the patch adds proper "basic" rate handling for STA
mode interface, AP mode interface will have to have new API added
to allow userspace to set the basic rate set, currently it'll be
empty... However, the basic rate handling will need to be moved to
the BSS conf stuff.

I do expect there to be bugs in this, especially wrt. transmit
power handling where I'm basically clueless about how it should work.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
38f3714d66 mac80211: dissolve pre-rx handlers
These handlers do not really return a status and the compiler
can do a much better job when they're simply static functions
that it can inline if appropriate. Also makes the code shorter.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:21 -05:00
Ron Rindjunsky
d92684e660 mac80211: A-MPDU Tx add delBA from recipient support
This patch adds the ability to handle delBA from recipient to initiator
during an A-MPDU session

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:18 -05:00
Ron Rindjunsky
eb2ba62ee5 mac80211: A-MPDU add debugfs support
This patch adds A-MPDU status report per STA to the debugfs.
The option to de/activate A-MPDU through debugfs is also present.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
fe3bf0f59e mac80211: A-MPDU Tx MLME data initialization
This patch initialize A-MPDU MLME data for Tx sessions.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
9e72349237 mac80211: A-MPDU Tx adding qdisc support
This patch allows qdisc support in A-MPDU Tx. a method to
handle QoS <-> TID switches is present in this patch.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
eadc8d9e90 mac80211: A-MPDU Tx adding basic functionality
This patch adds the following abilities to mac80211:
 - start A-MPDU Tx session
 - stop A-MPDU Tx session
 - call backs to start/stop A-MPDU Tx session
 - sending addBA request
 - processing addBA response

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:14 -05:00
Ron Rindjunsky
80656c2031 mac80211: A-MPDU Tx add MLME structures
This patch adds the needed structures to describe the Tx aggregation MLME
per STA
new:
 - struct tid_ampdu_tx: TID aggregation information (Tx)
changed:
 - struct sta_ampdu_mlme: Tx aggregation information per TID and
			  dialog token creator were added
 - struct sta_info: tid_to_tx_q added for tid<->tx queue mapping

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:13 -05:00
Ron Rindjunsky
0df3ef45a3 mac80211: A-MPDU Tx add session's and low level driver's API
This patch adds the API for 3 stages in A-MPDU Tx session flow:
- request mac80211 to start/stop A-MPDU Tx session for specific TID. such a
  request should be issued by a load aware element, either mac80211 itself
  or external element.
- requests by mac80211 to low-level driver to start/stop Tx aggregation.
  notice that low level driver responds now with Starting Sequence Number.
- async feedback by low-level to mac80211 to inform that HW is ready for
  next A-MPDU Tx state.
Changes in API to Rx A-MPDU were also made, reflected in iwlwifi changes as
well.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:13 -05:00
Johannes Berg
7d185b8bb1 mac80211: allow sending multicast frames through virtual ports
When reworking the port access control code, I forgot multicast frames
and those are now always rejected because the destination station is
not known. This changes the code to allow through multicast frames and
also avoid the sta hash lookup (which is bound to fail) for them.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:09 -05:00
Tomas Winkler
b220525696 mac80211: set assoc flag to bss_conf
Only BSS_CHANGED_ASSOC was set in the 'changed' bitmask. Assignment  to
bss_conf.assoc was absent.
This patch assign value to  bss_conf.assoc according the association state.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:08 -05:00
Pavel Emelyanov
95a363582b [NET]: Use existing device list walker for /proc/dev_mcast.
The seq_file_operations' dev_mc_seq_xxx callbacks do the same thing as
the dev_seq_xxx ones do, but skip the SEQ_START_TOKEN.

So use the existing exported dev_seq_xxx calls and handle the
SEQ_START_TOKEN in the dev_mc_seq_show().

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:44:14 -08:00
Denis V. Lunev
fd80eb942a [INET]: Remove struct dst_entry *dst from request_sock_ops.rtx_syn_ack.
It looks like dst parameter is used in this API due to historical
reasons.  Actually, it is really used in the direct call to
tcp_v4_send_synack only.  So, create a wrapper for tcp_v4_send_synack
and remove dst from rtx_syn_ack.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:43:03 -08:00
Neil Horman
58fbbed4fb [SCTP]: extend exported data in /proc/net/sctp/assoc
RFC 3873 specifies several MIB objects that can't be obtained by the
current data set exported by /proc/sys/net/sctp/assoc.  This patch
adds the missing pieces of data that allow us to compute all the
objects in the sctpAssocTable object.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:40:56 -08:00
Pavel Emelyanov
665bba1087 [NETFILTER/RXRPC]: Don't use seq_release_private where inappropriate.
Some netfilter code and rxrpc one use seq_open() to open
a proc file, but seq_release_private to release one.

This is harmless, but ambiguous.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:39:17 -08:00
Pavel Emelyanov
c20932d2c9 [ATALK/DECNET]: Use seq_open_private in appletalk and decnet.
These two also perform manual seq_open_private, so patch them both at
once. But unlike ATM code, these already use the seq_release_private,
so I splitted this patch from the previous one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:38:24 -08:00
Pavel Emelyanov
9a8c09e73b [ATM]: Use seq_open/release_privade instead of manual manipulations.
lec_seq_open/lec_seq_release and __vcc_seq_open/vcc_seq_release
do seq_open/release_private's job.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:37:02 -08:00
David S. Miller
45af1754bc [NET]: sk_release_kernel needs to be exported to modules
Fixes:

ERROR: "sk_release_kernel" [net/ipv6/ipv6.ko] undefined!

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:33:19 -08:00
Pavel Emelyanov
459eea7410 [SCTP]: Use proc_create to setup de->proc_fops.
In addition to commit 160f17 ("[SCTP]: Use proc_create() to setup
->proc_fops first") use proc_create in two more places.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:24:45 -08:00
Denis V. Lunev
98c6d1b261 [NETNS]: Make icmpv6_sk per namespace.
All preparations are done. Now just add a hook to perform an
initialization on namespace startup and replace icmpv6_sk macro with
proper inline call.  Actual namespace the packet belongs too will be
passed later along with the one for the routing.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:21:22 -08:00
Denis V. Lunev
4a6ad7a141 [NETNS]: Make icmp_sk per namespace.
All preparations are done. Now just add a hook to perform an
initialization on namespace startup and replace icmp_sk macro with
proper inline call.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:19:58 -08:00
Denis V. Lunev
5c8cafd65e [NETNS]: icmp(v6)_sk should not pin a namespace.
So, change icmp(v6)_sk creation/disposal to the scheme used in the
netlink for rtnl, i.e. create a socket in the context of the init_net
and assign the namespace without getting a referrence later.

Also use sk_release_kernel instead of sock_release to properly destroy
such sockets.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:19:22 -08:00
Denis V. Lunev
edf0208702 [NET]: Make netlink_kernel_release publically available as sk_release_kernel.
This staff will be needed for non-netlink kernel sockets, which should
also not pin a namespace like tcp_socket and icmp_socket.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:18:32 -08:00
Denis V. Lunev
9dfbec1fb2 [NETLINK]: No need for a separate __netlink_release call.
Merge it to netlink_kernel_release.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:17:56 -08:00
Denis V. Lunev
79c9115953 [ICMP]: Allocate data for __icmp(v6)_sk dynamically.
Own __icmp(v6)_sk should be present in each namespace. So, it should be
allocated dynamically. Though, alloc_percpu does not fit the case as it
implies additional dereferrence for no bonus.

Allocate data for pointers just like __percpu_alloc_mask does and place
pointers to struct sock into this array.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:17:11 -08:00
Denis V. Lunev
405666db84 [ICMP]: Pass proper ICMP socket into icmp(v6)_xmit_(un)lock.
We have to get socket lock inside icmp(v6)_xmit_lock/unlock. The socket
is get from global variable now. When this code became namespaces, one
should pass a namespace and get socket from it.

Though, above is useless. Socket is available in the caller, just pass
it inside. This saves a bit of code now and saves more later.

add/remove: 0/0 grow/shrink: 1/3 up/down: 1/-169 (-168)
function                                     old     new   delta
icmp_rcv                                     718     719      +1
icmpv6_rcv                                  2343    2303     -40
icmp_send                                   1566    1518     -48
icmp_reply                                   549     468     -81

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:16:46 -08:00
Denis V. Lunev
b7e729c4b4 [ICMP]: Store sock rather than socket for ICMP flow control.
Basically, there is no difference, what to store: socket or sock. Though,
sock looks better as there will be 1 less dereferrence on the fast path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:16:08 -08:00
Denis V. Lunev
1e3cf6834e [ICMP]: Optimize icmp_socket usage.
Use this macro only once in a function to save a bit of space.

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-98 (-98)
function                                     old     new   delta
icmp_reply                                   562     561      -1
icmp_push_reply                              305     258     -47
icmp_init                                    273     223     -50

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:15:42 -08:00
Denis V. Lunev
a5710d6582 [ICMP]: Add return code to icmp_init.
icmp_init could fail and this is normal for namespace other than initial.
So, the panic should be triggered only on init_net initialization path.

Additionally create rollback path for icmp_init as a separate function.
It will also be used later during namespace destruction.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:14:50 -08:00
Denis V. Lunev
9b0f976f27 [INET]: Remove struct net_proto_family* from _init calls.
struct net_proto_family* is not used in icmp[v6]_init, ndisc_init,
igmp_init and tcp_v4_init. Remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:13:15 -08:00
Wang Chen
5e47879f49 [IRDA]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 10:34:45 -08:00
Trond Myklebust
cdd0972945 Merge branch 'cleanups' into next 2008-02-28 23:48:05 -08:00
Trond Myklebust
5e4424af9a SUNRPC: Remove now-redundant RCU-safe rpc_task free path
Now that we've tightened up the locking rules for RPC queue wakeups, we can
remove the RCU-safe kfree calls...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:26:28 -08:00
Trond Myklebust
ff2d7db848 SUNRPC: Ensure that we read all available tcp data
Don't stop until we run out of data, or we hit an error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:26:27 -08:00
Trond Myklebust
f5fb7b06e4 SUNRPC: Eliminate the now-redundant rpc_start_wakeup()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:26:25 -08:00
Trond Myklebust
eb276c0e10 SUNRPC: Switch tasks to using the rpc_waitqueue's timer function
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:26:21 -08:00
Trond Myklebust
36df9aae31 SUNRPC: Add a timer function to wait queues.
This is designed to replace the timeout timer in the individual rpc_tasks.
By putting the timer function in the wait queue, we will eventually be able
to reduce the total number of timers in use by the RPC subsystem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:21:59 -08:00
Trond Myklebust
f6a1cc8930 SUNRPC: Add a (empty for the moment) destructor for rpc_wait_queues
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-28 23:17:27 -08:00
Sangtae Ha
0bc8c7bf9e [TCP]: BIC web page link is corrected.
Signed-off-by: Sangtae Ha <sha2@ncsu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 22:14:32 -08:00
Timo Teras
4c563f7669 [XFRM]: Speed up xfrm_policy and xfrm_state walking
Change xfrm_policy and xfrm_state walking algorithm from O(n^2) to O(n).
This is achieved adding the entries to one more list which is used
solely for walking the entries.

This also fixes some races where the dump can have duplicate or missing
entries when the SPD/SADB is modified during an ongoing dump.

Dumping SADB with 20000 entries using "time ip xfrm state" the sys
time dropped from 1.012s to 0.080s.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 21:31:08 -08:00
Adrian Bunk
1e04d53070 [IPV6]: Unexport ip6_find_1stfragopt
This patch removes the no longer used 
EXPORT_SYMBOL_GPL(ip6_find_1stfragopt).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 21:27:35 -08:00
Juha-Matti Tapio
99cd07a537 [IPV6]: Fix source address selection for ORCHID addresses
Skip the prefix length matching in source address selection for
orchid -> non-orchid addresses.

Overlay Routable Cryptographic Hash IDentifiers (RFC 4843,
2001:10::/28) are currenty not globally reachable. Without this
check a host with an ORCHID address can end up preferring those over
regular addresses when talking to other regular hosts in the 2001::/16
range thus breaking non-orchid connections.

Signed-off-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:55:46 -08:00
Juha-Matti Tapio
5fe47b8a65 [IPV6]: Add ORCHID prefix to address label table
Add a new label for Overlay Routable Cryptographic Hash Identifiers
(RFC 4843) prefix 2001:10::/28 to help proper source address
selection.

ORCHID addresses are used by for example Host Identity Protocol. They are 
global and routable, but they currently need support from both endpoints 
and therefore mixing regular and ORCHID addresses for source and 
destination is a bad idea in general case.

Signed-off-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:55:02 -08:00
Denis V. Lunev
c4544c7243 [NETNS]: Process inet_select_addr inside a namespace.
The context is available from a network device passed in.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:54 -08:00
Denis V. Lunev
3776c8891a [NETNS]: Enable IPv4 address manipulations inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:25 -08:00
Denis V. Lunev
1937504dd1 [NETNS]: Enable all routing manipulation via netlink inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:04 -08:00
Denis V. Lunev
e5b13cb10d [NETNS]: Process devinet ioctl in the correct namespace.
Add namespace parameter to devinet_ioctl and locate device inside it for
state changes.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:51:43 -08:00
Denis V. Lunev
73b3871165 [NETNS]: Register /proc/net/rt_cache for each namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:51:18 -08:00
Denis V. Lunev
a75e936f2f [NETNS]: Process /proc/net/rt_cache inside a namespace.
Show routing cache for a particular namespace only.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:55 -08:00
Denis V. Lunev
642d631811 [IPV4]: rt_cache_get_next should take rt_genid into account.
In the other case /proc/net/rt_cache will look inconsistent in respect to
genid.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:33 -08:00
Denis V. Lunev
317805b8f8 [NETNS]: Process ip_rt_redirect in the correct namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:06 -08:00
Denis V. Lunev
9de8f76d20 [NETNS]: DST cleanup routines should be called inside namespace.
Device inside the namespace can be started and downed. So, active routing
cache should be cleaned up on device stop.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:49:44 -08:00
Denis V. Lunev
be162d6288 [NETNS]: Enable inetdev_event notifier.
After all these preparations it is time to enable main IPv4 device
initialization routine inside namespace. It is safe do this now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:49:13 -08:00
Denis V. Lunev
2430aa85de [NETNS]: Disable multicaststing configuration inside non-initial namespace.
Do not calls hooks from device notifiers and disallow configuration from
ioctl/netlink layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:49 -08:00
Denis V. Lunev
0c65babd6c [NETNS]: Default arp parameters lookup.
Default ARP parameters should be findable regardless of the context.
Required to make inetdev_event working.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:25 -08:00
Denis V. Lunev
4ab438fcd7 [NETNS]: Register neighbour table parameters in the correct namespace.
neigh_sysctl_register should register sysctl entries inside correct namespace
to avoid naming conflict. Typical example is a loopback. Entries for it
present in all namespaces.

Required to make inetdev_event working.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:01 -08:00
Denis V. Lunev
6133fb1aa1 [NETNS]: Disable inetaddr notifiers in namespaces other than initial.
ip_fib_init is kept enabled. It is already namespace-aware.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:46:17 -08:00
Denis V. Lunev
6fc68624e5 [NETFILTER]: Consolidate masq_inet_event and masq_device_event.
They do exactly the same job.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:45:41 -08:00
Denis V. Lunev
5811769c78 [IPV4]: Remove check for ifa->ifa_dev != NULL.
This is a callback registered to inet address notifier chain.
The check is useless as:
- ifa->ifa_dev is always != NULL
- similar checks are abscent in all other notifiers.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:45:00 -08:00
Wang Chen
2335f8ec27 [X25]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:16:33 -08:00
Wang Chen
6d37a15816 [WANROUTER]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:15:56 -08:00
Wang Chen
d45fba3625 [8021Q]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:14:58 -08:00
Wang Chen
770207208e [IPV4]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:14:25 -08:00
Wang Chen
4436f4cbfa [IPV6]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:13:46 -08:00
Wang Chen
160f17e345 [SCTP]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:13:16 -08:00
Wang Chen
25296d599c [PKTGEN]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:11:49 -08:00
Wang Chen
46ecf0b994 [NEIGHBOUR]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:10:51 -08:00
Wang Chen
7e02180998 [LLC]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:08:54 -08:00
Wang Chen
1e15dc981d [IPX]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:06:14 -08:00
Wang Chen
2ce8f047d5 [SUNRPC]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:00:59 -08:00
David S. Miller
64758bd792 Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev 2008-02-28 13:56:37 -08:00
Wang Chen
16e297b358 [ATM]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 13:55:45 -08:00
Vlad Yasevich
7e8616d8e7 [SCTP]: Update AUTH structures to match declarations in draft-16.
The new SCTP socket api (draft 16) updates the AUTH API structures.
We never exported these since we knew they would change.
Update the rest to match the draft.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:04 -05:00
Vlad Yasevich
b40db68468 [SCTP]: Incorrect length was used in SCTP_*_AUTH_CHUNKS socket option
The chunks are stored inside a parameter structure in the kernel
and when we copy them to the user, we need to account for
the parameter header.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:01 -05:00
Neil Horman
15efbe7639 [SCTP]: Clean up naming conventions of sctp protocol/address family registration
I noticed while looking into some odd behavior in sctp, that the variable
name sctp_pf_inet6_specific was used twice to represent two different
pieces of data (its both a structure name and a pointer to that type of
structure), which is confusing to say the least, and potentially dangerous
depending on the variable scope.  This patch cleans that up, and makes the
protocol and address family registration names in SCTP more regular,
increasing readability.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

 ipv6.c     |   12 ++++++------
 protocol.c |   12 ++++++------
 2 files changed, 12 insertions(+), 12 deletions(-)
2008-02-28 16:41:05 -05:00
Wang Chen
ed2b5b474e [APPLETALK]: Use proc_create() to setup ->proc_fops first
As Davem mentioned in his recently patch
(d9595a7b9c)
that the procfs visibility should occur after
the ->proc_fops are setup.

And also, Alexey provide proc_create() to make
sure that ->proc_fops is setup before gluing PDE
to main tree.

We use proc_create().

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 12:53:32 -08:00
Herbert Xu
21e43188f2 [IPCOMP]: Disable BH on output when using shared tfm
Because we use shared tfm objects in order to conserve memory,
(each tfm requires 128K of vmalloc memory), BH needs to be turned
off on output as that can occur in process context.

Previously this was done implicitly by the xfrm output code.
That was lost when it became lockless.  So we need to add the
BH disabling to IPComp directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 11:23:17 -08:00
David S. Miller
60717f7e76 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-28 11:03:29 -08:00
Johannes Berg
03147dfc8a mac80211: fix kmalloc vs. net_ratelimit
The "goto end;" part definitely must not be rate limited.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-28 09:13:10 -05:00
Vlad Yasevich
b90a137d30 [SCTP]: Correctly set the length of sctp_assoc_change notification
sctp_assoc_change notification may contain the data from a received
ABORT chunk.  Set the length correctly to account for that.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-27 16:40:55 -05:00
Jan Engelhardt
6556874dc3 [NETFILTER]: xt_conntrack: fix IPv4 address comparison
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:20:41 -08:00
Jan Engelhardt
d61f89e941 [NETFILTER]: xt_conntrack: fix missing boolean clamping
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:09:05 -08:00
Patrick McHardy
4e29e9ec7e [NETFILTER]: nf_conntrack: fix smp_processor_id() in preemptible code warning
Since we're using RCU for the conntrack hash now, we need to avoid
getting preempted or interrupted by BHs while changing the stats.

Fixes warning reported by Tilman Schmidt <tilman@imap.cc> when using
preemptible RCU:

[   48.180297] BUG: using smp_processor_id() in preemptible [00000000] code: ntpdate/3562
[   48.180297] caller is __nf_conntrack_find+0x9b/0xeb [nf_conntrack]
[   48.180297] Pid: 3562, comm: ntpdate Not tainted 2.6.25-rc2-mm1-testing #1
[   48.180297]  [<c02015b9>] debug_smp_processor_id+0x99/0xb0
[   48.180297]  [<fac643a7>] __nf_conntrack_find+0x9b/0xeb [nf_conntrack]

Tested-by: Tilman Schmidt <tilman@imap.cc>
Tested-by: Christian Casteyde <casteyde.christian@free.fr> [Bugzilla #10097]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:07:47 -08:00
YOSHIFUJI Hideaki
3bdfe7ec08 [IPV6] SYSCTL: Fix possible memory leakage in error path.
In error path, we do need to free memory just allocated.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:06:38 -08:00
Pavel Emelyanov
b37d428b24 [INET]: Don't create tunnels with '%' in name.
Four tunnel drivers (ip_gre, ipip, ip6_tunnel and sit) can receive a
pre-defined name for a device from the userspace.  Since these drivers
call the register_netdevice() (rtnl_lock, is held), which does _not_
generate the device's name, this name may contain a '%' character.

Not sure how bad is this to have a device with a '%' in its name, but
all the other places either use the register_netdev(), which call the
dev_alloc_name(), or explicitly call the dev_alloc_name() before
registering, i.e. do not allow for such names.

This had to be prior to the commit 34cc7b, but I forgot to number the
patches and this one got lost, sorry.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 23:51:04 -08:00
David S. Miller
d9595a7b9c [AF_KEY]: Fix oops by converting to proc_net_*().
To make sure the procfs visibility occurs after the ->proc_fs ops are
setup, use proc_net_fops_create() and proc_net_remove().

This also fixes an OOPS after module unload in that the name string
for remove was wrong, so it wouldn't actually be removed.  That bug
was introduced by commit 61145aa1a1
("[KEY]: Clean up proc files creation a bit.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 22:23:31 -08:00
Bjorn Mork
148f97292e [IPV4]: Reset scope when changing address
This bug did bite at least one user, who did have to resort to rebooting
the system after an "ifconfig eth0 127.0.0.1" typo.

Deleting the address and adding a new is a less intrusive workaround.
But I still beleive this is a bug that should be fixed.  Some way or
another.

Another possibility would be to remove the scope mangling based on
address.  This will always be incomplete (are 127/8 the only address
space with host scope requirements?)

We set the scope to RT_SCOPE_HOST if an IPv4 interface is configured
with a loopback address (127/8).  The scope is never reset, and will
remain set to RT_SCOPE_HOST after changing the address. This patch
resets the scope if the address is changed again, to restore normal
functionality.

Signed-off-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 18:42:41 -08:00
Benjamin Thery
f1243c2db6 [IPV6]: Add missing initializations of the new nl_info.nl_net field
Add some more missing initializations of the new nl_info.nl_net field
in IPv6 stack. This field will be used when network namespaces are
fully supported.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 18:42:37 -08:00
Thomas Gleixner
3ab2273175 bluetooth: delete timer in l2cap_conn_del()
Delete a possibly armed timer before kfree'ing the connection object.

Solves: http://lkml.org/lkml/2008/2/15/514

Reported-by:Quel Qun <kelk1@comcast.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 17:42:56 -08:00
Trond Myklebust
5d00837b90 SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs
An audit of the current RPC timeout functions shows that they don't really
ever need to run in the softirq context. As long as the softirq is
able to signal that the wakeup is due to a timeout (which it can do by
setting task->tk_status to -ETIMEDOUT) then the callback functions can just
run as standard task->tk_callback functions (in the rpciod/process
context).

The only possible border-line case would be xprt_timer() for the case of
UDP, when the callback is used to reduce the size of the transport
congestion window. In testing, however, the effect of moving that update
to a callback would appear to be minor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:44 -08:00
Trond Myklebust
fda1393938 SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:42 -08:00
Trond Myklebust
96ef13b283 SUNRPC: Add a new helper rpc_wake_up_queued_task()
In all cases where we currently use rpc_wake_up_task(), we almost always
know on which waitqueue the rpc_task is actually sleeping. This will allows
us to simplify the queue locking in a future patch.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:40 -08:00
Trond Myklebust
fde95c7554 SUNRPC: Clean up rpc_run_timer()
All RPC timeout callback functions are expected to wake the task up. We can
enforce this by moving the wakeup back into rpc_run_timer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:39 -08:00
Trond Myklebust
32bfb5c0f4 SUNRPC: Allow the rpc_release() callback to be run on another workqueue
A lot of the work done by the rpc_release() callback is inappropriate for
rpciod as it will often involve things like starting a new rpc call in
order to clean up state after an interrupted NFSv4 open() call, or
calls to mntput(), etc.

This patch allows the caller of rpc_run_task() to specify that the
rpc_release callback should run on a different workqueue than the default
rpciod_workqueue.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:34 -08:00
Harvey Harrison
5f2f40a92e tipc: fix integer as NULL pointer sparse warnings in tipc
net/tipc/cluster.c:145:2: warning: Using plain integer as NULL pointer
net/tipc/link.c:3254:36: warning: Using plain integer as NULL pointer
net/tipc/ref.c:151:15: warning: Using plain integer as NULL pointer
net/tipc/zone.c:85:2: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-24 18:38:31 -08:00
Linus Torvalds
bdc0894289 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
  [NETFILTER]: fix ebtable targets return
  [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.
  [NET]: Restore sanity wrt. print_mac().
  [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.
  [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK
  tg3: ethtool phys_id default
  [BNX2]: Update version to 1.7.4.
  [BNX2]: Disable parallel detect on an HP blade.
  [BNX2]: More 5706S link down workaround.
  ssb: Fix support for PCI devices behind a SSB->PCI bridge
  zd1211rw: fix sparse warnings
  rtl818x: fix sparse warnings
  ssb: Fix pcicore cardbus mode
  ssb: Make the GPIO API reentrancy safe
  ssb: Fix the GPIO API
  ssb: Fix watchdog access for devices without a chipcommon
  ssb: Fix serial console on new bcm47xx devices
  ath5k: Fix build warnings on some 64-bit platforms.
  WDEV, ath5k, don't return int from bool function
  WDEV: ath5k, fix lock imbalance
  ...
2008-02-23 21:07:10 -08:00
Joonwoo Park
1b04ab4597 [NETFILTER]: fix ebtable targets return
The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:22:27 -08:00
Pavel Emelyanov
34cc7ba639 [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.
Use the added dev_alloc_name() call to create tunnel device name,
rather than iterate in a hand-made loop with an artificial limit.

Thanks Patrick for noticing this.

[ The way this works is, when the device is actually registered,
  the generic code noticed the '%' in the name and invokes
  dev_alloc_name() to fully resolve the name.  -DaveM ]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:19:20 -08:00
David S. Miller
55b01e8681 [NET]: Restore sanity wrt. print_mac().
MAC_FMT had only one user and we tried to get rid of
that, but this created more problems than it solved.

As a result, this reverts three commits:

235365f3aa ("net/8021q/vlan_dev.c: Use
print_mac."), fea5fa875e ("[NET]: Remove
MAC_FMT"), and 8f789c4844 ("[NET]:
Elminate spurious print_mac() calls.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:09:11 -08:00
Pavel Emelyanov
bc4bf5f38c [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.
The neigh_hash_grow() may update the tbl->hash_rnd value, which 
is used in all tbl->hash callbacks to calculate the hashval.

Two lookup routines may race with this, since they call the 
->hash callback without the tbl->lock held. Since the hash_rnd
is changed with this lock write-locked moving the calls to ->hash
under this lock read-locked closes this gap.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 19:57:02 -08:00
Thomas Graf
1840bb13c2 [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK
RTM_NEWLINK allows for already existing links to be modified. For this
purpose do_setlink() is called which expects address attributes with a
payload length of at least dev->addr_len. This patch adds the necessary
validation for the RTM_NEWLINK case.

The address length for links to be created is not checked for now as the
actual attribute length is used when copying the address to the netdevice
structure. It might make sense to report an error if less than addr_len
bytes are provided but enforcing this might break drivers trying to be
smart with not transmitting all zero addresses.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 19:54:36 -08:00
Rafael J. Wysocki
3a2d5b7001 PM: Introduce PM_EVENT_HIBERNATE callback state
During the last step of hibernation in the "platform" mode (with the
help of ACPI) we use the suspend code, including the devices'
->suspend() methods, to prepare the system for entering the ACPI S4
system sleep state.

But at least for some devices the operations performed by the
->suspend() callback in that case must be different from its operations
during regular suspend.

For this reason, introduce the new PM event type PM_EVENT_HIBERNATE and
pass it to the device drivers' ->suspend() methods during the last phase
of hibernation, so that they can distinguish this case and handle it as
appropriate.  Modify the drivers that handle PM_EVENT_SUSPEND in a
special way and need to handle PM_EVENT_HIBERNATE in the same way.

These changes are necessary to fix a hibernation regression related
to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 10:40:04 -08:00
Pavel Emelyanov
5216a8e70e Wrap buffers used for rpc debug printks into RPC_IFDEBUG
Sorry for the noise, but here's the v3 of this compilation fix :)

There are some places, which declare the char buf[...] on the stack
to push it later into dprintk(). Since the dprintk sometimes (if the
CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
cause gcc to produce appropriate warnings.

Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
compile them out when not needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-21 18:42:29 -05:00
Denis V. Lunev
da12f7356d [NETNS]: Namespace leak in pneigh_lookup.
release_net is missed on the error path in pneigh_lookup.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-20 00:26:16 -08:00
Pavel Emelyanov
5f31886ff0 [SCTP]: Pick up an orphaned sctp_sockets_allocated counter.
This counter is currently write-only.

Drawing an analogy with the similar tcp counter, I think
that this one should be pointed by the sockets_allocated
members of sctp_prot and sctpv6_prot.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-20 00:23:01 -08:00
Jan Engelhardt
27ecb1ff0a [NETFILTER]: xt_iprange: fix subtraction-based comparison
The host address parts need to be converted to host-endian first
before arithmetic makes any sense on them.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:20:06 -08:00
Jan Engelhardt
7d9904c260 [NETFILTER]: xt_hashlimit: remove unneeded struct member
By allocating ->hinfo, we already have the needed indirection to cope
with the per-cpu xtables struct match_entry.

[Patrick: do this now before the revision 1 struct is used by userspace]

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:19:44 -08:00
Joonwoo Park
eb1197bc0e [NETFILTER]: Fix incorrect use of skb_make_writable
http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:18:47 -08:00
Pavel Emelyanov
f449b3b54d [NETFILTER]: xt_u32: drop the actually unused variable from u32_match_it
The int ret variable is used only to trigger the BUG_ON() after
the skb_copy_bits() call, so check the call failure directly
and drop the variable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:18:20 -08:00
Patrick McHardy
e2b58a67b9 [NETFILTER]: {ip,ip6,nfnetlink}_queue: fix SKB_LINEAR_ASSERT when mangling packet data
As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>,
inserting new data in skbs queued over {ip,ip6,nfnetlink}_queue
triggers a SKB_LINEAR_ASSERT in skb_put().

Going back through the git history, it seems this bug is present since
at least 2.6.12-rc2, probably even since the removal of
skb_linearize() for netfilter.

Linearize non-linear skbs through skb_copy_expand() when enlarging
them.  Tested by Thomas, fixes bugzilla #9933.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:17:52 -08:00
Adrian Bunk
94cb1503c7 ipv4/fib_hash.c: fix NULL dereference
Unless I miss a guaranteed relation between between "f" and
"new_fa->fa_info" this patch is required for fixing a NULL dereference
introduced by commit a6501e080c ("[IPV4]
FIB_HASH: Reduce memory needs and speedup lookups") and spotted by the
Coverity checker.

Eric Dumazet says:

	Hum, you are right, kmem_cache_free() doesnt allow a NULL
	object, like kfree() does.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:28:54 -08:00
Adrian Bunk
15e29b8b05 net/9p/trans_virtio.c: kmalloc() enough memory
The Coverity checker spotted that less memory than required was
allocated.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:25:30 -08:00
Thomas Graf
76e87306c2 [RTNL]: Add missing link netlink attribute policy definitions
IFLA_LINK is no longer a write-only attribute on the kernel side and
must thus be validated. Same goes for the newly introduced
IFLA_LINKINFO.

Fixes undefined behaviour if either of the attributes are not well
formed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:12:08 -08:00
Jorge Boncompte [DTI2]
12aa343add [NET]: Messed multicast lists after dev_mc_sync/unsync
Commit a0a400d79e ("[NET]: dev_mcast:
add multicast list synchronization helpers") from you introduced a new
field "da_synced" to struct dev_addr_list that is not properly
initialized to 0. So when any of the current users (8021q, macvlan,
mac80211) calls dev_mc_sync/unsync they mess the address list for both
devices.

The attached patch fixed it for me and avoid future problems.

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 14:17:04 -08:00
Linus Torvalds
07ce198a1e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (60 commits)
  [NIU]: Bump driver version and release date.
  [NIU]: Fix BMAC alternate MAC address indexing.
  net: fix kernel-doc warnings in header files
  [IPV6]: Use BUG_ON instead of if + BUG in fib6_del_route.
  [IPV6]: dst_entry leak in ip4ip6_err. (resend)
  bluetooth: do not move child device other than rfcomm
  bluetooth: put hci dev after del conn
  [NET]: Elminate spurious print_mac() calls.
  [BLUETOOTH] hci_sysfs.c: Kill build warning.
  [NET]: Remove MAC_FMT
  net/8021q/vlan_dev.c: Use print_mac.
  [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer().
  [BLUETOOTH] net/bluetooth/hci_core.c: Use time_* macros
  [IPV6]: Fix hardcoded removing of old module code
  [NETLABEL]: Move some initialization code into __init section.
  [NETLABEL]: Shrink the genl-ops registration code.
  [AX25] ax25_out: check skb for NULL in ax25_kick()
  [TCP]: Fix tcp_v4_send_synack() comment
  [IPV4]: fix alignment of IP-Config output
  Documentation: fix tcp.txt
  ...
2008-02-19 07:52:45 -08:00
Pavel Emelyanov
2df96af03d [IPV6]: Use BUG_ON instead of if + BUG in fib6_del_route.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:50:42 -08:00
Denis V. Lunev
9937ded8e4 [IPV6]: dst_entry leak in ip4ip6_err. (resend)
The result of the ip_route_output is not assigned to skb. This means that
- it is leaked
- possible OOPS below dereferrencing skb->dst
- no ICMP message for this case

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:49:36 -08:00
Dave Young
8ac62dc773 bluetooth: do not move child device other than rfcomm
hci conn child devices other than rfcomm tty should not be moved here.
This is my lost, thanks for Barnaby's reporting and testing.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com> 
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:45:41 -08:00
Dave Young
0cd63c8089 bluetooth: put hci dev after del conn
Move hci_dev_put to del_conn to avoid hci dev going away before hci conn.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:44:01 -08:00
David S. Miller
988d0093f9 [BLUETOOTH] hci_sysfs.c: Kill build warning.
net/bluetooth/hci_sysfs.c: In function ‘del_conn’:
net/bluetooth/hci_sysfs.c:339: warning: suggest parentheses around assignment used as truth value

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 00:20:50 -08:00
Joe Perches
235365f3aa net/8021q/vlan_dev.c: Use print_mac.
Remove direct use of MAC_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:34:54 -08:00
YOSHIFUJI Hideaki
b791160b5a [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer().
Keep ordering of policy entries with same selector in
xfrm_dst_hash_transfer().

Issue should not appear in usual cases because multiple policy entries
with same selector are basically not allowed so far.  Bug was pointed
out by Sebastien Decugis <sdecugis@hongo.wide.ad.jp>.

We could convert bydst from hlist to list and use list_add_tail()
instead.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Sebastien Decugis <sdecugis@hongo.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:29:30 -08:00
S.Çağlar Onur
82453021b8 [BLUETOOTH] net/bluetooth/hci_core.c: Use time_* macros
The functions time_before, time_before_eq, time_after, and
time_after_eq are more robust for comparing jiffies against other
values.

So following patch implements usage of the time_after() macro, defined
at linux/jiffies.h, which deals with wrapping correctly

Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:25:57 -08:00
Wang Chen
5ee46e562c [IPV6]: Fix hardcoded removing of old module code
Rusty hardcoded the old module code.
We can remove it now.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:34:53 -08:00
Pavel Emelyanov
05705e4e11 [NETLABEL]: Move some initialization code into __init section.
Everything that is called from netlbl_init() can be marked with
__init. This moves 620 bytes from .text section to .text.init one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:33:57 -08:00
Pavel Emelyanov
227c43c3bc [NETLABEL]: Shrink the genl-ops registration code.
Turning them to array and registration in a loop saves
80 lines of code and ~300 bytes from text section.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:33:16 -08:00
Jarek Poplawski
f47b7257c7 [AX25] ax25_out: check skb for NULL in ax25_kick()
According to some OOPS reports ax25_kick tries to clone NULL skbs
sometimes. It looks like a race with ax25_clear_queues(). Probably
there is no need to add more than a simple check for this yet.
Another report suggested there are probably also cases where ax25
->paclen == 0 can happen in ax25_output(); this wasn't confirmed
during testing but let's leave this debugging check for some time.

Reported-and-tested-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:31:19 -08:00
Kris Katterjohn
9bf1d83e7e [TCP]: Fix tcp_v4_send_synack() comment
Signed-off-by: Kris Katterjohn <katterjohn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:29:19 -08:00
Uwe Kleine-Koenig
9c00409a2a [IPV4]: fix alignment of IP-Config output
Make the indented lines aligned in the output (not in the code).

Signed-off-by: Uwe Kleine-Koenig <Uwe.Kleine-Koenig@digi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:28:32 -08:00
Julia Lawall
d6584f3a08 net/9p/trans_virtio.c: Use BUG_ON
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
side-effects to allow a definition of BUG_ON that drops the code completely.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@ disable unlikely @ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (unlikely(E)) { BUG(); }
+ BUG_ON(E);
)

@@ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (E) { BUG(); }
+ BUG_ON(E);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:42:53 -08:00
Julia Lawall
163e3cb7da net/rxrpc: Use BUG_ON
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
side-effects to allow a definition of BUG_ON that drops the code completely.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@ disable unlikely @ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (unlikely(E)) { BUG(); }
+ BUG_ON(E);
)

@@ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (E) { BUG(); }
+ BUG_ON(E);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:42:03 -08:00
David S. Miller
9ff5660746 Revert "[NDISC]: Fix race in generic address resolution"
This reverts commit 69cc64d8d9.

It causes recursive locking in IPV6 because unlike other
neighbour layer clients, it even needs neighbour cache
entries to send neighbour soliciation messages :-(

We'll have to find another way to fix this race.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:39:54 -08:00
David S. Miller
93b2d4a208 Revert "[RTNETLINK]: Send a single notification on device state changes."
This reverts commit 45b5035482.

It break locking around dev->link_mode as well as cause
other bootup problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:35:07 -08:00
David S. Miller
42fe95cae5 Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-15 15:56:47 -08:00
Michael Buesch
ceffefd15a mac80211: Fix initial hardware configuration
On the initial device-open we need to defer the hardware reconfiguration
after we incremented the open_count, because the hw_config checks this flag
and won't call the lowlevel driver in case it is zero.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-15 13:44:19 -05:00
Linus Torvalds
f6866fecd6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits)
  [NET]: Make sure sockets implement splice_read
  netconsole: avoid null pointer dereference at show_local_mac()
  [IPV6]: Fix reversed local_df test in ip6_fragment
  [XFRM]: Avoid bogus BUG() when throwing new policy away.
  [AF_KEY]: Fix bug in spdadd
  [NETFILTER] nf_conntrack_proto_tcp.c: Mistyped state corrected.
  net: xfrm statistics depend on INET
  [NETFILTER]: make secmark_tg_destroy() static
  [INET]: Unexport inet_listen_wlock
  [INET]: Unexport __inet_hash_connect
  [NET]: Improve cache line coherency of ingress qdisc
  [NET]: Fix race in dev_close(). (Bug 9750)
  [IPSEC]: Fix bogus usage of u64 on input sequence number
  [RTNETLINK]: Send a single notification on device state changes.
  [NETLABLE]: Hide netlbl_unlabel_audit_addr6 under ifdef CONFIG_IPV6.
  [NETLABEL]: Don't produce unused variables when IPv6 is off.
  [NETLABEL]: Compilation for CONFIG_AUDIT=n case.
  [GENETLINK]: Relax dances with genl_lock.
  [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def.
  [IPV6]: remove unused method declaration (net/ndisc.h).
  ...
2008-02-15 07:33:07 -08:00
Rémi Denis-Courmont
997b37da15 [NET]: Make sure sockets implement splice_read
Fixes a segmentation fault when trying to splice from a non-TCP socket.

Signed-off-by: Rémi Denis-Courmont <rdenis@simphalempin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-15 02:35:45 -08:00
Herbert Xu
b5c15fc004 [IPV6]: Fix reversed local_df test in ip6_fragment
I managed to reverse the local_df test when forward-porting this
patch so it actually makes things worse by never fragmenting at
all.

Thanks to David Stevens for testing and reporting this bug.

Bill Fink pointed out that the local_df setting is also the wrong
way around.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 23:49:37 -08:00
Jan Blunck
1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
YOSHIFUJI Hideaki
073a371987 [XFRM]: Avoid bogus BUG() when throwing new policy away.
From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

When we destory a new policy entry, we need to tell
xfrm_policy_destroy() explicitly that the entry is not
alive yet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:52:38 -08:00
Kazunori MIYAZAWA
a4d6b8af1e [AF_KEY]: Fix bug in spdadd
This patch fix a BUG when adding spds which have same selector.

Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:51:38 -08:00
Jozsef Kadlecsik
d0c1fd7a8f [NETFILTER] nf_conntrack_proto_tcp.c: Mistyped state corrected.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:50:21 -08:00
Paul Mundt
0f4bda005f net: xfrm statistics depend on INET
net/built-in.o: In function `xfrm_policy_init':
/home/pmundt/devel/git/sh-2.6.25/net/xfrm/xfrm_policy.c:2338: undefined reference to `snmp_mib_init'

snmp_mib_init() is only built in if CONFIG_INET is set.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:48:45 -08:00
Trond Myklebust
cbc2005925 SUNRPC: Declare as const the rpc_message arguments to rpc_call_sync/async
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-14 11:17:24 -05:00
Adrian Bunk
f51f5ec690 [NETFILTER]: make secmark_tg_destroy() static
This patch makes the needlessly global secmark_tg_destroy() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:41:39 -08:00
Adrian Bunk
324b57619b [INET]: Unexport inet_listen_wlock
This patch removes the no longer used EXPORT_SYMBOL(inet_listen_wlock).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:40:25 -08:00
Adrian Bunk
74da4d34e4 [INET]: Unexport __inet_hash_connect
This patch removes the unused EXPORT_SYMBOL_GPL(__inet_hash_connect).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:39:34 -08:00
Randy Dunlap
bc2cda1ebd docbook: make a networking book and fix a few errors
Move networking (core and drivers) docbook to its own networking book.
Fix a few kernel-doc errors in header and source files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:19 -08:00
Randy Dunlap
65b6e42cdc docbook: sunrpc filenames and notation fixes
Use updated file list for docbook files and
fix kernel-doc warnings in sunrpc:
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:689): No description found for parameter 'rpc_client'
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:765): No description found for parameter 'flags'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:584): No description found for parameter 'tk_ops'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:618): No description found for parameter 'bufsize'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:19 -08:00
Harvey Harrison
b5606c2d44 remove final fastcall users
fastcall always expands to empty, remove it.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:18 -08:00
Matti Linnanvuori
d8b2a4d21e [NET]: Fix race in dev_close(). (Bug 9750)
There is a race in Linux kernel file net/core/dev.c, function dev_close.
The function calls function dev_deactivate, which calls function
dev_watchdog_down that deletes the watchdog timer. However, after that, a
driver can call netif_carrier_ok, which calls function
__netdev_watchdog_up that can add the watchdog timer again. Function
unregister_netdevice calls function dev_shutdown that traps the bug
!timer_pending(&dev->watchdog_timer). Moving dev_deactivate after
netif_running() has been cleared prevents function netif_carrier_on
from calling __netdev_watchdog_up and adding the watchdog timer again.

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 23:11:16 -08:00
Herbert Xu
b318e0e4ef [IPSEC]: Fix bogus usage of u64 on input sequence number
Al Viro spotted a bogus use of u64 on the input sequence number which
is big-endian.  This patch fixes it by giving the input sequence number
its own member in the xfrm_skb_cb structure.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:50:35 -08:00
Laszlo Attila Toth
45b5035482 [RTNETLINK]: Send a single notification on device state changes.
In do_setlink() a single notification is sent at the end of the
function if any modification occured. If the address has been changed,
another notification is sent.

Both of them is required because originally only the NETDEV_CHANGEADDR
notification was sent and although device state change implies address
change, some programs may expect the original notification. It remains
for compatibity.

If set_operstate() is called from do_setlink(), it doesn't send a
notification, only if it is called from rtnl_create_link() as earlier.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:42:09 -08:00
Pavel Emelyanov
370125f0a4 [NETLABLE]: Hide netlbl_unlabel_audit_addr6 under ifdef CONFIG_IPV6.
This one is called from under this config only, so move
it in the same place.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:38:06 -08:00
Pavel Emelyanov
56628b1d89 [NETLABEL]: Don't produce unused variables when IPv6 is off.
Some code declares variables on the stack, but uses them
under #ifdef CONFIG_IPV6, so thay become unused when ipv6
is off. Fortunately, they are used in a switch's case
branches, so the fix is rather simple.

Is it OK from coding style POV to add braces inside "cases",
or should I better avoid such style and rework the patch?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:37:19 -08:00
Pavel Emelyanov
94de7feb2d [NETLABEL]: Compilation for CONFIG_AUDIT=n case.
The audit_log_start() will expand into an empty do { } while (0)
construction and the audit_ctx becomes unused.

The solution: push current->audit_context into audit_log_start()
directly, since it is not required in any other place in the 
calling function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:35:37 -08:00
Pavel Emelyanov
910d6c320c [GENETLINK]: Relax dances with genl_lock.
The genl_unregister_family() calls the genl_unregister_mc_groups(), 
which takes and releases the genl_lock and then locks and releases
this lock itself.

Relax this behavior, all the more so the genl_unregister_mc_groups() 
is called from genl_unregister_family() only.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:16:33 -08:00
Pavel Emelyanov
4c3a0a254e [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def.
Currently, if the call to netlbl_domhsh_search succeeds the
return result will still be NULL.

Fix that, by returning the found entry (if any).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:15:14 -08:00
Urs Thuermann
fee54fa517 [NET]: Fix comment for skb_pull_rcsum
Fix comment for skb_pull_rcsum

Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:03:25 -08:00
Herbert Xu
28a89453b1 [IPV6]: Fix IPsec datagram fragmentation
This is a long-standing bug in the IPsec IPv6 code that breaks
when we emit a IPsec tunnel-mode datagram packet.  The problem
is that the code the emits the packet assumes the IPv6 stack
will fragment it later, but the IPv6 stack assumes that whoever
is emitting the packet is going to pre-fragment the packet.

In the long term we need to fix both sides, e.g., to get the
datagram code to pre-fragment as well as to get the IPv6 stack
to fragment locally generated tunnel-mode packet.

For now this patch does the second part which should make it
work for the IPsec host case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 18:07:27 -08:00
David S. Miller
69cc64d8d9 [NDISC]: Fix race in generic address resolution
Frank Blaschka provided the bug report and the initial suggested fix
for this bug.  He also validated this version of this fix.

The problem is that the access to neigh->arp_queue is inconsistent, we
grab references when dropping the lock lock to call
neigh->ops->solicit() but this does not prevent other threads of
control from trying to send out that packet at the same time causing
corruptions because both code paths believe they have exclusive access
to the skb.

The best option seems to be to hold the write lock on neigh->lock
during the ->solicit() call.  I looked at all of the ndisc_ops
implementations and this seems workable.  The only case that needs
special care is the IPV4 ARP implementation of arp_solicit().  It
wants to take neigh->lock as a reader to protect the header entry in
neigh->ha during the emission of the soliciation.  We can simply
remove the read lock calls to take care of that since holding the lock
as a writer at the caller providers a superset of the protection
afforded by the existing read locking.

The rest of the ->solicit() implementations don't care whether the
neigh is locked or not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:54:17 -08:00
Jarek Poplawski
e848b583e0 [AX25] ax25_ds_timer: use mod_timer instead of add_timer
This patch changes current use of: init_timer(), add_timer()
and del_timer() to setup_timer() with mod_timer(), which
should be safer anyway.

Reported-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:34 -08:00
Jarek Poplawski
21fab4a86a [AX25] ax25_timer: use mod_timer instead of add_timer
According to one of Jann's OOPS reports it looks like
BUG_ON(timer_pending(timer)) triggers during add_timer()
in ax25_start_t1timer(). This patch changes current use
of: init_timer(), add_timer() and del_timer() to
setup_timer() with mod_timer(), which should be safer
anyway.

Reported-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:33 -08:00
Jarek Poplawski
4de211f1a2 [AX25] ax25_route: make ax25_route_lock BH safe
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.24-dg8ngn-p02 #1
> ---------------------------------
> inconsistent {softirq-on-W} -> {in-softirq-R} usage.
> linuxnet/3046 [HC0[0]:SC1[2]:HE1:SE0] takes:
>  (ax25_route_lock){--.+}, at: [<f8a0cfb7>] ax25_get_route+0x18/0xb7 [ax25]
> {softirq-on-W} state was registered at:
...

This lockdep report shows that ax25_route_lock is taken for reading in
softirq context, and for writing in process context with BHs enabled.
So, to make this safe, all write_locks in ax25_route.c are changed to
_bh versions.

Reported-by: Jann Traschewski <jann@gmx.de>,
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:32 -08:00
Jarek Poplawski
1105b5d1d4 [AX25] af_ax25: remove sock lock in ax25_info_show()
This lockdep warning:

> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.24 #3
> -------------------------------------------------------
> swapper/0 is trying to acquire lock:
>  (ax25_list_lock){-+..}, at: [<f91dd3b1>] ax25_destroy_socket+0x171/0x1f0 [ax25]
>
> but task is already holding lock:
>  (slock-AF_AX25){-+..}, at: [<f91dbabc>] ax25_std_heartbeat_expiry+0x1c/0xe0 [ax25]
>
> which lock already depends on the new lock.
...

shows that ax25_list_lock and slock-AF_AX25 are taken in different
order: ax25_info_show() takes slock (bh_lock_sock(ax25->sk)) while
ax25_list_lock is held, so reversely to other functions. To fix this
the sock lock should be moved to ax25_info_start(), and there would
be still problem with breaking ax25_list_lock (it seems this "proper"
order isn't optimal yet). But, since it's only for reading proc info
it seems this is not necessary (e.g.  ax25_send_to_raw() does similar
reading without this lock too).

So, this patch removes sock lock to avoid deadlock possibility; there
is also used sock_i_ino() function, which reads sk_socket under proper
read lock. Additionally printf format of this i_ino is changed to %lu.

Reported-by: Bernard Pidoux F6BVP <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:31 -08:00
Stephen Hemminger
8315f5d80a fib_trie: /proc/net/route performance improvement
Use key/offset caching to change /proc/net/route (use by iputils route)
from O(n^2) to O(n). This improves performance from 30sec with 160,000
routes to 1sec.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:31 -08:00
Stephen Hemminger
ec28cf738d fib_trie: handle empty tree
This fixes possible problems when trie_firstleaf() returns NULL
to trie_leafindex().

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:30 -08:00
David S. Miller
e4f8b5d4ed [IPV4]: Remove IP_TOS setting privilege checks.
Various RFCs have all sorts of things to say about the CS field of the
DSCP value.  In particular they try to make the distinction between
values that should be used by "user applications" and things like
routing daemons.

This seems to have influenced the CAP_NET_ADMIN check which exists for
IP_TOS socket option settings, but in fact it has an off-by-one error
so it wasn't allowing CS5 which is meant for "user applications" as
well.

Further adding to the inconsistency and brokenness here, IPV6 does not
validate the DSCP values specified for the IPV6_TCLASS socket option.

The real actual uses of these TOS values are system specific in the
final analysis, and these RFC recommendations are just that, "a
recommendation".  In fact the standards very purposefully use
"SHOULD" and "SHOULD NOT" when describing how these values can be
used.

In the final analysis the only clean way to provide consistency here
is to remove the CAP_NET_ADMIN check.  The alternatives just don't
work out:

1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing
   setups.

2) If we just fix the off-by-one error in the class comparison in
   IPV4, certain DSCP values can be used in IPV6 but not IPV4 by
   default.  So people will just ask for a sysctl asking to
   override that.

I checked several other freely available kernel trees and they
do not make any privilege checks in this area like we do.  For
the BSD stacks, this goes back all the way to Stevens Volume 2
and beyond.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:29 -08:00
Linus Torvalds
0c0d61ca93 Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
  SUNPRC: Fix printk format warning
  nfsd: clean up svc_reserve_auth()
  NLM: don't requeue block if it was invalidated while GRANT_MSG was in flight
  NLM: don't reattempt GRANT_MSG when there is already an RPC in flight
  NLM: have server-side RPC clients default to soft RPC tasks
  NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients
2008-02-11 09:19:47 -08:00
Roland Dreier
bb50c8012c SUNPRC: Fix printk format warning
net/sunrpc/xprtrdma/svc_rdma_sendto.c:160: warning: format '%llx'
expects type 'long long unsigned int', but argument 3 has type 'u64'

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-10 18:11:22 -05:00
David S. Miller
30ddb159ff [PKT_SCHED] ematch: Fix build warning.
Commit 954415e33e ("[PKT_SCHED] ematch:
tcf_em_destroy robustness") removed a cast on em->data when
passing it to kfree(), but em->data is an integer type that can
hold pointers as well as other values so the cast is necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-10 03:48:15 -08:00
Jarek Poplawski
21347456ab [NET_SCHED] sch_htb: htb_requeue fix
htb_requeue() enqueues skbs for which htb_classify() returns NULL.
This is wrong because such skbs could be handled by NET_CLS_ACT code,
and the decision could be different than earlier in htb_enqueue().
So htb_requeue() is changed to work and look more like htb_enqueue().

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:44:00 -08:00
Rami Rosen
238fc7eac8 [IPV6]: Replace using the magic constant "1024" with IP6_RT_PRIO_USER for fc_metric.
This patch replaces the explicit usage of the magic constant "1024"
with IP6_RT_PRIO_USER in the IPV6 tree.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:43:11 -08:00
Stephen Hemminger
954415e33e [PKT_SCHED] ematch: tcf_em_destroy robustness
Make the code in tcf_em_tree_destroy more robust and cleaner:
 * Don't need to cast pointer to kfree() or avoid passing NULL.
 * After freeing the tree, clear the pointer to avoid possible problems
from repeated free.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:26:53 -08:00
Stephen Hemminger
ed7af3b350 [PKT_SCHED]: deinline functions in meta match
A couple of functions in meta match don't need to be inline.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:26:17 -08:00
Pavel Emelyanov
8ff65b4603 [SCTP]: Convert sctp_dbg_objcnt to seq files.
This makes the code use a good proc API and the text ~50 bytes shorter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:24:58 -08:00
Pavel Emelyanov
3f5340a67e [SCTP]: Use snmp_fold_field instead of a homebrew analogue.
SCPT already depends in INET, so this doesn't create additional
dependencies.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:23:44 -08:00
Denis V. Lunev
cd557bc1c1 [IGMP]: Optimize kfree_skb in igmp_rcv.
Merge error paths inside igmp_rcv.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:22:26 -08:00
Pavel Emelyanov
bd2f747658 [KEY]: Convert net/pfkey to use seq files.
The seq files API disposes the caller of the difficulty of
checking file position, the length of data to produce and
the size of provided buffer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:20:06 -08:00
Pavel Emelyanov
61145aa1a1 [KEY]: Clean up proc files creation a bit.
Mainly this removes ifdef-s from inside the ipsec_pfkey_init.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:19:14 -08:00
Stephen Hemminger
268bcca1e7 [PKT_SCHED] ematch: oops from uninitialized variable (resend)
Setting up a meta match causes a kernel OOPS because of uninitialized
elements in tree.

[   37.322381] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   37.322381] IP: [<ffffffff883fc717>] :em_meta:em_meta_destroy+0x17/0x80

[   37.322381] Call Trace:
[   37.322381]  [<ffffffff803ec83d>] tcf_em_tree_destroy+0x2d/0xa0
[   37.322381]  [<ffffffff803ecc8c>] tcf_em_tree_validate+0x2dc/0x4a0
[   37.322381]  [<ffffffff803f06d2>] nla_parse+0x92/0xe0
[   37.322381]  [<ffffffff883f9672>] :cls_basic:basic_change+0x202/0x3c0
[   37.322381]  [<ffffffff802a3917>] kmem_cache_alloc+0x67/0xa0
[   37.322381]  [<ffffffff803ea221>] tc_ctl_tfilter+0x3b1/0x580
[   37.322381]  [<ffffffff803dffd0>] rtnetlink_rcv_msg+0x0/0x260
[   37.322381]  [<ffffffff803ee944>] netlink_rcv_skb+0x74/0xa0
[   37.322381]  [<ffffffff803dffc8>] rtnetlink_rcv+0x18/0x20
[   37.322381]  [<ffffffff803ee6c3>] netlink_unicast+0x263/0x290
[   37.322381]  [<ffffffff803cf276>] __alloc_skb+0x96/0x160
[   37.322381]  [<ffffffff803ef014>] netlink_sendmsg+0x274/0x340
[   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[   37.322381]  [<ffffffff80288611>] zone_statistics+0xb1/0xc0
[   37.322381]  [<ffffffff803c7e5e>] sys_sendmsg+0x20e/0x360
[   37.322381]  [<ffffffff803c7411>] sockfd_lookup_light+0x41/0x80
[   37.322381]  [<ffffffff8028d04b>] handle_mm_fault+0x3eb/0x7f0
[   37.322381]  [<ffffffff8020c2fb>] system_call_after_swapgs+0x7b/0x80

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 03:47:19 -08:00
David S. Miller
ab1ecbabb1 Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev 2008-02-09 03:44:25 -08:00
Linus Torvalds
3668805a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [IPSEC] flow: reorder "struct flow_cache_entry" and remove SLAB_HWCACHE_ALIGN
  [DECNET] ROUTE: remove unecessary alignment
  [IPSEC]: Add support for aes-ctr.
  [ISDN]: fix section mismatch warning in enpci_card_msg
  [TIPC]: declare proto_ops structures as 'const'.
  [TIPC]: Kill unused static inline (x5)
  [TC]: oops in em_meta
  [IPV6] Minor cleanup: remove unused definitions in net/ip6_fib.h
  [IPV6] Minor clenup: remove two unused definitions in net/ip6_route.h
  [AF_IUCV]: defensive programming of iucv_callback_txdone
  [AF_IUCV]: broken send_skb_q results in endless loop
  [IUCV]: wrong irq-disabling locking at module load time
  [CAN]: Minor clean-ups
  [CAN]: Move proto_{,un}register() out of spin-locked region
  [CAN]: Clean up module auto loading
  [IPSEC] flow: Remove an unnecessary ____cacheline_aligned
  [IPV4]: route: fix crash ip_route_input
  [NETFILTER]: xt_iprange: add missing #include
  [NETFILTER]: xt_iprange: fix typo in address family
  [NETFILTER]: nf_conntrack: fix ct_extend ->move operation
  ...
2008-02-08 09:27:06 -08:00
Pavel Emelyanov
cbdc738732 namespaces: mark NET_NS with "depends on NAMESPACES"
There's already an option controlling the net namespaces cloning code, so make
it work the same way as all the other namespaces' options.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:23 -08:00
Eric Dumazet
dd5a1843d5 [IPSEC] flow: reorder "struct flow_cache_entry" and remove SLAB_HWCACHE_ALIGN
1) We can shrink sizeof(struct flow_cache_entry) by 8 bytes on 64bit arches.
2) No need to align these structures to hardware cache lines, this only waste 
   ram for very litle gain.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:30:42 -08:00
Eric Dumazet
fca09fb732 [DECNET] ROUTE: remove unecessary alignment
Same alignment requirement was removed on IP route cache in the past.

This alignment actually has bad effect on 32 bit arches, uniprocessor,
since sizeof(dn_rt_hash_bucket) is forced to 8 bytes instead of 4.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:29:57 -08:00
Joy Latten
405137d16f [IPSEC]: Add support for aes-ctr.
The below patch allows IPsec to use CTR mode with AES encryption
algorithm. Tested this using setkey in ipsec-tools.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:11:56 -08:00
Florian Westphal
bca65eae39 [TIPC]: declare proto_ops structures as 'const'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:18:01 -08:00
Ilpo Järvinen
86121fe5b4 [TIPC]: Kill unused static inline (x5)
All these static inlines are unused:

in_own_zone     1 (net/tipc/addr.h)
msg_dataoctet   1 (net/tipc/msg.h)
msg_direct      1 (include/net/tipc/tipc_msg.h)
msg_options     1 (include/net/tipc/tipc_msg.h)
tipc_nmap_get   1 (net/tipc/bcast.h)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:17:13 -08:00
Stephen Hemminger
04f217aca4 [TC]: oops in em_meta
If userspace passes a unknown match index into em_meta, then
em_meta_change will return an error and the data for the match will
not be set. This then causes an null pointer dereference when the
cleanup is done in the error path via tcf_em_tree_destroy. Since the
tree structure comes kzalloc, it is initialized to NULL.

Discovered when testing a new version of tc command against an
accidental older kernel.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:13:00 -08:00
Ursula Braun
f2a77991a9 [AF_IUCV]: defensive programming of iucv_callback_txdone
The loop in iucv_callback_txdone presumes existence of an entry
with msg->tag in the send_skb_q list. In error cases this
assumption might be wrong and might cause an endless loop.
Loop is rewritten to guarantee loop end in case of missing
msg->tag entry in send_skb_q.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:07:44 -08:00
Ursula Braun
d44447229e [AF_IUCV]: broken send_skb_q results in endless loop
A race has been detected in iucv_callback_txdone().
skb_unlink has to be done inside the locked area.

In addition checkings for successful allocations are inserted.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:07:19 -08:00
Ursula Braun
435bc9dfc6 [IUCV]: wrong irq-disabling locking at module load time
Linux may hang when running af_iucv socket programs concurrently
with a load of module netiucv. iucv_register() tries to take the
iucv_table_lock with spin_lock_irq. This conflicts with
iucv_connect() which has a need for an smp_call_function while
holding the iucv_table_lock.
Solution: use bh-disabling locking in iucv_register()

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:06:52 -08:00
Urs Thuermann
a219994bf5 [CAN]: Minor clean-ups
Remove unneeded variable.
Rename local variable error to err like in all other places.
Some white-space changes.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:05:04 -08:00
Urs Thuermann
a2fea5f19f [CAN]: Move proto_{,un}register() out of spin-locked region
The implementation of proto_register() has changed so that it can now
sleep.  The call to proto_register() must be moved out of the
spin-locked region.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:04:45 -08:00
Urs Thuermann
5423dd67bd [CAN]: Clean up module auto loading
Remove local char array to construct module name.
Don't call request_module() when CONFIG_KMOD is not set.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:04:21 -08:00
Eric Dumazet
5f58a5c872 [IPSEC] flow: Remove an unnecessary ____cacheline_aligned
We use a percpu variable named flow_hash_info, which holds 12 bytes.

It is currently marked as ____cacheline_aligned, which makes linker
skip space to properly align this variable.

Before :
c065cc90 D per_cpu__softnet_data
c065cd00 d per_cpu__flow_tables
<Here, hole of 124 bytes>
c065cd80 d per_cpu__flow_hash_info
<Here, hole of 116 bytes>
c065ce00 d per_cpu__flow_flush_tasklets
c065ce14 d per_cpu__rt_cache_stat


This alignement is quite unproductive, and removing it reduces the
size of percpu data (by 240 bytes on my x86 machine), and improves
performance (flow_tables & flow_hash_info can share a single cache
line)

After patch :
c065cc04 D per_cpu__softnet_data
c065cc4c d per_cpu__flow_tables
c065cc50 d per_cpu__flow_hash_info
c065cc5c d per_cpu__flow_flush_tasklets
c065cc70 d per_cpu__rt_cache_stat

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:03:18 -08:00
Patrick McHardy
4136cd523e [IPV4]: route: fix crash ip_route_input
ip_route_me_harder() may call ip_route_input() with skbs that don't
have skb->dev set for skbs rerouted in LOCAL_OUT and TCP resets
generated by the REJECT target, resulting in a crash when dereferencing
skb->dev->nd_net. Since ip_route_input() has an input device argument,
it seems correct to use that one anyway.

Bug introduced in b5921910a1 (Routing cache virtualization).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:58:20 -08:00
Jan Engelhardt
5da621f1c5 [NETFILTER]: xt_iprange: add missing #include
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:57:11 -08:00
Patrick McHardy
d9d17578d9 [NETFILTER]: xt_iprange: fix typo in address family
The family for iprange_mt4 should be AF_INET, not AF_INET6.
Noticed by Jiri Moravec <jim.lkml@gmail.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:56:49 -08:00
Patrick McHardy
86577c661b [NETFILTER]: nf_conntrack: fix ct_extend ->move operation
The ->move operation has two bugs:

- It is called with the same extension as source and destination,
  so it doesn't update the new extension.

- The address of the old extension is calculated incorrectly,
  instead of (void *)ct->ext + ct->ext->offset[i] it uses
  ct->ext + ct->ext->offset[i].

Fixes a crash on x86_64 reported by Chuck Ebbert <cebbert@redhat.com>
and Thomas Woerner <twoerner@redhat.com>.

Tested-by: Thomas Woerner <twoerner@redhat.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:56:34 -08:00
Jozsef Kadlecsik
b2155e7f70 [NETFILTER]: nf_conntrack: TCP conntrack reopening fix
TCP connection tracking in netfilter did not handle TCP reopening
properly: active close was taken into account for one side only and
not for any side, which is fixed now. The patch includes more comments
to explain the logic how the different cases are handled.
The bug was discovered by Jeff Chua.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:54:56 -08:00
David Howells
e231c2ee64 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]*PTR_ERR' fs crypto net security`

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:26 -08:00
Vlad Yasevich
5f9646c3d9 [SCTP]: Make sure the chunk is off the transmitted list prior to freeing.
In a few instances, we need to remove the chunk from the transmitted list
prior to freeing it.  This is because the free code doesn't do that any
more and so we need to do it manually.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Wei Yongjun
a869981423 [SCTP]: Fix kernel panic while received ASCONF chunk with bad serial number
While recevied ASCONF chunk with serial number less then needed, kernel
will treat this chunk as a retransmitted ASCONF chunk and find cached
ASCONF-ACK chunk used sctp_assoc_lookup_asconf_ack(). But this function
will always return NO-NULL. So response with cached ASCONF-ACKs chunk
will cause kernel panic.
In function sctp_assoc_lookup_asconf_ack(), if the cached ASCONF-ACKs
list asconf_ack_list is empty, or if the serial being requested does not
exists, the function as it currectly stands returns the actuall
list_head asoc->asconf_ack_list, this is not a cache ASCONF-ACK chunk
but a bogus pointer.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Vlad Yasevich
b46ae36de4 [SCTP]: Set ports in every address returned by sctp_getladdrs()
Thomas Dreibholz has reported that port numbers are not filled
in the results of sctp_getladdrs() when the socket was bound
to an ephemeral port.  This is only true, if the address was
not specified either.  So, fill in the port number correctly.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Vlad Yasevich
c068be5491 [SCTP]: Correctly reap SSNs when processing FORWARD_TSN chunk
When we recieve a FORWARD_TSN chunk, we need to reap
all the queued fast-forwarded chunks from the ordering queue
 However, if we don't have them queued, we need to see if
the next expected one is there as well.  If it is, start
deliver from that point instead of waiting for the next
chunk to arrive.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:26:26 -05:00
Andrew Morton
7276744354 9p: fix p9_printfcall export
ERROR: "p9_printfcall" [net/9p/9pnet_virtio.ko] undefined!

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:01 -06:00
Eric Van Hensbergen
8a0dc95fd9 9p: transport API reorganization
This merges the mux.c (including the connection interface) with trans_fd
in preparation for transport API changes.  Ultimately, trans_fd will need
to be rewritten to clean it up and simplify the implementation, but this
reorganization is viewed as the first step.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:03 -06:00
Eric Van Hensbergen
f39335453f 9p: add remove function to trans_virtio
Request from rusty:
Just cleaning up patches for 2.6.25 merge, and noticed that
net/9p/trans_virtio.c doesn't have a remove function.  This will crash when
removing the module (console doesn't have one because it can't really be
removed).

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:04 -06:00
Anthony Liguori
dea7bbb603 9p: Convert semaphore to spinlock for p9_idpool
When booting from v9fs, down_interruptible in p9_idpool_get() triggered a BUG
as it was being called with IRQs disabled.  A spinlock seems like the right
thing to be using since the idr functions go out of their way not to sleep.

This patch eliminates the BUG by converting the semaphore to a spinlock.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:04 -06:00
Eric Van Hensbergen
7c7d90f2dd 9p: Fix soft lockup in virtio transport
This fixes a poorly placed spinlock which could result in a
soft lockup condition.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:07 -06:00
Eric Van Hensbergen
e2735b7720 9p: block-based virtio client
This replaces the console-based virto client with a block-based
client using a single request queue.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:58 -06:00
Eric Van Hensbergen
043aba403e 9p: create transport rpc cut-thru
Add a new transport function which allows a cut-thru directly to
the transport instead of processing request through the mux if the
cut-thru exists.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:09 -06:00
Martin Stava
afcf0c13ae 9p: fix bug in p9_clone_stat
This patch fixes a bug in the copying of 9P
stat information where string references
weren't being updated properly.

Signed-off-by: Martin Sava <martin.stava@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:20:44 -06:00
Sven Wegener
9c1ca6e68a ipvs: Make wrr "no available servers" error message rate-limited
No available servers is more an error message than something informational. It
should also be rate-limited, else we're going to flood our logs on a busy
director, if all real servers are out of order with a weight of zero.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 20:00:10 -08:00
David S. Miller
a29961b33b Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-05 19:58:05 -08:00
Patrick McHardy
9ec138101f [NET_SCHED]: cls_flow: support classification based on VLAN tag
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:21:04 -08:00
Patrick McHardy
4f25049106 [NET_SCHED]: cls_flow: fix key mask validity check
Since we're using fls(), we need to check whether the value is
non-zero first.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:19:59 -08:00
Patrick McHardy
0ea9d70df8 [NET_SCHED]: em_meta: fix compile warning
net/sched/em_meta.c: In function 'meta_int_vlan_tag':
net/sched/em_meta.c:179: warning: 'tag' may be used uninitialized in this function

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:19:33 -08:00
Michael Buesch
03ac7a8141 mac80211: Is not EXPERIMENTAL anymore
Remove the EXPERIMENTAL dependency, as the existing mac80211
features are stable.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-05 14:35:47 -05:00
Linus Torvalds
3d412f60b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [PKT_SCHED]: vlan tag match
  [NET]: Add if_addrlabel.h to sanitized headers.
  [NET] rtnetlink.c: remove no longer used functions
  [ICMP]: Restore pskb_pull calls in receive function
  [INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations.
  [NET]: Remove further references to net-modules.txt
  bluetooth rfcomm tty: destroy before tty_close()
  bluetooth: blacklist another Broadcom BCM2035 device
  drivers/bluetooth/btsdio.c: fix double-free
  drivers/bluetooth/bpa10x.c: fix memleak
  bluetooth: uninlining
  bluetooth: hidp_process_hid_control remove unnecessary parameter dealing
  tun: impossible to deassert IFF_ONE_QUEUE or IFF_NO_PI
  hamradio: fix dmascc section mismatch
  [SCTP]: Fix kernel panic while received AUTH chunk with BAD shared key identifier
  [SCTP]: Fix kernel panic while received AUTH chunk while enabled auth
  [IPV4]: Formatting fix for /proc/net/fib_trie.
  [IPV6]: Fix sysctl compilation error.
  [NET_SCHED]: Add #ifdef CONFIG_NET_EMATCH in net/sched/cls_flow.c (latest git broken build)
  [IPV4]: Fix compile error building without CONFIG_FS_PROC
  ...
2008-02-05 10:09:07 -08:00
Paul Moore
eda61d32e8 NetLabel: introduce a new kernel configuration API for NetLabel
Add a new set of configuration functions to the NetLabel/LSM API so that
LSMs can perform their own configuration of the NetLabel subsystem without
relying on assistance from userspace.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:20 -08:00
Vlad Yasevich
01f2d38498 [SCTP]: Kill silly inlines in ulpqueue.c
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:30 -05:00
Vlad Yasevich
0eca8fee0c [SCTP]: Do not increase rwnd when reading partial notification.
When a user reads a partial notification message, do not
update rwnd since notifications must not be counted towards
receive window.

Tested-by: Oliver Roll <mail@oliroll.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:30 -05:00
Vlad Yasevich
60c778b259 [SCTP]: Stop claiming that this is a "reference implementation"
I was notified by Randy Stewart that lksctp claims to be
"the reference implementation".  First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:07 -05:00
Stephen Hemminger
3113e88c3c [PKT_SCHED]: vlan tag match
Provide a way to use tc filters on vlan tag even if tag is buried in
skb due to hardware acceleration.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:20:13 -08:00
Adrian Bunk
03245ce2f0 [NET] rtnetlink.c: remove no longer used functions
This patch removes the following no longer used functions:
- rtattr_parse()
- rtattr_strlcpy()
- __rtattr_parse_nested_compat()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:17:22 -08:00
Herbert Xu
8cf229437f [ICMP]: Restore pskb_pull calls in receive function
Somewhere along the development of my ICMP relookup patch the header
length check went AWOL on the non-IPsec path.  This patch restores the
check.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:15:50 -08:00
Pavel Emelyanov
5d8c0aa943 [INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations.
The port offset calculations depend on the protocol family, but, as
Adrian noticed, I broke this logic with the commit

	5ee31fc1ec
	[INET]: Consolidate inet(6)_hash_connect.

Return this logic back, by passing the port offset directly into the
consolidated function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Noticed-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:14:44 -08:00
Dave Young
93d807401c bluetooth rfcomm tty: destroy before tty_close()
rfcomm dev could be deleted in tty_hangup, so we must not call
rfcomm_dev_del again to prevent from destroying rfcomm dev before tty
close.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:12:06 -08:00
Andrew Morton
91f5cca3d1 bluetooth: uninlining
Remove all those inlines which were either a) unneeded or b) increased code
size.

          text    data     bss     dec     hex filename
before:   6997      74       8    7079    1ba7 net/bluetooth/hidp/core.o
after:    6492      74       8    6574    19ae net/bluetooth/hidp/core.o

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:07:58 -08:00
Dave Young
eff001e35a bluetooth: hidp_process_hid_control remove unnecessary parameter dealing
According to the bluetooth HID spec v1.0 chapter 7.4.2

"This code requests a major state change in a BT-HID device.  A HID_CONTROL
request does not generate a HANDSHAKE response."

"A HID_CONTROL packet with a parameter of VIRTUAL_CABLE_UNPLUG is the only
HID_CONTROL packet a device can send to a host.  A host will ignore all other
packets."

So in the hidp_precess_hid_control function, we just need to deal with the
UNLUG packet.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:07:14 -08:00
Wei Yongjun
7cc08b55fc [SCTP]: Fix kernel panic while received AUTH chunk with BAD shared key identifier
If SCTP-AUTH is enabled, received AUTH chunk with BAD shared key 
identifier will cause kernel panic.

Test as following:
step1: enabled /proc/sys/net/sctp/auth_enable
step 2:  connect  to SCTP server with auth capable. Association is 
established between endpoints. Then send a AUTH chunk with a bad 
shareid, SCTP server will kernel panic after received that AUTH chunk.

SCTP client                   SCTP server
  INIT         ---------->  
    (with auth capable)
               <----------    INIT-ACK
                              (with auth capable)
  COOKIE-ECHO  ---------->
               <----------    COOKIE-ACK
  AUTH         ---------->


AUTH chunk is like this:
  AUTH chunk
    Chunk type: AUTH (15)
    Chunk flags: 0x00
    Chunk length: 28
    Shared key identifier: 10
    HMAC identifier: SHA-1 (1)
    HMAC: 0000000000000000000000000000000000000000

The assignment of NULL to key can safely be removed, since key_for_each 
(which is just list_for_each_entry under the covers does an initial 
assignment to key anyway).

If the endpoint_shared_keys list is empty, or if the key_id being 
requested does not exist, the function as it currently stands returns 
the actuall list_head (in this case endpoint_shared_keys.  Since that 
list_head isn't surrounded by an actuall data structure, the last 
iteration through list_for_each_entry will do a container_of on key, and 
we wind up returning a bogus pointer, instead of NULL, as we should.

> Neil Horman wrote:
>> On Tue, Jan 22, 2008 at 05:29:20PM +0900, Wei Yongjun wrote:
>>
>> FWIW, Ack from me.  The assignment of NULL to key can safely be 
>> removed, since
>> key_for_each (which is just list_for_each_entry under the covers does 
>> an initial
>> assignment to key anyway).
>> If the endpoint_shared_keys list is empty, or if the key_id being 
>> requested does
>> not exist, the function as it currently stands returns the actuall 
>> list_head (in
>> this case endpoint_shared_keys.  Since that list_head isn't 
>> surrounded by an
>> actuall data structure, the last iteration through 
>> list_for_each_entry will do a
>> container_of on key, and we wind up returning a bogus pointer, 
>> instead of NULL,
>> as we should.  Wei's patch corrects that.
>>
>> Regards
>> Neil
>>
>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>
>
> Yep, the patch is correct.
>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>
> -vlad
>

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:03:06 -08:00
Wei Yongjun
d2f19fa13e [SCTP]: Fix kernel panic while received AUTH chunk while enabled auth
If STCP is started while /proc/sys/net/sctp/auth_enable is set 0 and
association is established between endpoints. Then if
/proc/sys/net/sctp/auth_enable is set 1, a received AUTH chunk will
cause kernel panic.

Test as following:
step 1: echo 0> /proc/sys/net/sctp/auth_enable
step 2:

   SCTP client                  SCTP server
      INIT          --------->
                    <---------   INIT-ACK
      COOKIE-ECHO   --------->
                    <---------   COOKIE-ACK
step 3:
    echo 1> /proc/sys/net/sctp/auth_enable
step 4:
   SCTP client                  SCTP server
       AUTH        ----------->  Kernel Panic


This patch fix this probleam to treat AUTH chunk as unknow chunk if peer 
has initialized with no auth capable.

> Sorry for the delay.  Was on vacation without net access.
>
> Wei Yongjun wrote:
>>
>>
>> This patch fix this probleam to treat AUTH chunk as unknow chunk if 
>> peer has initialized with no auth capable.
>>
>> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>
>>

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:02:26 -08:00
Denis V. Lunev
b9c4d82a85 [IPV4]: Formatting fix for /proc/net/fib_trie.
The line in the /proc/net/fib_trie for route with TOS specified
- has extra \n at the end
- does not have a space after route scope
like below.
           |-- 1.1.1.1
              /32 universe UNICASTtos =1

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:58:45 -08:00
Rami Rosen
0aead54347 [NET_SCHED]: Add #ifdef CONFIG_NET_EMATCH in net/sched/cls_flow.c (latest git broken build)
The 2.6 latest git build was broken when using the following
configuration options:
CONFIG_NET_EMATCH=n
CONFIG_NET_CLS_FLOW=y

with the following error:
net/sched/cls_flow.c: In function 'flow_dump':
net/sched/cls_flow.c:598: error: 'struct tcf_ematch_tree' has no
member named 'hdr'
make[2]: *** [net/sched/cls_flow.o] Error 1
make[1]: *** [net/sched] Error 2
make: *** [net] Error 2


see the recent post by Li Zefan:
  http://www.spinics.net/lists/netdev/msg54434.html

The reason for this crash is that struct tcf_ematch_tree
(net/pkt_cls.h) is empty when CONFIG_NET_EMATCH is not defined.

When CONFIG_NET_EMATCH is defined, the tcf_ematch_tree structure
indeed holds a struct tcf_ematch_tree_hdr (hdr) as flow_dump()
expects.

This patch adds #ifdef CONFIG_NET_EMATCH in flow_dump to avoid this.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:56:48 -08:00
Adrian Bunk
322c8a3c36 [IPSEC] xfrm4_beet_input(): fix an if()
A bug every C programmer makes at some point in time...

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:51:39 -08:00
Linus Torvalds
93890b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (25 commits)
  virtio: balloon driver
  virtio: Use PCI revision field to indicate virtio PCI ABI version
  virtio: PCI device
  virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz
  virtio_blk: Dont waste major numbers
  virtio_blk: provide getgeo
  virtio_net: parametrize the napi_weight for virtio receive queue.
  virtio: free transmit skbs when notified, not on next xmit.
  virtio: flush buffers on open
  virtnet: remove double ether_setup
  virtio: Allow virtio to be modular and used by modules
  virtio: Use the sg_phys convenience function.
  virtio: Put the virtio under the virtualization menu
  virtio: handle interrupts after callbacks turned off
  virtio: reset function
  virtio: populate network rings in the probe routine, not open
  virtio: Tweak virtio_net defines
  virtio: Net header needs hdr_len
  virtio: remove unused id field from struct virtio_blk_outhdr
  virtio: clarify NO_NOTIFY flag usage
  ...
2008-02-04 08:00:54 -08:00
Linus Torvalds
f5bb3a5e9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (79 commits)
  Jesper Juhl is the new trivial patches maintainer
  Documentation: mention email-clients.txt in SubmittingPatches
  fs/binfmt_elf.c: spello fix
  do_invalidatepage() comment typo fix
  Documentation/filesystems/porting fixes
  typo fixes in net/core/net_namespace.c
  typo fix in net/rfkill/rfkill.c
  typo fixes in net/sctp/sm_statefuns.c
  lib/: Spelling fixes
  kernel/: Spelling fixes
  include/scsi/: Spelling fixes
  include/linux/: Spelling fixes
  include/asm-m68knommu/: Spelling fixes
  include/asm-frv/: Spelling fixes
  fs/: Spelling fixes
  drivers/watchdog/: Spelling fixes
  drivers/video/: Spelling fixes
  drivers/ssb/: Spelling fixes
  drivers/serial/: Spelling fixes
  drivers/scsi/: Spelling fixes
  ...
2008-02-04 07:58:52 -08:00
Rusty Russell
18445c4d50 virtio: explicit enable_cb/disable_cb rather than callback return.
It seems that virtio_net wants to disable callbacks (interrupts) before
calling netif_rx_schedule(), so we can't use the return value to do so.

Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
now returns void, rather than a boolean.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:58 +11:00
Rusty Russell
a586d4f601 virtio: simplify config mechanism.
Previously we used a type/len pair within the config space, but this
seems overkill.  We now simply define a structure which represents the
layout in the config space: the config space can now only be extended
at the end.

The main driver-visible changes:
1) We indicate what fields are present with an explicit feature bit.
2) Virtqueues are explicitly numbered, and not in the config space.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:57 +11:00
Rusty Russell
f35d9d8aae virtio: Implement skb_partial_csum_set, for setting partial csums on untrusted packets.
Use it in virtio_net (replacing buggy version there), it's also going
to be used by TAP for partial csum support.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
2008-02-04 23:49:56 +11:00
Oliver Pinter
53379e57a7 typo fixes in net/core/net_namespace.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:56:48 +02:00
Oliver Pinter
1dbede8714 typo fix in net/rfkill/rfkill.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:55:45 +02:00
Oliver Pinter
ac461a0330 typo fixes in net/sctp/sm_statefuns.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:52:41 +02:00
David S. Miller
a80f509f4a Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-03 04:43:34 -08:00
Arnaldo Carvalho de Melo
ab1e0a13d7 [SOCK] proto: Add hashinfo member to struct proto
This way we can remove TCP and DCCP specific versions of

sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
sk->sk_prot->hash:     inet_hash is directly used, only v6 need
                       a specific version to deal with mapped sockets
sk->sk_prot->unhash:   both v4 and v6 use inet_hash directly

struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
that inet_csk_get_port can find the per family routine.

Now only the lookup routines receive as a parameter a struct inet_hashtable.

With this we further reuse code, reducing the difference among INET transport
protocols.

Eventually work has to be done on UDP and SCTP to make them share this
infrastructure and get as a bonus inet_diag interfaces so that iproute can be
used with these protocols.

net-2.6/net/ipv4/inet_hashtables.c:
  struct proto			     |   +8
  struct inet_connection_sock_af_ops |   +8
 2 structs changed
  __inet_hash_nolisten               |  +18
  __inet_hash                        | -210
  inet_put_port                      |   +8
  inet_bind_bucket_create            |   +1
  __inet_hash_connect                |   -8
 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191

net-2.6/net/core/sock.c:
  proto_seq_show                     |   +3
 1 function changed, 3 bytes added, diff: +3

net-2.6/net/ipv4/inet_connection_sock.c:
  inet_csk_get_port                  |  +15
 1 function changed, 15 bytes added, diff: +15

net-2.6/net/ipv4/tcp.c:
  tcp_set_state                      |   -7
 1 function changed, 7 bytes removed, diff: -7

net-2.6/net/ipv4/tcp_ipv4.c:
  tcp_v4_get_port                    |  -31
  tcp_v4_hash                        |  -48
  tcp_v4_destroy_sock                |   -7
  tcp_v4_syn_recv_sock               |   -2
  tcp_unhash                         | -179
 5 functions changed, 267 bytes removed, diff: -267

net-2.6/net/ipv6/inet6_hashtables.c:
  __inet6_hash |   +8
 1 function changed, 8 bytes added, diff: +8

net-2.6/net/ipv4/inet_hashtables.c:
  inet_unhash                        | +190
  inet_hash                          | +242
 2 functions changed, 432 bytes added, diff: +432

vmlinux:
 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7

/home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
  tcp_v6_get_port                    |  -31
  tcp_v6_hash                        |   -7
  tcp_v6_syn_recv_sock               |   -9
 3 functions changed, 47 bytes removed, diff: -47

/home/acme/git/net-2.6/net/dccp/proto.c:
  dccp_destroy_sock                  |   -7
  dccp_unhash                        | -179
  dccp_hash                          |  -49
  dccp_set_state                     |   -7
  dccp_done                          |   +1
 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241

/home/acme/git/net-2.6/net/dccp/ipv4.c:
  dccp_v4_get_port                   |  -31
  dccp_v4_request_recv_sock          |   -2
 2 functions changed, 33 bytes removed, diff: -33

/home/acme/git/net-2.6/net/dccp/ipv6.c:
  dccp_v6_get_port                   |  -31
  dccp_v6_hash                       |   -7
  dccp_v6_request_recv_sock          |   +5
 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-03 04:28:52 -08:00
Linus Torvalds
63e9b66e29 Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux: (100 commits)
  SUNRPC: RPC program information is stored in unsigned integers
  SUNRPC: Move exported symbol definitions after function declaration part 2
  NLM: tear down RPC clients in nlm_shutdown_hosts
  SUNRPC: spin svc_rqst initialization to its own function
  nfsd: more careful input validation in nfsctl write methods
  lockd: minor log message fix
  knfsd: don't bother mapping putrootfh enoent to eperm
  rdma: makefile
  rdma: ONCRPC RDMA protocol marshalling
  rdma: SVCRDMA sendto
  rdma: SVCRDMA recvfrom
  rdma: SVCRDMA Core Transport Services
  rdma: SVCRDMA Transport Module
  rdma: SVCRMDA Header File
  svc: Add svc_xprt_names service to replace svc_sock_names
  knfsd: Support adding transports by writing portlist file
  svc: Add svc API that queries for a transport instance
  svc: Add /proc/sys/sunrpc/transport files
  svc: Add transport hdr size for defer/revisit
  svc: Move the xprt independent code to the svc_xprt.c file
  ...
2008-02-02 14:31:28 +11:00
Chuck Lever
ea339d46b9 SUNRPC: RPC program information is stored in unsigned integers
Clean up: When looping over RPC version and procedure numbers, use
unsigned index variables.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 17:01:31 -05:00
Trond Myklebust
d2f7e79e3b SUNRPC: Move exported symbol definitions after function declaration part 2
Do it for the server code...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 17:01:24 -05:00
Jeff Layton
0113ab3464 SUNRPC: spin svc_rqst initialization to its own function
Move the initialzation in __svc_create_thread that happens prior to
thread creation to a new function. Export the function to allow
services to have better control over the svc_rqst structs.

Also rearrange the rqstp initialization to prevent NULL pointer
dereferences in svc_exit_thread in case allocations fail.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:15 -05:00
Tom Tucker
4b8449af75 rdma: makefile
Add the svcrdma module to the xprtrdma makefile.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
ef1eac0a3f rdma: ONCRPC RDMA protocol marshalling
This logic parses the ONCRDMA protocol headers that
precede the actual RPC header. It is placed in a separate
file to keep all protocol aware code in a single place.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
c06b540a54 rdma: SVCRDMA sendto
This file implements the RDMA transport sendto function. A RPC reply
on an RDMA transport consists of some number of RDMA_WRITE requests
followed by an RDMA_SEND request. The sendto function parses the
ONCRPC RDMA reply header to determine how to send the reply back to
the client. The send queue is sized so as to be able to send complete
replies for requests in most cases.  In the event that there are not
enough SQ WR slots to reply, e.g.  big data, the send will block the
NFSD thread. The I/O callback functions in svc_rdma_transport.c that
reap WR completions wake any waiters blocked on the SQ. In general,
the goal is not to block NFSD threads and the has_wspace method
stall requests when the SQ is nearly full.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
d5b31be682 rdma: SVCRDMA recvfrom
This file implements the RDMA transport recvfrom function. The function
dequeues work reqeust completion contexts from an I/O list that it shares
with the I/O tasklet in svc_rdma_transport.c. For ONCRPC RDMA, an RPC may
not be complete when it is received. Instead, the RDMA header that precedes
the RPC message informs the transport where to get the RPC data from on
the client and where to place it in the RPC message before it is delivered
to the server. The svc_rdma_recvfrom function therefore, parses this RDMA
header and issues any necessary RDMA operations to fetch the remainder of
the RPC from the client.

Special handling is required when the request involves an RDMA_READ.
In this case, recvfrom submits the RDMA_READ requests to the underlying
transport driver and then returns 0. When the transport
completes the last RDMA_READ for the request, it enqueues it on a
read completion queue and enqueues the transport. The recvfrom code
favors this queue over the regular DTO queue when satisfying reads.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
377f9b2f45 rdma: SVCRDMA Core Transport Services
This file implements the core transport data management and I/O
path. The I/O path for RDMA involves receiving callbacks on interrupt
context. Since all the svc transport locks are _bh locks we enqueue the
transport on a list, schedule a tasklet to dequeue data indications from
the RDMA completion queue. The tasklet in turn takes _bh locks to
enqueue receive data indications on a list for the transport. The
svc_rdma_recvfrom transport function dequeues data from this list in an
NFSD thread context.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
ef7fbf59e6 rdma: SVCRDMA Transport Module
This file implements the RDMA transport module initialization and
termination logic and registers the transport sysctl variables.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
9571af18fa svc: Add svc_xprt_names service to replace svc_sock_names
Create a transport independent version of the svc_sock_names function.

The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
a217813f90 knfsd: Support adding transports by writing portlist file
Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

echo "tcp 2049" > /proc/fs/nfsd/portlist

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.

Transports can also be removed as follows:

'-'<transport_name><space><port number>

For example:

echo "-tcp 2049" > /proc/fs/nfsd/portlist

Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".

Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
7fcb98d58c svc: Add svc API that queries for a transport instance
Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.

Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
dc9a16e49d svc: Add /proc/sys/sunrpc/transport files
Add a file that when read lists the set of registered svc
transports.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
260c1d1298 svc: Add transport hdr size for defer/revisit
Some transports have a header in front of the RPC header. The current
defer/revisit processing considers only the iov_len and arg_len to
determine how much to back up when saving the original request
to revisit. Add a field to the rqstp structure to save the size
of the transport header so svc_defer can correctly compute
the start of a request.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
0f0257eaa5 svc: Move the xprt independent code to the svc_xprt.c file
This functionally trivial patch moves all of the transport independent
functions from the svcsock.c file to the transport independent svc_xprt.c
file.

In addition the following formatting changes were made:
- White space cleanup
- Function signatures on single line
- The inline directive was removed
- Lines over 80 columns were reformatted
- The term 'socket' was changed to 'transport' in comments
- The SMP comment was moved and updated.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
18d19f949d svc: Make svc_check_conn_limits xprt independent
The svc_check_conn_limits function only manipulates xprt fields. Change references
to svc_sock->sk_xprt to svc_xprt directly.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
57b1d3baba svc: Removing remaining references to rq_sock in rqstp
This functionally empty patch removes rq_sock and unamed union
from rqstp structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
4e5caaa5f2 svc: Move create logic to common code
Move the svc transport list logic into common transport creation code.
Refactor this code path to make the flow of control easier to read.

Move the setting and clearing of the BUSY_BIT during transport creation
to common code.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
9f8bfae693 svc: Make svc_age_temp_sockets svc_age_temp_transports
This function is transport independent. Change it to use svc_xprt directly
and change it's name to reflect this.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
c36adb2a7f svc: Make svc_recv transport neutral
All of the transport field and functions used by svc_recv are now
transport independent. Change the svc_recv function to use the svc_xprt
structure directly instead of the transport specific svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
eab996d4ac svc: Make svc_sock_release svc_xprt_release
The svc_sock_release function only touches transport independent fields.
Change the function to manipulate svc_xprt directly instead of the transport
dependent svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
9dbc240f19 svc: Move the sockaddr information to svc_xprt
This patch moves the transport sockaddr to the svc_xprt
structure.  Convenience functions are added to set and
get the local and remote addresses of a transport from
the transport provider as well as determine the length
of a sockaddr.

A transport is responsible for setting the xpt_local
and xpt_remote addresses in the svc_xprt structure as
part of transport creation and xpo_accept processing. This
cannot be done in a generic way and in fact varies
between TCP, UDP and RDMA. A set of xpo_ functions
(e.g. getlocalname, getremotename) could have been
added but this would have resulted in additional
caching and copying of the addresses around.  Note that
the xpt_local address should also be set on listening
endpoints; for TCP/RDMA this is done as part of
endpoint creation.

For connected transports like TCP and RDMA, the addresses
never change and can be set once and copied into the
rqstp structure for each request. For UDP, however, the
local and remote addresses may change for each request. In
this case, the address information is obtained from the
UDP recvmsg info and copied into the rqstp structure from
there.

A svc_xprt_local_port function was also added that returns
the local port given a transport. This is used by
svc_create_xprt when returning the port associated with
a newly created transport, and later when creating a
generic find transport service to check if a service is
already listening on a given port.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
8c7b0172a1 svc: Make deferral processing xprt independent
This patch moves the transport independent sk_deferred list to the svc_xprt
structure and updates the svc_deferred_req structure to keep pointers to
svc_xprt's directly. The deferral processing code is also moved out of the
transport dependent recvfrom functions and into the generic svc_recv path.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
def13d7401 svc: Move the authinfo cache to svc_xprt.
Move the authinfo cache to svc_xprt. This allows both the TCP and RDMA
transports to share this logic. A flag bit is used to determine if
auth information is to be cached or not. Previously, this code looked
at the transport protocol.

I've also changed the spin_lock/unlock logic so that a lock is not taken for
transports that are not caching auth info.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
4bc6c497b2 svc: Remove sk_lastrecv
With the implementation of the new mark and sweep algorithm for shutting
down old connections, the sk_lastrecv field is no longer needed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
6bc5ab1367 svc: Move accept call to svc_xprt_received to common code
Now that the svc_xprt_received function handles transports, the call
to svc_xprt_received in the xpo_tcp_accept function can be moved to
common code.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
a6046f71f2 svc: Change svc_sock_received to svc_xprt_received and export it
All fields touched by svc_sock_received are now transport independent.
Change it to use svc_xprt directly. This function is called from
transport dependent code, so export it.

Update the comment to clearly state the rules for calling this function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
a50fea26b9 svc: Make svc_send transport neutral
Move the sk_mutex field to the transport independent svc_xprt structure.
Now all the fields that svc_send touches are transport neutral. Change the
svc_send function to use the transport independent svc_xprt directly instead
of the transport dependent svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
f6150c3cab svc: Make the enqueue service transport neutral and export it.
The svc_sock_enqueue function is now transport independent since all of
the fields it touches have been moved to the transport independent svc_xprt
structure. Change the function to use the svc_xprt structure directly
instead of the transport specific svc_sock structure.

Transport specific data-ready handlers need to call this function, so
export it.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
7a90e8cc21 svc: Move sk_reserved to svc_xprt
This functionally trivial patch moves the sk_reserved field to the
transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
7a18208383 svc: Make close transport independent
Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
2008-02-01 16:42:11 -05:00
Tom Tucker
bb5cf160b2 svc: Move sk_server and sk_pool to svc_xprt
This is another incremental change that moves transport independent
fields from svc_sock to the svc_xprt structure. The changes
should be functionally null.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
02fc6c3618 svc: Move sk_flags to the svc_xprt structure
This functionally trivial change moves the transport independent sk_flags
field to the transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
e1b3157f97 svc: Change sk_inuse to a kref
Change the atomic_t reference count to a kref and move it to the
transport indepenent svc_xprt structure. Change the reference count
wrapper names to be generic.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
d7c9f1ed97 svc: Change services to use new svc_create_xprt service
Modify the various kernel RPC svcs to use the svc_create_xprt service.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
b700cbb11f svc: Add a generic transport svc_create_xprt function
The svc_create_xprt function is a transport independent version
of the svc_makesock function.

Since transport instance creation contains transport dependent and
independent components, add an xpo_create transport function. The
transport implementation of this function allocates the memory for the
endpoint, implements the transport dependent initialization logic, and
calls svc_xprt_init to initialize the transport independent field (svc_xprt)
in it's data structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
f9f3cc4fae svc: Move connection limit checking to its own function
Move the code that poaches connections when the connection limit is hit
to a subroutine to make the accept logic path easier to follow. Since this
is in the new connection path, it should not be a performance issue.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
44a6995b32 svc: Remove unnecessary call to svc_sock_enqueue
The svc_tcp_accept function calls svc_sock_enqueue after setting the
SK_CONN bit. This doesn't actually do anything because the SK_BUSY bit
is still set. The call is unnecessary anyway because the generic code in
svc_recv calls svc_sock_received after calling the accept function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
38a417cc99 svc: Add xpo_accept transport function
Previously, the accept logic looked into the socket state to determine
whether to call accept or recv when data-ready was indicated on an endpoint.
Since some transports don't use sockets, this logic now uses a flag
bit (SK_LISTENER) to identify listening endpoints. A transport function
(xpo_accept) allows each transport to define its own accept processing.
A transport's initialization logic is reponsible for setting the
SK_LISTENER bit. I didn't see any way to do this in transport independent
logic since the passive side of a UDP connection doesn't listen and
always recv's.

In the svc_recv function, if the SK_LISTENER bit is set, the transport
xpo_accept function is called to handle accept processing.

Note that all functions are defined even if they don't make sense
for a given transport. For example, accept doesn't mean anything for
UDP. The function is defined anyway and bug checks if called. The
UDP transport should never set the SK_LISTENER bit.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
d7979ae4a0 svc: Move close processing to a single place
Close handling was duplicated in the UDP and TCP recvfrom
methods. This code has been moved to the transport independent
svc_recv function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
323bee32e9 svc: Add a transport function that checks for write space
In order to avoid blocking a service thread, the receive side checks
to see if there is sufficient write space to reply to the request.
Each transport has a different mechanism for determining if there is
enough write space to reply.

The code that checked for write space was coupled with code that
checked for CLOSE and CONN. These checks have been broken out into
separate statements to make the code easier to read.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
e831fe65b1 svc: Add xpo_prep_reply_hdr
Some transports add fields to the RPC header for replies, e.g. the TCP
record length. This function is called when preparing the reply header
to allow each transport to add whatever fields it requires.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
755cceaba7 svc: Add per-transport delete functions
Add transport specific xpo_detach and xpo_free functions. The xpo_detach
function causes the transport to stop delivering data-ready events
and enqueing the transport for I/O.

The xpo_free function frees all resources associated with the particular
transport instance.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
5148bf4ebc svc: Add transport specific xpo_release function
The svc_sock_release function releases pages allocated to a thread. For
UDP this frees the receive skb. For RDMA it will post a receive WR
and bump the client credit count.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
5d137990f5 svc: Move sk_sendto and sk_recvfrom to svc_xprt_class
The sk_sendto and sk_recvfrom are function pointers that allow svc_sock
to be used for both UDP and TCP. Move these function pointers to the
svc_xprt_ops structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
490231558e svc: Add a max payload value to the transport
The svc_max_payload function currently looks at the socket type
to determine the max payload. Add a max payload value to svc_xprt_class
so it can be returned directly.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
360d873864 svc: Make svc_sock the tcp/udp transport
Make TCP and UDP svc_sock transports, and register them
with the svc transport core.

A transport type (svc_sock) has an svc_xprt as its first member,
and calls svc_xprt_init to initialize this field.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:07 -05:00
Tom Tucker
1d8206b97a svc: Add an svc transport class
The transport class (svc_xprt_class) represents a type of transport, e.g.
udp, tcp, rdma.  A transport class has a unique name and a set of transport
operations kept in the svc_xprt_ops structure.

A transport class can be dynamically registered and unregisterd. The
svc_xprt_class represents the module that implements the transport
type and keeps reference counts on the module to avoid unloading while
there are active users.

The endpoint (svc_xprt) is a generic, transport independent endpoint that can
be used to send and receive data for an RPC service. It inherits it's
operations from the transport class.

A transport driver module registers and unregisters itself with svc sunrpc
by calling svc_reg_xprt_class, and svc_unreg_xprt_class respectively.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:07 -05:00
J. Bruce Fields
cb5c7d668e svcrpc: ensure gss DESTROY tokens free contexts from cache
If we don't do this then we'll end up with a pointless unusable context
sitting in the cache until the time the original context would have
expired.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:07 -05:00
J. Bruce Fields
b39c18fce0 sunrpc: gss: simplify rsi_parse logic
Make an obvious simplification that removes a few lines and some
unnecessary indentation; no change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:07 -05:00
J. Bruce Fields
980e5a40a4 nfsd: fix rsi_cache reference count leak
For some reason we haven't been put()'ing the reference count here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:07 -05:00
J. Bruce Fields
dbf847ecb6 knfsd: allow cache_register to return error on failure
Newer server features such as nfsv4 and gss depend on proc to work, so a
failure to initialize the proc files they need should be treated as
fatal.

Thanks to Andrew Morton for style fix and compile fix in case where
CONFIG_NFSD_V4 is undefined.

Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:05 -05:00
J. Bruce Fields
ffe9386b6e nfsd: move cache proc (un)registration to separate function
Just some minor cleanup.

Also I don't see much point in trying to register further proc entries
if initial entries fail; so just stop trying in that case.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:04 -05:00
J. Bruce Fields
df95a9d4fb knfsd: cache unregistration needn't return error
There's really nothing much the caller can do if cache unregistration
fails.  And indeed, all any caller does in this case is print an error
and continue.  So just return void and move the printk's inside
cache_unregister.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:04 -05:00
J. Bruce Fields
a490c681cb knfsd: fix cache.c comment
The path here must be left over from some earlier draft; fix it.  And do
some more minor cleanup while we're there.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:03 -05:00
Chuck Lever
e5cff482c7 SUNRPC: Use unsigned string lengths in xdr_decode_string_inplace
XDR strings, opaques, and net objects should all use unsigned lengths.
To wit, RFC 4506 says:

4.2.  Unsigned Integer

   An XDR unsigned integer is a 32-bit datum that encodes a non-negative
   integer in the range [0,4294967295].

 ...

4.11.  String

   The standard defines a string of n (numbered 0 through n-1) ASCII
   bytes to be the number n encoded as an unsigned integer (as described
   above), and followed by the n bytes of the string.

After this patch, xdr_decode_string_inplace now matches the other XDR
string and array helpers that take a string length argument.  See:

xdr_encode_opaque_fixed, xdr_encode_opaque, xdr_encode_array

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:02 -05:00
Chuck Lever
01b2969a85 SUNRPC: Prevent length underflow in read_flush()
Make sure we compare an unsigned length to an unsigned count in
read_flush().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:02 -05:00
Johannes Berg
3eadf5f4f6 mac80211: fix initialisation error path
The error handling in ieee80211_init() is broken when any of
the built-in rate control algorithms fail to initialise, fix
it and rename the error labels.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-01 16:15:40 -05:00
Johannes Berg
f0b9205cfb mac80211 rate control: fix section mismatch
When the rate control algorithms are built-in, their exit
functions can be called from mac80211's init function so
they cannot be marked __exit.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-01 16:14:30 -05:00
Johannes Berg
6feeb8aad7 mac80211: make alignment warning optional
Driver authors should be aware of the alignment requirements, but
not everybody cares about the warning. This patch makes it depend
on a new Kconfig symbol MAC80211_DEBUG_PACKET_ALIGNMENT which can
be enabled regardless of MAC80211_DEBUG and is recommended for
driver authors (only). This also restricts the warning to data
packets so other packets need not be realigned to not trigger the
warning.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-01 16:12:24 -05:00
Klaus Heinrich Kiwi
7759db8277 [AUDIT] Add uid, gid fields to ANOM_PROMISCUOUS message
Changes the ANOM_PROMISCUOUS message to include uid and gid fields,
making it consistent with other AUDIT_ANOM_ messages and in the
format the userspace is expecting.

Signed-off-by: Klaus Heinrich Kiwi <klausk@br.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:25:10 -05:00
Eric Paris
4746ec5b01 [AUDIT] add session id to audit messages
In order to correlate audit records to an individual login add a session
id.  This is incremented every time a user logs in and is included in
almost all messages which currently output the auid.  The field is
labeled ses=  or oses=

Signed-off-by: Eric Paris <eparis@redhat.com>
2008-02-01 14:06:51 -05:00
Al Viro
0c11b9428f [PATCH] switch audit_get_loginuid() to task_struct *
all callers pass something->audit_context

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-01 14:04:59 -05:00
Denis V. Lunev
4814bdbd59 [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.
The namespace is not available in the fib_sync_down_addr, add it as a
parameter.

Looking up a device by the pointer to it is OK. Looking up using a
result from fib_trie/fib_hash table lookup is also safe. No need to
fix that at all.  So, just fix lookup by address and insertion to the
hash table path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:41 -08:00
Denis V. Lunev
7462bd744e [NETNS]: Add a namespace mark to fib_info.
This is required to make fib_info lookups namespace aware. In the
other case initial namespace devices are marked as dead in the local
routing table during other namespace stop.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:40 -08:00
Denis V. Lunev
85326fa54b [IPV4]: fib_sync_down rework.
fib_sync_down can be called with an address and with a device. In
reality it is called either with address OR with a device. The
codepath inside is completely different, so lets separate it into two
calls for these two cases.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:39 -08:00
Denis V. Lunev
4b8aa9abee [NETNS]: Process interface address manipulation routines in the namespace.
The namespace is available when required except rtm_to_ifaddr. Add
namespace argument to it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:39 -08:00
Denis V. Lunev
7b2185747c [IPV4]: Small style cleanup of the error path in rtm_to_ifaddr.
Remove error code assignment inside brackets on failure. The code
looks better if the error is assigned before condition check. Also,
the compiler treats this better.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:38 -08:00
Denis V. Lunev
dce5cbeec3 [IPV4]: Fix memory leak on error path during FIB initialization.
net->ipv4.fib_table_hash is not freed when fib4_rules_init failed.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:37 -08:00
Pavel Emelyanov
3ed5df445e [NETFILTER]: Ipv6-related xt_hashlimit compilation fix.
The hashlimit_ipv6_mask() is called from under IP6_NF_IPTABLES config
option, but is not under it by itself.

gcc warns us about it :) :
net/netfilter/xt_hashlimit.c:473: warning: "hashlimit_ipv6_mask" defined but not used

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:36 -08:00
Patrick McHardy
e5dfb81518 [NET_SCHED]: Add flow classifier
Add new "flow" classifier, which is meant to extend the SFQ hashing
capabilities without hard-coding new hash functions and also allows
deterministic mappings of keys to classes, replacing some out of tree
iptables patches like IPCLASSIFY (maps IPs to classes), IPMARK (maps
IPs to marks, with fw filters to classes), ...

Some examples:

- Classic SFQ hash:

  tc filter add ... flow hash \
  	keys src,dst,proto,proto-src,proto-dst divisor 1024

- Classic SFQ hash, but using information from conntrack to work properly in
  combination with NAT:

  tc filter add ... flow hash \
  	keys nfct-src,nfct-dst,proto,nfct-proto-src,nfct-proto-dst divisor 1024

- Map destination IPs of 192.168.0.0/24 to classids 1-257:

  tc filter add ... flow map \
  	key dst addend -192.168.0.0 divisor 256

- alternatively:

  tc filter add ... flow map \
  	key dst and 0xff

- similar, but reverse ordered:

  tc filter add ... flow map \
  	key dst and 0xff xor 0xff

Perturbation is currently not supported because we can't reliable kill the
timer on destruction.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:36 -08:00
Patrick McHardy
94de78d195 [NET_SCHED]: sch_sfq: make internal queues visible as classes
Add support for dumping statistics and make internal queues visible as
classes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:35 -08:00
Patrick McHardy
7d2681a6ff [NET_SCHED]: sch_sfq: add support for external classifiers
Add support for external classifiers to allow using different flow
hash functions similar to ESFQ. When no classifier is attached the
built-in hash is used as before.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:34 -08:00
Patrick McHardy
5239008b0d [NET_SCHED]: Constify struct tcf_ext_map
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:34 -08:00
Dave Young
5396c9356e [BLUETOOTH]: Fix bugs in previous conn add/del workqueue changes.
Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:33 -08:00
Adrian Bunk
30a50cc566 [TCP]: Unexport sysctl_tcp_tso_win_divisor
This patch removes the no longer used
EXPORT_SYMBOL(sysctl_tcp_tso_win_divisor).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:32 -08:00
Adrian Bunk
0027ba8434 [IPV4]: Make struct ipv4_devconf static.
struct ipv4_devconf can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:31 -08:00
Adrian Bunk
81a21eb4ec [TR] net/802/tr.c: sysctl_tr_rif_timeout static
sysctl_tr_rif_timeout can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:31 -08:00
Masahide NAKAMURA
9472c9ef64 [XFRM]: Fix statistics.
o Outbound sequence number overflow error status
  is counted as XfrmOutStateSeqError.
o Additionaly, it changes inbound sequence number replay
  error name from XfrmInSeqOutOfWindow to XfrmInStateSeqError
  to apply name scheme above.
o Inbound IPv4 UDP encapsuling type mismatch error is wrongly
  mapped to XfrmInStateInvalid then this patch fiex the error
  to XfrmInStateMismatch.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:30 -08:00
Adrian Bunk
5255dc6e14 [XFRM]: Remove unused exports.
This patch removes the following no longer used EXPORT_SYMBOL's:
- xfrm_input.c: xfrm_parse_spi
- xfrm_state.c: xfrm_replay_check
- xfrm_state.c: xfrm_replay_advance

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:29 -08:00
Roel Kluin
cc8fd14dca [PKT_SCHED] sch_teql.c: Duplicate IFF_BROADCAST in FMASK, remove 2nd.
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:29 -08:00
Eric Dumazet
29e75252da [IPV4] route cache: Introduce rt_genid for smooth cache invalidation
Current ip route cache implementation is not suited to large caches.

We can consume a lot of CPU when cache must be invalidated, since we
currently need to evict all cache entries, and this eviction is
sometimes asynchronous. min_delay & max_delay can somewhat control this
asynchronism behavior, but whole thing is a kludge, regularly triggering
infamous soft lockup messages. When entries are still in use, this also
consumes a lot of ram, filling dst_garbage.list.

A better scheme is to use a generation identifier on each entry,
so that cache invalidation can be performed by changing the table
identifier, without having to scan all entries.
No more delayed flushing, no more stalling when secret_interval expires.

Invalidated entries will then be freed at GC time (controled by
ip_rt_gc_timeout or stress), or when an invalidated entry is found
in a chain when an insert is done.
Thus we keep a normal equilibrium.

This patch :
- renames rt_hash_rnd to rt_genid (and makes it an atomic_t)
- Adds a new rt_genid field to 'struct rtable' (filling a hole on 64bit)
- Checks entry->rt_genid at appropriate places :
2008-01-31 19:28:27 -08:00
Jesse Brandeburg
174ce04831 [PKTGEN]: pktgen should not print info that it is spinning
when using pktgen to send delay packets the module prints repeatedly
to the kernel log:

sleeping for X
sleeping for X
...

This is probably just a debugging item left in and should not be
enabled for regular use of the module.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:26 -08:00
Patrick McHardy
72eb7bd269 [NET_SCHED]: sch_ingress: remove netfilter support
Since the old policer code is gone, TC actions are needed for policing.
The ingress qdisc can get packets directly from netif_receive_skb()
in case TC actions are enabled or through netfilter otherwise, but
since without TC actions there is no policer the only thing it actually
does is count packets.

Remove the netfilter support and always require TC actions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:25 -08:00
Chris Leech
e83a2ea850 [VLAN]: set_rx_mode support for unicast address list
Reuse the existing logic for multicast list synchronization for the
unicast address list. The core of dev_mc_sync/unsync are split out as
__dev_addr_sync/unsync and moved from dev_mcast.c to dev.c.  These are
then used to implement dev_unicast_sync/unsync as well.

I'm working on cleaning up Intel's FCoE stack, which generates new MAC
addresses from the fibre channel device id assigned by the fabric as
per the current draft specification in T11.  When using such a
protocol in a VLAN environment it would be nice to not always be
forced into promiscuous mode, assuming the underlying Ethernet driver
supports multiple unicast addresses as well.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-01-31 19:28:24 -08:00
Shan Wei
16ca3f9130 [TCP]: Fix a bug in strategy_allowed_congestion_control
In strategy_allowed_congestion_control of the 2.6.24 kernel, when
sysctl_string return 1 on success,it should call
tcp_set_allowed_congestion_control to set the allowed congestion
control.But, it don't.  the sysctl_string return 1 on success,
otherwise return negative, never return 0.The patch fix the problem.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:23 -08:00
Stephen Hemminger
71d67e666e [IPV4] fib_trie: rescan if key is lost during dump
Normally during a dump the key of the last dumped entry is used for
continuation, but since lock is dropped it might be lost. In that case
fallback to the old counter based N^2 behaviour.  This means the dump
will end up skipping some routes which matches what FIB_HASH does.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:23 -08:00
Rami Rosen
9fe7c712fc [PKTGEN]: Remove an unused definition in pktgen.c.
- Remove an unused definition (LAT_BUCKETS_MAX) in net/core/pktgen.c.
- Remove the corresponding comment.
- The LAT_BUCKETS_MAX seems to have to do with a patch from a long
time ago which was not applied (Ben Greear), which dealt with latency
counters.

See, for example : http://oss.sgi.com/archives/netdev/2002-09/msg00184.html

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:22 -08:00
Jim Paris
23717795be [IPV6]: Update MSS even if MTU is unchanged.
This is needed because in ndisc.c, we have:

  static void ndisc_router_discovery(struct sk_buff *skb)
  {
  // ...
  	if (ndopts.nd_opts_mtu) {
  // ...
  			if (rt)
  				rt->u.dst.metrics[RTAX_MTU-1] = mtu;

  			rt6_mtu_change(skb->dev, mtu);
  // ...
  }

Since the mtu is set directly here, rt6_mtu_change_route thinks that
it is unchanged, and so it fails to update the MSS accordingly.  This
patch lets rt6_mtu_change_route still update MSS if old_mtu == new_mtu.

Signed-off-by: Jim Paris <jim@jtan.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:21 -08:00
Pavel Emelyanov
fa4d3c6210 [NETNS]: Udp sockets per-net lookup.
Add the net parameter to udp_get_port family of calls and
udp_lookup one and use it to filter sockets.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:21 -08:00
Pavel Emelyanov
d86e0dac2c [NETNS]: Tcp-v6 sockets per-net lookup.
Add a net argument to inet6_lookup and propagate it further.
Actually, this is tcp-v6 implementation of what was done for
tcp-v4 sockets in a previous patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:20 -08:00
Pavel Emelyanov
c67499c0e7 [NETNS]: Tcp-v4 sockets per-net lookup.
Add a net argument to inet_lookup and propagate it further
into lookup calls. Plus tune the __inet_check_established.

The dccp and inet_diag, which use that lookup functions
pass the init_net into them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:19 -08:00
Pavel Emelyanov
941b1d22cc [NETNS]: Make bind buckets live in net namespaces.
This tags the inet_bind_bucket struct with net pointer,
initializes it during creation and makes a filtering
during lookup.

A better hashfn, that takes the net into account is to
be done in the future, but currently all bind buckets
with similar port will be in one hash chain.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:18 -08:00
Pavel Emelyanov
5ee31fc1ec [INET]: Consolidate inet(6)_hash_connect.
These two functions are the same except for what they call
to "check_established" and "hash" for a socket.

This saves half-a-kilo for ipv4 and ipv6.

 add/remove: 1/0 grow/shrink: 1/4 up/down: 582/-1128 (-546)
 function                                     old     new   delta
 __inet_hash_connect                            -     577    +577
 arp_ignore                                   108     113      +5
 static.hint                                    8       4      -4
 rt_worker_func                               376     372      -4
 inet6_hash_connect                           584      25    -559
 inet_hash_connect                            586      25    -561

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:17 -08:00
Pavel Emelyanov
535174efbe [IPV6]: Introduce the INET6_TW_MATCH macro.
We have INET_MATCH, INET_TW_MATCH and INET6_MATCH to test sockets and
twbuckets for matching, but ipv6 twbuckets are tested manually.

Here's the INET6_TW_MATCH to help with it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:17 -08:00
Patrick McHardy
22e0e62cd0 [NETFILTER]: xt_iprange: fix sparse warnings
CHECK   net/netfilter/xt_iprange.c
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:19: warning: restricted degrades to integer
net/netfilter/xt_iprange.c:104:37: warning: restricted degrades to integer

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:16 -08:00
Patrick McHardy
969d71089f [NETFILTER]: nf_nat: fix sparse warning
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:15 -08:00
Patrick McHardy
9e232495de [NETFILTER]: nf_conntrack: fix sparse warning
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:15 -08:00
Patrick McHardy
c392a74018 [NETFILTER]: {ip,ip6}_queue: fix build error
Reported by Ingo Molnar:

 net/built-in.o: In function `ip_queue_init':
 ip_queue.c:(.init.text+0x322c): undefined reference to `net_ipv4_ctl_path'

Fix the build error and also handle CONFIG_PROC_FS=n properly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:14 -08:00
Jan Engelhardt
32948588ac [NETFILTER]: nf_conntrack: annotate l3protos with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:13 -08:00
Jan Engelhardt
7cc3864d39 [NETFILTER]: nf_{conntrack,nat}_icmp: constify and annotate
Constify a few data tables use const qualifiers on variables where
possible in the nf_conntrack_icmp* sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:12 -08:00
Jan Engelhardt
dc35dc5a4c [NETFILTER]: nf_{conntrack,nat}_proto_gre: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:12 -08:00
Jan Engelhardt
da3f13c95a [NETFILTER]: nf_{conntrack,nat}_proto_udp{,lite}: annotate with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:11 -08:00
Jan Engelhardt
82f568fc2f [NETFILTER]: nf_{conntrack,nat}_proto_tcp: constify and annotate TCP modules
Constify a few data tables use const qualifiers on variables where
possible in the nf_*_proto_tcp sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:10 -08:00
Jan Engelhardt
02e23f4057 [NETFILTER]: nf_conntrack_sane: annotate SANE helper with const
Annotate nf_conntrack_sane variables with const qualifier and remove
a few casts.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:10 -08:00
Jan Engelhardt
9ddd0ed050 [NETFILTER]: nf_{conntrack,nat}_pptp: annotate PPtP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:09 -08:00
Jan Engelhardt
de24b4ebb8 [NETFILTER]: nf_{conntrack,nat}_tftp: annotate TFTP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:08 -08:00
Jan Engelhardt
13f7d63c29 [NETFILTER]: nf_{conntrack,nat}_sip: annotate SIP helper with const
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:08 -08:00
Jan Engelhardt
905e3e8ec5 [NETFILTER]: nf_conntrack_h323: constify and annotate H.323 helper
Constify data tables (predominantly in nf_conntrack_h323_types.c, but
also a few in nf_conntrack_h323_asn1.c) and use const qualifiers on
variables where possible in the h323 sources.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:07 -08:00
Alexey Dobriyan
3cb609d57c [NETFILTER]: x_tables: create per-netns /proc/net/*_tables_*
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:06 -08:00
Alexey Dobriyan
715cf35ac9 [NETFILTER]: x_tables: netns propagation for /proc/net/*_tables_names
Propagate netns together with AF down to ->start/->next/->stop
iterators. Choose table based on netns and AF for showing.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:05 -08:00
Alexey Dobriyan
025d93d148 [NETFILTER]: x_tables: semi-rewrite of /proc/net/foo_tables_*
There are many small but still wrong things with /proc/net/*_tables_*
so I decided to do overhaul simultaneously making it more suitable for
per-netns /proc/net/*_tables_* implementation.

Fix
a) xt_get_idx() duplicating now standard seq_list_start/seq_list_next
   iterators
b) tables/matches/targets list was chosen again and again on every ->next
c) multiple useless "af >= NPROTO" checks -- we simple don't supply invalid
   AFs there and registration function should BUG_ON instead.

   Regardless, the one in ->next() is the most useless -- ->next doesn't
   run at all if ->start fails.
d) Don't use mutex_lock_interruptible() -- it can fail and ->stop is
   executed even if ->start failed, so unlock without lock is possible.

As side effect, streamline code by splitting xt_tgt_ops into xt_target_ops,
xt_matches_ops, xt_tables_ops.

xt_tables_ops hooks will be changed by per-netns code. Code of
xt_matches_ops, xt_target_ops is identical except the list chosen for
iterating, but I think consolidating code for two files not worth it
given "<< 16" hacks needed for it.

[Patrick: removed unused enum in x_tables.c]

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:05 -08:00
Jan Engelhardt
09e410def6 [NETFILTER]: xt_hashlimit match, revision 1
Introduces the xt_hashlimit match revision 1. It adds support for
kernel-level inversion and grouping source and/or destination IP
addresses, allowing to limit on a per-subnet basis. While this would
technically obsolete xt_limit, xt_hashlimit is a more expensive due
to the hashbucketing.

Kernel-level inversion: Previously you had to do user-level inversion:

	iptables -N foo
	iptables -A foo -m hashlimit --hashlimit(-upto) 5/s -j RETURN
	iptables -A foo -j DROP
	iptables -A INPUT -j foo

now it is simpler:

	iptables -A INPUT -m hashlimit --hashlimit-over 5/s -j DROP

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:04 -08:00
Ilpo Järvinen
d33b7c06bd [NETFILTER]: nf_conntrack: kill unused static inline (do_iter)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:03 -08:00
Ilpo Järvinen
a38201e3c9 [NETFILTER]: ipt_CLUSTERIP: kill clusterip_config_entry_get
It's unused static inline.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:02 -08:00
Eric Leblond
a83099a60f [NETFILTER]: nf_conntrack_netlink: transmit mark during all events
The following feature was submitted some months ago. It forces the dump
of mark during the connection destruction event. The induced load is
quiet small and the patch is usefull to provide an easy way to filter
event on user side without having to keep an hash in userspace.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:02 -08:00
Jan Engelhardt
1f807d6eb3 [NETFILTER]: nf_conntrack_h323: clean up code a bit
-total: 81 errors, 3 warnings, 876 lines checked
+total: 44 errors, 3 warnings, 876 lines checked

There is still work to be done, but that's for another patch.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:01 -08:00
Patrick McHardy
02502f6224 [NETFILTER]: nf_nat: switch rwlock to spinlock
Since we're using RCU, all users of nf_nat_lock take a write_lock.
Switch it to a spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:00 -08:00
Patrick McHardy
4d354c5782 [NETFILTER]: nf_nat: use RCU for bysource hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:00 -08:00
Patrick McHardy
c88130bcd5 [NETFILTER]: nf_conntrack: naming unification
Rename all "conntrack" variables to "ct" for more consistency and
avoiding some overly long lines.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:59 -08:00
Patrick McHardy
76eb946040 [NETFILTER]: nf_conntrack: don't inline early_drop()
early_drop() is only called *very* rarely, unfortunately gcc inlines it
into the hotpath because there is only a single caller. Explicitly mark
it noinline.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:58 -08:00
Patrick McHardy
0794935e21 [NETFILTER]: nf_conntrack: optimize hash_conntrack()
Avoid calling jhash three times and hash the entire tuple in one go.

  __hash_conntrack | -485 # 760 -> 275, # inlines: 3 -> 1, size inlines: 717 -> 252
 1 function changed, 485 bytes removed

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:57 -08:00
Patrick McHardy
ba419aff2c [NETFILTER]: nf_conntrack: optimize __nf_conntrack_find()
Ignoring specific entries in __nf_conntrack_find() is only needed by NAT
for nf_conntrack_tuple_taken(). Remove it from __nf_conntrack_find()
and make nf_conntrack_tuple_taken() search the hash itself.

Saves 54 bytes of text in the hotpath on x86_64:

  __nf_conntrack_find      |  -54 # 321 -> 267, # inlines: 3 -> 2, size inlines: 181 -> 127
  nf_conntrack_tuple_taken | +305 # 15 -> 320, lexblocks: 0 -> 3, # inlines: 0 -> 3, size inlines: 0 -> 181
  nf_conntrack_find_get    |   -2 # 90 -> 88
 3 functions changed, 305 bytes added, 56 bytes removed, diff: +249

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:55 -08:00
Patrick McHardy
f8ba1affa1 [NETFILTER]: nf_conntrack: switch rwlock to spinlock
With the RCU conversion only write_lock usages of nf_conntrack_lock are
left (except one read_lock that should actually use write_lock in the
H.323 helper). Switch to a spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:54 -08:00
Patrick McHardy
76507f69c4 [NETFILTER]: nf_conntrack: use RCU for conntrack hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:54 -08:00
Patrick McHardy
7d0742da1c [NETFILTER]: nf_conntrack_expect: use RCU for expectation hash
Use RCU for expectation hash. This doesn't buy much for conntrack
runtime performance, but allows to reduce the use of nf_conntrack_lock
for /proc and nf_netlink_conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:53 -08:00
Patrick McHardy
c52fbb410b [NETFILTER]: nf_conntrack_core: avoid taking nf_conntrack_lock in nf_conntrack_alter_reply
The conntrack is unconfirmed, so we have an exclusive reference, which
means that the write_lock is definitely unneeded. A read_lock used to
be needed for the helper lookup, but since we're using RCU for helpers
now rcu_read_lock is enough.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:52 -08:00
Patrick McHardy
58a3c9bb0c [NETFILTER]: nf_conntrack: use RCU for conntrack helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:51 -08:00
Patrick McHardy
47d9504543 [NETFILTER]: nf_conntrack: fix accounting with fixed timeouts
Don't skip accounting for conntracks with the FIXED_TIMEOUT bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:51 -08:00
Patrick McHardy
1d670fdc8c [NETFILTER]: nf_conntrack_netlink: fix unbalanced locking
Properly drop nf_conntrack_lock on tuple parsing error.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:50 -08:00
Patrick McHardy
99fa5f5397 [NETFILTER]: nf_conntrack_ipv6: fix sparse warnings
CHECK   net/ipv6/netfilter/nf_conntrack_reasm.c
  net/ipv6/netfilter/nf_conntrack_reasm.c:77:18: warning: symbol 'nf_ct_ipv6_sysctl_table' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:586:16: warning: symbol 'nf_ct_frag6_gather' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:662:6: warning: symbol 'nf_ct_frag6_output' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:683:5: warning: symbol 'nf_ct_frag6_kfree_frags' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:698:5: warning: symbol 'nf_ct_frag6_init' was not declared. Should it be static?
  net/ipv6/netfilter/nf_conntrack_reasm.c:717:6: warning: symbol 'nf_ct_frag6_cleanup' was not declared. Should it be static?

Based on patch by Stephen Hemminger with suggestions by Yasuyuki KOZAKAI.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:49 -08:00
Patrick McHardy
b0a6363c24 [NETFILTER]: {ip,arp,ip6}_tables: fix sparse warnings in compat code
CHECK   net/ipv4/netfilter/ip_tables.c
net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1453:8:    expected int *size
net/ipv4/netfilter/ip_tables.c:1453:8:    got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1458:44:    expected int *size
net/ipv4/netfilter/ip_tables.c:1458:44:    got unsigned int [usertype] *size
net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1603:2:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1603:2:    got int *<noident>
net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1627:8:    expected int *size
net/ipv4/netfilter/ip_tables.c:1627:8:    got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/ip_tables.c:1634:40:    expected int *size
net/ipv4/netfilter/ip_tables.c:1634:40:    got unsigned int *size
net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness)
net/ipv4/netfilter/ip_tables.c:1653:8:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1653:8:    got int *<noident>
net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness)
net/ipv4/netfilter/ip_tables.c:1666:2:    expected unsigned int *i
net/ipv4/netfilter/ip_tables.c:1666:2:    got int *<noident>
  CHECK   net/ipv4/netfilter/arp_tables.c
net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1285:40:    expected int *size
net/ipv4/netfilter/arp_tables.c:1285:40:    got unsigned int *size
net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness)
net/ipv4/netfilter/arp_tables.c:1543:44:    expected int *size
net/ipv4/netfilter/arp_tables.c:1543:44:    got unsigned int [usertype] *size
  CHECK   net/ipv6/netfilter/ip6_tables.c
net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1481:8:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1481:8:    got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1486:44:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1486:44:    got unsigned int [usertype] *size
net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1631:2:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1631:2:    got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1655:8:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1655:8:    got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1662:40:    expected int *size
net/ipv6/netfilter/ip6_tables.c:1662:40:    got unsigned int *size
net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1680:8:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1680:8:    got int *<noident>
net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness)
net/ipv6/netfilter/ip6_tables.c:1693:2:    expected unsigned int *i
net/ipv6/netfilter/ip6_tables.c:1693:2:    got int *<noident>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:49 -08:00
Patrick McHardy
855304af29 [NETFILTER]: ipt_recent: fix sparse warnings
net/ipv4/netfilter/ipt_recent.c:215:17: warning: symbol 't' shadows an earlier one
net/ipv4/netfilter/ipt_recent.c:179:22: originally declared here
net/ipv4/netfilter/ipt_recent.c:322:13: warning: context imbalance in 'recent_seq_start' - wrong count at exit
net/ipv4/netfilter/ipt_recent.c:354:13: warning: context imbalance in 'recent_seq_stop' - unexpected unlock

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:48 -08:00
Stephen Hemminger
dc64d02ba8 [NETFILTER]: nf_conntrack_h3223: sparse fixes
Sparse complains when a function is not really static. Putting static
on the function prototype is not enough.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:47 -08:00
Stephen Hemminger
f4f6fb714f [NETFILTER]: more sparse fixes
Some lock annotations, and make initializers static.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:46 -08:00
Stephen Hemminger
2f0d2f1039 [NETFILTER]: conntrack: get rid of sparse warnings
Teach sparse about locking here, and fix signed/unsigned warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:46 -08:00
Stephen Hemminger
4e26fe2681 [NETFILTER]: nfnetlink_log: sparse warning fixes
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:45 -08:00
Stephen Hemminger
96eb24d770 [NETFILTER]: nf_conntrack: sparse warnings
The hashtable size is really unsigned so sparse complains when you pass
a signed integer.  Change all uses to make it consistent.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:44 -08:00
Stephen Hemminger
06aa10728e [NETFILTER]: nf_nat_snmp: sparse warning
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:44 -08:00
Jan Engelhardt
edc26f7aaa [NETFILTER]: xt_owner: allow matching UID/GID ranges
Add support for ranges to the new revision. This doesn't affect
compatibility since the new revision was not released yet.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:43 -08:00
Jan Engelhardt
37c08387fc [NETFILTER]: xt_TCPMSS: consider reverse route's MTU in clamp-to-pmtu
The TCPMSS target in Xtables should consider the MTU of the reverse
route on forwarded packets as part of the path MTU.

Point in case: IN=ppp0, OUT=eth0. MSS set to 1460 in spite of MTU of
ppp0 being 1392.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:42 -08:00
Alexey Dobriyan
df200969b1 [NETFILTER]: netns: put table module on netns stop
When number of entries exceeds number of initial entries, foo-tables code
will pin table module. But during table unregister on netns stop,
that additional pin was forgotten.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:41 -08:00
Alexey Dobriyan
9ea0cb2601 [NETFILTER]: arp_tables: per-netns arp_tables FILTER
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:41 -08:00
Alexey Dobriyan
79df341ab6 [NETFILTER]: arp_tables: netns preparation
* Propagate netns from userspace.
* arpt_register_table() registers table in supplied netns.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:40 -08:00
Alexey Dobriyan
8280aa6182 [NETFILTER]: ip6_tables: per-netns IPv6 FILTER, MANGLE, RAW
Now it's possible to list and manipulate per-netns ip6tables rules.
Filtering decisions are based on init_net's table so far.

P.S.: remove init_net check in inet6_create() to see the effect

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:39 -08:00
Alexey Dobriyan
336b517fdc [NETFILTER]: ip6_tables: netns preparation
* Propagate netns from userspace down to xt_find_table_lock()
* Register ip6 tables in netns (modules still use init_net)

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:39 -08:00
Alexey Dobriyan
9335f047fe [NETFILTER]: ip_tables: per-netns FILTER, MANGLE, RAW
Now, iptables show and configure different set of rules in different
netnss'. Filtering decisions are still made by consulting only
init_net's set.

Changes are identical except naming so no splitting.

P.S.: one need to remove init_net checks in nf_sockopt.c and inet_create()
      to see the effect.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:38 -08:00
Alexey Dobriyan
34bd137ba7 [NETFILTER]: ip_tables: propagate netns from userspace
.. all the way down to table searching functions.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:37 -08:00
Alexey Dobriyan
44d34e721e [NETFILTER]: x_tables: return new table from {arp,ip,ip6}t_register_table()
Typical table module registers xt_table structure (i.e. packet_filter)
and link it to list during it. We can't use one template for it because
corresponding list_head will become corrupted. We also can't unregister
with template because it wasn't changed at all and thus doesn't know in
which list it is.

So, we duplicate template at the very first step of table registration.
Table modules will save it for use during unregistration time and actual
filtering.

Do it at once to not screw bisection.

P.S.: renaming i.e. packet_filter => __packet_filter is temporary until
      full netnsization of table modules is done.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:36 -08:00
Alexey Dobriyan
8d87005207 [NETFILTER]: x_tables: per-netns xt_tables
In fact all we want is per-netns set of rules, however doing that will
unnecessary complicate routines such as ipt_hook()/ipt_do_table, so
make full xt_table array per-netns.

Every user stubbed with init_net for a while.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:35 -08:00
Alexey Dobriyan
a98da11d88 [NETFILTER]: x_tables: change xt_table_register() return value convention
Switch from 0/-E to ptr/PTR_ERR convention.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:35 -08:00
Jan Engelhardt
30083c9500 [NETFILTER]: ebtables: mark matches, targets and watchers __read_mostly
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:34 -08:00
Jan Engelhardt
f776c4cda4 [NETFILTER]: ebtables: Update modules' descriptions
Update the MODULES_DESCRIPTION() tags for all Ebtables modules.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:33 -08:00
Jan Engelhardt
abfdf1c489 [NETFILTER]: ebtables: remove casts, use consts
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:33 -08:00
Jan Engelhardt
b41649989c [NETFILTER]: xt_conntrack: add port and direction matching
Extend the xt_conntrack match revision 1 by port matching (all four
{orig,repl}{src,dst}) and by packet direction matching.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:31 -08:00
Patrick McHardy
41d0cdedd5 [NETFILTER]: nfnetlink_log: fix typo
It should use htonl for the GID, not htons.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:30 -08:00
Patrick McHardy
2fd8e526f4 [NETFILTER]: bridge netfilter: remove nf_bridge_info read-only netoutdev member
Before the removal of the deferred output hooks, netoutdev was used in
case of VLANs on top of a bridge to store the VLAN device, so the
deferred hooks would see the correct output device. This isn't
necessary anymore since we're calling the output hooks for the correct
device directly in the IP stack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:29 -08:00
Patrick McHardy
d44caf88e8 [NETFILTER]: nf_nat: remove double bysource hash initialization
The hash table is already initialized by nf_ct_alloc_hashtable().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:28 -08:00
Jan Engelhardt
ecb6f85e11 [NETFILTER]: Use const in struct xt_match, xt_target, xt_table
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:28 -08:00
Eric Dumazet
ca7c48ca97 [NETFILTER]: Supress some sparse warnings
CHECK   net/netfilter/nf_conntrack_expect.c
net/netfilter/nf_conntrack_expect.c:429:13: warning: context imbalance in 'exp_seq_start' - wrong count at exit
net/netfilter/nf_conntrack_expect.c:441:13: warning: context imbalance in 'exp_seq_stop' - unexpected unlock
  CHECK   net/netfilter/nf_log.c
net/netfilter/nf_log.c:105:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nf_log.c:125:13: warning: context imbalance in 'seq_stop' - unexpected unlock
  CHECK   net/netfilter/nfnetlink_queue.c
net/netfilter/nfnetlink_queue.c:363:7: warning: symbol 'size' shadows an earlier one
net/netfilter/nfnetlink_queue.c:217:9: originally declared here
net/netfilter/nfnetlink_queue.c:847:13: warning: context imbalance in 'seq_start' - wrong count at exit
net/netfilter/nfnetlink_queue.c:859:13: warning: context imbalance in 'seq_stop' - unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:27 -08:00
Denis V. Lunev
3046d76746 [RAW]: Wrong content of the /proc/net/raw6.
The address of IPv6 raw sockets was shown in the wrong format, from
IPv4 ones.  The problem has been introduced by the commit
42a73808ed ("[RAW]: Consolidate proc
interface.")

Thanks to Adrian Bunk who originally noticed the problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:26 -08:00
Denis V. Lunev
8cd850efa4 [RAW]: Cleanup IPv4 raw_seq_show.
There is no need to use 128 bytes on the stack at all. Clean the code
in the IPv6 style.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:25 -08:00
Denis V. Lunev
377cf82d66 [RAW]: Family check in the /proc/net/raw[6] is extra.
Different hashtables are used for IPv6 and IPv4 raw sockets, so no
need to check the socket family in the iterator over hashtables. Clean
this out.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:24 -08:00
Herbert Xu
b1641064a3 [IPCOMP]: Fix reception of incompressible packets
I made a silly typo by entering IPPROTO_IP (== 0) instead of
IPPROTO_IPIP (== 4).  This broke the reception of incompressible
packets.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:24 -08:00
Eric Dumazet
e242297055 [NET]: should explicitely initialize atomic_t field in struct dst_ops
All but one struct dst_ops static initializations miss explicit
initialization of entries field.

As this field is atomic_t, we should use ATOMIC_INIT(0), and not
rely on atomic_t implementation.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:23 -08:00
Ilpo Järvinen
ad1984e844 [TCP]: NewReno must count every skb while marking losses
NewReno should add cnt per skb (as with FACK) instead of depending on
SACKED_ACKED bits which won't be set with it at all.  Effectively,
NewReno should always exists after the first iteration anyway (or
immediately if there's already head in lost_out.

This was fixed earlier in net-2.6.25 but got reverted among other
stuff and I didn't notice that this is still necessary (actually
wasn't even considering this case while trying to figure out the
reports because I lived with different kind of code than it in reality
was).

This should solve the WARN_ONs in TCP code that as a result of this
triggered multiple times in every place we check for this invariant.

Special thanks to Dave Young <hidave.darkstar@gmail.com> and Krishna
Kumar2 <krkumar2@in.ibm.com> for trying with my debug patches.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Krishna Kumar2 <krkumar2@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:22 -08:00
Pavel Emelyanov
23fe18669e [NETNS]: Fix race between put_net() and netlink_kernel_create().
The comment about "race free view of the set of network
namespaces" was a bit hasty. Look (there even can be only
one CPU, as discovered by Alexey Dobriyan and Denis Lunev):

put_net()
  if (atomic_dec_and_test(&net->refcnt))
    /* true */
      __put_net(net);
        queue_work(...);

/*
 * note: the net now has refcnt 0, but still in
 * the global list of net namespaces
 */

== re-schedule ==

register_pernet_subsys(&some_ops);
  register_pernet_operations(&some_ops);
    (*some_ops)->init(net);
      /*
       * we call netlink_kernel_create() here
       * in some places
       */
      netlink_kernel_create();
         sk_alloc();
            get_net(net); /* refcnt = 1 */
         /*
          * now we drop the net refcount not to
          * block the net namespace exit in the
          * future (or this can be done on the
          * error path)
          */
         put_net(sk->sk_net);
             if (atomic_dec_and_test(&...))
                   /*
                    * true. BOOOM! The net is
                    * scheduled for release twice
                    */

When thinking on this problem, I decided, that getting and
putting the net in init callback is wrong. If some init
callback needs to have a refcount-less reference on the struct
net, _it_ has to be careful himself, rather than relying on
the infrastructure to handle this correctly.

In case of netlink_kernel_create(), the problem is that the
sk_alloc() gets the given namespace, but passing the info
that we don't want to get it inside this call is too heavy.

Instead, I propose to crate the socket inside an init_net
namespace and then re-attach it to the desired one right
after the socket is created.

After doing this, we also have to be careful on error paths
not to drop the reference on the namespace, we didn't get
the one on.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:22 -08:00
Eric Dumazet
533cb5b0a6 [XFRM]: constify 'struct xfrm_type'
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:20 -08:00
Benjamin Thery
2216b48376 [NETNS]: Add missing initialization of nl_info.nl_net in rtm_to_fib6_config()
Add missing initialization of the new nl_info.nl_net field in
rtm_to_fib6_config(). This will be needed the store network namespace
associated to the fib6_config struct.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:20 -08:00
Laszlo Attila Toth
4a19ec5800 [NET]: Introducing socket mark socket option.
A userspace program may wish to set the mark for each packets its send
without using the netfilter MARK target. Changing the mark can be used
for mark based routing without netfilter or for packet filtering.

It requires CAP_NET_ADMIN capability.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:19 -08:00
Jan Engelhardt
036c2e27bc [AF_RXRPC]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:18 -08:00
Dave Young
b6c0632105 [BLUETOOTH]: Add conn add/del workqueues to avoid connection fail.
The bluetooth hci_conn sysfs add/del executed in the default
workqueue.  If the del_conn is executed after the new add_conn with
same target, add_conn will failed with warning of "same kobject name".

Here add btaddconn & btdelconn workqueues, flush the btdelconn
workqueue in the add_conn function to avoid the issue.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:12 -08:00
Herbert Xu
2614fa59fa [IPCOMP]: Fetch nexthdr before ipch is destroyed
When I moved the nexthdr setting out of IPComp I accidently moved
the reading of ipch->nexthdr after the decompression.  Unfortunately
this means that we'd be reading from a stale ipch pointer which
doesn't work very well.

This patch moves the reading up so that we get the correct nexthdr
value.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:11 -08:00
Julian Anastasov
936f6f8e1b [IPV4] fib_trie: apply fixes from fib_hash
Update fib_trie with some fib_hash fixes:
- check for duplicate alternative routes for prefix+tos+priority when
replacing route
- properly insert by matching tos together with priority
- fix alias walking to use list_for_each_entry_continue for insertion
and deletion when fa_head is not NULL
- copy state from fa to new_fa on replace (not a problem for now)
- additionally, avoid replacement without error if new route is same,
as Joonwoo Park suggests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:10 -08:00
Julian Anastasov
c18865f392 [IPV4] fib: fix route replacement, fib_info is shared
fib_info can be shared by many route prefixes but we don't want
duplicate alternative routes for a prefix+tos+priority. Last change
was not correct to check fib_treeref because it accounts usage from
other prefixes. Additionally, avoid replacement without error if new
route is same, as Joonwoo Park suggests.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:10 -08:00
Wei Yongjun
ec9dbb1c3e [SCTP]: Fix miss of report unrecognized HMAC Algorithm parameter
This patch fix miss of check for report unrecognized HMAC Algorithm
parameter.  When AUTH is disabled, goto fall through path to report
unrecognized parameter, else, just break

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:09 -08:00
Arnaldo Carvalho de Melo
8cf8e5a67f [INET_DIAG]: Fix inet_diag_lock_handler error path.
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=9825

The inet_diag_lock_handler function uses ERR_PTR to encode errors but
its callers were testing against NULL.

This only happens when the only inet_diag modular user, DCCP, is not
built into the kernel or available as a module.

Also there was a problem with not dropping the mutex lock when a handler
was not found, also fixed in this patch.

This caused an OOPS and ss would then hang on subsequent calls, as
&inet_diag_table_mutex was being left locked.

Thanks to spike at ml.yaroslavl.ru for report it after trying 'ss -d'
on a kernel that doesn't have DCCP available.

This bug was introduced in cset
d523a328fb ("Fix inet_diag dead-lock
regression"), after 2.6.24-rc3, so just 2.6.24 seems to be affected.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:08 -08:00
Herbert Xu
29ffe1a5c5 [INET]: Prevent out-of-sync truesize on ip_fragment slow path
When ip_fragment has to hit the slow path the value of skb->truesize
may go out of sync because we would have updated it without changing
the packet length.  This violates the constraints on truesize.

This patch postpones the update of skb->truesize to prevent this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:07 -08:00
maximilian attems
1987e7b485 [AX25]: Kill ax25_bind() user triggable printk.
on the last run overlooked that sfuzz triggable message.
move the message to the corresponding comment.

Signed-off-by: maximilian attems <max@stro.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:06 -08:00
maximilian attems
a44562e4d3 [AX25]: Beautify x25_init() version printk.
kill ref to old version and dup Linux.

Signed-off-by: maximilian attems <max@stro.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:06 -08:00
Ilpo Järvinen
8674204a49 [NET] 9p: kill dead static inline buf_put_string
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:05 -08:00
Herbert Xu
1a6509d991 [IPSEC]: Add support for combined mode algorithms
This patch adds support for combined mode algorithms with GCM being
the first algorithm supported.

Combined mode algorithms can be added through the xfrm_user interface
using the new algorithm payload type XFRMA_ALG_AEAD.  Each algorithms
is identified by its name and the ICV length.

For the purposes of matching algorithms in xfrm_tmpl structures,
combined mode algorithms occupy the same name space as encryption
algorithms.  This is in line with how they are negotiated using IKE.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:03 -08:00
Herbert Xu
6fbf2cb774 [IPSEC]: Allow async algorithms
Now that ESP uses authenc we can turn on the support for async
algorithms in IPsec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:02 -08:00
Herbert Xu
38320c70d2 [IPSEC]: Use crypto_aead and authenc in ESP
This patch converts ESP to use the crypto_aead interface and in particular
the authenc algorithm.  This lays the foundations for future support of
combined mode algorithms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:27:02 -08:00
Iñaky Pérez-González
303d9bf6bb rfkill: add the WiMAX radio type
Teach rfkill about wimax radios.

Had to define a KEY_WIMAX as a 'key for disabling only wimax radios',
as other radio technologies have. This makes sense as hardware has
specific keys for disabling specific radios.

The RFKILL enabling part is, otherwise, a copy and paste of any other
radio technology.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:46 -08:00
Ron Rindjunsky
8b6bbe7538 mac80211: fixing null qos data frames check for reordering buffer
This patch fixes a wrong condition for null qos data frames, causing us to
drop data frames needed for reordering as well.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:38 -08:00
Johannes Berg
691ba2346d mac80211: fix alignment warning
When I introduced the alignment warning I forgot the A-MSDU case which
has a different requirement because each frame contains 14-byte 802.3
headers in front of the IP payload. This patch moves the alignment
warning to a place where we know whether we're dealing with an A-MSDU
frame and adjusts it accordingly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:26:34 -08:00
Linus Torvalds
75659ca0c1 Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
  Remove commented-out code copied from NFS
  NFS: Switch from intr mount option to TASK_KILLABLE
  Add wait_for_completion_killable
  Add wait_event_killable
  Add schedule_timeout_killable
  Use mutex_lock_killable in vfs_readdir
  Add mutex_lock_killable
  Use lock_page_killable
  Add lock_page_killable
  Add fatal_signal_pending
  Add TASK_WAKEKILL
  exit: Use task_is_*
  signal: Use task_is_*
  sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
  ptrace: Use task_is_*
  power: Use task_is_*
  wait: Use TASK_NORMAL
  proc/base.c: Use task_is_*
  proc/array.c: Use TASK_REPORT
  perfmon: Use task_is_*
  ...

Fixed up conflicts in NFS/sunrpc manually..
2008-02-01 11:45:47 +11:00
Linus Torvalds
44c3b59102 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  security: compile capabilities by default
  selinux: make selinux_set_mnt_opts() static
  SELinux: Add warning messages on network denial due to error
  SELinux: Add network ingress and egress control permission checks
  NetLabel: Add auditing to the static labeling mechanism
  NetLabel: Introduce static network labels for unlabeled connections
  SELinux: Allow NetLabel to directly cache SIDs
  SELinux: Enable dynamic enable/disable of the network access checks
  SELinux: Better integration between peer labeling subsystems
  SELinux: Add a new peer class and permissions to the Flask definitions
  SELinux: Add a capabilities bitmap to SELinux policy version 22
  SELinux: Add a network node caching mechanism similar to the sel_netif_*() functions
  SELinux: Only store the network interface's ifindex
  SELinux: Convert the netif code to use ifindex values
  NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function
  NetLabel: Add secid token support to the NetLabel secattr struct
  NetLabel: Consolidate the LSM domain mapping/hashing locks
  NetLabel: Cleanup the LSM domain hash functions
  NetLabel: Remove unneeded RCU read locks
2008-01-31 09:32:24 +11:00
Linus Torvalds
dd430ca20c Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (890 commits)
  x86: fix nodemap_size according to nodeid bits
  x86: fix overlap between pagetable with bss section
  x86: add PCI IDs to k8topology_64.c
  x86: fix early_ioremap pagetable ops
  x86: use the same pgd_list for PAE and 64-bit
  x86: defer cr3 reload when doing pud_clear()
  x86: early boot debugging via FireWire (ohci1394_dma=early)
  x86: don't special-case pmd allocations as much
  x86: shrink some ifdefs in fault.c
  x86: ignore spurious faults
  x86: remove nx_enabled from fault.c
  x86: unify fault_32|64.c
  x86: unify fault_32|64.c with ifdefs
  x86: unify fault_32|64.c by ifdef'd function bodies
  x86: arch/x86/mm/init_32.c printk fixes
  x86: arch/x86/mm/init_32.c cleanup
  x86: arch/x86/mm/init_64.c printk fixes
  x86: unify ioremap
  x86: fixes some bugs about EFI memory map handling
  x86: use reboot_type on EFI 32
  ...
2008-01-31 00:40:09 +11:00
Linus Torvalds
44c45eb911 Make !NETFILTER_ADVANCED enable IP6_NF_MATCH_IPV6HEADER
We want IPV6HEADER matching for the non-advanced default netfilter
configuration, since it's part of the standard netfilter setup of at
least some distributions (eg Fedora).

Otherwise NETFILTER_ADVANCED loses much of its point, since even
non-advanced users would have to enable all the advanced options just to
get a working IPv6 netfilter setup.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-31 00:26:10 +11:00
travis@sgi.com
df3825c56d x86: change NR_CPUS arrays in numa_64
Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

	char cpu_to_node_map[NR_CPUS];

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:33:11 +01:00
Trond Myklebust
34f5b4662b SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls
The caller will never sleep in rpc_execute, so don't bother setting the
sigmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Chuck Lever
afc881124b SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create()
The variable "sin" is a pointer, so sizeof(sin) is the size of a pointer,
not the size of thing that sin points to.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
67d6021362 SUNRPC: Clean up block comment preceding rpcb_getport_sync()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
f1ec08cb94 SUNRPC: Use appropriate argument types in rpcb client
Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle
and use "u32" instead of "__u32" for types in definitions that are not
shared with user space.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
b91e101fca SUNRPC: rpcb_getport_sync() should use built-in hostname generator
rpc_create() can already fill in the hostname with a string representation
of the server's IP address, so remove redundant logic in in
rpcb_getport_sync() that does that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:09 -05:00
Chuck Lever
33e01dc7f5 SUNRPC: Clean up functions that free address_strings array
Clean up: document the rule (kfree) and the exceptions
(RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in
a transport's address_strings array.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:08 -05:00
Trond Myklebust
86d61d8638 SUNRPC: Fix up constant string declarations in struct rpcbind_args
...and eliminate an unnecessary cast.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:06 -05:00
Chuck Lever
b454ae9060 SUNRPC: fewer conditionals in the format_ip_address routines
Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID.  This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.

Tighten up type checking on the address_strings array while we're at it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:04 -05:00
Chuck Lever
7df089952f SUNRPC: Fix use of copy_to_user() in gss_pipe_upcall()
The gss_pipe_upcall() function expects the copy_to_user() function to
return a negative error value if the call fails, but copy_to_user()
returns an unsigned long number of bytes that couldn't be copied.

Can rpc_pipefs actually retry a partially completed upcall read?  If
not, then gss_pipe_upcall() should punt any partial read, just like the
upcall logic in net/sunrpc/cache.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:00 -05:00
Trond Myklebust
ba7392bb37 SUNRPC: Add support for per-client timeout values
In order to be able to support setting the timeo and retrans parameters on
a per-mountpoint basis, we move the rpc_timeout structure into the
rpc_clnt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:59 -05:00
Trond Myklebust
2881ae74e6 SUNRPC: Clean up the transport timeout initialisation
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Trond Myklebust
698b6d088e SUNRPC: cleanup for rpc_new_client()
There is no reason why we shouldn't just pass the rpc_create_args.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Chuck Lever
0fb2b7e945 SUNRPC: Move universal address definitions to global header
Universal addresses are defined in RFC 1833 and clarified in RFC 3530.  We
need to use them in several places in the NFS and RPC clients, so move the
relevant definition and block comment to an appropriate global include
file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
0a48f5d70f SUNRPC: RPC version numbers are u32
Clean up: use correct type for RPC version numbers in rpcbind client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
9f6ad26d2a SUNRPC: Fix socket address handling in rpcb_clnt
Make sure rpcb_clnt passes the correct address length to rpc_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:50 -05:00
Chuck Lever
510deb0d70 SUNRPC: rpc_create() default hostname should support AF_INET6 addresses
If the ULP doesn't pass a hostname string to rpc_create(), it manufactures
one based on the passed-in address.  Be smart enough to handle an AF_INET6
address properly in this case.

Move the default servername logic before the xprt_create_transport() call
to simplify error handling in rpc_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:49 -05:00
Chuck Lever
9b45b74ce2 SUNRPC: Remove an unneeded implicit type cast when calling rpc_depopulate()
The two arguments of rpc_depopulate() that pass in inode numbers should use
the same type as inode->i_ino: unsigned long.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:43 -05:00
Chuck Lever
322e2efe62 SUNRPC: temp var should match return type of xdr_skb_read_actor
The return type of xdr_skb_read_actor functions is size_t.  This fixes a
nit I unwittingly overlooked in commit dd456471.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:43 -05:00
Chuck Lever
5d40a8a525 SUNRPC: Check a return result
Minor: Replace an empty if statement with a debugging dprintk.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:42 -05:00
Chuck Lever
d4b37ff735 SUNRPC: Fix an unnecessary implicit type cast in rpcrdma_count_chunks()
Nit: rl_nchunks is an unsigned integer, so pass it into
rpcrdma_count_chunks() via an unsigned integer argument.  This eliminates
a harmless mixed sign comparison in rpcrdma_count_chunks()

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:42 -05:00
Chuck Lever
2a428b2b8f SUNRPC: Prevent mixed sign comparisons in rpcrdma_convert_iovs()
Keep the type of the buffer position the same during iovec conversion to
reduce the likelihood of unexpected results from comparisons and length
computations.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Thomas Talpey <Thomas.Talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:41 -05:00
Trond Myklebust
a4a874990c SUNRPC: Cleanup to remove the last users of the RPC_WAITQ declaration
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:41 -05:00
Trond Myklebust
47fe064831 SUNRPC: Unexport rpc_init_task() and rpc_execute()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:40 -05:00
Trond Myklebust
e8f5d77c80 SUNRPC: allow the caller of rpc_run_task to preallocate the struct rpc_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:38 -05:00
Trond Myklebust
b5627943ab SUNRPC: Remove the now unused function rpc_call_setup()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:35 -05:00
Trond Myklebust
5138fde011 NFS/SUNRPC: Convert all users of rpc_call_setup()
Replace use of rpc_call_setup() with rpc_init_task(), and in cases where we
need to initialise task->tk_action, with rpc_call_start().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
b3ef8b3bb9 SUNRPC: Allow rpc_init_task() to initialise the rpc_task->tk_msg
In preparation for the removal of rpc_call_setup().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
77de2c590e SUNRPC: Add a helper rpc_call_start() that initialises task->tk_action
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
5085925902 SUNRPC: Mask signals across the call to rpc_call_setup() in rpc_run_task
To ensure that the RPCSEC_GSS upcall is performed with the correct sigmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:31 -05:00
Trond Myklebust
3ff7576dda SUNRPC: Clean up the initialisation of priority queue scheduling info.
We want the default scheduling priority (priority == 0) to remain
RPC_PRIORITY_NORMAL.

Also ensure that the priority wait queue scheduling is per process id
instead of sometimes being per thread, and sometimes being per inode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
c970aa85e7 SUNRPC: Clean up rpc_run_task
Make it use the new task initialiser structure instead of acting as a
wrapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
84115e1cd4 SUNRPC: Cleanup of rpc_task initialisation
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
e8914c65f7 SUNRPC: Restrict sunrpc client exports
The sunrpc client exports are not meant to be part of any official kernel
API: they can change at the drop of a hat. Mark them as internal functions
using EXPORT_SYMBOL_GPL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:28 -05:00
Trond Myklebust
a6eaf8bdf9 SUNRPC: Move exported declarations to the function declarations
Do this for all RPC client related functions and XDR functions.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:28 -05:00
J. Bruce Fields
93a44a75b9 sunrpc: document the rpc_pipefs kernel api
Add kerneldoc comments for the rpc_pipefs.c functions that are exported.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
663b8858dd SUNRPC: Reconnect immediately whenever the server isn't refusing it.
If we've disconnected from the server, rather than the other way round,
then it makes little sense to wait 3 seconds before reconnecting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
62da3b2488 SUNRPC: Rename xprt_disconnect()
xprt_disconnect() should really only be called when the transport shutdown
is completed, and it is time to wake up any pending tasks. Rename it to
xprt_disconnect_done() in order to reflect the semantical change.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:27 -05:00
Trond Myklebust
3ebb067d92 SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect()
Move the calls to xprt_disconnect() over to xprt_force_disconnect() in
order to enable the transport layer to manage the state of the
XPRT_CONNECTED flag.
Ditto in xs_tcp_read_fraghdr().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
7272dcd31d SUNRPC: xprt_autoclose() should not call xprt_disconnect()
The transport layer should do that itself whenever appropriate.

Note that the RDMA transport already assumes that it needs to call
xprt_disconnect in xprt_rdma_close().
For TCP sockets, we want to call xprt_disconnect() only after the
connection has been closed by both ends.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
e06799f958 SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
By using shutdown() rather than close() we allow the RPC client to wait
for the TCP close handshake to complete before we start trying to reconnect
using the same port.
We use shutdown(SHUT_WR) only instead of shutting down both directions,
however we wait until the server has closed the connection on its side.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:26 -05:00
Trond Myklebust
ef80367071 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
3b948ae5be SUNRPC: Allow the client to detect if the TCP connection is closed
Add an xprt->state bit to enable the TCP ->state_change() method to signal
whether or not the TCP connection is in the process of closing down.
This will to be used by the reconnection logic in a separate patch.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
67a391d72c SUNRPC: Fix TCP rebinding logic
Currently the TCP rebinding logic assumes that if we're not using a
reserved port, then we don't need to reconnect on the same port if a
disconnection event occurs. This breaks most RPC duplicate reply cache
implementations.

Also take into account the fact that xprt_min_resvport and
xprt_max_resvport may change while we're reconnecting, since the user may
change them at any time via the sysctls. Ensure that we check the port
boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
boundaries change, we only scan the ports a maximum of 2 times.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:25 -05:00
Trond Myklebust
66af1e5585 SUNRPC: Fix a race in xs_tcp_state_change()
When scheduling the autoclose RPC call, we want to ensure that we don't
race against the test_bit() call in xprt_clear_locked().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:24 -05:00
Paul Moore
13541b3ada NetLabel: Add auditing to the static labeling mechanism
This patch adds auditing support to the NetLabel static labeling mechanism.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:29 +11:00
Paul Moore
8cc44579d1 NetLabel: Introduce static network labels for unlabeled connections
Most trusted OSs, with the exception of Linux, have the ability to specify
static security labels for unlabeled networks.  This patch adds this ability to
the NetLabel packet labeling framework.

If the NetLabel subsystem is called to determine the security attributes of an
incoming packet it first checks to see if any recognized NetLabel packet
labeling protocols are in-use on the packet.  If none can be found then the
unlabled connection table is queried and based on the packets incoming
interface and address it is matched with a security label as configured by the
administrator using the netlabel_tools package.  The matching security label is
returned to the caller just as if the packet was explicitly labeled using a
labeling protocol.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:28 +11:00
Paul Moore
d621d35e57 SELinux: Enable dynamic enable/disable of the network access checks
This patch introduces a mechanism for checking when labeled IPsec or SECMARK
are in use by keeping introducing a configuration reference counter for each
subsystem.  In the case of labeled IPsec, whenever a labeled SA or SPD entry
is created the labeled IPsec/XFRM reference count is increased and when the
entry is removed it is decreased.  In the case of SECMARK, when a SECMARK
target is created the reference count is increased and later decreased when the
target is removed.  These reference counters allow SELinux to quickly determine
if either of these subsystems are enabled.

NetLabel already has a similar mechanism which provides the netlbl_enabled()
function.

This patch also renames the selinux_relabel_packet_permission() function to
selinux_secmark_relabel_packet_permission() as the original name and
description were misleading in that they referenced a single packet label which
is not the case.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:26 +11:00
Paul Moore
75e22910cf NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function
In order to do any sort of IP header inspection of incoming packets we need to
know which address family, AF_INET/AF_INET6/etc., it belongs to and since the
sk_buff structure does not store this information we need to pass along the
address family separate from the packet itself.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:20 +11:00
Paul Moore
16efd45435 NetLabel: Add secid token support to the NetLabel secattr struct
This patch adds support to the NetLabel LSM secattr struct for a secid token
and a type field, paving the way for full LSM/SELinux context support and
"static" or "fallback" labels.  In addition, this patch adds a fair amount
of documentation to the core NetLabel structures used as part of the
NetLabel kernel API.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:19 +11:00
Paul Moore
1c3fad936a NetLabel: Consolidate the LSM domain mapping/hashing locks
Currently we use two separate spinlocks to protect both the hash/mapping table
and the default entry.  This could be considered a bit foolish because it adds
complexity without offering any real performance advantage.  This patch
removes the dedicated default spinlock and protects the default entry with the
hash/mapping table spinlock.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:18 +11:00
Paul Moore
b64397e0b4 NetLabel: Cleanup the LSM domain hash functions
The NetLabel/LSM domain hash table search function used an argument to specify
if the default entry should be returned if an exact match couldn't be found in
the hash table.  This is a bit against the kernel's style so make two separate
functions to represent the separate behaviors.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:17 +11:00
Paul Moore
c783f1ce57 NetLabel: Remove unneeded RCU read locks
This patch removes some unneeded RCU read locks as we can treat the reads as
"safe" even without RCU.  It also converts the NetLabel configuration refcount
from a spinlock protected u32 into atomic_t to be more consistent with the rest
of the kernel.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-30 08:17:16 +11:00
YOSHIFUJI Hideaki
85040bcb46 [IPV6] ADDRLABEL: Fix double free on label deletion.
If an entry is being deleted because it has only one reference, 
we immediately delete it and blindly register the rcu handler for it,
This results in oops by double freeing that object.

This patch fixes it by consolidating the code paths for the deletion;
let its rcu handler delete the object if it has no more reference.

Bug was found by Mitsuru Chinen <mitch@linux.vnet.ibm.com>

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:46:02 -08:00
Stephen Hemminger
ac97f75faa [IPV4] fib_trie: remove unneeded NULL check
Since fib_route_seq_show now uses hlist_for_each_entry(), the leaf
info can not be NULL.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:26 -08:00
Stephen Hemminger
f638a2f057 [IPV4] fib_trie: More whitespace cleanup.
Remove extra blank lines.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:25 -08:00
Patrick McHardy
7a9c1bd409 [NET_SCHED]: Use nla_policy for attribute validation in ematches
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:24 -08:00
Patrick McHardy
53b2bf3f8a [NET_SCHED]: Use nla_policy for attribute validation in actions
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:23 -08:00
Patrick McHardy
6fa8c0144b [NET_SCHED]: Use nla_policy for attribute validation in classifiers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:23 -08:00
Patrick McHardy
27a3421e48 [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:22 -08:00
Patrick McHardy
5feb5e1aaa [NET_SCHED]: sch_api: introduce constant for rate table size
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:21 -08:00
Patrick McHardy
1587bac49f [NET_SCHED]: Use typeful attribute parsing helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:21 -08:00
Patrick McHardy
24beeab539 [NET_SCHED]: Use typeful attribute construction helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:20 -08:00
Patrick McHardy
57e1c487a4 [NET_SCHED]: Use NLA_PUT_STRING for string dumping
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:19 -08:00
Patrick McHardy
4b3550ef53 [NET_SCHED]: Use nla_nest_start/nla_nest_end
Use nla_nest_start/nla_nest_end for dumping nested attributes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:18 -08:00
Patrick McHardy
cee63723b3 [NET_SCHED]: Propagate nla_parse return value
nla_parse() returns more detailed errno codes, propagate them back on
error.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:18 -08:00
Patrick McHardy
ab27cfb85c [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:17 -08:00
Patrick McHardy
c96c9471dd [NET_SCHED]: act_api: use nlmsg_parse
Convert open-coded nlmsg_parse to use the real function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:16 -08:00
Patrick McHardy
6d834e04e5 [NET_SCHED]: act_api: fix netlink API conversion bug
Fix two invalid attribute accesses, indices start at 1 with the new
netlink API.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:15 -08:00
Patrick McHardy
b03f467200 [NET_SCHED]: sch_netem: use nla_parse_nested_compat
Replace open coded equivalent of nla_parse_nested_compat().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:15 -08:00
Patrick McHardy
f5e5cb7553 [NET_SCHED]: sch_atm: fix format string warning
Fix format string warning introduces by the netlink API conversion:

net/sched/sch_atm.c:250: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int'.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:14 -08:00
Denis V. Lunev
dde1bc0e6f [NETNS]: Add namespace for ICMP replying code.
All needed API is done, the namespace is available when required from
the device on the DST entry from the incoming packet. So, just replace
init_net with proper namespace.

Other protocols will follow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:13 -08:00
Denis V. Lunev
b5921910a1 [NETNS]: Routing cache virtualization.
Basically, this piece looks relatively easy. Namespace is already
available on the dst entry via device and the device is safe to
dereferrence. Compare it with one of a searcher and skip entry if
appropriate.

The only exception is ip_rt_frag_needed. So, add namespace parameter to it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:13 -08:00
Patrick McHardy
7ba699c604 [NET_SCHED]: Convert actions from rtnetlink to new netlink API
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:11 -08:00
Patrick McHardy
add93b610a [NET_SCHED]: Convert classifiers from rtnetlink to new netlink API
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:11 -08:00
Patrick McHardy
1e90474c37 [NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API
Convert packet schedulers to use the netlink API. Unfortunately a gradual
conversion is not possible without breaking compilation in the middle or
adding lots of casts, so this patch converts them all in one step. The
patch has been mostly generated automatically with some minor edits to
at least allow seperate conversion of classifiers and actions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:10 -08:00
Patrick McHardy
01480e1cf5 [NETLINK]: Add nla_append()
Used to append data to a message without a header or padding.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:09 -08:00
Patrick McHardy
2eb9d75c72 [NET_SCHED]: mark classifier ops __read_mostly
Additionally remove unnecessary NULL initilizations of the next pointer.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:08 -08:00
Patrick McHardy
62e3ba1b55 [NET_SCHED]: Move EXPORT_SYMBOL next to exported symbol
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:07 -08:00
Denis V. Lunev
f206351a50 [NETNS]: Add namespace parameter to ip_route_output_key.
Needed to propagate it down to the ip_route_output_flow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:07 -08:00
Denis V. Lunev
f1b050bf7a [NETNS]: Add namespace parameter to ip_route_output_flow.
Needed to propagate it down to the __ip_route_output_key.

Signed_off_by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:06 -08:00
Denis V. Lunev
611c183ebc [NETNS]: Add namespace parameter to __ip_route_output_key.
This is only required to propagate it down to the
ip_route_output_slow.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:05 -08:00
Denis V. Lunev
b40afd0e5c [NETNS]: Add namespace parameter to ip_route_output_slow.
This function needs a net namespace to lookup devices, fib tables,
etc. in, so pass it there.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:05 -08:00
Denis V. Lunev
1ab352768f [NETNS]: Add namespace parameter to ip_dev_find.
in_dev_find() need a namespace to pass it to fib_get_table(), so add
an argument.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:04 -08:00
Denis V. Lunev
010278ec4c [NETNS]: Add netns parameter to fib_select_default.
Currently fib_select_default calls fib_get_table() with the
init_net. Prepare it to provide a correct namespace to lookup default
route.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:03 -08:00
Denis V. Lunev
64c2d53829 [IPV4]: Consolidate fib_select_default.
The difference in the implementation of the fib_select_default when
CONFIG_IP_MULTIPLE_TABLES is (not) defined looks
negligible. Consolidate it and place into fib_frontend.c.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:02 -08:00
Stephen Hemminger
d5ce8a0e97 [IPV4] fib_trie: avoid rescan on dump
This converts dumping (and flushing) of large route tables form O(N^2)
to O(N). If the route dump took multiple pages then the dump routine
gets called again. The old code kept track of location by counter, the
new code instead uses the last key.

This is a really big win ( 0.3 sec vs 12 sec) for big route tables.

One side effect is that if the table changes during the dump, then the
last key will not be found, and we will return -EBUSY.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:01 -08:00
Stephen Hemminger
9195bef7fb [IPV4] fib_trie: avoid extra search on delete
Get rid of extra search that made route deletion O(n).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:00 -08:00
Stephen Hemminger
a88ee22925 [IPV4] fib_trie: dump table in sorted order
It is easier with TRIE to dump the data traversal rather than
interating over every possible prefix. This saves some time and makes
the dump come out in sorted order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:00 -08:00
Stephen Hemminger
82cfbb0085 [IPV4] fib_trie: iterator recode
Remove the complex loop structure of nextleaf() and replace it with a
simpler tree walker. This improves the performance and is much
cleaner.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:59 -08:00
Stephen Hemminger
64347f786d [IPV4] fib_trie: dump message multiple part flag
Match fib_hash, and set NLM_F_MULTI to handle multiple part messages.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:58 -08:00
Stephen Hemminger
1328042e26 [IPV4] fib_trie: use hash list
The code to dump can use the existing hash chain rather than doing
repeated lookup.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:58 -08:00
Stephen Hemminger
936722922f [IPV4] fib_trie: compute size when needed
Compute the number of prefixes when needed, rather than doing bookeeping.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:57 -08:00
Stephen Hemminger
a07f5f508a [IPV4] fib_trie: style cleanup
Style cleanups:
      * make check_leaf return -1 or plen, rather than by reference
      * Get rid of #ifdef that is always set
      * split out embedded function calls in if statements.
      * checkpatch warnings

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:56 -08:00
Stephen Hemminger
bc3c8c1e02 [IPV4] fib_trie: put leaf nodes in a slab cache
This improves locality for operations that touch all the leaves.  Save
space since these entries don't need to be hardware cache aligned.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:56 -08:00
Jan Engelhardt
a1f6f5a768 [AF_X25]: constify function pointer tables
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:55 -08:00
Ross Burton
91cde6f7d2 [IrDA]: LMP discovery timer not started by default
By default, LMP sets up a 3 seconds timer for discovery.
We don't need it until discovery is set to 1.

Signed-off-by: Ross Burton <ross@openedhand.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:54 -08:00
Masakazu Mokuno
4248d2f811 WEXT: remove unused variable
As event_type_pk_size[] is not used,  Remove it.

Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:48 -08:00
Ron Rindjunsky
71ebb4aac8 mac80211: fix rx flow sparse errors, make functions static
This patch adds static declarations to functions in the Rx flow in order to
eliminate sparse errors

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:46 -08:00
Stefano Brivio
c09c7237ea rc80211-pid: fix last_sample initialization
Fix last_sample initialization. kzalloc'ing the per-STA data wasn't enough,
as jiffies can assume negative values as well. This fixes a bug which
prevented correct PID operation for a while after booting.

Thanks to Michael Buesch for reporting this.

Reported-and-tested-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:43 -08:00
Johannes Berg
f99b751fca mac80211: fix RCU locking in __ieee80211_rx_handle_packet
Commit c7a51bda ("mac80211: restructure __ieee80211_rx") extracted
__ieee80211_rx_handle_packet out of __ieee80211_rx and hence changed
the locking rules for __ieee80211_rx_handle_packet(), it is now
invoked under RCU lock. There is, however, one instance left where
it contains an rcu_read_unlock() in an error path, which is a bug.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:43 -08:00
Guy Cohen
a8bdf29c6c mac80211: Assign correct TID for local bridged packets
This patch assigns correct TID to frames transmitted between
two stations in the same BSS in AP mode.
The problem is that skb->protocol is not set to ETH_P_IP and it is wrong
to use that field at this stage.
The fix compares the LLC/Protocol headers explicitly to check if the
encapsulated frame is IP frame

Signed-off-by: Guy Cohen <guy.cohen@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:41 -08:00
Eric Dumazet
69a73829db [DST]: shrinks sizeof(struct rtable) by 64 bytes on x86_64
On x86_64, sizeof(struct rtable) is 0x148, which is rounded up to
0x180 bytes by SLAB allocator.

We can reduce this to exactly 0x140 bytes, without alignment overhead,
and store 12 struct rtable per PAGE instead of 10.

rate_tokens is currently defined as an "unsigned long", while its
content should not exceed 6*HZ. It can safely be converted to an
unsigned int.

Moving tclassid right after rate_tokens to fill the 4 bytes hole
permits to save 8 bytes on 'struct dst_entry', which finally permits
to save 8 bytes on 'struct rtable'

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:41 -08:00
Pavel Emelyanov
81566e8322 [NETNS][FRAGS]: Make the pernet subsystem for fragments.
On namespace start we mainly prepare the ctl variables.

When the namespace is stopped we have to kill all the fragments that
point to this namespace.  The inet_frags_exit_net() handles it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:40 -08:00
Pavel Emelyanov
3140c25c82 [NETNS][FRAGS]: Make the LRU list per namespace.
The inet_frags.lru_list is used for evicting only, so we have
to make it per-namespace, to evict only those fragments, who's
namespace exceeded its high threshold, but not the whole hash.
Besides, this helps to avoid long loops  in evictor.

The spinlock is not per-namespace because it protects the
hash table as well, which is global.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:39 -08:00
Pavel Emelyanov
3b4bc4a2bf [NETNS][FRAGS]: Isolate the secret interval from namespaces.
Since we have one hashtable to lookup the fragment, having
different secret_interval-s for hash rebuild doesn't make
sense, so move this one to inet_frags.

The inet_frags_ctl becomes empty after this, so remove it.
The appropriate ctl table is kept read-only in namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:39 -08:00
Pavel Emelyanov
e31e0bdc7e [NETNS][FRAGS]: Make thresholds work in namespaces.
This is the same as with the timeout variable.

Currently, after exceeding the high threshold _all_
the fragments are evicted, but it will be fixed in
later patch.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:38 -08:00
Pavel Emelyanov
b2fd5321dd [NETNS][FRAGS]: Make the net.ipv4.ipfrag_timeout work in namespaces.
Move it to the netns_frags, adjust the usage and
make the appropriate ctl table writable.

Now fragment, that live in different namespaces can
live for different times.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:37 -08:00
Pavel Emelyanov
e4a2d5c2bc [NETNS][FRAGS]: Duplicate sysctl tables for new namespaces.
Each namespace has to have own tables to tune their
different parameters, so duplicate the tables and
register them.

All the tables in sub-namespaces are temporarily made
read-only.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:37 -08:00
Pavel Emelyanov
6ddc082223 [NETNS][FRAGS]: Make the mem counter per-namespace.
This is also simple, but introduces more changes, since
then mem counter is altered in more places.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:36 -08:00
Pavel Emelyanov
e5a2bb842c [NETNS][FRAGS]: Make the nqueues counter per-namespace.
This is simple - just move the variable from struct inet_frags
to struct netns_frags and adjust the usage appropriately.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:35 -08:00
Pavel Emelyanov
ac18e7509e [NETNS][FRAGS]: Make the inet_frag_queue lookup work in namespaces.
Since fragment management code is consolidated, we cannot have the
pointer from inet_frag_queue to struct net, since we must know what
king of fragment this is.

So, I introduce the netns_frags structure. This one is currently
empty, but will be eventually filled with per-namespace
attributes. Each inet_frag_queue is tagged with this one.

The conntrack_reasm is not "netns-izated", so it has one static
netns_frags instance to keep working in init namespace.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:34 -08:00
Pavel Emelyanov
8d8354d2fb [NETNS][FRAGS]: Move ctl tables around.
This is a preparation for sysctl netns-ization.
Move the ctl tables to the files, where the tuning
variables reside. Plus make the helpers to register
the tables.

This will simplify the later patches and will keep
similar things closer to each other.

ipv4, ipv6 and conntrack_reasm are patched differently,
but the result is all the tables are in appropriate files.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:34 -08:00
YOSHIFUJI Hideaki
61cf46ad58 [IPV6] NDISC: Sparse: Use different variable name for local use.
Fix the following sparse warnings:
| net/ipv6/ndisc.c:1300:21: warning: symbol 'opt' shadows an earlier one
| net/ipv6/ndisc.c:1078:7: originally declared here

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:28 -08:00
YOSHIFUJI Hideaki
5d5619b40c [IPV6] ADDRCONF: Sparse: Make inet6_dump_addr() code paths more straight-forward.
Fix the following sparse warning:
| net/ipv6/addrconf.c:3384:2: warning: context imbalance in 'inet6_dump_addr' - different lock contexts for basic block

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:28 -08:00
YOSHIFUJI Hideaki
2334ecbdb2 [IPV6]: Sparse: Declare non-static ipv6_{route,icmp,frag}_sysctl_init() in header.
Fix the following sparse warnings:
| net/ipv6/route.c:2491:18: warning: symbol 'ipv6_route_sysctl_init' was not declared. Should it be static?
| net/ipv6/icmp.c:922:18: warning: symbol 'ipv6_icmp_sysctl_init' was not declared. Should it be static?
| net/ipv6/reassembly.c:628:6: warning: symbol 'ipv6_frag_sysctl_init' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:27 -08:00
YOSHIFUJI Hideaki
40fee36e11 [IPV6] ADDRLABEL: Sparse: Make several functions static.
Fix following sparse warnings:
| net/ipv6/addrlabel.c:172:25: warning: symbol 'ip6addrlbl_alloc' was not declared. Should it be static?
| net/ipv6/addrlabel.c:219:5: warning: symbol '__ip6addrlbl_add' was not declared. Should it be static?
| net/ipv6/addrlabel.c:260:5: warning: symbol 'ip6addrlbl_add' was not declared. Should it be static?
| net/ipv6/addrlabel.c:285:5: warning: symbol '__ip6addrlbl_del' was not declared. Should it be static?
| net/ipv6/addrlabel.c:311:5: warning: symbol 'ip6addrlbl_del' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:26 -08:00
YOSHIFUJI Hideaki
5e8b9df6e8 [IPV6] UDPLITE: Sparse: Declare non-static symbols in header.
Fix the following sparse warnings:
| net/ipv6/udplite.c:45:14: warning: symbol 'udplitev6_prot' was not declared. Should it be static?
| net/ipv6/udplite.c:80:12: warning: symbol 'udplitev6_init' was not declared. Should it be static?
| net/ipv6/udplite.c:99:6: warning: symbol 'udplitev6_exit' was not declared. Should it be static?

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:26 -08:00
YOSHIFUJI Hideaki
77d0d350e9 [IPV6] UDP,UDPLITE: Sparse: {__udp6_lib,udp,udplite}_err() are of void.
Fix following sparse warnings:
| net/ipv6/udp.c:262:2: warning: returning void-valued expression
| net/ipv6/udplite.c:29:2: warning: returning void-valued expression

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:25 -08:00
YOSHIFUJI Hideaki
fc80be87dc [IPV4] UDP,UDPLITE: Sparse: {__udp4_lib,udp,udplite}_err() are of void.
Fix following sparse warnings:
| net/ipv4/udp.c:421:2: warning: returning void-valued expression
| net/ipv4/udplite.c:38:2: warning: returning void-valued expression

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-01-28 15:10:24 -08:00
Denis V. Lunev
ecfdc8c542 [NETNS]: Pass correct namespace in ip_rt_get_source.
ip_rt_get_source is the infamous place for which dst_ifdown kludges
have been implemented. This means that rt->u.dst.dev can be safely
dereferrenced obtain nd_net.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:23 -08:00
Denis V. Lunev
84a885f449 [NETNS]: Pass correct namespace in ip_route_input_slow.
The packet on the input path always has a referrence to an input
network device it is passed from. Extract network namespace from it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:22 -08:00
Denis V. Lunev
86167a377f [NETNS]: Pass correct namespace in context fib_check_nh.
Correct network namespace is already used in fib_check_nh. Re-work its
usage for better readability and pass into fib_lookup &
inetdev_by_index.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:21 -08:00
Denis V. Lunev
5b707aaae4 [NETNS]: Pass correct namespace in fib_validate_source.
Correct network namespace is available inside fib_validate_source. It
can be obtained from the device passed in. The device is not NULL as
in_device is obtained from it just above.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:21 -08:00
Denis V. Lunev
7fee0ca237 [NETNS]: Add netns parameter to inetdev_by_index.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:20 -08:00
Denis V. Lunev
da0e28cb68 [NETNS]: Add netns parameter to fib_lookup.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:19 -08:00
Stephen Hemminger
ba93ef7465 [IPV4]: ipmr sparse warnings
Get rid of some of the sparse warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:18 -08:00
Stephen Hemminger
dd329bfa96 [IPV4]: igmp sparse warnings
Partial sparse warning fix.  The other conditional locking
is too much for sparse to handle.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:18 -08:00
Stephen Hemminger
1402c8519a [VLAN]: sparse warning fix
Minor sparse warning fix.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:10:17 -08:00
Ron Rindjunsky
66dcb6bdc5 mac80211: A-MPDU Rx remove stop_rx_ba_session warning print
This patch removes a warning print from ieee80211_sta_stop_rx_ba_session
in case the tid is inactive when interface goes down.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:10 -08:00
Ron Rindjunsky
5b3d71d90e mac80211: A-MPDU Rx stop aggregation on proper dev
This patch adds a check to insure that Rx A-MPDU will be stopped only
for the proper device.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:09 -08:00
Ivo van Doorn
0f7054e32f mac80211: Initialize vif pointer
Before calling update_beacon() mac80211 must
initialize the control.vif pointer so it can
be used by the driver to determine which
interface is trying to send the beacon.

v2: ieee80211_beacon_get() should also initialize the
vif pointer since it can be called by mac80211 internally
before calling config_interface().

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:10:08 -08:00
Johannes Berg
471b3efdfc mac80211: add unified BSS configuration
This patch (based on Ron Rindjunsky's) creates a framework for
a unified way to pass BSS configuration to drivers that require
the information, e.g. for implementing power save mode.

This patch introduces new ieee80211_bss_conf structure that is
passed to the driver via the new bss_info_changed() callback
when the BSS configuration changes.

This new BSS configuration infrastructure adds the following
new features:
 * drivers are notified of their association AID
 * drivers are notified of association status

and replaces the erp_ie_changed() callback. The patch also does
the relevant driver updates for the latter change.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:43 -08:00
Michael Wu
2bc454b0b3 mac80211: Fix rate reporting regression
Mattias Nissler's "clean up rate selection" patch incorrectly changes
the behavior of txrate setting in sta_info. This patch backs out parts
of the rate selection consolidation in order to fix this issue for now.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:42 -08:00
Johannes Berg
4fd6931ebe mac80211: implement cfg80211 station handling
This implements station handling from userspace via cfg80211
in mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:39 -08:00
Johannes Berg
5dfdaf58d6 mac80211: add beacon configuration via cfg80211
This patch implements the cfg80211 hooks for configuring beaconing
on an access point interface in mac80211. While doing so, it fixes
a number of races that could badly crash the machine when the
beacon is changed while being requested by the driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:38 -08:00
Johannes Berg
51fb61e76d mac80211: move interface type to vif structure
Drivers that support mixed AP/STA operation may well need to
know the type of a virtual interface when iterating over them.
The easiest way to support that is to move the interface type
variable into the vif structure.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:37 -08:00
Johannes Berg
32bfd35d4b mac80211: dont use interface indices in drivers
This patch gets rid of the if_id stuff where possible in favour of
a new per-virtual-interface structure "struct ieee80211_vif". This
structure is located at the end of the per-interface structure and
contains a variable length driver-use data area.

This has two advantages:
 * removes the need to look up interfaces by if_id, this is better
   for working with network namespaces and performance
 * allows drivers to store and retrieve per-interface data without
   having to allocate own lists/hash tables

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:09:36 -08:00
Al Viro
8524f59d47 ieee80211: beacon->capability is little-endian
It's only a debugging printk, so it went unnoticed; still, the
fix is trivial, so...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:48 -08:00
Al Viro
d9e94d5647 ieee80211: fix misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:48 -08:00
Al Viro
c414e84b22 ieee80211softmac_auth_resp() fix
The struct ieee8021_auth * passed to it comes straight from skb->data
without any conversions; members of the struct are little-endian, so
we'd better take that into account when doing switch by auth->algorithm,
etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:47 -08:00
Al Viro
b16f13d00c several missing cpu_to_le16() in ieee80211softmac_capabilities()
on some codepaths we forgot to convert to little-endian as we do on the
rest of them and as the caller expects from us.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:46 -08:00
Al Viro
8fffc15dc7 eliminate byteswapping in struct ieee80211_qos_parameters
Make it match the on-the-wire endianness, eliminate byteswapping.
The only driver that used this sucker (ipw2200) updated.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-01-28 15:08:45 -08:00
Jan Engelhardt
1e637c74b0 [IPV4]: Enable use of 240/4 address space.
This short patch modifies the IPv4 networking to enable use of the
240.0.0.0/4 (aka "class-E") address space as propsed in the internet
draft draft-fuller-240space-00.txt.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:44 -08:00
Jarek Poplawski
96750162b5 [NET] gen_estimator: gen_replace_estimator() cosmetic changes
White spaces etc. are changed in gen_replace_estimator() to make it
similar to others in a file.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:43 -08:00
Stephen Hemminger
72348a424f [PKT_SCHED] net: add sparse annotation to ptype_seq_start/stop
Get rid of some more sparse warnings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:42 -08:00
Stephen Hemminger
aa767bfea4 [PKT_SCHED] net classifier: style cleanup's
Classifier code cleanup. Get rid of printk wrapper, and fix whitespace
and other style stuff reported by checkpatch

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:42 -08:00
Stephen Hemminger
786a90366f [PKT_SCHED] sch_atm: style cleanup
ATM scheduler clean house:
  * get rid of printk and qdisc_priv() wrapper
  * split some assignment in if() statements
  * whitespace and line breaks.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:41 -08:00
Stephen Hemminger
9d127fbdd2 [PKT_SCHED] dsmark: checkpatch warning cleanup
Get rid of all style things checkpatch warns about, indentation and
whitespace.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:40 -08:00
Stephen Hemminger
4c30719f4f [PKT_SCHED] dsmark: handle cloned and non-linear skb's
Make dsmark work properly with non-linear and cloned skb's
Before modifying the header, it needs to check that skb header is
writeable.

Note: this makes the assumption, that if it queues a good skb
then a good skb will come out of the embedded qdisc.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:40 -08:00
David S. Miller
5b0ac72bc5 [PKT_SCHED] dsmark: Use hweight32() instead of convoluted loop.
Based upon a patch by Stephen Hemminger and suggestions
from Patrick McHardy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:39 -08:00
Stephen Hemminger
81da99ed71 [PKT_SCHED] dsmark: get rid of wrappers
Remove extraneous macro wrappers for printk and qdisc_priv.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:38 -08:00
Stephen Hemminger
d20b3109e9 [IPV6]: addrconf sparse warnings
Get rid of a couple of sparse warnings in IPV6 addrconf code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:37 -08:00
Patrick McHardy
13a0a096e5 [NET_SCHED]: kill obsolete NET_CLS_POLICE option
The code is already gone for about half a year, the config option
has been kept around to select the replacement options for easier
upgrades. This seems long enough, people upgrading from older
kernels will have to reconfigure a lot anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:37 -08:00
Pavel Emelyanov
91b4f95475 [VLAN]: Move protocol determination to seperate function
I think, that we can make this code flow easier to understand
by introducing the vlan_set_encap_proto() function (I hope the
name is good) to setup the skb proto and merge the paths calling
netif_rx() together.

[Patrick: Modified to apply on top of my previous patches]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:35 -08:00
Patrick McHardy
31ffdbcb59 [VLAN]: Clean up vlan_skb_recv()
- remove three instances of identical code
- remove unnecessary NULL initialization
- remove obvious and unnecessary comments

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:35 -08:00
Patrick McHardy
ad712087f7 [VLAN]: Update list address
VLAN related mail should go to netdev.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:34 -08:00
Patrick McHardy
2029cc2c84 [VLAN]: checkpatch cleanups
Checkpatch cleanups, consisting mainly of overly long lines and
missing spaces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:33 -08:00
Patrick McHardy
9dfebcc647 [VLAN]: Turn VLAN_DEV_INFO into inline function
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:32 -08:00
Patrick McHardy
af30151709 [VLAN]: Simplify vlan unregistration
Keep track of the number of VLAN devices in a vlan group. This allows
to have the caller sense when the group is going to be destroyed and
stop using it, which in turn allows to remove the wrapper around
unregister_vlan_dev for the NETDEV_UNREGISTER notifier and avoid
iterating over all possible VLAN ids whenever a device in unregistered.

Also fix what looks like a use-after-free (but is actually safe since
we're holding the RTNL), the real_dev reference should not be dropped
while we still use it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:31 -08:00
Patrick McHardy
acc5efbcd2 [VLAN]: Clean up unregister_vlan_dev
Save two levels of indentation by aborting on error conditions,
remove unnecessary initialization to NULL and remove two obvious
comments.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:30 -08:00
Patrick McHardy
69ab4b7d6d [VLAN]: Clean up initialization code
- move module init/exit functions to end of file, remove some now unnecessary
  forward declarations
- remove some obvious comments
- clean up proc init function and move a proc-related printk there

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:30 -08:00
Patrick McHardy
198a291ce3 [VLAN]: Remove non-implemented ioctls
The GET_VLAN_INGRESS_PRIORITY_CMD/GET_VLAN_EGRESS_PRIORITY_CMD ioctls are
not implemented and won't be, new functionality will be added to the netlink
interface. Remove the code and make the ioctl handler return -EOPNOTSUPP
for unknown commands instead of -EINVAL.

Also remove a comment about passing unknown commands to the underlying
device, that doesn't make any sense since its a VLAN specific ioctl and
if its not implemented here, its implemented nowhere.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:29 -08:00
Patrick McHardy
40f98e1af4 [VLAN]: Clean up debugging and printks
- use pr_* functions and common prefix for non-device related messages

- remove VLAN_ printk levels

- kill lots of useless debugging statements

- remove a few unnecessary printks like for double VID registration (already
  returns -EEXIST) and kill of a number of unnecessary checks in
  vlan_proc_{add,rem}_dev() that are already performed by the caller

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:28 -08:00
Patrick McHardy
62f99efce6 [VLAN]: Kill useless check
vlan->real_dev is always equal to the device since thats what we used
for the lookup. It doesn't even seem worth a WARN_ON or BUG_ON.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:27 -08:00
Patrick McHardy
ef3eb3e59b [VLAN]: Move device setup to vlan_dev.c
Move device setup to vlan_dev.c and make all the VLAN device methods
static.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:26 -08:00
Patrick McHardy
7bd38d778e [VLAN]: Use dev->stats
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:25 -08:00
Patrick McHardy
b7a4a83629 [VLAN]: Kill useless VLAN_NAME define
The only user already includes __FUNCTION__ (vlan_proto_init) in the
output, which is enough to identify what the message is about.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:25 -08:00
Patrick McHardy
891687649a [NET_SCHED]: sch_ingress: remove useless printk
The printk about ingress qdisc registration error can't be triggered
under normal circumstances. Since register_qdisc only fails for two
identical registrations, the only way to trigger it is by loading the
sch_ingress modules multiple times under different names, in which
case we already return -EEXIST to userspace.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:22 -08:00
Patrick McHardy
1389356735 [NET_SCHED]: sch_ingress: avoid a few #ifdefs
Move the repeating "ifndef CONFIG_NET_CLS_ACT/ifdef CONFIG_NETFILTER"
ifdefs into a single condition.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:22 -08:00
Patrick McHardy
645a1e39e4 [NET_SCHED]: sch_ingress: move dependencies to Kconfig
Instead of complaining at scheduler initialization time, check the
dependencies in Kconfig.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:21 -08:00
Patrick McHardy
c6ee877f2e [NET_SCHED]: sch_ingress: remove unnecessary ops
- ->reset is optional
- sch_api provides identical defaults for ->dequeue/->requeue
- ->drop can't happen since ingress never has a parent qdisc

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:20 -08:00
Patrick McHardy
e037834758 [NET_SCHED]: sch_ingress: return proper error code in ingress_graft()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:20 -08:00
Patrick McHardy
c21d4d5dd2 [NET_SCHED]: sch_ingress: remove unused inner qdisc
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:19 -08:00
Patrick McHardy
cb53c04891 [NET_SCHED]: sch_ingress: remove qdisc_priv() wrapper
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:18 -08:00
Patrick McHardy
a47812211b [NET_SCHED]: sch_ingress: remove excessive debugging
Remove excessive debugging statements and some "future use" stuff.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:18 -08:00
Patrick McHardy
58f4df423e [NET_SCHED]: sch_ingress: formatting fixes
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:17 -08:00
Stephen Hemminger
6f9e98f7a9 [PKT_SCHED] SFQ: whitespace cleanup
Add whitespace around operators, and add a few blank lines to improve
readability.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:16 -08:00
Stephen Hemminger
d46f8dd87d [PKT_SCHED] SFQ: use net_random
SFQ doesn't need true random numbers, it is only using them to salt a
hash. Therefore it is better to use net_random() and avoid any
possible problems with depleting the entropy pool.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:15 -08:00
Stephen Hemminger
d3e994830d [PKT_SCHED] SFQ: timer is deferrable
The perturbation timer used for re-keying can be deferred, it doesn't
need to be deterministic.

Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:15 -08:00
Denis V. Lunev
51314a17ba [NETNS]: Process FIB rule action in the context of the namespace.
Save namespace context on the fib rule at the rule creation time and
call routing lookup in the correct namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:14 -08:00
Denis V. Lunev
9e3a548781 [NETNS]: FIB rules API cleanup.
Remove struct net from fib_rules_register(unregister)/notify_change
paths and diet code size a bit.

add/remove: 0/0 grow/shrink: 10/12 up/down: 35/-100 (-65)
function                                     old     new   delta
notify_rule_change                           273     280      +7
trie_show_stats                              471     475      +4
fn_trie_delete                               473     477      +4
fib_rules_unregister                         144     148      +4
fib4_rule_compare                            119     123      +4
resize                                      2842    2845      +3
fn_trie_select_default                       515     518      +3
inet_sk_rebuild_header                       836     838      +2
fib_trie_seq_show                            764     766      +2
__devinet_sysctl_register                    276     278      +2
fn_trie_lookup                              1124    1123      -1
ip_fib_check_default                         133     131      -2
devinet_conf_sysctl                          223     221      -2
snmp_fold_field                              126     123      -3
fn_trie_insert                              2091    2086      -5
inet_create                                  876     870      -6
fib4_rules_init                              197     191      -6
fib_sync_down                                452     444      -8
inet_gso_send_check                          334     325      -9
fib_create_info                             3003    2991     -12
fib_nl_delrule                               568     553     -15
fib_nl_newrule                               883     852     -31

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:13 -08:00
Denis V. Lunev
0359238333 [FIB]: Add netns to fib_rules_ops.
The backward link from FIB rules operations to the network namespace
will allow to simplify the API a bit.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:13 -08:00
Vlad Yasevich
853f4b5055 [SCTP]: Correctly initialize error when parameter validation failed.
When parameter validation fails, there should be error causes that
specify what type of failure we've encountered.  If the causes are not
there, we lacked memory to allocated them.  Thus make that the default
value for the error.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:11 -08:00
Adrian Bunk
e9888f5498 [IrDA]: Irport removal - part 1
This patch removes IrPORT and the old dongle drivers (all off them
have replacement drivers).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:10 -08:00
Robie Basak
5d780cd658 [IrDA]: Frame length validation.
When using a stir4200-based USB adaptor to talk to a device that uses
an mcp2150, the stir4200 sometimes drops an incoming frame causing the
mcp2150 to try and retransmit the lost frame. In this combination, the
next frame received from the mcp2150 is often invalid - either an
empty i:rsp or an IrCOMM i:rsp with an invalid clen. These corner
cases are now checked.

Signed-off-by: Robie Basak <rb-oss-1@justgohome.co.uk>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:09 -08:00
Robie Basak
6d97b53e92 [IrDA]: Resend frames on timeout.
When final timer expires, it might also mean that the i:cmd wasn't
received properly. If we have rejected frames, we can try to resend them.

Signed-off-by: Robie Basak <rb-oss-1@justgohome.co.uk>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:08 -08:00
Denis V. Lunev
775516bfa2 [NETNS]: Namespace stop vs 'ip r l' race.
During network namespace stop process kernel side netlink sockets
belonging to a namespace should be closed. They should not prevent
namespace to stop, so they do not increment namespace usage
counter. Though this counter will be put during last sock_put.

The raplacement of the correct netns for init_ns solves the problem
only partial as socket to be stoped until proper stop is a valid
netlink kernel socket and can be looked up by the user processes. This
is not a problem until it resides in initial namespace (no processes
inside this net), but this is not true for init_net.

So, hold the referrence for a socket, remove it from lookup tables and
only after that change namespace and perform a last put.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:08 -08:00
Denis V. Lunev
b7c6ba6eb1 [NETNS]: Consolidate kernel netlink socket destruction.
Create a specific helper for netlink kernel socket disposal. This just
let the code look better and provides a ground for proper disposal
inside a namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:07 -08:00