mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2026-01-20 18:25:47 +00:00

Author	SHA1	Message	Date
Linus Torvalds	8a84e01e14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl. 2) nf_nat_range is not fully initialized and this is copied back to userspace, from Daniel Borkmann. 3) Fix read past end of b uffer in netfilter ipset, also from Dan Carpenter. 4) Signed integer overflow in ipv4 address mask creation helper inet_make_mask(), from Vincent BENAYOUN. 5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods to properly describe the device's capabilities, from Joe Stringer. 6) Fix memory leaks and checksum miscalculations in openvswitch, from Pravin B SHelar and Jesse Gross. 7) FIB rules passes back ambiguous error code for unreachable routes, making behavior confusing for userspace. Fix from Panu Matilainen. 8) ieee802154fake_probe() doesn't release resources properly on error, from Alexey Khoroshilov. 9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann. 10) Fix access of stale slave pointers in bonding code, from Nikolay Aleksandrov. 11) Fix stack info leak in PPP pptp code, from Mathias Krause. 12) Cure locking bug in IPX stack, from Jiri Bohac. 13) Revert SKB fclone memory freeing optimization that is racey and can allow accesses to freed up memory, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (71 commits) tcp: Restore RFC5961-compliant behavior for SYN packets net: Revert "net: avoid one atomic operation in skb_clone()" virtio-net: validate features during probe cxgb4 : Fix DCB priority groups being returned in wrong order ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg openvswitch: Don't validate IPv6 label masks. pptp: fix stack info leak in pptp_getname() brcmfmac: don't include linux/unaligned/access_ok.h cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection ipv6: delete protocol and unregister rtnetlink when cleanup net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too bonding: fix curr_active_slave/carrier with loadbalance arp monitoring mac80211: minstrel_ht: fix a crash in rate sorting vxlan: Inline vxlan_gso_check(). can: m_can: update to support CAN FD features can: m_can: fix incorrect error messages can: m_can: add missing delay after setting CCCR_INIT bit can: m_can: fix not set can_dlc for remote frame can: m_can: fix possible sleep in napi poll can: m_can: add missing message RAM initialization ...	2014-11-21 17:20:36 -08:00
David S. Miller	53b15ef3c2	Merge tag 'master-2014-11-20' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-11-21 Please pull this batch of updates intended for the 3.19 stream... For the mac80211 bits, Johannes says: "It has been a while since my last pull request, so we accumulated another relatively large set of changes: * TDLS off-channel support set from Arik/Liad, with some support patches I did * custom regulatory fixes from Arik * minstrel VHT fix (and a small optimisation) from Felix * add back radiotap vendor namespace support (myself) * random MAC address scanning for cfg80211/mac80211/hwsim (myself) * CSA improvements (Luca) * WoWLAN Net Detect (wake on network found) support (Luca) * and lots of other smaller changes from many people" For the Bluetooth bits, Johan says: "Here's another set of patches for 3.19. Most of it is again fixes and cleanups to ieee802154 related code from Alexander Aring. We've also got better handling of hardware error events along with a proper API for HCI drivers to notify the HCI core of such situations. There's also a minor fix for mgmt events as well as a sparse warning fix. The code for sending HCI commands synchronously also gets a fix where we might loose the completion event in the case of very fast HW (particularly easily reproducible with an emulated HCI device)." And... "Here's another bluetooth-next pull request for 3.19. We've got: - Various fixes, cleanups and improvements to ieee802154/mac802154 - Support for a Broadcom BCM20702A1 variant - Lots of lockdep fixes - Fixed handling of LE CoC errors that should trigger SMP" For the Atheros bits, Kalle says: "One ath6kl patch and rest for ath10k, but nothing really major which stands out. Most notable: o fix resume (Bartosz) o firmware restart is now faster and more reliable (Michal) o it's now possible to test hardware restart functionality without crashing the firmware using hw-restart parameter with simulate_fw_crash debugfs file (Michal)" On top of that...both ath9k and mwifiex get their usual level of updates. Of note is the ath9k spectral scan work from Oleksij Rempel. I also pulled from the wireless tree in order to avoid some merge issues. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 16:39:45 -05:00
Richard Cochran	37dd9255b2	vlan: Pass ethtool get_ts_info queries to real device. Commit `a6111d3c` "vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device" intended to enable hardware time stamping on VLAN interfaces, but passing SIOCSHWTSTAMP is only half of the story. This patch adds the second half, by letting user space find out the time stamping capabilities of the device backing a VLAN interface. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:35:53 -05:00
Calvin Owens	0c228e833c	tcp: Restore RFC5961-compliant behavior for SYN packets Commit `c3ae62af8e` ("tcp: should drop incoming frames without ACK flag set") was created to mitigate a security vulnerability in which a local attacker is able to inject data into locally-opened sockets by using TCP protocol statistics in procfs to quickly find the correct sequence number. This broke the RFC5961 requirement to send a challenge ACK in response to spurious RST packets, which was subsequently fixed by commit `7b514a886b` ("tcp: accept RST without ACK flag"). Unfortunately, the RFC5961 requirement that spurious SYN packets be handled in a similar manner remains broken. RFC5961 section 4 states that: ... the handling of the SYN in the synchronized state SHOULD be performed as follows: 1) If the SYN bit is set, irrespective of the sequence number, TCP MUST send an ACK (also referred to as challenge ACK) to the remote peer: <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK> After sending the acknowledgment, TCP MUST drop the unacceptable segment and stop processing further. By sending an ACK, the remote peer is challenged to confirm the loss of the previous connection and the request to start a new connection. A legitimate peer, after restart, would not have a TCB in the synchronized state. Thus, when the ACK arrives, the peer should send a RST segment back with the sequence number derived from the ACK field that caused the RST. This RST will confirm that the remote peer has indeed closed the previous connection. Upon receipt of a valid RST, the local TCP endpoint MUST terminate its connection. The local TCP endpoint should then rely on SYN retransmission from the remote end to re-establish the connection. This patch lets SYN packets through the discard added in `c3ae62af8e`, so that spurious SYN packets are properly dealt with as per the RFC. The challenge ACK is sent unconditionally and is rate-limited, so the original vulnerability is not reintroduced by this patch. Signed-off-by: Calvin Owens <calvinowens@fb.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:33:50 -05:00
Eric Dumazet	e7820e39b7	net: Revert "net: avoid one atomic operation in skb_clone()" Not sure what I was thinking, but doing anything after releasing a refcount is suicidal or/and embarrassing. By the time we set skb->fclone to SKB_FCLONE_FREE, another cpu could have released last reference and freed whole skb. We potentially corrupt memory or trap if CONFIG_DEBUG_PAGEALLOC is set. Reported-by: Chris Mason <clm@fb.com> Fixes: `ce1a4ea3f1` ("net: avoid one atomic operation in skb_clone()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:26:32 -05:00
Richard Alpe	1593123a6a	tipc: add name table dump to new netlink api Add TIPC_NL_NAME_TABLE_GET command to the new tipc netlink API. This command supports dumping the name table of all nodes. Netlink logical layout of name table response message: -> name table -> publication -> type -> lower -> upper -> scope -> node -> ref -> key Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:32 -05:00
Richard Alpe	27c2141672	tipc: add net set to new netlink api Add TIPC_NL_NET_SET command to the new tipc netlink API. This command can set the network id and network (tipc) address. Netlink logical layout of network set message: -> net [ -> id ] [ -> address ] Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	fd3cf2ad51	tipc: add net dump to new netlink api Add TIPC_NL_NET_GET command to the new tipc netlink API. This command dumps the network id of the node. Netlink logical layout of returned network data: -> net -> id Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	3e4b6ab58d	tipc: add node get/dump to new netlink api Add TIPC_NL_NODE_GET to the new tipc netlink API. This command can dump the address and node status of all nodes in the tipc cluster. Netlink logical layout of returned node/address data: -> node -> address -> up flag Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	1e55417d8f	tipc: add media set to new netlink api Add TIPC_NL_MEDIA_SET command to the new tipc netlink API. This command can set one or more link properties for a particular media. Netlink logical layout of bearer set message: -> media -> name -> link properties [ -> tolerance ] [ -> priority ] [ -> window ] Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	46f15c6794	tipc: add media get/dump to new netlink api Add TIPC_NL_MEDIA_GET command to the new tipc netlink API. This command supports dumping all information about all defined media as well as getting all information about a specific media. The information about a media includes name and link properties. Netlink logical layout of media get response message: -> media -> name -> link properties -> tolerance -> priority -> window Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	ae36342b50	tipc: add link stat reset to new netlink api Add TIPC_NL_LINK_RESET_STATS command to the new netlink API. This command resets the link statistics for a particular link. Netlink logical layout of link reset message: -> link -> name Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:31 -05:00
Richard Alpe	f96ce7a20d	tipc: add link set to new netlink api Add TIPC_NL_LINK_SET to the new tipc netlink API. This command can set one or more link properties for a particular link. Netlink logical layout of link set message: -> link -> name -> properties [ -> tolerance ] [ -> priority ] [ -> window ] Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:30 -05:00
Richard Alpe	7be57fc691	tipc: add link get/dump to new netlink api Add TIPC_NL_LINK_GET command to the new tipc netlink API. This command supports dumping all information about all links (including the broadcast link) or getting all information about a specific link (not the broadcast link). The information about a link includes name, transmission info, properties and link statistics. As the tipc broadcast link is special we unfortunately have to treat it specially. It is a deliberate decision not to abstract the broadcast link on this (API) level. Netlink logical layout of link response message: -> port -> name -> MTU -> RX -> TX -> up flag -> active flag -> properties -> priority -> tolerance -> window -> statistics -> rx_info -> rx_fragments -> rx_fragmented -> rx_bundles -> rx_bundled -> tx_info -> tx_fragments -> tx_fragmented -> tx_bundles -> tx_bundled -> msg_prof_tot -> msg_len_cnt -> msg_len_tot -> msg_len_p0 -> msg_len_p1 -> msg_len_p2 -> msg_len_p3 -> msg_len_p4 -> msg_len_p5 -> msg_len_p6 -> rx_states -> rx_probes -> rx_nacks -> rx_deferred -> tx_states -> tx_probes -> tx_nacks -> tx_acks -> retransmitted -> duplicates -> link_congs -> max_queue -> avg_queue Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:30 -05:00
Richard Alpe	1a1a143daf	tipc: add publication dump to new netlink api Add TIPC_NL_PUBL_GET command to the new tipc netlink API. This command supports dumping of all publications for a specific socket. Netlink logical layout of request message: -> socket -> reference Netlink logical layout of response message: -> publication -> type -> lower -> upper Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:30 -05:00
Richard Alpe	34b78a127c	tipc: add sock dump to new netlink api Add TIPC_NL_SOCK_GET command to the new tipc netlink API. This command supports dumping of all available sockets with their associated connection or publication(s). It could be extended to reply with a single socket if the NLM_F_DUMP isn't set. The information about a socket includes reference, address, connection information / publication information. Netlink logical layout of response message: -> socket -> reference -> address [ -> connection -> node -> socket [ -> connected flag -> type -> instance ] ] [ -> publication flag ] Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:30 -05:00
Richard Alpe	315c00bc9f	tipc: add bearer set to new netlink api Add TIPC_NL_BEARER_SET command to the new tipc netlink API. This command can set one or more link properties for a particular bearer. Netlink logical layout of bearer set message: -> bearer -> name -> link properties [ -> tolerance ] [ -> priority ] [ -> window ] Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:30 -05:00
Richard Alpe	35b9dd7607	tipc: add bearer get/dump to new netlink api Add TIPC_NL_BEARER_GET command to the new tipc netlink API. This command supports dumping all data about all bearers or getting all information about a specific bearer. The information about a bearer includes name, link priorities and domain. Netlink logical layout of bearer get message: -> bearer -> name Netlink logical layout of returned bearer information: -> bearer -> name -> link properties -> priority -> tolerance -> window -> domain Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:29 -05:00
Richard Alpe	0655f6a863	tipc: add bearer disable/enable to new netlink api A new netlink API for tipc that can disable or enable a tipc bearer. The new API is separated from the old API because of a bug in the user space client (tipc-config). The problem is that older versions of tipc-config has a very low receive limit and adding commands to the legacy genl_opts struct causes the ctrl_getfamily() response message to grow, subsequently breaking the tool. The new API utilizes netlink policies for input validation. Where the top-level netlink attributes are tipc-logical entities, like bearer. The top level entities then contain nested attributes. In this case a name, nested link properties and a domain. Netlink commands implemented in this patch: TIPC_NL_BEARER_ENABLE TIPC_NL_BEARER_DISABLE Netlink logical layout of bearer enable message: -> bearer -> name [ -> domain ] [ -> properties -> priority ] Netlink logical layout of bearer disable message: -> bearer -> name Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:01:29 -05:00
Daniel Borkmann	f869c91286	net: sctp: keep owned chunk in destructor_arg instead of skb->cb It's just silly to hold the skb destructor argument around inside skb->cb[] as we currently do in SCTP. Nowadays, we're sort of cheating on data accounting in the sense that due to commit `4c3a5bdae2` ("sctp: Don't charge for data in sndbuf again when transmitting packet"), we orphan the skb already in the SCTP output path, i.e. giving back charged data memory, and use a different destructor only to make sure the sk doesn't vanish on skb destruction time. Thus, cb[] is still valid here as we operate within the SCTP layer. (It's generally actually a big candidate for future rework, imho.) However, storing the destructor in the cb[] can easily cause issues should an non sctp_packet_set_owner_w()'ed skb ever escape the SCTP layer, since cb[] may get overwritten by lower layers and thus can corrupt the chunk pointer. There are no such issues at present, but lets keep the chunk in destructor_arg, as this is the actual purpose for it. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:46:12 -05:00
Willem de Bruijn	9c7077622d	packet: make packet_snd fail on len smaller than l2 header When sending packets out with PF_PACKET, SOCK_RAW, ensure that the packet is at least as long as the device's expected link layer header. This check already exists in tpacket_snd, but not in packet_snd. Also rate limit the warning in tpacket_snd. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:43:07 -05:00
Jiri Pirko	c7e2b9689e	sched: introduce vlan action This tc action allows to work with vlan tagged skbs. Two supported sub-actions are header pop and header push. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:18 -05:00
Jiri Pirko	93515d53b1	net: move vlan pop/push functions into common code So it can be used from out of openvswitch code. Did couple of cosmetic changes on the way, namely variable naming and adding support for 8021AD proto. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:18 -05:00
Jiri Pirko	e21951212f	net: move make_writable helper into common code note that skb_make_writable already exists in net/netfilter/core.c but does something slightly different. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:17 -05:00
Jiri Pirko	5968250c86	vlan: introduce *vlan_hwaccel_push_inside helpers Use them to push skb->vlan_tci into the payload and avoid code duplication. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:17 -05:00
Jiri Pirko	62749e2cb3	vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto Name fits better. Plus there's going to be introduced __vlan_insert_tag later on. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:17 -05:00
Jiri Pirko	b960a0ac69	vlan: make __vlan_hwaccel_put_tag return void Always returns the same skb it gets, so change to void. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:16 -05:00
Jiri Pirko	1abcd82c20	openvswitch: actions: use skb_postpull_rcsum when possible Replace duplicated code by calling skb_postpull_rcsum Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:20:16 -05:00
Alexander Couzens	fe1591224a	l2tp_eth: allow to set a specific mac address Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:16:38 -05:00
David S. Miller	7e09dccd07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains two bugfixes for your net tree, they are: 1) Validate netlink group from nfnetlink to avoid an out of bound array access. This should only happen with superuser priviledges though. Discovered by Andrey Ryabinin using trinity. 2) Don't push ethernet header before calling the netfilter output hook for multicast traffic, this breaks ebtables since it expects to see skb->data pointing to the network header, patch from Linus Luessing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 00:12:39 -05:00
David S. Miller	c857781900	Merge tag 'master-2014-11-20' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-11-20 Please full this little batch of fixes intended for the 3.18 stream! For the mac80211 patch, Johannes says: "Here's another last minute fix, for minstrel HT crashing depending on the value of some uninitialised stack." On top of that... Ben Greear fixes an ath9k regression in which a BSSID mask is miscalculated. Dmitry Torokhov corrects an error handling routing in brcmfmac which was checking an unsigned variable for a negative value. Johannes Berg avoids a build problem in brcmfmac for arches where linux/unaligned/access_ok.h and asm/unaligned.h conflict. Mathy Vanhoef addresses another brcmfmac issue so as to eliminate a use-after-free of the URB transfer buffer if a timeout occurs. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 00:07:51 -05:00
Jiri Bohac	01462405f0	ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg This fixes an old regression introduced by commit `b0d0d915` (ipx: remove the BKL). When a recvmsg syscall blocks waiting for new data, no data can be sent on the same socket with sendmsg because ipx_recvmsg() sleeps with the socket locked. This breaks mars-nwe (NetWare emulator): - the ncpserv process reads the request using recvmsg - ncpserv forks and spawns nwconn - ncpserv calls a (blocking) recvmsg and waits for new requests - nwconn deadlocks in sendmsg on the same socket Commit `b0d0d915` has simply replaced BKL locking with lock_sock/release_sock. Unlike now, BKL got unlocked while sleeping, so a blocking recvmsg did not block a concurrent sendmsg. Only keep the socket locked while actually working with the socket data and release it prior to calling skb_recv_datagram(). Signed-off-by: Jiri Bohac <jbohac@suse.cz> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-20 22:57:03 -05:00
Joe Stringer	d3052bb5d3	openvswitch: Don't validate IPv6 label masks. When userspace doesn't provide a mask, OVS datapath generates a fully unwildcarded mask for the flow by copying the flow and setting all bits in all fields. For IPv6 label, this creates a mask that matches on the upper 12 bits, causing the following error: openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff) This patch ignores the label validation check for masks, avoiding this error. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-20 22:56:13 -05:00
John W. Linville	9a638ddfb0	It has been a while since my last pull request, so we accumulated another relatively large set of changes: * TDLS off-channel support set from Arik/Liad, with some support patches I did * custom regulatory fixes from Arik * minstrel VHT fix (and a small optimisation) from Felix * add back radiotap vendor namespace support (myself) * random MAC address scanning for cfg80211/mac80211/hwsim (myself) * CSA improvements (Luca) * WoWLAN Net Detect (wake on network found) support (Luca) * and lots of other smaller changes from many people -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUbghkAAoJEDBSmw7B7bqroosP/RABYXUMua+k3Ccq7T4eU4jV AEO2p2gt5nHBzEl1NCJtdUTJkZ6ftT7ehvAkV2zboB0FBoAoBTbZ8YDtcBBiWaY6 wJ5TYuOl6LFo7csAxWxpCzUxW3M+iq26itpyW9Zt9WWxP4QLSNPFyXEXV3SEh45n HcDW9A0SE+mgdaTQ2LEMBJ5XWxG/Ic7i9Xn6Py3o4x7NsTB4EqFNOD0WXcPCq7M0 H8xlsIYIBYoGNMsV/2Nu7CEgcSXfDLqWcs9uPHQMCvWPjx/vIoEyOgTwJlE9bQHh 2tloc1LBP6XKQ6g2bJ/pBaQVnZGugcOJhD6KUq3ckNm9qIP1ZtRmJslH4V6pUSCS eGl3TcAKSSE4BWIa7AaETWXeoH4X68dO7PM7pnflQRCQLzCJRbDWCdqjBst/AxBT 6hvAFAvExEcWBkNVSTJ2egRk/C9cDFKRaCWQ1h4wX9yvh+8efe1D0DDWLW9a9qv1 LsoGJE72BZdXn2CaQEME+CjTd3fWmn6u729d/c863cq2kspCSOof0QD0X9uWhBUx BvqtgbQjGZzAvHFcjBd7yRd5hz0aDfLyBL59bq2IBzaU1QmyekNPqzSMSD+5ZlCp uxEeE5AY2+pcNZV1KRtkvgAByfUgAVd0FHZcVb8SIM6QZ3IhqiOuzxuXtxv6hrYP V/76B+ath4Sv1IPF56ex =q4e6 -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-john-2014-11-20' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "It has been a while since my last pull request, so we accumulated another relatively large set of changes: * TDLS off-channel support set from Arik/Liad, with some support patches I did * custom regulatory fixes from Arik * minstrel VHT fix (and a small optimisation) from Felix * add back radiotap vendor namespace support (myself) * random MAC address scanning for cfg80211/mac80211/hwsim (myself) * CSA improvements (Luca) * WoWLAN Net Detect (wake on network found) support (Luca) * and lots of other smaller changes from many people" Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-20 16:09:30 -05:00
Marcelo Leitner	beacd3e8ef	netfilter: nfnetlink_log: Make use of pr_fmt where applicable Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-20 14:09:01 +01:00
Markus Elfring	982f405136	netfilter: Deletion of unnecessary checks before two function calls The functions free_percpu() and module_put() test whether their argument is NULL and then return immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-20 13:08:43 +01:00
Steffen Klassert	fbe68ee875	vti6: Add a lookup method for tunnels with wildcard endpoints. Currently we can't lookup tunnels with wildcard endpoints. This patch adds a method to lookup these tunnels in the receive path. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-11-20 10:03:07 +01:00
Duan Jiong	ffb1388a36	ipv6: delete protocol and unregister rtnetlink when cleanup pim6_protocol was added when initiation, but it not deleted. Similarly, unregister RTNL_FAMILY_IP6MR rtnetlink. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 16:56:17 -05:00
Al Viro	08adb7dabd	fold verify_iovec() into copy_msghdr_from_user() ... and do the same on the compat side of things. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-11-19 16:23:49 -05:00
Al Viro	0844932009	{compat_,}verify_iovec(): switch to generic copying of iovecs use {compat_,}rw_copy_check_uvector(). As the result, we are guaranteed that all iovecs seen in ->msg_iov by ->sendmsg() and ->recvmsg() will pass access_ok(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-11-19 16:23:16 -05:00
Al Viro	666547ff59	separate kernel- and userland-side msghdr Kernel-side struct msghdr is (currently) using the same layout as userland one, but it's not a one-to-one copy - even without considering 32bit compat issues, we have msg_iov, msg_name and msg_control copied to kernel[1]. It's fairly localized, so we get away with a few functions where that knowledge is needed (and we could shrink that set even more). Pretty much everything deals with the kernel-side variant and the few places that want userland one just use a bunch of force-casts to paper over the differences. The thing is, kernel-side definition of struct msghdr is not exposed in include/uapi - libc doesn't see it, etc. So we can add struct user_msghdr, with proper annotations and let the few places that ever deal with those beasts use it for userland pointers. Saner typechecking aside, that will allow to change the layout of kernel-side msghdr - e.g. replace msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need to modify the iovec as we copy data to/from it, etc. We could introduce kernel_msghdr instead, but that would create much more noise - the absolute majority of the instances would need to have the type switched to kernel_msghdr and definition of struct msghdr in include/linux/socket.h is not going to be seen by userland anyway. This commit just introduces user_msghdr and switches the few places that are dealing with userland-side msghdr to it. [1] actually, it's even trickier than that - we copy msg_control for sendmsg, but keep the userland address on recvmsg. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-11-19 16:22:59 -05:00
John W. Linville	6158fb37d1	Here's another last minute fix, for minstrel HT crashing depending on the value of some uninitialised stack. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUa70EAAoJEDBSmw7B7bqr40gQAMgNUKILuYTyHZaKR7teV8aG uVFzrMdmNjCZIsxAuH1X8VNOJZEAB+Ro+lTybNjC/tTPyA+Yu3lDCQjSYiE8UOFk mDrNFG0tjZHGn3axceLIhujfPqXEks/wQ98tDEJU2eMLxS0nClwyTXjhc1IxzUBO /L2HlhGWADRZ6FIV7yhRO33KN1l3Q8oxW79ybJq/5JCcclfGBt3OpYvi7AW6M4iK qfgTfS6GbpcRRc4+HVA/cgYOFaRmMI71bvqMgvh8MgmqfQeRw+WSOaUiLx795CsR OKrIQgdpVp2JHwHkFXIgBy+EBBbXtCozvxKX3u+YKPabj4GC/6dkT370nuvv9hkz IzTxt+yNBTqJawum7vM6WypgT4wCZOR0JbY0vDW5jYLJnDfGJkYFbny7WtRtCVf7 xWzIixkyWLNFs6swbRdIKoU5GNw2fNe7az4fRoUqNtZGSPHiK7oD5sEkr81hWUV5 5XsifngqekfdVvIGTckpy2SAikxgkrtHOQFIGJz+PN1CjvDT++66Opr3AHnwWeSP IAjmoVOCiyIBzXlcsZJmAfPa4Hvk+NAPRKCx6N25q/+ESpOZRARWFRV3VuPeGzQW +SiC4V9VYWHvW9VPJ37WkAFYv5tPaLLyURdU7p6MSwuc0RCPgtcvXty2QFbjQLdf k0apC9DZydQPF2kre6Pv =L2Jb -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Here's another last minute fix, for minstrel HT crashing depending on the value of some uninitialised stack." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-19 15:44:40 -05:00
John W. Linville	ab1f5a532c	Merge commit '4e6ce4dc7ce71d0886908d55129d5d6482a27ff9' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-11-19 15:38:48 -05:00
Markus Elfring	fcd4d35ecc	netlink: Deletion of an unnecessary check before the function call "__module_get" The __module_get() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 15:27:40 -05:00
Markus Elfring	ef87c5d6a1	net: pktgen: Deletion of an unnecessary check before the function call "proc_remove" The proc_remove() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 15:20:15 -05:00
Eric Dumazet	355a901e6c	tcp: make connect() mem charging friendly While working on sk_forward_alloc problems reported by Denys Fedoryshchenko, we found that tcp connect() (and fastopen) do not call sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so sk_forward_alloc is negative while connect is in progress. We can fix this by calling regular sk_stream_alloc_skb() both for the SYN packet (in tcp_connect()) and the syn_data packet in tcp_send_syn_data() Then, tcp_send_syn_data() can avoid copying syn_data as we simply can manipulate syn_data->cb[] to remove SYN flag (and increment seq) Instead of open coding memcpy_fromiovecend(), simply use this helper. This leaves in socket write queue clean fast clone skbs. This was tested against our fastopen packetdrill tests. Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 14:57:01 -05:00
Felix Fietkau	75769c80e3	mac80211: minstrel_ht: add a small optimization to minstrel_aggr_check Check the queue mapping earlier, skb->queue_mapping is more likely than skb->data to still be in d-cache. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 19:31:07 +01:00
Johannes Berg	f815e2b3c0	mac80211: notify drivers on sta rate table changes This allows drivers with a firmware or chip-based rate lookup table to use the most recent default rate selection without having to get it from per-packet data or explicit ieee80211_get_tx_rate calls Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 19:15:50 +01:00
Tomasz Bursztyka	8f894be2df	nl80211: Broadcast CMD_NEW_INTERFACE and CMD_DEL_INTERFACE Let the other listeners being notified when a new or del interface command has been issued, thus reducing later necessary request to be in sync with current context. Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 19:02:42 +01:00
Jukka Rissanen	18e5ca65e5	nl80211: Replace interface socket owner attribute with more generic one Replace NL80211_ATTR_IFACE_SOCKET_OWNER attribute with more generic NL80211_ATTR_SOCKET_OWNER that can be used with other commands that interface creation. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:54:40 +01:00
Rafał Miłecki	d687cbb703	cfg80211: protect fools returning NULL in add_virtual_intf Callback add_virtual_intf is supposed to return ERR_PTR and trying to return NULL results in some "Unable to handle kernel paging request", etc. As it may be complicated to debug & trace, let's catch it (WARN). Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:50:34 +01:00
Arik Nemtsov	c7ab508190	cfg80211: explicitly initialize some fields in custom reg path Explicitly initialize the DFS state and beacon found state when handling channels in the custom regulatory path. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:49:33 +01:00
Arik Nemtsov	2e18b38fc8	cfg80211: update missing fields in custom regulatory path Some channels fields were not being updated in the custom regulatory path. Update them according to the code in handle_channel(). Signed-off-by: Jonathan Doron <jonathanx.doron@intel.com> Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:49:33 +01:00
Felix Fietkau	628c010f1f	mac80211: skip legacy rate mask handling for VHT rates The rate mask code currently assumes that a rate is legacy if IEEE80211_TX_RC_MCS is not set. This might be the cause of bogus VHT rates being reported with minstrel_ht. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:46:35 +01:00
Eliad Peller	8b1956f041	mac80211: don't allow 40MHz tx rates in case of 20MHz chandef When 20MHz chandef is used, 40MHz rates shouldn't be used (by the rate-control algorithm), even if the sta ht capabilities indicate support for it. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Singed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:46:30 +01:00
Johannes Berg	a344d6778a	mac80211: allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR Allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR with software based scanning and generate a random MAC address for them for every scan request with the flag. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:46:09 +01:00
Johannes Berg	6ea0a69ca2	mac80211: rcu-ify scan and scheduled scan request pointers In order to use the scan and scheduled scan request pointers during RX to check for randomisation, make them accessible using RCU. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:58 +01:00
Johannes Berg	ad2b26abc1	cfg80211: allow drivers to support random MAC addresses for scan Add the necessary feature flags and a scan flag to support using random MAC addresses for scan while unassociated. The configuration for this supports an arbitrary MAC address value and mask, so that any kind of configuration (e.g. fixed OUI or full 46-bit random) can be requested. Full 46-bit random is the default when no other configuration is passed. Also add a small helper function to use the addr/mask correctly. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:52 +01:00
Eliad Peller	ff5db4392c	mac80211: remove redundant check local->scan_req was tested in the previous line, so it can't be NULL. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:48 +01:00
Luciano Coelho	8cd4d4563e	cfg80211: add wowlan net-detect support Add a new WoWLAN API to enable net-detect as a wake up trigger. Net-detect allows the device to scan in the background while the host is asleep to wake up the host system when a matching network is found. Reuse the scheduled scan attributes to specify how the scan is performed while suspended and the matches that will trigger a wake event. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:45 +01:00
Luciano Coelho	256da02d18	cfg80211: refactor nl80211_start_sched_scan so it can be reused For net detect, we will need to reuse most of the scheduled scan parsing function, but not all, so split out the attributes parsing part out of the main start sched_scan function. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:42 +01:00
Liad Kaufman	b6da911b3c	mac80211: synchronously reserve TID per station In TDLS (e.g., TDLS off-channel) there is a requirement for some drivers to supply an unused TID between the AP and the device to the FW, to allow sending PTI requests and to allow the FW to aggregate on a specific TID for better throughput. To ensure that the allocated TID is indeed unused, this patch introduces an API for blocking the driver from TXing on that TID. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:36 +01:00
Liad Kaufman	4f9610d528	mac80211: add specific-queue flushing support If the HW supports IEEE80211_HW_QUEUE_CONTROL, allow flushing only specific queues rather than all of them. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:30 +01:00
Arik Nemtsov	8a4d32f30d	mac80211: add TDLS channel-switch Rx flow When receiving a TDLS channel switch request or response, parse the frame and call a new tdls_recv_channel_switch op in the low level driver with the parsed data. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:26 +01:00
Arik Nemtsov	a7a6bdd067	mac80211: introduce TDLS channel switch ops Implement the cfg80211 TDLS channel switch ops and introduce new mac80211 ones for low-level drivers. Verify low-level driver support for the new ops when using the relevant wiphy feature bit. Also verify the peer supports channel switching before passing the command down. Add a new STA flag to track the off-channel state with the TDLS peer and make sure to cancel the channel-switch if the peer STA is unexpectedly removed. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:21 +01:00
Arik Nemtsov	5383758443	mac80211: add parsing of TDLS specific IEs These are used in TDLS channel switching code. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:16 +01:00
Arik Nemtsov	1057d35ede	cfg80211: introduce TDLS channel switch commands Introduce commands to initiate and cancel TDLS channel-switching. Once TDLS channel-switching is started, the lower level driver is responsible for continually initiating channel-switch operations and returning to the base (AP) channel to listen for beacons from time to time. Upon cancellation of the channel-switch all communication between the relevant TDLS peers will continue on the base channel. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:12 +01:00
Arik Nemtsov	c273390569	mac80211: prepare TDLS mgmt code for channel-switch templates Split the data-generating from the Tx-sending functionality, as we do not want to send templates to the lower driver. Also add an optional chandef argument to the data-generating portion. It will be used for channel-switch templates. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:07 +01:00
Arik Nemtsov	9041c1fa57	mac80211: track AP and peer STA TDLS chan-switch support The AP or peer can prohibit TDLS channel switch via a bit in the extended capabilities IE. Parse the IE and track this bit. Set an appropriate STA flag if both the AP and peer STA support TDLS channel-switching. Add the new STA flag and the missing TDLS_INITIATOR to debugfs. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:45:04 +01:00
Arik Nemtsov	78632a17ea	cfg/mac80211: define TDLS channel switch feature bit Define some related TDLS protocol constants and advertise channel switch support in the extended-capabilities IE when the feature bit is defined. Actually supporting TDLS channel-switching also requires support for some new nl80211 commands, to be introduced by future patches. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:58 +01:00
Arik Nemtsov	2cedd87960	mac80211: add BSS coex IE to TDLS setup frames Add the BSS coex IE in case we support HT40 channels, as mandated by section 8.5.13 in IEEE802.11 2012. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:52 +01:00
Arik Nemtsov	f0d29cb979	mac80211: add supported channels IE during TDLS setup This information element is mandatory in case TDLS channel-switching is to be supported. The channels given are ones supported and allowed to be active in the current regulatory setting. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:47 +01:00
Johannes Berg	7528ec5776	mac80211: add function to create data frame template including key For some TDLS channel switch implementations data frames need to be sent by the firmware based on a template. This template should be created by mac80211, and thus needs to properly be built from an 802.3 frame into an 802.11 frame. In addition, the device will need the key information so the select_key handler needs to be run. However, the driver/device will be responsible for all of the crypto encapsulation, as the sequence numbers etc. cannot be built by the host anyway in this case since it's a template to be used multiple times. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:43 +01:00
Johannes Berg	4c9451ed94	mac80211: factor out 802.11 header building code Factor out the 802.11 header building code from the xmit function to be able to use it separately in a later commit. While at it, fix up some documentation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:38 +01:00
Johannes Berg	73c4e195e6	mac80211: move skb info band assignment out Instead of passing the band as a parameter to ieee80211_xmit() and ieee80211_tx(), move it outside of the two functions while making sure info->band is set up before calling them. This removes the parameter and simplifies the follow commit. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:35 +01:00
Liad Kaufman	1277b4a9f5	mac80211: retransmit TDLS teardown packet through AP if not ACKed Since the TDLS peer station might not receive the teardown packet (e.g., when in PS), this makes sure the packet is retransmitted - this time through the AP - if the TDLS peer didn't ACK the packet. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:29 +01:00
Liad Kaufman	24d342c514	mac80211: add option for setting skb flags before xmit Allows setting of an skb's flags - if needed - when calling ieee80211_subif_start_xmit(). Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:19 +01:00
J. Bruce Fields	56429e9b3b	merge nfs bugfixes into nfsd for-3.19 branch In addition to nfsd bugfixes, there are some fixes in -rc5 for client bugs that can interfere with my testing.	2014-11-19 12:06:30 -05:00
Trond Myklebust	093a1468b6	SUNRPC: Fix locking around callback channel reply receive Both xprt_lookup_rqst() and xprt_complete_rqst() require that you take the transport lock in order to avoid races with xprt_transmit(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-11-19 12:03:20 -05:00
Johan Hedberg	0378b59770	Bluetooth: Convert link keys list to use RCU This patch converts the hdev->link_keys list to be protected through RCU, thereby eliminating the need to hold the hdev lock while accessing the list. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-19 16:19:47 +01:00
Johan Hedberg	cb6f3f7ace	Bluetooth: Fix setting conn->pending_sec_level value from link key When a connection is requested the conn->pending_sec_level value gets set to whatever level the user requested the connection to be. During the pairing process there are various sanity checks to try to ensure that the right length PIN or right IO Capability is used to satisfy the target security level. However, when we finally get hold of the link key that is to be used we should still set the actual final security level from the key type. This way when we eventually get an Encrypt Change event the correct value gets copied to conn->sec_level. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-19 16:17:32 +01:00
Johan Hedberg	22a3ceabf1	Bluetooth: Fix setting state back to TASK_RUNNING In __hci_cmd_sync_ev() and __hci_req_sync() if the hci_req_run() call fails and we return from the functions we should ensure that the state doesn't remain in TASK_INTERRUPTIBLE that we just set it to. This patch fixes missing calls to set_current_state(TASK_RUNNING) in both places. Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-19 16:15:55 +01:00
Felix Fietkau	280ba51d60	mac80211: minstrel_ht: fix a crash in rate sorting The commit `5935839ad7` "mac80211: improve minstrel_ht rate sorting by throughput & probability" introduced a crash on rate sorting that occurs when the rate added to the sorting array is faster than all the previous rates. Due to an off-by-one error, it reads the rate index from tp_list[-1], which contains uninitialized stack garbage, and then uses the resulting index for accessing the group rate stats, leading to a crash if the garbage value is big enough. Cc: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-18 22:39:16 +01:00
Rick Jones	e3e3217029	icmp: Remove some spurious dropped packet profile hits from the ICMP path If icmp_rcv() has successfully processed the incoming ICMP datagram, we should use consume_skb() rather than kfree_skb() because a hit on the likes of perf -e skb:kfree_skb is not called-for. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:28:28 -05:00
Fabian Frederick	54aeba7f06	dev_ioctl: use sizeof(x) instead of sizeof x Also remove spaces after cast. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:27:32 -05:00
Fabian Frederick	e56f735913	net/core: include linux/types.h instead of asm/types.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:32 -05:00
Fabian Frederick	1d2398dc7c	net: fix spelling for synchronized Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:32 -05:00
Fabian Frederick	a77b634367	dccp: spelling s/reseting/resetting Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:32 -05:00
Fabian Frederick	02c31d2e56	dccp: replace min/casting by min_t Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:32 -05:00
Fabian Frederick	54da7996b8	dccp: remove blank lines between function/EXPORT_SYMBOL See Documentation/CodingStyle chapter 6. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:32 -05:00
Fabian Frederick	6d80c4732b	dccp: kerneldoc warning fixes Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:26:31 -05:00
John W. Linville	f48ecb19bc	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg <johan.hedberg@gmail.com> says: "Here's another bluetooth-next pull request for 3.19. We've got: - Various fixes, cleanups and improvements to ieee802154/mac802154 - Support for a Broadcom BCM20702A1 variant - Lots of lockdep fixes - Fixed handling of LE CoC errors that should trigger SMP" Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-18 15:13:26 -05:00
Johannes Berg	760a52e80f	Merge remote-tracking branch 'wireless-next/master' into mac80211-next This brings in some mwifiex changes that further patches will need to work on top to not cause merge conflicts. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-18 09:32:44 +01:00
Johan Hedberg	76727c02c1	Bluetooth: Call drain_workqueue() before resetting state Doing things like hci_conn_hash_flush() while holding the hdev lock is risky since its synchronous pending work cancellation could cause the L2CAP layer to try to reacquire the hdev lock. Right now there doesn't seem to be any obvious places where this would for certain happen but it's already enough to cause lockdep to start warning against the hdev and the work struct locks being taken in the "wrong" order: [ +0.000373] mgmt-tester/1603 is trying to acquire lock: [ +0.000292] ((&conn->pending_rx_work)){+.+.+.}, at: [<c104266d>] flush_work+0x0/0x181 [ +0.000270] but task is already holding lock: [ +0.000000] (&hdev->lock){+.+.+.}, at: [<c13b9a80>] hci_dev_do_close+0x166/0x359 [ +0.000000] which lock already depends on the new lock. [ +0.000000] the existing dependency chain (in reverse order) is: [ +0.000000] -> #1 (&hdev->lock){+.+.+.}: [ +0.000000] [<c105ea8f>] lock_acquire+0xe3/0x156 [ +0.000000] [<c140c663>] mutex_lock_nested+0x54/0x375 [ +0.000000] [<c13d644b>] l2cap_recv_frame+0x293/0x1a9c [ +0.000000] [<c13d7ca4>] process_pending_rx+0x50/0x5e [ +0.000000] [<c1041a3f>] process_one_work+0x21c/0x436 [ +0.000000] [<c1041e3d>] worker_thread+0x1be/0x251 [ +0.000000] [<c1045a22>] kthread+0x94/0x99 [ +0.000000] [<c140f801>] ret_from_kernel_thread+0x21/0x30 [ +0.000000] -> #0 ((&conn->pending_rx_work)){+.+.+.}: [ +0.000000] [<c105e158>] __lock_acquire+0xa07/0xc89 [ +0.000000] [<c105ea8f>] lock_acquire+0xe3/0x156 [ +0.000000] [<c1042696>] flush_work+0x29/0x181 [ +0.000000] [<c1042864>] __cancel_work_timer+0x76/0x8f [ +0.000000] [<c104288c>] cancel_work_sync+0xf/0x11 [ +0.000000] [<c13d4c18>] l2cap_conn_del+0x72/0x183 [ +0.000000] [<c13d8953>] l2cap_disconn_cfm+0x49/0x55 [ +0.000000] [<c13be37a>] hci_conn_hash_flush+0x7a/0xc3 [ +0.000000] [<c13b9af6>] hci_dev_do_close+0x1dc/0x359 [ +0.012038] [<c13bbe38>] hci_unregister_dev+0x6e/0x1a3 [ +0.000000] [<c12d33c1>] vhci_release+0x28/0x47 [ +0.000000] [<c10dd6a9>] __fput+0xd6/0x154 [ +0.000000] [<c10dd757>] ____fput+0xd/0xf [ +0.000000] [<c1044bb2>] task_work_run+0x6b/0x8d [ +0.000000] [<c1001bd2>] do_notify_resume+0x3c/0x3f [ +0.000000] [<c140fa70>] work_notifysig+0x29/0x31 [ +0.000000] other info that might help us debug this: [ +0.000000] Possible unsafe locking scenario: [ +0.000000] CPU0 CPU1 [ +0.000000] ---- ---- [ +0.000000] lock(&hdev->lock); [ +0.000000] lock((&conn->pending_rx_work)); [ +0.000000] lock(&hdev->lock); [ +0.000000] lock((&conn->pending_rx_work)); [ +0.000000] * DEADLOCK * Fully fixing this would require some quite heavy refactoring to change how the hdev lock and hci_conn instances are handled together. A simpler solution for now which this patch takes is to try ensure that the hdev workqueue is empty before proceeding with the various cleanup calls, including hci_conn_hash_flush(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-18 08:32:08 +01:00
Johan Hedberg	38da170306	Bluetooth: Use shorter "rand" name for "randomizer" The common short form of "randomizer" is "rand" in many places (including the Bluetooth specification). The shorter version also makes for easier to read code with less forced line breaks. This patch renames all occurences of "randomizer" to "rand" in the Bluetooth subsystem code. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-18 01:53:15 +01:00
Johan Hedberg	c19a495c8b	Bluetooth: Fix BR/EDR-only address checks for remote OOB data For now the mgmt commands dealing with remote OOB data are strictly BR/EDR-only. This patch fixes missing checks for the passed address type so that any non-BR/EDR value triggers the appropriate error response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-18 01:53:15 +01:00
Vasily Averin	2c7b5d5dac	netfilter: nf_conntrack_h323: lookup route from proper net namespace Signed-off-by: Vasily Averin <vvs@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-17 12:47:14 +01:00
Florian Westphal	e59ea3df3f	netfilter: xt_connlimit: honor conntrack zone if available Currently all the conntrack lookups are done using default zone. In case the skb has a ct attached (e.g. template) we should use this zone for lookups instead. This makes connlimit work with connections assigned to other zones. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-17 12:44:20 +01:00
Linus Lüssing	f0b4eeced5	bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queries Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected for both locally generated IGMP and MLD queries. The IP header specific filter options are off by 14 Bytes for netfilter (actual output on interfaces is fine). NF_HOOK() expects the skb->data to point to the IP header, not the ethernet one (while dev_queue_xmit() does not). Luckily there is an br_dev_queue_push_xmit() helper function already - let's just use that. Introduced by `eb1d164143` ("bridge: Add core IGMP snooping support") Ebtables example: $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \ --log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP before (broken): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \ DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \ IPv6 priority=0x3, Next Header=2 after (working): ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \ MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \ SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \ DST=ff02:0000:0000:0000:0000:0000:0000:0001, \ IPv6 priority=0x0, Next Header=0 Signed-off-by: Linus Lüssing <linus.luessing@web.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-17 12:38:02 +01:00
Pablo Neira Ayuso	97840cb67f	netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind Make sure the netlink group exists, otherwise you can trigger an out of bound array memory access from the netlink_bind() path. This splat can only be triggered only by superuser. [ 180.203600] UBSan: Undefined behaviour in ../net/netfilter/nfnetlink.c:467:28 [ 180.204249] index 9 is out of range for type 'int [9]' [ 180.204697] CPU: 0 PID: 1771 Comm: trinity-main Not tainted 3.18.0-rc4-mm1+ #122 [ 180.205365] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org +04/01/2014 [ 180.206498] 0000000000000018 0000000000000000 0000000000000009 ffff88007bdf7da8 [ 180.207220] ffffffff82b0ef5f 0000000000000092 ffffffff845ae2e0 ffff88007bdf7db8 [ 180.207887] ffffffff8199e489 ffff88007bdf7e18 ffffffff8199ea22 0000003900000000 [ 180.208639] Call Trace: [ 180.208857] dump_stack (lib/dump_stack.c:52) [ 180.209370] ubsan_epilogue (lib/ubsan.c:174) [ 180.209849] __ubsan_handle_out_of_bounds (lib/ubsan.c:400) [ 180.210512] nfnetlink_bind (net/netfilter/nfnetlink.c:467) [ 180.210986] netlink_bind (net/netlink/af_netlink.c:1483) [ 180.211495] SYSC_bind (net/socket.c:1541) Moreover, define the missing nf_tables and nf_acct multicast groups too. Reported-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-17 12:01:13 +01:00
Alexander Aring	ee7b9053bd	ieee802154: fix byteorder for short address and panid This patch changes the byteorder handling for short and panid handling. We now except to get little endian in nl802154 for these attributes. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:17 +01:00
Alexander Aring	cb41c8dd01	ieee802154: rename and move WPAN_NUM_ defines This patch moves the 802.15.4 constraints WPAN_NUM_ defines into "net/ieee802154.h" which should contain all necessary 802.15.4 related information. Also rename these defines to a common name which is IEEE802154_MAX_CHANNEL and IEEE802154_MAX_PAGE. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:17 +01:00
Alexander Aring	b821ecd4c8	ieee802154: add del interface command This patch adds support for deleting a wpan interface via nl802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:17 +01:00
Alexander Aring	0e57547eb7	ieee802154: setting extended address while iface add This patch adds support for setting an extended address while registration a new interface. If ieee802154_is_valid_extended_addr getting as parameter and invalid extended address then the perm address is fallback. This is useful to make some default handling while for example default registration of a wpan interface while phy registration. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:16 +01:00
Alexander Aring	f3ea5e4423	ieee802154: add new interface command This patch adds a new nl802154 command for adding a new interface according to a wpan phy via nl802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:16 +01:00
Alexander Aring	133d3f3172	mac802154: remove wpan_dev parameter in if_add This parameter was grabbed from wireless implementation with the identically wireless dev struct. We don't need this right now and so we remove it. Maybe we will add it later again if we found any real reason to have such parameter. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:16 +01:00
Alexander Aring	944742a36d	mac802154: use new nl802154 iftype types This patch replace the depracted IEEE802154_DEV to the new introduced NL802154_IFTYPE_NODE types. There is a backwards compatibility to have the identical types for both enum definitions. Also remove some inlcude issue with "linux/nl802154.h", because the export nl_policy inside this header it was always necessary to have an include of "net/rtnetlink.h" before. The reason for this is more complicated. Nevertheless we removed this now, because "linux/nl802154.h" is the depracted netlink interface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:15 +01:00
Alexander Aring	cd11d935f2	mac802154: remove deprecated linux-zigbee info We don't and we can't name it zigbee anymore. This patch removes deprecated information for project website. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:15 +01:00
Alexander Aring	628b1e1136	mac802154: remove const for non pointer in rdev-ops This patches removes the const keyword in variables which are non pointers. There is no sense to declare call by value parameters as const. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:15 +01:00
Alexander Aring	6d5fb87745	mac802154: remove const for non pointer in cfg ops This patches removes the const keyword in variables which are non pointers. There is no sense to declare call by value parameters as const. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:15 +01:00
Alexander Aring	29cd54b9bf	mac802154: remove const for non pointer in driver-ops This patches removes the const keyword in variables which are non pointers. There is no sense to declare call by value parameters as const. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:14 +01:00
Alexander Aring	197304b7a5	mac802154: remove unused prototypes This patch removes some prototypes which are not used anymore. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-17 09:49:14 +01:00
Daniel Borkmann	feb91a02cc	ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs It has been reported that generating an MLD listener report on devices with large MTUs (e.g. 9000) and a high number of IPv6 addresses can trigger a skb_over_panic(): skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20 head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0 dev:port1 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:100! invalid opcode: 0000 [#1] SMP Modules linked in: ixgbe(O) CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4 [...] Call Trace: <IRQ> [<ffffffff80578226>] ? skb_put+0x3a/0x3b [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4 [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45 [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68 [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182 [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3 [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46 [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70 mld_newpack() skb allocations are usually requested with dev->mtu in size, since commit `72e09ad107` ("ipv6: avoid high order allocations") we have changed the limit in order to be less likely to fail. However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb) macros, which determine if we may end up doing an skb_put() for adding another record. To avoid possible fragmentation, we check the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong assumption as the actual max allocation size can be much smaller. The IGMP case doesn't have this issue as commit `57e1ab6ead` ("igmp: refine skb allocations") stores the allocation size in the cb[]. Set a reserved_tailroom to make it fit into the MTU and use skb_availroom() helper instead. This also allows to get rid of igmp_skb_size(). Reported-by: Wei Liu <lw1a2.jing@gmail.com> Fixes: `72e09ad107` ("ipv6: avoid high order allocations") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: David L Stevens <david.stevens@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 16:55:06 -05:00
Eric Dumazet	960fb622f8	net: provide a per host RSS key generic infrastructure RSS (Receive Side Scaling) typically uses Toeplitz hash and a 40 or 52 bytes RSS key. Some drivers use a constant (and well known key), some drivers use a random key per port, making bonding setups hard to tune. Well known keys increase attack surface, considering that number of queues is usually a power of two. This patch provides infrastructure to help drivers doing the right thing. netdev_rss_key_fill() should be used by drivers to initialize their RSS key, even if they provide ethtool -X support to let user redefine the key later. A new /proc/sys/net/core/netdev_rss_key file can be used to get the host RSS key even for drivers not providing ethtool -x support, in case some applications want to precisely setup flows to match some RX queues. Tested: myhost:~# cat /proc/sys/net/core/netdev_rss_key 11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41:36:40:74:b6:15:ca:27:44:aa:b3:4d:72 myhost:~# ethtool -x eth0 RX flow hash indirection table for eth0 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: 11:63:99:bb:79:fb:a5:a7:07:45:b2:20:bf:02:42:2d:08:1a:dd:19:2b:6b:23:ac:56:28:9d:70:c3:ac:e8:16:4b:b7:c1:10:53:a4:78:41 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 15:59:11 -05:00
David S. Miller	c6ab766e81	Merge branch 'net_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following fixes are accumulated in ovs-repo. Three of them are related to protocol processing, one is related to memory leak in case of error and one is to fix race. Patch "Validate IPv6 flow key and mask values" has conflicts with net-next, Let me know if you want me to send the patch for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:59:01 -05:00
Anish Bhatt	52cff74eef	dcbnl : Disable software interrupts before taking dcb_lock Solves possible lockup issues that can be seen from firmware DCB agents calling into the DCB app api. DCB firmware event queues can be tied in with NAPI so that dcb events are generated in softIRQ context. This can results in calls to dcb_*app() functions which try to take the dcb_lock. If the the event triggers while we also have the dcb_lock because lldpad or some other agent happened to be issuing a get/set command we could see a cpu lockup. This code was not originally written with firmware agents in mind, hence grabbing dcb_lock from softIRQ context was not considered. Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:50:52 -05:00
Fabian Frederick	6f2aed6ad7	net: dsa: replace countsize kzalloc by kcalloc kcalloc manages countsizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:41:39 -05:00
Fabian Frederick	5bc4b46a70	net: dsa: replace countsize kmalloc by kmalloc_array kmalloc_array manages countsizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:41:39 -05:00
Fabian Frederick	f35423c137	openvswitch: use PTR_ERR_OR_ZERO Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:41:39 -05:00
Holger Brunck	0372bf5c09	tipc: allow one link per bearer to neighboring nodes There is no reason to limit the amount of possible links to a neighboring node to 2. If we have more then two bearers we can also establish more links. Signed-off-by: Holger Brunck <holger.brunck@keymile.com> Reviewed-By: Jon Maloy <jon.maloy@ericsson.com> cc: Ying Xue <ying.xue@windriver.com> cc: Erik Hugne <erik.hugne@ericsson.com> cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:27:17 -05:00
David S. Miller	f1227c5c1b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter updates for your net tree, they are: 1) Fix missing initialization of the range structure (allocated in the stack) in nft_masq_{ipv4, ipv6}_eval, from Daniel Borkmann. 2) Make sure the data we receive from userspace contains the req_version structure, otherwise return an error incomplete on truncated input. From Dan Carpenter. 3) Fix handling og skb->sk which may cause incorrect handling of connections from a local process. Via Simon Horman, patch from Calvin Owens. 4) Fix wrong netns in nft_compat when setting target and match params structure. 5) Relax chain type validation in nft_compat that was recently included, this broke the matches that need to be run from the route chain type. Now iptables-test.py automated regression tests report success again and we avoid the only possible problematic case, which is the use of nat targets out of nat chain type. 6) Use match->table to validate the tablename, instead of the match->name. Again patch for nft_compat. 7) Restore the synchronous release of objects from the commit and abort path in nf_tables. This is causing two major problems: splats when using nft_compat, given that matches and targets may sleep and call_rcu is invoked from softirq context. Moreover Patrick reported possible event notification reordering when rules refer to anonymous sets. 8) Fix race condition in between packets that are being confirmed by conntrack and the ctnetlink flush operation. This happens since the removal of the central spinlock. Thanks to Jesper D. Brouer to looking into this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:23:56 -05:00
Panu Matilainen	49dd18ba46	ipv4: Fix incorrect error code when adding an unreachable route Trying to add an unreachable route incorrectly returns -ESRCH if if custom FIB rules are present: [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4 RTNETLINK answers: Network is unreachable [root@localhost ~]# ip rule add to 55.66.77.88 table 200 [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4 RTNETLINK answers: No such process [root@localhost ~]# Commit `83886b6b63` ("[NET]: Change "not found" return value for rule lookup") changed fib_rules_lookup() to use -ESRCH as a "not found" code internally, but for user space it should be translated into -ENETUNREACH. Handle the translation centrally in ipv4-specific fib_lookup(), leaving the DECnet case alone. On a related note, commit `b7a71b51ee` ("ipv4: removed redundant conditional") removed a similar translation from ip_route_input_slow() prematurely AIUI. Fixes: `b7a71b51ee` ("ipv4: removed redundant conditional") Signed-off-by: Panu Matilainen <pmatilai@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 14:11:45 -05:00
Ingo Molnar	e9ac5f0fa8	Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-11-16 10:50:25 +01:00
Linus Torvalds	1afcb6ed0d	NFS client bugfixes for Linux 3.18 Highlights include: - Stable patches to fix NFSv4.x delegation reclaim error paths - Fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2 features - Fix a use-after-free problem with pNFS block layouts - Fix a memory leak in the pNFS files O_DIRECT code - Replace an intrusive and Oops-prone performance fix in the NFSv4 atomic open code with a safer one-line version and revert the two original patches. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUZol9AAoJEGcL54qWCgDyNQQQALnngvpPR51BoO/iTz9ruXol fGZy0SRIlTUKm1ArsQsQ+HGbV5K0hgP3Tg+z2AtEEZ8u/2Fi2Bqdl6+eNY12tKHd uUctDdM5TXLrETAn1UULrnd2eX1cvPMBfOlXlAdNHHsGEgC7w7YQ+rzGwnls+HDy LYXzY7Y3jYGdTMaRgZc5YRdtd8JBpCxciRvPEQLDIobwP0JnZC1afTLe1XInqB2I TZ4NTHT+DEWA+Ou1P2deL7+RuJNEAeWWBvULJy76n4BqKvN/HNedOO5HyBYXrwSd 3UX3wbx9CWRxN1F0EqNKxjxZ/597JwqBeNoTDRcofLsqumUfAOtlbym1EahcD3Ls pykopNfgUhGuhxolStmuHdS6CnyQPERpR5lFZcDp7XtcwSq4FcwD8DRzLJMZW5dg N1lkfFlwQN3rqdk/NEHL+IxS41Hlk4HXjMoP6MNbRtqzIN6tW9tvC4MtAWd1aYxO YuUW281pbWxXQ731s0kTIrMUdQ9vGSRBMcbnO9rL3o+xkh8y5SPVkx9lhdhJN0UD VbQ5Ws/xZ54bD1PfyYb+Yx659lI8MSFOsDuMuLmDtfYnVicHwCA3H63StvQ3ihf/ q0gu8Iex9YbNNjf7IfYGuWPmPn3gwPBoURPC0bcZvMPdY6DXodU6Oj4BRTQ5VCie 9N0pt2wp2eRjaSzD7r5A =8YN6 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - stable patches to fix NFSv4.x delegation reclaim error paths - fix a bug whereby we were advertising NFSv4.1 but using NFSv4.2 features - fix a use-after-free problem with pNFS block layouts - fix a memory leak in the pNFS files O_DIRECT code - replace an intrusive and Oops-prone performance fix in the NFSv4 atomic open code with a safer one-line version and revert the two original patches" * tag 'nfs-for-3.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor NFS: Don't try to reclaim delegation open state if recovery failed NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired NFS: SEEK is an NFS v4.2 feature nfs: Fix use of uninitialized variable in nfs_getattr() nfs: Remove bogus assignment nfs: remove spurious WARN_ON_ONCE in write path pnfs/blocklayout: serialize GETDEVICEINFO calls nfs: fix pnfs direct write memory leak Revert "NFS: nfs4_do_open should add negative results to the dcache." Revert "NFS: remove BUG possibility in nfs4_open_and_get_state" NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT	2014-11-15 14:15:16 -08:00
Johan Hedberg	eedbd5812c	Bluetooth: Fix clearing remote OOB data through mgmt When passed BDADDR_ANY the Remove Remote OOB Data comand is specified to clear all entries. This patch adds the necessary check and calls hci_remote_oob_data_clear() when necessary. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 09:00:29 +01:00
Johan Hedberg	49d1174130	Bluetooth: Add debug logs to help track locking issues This patch adds some extra debug logs to L2CAP related code. These are mainly to help track locking issues but will probably be useful for debugging other types of issues as well. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:53:27 +01:00
Johan Hedberg	d88b5bbf1a	Bluetooth: Remove unnecessary hdev locking in smp.c Now that the SMP related key lists are converted to RCU there is nothing in smp_cmd_sign_info() or smp_cmd_ident_addr_info() that would require taking the hdev lock (including the smp_distribute_keys call). This patch removes this unnecessary locking. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:53:27 +01:00
Johan Hedberg	adae20cb2d	Bluetooth: Convert IRK list to RCU This patch set converts the hdev->identity_resolving_keys list to use RCU to eliminate the need to use hci_dev_lock/unlock. An additional change that must be done is to remove use of CRYPTO_ALG_ASYNC for the hdev-specific AES crypto context. The reason is that this context is used for matching RPAs and the loop that does the matching is under the RCU read lock, i.e. is an atomic section which cannot sleep. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:53:27 +01:00
Johan Hedberg	970d0f1b28	Bluetooth: Convert LTK list to RCU This patch set converts the hdev->long_term_keys list to use RCU to eliminate the need to use hci_dev_lock/unlock. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:53:27 +01:00
Johan Hedberg	3e64b7bd82	Bluetooth: Trigger SMP for the appropriate LE CoC errors The insufficient authentication/encryption errors indicate to the L2CAP client that it should try to elevate the security level. Since there really isn't any exception to this rule it makes sense to fully handle it on the kernel side instead of pushing the responsibility to user space. This patch adds special handling of these two error codes and calls smp_conn_security() with the elevated security level if necessary. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:46:50 +01:00
Johan Hedberg	35dc6f834c	Bluetooth: Add key preference parameter to smp_sufficient_security So far smp_sufficient_security() has returned false if we're encrypted with an STK but do have an LTK available. However, for the sake of LE CoC servers we do want to let the incoming connection through even though we're only encrypted with the STK. This patch adds a key preference parameter to smp_sufficient_security() with two possible values (enum used instead of bool for readability). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:46:49 +01:00
Johan Hedberg	fa37c1aa30	Bluetooth: Fix sending incorrect LE CoC PDU in BT_CONNECT2 state For LE CoC L2CAP servers we don't do security level elevation during the BT_CONNECT2 state (instead LE CoC simply sends an immediate error response if the security level isn't high enough). Therefore if we get a security level change while an LE CoC channel is in the BT_CONNECT2 state we should simply do nothing. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:46:49 +01:00
Fabian Frederick	a809eff11f	Bluetooth: hidp: replace kzalloc/copy_from_user by memdup_user use memdup_user for rd_data import. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-15 01:30:16 +01:00
Jarno Rajahalme	fecaef85f7	openvswitch: Validate IPv6 flow key and mask values. Reject flow label key and mask values with invalid bits set. Introduced by commit `3fdbd1ce11` ("openvswitch: add ipv6 'set' action"). Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-14 15:13:26 -08:00
Pravin B Shelar	8ec609d8b5	openvswitch: Convert dp rcu read operation to locked operations dp read operations depends on ovs_dp_cmd_fill_info(). This API needs to looup vport to find dp name, but vport lookup can fail. Therefore to keep vport reference alive we need to take ovs lock. Introduced by commit `6093ae9aba` ("openvswitch: Minimize dp and vport critical sections"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-11-14 15:13:26 -08:00
Daniele Di Proietto	19e7a3df72	openvswitch: Fix NDP flow mask validation match_validate() enforce that a mask matching on NDP attributes has also an exact match on ICMPv6 type. The ICMPv6 type, which is 8-bit wide, is stored in the 'tp.src' field of 'struct sw_flow_key', which is 16-bit wide. Therefore, an exact match on ICMPv6 type should only check the first 8 bits. This commit fixes a bug that prevented flows with an exact match on NDP field from being installed Introduced by commit `03f0d916aa` ("openvswitch: Mega flow implementation"). Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-14 15:13:26 -08:00
Jesse Gross	856447d020	openvswitch: Fix checksum calculation when modifying ICMPv6 packets. The checksum of ICMPv6 packets uses the IP pseudoheader as part of the calculation, unlike ICMP in IPv4. This was not implemented, which means that modifying the IP addresses of an ICMPv6 packet would cause the checksum to no longer be correct as the psuedoheader did not match. Introduced by commit `3fdbd1ce11` ("openvswitch: add ipv6 'set' action"). Reported-by: Neal Shrader <icosahedral@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-14 15:13:26 -08:00
Pravin B Shelar	ab64f16ff2	openvswitch: Fix memory leak. Need to free memory in case of sample action error. Introduced by commit `651887b0c2` ("openvswitch: Sample action without side effects"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-14 15:13:26 -08:00
David S. Miller	8a5809e0dd	Merge tag 'master-2014-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-11-13 Please pull this set of a few more wireless fixes intended for the 3.18 stream... For the mac80211 bits, Johannes says: "This has just one fix, for an issue with the CCMP decryption that can cause a kernel crash. I'm not sure it's remotely exploitable, but it's an important fix nonetheless." For the iwlwifi bits, Emmanuel says: "Two fixes here - we weren't updating mac80211 if a scan was cut short by RFKILL which confused cfg80211. As a result, the latter wouldn't allow to run another scan. Liad fixes a small bug in the firmware dump." On top of that... Arend van Spriel corrects a channel width conversion that caused a WARNING in brcmfmac. Hauke Mehrtens avoids a NULL pointer dereference in b43. Larry Finger hits a trio of rtlwifi bugs left over from recent backporting from the Realtek vendor driver. Miaoqing Pan fixes a clocking problem in ath9k that could affect packet timestamps and such. Stanislaw Gruszka addresses an payload alignment issue that has been plaguing rt2x00. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-14 17:10:35 -05:00
bill bonaparte	5195c14c8b	netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse After removal of the central spinlock nf_conntrack_lock, in commit `93bb0ceb75` ("netfilter: conntrack: remove central spinlock nf_conntrack_lock"), it is possible to race against get_next_corpse(). The race is against the get_next_corpse() cleanup on the "unconfirmed" list (a per-cpu list with seperate locking), which set the DYING bit. Fix this race, in __nf_conntrack_confirm(), by removing the CT from unconfirmed list before checking the DYING bit. In case race occured, re-add the CT to the dying list. While at this, fix coding style of the comment that has been updated. Fixes: `93bb0ceb75` ("netfilter: conntrack: remove central spinlock nf_conntrack_lock") Reported-by: bill bonaparte <programme110@gmail.com> Signed-off-by: bill bonaparte <programme110@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-14 17:43:05 +01:00
Pravin B Shelar	8cd4313aa7	openvswitch: Fix build failure. Add dependency on INET to fix following build error. I have also fixed MPLS dependency. ERROR: "ip_route_output_flow" [net/openvswitch/openvswitch.ko] undefined! make[1]: *** [__modpost] Error 1 Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-14 01:24:27 -05:00
David S. Miller	076ce44825	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/chelsio/cxgb4vf/sge.c drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c sge.c was overlapping two changes, one to use the new __dev_alloc_page() in net-next, and one to use s->fl_pg_order in net. ixgbe_phy.c was a set of overlapping whitespace changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-14 01:01:12 -05:00
Linus Torvalds	5cf5203704	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) sunhme driver lacks DMA mapping error checks, based upon a report by Meelis Roos. 2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee. 3) DMA memory allocation sizes are wrong in systemport ethernet driver, fix from Florian Fainelli. 4) Fix use after free in mac80211 defragmentation code, from Johannes Berg. 5) Some networking uapi headers missing from Kbuild file, from Stephen Hemminger. 6) TUN driver gets csum_start offset wrong when VLAN accel is enabled, and macvtap has a similar bug, from Herbert Xu. 7) Adjust several tunneling drivers to set dev->iflink after registry, because registry sets that to -1 overwriting whatever we did. From Steffen Klassert. 8) Geneve forgets to set inner tunneling type, causing GSO segmentation to fail on some NICs. From Jesse Gross. 9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and Giuseppe CAVALLARO. 10) Fix spurious timeouts with NewReno on low traffic connections, from Marcelo Leitner. 11) Fix descriptor updates in enic driver, from Govindarajulu Varadarajan. 12) PPP calls bpf_prog_create() with locks held, which isn't kosher. Fix from Takashi Iwai. 13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel Borkmann. 14) psock_fanout selftest accesses past the end of the mmap ring, fix from Shuah Khan. 15) Fix PTP timestamping for VLAN packets, from Richard Cochran. 16) netlink_unbind() calls in netlink pass wrong initial argument, from Hiroaki SHIMODA. 17) vxlan socket reuse accidently reuses a socket when the address family is different, so we have to explicitly check this, from Marcelo Lietner. 18) Fix missing include in nft_reject_bridge.c breaking the build on ppc and other architectures, from Guenter Roeck. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits) vxlan: Do not reuse sockets for a different address family smsc911x: power-up phydev before doing a software reset. lib: rhashtable - Remove weird non-ASCII characters from comments net/smsc911x: Fix delays in the PHY enable/disable routines net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode netlink: Properly unbind in error conditions. net: ptp: fix time stamp matching logic for VLAN packets. cxgb4 : dcb open-lldp interop fixes selftests/net: psock_fanout seg faults in sock_fanout_read_ring() net: bcmgenet: apply MII configuration in bcmgenet_open() net: bcmgenet: connect and disconnect from the PHY state machine net: qualcomm: Fix dependency ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx net: phy: Correctly handle MII ioctl which changes autonegotiation. ipv6: fix IPV6_PKTINFO with v4 mapped net: sctp: fix memory leak in auth key management net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet net: ppp: Don't call bpf_prog_create() in ppp_lock net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set cxgb4 : Fix bug in DCB app deletion ...	2014-11-13 17:54:08 -08:00
Eric Dumazet	d649a7a81f	tcp: limit GSO packets to half cwnd In DC world, GSO packets initially cooked by tcp_sendmsg() are usually big, as sk_pacing_rate is high. When network is congested, cwnd can be smaller than the GSO packets found in socket write queue. tcp_write_xmit() splits GSO packets using the available cwnd, and we end up sending a single GSO packet, consuming all available cwnd. With GRO aggregation on the receiver, we might handle a single GRO packet, sending back a single ACK. 1) This single ACK might be lost TLP or RTO are forced to attempt a retransmit. 2) This ACK releases a full cwnd, sender sends another big GSO packet, in a ping pong mode. This behavior does not fill the pipes in the best way, because of scheduling artifacts. Make sure we always have at least two GSO packets in flight. This allows us to safely increase GRO efficiency without risking spurious retransmits. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:21:44 -05:00
Thomas Graf	6eba82248e	rhashtable: Drop gfp_flags arg in insert/remove functions Reallocation is only required for shrinking and expanding and both rely on a mutex for synchronization and callers of rhashtable_init() are in non atomic context. Therefore, no reason to continue passing allocation hints through the API. Instead, use GFP_KERNEL and add __GFP_NOWARN \| __GFP_NORETRY to allow for silent fall back to vzalloc() without the OOM killer jumping in as pointed out by Eric Dumazet and Eric W. Biederman. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:18:40 -05:00
Herbert Xu	7b4ce23534	rhashtable: Add parent argument to mutex_is_held Currently mutex_is_held can only test locks in the that are global since it takes no arguments. This prevents rhashtable from being used in places where locks are lock, e.g., per-namespace locks. This patch adds a parent field to mutex_is_held and rhashtable_params so that local locks can be used (and tested). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:13:05 -05:00
Herbert Xu	1f501d6252	netfilter: Move mutex_is_held under PROVE_LOCKING The rhashtable function mutex_is_held is only used when PROVE_LOCKING is enabled. This patch modifies netfilter so that we can rhashtable.h itself can later make mutex_is_held optional depending on PROVE_LOCKING. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:13:05 -05:00
Herbert Xu	9712756620	netlink: Move mutex_is_held under PROVE_LOCKING The rhashtable function mutex_is_held is only used when PROVE_LOCKING is enabled. This patch modifies netlink so that we can rhashtable.h itself can later make mutex_is_held optional depending on PROVE_LOCKING. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:13:05 -05:00
Michal Kubeček	fbe168ba91	net: generic dev_disable_lro() stacked device handling Large receive offloading is known to cause problems if received packets are passed to other host. Therefore the kernel disables it by calling dev_disable_lro() whenever a network device is enslaved in a bridge or forwarding is enabled for it (or globally). For virtual devices we need to disable LRO on the underlying physical device (which is actually receiving the packets). Current dev_disable_lro() code handles this propagation for a vlan (including 802.1ad nested vlan), macvlan or a vlan on top of a macvlan. It doesn't handle other stacked devices and their combinations, in particular propagation from a bond to its slaves which often causes problems in virtualization setups. As we now have generic data structures describing the upper-lower device relationship, dev_disable_lro() can be generalized to disable LRO also for all lower devices (if any) once it is disabled for the device itself. For bonding and teaming devices, it is necessary to disable LRO not only on current slaves at the moment when dev_disable_lro() is called but also on any slave (port) added later. v2: use lower device links for all devices (including vlan and macvlan) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 14:48:56 -05:00
Thomas Graf	882288c05e	FOU: Fix no return statement warning for !CONFIG_NET_FOU_IP_TUNNELS net/ipv4/fou.c: In function ‘ip_tunnel_encap_del_fou_ops’: net/ipv4/fou.c:861:1: warning: no return statement in function returning non-void [-Wreturn-type] Fixes: `a8c5f90fb5` ("ip_tunnel: Ops registration for secondary encap (fou, gue)") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 14:32:00 -05:00
Ilya Dryomov	cc9f1f518c	libceph: change from BUG to WARN for __remove_osd() asserts No reason to use BUG_ON for osd request list assertions. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-11-13 22:26:34 +03:00
Ilya Dryomov	ba9d114ec5	libceph: clear r_req_lru_item in __unregister_linger_request() kick_requests() can put linger requests on the notarget list. This means we need to clear the much-overloaded req->r_req_lru_item in __unregister_linger_request() as well, or we get an assertion failure in ceph_osdc_release_request() - !list_empty(&req->r_req_lru_item). AFAICT the assumption was that registered linger requests cannot be on any of req->r_req_lru_item lists, but that's clearly not the case. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-11-13 22:21:14 +03:00
Ilya Dryomov	a390de0208	libceph: unlink from o_linger_requests when clearing r_osd Requests have to be unlinked from both osd->o_requests (normal requests) and osd->o_linger_requests (linger requests) lists when clearing req->r_osd. Otherwise __unregister_linger_request() gets confused and we trip over a !list_empty(&osd->o_linger_requests) assert in __remove_osd(). MON=1 OSD=1: # cat remove-osd.sh #!/bin/bash rbd create --size 1 test DEV=$(rbd map test) ceph osd out 0 sleep 3 rbd map dne/dne # obtain a new osdmap as a side effect rbd unmap $DEV & # will block sleep 3 ceph osd in 0 Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-11-13 22:21:13 +03:00
Ilya Dryomov	aaef31703a	libceph: do not crash on large auth tickets Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth tickets will have their buffers vmalloc'ed, which leads to the following crash in crypto: [ 28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0 [ 28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80 [ 28.686032] PGD 0 [ 28.688088] Oops: 0000 [#1] PREEMPT SMP [ 28.688088] Modules linked in: [ 28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305 [ 28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 28.688088] Workqueue: ceph-msgr con_work [ 28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000 [ 28.688088] RIP: 0010:[<ffffffff81392b42>] [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80 [ 28.688088] RSP: 0018:ffff8800d903f688 EFLAGS: 00010286 [ 28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0 [ 28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750 [ 28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880 [ 28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010 [ 28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000 [ 28.688088] FS: 00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 [ 28.688088] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0 [ 28.688088] Stack: [ 28.688088] ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32 [ 28.688088] ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020 [ 28.688088] ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010 [ 28.688088] Call Trace: [ 28.688088] [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40 [ 28.688088] [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40 [ 28.688088] [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220 [ 28.688088] [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180 [ 28.688088] [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30 [ 28.688088] [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0 [ 28.688088] [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0 [ 28.688088] [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60 [ 28.688088] [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0 [ 28.688088] [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360 [ 28.688088] [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0 [ 28.688088] [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80 [ 28.688088] [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0 [ 28.688088] [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80 [ 28.688088] [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0 [ 28.688088] [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0 [ 28.688088] [<ffffffff81559289>] try_read+0x1e59/0x1f10 This is because we set up crypto scatterlists as if all buffers were kmalloc'ed. Fix it. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-11-13 22:21:12 +03:00
Jeff Layton	b3ecba0967	sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor Bruce reported that he was seeing the following BUG pop: BUG: sleeping function called from invalid context at mm/slab.c:2846 in_atomic(): 0, irqs_disabled(): 0, pid: 4539, name: mount.nfs 2 locks held by mount.nfs/4539: #0: (nfs_clid_init_mutex){+.+.+.}, at: [<ffffffffa01c0a9a>] nfs4_discover_server_trunking+0x4a/0x2f0 [nfsv4] #1: (rcu_read_lock){......}, at: [<ffffffffa00e3185>] gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss] Preemption disabled at:[<ffffffff81a4f082>] printk+0x4d/0x4f CPU: 3 PID: 4539 Comm: mount.nfs Not tainted 3.18.0-rc1-00013-g5b095e9 #3393 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 ffff880021499390 ffff8800381476a8 ffffffff81a534cf 0000000000000001 0000000000000000 ffff8800381476c8 ffffffff81097854 00000000000000d0 0000000000000018 ffff880038147718 ffffffff8118e4f3 0000000020479f00 Call Trace: [<ffffffff81a534cf>] dump_stack+0x4f/0x7c [<ffffffff81097854>] __might_sleep+0x114/0x180 [<ffffffff8118e4f3>] __kmalloc+0x1a3/0x280 [<ffffffffa00e31d8>] gss_stringify_acceptor+0x58/0xb0 [auth_rpcgss] [<ffffffffa00e3185>] ? gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss] [<ffffffffa006b438>] rpcauth_stringify_acceptor+0x18/0x30 [sunrpc] [<ffffffffa01b0469>] nfs4_proc_setclientid+0x199/0x380 [nfsv4] [<ffffffffa01b04d0>] ? nfs4_proc_setclientid+0x200/0x380 [nfsv4] [<ffffffffa01bdf1a>] nfs40_discover_server_trunking+0xda/0x150 [nfsv4] [<ffffffffa01bde45>] ? nfs40_discover_server_trunking+0x5/0x150 [nfsv4] [<ffffffffa01c0acf>] nfs4_discover_server_trunking+0x7f/0x2f0 [nfsv4] [<ffffffffa01c8e24>] nfs4_init_client+0x104/0x2f0 [nfsv4] [<ffffffffa01539b4>] nfs_get_client+0x314/0x3f0 [nfs] [<ffffffffa0153780>] ? nfs_get_client+0xe0/0x3f0 [nfs] [<ffffffffa01c83aa>] nfs4_set_client+0x8a/0x110 [nfsv4] [<ffffffffa0069708>] ? __rpc_init_priority_wait_queue+0xa8/0xf0 [sunrpc] [<ffffffffa01c9b2f>] nfs4_create_server+0x12f/0x390 [nfsv4] [<ffffffffa01c1472>] nfs4_remote_mount+0x32/0x60 [nfsv4] [<ffffffff81196489>] mount_fs+0x39/0x1b0 [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20 [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150 [<ffffffffa01c1396>] nfs_do_root_mount+0x86/0xc0 [nfsv4] [<ffffffffa01c1784>] nfs4_try_mount+0x44/0xc0 [nfsv4] [<ffffffffa01549b7>] ? get_nfs_version+0x27/0x90 [nfs] [<ffffffffa0161a2d>] nfs_fs_mount+0x47d/0xd60 [nfs] [<ffffffff81a59c5e>] ? mutex_unlock+0xe/0x10 [<ffffffffa01606a0>] ? nfs_remount+0x430/0x430 [nfs] [<ffffffffa01609c0>] ? nfs_clone_super+0x140/0x140 [nfs] [<ffffffff81196489>] mount_fs+0x39/0x1b0 [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20 [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150 [<ffffffff811b5830>] do_mount+0x210/0xbe0 [<ffffffff811b54ca>] ? copy_mount_options+0x3a/0x160 [<ffffffff811b651f>] SyS_mount+0x6f/0xb0 [<ffffffff81a5c852>] system_call_fastpath+0x12/0x17 Sleeping under the rcu_read_lock is bad. This patch fixes it by dropping the rcu_read_lock before doing the allocation and then reacquiring it and redoing the dereference before doing the copy. If we find that the string has somehow grown in the meantime, we'll reallocate and try again. Cc: <stable@vger.kernel.org> # v3.17+ Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-11-13 13:15:49 -05:00
Pablo Neira Ayuso	8225161545	netfilter: nfnetlink_log: remove unnecessary error messages In case of OOM, there's nothing userspace can do. If there's no room to put the payload in __build_packet_message(), jump to nla_put_failure which already performs the corresponding error reporting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-13 13:13:00 +01:00
Florian Westphal	5676864431	netfilter: fix various sparse warnings net/bridge/br_netfilter.c:870:6: symbol 'br_netfilter_enable' was not declared. Should it be static? no; add include net/ipv4/netfilter/nft_reject_ipv4.c:22:6: symbol 'nft_reject_ipv4_eval' was not declared. Should it be static? yes net/ipv6/netfilter/nf_reject_ipv6.c:16:6: symbol 'nf_send_reset6' was not declared. Should it be static? no; add include net/ipv6/netfilter/nft_reject_ipv6.c:22:6: symbol 'nft_reject_ipv6_eval' was not declared. Should it be static? yes net/netfilter/core.c:33:32: symbol 'nf_ipv6_ops' was not declared. Should it be static? no; add include net/netfilter/xt_DSCP.c:40:57: cast truncates bits from constant value (ffffff03 becomes 3) net/netfilter/xt_DSCP.c:57:59: cast truncates bits from constant value (ffffff03 becomes 3) add __force, 3 is what we want. net/ipv4/netfilter/nf_log_arp.c:77:6: symbol 'nf_log_arp_packet' was not declared. Should it be static? yes net/ipv4/netfilter/nf_reject_ipv4.c:17:6: symbol 'nf_send_reset' was not declared. Should it be static? no; add include Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-13 12:14:42 +01:00
Herbert Xu	12bfa8bdba	xfrm: Use __xfrm_policy_link in xfrm_policy_insert For a long time we couldn't actually use __xfrm_policy_link in xfrm_policy_insert because the latter wanted to do hashing at a specific position. Now that __xfrm_policy_link no longer does hashing it can now be safely used in xfrm_policy_insert to kill some duplicate code, finally reuniting general policies with socket policies. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-11-13 11:25:04 +01:00
Herbert Xu	53c2e285f9	xfrm: Do not hash socket policies Back in 2003 when I added policy expiration, I half-heartedly did a clean-up and renamed xfrm_sk_policy_link/xfrm_sk_policy_unlink to __xfrm_policy_link/__xfrm_policy_unlink, because the latter could be reused for all policies. I never actually got around to using __xfrm_policy_link for non-socket policies. Later on hashing was added to all xfrm policies, including socket policies. In fact, we don't need hashing on socket policies at all since they're always looked up via a linked list. This patch restores xfrm_sk_policy_link/xfrm_sk_policy_unlink as wrappers around __xfrm_policy_link/__xfrm_policy_unlink so that it's obvious we're dealing with socket policies. This patch also removes hashing from __xfrm_policy_link as for now it's only used by socket policies which do not need to be hashed. Ironically this will in fact allow us to use this helper for non-socket policies which I shall do later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-11-13 11:25:03 +01:00
Johan Hedberg	2773b02422	Bluetooth: Fix correct nesting for 6lowpan server channel Server channels in BT_LISTEN state should use L2CAP_NESTING_PARENT. This patch fixes the nesting value for the 6lowpan channel. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-13 09:11:37 +01:00
Johan Hedberg	ff714119a6	Bluetooth: Fix L2CAP nesting level initialization location There's no reason why all users of L2CAP would need to worry about initializing chan->nesting to L2CAP_NESTING_NORMAL (which is important since 0 is the same as NESTING_SMP). This patch moves the initialization to the common place that's used to create all new channels, i.e. the l2cap_chan_create() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-13 09:11:37 +01:00
Johan Hedberg	3b2ab39e26	Bluetooth: Fix L2CAP socket lock nesting level The teardown callback for L2CAP channels is problematic in that it is explicitly called for all types of channels from l2cap_chan_del(), meaning it's not possible to hard-code a nesting level when taking the socket lock. The simplest way to have a correct nesting level for the socket locking is to use the same value as for the chan. This also means that the other places trying to lock parent sockets need to be update to use the chan value (since L2CAP_NESTING_PARENT is defined as 2 whereas SINGLE_DEPTH_NESTING has the value 1). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-13 07:49:09 +01:00
Johan Hedberg	abe84903a8	Bluetooth: Use proper nesting annotation for l2cap_chan lock By default lockdep considers all L2CAP channels equal. This would mean that we get warnings if a channel is locked when another one's lock is tried to be acquired in the same thread. This kind of inter-channel locking dependencies exist in the form of parent-child channels as well as any channel wishing to elevate the security by requesting procedures on the SMP channel. To eliminate the chance for these lockdep warnings we introduce a nesting level for each channel and use that when acquiring the channel lock. For now there exists the earlier mentioned three identified categories: SMP, "normal" channels and parent channels (i.e. those in BT_LISTEN state). The nesting level is defined as atomic_t since we need access to it before the lock is actually acquired. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-13 07:49:09 +01:00
Alexander Aring	61f2dcba9a	mac802154: add interframe spacing time handling This patch adds a new interframe spacing time handling into mac802154 layer. Interframe spacing time is a time period between each transmit. This patch adds a high resolution timer into mac802154 and starts on xmit complete with corresponding interframe spacing expire time if ifs_handling is true. We make it variable because it depends if interframe spacing time is handled by transceiver or mac802154. At the timer complete function we wake the netdev queue again. This avoids new frame transmit in range of interframe spacing time. For synced driver we add no handling of interframe spacing time. This is currently a lack of support in all synced xmit drivers. I suppose it's working because the latency of workqueue which is needed to call spi_sync. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-13 04:51:58 +01:00
Joe Perches	a768851f94	irda: Fix build failures after IRDA_DEBUG->pr_debug Fix the build failures that result from the use of pr_debug without the referenced char * arrays being defined. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 22:01:14 -05:00
Hiroaki SHIMODA	6251edd932	netlink: Properly unbind in error conditions. Even if netlink_kernel_cfg::unbind is implemented the unbind() method is not called, because cfg->unbind is omitted in __netlink_kernel_create(). And fix wrong argument of test_bit() and off by one problem. At this point, no unbind() method is implemented, so there is no real issue. Fixes: `4f52090052` ("netlink: have netlink per-protocol bind function return an error code.") Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Richard Guy Briggs <rgb@redhat.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 15:12:06 -05:00
Tom Herbert	a8c5f90fb5	ip_tunnel: Ops registration for secondary encap (fou, gue) Instead of calling fou and gue functions directly from ip_tunnel use ops for these that were previously registered. This patch adds the logic to add and remove encapsulation operations for ip_tunnel, and modified fou (and gue) to register with ip_tunnels. This patch also addresses a circular dependency between ip_tunnel and fou that was causing link errors when CONFIG_NET_IP_TUNNEL=y and CONFIG_NET_FOU=m. References to fou an gue have been removed from ip_tunnel.c Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 15:01:35 -05:00
Joe Perches	4243cdc2c1	udp: Neaten function pointer calls and add braces Standardize function pointer uses. Convert calling style from: (*foo)(args...); to: foo(args...); Other miscellanea: o Add braces around loops with single ifs on multiple lines o Realign arguments around these functions o Invert logic in if to return immediately. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 14:51:59 -05:00
Joe Perches	955a9d202f	irda: Convert IRDA_DEBUG to pr_debug Use the normal kernel debugging mechanism which also enables dynamic_debug at the same time. Other miscellanea: o Remove sysctl for irda_debug o Remove function tracing like uses (use ftrace instead) o Coalesce formats o Realign arguments o Remove unnecessary OOM messages Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-12 13:56:41 -05:00
Pablo Neira Ayuso	b326dd37b9	netfilter: nf_tables: restore synchronous object release from commit/abort The existing xtables matches and targets, when used from nft_compat, may sleep from the destroy path, ie. when removing rules. Since the objects are released via call_rcu from softirq context, this results in lockdep splats and possible lockups that may be hard to reproduce. Patrick also indicated that delayed object release via call_rcu can cause us problems in the ordering of event notifications when anonymous sets are in place. So, this patch restores the synchronous object release from the commit and abort paths. This includes a call to synchronize_rcu() to make sure that no packets are walking on the objects that are going to be released. This is slowier though, but it's simple and it resolves the aforementioned problems. This is a partial revert of `c7c32e7` ("netfilter: nf_tables: defer all object release via rcu") that was introduced in 3.16 to speed up interaction with userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 12:06:24 +01:00
Pablo Neira Ayuso	afefb6f928	netfilter: nft_compat: use the match->table to validate dependencies Instead of the match->name, which is of course not relevant. Fixes: `f3f5dde` ("netfilter: nft_compat: validate chain type in match/target") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 12:06:24 +01:00
Pablo Neira Ayuso	c918687f5e	netfilter: nft_compat: relax chain type validation Check for nat chain dependency only, which is the one that can actually crash the kernel. Don't care if mangle, filter and security specific match and targets are used out of their scope, they are harmless. This restores iptables-compat with mangle specific match/target when used out of the OUTPUT chain, that are actually emulated through filter chains, which broke when performing strict validation. Fixes: `f3f5dde` ("netfilter: nft_compat: validate chain type in match/target") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 12:06:24 +01:00
Pablo Neira Ayuso	2daf1b4d18	netfilter: nft_compat: use current net namespace Instead of init_net when using xtables over nftables compat. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 12:06:24 +01:00
Pablo Neira Ayuso	baf4750d92	netfilter: nft_redir: fix sparse warnings >> net/netfilter/nft_redir.c:39:26: sparse: incorrect type in assignment (different base types) net/netfilter/nft_redir.c:39:26: expected unsigned int [unsigned] [usertype] nla_be32 net/netfilter/nft_redir.c:39:26: got restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:40:40: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:46:34: sparse: incorrect type in assignment (different base types) net/netfilter/nft_redir.c:46:34: expected unsigned int [unsigned] [usertype] nla_be32 net/netfilter/nft_redir.c:46:34: got restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 >> net/netfilter/nft_redir.c:47:48: sparse: cast to restricted __be32 Fixes: `e9105f1` ("netfilter: nf_tables: add new expression nft_redir") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 12:00:04 +01:00
Pablo Neira Ayuso	f6c6339d5e	netfilter: fix unmet dependencies in NETFILTER_XT_TARGET_REDIRECT warning: (NETFILTER_XT_TARGET_REDIRECT) selects NF_NAT_REDIRECT_IPV4 which has unmet direct dependencies (NET && INET && NETFILTER && NF_NAT_IPV4) warning: (NETFILTER_XT_TARGET_REDIRECT) selects NF_NAT_REDIRECT_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NF_NAT_IPV6) Fixes: `8b13edd` ("netfilter: refactor NAT redirect IPv4 to use it from nf_tables") Fixes: `9de920e` ("netfilter: refactor NAT redirect IPv6 code to use it from nf_tables") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-12 11:54:12 +01:00
Johan Hedberg	a930430b04	Bluetooth: Remove unnecessary hci_dev_lock/unlock in smp.c The mgmt_user_passkey_request and related functions do not do anything else except read access to hdev->id. This member never changes after the hdev creation so there is no need to acquire a lock to read it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 10:05:25 +01:00
Johan Hedberg	f03567040c	Bluetooth: Fix l2cap_sock_teardown_cb lockdep warning Any code calling bt_accept_dequeue() to get a new child socket from a server socket should use lock_sock_nested to avoid lockdep warnings due to the parent and child sockets being locked at the same time. The l2cap_sock_accept() function is already doing this correctly but a second place calling bt_accept_dequeue() is the code path from l2cap_sock_teardown_cb() that calls l2cap_sock_cleanup_listen(). This patch fixes the proper nested locking annotation and thereby avoids the following style of lockdep warning. [ +0.000224] [ INFO: possible recursive locking detected ] [ +0.000222] 3.17.0+ #1153 Not tainted [ +0.000130] --------------------------------------------- [ +0.000227] l2cap-tester/562 is trying to acquire lock: [ +0.000210] (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}, at: [<c1393f47>] bt_accept_dequeue+0x68/0x11b [ +0.000467] but task is already holding lock: [ +0.000186] (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}, at: [<c13b949a>] lock_sock+0xa/0xc [ +0.000421] other info that might help us debug this: [ +0.000199] Possible unsafe locking scenario: [ +0.000117] CPU0 [ +0.000000] ---- [ +0.000000] lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP); [ +0.000000] lock(sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP); [ +0.000000] * DEADLOCK * Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 10:05:25 +01:00
Alexander Aring	c8937a1d11	ieee820154: add lbt setting support This patch adds support for setting listen before transmit mode via nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:43 +01:00
Alexander Aring	17a3a46bfb	ieee820154: add max frame retries setting support This patch add support for setting mac frame retries setting via nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:42 +01:00
Alexander Aring	a01ba7652c	ieee820154: add max csma backoffs setting support This patch add support for max csma backoffs setting via nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:41 +01:00
Alexander Aring	656a999e87	ieee820154: add backoff exponent setting support This patch adds support for setting backoff exponents via nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:40 +01:00
Alexander Aring	9830c62a0b	ieee820154: add short_addr setting support This patch adds support for setting short address via nl802154 framework. Also added a comment because a 0xffff seems to be valid address that we don't have a short address. This is a valid setting but we need more checks in upper layers to don't allow this address as source address. Also the current netlink interface doesn't allow to set the short_addr to 0xffff. Same for the 0xfffe short address which describes a not allocated short address. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:40 +01:00
Alexander Aring	702bf37128	ieee820154: add pan_id setting support This patch adds support for setting pan_id via nl802154 framework. Adding a comment because setting 0xffff as pan_id seems to be valid setting. The pan_id 0xffff as source pan is invalid. I am not sure now about this setting but for the current netlink interface this is an invalid setting, so we do the same now. Maybe we need to change that when we have coordinator support and association support. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:39 +01:00
Alexander Aring	ab0bd56172	ieee820154: add channel set support This patch adds page and channel setting support to nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:38 +01:00
Alexander Aring	be4fd8e5d9	mac802154: add ifname change notifier This patch adds a netdev notifier for interface renaming. We have a name attribute inside of subif data struct. This is needed to have always the actual netdev name in sdata name attribute. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:37 +01:00
Alexander Aring	912f67aec7	mac802154: change module description This patch changes the module description like wireless which is IEEE 802.11 "subsystem" and not "implementation". Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:37 +01:00
Alexander Aring	6322d50d87	mac802154: add wpan_phy priv id This patch adds an unique id for an wpan_phy. This behaviour is mostly grabbed from wireless stack. This is needed for upcomming patches which identify the wpan netdev while NETDEV_CHANGENAME in netdev notify function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:36 +01:00
Alexander Aring	2789e6297f	mac820154: move mutex locks out of loop Instead of always re-lock the iflist_mtx at multiple interfaces we lock the complete for each loop at start and at the end. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:36 +01:00
Alexander Aring	d14e1c71cf	mac820154: rename sdata next to tmp This patch is just a cleanup to name the temporary variable for protected list for each loop as tmp. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:36 +01:00
Alexander Aring	592dfbfc72	mac820154: move interface unregistration into iface This patch move the iface unregistration into iface.c file to have a behaviour which is similar like mac80211. Also iface handling should be inside iface.c file only. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-12 05:10:35 +01:00
Calvin Owens	50656d9df6	ipvs: Keep skb->sk when allocating headroom on tunnel xmit ip_vs_prepare_tunneled_skb() ignores ->sk when allocating a new skb, either unconditionally setting ->sk to NULL or allowing the uninitialized ->sk from a newly allocated skb to leak through to the caller. This patch properly copies ->sk and increments its reference count. Signed-off-by: Calvin Owens <calvinowens@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-11-12 11:03:04 +09:00
Joe Perches	6c91023dc3	irda: Remove IRDA_<TYPE> logging macros And use the more common mechanisms directly. Other miscellanea: o Coalesce formats o Add missing newlines o Realign arguments o Remove unnecessary OOM message logging as there's a generic stack dump already on OOM. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 18:11:00 -05:00
John W. Linville	6164c20228	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-11-11 16:30:48 -05:00
Eric Dumazet	5337b5b75c	ipv6: fix IPV6_PKTINFO with v4 mapped Use IS_ENABLED(CONFIG_IPV6), to enable this code if IPv6 is a module. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `c8e6ad0829` ("ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg") Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 15:32:45 -05:00
WANG Cong	d7480fd3b1	neigh: remove dynamic neigh table registration support Currently there are only three neigh tables in the whole kernel: arp table, ndisc table and decnet neigh table. What's more, we don't support registering multiple tables per family. Therefore we can just make these tables statically built-in. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 15:23:54 -05:00
Daniel Borkmann	4184b2a79a	net: sctp: fix memory leak in auth key management A very minimal and simple user space application allocating an SCTP socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing the socket again will leak the memory containing the authentication key from user space: unreferenced object 0xffff8800837047c0 (size 16): comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s) hex dump (first 16 bytes): 01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811c88d8>] __kmalloc+0xe8/0x270 [<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp] [<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp] [<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp] [<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20 [<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0 [<ffffffff816e58a9>] system_call_fastpath+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff This is bad because of two things, we can bring down a machine from user space when auth_enable=1, but also we would leave security sensitive keying material in memory without clearing it after use. The issue is that sctp_auth_create_key() already sets the refcount to 1, but after allocation sctp_auth_set_key() does an additional refcount on it, and thus leaving it around when we free the socket. Fixes: `65b07e5d0d` ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 15:19:11 -05:00
Daniel Borkmann	e40607cbe2	net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet An SCTP server doing ASCONF will panic on malformed INIT ping-of-death in the form of: ------------ INIT[PARAM: SET_PRIMARY_IP] ------------> While the INIT chunk parameter verification dissects through many things in order to detect malformed input, it misses to actually check parameters inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary IP address' parameter in ASCONF, which has as a subparameter an address parameter. So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0 and thus sctp_get_af_specific() returns NULL, too, which we then happily dereference unconditionally through af->from_addr_param(). The trace for the log: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp] PGD 0 Oops: 0000 [#1] SMP [...] Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffffa01e9c62>] [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp] [...] Call Trace: <IRQ> [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp] [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp] [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp] [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp] [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp] [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp] [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp] [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter] [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [...] A minimal way to address this is to check for NULL as we do on all other such occasions where we know sctp_get_af_specific() could possibly return with NULL. Fixes: `d6de309759` ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 15:19:10 -05:00
Joe Perches	ba7a46f16d	net: Convert LIMIT_NETDEBUG to net_dbg_ratelimited Use the more common dynamic_debug capable net_dbg_ratelimited and remove the LIMIT_NETDEBUG macro. All messages are still ratelimited. Some KERN_<LEVEL> uses are changed to KERN_DEBUG. This may have some negative impact on messages that were emitted at KERN_INFO that are not not enabled at all unless DEBUG is defined or dynamic_debug is enabled. Even so, these messages are now _not_ emitted by default. This also eliminates the use of the net_msg_warn sysctl "/proc/sys/net/core/warnings". For backward compatibility, the sysctl is not removed, but it has no function. The extern declaration of net_msg_warn is removed from sock.h and made static in net/core/sysctl_net_core.c Miscellanea: o Update the sysctl documentation o Remove the embedded uses of pr_fmt o Coalesce format fragments o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 14:10:31 -05:00
David S. Miller	4083c8056d	Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following batch of patches brings feature parity between upstream ovs and out of tree ovs module. Two features are added, first adds support to export egress tunnel information for a packet. This is used to improve visibility in network traffic. Second feature allows userspace vswitchd process to probe ovs module features. Other patches are optimization and code cleanup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:32:25 -05:00
Joe Perches	a2ae6007a4	dsa: Use netdev_<level> instead of printk Neaten and standardize the logging output. Other miscellanea: o Use pr_notice_once instead of a guard flag. o Convert existing pr_<level> uses too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:30:57 -05:00
Eric Dumazet	2c8c56e15d	net: introduce SO_INCOMING_CPU Alternative to RPS/RFS is to use hardware support for multiple queues. Then split a set of million of sockets into worker threads, each one using epoll() to manage events on its own socket pool. Ideally, we want one thread per RX/TX queue/cpu, but we have no way to know after accept() or connect() on which queue/cpu a socket is managed. We normally use one cpu per RX queue (IRQ smp_affinity being properly set), so remembering on socket structure which cpu delivered last packet is enough to solve the problem. After accept(), connect(), or even file descriptor passing around processes, applications can use : int cpu; socklen_t len = sizeof(cpu); getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len); And use this information to put the socket into the right silo for optimal performance, as all networking stack should run on the appropriate cpu, without need to send IPI (RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:00:06 -05:00
Eric Dumazet	3d97379a67	tcp: move sk_mark_napi_id() at the right place sk_mark_napi_id() is used to record for a flow napi id of incoming packets for busypoll sake. We should do this only on established flows, not on listeners. This was 'working' by virtue of the socket cloning, but doing this on SYN packets in unecessary cache line dirtying. Even if we move sk_napi_id in the same cache line than sk_lock, we are working to make SYN processing lockless, so it is desirable to set sk_napi_id only for established flows. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:00:05 -05:00
Johannes Berg	0395442ad2	mac80211: refactor duplicate detection Put duplicate detection into its own RX handler, and separate out the conditions a bit to make the code more readable. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-11 17:05:43 +01:00
Johan Hedberg	4e79022677	Bluetooth: 6lowpan: Remove unnecessary RCU callback When kfree() is all that's needed to free an object protected by RCU there's a kfree_rcu() convenience function that can be used. This patch updates the 6lowpan code to use this, thereby eliminating the need for the separate peer_free() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-11 14:26:02 +01:00
Dan Carpenter	2196937e12	netfilter: ipset: small potential read beyond the end of buffer We could be reading 8 bytes into a 4 byte buffer here. It seems harmless but adding a check is the right thing to do and it silences a static checker warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-11 13:46:37 +01:00
Johan Hedberg	60cb49d2c9	Bluetooth: Fix mgmt connected notification This patch fixes a regression that was introduced by commit `cb77c3ec07`. In addition to BT_CONFIG, BT_CONNECTED is also a state in which we may get a remote name and need to indicate over mgmt the connection status. This scenario is particularly likely to happen for incoming connections that do not need authentication since there the hci_conn state will reach BT_CONNECTED before the remote name is received. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-11 10:34:52 +01:00
Johan Hedberg	252670c421	Bluetooth: Fix sparse warning in amp.c This fixes the following sparse warning: net/bluetooth/amp.c:152:53: warning: Variable length array is used. The warning itself is probably harmless since this kind of usage of shash_desc is present also in other places in the kernel (there's even a convenience macro SHASH_DESC_ON_STACK available for defining such stack variables). However, dynamically allocated versions are also used in several places of the kernel (e.g. kernel/kexec.c and lib/digsig.c) which have the benefit of not exhibiting the sparse warning. Since there are no more sparse warnings in the Bluetooth subsystem after fixing this one it is now easier to spot whenever new ones might get introduced by future patches. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-11 00:07:29 +01:00
Jesse Gross	cfdf1e1ba5	udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete. When doing GRO processing for UDP tunnels, we never add SKB_GSO_UDP_TUNNEL to gso_type - only the type of the inner protocol is added (such as SKB_GSO_TCPV4). The result is that if the packet is later resegmented we will do GSO but not treat it as a tunnel. This results in UDP fragmentation of the outer header instead of (i.e.) TCP segmentation of the inner header as was originally on the wire. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 15:09:45 -05:00
David S. Miller	b92172661e	Merge tag 'master-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-11-07 Please pull this batch of updates intended for the 3.19 stream! For the mac80211 bits, Johannes says: "This relatively large batch of changes is comprised of the following: * large mac80211-hwsim changes from Ben, Jukka and a bit myself * OCB/WAVE/11p support from Rostislav on behalf of the Czech Technical University in Prague and Volkswagen Group Research * minstrel VHT work from Karl * more CSA work from Luca * WMM admission control support in mac80211 (myself) * various smaller fixes, spelling corrections, and minor API additions" For the Bluetooth bits, Johan says: "Here's the first bluetooth-next pull request for 3.19. The vast majority of patches are for ieee802154 from Alexander Aring with various fixes and cleanups. There are also several LE/SMP fixes as well as improved support for handling LE devices that have lost their pairing information (the patches from Alfonso). Jukka provides a couple of stability fixes for 6lowpan and Szymon conformance fixes for RFCOMM. For the HCI drivers we have one new USB ID for an Acer controller as well as a reset handling fix for H5." For the Atheros bits, Kalle says: "Major changes are: o ethtool support (Ben) o print dev string prefix with debug hex buffers dump (Michal) o debugfs file to read calibration data from the firmware verification purposes (me) o fix fw_stats debugfs file, now results are more reliable (Michal) o firmware crash counters via debugfs (Ben&me) o various tracing points to debug firmware (Rajkumar) o make it possible to provide firmware calibration data via a file (me) And we have quite a lot of smaller fixes and clean up." For the iwlwifi bits, Emmanuel says: "The big new thing here is netdetect which allows the firmware to wake up the platform when a specific network is detected. Along with that I have fixes for d3 operation. The usual amount of rate scaling stuff - we now support STBC. The other commit that stands out is Johannes's work on devcoredump. He basically starts to use the standard infrastructure he built." Along with that are the usual sort of updates and such for ath9k, brcmfmac, wil6210, and a handful of other bits here and there... Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 14:34:59 -05:00
Herbert Xu	c008ba5bdc	ipv4: Avoid reading user iov twice after raw_probe_proto_opt Ever since raw_probe_proto_opt was added it had the problem of causing the user iov to be read twice, once during the probe for the protocol header and once again in ip_append_data. This is a potential security problem since it means that whatever we're probing may be invalid. This patch plugs the hole by firstly advancing the iov so we don't read the same spot again, and secondly saving what we read the first time around for use by ip_append_data. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 14:25:35 -05:00
Herbert Xu	32b5913a93	ipv4: Use standard iovec primitive in raw_probe_proto_opt The function raw_probe_proto_opt tries to extract the first two bytes from the user input in order to seed the IPsec lookup for ICMP packets. In doing so it's processing iovec by hand and overcomplicating things. This patch replaces the manual iovec processing with a call to memcpy_fromiovecend. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 14:25:35 -05:00
John W. Linville	6168823518	This has just one fix, for an issue with the CCMP decryption that can cause a kernel crash. I'm not sure it's remotely exploitable, but it's an important fix nonetheless. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUYJNkAAoJEDBSmw7B7bqrgooP/ivTw/pJJ5i5wLsfAUUNV1tI FDe3+w/kBsD0d0lNsnKoyq58iORyYIxHtmLhvFV+yuBMvpfADTNm9CkLDDjg939m mU6/kjO6wbHv/SoLSL4rss/AggGaEsu+3ggGpmhXl2JApSUIVjw0z9XixNcU4Vnz CEjIw9jsnKA5UOeH2cv+fwQLXImXKSsofAf2NgZ3zJ6Ci0zLOZtr/ZmO2o4IU9Do HWp3CHsaehoWlaqtPx6r8JaodE7XPEPtim66aWoUaeSYueL9iYKJU7BS2jkTWJnQ Kv8JuVQIQPlMsTuvTR8oHvpHZ29m0Z6mc8XsKeNu6UCNzoOBGS2Ak2YZwcAeqSLr oQRDDBc4aKsY5YWcaChPx6N8gOEXRr4cQgfyY79o5vaPPJwixxLMPUUOU1rKHC3D E61YfQ/tfoFWzEitksX14P3Gynas6abySku5mhM5y2L/1XFdDHugZjzRjcBv/1ry J9x44k2EWbduvz8GBGGethmKMghKYYTqfB5rWh1cKxafgBEFt61ZcE4LuqaNnP17 sg+IlXFxZjJLnhhg0YL2oZt+++/CyCLl9G+mx6wZE4HLqsl26ElZ6ql2kekJ5baV HKsPyBa2r48icJfrogA9WpjumI8b19ztG5OrsF0AQmK77IVSj80JFVnsTBal9/Y7 Ao2xMau9OghJq8AEJpOi =H7HX -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-11-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "This has just one fix, for an issue with the CCMP decryption that can cause a kernel crash. I'm not sure it's remotely exploitable, but it's an important fix nonetheless." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-10 13:08:45 -05:00
Eric Dumazet	3b47d30396	net: gro: add a per device gro flush timer Tuning coalescing parameters on NIC can be really hard. Servers can handle both bulk and RPC like traffic, with conflicting goals : bulk flows want as big GRO packets as possible, RPC want minimal latencies. To reach big GRO packets on 10Gbe NIC, one can use : ethtool -C eth0 rx-usecs 4 rx-frames 44 But this penalizes rpc sessions, with an increase of latencies, up to 50% in some cases, as NICs generally do not force an interrupt when a packet with TCP Push flag is received. Some NICs do not have an absolute timer, only a timer rearmed for every incoming packet. This patch uses a different strategy : Let GRO stack decides what do do, based on traffic pattern. Packets with Push flag wont be delayed. Packets without Push flag might be held in GRO engine, if we keep receiving data. This new mechanism is off by default, and shall be enabled by setting /sys/class/net/ethX/gro_flush_timeout to a value in nanosecond. To fully enable this mechanism, drivers should use napi_complete_done() instead of napi_complete(). Tested: Ran 200 netperf TCP_STREAM from A to B (10Gbe mlx4 link, 8 RX queues) Without this feature, we send back about 305,000 ACK per second. GRO aggregation ratio is low (811/305 = 2.65 segments per GRO packet) Setting a timer of 2000 nsec is enough to increase GRO packet sizes and reduce number of ACK packets. (811/19.2 = 42) Receiver performs less calls to upper stacks, less wakes up. This also reduces cpu usage on the sender, as it receives less ACK packets. Note that reducing number of wakes up increases cpu efficiency, but can decrease QPS, as applications wont have the chance to warmup cpu caches doing a partial read of RPC requests/answers if they fit in one skb. B:~# sar -n DEV 1 10 \| grep eth0 \| tail -1 Average: eth0 811269.80 305732.30 1199462.57 19705.72 0.00 0.00 0.50 B:~# echo 2000 >/sys/class/net/eth0/gro_flush_timeout B:~# sar -n DEV 1 10 \| grep eth0 \| tail -1 Average: eth0 811577.30 19230.80 1199916.51 1239.80 0.00 0.00 0.50 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 12:05:59 -05:00
Daniel Borkmann	6b96686ecf	netfilter: nft_masq: fix uninitialized range in nft_masq_{ipv4, ipv6}_eval When transferring from the original range in nf_nat_masquerade_{ipv4,ipv6}() we copy over values from stack in from min_proto/max_proto due to uninitialized range variable in both, nft_masq_{ipv4,ipv6}_eval. As we only initialize flags at this time from nft_masq struct, just zero out the rest. Fixes: `9ba1f726be` ("netfilter: nf_tables: add new nft_masq expression") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-10 17:56:28 +01:00
Arik Nemtsov	a6d4a534e1	cfg80211: introduce regulatory flags controlling bw Allow setting bandwidth related regulatory flags. These flags are mapped to the corresponding channel flags in the specified range. Make sure the new flags are consulted when calculating the maximum bandwidth allowed by a regulatory-rule. Also allow propagating the GO_CONCURRENT modifier from a reg-rule to a channel. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:36:21 +01:00
Johannes Berg	1f7bba79af	mac80211: add back support for radiotap vendor namespace data Radiotap vendor namespace data might still be useful, but we reverted it because it used too much space in the RX status. Put it back, but address the space problem by using a single bit only and putting everything else into the skb->data. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:30:43 +01:00
Luciano Coelho	d04b5ac9e7	cfg80211/mac80211: allow any interface to send channel switch notifications For multi-vif channel switches, we want to send NL80211_CMD_CH_SWITCH_NOTIFY to the userspace to let it decide whether other interfaces need to be moved as well. This is needed when we want a P2P GO interface to follow the channel of a station, for example. Modify the code so that all interfaces can send CSA notifications. Additionally, send notifications for STA CSA as well. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:20:18 +01:00
Luciano Coelho	2f4572930d	mac80211: send channel switch started notifications Send a channel switch notification to userspace when a channel switch is requested or when we react to a remote CSA. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:20:17 +01:00
Luciano Coelho	f8d7552e94	cfg80211: add channel switch started notification Add a new NL80211_CH_SWITCH_STARTED_NOTIFY message that can be sent to the userspace when a channel switch process has started. This allows userspace to take action, for instance, by requesting other interfaces to switch channel as necessary. This patch introduces a function that allows the drivers to send this notification. It should be used when the driver starts processing a channel switch initiated by a remote device (eg. when a STA receives a CSA from the AP) and when it successfully starts a userspace-triggered channel switch (eg. when hostapd triggers a channel swith in the AP). Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:20:14 +01:00
Luciano Coelho	127f10ec60	mac80211: add device_timestamp to the drv_pre_channel_switch trace The device_timestamp value was left out of the event trace for drv_pre_channel_switch by mistake. Add it. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:18:04 +01:00
Luciano Coelho	000baa5dfd	mac80211: fix order of setting ch_switch and drv_pre_channel_switch call There was a mistake when merging commit `6d027bcc` (mac80211: add pre_channel_switch driver operation) for upstream. The assignment of the values in the ch_switch structure came below the call to drv_pre_channel_switch. Fix the order. Fixes: `6d027bcc` (mac80211: add pre_channel_switch driver operation) Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-10 10:17:42 +01:00
Jarno Rajahalme	05da5898a9	openvswitch: Add support for OVS_FLOW_ATTR_PROBE. This new flag is useful for suppressing error logging while probing for datapath features using flow commands. For backwards compatibility reasons the commands are executed normally, but error logging is suppressed. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-09 18:58:44 -08:00
Thomas Graf	12eb18f711	openvswitch: Constify various function arguments Help produce better optimized code. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-09 18:58:44 -08:00
Pravin B Shelar	e8eedb85bd	openvswitch: Remove redundant key ref from upcall_info. struct dp_upcall_info has pointer to pkt_key which is already available in OVS_CB. This also simplifies upcall handling for gso packet. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-11-09 18:58:44 -08:00
Pravin B Shelar	fff06c36a2	openvswitch: Optimize recirc action. OVS need to flow key for flow lookup in recic action. OVS does key extract in recic action. Most of cases we could use OVS_CB packet key directly and can avoid packet flow key extract. SET action we can update flow-key along with packet to keep it consistent. But there are some action like MPLS pop which forces OVS to do flow-extract. In such cases we can mark flow key as invalid so that subsequent recirc action can do full flow extract. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-11-09 18:58:44 -08:00
Wenyu Zhang	8f0aad6f35	openvswitch: Extend packet attribute for egress tunnel info OVS vswitch has extended IPFIX exporter to export tunnel headers to improve network visibility. To export this information userspace needs to know egress tunnel for given packet. By extending packet attributes datapath can export egress tunnel info for given packet. So that userspace can ask for egress tunnel info in userspace action. This information is used to build IPFIX data for given flow. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Romain Lenglet <rlenglet@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-09 18:58:44 -08:00
Pravin B Shelar	9ba559d9ca	openvswitch: Export symbols as GPL symbols. vport can be compiled as modules, therefore openvswitch needs to export few symbols. Export them as GPL symbols. CC: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-09 18:58:44 -08:00
Alexander Aring	f7cb96f105	mac802154: protect address changes via ioctl This patch adds a netif_running check while trying to change the address attributes via ioctl. While netif_running is true these attributes should be only readable. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	87023e1058	ieee802154: fix iface dump with lowpan This patch adds a hacked solution for an interface dump with a running lowpan interface. This will crash because lowpan and wpan interface use the same arphdr. To change the arphdr will change the UAPI, this patch checks on mtu which should on lowpan interface always different than IEEE802154_MTU. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	7bea1ea7b4	ieee802154: netlink add rtnl lock This patch adds rtnl lock hold mechanism while accessing wpan_dev attributes. Furthermore these attributes should be protected by rtnl lock and netif_running only. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	b03c9cccfa	mac820154: don't set monitor dev_addr This patch removes the setting of dev_addr on a monitor device. This address should be zero. A monitor should only sniff and send raw frames out. The address should be never used by upper layers and receiving frame parsing. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	4b96aea0fc	ieee802154: add wpan_dev dump support This patch adds support for wpan_dev dump via nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	ca20ce201c	ieee802154: add wpan_phy dump support This patch adds support for dumping wpan_phy attributes via nl802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	79fe1a2aa7	ieee802154: add nl802154 framework This patch adds a basic nl802154 framework. Most of this code was grabbed from nl80211 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:29 +01:00
Alexander Aring	a6fd693f6b	ieee802154: sysfs add wpan_phy index and name This patch adds new sysfs entries for wpan_phy index and name. This needed for the new 802.15.4 userspace tool. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	fcf39e6e88	ieee802154: add wpan_dev_list This patch adds a wpan_dev_list list into cfg802154_registered_device struct. Also adding new wpan_dev into this list while cfg802154_netdev_notifier_call. This behaviour is mostly grab from wireless core.c implementation and is needed for preparing nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	190ac1ca33	ieee802154: add iftype to wpan_dev This patch adds an iftype argument to the wpan_dev. This is needed to get the interface type from netdev ieee802154_ptr. The subif data struct can only accessible in mac802154 branch. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	f3ada640c2	ieee802154: add cfg802154_registered_device list This patch adds a new cfg802154_rdev_list to remember all registered cfg802154_registered_device structs. This is needed to prepare the upcomming nl802154 framework. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	f601379fa1	ieee802154: rename wpan_phy_alloc This patch renames the wpan_phy_alloc function to wpan_phy_new. This naming convention is like wireless and "wiphy_new" function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	5fb3f026ae	mac802154: remove mac_params in sdata This patch removes the mac_params from subif data struct. Instead we manipulate the wpan attributes directly. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:28 +01:00
Alexander Aring	863e88f255	mac802154: move mac pib attributes into wpan_dev This patch moves all mac pib attributes into the wpan_dev struct. Furthermore we can easier access these attributes over the netdev 802154_ptr pointer. Currently this is only possible over a complicated callback structure in mac802154 because subif data structure is accessable inside mac802154 only. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-09 19:50:27 +01:00
Ana Rey	ce674173e9	netfilter: nft_meta: add cgroup support This allows you to filter traffic by process control group (cgroup). Signed-off-by: Ana Rey <anarey@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-09 16:21:22 +01:00
Joe Perches	c0560b9c52	dccp: Convert DCCP_WARN to net_warn_ratelimited Remove the dependency on the "warning" sysctl (net_msg_warn) which is only used by the LIMIT_NETDEBUG macro. Convert the LIMIT_NETDEBUG use in DCCP_WARN to the more common net_warn_ratelimited mechanism. This still ratelimits based on the net_ratelimit() function, but removes the check for the sysctl. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-08 21:22:54 -05:00
Alexander Aring	b0c42cd7b2	Bluetooth: 6lowpan: fix skb_unshare behaviour This patch reverts commit: `a7807d73` ("Bluetooth: 6lowpan: Avoid memory leak if memory allocation fails") which was wrong suggested by Alexander Aring. The function skb_unshare run also kfree_skb on failure. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.18.x	2014-11-08 20:29:35 +01:00
Rick Jones	36cbb2452c	udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts As NIC multicast filtering isn't perfect, and some platforms are quite content to spew broadcasts, we should not trigger an event for skb:kfree_skb when we do not have a match for such an incoming datagram. We do though want to avoid sweeping the matter under the rug entirely, so increment a suitable statistic. This incorporates feedback from David L. Stevens, Karl Neiss and Eric Dumazet. V3 - use bool per David Miller Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-07 15:45:50 -05:00
Herbert Xu	bfe1be38fc	net: Kill skb_copy_datagram_const_iovec Now that both macvtap and tun are using skb_copy_datagram_iter, we can kill the abomination that is skb_copy_datagram_const_iovec. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-07 12:13:34 -05:00
Herbert Xu	a8f820aa40	inet: Add skb_copy_datagram_iter This patch adds skb_copy_datagram_iter, which is identical to skb_copy_datagram_iovec except that it operates on iov_iter instead of iovec. Eventually all users of skb_copy_datagram_iovec should switch over to iov_iter and then we can remove skb_copy_datagram_iovec. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-07 12:13:34 -05:00
Jaganath Kanakkassery	cb77c3ec07	Bluetooth: Send mgmt_connected only if state is BT_CONFIG If a remote name request is initiated while acl connection is going on, and if it fails then mgmt_connected will be sent. Evetually after acl connection, authentication will not be initiated and userspace will never get pairing reply. < HCI Command: Create Connection (0x01\|0x0005) plen 13 bdaddr AA:BB:CC:DD:EE:FF ptype 0xcc18 rswitch 0x01 clkoffset 0x2306 (valid) Packet type: DM1 DM3 DM5 DH1 DH3 DH5 > HCI Event: Command Status (0x0f) plen 4 Create Connection (0x01\|0x0005) status 0x00 ncmd 1 > HCI Event: Inquiry Complete (0x01) plen 1 status 0x00 < HCI Command: Remote Name Request (0x01\|0x0019) plen 10 bdaddr AA:BB:CC:DD:EE:FF mode 1 clkoffset 0x2306 > HCI Event: Command Status (0x0f) plen 4 Remote Name Request (0x01\|0x0019) status 0x0c ncmd 1 Error: Command Disallowed > HCI Event: Connect Complete (0x03) plen 11 status 0x00 handle 50 bdaddr 00:0D:FD:47:53:B2 type ACL encrypt 0x00 < HCI Command: Read Remote Supported Features (0x01\|0x001b) plen 2 handle 50 > HCI Event: Command Status (0x0f) plen 4 Read Remote Supported Features (0x01\|0x001b) status 0x00 ncmd 1 > HCI Event: Max Slots Change (0x1b) plen 3 handle 50 slots 5 > HCI Event: Read Remote Supported Features (0x0b) plen 11 status 0x00 handle 50 Features: 0xff 0xff 0x8f 0xfe 0x9b 0xff 0x59 0x83 < HCI Command: Read Remote Extended Features (0x01\|0x001c) plen 3 handle 50 page 1 > HCI Event: Command Status (0x0f) plen 4 Read Remote Extended Features (0x01\|0x001c) status 0x00 ncmd 1 > HCI Event: Read Remote Extended Features (0x23) plen 13 status 0x00 handle 50 page 1 max 1 Features: 0x01 0x00 0x00 0x00 0x00 0x00 0x00 0x00 This patch sends mgmt_connected in remote name command status only if conn->state is BT_CONFIG Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-07 15:43:51 +02:00
David S. Miller	1f5623106f	Merge tag 'master-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-11-06 Please pull this batch of fixes intended for the 3.18 stream... For the mac80211 bits, Johannes says: "This contains another small set of fixes for 3.18, these are all over the place and most of the bugs are old, one even dates back to the original mac80211 we merged into the kernel." For the iwlwifi bits, Emmanuel says: "I fix here two issues that are related to the firmware loading flow. A user reported that he couldn't load the driver because the rfkill line was pulled up while we were running the calibrations. This was happening while booting the system: systemd was restoring the "disable wifi settings" and that raised an RFKILL interrupt during the calibration. Our driver didn't handle that properly and this is now fixed." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 22:15:20 -05:00
David S. Miller	4e84b496fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-06 22:01:18 -05:00
David S. Miller	6b798d70d0	Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch First two patches are related to OVS MPLS support. Rest of patches are mostly refactoring and minor improvements to openvswitch. v1-v2: - Fix conflicts due to "gue: Remote checksum offload" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 16:33:13 -05:00
Martin Townsend	56b2c3eea3	6lowpan: move skb_free from error paths in decompression Currently we ensure that the skb is freed on every error path in IPHC decompression which makes it easy to introduce skb leaks. By centralising the skb_free into the receive function it makes future decompression routines easier to maintain. It does come at the expense of ensuring that the skb passed into the decompression routine must not be copied. Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-06 22:09:48 +01:00
Joe Perches	4508349777	net: esp: Convert NETDEBUG to pr_info Commit `64ce207306` ("[NET]: Make NETDEBUG pure printk wrappers") originally had these NETDEBUG printks as always emitting. Commit `a2a316fd06` ("[NET]: Replace CONFIG_NET_DEBUG with sysctl") added a net_msg_warn sysctl to these NETDEBUG uses. Convert these NETDEBUG uses to normal pr_info calls. This changes the output prefix from "ESP: " to include "IPSec: " for the ipv4 case and "IPv6: " for the ipv6 case. These output lines are now like the other messages in the files. Other miscellanea: Neaten the arithmetic spacing to be consistent with other arithmetic spacing in the files. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 15:11:10 -05:00
Joe Perches	cbffccc970	net; ipv[46] - Remove 2 unnecessary NETDEBUG OOM messages These messages aren't useful as there's a generic dump_stack() on OOM. Neaten the comment and if test above the OOM by separating the assign in if into an allocation then if test. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 15:11:10 -05:00
Andrew Lunn	b31f65fb43	net: dsa: slave: Fix autoneg for phys on switch MDIO bus When the ports phys are connected to the switches internal MDIO bus, we need to connect the phy to the slave netdev, otherwise auto-negotiation etc, does not work. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 15:06:28 -05:00
Jiri Pirko	0c6965dd31	sched: fix act file names in header comment Fixes: `4bba3925` ("[PKT_SCHED]: Prefix tc actions with act_") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 15:04:41 -05:00
Steffen Klassert	ea3dc9601b	ip6_tunnel: Add support for wildcard tunnel endpoints. This patch adds support for tunnels with local or remote wildcard endpoints. With this we get a NBMA tunnel mode like we have it for ipv4 and sit tunnels. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 14:19:20 -05:00
Steffen Klassert	d50051407f	ipv6: Allow sending packets through tunnels with wildcard endpoints Currently we need the IP6_TNL_F_CAP_XMIT capabiltiy to transmit packets through an ipv6 tunnel. This capability is set when the tunnel gets configured, based on the tunnel endpoint addresses. On tunnels with wildcard tunnel endpoints, we need to do the capabiltiy checking on a per packet basis like it is done in the receive path. This patch extends ip6_tnl_xmit_ctl() to take local and remote addresses as parameters to allow for per packet capabiltiy checking. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 14:19:19 -05:00
Kuba Pawlak	9645c76c7c	Bluetooth: Sort switch cases by opcode's numeric value Opcodes in switch/case in hci_cmd_status_evt are not sorted by value. This patch restores proper ordering. Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-06 19:38:42 +01:00
Kuba Pawlak	50fc85f1b0	Bluetooth: Clear role switch pending flag If role switch was rejected by the controller and HCI Event: Command Status returned with status "Command Disallowed" (0x0C) the flag HCI_CONN_RSWITCH_PEND remains set. No further role switches are possible as this flag prevents us from sending any new HCI Switch Role requests and the only way to clear it is to receive a valid HCI Event Switch Role. This patch clears the flag if command was rejected. 2013-01-01 00:03:44.209913 < HCI Command: Switch Role (0x02\|0x000b) plen 7 bdaddr BC:C6:DB:C4:6F:79 role 0x00 Role: Master 2013-01-01 00:03:44.210867 > HCI Event: Command Status (0x0f) plen 4 Switch Role (0x02\|0x000b) status 0x0c ncmd 1 Error: Command Disallowed Signed-off-by: Kuba Pawlak <kubax.t.pawlak@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-06 19:38:42 +01:00
Ronald Wahl	4f031fa9f1	mac80211: Fix regression that triggers a kernel BUG with CCMP Commit `7ec7c4a9a6` (mac80211: port CCMP to cryptoapi's CCM driver) introduced a regression when decrypting empty packets (data_len == 0). This will lead to backtraces like: (scatterwalk_start) from [<c01312f4>] (scatterwalk_map_and_copy+0x2c/0xa8) (scatterwalk_map_and_copy) from [<c013a5a0>] (crypto_ccm_decrypt+0x7c/0x25c) (crypto_ccm_decrypt) from [<c032886c>] (ieee80211_aes_ccm_decrypt+0x160/0x170) (ieee80211_aes_ccm_decrypt) from [<c031c628>] (ieee80211_crypto_ccmp_decrypt+0x1ac/0x238) (ieee80211_crypto_ccmp_decrypt) from [<c032ef28>] (ieee80211_rx_handlers+0x870/0x1d24) (ieee80211_rx_handlers) from [<c0330c7c>] (ieee80211_prepare_and_rx_handle+0x8a0/0x91c) (ieee80211_prepare_and_rx_handle) from [<c0331260>] (ieee80211_rx+0x568/0x730) (ieee80211_rx) from [<c01d3054>] (__carl9170_rx+0x94c/0xa20) (__carl9170_rx) from [<c01d3324>] (carl9170_rx_stream+0x1fc/0x320) (carl9170_rx_stream) from [<c01cbccc>] (carl9170_usb_tasklet+0x80/0xc8) (carl9170_usb_tasklet) from [<c00199dc>] (tasklet_hi_action+0x88/0xcc) (tasklet_hi_action) from [<c00193c8>] (__do_softirq+0xcc/0x200) (__do_softirq) from [<c0019734>] (irq_exit+0x80/0xe0) (irq_exit) from [<c0009c10>] (handle_IRQ+0x64/0x80) (handle_IRQ) from [<c000c3a0>] (__irq_svc+0x40/0x4c) (__irq_svc) from [<c0009d44>] (arch_cpu_idle+0x2c/0x34) Such packets can appear for example when using the carl9170 wireless driver because hardware sometimes generates garbage when the internal FIFO overruns. This patch adds an additional length check. Cc: stable@vger.kernel.org Fixes: `7ec7c4a9a6` ("mac80211: port CCMP to cryptoapi's CCM driver") Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-06 12:42:22 +01:00
Pravin B Shelar	a85311bf1f	openvswitch: Avoid NULL mask check while building mask OVS does mask validation even if it does not need to convert netlink mask attributes to mask structure. ovs_nla_get_match() caller can pass NULL mask structure pointer if the caller does not need mask. Therefore NULL check is required in SW_FLOW_KEY* macros. Following patch does not convert mask netlink attributes if mask pointer is NULL, so we do not need these checks in SW_FLOW_KEY* macro. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-11-05 23:52:35 -08:00
Pravin B Shelar	2fdb957d63	openvswitch: Refactor action alloc and copy api. There are two separate API to allocate and copy actions list. Anytime OVS needs to copy action list, it needs to call both functions. Following patch moves action allocation to copy function to avoid code duplication. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>	2014-11-05 23:52:35 -08:00
Joe Stringer	41af73e9c1	openvswitch: Move key_attr_size() to flow_netlink.h. flow-netlink has netlink related code. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:35 -08:00
Lorand Jakab	d98612b8c1	openvswitch: Remove flow member from struct ovs_skb_cb The 'flow' memeber was chosen for removal because it's only used in ovs_execute_actions() we can pass it as argument to this function. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:35 -08:00
Chunhe Li	e1f9c356d2	openvswitch: Drop packets when interdev is not up If the internal device is not up, it should drop received packets. Sometimes it receive the broadcast or multicast packets, and the ip protocol stack will casue more cpu usage wasted. Signed-off-by: Chunhe Li <lichunhe@huawei.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:35 -08:00
Andy Zhou	cc3a5ae6f2	openvswitch: Refactor get_dp() function into multiple access APIs. Avoid recursive read_rcu_lock() by using the lighter weight get_dp_rcu() API. Add proper locking assertions to get_dp(). Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:34 -08:00
Joe Stringer	ca7105f278	openvswitch: Refactor ovs_flow_cmd_fill_info(). Split up ovs_flow_cmd_fill_info() to make it easier to cache parts of a dump reply. This will be used to streamline flow_dump in a future patch. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:34 -08:00
Andy Zhou	738967b8bf	openvswitch: refactor do_output() to move NULL check out of fast path skb_clone() NULL check is implemented in do_output(), as past of the common (fast) path. Refactoring so that NULL check is done in the slow path, immediately after skb_clone() is called. Besides optimization, this change also improves code readability by making the skb_clone() NULL check consistent within OVS datapath module. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:34 -08:00
Jesse Gross	426cda5cc1	openvswitch: Additional logging for -EINVAL on flow setups. There are many possible ways that a flow can be invalid so we've added logging for most of them. This adds logs for the remaining possible cases so there isn't any ambiguity while debugging. CC: Federico Iezzi <fiezzi@enter.it> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:34 -08:00
Joe Stringer	1b760fb9a8	openvswitch: Remove redundant tcp_flags code. These two cases used to be treated differently for IPv4/IPv6, but they are now identical. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:34 -08:00
Pravin B Shelar	9b996e544a	openvswitch: Move table destroy to dp-rcu callback. Ths simplifies flow-table-destroy API. No need to pass explicit parameter about context. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@redhat.com>	2014-11-05 23:52:34 -08:00
Simon Horman	25cd9ba0ab	openvswitch: Add basic MPLS support to kernel Allow datapath to recognize and extract MPLS labels into flow keys and execute actions which push, pop, and set labels on packets. Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer. Cc: Ravi K <rkerur@gmail.com> Cc: Leo Alterman <lalterman@nicira.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Joe Stringer <joe@wand.net.nz> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:33 -08:00
Pravin B Shelar	59b93b41e7	net: Remove MPLS GSO feature. Device can export MPLS GSO support in dev->mpls_features same way it export vlan features in dev->vlan_features. So it is safe to remove NETIF_F_GSO_MPLS redundant flag. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-11-05 23:52:33 -08:00
Tom Herbert	e1b2cb6550	fou: Fix typo in returning flags in netlink When filling netlink info, dport is being returned as flags. Fix instances to return correct value. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 22:18:20 -05:00
Daniel Borkmann	4c672e4b42	ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs It has been reported that generating an MLD listener report on devices with large MTUs (e.g. 9000) and a high number of IPv6 addresses can trigger a skb_over_panic(): skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20 head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0 dev:port1 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:100! invalid opcode: 0000 [#1] SMP Modules linked in: ixgbe(O) CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4 [...] Call Trace: <IRQ> [<ffffffff80578226>] ? skb_put+0x3a/0x3b [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4 [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45 [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68 [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182 [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3 [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46 [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70 mld_newpack() skb allocations are usually requested with dev->mtu in size, since commit `72e09ad107` ("ipv6: avoid high order allocations") we have changed the limit in order to be less likely to fail. However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb) macros, which determine if we may end up doing an skb_put() for adding another record. To avoid possible fragmentation, we check the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong assumption as the actual max allocation size can be much smaller. The IGMP case doesn't have this issue as commit `57e1ab6ead` ("igmp: refine skb allocations") stores the allocation size in the cb[]. Set a reserved_tailroom to make it fit into the MTU and use skb_availroom() helper instead. This also allows to get rid of igmp_skb_size(). Reported-by: Wei Liu <lw1a2.jing@gmail.com> Fixes: `72e09ad107` ("ipv6: avoid high order allocations") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: David L Stevens <david.stevens@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 22:12:30 -05:00
Joe Perches	1744bea1fa	net: Convert SEQ_START_TOKEN/seq_printf to seq_puts Using a single fixed string is smaller code size than using a format and many string arguments. Reduces overall code size a little. $ size net/ipv4/igmp.o* net/ipv6/mcast.o* net/ipv6/ip6_flowlabel.o* text data bss dec hex filename 34269 7012 14824 56105 db29 net/ipv4/igmp.o.new 34315 7012 14824 56151 db57 net/ipv4/igmp.o.old 30078 7869 13200 51147 c7cb net/ipv6/mcast.o.new 30105 7869 13200 51174 c7e6 net/ipv6/mcast.o.old 11434 3748 8580 23762 5cd2 net/ipv6/ip6_flowlabel.o.new 11491 3748 8580 23819 5d0b net/ipv6/ip6_flowlabel.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 22:04:55 -05:00
Marcelo Leitner	1f37bf87aa	tcp: zero retrans_stamp if all retrans were acked Ueki Kohei reported that when we are using NewReno with connections that have a very low traffic, we may timeout the connection too early if a second loss occurs after the first one was successfully acked but no data was transfered later. Below is his description of it: When SACK is disabled, and a socket suffers multiple separate TCP retransmissions, that socket's ETIMEDOUT value is calculated from the time of the first retransmission instead of the latest retransmission. This happens because the tcp_sock's retrans_stamp is set once then never cleared. Take the following connection: Linux remote-machine \| \| send#1---->(1)\|--------> data#1 --------->\| \| \| \| RTO : : \| \| \| ---(2)\|----> data#1(retrans) ---->\| \| (3)\|<---------- ACK <----------\| \| \| \| \| : : \| : : \| : : 16 minutes (or more) : \| : : \| : : \| : : \| \| \| send#2---->(4)\|--------> data#2 --------->\| \| \| \| RTO : : \| \| \| ---(5)\|----> data#2(retrans) ---->\| \| \| \| \| \| \| RTO2 : : \| \| \| \| \| \| ETIMEDOUT<----(6)\| \| (1) One data packet sent. (2) Because no ACK packet is received, the packet is retransmitted. (3) The ACK packet is received. The transmitted packet is acknowledged. At this point the first "retransmission event" has passed and been recovered from. Any future retransmission is a completely new "event". (4) After 16 minutes (to correspond with retries2=15), a new data packet is sent. Note: No data is transmitted between (3) and (4). The socket's timeout SHOULD be calculated from this point in time, but instead it's calculated from the prior "event" 16 minutes ago. (5) Because no ACK packet is received, the packet is retransmitted. (*6) At the time of the 2nd retransmission, the socket returns ETIMEDOUT. Therefore, now we clear retrans_stamp as soon as all data during the loss window is fully acked. Reported-by: Ueki Kohei Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:59:49 -05:00
David S. Miller	51f3d02b98	net: Add and use skb_copy_datagram_msg() helper. This encapsulates all of the skb_copy_datagram_iovec() callers with call argument signature "skb, offset, msghdr->msg_iov, length". When we move to iov_iters in the networking, the iov_iter object will sit in the msghdr. Having a helper like this means there will be less places to touch during that transformation. Based upon descriptions and patch from Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:46:40 -05:00
Tom Herbert	a8d31c128b	gue: Receive side of remote checksum offload Add processing of the remote checksum offload option in both the normal path as well as the GRO path. The implements patching the affected checksum to derive the offloaded checksum. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:04 -05:00
Tom Herbert	b17f709a24	gue: TX support for using remote checksum offload option Add if_tunnel flag TUNNEL_ENCAP_FLAG_REMCSUM to configure remote checksum offload on an IP tunnel. Add logic in gue_build_header to insert remote checksum offload option. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:03 -05:00
Tom Herbert	e585f23636	udp: Changes to udp_offload to support remote checksum offload Add a new GSO type, SKB_GSO_TUNNEL_REMCSUM, which indicates remote checksum offload being done (in this case inner checksum must not be offloaded to the NIC). Added logic in __skb_udp_tunnel_segment to handle remote checksum offload case. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:03 -05:00
Tom Herbert	5024c33ac3	gue: Add infrastructure for flags and options Add functions and basic definitions for processing standard flags, private flags, and control messages. This includes definitions to compute length of optional fields corresponding to a set of flags. Flag validation is in validate_gue_flags function. This checks for unknown flags, and that length of optional fields is <= length in guehdr hlen. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:03 -05:00
Tom Herbert	4bcb877d25	udp: Offload outer UDP tunnel csum if available In __skb_udp_tunnel_segment if outer UDP checksums are enabled and ip_summed is not already CHECKSUM_PARTIAL, set up checksum offload if device features allow it. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:03 -05:00
Tom Herbert	63487babf0	net: Move fou_build_header into fou.c and refactor Move fou_build_header out of ip_tunnel.c and into fou.c splitting it up into fou_build_header, gue_build_header, and fou_build_udp. This allows for other users for TX of FOU or GUE. Change ip_tunnel_encap to call fou_build_header or gue_build_header based on the tunnel encapsulation type. Similarly, added fou_encap_hlen and gue_encap_hlen functions which are called by ip_encap_hlen. New net/fou.h has prototypes and defines for this. Added NET_FOU_IP_TUNNELS configuration. When this is set, IP tunnels can use FOU/GUE and fou module is also selected. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 16:30:02 -05:00
Alexander Aring	0916c02205	mac802154: fix typo promisuous to promiscuous Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:06 +01:00
Alexander Aring	e57a894684	mac802154: use IEEE802154_EXTENDED_ADDR_LEN This patch removes the af_ieee802154 defines and use the IEEE802154_EXTENDED_ADDR_LEN. We should do this everywhere in the 802.15.4 subsystem because af_ieee802154 should be normally an uapi header. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:06 +01:00
Alexander Aring	dee56d1477	mac802154: add support for perm_extended_addr This patch adding support for a perm extended address. This is useful when a device supports an eeprom with a programmed static extended address. If a device doesn't support such eeprom or serial registers then the driver should generate a random extended address. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:05 +01:00
Alexander Aring	705cbbbe9c	mac802154: cleanup ieee802154_netdev_to_extended_addr This patch cleanups the ieee802154_be64_to_le64 to have a similar function like ieee802154_le64_to_be64 only with switched source and destionation types. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:05 +01:00
Alexander Aring	7c118c1a86	mac802154: add ieee802154_vif struct This patch adds an ieee802154_vif similar like the ieee80211_vif which holds the interface type and maybe further more attributes like the ieee80211_vif structure. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Varka Bhadram <varkabhadram@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:04 +01:00
Alexander Aring	e4962a1443	mac802154: add default interface registration This patch adds a default interface registration for a wpan interface type. Currently the 802.15.4 subsystem need to call userspace tools to add an interface. This patch is like mac80211 handling for registration a station interface type by default. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:04 +01:00
Alexander Aring	bd28a11f25	ieee802154: remove mlme get_phy callback This patch removes the get_phy callback from mlme ops structure. Instead we doing a dereference via ieee802154_ptr dev pointer. For backwards compatibility we need to run get_device after dereference wpan_phy via ieee802154_ptr. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:04 +01:00
Alexander Aring	d5ae67bacd	ieee802154: rework interface registration This patch meld mac802154_netdev_register into ieee802154_if_add function. Also we have now only one alloc_netdev call with one interface setup routine "ieee802154_if_setup" instead two different one for each interface type. This patch checks via runtime the interface type and do different handling now. Additional we add the wpan_dev struct in ieee802154_sub_if_data and set the new ieee802154_ptr while netdev registration. This behaviour is very similar the mac80211 netdev registration functionality. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:04 +01:00
Alexander Aring	12cb56c237	mac802154: move dev_hold out of ieee802154_if_add This patch moves the dev_hold call inside of nl-phy ieee802154_add_iface function. The ieee802154_add_iface is the only one function which use the ieee802154_if_add function and contains the corresponding dev_put call. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:03 +01:00
Alexander Aring	986a8abfc5	mac802154: move interface add handling in iface This patch moves and renames the mac802154_add_iface and mac802154_netdev_register functions into iface.c. The function mac802154_add_iface is renamed to ieee802154_if_add which is a similar naming convention like mac80211. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:03 +01:00
Alexander Aring	b210b18747	mac802154: move interface del handling in iface This patch moves and rename the mac802154_del_iface function into iface.c and rename the function to ieee802154_if_remove which is a similar naming convention like mac80211. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:03 +01:00
Alexander Aring	9f3295b9ea	ieee802154: remove nl802154 unused functions The include/net/nl802154.h file contains a lot of prototypes which are not used inside of ieee802154 subsystem. This patch removes this file and make the only one used prototype "ieee802154_nl_start_confirm" as static declaration in ieee802154/nl-mac.c Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:03 +01:00
Alexander Aring	53f9ee61b4	ieee802154: rework wpan_phy index assignment This patch reworks the wpan_phy index incrementation. It's now similar like wireless wiphy index incrementation. We move the wpan_phy index attribute inside of cfg802154_registered_device and use atomic operations instead locking mechanism via wpan_phy_mutex. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-05 21:53:03 +01:00
Jesse Gross	d3ca9eafc0	geneve: Unregister pernet subsys on module unload. The pernet ops aren't ever unregistered, which causes a memory leak and an OOPs if the module is ever reinserted. Fixes: `0b5e8b8eea` ("net: Add Geneve tunneling protocol driver") CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 15:00:51 -05:00
Jesse Gross	45cac46e51	geneve: Set GSO type on transmit. Geneve does not currently set the inner protocol type when transmitting packets. This causes GSO segmentation to fail on NICs that do not support Geneve offloading. CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-05 15:00:51 -05:00
Steven Rostedt (Red Hat)	e71456ae98	netfilter: Remove checks of seq_printf() return values The return value of seq_printf() is soon to be removed. Remove the checks from seq_printf() in favor of seq_has_overflowed(). Link: http://lkml.kernel.org/r/20141104142236.GA10239@salvia Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-11-05 14:11:02 -05:00
Joe Perches	824f1fbee7	netfilter: Convert print_tuple functions to return void Since adding a new function to seq_file (seq_has_overflowed()) there isn't any value for functions called from seq_show to return anything. Remove the int returns of the various print_tuple/<foo>_print_tuple functions. Link: http://lkml.kernel.org/p/f2e8cf8df433a197daa62cbaf124c900c708edc7.1412031505.git.joe@perches.com Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-11-05 14:10:33 -05:00
Steven Rostedt (Red Hat)	37246a5837	netfilter: Remove return values for print_conntrack callbacks The seq_printf() and friends are having their return values removed. The print_conntrack() returns the result of seq_printf(), which is meaningless when seq_printf() returns void. Might as well remove the return values of print_conntrack() as well. Link: http://lkml.kernel.org/r/20141029220107.465008329@goodmis.org Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-11-05 14:09:47 -05:00
Fabian Frederick	6cf1093e58	udp: remove blank line between set and test Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 17:12:10 -05:00
Florent Fourcot	869ba988fe	ipv6: trivial, add bracket for the if block The "else" block is on several lines and use bracket. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 17:10:19 -05:00
Fabian Frederick	05006e8c59	esp4: remove assignment in if condition Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 16:57:49 -05:00
John W. Linville	bf515fb11a	This relatively large batch of changes is comprised of the following: * large mac80211-hwsim changes from Ben, Jukka and a bit myself * OCB/WAVE/11p support from Rostislav on behalf of the Czech Technical University in Prague and Volkswagen Group Research * minstrel VHT work from Karl * more CSA work from Luca * WMM admission control support in mac80211 (myself) * various smaller fixes, spelling corrections, and minor API additions -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUWMs/AAoJEDBSmw7B7bqrmBQQAIbfAe7wH1WifRtOnhw3zWQQ K36+Edf3HlQ+EIkSs63QousRj2e7pGDOyhzMWLaqsmeTLteUtlGbr7qwiJO1QZdf Ml2V5O2s+b8hUIClDBVQF2L6+GGUmRUdQqvDDhkN1guoxD/Nk8cNtsRkSdiXWJWy R48NzvYDflBhc8uqPtR8jDb10eM3c00YP9HB+w9hYAfizD+FRue7UNp4MQIqwp9V HdKRT6L2n/6QA+Mzse0rMDes5qI7nIUNgj+hjqgJSnhITPMgGR5j/pitnVHrr81M ngOipBFG3svsQrwZh8nM4Llp0cM4Gs+GlgCieu9+TJpr2sY00Z3kYcp0pxtDoSxz Wblqz9n/bnW9mrkEfl12XqwwT5vguchwHoZ9cXhejDxSawWXoTRx20uW4ahO8ArA kWwwjTBVsQ5WMCtOBiqggzNKghwCc2ILmcZnjGdg9aNXcWsmQ4vyeCfG2QxBz/UB Grv/f9NSy6mzKQ34yv+lyR7rFZ8XcT03EVAnZSYz8X0ZZGxwtFupRp1RrBh1KPtD TJoe6Q71FfHKYRJ2xgygYkQFo+r9d0BKBeerq+Vu2hBeaqyi4aUwSj7d1sUaaq6N tL8fmAUqFjVOOUFeH1g07Xke5QD+yrEC7sJKkeRMfcRGB+dEa+2m3I5p4WDz9bWM AEvFSsYr/I9KI4d1huXD =6GIj -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "This relatively large batch of changes is comprised of the following: * large mac80211-hwsim changes from Ben, Jukka and a bit myself * OCB/WAVE/11p support from Rostislav on behalf of the Czech Technical University in Prague and Volkswagen Group Research * minstrel VHT work from Karl * more CSA work from Luca * WMM admission control support in mac80211 (myself) * various smaller fixes, spelling corrections, and minor API additions" Conflicts: drivers/net/wireless/ath/wil6210/cfg80211.c Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-04 16:18:12 -05:00
Florian Westphal	f7b3bec6f5	net: allow setting ecn via routing table This patch allows to set ECN on a per-route basis in case the sysctl tcp_ecn is not set to 1. In other words, when ECN is set for specific routes, it provides a tcp_ecn=1 behaviour for that route while the rest of the stack acts according to the global settings. One can use 'ip route change dev $dev $net features ecn' to toggle this. Having a more fine-grained per-route setting can be beneficial for various reasons, for example, 1) within data centers, or 2) local ISPs may deploy ECN support for their own video/streaming services [1], etc. There was a recent measurement study/paper [2] which scanned the Alexa's publicly available top million websites list from a vantage point in US, Europe and Asia: Half of the Alexa list will now happily use ECN (tcp_ecn=2, most likely blamed to commit `255cac91c3` ("tcp: extend ECN sysctl to allow server-side only ECN") ;)); the break in connectivity on-path was found is about 1 in 10,000 cases. Timeouts rather than receiving back RSTs were much more common in the negotiation phase (and mostly seen in the Alexa middle band, ranks around 50k-150k): from 12-thousand hosts on which there _may_ be ECN-linked connection failures, only 79 failed with RST when _not_ failing with RST when ECN is not requested. It's unclear though, how much equipment in the wild actually marks CE when buffers start to fill up. We thought about a fallback to non-ECN for retransmitted SYNs as another global option (which could perhaps one day be made default), but as Eric points out, there's much more work needed to detect broken middleboxes. Two examples Eric mentioned are buggy firewalls that accept only a single SYN per flow, and middleboxes that successfully let an ECN flow establish, but later mark CE for all packets (so cwnd converges to 1). [1] http://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf, p.15 [2] http://ecn.ethz.ch/ Joint work with Daniel Borkmann. Reference: http://thread.gmane.org/gmane.linux.network/335797 Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 16:06:09 -05:00
Florian Westphal	f1673381b1	syncookies: split cookie_check_timestamp() into two functions The function cookie_check_timestamp(), both called from IPv4/6 context, is being used to decode the echoed timestamp from the SYN/ACK into TCP options used for follow-up communication with the peer. We can remove ECN handling from that function, split it into a separate one, and simply rename the original function into cookie_decode_options(). cookie_decode_options() just fills in tcp_option struct based on the echoed timestamp received from the peer. Anything that fails in this function will actually discard the request socket. While this is the natural place for decoding options such as ECN which commit `172d69e63c` ("syncookies: add support for ECN") added, we argue that in particular for ECN handling, it can be checked at a later point in time as the request sock would actually not need to be dropped from this, but just ECN support turned off. Therefore, we split this functionality into cookie_ecn_ok(), which tells us if the timestamp indicates ECN support AND the tcp_ecn sysctl is enabled. This prepares for per-route ECN support: just looking at the tcp_ecn sysctl won't be enough anymore at that point; if the timestamp indicates ECN and sysctl tcp_ecn == 0, we will also need to check the ECN dst metric. This would mean adding a route lookup to cookie_check_timestamp(), which we definitely want to avoid. As we already do a route lookup at a later point in cookie_{v4,v6}_check(), we can simply make use of that as well for the new cookie_ecn_ok() function w/o any additional cost. Joint work with Daniel Borkmann. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 16:06:09 -05:00
Florian Westphal	274e2da0ec	syncookies: avoid magic values and document which-bit-is-what-option Was a bit more difficult to read than needed due to magic shifts; add defines and document the used encoding scheme. Joint work with Daniel Borkmann. Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 16:06:08 -05:00
Mika Westerberg	72daceb9a1	net: rfkill: gpio: Add default GPIO driver mappings for ACPI The driver uses devm_gpiod_get_index(..., index) so that the index refers directly to the GpioIo resource under the ACPI device. The problem with this is that if the ordering changes we get wrong GPIOs. With ACPI 5.1 _DSD we can now use names instead to reference GPIOs analogous to Device Tree. However, we still have systems out there that do not provide _DSD at all. These systems must be supported as well. Luckily we now have acpi_dev_add_driver_gpios() that can be used to provide mappings for systems where _DSD is not provided and still take advantage of _DSD if it exists. This patch changes the driver to create default GPIO mappings if we are running on ACPI system. While there we can drop the indices completely and use devm_gpiod_get() with name instead. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: John W. Linville <linville@tuxdriver.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-04 21:58:24 +01:00
John W. Linville	0c9a67c8f1	This contains another small set of fixes for 3.18, these are all over the place and most of the bugs are old, one even dates back to the original mac80211 we merged into the kernel. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUWJSYAAoJEDBSmw7B7bqr17IP/3RFbqI4S/ceZzrVNLtvGDUd MIlkGhngDdhpFhSxTdOH4opFM/j9bkwndk0F35d4r94mCeB5eJQKrtUfeon/7aft AKaRa3CNsEVQgCempCYOKGwlJZQQ86IL6IvU4CW5CTNHENUBLA83KHqX+6Aoumhm mdJxhSzmB53Qn1bteIJXyJmOjgxQvBZggBIF/25Xnosb3FBH3hvPsH0qbIKZaicy PlD5JWk9UseySjLNwk1/jriQ4koF5Dy/BVRyQ/0fRYswdmS3o2EiC4JOWjsOfIUi NE9Ax+DAKvHHGYNcsX/hXsPJTc6fYgq3INEZBvnK04GHVFVGLq1WoEIfOeLugK7o j7OIEJbkKAQjJSnEpB9Y6YHO/jPXEokJjUNT7VuZJqLElp4Hd8K9jnhKD9jkZBA6 TGjNO5NJqgGdlxnq3nu4+XFh9StAam6J1Ey1TWarc6Kxd8Gtg3Ymkj3cO46rHcQU JX3i3RGlYqibEQ0NVtZ4EfnGjtcGx0Vbf+yAc9ZpWzKFvX9YKS1wuOd5i/eZI8bb hxMjHFwmViV3Ifk9GjBNKioXkCpEfk9Q3pKzRllHQn56ueTu1mBvAfIe93PRm9kR y/giIZvHEhs8VH2PHVuHzT16YMVnNfQniAi+BK73QWC3zAhj1ss3xN33+Q8FfpMM xw/prlY9IAH2A9zis1Vz =8uwd -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "This contains another small set of fixes for 3.18, these are all over the place and most of the bugs are old, one even dates back to the original mac80211 we merged into the kernel." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-04 15:56:33 -05:00
Fabian Frederick	436f7c2068	igmp: remove camel case definitions use standard uppercase for definitions Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:13:18 -05:00
Fabian Frederick	c18450a52a	udp: remove else after return else is unnecessary after return 0 in __udp4_lib_rcv() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:13:18 -05:00
Fabian Frederick	aa1f731e52	inet: frags: remove inline on static in c file remove __inline__ / inline and let compiler decide what to do with static functions Inspired-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:13:18 -05:00
Fabian Frederick	0d3979b9c7	ipv4: remove 0/NULL assignment on static static values are automatically initialized to 0 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:52 -05:00
Fabian Frederick	c9f503b006	ipv4: use seq_puts instead of seq_printf where possible Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:52 -05:00
Fabian Frederick	b92022f3e5	tcp: spelling s/plugable/pluggable Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:52 -05:00
Fabian Frederick	988b13438c	cipso: remove NULL assignment on static Also add blank line after structure declarations Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:52 -05:00
Fabian Frederick	4c787b1626	ipv4: include linux/bug.h instead of asm/bug.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:20 -05:00
Fabian Frederick	4973404f81	cipso: kerneldoc warning fix Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-04 15:09:20 -05:00
Pablo Neira Ayuso	c5a589cc30	netfilter: nf_log: fix sparse warning in nf_logger_find_get() net/netfilter/nf_log.c:157:16: warning: incorrect type in assignment (different address spaces) net/netfilter/nf_log.c:157:16: expected struct nf_logger logger net/netfilter/nf_log.c:157:16: got struct nf_logger [noderef] <asn:4><noident> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-11-04 17:56:31 +01:00
Simon Vincent	980edbd503	6lowpan: fix udp header compression when using raw sockets If you use RAW sockets the transport header offset is not set by the ipv6 stack so when we get to the udp header compression it does not compress the right part of the packet. This patch adds a check for this scenario and sets the transport header offset. Signed-off-by: Simon Vincent <simon.vincent@xsilon.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-04 17:31:01 +01:00
Henning Rogge	1ef4c85049	cfg80211: fix nl80211 cmd id in nl80211_send_mpath() Netlink command for nl80211_send_mpath() should be NL80211_CMD_NEW_MPATH. Signed-off-by: Henning Rogge <henning.rogge@fkie.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 16:37:22 +01:00
Eliad Peller	cf2c92d840	mac80211: replace restart_complete() with reconfig_complete() Drivers might want to know also when mac80211 has completed reconfiguring after resume (e.g. in order to know when frames can be passed to mac80211). Rename restart_complete() to a more-generic reconfig_complete(), and add a new enum to indicate the reconfiguration type. Update the current users with the new prototype. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 13:49:00 +01:00
Andrei Otcheretianski	13a8098af9	mac80211: increase U-APSD max service period length Deliver up to 128 frames during service period instead of 8 if unlimited is specified by the client during association. 8 was just an arbitrary value; so is 128 since unlimited can be any number. However for large traffic bursts, increasing this value looks reasonable. Also, it seems that a few certification tests expect more frames to be delivered during SP. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 13:18:22 +01:00
Johannes Berg	8ed2874715	mac80211: handle RIC data element in reassociation request When the RIC data element (RDE) is included in the IEs coming from userspace for an association request, its handling is currently broken as any IEs that are contained within it would be split off from it and inserted again after all the IEs that mac80211 generates (e.g. HT, VHT.) To fix this, treat the RIC element specially, and stop after it only when we find something that doesn't actually belong to it. This assumes userspace is actually correctly building it, directly after the fast BSS transition IE and before all the others like extended capabilities. This leaves as a potential problem the case where userspace is building the following IEs: [RDE] [vendor resource description] [vendor non-resource IE] In this case, we'd erroneously consider all three IEs to be part of the RIC data together, and not split them between the two vendor IEs. Unfortunately, it isn't easily possible to distinguish vendor IEs, so this isn't easy to fix. Luckily, this case is rare as normally wpa_supplicant will include an extended capabilities IE in the IEs, and that certainly will break the two vendor IEs apart correctly. Reviewed-by: Eliad Peller <eliad@wizery.com> Reviewed-by: Beni Lev <beni.lev@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 13:18:21 +01:00
Rostislav Lisovy	239281f803	mac80211: 802.11p OCB mode support This patch adds 802.11p OCB (Outside the Context of a BSS) mode support. When communicating in OCB mode a mandatory wildcard BSSID (48 '1' bits) is used. The EDCA parameters handling function was changed to support 802.11p specific values. The insertion of a newly discovered STAs is done in the similar way as in the IBSS mode -- through the deferred insertion. The OCB mode uses a periodic 'housekeeping task' for expiration of disconnected STAs (in the similar manner as in the MESH mode). New Kconfig option for verbose OCB debugging outputs is added. Signed-off-by: Rostislav Lisovy <rostislav.lisovy@fel.cvut.cz> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 13:18:21 +01:00
Rostislav Lisovy	6e0bd6c35b	cfg80211: 802.11p OCB mode handling This patch adds new iface type (NL80211_IFTYPE_OCB) representing the OCB (Outside the Context of a BSS) mode. When establishing a connection to the network a cfg80211_join_ocb function is called (particular nl80211_command is added as well). A mandatory parameters during the ocb_join operation are 'center frequency' and 'channel width (5/10 MHz)'. Changes done in mac80211 are minimal possible required to avoid many warnings (warning: enumeration value 'NL80211_IFTYPE_OCB' not handled in switch) during compilation. Full functionality (where needed) is added in the following patch. Signed-off-by: Rostislav Lisovy <rostislav.lisovy@fel.cvut.cz> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 13:18:17 +01:00
Felix Fietkau	5b3dc42b1b	mac80211: add support for driver tx power reporting The configured tx power is often limited by hardware capabilities, channel settings, antenna configuration, etc. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [fix tracing compilation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-04 10:15:09 +01:00
Johan Hedberg	2a68c89724	Bluetooth: Fix sparse warnings in RFCOMM This patch fixes the following sparse warnings in rfcomm/core.c: net/bluetooth/rfcomm/core.c:391:16: warning: dubious: x \| !y net/bluetooth/rfcomm/core.c:546:24: warning: dubious: x \| !y Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-04 08:01:46 +01:00
Peter Zijlstra	ff960a7317	netdev, sched/wait: Fix sleeping inside wait event rtnl_lock_unregistering*() take rtnl_lock() -- a mutex -- inside a wait loop. The wait loop relies on current->state to function, but so does mutex_lock(), nesting them makes for the inner to destroy the outer state. Fix this using the new wait_woken() bits. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <cwang@twopensource.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jerry Chu <hkchu@google.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com> Cc: stephen hemminger <stephen@networkplumber.org> Cc: Tom Gundersen <teg@jklm.no> Cc: Tom Herbert <therbert@google.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20141029173110.GE15602@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-11-04 07:17:48 +01:00
Peter Zijlstra	eedf7e47da	rfcomm, sched/wait: Fix broken wait construct rfcomm_run() is a tad broken in that is has a nested wait loop. One cannot rely on p->state for the outer wait because the inner wait will overwrite it. Fix this using the new wait_woken() facility. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Alexander Holler <holler@ahsoftware.de> Cc: David S. Miller <davem@davemloft.net> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Joe Perches <joe@perches.com> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Seung-Woo Kim <sw0312.kim@samsung.com> Cc: Vignesh Raman <Vignesh_Raman@mentor.com> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-11-04 07:17:47 +01:00
Greg Kroah-Hartman	a8a93c6f99	Merge branch 'platform/remove_owner' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into driver-core-next Remove all .owner fields from platform drivers	2014-11-03 19:53:56 -08:00
Linus Torvalds	ce1928da84	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "There is a GFP flag fix from Mike Christie, an error code fix from Jan, and fixes for two unnecessary allocations (kmalloc and workqueue) from Ilya. All are well tested. Ilya has one other fix on the way but it didn't get tested in time" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: eliminate unnecessary allocation in process_one_ticket() rbd: Fix error recovery in rbd_obj_read_sync() libceph: use memalloc flags for net IO rbd: use a single workqueue for all devices	2014-11-03 15:04:26 -08:00
Eric Dumazet	56b174256b	net: add rbnode to struct sk_buff Yaogong replaces TCP out of order receive queue by an RB tree. As netem already does a private skb->{next/prev/tstamp} union with a 'struct rb_node', lets do this in a cleaner way. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 16:13:03 -05:00
Steffen Klassert	f03eb128e3	gre6: Move the setting of dev->iflink into the ndo_init functions. Otherwise it gets overwritten by register_netdev(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 15:42:24 -05:00
Steffen Klassert	ebe084aafb	sit: Use ipip6_tunnel_init as the ndo_init function. ipip6_tunnel_init() sets the dev->iflink via a call to ipip6_tunnel_bind_dev(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for ipv6 tunnels. Fix this by using ipip6_tunnel_init() as the ndo_init function. Then ipip6_tunnel_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 15:42:24 -05:00
Steffen Klassert	16a0231bf7	vti6: Use vti6_dev_init as the ndo_init function. vti6_dev_init() sets the dev->iflink via a call to vti6_link_config(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for vti6 tunnels. Fix this by using vti6_dev_init() as the ndo_init function. Then vti6_dev_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 15:42:24 -05:00
Steffen Klassert	6c6151daaf	ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function. ip6_tnl_dev_init() sets the dev->iflink via a call to ip6_tnl_link_config(). After that, register_netdevice() sets dev->iflink = -1. So we loose the iflink configuration for ipv6 tunnels. Fix this by using ip6_tnl_dev_init() as the ndo_init function. Then ip6_tnl_dev_init() is called after dev->iflink is set to -1 from register_netdevice(). Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 15:42:24 -05:00
Eric Dumazet	d75b1ade56	net: less interrupt masking in NAPI net_rx_action() can mask irqs a single time to transfert sd->poll_list into a private list, for a very short duration. Then, napi_complete() can avoid masking irqs again, and net_rx_action() only needs to mask irq again in slow path. This patch removes 2 couples of irq mask/unmask per typical NAPI run, more if multiple napi were triggered. Note this also allows to give control back to caller (do_softirq()) more often, so that other softirq handlers can be called a bit earlier, or ksoftirqd can be wakeup earlier under pressure. This was developed while testing an alternative to RX interrupt mitigation to reduce latencies while keeping or improving GRO aggregation on fast NIC. Idea is to test napi->gro_list at the end of a napi->poll() and reschedule one NAPI poll, but after servicing a full round of softirqs (timers, TX, rcu, ...). This will be allowed only if softirq is currently serviced by idle task or ksoftirqd, and resched not needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:25:09 -05:00
Guenter Roeck	c1207c049b	netfilter: nft_reject_bridge: Fix powerpc build error Fix: net/bridge/netfilter/nft_reject_bridge.c: In function 'nft_reject_br_send_v6_unreach': net/bridge/netfilter/nft_reject_bridge.c:240:3: error: implicit declaration of function 'csum_ipv6_magic' csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr, ^ make[3]: *** [net/bridge/netfilter/nft_reject_bridge.o] Error 1 Seen with powerpc:allmodconfig. Fixes: `523b929d54` ("netfilter: nft_reject_bridge: don't use IP stack to reject traffic") Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:12:34 -05:00
Szymon Janc	a736abc1ac	Bluetooth: Fix invalid response for 'Start Discovery' command According to Management Interface API 'Start Discovery' command should generate a Command Complete event on failure. Currently kernel is sending Command Status on early errors. This results in userspace ignoring such event due to invalid size. bluetoothd[28499]: src/adapter.c:trigger_start_discovery() bluetoothd[28499]: src/adapter.c:cancel_passive_scanning() bluetoothd[28499]: src/adapter.c:start_discovery_timeout() bluetoothd[28499]: src/adapter.c:start_discovery_complete() status 0x0a bluetoothd[28499]: Wrong size of start discovery return parameters Reported-by: Jukka Taimisto <jtt@codenomicon.com> Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-03 15:43:05 +02:00
Johannes Berg	b8fff407a1	mac80211: fix use-after-free in defragmentation Upon receiving the last fragment, all but the first fragment are freed, but the multicast check for statistics at the end of the function refers to the current skb (the last fragment) causing a use-after-free bug. Since multicast frames cannot be fragmented and we check for this early in the function, just modify that check to also do the accounting to fix the issue. Cc: stable@vger.kernel.org Reported-by: Yosef Khyal <yosefx.khyal@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-03 14:28:50 +01:00
Marcel Holtmann	40f4938aa6	Bluetooth: Consolidate whitelist debugfs entry into device_list The debufs entry for the BR/EDR whitelist is confusing since there is a controller debugfs entry with the name white_list and both are two different things. With the BR/EDR whitelist, the actual interface in use is the device list and thus just include all values from the internal BR/EDR whitelist in the device_list debugfs entry. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-03 10:13:42 +02:00
dingzhi	f293a5e33e	xfrm: add XFRMA_REPLAY_VAL attribute to SA messages After this commit, the attribute XFRMA_REPLAY_VAL is added when no ESN replay value is defined. Thus sequence number values are always notified to userspace. Signed-off-by: dingzhi <zhi.ding@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-11-03 08:54:52 +01:00
Alexander Aring	868ed8e06a	ieee802154: remove unnecessary functions This patch fixes commit `c7420c367d` ("mac802154: move mac_params functions into mac_cmd"). The mac_params functions wasn't deleted by this commit. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 21:52:03 +01:00
Alexander Aring	fdd2068ab7	mac802154: cfg: add missing include Running make C=2 occurs warning: symbol 'mac802154_config_ops' was not declared. Should it be static? This patch adds a missing include in cfg.c to solve this warning. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 21:52:02 +01:00
Alexander Aring	c5fbbc4683	ieee802154: sysfs: add missing include Running make C=2 occurs in warnings: symbol 'wpan_phy_class' was not declared. Should it be static? symbol 'wpan_phy_sysfs_init' was not declared. Should it be static? symbol 'wpan_phy_sysfs_exit' wasnot declared. Should it be static? This patch adds a missing include "sysfs.h" to solve these warnings. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 21:52:02 +01:00
Linus Torvalds	4cb8c3593b	irda: stop calling sk_prot->disconnect() on connection failure The sk_prot is irda's own set of protocol handlers, so irda should statically know what that function is anyway, without using an indirect pointer. And as it happens, we know exactly what that pointer is statically: it's NULL, because irda doesn't define a disconnect operation. So calling that function is doubly wrong, and will just cause an oops. Reported-by: Martin Lang <mlg.hessigheim@gmail.com> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-11-02 10:20:26 -08:00
Marcel Holtmann	75e0569f7f	Bluetooth: Add hci_reset_dev() for driver triggerd stack reset Some Bluetooth drivers require to reset the upper stack. To avoid having all drivers send HCI Hardware Error events, provide a generic function to wrap the reset functionality. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-02 10:03:45 +02:00
Marcel Holtmann	65efd2bf48	Bluetooth: Introduce BT_BREDR and BT_LE config options The current kernel options do not make it clear which modules are for Bluetooth Classic (BR/EDR) and which are for Bluetooth Low Energy (LE). To make it really clear, introduce BT_BREDR and BT_LE options with proper dependencies into the different modules. Both new options default to y to not create a regression with previous kernel config files. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-02 10:01:53 +02:00
Marcel Holtmann	24dfa34371	Bluetooth: Print error message for HCI_Hardware_Error event When the HCI_Hardware_Error event is send by the controller or injected by the driver, then at least print an error message. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-02 09:59:42 +02:00
Marcel Holtmann	8761f9d662	Bluetooth: Check status of command complete for HCI_Reset When the HCI_Reset command returns, the status needs to be checked. It is unlikely that HCI_Reset actually fails, but when it fails, it is a bad idea to reset all values since the controller will have not reset its values in that case. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-02 09:58:50 +02:00
Alexander Aring	6290671018	ieee802154: 6lowpan: remove set of mac address Currently the ieee802154 6lowpan interface operates on wpan interfaces only. Setting the wpan mac address over 6lowpan interface is complex and maybe we can't never do this. This patch removes the set of mac address handling in ieee802154 6lowpan interface for now. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:08 +01:00
Alexander Aring	ea7053c1df	mac802154: iface: add validation for extended address This patch use the validation function to check if an extended address is valid or not while set the extended address. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:08 +01:00
Alexander Aring	f59f419d31	mac802154: move phy settings into netlink receive All PHY attributes should be directly set to the transceiver after netlink. MAC attributes should be set by interface up. Currently the macparams netlink cmd contains mixed attributes of phy and mac settings. This patch moves all phy settings to the netlink receive function for setting macparams. This is the only way which doesn't change the userspace API and keep the deprecated netlink interface alive. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:07 +01:00
Alexander Aring	50c7907501	mac802154: set panid address filter on ifup This patch moves the setting of hardware panid address filtering inside of interface up instead doing it it directly inside of netlink interface. The netlink call which can only be called when netif isn't running sets only the necessary panid value in sdata. After an interface up the address filter will be set with this value. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:07 +01:00
Alexander Aring	78b4bad16e	mac802154: set short address filter on ifup This patch moves the setting of hardware short address filtering inside of interface up instead doing it it directly inside of netlink interface. The netlink call which can only be called when netif isn't running sets only the necessary short_addr value in sdata. After an interface up the address filter will be set with this value. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:07 +01:00
Alexander Aring	776e59de46	mac802154: set extended address filter on ifup This patch moves the setting of hardware extended address filtering inside of interface up instead doing it directly inside of netlink interface. Also we don't need to set the sdata extended attribute in netlink. This is already done by ndo_set_mac_address of net_device_ops. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:07 +01:00
Alexander Aring	8f499f991c	ieee802154: don't allow to change addr while netif_running This patch changes the actual behaviour for setting address attributes. We should not change addresses while netif_running is true. Furthermore when netif_running is running the address attributes becomes read only and we can remove locking mechanism in receive and transmit hothpaths of 802.15.4 subsystem. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:07 +01:00
Alexander Aring	4a9a816a4f	cfg802154: convert deprecated iface add and del This patch removes the wpan_phy callbacks for add and del an interface on a phy. Instead we introduce deprecated cfg802154 callbacks for this. Furthermore we introduce a new netlink interface nl802154 which use different callbacks. The deprecated function is to have a backwards compatibility with the current netlink interface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:06 +01:00
Alexander Aring	ea4dcd32a4	ieee802154: add helper wpan_phy_to_rdev function This patch introduce a function to get the cfg802154_registered_device from a wpan_phy. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:06 +01:00
Alexander Aring	1201cd22fd	mac802154: introduce mac802154_config_ops This patch introduces mac802154_config_ops struct. Like wireless this struct should be the only one interface between ieee802154 to mac802154 or possible HardMAC drivers. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:06 +01:00
Alexander Aring	a5dd1d72d8	cfg802154: introduce cfg802154_registered_device This patch introduce the cfg802154_registered_device struct. Like cfg80211_registered_device in wireless this should contain similar functionality for cfg802154. This patch should not change any behaviour. We just adds cfg802154_registered_device as container for wpan_phy struct. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:06 +01:00
Alexander Aring	ff4e65581e	ieee802154: remove default channel settings This patch removes the default channel setting. A channel is always set and there is no default channel setting according 802.15.4. Drivers should set the default channel and page in probing routine. This behaviour is currently a lack of all 802.15.4 drivers. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-02 04:51:06 +01:00
Chan-yeol Park	039fada5cd	Bluetooth: Fix hci_sync missing wakeup interrupt __hci_cmd_sync_ev(), __hci_req_sync() could miss wake_up_interrupt from hci_req_sync_complete() because hci_cmd_work() workqueue and its response could be completed before they are ready to get the signal through add_wait_queue(), set_current_state(TASK_INTERRUPTIBLE). Signed-off-by: Chan-yeol Park <chanyeol.park@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-11-01 23:20:21 +02:00
David S. Miller	55b42b5ca2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-01 14:53:27 -04:00
Ilya Dryomov	e9226d7c9f	libceph: eliminate unnecessary allocation in process_one_ticket() Commit `c27a3e4d66` ("libceph: do not hard code max auth ticket len") while fixing a buffer overlow tried to keep the same as much of the surrounding code as possible and introduced an unnecessary kmalloc() in the unencrypted ticket path. It is likely to fail on huge tickets, so get rid of it. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-31 23:43:08 +03:00
Guenter Roeck	e0fb6fb6d5	net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0 If a driver supports reading EEPROM but no EEPROM is installed in the system, the driver's get_eeprom_len function returns 0. ethtool will subsequently try to read that zero-length EEPROM anyway. If the driver does not support EEPROM access at all, this operation will return -EOPNOTSUPP. If the driver does support EEPROM access but no EEPROM is installed, the operation will return -EINVAL. Return -EOPNOTSUPP in both cases for consistency. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-31 16:12:34 -04:00
Pravin B Shelar	de05c400f7	mpls: Allow mpls_gso to be built as module Kconfig already allows mpls to be built as module. Following patch fixes Makefile to do same. CC: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-31 15:47:21 -04:00
Pravin B Shelar	f7065f4bd3	mpls: Fix mpls_gso handler. mpls gso handler needs to pull skb after segmenting skb. CC: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-31 15:47:21 -04:00
David S. Miller	e3a88f9c4f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== netfilter/ipvs fixes for net The following patchset contains fixes for netfilter/ipvs. This round of fixes is larger than usual at this stage, specifically because of the nf_tables bridge reject fixes that I would like to see in 3.18. The patches are: 1) Fix a null-pointer dereference that may occur when logging errors. This problem was introduced by `4a4739d56b` ("ipvs: Pull out crosses_local_route_boundary logic") in v3.17-rc5. 2) Update hook mask in nft_reject_bridge so we can also filter out packets from there. This fixes `36d2af5` ("netfilter: nf_tables: allow to filter from prerouting and postrouting"), which needs this chunk to work. 3) Two patches to refactor common code to forge the IPv4 and IPv6 reject packets from the bridge. These are required by the nf_tables reject bridge fix. 4) Fix nft_reject_bridge by avoiding the use of the IP stack to reject packets from the bridge. The idea is to forge the reject packets and inject them to the original port via br_deliver() which is now exported for that purpose. 5) Restrict nft_reject_bridge to bridge prerouting and input hooks. the original skbuff may cloned after prerouting when the bridge stack needs to flood it to several bridge ports, it is too late to reject the traffic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-31 12:29:42 -04:00
Johannes Berg	de4fcbadde	cfg80211: avoid using default in interface type switch Most code avoids having a default case in interface type switch statements already, to make it easier to find places that need to be extended. Change the code in the __cfg80211_leave() and nl80211_key_allowed() functions to not have a default case. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-31 14:19:19 +01:00
Pablo Neira Ayuso	127917c29a	netfilter: nft_reject_bridge: restrict reject to prerouting and input Restrict the reject expression to the prerouting and input bridge hooks. If we allow this to be used from forward or any other later bridge hook, if the frame is flooded to several ports, we'll end up sending several reject packets, one per cloned packet. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-31 12:50:09 +01:00
Pablo Neira Ayuso	523b929d54	netfilter: nft_reject_bridge: don't use IP stack to reject traffic If the packet is received via the bridge stack, this cannot reject packets from the IP stack. This adds functions to build the reject packet and send it from the bridge stack. Comments and assumptions on this patch: 1) Validate the IPv4 and IPv6 headers before further processing, given that the packet comes from the bridge stack, we cannot assume they are clean. Truncated packets are dropped, we follow similar approach in the existing iptables match/target extensions that need to inspect layer 4 headers that is not available. This also includes packets that are directed to multicast and broadcast ethernet addresses. 2) br_deliver() is exported to inject the reject packet via bridge localout -> postrouting. So the approach is similar to what we already do in the iptables reject target. The reject packet is sent to the bridge port from which we have received the original packet. 3) The reject packet is forged based on the original packet. The TTL is set based on sysctl_ip_default_ttl for IPv4 and per-net ipv6.devconf_all hoplimit for IPv6. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-31 12:50:08 +01:00
Pablo Neira Ayuso	8bfcdf6671	netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip6_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_ip6hdr_put(): to build the IPv6 header. * nf_reject_ip6_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-31 12:49:57 +01:00
Pablo Neira Ayuso	052b9498ee	netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions That can be reused by the reject bridge expression to build the reject packet. The new functions are: * nf_reject_ip_tcphdr_get(): to sanitize and to obtain the TCP header. * nf_reject_iphdr_put(): to build the IPv4 header. * nf_reject_ip_tcphdr_put(): to build the TCP header. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-31 12:49:05 +01:00
Pablo Neira Ayuso	4d87716cd0	netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing Fixes: `36d2af5` ("netfilter: nf_tables: allow to filter from prerouting and postrouting") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-31 12:44:56 +01:00
Stephen Hemminger	d070f9137a	mac80211: fix spelling errors Use codespell to find spelling errors. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-31 08:29:56 +01:00
Ben Hutchings	5188cd44c5	drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets UFO is now disabled on all drivers that work with virtio net headers, but userland may try to send UFO/IPv6 packets anyway. Instead of sending with ID=0, we should select identifiers on their behalf (as we used to). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: `916e4cf46d` ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 20:01:18 -04:00
Eric Dumazet	39bb5e6286	net: skb_fclone_busy() needs to detect orphaned skb Some drivers are unable to perform TX completions in a bound time. They instead call skb_orphan() Problem is skb_fclone_busy() has to detect this case, otherwise we block TCP retransmits and can freeze unlucky tcp sessions on mostly idle hosts. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `1f3279ae0c` ("tcp: avoid retransmits of TCP packets hanging in host queues") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:58:30 -04:00
Sowmini Varadhan	cd2145358e	tcp: Correction to RFC number in comment Challenge ACK is described in RFC 5961, fix typo. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:53:34 -04:00
Tom Herbert	14051f0452	gre: Use inner mac length when computing tunnel length Currently, skb_inner_network_header is used but this does not account for Ethernet header for ETH_P_TEB. Use skb_inner_mac_header which handles TEB and also should work with IP encapsulation in which case inner mac and inner network headers are the same. Tested: Ran TCP_STREAM over GRE, worked as expected. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:51:56 -04:00
Michele Baldessari	afb6befce6	sctp: replace seq_printf with seq_puts Fixes checkpatch warning: "WARNING: Prefer seq_puts to seq_printf" Signed-off-by: Michele Baldessari <michele@acksyn.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:40:16 -04:00
Michele Baldessari	891310d53d	sctp: add transport state in /proc/net/sctp/remaddr It is often quite helpful to be able to know the state of a transport outside of the application itself (for troubleshooting purposes or for monitoring purposes). Add it under /proc/net/sctp/remaddr. Signed-off-by: Michele Baldessari <michele@acksyn.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:40:16 -04:00
Nicolas Cavallari	fa19c2b050	ipv4: Do not cache routing failures due to disabled forwarding. If we cache them, the kernel will reuse them, independently of whether forwarding is enabled or not. Which means that if forwarding is disabled on the input interface where the first routing request comes from, then that unreachable result will be cached and reused for other interfaces, even if forwarding is enabled on them. The opposite is also true. This can be verified with two interfaces A and B and an output interface C, where B has forwarding enabled, but not A and trying ip route get $dst iif A from $src && ip route get $dst iif B from $src Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:20:40 -04:00
stephen hemminger	b2ad5e5fcc	tipc: spelling errors Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 16:56:41 -04:00
Florian Westphal	646697b9e3	syncookies: only increment SYNCOOKIESFAILED on validation error Only count packets that failed cookie-authentication. We can get SYNCOOKIESFAILED > 0 while we never even sent a single cookie. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 16:53:39 -04:00
stephen hemminger	f4e715c325	ipv4: minor spelling fixes Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 16:14:43 -04:00
Alexey Andriyanov	acf722f734	ip6_tunnel: allow to change mode for the ip6tnl0 The fallback device is in ipv6 mode by default. The mode can not be changed in runtime, so there is no way to decapsulate ip4in6 packets coming from various sources without creating the specific tunnel ifaces for each peer. This allows to update the fallback tunnel device, but only the mode could be changed. Usual command should work for the fallback device: `ip -6 tun change ip6tnl0 mode any` The fallback device can not be hidden from the packet receiver as a regular tunnel, but there is no need for synchronization as long as we do single assignment. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexey Andriyanov <alan@al-an.info> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 16:09:20 -04:00
Fabian Frederick	43728fa5c5	ipv6: remove assignment in if condition Do assignment before if condition and test !skb like in rawv6_recvmsg() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 15:51:43 -04:00
Fabian Frederick	fc08c25819	ipv6: remove inline on static in c file remove __inline__ / inline and let compiler decide what to do with static functions Inspired-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 15:51:43 -04:00
Fabian Frederick	40dc2ca3cb	ipv6: spelling s/incomming/incoming Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 15:51:43 -04:00
Fabian Frederick	e7b6658ea8	ipx: remove all unnecessary castings on ntohl Apply commit `e0f36310f7` ("ipx: remove unnecessary casting on ntohl") to all seq_printf/08lX Inspired-by: "David S. Miller" <davem@davemloft.net> Inspired-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 15:51:43 -04:00
Guenter Roeck	3d762a0f0a	net: dsa: Add support for reading switch registers with ethtool Add support for reading switch registers with 'ethtool -d'. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	6793abb4e8	net: dsa: Add support for switch EEPROM access On some chips it is possible to access the switch eeprom. Add infrastructure support for it. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	51579c3f1a	net: dsa: Add support for reporting switch chip temperatures Some switches provide chip temperature data. Add support for reporting it through the hwmon subsystem. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:11 -04:00
Guenter Roeck	734cbb5b6b	net: dsa: Don't set skb->protocol on outgoing tagged packets Setting skb->protocol to a private protocol type may result in warning messages such as e1000e 0000:00:19.0 em1: checksum_partial proto=dada! This happens if the L3 protocol is IP or IPv6 and skb->ip_summed is set to CHECKSUM_PARTIAL. Looking through the code, it appears that changing skb->protocol for transmitted packets is not necessary and may actually be harmful. For example, it prevents purposely unmodified (from a DSA perspective) network drivers from properly setting up their transmit checksum offload pointers since they inspect skb->protocol to set up the IPv4 header or IPv6 header pointers. So don't unnecessarily change the protocol field. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 14:54:10 -04:00
Marcel Holtmann	a4d5504d5c	Bluetooth: Clear LE white list when resetting controller The internal representation of the LE white list needs to be cleared when receiving a successful HCI_Reset command. A reset of the controller is expected to start with an empty LE white list. When the LE white list is not cleared on controller reset, the passive background scanning might skip programming the remote devices. Only changes to the LE white list are programmed when passive background is started. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org # 3.17.x	2014-10-30 17:41:08 +01:00
stephen hemminger	01cfa0a4ed	netfilter: fix spelling errors Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-30 17:35:30 +01:00
Dan Carpenter	daac197ca9	Bluetooth: 6lowpan: use after free in disconnect_devices() This was accidentally changed from list_for_each_entry_safe() to list_for_each_entry() so now it has a use after free bug. I've changed it back. Fixes: `9030582963` ('Bluetooth: 6lowpan: Converting rwlocks to use RCU') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-30 17:23:25 +01:00
Pablo Neira Ayuso	c3ac759ea6	Merge branch 'ipvs-next' Simon Horman says: ==================== The single patch in this series fixes some minor fallout from adding support IPv6 real servers in IPv4 virtual-services and vice versa. It should not have any run-time affect other than perhaps saving a few cycles. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-30 16:52:30 +01:00
Marcelo Leitner	8ac2bde2a4	netfilter: log: protect nf_log_register against double registering Currently, despite the comment right before the function, nf_log_register allows registering two loggers on with the same type and end up overwriting the previous register. Not a real issue today as current tree doesn't have two loggers for the same type but it's better to get this protected. Also make sure that all of its callers do error checking. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-30 16:41:48 +01:00
Marcelo Leitner	0c26ed1c07	netfilter: nf_log: Introduce nft_log_dereference() macro Wrap up a common call pattern in an easier to handle call. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-30 16:39:40 +01:00
Johannes Berg	46238845bd	mac80211: properly flush delayed scan work on interface removal When an interface is deleted, an ongoing hardware scan is canceled and the driver must abort the scan, at the very least reporting completion while the interface is removed. However, if it scheduled the work that might only run after everything is said and done, which leads to cfg80211 warning that the scan isn't reported as finished yet; this is no fault of the driver, it already did, but mac80211 hasn't processed it. To fix this situation, flush the delayed work when the interface being removed is the one that was executing the scan. Cc: stable@vger.kernel.org Reported-by: Sujith Manoharan <sujith@msujith.org> Tested-by: Sujith Manoharan <sujith@msujith.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-30 15:48:32 +01:00
Mike Christie	89baaa570a	libceph: use memalloc flags for net IO This patch has ceph's lib code use the memalloc flags. If the VM layer needs to write data out to free up memory to handle new allocation requests, the block layer must be able to make forward progress. To handle that requirement we use structs like mempools to reserve memory for objects like bios and requests. The problem is when we send/receive block layer requests over the network layer, net skb allocations can fail and the system can lock up. To solve this, the memalloc related flags were added. NBD, iSCSI and NFS uses these flags to tell the network/vm layer that it should use memory reserves to fullfill allcation requests for structs like skbs. I am running ceph in a bunch of VMs in my laptop, so this patch was not tested very harshly. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>	2014-10-30 13:11:50 +03:00
Alexander Aring	38130c31ef	mac802154: add basic support for monitor This patch adds basic support for monitor mode. Also change the open call that we set the transceiver mac setting on an interface up. Futher patches will add a better handling while interface up an interface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:46 +01:00
Alexander Aring	05f7de6792	mac802154: rx: add error handling after skb_clone This patch adds error handling after skb_clone and deliver only if skb_clone was successful. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:46 +01:00
Alexander Aring	20b48120c1	mac802154: rx: monitor receive cleanup This patch replace the !netif_running(sdata->dev) instead we doing a !ieee802154_sdata_running(sdata). Also move this in two separate if branches to compare with mac80211 code. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:46 +01:00
Alexander Aring	18460672e0	mac802154: rx: add rx stats incrementation This patch adds rx stats incrementation when the monitor interface recevied a frame. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:46 +01:00
Alexander Aring	21fdf0a1c1	mac802154: rx: use netif_receive_skb This patch removes netif_rx_ni call. Instead we call netif_receive_skb, we can do that since commit `c5c47e67bc` ("mac802154: rx: use tasklet instead workqueue"). Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:46 +01:00
Alexander Aring	1054ed81c4	mac802154: rx: remove override pkt_type set to PACKET_HOST This patch removes pkt_type set to PACKET_HOST while monitor receiving. This should be PACKET_OTHERHOST on monitor mode which already set before. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	ec718f3db9	mac802154: rx: add software checksum filtering check This patch adds a new hardware flag which indicate that the transceiver doesn't support check for bad checksum via hardware. Also add a handling of this while receive. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	b7889497d3	mac802154: rx: simplify crc receive handling This patch change the actual crc handling while receive. Currently the IEEE802154_HW_RX_OMIT_CKSUM flag is used to filter a frame with a bad crc. This patch changes the behaviour of IEEE802154_HW_RX_OMIT_CKSUM to add a crc while receiving for the monitor interface. After monitor receiving we remove the crc for frame parsing. This affect the driver layer because all drivers sets IEEE802154_HW_RX_OMIT_CKSUM and deliver without checksum. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	08c511a733	mac802154: rx: remove unnecessary parameter This patch removes a not used parameter in ieee802154_deliver_skb. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	90386a7e3b	mac802154: separate omit tx/rx flags This patch splits the IEEE802154_HW_OMIT_CKSUM hardware flag into IEEE802154_HW_TX_OMIT_CKSUM and IEEE802154_HW_RX_OMIT_CKSUM. This is useful to deliver the received crc from the driver layer to the monitor interface. At the moment we can't do that without change the xmit handling. The received checksum should be visible in monitor mode only. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	94b792220c	mac802154: add support for promiscuous mode This patch adds a new driver operation to bring the transceiver into promiscuous mode. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:45 +01:00
Alexander Aring	55a2d06517	mac802154: main: remove unnecessary include This patch removes an unnecessary include of driver-ops header file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 23:07:44 +01:00
Nicolas Dichtel	75fbfd3323	neigh: optimize neigh_parms_release() In neigh_parms_release() we loop over all entries to find the entry given in argument and being able to remove it from the list. By using a double linked list, we can avoid this loop. Here are some numbers with 30 000 dummy interfaces configured: Before the patch: $ time rmmod dummy real 2m0.118s user 0m0.000s sys 1m50.048s After the patch: $ time rmmod dummy real 1m9.970s user 0m0.000s sys 0m47.976s Suggested-by: Thierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 16:11:50 -04:00
Eric Dumazet	bc9ad166e3	net: introduce napi_schedule_irqoff() napi_schedule() can be called from any context and has to mask hard irqs. Add a variant that can only be called from hard interrupts handlers or when irqs are already masked. Many NIC drivers can use it from their hard IRQ handler instead of generic variant. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 16:07:27 -04:00
Nikolay Aleksandrov	d70127e8a9	inet: frags: remove the WARN_ON from inet_evict_bucket The WARN_ON in inet_evict_bucket can be triggered by a valid case: inet_frag_kill and inet_evict_bucket can be running in parallel on the same queue which means that there has been at least one more ref added by a previous inet_frag_find call, but inet_frag_kill can delete the timer before inet_evict_bucket which will cause the WARN_ON() there to trigger since we'll have refcnt!=1. Now, this case is valid because the queue is being "killed" for some reason (removed from the chain list and its timer deleted) so it will get destroyed in the end by one of the inet_frag_put() calls which reaches 0 i.e. refcnt is still valid. CC: Florian Westphal <fw@strlen.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McLean <chutzpah@gentoo.org> Fixes: `b13d3cbfb8` ("inet: frag: move eviction of queues to work queue") Reported-by: Patrick McLean <chutzpah@gentoo.org> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:21:30 -04:00
Nikolay Aleksandrov	65ba1f1ec0	inet: frags: fix a race between inet_evict_bucket and inet_frag_kill When the evictor is running it adds some chosen frags to a local list to be evicted once the chain lock has been released but at the same time the *frag_queue can be running for some of the same queues and it may call inet_frag_kill which will wait on the chain lock and will then delete the queue from the wrong list since it was added in the eviction one. The fix is simple - check if the queue has the evict flag set under the chain lock before deleting it, this is safe because the evict flag is set only under that lock and having the flag set also means that the queue has been detached from the chain list, so no need to delete it again. An important note to make is that we're safe w.r.t refcnt because inet_frag_kill and inet_evict_bucket will sync on the del_timer operation where only one of the two can succeed (or if the timer is executing - none of them), the cases are: 1. inet_frag_kill succeeds in del_timer - then the timer ref is removed, but inet_evict_bucket will not add this queue to its expire list but will restart eviction in that chain 2. inet_evict_bucket succeeds in del_timer - then the timer ref is kept until the evictor "expires" the queue, but inet_frag_kill will remove the initial ref and will set INET_FRAG_COMPLETE which will make the frag_expire fn just to remove its ref. In the end all of the queue users will do an inet_frag_put and the one that reaches 0 will free it. The refcount balance should be okay. CC: Florian Westphal <fw@strlen.de> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McLean <chutzpah@gentoo.org> Fixes: `b13d3cbfb8` ("inet: frag: move eviction of queues to work queue") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Patrick McLean <chutzpah@gentoo.org> Tested-by: Patrick McLean <chutzpah@gentoo.org> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:21:30 -04:00
Erik Kline	7fd2561e4e	net: ipv6: Add a sysctl to make optimistic addresses useful candidates Add a sysctl that causes an interface's optimistic addresses to be considered equivalent to other non-deprecated addresses for source address selection purposes. Preferred addresses will still take precedence over optimistic addresses, subject to other ranking in the source address selection algorithm. This is useful where different interfaces are connected to different networks from different ISPs (e.g., a cell network and a home wifi network). The current behaviour complies with RFC 3484/6724, and it makes sense if the host has only one interface, or has multiple interfaces on the same network (same or cooperating administrative domain(s), but not in the multiple distinct networks case. For example, if a mobile device has an IPv6 address on an LTE network and then connects to IPv6-enabled wifi, while the wifi IPv6 address is undergoing DAD, IPv6 connections will try use the wifi default route with the LTE IPv6 address, and will get stuck until they time out. Also, because optimistic nodes can receive frames, issue an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC flag appropriately set). A second RTM_NEWADDR is sent if DAD completes (the address flags have changed), otherwise an RTM_DELADDR is sent. Also: add an entry in ip-sysctl.txt for optimistic_dad. Signed-off-by: Erik Kline <ek@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:11:36 -04:00
Eric Dumazet	dca145ffaa	tcp: allow for bigger reordering level While testing upcoming Yaogong patch (converting out of order queue into an RB tree), I hit the max reordering level of linux TCP stack. Reordering level was limited to 127 for no good reason, and some network setups [1] can easily reach this limit and get limited throughput. Allow a new max limit of 300, and add a sysctl to allow admins to even allow bigger (or lower) values if needed. [1] Aggregation of links, per packet load balancing, fabrics not doing deep packet inspections, alternative TCP congestion modules... Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 15:05:15 -04:00
Toshiaki Makita	432c856fcf	net: skb_segment() should preserve backpressure This patch generalizes commit `d6a4a10411` ("tcp: GSO should be TSQ friendly") to protocols using skb_set_owner_w() TCP uses its own destructor (tcp_wfree) and needs a more complex scheme as explained in commit `6ff50cd555` ("tcp: gso: do not generate out of order packets") This allows UDP sockets using UFO to get proper backpressure, thus avoiding qdisc drops and excessive cpu usage. Here are performance test results (macvlan on vlan): - Before # netperf -t UDP_STREAM ... Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 65507 60.00 144096 1224195 1258.56 212992 60.00 51 0.45 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.23 0.00 25.26 0.08 0.00 74.43 - After # netperf -t UDP_STREAM ... Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 65507 60.00 109593 0 957.20 212992 60.00 109593 957.20 Average: CPU %user %nice %system %iowait %steal %idle Average: all 0.18 0.00 8.38 0.02 0.00 91.43 [edumazet] Rewrote patch and changelog. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 14:47:19 -04:00
Lubomir Rintel	b2ed64a974	ipv6: notify userspace when we added or changed an ipv6 token NetworkManager might want to know that it changed when the router advertisement arrives. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-29 14:35:32 -04:00
WANG Cong	d56109020d	sch_pie: schedule the timer after all init succeed Cc: Vijay Subramanian <vijaynsu@cisco.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com>	2014-10-29 14:28:01 -04:00
Johannes Berg	fc1f48ffd5	cfg80211: fix integer signedness in chandef_primary_freqs() The helper function can't ever create negative values, so use u32 pointers as the function arguments as the caller does. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-29 18:42:51 +01:00
Fabian Frederick	dcc6c2f516	cfg80211: fix set but not used warning in nl80211_channel_switch() radar_detect_width is unused since commit `97dc94f1d9` ("cfg80211: remove channel_switch combination check") Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-29 18:42:51 +01:00
Luciano Coelho	ff1e417c7c	mac80211: schedule the actual switch of the station before CSA count 0 Due to the time it takes to process the beacon that started the CSA process, we may be late for the switch if we try to reach exactly beacon 0. To avoid that, use count - 1 when calculating the switch time. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-29 16:37:54 +01:00
Luciano Coelho	84469a45a1	mac80211: use secondary channel offset IE also beacons during CSA If we are switching from an HT40+ to an HT40- channel (or vice-versa), we need the secondary channel offset IE to specify what is the post-CSA offset to be used. This applies both to beacons and to probe responses. In ieee80211_parse_ch_switch_ie() we were ignoring this IE from beacons and using the current HT information IE instead. This was causing us to use the same offset as before the switch. Fix that by using the secondary channel offset IE also for beacons and don't ever use the pre-switch offset. Additionally, remove the "beacon" argument from ieee80211_parse_ch_switch_ie(), since it's not needed anymore. Cc: stable@vger.kernel.org Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-29 16:37:45 +01:00
Dan Carpenter	eb63192bb8	SUNRPC: off by one in BUG_ON() The m->pool_to[] array has "maxpools" number of elements. It's allocated in svc_pool_map_alloc_arrays() which we called earlier in the function. This test should be >= instead of >. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-10-29 11:37:42 -04:00
Felix Fietkau	10b6848786	mac80211: flush keys for AP mode on ieee80211_do_stop Userspace can add keys to an AP mode interface before start_ap has been called. If there have been no calls to start_ap/stop_ap in the mean time, the keys will still be around when the interface is brought down. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [adjust comments, fix AP_VLAN case] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-29 16:33:37 +01:00
Jukka Rissanen	9cfd5a23a4	Bluetooth: Wrong style spin lock used Use spin_lock_bh() as the code is called from softirq in networking subsystem. This is needed to prevent deadlocks when 6lowpan link is in use. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-29 16:20:40 +01:00
Alexander Aring	e23e9ec16b	ieee802154: introduce sysfs file This patch moves the sysfs handling in a own file. This is like wireless sysfs file handling. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:09 +01:00
Alexander Aring	7445764155	mac802154: cleanup open count handling This patch cleanups the open_count variable increment in open and close calls of netdev. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:09 +01:00
Alexander Aring	c7420c367d	mac802154: move mac_params functions into mac_cmd These functions can be static in mac_cmd file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	12439a5356	mac802154: remove channel attributes from sdata These channel attributes was part of "channel context switch while xmit" which was removed by commit `dc67c6b30f` ("mac802154: tx: remove xmit channel context switch"). This patch removes these unnecessary variables and use the current_page and current_channel by wpan_phy struct now. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	33d4189f51	mac802154: iface: remove assign to zero These variables should already be zero, so we remove the extra assign to zero. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	538181a879	mac802154: add synchronization handling This patch adds synchronization handling in start and stop driver ops calls. This patch is mostly grab from mac80211 which was introduced by commit `ea77f12f2c` ("mac80211: remove tasklet enable/disable"). This is to be sure that we don't run into same issues. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	e363eca386	mac802154: move local started handling This patch removes the current handling of started boolean. This is actually dead code, because mac802154_netdev_register can't never be called before ieee802154_register_hw. This means that local->started is always be true when mac802154_netdev_register is called. Instead we using this now like mac80211 to indicate that an instance of sdata is running. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	5d65cae4bf	mac802154: rename running to started This variable should be handled like ieee80211_local struct of mac80211. We rename this variable to started now to have the same name convention. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:08 +01:00
Alexander Aring	0ea3da64fa	mac802154: rework sdata state change to running This patch reworks the handling for setting the state like mac80211. We use bit's instead a bool variable. The mutex is not needed because it use test and set bits which are atomic operations. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:07 +01:00
Alexander Aring	a543c5989d	mac802154: remove driver ops in wpan-phy This patch removes the driver ops callbacks inside of wpan_phy struct. It was used to check if a phy supports this driver ops call. We do this now via hardware flags. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:07 +01:00
Alexander Aring	59cb300f2b	mac802154: use driver-ops function wrappers This patch replaces all directly called driver ops by previous introduced driver-ops function wrappers. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:07 +01:00
Alexander Aring	b6eea9ca35	mac802154: introduce driver-ops header This patch introduce a driver-ops header file with function wrappers to call the driver ops. These wrappers checking on right context information and warn if optional driver ops are called when these aren't implemented. This behaviour is like mac80211 driver-ops header file, just without function tracing calls. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:07 +01:00
Alexander Aring	1630186100	mac802154: declare struct ieee802154_ops as const The ieee802154_ops structure should be never changed during runtime. This patch declare this structure as const to avoid a runtime change. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:07 +01:00
Alexander Aring	19ec690a43	mac802154: main: move open and close into iface These functions can be static inside the iface file, because it's not used anywhere else. This patch moves these functions into iface file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:06 +01:00
Alexander Aring	b9ff77e50c	mac802154: monitor: merge into iface implementation This patch removes the monitor implementation file and put all monitor stuff into iface file. It's now small enough to put all necessary handling into iface. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 23:19:06 +01:00
Johan Hedberg	0b1db38ca2	Bluetooth: Fix check for direct advertising These days we allow simultaneous LE scanning and advertising. Checking for whether advertising is enabled or not is therefore not a reliable way to determine whether directed advertising was used to trigger the connection creation. The appropriate place to check (instead of the hdev context) is the connection role that's stored in the hci_conn. This patch fixes such a check in le_conn_timeout() which could otherwise lead to incorrect HCI commands being sent. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.16.x	2014-10-28 22:48:56 +01:00
Johan Hedberg	980ffc0a2c	Bluetooth: Fix LE connection timeout deadlock The le_conn_timeout() may call hci_le_conn_failed() which in turn may call hci_conn_del(). Trying to use the _sync variant for cancelling the conn timeout from hci_conn_del() could therefore result in a deadlock. This patch converts hci_conn_del() to use the non-sync variant so the deadlock is not possible. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.16.x	2014-10-28 22:48:56 +01:00
David S. Miller	2c6c49ded7	openvswitch: Export lockdep_ovsl_is_held to modules. ERROR: "lockdep_ovsl_is_held" [net/openvswitch/vport-gre.ko] undefined! Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:27:23 -04:00
Simon Horman	941d8ebcf7	datapath: Rename last_action() as nla_is_last() and move to netlink.h The original motivation for this change was to allow the helper to be used in files other than actions.c as part of work on an odp select group action. It was as pointed out by Thomas Graf that this helper would be best off living in netlink.h. Furthermore, I think that the generic nature of this helper means it is best off in netlink.h regardless of if it is used more than one .c file or not. Thus, I would like it considered independent of the work on an odp select group action. Cc: Thomas Graf <tgraf@suug.ch> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Andy Zhou <azhou@nicira.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:07:29 -04:00
David S. Miller	25946f20b7	Merge tag 'master-2014-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-10-28 Please pull this batch of fixes intended for the 3.18 stream! For the mac80211 bits, Johannes says: "Here are a few fixes for the wireless stack: one fixes the RTS rate, one for a debugfs file, one to return the correct channel to userspace, a sanity check for a userspace value and the remaining two are just documentation fixes." For the iwlwifi bits, Emmanuel says: "I revert here a patch that caused interoperability issues. dvm gets a fix for a bug that was reported by many users. Two minor fixes for BT Coex and platform power fix that helps reducing latency when the PCIe link goes to low power states." In addition... Felix Fietkau adds a couple of ath code fixes related to regulatory rule enforcement. Hauke Mehrtens fixes a build break with bcma when CONFIG_OF_ADDRESS is not set. Karsten Wiese provides a trio of minor fixes for rtl8192cu. Kees Cook prevents a potential information leak in rtlwifi. Larry Finger also brings a trio of minor fixes for rtlwifi. Rafał Miłecki adds a device ID to the bcma bus driver. Rickard Strandqvist offers some strn* -> strl* changes in brcmfmac to eliminate non-terminated string issues. Sujith Manoharan avoids some ath9k stalls by enabling HW queue control only for MCC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 15:30:15 -04:00
Andrew Lunn	ae439286a0	net: dsa: Error out on tagging protocol mismatches If there is a mismatch between enabled tagging protocols and the protocol the switch supports, error out, rather than continue with a situation which is unlikely to work. Signed-off-by: Andrew Lunn <andrew@lunn.ch> cc: alexander.h.duyck@intel.com Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 15:27:54 -04:00
Thomas Graf	62b9c8d037	ovs: Turn vports with dependencies into separate modules The internal and netdev vport remain part of openvswitch.ko. Encap vports including vxlan, gre, and geneve can be built as separate modules and are loaded on demand. Modules can be unloaded after use. Datapath ports keep a reference to the vport module during their lifetime. Allows to remove the error prone maintenance of the global list vport_ops_list. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 14:43:18 -04:00
Stephen Hemminger	49c922bb1e	Bluetooth: spelling fixes Fix spelling errors in comments. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 17:23:58 +01:00
Jukka Rissanen	df092306d6	Bluetooth: 6lowpan: Fix lockdep splats When a device ndo_start_xmit() calls again dev_queue_xmit(), lockdep can complain because dev_queue_xmit() is re-entered and the spinlocks protecting tx queues share a common lockdep class. Same issue was fixed for ieee802154 in commit "20e7c4e80dcd" Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 17:04:39 +01:00
Jukka Rissanen	9030582963	Bluetooth: 6lowpan: Converting rwlocks to use RCU The rwlocks are converted to use RCU. This helps performance as the irq locks are not needed any more. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 17:04:38 +01:00
Johan Hedberg	da213f8e0c	Bluetooth: Revert SMP self-test patches This reverts commits `c6992e9ef2` and `4cd3362da8`. The reason for the revert is that we cannot have more than one module initialization function and the SMP one breaks the build with modular kernels. As the proper fix for this is right now looking non-trivial it's better to simply revert the problematic patches in order to keep the upstream tree compilable. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-28 15:32:49 +01:00
Alex Gartrell	d770108911	ipvs: remove unnecessary assignment in __ip_vs_get_out_rt It is a precondition of the function that daddr be equal to dest->addr.ip if dest is non-NULL, so this additional assignment is just confusing for stupid engineers like me. Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-10-28 09:50:06 +09:00
Alex Gartrell	3d53666b40	ipvs: Avoid null-pointer deref in debug code Use daddr instead of reaching into dest. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-10-28 09:48:31 +09:00
Alexei Starovoitov	f89b7755f5	bpf: split eBPF out of NET introduce two configs: - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters depend on - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use that solves several problems: - tracing and others that wish to use eBPF don't need to depend on NET. They can use BPF_SYSCALL to allow loading from userspace or select BPF to use it directly from kernel in NET-less configs. - in 3.18 programs cannot be attached to events yet, so don't force it on - when the rest of eBPF infra is there in 3.19+, it's still useful to switch it off to minimize kernel size bloat-o-meter on x64 shows: add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601) tested with many different config combinations. Hopefully didn't miss anything. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 19:09:59 -04:00
Kyeyoon Park	958501163d	bridge: Add support for IEEE 802.11 Proxy ARP This feature is defined in IEEE Std 802.11-2012, 10.23.13. It allows the AP devices to keep track of the hardware-address-to-IP-address mapping of the mobile devices within the WLAN network. The AP will learn this mapping via observing DHCP, ARP, and NS/NA frames. When a request for such information is made (i.e. ARP request, Neighbor Solicitation), the AP will respond on behalf of the associated mobile device. In the process of doing so, the AP will drop the multicast request frame that was intended to go out to the wireless medium. It was recommended at the LKS workshop to do this implementation in the bridge layer. vxlan.c is already doing something very similar. The DHCP snooping code will be added to the userspace application (hostapd) per the recommendation. This RFC commit is only for IPv4. A similar approach in the bridge layer will be taken for IPv6 as well. Signed-off-by: Kyeyoon Park <kyeyoonp@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 19:02:04 -04:00
David S. Miller	5d26b1f50a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Allow to recycle a TCP port in conntrack when the change role from server to client, from Marcelo Leitner. 2) Fix possible off by one access in ip_set_nfnl_get_byindex(), patch from Dan Carpenter. 3) alloc_percpu returns NULL on error, no need for IS_ERR() in nf_tables chain statistic updates. From Sabrina Dubroca. 4) Don't compile ip options in bridge netfilter, this mangles the packet and bridge should not alter layer >= 3 headers when forwarding packets. Patch from Herbert Xu and tested by Florian Westphal. 5) Account the final NLMSG_DONE message when calculating the size of the nflog netlink batches. Patch from Florian Westphal. 6) Fix a possible netlink attribute length overflow with large packets. Again from Florian Westphal. 7) Release the skbuff if nfnetlink_log fails to put the final NLMSG_DONE message. This fixes a leak on error. This shouldn't ever happen though, otherwise this means we miscalculate the netlink batch size, so spot a warning if this ever happens so we can track down the problem. This patch from Houcheng Lin. 8) Look at the right list when recycling targets in the nft_compat, patch from Arturo Borrero. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 18:47:40 -04:00
Arturo Borrero	e9105f1bea	netfilter: nf_tables: add new expression nft_redir This new expression provides NAT in the redirect flavour, which is to redirect packets to local machine. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-27 22:49:39 +01:00
Arturo Borrero	9de920eddb	netfilter: refactor NAT redirect IPv6 code to use it from nf_tables This patch refactors the IPv6 code so it can be usable both from xt and nf_tables. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-27 22:48:10 +01:00
Arturo Borrero	8b13eddfdf	netfilter: refactor NAT redirect IPv4 to use it from nf_tables This patch refactors the IPv4 code so it can be usable both from xt and nf_tables. A similar patch follows-up to handle IPv6. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-27 22:47:06 +01:00
Arturo Borrero	7965ee9371	netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops() The code looks for an already loaded target, and the correct list to search is nft_target_list, not nft_match_list. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-27 22:17:46 +01:00
Fabian Frederick	b8901ac319	ipx: remove __inline__ in c file on static Let compiler decide what to do with static void __ipxitf_put() Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:25:31 -04:00
Fabian Frederick	e0f36310f7	ipx: remove unnecessary casting on ntohl use %08X instead of %08lX and remove casting. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:03:53 -04:00
Fabian Frederick	5e96d788d9	ipx: move extern sysctl_ipx_pprop_broadcasting to header file include ipx.h from sysctl_net_ipx.c Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:03:53 -04:00
Fabian Frederick	ce256981e5	ipv6: include linux/uaccess.h instead of asm/uaccess.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:03:52 -04:00
Fabian Frederick	9451a304ce	ipv6: replace min/casting by min_t Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:03:52 -04:00
Fabian Frederick	6b436d3381	ipv4: remove set but unused variable sha unsigned char *sha (source) was already in original git version but was never used. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-27 16:03:52 -04:00
John W. Linville	99c814066e	Here are a few fixes for the wireless stack: one fixes the RTS rate, one for a debugfs file, one to return the correct channel to userspace, a sanity check for a userspace value and the remaining two are just documentation fixes. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUSUotAAoJEDBSmw7B7bqr2TwP/2EJiYMOXhLTM5F/sEaGP5aX +hN+/Za3hAuLu3GkYXnIEw8uJL0ooDpUkQLoyV7AUoqVKhgqCuMQPWbklU0Ns2Og qubAl9BRY1DPgMtdGa3mMMJ3GYkIC8Hbh6kTOPqYASVZWover5NRKFlp2jp1uhbf ypv0IfIJVO323s90TWv1ZQsTtnGxMtL9DaLwqBNKN8nSGKxe62cUZsQN+H5KGm4N /n7eN62XPkidFUsTmdAXHfcgEpGv82rtSpxWmSrwxDbQEj12xkP66cRTuomZJ5v1 981OIzcxtV0ngLjfnoSGev6bvgO2TDEbvQScIsZiqfnaJuBfzPEAaghnWozox3op dfolKkD3LecLGcxVVGJlKddxm3K4+2q7tuwkfDcxKNx2KFqtOqM6gY8z1uXYX8MW Jv7669nwpKgWM0e3hsxz6WJauEsdWRVzarmimK/Ymitu0RgNmXTVbdvvFVSTenuZ 0HLqfr7Uk0gw5gQgWfj4F0qjNxzmjhnw/pz+c1DRtYs6w6SGToCqcm5yRU6f8pLt SHc3LJ67xK5RnOq8+KJ8o92MfE29HH2CTzLzgghNrLnwqcYBTApuCr0OtpQb04Zf AgQRMq61IGXwCDiVFE1ElpRgrW4/aekUZh/JB8pGHlhrAvl+HwNYVek66MBDeMyl akmLeHhrCkuWstDHHz+o =m/XT -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Here are a few fixes for the wireless stack: one fixes the RTS rate, one for a debugfs file, one to return the correct channel to userspace, a sanity check for a userspace value and the remaining two are just documentation fixes." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-10-27 13:38:15 -04:00
Alexander Aring	be9d215fa9	mac802154: rx: change naming convention This patch changes the naming convention of mac802154 rx file. It should be more named like mac80211 stack. Furthermore we introduce a new frame parsing implementation which is much similar the mac80211 implementation. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:50 +01:00
Alexander Aring	e176b681b0	mac802154: rx: move rcu locking Instead of twice lock and unlock mechanism this patch hold these locks only once at one position. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:45 +01:00
Alexander Aring	9cf215d073	mac802154: rx: move skb_reset_mac_header This patch moves the skb_reset_mac_header call before frame parsing while wpan rx and before monitor deliver functionality. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:45 +01:00
Alexander Aring	c9ca640140	mac802154: rx: add monitor pkt_type information This patch adds a PACKET_OTHERHOST setting when a monitor interface receives a skb. All receiving skb's to the monitor interface should be PACKET_OTHERHOST. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:44 +01:00
Alexander Aring	75a46f0ee7	mac802154: rx: add CHECKSUM_UNNECESSARY This patch adds CHECKSUM_UNNECESSARY to skb->ip_summed before delivery. There exist no transceiver with IP checksum functionality. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:44 +01:00
Alexander Aring	702dcf994a	mac802154: rx: move skb->protocol setting This patch moves the skb->protocol setting to the position when it's needed. It's only needed when frame parsing was successful. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:43 +01:00
Alexander Aring	469100d6c2	mac802154: rx: rename remove mac802154_subif_rx This patch removes the mac802154_subif_rx function and do the necessary calls inside of ieee802154_rx function. The ieee802154_rx is small enough to move the functionality inside this function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:43 +01:00
Alexander Aring	4ca18be54f	mac802154: tx: remove monitor receive while xmit This removes the call of monitor receive funktion when any interface type call xmit. There exist no such use case that a monitor interface should receive the actual sending frame. One use case could be that a wpan interface and monitor interface could be running at the same time on one phy. Then the monitor interface receives the wpan frames also. Furthermore we adding support for promiscous mode setting. With promiscous mode setting we can't run a wpan and monitor interface at the same time. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:42 +01:00
Alexander Aring	2a9820c9e2	mac802154: rx: move receive handling into rx.c This patch removes all relevant receiving functions inclusive frame parsing into rx file. Like mac80211 we should implement the complete receive handling and parsing in this file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:42 +01:00
Alexander Aring	c730c90316	mac802154: rx: document ieee802154_rx() context requirement This patch is similar like `d20ef63d32` ("mac80211: document ieee80211_rx() context requirement"). The netif_receive_skb call requires with softirqs disabled. This patch adds a warning if softirqs are pending while calling ieee802154_rx. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:42 +01:00
Alexander Aring	c5c47e67bc	mac802154: rx: use tasklet instead workqueue Tasklets have much less overhead than workqueues. This patch also removes the heap allocation for the worker on receiving path. Like mac80211 we should prefer use a tasklet here instead a workqueue to getting fast out of interrupt context when ieee802154_rx_irqsafe is called by driver. Like wireless inside the tasklet context we should call netif_receive_skb instead netif_rx_ni anymore. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:40 +01:00
Alexander Aring	061ef8f915	mac802154: tx: use put_unaligned_le16 for copy crc This patch replaces the memcpy with a put_unaligned_le16. The placement of crc inside of PSDU can also be unaligned. With memcpy this can fail on some architectures. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 18:07:36 +01:00
Martin Townsend	01141234f2	ieee802154: 6lowpan: rename process_data and lowpan_process_data As we have decouple decompression from data delivery we can now rename all occurences of process_data in receive path. Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 15:51:16 +01:00
Martin Townsend	3c400b843d	bluetooth:6lowpan: use consume_skb when packet processed successfully Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 15:51:16 +01:00
Martin Townsend	04dfd7386a	6lowpan: fix process_data return values As process_data now returns just error codes fix up the calls to this function to only drop the skb if an error code is returned. Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 15:51:15 +01:00
Martin Townsend	f8b361768e	6lowpan: remove skb_deliver from IPHC Separating skb delivery from decompression ensures that we can support further decompression schemes and removes the mixed return value of error codes with NET_RX_FOO. Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-27 15:51:15 +01:00
Karl Beldan	9ffe904405	mac80211: minstrel_ht: do not always skip ht rates vht_only is true When CONFIG_MAC80211_RC_MINSTREL_VHT is set, the module param minstrel_vht_only tells minstrel_ht whether to allow the mix of ht rates with vht rates. ATM, minstrel_ht skips ht rates when minstrel_vht_only is true, but it does that even if vht is not supported, which makes the sta rates fallback to legacy as no ht rate gets enabled. Fixes: `9208247d74` ("mac80211: minstrel_ht: add basic support for VHT rates <= 3SS@80MHz") Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:35 +01:00
Emmanuel Grumbach	cfede0d80d	mac80211: don't flush when probing the AP All the callers of ieee80211_mgd_probe_ap_send return right after they call the flush() callback. This means that calling flush() is uneeded since its meaning is to wait until the queues of the device are empty. Devices that know how to report status on Tx will do so using the regular path (ieee80211_tx_status) and this status will trigger the continuation of the flow of the probe (ieee80211_sta_tx_notify). Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:34 +01:00
Ben Greear	b5dfae020b	mac80211: support creating vifs with specified mac address This is useful when creating virtual interfaces. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:34 +01:00
Ben Greear	e8f479b112	cfg80211: support configuring vif mac addr on create This is useful when creating virtual interfaces. Keeps udev from mucking with things it shouldn't, since the default MAC is never seen by udev when specified on the cmd-line during creation. Signed-off-by: Ben Greear <greearb@candelatech.com> [check for feature flag in nl80211 to force drivers to set it] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:33 +01:00
Ben Greear	e27513fbd0	mac80211: support creating wiphy w/out creating wlanX This will be helpful when using the mac80211_hwsim wiphys and automated testing. Let user create the vifs as needed, and named as expected. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:32 +01:00
Ben Greear	ad28757eef	mac80211: allow creating wiphy devices with suggested name Support creating wiphy devices with an optional name. This will be used by hwsim to have better automated control over virtual radio creation/deletion. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:31 +01:00
Ben Greear	1998d90ad4	cfg80211: support creating wiphy with suggested name Kernel will attempt to use the name if it is supplied, but if name cannot be used for some reason, the default phyX name will be used instead. Signed-off-by: Ben Greear <greearb@candelatech.com> [while at it, use wiphy_name() instead of dev_name(), fix format string issue reported by Kees Cook] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-27 08:48:18 +01:00
Fabian Frederick	5c1e9f2c1f	xfrm: fix set but not used warning in xfrm_policy_queue_process() err was set but unused. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-10-27 07:16:46 +01:00
Eric Dumazet	93a35f59f1	net: napi_reuse_skb() should check pfmemalloc Do not reuse skb if it was pfmemalloc tainted, otherwise future frame might be dropped anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-26 22:47:23 -04:00
Alexander Aring	f81f466ca5	mac802154: tx: make worker information static This patch moves the worker information struct out of skb control block. Instead control block we declare it static inside of tx.c file. We can do that, because the worker can't be used twice at the same time. It's protected by stop and wake netdev queue. This patch fix an issue that the "struct ieee802154_xmit_cb" doesn't fit into the skb control block on some kernel configuartion reported by kbuild test robot. It was introduced by commit `fe24371d66` ("mac802154: tx: remove kmalloc in xmit hotpath"). Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 19:18:35 +01:00
Alexander Aring	e5e584fcc2	mac802154: tx: change naming convention This patch changes the naming convention of the tx functions like mac80211. Just with an 802154 instead 80211 inside the name. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:05 +01:00
Alexander Aring	409c3b0c5f	mac802154: tx: move stats tx increment This patch moves the stats increment of successful transmitted packets in the right place when the skb was really successful transmitted. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:05 +01:00
Alexander Aring	b7eec52bcb	mac802154: tx: cleanup crc calculation Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:05 +01:00
Alexander Aring	cfa626cb37	mac802154: tx: use netdev print helpers This patch replace the pr_foo printout function to netdev_foo printout function. Inside the xmit handling, the interface is already known. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:04 +01:00
Alexander Aring	6001d5223d	mac802154: tx: don't allow if down while sync tx This patch holds rtnl lock while sync xmit inside of workqueue. Otherwise we could down the interface while worker xmit handling. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:04 +01:00
Alexander Aring	ed0a5dce0c	mac802154: tx: add support for xmit_async callback This patch renames the existsing xmit callback to xmit_sync and introduces an asynchronous xmit_async function. If ieee802154_ops doesn't provide the xmit_async callback, then we have a fallback to the xmit_sync callback. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:04 +01:00
Alexander Aring	cdb66beaa0	mac802154: tx: fix error handling while xmit In case of an error we should call kfree_skb instead of consume_skb which is called by ieee802154_xmit_complete function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:04 +01:00
Alexander Aring	18d60a0d49	mac802154: tx: use queue helpers in xmit worker This patch uses the queue utility helpers inside the xmit worker of mac802154 subsystem. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:03 +01:00
Alexander Aring	c208510351	mac802154: add netdev qeue helpers This patch adds a new file net/mac802154/util.c which contains utility functions for drivers, etc. This file contains functions to start and stop queues for all virtual interfaces, this is useful for asynchronous handling by driver level. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:03 +01:00
Alexander Aring	dc67c6b30f	mac802154: tx: remove xmit channel context switch This patch removes the channel hopping feature before xmit. There are several issues to provide a real channel hopping (timing requirements, etc...). We don't have any known kernelspace protocol which really use this feature. And I don't know an real user of this feature. We simply drop this feature now. This patch removes also the hold of pib lock which isn't needed by any real driver xmit callback implementation. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:03 +01:00
Alexander Aring	e89e45f22a	mac802154: tx: squash multiple dereferencing This patch introduce some new stack variables to avoid multiple dereferencing inside the xmit worker function. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:24:02 +01:00
Alexander Aring	fe24371d66	mac802154: tx: remove kmalloc in xmit hotpath This patch removes the kmalloc allocation for workqueue data. This patch replaces the kmalloc and uses the control block of skb. The control block has enough space and isn't use by any other layer in this case. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:23:58 +01:00
Alexander Aring	50c6fb9965	mac802154: tx: move xmit callback to tx file This patch moves the netdev xmit callback functions into the tx.c file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-26 17:23:50 +01:00
Eric Dumazet	349ce993ac	tcp: md5: do not use alloc_percpu() percpu tcp_md5sig_pool contains memory blobs that ultimately go through sg_set_buf(). -> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf)); This requires that whole area is in a physically contiguous portion of memory. And that @buf is not backed by vmalloc(). Given that alloc_percpu() can use vmalloc() areas, this does not fit the requirements. Replace alloc_percpu() by a static DEFINE_PER_CPU() as tcp_md5sig_pool is small anyway, there is no gain to dynamically allocate it. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `765cf9976e` ("tcp: md5: remove one indirection level in tcp_md5sig_pool") Reported-by: Crestez Dan Leonard <cdleonard@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-25 16:10:04 -04:00
Alexander Aring	c6f635faf3	mac802154: remove ieee802154_addr from driver_ops This driver_ops callback function is never used by any driver. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:39 +02:00
Alexander Aring	f773054254	mac802154: rename dev_workqueue to workqueue Small rename to use the name workqueue than dev_workqueue. To bring the same naming convention like wireless into 802.15.4. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:38 +02:00
Alexander Aring	59d19cd70c	mac802154: introduce IEEE802154_DEV_TO_SUB_IF This function adds a wrapper to call netdev_priv to getting the sdata attribute. This is similar like the IEEE80211_DEV_TO_SUB_IF function inside wireless stack implementation. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:38 +02:00
Alexander Aring	60741361c3	mac802154: introduce hw_to_local function This patch replace the mac802154_to_priv macro with a static inline function named hw_to_local. This brings a similar naming convention like mac80211 stack. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:38 +02:00
Alexander Aring	d98be45b36	mac802154: rename sdata slaves and slaves_mtx This patch renamens the slaves attribute in sdata to interfaces and slaves_mtx to iflist_mtx. This is similar like the mac80211 stack naming convention. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:38 +02:00
Alexander Aring	04e850fe06	mac802154: rename hw subif_data variable to local This patch renames the hw attribute in struct ieee802154_sub_if_data to local. This avoid confusing with the struct ieee802154_hw hw; inside of local struct. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:38 +02:00
Alexander Aring	036562f9c4	mac802154: rename mac802154_sub_if_data Like wireless this structure should named ieee802154_sub_if_data and not mac802154_sub_if_data. This patch renames the struct and variables to sdata instead priv sometimes. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:37 +02:00
Alexander Aring	a5e1ec538f	mac802154: rename mac802154_priv to ieee802154_local This patch rename the mac802154_priv to ieee802154_local. The mac802154_priv structure is like ieee80211_local and so we name it ieee802154_local. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:37 +02:00
Alexander Aring	5a50439775	ieee802154: rename ieee802154_dev to ieee802154_hw The identical struct of the wireless stack implementation is named ieee80211_hw. This is useful to name the variable hw instead of get confusing with netdev dev variable. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:55:37 +02:00
Alexander Aring	4ca24aca55	ieee802154: move ieee802154 header This patch moves the ieee802154 header into include/linux instead include/net. Similar like wireless which have the ieee80211 header inside of include/linux. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:57 +02:00
Alexander Aring	86d52cd964	ieee802154: move wpan-class.c to core.c Like the wireless core.c file this file contains function for phy allocation and freeing. Move this file to core.c to get similar behaviour. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:56 +02:00
Alexander Aring	5ad60d3699	ieee802154: move wpan-phy.h to cfg802154.h The wpan-phy header contains the wpan_phy struct information. Later this header will be have similar function like cfg80211 header. The cfg80211 header contains the wiphy struct which is identically the wpan_phy struct inside 802.15.4 subsystem. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:56 +02:00
Alexander Aring	15859a5e14	mac802154: move wpan.c to iface.c The wpan.c file contains the interface handling functions now. It's similar like the mac80211 iface.c file. This patch renames this file to iface.c to have similar naming convention in mac802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:55 +02:00
Alexander Aring	0f1556bc2b	mac802154: move mac802154.h to ieee802154_i.h This patch moves the mac802154.h internal header to ieee802154_i.h like the wireless stack ieee80211_i.h file. This avoids confusing with the not internal header include/net/mac802154.h header. Additional we get the same naming conversion like mac80211 for this file. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:55 +02:00
Alexander Aring	62eb01f5c2	mac802154: move ieee802154_dev.c to main.c The ieee802154_dev functionality contains various function for allocation and registration of an ieee802154_dev. This is equal to the net/mac80211/main.c file. This patch rename the ieee802154_dev.c to main.c to have the same behaviour. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:39:54 +02:00
Johan Hedberg	c6992e9ef2	Bluetooth: Add self-tests for SMP crypto functions This patch adds self-tests for the c1 and s1 crypto functions used for SMP pairing. The data used is the sample data from the core specification. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:33:57 +02:00
Johan Hedberg	4cd3362da8	Bluetooth: Add skeleton for SMP self-tests This patch adds a basic skeleton for SMP self-tests. The tests are put behind a new configuration option since running them will slow down the boot process. For now there are no actual tests defined but those will come in a subsequent patch. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:33:56 +02:00
Johan Hedberg	e491eaf3c0	Bluetooth: Pass only crypto context to SMP crypto functions In order to make unit testing possible we need to make the SMP crypto functions only take the crypto context instead of the full SMP context (the latter would require having hci_dev, hci_conn, l2cap_chan, l2cap_conn, etc around). The drawback is that we no-longer get the involved hdev in the debug logs, but this is really the only way to make simple unit tests for the code. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 21:33:56 +02:00
Fabian Frederick	4f639edef7	Bluetooth: fix shadow warning in hci_disconnect() use clkoff_cp for hci_cp_read_clock_offset instead of cp (already defined above). Suggested-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 18:53:39 +02:00
Alexander Aring	ed1da14817	ieee802154: wpan-class: fix trailing semicolon This patch removes an unnecessary tailing semicolon after macro define. Otherwise we get a trailing semicolon while using this macro. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 08:07:30 +02:00
Alexander Aring	57205c14ca	mac802154: fix typo IEEE802515 to IEEE802154 This patch fixs a typo in address filter defines from IEEE802515 to IEEE802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 08:07:30 +02:00
Alexander Aring	139f14adab	ieee802154: ieee802154_dev: fix align typo This patch fix a typo and fix align instead allign. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 08:07:30 +02:00
Alexander Aring	b3020f0a35	ieee802154: mac802154: remove FSF address This patch removes the FSF address in files which belongs to ieee802154 and mac802154. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Cc: Alan Ott <alan@signal11.us> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 08:07:30 +02:00
Martin Townsend	ee93053d56	Bluetooth: Fix missing channel unlock in l2cap_le_credits In the error case where credits is greater than max_credits there is a missing l2cap_chan_unlock before returning. Signed-off-by: Martin Townsend <mtownsend1973@gmail.com> Tested-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:25 +02:00
Martin Townsend	11e3ff7072	6lowpan: Use skb_cow in IPHC decompression. Currently there are potentially 2 skb_copy_expand calls in IPHC decompression. This patch replaces this with one call to skb_cow which will check to see if there is enough headroom first to ensure it's only done if necessary and will handle alignment issues for cache. As skb_cow uses pskb_expand_head we ensure the skb isn't shared from bluetooth and ieee802.15.4 code that use the IPHC decompression. Signed-off-by: Martin Townsend <martin.townsend@xsilon.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:25 +02:00
Li RongQing	4456c50d23	Bluetooth: 6lowpan: remove unnecessary codes in give_skb_to_upper netif_rx() only returns NET_RX_DROP and NET_RX_SUCCESS, not returns negative value Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:25 +02:00
Szymon Janc	15346a9c28	Bluetooth: Improve RFCOMM __test_pf macro robustness Value returned by this macro might be used as bit value so it should return either 0 or 1 to avoid possible bugs (similar to NSC bug) when shifting it. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:24 +02:00
Szymon Janc	ec511545ef	Bluetooth: Fix RFCOMM NSC response rfcomm_send_nsc expects CR to be either 0 or 1 since it is later passed to __mcc_type macro and shitfed. Unfortunatelly CR extracted from received frame type was not sanitized and shifted value was passed resulting in bogus response. Note: shifted value was also passed to other functions but was used only in if satements so this bug appears only for NSC case. The CR bit in the value octet shall be set to the same value as the CR bit in the type field octet of the not supported command frame but the CR bit for NCS response should be set to 0 since it is always a response. This was affecting TC_RFC_BV_25_C PTS qualification test. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:24 +02:00
Alfonso Acosta	89cbb0638e	Bluetooth: Defer connection-parameter removal when unpairing Systematically removing the LE connection parameters and autoconnect action is inconvenient for rebonding without disconnecting from userland (i.e. unpairing followed by repairing without disconnecting). The parameters will be lost after unparing and userland needs to take care of book-keeping them and re-adding them. This patch allows userland to forget about parameter management when rebonding without disconnecting. It defers clearing the connection parameters when unparing without disconnecting, giving a chance of keeping the parameters if a repairing happens before the connection is closed. Signed-off-by: Alfonso Acosta <fons@spotify.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:24 +02:00
Alexander Aring	c37a8106de	ieee802154: 6lowpan: add RTNL assertion This patch ensure that the rtnl lock is hold while newlink callback. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:24 +02:00
Alexander Aring	1ae2605e55	ieee802154: 6lowpan: improve packet registration This patch improves the packet registration handling. Instead of registration with module init we have a open count variable and registration the lowpan packet handler when it's needed. The open count variable should be protected by RTNL. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:24 +02:00
Alfonso Acosta	ddbea5cff7	Bluetooth: Remove redundant check on hci_conn's device class NULL-checking conn->dev_class is pointless since the variable is defined as an array, i.e. it will always be non-NULL. Signed-off-by: Alfonso Acosta <fons@spotify.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:24 +02:00
Alfonso Acosta	fd45ada910	Bluetooth: Include ADV_IND report in Device Connected event There are scenarios when autoconnecting to a device after the reception of an ADV_IND report (action 0x02), in which userland might want to examine the report's contents. For instance, the Service Data might have changed and it would be useful to know ahead of time before starting any GATT procedures. Also, the ADV_IND may contain Manufacturer Specific data which would be lost if not propagated to userland. In fact, this patch results from the need to rebond with a device lacking persistent storage which notifies about losing its LTK in ADV_IND reports. This patch appends the ADV_IND report which triggered the autoconnection to the EIR Data in the Device Connected event. Signed-off-by: Alfonso Acosta <fons@spotify.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:24 +02:00
Alfonso Acosta	48ec92fa4f	Bluetooth: Refactor arguments of mgmt_device_connected The values of a lot of the mgmt_device_connected() parameters come straight from a hci_conn object. We can simplify the function by passing the full hci_conn pointer to it. Signed-off-by: Alfonso Acosta <fons@spotify.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-25 07:56:23 +02:00
Alexander Aring	c0bffc7ddc	ieee802154: 6lowpan: fix sign of errno return val This patch fix ERR_PTR(-rc) to ERR_PTR(rc). The variable rc is already a negative errno value. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:22 +02:00
Alexander Aring	f870b8c631	ieee802154: reassembly: fix tag byteorder This patch fix byte order handling in reassembly code of 802.15.4 6LoWPAN fragmentation handling. net/ieee802154/reassembly.c:58:43: warning: restricted __be16 degrades to integer Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:22 +02:00
Alexander Aring	cd97a713ac	ieee802154: 6lowpan: fix byteorder for frag tag This patch fix byteorder issues with fragment tag of generation 802.15.4 6LoWPAN fragment header. net/ieee802154/6lowpan_rtnl.c:278:54: warning restricted __be16 degrades to integer net/ieee802154/6lowpan_rtnl.c:278:18: warning: incorrect type in assignment (different base types) net/ieee802154/6lowpan_rtnl.c:278:18: expected restricted __be16 [usertype] frag_tag net/ieee802154/6lowpan_rtnl.c:278:18: got unsigned short Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:22 +02:00
Simon Vincent	39f6eb19cf	ieee802154: 6lowpan: Drop PACKET_OTHERHOST skbs in 6lowpan There is no point processing pkts which are PACKET_OTHERHOST in 6lowpan as they are discarded as soon as they reach the ipv6 layer. Therefore we should drop them in the 6lowpan layer. Signed-off-by: Simon Vincent <simon.vincent@xsilon.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:21 +02:00
Julien Catalano	ee4c148e8a	trivial: net/mac802154: Fix Kconfig typo Signed-off-by: Julien Catalano <julien.catalano@gmail.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-25 07:56:21 +02:00
Fabian Frederick	74bca138e1	net: llc: include linux/errno.h instead of asm/errno.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:51:42 -04:00
Fabian Frederick	75da1469f9	lapb: move EXPORT_SYMBOL after functions. See Documentation/CodingStyle Chapter 6 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 15:51:42 -04:00
Houcheng Lin	b51d3fa364	netfilter: nf_log: release skbuff on nlmsg put failure The kernel should reserve enough room in the skb so that the DONE message can always be appended. However, in case of e.g. new attribute erronously not being size-accounted for, __nfulnl_send() will still try to put next nlmsg into this full skbuf, causing the skb to be stuck forever and blocking delivery of further messages. Fix issue by releasing skb immediately after nlmsg_put error and WARN() so we can track down the cause of such size mismatch. [ fw@strlen.de: add tailroom/len info to WARN ] Signed-off-by: Houcheng Lin <houcheng@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-24 14:34:11 +02:00
Florian Westphal	c1e7dc91ee	netfilter: nfnetlink_log: fix maximum packet length logged to userspace don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work. The nla length includes the size of the nla struct, so anything larger results in u16 integer overflow. This patch is similar to `9cefbbc9c8` (netfilter: nfnetlink_queue: cleanup copy_range usage). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-24 14:32:27 +02:00
Florian Westphal	9dfa1dfe4d	netfilter: nf_log: account for size of NLMSG_DONE attribute We currently neither account for the nlattr size, nor do we consider the size of the trailing NLMSG_DONE when allocating nlmsg skb. This can result in nflog to stop working, as __nfulnl_send() re-tries sending forever if it failed to append NLMSG_DONE (which will never work if buffer is not large enough). Reported-by: Houcheng Lin <houcheng@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-24 14:30:15 +02:00
Herbert Xu	7677e86843	bridge: Do not compile options in br_parse_ip_options Commit `462fb2af97` bridge : Sanitize skb before it enters the IP stack broke when IP options are actually used because it mangles the skb as if it entered the IP stack which is wrong because the bridge is supposed to operate below the IP stack. Since nobody has actually requested for parsing of IP options this patch fixes it by simply reverting to the previous approach of ignoring all IP options, i.e., zeroing the IPCB. If and when somebody who uses IP options and actually needs them to be parsed by the bridge complains then we can revisit this. Reported-by: David Newall <davidn@davidnewall.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-24 14:24:03 +02:00
Martin KaFai Lau	367efcb932	ipv6: Avoid redoing fib6_lookup() with reachable = 0 by saving fn This patch save the fn before doing rt6_backtrack. Hence, without redo-ing the fib6_lookup(), saved_fn can be used to redo rt6_select() with RT6_LOOKUP_F_REACHABLE off. Some minor changes I think make sense to review as a single patch: * Remove the 'out:' goto label. * Remove the 'reachable' variable. Only use the 'strict' variable instead. After this patch, "failing ip6_ins_rt()" should be the only case that requires a redo of fib6_lookup(). Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Martin KaFai Lau	94c77bb41d	ipv6: Avoid redoing fib6_lookup() for RTF_CACHE hit case When there is a RTF_CACHE hit, no need to redo fib6_lookup() with reachable=0. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Martin KaFai Lau	a3c00e46ef	ipv6: Remove BACKTRACK macro It is the prep work to reduce the number of calls to fib6_lookup(). The BACKTRACK macro could be hard-to-read and error-prone due to its side effects (mainly goto). This patch is to: 1. Replace BACKTRACK macro with a function (fib6_backtrack) with the following return values: * If it is backtrack-able, returns next fn for retry. * If it reaches the root, returns NULL. 2. The caller needs to decide if a backtrack is needed (by testing rt == net->ipv6.ip6_null_entry). 3. Rename the goto labels in ip6_pol_route() to make the next few patches easier to read. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:14:39 -04:00
Kenjiro Nakayama	105970f608	net: Remove trailing whitespace in tcp.h icmp.c syncookies.c Remove trailing whitespace in tcp.h icmp.c syncookies.c Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-24 00:13:10 -04:00
Arik Nemtsov	0fc1e0495f	mac80211: expose API allowing station iteration Allow drivers to iterate all stations currently uploaded to them. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:40:03 +02:00
Arik Nemtsov	2bad7748b3	mac80211: add stations in order to the station list During reconfig the station list is traversed in order and station are added back to the driver. Make sure the stations are added to the driver in the same order they were added to mac80211. This has a real side effect - some drivers (iwlwifi) require TDLS stations to be added only after the AP station for the same network. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:40:02 +02:00
Arik Nemtsov	8b94148cfe	mac80211: expose TDLS-initiator value to low level driver Some drivers need to know which station is the TDLS link initiator. Expose this value via the mac80211 ieee80211_sta structure. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:40:02 +02:00
Arik Nemtsov	452218d9fd	mac80211: fix network header breakage during encryption When an IV is generated, only the MAC header is moved back. The network header location remains the same relative to the skb head, as the new IV is using headroom space that was reserved in advance. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:40:01 +02:00
Andrei Otcheretianski	a7f3a76828	mac80211: export IE splitting function Export ieee80211_ie_split function, so it can be reused by drivers which need to insert additional elements. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:39:43 +02:00
Karl Beldan	8ec7886b1c	mac80211: minstrel_ht: use group flags instead of index to display rates When displaying a rate through debugfs minstrel_ht guesses its flags comparing group indexes. Since `3ec373c421` ("mac80211: minstrel_ht: include type (cck/ht) in rates flag"), the rate flags of interest are present in the mcs_group-s, so use it. While improving the code, this also fixes a smatch false positive "error: testing array offset 'i' after use" in minstrel_ht_stats_dump. This warning only triggers after `9208247d74` ("mac80211: minstrel_ht: add basic support for VHT rates <= 3SS@80MHz") with CONFIG_MAC80211_RC_MINSTREL_VHT unset because then MINSTREL_VHT_GROUP_0 is above MINSTREL_GROUPS_NB and smatch only barks when the "testing array offset" seems to prevent possible out of bonds accesses (which does not happen here since i < ARRAY_SIZE(mi->groups)). Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-23 20:36:13 +02:00
J. Bruce Fields	280caac078	rpc: change comments to assertions Reported-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-10-23 14:05:11 -04:00
J. Bruce Fields	ed38c06998	RPC: remove unneeded checks from xdr_truncate_encode() Thanks to Andrea Arcangeli for pointing out these checks are obviously unnecessary given the preceding calculations. Reported-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-10-23 14:05:11 -04:00
Sathya Perla	9e7ceb0607	net: fix saving TX flow hash in sock for outgoing connections The commit "net: Save TX flow hash in sock and set in skbuf on xmit" introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate and record flow hash(sk_txhash) in the socket structure. sk_txhash is used to set skb->hash which is used to spread flows across multiple TXQs. But, the above routines are invoked before the source port of the connection is created. Because of this all outgoing connections that just differ in the source port get hashed into the same TXQ. This patch fixes this problem for IPv4/6 by invoking the the above routines after the source port is available for the socket. Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit") Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 16:14:29 -04:00
Li RongQing	789f202326	xfrm6: fix a potential use after free in xfrm6_policy.c pskb_may_pull() maybe change skb->data and make nh and exthdr pointer oboslete, so recompute the nd and exthdr Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 15:38:48 -04:00
Karl Beldan	a63ba13eec	net: tso: fix unaligned access to crafted TCP header in helper API The crafted header start address is from a driver supplied buffer, which one can reasonably expect to be aligned on a 4-bytes boundary. However ATM the TSO helper API is only used by ethernet drivers and the tcp header will then be aligned to a 2-bytes only boundary from the header start address. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-22 12:52:55 -04:00
Sabrina Dubroca	c123bb7163	netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation alloc_percpu returns NULL on failure, not a negative error code. Fixes: `ff3cd7b3c9` ("netfilter: nf_tables: refactor chain statistic routines") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-22 14:12:51 +02:00
Dan Carpenter	0f9f5e1b83	netfilter: ipset: off by one in ip_set_nfnl_get_byindex() The ->ip_set_list[] array is initialized in ip_set_net_init() and it has ->ip_set_max elements so this check should be >= instead of > otherwise we are off by one. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-22 14:12:50 +02:00
Marcelo Leitner	e37ad9fd63	netfilter: nf_conntrack: allow server to become a client in TW handling When a port that was used to listen for inbound connections gets closed and reused for outgoing connections (like rsh ends up doing for stderr flow), current we may reject the SYN/ACK packet for the new connection because tcp_conntracks states forbirds a port to become a client while there is still a TIME_WAIT entry in there for it. As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout for it is 120s, there is a ~60s window that the application can end up opening a port that conntrack will end up blocking. This patch fixes this by simply allowing such state transition: if we see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note that the rest of the code already handles this situation, more specificly in tcp_packet(), first switch clause. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-22 14:12:50 +02:00
Johannes Berg	4619194a49	mac80211: don't remove tainted keys after not programming When a key is tainted during resume, it is no longer programmed into the device; however, it's uploaded flag may (will) be set. Clear the flag when not programming it because it's tainted to avoid attempting to remove it again later. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-22 11:30:30 +02:00
Johannes Berg	02219b3abc	mac80211: add WMM admission control support Use the currently existing APIs between mac80211 and the low level driver to implement WMM admission control. The low level driver needs to report the media time used by each transmitted packet in ieee80211_tx_status. Based on that information, mac80211 will modify the QoS parameters of the admission controlled Access Category when the limit is reached. Once the original QoS parameters can be restored, mac80211 will do so. One issue with this approach is that management frames will also erroneously be downgraded, but the upside is that the implementation is simple. In the future, it can be extended to driver- or device-based implementations that are better. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-22 10:42:09 +02:00
Johannes Berg	f409079bb6	mac80211: sanity check CW_min/CW_max towards driver There's no reason to ever set invalid CW_min/CW_max to the drivers, we should catch it in higher layers. However, the consequences of setting it wrong can be quite severe, so double-check at a low level and error out for invalid data. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-22 10:42:09 +02:00
Johannes Berg	723e73acd1	cfg80211: make WMM TSPEC support flag an nl80211 feature flag During the review of the corresponding wpa_supplicant patches we noticed that the only way for it to detect that this functionality is supported currently is to check for the command support. This can be misleading though, as the command was also designed to, in the future, support pure 802.11 TSPECs. Expose the WMM-TSPEC feature flag to nl80211 so later we can also expose an 802.11-TSPEC feature flag (if needed) to differentiate the two cases. Note: this change isn't needed in 3.18 as there's no driver there yet that supports the functionality at all. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-22 10:41:49 +02:00
Sabrina Dubroca	7c1c97d54f	net: sched: initialize bstats syncp Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp. Fixes: `22e0f8b932` "net: sched: make bstats per cpu and estimator RCU safe" Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 21:45:21 -04:00
Thomas Graf	78fd1d0ab0	netlink: Re-add locking to netlink_lookup() and seq walker The synchronize_rcu() in netlink_release() introduces unacceptable latency. Reintroduce minimal lookup so we can drop the synchronize_rcu() until socket destruction has been RCUfied. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 21:34:49 -04:00
Ying Xue	1a194c2d59	tipc: fix lockdep warning when intra-node messages are delivered When running tipcTC&tipcTS test suite, below lockdep unsafe locking scenario is reported: [ 1109.997854] [ 1109.997988] ================================= [ 1109.998290] [ INFO: inconsistent lock state ] [ 1109.998575] 3.17.0-rc1+ #113 Not tainted [ 1109.998762] --------------------------------- [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] {SOFTIRQ-ON-W} state was registered at: [ 1109.998762] [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80 [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc] [ 1109.998762] [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc] [ 1109.998762] [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc] [ 1109.998762] [<ffffffff817676ee>] SYSC_connect+0xae/0xc0 [ 1109.998762] [<ffffffff81767b7e>] SyS_connect+0xe/0x10 [ 1109.998762] [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200 [ 1109.998762] [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f [ 1109.998762] irq event stamp: 241060 [ 1109.998762] hardirqs last enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0 [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0 [ 1109.998762] softirqs last enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50 [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [ 1109.998762] other info that might help us debug this: [ 1109.998762] Possible unsafe locking scenario: [ 1109.998762] [ 1109.998762] CPU0 [ 1109.998762] ---- [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] <Interrupt> [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] [ 1109.998762] * DEADLOCK * [ 1109.998762] [ 1109.998762] 2 locks held by swapper/7/0: [ 1109.998762] #0: (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] #1: (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [ 1109.998762] stack backtrace: [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113 [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 1109.998762] ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007 [ 1109.998762] ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001 [ 1109.998762] ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000 [ 1109.998762] Call Trace: [ 1109.998762] <IRQ> [<ffffffff81a209eb>] dump_stack+0x4e/0x68 [ 1109.998762] [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202 [ 1109.998762] [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50 [ 1109.998762] [<ffffffff810a406c>] mark_lock+0x28c/0x2f0 [ 1109.998762] [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0 [ 1109.998762] [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80 [ 1109.998762] [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10 [ 1109.998762] [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc] [ 1109.998762] [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc] [ 1109.998762] [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70 [ 1109.998762] [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0 [ 1109.998762] [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70 [ 1109.998762] [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0 [ 1109.998762] [<ffffffff81785518>] napi_gro_receive+0xd8/0x240 [ 1109.998762] [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530 [ 1109.998762] [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffff817842b1>] net_rx_action+0x141/0x310 [ 1109.998762] [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150 [ 1109.998762] [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0 [ 1109.998762] [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [<ffffffff81a30d07>] do_IRQ+0x67/0x110 [ 1109.998762] [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f [ 1109.998762] <EOI> [<ffffffff8100d2b7>] ? default_idle+0x37/0x250 [ 1109.998762] [<ffffffff8100d2b5>] ? default_idle+0x35/0x250 [ 1109.998762] [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20 [ 1109.998762] [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0 [ 1109.998762] [<ffffffff81034c78>] start_secondary+0x188/0x1f0 When intra-node messages are delivered from one process to another process, tipc_link_xmit() doesn't disable BH before it directly calls tipc_sk_rcv() on process context to forward messages to destination socket. Meanwhile, if messages delivered by remote node arrive at the node and their destinations are also the same socket, tipc_sk_rcv() running on process context might be preempted by tipc_sk_rcv() running BH context. As a result, the latter cannot obtain the socket lock as the lock was obtained by the former, however, the former has no chance to be run as the latter is owning the CPU now, so headlock happens. To avoid it, BH should be always disabled in tipc_sk_rcv(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:28:15 -04:00
Ying Xue	7b8613e0a1	tipc: fix a potential deadlock Locking dependency detected below possible unsafe locking scenario: CPU0 CPU1 T0: tipc_named_rcv() tipc_rcv() T1: [grab nametble write lock]* [grab node lock]* T2: tipc_update_nametbl() tipc_node_link_up() T3: tipc_nodesub_subscribe() tipc_nametbl_publish() T4: [grab node lock]* [grab nametble write lock]* The opposite order of holding nametbl write lock and node lock on above two different paths may result in a deadlock. If we move the the updating of the name table after link state named out of node lock, the reverse order of holding locks will be eliminated, and as a result, the deadlock risk. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-21 15:28:15 -04:00
Fabian Frederick	5c6761adc7	mac80211: remove unnecessary null test before debugfs_remove() The debugfs_remove() function can safely take NULL parameters so the additionally null test isn't required, and there's no other reason to have it here, so remove it. Signed-off-by: Fabian Frederick <fabf@skynet.be> [rewrite commit message, re-introduce blank line after assert] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-21 21:08:10 +02:00
Karl Beldan	9208247d74	mac80211: minstrel_ht: add basic support for VHT rates <= 3SS@80MHz When the new CONFIG_MAC80211_RC_MINSTREL_VHT is not set (default 'N'), there is no behavioral change including in sampling and MCS_GROUP_RATES remains 8. Otherwise MCS_GROUP_RATES is 10, and a module parameter vht_only (default 'true'), restricts the rates selection to VHT when VHT is supported. Regarding the debugfs stats buffer: It is explicitly increased from 8k to 32k to fit every rates incl. when both HT and VHT rates are enabled, as for the format, before: type rate tpt eprob prob ret ok(cum) ok( cum) HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0( 0) after: type rate tpt eprob prob ret ok(cum) ok( cum) HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0( 0) VHT40/LGI MCS5/2 0.0 0.0 0.0 0 0( 0) 0( 0) Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-21 13:25:26 +02:00
Karl Beldan	3ec373c421	mac80211: minstrel_ht: include type (cck/ht) in rates flag ATM, we grep cck rates idx with idx / MCS_GROUP_RATES == MINSTREL_CCK_GROUP. Matching neither-cck-non-ht rates could be done by replacing '==' with '>', however it would be less versatile or explicit. This will allow to match VHT rates with IEEE80211_TX_RC_VHT_MCS. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 21:39:35 +02:00
Karl Beldan	8a0ee4fe19	mac80211: minstrel_ht: macros adjustments for future VHT_GROUPs No functional change. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 21:39:35 +02:00
Karl Beldan	d4d141cae8	mac80211: minstrel_ht: Increase the range of handled rate indexes Since `5935839ad7` ("mac80211: improve minstrel_ht rate sorting by throughput & probability"), the rate indexes are manipulated via u8's and hence allow for a maximum of 256 mcs_group entries in minstrel_mcs_groups. ATM, minstrel_ht advertizes support up to 3HTSS@40MHz, consuming: 8(MCS_GROUP_RATES) * (3(SS)2(GI)2(BW)+1(CCK)), i.e. 104 entries. Support for 3VHTSS@80MHz will require: 10(MCS_GROUP_RATES) * (3(SS)2(GI)2(BW)+1(CCK)) + 10(MCS_GROUP_RATES) * (3(SS)2(GI)3(BW)), i.e. 130 + 180 entries. This change moves from u8s to u16s where necessary. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 21:39:35 +02:00
Johannes Berg	8fa74e3aa6	Merge branch 'mac80211' into mac80211-next This was needed to avoid conflicts in the minstrel changes. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 21:39:29 +02:00
Johannes Berg	b08cc24e0a	mac80211: fix change flags variable signedness This showed up as a sparse warning (with higher verbosity) and is certainly correct - the change flags should be unsigned. It's not that important since high flag numbers aren't used and bitwise operations would still work. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 21:36:55 +02:00
Florian Westphal	f993bc25e5	net: core: handle encapsulation offloads when computing segment lengths if ->encapsulation is set we have to use inner_tcp_hdrlen and add the size of the inner network headers too. This is 'mostly harmless'; tbf might send skb that is slightly over quota or drop skb even if it would have fit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-20 12:38:13 -04:00
Florian Westphal	330966e501	net: make skb_gso_segment error handling more robust skb_gso_segment has three possible return values: 1. a pointer to the first segmented skb 2. an errno value (IS_ERR()) 3. NULL. This can happen when GSO is used for header verification. However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL and would oops when NULL is returned. Note that these call sites should never actually see such a NULL return value; all callers mask out the GSO bits in the feature argument. However, there have been issues with some protocol handlers erronously not respecting the specified feature mask in some cases. It is preferable to get 'have to turn off hw offloading, else slow' reports rather than 'kernel crashes'. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-20 12:38:13 -04:00
Florian Westphal	1e16aa3ddf	net: gso: use feature flag argument in all protocol gso handlers skb_gso_segment() has a 'features' argument representing offload features available to the output path. A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use those instead of the provided ones when handing encapsulation/tunnels. Depending on dev->hw_enc_features of the output device skb_gso_segment() can then return NULL even when the caller has disabled all GSO feature bits, as segmentation of inner header thinks device will take care of segmentation. This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs that did not fit the remaining token quota as the segmentation does not work when device supports corresponding hw offload capabilities. Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-20 12:38:12 -04:00
David S. Miller	ce8ec48967	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains netfilter fixes for your net tree, they are: 1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules. 2) Restrict nat and masq expressions to the nat chain type. Otherwise, users may crash their kernel if they attach a nat/masq rule to a non nat chain. 3) Fix hook validation in nft_compat when non-base chains are used. Basically, initialize hook_mask to zero. 4) Make sure you use match/targets in nft_compat from the right chain type. The existing validation relies on the table name which can be avoided by 5) Better netlink attribute validation in nft_nat. This expression has to reject the configuration when no address and proto configurations are specified. 6) Interpret NFTA_NAT_REG__MAX if only if NFTA_NAT_REG__MIN is set. Yet another sanity check to reject incorrect configurations from userspace. 7) Conditional NAT attribute dumping depending on the existing configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-20 11:57:47 -04:00
Jouni Malinen	988568669d	cfg80211: Specify frame and reason code for NL80211_CMD_DEL_STATION The optional NL80211_ATTR_MGMT_SUBTYPE and NL80211_ATTR_REASON_CODE attributes can now be included in NL80211_CMD_DEL_STATION to indicate to the driver which frame (Deauthentication/Disassociation) and reason code in that frame should be used to indicate removal to the specific station. This is used by drivers that implement AP SME and generate those frames internally. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 16:39:23 +02:00
Karl Beldan	11b2357d5d	mac80211: minstrels: fix buffer overflow in HT debugfs rc_stats ATM an HT rc_stats line is 106 chars. Times 8(MCS_GROUP_RATES)3(SS)2(GI)2(BW) + CCK(4), i.e. x100, this is well above the current 8192 - sizeof(ms) currently allocated. Fix this by squeezing the output as follows (not that we're short on memory but this also improves readability and range, the new format adds one more digit to ok/cum and ok/cum): - Before (HT) (106 ch): type rate throughput ewma prob this prob retry this succ/attempt success attempts CCK/LP 5.5M 0.0 0.0 0.0 0 0( 0) 0 0 HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0 0 - After (75 ch): type rate tpt eprob prob ret ok(cum) ok( cum) CCK/LP 5.5M 0.0 0.0 0.0 0 0( 0) 0( 0) HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0( 0) - Align non-HT format Before (non-HT) (83 ch): rate throughput ewma prob this prob this succ/attempt success attempts ABCDP 6 0.0 0.0 0.0 0( 0) 0 0 54 0.0 0.0 0.0 0( 0) 0 0 - After (61 ch): rate tpt eprob prob ok(cum) ok( cum) ABCDP 1 0.0 0.0 0.0 0( 0) 0( 0) 54 0.0 0.0 0.0 0( 0) 0( 0) This also adds dynamic checks for overflow, lowers the size of the non-HT request (allowing > 30 entries) and replaces the buddy-rounded allocations (s/sizeof(ms) + 8192/8192). Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 16:37:01 +02:00
Jouni Malinen	89c771e5a6	cfg80211: Convert del_station() callback to use a param struct This makes it easier to add new parameters for the del_station calls without having to modify all drivers that use this. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-20 16:24:21 +02:00
Wolfram Sang	140bbc4af0	net: rfkill: drop owner assignment from platform_drivers A platform_driver does not need to set an owner, it will be populated by the driver core. Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2014-10-20 16:21:58 +02:00
Wolfram Sang	0dd1153813	net: dsa: drop owner assignment from platform_drivers A platform_driver does not need to set an owner, it will be populated by the driver core. Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2014-10-20 16:21:58 +02:00
Linus Torvalds	e25b492741	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A quick batch of bug fixes: 1) Fix build with IPV6 disabled, from Eric Dumazet. 2) Several more cases of caching SKB data pointers across calls to pskb_may_pull(), thus referencing potentially free'd memory. From Li RongQing. 3) DSA phy code tests operation presence improperly, instead of going: if (x->ops->foo) r = x->ops->foo(args); it was going: if (x->ops->foo(args)) r = x->ops->foo(args); Fix from Andew Lunn" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: Net: DSA: Fix checking for get_phy_flags function ipv6: fix a potential use after free in sit.c ipv6: fix a potential use after free in ip6_offload.c ipv4: fix a potential use after free in gre_offload.c tcp: fix build error if IPv6 is not enabled	2014-10-19 11:41:57 -07:00
Andrew Lunn	228b16cb13	Net: DSA: Fix checking for get_phy_flags function The check for the presence or not of the optional switch function get_phy_flags() called the function, rather than checked to see if it is a NULL pointer. This causes a derefernce of a NULL pointer on all switch chips except the sf2, the only switch to implement this call. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: `6819563e64` ("net: dsa: allow switch drivers to specify phy_device::dev_flags") Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-19 12:46:31 -04:00
Linus Torvalds	0e6e58f941	One cc: stable commit, the rest are a series of minor cleanups which have been sitting in MST's tree during my vacation. I changed a function name and made one trivial change, then they spent two days in linux-next. Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUQFBQAAoJENkgDmzRrbjxJRIP/1yCQRElQewxURSmJelyqCdU 0mHYB0R9Mf3tfre1xnofqs2lWeSMc/4ptKHsVR6pupoztSwnz7HsLHfEFvFJh4mj KsaqYElxkNxTcfyHwLjyJS0/J6tG1tYypXGiimTBS0bvFHL3XZdimVgJ6WvX+gO7 YSaDEX8/EqCERafslS5+gKJlz3drDOnCZCe9y4BDSmsvl2k7bkpSxIn8vsR6jIC0 c5JpUy6QVF+3XA/J932M7yRs+xpqxNoUWiyY3ar9o3CtQAaQB0ZAetSxY6hTfvVc GlNFzCifdsaQwsl2SVsE2h6tWaRhtMtcGWQuhHThIPyIf8XxhYyBRY2FLo70LMz1 eqtwy6F/Bg/nzUsdee4PZBMeoKHlAEL12RpsEKgfUoLzj16Aqa8ll+Agbglbkw8G f3d2FwzKAlpY5NwHETC1wYy52PJ3efqksRWuhokmYpxNSbHJS/lsiJOE7272/4Qr MtXuvRmo22tf34XFd5y7zqWjgZ58eeFOqQWi/K+6ZgpqVOvikjrXXKEuiVdjO0ZD kTVR/sQKiR+79rzENk80XBhWaMveECNXF1TiZ/3MmURkmEOBRQMxRQ20BX3exvna AJ/WVA5DcfXZc1yyqknE1NLGrvSBMJENH13x2QPwrqNWAryOOKuF1VKKIwWlDw5j vtx5nXiJa8YYdxI2TJCN =JK6x -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio updates from Rusty Russell: "One cc: stable commit, the rest are a series of minor cleanups which have been sitting in MST's tree during my vacation. I changed a function name and made one trivial change, then they spent two days in linux-next" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits) virtio-rng: refactor probe error handling virtio_scsi: drop scan callback virtio_balloon: enable VQs early on restore virtio_scsi: fix race on device removal virito_scsi: use freezable WQ for events virtio_net: enable VQs early on restore virtio_console: enable VQs early on restore virtio_scsi: enable VQs early on restore virtio_blk: enable VQs early on restore virtio_scsi: move kick event out from virtscsi_init virtio_net: fix use after free on allocation failure 9p/trans_virtio: enable VQs early virtio_console: enable VQs early virtio_blk: enable VQs early virtio_net: enable VQs early virtio: add API to enable VQs early virtio_net: minor cleanup virtio-net: drop config_mutex virtio_net: drop config_enable virtio-blk: drop config_mutex ...	2014-10-18 10:25:09 -07:00
Li RongQing	a6d4518da3	ipv6: fix a potential use after free in sit.c pskb_may_pull() maybe change skb->data and make iph pointer oboslete, fix it by geting ip header length directly. Fixes: `ca15a078` (sit: generate icmpv6 error when receiving icmpv4 error) Cc: Oussama Ghorbel <ghorbel@pivasoftware.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-18 13:04:09 -04:00
Li RongQing	fc6fb41cd6	ipv6: fix a potential use after free in ip6_offload.c pskb_may_pull() maybe change skb->data and make opth pointer oboslete, so set the opth again Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-18 13:04:08 -04:00
Li RongQing	b4e3cef703	ipv4: fix a potential use after free in gre_offload.c pskb_may_pull() may change skb->data and make greh pointer oboslete; so need to reassign greh; but since first calling pskb_may_pull already ensured that skb->data has enough space for greh, so move the reference of greh before second calling pskb_may_pull(), to avoid reassign greh. Fixes: 7a7ffbabf9("ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC") Cc: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-18 13:04:08 -04:00
Linus Torvalds	2e923b0251	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Include fixes for netrom and dsa (Fabian Frederick and Florian Fainelli) 2) Fix FIXED_PHY support in stmmac, from Giuseppe CAVALLARO. 3) Several SKB use after free fixes (vxlan, openvswitch, vxlan, ip_tunnel, fou), from Li ROngQing. 4) fec driver PTP support fixes from Luwei Zhou and Nimrod Andy. 5) Use after free in virtio_net, from Michael S Tsirkin. 6) Fix flow mask handling for megaflows in openvswitch, from Pravin B Shelar. 7) ISDN gigaset and capi bug fixes from Tilman Schmidt. 8) Fix route leak in ip_send_unicast_reply(), from Vasily Averin. 9) Fix two eBPF JIT bugs on x86, from Alexei Starovoitov. 10) TCP_SKB_CB() reorganization caused a few regressions, fixed by Cong Wang and Eric Dumazet. 11) Don't overwrite end of SKB when parsing malformed sctp ASCONF chunks, from Daniel Borkmann. 12) Don't call sock_kfree_s() with NULL pointers, this function also has the side effect of adjusting the socket memory usage. From Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits) bna: fix skb->truesize underestimation net: dsa: add includes for ethtool and phy_fixed definitions openvswitch: Set flow-key members. netrom: use linux/uaccess.h dsa: Fix conversion from host device to mii bus tipc: fix bug in bundled buffer reception ipv6: introduce tcp_v6_iif() sfc: add support for skb->xmit_more r8152: return -EBUSY for runtime suspend ipv4: fix a potential use after free in fou.c ipv4: fix a potential use after free in ip_tunnel_core.c hyperv: Add handling of IP header with option field in netvsc_set_hash() openvswitch: Create right mask with disabled megaflows vxlan: fix a free after use openvswitch: fix a use after free ipv4: dst_entry leak in ip_send_unicast_reply() ipv4: clean up cookie_v4_check() ipv4: share tcp_v4_save_options() with cookie_v4_check() ipv4: call __ip_options_echo() in cookie_v4_check() atm: simplify lanai.c by using module_pci_driver ...	2014-10-18 09:31:37 -07:00
Pablo Neira Ayuso	1e2d56a5d3	netfilter: nft_nat: dump attributes if they are set Dump NFTA_NAT_REG_ADDR_MIN if this is non-zero. Same thing with NFTA_NAT_REG_PROTO_MIN. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-18 14:16:13 +02:00
Pablo Neira Ayuso	61cfac6b42	netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN Interpret NFTA_NAT_REG_ADDR_MAX if NFTA_NAT_REG_ADDR_MIN is present, otherwise, skip it. Same thing with NFTA_NAT_REG_PROTO_MAX. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-18 14:16:12 +02:00
Pablo Neira Ayuso	5c819a3975	netfilter: nft_nat: insufficient attribute validation We have to validate that we at least get an NFTA_NAT_REG_ADDR_MIN or NFTA_NFT_REG_PROTO_MIN attribute. Reject the configuration if none of them are present. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-18 14:16:11 +02:00
Pablo Neira Ayuso	f3f5ddeddd	netfilter: nft_compat: validate chain type in match/target We have to validate the real chain type to ensure that matches/targets are not used out from their scope (eg. MASQUERADE in nat chain type). The existing validation relies on the table name, but this is not sufficient since userspace can fool us by using the appropriate table name with a different chain type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-18 14:14:07 +02:00
Florian Fainelli	a28205437b	net: dsa: add includes for ethtool and phy_fixed definitions net/dsa/slave.c uses functions and structures declared in phy_fixed.h but does not explicitely include it, while dsa.h needs structure declarations for 'struct ethtool_wolinfo' and 'struct ethtool_eee', fix those by including the correct header files. Fixes: `ec9436baed` ("net: dsa: allow drivers to do link adjustment") Fixes: `ce31b31c68` ("net: dsa: allow updating fixed PHY link information") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:54:46 -04:00
Pravin B Shelar	25ef1328a0	openvswitch: Set flow-key members. This patch adds missing memset which are required to initialize flow key member. For example for IP flow we need to initialize ip.frag for all cases. Found by inspection. This bug is introduced by commit `0714812134` ("openvswitch: Eliminate memset() from flow_extract"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:54:02 -04:00
Fabian Frederick	dc8e54165f	netrom: use linux/uaccess.h replace asm/uaccess.h by linux/uaccess.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:52:54 -04:00
Jon Paul Maloy	643566d4b4	tipc: fix bug in bundled buffer reception In commit `ec8a2e5621` ("tipc: same receive code path for connection protocol and data messages") we omitted the the possiblilty that an arriving message extracted from a bundle buffer may be a multicast message. Such messages need to be to be delivered to the socket via a separate function, tipc_sk_mcast_rcv(). As a result, small multicast messages arriving as members of a bundle buffer will be silently dropped. This commit corrects the error by considering this case in the function tipc_link_bundle_rcv(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:50:53 -04:00
Eric Dumazet	870c315138	ipv6: introduce tcp_v6_iif() Commit `971f10eca1` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") added a regression for SO_BINDTODEVICE on IPv6. This is because we still use inet6_iif() which expects that IP6 control block is still at the beginning of skb->cb[] This patch adds tcp_v6_iif() helper and uses it where necessary. Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif parameter to it. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `971f10eca1` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:48:07 -04:00
Li RongQing	d8f00d2710	ipv4: fix a potential use after free in fou.c pskb_may_pull() maybe change skb->data and make uh pointer oboslete, so reload uh and guehdr Fixes: `37dd0247` ("gue: Receive side for Generic UDP Encapsulation") Cc: Tom Herbert <therbert@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:45:26 -04:00
Li RongQing	1245dfc8ca	ipv4: fix a potential use after free in ip_tunnel_core.c pskb_may_pull() maybe change skb->data and make eth pointer oboslete, so set eth after pskb_may_pull() Fixes:3d7b46cd("ip_tunnel: push generic protocol handling to ip_tunnel module") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 23:45:26 -04:00
Pravin B Shelar	f47de068f6	openvswitch: Create right mask with disabled megaflows If megaflows are disabled, the userspace does not send the netlink attribute OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask. sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the bytes that represent padding for struct sw_flow, or the bytes that represent fields that may not be set during ovs_flow_extract(). This is a problem, because when we extract a flow from a packet, we do not memset() anymore the struct sw_flow to 0. This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(), which operates on the netlink attributes rather than on the mask key. Using this approach we are sure that only the bytes that the user provided in the flow are matched. Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we now return with an error. This bug is introduced by commit `0714812134` ("openvswitch: Eliminate memset() from flow_extract"). Reported-by: Alex Wang <alexw@nicira.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 16:49:34 -04:00
Li RongQing	389f48947a	openvswitch: fix a use after free pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp setting after arphdr_ok to avoid the use the freed memory Fixes: `0714812134` ("openvswitch: Eliminate memset() from flow_extract.") Cc: Jesse Gross <jesse@nicira.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 16:21:53 -04:00
Vasily Averin	4062090e3e	ipv4: dst_entry leak in ip_send_unicast_reply() ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork and in case errors in __ip_append_data() nobody frees stolen dst entry Fixes: `2e77d89b2f` ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()") Signed-off-by: Vasily Averin <vvs@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 15:30:12 -04:00
Cong Wang	461b74c391	ipv4: clean up cookie_v4_check() We can retrieve opt from skb, no need to pass it as a parameter. And opt should always be non-NULL, no need to check. Cc: Krzysztof Kolasa <kkolasa@winsoft.pl> Cc: Eric Dumazet <edumazet@google.com> Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 12:02:57 -04:00
Cong Wang	e25f866fbc	ipv4: share tcp_v4_save_options() with cookie_v4_check() cookie_v4_check() allocates ip_options_rcu in the same way with tcp_v4_save_options(), we can just make it a helper function. Cc: Krzysztof Kolasa <kkolasa@winsoft.pl> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 12:02:57 -04:00
Cong Wang	2077eebf7d	ipv4: call __ip_options_echo() in cookie_v4_check() commit `971f10eca1` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") missed that cookie_v4_check() still calls ip_options_echo() which uses IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo() instead. Fixes: commit `971f10eca1` ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Cc: Krzysztof Kolasa <kkolasa@winsoft.pl> Cc: Eric Dumazet <edumazet@google.com> Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-17 12:02:57 -04:00
Fabian Frederick	4e8febd0a7	openvswitch: use vport instead of p All functions used struct vport *vport except ovs_vport_find_upcall_portid. This fixes 1 kerneldoc warning Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-15 23:25:33 -04:00
Fabian Frederick	7e78cc46b7	openvswitch: kerneldoc warning fix s/sock/gs Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-15 23:25:33 -04:00
Tom Herbert	04ffcb255f	net: Add ndo_gso_check Add ndo_gso_check which a device can define to indicate whether is is capable of doing GSO on a packet. This funciton would be called from the stack to determine whether software GSO is needed to be done. A driver should populate this function if it advertises GSO types for which there are combinations that it wouldn't be able to handle. For instance a device that performs UDP tunneling might only implement support for transparent Ethernet bridging type of inner packets or might have limitations on lengths of inner headers. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-15 12:11:00 -04:00
Linus Torvalds	0429fbc0bd	Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...	2014-10-15 07:48:18 +02:00
Linus Torvalds	6b04908166	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is the long-awaited discard support for RBD (Guangliang Zhao, Josh Durgin), a pile of RBD bug fixes that didn't belong in late -rc's (Ilya Dryomov, Li RongQing), a pile of fs/ceph bug fixes and performance and debugging improvements (Yan, Zheng, John Spray), and a smattering of cleanups (Chao Yu, Fabian Frederick, Joe Perches)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits) ceph: fix divide-by-zero in __validate_layout() rbd: rbd workqueues need a resque worker libceph: ceph-msgr workqueue needs a resque worker ceph: fix bool assignments libceph: separate multiple ops with commas in debugfs output libceph: sync osd op definitions in rados.h libceph: remove redundant declaration ceph: additional debugfs output ceph: export ceph_session_state_name function ceph: include the initial ACL in create/mkdir/mknod MDS requests ceph: use pagelist to present MDS request data libceph: reference counting pagelist ceph: fix llistxattr on symlink ceph: send client metadata to MDS ceph: remove redundant code for max file size verification ceph: remove redundant io_iter_advance() ceph: move ceph_find_inode() outside the s_mutex ceph: request xattrs if xattr_version is zero rbd: set the remaining discard properties to enable support rbd: use helpers to handle discard for layered images correctly ...	2014-10-15 06:46:01 +02:00
Michael S. Tsirkin	64b4cc3911	9p/trans_virtio: enable VQs early virtio spec requires drivers to set DRIVER_OK before using VQs. This is set automatically after probe returns, but virtio 9p device adds self to channel list within probe, at which point VQ can be used in violation of the spec. To fix, call virtio_device_ready before using VQs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2014-10-15 10:25:04 +10:30
Eric Dumazet	9b462d02d6	tcp: TCP Small Queues and strange attractors TCP Small queues tries to keep number of packets in qdisc as small as possible, and depends on a tasklet to feed following packets at TX completion time. Choice of tasklet was driven by latencies requirements. Then, TCP stack tries to avoid reorders, by locking flows with outstanding packets in qdisc in a given TX queue. What can happen is that many flows get attracted by a low performing TX queue, and cpu servicing TX completion has to feed packets for all of them, making this cpu 100% busy in softirq mode. This became particularly visible with latest skb->xmit_more support Strategy adopted in this patch is to detect when tcp_wfree() is called from ksoftirqd and let the outstanding queue for this flow being drained before feeding additional packets, so that skb->ooo_okay can be set to allow select_queue() to select the optimal queue : Incoming ACKS are normally handled by different cpus, so this patch gives more chance for these cpus to take over the burden of feeding qdisc with future packets. Tested: lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 & lpaa23:~# sar -n DEV 1 10 \| grep eth1 06:16:18 AM eth1 595448.00 1190564.00 38381.09 1760253.12 0.00 0.00 1.00 06:16:19 AM eth1 594858.00 1189686.00 38340.76 1758952.72 0.00 0.00 0.00 06:16:20 AM eth1 597017.00 1194019.00 38480.79 1765370.29 0.00 0.00 1.00 06:16:21 AM eth1 595450.00 1190936.00 38380.19 1760805.05 0.00 0.00 0.00 06:16:22 AM eth1 596385.00 1193096.00 38442.56 1763976.29 0.00 0.00 1.00 06:16:23 AM eth1 598155.00 1195978.00 38552.97 1768264.60 0.00 0.00 0.00 06:16:24 AM eth1 594405.00 1188643.00 38312.57 1757414.89 0.00 0.00 1.00 06:16:25 AM eth1 593366.00 1187154.00 38252.16 1755195.83 0.00 0.00 0.00 06:16:26 AM eth1 593188.00 1186118.00 38232.88 1753682.57 0.00 0.00 1.00 06:16:27 AM eth1 596301.00 1192241.00 38440.94 1762733.09 0.00 0.00 0.00 Average: eth1 595457.30 1190843.50 38381.69 1760664.84 0.00 0.00 0.50 lpaa23:~# ./tc -s -d qd sh dev eth1 \| grep backlog backlog 7606336b 2513p requeues 167982 backlog 224072b 74p requeues 566 backlog 581376b 192p requeues 5598 backlog 181680b 60p requeues 1070 backlog 5305056b 1753p requeues 110166 // Here, this TX queue is attracting flows backlog 157456b 52p requeues 1758 backlog 672216b 222p requeues 3025 backlog 60560b 20p requeues 24541 backlog 448144b 148p requeues 21258 lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect Immediate jump to full bandwidth, and traffic is properly shard on all tx queues. lpaa23:~# sar -n DEV 1 10 \| grep eth1 06:16:46 AM eth1 1397632.00 2795397.00 90081.87 4133031.26 0.00 0.00 1.00 06:16:47 AM eth1 1396874.00 2793614.00 90032.99 4130385.46 0.00 0.00 0.00 06:16:48 AM eth1 1395842.00 2791600.00 89966.46 4127409.67 0.00 0.00 1.00 06:16:49 AM eth1 1395528.00 2791017.00 89946.17 4126551.24 0.00 0.00 0.00 06:16:50 AM eth1 1397891.00 2795716.00 90098.74 4133497.39 0.00 0.00 1.00 06:16:51 AM eth1 1394951.00 2789984.00 89908.96 4125022.51 0.00 0.00 0.00 06:16:52 AM eth1 1394608.00 2789190.00 89886.90 4123851.36 0.00 0.00 1.00 06:16:53 AM eth1 1395314.00 2790653.00 89934.33 4125983.09 0.00 0.00 0.00 06:16:54 AM eth1 1396115.00 2792276.00 89984.25 4128411.21 0.00 0.00 1.00 06:16:55 AM eth1 1396829.00 2793523.00 90030.19 4130250.28 0.00 0.00 0.00 Average: eth1 1396158.40 2792297.00 89987.09 4128439.35 0.00 0.00 0.50 lpaa23:~# tc -s -d qd sh dev eth1 \| grep backlog backlog 7900052b 2609p requeues 173287 backlog 878120b 290p requeues 589 backlog 1068884b 354p requeues 5621 backlog 996212b 329p requeues 1088 backlog 984100b 325p requeues 115316 backlog 956848b 316p requeues 1781 backlog 1080996b 357p requeues 3047 backlog 975016b 322p requeues 24571 backlog 990156b 327p requeues 21274 (All 8 TX queues get a fair share of the traffic) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 17:16:26 -04:00
David S. Miller	e53da5fbfc	net: Trap attempts to call sock_kfree_s() with a NULL pointer. Unlike normal kfree() it is never right to call sock_kfree_s() with a NULL pointer, because sock_kfree_s() also has the side effect of discharging the memory from the sockets quota. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 17:02:37 -04:00
Cong Wang	dee49f203a	rds: avoid calling sock_kfree_s() on allocation failure It is okay to free a NULL pointer but not okay to mischarge the socket optmem accounting. Compile test only. Reported-by: rucsoftsec@gmail.com Cc: Chien Yen <chien.yen@oracle.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 17:00:19 -04:00
Fabian Frederick	91c4467e3c	caif_usb: use target structure member in memset parent cfusbl was used instead of first structure member 'layer' Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 16:05:45 -04:00
Fabian Frederick	7970f1918f	caif_usb: remove redundant memory message Let MM subsystem display out of memory messages. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 16:05:45 -04:00
Fabian Frederick	6ff1e1e3c8	caif: replace kmalloc/memset 0 by kzalloc Also add blank line after declaration Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 16:04:07 -04:00
Jiri Pirko	f76936d07c	ipv4: fix nexthop attlen check in fib_nh_match fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: `4e902c5741` ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
Eric Dumazet	ad971f616a	tcp: fix tcp_ack() performance problem We worked hard to improve tcp_ack() performance, by not accessing skb_shinfo() in fast path (`cd7d8498c9` tcp: change tcp_skb_pcount() location) We still have one spurious access because of ACK timestamping, added in commit `e1c8a607b2` ("net-timestamp: ACK timestamp for bytestreams") By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set, we can avoid two cache line misses for the common case. While we are at it, add two prefetchw() : One in tcp_ack() to bring skb at the head of write queue. One in tcp_clean_rtx_queue() loop to bring following skb, as we will delete skb from the write queue and dirty skb->next->prev. Add a couple of [un]likely() clauses. After this patch, tcp_ack() is no longer the most consuming function in tcp stack. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 15:59:37 -04:00
Ilya Dryomov	f9865f06f7	libceph: ceph-msgr workqueue needs a resque worker Commit `f363e45fd1` ("net/ceph: make ceph_msgr_wq non-reentrant") effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq. This is wrong - libceph is very much a memory reclaim path, so restore it. Cc: stable@vger.kernel.org # needs backporting for < 3.12 Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Tested-by: Micha Krause <micha@krausam.de> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 12:57:04 -07:00
Ilya Dryomov	25f897773b	libceph: separate multiple ops with commas in debugfs output For requests with multiple ops, separate ops with commas instead of \t, which is a field separator here. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 12:57:03 -07:00
Ilya Dryomov	70b5bfa360	libceph: sync osd op definitions in rados.h Bring in missing osd ops and strings, use macros to eliminate multiple points of maintenance. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 12:57:02 -07:00
Yan, Zheng	e4339d28f6	libceph: reference counting pagelist this allow pagelist to present data that may be sent multiple times. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-10-14 12:56:48 -07:00
Li RongQing	02ea80741a	ipv6: remove aca_lock spinlock from struct ifacaddr6 no user uses this lock. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:15:15 -04:00
Eric Dumazet	b2532eb9ab	tcp: fix ooo_okay setting vs Small Queues TCP Small Queues (tcp_tsq_handler()) can hold one reference on sk->sk_wmem_alloc, preventing skb->ooo_okay being set. We should relax test done to set skb->ooo_okay to take care of this extra reference. Minimal truesize of skb containing one byte of payload is SKB_TRUESIZE(1) Without this fix, we have more chance locking flows into the wrong transmit queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 13:12:00 -04:00
Ilya Dryomov	91883cd27c	libceph: don't try checking queue_work() return value queue_work() doesn't "fail to queue", it returns false if work was already on a queue, which can't happen here since we allocate event_work right before we queue it. So don't bother at all. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-10-14 21:03:25 +04:00
Joe Perches	b9a678994b	libceph: Convert pr_warning to pr_warn Use the more common pr_warn. Other miscellanea: o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>	2014-10-14 21:03:23 +04:00
Li RongQing	589506f1e7	libceph: fix a use after free issue in osdmap_set_max_osd If the state variable is krealloced successfully, map->osd_state will be freed, once following two reallocation failed, and exit the function without resetting map->osd_state, map->osd_state become a wild pointer. fix it by resetting them after krealloc successfully. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>	2014-10-14 21:03:21 +04:00
Ilya Dryomov	dc220db03f	libceph: select CRYPTO_CBC in addition to CRYPTO_AES We want "cbc(aes)" algorithm, so select CRYPTO_CBC too, not just CRYPTO_AES. Otherwise on !CRYPTO_CBC kernels we fail rbd map/mount with libceph: error -2 building auth method x request Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>	2014-10-14 21:03:20 +04:00
Ilya Dryomov	2cc6128ab2	libceph: resend lingering requests with a new tid Both not yet registered (r_linger && list_empty(&r_linger_item)) and registered linger requests should use the new tid on resend to avoid the dup op detection logic on the OSDs, yet we were doing this only for "registered" case. Factor out and simplify the "registered" logic and use the new helper for "not registered" case as well. Fixes: http://tracker.ceph.com/issues/8806 Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-10-14 21:03:19 +04:00
Ilya Dryomov	f671b581f1	libceph: abstract out ceph_osd_request enqueue logic Introduce __enqueue_request() and switch to it. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>	2014-10-14 21:03:18 +04:00
Daniel Borkmann	26b87c7881	net: sctp: fix remote memory pressure from excessive queueing This scenario is not limited to ASCONF, just taken as one example triggering the issue. When receiving ASCONF probes in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------> [...] ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------> ... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed ASCONFs and have increasing serial numbers, we process such ASCONF chunk(s) marked with !end_of_packet and !singleton, since we have not yet reached the SCTP packet end. SCTP does only do verification on a chunk by chunk basis, as an SCTP packet is nothing more than just a container of a stream of chunks which it eats up one by one. We could run into the case that we receive a packet with a malformed tail, above marked as trailing JUNK. All previous chunks are here goodformed, so the stack will eat up all previous chunks up to this point. In case JUNK does not fit into a chunk header and there are no more other chunks in the input queue, or in case JUNK contains a garbage chunk header, but the encoded chunk length would exceed the skb tail, or we came here from an entirely different scenario and the chunk has pdiscard=1 mark (without having had a flush point), it will happen, that we will excessively queue up the association's output queue (a correct final chunk may then turn it into a response flood when flushing the queue ;)): I ran a simple script with incremental ASCONF serial numbers and could see the server side consuming excessive amount of RAM [before/after: up to 2GB and more]. The issue at heart is that the chunk train basically ends with !end_of_packet and !singleton markers and since commit `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") therefore preventing an output queue flush point in sctp_do_sm() -> sctp_cmd_interpreter() on the input chunk (chunk = event_arg) even though local_cork is set, but its precedence has changed since then. In the normal case, the last chunk with end_of_packet=1 would trigger the queue flush to accommodate possible outgoing bundling. In the input queue, sctp_inq_pop() seems to do the right thing in terms of discarding invalid chunks. So, above JUNK will not enter the state machine and instead be released and exit the sctp_assoc_bh_rcv() chunk processing loop. It's simply the flush point being missing at loop exit. Adding a try-flush approach on the output queue might not work as the underlying infrastructure might be long gone at this point due to the side-effect interpreter run. One possibility, albeit a bit of a kludge, would be to defer invalid chunk freeing into the state machine in order to possibly trigger packet discards and thus indirectly a queue flush on error. It would surely be better to discard chunks as in the current, perhaps better controlled environment, but going back and forth, it's simply architecturally not possible. I tried various trailing JUNK attack cases and it seems to look good now. Joint work with Vlad Yasevich. Fixes: `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Daniel Borkmann	b69040d8e3	net: sctp: fix panic on duplicate ASCONF chunks When receiving a e.g. semi-good formed connection scan in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---------------- ASCONF_a; ASCONF_b -----------------> ... where ASCONF_a equals ASCONF_b chunk (at least both serials need to be equal), we panic an SCTP server! The problem is that good-formed ASCONF chunks that we reply with ASCONF_ACK chunks are cached per serial. Thus, when we receive a same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do not need to process them again on the server side (that was the idea, also proposed in the RFC). Instead, we know it was cached and we just resend the cached chunk instead. So far, so good. Where things get nasty is in SCTP's side effect interpreter, that is, sctp_cmd_interpreter(): While incoming ASCONF_a (chunk = event_arg) is being marked !end_of_packet and !singleton, and we have an association context, we do not flush the outqueue the first time after processing the ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it queued up, although we set local_cork to 1. Commit `2e3216cd54` changed the precedence, so that as long as we get bundled, incoming chunks we try possible bundling on outgoing queue as well. Before this commit, we would just flush the output queue. Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we continue to process the same ASCONF_b chunk from the packet. As we have cached the previous ASCONF_ACK, we find it, grab it and do another SCTP_CMD_REPLY command on it. So, effectively, we rip the chunk->list pointers and requeue the same ASCONF_ACK chunk another time. Since we process ASCONF_b, it's correctly marked with end_of_packet and we enforce an uncork, and thus flush, thus crashing the kernel. Fix it by testing if the ASCONF_ACK is currently pending and if that is the case, do not requeue it. When flushing the output queue we may relink the chunk for preparing an outgoing packet, but eventually unlink it when it's copied into the skb right before transmission. Joint work with Vlad Yasevich. Fixes: `2e3216cd54` ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Daniel Borkmann	9de7922bc7	net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks Commit `6f4c618ddb` ("SCTP : Add paramters validity check for ASCONF chunk") added basic verification of ASCONF chunks, however, it is still possible to remotely crash a server by sending a special crafted ASCONF chunk, even up to pre 2.6.12 kernels: skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768 head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950 end:0x440 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:129! [...] Call Trace: <IRQ> [<ffffffff8144fb1c>] skb_put+0x5c/0x70 [<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp] [<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp] [<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20 [<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp] [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp] [<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0 [<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp] [<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp] [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp] [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp] [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter] [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0 [<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0 [<ffffffff81497078>] ip_local_deliver+0x98/0xa0 [<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440 [<ffffffff81496ac5>] ip_rcv+0x275/0x350 [<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750 [<ffffffff81460588>] netif_receive_skb+0x58/0x60 This can be triggered e.g., through a simple scripted nmap connection scan injecting the chunk after the handshake, for example, ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ------------------ ASCONF; UNKNOWN ------------------> ... where ASCONF chunk of length 280 contains 2 parameters ... 1) Add IP address parameter (param length: 16) 2) Add/del IP address parameter (param length: 255) ... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the Address Parameter in the ASCONF chunk is even missing, too. This is just an example and similarly-crafted ASCONF chunks could be used just as well. The ASCONF chunk passes through sctp_verify_asconf() as all parameters passed sanity checks, and after walking, we ended up successfully at the chunk end boundary, and thus may invoke sctp_process_asconf(). Parameter walking is done with WORD_ROUND() to take padding into account. In sctp_process_asconf()'s TLV processing, we may fail in sctp_process_asconf_param() e.g., due to removal of the IP address that is also the source address of the packet containing the ASCONF chunk, and thus we need to add all TLVs after the failure to our ASCONF response to remote via helper function sctp_add_asconf_response(), which basically invokes a sctp_addto_chunk() adding the error parameters to the given skb. When walking to the next parameter this time, we proceed with ... length = ntohs(asconf_param->param_hdr.length); asconf_param = (void )asconf_param + length; ... instead of the WORD_ROUND()'ed length, thus resulting here in an off-by-one that leads to reading the follow-up garbage parameter length of 12336, and thus throwing an skb_over_panic for the reply when trying to sctp_addto_chunk() next time, which implicitly calls the skb_put() with that length. Fix it by using sctp_walk_params() [ which is also used in INIT parameter processing ] macro in the verification and* in ASCONF processing: it will make sure we don't spill over, that we walk parameters WORD_ROUND()'ed. Moreover, we're being more defensive and guard against unknown parameter types and missized addresses. Joint work with Vlad Yasevich. Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-14 12:46:22 -04:00
Pablo Neira Ayuso	493618a92c	netfilter: nft_compat: fix hook validation for non-base chains Set hook_mask to zero for non-base chains, otherwise people may hit bogus errors from the xt_check_target() and xt_check_match() when validating the uninitialized hook_mask. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-14 12:52:40 +02:00
Karl Beldan	c7abf25af0	mac80211: fix typo in starting baserate for rts_cts_rate_idx It affects non-(V)HT rates and can lead to selecting an rts_cts rate that is not a basic rate or way superior to the reference rate (ATM rates[0] used for the 1st attempt of the protected frame data). E.g, assuming drivers register growing (bitrate) sorted tables of ieee80211_rate-s, having : - rates[0].idx == d'2 and basic_rates == b'10100 will select rts_cts idx b'10011 & ~d'(BIT(2)-1), i.e. 1, likewise - rates[0].idx == d'2 and basic_rates == b'10001 will select rts_cts idx b'10000 The first is not a basic rate and the second is > rates[0]. Also, wrt severity of the addressed misbehavior, ATM we only have one rts_cts_rate_idx rather than one per rate table entry, so this idx might still point to bitrates > rates[1..MAX_RATES]. Fixes: `5253ffb8c9` ("mac80211: always pick a basic rate to tx RTS/CTS for pre-HT rates") Cc: stable@vger.kernel.org Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-14 11:16:16 +02:00
Andy Shevchenko	5df1415aee	lib80211: remove unused print_ssid() In kernel we have %*pE specifier to print an escaped buffer. All users now switched to that approach. This fixes a bug as well. The current implementation wrongly prints octal numbers: only two first digits are used in case when 3 are required and the rest of the string ends up cut off. Additionally by default the \f, \v, \a, and \e are escaped to their alphabetic representation. It's safe to do since it is currently used for messaging only. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: "John W . Linville" <linville@tuxdriver.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:27 +02:00
Rasmus Villemoes	b7a8d756fb	batman-adv: replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:24 +02:00
Rasmus Villemoes	18082746a2	netfilter: replace strnicmp with strncasecmp The kernel used to contain two functions for length-delimited, case-insensitive string comparison, strnicmp with correct semantics and a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper for the new strncasecmp to avoid breaking existing users. To allow the compat wrapper strnicmp to be removed at some point in the future, and to avoid the extra indirection cost, do s/strnicmp/strncasecmp/g. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:24 +02:00
Pablo Neira Ayuso	7210e4e38f	netfilter: nf_tables: restrict nat/masq expressions to nat chain type This adds the missing validation code to avoid the use of nat/masq from non-nat chains. The validation assumes two possible configuration scenarios: 1) Use of nat from base chain that is not of nat type. Reject this configuration from the nft__init() path of the expression. 2) Use of nat from non-base chain. In this case, we have to wait until the non-base chain is referenced by at least one base chain via jump/goto. This is resolved from the nft__validate() path which is called from nf_tables_check_loops(). The user gets an -EOPNOTSUPP in both cases. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-13 20:42:00 +02:00
Linus Torvalds	77c688ac87	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The big thing in this pile is Eric's unmount-on-rmdir series; we finally have everything we need for that. The final piece of prereqs is delayed mntput() - now filesystem shutdown always happens on shallow stack. Other than that, we have several new primitives for iov_iter (Matt Wilcox, culled from his XIP-related series) pushing the conversion to ->read_iter()/ ->write_iter() a bit more, a bunch of fs/dcache.c cleanups and fixes (including the external name refcounting, which gives consistent behaviour of d_move() wrt procfs symlinks for long and short names alike) and assorted cleanups and fixes all over the place. This is just the first pile; there's a lot of stuff from various people that ought to go in this window. Starting with unionmount/overlayfs mess... ;-/" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (60 commits) fs/file_table.c: Update alloc_file() comment vfs: Deduplicate code shared by xattr system calls operating on paths reiserfs: remove pointless forward declaration of struct nameidata don't need that forward declaration of struct nameidata in dcache.h anymore take dname_external() into fs/dcache.c let path_init() failures treated the same way as subsequent link_path_walk() fix misuses of f_count() in ppp and netlink ncpfs: use list_for_each_entry() for d_subdirs walk vfs: move getname() from callers to do_mount() gfs2_atomic_open(): skip lookups on hashed dentry [infiniband] remove pointless assignments gadgetfs: saner API for gadgetfs_create_file() f_fs: saner API for ffs_sb_create_file() jfs: don't hash direct inode [s390] remove pointless assignment of ->f_op in vmlogrdr ->open() ecryptfs: ->f_op is never NULL android: ->f_op is never NULL nouveau: __iomem misannotations missing annotation in fs/file.c fs: namespace: suppress 'may be used uninitialized' warnings ...	2014-10-13 11:28:42 +02:00
Linus Torvalds	5e40d331bd	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris. Mostly ima, selinux, smack and key handling updates. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits) integrity: do zero padding of the key id KEYS: output last portion of fingerprint in /proc/keys KEYS: strip 'id:' from ca_keyid KEYS: use swapped SKID for performing partial matching KEYS: Restore partial ID matching functionality for asymmetric keys X.509: If available, use the raw subjKeyId to form the key description KEYS: handle error code encoded in pointer selinux: normalize audit log formatting selinux: cleanup error reporting in selinux_nlmsg_perm() KEYS: Check hex2bin()'s return when generating an asymmetric key ID ima: detect violations for mmaped files ima: fix race condition on ima_rdwr_violation_check and process_measurement ima: added ima_policy_flag variable ima: return an error code from ima_add_boot_aggregate() ima: provide 'ima_appraise=log' kernel option ima: move keyring initialization to ima_init() PKCS#7: Handle PKCS#7 messages that contain no X.509 certs PKCS#7: Better handling of unsupported crypto KEYS: Overhaul key identification when searching for asymmetric keys KEYS: Implement binary asymmetric key ID handling ...	2014-10-12 10:13:55 -04:00
Linus Torvalds	ca321885b0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "This set fixes a bunch of fallout from the changes that went in during this merge window, particularly: - Fix fsl_pq_mdio (Claudiu Manoil) and fm10k (Pranith Kumar) build failures. - Several networking drivers do atomic_set() on page counts where that's not exactly legal. From Eric Dumazet. - Make __skb_flow_get_ports() work cleanly with unaligned data, from Alexander Duyck. - Fix some kernel-doc buglets in rfkill and netlabel, from Fabian Frederick. - Unbalanced enable_irq_wake usage in bcmgenet and systemport drivers, from Florian Fainelli. - pxa168_eth needs to depend on HAS_DMA, from Geert Uytterhoeven. - Multi-dequeue in the qdisc layer severely bypasses the fairness limits the previous code used to enforce, reintroduce in a way that at the same time doesn't compromise bulk dequeue opportunities. From Jesper Dangaard Brouer. - macvlan receive path unnecessarily hops through a softirq by using netif_rx() instead of netif_receive_skb(). From Jason Baron" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits) net: systemport: avoid unbalanced enable_irq_wake calls net: bcmgenet: avoid unbalanced enable_irq_wake calls net: bcmgenet: fix off-by-one in incrementing read pointer net: fix races in page->_count manipulation mlx4: fix race accessing page->_count ixgbe: fix race accessing page->_count igb: fix race accessing page->_count fm10k: fix race accessing page->_count net/phy: micrel: Add clock support for KSZ8021/KSZ8031 flow-dissector: Fix alignment issue in __skb_flow_get_ports net: filter: fix the comments Documentation: replace __sk_run_filter with __bpf_prog_run macvlan: optimize the receive path macvlan: pass 'bool' type to macvlan_count_rx() drivers: net: xgene: Add 10GbE ethtool support drivers: net: xgene: Add 10GbE support drivers: net: xgene: Preparing for adding 10GbE support dtb: Add 10GbE node to APM X-Gene SoC device tree Documentation: dts: Update section header for APM X-Gene MAINTAINERS: Update APM X-Gene section ...	2014-10-11 21:19:00 -04:00
Linus Torvalds	ef4a48c513	File locking related changes for v3.18 (pile #1 ) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUNZK4AAoJEAAOaEEZVoIVI08P/iM7eaIVRnqaqtWw/JBzxiba EMDlJYUBSlv6lYk9s8RJT4bMmcmGAKSYzVAHSoPahzNcqTDdFLeDTLGxJ8uKBbjf d1qRRdH1yZHGUzCvJq3mEendjfXn435Y3YburUxjLfmzrzW7EbMvndiQsS5dhAm9 PEZ+wrKF/zFL7LuXa1YznYrbqOD/GRsJAXGEWc3kNwfS9avephVG/RI3GtpI2PJj RY1mf8P7+WOlrShYoEuUo5aqs01MnU70LbqGHzY8/QKH+Cb0SOkCHZPZyClpiA+G MMJ+o2XWcif3BZYz+dobwz/FpNZ0Bar102xvm2E8fqByr/T20JFjzooTKsQ+PtCk DetQptrU2gtyZDKtInJUQSDPrs4cvA13TW+OEB1tT8rKBnmyEbY3/TxBpBTB9E6j eb/V3iuWnywR3iE+yyvx24Qe7Pov6deM31s46+Vj+GQDuWmAUJXemhfzPtZiYpMT exMXTyDS3j+W+kKqHblfU5f+Bh1eYGpG2m43wJVMLXKV7NwDf8nVV+Wea962ga+w BAM3ia4JRVgRWJBPsnre3lvGT5kKPyfTZsoG+kOfRxiorus2OABoK+SIZBZ+c65V Xh8VH5p3qyCUBOynXlHJWFqYWe2wH0LfbPrwe9dQwTwON51WF082EMG5zxTG0Ymf J2z9Shz68zu0ok8cuSlo =Hhee -----END PGP SIGNATURE----- Merge tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux Pull file locking related changes from Jeff Layton: "This release is a little more busy for file locking changes than the last: - a set of patches from Kinglong Mee to fix the lockowner handling in knfsd - a pile of cleanups to the internal file lease API. This should get us a bit closer to allowing for setlease methods that can block. There are some dependencies between mine and Bruce's trees this cycle, and I based my tree on top of the requisite patches in Bruce's tree" * tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux: (26 commits) locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKING locks: flock_make_lock should return a struct file_lock (or PTR_ERR) locks: set fl_owner for leases to filp instead of current->files locks: give lm_break a return value locks: __break_lease cleanup in preparation of allowing direct removal of leases locks: remove i_have_this_lease check from __break_lease locks: move freeing of leases outside of i_lock locks: move i_lock acquisition into generic_*_lease handlers locks: define a lm_setup handler for leases locks: plumb a "priv" pointer into the setlease routines nfsd: don't keep a pointer to the lease in nfs4_file locks: clean up vfs_setlease kerneldoc comments locks: generic_delete_lease doesn't need a file_lock at all nfsd: fix potential lease memory leak in nfs4_setlease locks: close potential race in lease_get_mtime security: make security_file_set_fowner, f_setown and __f_setown void return locks: consolidate "nolease" routines locks: remove lock_may_read and lock_may_write lockd: rip out deferred lock handling from testlock codepath NFSD: Get reference of lockowner when coping file_lock ...	2014-10-11 13:21:34 -04:00
Pablo Neira Ayuso	ab2d7251d6	netfilter: missing module license in the nf_reject_ipvX modules [ 23.545204] nf_reject_ipv4: module license 'unspecified' taints kernel. Fixes: `c8d7b98` ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: Dave Young <dyoung@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-11 14:59:41 +02:00
Eric Dumazet	4c450583d9	net: fix races in page->_count manipulation This is illegal to use atomic_set(&page->_count, ...) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. The only case it is valid is when page->_count is 0 Fixes: `540eb7bf0b` ("net: Update alloc frag to reduce get/put page usage and recycle pages") Signed-off-by: Eric Dumaze <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:29 -04:00
Alexander Duyck	5af7fb6e3e	flow-dissector: Fix alignment issue in __skb_flow_get_ports This patch addresses a kernel unaligned access bug seen on a sparc64 system with an igb adapter. Specifically the __skb_flow_get_ports was returning a be32 pointer which was then having the value directly returned. In order to prevent this it is actually easier to simply not populate the ports or address values when an skb is not present. In this case the assumption is that the data isn't needed and rather than slow down the faster aligned accesses by making them have to assume the unaligned path on architectures that don't support efficent unaligned access it makes more sense to simply switch off the bits that were copying the source and destination address/port for the case where we only care about the protocol types and lengths which are normally 16 bit fields anyway. Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:33:47 -04:00
Li RongQing	8ea6e345a6	net: filter: fix the comments 1. sk_run_filter has been renamed, sk_filter() is using SK_RUN_FILTER. 2. Remove wrong comments about storing intermediate value. 3. replace sk_run_filter with __bpf_prog_run for check_load_and_stores's comments Cc: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:11:51 -04:00
Alexei Starovoitov	38b3629adb	net: bpf: fix bpf syscall dependence on anon_inodes minimal configurations where EPOLL, PERF_EVENTS, etc are disabled, but NET is enabled, are failing to build with link error: kernel/built-in.o: In function `bpf_prog_load': syscall.c:(.text+0x3b728): undefined reference to `anon_inode_getfd' fix it by selecting ANON_INODES when NET is enabled Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:02:23 -04:00
David S. Miller	7b6fa1eef6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter fixes for net-next This batch contains two fixes for what you have in your net-next, they are: 1) Remove nf_send_reset6() from header file. This function now resides in the nf_reject_ipv6 module. Reported by Eric Dumazet. 2) Fix wrong NFT_REJECT_ICMPX_MAX definition and adjust code to fix errors reported by Dan Carpenter's static analysis tools. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:01:09 -04:00
David S. Miller	4511a4a50e	Merge tag 'master-2014-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-10-09 Please pull this batch of fixes intended for the 3.18 stream! Andrea Merello makes rtl818x_pci use a more reasonable transmission rate for HW generated frames. Fabian Frederick tweaks some kernel-doc bits to avoid warnings. Larry Finger corrects a possible unaligned access in the rtlwifi code. Marek Puzyniak avoids a kernel panic in ath9k_hw_reset. Sujith Manoharan goes for the hat trick -- he fixes a smatch warning in the shared ath code, he fixes a crash in ath9k, and he corrects a sequence number assignment problem in ath9k too. For ease of merging, I pulled the last bits of the wireless tree as well... Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 14:49:55 -04:00
Karl Beldan	2a84ee8625	cfg80211: set the rates mask in connection probes over specified freq ATM, specifying the frequency when connecting sends a void 'supported rates' EID. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> [fix memory leak in error path] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-10 17:11:13 +02:00
Michal Kazior	486cf4c08f	mac80211: enable DFS with channel contexts It is okay to enable DFS for channel contexts based drivers as long as no combination advertises radar detection and multi-channel operation at the same time. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-10 17:08:33 +02:00
Linus Torvalds	c798360cd1	Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "A lot of activities on percpu front. Notable changes are... - percpu allocator now can take @gfp. If @gfp doesn't contain GFP_KERNEL, it tries to allocate from what's already available to the allocator and a work item tries to keep the reserve around certain level so that these atomic allocations usually succeed. This will replace the ad-hoc percpu memory pool used by blk-throttle and also be used by the planned blkcg support for writeback IOs. Please note that I noticed a bug in how @gfp is interpreted while preparing this pull request and applied the fix `6ae833c7fe` ("percpu: fix how @gfp is interpreted by the percpu allocator") just now. - percpu_ref now uses longs for percpu and global counters instead of ints. It leads to more sparse packing of the percpu counters on 64bit machines but the overhead should be negligible and this allows using percpu_ref for refcnting pages and in-memory objects directly. - The switching between percpu and single counter modes of a percpu_ref is made independent of putting the base ref and a percpu_ref can now optionally be initialized in single or killed mode. This allows avoiding percpu shutdown latency for cases where the refcounted objects may be synchronously created and destroyed in rapid succession with only a fraction of them reaching fully operational status (SCSI probing does this when combined with blk-mq support). It's also planned to be used to implement forced single mode to detect underflow more timely for debugging. There's a separate branch percpu/for-3.18-consistent-ops which cleans up the duplicate percpu accessors. That branch causes a number of conflicts with s390 and other trees. I'll send a separate pull request w/ resolutions once other branches are merged" * 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (33 commits) percpu: fix how @gfp is interpreted by the percpu allocator blk-mq, percpu_ref: start q->mq_usage_counter in atomic mode percpu_ref: make INIT_ATOMIC and switch_to_atomic() sticky percpu_ref: add PERCPU_REF_INIT_* flags percpu_ref: decouple switching to percpu mode and reinit percpu_ref: decouple switching to atomic mode and killing percpu_ref: add PCPU_REF_DEAD percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch percpu_ref: replace pcpu_ prefix with percpu_ percpu_ref: minor code and comment updates percpu_ref: relocate percpu_ref_reinit() Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe" Revert "percpu: free percpu allocation info for uniprocessor system" percpu-refcount: make percpu_ref based on longs instead of ints percpu-refcount: improve WARN messages percpu: fix locking regression in the failure path of pcpu_alloc() percpu-refcount: add @gfp to percpu_ref_init() proportions: add @gfp to init functions percpu_counter: add @gfp to percpu_counter_init() percpu_counter: make percpu_counters_lock irq-safe ...	2014-10-10 07:26:02 -04:00
Jesper Dangaard Brouer	b8358d70ce	net_sched: restore qdisc quota fairness limits after bulk dequeue Restore the quota fairness between qdisc's, that we broke with commit `5772e9a346` ("qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE"). Before that commit, the quota in __qdisc_run() were in packets as dequeue_skb() would only dequeue a single packet, that assumption broke with bulk dequeue. We choose not to account for the number of packets inside the TSO/GSO packets (accessable via "skb_gso_segs"). As the previous fairness also had this "defect". Thus, GSO/TSO packets counts as a single packet. Further more, we choose to slack on accuracy, by allowing a bulk dequeue try_bulk_dequeue_skb() to exceed the "packets" limit, only limited by the BQL bytelimit. This is done because BQL prefers to get its full budget for appropriate feedback from TX completion. In future, we might consider reworking this further and, if it allows, switch to a time-based model, as suggested by Eric. Right now, we only restore old semantics. Joint work with Eric, Hannes, Daniel and Jesper. Hannes wrote the first patch in cooperation with Daniel and Jesper. Eric rewrote the patch. Fixes: `5772e9a346` ("qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-09 19:12:26 -04:00
Masanari Iida	de3f0d0eff	net: Missing @ before descriptions cause make xmldocs warning This patch fix following warning. Warning(.//net/core/skbuff.c:4142): No description found for parameter 'header_len' Warning(.//net/core/skbuff.c:4142): No description found for parameter 'data_len' Warning(.//net/core/skbuff.c:4142): No description found for parameter 'max_page_order' Warning(.//net/core/skbuff.c:4142): No description found for parameter 'errcode' Warning(.//net/core/skbuff.c:4142): No description found for parameter 'gfp_mask' Acutually the descriptions exist, but missing "@" in front. This problem start to happen when following commit was merged into Linus's tree during 3.18-rc1 merge period. commit `2e4e441071` net: add alloc_skb_with_frags() helper Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-09 18:57:14 -04:00
Fabian Frederick	408b18abf6	mac80211: directly return ieee80211_vif_use_reserved_context() No need to store ieee80211_vif_use_reserved_context() result and test it before returning. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 20:54:04 +02:00
Luciano Coelho	0f791eb47f	mac80211: allow channel switch with multiple channel contexts Channel switch with multiple channel contexts should now work fine. Remove check that disallows switches when multiple contexts are in use. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:09 +02:00
Luciano Coelho	0c21e6320f	mac80211: wait for the first beacon on the new channel after CSA Instead of immediately reopening the queues (in case of block_tx), calling the post_channel_switch operation and sending the notification, wait for the first beacon on the new channel. This makes sure that we don't lose packets if the AP/GO is not on the new channel yet. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:09 +02:00
Luciano Coelho	f1d65583bc	mac80211: add post_channel_switch driver operation As a counterpart to the pre_channel_switch operation, add a post_channel_switch operation. This allows the drivers to go back to a normal configuration after the channel switch is completed. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:09 +02:00
Luciano Coelho	6d027bcc8a	mac80211: add pre_channel_switch driver operation Some drivers may need to prepare for a channel switch also when it is initiated from the remote side (eg. station, P2P client). To make this possible, add a generic callback that can be called for all interface types. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:08 +02:00
Luciano Coelho	e9a21949b7	mac80211: add extended channel switching capability if the driver supports CSA The Extended Channel Switching capability bit in the extended capabilities element must be set if the driver supports CSA on non-beaconing interfaces. Since this capability needs to be set during driver registration, the extended_capabiliities global variable needs to be moved to the local structure so that it can be modified. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:08 +02:00
Luciano Coelho	2ba45384e5	mac80211: add device_timestamp to the ieee80211_channel_switch struct Some devices may need the device timestamp in order to synchronize the channel switch. To pass this value back to the driver, add it to the channel switch structure and copy the device_timestamp value received in the rx info structure into it. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:30:08 +02:00
Luciano Coelho	252e07ca5f	nl80211: sanity check the channel switch counter value The nl80211 channel switch count attribute (NL80211_ATTR_CH_SWITCH_COUNT) is specified as u32, but the specification uses u8 for the counter. To make sure strange things don't happen without informing the user, sanity check the value and return -EINVAL if it doesn't fit in u8. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:25:11 +02:00
Henning Rogge	a2db2ed3fb	mac80211: implement cfg80211_ops to query mesh proxy path table Implement get_mpp and dump_mpp cfg80211_ops to export the content of the 802.11s mesh proxy path table to userspace. Signed-off-by: Henning Rogge <henning.rogge@fkie.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:19:07 +02:00
Henning Rogge	66be7d2bcd	cfg80211: add ops to query mesh proxy path table Add two new cfg80211 operations for querying a table with proxied mesh paths. Signed-off-by: Henning Rogge <henning.rogge@fkie.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:19:07 +02:00
Fabian Frederick	bc37b16870	net: rfkill: kernel-doc warning fixes Correct the kernel-doc, the parameter is called "blocked" not "state". Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:16:15 +02:00
Luciano Coelho	c12bc4885f	mac80211: return the vif's chandef in ieee80211_cfg_get_channel() The chandef of the channel context a vif is using may be different than the chandef of the vif itself. For instance, the bandwidth used by the vif may be narrower than the one configured in the channel context. To avoid confusion, return the vif's chandef in ieee80211_cfg_get_channel() instead of the chandef of the channel context. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 11:01:58 +02:00
Karl Beldan	cc61d8df0a	mac80211: minstrel_ht: fix MCS_GROUP_RATES usage Commit `5935839ad7` ("mac80211: improve minstrel_ht rate sorting by throughput & probability") replaced the constant 8 with MCS_GROUP_RATES when getting the number of streams of an HT MCS. See commit `7a5e3fa2c8` ("mac80211: minstrel_ht: replace some occurences of MCS_GROUP_RATES"). Fixes: `5935839ad7` ("mac80211: improve minstrel_ht rate sorting by throughput & probability") Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 10:51:19 +02:00
Liad Kaufman	8975ae88e1	mac80211: fix warning on htmldocs for last_tdls_pkt_time Forgot to add an entry to the struct description of sta_info. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-10-09 10:33:29 +02:00
Al Viro	24dff96a37	fix misuses of f_count() in ppp and netlink we used to check for "nobody else could start doing anything with that opened file" by checking that refcount was 2 or less - one for descriptor table and one we'd acquired in fget() on the way to wherever we are. That was race-prone (somebody else might have had a reference to descriptor table and do fget() just as we'd been checking) and it had become flat-out incorrect back when we switched to fget_light() on those codepaths - unlike fget(), it doesn't grab an extra reference unless the descriptor table is shared. The same change allowed a race-free check, though - we are safe exactly when refcount is less than 2. It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading to ppp one) and 2.6.17 for sendmsg() (netlink one). OTOH, netlink hadn't grown that check until 3.9 and ppp used to live in drivers/net, not drivers/net/ppp until 3.1. The bug existed well before that, though, and the same fix used to apply in old location of file. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:39:17 -04:00
Fabian Frederick	59f35b810e	netlabel: kernel-doc warning fix no secid argument in netlbl_cfg_unlbl_static_del Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-09 01:40:05 -04:00
Linus Torvalds	35a9ad8af0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Most notable changes in here: 1) By far the biggest accomplishment, thanks to a large range of contributors, is the addition of multi-send for transmit. This is the result of discussions back in Chicago, and the hard work of several individuals. Now, when the ->ndo_start_xmit() method of a driver sees skb->xmit_more as true, it can choose to defer the doorbell telling the driver to start processing the new TX queue entires. skb->xmit_more means that the generic networking is guaranteed to call the driver immediately with another SKB to send. There is logic added to the qdisc layer to dequeue multiple packets at a time, and the handling mis-predicted offloads in software is now done with no locks held. Finally, pktgen is extended to have a "burst" parameter that can be used to test a multi-send implementation. Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4, virtio_net Adding support is almost trivial, so export more drivers to support this optimization soon. I want to thank, in no particular or implied order, Jesper Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann, David Tat, Hannes Frederic Sowa, and Rusty Russell. 2) PTP and timestamping support in bnx2x, from Michal Kalderon. 3) Allow adjusting the rx_copybreak threshold for a driver via ethtool, and add rx_copybreak support to enic driver. From Govindarajulu Varadarajan. 4) Significant enhancements to the generic PHY layer and the bcm7xxx driver in particular (EEE support, auto power down, etc.) from Florian Fainelli. 5) Allow raw buffers to be used for flow dissection, allowing drivers to determine the optimal "linear pull" size for devices that DMA into pools of pages. The objective is to get exactly the necessary amount of headers into the linear SKB area pre-pulled, but no more. The new interface drivers use is eth_get_headlen(). From WANG Cong, with driver conversions (several had their own by-hand duplicated implementations) by Alexander Duyck and Eric Dumazet. 6) Support checksumming more smoothly and efficiently for encapsulations, and add "foo over UDP" facility. From Tom Herbert. 7) Add Broadcom SF2 switch driver to DSA layer, from Florian Fainelli. 8) eBPF now can load programs via a system call and has an extensive testsuite. Alexei Starovoitov and Daniel Borkmann. 9) Major overhaul of the packet scheduler to use RCU in several major areas such as the classifiers and rate estimators. From John Fastabend. 10) Add driver for Intel FM10000 Ethernet Switch, from Alexander Duyck. 11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric Dumazet. 12) Add Datacenter TCP congestion control algorithm support, From Florian Westphal. 13) Reorganize sk_buff so that __copy_skb_header() is significantly faster. From Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits) netlabel: directly return netlbl_unlabel_genl_init() net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers net: description of dma_cookie cause make xmldocs warning cxgb4: clean up a type issue cxgb4: potential shift wrapping bug i40e: skb->xmit_more support net: fs_enet: Add NAPI TX net: fs_enet: Remove non NAPI RX r8169:add support for RTL8168EP net_sched: copy exts->type in tcf_exts_change() wimax: convert printk to pr_foo() af_unix: remove 0 assignment on static ipv6: Do not warn for informational ICMP messages, regardless of type. Update Intel Ethernet Driver maintainers list bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING tipc: fix bug in multicast congestion handling net: better IFF_XMIT_DST_RELEASE support net/mlx4_en: remove NETDEV_TX_BUSY 3c59x: fix bad split of cpu_to_le32(pci_map_single()) net: bcmgenet: fix Tx ring priority programming ...	2014-10-08 21:40:54 -04:00
David S. Miller	64b1f00a08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-10-08 16:22:22 -04:00
Fabian Frederick	16b99a4f66	netlabel: directly return netlbl_unlabel_genl_init() No need to store netlbl_unlabel_genl_init result and test it before returning. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-08 16:08:04 -04:00
WANG Cong	5301e3e117	net_sched: copy exts->type in tcf_exts_change() We need to copy exts->type when committing the change, otherwise it would be always 0. This is a quick fix for -net and -stable, for net-next tcf_exts will be removed. Fixes: commit `33be627159` ("net_sched: act: use standard struct list_head") Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-08 15:41:27 -04:00
Fabian Frederick	2f29fed3f8	net: rfkill: kernel-doc warning fixes s/state/blocked Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-10-08 15:24:15 -04:00
Linus Torvalds	6dea0737bc	Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - support the NFSv4.2 SEEK operation (allowing clients to support SEEK_HOLE/SEEK_DATA), thanks to Anna. - end the grace period early in a number of cases, mitigating a long-standing annoyance, thanks to Jeff - improve SMP scalability, thanks to Trond" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: eliminate "to_delegation" define NFSD: Implement SEEK NFSD: Add generic v4.2 infrastructure svcrdma: advertise the correct max payload nfsd: introduce nfsd4_callback_ops nfsd: split nfsd4_callback initialization and use nfsd: introduce a generic nfsd4_cb nfsd: remove nfsd4_callback.cb_op nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence nfsd: fix nfsd4_cb_recall_done error handling nfsd4: clarify how grace period ends nfsd4: stop grace_time update at end of grace period nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls nfsd: serialize nfsdcltrack upcalls for a particular client nfsd: pass extra info in env vars to upcalls to allow for early grace period end nfsd: add a v4_end_grace file to /proc/fs/nfsd lockd: add a /proc/fs/lockd/nlm_end_grace file nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE nfsd: remove redundant boot_time parm from grace_done client tracking op ...	2014-10-08 12:51:44 -04:00
Linus Torvalds	25641c0c8d	NFS client updates for Linux 3.18 Highlights include: Stable fixes: - fix an NFSv4.1 state renewal regression - fix open/lock state recovery error handling - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails - fix statd when reconnection fails - Don't wake tasks during connection abort - Don't start reboot recovery if lease check fails - fix duplicate proc entries Features: - pNFS block driver fixes and clean ups from Christoph - More code cleanups from Anna - Improve mmap() writeback performance - Replace use of PF_TRANS with a more generic mechanism for avoiding deadlocks in nfs_release_page -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUMpFYAAoJEGcL54qWCgDywHYP/A7XNykwOGhoHVP1Cgr3xqoz gVhAw97AEMZE8xSNVEGS++pJTe59JVzsIsYAwdHMwePV33l3zyzYorae6N9p7zWF 0xVaNQ4qNLVhbrNLAoB5KA/c3/jMnNjF5t15+8akZad5pt4kXLlhSKjyVpdEEtJE A0eneXShMYEeLZoOJhpQt5bsw0OZ8YbWWEMjGlDqyeelvV3K1+zfivQOoyX6hS4w XFkPEDmU7zunE/xFP9ZoUaVdLO0TvOWfEZ7STWoHm7NuWfPQiDb9w1mTnuZbZyka ssezoGcitzwsjCcQ5e1iKTOoFRIsm/zYXFQgFQL7VFMBU1Tss9Of8047EyDkqcPF GxctsGg0gQ2FkG7yx7JH7AKpyibOIuByQrQQ916coWSf7K0L4H4Rcky3vryroylP 1e1RI49xu215OTm+dLvlvYCv55bqCrTmaUGImZac18+ixD2eh6MNfW2ubSdxk89L U2rTFV09Bd52N7IQOGQx1FBEI2ZnIFUV4UaFz7v+rGFxOnk6+WYe+iWyb4wC70Yc 8Jh/gTIQDd5aghql3FTieMOyfEvO6Re4pLMXmqEWMAevicx2t8DwkJriRu6X8Iy2 rlDlBPwu5QmRWC20Dc897f0VajwDtwdeB8puod7nobOWzOfx4FrNqLJ+jR3pmHUk 0otvJytqemXt+zkqqHKK =/OQi -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - fix an NFSv4.1 state renewal regression - fix open/lock state recovery error handling - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails - fix statd when reconnection fails - don't wake tasks during connection abort - don't start reboot recovery if lease check fails - fix duplicate proc entries Features: - pNFS block driver fixes and clean ups from Christoph - More code cleanups from Anna - Improve mmap() writeback performance - Replace use of PF_TRANS with a more generic mechanism for avoiding deadlocks in nfs_release_page" * tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits) NFSv4.1: Fix an NFSv4.1 state renewal regression NFSv4: fix open/lock state recovery error handling NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails NFS: Fabricate fscache server index key correctly SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT NFSv3: Fix missing includes of nfs3_fs.h NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page() NFS: avoid waiting at all in nfs_release_page when congested. NFS: avoid deadlocks with loop-back mounted NFS filesystems. MM: export page_wakeup functions SCHED: add some "wait..on_bit...timeout()" interfaces. NFS: don't use STABLE writes during writeback. NFSv4: use exponential retry on NFS4ERR_DELAY for async requests. rpc: Add -EPERM processing for xs_udp_send_request() rpc: return sent and err from xs_sendpages() lockd: Try to reconnect if statd has moved SUNRPC: Don't wake tasks during connection abort Fixing lease renewal nfs: fix duplicate proc entries pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe ...	2014-10-08 12:49:23 -04:00
Linus Torvalds	28596c9722	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull "trivial tree" updates from Jiri Kosina: "Usual pile from trivial tree everyone is so eagerly waiting for" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Remove MN10300_PROC_MN2WS0038 mei: fix comments treewide: Fix typos in Kconfig kprobes: update jprobe_example.c for do_fork() change Documentation: change "&" to "and" in Documentation/applying-patches.txt Documentation: remove obsolete pcmcia-cs from Changes Documentation: update links in Changes Documentation: Docbook: Fix generated DocBook/kernel-api.xml score: Remove GENERIC_HAS_IOMAP gpio: fix 'CONFIG_GPIO_IRQCHIP' comments tty: doc: Fix grammar in serial/tty dma-debug: modify check_for_stack output treewide: fix errors in printk genirq: fix reference in devm_request_threaded_irq comment treewide: fix synchronize_rcu() in comments checkstack.pl: port to AArch64 doc: queue-sysfs: minor fixes init/do_mounts: better syntax description MIPS: fix comment spelling powerpc/simpleboot: fix comment ...	2014-10-07 21:16:26 -04:00
Linus Torvalds	d0cd84817c	dmaengine-3.17 1/ Step down as dmaengine maintainer see commit `08223d80df` "dmaengine maintainer update" 2/ Removal of net_dma, as it has been marked 'broken' since 3.13 (commit `7787380336` "net_dma: mark broken"), without reports of performance regression. 3/ Miscellaneous fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUKDLKAAoJEB7SkWpmfYgC7wwP/iNHqRjf1suMUTBIF3P6Hgbe VCUwh0IkuujMPDG46WRn6cYzarRxVPLoGaLHLPszgjI6pmGPVv19wqeDOlUxtcmr 0iQWEWv/zqseaAIW+4gj/WYCyMgKil49EUBJKCZCfNmIaad+e0pr8f0uE5yOkHPM tqWoZERu9A4dlXGr1TjeOZVzdnPrCt92MrLDN6ZZ6tMuJaEc5PauaLxKTeGy5fYj UB+k1xJQzECbsYfpB+uCVYl5/qPO1rNyuBYS8THCsW+JYmrbbfH2kkF2lo2FaUpO 8Yd50FtzXHKWwAt7BzfIwU2M7x0wRmryrC/xsQi6M+WmVeHYvvHUIpzaA66xRZ5x fCy3Fu8sEnmnmboAbh2v2c5uTycqRl2xPzbpLAuxglloXIxzi3ckp6ESF/Z4SldH oxIoEievN7lah3vKgvlHZYcWDzrYr8EKf/EzFe9RqDBQDKtzDzre1H9Uivr387Vm uFUcGHYG/GXuX47C7EUsMtaSW2UEoR2ytw/HR6CKFPTVXwAzEO6kA9vg0EqL0iIq 2wVLgavlZuwegmaUBgnr+bgVZMvVN7OU7fAIRVe5xNO6itrPKvheSlQthmRiiq9C uzOu4PS6PexqzHUNPCcJpCsj+lawmCSrE0bxtPzTA/CQInVgWs219V9+W5Gn/0YA EARN9k6ueX9PZPQrPQLm =BBBv -----END PGP SIGNATURE----- Merge tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine updates from Dan Williams: "Even though this has fixes marked for -stable, given the size and the needed conflict resolutions this is 3.18-rc1/merge-window material. These patches have been languishing in my tree for a long while. The fact that I do not have the time to do proper/prompt maintenance of this tree is a primary factor in the decision to step down as dmaengine maintainer. That and the fact that the bulk of drivers/dma/ activity is going through Vinod these days. The net_dma removal has not been in -next. It has developed simple conflicts against mainline and net-next (for-3.18). Continuing thanks to Vinod for staying on top of drivers/dma/. Summary: 1/ Step down as dmaengine maintainer see commit `08223d80df` "dmaengine maintainer update" 2/ Removal of net_dma, as it has been marked 'broken' since 3.13 (commit `7787380336` "net_dma: mark broken"), without reports of performance regression. 3/ Miscellaneous fixes" * tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: net: make tcp_cleanup_rbuf private net_dma: revert 'copied_early' net_dma: simple removal dmaengine maintainer update dmatest: prevent memory leakage on error path in thread ioat: Use time_before_jiffies() dmaengine: fix xor sources continuation dma: mv_xor: Rename __mv_xor_slot_cleanup() to mv_xor_slot_cleanup() dma: mv_xor: Remove all callers of mv_xor_slot_cleanup() dma: mv_xor: Remove unneeded mv_xor_clean_completed_slots() call ioat: Use pci_enable_msix_exact() instead of pci_enable_msix() drivers: dma: Include appropriate header file in dca.c drivers: dma: Mark functions as static in dma_v3.c dma: mv_xor: Add DMA API error checks ioat/dca: Use dev_is_pci() to check whether it is pci device	2014-10-07 20:39:25 -04:00
Fabian Frederick	28b7deae75	wimax: convert printk to pr_foo() Use current logging functions and add module name prefix. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 20:28:44 -04:00
Fabian Frederick	505e907db3	af_unix: remove 0 assignment on static static values are automatically initialized to 0 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 17:03:14 -04:00
David S. Miller	ea85a0a2dc	ipv6: Do not warn for informational ICMP messages, regardless of type. There is no reason to emit a log message for these. Based upon a suggestion from Hannes Frederic Sowa. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>	2014-10-07 16:33:53 -04:00
Herbert Xu	93fdd47e52	bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING As we may defragment the packet in IPv4 PRE_ROUTING and refragment it after POST_ROUTING we should save the value of frag_max_size. This is still very wrong as the bridge is supposed to leave the packets intact, meaning that the right thing to do is to use the original frag_list for fragmentation. Unfortunately we don't currently guarantee that the frag_list is left untouched throughout netfilter so until this changes this is the best we can do. There is also a spot in FORWARD where it appears that we can forward a packet without going through fragmentation, mark it so that we can fix it later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 15:12:44 -04:00
Jon Maloy	908344cdda	tipc: fix bug in multicast congestion handling One aim of commit `50100a5e39` ("tipc: use pseudo message to wake up sockets after link congestion") was to handle link congestion abatement in a uniform way for both unicast and multicast transmit. However, the latter doesn't work correctly, and has been broken since the referenced commit was applied. If a user now sends a burst of multicast messages that is big enough to cause broadcast link congestion, it will be put to sleep, and not be waked up when the congestion abates as it should be. This has two reasons. First, the flag that is used, TIPC_WAKEUP_USERS, is set correctly, but in the wrong field. Instead of setting it in the 'action_flags' field of the arrival node struct, it is by mistake set in the dummy node struct that is owned by the broadcast link, where it will never tested for. Second, we cannot use the same flag for waking up unicast and multicast users, since the function tipc_node_unlock() needs to pick the wakeup pseudo messages to deliver from different queues. It must hence be able to distinguish between the two cases. This commit solves this problem by adding a new flag TIPC_WAKEUP_BCAST_USERS, and a new function tipc_bclink_wakeup_user(). The latter is to be called by tipc_node_unlock() when the named flag, now set in the correct field, is encountered. v2: using explicit 'unsigned int' declaration instead of 'uint', as per comment from David Miller. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 14:50:15 -04:00
John W. Linville	d7ffd588f0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-10-07 14:48:29 -04:00
Pablo Neira Ayuso	f0d1f04f0a	netfilter: fix wrong arithmetics regarding NFT_REJECT_ICMPX_MAX NFT_REJECT_ICMPX_MAX should be __NFT_REJECT_ICMPX_MAX - 1. nft_reject_icmp_code() and nft_reject_icmpv6_code() are called from the packet path, so BUG_ON in case we try to access an unknown abstracted ICMP code. This should not happen since we already validate this from nft_reject_{inet,bridge}_init(). Fixes: `51b0a5d` ("netfilter: nft_reject: introduce icmp code abstraction for inet and bridge") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-07 20:16:31 +02:00
Eric Dumazet	0287587884	net: better IFF_XMIT_DST_RELEASE support Testing xmit_more support with netperf and connected UDP sockets, I found strange dst refcount false sharing. Current handling of IFF_XMIT_DST_RELEASE is not optimal. Dropping dst in validate_xmit_skb() is certainly too late in case packet was queued by cpu X but dequeued by cpu Y The logical point to take care of drop/force is in __dev_queue_xmit() before even taking qdisc lock. As Julian Anastasov pointed out, need for skb_dst() might come from some packet schedulers or classifiers. This patch adds new helper to cleanly express needs of various drivers or qdiscs/classifiers. Drivers that need skb_dst() in their ndo_start_xmit() should call following helper in their setup instead of the prior : dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; -> netif_keep_dst(dev); Instead of using a single bit, we use two bits, one being eventually rebuilt in bonding/team drivers. The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being rebuilt in bonding/team. Eventually, we could add something smarter later. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 13:22:11 -04:00
WANG Cong	02c0fc1b8f	net_sched: fix unused variables in __gnet_stats_copy_basic_cpu() Probably not a big deal, but we'd better just use the one we get in retry loop. Fixes: commit `22e0f8b932` ("net: sched: make bstats per cpu and estimator RCU safe") Reported-by: Joe Perches <joe@perches.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:10:49 -04:00
Andy Zhou	7c5df8fa19	openvswitch: fix a compilation error when CONFIG_INET is not setW! Fix a openvswitch compilation error when CONFIG_INET is not set: ===================================================== In file included from include/net/geneve.h:4:0, from net/openvswitch/flow_netlink.c:45: include/net/udp_tunnel.h: In function 'udp_tunnel_handle_offloads': >> include/net/udp_tunnel.h💯2: error: implicit declaration of function 'iptunnel_handle_offloads' [-Werror=implicit-function-declaration] >> return iptunnel_handle_offloads(skb, udp_csum, type); >> ^ >> >> include/net/udp_tunnel.h💯2: warning: return makes pointer from integer without a cast >> >> cc1: some warnings being treated as errors ===================================================== Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:10:49 -04:00
Andy Zhou	0a5d1c55fa	openvswitch: fix a sparse warning Fix a sparse warning introduced by commit: `f579668406` (openvswitch: Add support for Geneve tunneling.) caught by kbuild test robot: reproduce: # apt-get install sparse # git checkout `f579668406` # make ARCH=x86_64 allmodconfig # make C=1 CF=-D__CHECK_ENDIAN__ # # # sparse warnings: (new ones prefixed by >>) # # >> net/openvswitch/vport-geneve.c:109:15: sparse: incorrect type in assignment (different base types) # net/openvswitch/vport-geneve.c:109:15: expected restricted __be16 [usertype] sport # net/openvswitch/vport-geneve.c:109:15: got int # >> net/openvswitch/vport-geneve.c:110:56: sparse: incorrect type in argument 3 (different base types) # net/openvswitch/vport-geneve.c:110:56: expected unsigned short [unsigned] [usertype] value # net/openvswitch/vport-geneve.c:110:56: got restricted __be16 [usertype] sport Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:10:48 -04:00
Andy Zhou	42350dcaaf	net: fix a sparse warning Fix a sparse warning introduced by Commit `0b5e8b8eea` (net: Add Geneve tunneling protocol driver) caught by kbuild test robot: # apt-get install sparse # git checkout `0b5e8b8eea` # make ARCH=x86_64 allmodconfig # make C=1 CF=-D__CHECK_ENDIAN__ # # # sparse warnings: (new ones prefixed by >>) # # >> net/ipv4/geneve.c:230:42: sparse: incorrect type in assignment (different base types) # net/ipv4/geneve.c:230:42: expected restricted __be32 [addressable] [assigned] [usertype] s_addr # net/ipv4/geneve.c:230:42: got unsigned long [unsigned] <noident> # Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:10:47 -04:00
Hannes Frederic Sowa	327571cb10	ipv6: don't walk node's leaf during serial number update Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:02:30 -04:00
Hannes Frederic Sowa	812918c464	ipv6: make fib6 serial number per namespace Try to reduce number of possible fn_sernum mutation by constraining them to their namespace. Also remove rt_genid which I forgot to remove in `705f1c869d` ("ipv6: remove rt6i_genid"). Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:02:30 -04:00
Hannes Frederic Sowa	c8c4d42a6b	ipv6: only generate one new serial number per fib mutation Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:02:30 -04:00
Hannes Frederic Sowa	42b1870646	ipv6: make rt_sernum atomic and serial number fields ordinary ints Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:02:30 -04:00
Hannes Frederic Sowa	94b2cfe02b	ipv6: minor fib6 cleanups like type safety, bool conversion, inline removal Also renamed struct fib6_walker_t to fib6_walker and enum fib_walk_state_t to fib6_walk_state as recommended by Cong Wang. Cc: Cong Wang <cwang@twopensource.com> Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 00:02:30 -04:00
Eric Dumazet	1ff0dc9499	net: validate_xmit_vlan() is static Marking this as static allows compiler to inline it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 18:17:17 -04:00
Fabian Frederick	79952bca86	net: fix rcu access on phonet_routes -Add __rcu annotation on table to fix sparse warnings: net/phonet/pn_dev.c:279:25: warning: incorrect type in assignment (different address spaces) net/phonet/pn_dev.c:279:25: expected struct net_device <noident> net/phonet/pn_dev.c:279:25: got void [noderef] <asn:4><noident> net/phonet/pn_dev.c:376:17: warning: incorrect type in assignment (different address spaces) net/phonet/pn_dev.c:376:17: expected struct net_device volatile <noident> net/phonet/pn_dev.c:376:17: got struct net_device [noderef] <asn:4><noident> net/phonet/pn_dev.c:392:17: warning: incorrect type in assignment (different address spaces) net/phonet/pn_dev.c:392:17: expected struct net_device <noident> net/phonet/pn_dev.c:392:17: got void [noderef] <asn:4><noident> -Access table with rcu_access_pointer (fixes the following sparse errors): net/phonet/pn_dev.c:278:25: error: incompatible types in comparison expression (different address spaces) net/phonet/pn_dev.c:391:17: error: incompatible types in comparison expression (different address spaces) Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 18:16:30 -04:00
John Fastabend	18cdb37ebf	net: sched: do not use tcf_proto 'tp' argument from call_rcu Using the tcf_proto pointer 'tp' from inside the classifiers callback is not valid because it may have been cleaned up by another call_rcu occuring on another CPU. 'tp' is currently being used by tcf_unbind_filter() in this patch we move instances of tcf_unbind_filter outside of the call_rcu() context. This is safe to do because any running schedulers will either read the valid class field or it will be zeroed. And all schedulers today when the class is 0 do a lookup using the same call used by the tcf_exts_bind(). So even if we have a running classifier hit the null class pointer it will do a lookup and get to the same result. This is particularly fragile at the moment because the only way to verify this is to audit the schedulers call sites. Reported-by: Cong Wang <xiyou.wangconf@gmail.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 18:02:33 -04:00
John Fastabend	13990f8156	net: sched: cls_cgroup tear down exts and ematch from rcu callback It is not RCU safe to destroy the action chain while there is a possibility of readers accessing it. Move this code into the rcu callback using the same rcu callback used in the code patch to make a change to head. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 18:02:32 -04:00
John Fastabend	82a470f111	net: sched: remove tcf_proto from ematch calls This removes the tcf_proto argument from the ematch code paths that only need it to reference the net namespace. This allows simplifying qdisc code paths especially when we need to tear down the ematch from an RCU callback. In this case we can not guarentee that the tcf_proto structure is still valid. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 18:02:32 -04:00
Eric Dumazet	fcbeb976d7	net: introduce netdevice gso_min_segs attribute Some TSO engines might have a too heavy setup cost, that impacts performance on hosts sending small bursts (2 MSS per packet). This patch adds a device gso_min_segs, allowing drivers to set a minimum segment size for TSO packets, according to the NIC performance. Tested on a mlx4 NIC, this allows to get a ~110% increase of throughput when sending 2 MSS per packet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 17:56:28 -04:00
Daniel Borkmann	b47bd8d279	ipv4: igmp: fix v3 general query drop monitor false positive In case we find a general query with non-zero number of sources, we are dropping the skb as it's malformed. RFC3376, section 4.1.8. Number of Sources (N): This number is zero in a General Query or a Group-Specific Query, and non-zero in a Group-and-Source-Specific Query. Therefore, reflect that by using kfree_skb() instead of consume_skb(). Fixes: `d679c5324d` ("igmp: avoid drop_monitor false positives") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 17:14:54 -04:00
Eric Dumazet	1255a50554	ethtool: Ethtool parameter to dynamically change tx_copybreak Use new ethtool [sg]et_tunable() to set tx_copybread (inline threshold) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 01:04:16 -04:00
Eric Dumazet	f2600cf02b	net: sched: avoid costly atomic operation in fq_dequeue() Standard qdisc API to setup a timer implies an atomic operation on every packet dequeue : qdisc_unthrottled() It turns out this is not really needed for FQ, as FQ has no concept of global qdisc throttling, being a qdisc handling many different flows, some of them can be throttled, while others are not. Fix is straightforward : add a 'bool throttle' to qdisc_watchdog_schedule_ns(), and remove calls to qdisc_unthrottled() in sch_fq. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:55:10 -04:00
Eric Dumazet	bec3cfdca3	net: skb_segment() provides list head and tail Its unfortunate we have to walk again skb list to find the tail after segmentation, even if data is probably hot in cpu caches. skb_segment() can store the tail of the list into segs->prev, and validate_xmit_skb_list() can immediately get the tail. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:37:30 -04:00
Jesse Gross	f579668406	openvswitch: Add support for Geneve tunneling. The Openvswitch implementation is completely agnostic to the options that are in use and can handle newly defined options without further work. It does this by simply matching on a byte array of options and allowing userspace to setup flows on this array. Signed-off-by: Jesse Gross <jesse@nicira.com> Singed-off-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:21 -04:00
Jesse Gross	6b205b2ca1	openvswitch: Factor out allocation and verification of actions. As the size of the flow key grows, it can put some pressure on the stack. This is particularly true in ovs_flow_cmd_set(), which needs several copies of the key on the stack. One of those uses is logically separate, so this factors it out to reduce stack pressure and improve readibility. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Jesse Gross	f0b128c1e2	openvswitch: Wrap struct ovs_key_ipv4_tunnel in a new structure. Currently, the flow information that is matched for tunnels and the tunnel data passed around with packets is the same. However, as additional information is added this is not necessarily desirable, as in the case of pointers. This adds a new structure for tunnel metadata which currently contains only the existing struct. This change is purely internal to the kernel since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version of OVS_KEY_ATTR_TUNNEL that is translated at flow setup. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Jesse Gross	67fa034194	openvswitch: Add support for matching on OAM packets. Some tunnel formats have mechanisms for indicating that packets are OAM frames that should be handled specially (either as high priority or not forwarded beyond an endpoint). This provides support for allowing those types of packets to be matched. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Jesse Gross	0714812134	openvswitch: Eliminate memset() from flow_extract. As new protocols are added, the size of the flow key tends to increase although few protocols care about all of the fields. In order to optimize this for hashing and matching, OVS uses a variable length portion of the key. However, when fields are extracted from the packet we must still zero out the entire key. This is no longer necessary now that OVS implements masking. Any fields (or holes in the structure) which are not part of a given protocol will be by definition not part of the mask and zeroed out during lookup. Furthermore, since masking already uses variable length keys this zeroing operation automatically benefits as well. In principle, the only thing that needs to be done at this point is remove the memset() at the beginning of flow. However, some fields assume that they are initialized to zero, which now must be done explicitly. In addition, in the event of an error we must also zero out corresponding fields to signal that there is no valid data present. These increase the total amount of code but very little of it is executed in non-error situations. Removing the memset() reduces the profile of ovs_flow_extract() from 0.64% to 0.56% when tested with large packets on a 10G link. Suggested-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Andy Zhou	0b5e8b8eea	net: Add Geneve tunneling protocol driver This adds a device level support for Geneve -- Generic Network Virtualization Encapsulation. The protocol is documented at http://tools.ietf.org/html/draft-gross-geneve-01 Only protocol layer Geneve support is provided by this driver. Openvswitch can be used for configuring, set up and tear down functional Geneve tunnels. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Vlad Yasevich	bdf6fa52f0	sctp: handle association restarts when the socket is closed. Currently association restarts do not take into consideration the state of the socket. When a restart happens, the current assocation simply transitions into established state. This creates a condition where a remote system, through a the restart procedure, may create a local association that is no way reachable by user. The conditions to trigger this are as follows: 1) Remote does not acknoledge some data causing data to remain outstanding. 2) Local application calls close() on the socket. Since data is still outstanding, the association is placed in SHUTDOWN_PENDING state. However, the socket is closed. 3) The remote tries to create a new association, triggering a restart on the local system. The association moves from SHUTDOWN_PENDING to ESTABLISHED. At this point, it is no longer reachable by any socket on the local system. This patch addresses the above situation by moving the newly ESTABLISHED association into SHUTDOWN-SENT state and bundling a SHUTDOWN after the COOKIE-ACK chunk. This way, the restarted associate immidiately enters the shutdown procedure and forces the termination of the unreachable association. Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:21:45 -04:00
David S. Miller	a4b4a2b7f9	Merge tag 'master-2014-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-10-03 Please pull tihs batch of updates intended for the 3.18 stream! For the iwlwifi bits, Emmanuel says: "I have here a few things that depend on the latest mac80211's changes: RRM, TPC, Quiet Period etc... Eyal keeps improving our rate control and we have a new device ID. This last patch should probably have gone to wireless.git, but at that stage, I preferred to send it to -next and CC stable." For (most of) the Atheros bits, Kalle says: "The only new feature is testmode support from me. Ben added a new method to crash the firmware with an assert for debug purposes. As usual, we have lots of smaller fixes from Michal. Matteo fixed a Kconfig dependency with debugfs. I fixed some warnings recently added to checkpatch." For the NFC bits, Samuel says: "We've had major updates for TI and ST Microelectronics drivers, and a few NCI related changes. For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell added ISO15693 support to the NCI stack, together with a couple of NCI fixes." For the Bluetooth bits, Johan says: "This 3.18 pull request replaces the one I did on Monday ("bluetooth-next 2014-09-22", which hasn't been pulled yet). The additions since the last request are: - SCO connection fix for devices not supporting eSCO - Cleanups regarding the SCO establishment logic - Remove unnecessary return value from logging functions - Header compression fix for 6lowpan - Cleanups to the ieee802154/mrf24j40 driver Here's a copy from previous request that this one replaces: ' Here are some more patches for 3.18. They include various fixes to the btusb HCI driver, a fix for LE SMP, as well as adding Jukka to the MAINTAINERS file for generic 6LoWPAN (as requested by Alexander Aring). I've held on to this pull request a bit since we were waiting for a SCO related fix to get sorted out first. However, since the merge window is getting closer I decided not to wait for it. If we do get the fix sorted out there'll probably be a second small pull request later this week. '" And, "Unless 3.17 gets delayed this will probably be our last -next pull request for 3.18. We've got: - New Marvell hardware supportr - Multicast support for 6lowpan - Several of 6lowpan fixes & cleanups - Fix for a (false-positive) lockdep warning in L2CAP - Minor btusb cleanup" On top of all that comes the usual sort of updates to ath5k, ath9k, ath10k, brcmfmac, mwifiex, and wil6210. This time around there are also a number of rtlwifi updates to enable some new hardware and to reconcile the in-kernel drivers with some newer releases of the Realtek vendor drivers. Also of note is some device tree work for the bcma bus. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:34:39 -04:00
David S. Miller	61b37d2f54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains another batch with Netfilter/IPVS updates for net-next, they are: 1) Add abstracted ICMP codes to the nf_tables reject expression. We introduce four reasons to reject using ICMP that overlap in IPv4 and IPv6 from the semantic point of view. This should simplify the maintainance of dual stack rule-sets through the inet table. 2) Move nf_send_reset() functions from header files to per-family nf_reject modules, suggested by Patrick McHardy. 3) We have to use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) everywhere in the code now that br_netfilter can be modularized. Convert remaining spots in the network stack code. 4) Use rcu_barrier() in the nf_tables module removal path to ensure that we don't leave object that are still pending to be released via call_rcu (that may likely result in a crash). 5) Remove incomplete arch 32/64 compat from nft_compat. The original (bad) idea was to probe the word size based on the xtables match/target info size, but this assumption is wrong when you have to dump the information back to userspace. 6) Allow to filter from prerouting and postrouting in the nf_tables bridge. In order to emulate the ebtables NAT chains (which are actually simple filter chains with no special semantics), we have support filtering from this hooks too. 7) Add explicit module dependency between xt_physdev and br_netfilter. This provides a way to detect if the user needs br_netfilter from the configuration path. This should reduce the breakage of the br_netfilter modularization. 8) Cleanup coding style in ip_vs.h, from Simon Horman. 9) Fix crash in the recently added nf_tables masq expression. We have to register/unregister the notifiers to clean up the conntrack table entries from the module init/exit path, not from the rule addition / deletion path. From Arturo Borrero. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:32:37 -04:00
Vlad Yasevich	5be5a2df40	bridge: Add filtering support for default_pvid Currently when vlan filtering is turned on on the bridge, the bridge will drop all traffic untill the user configures the filter. This isn't very nice for ports that don't care about vlans and just want untagged traffic. A concept of a default_pvid was recently introduced. This patch adds filtering support for default_pvid. Now, ports that don't care about vlans and don't define there own filter will belong to the VLAN of the default_pvid and continue to receive untagged traffic. This filtering can be disabled by setting default_pvid to 0. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:37 -04:00
Vlad Yasevich	3df6bf45ec	bridge: Simplify pvid checks. Currently, if the pvid is not set, we return an illegal vlan value even though the pvid value is set to 0. Since pvid of 0 is currently invalid, just return 0 instead. This makes the current and future checks simpler. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:36 -04:00
Vlad Yasevich	96a20d9d7f	bridge: Add a default_pvid sysfs attribute This patch allows the user to set and retrieve default_pvid value. A new value can only be stored when vlan filtering is disabled. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:36 -04:00
Ignacy Gawędzki	34a419d4e2	ematch: Fix early ending of inverted containers. The result of a negated container has to be inverted before checking for early ending. This fixes my previous attempt (`17c9c82326`) to make inverted containers work correctly. Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:49:46 -04:00
John Fastabend	1e203c1a2c	net: sched: suspicious RCU usage in qdisc_watchdog Suspicious RCU usage in qdisc_watchdog call needs to be done inside rcu_read_lock/rcu_read_unlock. And then Qdisc destroy operations need to ensure timer is cancelled before removing qdisc structure. [ 3992.191339] =============================== [ 3992.191340] [ INFO: suspicious RCU usage. ] [ 3992.191343] 3.17.0-rc6net-next+ #72 Not tainted [ 3992.191345] ------------------------------- [ 3992.191347] include/net/sch_generic.h:272 suspicious rcu_dereference_check() usage! [ 3992.191348] [ 3992.191348] other info that might help us debug this: [ 3992.191348] [ 3992.191351] [ 3992.191351] rcu_scheduler_active = 1, debug_locks = 1 [ 3992.191353] no locks held by swapper/1/0. [ 3992.191355] [ 3992.191355] stack backtrace: [ 3992.191358] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.17.0-rc6net-next+ #72 [ 3992.191360] Hardware name: /DZ77RE-75K, BIOS GAZ7711H.86A.0060.2012.1115.1750 11/15/2012 [ 3992.191362] 0000000000000001 ffff880235803e48 ffffffff8178f92c 0000000000000000 [ 3992.191366] ffff8802322224a0 ffff880235803e78 ffffffff810c9966 ffff8800a5fe3000 [ 3992.191370] ffff880235803f30 ffff8802359cd768 ffff8802359cd6e0 ffff880235803e98 [ 3992.191374] Call Trace: [ 3992.191376] <IRQ> [<ffffffff8178f92c>] dump_stack+0x4e/0x68 [ 3992.191387] [<ffffffff810c9966>] lockdep_rcu_suspicious+0xe6/0x130 [ 3992.191392] [<ffffffff8167213a>] qdisc_watchdog+0x8a/0xb0 [ 3992.191396] [<ffffffff810f93f2>] __run_hrtimer+0x72/0x420 [ 3992.191399] [<ffffffff810f9bcd>] ? hrtimer_interrupt+0x7d/0x240 [ 3992.191403] [<ffffffff816720b0>] ? tc_classify+0xc0/0xc0 [ 3992.191406] [<ffffffff810f9c4f>] hrtimer_interrupt+0xff/0x240 [ 3992.191410] [<ffffffff8109e4a5>] ? __atomic_notifier_call_chain+0x5/0x140 [ 3992.191415] [<ffffffff8103577b>] local_apic_timer_interrupt+0x3b/0x60 [ 3992.191419] [<ffffffff8179c2b5>] smp_apic_timer_interrupt+0x45/0x60 [ 3992.191422] [<ffffffff8179a6bf>] apic_timer_interrupt+0x6f/0x80 [ 3992.191424] <EOI> [<ffffffff815ed233>] ? cpuidle_enter_state+0x73/0x2e0 [ 3992.191432] [<ffffffff815ed22e>] ? cpuidle_enter_state+0x6e/0x2e0 [ 3992.191437] [<ffffffff815ed567>] cpuidle_enter+0x17/0x20 [ 3992.191441] [<ffffffff810c0741>] cpu_startup_entry+0x3d1/0x4a0 [ 3992.191445] [<ffffffff81106fc6>] ? clockevents_config_and_register+0x26/0x30 [ 3992.191448] [<ffffffff81033c16>] start_secondary+0x1b6/0x260 Fixes: `b26b0d1e8b` ("net: qdisc: use rcu prefix and silence sparse warnings") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:45:54 -04:00
Florian Fainelli	f7d6b96f34	net: dsa: do not call phy_start_aneg Commit `f7f1de51ed` ("net: dsa: start and stop the PHY state machine") add calls to phy_start() in dsa_slave_open() respectively phy_stop() in dsa_slave_close(). We also call phy_start_aneg() in dsa_slave_create(), and this call is messing up with the PHY state machine, since we basically start the auto-negotiation, and later on restart it when calling phy_start(). phy_start() does not currently handle the PHY_FORCING or PHY_AN states properly, but such a fix would be too invasive for this window. Fixes: `f7f1de51ed` ("net: dsa: start and stop the PHY state machine") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:44:44 -04:00
Vijay Subramanian	c8753d55af	net: Cleanup skb cloning by adding SKB_FCLONE_FREE SKB_FCLONE_UNAVAILABLE has overloaded meaning depending on type of skb. 1: If skb is allocated from head_cache, it indicates fclone is not available. 2: If skb is a companion fclone skb (allocated from fclone_cache), it indicates it is available to be used. To avoid confusion for case 2 above, this patch replaces SKB_FCLONE_UNAVAILABLE with SKB_FCLONE_FREE where appropriate. For fclone companion skbs, this indicates it is free for use. SKB_FCLONE_UNAVAILABLE will now simply indicate skb is from head_cache and cannot / will not have a companion fclone. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:34:25 -04:00
Nicolas Dichtel	3be07244b7	ip6_gre: fix flowi6_proto value in xmit path In xmit path, we build a flowi6 which will be used for the output route lookup. We are sending a GRE packet, neither IPv4 nor IPv6 encapsulated packet, thus the protocol should be IPPROTO_GRE. Fixes: `c12b395a46` ("gre: Support GRE over IPv6") Reported-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:08:24 -04:00
Tom Herbert	bc1fc390e1	ip_tunnel: Add GUE support This patch allows configuring IPIP, sit, and GRE tunnels to use GUE. This is very similar to fou excpet that we need to insert the GUE header in addition to the UDP header on transmit. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:33 -07:00
Tom Herbert	37dd024779	gue: Receive side for Generic UDP Encapsulation This patch adds support receiving for GUE packets in the fou module. The fou module now supports direct foo-over-udp (no encapsulation header) and GUE. To support this a type parameter is added to the fou netlink parameters. For a GUE socket we define gue_udp_recv, gue_gro_receive, and gue_gro_complete to handle the specifics of the GUE protocol. Most of the code to manage and configure sockets is common with the fou. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:33 -07:00
Tom Herbert	efc98d08e1	fou: eliminate IPv4,v6 specific GRO functions This patch removes fou[46]_gro_receive and fou[46]_gro_complete functions. The v4 or v6 variants were chosen for the UDP offloads based on the address family of the socket this is not necessary or correct. Alternatively, this patch adds is_ipv6 to napi_gro_skb. This is set in udp6_gro_receive and unset in udp4_gro_receive. In fou_gro_receive the value is used to select the correct inet_offloads for the protocol of the outer IP header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:32 -07:00
Tom Herbert	7371e0221c	ip_tunnel: Account for secondary encapsulation header in max_headroom When adjusting max_header for the tunnel interface based on egress device we need to account for any extra bytes in secondary encapsulation (e.g. FOU). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:32 -07:00
Eric Dumazet	01291202ed	net: do not export skb_gro_receive() skb_gro_receive() is only called from tcp_gro_receive() which is not in a module. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:54:30 -07:00
Eric Dumazet	55a93b3ea7	qdisc: validate skb without holding lock Validation of skb can be pretty expensive : GSO segmentation and/or checksum computations. We can do this without holding qdisc lock, so that other cpus can queue additional packets. Trick is that requeued packets were already validated, so we carry a boolean so that sch_direct_xmit() can validate a fresh skb list, or directly use an old one. Tested on 40Gb NIC (8 TX queues) and 200 concurrent flows, 48 threads host. Turning TSO on or off had no effect on throughput, only few more cpu cycles. Lock contention on qdisc lock disappeared. Same if disabling TX checksum offload. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:36:11 -07:00
Herton R. Krzesinski	593cbb3ec6	net/rds: fix possible double free on sock tear down I got a report of a double free happening at RDS slab cache. One suspicion was that may be somewhere we were doing a sock_hold/sock_put on an already freed sock. Thus after providing a kernel with the following change: static inline void sock_hold(struct sock *sk) { - atomic_inc(&sk->sk_refcnt); + if (!atomic_inc_not_zero(&sk->sk_refcnt)) + WARN(1, "Trying to hold sock already gone: %p (family: %hd)\n", + sk, sk->sk_family); } The warning successfuly triggered: Trying to hold sock already gone: ffff81f6dda61280 (family: 21) WARNING: at include/net/sock.h:350 sock_hold() Call Trace: <IRQ> [<ffffffff8adac135>] :rds:rds_send_remove_from_sock+0xf0/0x21b [<ffffffff8adad35c>] :rds:rds_send_drop_acked+0xbf/0xcf [<ffffffff8addf546>] :rds_rdma:rds_ib_recv_tasklet_fn+0x256/0x2dc [<ffffffff8009899a>] tasklet_action+0x8f/0x12b [<ffffffff800125a2>] __do_softirq+0x89/0x133 [<ffffffff8005f30c>] call_softirq+0x1c/0x28 [<ffffffff8006e644>] do_softirq+0x2c/0x7d [<ffffffff8006e4d4>] do_IRQ+0xee/0xf7 [<ffffffff8005e625>] ret_from_intr+0x0/0xa <EOI> Looking at the call chain above, the only way I think this would be possible is if somewhere we already released the same socket->sock which is assigned to the rds_message at rds_send_remove_from_sock. Which seems only possible to happen after the tear down done on rds_release. rds_release properly calls rds_send_drop_to to drop the socket from any rds_message, and some proper synchronization is in place to avoid race with rds_send_drop_acked/rds_send_remove_from_sock. However, I still see a very narrow window where it may be possible we touch a sock already released: when rds_release races with rds_send_drop_acked, we check RDS_MSG_ON_CONN to avoid cleanup on the same rds_message, but in this specific case we don't clear rm->m_rs. In this case, it seems we could then go on at rds_send_drop_to and after it returns, the sock is freed by last sock_put on rds_release, with concurrently we being at rds_send_remove_from_sock; then at some point in the loop at rds_send_remove_from_sock we process an rds_message which didn't have rm->m_rs unset for a freed sock, and a possible sock_hold on an sock already gone at rds_release happens. This hopefully address the described condition above and avoids a double free on "second last" sock_put. In addition, I removed the comment about socket destruction on top of rds_send_drop_acked: we call rds_send_drop_to in rds_release and we should have things properly serialized there, thus I can't see the comment being accurate there. Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:52:00 -07:00
Herton R. Krzesinski	eb74cc97b8	net/rds: do proper house keeping if connection fails in rds_tcp_conn_connect I see two problems if we consider the sock->ops->connect attempt to fail in rds_tcp_conn_connect. The first issue is that for example we don't remove the previously added rds_tcp_connection item to rds_tcp_tc_list at rds_tcp_set_callbacks, which means that on a next reconnect attempt for the same rds_connection, when rds_tcp_conn_connect is called we can again call rds_tcp_set_callbacks, resulting in duplicated items on rds_tcp_tc_list, leading to list corruption: to avoid this just make sure we call properly rds_tcp_restore_callbacks before we exit. The second issue is that we should also release the sock properly, by setting sock = NULL only if we are returning without error. Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:51:59 -07:00
Herton R. Krzesinski	310886dd5f	net/rds: call rds_conn_drop instead of open code it at rds_connect_complete Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:51:59 -07:00
Jesper Dangaard Brouer	808e7ac0bd	qdisc: dequeue bulking also pickup GSO/TSO packets The TSO and GSO segmented packets already benefit from bulking on their own. The TSO packets have always taken advantage of the only updating the tailptr once for a large packet. The GSO segmented packets have recently taken advantage of bulking xmit_more API, via merge commit `53fda7f7f9` ("Merge branch 'xmit_list'"), specifically via commit `7f2e870f2a` ("net: Move main gso loop out of dev_hard_start_xmit() into helper.") allowing qdisc requeue of remaining list. And via commit `ce93718fb7` ("net: Don't keep around original SKB when we software segment GSO frames."). This patch allow further bulking of TSO/GSO packets together, when dequeueing from the qdisc. Testing: Measuring HoL (Head-of-Line) blocking for TSO and GSO, with netperf-wrapper. Bulking several TSO show no performance regressions (requeues were in the area 32 requeues/sec). Bulking several GSOs does show small regression or very small improvement (requeues were in the area 8000 requeues/sec). Using ixgbe 10Gbit/s with GSO bulking, we can measure some additional latency. Base-case, which is "normal" GSO bulking, sees varying high-prio queue delay between 0.38ms to 0.47ms. Bulking several GSOs together, result in a stable high-prio queue delay of 0.50ms. Using igb at 100Mbit/s with GSO bulking, shows an improvement. Base-case sees varying high-prio queue delay between 2.23ms to 2.35ms Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:37:06 -07:00
Jesper Dangaard Brouer	5772e9a346	qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE Based on DaveM's recent API work on dev_hard_start_xmit(), that allows sending/processing an entire skb list. This patch implements qdisc bulk dequeue, by allowing multiple packets to be dequeued in dequeue_skb(). The optimization principle for this is two fold, (1) to amortize locking cost and (2) avoid expensive tailptr update for notifying HW. (1) Several packets are dequeued while holding the qdisc root_lock, amortizing locking cost over several packet. The dequeued SKB list is processed under the TXQ lock in dev_hard_start_xmit(), thus also amortizing the cost of the TXQ lock. (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more API to delay HW tailptr update, which also reduces the cost per packet. One restriction of the new API is that every SKB must belong to the same TXQ. This patch takes the easy way out, by restricting bulk dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the qdisc only have attached a single TXQ. Some detail about the flow; dev_hard_start_xmit() will process the skb list, and transmit packets individually towards the driver (see xmit_one()). In case the driver stops midway in the list, the remaining skb list is returned by dev_hard_start_xmit(). In sch_direct_xmit() this returned list is requeued by dev_requeue_skb(). To avoid overshooting the HW limits, which results in requeuing, the patch limits the amount of bytes dequeued, based on the drivers BQL limits. In-effect bulking will only happen for BQL enabled drivers. Small amounts for extra HoL blocking (2x MTU/0.24ms) were measured at 100Mbit/s, with bulking 8 packets, but the oscillating nature of the measurement indicate something, like sched latency might be causing this effect. More comparisons show, that this oscillation goes away occationally. Thus, we disregard this artifact completely and remove any "magic" bulking limit. For now, as a conservative approach, stop bulking when seeing TSO and segmented GSO packets. They already benefit from bulking on their own. A followup patch add this, to allow easier bisect-ability for finding regressions. Jointed work with Hannes, Daniel and Florian. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:37:06 -07:00
Arturo Borrero	8da4cc1b10	netfilter: nft_masq: register/unregister notifiers on module init/exit We have to register the notifiers in the masquerade expression from the the module _init and _exit path. This fixes crashes when removing the masquerade rule with no ipt_MASQUERADE support in place (which was masking the problem). Fixes: 9ba1f72 ("netfilter: nf_tables: add new nft_masq expression") Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-03 14:24:35 +02:00
David S. Miller	739e4a758e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/r8152.c net/netfilter/nfnetlink.c Both r8152 and nfnetlink conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-02 11:25:43 -07:00
John W. Linville	f6cd071891	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-10-02 13:56:19 -04:00
Pablo Neira Ayuso	4b7fd5d97e	netfilter: explicit module dependency between br_netfilter and physdev You can use physdev to match the physical interface enslaved to the bridge device. This information is stored in skb->nf_bridge and it is set up by br_netfilter. So, this is only available when iptables is used from the bridge netfilter path. Since `34666d4` ("netfilter: bridge: move br_netfilter out of the core"), the br_netfilter code is modular. To reduce the impact of this change, we can autoload the br_netfilter if the physdev match is used since we assume that the users need br_netfilter in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:57 +02:00
Pablo Neira Ayuso	36d2af5998	netfilter: nf_tables: allow to filter from prerouting and postrouting This allows us to emulate the NAT table in ebtables, which is actually a plain filter chain that hooks at prerouting, output and postrouting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:56 +02:00
Pablo Neira Ayuso	756c1b1a7f	netfilter: nft_compat: remove incomplete 32/64 bits arch compat code This code was based on the wrong asumption that you can probe based on the match/target private size that we get from userspace. This doesn't work at all when you have to dump the info back to userspace since you don't know what word size the userspace utility is using. Currently, the extensions that require arch compat are limit match and the ebt_mark match/target. The standard targets are not used by the nft-xt compat layer, so they are not affected. We can work around this limitation with a new revision that uses arch agnostic types. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:55 +02:00
Pablo Neira Ayuso	1b1bc49c0f	netfilter: nf_tables: wait for call_rcu completion on module removal Make sure the objects have been released before the nf_tables modules is removed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:54 +02:00
Pablo Neira Ayuso	1109a90c01	netfilter: use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) In `34666d4` ("netfilter: bridge: move br_netfilter out of the core"), the bridge netfilter code has been modularized. Use IS_ENABLED instead of ifdef to cover the module case. Fixes: `34666d4` ("netfilter: bridge: move br_netfilter out of the core") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:54 +02:00
Pablo Neira Ayuso	c8d7b98bec	netfilter: move nf_send_resetX() code to nf_reject_ipvX modules Move nf_send_reset() and nf_send_reset6() to nf_reject_ipv4 and nf_reject_ipv6 respectively. This code is shared by x_tables and nf_tables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:30:49 +02:00
Pablo Neira Ayuso	51b0a5d8c2	netfilter: nft_reject: introduce icmp code abstraction for inet and bridge This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides an abstraction to the ICMP and ICMPv6 codes that you can use from the inet and bridge tables, they are: * NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable * NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable * NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable * NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited You can still use the specific codes when restricting the rule to match the corresponding layer 3 protocol. I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have different semantics depending on the table family and to allow the user to specify ICMP family specific codes if they restrict it to the corresponding family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-02 18:29:57 +02:00
Jukka Rissanen	9c238ca8ec	Bluetooth: 6lowpan: Check transmit errors for multicast packets We did not return error if multicast packet transmit failed. This might not be desired so return error also in this case. If there are multiple 6lowpan devices where the multicast packet is sent, then return error even if sending to only one of them fails. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-02 13:41:57 +03:00
Jukka Rissanen	d7b6b0a532	Bluetooth: 6lowpan: Return EAGAIN error also for multicast packets Make sure that we are able to return EAGAIN from l2cap_chan_send() even for multicast packets. The error code was ignored unncessarily. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-02 13:41:39 +03:00
Jukka Rissanen	a7807d73a0	Bluetooth: 6lowpan: Avoid memory leak if memory allocation fails If skb_unshare() returns NULL, then we leak the original skb. Solution is to use temp variable to hold the new skb. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-02 13:41:32 +03:00
Jukka Rissanen	fc12518a4b	Bluetooth: 6lowpan: Memory leak as the skb is not freed The earlier multicast commit `36b3dd250d` ("Bluetooth: 6lowpan: Ensure header compression does not corrupt IPv6 header") lost one skb free which then caused memory leak. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-10-02 13:41:30 +03:00
Johan Hedberg	02e246aee8	Bluetooth: Fix lockdep warning with l2cap_chan_connect The L2CAP connection's channel list lock (conn->chan_lock) must never be taken while already holding a channel lock (chan->lock) in order to avoid lock-inversion and lockdep warnings. So far the l2cap_chan_connect function has acquired the chan->lock early in the function and then later called l2cap_chan_add(conn, chan) which will try to take the conn->chan_lock. This violates the correct order of taking the locks and may lead to the following type of lockdep warnings: -> #1 (&conn->chan_lock){+.+...}: [<c109324d>] lock_acquire+0x9d/0x140 [<c188459c>] mutex_lock_nested+0x6c/0x420 [<d0aab48e>] l2cap_chan_add+0x1e/0x40 [bluetooth] [<d0aac618>] l2cap_chan_connect+0x348/0x8f0 [bluetooth] [<d0cc9a91>] lowpan_control_write+0x221/0x2d0 [bluetooth_6lowpan] -> #0 (&chan->lock){+.+.+.}: [<c10928d8>] __lock_acquire+0x1a18/0x1d20 [<c109324d>] lock_acquire+0x9d/0x140 [<c188459c>] mutex_lock_nested+0x6c/0x420 [<d0ab05fd>] l2cap_connect_cfm+0x1dd/0x3f0 [bluetooth] [<d0a909c4>] hci_le_meta_evt+0x11a4/0x1260 [bluetooth] [<d0a910eb>] hci_event_packet+0x3ab/0x3120 [bluetooth] [<d0a7cb08>] hci_rx_work+0x208/0x4a0 [bluetooth] CPU0 CPU1 ---- ---- lock(&conn->chan_lock); lock(&chan->lock); lock(&conn->chan_lock); lock(&chan->lock); Before calling l2cap_chan_add() the channel is not part of the conn->chan_l list, and can therefore only be accessed by the L2CAP user (such as l2cap_sock.c). We can therefore assume that it is the responsibility of the user to handle mutual exclusion until this point (which we can see is already true in l2cap_sock.c by it in many places touching chan members without holding chan->lock). Since the hci_conn and by exctension l2cap_conn creation in the l2cap_chan_connect() function depend on chan details we cannot simply add a mutex_lock(&conn->chan_lock) in the beginning of the function (since the conn object doesn't yet exist there). What we can do however is move the chan->lock taking later into the function where we already have the conn object and can that way take conn->chan_lock first. This patch implements the above strategy and does some other necessary changes such as using __l2cap_chan_add() which assumes conn->chan_lock is held, as well as adding a second needed label so the unlocking happens as it should. Reported-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-10-02 10:37:07 +02:00
Alexei Starovoitov	38b2cf2982	net: pktgen: packet bursting via skb->xmit_more This patch demonstrates the effect of delaying update of HW tailptr. (based on earlier patch by Jesper) burst=1 is the default. It sends one packet with xmit_more=false burst=2 sends one packet with xmit_more=true and 2nd copy of the same packet with xmit_more=false burst=3 sends two copies of the same packet with xmit_more=true and 3rd copy with xmit_more=false Performance with ixgbe (usec 30): burst=1 tx:9.2 Mpps burst=2 tx:13.5 Mpps burst=3 tx:14.5 Mpps full 10G line rate Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 22:08:12 -04:00
Florian Fainelli	775dd692bd	net: bridge: add a br_set_state helper function In preparation for being able to propagate port states to e.g: notifiers or other kernel parts, do not manipulate the port state directly, but instead use a helper function which will allow us to do a bit more than just setting the state. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 22:03:50 -04:00
WANG Cong	a0efb80ce3	net_sched: avoid calling tcf_unbind_filter() in call_rcu callback This fixes the following crash: [ 63.976822] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 63.980094] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted 3.17.0-rc6+ #648 [ 63.980094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 63.980094] task: ffff880117dea690 ti: ffff880117dfc000 task.ti: ffff880117dfc000 [ 63.980094] RIP: 0010:[<ffffffff817e6d07>] [<ffffffff817e6d07>] u32_destroy_key+0x27/0x6d [ 63.980094] RSP: 0018:ffff880117dffcc0 EFLAGS: 00010202 [ 63.980094] RAX: ffff880117dea690 RBX: ffff8800d02e0820 RCX: 0000000000000000 [ 63.980094] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b [ 63.980094] RBP: ffff880117dffcd0 R08: 0000000000000000 R09: 0000000000000000 [ 63.980094] R10: 00006c0900006ba8 R11: 00006ba100006b9d R12: 0000000000000001 [ 63.980094] R13: ffff8800d02e0898 R14: ffffffff817e6d4d R15: ffff880117387a30 [ 63.980094] FS: 0000000000000000(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 [ 63.980094] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 63.980094] CR2: 00007f07e6732fed CR3: 000000011665b000 CR4: 00000000000006e0 [ 63.980094] Stack: [ 63.980094] ffff88011a9cd300 ffffffff82051ac0 ffff880117dffce0 ffffffff817e6d68 [ 63.980094] ffff880117dffd70 ffffffff810cb4c7 ffffffff810cb3cd ffff880117dfffd8 [ 63.980094] ffff880117dea690 ffff880117dea690 ffff880117dfffd8 000000000000000a [ 63.980094] Call Trace: [ 63.980094] [<ffffffff817e6d68>] u32_delete_key_freepf_rcu+0x1b/0x1d [ 63.980094] [<ffffffff810cb4c7>] rcu_process_callbacks+0x3bb/0x691 [ 63.980094] [<ffffffff810cb3cd>] ? rcu_process_callbacks+0x2c1/0x691 [ 63.980094] [<ffffffff817e6d4d>] ? u32_destroy_key+0x6d/0x6d [ 63.980094] [<ffffffff810780a4>] __do_softirq+0x142/0x323 [ 63.980094] [<ffffffff810782a8>] run_ksoftirqd+0x23/0x53 [ 63.980094] [<ffffffff81092126>] smpboot_thread_fn+0x203/0x221 [ 63.980094] [<ffffffff81091f23>] ? smpboot_unpark_thread+0x33/0x33 [ 63.980094] [<ffffffff8108e44d>] kthread+0xc9/0xd1 [ 63.980094] [<ffffffff819e00ea>] ? do_wait_for_common+0xf8/0x125 [ 63.980094] [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61 [ 63.980094] [<ffffffff819e43ec>] ret_from_fork+0x7c/0xb0 [ 63.980094] [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61 tp could be freed in call_rcu callback too, the order is not guaranteed. John Fastabend says: ==================== Its worth noting why this is safe. Any running schedulers will either read the valid class field or it will be zeroed. All schedulers today when the class is 0 do a lookup using the same call used by the tcf_exts_bind(). So even if we have a running classifier hit the null class pointer it will do a lookup and get to the same result. This is particularly fragile at the moment because the only way to verify this is to audit the schedulers call sites. ==================== Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 22:00:42 -04:00
WANG Cong	6e0565697a	net_sched: fix another crash in cls_tcindex This patch fixes the following crash: [ 166.670795] BUG: unable to handle kernel NULL pointer dereference at (null) [ 166.674230] IP: [<ffffffff814b739f>] __list_del_entry+0x5c/0x98 [ 166.674230] PGD d0ea5067 PUD ce7fc067 PMD 0 [ 166.674230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 166.674230] CPU: 1 PID: 775 Comm: tc Not tainted 3.17.0-rc6+ #642 [ 166.674230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 166.674230] task: ffff8800d03c4d20 ti: ffff8800cae7c000 task.ti: ffff8800cae7c000 [ 166.674230] RIP: 0010:[<ffffffff814b739f>] [<ffffffff814b739f>] __list_del_entry+0x5c/0x98 [ 166.674230] RSP: 0018:ffff8800cae7f7d0 EFLAGS: 00010207 [ 166.674230] RAX: 0000000000000000 RBX: ffff8800cba8d700 RCX: ffff8800cba8d700 [ 166.674230] RDX: 0000000000000000 RSI: dead000000200200 RDI: ffff8800cba8d700 [ 166.674230] RBP: ffff8800cae7f7d0 R08: 0000000000000001 R09: 0000000000000001 [ 166.674230] R10: 0000000000000000 R11: 000000000000859a R12: ffffffffffffffe8 [ 166.674230] R13: ffff8800cba8c5b8 R14: 0000000000000001 R15: ffff8800cba8d700 [ 166.674230] FS: 00007fdb5f04a740(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 [ 166.674230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 166.674230] CR2: 0000000000000000 CR3: 00000000cf929000 CR4: 00000000000006e0 [ 166.674230] Stack: [ 166.674230] ffff8800cae7f7e8 ffffffff814b73e8 ffff8800cba8d6e8 ffff8800cae7f828 [ 166.674230] ffffffff817caeec 0000000000000046 ffff8800cba8c5b0 ffff8800cba8c5b8 [ 166.674230] 0000000000000000 0000000000000001 ffff8800cf8e33e8 ffff8800cae7f848 [ 166.674230] Call Trace: [ 166.674230] [<ffffffff814b73e8>] list_del+0xd/0x2b [ 166.674230] [<ffffffff817caeec>] tcf_action_destroy+0x4c/0x71 [ 166.674230] [<ffffffff817ca0ce>] tcf_exts_destroy+0x20/0x2d [ 166.674230] [<ffffffff817ec2b5>] tcindex_delete+0x196/0x1b7 struct list_head can not be simply copied and we should always init it. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 22:00:42 -04:00
Tom Herbert	54bc9bac30	gre: Set inner protocol in v4 and v6 GRE transmit Call skb_set_inner_protocol to set inner Ethernet protocol to protocol being encapsulation by GRE before tunnel_xmit. This is needed for GSO if UDP encapsulation (fou) is being done. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 21:35:51 -04:00
Tom Herbert	077c5a0948	ipip: Set inner IP protocol in ipip Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV4 before tunnel_xmit. This is needed if UDP encapsulation (fou) is being done. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 21:35:51 -04:00
Tom Herbert	469471cdfc	sit: Set inner IP protocol in sit Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV6 before tunnel_xmit. This is needed if UDP encapsulation (fou) is being done. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 21:35:51 -04:00
Tom Herbert	8bce6d7d0d	udp: Generalize skb_udp_segment skb_udp_segment is the function called from udp4_ufo_fragment to segment a UDP tunnel packet. This function currently assumes segmentation is transparent Ethernet bridging (i.e. VXLAN encapsulation). This patch generalizes the function to operate on either Ethertype or IP protocol. The inner_protocol field must be set to the protocol of the inner header. This can now be either an Ethertype or an IP protocol (in a union). A new flag in the skbuff indicates which type is effective. skb_set_inner_protocol and skb_set_inner_ipproto helper functions were added to set the inner_protocol. These functions are called from the point where the tunnel encapsulation is occuring. When skb_udp_tunnel_segment is called, the function to segment the inner packet is selected based on the inner IP or Ethertype. In the case of an IP protocol encapsulation, the function is derived from inet[6]_offloads. In the case of Ethertype, skb->protocol is set to the inner_protocol and skb_mac_gso_segment is called. (GRE currently does this, but it might be possible to lookup the protocol in offload_base and call the appropriate segmenation function directly). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 21:35:51 -04:00
Eric Dumazet	ce1a4ea3f1	net: avoid one atomic operation in skb_clone() Fast clone cloning can actually avoid an atomic_inc(), if we guarantee prior clone_ref value is 1. This requires a change kfree_skbmem(), to perform the atomic_dec_and_test() on clone_ref before setting fclone to SKB_FCLONE_UNAVAILABLE. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 21:27:23 -04:00
Fabian Frederick	e500f488c2	net/dccp/ccid.c: add __init to ccid_activate ccid_activate is only called by __init ccid_initialize_builtins in same module. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 18:33:13 -04:00
Fabian Frederick	0c5b8a4629	net/dccp/proto.c: add __init to dccp_mib_init dccp_mib_init is only called by __init dccp_init in same module. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 18:33:13 -04:00
Eric Dumazet	d0bf4a9e92	net: cleanup and document skb fclone layout Lets use a proper structure to clearly document and implement skb fast clones. Then, we might experiment more easily alternative layouts. This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm, to stop leaking of implementation details. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 16:34:25 -04:00
Yuchung Cheng	b248230c34	tcp: abort orphan sockets stalling on zero window probes Currently we have two different policies for orphan sockets that repeatedly stall on zero window ACKs. If a socket gets a zero window ACK when it is transmitting data, the RTO is used to probe the window. The socket is aborted after roughly tcp_orphan_retries() retries (as in tcp_write_timeout()). But if the socket was idle when it received the zero window ACK, and later wants to send more data, we use the probe timer to probe the window. If the receiver always returns zero window ACKs, icsk_probes keeps getting reset in tcp_ack() and the orphan socket can stall forever until the system reaches the orphan limit (as commented in tcp_probe_timer()). This opens up a simple attack to create lots of hanging orphan sockets to burn the memory and the CPU, as demonstrated in the recent netdev post "TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised." http://www.spinics.net/lists/netdev/msg296539.html This patch follows the design in RTO-based probe: we abort an orphan socket stalling on zero window when the probe timer reaches both the maximum backoff and the maximum RTO. For example, an 100ms RTT connection will timeout after roughly 153 seconds (0.3 + 0.6 + .... + 76.8) if the receiver keeps the window shut. If the orphan socket passes this check, but the system already has too many orphans (as in tcp_out_of_resources()), we still abort it but we'll also send an RST packet as the connection may still be active. In addition, we change TCP_USER_TIMEOUT to cover (life or dead) sockets stalled on zero-window probes. This changes the semantics of TCP_USER_TIMEOUT slightly because it previously only applies when the socket has pending transmission. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 16:27:52 -04:00
Fabian Frederick	cb57659a15	cipso: add __init to cipso_v4_cache_init cipso_v4_cache_init is only called by __init cipso_v4_init Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 15:46:20 -04:00
Fabian Frederick	57a02c39c1	inet: frags: add __init to ip4_frags_ctl_register ip4_frags_ctl_register is only called by __init ipfrag_init Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 15:46:19 -04:00
Fabian Frederick	47d7a88c18	tcp: add __init to tcp_init_mem tcp_init_mem is only called by __init tcp_init. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 15:41:14 -04:00
Thierry Reding	e506d405ac	net: dsa: Fix build warning for !PM_SLEEP The dsa_switch_suspend() and dsa_switch_resume() functions are only used when PM_SLEEP is enabled, so they need #ifdef CONFIG_PM_SLEEP protection to avoid a compiler warning. Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 15:24:00 -04:00
Eric Dumazet	2c804d0f8f	ipv4: mentions skb_gro_postpull_rcsum() in inet_gro_receive() Proper CHECKSUM_COMPLETE support needs to adjust skb->csum when we remove one header. Its done using skb_gro_postpull_rcsum() In the case of IPv4, we know that the adjustment is not really needed, because the checksum over IPv4 header is 0. Lets add a comment to ease code comprehension and avoid copy/paste errors. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 13:44:05 -04:00
Fabian Frederick	f0a0c1cedf	ieee802154: fix __init functions Commit `3243acd37f` ("ieee802154: add __init to lowpan_frags_sysctl_register") added __init to lowpan_frags_ns_sysctl_register instead of lowpan_frags_sysctl_register Suggested-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-01 02:03:13 -04:00
Trond Myklebust	72c23f0819	Merge branch 'bugfixes' into linux-next * bugfixes: NFSv4.1: Fix an NFSv4.1 state renewal regression NFSv4: fix open/lock state recovery error handling NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails NFS: Fabricate fscache server index key correctly SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT nfs: fix duplicate proc entries	2014-09-30 17:21:41 -04:00
Li RongQing	a12a601ed1	tcp: Change tcp_slow_start function to return void No caller uses the return value, so make this function return void. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 17:09:16 -04:00
Fabian Frederick	3243acd37f	ieee802154: add __init to lowpan_frags_sysctl_register lowpan_frags_sysctl_register is only called by __init lowpan_net_frag_init (part of the lowpan module). Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 17:08:06 -04:00
Fabian Frederick	0d4a2f9a33	irda: add __init to irlan_open irlan_open is only called by __init irlan_init in same module. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 17:08:06 -04:00
Florian Westphal	57f5877c11	netfilter: bridge: build br_nf_core only if required Eric reports build failure with CONFIG_BRIDGE_NETFILTER=n We insist to build br_nf_core.o unconditionally, but we must only do so if br_netfilter was enabled, else it fails to build due to functions being defined to empty stubs (and some structure members being defined out). Also, BRIDGE_NETFILTER=y\|m makes no sense when BRIDGE=n. Fixes: `34666d467` (netfilter: bridge: move br_netfilter out of the core) Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 14:07:51 -04:00
Hannes Frederic Sowa	705f1c869d	ipv6: remove rt6i_genid Eric Dumazet noticed that all no-nonexthop or no-gateway routes which are already marked DST_HOST (e.g. input routes routes) will always be invalidated during sk_dst_check. Thus per-socket dst caching absolutely had no effect and early demuxing had no effect. Thus this patch removes rt6i_genid: fn_sernum already gets modified during add operations, so we only must ensure we mutate fn_sernum during ipv6 address remove operations. This is a fairly cost extensive operations, but address removal should not happen that often. Also our mtu update functions do the same and we heard no complains so far. xfrm policy changes also cause a call into fib6_flush_trees. Also plug a hole in rt6_info (no cacheline changes). I verified via tracing that this change has effect. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 14:00:48 -04:00
James Morris	6c8ff877cd	Merge commit 'v3.16' into next	2014-10-01 00:44:04 +10:00
John Fastabend	b0ab6f9275	net: sched: enable per cpu qstats After previous patches to simplify qstats the qstats can be made per cpu with a packed union in Qdisc struct. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	6401585366	net: sched: restrict use of qstats qlen This removes the use of qstats->qlen variable from the classifiers and makes it an explicit argument to gnet_stats_copy_queue(). The qlen represents the qdisc queue length and is packed into the qstats at the last moment before passnig to user space. By handling it explicitely we avoid, in the percpu stats case, having to figure out which per_cpu variable to put it in. It would probably be best to remove it from qstats completely but qstats is a user space ABI and can't be broken. A future patch could make an internal only qstats structure that would avoid having to allocate an additional u32 variable on the Qdisc struct. This would make the qstats struct 128bits instead of 128+32. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	25331d6ce4	net: sched: implement qstat helper routines This adds helpers to manipulate qstats logic and replaces locations that touch the counters directly. This simplifies future patches to push qstats onto per cpu counters. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	22e0f8b932	net: sched: make bstats per cpu and estimator RCU safe In order to run qdisc's without locking statistics and estimators need to be handled correctly. To resolve bstats make the statistics per cpu. And because this is only needed for qdiscs that are running without locks which is not the case for most qdiscs in the near future only create percpu stats when qdiscs set the TCQ_F_CPUSTATS flag. Next because estimators use the bstats to calculate packets per second and bytes per second the estimator code paths are updated to use the per cpu statistics. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
Ignacy Gawędzki	17c9c82326	ematch: Fix matching of inverted containers. Negated expressions and sub-expressions need to have their flags checked for TCF_EM_INVERT and their result negated accordingly. Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 15:31:29 -04:00
Eric Dumazet	73d3fe6d1c	gro: fix aggregation for skb using frag_list In commit `8a29111c7c` ("net: gro: allow to build full sized skb") I added a regression for linear skb that traditionally force GRO to use the frag_list fallback. Erez Shitrit found that at most two segments were aggregated and the "if (skb_gro_len(p) != pinfo->gso_size)" test was failing. This is because pinfo at this spot still points to the last skb in the chain, instead of the first one, where we find the correct gso_size information. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `8a29111c7c` ("net: gro: allow to build full sized skb") Reported-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 15:17:59 -04:00
David S. Miller	852248449c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== pull request: netfilter/ipvs updates for net-next The following patchset contains Netfilter/IPVS updates for net-next, most relevantly they are: 1) Four patches to make the new nf_tables masquerading support independent of the x_tables infrastructure. This also resolves a compilation breakage if the masquerade target is disabled but the nf_tables masq expression is enabled. 2) ipset updates via Jozsef Kadlecsik. This includes the addition of the skbinfo extension that allows you to store packet metainformation in the elements. This can be used to fetch and restore this to the packets through the iptables SET target, patches from Anton Danilov. 3) Add the hash:mac set type to ipset, from Jozsef Kadlecsick. 4) Add simple weighted fail-over scheduler via Simon Horman. This provides a fail-over IPVS scheduler (unlike existing load balancing schedulers). Connections are directed to the appropriate server based solely on highest weight value and server availability, patch from Kenny Mathis. 5) Support IPv6 real servers in IPv4 virtual-services and vice versa. Simon Horman informs that the motivation for this is to allow more flexibility in the choice of IP version offered by both virtual-servers and real-servers as they no longer need to match: An IPv4 connection from an end-user may be forwarded to a real-server using IPv6 and vice versa. No ip_vs_sync support yet though. Patches from Alex Gartrell and Julian Anastasov. 6) Add global generation ID to the nf_tables ruleset. When dumping from several different object lists, we need a way to identify that an update has ocurred so userspace knows that it needs to refresh its lists. This also includes a new command to obtain the 32-bits generation ID. The less significant 16-bits of this ID is also exposed through res_id field in the nfnetlink header to quickly detect the interference and retry when there is no risk of ID wraparound. 7) Move br_netfilter out of the bridge core. The br_netfilter code is built in the bridge core by default. This causes problems of different kind to people that don't want this: Jesper reported performance drop due to the inconditional hook registration and I remember to have read complains on netdev from people regarding the unexpected behaviour of our bridging stack when br_netfilter is enabled (fragmentation handling, layer 3 and upper inspection). People that still need this should easily undo the damage by modprobing the new br_netfilter module. 8) Dump the set policy nf_tables that allows set parameterization. So userspace can keep user-defined preferences when saving the ruleset. From Arturo Borrero. 9) Use __seq_open_private() helper function to reduce boiler plate code in x_tables, From Rob Jones. 10) Safer default behaviour in case that you forget to load the protocol tracker. Daniel Borkmann and Florian Westphal detected that if your ruleset is stateful, you allow traffic to at least one single SCTP port and the SCTP protocol tracker is not loaded, then any SCTP traffic may be pass through unfiltered. After this patch, the connection tracking classifies SCTP/DCCP/UDPlite/GRE packets as invalid if your kernel has been compiled with support for these modules. ==================== Trivially resolved conflict in include/linux/skbuff.h, Eric moved some netfilter skbuff members around, and the netfilter tree adjusted the ifdef guards for the bridging info pointer. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 14:46:53 -04:00
Florian Westphal	735d383117	tcp: change TCP_ECN prefixes to lower case Suggested by Stephen. Also drop inline keyword and let compiler decide. gcc 4.7.3 decides to no longer inline tcp_ecn_check_ce, so split it up. The actual evaluation is not inlined anymore while the ECN_OK test is. Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 14:41:22 -04:00
Florian Westphal	d82bd12298	tcp: move TCP_ECN_create_request out of header After Octavian Purdilas tcp ipv4/ipv6 unification work this helper only has a single callsite. While at it, convert name to lowercase, suggested by Stephen. Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 14:41:22 -04:00
Steve Wise	7e5be28827	svcrdma: advertise the correct max payload Svcrdma currently advertises 1MB, which is too large. The correct value is the minimum of RPCSVC_MAXPAYLOAD and the max scatter-gather allowed in an NFSRDMA IO chunk * the host page size. This bug is usually benign because the Linux X64 NFSRDMA client correctly limits the payload size to the correct value (64*4096 = 256KB). But if the Linux client is PPC64 with a 64KB page size, then the client will indeed use a payload size that will overflow the server. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-29 14:35:18 -04:00
Li RongQing	41c91996d9	tcp: remove unnecessary assignment. This variable i is overwritten to 0 by following code Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 12:31:12 -04:00
Eric Dumazet	b193722731	net: reorganize sk_buff for faster __copy_skb_header() With proliferation of bit fields in sk_buff, __copy_skb_header() became quite expensive, showing as the most expensive function in a GSO workload. __copy_skb_header() performance is also critical for non GSO TCP operations, as it is used from skb_clone() This patch carefully moves all the fields that were not copied in a separate zone : cloned, nohdr, fclone, peeked, head_frag, xmit_more Then I moved all other fields and all other copied fields in a section delimited by headers_start[0]/headers_end[0] section so that we can use a single memcpy() call, inlined by compiler using long word load/stores. I also tried to make all copies in the natural orders of sk_buff, to help hardware prefetching. I made sure sk_buff size did not change. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 12:27:20 -04:00
Jukka Rissanen	156395c998	Bluetooth: 6lowpan: Enable multicast support Set multicast support for 6lowpan network interface. This is needed in every network interface that supports IPv6. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-29 17:06:38 +02:00
Jukka Rissanen	36b3dd250d	Bluetooth: 6lowpan: Ensure header compression does not corrupt IPv6 header If skb is going to multiple destinations, then make sure that we do not overwrite the common IPv6 headers. So before compressing the IPv6 headers, we copy the skb and that is then sent to 6LoWPAN Bluetooth devices. This is a similar patch as what was done for IEEE 802.154 6LoWPAN in commit `f19f4f9525` ("ieee802154: 6lowpan: ensure header compression does not corrupt ipv6 header") Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-29 17:06:38 +02:00
Florian Westphal	db29a9508a	netfilter: conntrack: disable generic tracking for known protocols Given following iptables ruleset: -P FORWARD DROP -A FORWARD -m sctp --dport 9 -j ACCEPT -A FORWARD -p tcp --dport 80 -j ACCEPT -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT One would assume that this allows SCTP on port 9 and TCP on port 80. Unfortunately, if the SCTP conntrack module is not loaded, this allows all SCTP communication, to pass though, i.e. -p sctp -j ACCEPT, which we think is a security issue. This is because on the first SCTP packet on port 9, we create a dummy "generic l4" conntrack entry without any port information (since conntrack doesn't know how to extract this information). All subsequent packets that are unknown will then be in established state since they will fallback to proto_generic and will match the 'generic' entry. Our originally proposed version [1] completely disabled generic protocol tracking, but Jozsef suggests to not track protocols for which a more suitable helper is available, hence we now mitigate the issue for in tree known ct protocol helpers only, so that at least NAT and direction information will still be preserved for others. [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html Joint work with Daniel Borkmann. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-29 12:17:49 +02:00
Arturo Borrero	9363dc4b59	netfilter: nf_tables: store and dump set policy We want to know in which cases the user explicitly sets the policy options. In that case, we also want to dump back the info. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-29 11:28:03 +02:00
Jukka Rissanen	59790aa287	Bluetooth: 6lowpan: Make sure skb exists before accessing it We need to make sure that the saved skb exists when resuming or suspending a CoC channel. This can happen if initial credits is 0 when channel is connected. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-29 10:10:02 +02:00
Daniel Borkmann	e3118e8359	net: tcp: add DCTCP congestion control algorithm This work adds the DataCenter TCP (DCTCP) congestion control algorithm [1], which has been first published at SIGCOMM 2010 [2], resp. follow-up analysis at SIGMETRICS 2011 [3] (and also, more recently as an informational IETF draft available at [4]). DCTCP is an enhancement to the TCP congestion control algorithm for data center networks. Typical data center workloads are i.e. i) partition/aggregate (queries; bursty, delay sensitive), ii) short messages e.g. 50KB-1MB (for coordination and control state; delay sensitive), and iii) large flows e.g. 1MB-100MB (data update; throughput sensitive). DCTCP has therefore been designed for such environments to provide/achieve the following three requirements: * High burst tolerance (incast due to partition/aggregate) * Low latency (short flows, queries) * High throughput (continuous data updates, large file transfers) with commodity, shallow buffered switches The basic idea of its design consists of two fundamentals: i) on the switch side, packets are being marked when its internal queue length > threshold K (K is chosen so that a large enough headroom for marked traffic is still available in the switch queue); ii) the sender/host side maintains a moving average of the fraction of marked packets, so each RTT, F is being updated as follows: F := X / Y, where X is # of marked ACKs, Y is total # of ACKs alpha := (1 - g) * alpha + g * F, where g is a smoothing constant The resulting alpha (iow: probability that switch queue is congested) is then being used in order to adaptively decrease the congestion window W: W := (1 - (alpha / 2)) * W The means for receiving marked packets resp. marking them on switch side in DCTCP is the use of ECN. RFC3168 describes a mechanism for using Explicit Congestion Notification from the switch for early detection of congestion, rather than waiting for segment loss to occur. However, this method only detects the presence of congestion, not the extent. In the presence of mild congestion, it reduces the TCP congestion window too aggressively and unnecessarily affects the throughput of long flows [4]. DCTCP, as mentioned, enhances Explicit Congestion Notification (ECN) processing to estimate the fraction of bytes that encounter congestion, rather than simply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate [4], thus it can derive multibit feedback from the information present in the single-bit sequence of marks in its control law. And thus act in proportion to the extent of congestion, not its presence. Switches therefore set the Congestion Experienced (CE) codepoint in packets when internal queue lengths exceed threshold K. Resulting, DCTCP delivers the same or better throughput than normal TCP, while using 90% less buffer space. It was found in [2] that DCTCP enables the applications to handle 10x the current background traffic, without impacting foreground traffic. Moreover, a 10x increase in foreground traffic did not cause any timeouts, and thus largely eliminates TCP incast collapse problems. The algorithm itself has already seen deployments in large production data centers since then. We did a long-term stress-test and analysis in a data center, short summary of our TCP incast tests with iperf compared to cubic: This test measured DCTCP throughput and latency and compared it with CUBIC throughput and latency for an incast scenario. In this test, 19 senders sent at maximum rate to a single receiver. The receiver simply ran iperf -s. The senders ran iperf -c <receiver> -t 30. All senders started simultaneously (using local clocks synchronized by ntp). This test was repeated multiple times. Below shows the results from a single test. Other tests are similar. (DCTCP results were extremely consistent, CUBIC results show some variance induced by the TCP timeouts that CUBIC encountered.) For this test, we report statistics on the number of TCP timeouts, flow throughput, and traffic latency. 1) Timeouts (total over all flows, and per flow summaries): CUBIC DCTCP Total 3227 25 Mean 169.842 1.316 Median 183 1 Max 207 5 Min 123 0 Stddev 28.991 1.600 Timeout data is taken by measuring the net change in netstat -s "other TCP timeouts" reported. As a result, the timeout measurements above are not restricted to the test traffic, and we believe that it is likely that all of the "DCTCP timeouts" are actually timeouts for non-test traffic. We report them nevertheless. CUBIC will also include some non-test timeouts, but they are drawfed by bona fide test traffic timeouts for CUBIC. Clearly DCTCP does an excellent job of preventing TCP timeouts. DCTCP reduces timeouts by at least two orders of magnitude and may well have eliminated them in this scenario. 2) Throughput (per flow in Mbps): CUBIC DCTCP Mean 521.684 521.895 Median 464 523 Max 776 527 Min 403 519 Stddev 105.891 2.601 Fairness 0.962 0.999 Throughput data was simply the average throughput for each flow reported by iperf. By avoiding TCP timeouts, DCTCP is able to achieve much better per-flow results. In CUBIC, many flows experience TCP timeouts which makes flow throughput unpredictable and unfair. DCTCP, on the other hand, provides very clean predictable throughput without incurring TCP timeouts. Thus, the standard deviation of CUBIC throughput is dramatically higher than the standard deviation of DCTCP throughput. Mean throughput is nearly identical because even though cubic flows suffer TCP timeouts, other flows will step in and fill the unused bandwidth. Note that this test is something of a best case scenario for incast under CUBIC: it allows other flows to fill in for flows experiencing a timeout. Under situations where the receiver is issuing requests and then waiting for all flows to complete, flows cannot fill in for timed out flows and throughput will drop dramatically. 3) Latency (in ms): CUBIC DCTCP Mean 4.0088 0.04219 Median 4.055 0.0395 Max 4.2 0.085 Min 3.32 0.028 Stddev 0.1666 0.01064 Latency for each protocol was computed by running "ping -i 0.2 <receiver>" from a single sender to the receiver during the incast test. For DCTCP, "ping -Q 0x6 -i 0.2 <receiver>" was used to ensure that traffic traversed the DCTCP queue and was not dropped when the queue size was greater than the marking threshold. The summary statistics above are over all ping metrics measured between the single sender, receiver pair. The latency results for this test show a dramatic difference between CUBIC and DCTCP. CUBIC intentionally overflows the switch buffer which incurs the maximum queue latency (more buffer memory will lead to high latency.) DCTCP, on the other hand, deliberately attempts to keep queue occupancy low. The result is a two orders of magnitude reduction of latency with DCTCP - even with a switch with relatively little RAM. Switches with larger amounts of RAM will incur increasing amounts of latency for CUBIC, but not for DCTCP. 4) Convergence and stability test: This test measured the time that DCTCP took to fairly redistribute bandwidth when a new flow commences. It also measured DCTCP's ability to remain stable at a fair bandwidth distribution. DCTCP is compared with CUBIC for this test. At the commencement of this test, a single flow is sending at maximum rate (near 10 Gbps) to a single receiver. One second after that first flow commences, a new flow from a distinct server begins sending to the same receiver as the first flow. After the second flow has sent data for 10 seconds, the second flow is terminated. The first flow sends for an additional second. Ideally, the bandwidth would be evenly shared as soon as the second flow starts, and recover as soon as it stops. The results of this test are shown below. Note that the flow bandwidth for the two flows was measured near the same time, but not simultaneously. DCTCP performs nearly perfectly within the measurement limitations of this test: bandwidth is quickly distributed fairly between the two flows, remains stable throughout the duration of the test, and recovers quickly. CUBIC, in contrast, is slow to divide the bandwidth fairly, and has trouble remaining stable. CUBIC DCTCP Seconds Flow 1 Flow 2 Seconds Flow 1 Flow 2 0 9.93 0 0 9.92 0 0.5 9.87 0 0.5 9.86 0 1 8.73 2.25 1 6.46 4.88 1.5 7.29 2.8 1.5 4.9 4.99 2 6.96 3.1 2 4.92 4.94 2.5 6.67 3.34 2.5 4.93 5 3 6.39 3.57 3 4.92 4.99 3.5 6.24 3.75 3.5 4.94 4.74 4 6 3.94 4 5.34 4.71 4.5 5.88 4.09 4.5 4.99 4.97 5 5.27 4.98 5 4.83 5.01 5.5 4.93 5.04 5.5 4.89 4.99 6 4.9 4.99 6 4.92 5.04 6.5 4.93 5.1 6.5 4.91 4.97 7 4.28 5.8 7 4.97 4.97 7.5 4.62 4.91 7.5 4.99 4.82 8 5.05 4.45 8 5.16 4.76 8.5 5.93 4.09 8.5 4.94 4.98 9 5.73 4.2 9 4.92 5.02 9.5 5.62 4.32 9.5 4.87 5.03 10 6.12 3.2 10 4.91 5.01 10.5 6.91 3.11 10.5 4.87 5.04 11 8.48 0 11 8.49 4.94 11.5 9.87 0 11.5 9.9 0 SYN/ACK ECT test: This test demonstrates the importance of ECT on SYN and SYN-ACK packets by measuring the connection probability in the presence of competing flows for a DCTCP connection attempt without ECT in the SYN packet. The test was repeated five times for each number of competing flows. Competing Flows 1 \| 2 \| 4 \| 8 \| 16 ------------------------------ Mean Connection Probability 1 \| 0.67 \| 0.45 \| 0.28 \| 0 Median Connection Probability 1 \| 0.65 \| 0.45 \| 0.25 \| 0 As the number of competing flows moves beyond 1, the connection probability drops rapidly. Enabling DCTCP with this patch requires the following steps: DCTCP must be running both on the sender and receiver side in your data center, i.e.: sysctl -w net.ipv4.tcp_congestion_control=dctcp Also, ECN functionality must be enabled on all switches in your data center for DCTCP to work. The default ECN marking threshold (K) heuristic on the switch for DCTCP is e.g., 20 packets (30KB) at 1Gbps, and 65 packets (~100KB) at 10Gbps (K > 1/7 * C * RTT, [4]). In above tests, for each switch port, traffic was segregated into two queues. For any packet with a DSCP of 0x01 - or equivalently a TOS of 0x04 - the packet was placed into the DCTCP queue. All other packets were placed into the default drop-tail queue. For the DCTCP queue, RED/ECN marking was enabled, here, with a marking threshold of 75 KB. More details however, we refer you to the paper [2] under section 3). There are no code changes required to applications running in user space. DCTCP has been implemented in full isolation of the rest of the TCP code as its own congestion control module, so that it can run without a need to expose code to the core of the TCP stack, and thus nothing changes for non-DCTCP users. Changes in the CA framework code are minimal, and DCTCP algorithm operates on mechanisms that are already available in most Silicon. The gain (dctcp_shift_g) is currently a fixed constant (1/16) from the paper, but we leave the option that it can be chosen carefully to a different value by the user. In case DCTCP is being used and ECN support on peer site is off, DCTCP falls back after 3WHS to operate in normal TCP Reno mode. ss {-4,-6} -t -i diag interface: ... dctcp wscale:7,7 rto:203 rtt:2.349/0.026 mss:1448 cwnd:2054 ssthresh:1102 ce_state 0 alpha 15 ab_ecn 0 ab_tot 735584 send 10129.2Mbps pacing_rate 20254.1Mbps unacked:1822 retrans:0/15 reordering:101 rcv_space:29200 ... dctcp-reno wscale:7,7 rto:201 rtt:0.711/1.327 ato:40 mss:1448 cwnd:10 ssthresh:1102 fallback_mode send 162.9Mbps pacing_rate 325.5Mbps rcv_rtt:1.5 rcv_space:29200 More information about DCTCP can be found in [1-4]. [1] http://simula.stanford.edu/~alizade/Site/DCTCP.html [2] http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf [3] http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf [4] http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00 Joint work with Florian Westphal and Glenn Judd. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Florian Westphal	9890092e46	net: tcp: more detailed ACK events and events for CE marked packets DataCenter TCP (DCTCP) determines cwnd growth based on ECN information and ACK properties, e.g. ACK that updates window is treated differently than DUPACK. Also DCTCP needs information whether ACK was delayed ACK. Furthermore, DCTCP also implements a CE state machine that keeps track of CE markings of incoming packets. Therefore, extend the congestion control framework to provide these event types, so that DCTCP can be properly implemented as a normal congestion algorithm module outside of the core stack. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Florian Westphal	7354c8c389	net: tcp: split ack slow/fast events from cwnd_event The congestion control ops "cwnd_event" currently supports CA_EVENT_FAST_ACK and CA_EVENT_SLOW_ACK events (among others). Both FAST and SLOW_ACK are only used by Westwood congestion control algorithm. This removes both flags from cwnd_event and adds a new in_ack_event callback for this. The goal is to be able to provide more detailed information about ACKs, such as whether ECE flag was set, or whether the ACK resulted in a window update. It is required for DataCenter TCP (DCTCP) congestion control algorithm as it makes a different choice depending on ECE being set or not. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Daniel Borkmann	30e502a34b	net: tcp: add flag for ca to indicate that ECN is required This patch adds a flag to TCP congestion algorithms that allows for requesting to mark IPv4/IPv6 sockets with transport as ECN capable, that is, ECT(0), when required by a congestion algorithm. It is currently used and needed in DataCenter TCP (DCTCP), as it requires both peers to assert ECT on all IP packets sent - it uses ECN feedback (i.e. CE, Congestion Encountered information) from switches inside the data center to derive feedback to the end hosts. Therefore, simply add a new flag to icsk_ca_ops. Note that DCTCP's algorithm/behaviour slightly diverges from RFC3168, therefore this is only (!) enabled iff the assigned congestion control ops module has requested this. By that, we can tightly couple this logic really only to the provided congestion control ops. Joint work with Florian Westphal and Glenn Judd. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Florian Westphal	55d8694fa8	net: tcp: assign tcp cong_ops when tcp sk is created Split assignment and initialization from one into two functions. This is required by followup patches that add Datacenter TCP (DCTCP) congestion control algorithm - we need to be able to determine if the connection is moderated by DCTCP before the 3WHS has finished. As we walk the available congestion control list during the assignment, we are always guaranteed to have Reno present as it's fixed compiled-in. Therefore, since we're doing the early assignment, we don't have a real use for the Reno alias tcp_init_congestion_ops anymore and can thus remove it. Actual usage of the congestion control operations are being made after the 3WHS has finished, in some cases however we can access get_info() via diag if implemented, therefore we need to zero out the private area for those modules. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
John Fastabend	53dfd50181	net: sched: cls_rcvp, complete rcu conversion This completes the cls_rsvp conversion to RCU safe copy, update semantics. As a result all cases of tcf_exts_change occur on empty lists now. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:04:55 -04:00
WANG Cong	68f6a7c6c9	net_sched: fix another regression in cls_tcindex Clearly the following change is not expected: - if (!cp.perfect && !cp.h) - cp.alloc_hash = cp.hash; + if (!cp->perfect && cp->h) + cp->alloc_hash = cp->hash; Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:34:35 -04:00
WANG Cong	02c5e84413	net_sched: fix errno in tcindex_set_parms() When kmemdup() fails, we should return -ENOMEM. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:34:22 -04:00
Rick Jones	825bae5d97	arp: Do not perturb drop profiles with ignored ARP packets We do not wish to disturb dropwatch or perf drop profiles with an ARP we will ignore. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:30:35 -04:00
WANG Cong	18d0264f63	net_sched: remove the first parameter from tcf_exts_destroy() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:29:01 -04:00
David S. Miller	f5c7e1a47a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-09-25 1) Remove useless hash_resize_mutex in xfrm_hash_resize(). This mutex is used only there, but xfrm_hash_resize() can't be called concurrently at all. From Ying Xue. 2) Extend policy hashing to prefixed policies based on prefix lenght thresholds. From Christophe Gouault. 3) Make the policy hash table thresholds configurable via netlink. From Christophe Gouault. 4) Remove the maximum authentication length for AH. This was needed to limit stack usage. We switched already to allocate space, so no need to keep the limit. From Herbert Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:19:15 -04:00
WANG Cong	2c1a4311b6	neigh: check error pointer instead of NULL for ipv4_neigh_lookup() Fixes: commit `f187bc6efb` ("ipv4: No need to set generic neighbour pointer") Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:16:04 -04:00
Florian Fainelli	7905288f09	net: dsa: allow switches driver to implement get/set EEE Allow switches driver to query and enable/disable EEE on a per-port basis by implementing the ethtool_{get,set}_eee settings and delegating these operations to the switch driver. set_eee() will need to coordinate with the PHY driver to make sure that EEE is enabled, the link-partner supports it and the auto-negotiation result is satisfactory. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:14:09 -04:00
Florian Fainelli	b2f2af21e3	net: dsa: allow enabling and disable switch ports Whenever a per-port network device is used/unused, invoke the switch driver port_enable/port_disable callbacks to allow saving as much power as possible by disabling unused parts of the switch (RX/TX logic, memory arrays, PHYs...). We supply a PHY device argument to make sure the switch driver can act on the PHY device if needed (like putting/taking the PHY out of deep low power mode). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:14:08 -04:00
Florian Fainelli	f7f1de51ed	net: dsa: start and stop the PHY state machine dsa_slave_open() should start the PHY library state machine for its PHY interface, and dsa_slave_close() should stop the PHY library state machine accordingly. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:14:08 -04:00
Peter Pan(潘卫平)	155c6e1ad4	tcp: use tcp_flags in tcp_data_queue() This patch is a cleanup which follows the idea in commit `e11ecddf51` (tcp: use TCP_SKB_CB(skb)->tcp_flags in input path), and it may reduce register pressure since skb->cb[] access is fast, bacause skb is probably in a register. v2: remove variable th v3: reword the changelog Signed-off-by: Weiping Pan <panweiping3@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:37:57 -04:00
Eric Dumazet	cd7d8498c9	tcp: change tcp_skb_pcount() location Our goal is to access no more than one cache line access per skb in a write or receive queue when doing the various walks. After recent TCP_SKB_CB() reorganizations, it is almost done. Last part is tcp_skb_pcount() which currently uses skb_shinfo(skb)->gso_segs, which is a terrible choice, because it needs 3 cache lines in current kernel (skb->head, skb->end, and shinfo->gso_segs are all in 3 different cache lines, far from skb->cb) This very simple patch reuses space currently taken by tcp_tw_isn only in input path, as tcp_skb_pcount is only needed for skb stored in write queue. This considerably speeds up tcp_ack(), granted we avoid shinfo->tx_flags to get SKBTX_ACK_TSTAMP, which seems possible. This also speeds up all sack processing in general. This speeds up tcp_sendmsg() because it no longer has to access/dirty shinfo. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:36:48 -04:00
Eric Dumazet	971f10eca1	tcp: better TCP_SKB_CB layout to reduce cache line misses TCP maintains lists of skb in write queue, and in receive queues (in order and out of order queues) Scanning these lists both in input and output path usually requires access to skb->next, TCP_SKB_CB(skb)->seq, and TCP_SKB_CB(skb)->end_seq These fields are currently in two different cache lines, meaning we waste lot of memory bandwidth when these queues are big and flows have either packet drops or packet reorders. We can move TCP_SKB_CB(skb)->header at the end of TCP_SKB_CB, because this header is not used in fast path. This allows TCP to search much faster in the skb lists. Even with regular flows, we save one cache line miss in fast path. Thanks to Christoph Paasch for noticing we need to cleanup skb->cb[] (IPCB/IP6CB) before entering IP stack in tx path, and that I forgot IPCB use in tcp_v4_hnd_req() and tcp_v4_save_options(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:43 -04:00
Eric Dumazet	a224772db8	ipv6: add a struct inet6_skb_parm param to ipv6_opt_accepted() ipv6_opt_accepted() assumes IP6CB(skb) holds the struct inet6_skb_parm that it needs. Lets not assume this, as TCP stack might use a different place. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:43 -04:00
Eric Dumazet	24a2d43d88	ipv4: rename ip_options_echo to __ip_options_echo() ip_options_echo() assumes struct ip_options is provided in &IPCB(skb)->opt Lets break this assumption, but provide a helper to not change all call points. ip_send_unicast_reply() gets a new struct ip_options pointer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:42 -04:00
Steffen Klassert	cd0a0bd9b8	ip6_gre: Return an error when adding an existing tunnel. ip6gre_tunnel_locate() should not return an existing tunnel if create is true. Otherwise it is possible to add the same tunnel multiple times without getting an error. So return NULL if the tunnel that should be created already exists. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:19:46 -04:00
Steffen Klassert	d814b847be	ip6_vti: Return an error when adding an existing tunnel. vti6_locate() should not return an existing tunnel if create is true. Otherwise it is possible to add the same tunnel multiple times without getting an error. So return NULL if the tunnel that should be created already exists. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:19:46 -04:00
Steffen Klassert	2b0bb01b6e	ip6_tunnel: Return an error when adding an existing tunnel. ip6_tnl_locate() should not return an existing tunnel if create is true. Otherwise it is possible to add the same tunnel multiple times without getting an error. So return NULL if the tunnel that should be created already exists. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:19:46 -04:00
Dan Williams	3f33407856	net: make tcp_cleanup_rbuf private net_dma was the only external user so this can become local to tcp.c again. Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-09-28 07:22:21 -07:00
Dan Williams	d27f9bc104	net_dma: revert 'copied_early' Now that tcp_dma_try_early_copy() is gone nothing ever sets copied_early. Also reverts "53240c208776 tcp: Fix possible double-ack w/ user dma" since it is no longer necessary. Cc: Ali Saidi <saidi@engin.umich.edu> Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Neal Cardwell <ncardwell@google.com> Reported-by: Dave Jones <davej@redhat.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-09-28 07:22:21 -07:00
Dan Williams	7bced39751	net_dma: simple removal Per commit "77873803363c net_dma: mark broken" net_dma is no longer used and there is no plan to fix it. This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards. Reverting the remainder of the net_dma induced changes is deferred to subsequent patches. Marked for stable due to Roman's report of a memory leak in dma_pin_iovec_pages(): https://lkml.org/lkml/2014/9/3/177 Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: David Whipple <whipple@securedatainnovations.ch> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: <stable@vger.kernel.org> Reported-by: Roman Gushchin <klamm@yandex-team.ru> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-09-28 07:05:16 -07:00
Nicolas Dichtel	5a4ee9a9a0	ip6gre: add a rtnl link alias for ip6gretap With this alias, we don't need to load manually the module before adding an ip6gretap interface with iproute2. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 17:15:57 -04:00
Eric Dumazet	ff04a771ad	net : optimize skb_release_data() Cache skb_shinfo(skb) in a variable to avoid computing it multiple times. Reorganize the tests to remove one indentation level. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 16:53:49 -04:00
Wang Sheng-Hui	8280bf00fd	net/openvswitch: remove dup comment in vport.h Remove the duplicated comment "/* The following definitions are for users of the vport subsytem: */" in vport.h Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 16:42:33 -04:00
David S. Miller	e7af85db54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== nf pull request for net This series contains netfilter fixes for net, they are: 1) Fix lockdep splat in nft_hash when releasing sets from the rcu_callback context. We don't the mutex there anymore. 2) Remove unnecessary spinlock_bh in the destroy path of the nf_tables rbtree set type from rcu_callback context. 3) Fix another lockdep splat in rhashtable. None of the callers hold a mutex when calling rhashtable_destroy. 4) Fix duplicated error reporting from nfnetlink when aborting and replaying a batch. 5) Fix a Kconfig issue reported by kbuild robot. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 16:21:29 -04:00
LEROY Christophe	58e3cac561	net: optimise inet_proto_csum_replace4() csum_partial() is a generic function which is not optimised for small fixed length calculations, and its use requires to store "from" and "to" values in memory while we already have them available in registers. This also has impact, especially on RISC processors. In the same spirit as the change done by Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4() taking into account RFC1624. I spotted during a NATted tcp transfert that csum_partial() is one of top 5 consuming functions (around 8%), and the second user of csum_partial() is inet_proto_csum_replace4(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 16:14:17 -04:00
Eric Dumazet	f4a775d144	net: introduce __skb_header_release() While profiling TCP stack, I noticed one useless atomic operation in tcp_sendmsg(), caused by skb_header_release(). It turns out all current skb_header_release() users have a fresh skb, that no other user can see, so we can avoid one atomic operation. Introduce __skb_header_release() to clearly document this. This gave me a 1.5 % improvement on TCP_RR workload. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 15:40:06 -04:00
David S. Miller	57219dc7bf	Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-09-22 Please pull this batch of updates intended for the 3.18 stream... For the mac80211 bits, Johannes says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." For the bluetooth bits, Johan says: "Here are some more patches intended for 3.18. Most of them are cleanups or fixes for SMP. The only exception is a fix for BR/EDR L2CAP fixed channels which should now work better together with the L2CAP information request procedure." For the iwlwifi bits, Emmanuel says: "I fix here dvm which was broken by my last pull request. Arik continues to work on TDLS and Luca solved a few issues in CT-Kill. Eyal keeps digging into rate scaling code, more to come soon. Besides this, nothing really special here." Beyond that, there are the usual big batches of updates to ath9k, b43, mwifiex, and wil6210 as well as a handful of other bits here and there. Also, rtlwifi gets some btcoexist attention from Larry. Please let me know if there are problems! ==================== Had to adjust the wil6210 code to comply with Joe Perches's recent change in net-next to make the netdev_*() routines return void instead of 'int'. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 15:39:24 -04:00
Joe Perches	6ea754eb76	net: Change netdev_<level> logging functions to return void No caller or macro uses the return value so make all the functions return void. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 15:17:17 -04:00
John W. Linville	30d3c071a6	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-09-26 13:38:51 -04:00
John W. Linville	330bd4ec9d	NFC: 3.18 pull request This is the NFC pull request for 3.18. We've had major updates for TI and ST Microelectronics drivers: For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell add ISO15693 support to the NCI stack, together with a couple of NCI fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUIg4oAAoJEIqAPN1PVmxKg+kP/RPrgH6LA1tFubIwKR2+sGQ7 g/W2J3AE8QASZkErkpXRNCt2D/SPWEKBY/qscwz+BtcWg76taIIaGTvVUNtxSaW3 gS4hG6V1UlANWv3KFfaOKmzjEOO/SPNtkFAyI0cTOaGyUqG4o9BgBZpn1rYO16MD ZkSC39MpjMoXB9BbsfQngoUEoWc3tZNMmRzk4IVTwE/wXuQvZxmFQXEAiZ+pnYle NQfugaGMz0526rLG3QnrpkUakFb81iQwtONpbx6i8KW/Klkc6TN/ek6J9ecU8t5z tdHOViZWRmA1VwMGBHwpq8F2o/ATH6GeivTgqrQjcjGNhCUUT1Ulzve2UxGEMWi6 ncjKY/GxUrYaMMtRvLv+/knrfbWtd+EnWOav07jgNrrA0tvgBNQvEKKHPoWykDVN QKpxu3YoNxrsR/LJMS+Zjj0IIM1Y+9DTOkLXzxJ5Hvht8rOl5heYGh2DICOpWsbQ ejrQicJOJvN5vqu+Sgcqq4msyTEdbs2LfRDrW1VC9A6ILI+KzYg2laTFGMnhZ5qn TgsYIDdONS2iGUulFHylGHI7ANtUg/mhklLUccY1HQYyiAM1NQUtzq1tAz6yLoIH l8iIiyzJSBWW57nWhyrULEbzHgPE+bHIjO4T+UUOxMgquYa4V11S1uP0OfWfZogR xS24GlobS2oXHMQqh0fA =d/C+ -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz <sameo@linux.intel.com> says: "NFC: 3.18 pull request This is the NFC pull request for 3.18. We've had major updates for TI and ST Microelectronics drivers: For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell add ISO15693 support to the NCI stack, together with a couple of NCI fixes." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-26 13:37:02 -04:00
Pablo Neira Ayuso	34666d467c	netfilter: bridge: move br_netfilter out of the core Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>	2014-09-26 18:42:31 +02:00
Pablo Neira Ayuso	7276ca3fa2	netfilter: bridge: nf_bridge_copy_header as static inline in header Move nf_bridge_copy_header() as static inline in netfilter_bridge.h header file. This patch prepares the modularization of the br_netfilter code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-26 18:42:30 +02:00
Rob Jones	772476df70	net/netfilter/x_tables.c: use __seq_open_private() Reduce boilerplate code by using __seq_open_private() instead of seq_open() in xt_match_open() and xt_target_open(). Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-26 18:42:29 +02:00
Steffen Klassert	d61746b2e7	ip_tunnel: Don't allow to add the same tunnel multiple times. When we try to add an already existing tunnel, we don't return an error. Instead we continue and call ip_tunnel_update(). This means that we can change existing tunnels by adding the same tunnel multiple times. It is even possible to change the tunnel endpoints of the fallback device. We fix this by returning an error if we try to add an existing tunnel. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 00:41:30 -04:00
Eric Dumazet	4a8e320c92	net: sched: use pinned timers While using a MQ + NETEM setup, I had confirmation that the default timer migration ( /proc/sys/kernel/timer_migration ) is killing us. Installing this on a receiver side of a TCP_STREAM test, (NIC has 8 TX queues) : EST="est 1sec 4sec" for ETH in eth1 do tc qd del dev $ETH root 2>/dev/null tc qd add dev $ETH root handle 1: mq tc qd add dev $ETH parent 1:1 $EST netem limit 70000 delay 6ms tc qd add dev $ETH parent 1:2 $EST netem limit 70000 delay 8ms tc qd add dev $ETH parent 1:3 $EST netem limit 70000 delay 10ms tc qd add dev $ETH parent 1:4 $EST netem limit 70000 delay 12ms tc qd add dev $ETH parent 1:5 $EST netem limit 70000 delay 14ms tc qd add dev $ETH parent 1:6 $EST netem limit 70000 delay 16ms tc qd add dev $ETH parent 1:7 $EST netem limit 80000 delay 18ms tc qd add dev $ETH parent 1:8 $EST netem limit 90000 delay 20ms done We can see that timers get migrated into a single cpu, presumably idle at the time timers are set up. Then all qdisc dequeues run from this cpu and huge lock contention happens. This single cpu is stuck in softirq mode and cannot dequeue fast enough. 39.24% [kernel] [k] _raw_spin_lock 2.65% [kernel] [k] netem_enqueue 1.80% [kernel] [k] netem_dequeue 1.63% [kernel] [k] copy_user_enhanced_fast_string 1.45% [kernel] [k] _raw_spin_lock_bh By pinning qdisc timers on the cpu running the qdisc, we respect proper XPS setting and remove this lock contention. 5.84% [kernel] [k] netem_enqueue 4.83% [kernel] [k] _raw_spin_lock 2.92% [kernel] [k] copy_user_enhanced_fast_string Current Qdiscs that benefit from this change are : netem, cbq, fq, hfsc, tbf, htb. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 00:26:48 -04:00
Tom Herbert	53e5039896	net: Remove gso_send_check as an offload callback The send_check logic was only interesting in cases of TCP offload and UDP UFO where the checksum needed to be initialized to the pseudo header checksum. Now we've moved that logic into the related gso_segment functions so gso_send_check is no longer needed. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 00:22:47 -04:00
Tom Herbert	f71470b37e	udp: move logic out of udp[46]_ufo_send_check In udp[46]_ufo_send_check the UDP checksum initialized to the pseudo header checksum. We can move this logic into udp[46]_ufo_fragment. After this change udp[64]_ufo_send_check is a no-op. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 00:22:46 -04:00
Tom Herbert	d020f8f733	tcp: move logic out of tcp_v[64]_gso_send_check In tcp_v[46]_gso_send_check the TCP checksum is initialized to the pseudo header checksum using __tcp_v[46]_send_check. We can move this logic into new tcp[46]_gso_segment functions to be done when ip_summed != CHECKSUM_PARTIAL (ip_summed == CHECKSUM_PARTIAL should be the common case, possibly always true when taking GSO path). After this change tcp_v[46]_gso_send_check is no-op. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 00:22:46 -04:00
Trond Myklebust	2aca5b869a	SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT The flag RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT was intended introduced in order to allow NFSv4 clients to disable resend timeouts. Since those cause the RPC layer to break the connection, they mess up the duplicate reply caches that remain indexed on the port number in NFSv4.. This patch includes the code that was missing in the original to set the appropriate flag in struct rpc_clnt, when the caller of rpc_create() sets RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT. Fixes: `8a19a0b6cb` (SUNRPC: Add RPC task and client level options to...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-25 21:25:17 -04:00
NeilBrown	1aff525629	NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page() Now that nfs_release_page() doesn't block indefinitely, other deadlock avoidance mechanisms aren't needed. - it doesn't hurt for kswapd to block occasionally. If it doesn't want to block it would clear __GFP_WAIT. The current_is_kswapd() was only added to avoid deadlocks and we have a new approach for that. - memory allocation in the SUNRPC layer can very rarely try to ->releasepage() a page it is trying to handle. The deadlock is removed as nfs_release_page() doesn't block indefinitely. So we don't need to set PF_FSTRANS for sunrpc network operations any more. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-25 08:25:47 -04:00
Johan Hedberg	565766b087	Bluetooth: Rename sco_param_wideband table to esco_param_msbc The sco_param_wideband table represents the eSCO parameters for specifically mSBC encoding. This patch renames the table to the more descriptive esco_param_msbc name. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-25 10:35:08 +02:00
Jason Baron	3dedbb5ca1	rpc: Add -EPERM processing for xs_udp_send_request() If an iptables drop rule is added for an nfs server, the client can end up in a softlockup. Because of the way that xs_sendpages() is structured, the -EPERM is ignored since the prior bits of the packet may have been successfully queued and thus xs_sendpages() returns a non-zero value. Then, xs_udp_send_request() thinks that because some bits were queued it should return -EAGAIN. We then try the request again and again, resulting in cpu spinning. Reproducer: 1) open a file on the nfs server '/nfs/foo' (mounted using udp) 2) iptables -A OUTPUT -d <nfs server ip> -j DROP 3) write to /nfs/foo 4) close /nfs/foo 5) iptables -D OUTPUT -d <nfs server ip> -j DROP The softlockup occurs in step 4 above. The previous patch, allows xs_sendpages() to return both a sent count and any error values that may have occurred. Thus, if we get an -EPERM, return that to the higher level code. With this patch in place we can successfully abort the above sequence and avoid the softlockup. I also tried the above test case on an nfs mount on tcp and although the system does not softlockup, I still ended up with the 'hung_task' firing after 120 seconds, due to the i/o being stuck. The tcp case appears a bit harder to fix, since -EPERM appears to get ignored much lower down in the stack and does not propogate up to xs_sendpages(). This case is not quite as insidious as the softlockup and it is not addressed here. Reported-by: Yigong Lou <ylou@akamai.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-24 23:13:46 -04:00
Jason Baron	f279cd008f	rpc: return sent and err from xs_sendpages() If an error is returned after the first bits of a packet have already been successfully queued, xs_sendpages() will return a positive 'int' value indicating success. Callers seem to treat this as -EAGAIN. However, there are cases where its not a question of waiting for the write queue to drain. For example, when there is an iptables rule dropping packets to the destination, the lower level code can return -EPERM only after parts of the packet have been successfully queued. In this case, we can end up continuously retrying resulting in a kernel softlockup. This patch is intended to make no changes in behavior but is in preparation for subsequent patches that can make decisions based on both on the number of bytes sent by xs_sendpages() and any errors that may have be returned. Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-24 23:13:37 -04:00
Benjamin Coddington	a743419f42	SUNRPC: Don't wake tasks during connection abort When aborting a connection to preserve source ports, don't wake the task in xs_error_report. This allows tasks with RPC_TASK_SOFTCONN to succeed if the connection needs to be re-established since it preserves the task's status instead of setting it to the status of the aborting kernel_connect(). This may also avoid a potential conflict on the socket's lock. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-24 23:06:56 -04:00
David S. Miller	4daaab4f0c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-09-24 16:48:32 -04:00
Johan Hedberg	c7da579763	Bluetooth: Add retransmission effort into SCO parameter table It is expected that new parameter combinations will have the retransmission effort value different between some entries (mainly because of the new S4 configuration added by HFP 1.7), so it makes sense to move it into the table instead of having it hard coded based on the selected SCO_AIRMODE_*. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 22:15:29 +02:00
David S. Miller	543a2dff5e	Merge tag 'master-2014-09-23' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-23 Please consider pulling this one last batch of fixes intended for the 3.17 stream! For the NFC bits, Samuel says: "Hopefully not too late for a handful of NFC fixes: - 2 potential build failures for ST21NFCA and ST21NFCB, triggered by a depmod dependenyc cycle. - One potential buffer overflow in the microread driver." On top of that... Emil Goode provides a fix for a brcmfmac off-by-one regression which was introduced in the 3.17 cycle. Loic Poulain fixes a polarity mismatch for a variable assignment inside of rfkill-gpio. Wojciech Dubowik prevents a NULL pointer dereference in ath9k. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-24 15:00:12 -04:00
Tejun Heo	d06efebf0c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block into for-3.18 This is to receive `0a30288da1` ("blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe") which implements __percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The commit reverted and patches to implement proper fix will be added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de>	2014-09-24 13:00:21 -04:00
Simon Vincent	f19f4f9525	ieee802154: 6lowpan: ensure header compression does not corrupt ipv6 header The 6lowpan ipv6 header compression was causing problems for other interfaces that expected a ipv6 header to still be in place, as we were replacing the ipv6 header with a compressed version. This happened if you sent a packet to a multicast address as the packet would be output on 802.15.4, ethernet, and also be sent to the loopback interface. The skb data was shared between these interfaces so all interfaces ended up with a compressed ipv6 header. The solution is to ensure that before we do any header compression we are not sharing the skb or skb data with any other interface. If we are then we must take a copy of the skb and skb data before modifying the ipv6 header. The only place we can copy the skb is inside the xmit function so we don't leave dangling references to skb. This patch moves all the header compression to inside the xmit function. Very little code has been changed it has mostly been moved from lowpan_header_create to lowpan_xmit. At the top of the xmit function we now check if the skb is shared and if so copy it. In lowpan_header_create all we do now is store the source and destination addresses for use later when we compress the header. Signed-off-by: Simon Vincent <simon.vincent@xsilon.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 14:15:08 +02:00
Johan Hedberg	d41c15cf95	Bluetooth: Fix reason code used for rejecting SCO connections The core specification defines valid values for the HCI_Reject_Synchronous_Connection_Request command to be 0x0D-0x0F. So far the code has been using HCI_ERROR_REMOTE_USER_TERM (0x13) which is not a valid value and is therefore being rejected by some controllers: > HCI Event: Connect Request (0x04) plen 10 bdaddr 40:6F:2A:6A:E5:E0 class 0x000000 type eSCO < HCI Command: Reject Synchronous Connection (0x01\|0x002a) plen 7 bdaddr 40:6F:2A:6A:E5:E0 reason 0x13 Reason: Remote User Terminated Connection > HCI Event: Command Status (0x0f) plen 4 Reject Synchronous Connection (0x01\|0x002a) status 0x12 ncmd 1 Error: Invalid HCI Command Parameters This patch introduces a new define for a value from the valid range (0x0d == Connection Rejected Due To Limited Resources) and uses it instead for rejecting incoming connections. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 14:03:32 +02:00
Joe Perches	2b0bf6c85a	Bluetooth: Convert bt_<level> logging functions to return void No caller or macro uses the return value so make all the functions return void. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 09:40:08 +02:00
Christophe Ricard	9e87f9a9c4	NFC: nci: Add support for proprietary RF Protocols In NFC Forum NCI specification, some RF Protocol values are reserved for proprietary use (from 0x80 to 0xfe). Some CLF vendor may need to use one value within this range for specific technology. Furthermore, some CLF may not becompliant with NFC Froum NCI specification 2.0 and therefore will not support RF Protocol value 0x06 for PROTOCOL_T5T as mention in a draft specification and in a recent push. Adding get_rf_protocol handle to the nci_ops structure will help to set the correct technology to target. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-09-24 02:02:24 +02:00
Eric Dumazet	bd1e75abf4	tcp: add coalescing attempt in tcp_ofo_queue() In order to make TCP more resilient in presence of reorders, we need to allow coalescing to happen when skbs from out of order queue are transferred into receive queue. LRO/GRO can be completely canceled in some pathological cases, like per packet load balancing on aggregated links. I had to move tcp_try_coalesce() up in the file above tcp_ofo_queue() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:47:38 -04:00
Eric Dumazet	4cdf507d54	icmp: add a global rate limitation Current ICMP rate limiting uses inetpeer cache, which is an RBL tree protected by a lock, meaning that hosts can be stuck hard if all cpus want to check ICMP limits. When say a DNS or NTP server process is restarted, inetpeer tree grows quick and machine comes to its knees. iptables can not help because the bottleneck happens before ICMP messages are even cooked and sent. This patch adds a new global limitation, using a token bucket filter, controlled by two new sysctl : icmp_msgs_per_sec - INTEGER Limit maximal number of ICMP packets sent per second from this host. Only messages whose type matches icmp_ratemask are controlled by this limit. Default: 1000 icmp_msgs_burst - INTEGER icmp_msgs_per_sec controls number of ICMP packets sent per second, while icmp_msgs_burst controls the burst size of these packets. Default: 50 Note that if we really want to send millions of ICMP messages per second, we might extend idea and infra added in commit `04ca6973f7` ("ip: make IP identifiers less predictable") : add a token bucket in the ip_idents hash and no longer rely on inetpeer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:47:38 -04:00
David S. Miller	1f6d80358d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:09:27 -04:00
Bernhard Thaler	48e68ff5e5	Bluetooth: Check for SCO type before setting retransmission effort SCO connection cannot be setup to devices that do not support retransmission. Patch based on http://permalink.gmane.org/gmane.linux.bluez.kernel/7779 and adapted for this kernel version. Code changed to check SCO/eSCO type before setting retransmission effort and max. latency. The purpose of the patch is to support older devices not capable of eSCO. Tested on Blackberry 655+ headset which does not support retransmission. Credits go to Alexander Sommerhuber. Signed-off-by: Bernhard Thaler <bernhard.thaler@r-it.at> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-23 11:30:04 +02:00
Linus Torvalds	98f75b8291	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) If the user gives us a msg_namelen of 0, don't try to interpret anything pointed to by msg_name. From Ani Sinha. 2) Fix some bnx2i/bnx2fc randconfig compilation errors. The gist of the issue is that we firstly have drivers that span both SCSI and networking. And at the top of that chain of dependencies we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are selected. But since select is a sledgehammer and ignores dependencies, everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also explicitly select their dependencies and so on and so forth. Generally speaking 'select' is supposed to only be used for child nodes, those which have no dependencies of their own. And this whole chain of dependencies in the scsi layer violates that rather strongly. So just make SCSI_NETLINK depend upon it's dependencies, and so on and so forth for the things selecting it (either directly or indirectly). From Anish Bhatt and Randy Dunlap. 3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert. 4) Actually notice netdev feature changes in rtl_open() code, from Hayes Wang. 5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov. 6) Missing memory barrier in sunvnet driver, from David Stevens. 7) Don't leave anycast addresses around when ipv6 interface is destroyed, from Sabrina Dubroca. 8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is initialized in SFC driver, from Edward Cree. 9) Fix missing DMA error checking in 3c59x, from Neal Horman. 10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently, fix from Samuel Gauthier. 11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a build error. 12) Fix macvlan regression wherein we stopped emitting broadcast/multicast frames over software devices. From Nicolas Dichtel. 13) Fix infiniband bug due to unintended overflow of skb->cb[], from Eric Dumazet. And add an assertion so this doesn't happen again. 14) dm9000_parse_dt() should return error pointers, not NULL. From Tobias Klauser. 15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma net: bcmgenet: fix TX reclaim accounting for fragments ipv4: do not use this_cpu_ptr() in preemptible context dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt() r8169: fix an if condition r8152: disable ALDPS ipoib: validate struct ipoib_cb size net: sched: shrink struct qdisc_skb_cb to 28 bytes tg3: Work around HW/FW limitations with vlan encapsulated frames macvlan: allow to enqueue broadcast pkt on virtual device pch_gbe: 'select' NET_PTP_CLASSIFY. scsi: Use 'depends' with LIBFC instead of 'select'. openvswitch: restore OVS_FLOW_CMD_NEW notifications genetlink: add function genl_has_listeners() lib: rhashtable: remove second linux/log2.h inclusion net: allow macvlans to move to net namespace 3c59x: Fix bad offset spec in skb_frag_dma_map 3c59x: Add dma error checking and recovery sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG can: at91_can: add missing prepare and unprepare of the clock ...	2014-09-22 18:23:33 -07:00
Eric Dumazet	a35165ca10	ipv4: do not use this_cpu_ptr() in preemptible context this_cpu_ptr() in preemptible context is generally bad Sep 22 05:05:55 br kernel: [ 94.608310] BUG: using smp_processor_id() in preemptible [00000000] code: ip/2261 Sep 22 05:05:55 br kernel: [ 94.608316] caller is tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel] Sep 22 05:05:55 br kernel: [ 94.608319] CPU: 3 PID: 2261 Comm: ip Not tainted 3.17.0-rc5 #82 We can simply use raw_cpu_ptr(), as preemption is safe in these contexts. Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Joe <joe9mail@gmail.com> Fixes: `9a4aa9af44` ("ipv4: Use percpu Cache route in IP tunnels") Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 18:31:18 -04:00
Eric Dumazet	a2aeb02a8e	net: sched: fix compile warning in cls_u32 $ grep CONFIG_CLS_U32_MARK .config # CONFIG_CLS_U32_MARK is not set net/sched/cls_u32.c: In function 'u32_change': net/sched/cls_u32.c:852:1: warning: label 'errout' defined but not used [-Wunused-label] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:47:19 -04:00
David S. Miller	84de67b298	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2014-09-22 We generate a blackhole or queueing route if a packet matches an IPsec policy but a state can't be resolved. Here we assume that dst_output() is called to kill these packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system without the necessary transformations. This pull request contains two patches to fix this issue: 1) Fix for blackhole routed packets. 2) Fix for queue routed packets. Both patches are serious stable candidates. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:41:41 -04:00
Eric Dumazet	fcdd1cf4dd	tcp: avoid possible arithmetic overflows icsk_rto is a 32bit field, and icsk_backoff can reach 15 by default, or more if some sysctl (eg tcp_retries2) are changed. Better use 64bit to perform icsk_rto << icsk_backoff operations As Joe Perches suggested, add a helper for this. Yuchung spotted the tcp_v4_err() case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:27:10 -04:00
Daniel Borkmann	35f7aa5309	ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback RFC2710 (MLDv1), section 3.7. says: The length of a received MLD message is computed by taking the IPv6 Payload Length value and subtracting the length of any IPv6 extension headers present between the IPv6 header and the MLD message. If that length is greater than 24 octets, that indicates that there are other fields present beyond the fields described above, perhaps belonging to a future backwards-compatible version of MLD. An implementation of the version of MLD specified in this document MUST NOT send an MLD message longer than 24 octets and MUST ignore anything past the first 24 octets of a received MLD message. RFC3810 (MLDv2), section 8.2.1. states for listeners regarding presence of MLDv1 routers: In order to be compatible with MLDv1 routers, MLDv2 hosts MUST operate in version 1 compatibility mode. [...] When Host Compatibility Mode is MLDv2, a host acts using the MLDv2 protocol on that interface. When Host Compatibility Mode is MLDv1, a host acts in MLDv1 compatibility mode, using only the MLDv1 protocol, on that interface. [...] While section 8.3.1. specifies router behaviour regarding presence of MLDv1 routers: MLDv2 routers may be placed on a network where there is at least one MLDv1 router. The following requirements apply: If an MLDv1 router is present on the link, the Querier MUST use the lowest version of MLD present on the network. This must be administratively assured. Routers that desire to be compatible with MLDv1 MUST have a configuration option to act in MLDv1 mode; if an MLDv1 router is present on the link, the system administrator must explicitly configure all MLDv2 routers to act in MLDv1 mode. When in MLDv1 mode, the Querier MUST send periodic General Queries truncated at the Multicast Address field (i.e., 24 bytes long), and SHOULD also warn about receiving an MLDv2 Query (such warnings must be rate-limited). The Querier MUST also fill in the Maximum Response Delay in the Maximum Response Code field, i.e., the exponential algorithm described in section 5.1.3. is not used. [...] That means that we should not get queries from different versions of MLD. When there's a MLDv1 router present, MLDv2 enforces truncation and MRC == MRD (both fields are overlapping within the 24 octet range). Section 8.3.2. specifies behaviour in the presence of MLDv1 multicast address listeners: MLDv2 routers may be placed on a network where there are hosts that have not yet been upgraded to MLDv2. In order to be compatible with MLDv1 hosts, MLDv2 routers MUST operate in version 1 compatibility mode. MLDv2 routers keep a compatibility mode per multicast address record. The compatibility mode of a multicast address is determined from the Multicast Address Compatibility Mode variable, which can be in one of the two following states: MLDv1 or MLDv2. The Multicast Address Compatibility Mode of a multicast address record is set to MLDv1 whenever an MLDv1 Multicast Listener Report is received for that multicast address. At the same time, the Older Version Host Present timer for the multicast address is set to Older Version Host Present Timeout seconds. The timer is re-set whenever a new MLDv1 Report is received for that multicast address. If the Older Version Host Present timer expires, the router switches back to Multicast Address Compatibility Mode of MLDv2 for that multicast address. [...] That means, what can happen is the following scenario, that hosts can act in MLDv1 compatibility mode when they previously have received an MLDv1 query (or, simply operate in MLDv1 mode-only); and at the same time, an MLDv2 router could start up and transmits MLDv2 startup query messages while being unaware of the current operational mode. Given RFC2710, section 3.7 we would need to answer to that with an MLDv1 listener report, so that the router according to RFC3810, section 8.3.2. would receive that and internally switch to MLDv1 compatibility as well. Right now, I believe since the initial implementation of MLDv2, Linux hosts would just silently drop such MLDv2 queries instead of replying with an MLDv1 listener report, which would prevent a MLDv2 router going into fallback mode (until it receives other MLDv1 queries). Since the mapping of MRC to MRD in exactly such cases can make use of the exponential algorithm from 5.1.3, we cannot [strictly speaking] be aware in MLDv1 of the encoding in MRC, it seems also not mentioned by the RFC. Since encodings are the same up to 32767, assume in such a situation this value as a hard upper limit we would clamp. We have asked one of the RFC authors on that regard, and he mentioned that there seem not to be any implementations that make use of that exponential algorithm on startup messages. In any case, this patch fixes this MLD interoperability issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:23:15 -04:00
Loic Poulain	fa5c107cc8	net: rfkill: gpio: Fix clock status Clock is disabled when the device is blocked. So, clock_enabled is the logical negation of "blocked". Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-22 16:02:15 -04:00
John Fastabend	de5df63228	net: sched: cls_u32 changes to knode must appear atomic to readers Changes to the cls_u32 classifier must appear atomic to the readers. Before this patch if a change is requested for both the exts and ifindex, first the ifindex is updated then the exts with tcf_exts_change(). This opens a small window where a reader can have a exts chain with an incorrect ifindex. This violates the the RCU semantics. Here we resolve this by always passing u32_set_parms() a copy of the tc_u_knode to work on and then inserting it into the hash table after the updates have been successfully applied. Tested with the following short script: #tc filter add dev p3p2 parent 8001:0 protocol ip prio 99 handle 1: \ u32 divisor 256 #tc filter add dev p3p2 parent 8001:0 protocol ip prio 99 \ u32 link 1: hashkey mask ffffff00 at 12 \ match ip src 192.168.8.0/2 #tc filter add dev p3p2 parent 8001:0 protocol ip prio 102 \ handle 1::10 u32 classid 1:2 ht 1: \ match ip src 192.168.8.0/8 match ip tos 0x0a 1e #tc filter change dev p3p2 parent 8001:0 protocol ip prio 102 \ handle 1::10 u32 classid 1:2 ht 1: \ match ip src 1.1.0.0/8 match ip tos 0x0b 1e CC: Eric Dumazet <edumazet@google.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 15:59:21 -04:00
John Fastabend	a1ddcfee2d	net: cls_u32: fix missed pcpu_success free_percpu This fixes a missed free_percpu in the unwind code path and when keys are destroyed. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 15:59:21 -04:00
Tom Herbert	3fcb95a84f	udp: Need to make ip6_udp_tunnel.c have GPL license Unable to load various tunneling modules without this: [ 80.679049] fou: Unknown symbol udp_sock_create6 (err 0) [ 91.439939] ip6_udp_tunnel: Unknown symbol ip6_local_out (err 0) [ 91.439954] ip6_udp_tunnel: Unknown symbol __put_net (err 0) [ 91.457792] vxlan: Unknown symbol udp_sock_create6 (err 0) [ 91.457831] vxlan: Unknown symbol udp_tunnel6_xmit_skb (err 0) Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 15:08:25 -04:00
Jason Wang	cecda693a9	net: keep original skb which only needs header checking during software GSO Commit `ce93718fb7` ("net: Don't keep around original SKB when we software segment GSO frames") frees the original skb after software GSO even for dodgy gso skbs. This breaks the stream throughput from untrusted sources, since only header checking was done during software GSO instead of a true segmentation. This patch fixes this by freeing the original gso skb only when it was really segmented by software. Fixes `ce93718fb7` ("net: Don't keep around original SKB when we software segment GSO frames.") Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:57:08 -04:00
Florian Fainelli	19e57c4e6d	net: dsa: add {get, set}_wol callbacks to slave devices Allow switch drivers to implement per-port Wake-on-LAN getter and setters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:41:23 -04:00
Florian Fainelli	2446254915	net: dsa: allow switch drivers to implement suspend/resume hooks Add an abstraction layer to suspend/resume switch devices, doing the following split: - suspend/resume the slave network devices and their corresponding PHY devices - suspend/resume the switch hardware using switch driver callbacks Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:41:23 -04:00
Eric Dumazet	2571178626	net: sched: shrink struct qdisc_skb_cb to 28 bytes We cannot make struct qdisc_skb_cb bigger without impacting IPoIB, or increasing skb->cb[] size. Commit `e0f31d8498` ("flow_keys: Record IP layer protocol in skb_flow_dissect()") broke IPoIB. Only current offender is sch_choke, and this one do not need an absolutely precise flow key. If we store 17 bytes of flow key, its more than enough. (Its the actual size of flow_keys if it was a packed structure, but we might add new fields at the end of it later) Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `e0f31d8498` ("flow_keys: Record IP layer protocol in skb_flow_dissect()") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:21:47 -04:00
Andy Zhou	6d967f8789	udp_tunnel: Only build ip6_udp_tunnel.c when IPV6 is selected Functions supplied in ip6_udp_tunnel.c are only needed when IPV6 is selected. When IPV6 is not selected, those functions are stubbed out in udp_tunnel.h. ================================================================== net/ipv6/ip6_udp_tunnel.c:15:5: error: redefinition of 'udp_sock_create6' int udp_sock_create6(struct net net, struct udp_port_cfg cfg, In file included from net/ipv6/ip6_udp_tunnel.c:9:0: include/net/udp_tunnel.h:36:19: note: previous definition of 'udp_sock_create6' was here static inline int udp_sock_create6(struct net net, struct udp_port_cfg cfg, ================================================================== Fixes: `fd384412e` udp_tunnel: Seperate ipv6 functions into its own file Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 22:05:28 -04:00
Samuel Gauthier	9b67aa4a82	openvswitch: restore OVS_FLOW_CMD_NEW notifications Since commit `fb5d1e9e12` ("openvswitch: Build flow cmd netlink reply only if needed."), the new flows are not notified to the listeners of OVS_FLOW_MCGROUP. This commit fixes the problem by using the genl function, ie genl_has_listerners() instead of netlink_has_listeners(). Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:28:26 -04:00
Tom Herbert	4565e9919c	gre: Setup and TX path for gre/UDP foo-over-udp encapsulation Added netlink attrs to configure FOU encapsulation for GRE, netlink handling of these flags, and properly adjust MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	473ab820dd	ipip: Setup and TX path for ipip/UDP foo-over-udp encapsulation Add netlink handling for IP tunnel encapsulation parameters and and adjustment of MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	14909664e4	sit: Setup and TX path for sit/UDP foo-over-udp encapsulation Added netlink handling of IP tunnel encapulation paramters, properly adjust MTU for encapsulation. Added ip_tunnel_encap call to ipip6_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	5632848653	net: Changes to ip_tunnel to support foo-over-udp encapsulation This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Tom Herbert	afe93325bc	fou: Add GRO support Implement fou_gro_receive and fou_gro_complete, and populate these in the correponsing udp_offloads for the socket. Added ipproto to udp_offloads and pass this from UDP to the fou GRO routine in proto field of napi_gro_cb structure. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
Tom Herbert	23461551c0	fou: Support for foo-over-udp RX path This patch provides a receive path for foo-over-udp. This allows direct encapsulation of IP protocols over UDP. The bound destination port is used to map to an IP protocol, and the XFRM framework (udp_encap_rcv) is used to receive encapsulated packets. Upon reception, the encapsulation header is logically removed (pointer to transport header is advanced) and the packet is reinjected into the receive path with the IP protocol indicated by the mapping. Netlink is used to configure FOU ports. The configuration information includes the port number to bind to and the IP protocol corresponding to that port. This should support GRE/UDP (http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02), as will as the other IP tunneling protocols (IPIP, SIT). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
Tom Herbert	ce3e02867e	net: Export inet_offloads and inet6_offloads Want to be able to use these in foo-over-udp offloads, etc. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:31 -04:00
John Fastabend	4e2840eee6	net: sched: cls_u32: rcu can not be last node tc_u32_sel 'sel' in tc_u_knode expects to be the last element in the structure and pads the structure with tc_u32_key fields for each key. kzalloc(sizeof(n) + s->nkeyssizeof(struct tc_u32_key), GFP_KERNEL) CC: Eric Dumazet <edumazet@google.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:05:45 -04:00
David S. Miller	8f665f6cb7	Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-17 Please pull this batch of fixes intended for the 3.17 stream... Arend van Spriel sends a trio of minor brcmfmac fixes, including a fix for a Kconfig/build issue, a fix for a crash (null reference), and a regression fix related to event handling on a P2P interface. Hante Meuleman follows-up with a brcmfmac fix for a memory leak. Johannes Stezenbach brings an ath9k_htc fix for a regression related to hardware decryption offload. Marcel Holtmann delivers a one-liner to properly mark a device ID table in rfkill-gpio. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:33:15 -04:00
Eric Dumazet	ab34f64808	net: sched: use __skb_queue_head_init() where applicable pfifo_fast and htb use skb lists, without needing their spinlocks. (They instead use the standard qdisc lock) We can use __skb_queue_head_init() instead of skb_queue_head_init() to be consistent. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:32:10 -04:00
Florian Fainelli	6819563e64	net: dsa: allow switch drivers to specify phy_device::dev_flags Some switch drivers (e.g: bcm_sf2) may have to communicate specific workarounds or flags towards the PHY device driver. Allow switches driver to be delegated that task by introducing a get_phy_flags() callback which will do just that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Eric Dumazet	2e4e441071	net: add alloc_skb_with_frags() helper Extract from sock_alloc_send_pskb() code building skb with frags, so that we can reuse this in other contexts. Intent is to use it from tcp_send_rcvq(), tcp_collapse(), ... We also want to replace some skb_linearize() calls to a more reliable strategy in pathological cases where we need to reduce number of frags. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:25:23 -04:00
Eric Dumazet	cb93471acc	tcp: do not fake tcp headers in tcp_send_rcvq() Now we no longer rely on having tcp headers for skbs in receive queue, tcp repair do not need to build fake ones. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:04:13 -04:00
Andy Zhou	c8fffcea0a	l2tp: Refactor l2tp core driver to make use of the common UDP tunnel functions Simplify l2tp implementation using common UDP tunnel APIs. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	6a93cc9052	udp-tunnel: Add a few more UDP tunnel APIs Added a few more UDP tunnel APIs that can be shared by UDP based tunnel protocol implementation. The main ones are highlighted below. setup_udp_tunnel_sock() configures UDP listener socket for receiving UDP encapsulated packets. udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb using UDP encapsulation. udp_tunnel_sock_release() closes the UDP tunnel listener socket. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	fd384412e1	udp_tunnel: Seperate ipv6 functions into its own file. Add ip6_udp_tunnel.c for ipv6 UDP tunnel functions to avoid ifdefs in udp_tunnel.c Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Pablo Neira Ayuso	84d7fce693	netfilter: nf_tables: export rule-set generation ID This patch exposes the ruleset generation ID in three ways: 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset generation ID. This ID is incremented in every commit and it should be large enough to avoid wraparound problems. 2) The less significant 16-bits of the generation ID are exposed through the nfgenmsg->res_id header field. This allows us to quickly catch if the ruleset has change between two consecutive list dumps from different object lists (in this specific case I think the risk of wraparound is unlikely). 3) Userspace subscribers may receive notifications of new rule-set generation after every commit. This also provides an alternative way to monitor the generation ID. If the events are lost, the userspace process hits a overrun error, so it knows that it is working with a stale ruleset anyway. Patrick spotted that rule-set transformations in userspace may take quite some time. In that case, it annotates the 32-bits generation ID before fetching the rule-set, then: 1) it compares it to what we obtain after the transformation to make sure it is not working with a stale rule-set and no wraparound has ocurred. 2) it subscribes to ruleset notifications, so it can watch for new generation ID. This is complementary to the NLM_F_DUMP_INTR approach, which allows us to detect an interference in the middle one single list dumping. There is no way to explicitly check that an interference has occurred between two list dumps from the kernel, since it doesn't know how many lists the userspace client is actually going to dump. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-19 11:14:43 +02:00
Pablo Neira Ayuso	fc04733a1a	netfilter: nfnetlink: use original skbuff when committing/aborting This allows us to access the original content of the batch from the commit and the abort paths. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-19 11:14:42 +02:00
Johan Hedberg	5eb596f55c	Bluetooth: Fix setting correct security level when initiating SMP We can only determine the final security level when both pairing request and response have been exchanged. When initiating pairing the starting target security level is set to MEDIUM unless explicitly specified to be HIGH, so that we can still perform pairing even if the remote doesn't have MITM capabilities. However, once we've received the pairing response we should re-consult the remote and local IO capabilities and upgrade the target security level if necessary. Without this patch the resulting Long Term Key will occasionally be reported to be unauthenticated when it in reality is an authenticated one. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-09-18 17:39:37 +02:00
Pablo Neira Ayuso	fcfa8f493f	Merge branch 'ipvs-next' Simon Horman says: ==================== This pull requests makes the following changes: * Add simple weighted fail-over scheduler. - Unlike other IPVS schedulers this offers fail-over rather than load balancing. Connections are directed to the appropriate server based solely on highest weight value and server availability. - Thanks to Kenny Mathis * Support IPv6 real servers in IPv4 virtual-services and vice versa - This feature is supported in conjunction with the tunnel (IPIP) forwarding mechanism. That is, IPv4 may be forwarded in IPv6 and vice versa. - The motivation for this is to allow more flexibility in the choice of IP version offered by both virtual-servers and real-servers as they no longer need to match: An IPv4 connection from an end-user may be forwarded to a real-server using IPv6 and vice versa. - Further work need to be done to support this feature in conjunction with connection synchronisation. For now such configurations are not allowed. - This change includes update to netlink protocol, adding a new destination address family attribute. And the necessary changes to plumb this information throughout IPVS. - Thanks to Alex Gartrell and Julian Anastasov ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-18 10:59:33 +02:00
Herbert Xu	689f1c9de2	ipsec: Remove obsolete MAX_AH_AUTH_LEN While tracking down the MAX_AH_AUTH_LEN crash in an old kernel I thought that this limit was rather arbitrary and we should just get rid of it. In fact it seems that we've already done all the work needed to remove it apart from actually removing it. This limit was there in order to limit stack usage. Since we've already switched over to allocating scratch space using kmalloc, there is no longer any need to limit the authentication length. This patch kills all references to it, including the BUG_ONs that led me here. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-18 10:54:36 +02:00
Alex Gartrell	bc18d37f67	ipvs: Allow heterogeneous pools now that we support them Remove the temporary consistency check and add a case statement to only allow ipip mixed dests. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-18 08:59:29 +09:00
Julian Anastasov	f18ae7206e	ipvs: use the new dest addr family field Use the new address family field cp->daf when printing cp->daddr in logs or connection listing. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-18 08:59:28 +09:00
Julian Anastasov	4d316f3f9a	ipvs: use correct address family in scheduler logs Needed to support svc->af != dest->af. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-18 08:59:23 +09:00
Marcel Holtmann	0097db06f5	Bluetooth: Remove exported hci_recv_fragment function The hci_recv_fragment function is no longer used by any driver and thus do not export it. In fact it is not even needed by the core and it can be removed altogether. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-09-17 10:23:03 +03:00
John Fastabend	9f6c38e70b	net: sched: cls_cgroup need tcf_exts_init in all cases This ensures the tcf_exts_init() is called for all cases. Fixes: `952313bd62` ("net: sched: cls_cgroup use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 16:26:39 -04:00
David S. Miller	2d9d65fa44	Merge branch 'net_next_ovs' of git://git.kernel.org/pub/scm/linux/kernel/git/pshelar/openvswitch Pravin B Shelar says: ==================== Open vSwitch Following patches adds recirculation and hash action to OVS. First patch removes pointer to stack object. Next three patches does code restructuring which is required for last patch. Recirculation implementation is changed, according to comments from David Miller, to avoid using recursive calls in OVS. It is using queue to record recirc action and deferred recirc is executed at the end of current actions execution. v1-v2: Changed subsystem name in subject to openvswitch v2-v3: Added patch to remove pkt_key pointer from skb->cb. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 16:21:48 -04:00
Marcel Holtmann	dda3b191eb	net: rfkill: gpio: Enable module auto-loading for ACPI based switches For the ACPI based switches the MODULE_DEVICE_TABLE is missing to export the entries for module auto-loading. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-16 16:09:01 -04:00
John Fastabend	e1f93eb06c	net: sched: cls_fw: add missing tcf_exts_init call in fw_change() When allocating a new structure we also need to call tcf_exts_init to initialize exts. A follow up patch might be in order to remove some of this code and do tcf_exts_assign(). With this we could remove the tcf_exts_init/tcf_exts_change pattern for some of the classifiers. As part of the future tcf_actions RCU series this will need to be done. For now fix the call here. Fixes `e35a8ee599` ("net: sched: fw use RCU") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	d14cbfc88f	net: sched: cls_cgroup fix possible memory leak of 'new' tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: `54996b529a` commit: `c7953ef230` [625/646] net: sched: cls_cgroup use RCU net/sched/cls_cgroup.c:130 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:135 cls_cgroup_change() warn: possible memory leak of 'new' net/sched/cls_cgroup.c:139 cls_cgroup_change() warn: possible memory leak of 'new' Fixes: `c7953ef230` ("net: sched: cls_cgroup use RCU") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	a96366bf26	net: sched: cls_u32 add missing rcu_assign_pointer and annotation Add missing rcu_assign_pointer and missing annotation for ht_up in cls_u32.c Caught by kbuild bot, >> net/sched/cls_u32.c:378:36: sparse: incorrect type in initializer (different address spaces) net/sched/cls_u32.c:378:36: expected struct tc_u_hnode ht net/sched/cls_u32.c:378:36: got struct tc_u_hnode [noderef] <asn:4>ht_up >> net/sched/cls_u32.c:610:54: sparse: incorrect type in argument 4 (different address spaces) net/sched/cls_u32.c:610:54: expected struct tc_u_hnode ht net/sched/cls_u32.c:610:54: got struct tc_u_hnode [noderef] <asn:4>ht_up >> net/sched/cls_u32.c:684:18: sparse: incorrect type in assignment (different address spaces) net/sched/cls_u32.c:684:18: expected struct tc_u_hnode [noderef] <asn:4>ht_up net/sched/cls_u32.c:684:18: got struct tc_u_hnode [assigned] ht >> net/sched/cls_u32.c:359:18: sparse: dereference of noderef expression Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
John Fastabend	80aab73de4	net: sched: fix unsued cpu variable kbuild test robot reported an unused variable cpu in cls_u32.c after the patch below. This happens when PERF and MARK config variables are disabled Fix this is to use separate variables for perf and mark and define the cpu variable inside the ifdef logic. Fixes: `459d5f626d` ("net: sched: make cls_u32 per cpu")' Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:59:36 -04:00
WANG Cong	69301eaa7f	net_sched: fix a null pointer dereference in tcindex_set_parms() This patch fixes the following crash: [ 42.199159] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 42.200027] IP: [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] PGD d2319067 PUD d4ffe067 PMD 0 [ 42.200027] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.200027] CPU: 0 PID: 541 Comm: tc Not tainted 3.17.0-rc4+ #603 [ 42.200027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 42.200027] task: ffff8800d22d2670 ti: ffff8800ce790000 task.ti: ffff8800ce790000 [ 42.200027] RIP: 0010:[<ffffffff817e3fc4>] [<ffffffff817e3fc4>] tcindex_set_parms+0x45c/0x526 [ 42.200027] RSP: 0018:ffff8800ce793898 EFLAGS: 00010202 [ 42.200027] RAX: 0000000000000001 RBX: ffff8800d1786498 RCX: 0000000000000000 [ 42.200027] RDX: ffffffff82114ec8 RSI: ffffffff82114ec8 RDI: ffffffff82114ec8 [ 42.200027] RBP: ffff8800ce793958 R08: 00000000000080d0 R09: 0000000000000001 [ 42.200027] R10: ffff8800ce7939a0 R11: 0000000000000246 R12: ffff8800d017d238 [ 42.200027] R13: 0000000000000018 R14: ffff8800d017c6a0 R15: ffff8800d1786620 [ 42.200027] FS: 00007f4e24539740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 42.200027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.200027] CR2: 0000000000000018 CR3: 00000000cff38000 CR4: 00000000000006f0 [ 42.200027] Stack: [ 42.200027] ffff8800ce0949f0 0000000000000000 0000000200000003 ffff880000000000 [ 42.200027] ffff8800ce7938b8 ffff8800ce7938b8 0000000600000007 0000000000000000 [ 42.200027] ffff8800ce7938d8 ffff8800ce7938d8 0000000600000007 ffff8800ce0949f0 [ 42.200027] Call Trace: [ 42.200027] [<ffffffff817e4169>] tcindex_change+0xdb/0xee [ 42.200027] [<ffffffff817c16ca>] tc_ctl_tfilter+0x44d/0x63f [ 42.200027] [<ffffffff8179d161>] rtnetlink_rcv_msg+0x181/0x194 [ 42.200027] [<ffffffff8179cf9d>] ? rtnl_lock+0x17/0x19 [ 42.200027] [<ffffffff8179cfe0>] ? __rtnl_unlock+0x17/0x17 [ 42.200027] [<ffffffff817ee296>] netlink_rcv_skb+0x49/0x8b [ 43.462494] [<ffffffff8179cfc2>] rtnetlink_rcv+0x23/0x2a [ 43.462494] [<ffffffff817ec8df>] netlink_unicast+0xc7/0x148 [ 43.462494] [<ffffffff817ed413>] netlink_sendmsg+0x5cb/0x63d [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff817757b8>] __sock_sendmsg_nosec+0x25/0x27 [ 43.462494] [<ffffffff81778165>] sock_sendmsg+0x57/0x71 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff81152c06>] ? might_fault+0xa0/0xa4 [ 43.462494] [<ffffffff81152bbd>] ? might_fault+0x57/0xa4 [ 43.462494] [<ffffffff817838fd>] ? verify_iovec+0x69/0xb7 [ 43.462494] [<ffffffff817784f8>] ___sys_sendmsg+0x21d/0x2bb [ 43.462494] [<ffffffff81009db3>] ? native_sched_clock+0x35/0x37 [ 43.462494] [<ffffffff8109ab53>] ? sched_clock_local+0x12/0x72 [ 43.462494] [<ffffffff810ad781>] ? mark_lock+0x2e/0x224 [ 43.462494] [<ffffffff8109ada4>] ? sched_clock_cpu+0xa0/0xb9 [ 43.462494] [<ffffffff810aee37>] ? __lock_acquire+0x5fe/0xde4 [ 43.462494] [<ffffffff8119f570>] ? rcu_read_lock_held+0x36/0x38 [ 43.462494] [<ffffffff8119f75a>] ? __fcheck_files.isra.7+0x4b/0x57 [ 43.462494] [<ffffffff8119fbf2>] ? __fget_light+0x30/0x54 [ 43.462494] [<ffffffff81779012>] __sys_sendmsg+0x42/0x60 [ 43.462494] [<ffffffff81779042>] SyS_sendmsg+0x12/0x1c [ 43.462494] [<ffffffff819d24d2>] system_call_fastpath+0x16/0x1b 'p->h' could be NULL while 'cp->h' is always update to date. Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:20:09 -04:00
WANG Cong	44b75e4317	net_sched: fix memory leak in cls_tcindex Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-By: John Fastabend <john.r.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-16 15:19:23 -04:00
David Howells	0c903ab64f	KEYS: Make the key matching functions return bool Make the key matching functions pointed to by key_match_data::cmp return bool rather than int. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-16 17:36:08 +01:00
David Howells	c06cfb08b8	KEYS: Remove key_type::match in favour of overriding default by match_preparse A previous patch added a ->match_preparse() method to the key type. This is allowed to override the function called by the iteration algorithm. Therefore, we can just set a default that simply checks for an exact match of the key description with the original criterion data and allow match_preparse to override it as needed. The key_type::match op is then redundant and can be removed, as can the user_match() function. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-16 17:36:06 +01:00
David Howells	462919591a	KEYS: Preparse match data Preparse the match data. This provides several advantages: (1) The preparser can reject invalid criteria up front. (2) The preparser can convert the criteria to binary data if necessary (the asymmetric key type really wants to do binary comparison of the key IDs). (3) The preparser can set the type of search to be performed. This means that it's not then a one-off setting in the key type. (4) The preparser can set an appropriate comparator function. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com>	2014-09-16 17:36:02 +01:00
Steffen Klassert	b8c203b2d2	xfrm: Generate queueing routes only from route lookup functions Currently we genarate a queueing route if we have matching policies but can not resolve the states and the sysctl xfrm_larval_drop is disabled. Here we assume that dst_output() is called to kill the queued packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating queueing routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: `a0073fe18e` ("xfrm: Add a state resolution packet queue") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-16 10:08:49 +02:00
Steffen Klassert	f92ee61982	xfrm: Generate blackhole routes only from route lookup functions Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: `2774c131b1` ("xfrm: Handle blackhole route creation via afinfo.") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-16 10:08:40 +02:00
Andy Zhou	971427f353	openvswitch: Add recirc and hash action. Recirc action allows a packet to reenter openvswitch processing. currently openvswitch lookup flow for packet received and execute set of actions on that packet, with help of recirc action we can process/modify the packet and recirculate it back in openvswitch for another pass. OVS hash action calculates 5-tupple hash and set hash in flow-key hash. This can be used along with recirculation for distributing packets among different ports for bond devices. For example: OVS bonding can use following actions: Match on: bond flow; Action: hash, recirc(id) Match on: recirc-id == id and hash lower bits == a; Action: output port_bond_a Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 23:28:14 -07:00
Andy Zhou	32ae87ff79	openvswitch: simplify sample action implementation The current sample() function implementation is more complicated than necessary in handling single user space action optimization and skb reference counting. There is no functional changes. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 23:28:14 -07:00
Pravin B Shelar	8c8b1b83fc	openvswitch: Use tun_key only for egress tunnel path. Currently tun_key is used for passing tunnel information on ingress and egress path, this cause confusion. Following patch removes its use on ingress path make it egress only parameter. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-09-15 23:28:13 -07:00
Pravin B Shelar	83c8df26a3	openvswitch: refactor ovs flow extract API. OVS flow extract is called on packet receive or packet execute code path. Following patch defines separate API for extracting flow-key in packet execute code path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>	2014-09-15 23:28:13 -07:00
Pravin B Shelar	2ff3e4e486	openvswitch: Remove pkt_key from OVS_CB OVS keeps pointer to packet key in skb->cb, but the packet key is store on stack. This could make code bit tricky. So it is better to get rid of the pointer. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-09-15 23:28:13 -07:00
Julian Anastasov	cf34e646da	ipvs: address family of LBLCR entry depends on svc family The LBLCR entries should use svc->af, not dest->af. Needed to support svc->af != dest->af. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:38 +09:00
Julian Anastasov	f7fa380069	ipvs: address family of LBLC entry depends on svc family The LBLC entries should use svc->af, not dest->af. Needed to support svc->af != dest->af. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:38 +09:00
Alex Gartrell	8052ba2925	ipvs: support ipv4 in ipv6 and ipv6 in ipv4 tunnel forwarding Pull the common logic for preparing an skb to prepend the header into a single function and then set fields such that they can be used in either case (generalize tos and tclass to dscp, hop_limit and ttl to ttl, etc) Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:37 +09:00
Alex Gartrell	c63e4de2be	ipvs: Add generic ensure_mtu_is_adequate to handle mixed pools The out_rt functions check to see if the mtu is large enough for the packet and, if not, send icmp messages (TOOBIG or DEST_UNREACH) to the source and bail out. We needed the ability to send ICMP from the out_rt_v6 function and DEST_UNREACH from the out_rt function, so we just pulled it out into a common function. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:37 +09:00
Alex Gartrell	919aa0b2bb	ipvs: Pull out update_pmtu code Another step toward heterogeneous pools, this removes another piece of functionality currently specific to each address family type. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:36 +09:00
Alex Gartrell	4a4739d56b	ipvs: Pull out crosses_local_route_boundary logic This logic is repeated in both out_rt functions so it was redundant. Additionally, we'll need to be able to do checks to route v4 to v6 and vice versa in order to deal with heterogeneous pools. This patch also updates the callsites to add an additional parameter to the out route functions. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:36 +09:00
Alex Gartrell	391f503d69	ipvs: prevent mixing heterogeneous pools and synchronization The synchronization protocol is not compatible with heterogeneous pools, so we need to verify that we're not turning both on at the same time. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:35 +09:00
Alex Gartrell	ba38528aae	ipvs: Supply destination address family to ip_vs_conn_new The assumption that dest af is equal to service af is now unreliable, so we must specify it manually so as not to copy just the first 4 bytes of a v6 address or doing an illegal read of 16 butes on a v6 address. We "lie" in two places: for synchronization (which we will explicitly disallow from happening when we have heterogeneous pools) and for black hole addresses where there's no real dest. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:34 +09:00
Alex Gartrell	ad147aa4dd	ipvs: Pass destination address family to ip_vs_trash_get_dest Part of a series of diffs to tease out destination family from virtual family. This diff just adds a parameter to ip_vs_trash_get and then uses it for comparison rather than svc->af. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:34 +09:00
Alex Gartrell	655eef103d	ipvs: Supply destination addr family to ip_vs_{lookup_dest,find_dest} We need to remove the assumption that virtual address family is the same as real address family in order to support heterogeneous services (that is, services with v4 vips and v6 backends or the opposite). Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:33 +09:00
Alex Gartrell	6cff339bbd	ipvs: Add destination address family to netlink interface This is necessary to support heterogeneous pools. For example, if you have an ipv6 addressed network, you'll want to be able to forward ipv4 traffic into it. This patch enforces that destination address family is the same as service family, as none of the forwarding mechanisms support anything else. For the old setsockopt mechanism, we simply set the dest address family to AF_INET as we do with the service. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:33 +09:00
Kenny Mathis	616a9be25c	ipvs: Add simple weighted failover scheduler Add simple weighted IPVS failover support to the Linux kernel. All other scheduling modules implement some form of load balancing, while this offers a simple failover solution. Connections are directed to the appropriate server based solely on highest weight value and server availability. Tested functionality with keepalived. Signed-off-by: Kenny Mathis <kmathis@chokepoint.net> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:32 +09:00
Florian Fainelli	c1f570a6ab	net: dsa: fix mii_bus to host_dev replacement dsa_of_probe() still used cd->mii_bus instead of cd->host_dev when building with CONFIG_OF=y. Fix this by making the replacement here as well. Fixes: `b4d2394d01` ("dsa: Replace mii_bus with a generic host device") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:52:48 -04:00
WANG Cong	10ee1c34be	net_sched: use tcindex_filter_result_init() Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:51:18 -04:00
WANG Cong	2f9a220eff	net_sched: fix suspicious RCU usage in tcindex_classify() This patch fixes the following kernel warning: [ 44.805900] [ INFO: suspicious RCU usage. ] [ 44.808946] 3.17.0-rc4+ #610 Not tainted [ 44.811831] ------------------------------- [ 44.814873] net/sched/cls_tcindex.c:84 suspicious rcu_dereference_check() usage! Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:49:42 -04:00
WANG Cong	a57a65ba47	net_sched: fix an allocation bug in tcindex_set_parms() Fixes: commit `331b72922c` ("net: sched: RCU cls_tcindex") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:48:23 -04:00
WANG Cong	80dcbd12fb	net_sched: fix suspicious RCU usage in cls_bpf_classify() Fixes: commit `1f947bf151` ("net: sched: rcu'ify cls_bpf") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:42:08 -04:00
Vlad Yasevich	c095f248e6	bridge: Fix br_should_learn to check vlan_enabled As Toshiaki Makita pointed out, the BRIDGE_INPUT_SKB_CB will not be initialized in br_should_learn() as that function is called only from br_handle_local_finish(). That is an input handler for link-local ethernet traffic so it perfectly correct to check br->vlan_enabled here. Reported-by: Toshiaki Makita<toshiaki.makita1@gmail.com> Fixes: `20adfa1` bridge: Check if vlan filtering is enabled only once. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:38:30 -04:00
Alexander Duyck	b4d2394d01	dsa: Replace mii_bus with a generic host device This change makes it so that instead of passing and storing a mii_bus we instead pass and store a host_dev. From there we can test to determine the exact type of device, and can verify it is the correct device for our switch. So for example it would be possible to pass a device pointer from a pci_dev and instead of checking for a PHY ID we could check for a vendor and/or device ID. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:24:20 -04:00
Alexander Duyck	5075314e4e	dsa: Split ops up, and avoid assigning tag_protocol and receive separately This change addresses several issues. First, it was possible to set tag_protocol without setting the ops pointer. To correct that I have reordered things so that rcv is now populated before we set tag_protocol. Second, it didn't make much sense to keep setting the device ops each time a new slave was registered. So by moving the receive portion out into root switch initialization that issue should be addressed. Third, I wanted to avoid sending tags if the rcv pointer was not registered so I changed the tag check to verify if the rcv function pointer is set on the root tree. If it is then we start sending DSA tagged frames. Finally I split the device ops pointer in the structures into two spots. I placed the rcv function pointer in the root switch since this makes it easiest to access from there, and I placed the xmit function pointer in the slave for the same reason. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:24:20 -04:00
Jozsef Kadlecsik	07034aeae1	netfilter: ipset: hash:mac type added to ipset Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:21 +02:00
Anton Danilov	76cea4109c	netfilter: ipset: Add skbinfo extension support to SET target. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:21 +02:00
Anton Danilov	cbee93d7b7	netfilter: ipset: Add skbinfo extension kernel support for the list set type. Add skbinfo extension kernel support for the list set type. Introduce the new revision of the list set type. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:20 +02:00
Anton Danilov	af331419d3	netfilter: ipset: Add skbinfo extension kernel support for the hash set types. Add skbinfo extension kernel support for the hash set types. Inroduce the new revisions of all hash set types. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:20 +02:00
Anton Danilov	39d1ecf1ad	netfilter: ipset: Add skbinfo extension kernel support for the bitmap set types. Add skbinfo extension kernel support for the bitmap set types. Inroduce the new revisions of bitmap_ip, bitmap_ipmac and bitmap_port set types. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:20 +02:00
Anton Danilov	0e9871e3f7	netfilter: ipset: Add skbinfo extension kernel support in the ipset core. Skbinfo extension provides mapping of metainformation with lookup in the ipset tables. This patch defines the flags, the constants, the functions and the structures for the data type independent support of the extension. Note the firewall mark stores in the kernel structures as two 32bit values, but transfered through netlink as one 64bit value. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:20 +02:00
Jozsef Kadlecsik	73e64e1813	netfilter: ipset: Fix static checker warning in ip_set_core.c Dan Carpenter reported the following static checker warning: net/netfilter/ipset/ip_set_core.c:1414 call_ad() error: 'nlh->nlmsg_len' from user is not capped properly The payload size is limited now by the max size of size_t. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2014-09-15 22:20:20 +02:00
John W. Linville	1186b623c2	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-09-15 14:55:45 -04:00
John W. Linville	6bd2bd27ba	This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUEpv0AAoJEDBSmw7B7bqr6CMP/2CXvWr/98AY2Flt74KDNyaE vmJBVCsu+eT0G9FL6YxbVU5+rvInGDHd9qTHkU4ljd+uXwnG8XAT+WHFlhBjzm+V juXPWblbSdMzwpWDfq7Kbk134b9ALTEUqekhqSFvhPA5h0Dq0/8lDK9CFyfwKWbN 07PwUv0VUUEHKVqQoVSNJu9Szi5NvZvDcN7Jwg1Cpnv0sUOeH7J2Kz1OUT4RaEhI c/UJjCQV4ssXaEkTDIxciQ62HrglZanMqyx4a9LGbrxLdw1KJ19CNmSkwB5mQuZg LhV05Y0Gv4tkRC8sCo7HF7cqgjBfjTNiEjZYfbExW0QFOMKIgKmmjYIEezVdbrk7 gFIyhTRE595UtztUJV0dcitoOlybbRf3OdEwAIJD6fc0vhoe/rSjUIyS7/CZisMT 9zg33JvtK3eYPSJS1jy4lk2yZ5alhLoPMQTNmsEuyOGcU3sH9vTGMjONPffOlcH9 nzj7aUS2Qvwn3H+4CIaZbZhySpa0B9zkGL3oxeaEBmLJbFMTo5ua2FNGhubC2O+O BwNULDBEMwsGHKMUCWCLmQwACWdVdNxYYWtXbWfxdmC/CJoXgdLCJIUfoa1aOf2A DyCqUFvG/n8ObHVy+P3RU6poQFj0M/yclJAMHRW6x2qzNvAkDb0G6TVeIlgN5dG8 jLoZPL5OH0wb0BPVNEH8 =OPIp -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-john-2014-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." Conflicts: net/mac80211/iface.c Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-15 14:51:23 -04:00
Eric Dumazet	b3d6cb92fd	tcp: do not copy headers in tcp_collapse() tcp_collapse() wants to shrink skb so that the overhead is minimal. Now we store tcp flags into TCP_SKB_CB(skb)->tcp_flags, we no longer need to keep around full headers. Whole available space is dedicated to the payload. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 14:41:08 -04:00
Eric Dumazet	e93a0435f8	tcp: allow segment with FIN in tcp_try_coalesce() We can allow a segment with FIN to be aggregated, if we take care to add tcp flags, and if skb_try_coalesce() takes care of zero sized skbs. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 14:41:07 -04:00
Eric Dumazet	e11ecddf51	tcp: use TCP_SKB_CB(skb)->tcp_flags in input path Input path of TCP do not currently uses TCP_SKB_CB(skb)->tcp_flags, which is only used in output path. tcp_recvmsg(), looks at tcp_hdr(skb)->syn for every skb found in receive queue, and its unfortunate because this bit is located in a cache line right before the payload. We can simplify TCP by copying tcp flags into TCP_SKB_CB(skb)->tcp_flags. This patch does so, and avoids the cache line miss in tcp_recvmsg() Following patches will - allow a segment with FIN being coalesced in tcp_try_coalesce() - simplify tcp_collapse() by not copying the headers. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 14:41:07 -04:00
Alexander Y. Fomichev	7ce64c79c4	net: fix creation adjacent device symlinks __netdev_adjacent_dev_insert may add adjust device of different net namespace, without proper check it leads to emergence of broken sysfs links from/to devices in another namespace. Fix: rewrite netdev_adjacent_is_neigh_list macro as a function, move net_eq check into netdev_adjacent_is_neigh_list. (thanks David) related to: `4c75431ac3` Signed-off-by: Alexander Fomichev <git.user@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 14:24:53 -04:00
Marcel Holtmann	43e73e4e2a	Bluetooth: Provide HCI command opcode information to driver The Bluetooth core already does processing of the HCI command header and puts it together before sending it to the driver. It is not really efficient for the driver to look at the HCI command header again in case it has to make certain decisions about certain commands. To make this easier, just provide the opcode as part of the SKB control buffer information. The extra information about the opcode is optional and only provided for HCI commands. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-09-15 07:15:45 +03:00
Marcel Holtmann	7cb9d20fd9	Bluetooth: Add BUILD_BUG_ON check for SKB control buffer size The struct bt_skb_cb size needs to stay within the limits of skb->cb at all times and to ensure that add a BUILD_BUG_ON to check for it at compile time. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-09-15 07:15:41 +03:00
Sasha Levin	c0d1379a19	net: bpf: correctly handle errors in sk_attach_filter() Commit "net: bpf: make eBPF interpreter images read-only" has changed bpf_prog to be vmalloc()ed but never handled some of the errors paths of the old code. On error within sk_attach_filter (which userspace can easily trigger), we'd kfree() the vmalloc()ed memory, and leak the internal bpf_work_struct. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:37:49 -04:00
Vlad Yasevich	635126b7ca	bridge: Allow clearing of pvid and untagged bitmap Currently, it is possible to modify the vlan filter configuration to add pvid or untagged support. For example: bridge vlan add vid 10 dev eth0 bridge vlan add vid 10 dev eth0 untagged pvid The second statement will modify vlan 10 to include untagged and pvid configuration. However, it is currently impossible to go backwards bridge vlan add vid 10 dev eth0 untagged pvid bridge vlan add vid 10 dev eth0 Here nothing happens. This patch correct this so that any modifiers not supplied are removed from the configuration. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:21:56 -04:00
Vlad Yasevich	20adfa1a81	bridge: Check if vlan filtering is enabled only once. The bridge code checks if vlan filtering is enabled on both ingress and egress. When the state flip happens, it is possible for the bridge to currently be forwarding packets and forwarding behavior becomes non-deterministic. Bridge may drop packets on some interfaces, but not others. This patch solves this by caching the filtered state of the packet into skb_cb on ingress. The skb_cb is guaranteed to not be over-written between the time packet entres bridge forwarding path and the time it leaves it. On egress, we can then check the cached state to see if we need to apply filtering information. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:21:56 -04:00
Hannes Frederic Sowa	233577a220	net: filter: constify detection of pkt_type_offset Currently we have 2 pkt_type_offset functions doing the same thing and spread across the architecture files. Remove those and replace them with a PKT_TYPE_OFFSET macro helper which gets the constant value from a zero sized sk_buff member right in front of the bitfield with offsetof. This new offset marker does not change size of struct sk_buff. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:07:21 -04:00
Florian Fainelli	ac7a04c33d	net: dsa: change tag_protocol to an enum Now that we introduced an additional multiplexing/demultiplexing layer with commit `3e8a72d1da` ("net: dsa: reduce number of protocol hooks") that lives within the DSA code, we no longer need to have a given switch driver tag_protocol be an actual ethertype value, instead, we can replace it with an enum: dsa_tag_protocol. Do this replacement in the drivers, which allows us to get rid of the cpu_to_be16()/htons() dance, and remove ETH_P_BRCMTAG since we do not need it anymore. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:04:35 -04:00
WANG Cong	3ce62a84d5	ipv6: exit early in addrconf_notify() if IPv6 is disabled If IPv6 is explicitly disabled before the interface comes up, it makes no sense to continue when it comes up, even just print a message. (I am not sure about other cases though, so I prefer not to touch) Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:39:40 -04:00
WANG Cong	1691c63ea4	ipv6: refactor ipv6_dev_mc_inc() Refactor out allocation and initialization and make the refcount code more readable. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	f7ed925c1b	ipv6: update the comment in mcast.c Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	414b6c943f	ipv6: drop some rcu_read_lock in mcast Similarly the code is already protected by rtnl lock. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	b5350916bf	ipv6: drop ipv6_sk_mc_lock in mcast Similarly the code is already protected by rtnl lock. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	83aa29eefd	ipv6: refactor __ipv6_dev_ac_inc() Refactor out allocation and initialization and make the refcount code more readable. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	013b4d9038	ipv6: clean up ipv6_dev_ac_inc() Make it accept inet6_dev, and rename it to __ipv6_dev_ac_inc() to reflect this change. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	b03a9c04a3	ipv6: remove ipv6_sk_ac_lock Just move rtnl lock up, so that the anycast list can be protected by rtnl lock now. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
WANG Cong	6c555490e0	ipv6: drop useless rcu_read_lock() in anycast These code is now protected by rtnl lock, rcu read lock is useless now. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
John Fastabend	1f947bf151	net: sched: rcu'ify cls_bpf This patch makes the cls_bpf classifier RCU safe. The tcf_lock was being used to protect a list of cls_bpf_prog now this list is RCU safe and updates occur with rcu_replace. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	b929d86d25	net: sched: rcu'ify cls_rsvp Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	1ce87720d4	net: sched: make cls_u32 lockless Make cls_u32 classifier safe to run without holding lock. This patch converts statistics that are kept in read section u32_classify into per cpu counters. This patch was tested with a tight u32 filter add/delete loop while generating traffic with pktgen. By running pktgen on vlan devices created on top of a physical device we can hit the qdisc layer correctly. For ingress qdisc's a loopback cable was used. for i in {1..100}; do q=`echo $i%8\|bc`; echo -n "u32 tos: iteration $i on queue $q"; tc filter add dev p3p2 parent $p prio $i u32 match ip tos 0x10 0xff \ action skbedit queue_mapping $q; sleep 1; tc filter del dev p3p2 prio $i; echo -n "u32 tos hash table: iteration $i on queue $q"; tc filter add dev p3p2 parent $p protocol ip prio $i handle 628: u32 divisor 1 tc filter add dev p3p2 parent $p protocol ip prio $i u32 \ match ip protocol 17 0xff link 628: offset at 0 mask 0xf00 shift 6 plus 0 tc filter add dev p3p2 parent $p protocol ip prio $i u32 \ ht 628:0 match ip tos 0x10 0xff action skbedit queue_mapping $q sleep 2; tc filter del dev p3p2 prio $i sleep 1; done Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	459d5f626d	net: sched: make cls_u32 per cpu This uses per cpu counters in cls_u32 in preparation to convert over to rcu. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	331b72922c	net: sched: RCU cls_tcindex Make cls_tcindex RCU safe. This patch addds a new RCU routine rcu_dereference_bh_rtnl() to check caller either holds the rcu read lock or RTNL. This is needed to handle the case where tcindex_lookup() is being called in both cases. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	1109c00547	net: sched: RCU cls_route RCUify the route classifier. For now however spinlock's are used to protect fastmap cache. The issue here is the fastmap may be read by one CPU while the cache is being updated by another. An array of pointers could be one possible solution. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	e35a8ee599	net: sched: fw use RCU RCU'ify fw classifier. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	70da9f0bf9	net: sched: cls_flow use RCU Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	952313bd62	net: sched: cls_cgroup use RCU Make cgroup classifier safe for RCU. Also drops the calls in the classify routine that were doing a rcu_read_lock()/rcu_read_unlock(). If the rcu_read_lock() isn't held entering this routine we have issues with deleting the classifier chain so remove the unnecessary rcu_read_lock()/rcu_read_unlock() pair noting all paths AFAIK hold rcu_read_lock. If there is a case where classify is called without the rcu read lock then an rcu splat will occur and we can correct it. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:26 -04:00
John Fastabend	9888faefe1	net: sched: cls_basic use RCU Enable basic classifier for RCU. Dereferencing tp->root may look a bit strange here but it is needed by my accounting because it is allocated at init time and needs to be kfree'd at destroy time. However because it may be referenced in the classify() path we must wait an RCU grace period before free'ing it. We use kfree_rcu() and rcu_ APIs to enforce this. This pattern is used in all the classifiers. Also the hgenerator can be incremented without concern because it is always incremented under RTNL. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:25 -04:00
John Fastabend	25d8c0d55f	net: rcu-ify tcf_proto rcu'ify tcf_proto this allows calling tc_classify() without holding any locks. Updaters are protected by RTNL. This patch prepares the core net_sched infrastracture for running the classifier/action chains without holding the qdisc lock however it does nothing to ensure cls_xxx and act_xxx types also work without locking. Additional patches are required to address the fall out. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:25 -04:00
John Fastabend	46e5da40ae	net: qdisc: use rcu prefix and silence sparse warnings Add __rcu notation to qdisc handling by doing this we can make smatch output more legible. And anyways some of the cases should be using rcu_dereference() see qdisc_all_tx_empty(), qdisc_tx_chainging(), and so on. Also *wake_queue() API is commonly called from driver timer routines without rcu lock or rtnl lock. So I added rcu_read_lock() blocks around netif_wake_subqueue and netif_tx_wake_queue. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:25 -04:00
David S. Miller	cffc6c4c94	Merge tag 'master-2014-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-11 Please pull this batch of fixes intended for the 3.17 stream: For the mac80211 bits, Johannes says: "Two more fixes for mac80211 - one of them addresses a long-standing issue that we only found when using vendor events more frequently; the other addresses some bad information being reported in userspace that people were starting to actually look at." For the iwlwifi bits, Emmanuel says: "I re-enable scheduled scan on firmware that contain the fix for the bug that Linus reported. A few trivial fixes: endianity issues, the same DTIM period fix that I did in mac80211. Eyal fixes a few issues we identified with EAPOL, we now send them just as if they were management frames, this solves interrop issues. Johannes has another set of trivial fixes, while Luca fixes the way we configure the filters in the firmware. Last but not least, a new device is added by Oren." Emmanuel was traveling, resulting in his pull to be a bit larger than I would have liked to see at this point. FWIW, I have asked Emmanuel to be much more strict for any more pull requests in this cycle. In addition to the above, Sujith Manoharan reverts an earlier ath9k patch. The earlier change was found to allow for the device to sleep too long and miss beacons. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 18:21:47 -04:00
Scott Wood	2d8f7e2c8a	udp: Fix inverted NAPI_GRO_CB(skb)->flush test Commit `2abb7cdc0d` ("udp: Add support for doing checksum unnecessary conversion") caused napi_gro_cb structs with the "flush" field zero to take the "udp_gro_receive" path rather than the "set flush to 1" path that they would previously take. As a result I saw booting from an NFS root hang shortly after starting userspace, with "server not responding" messages. This change to the handling of "flush == 0" packets appears to be incidental to the goal of adding new code in the case where skb_gro_checksum_validate_zero_check() returns zero. Based on that and the fact that it breaks things, I'm assuming that it is unintentional. Fixes: `2abb7cdc0d` ("udp: Add support for doing checksum unnecessary conversion") Cc: Tom Herbert <therbert@google.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 17:55:41 -04:00
Alexander Duyck	bf7fa551e0	mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path There is a possible issue with the use, or lack thereof of sk_refcnt and sk_wmem_alloc in the wifi ack status functionality. Specifically if a socket were to request acknowledgements, and the socket were to have sk_refcnt drop to 0 resulting in it waiting on sk_wmem_alloc to reach 0 it would be possible to have sock_queue_err_skb orphan the last buffer, resulting in __sk_free being called on the socket. After this the buffer is enqueued on sk_error_queue, however the queue has already been flushed resulting in at least a memory leak, if not a data corruption. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 17:51:25 -04:00
Alexander Duyck	cab41c47d9	skb: Add documentation for skb_clone_sk This change adds some documentation to the call skb_clone_sk. This is meant to help clarify the purpose of the function for other developers. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 17:51:24 -04:00
Sabrina Dubroca	381f4dca48	ipv6: clean up anycast when an interface is destroyed If we try to rmmod the driver for an interface while sockets with setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up and we get stuck on: unregister_netdevice: waiting for ens3 to become free. Usage count = 1 If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no problem. We need to perform a cleanup similar to the one for multicast in addrconf_ifdown(how == 1). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 17:33:06 -04:00
Johan Hedberg	9a783a139c	Bluetooth: Fix re-setting RPA as expired when deferring update The hci_update_random_address will clear the RPA_EXPIRED flag and proceed with setting a new one if the flag was set. However, the set_random_addr() function that is called may choose to defer the update to a later moment. In such a case the flag would incorrectly remain unset unless set_random_addr() re-sets it. This patch fixes the issue. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-12 18:34:25 +02:00
Pablo Neira Ayuso	0bbe80e571	netfilter: masquerading needs to be independent of x_tables in Kconfig Users are starting to test nf_tables with no x_tables support. Therefore, masquerading needs to be indenpendent of it from Kconfig. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-12 09:40:18 +02:00
Pablo Neira Ayuso	3e8dc212a0	netfilter: NFT_CHAIN_NAT_IPV* is independent of NFT_NAT Now that we have masquerading support in nf_tables, the NAT chain can be use with it, not only for SNAT/DNAT. So make this chain type independent of it. While at it, move it inside the scope of 'if NF_NAT_IPV*' to simplify dependencies. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-12 09:40:17 +02:00
Linus Torvalds	c73f6fdf2f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "The main thing here is a set of three patches that fix a buffer overrun for large authentication tickets (sigh). There is also a trivial warning fix and an error path fix that are both regressions" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: do not hard code max auth ticket len libceph: add process_one_ticket() helper libceph: gracefully handle large reply messages from the mon rbd: fix error return code in rbd_dev_device_setup() rbd: avoid format-security warning inside alloc_workqueue()	2014-09-11 18:03:21 -07:00
Eliad Peller	0d8614b4b9	mac80211: replace SMPS hw flags with wiphy feature bits Use the new static_smps / dynamic_smps feature bits instead of mac80211-internal hw flags. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 13:37:02 +02:00
Eliad Peller	f699317487	mac80211: set smps_mode according to ap params Take the requested smps mode from the ap params (instead of always starting with SMPS_OFF) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 13:37:02 +02:00
Eliad Peller	18998c381b	cfg80211: allow requesting SMPS mode on ap start Add feature bits to indicate device support for static-smps and dynamic-smps modes. Add a new NL80211_ATTR_SMPS_MODE attribue to allow configuring the smps mode to be used by the ap (e.g. configuring to ap to dynamic smps mode will reduce power consumption while having minor effect on throughput) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 13:37:02 +02:00
Arik Nemtsov	59cd85cbcf	mac80211: set network header in TDLS frames Correctly mark the network header location in mac80211-generated TDLS frames. These may be used by lower-level drivers. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:25:22 +02:00
Eliad Peller	b0b6aa2c8e	cfg80211/mac80211: add wmm info to assoc event Userspace might need to know what queues are configured for uapsd (e.g. for setting proper default values in tspecs). Add this bitmap to the association event (inside wmm nested attribute) Add additional parameter to cfg80211_rx_assoc_resp, and update its callers. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:24:39 +02:00
Johannes Berg	960d01acf6	cfg80211: add WMM traffic stream API Add nl80211 and driver API to validate, add and delete traffic streams with appropriate settings. The API calls for userspace doing the action frame handshake with the peer, and then allows only to set up the parameters in the driver. To avoid setting up a session only to tear it down again, the validate API is provided, but the real usage later can still fail so userspace must be prepared for that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:21:18 +02:00
Liad Kaufman	9d58f25b12	mac80211: add TDLS connection timeout Adding a timeout for tearing down a TDLS connection that hasn't had ACKed traffic sent through it for a certain amount of time. Since we have no other monitoring facility to indicate the existance (or non-existance) of a peer, this patch will cause a peer to be considered as unavailable if for some X time at least some Y packets have all not been ACKed. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:18:47 +02:00
Thomas Huehn	5935839ad7	mac80211: improve minstrel_ht rate sorting by throughput & probability This patch improves the way minstrel_ht sorts rates according to throughput and success probability. 3 FOR-loops across the entire rate and mcs group set in function minstrel_ht_update_stats() which where used to determine the fastest, second fastest and most robust rate are reduced to 2 FOR-loop. The sorted list of rates according throughput is extended to the best four rates as we need them in upcoming joint rate and power control. The sorting is done via the new function minstrel_ht_sort_best_tp_rates(). The annotation of those 4 best throughput rates in the debugfs file rc-stats is changes to: "A,B,C,D", where A is the fastest rate and C the 4th fastest. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Tested-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:10:14 +02:00
Thomas Huehn	ca12c0c833	mac80211: Unify rate statistic variables between Minstrel & Minstrel_HT Minstrel and Mintrel_HT used there own structs to keep track of rate statistics. Unify those variables in struct minstrel_rate_states and move it to rc80211_minstrel.h for common usage. This is a clean-up patch to prepare Minstrel and Minstrel_HT codebase for upcoming TPC. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:08:31 +02:00
Johannes Berg	5393b917bc	cfg80211: clear nl80211 messages carrying keys after processing Clear any nl80211 messages that might contain keys after processing them to avoid leaving their data in memory "forever" after they've been freed. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:39 +02:00
Johannes Berg	78f686cae0	cfg80211: don't put kek/kck/replay counter on the stack There's no need to put the values on the stack, just pass a pointer to the data in the nl80211 message. This reduces stack usage and avoids potential issues with putting sensitive data on the stack. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:34 +02:00
Johannes Berg	538c9eb8b3	cfg80211: clear wext keys when freeing and removing them When freeing the keys stored for wireless extensions, clear the memory to avoid having the key material stick around in memory "forever". Similarly, when userspace overwrites a key, actually clear it instead of just setting the key length to zero. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:28 +02:00
Johannes Berg	29c3f9c399	mac80211: clear key material when freeing keys When freeing the key, clear the memory to avoid having the key material stick around in memory "forever". Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:23 +02:00
Johannes Berg	b47f610bd6	cfg80211: clear connect keys when freeing them When freeing the connect keys, clear the memory to avoid having the key material stick around in memory "forever". Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:18 +02:00
Johan Hedberg	7ed3fa2078	Bluetooth: Expire RPA if encryption fails If encryption fails and we're using an RPA it may be because of a conflict with another device. To avoid repeated failures the safest action is to simply mark the RPA as expired so that a new one gets generated as soon as the connection drops. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 07:32:14 +02:00
Johan Hedberg	5be5e275ad	Bluetooth: Avoid hard-coded IO capability values in SMP This is a trivial change to use a proper define for the NoInputNoOutput IO capability instead of hard-coded values. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 03:02:22 +02:00
Johan Hedberg	aeaeb4bbca	Bluetooth: Fix L2CAP information request handling for fixed channels Even if we have no connection-oriented channels we should perform the L2CAP Information Request procedures before notifying L2CAP channels of the connection. This is so that the L2CAP channel implementations can perform checks on what the remote side supports (e.g. does it support the fixed channel in question). So far the code has relied on the l2cap_do_start() function to initiate the Information Request, however l2cap_do_start() is used on a per-channel basis and only for connection-oriented channels. This means that if there are no connection-oriented channels on the system we would never start the Information Request procedure. This patch creates a new l2cap_request_info() helper function to initiate the Information Request procedure, and ensures that it is called whenever a BR/EDR connection has been established. The patch also updates fixed channels to be notified of connection readiness only once the Information Request procedure has completed. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Johan Hedberg	a6f7833ca3	Bluetooth: Add smp_ltk_sec_level() helper function There are several places that need to determine the security level that an LTK can provide. This patch adds a convenience function for this to help make the code more readable. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Johan Hedberg	1afc2a1ab6	Bluetooth: Fix SMP security level when we have no IO capabilities When the local IO capability is NoInputNoOutput any attempt to convert the remote authentication requirement to a target security level is futile. This patch makes sure that we set the target security level at most to MEDIUM if the local IO capability is NoInputNoOutput. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Johan Hedberg	24bd0bd94e	Bluetooth: Centralize disallowing SMP commands to a single place All the cases where we mark SMP commands as dissalowed are their respective command handlers. We can therefore simplify the code by always clearing the bit immediately after testing it. This patch converts the corresponding test_bit() call to a test_and_clear_bit() call and also removes the now unused SMP_DISALLOW_CMD macro. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Johan Hedberg	c05b9339c8	Bluetooth: Fix ignoring unknown SMP authentication requirement bits The SMP specification states that we should ignore any unknown bits from the authentication requirement. We already have a define for masking out unknown bits but we haven't used it in all places so far. This patch adds usage of the AUTH_REQ_MASK to all places that need it and ensures that we don't pass unknown bits onward to other functions. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Johan Hedberg	3a7dbfb8ff	Bluetooth: Remove unnecessary early initialization of variable We do nothing else with the auth variable in smp_cmd_pairing_rsp() besides passing it to tk_request() which in turn only cares about whether one of the sides had the MITM bit set. It is therefore unnecessary to assign a value to it until just before calling tk_request(), and this value can simply be the bit-wise or of the local and remote requirements. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-11 02:45:24 +02:00
Erik Hugne	0fc4dffad1	tipc: fix sparse warnings This fixes the following sparse warnings: sparse: symbol 'tipc_update_nametbl' was not declared. Should it be static? Also, the function is changed to return bool upon success, rather than a potentially freed pointer. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 14:00:58 -07:00
Chris Perl	0f7a622ca6	rpc: xs_bind - do not bind when requesting a random ephemeral port When attempting to establish a local ephemeral endpoint for a TCP or UDP socket, do not explicitly call bind, instead let it happen implicilty when the socket is first used. The main motivating factor for this change is when TCP runs out of unique ephemeral ports (i.e. cannot find any ephemeral ports which are not a part of any TCP connection). In this situation if you explicitly call bind, then the call will fail with EADDRINUSE. However, if you allow the allocation of an ephemeral port to happen implicitly as part of connect (or other functions), then ephemeral ports can be reused, so long as the combination of (local_ip, local_port, remote_ip, remote_port) is unique for TCP sockets on the system. This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP sockets the same. This can allow mount.nfs(8) to continue to function successfully, even in the face of misbehaving applications which are creating a large number of TCP connections. Signed-off-by: Chris Perl <chris.perl@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-10 12:47:00 -07:00
David S. Miller	0aac383353	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== nf-next pull request The following patchset contains Netfilter/IPVS updates for your net-next tree. Regarding nf_tables, most updates focus on consolidating the NAT infrastructure and adding support for masquerading. More specifically, they are: 1) use __u8 instead of u_int8_t in arptables header, from Mike Frysinger. 2) Add support to match by skb->pkttype to the meta expression, from Ana Rey. 3) Add support to match by cpu to the meta expression, also from Ana Rey. 4) A smatch warning about IPSET_ATTR_MARKMASK validation, patch from Vytas Dauksa. 5) Fix netnet and netportnet hash types the range support for IPv4, from Sergey Popovich. 6) Fix missing-field-initializer warnings resolved, from Mark Rustad. 7) Dan Carperter reported possible integer overflows in ipset, from Jozsef Kadlecsick. 8) Filter out accounting objects in nfacct by type, so you can selectively reset quotas, from Alexey Perevalov. 9) Move specific NAT IPv4 functions to the core so x_tables and nf_tables can share the same NAT IPv4 engine. 10) Use the new NAT IPv4 functions from nft_chain_nat_ipv4. 11) Move specific NAT IPv6 functions to the core so x_tables and nf_tables can share the same NAT IPv4 engine. 12) Use the new NAT IPv6 functions from nft_chain_nat_ipv6. 13) Refactor code to add nft_delrule(), which can be reused in the enhancement of the NFT_MSG_DELTABLE to remove a table and its content, from Arturo Borrero. 14) Add a helper function to unregister chain hooks, from Arturo Borrero. 15) A cleanup to rename to nft_delrule_by_chain for consistency with the new nft_*() functions, also from Arturo. 16) Add support to match devgroup to the meta expression, from Ana Rey. 17) Reduce stack usage for IPVS socket option, from Julian Anastasov. 18) Remove unnecessary textsearch state initialization in xt_string, from Bojan Prtvar. 19) Add several helper functions to nf_tables, more work to prepare the enhancement of NFT_MSG_DELTABLE, again from Arturo Borrero. 20) Enhance NFT_MSG_DELTABLE to delete a table and its content, from Arturo Borrero. 21) Support NAT flags in the nat expression to indicate the flavour, eg. random fully, from Arturo. 22) Add missing audit code to ebtables when replacing tables, from Nicolas Dichtel. 23) Generalize the IPv4 masquerading code to allow its re-use from nf_tables, from Arturo. 24) Generalize the IPv6 masquerading code, also from Arturo. 25) Add the new masq expression to support IPv4/IPv6 masquerading from nf_tables, also from Arturo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:46:32 -07:00
Joe Perches	b167a37c7b	netfilter: Convert pr_warning to pr_warn Use the more common pr_warn. Other miscellanea: o Coalesce formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:40:10 -07:00
Joe Perches	47c4cfc37f	iucv: Convert pr_warning to pr_warn Use the more common pr_warn. Coalesce formats. Realign arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:40:10 -07:00
Joe Perches	294a0b7f31	pktgen: Convert pr_warning to pr_warn Use the more common pr_warn. Realign arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:40:10 -07:00
Joe Perches	ef423a4109	atm: Convert pr_warning to pr_warn Use the more common pr_warn. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:40:10 -07:00
Ilya Dryomov	c27a3e4d66	libceph: do not hard code max auth ticket len We hard code cephx auth ticket buffer size to 256 bytes. This isn't enough for any moderate setups and, in case tickets themselves are not encrypted, leads to buffer overflows (ceph_x_decrypt() errors out, but ceph_decode_copy() doesn't - it's just a memcpy() wrapper). Since the buffer is allocated dynamically anyway, allocated it a bit later, at the point where we know how much is going to be needed. Fixes: http://tracker.ceph.com/issues/8979 Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-09-10 20:08:36 +04:00
Ilya Dryomov	597cda3577	libceph: add process_one_ticket() helper Add a helper for processing individual cephx auth tickets. Needed for the next commit, which deals with allocating ticket buffers. (Most of the diff here is whitespace - view with git diff -b). Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@redhat.com>	2014-09-10 20:08:35 +04:00
Sage Weil	73c3d4812b	libceph: gracefully handle large reply messages from the mon We preallocate a few of the message types we get back from the mon. If we get a larger message than we are expecting, fall back to trying to allocate a new one instead of blindly using the one we have. CC: stable@vger.kernel.org Signed-off-by: Sage Weil <sage@redhat.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com>	2014-09-10 20:08:32 +04:00
Tom Herbert	19424e052f	sit: Add gro callbacks to sit_offload Add ipv6_gro_receive and ipv6_gro_complete to sit_offload to support GRO. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 21:29:33 -07:00
Tom Herbert	9667e9bb3f	ipip: Add gro callbacks to ipip offload Add inet_gro_receive and inet_gro_complete to ipip_offload to support GRO. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 21:29:33 -07:00
Tom Herbert	03d56daafe	ipv6: Clear flush_id to make GRO work In TCP gro we check flush_id which is derived from the IP identifier. In IPv4 gro path the flush_id is set with the expectation that every matched packet increments IP identifier. In IPv6, the flush_id is never set and thus is uinitialized. What's worse is that in IPv6 over IPv4 encapsulation, the IP identifier is taken from the outer header which is currently not incremented on every packet for Linux stack, so GRO in this case never matches packets (identifier is not increasing). This patch clears flush_id for every time for a matched packet in IPv6 gro_receive. We need to do this each time to overwrite the setting that would be done in IPv4 gro_receive per the outer header in IPv6 over Ipv4 encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 21:29:33 -07:00
David Howells	ed3bfdfdce	RxRPC: Fix missing __user annotation Fix a missing __user annotation in a cast of a user space pointer (found by checker). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:39:40 -07:00
Florian Westphal	46cfd725c3	net: use kfree_skb_list() helper in more places Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:45 -07:00
Eric Dumazet	72bb17b37b	ipv4: udp4_gro_complete() is static net/ipv4/udp_offload.c:339:5: warning: symbol 'udp4_gro_complete' was not declared. Should it be static? Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Fixes: `57c67ff4bd` ("udp: additional GRO support") Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:45 -07:00
Eric Dumazet	416c51e17b	netns: remove one sparse warning net/core/net_namespace.c:227:18: warning: incorrect type in argument 1 (different address spaces) net/core/net_namespace.c:227:18: expected void const <noident> net/core/net_namespace.c:227:18: got struct net_generic [noderef] <asn:4>gen We can use rcu_access_pointer() here as read-side access to the pointer was removed at least one grace period ago. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:45 -07:00
Eric Dumazet	cc9c668a08	ipv6: udp6_gro_complete() is static net/ipv6/udp_offload.c:159:5: warning: symbol 'udp6_gro_complete' was not declared. Should it be static? Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `57c67ff4bd` ("udp: additional GRO support") Cc: Tom Herbert <therbert@google.com> Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:44 -07:00
Eric Dumazet	8e380f004e	ipv4: rcu cleanup in ip_ra_control() Remove one sparse warning : net/ipv4/ip_sockglue.c:328:22: warning: incorrect type in assignment (different address spaces) net/ipv4/ip_sockglue.c:328:22: expected struct ip_ra_chain [noderef] <asn:4>next net/ipv4/ip_sockglue.c:328:22: got struct ip_ra_chain [assigned] ra And replace one rcu_assign_ptr() by RCU_INIT_POINTER() where applicable. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:44 -07:00
Daniel Borkmann	cbeddd5d16	ipv6: mcast: remove dead debugging defines It's not used anywhere, so just remove these. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 20:10:44 -07:00
Ani Sinha	6a2a2b3ae0	net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland. Linux manpage for recvmsg and sendmsg calls does not explicitly mention setting msg_namelen to 0 when msg_name passed set as NULL. When developers don't set msg_namelen member in msghdr, it might contain garbage value which will fail the validation check and sendmsg and recvmsg calls from kernel will return EINVAL. This will break old binaries and any code for which there is no access to source code. To fix this, we set msg_namelen to 0 when msg_name is passed as NULL from userland. Signed-off-by: Ani Sinha <ani@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:35:46 -07:00
Willem de Bruijn	67cc0d4077	net-timestamp: optimize sock_tx_timestamp default path Few packets have timestamping enabled. Exit sock_tx_timestamp quickly in this common case. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:34:41 -07:00
Florian Westphal	17448e5f63	net_sched: sfq: remove unused macro not used anymore since `ddecf0f` (net_sched: sfq: add optional RED on top of SFQ). Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:34:41 -07:00
Daniel Borkmann	286aad3c40	net: bpf: be friendly to kmemcheck Reported by Mikulas Patocka, kmemcheck currently barks out a false positive since we don't have special kmemcheck annotation for bitfields used in bpf_prog structure. We currently have jited:1, len:31 and thus when accessing len while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that we're reading uninitialized memory. As we don't need the whole bit universe for pages member, we can just split it to u16 and use a bool flag for jited instead of a bitfield. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 16:58:56 -07:00
Eric Dumazet	ca777eff51	tcp: remove dst refcount false sharing for prequeue mode Alexander Duyck reported high false sharing on dst refcount in tcp stack when prequeue is used. prequeue is the mechanism used when a thread is blocked in recvmsg()/read() on a TCP socket, using a blocking model rather than select()/poll()/epoll() non blocking one. We already try to use RCU in input path as much as possible, but we were forced to take a refcount on the dst when skb escaped RCU protected region. When/if the user thread runs on different cpu, dst_release() will then touch dst refcount again. Commit `093162553c` (tcp: force a dst refcount when prequeue packet) was an example of a race fix. It turns out the only remaining usage of skb->dst for a packet stored in a TCP socket prequeue is IP early demux. We can add a logic to detect when IP early demux is probably going to use skb->dst. Because we do an optimistic check rather than duplicate existing logic, we need to guard inet_sk_rx_dst_set() and inet6_sk_rx_dst_set() from using a NULL dst. Many thanks to Alexander for providing a nice bug report, git bisection, and reproducer. Tested using Alexander script on a 40Gb NIC, 8 RX queues. Hosts have 24 cores, 48 hyper threads. echo 0 >/proc/sys/net/ipv4/tcp_autocorking for i in `seq 0 47` do for j in `seq 0 2` do netperf -H $DEST -t TCP_STREAM -l 1000 \ -c -C -T $i,$i -P 0 -- \ -m 64 -s 64K -D & done done Before patch : ~6Mpps and ~95% cpu usage on receiver After patch : ~9Mpps and ~35% cpu usage on receiver. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 16:54:41 -07:00
Johan Hedberg	196332f5a1	Bluetooth: Fix allowing SMP Signing info PDU If the remote side is not distributing its IRK but is distributing the CSRK the next PDU after master identification is the Signing Information. This patch fixes a missing SMP_ALLOW_CMD() for this in the smp_cmd_master_ident() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-10 01:45:01 +02:00
Jeff Layton	e0b93eddfe	security: make security_file_set_fowner, f_setown and __f_setown void return security_file_set_fowner always returns 0, so make it f_setown and __f_setown void return functions and fix up the error handling in the callers. Cc: linux-security-module@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2014-09-09 16:01:36 -04:00
Li RongQing	e403aded79	openvswitch: change the data type of error status to atomic_long_t Change the date type of error status from u64 to atomic_long_t, and use atomic operation, then remove the lock which is used to protect the error status. The operation of atomic maybe faster than spin lock. Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:48:07 -07:00
Rami Rosen	5aaa62d608	bridge: Cleanup of unncessary check. This patch removes an unncessary check in the br_afspec() method of br_netlink.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:32:11 -07:00
Jiri Pirko	1332351617	bridge: implement rtnl_link_ops->changelink Allow rtnetlink users to set bridge master info via IFLA_INFO_DATA attr This initial part implements forward_delay, hello_time, max_age options. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:29:55 -07:00
Jiri Pirko	e5c3ea5c66	bridge: implement rtnl_link_ops->get_size and rtnl_link_ops->fill_info Allow rtnetlink users to get bridge master info in IFLA_INFO_DATA attr This initial part implements forward_delay, hello_time, max_age options. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:29:55 -07:00
Jiri Pirko	3ac636b859	bridge: implement rtnl_link_ops->slave_changelink Allow rtnetlink users to set port info via IFLA_INFO_SLAVE_DATA attr Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:29:55 -07:00
Jiri Pirko	ced8283f90	bridge: implement rtnl_link_ops->get_slave_size and rtnl_link_ops->fill_slave_info Allow rtnetlink users to get port info in IFLA_INFO_SLAVE_DATA attr Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:29:55 -07:00
Jiri Pirko	0f49579a39	bridge: switch order of rx_handler reg and upper dev link The thing is that netdev_master_upper_dev_link calls call_netdevice_notifiers(NETDEV_CHANGEUPPER, dev). That generates rtnl link message and during that, rtnl_link_ops->fill_slave_info is called. But with current ordering, rx_handler and IFF_BRIDGE_PORT are not set yet so there would have to be check for that in fill_slave_info callback. Resolve this by reordering to similar what bonding and team does to avoid the check. Also add removal of IFF_BRIDGE_PORT flag into error path. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:29:54 -07:00
John W. Linville	ab09b95cbf	Two more fixes for mac80211 - one of them addresses a long-standing issue that we only found when using vendor events more frequently; the other addresses some bad information being reported in userspace that people were starting to actually look at. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUDcHCAAoJEDBSmw7B7bqr4wAQALogLihe4uESol038EXLA6yi NqoDQLUKtpcbWzhZKQZlOf7W53ggIXHUeIYSS2SXpn9dOX/9VniA5MkzbJK70uCf oVaXkSHujFE7JkLZ5Pusto4WtCLSfmlWJTRt3SN2BEpGcW0boYM8FxVuKBMtIj7s H56eVcTSCWCQrMbCfQcS2pDkkRLQL2MYmA0CAT/Hbxy7KjBYuxMJvXz9HsoTKotj Zj86UzisZ2QJSyxRtV5v7Z95LTcQtRKCKQp1kIV+64Q7c/ZOeTZ6l//52MqUhLpH vIwfvAcW02iCWrN8d/lulkAKPfw4RCPvoEWt9sIsp9WQjwWhrsBXLocw6XiuAFHZ j2EGoZvOBJV43FQOSw9Hli232QkwHh2QTiLEObNaVbG0wTMuEfRcjIjKsSpJ/WRq HfSGzg32X4XLPVh2EJ5n7qbChPbTzgBv+ydU4ApESHFUJHmLPWrbKNgTEs14llpr oIM+QVFA4o3vukaZIL/wrPk9dfbTSeEOHvpLeJZ2BCqOko8WyitZqoD54NeaPthf u/S4QzeRifdHB3gPWI1NNIC5jAgkgA9Zy0a+4xZP75bSBPfPB4PeM9z5yXdv+y6G mT9KWoTUltTBoHRwDceb4BHOYXX4xjwZSfcWwjTs1q4A83a9PQidkG2RelJJEiGQ 8WIVmGSlf6MpD2mZBnJc =gW3Y -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Two more fixes for mac80211 - one of them addresses a long-standing issue that we only found when using vendor events more frequently; the other addresses some bad information being reported in userspace that people were starting to actually look at." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-09 14:29:36 -04:00
Vincent Bernat	49a601589c	net/ipv4: bind ip_nonlocal_bind to current netns net.ipv4.ip_nonlocal_bind sysctl was global to all network namespaces. This patch allows to set a different value for each network namespace. Signed-off-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:27:09 -07:00
Arturo Borrero	9ba1f726be	netfilter: nf_tables: add new nft_masq expression The nft_masq expression is intended to perform NAT in the masquerade flavour. We decided to have the masquerade functionality in a separated expression other than nft_nat. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:30 +02:00
Arturo Borrero	be6b635cd6	netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv6; a similar patch exists to handle IPv4. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:29 +02:00
Arturo Borrero	8dd33cc93e	netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv4; a similar patch follows to handle IPv6. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:29 +02:00
Nicolas Dichtel	c55fbbb4a7	netfilter: ebtables: create audit records for replaces This is already done for x_tables (family AF_INET and AF_INET6), let's do it for AF_BRIDGE also. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:28 +02:00
Arturo Borrero	e42eff8a32	netfilter: nft_nat: include a flag attribute Both SNAT and DNAT (and the upcoming masquerade) can have additional configuration parameters, such as port randomization and NAT addressing persistence. We can cover these scenarios by simply adding a flag attribute for userspace to fill when needed. The flags to use are defined in include/uapi/linux/netfilter/nf_nat.h: NF_NAT_RANGE_MAP_IPS NF_NAT_RANGE_PROTO_SPECIFIED NF_NAT_RANGE_PROTO_RANDOM NF_NAT_RANGE_PERSISTENT NF_NAT_RANGE_PROTO_RANDOM_FULLY NF_NAT_RANGE_PROTO_RANDOM_ALL The caller must take care of not messing up with the flags, as they are added unconditionally to the final resulting nf_nat_range. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:27 +02:00
Arturo Borrero	b9ac12ef09	netfilter: nf_tables: extend NFT_MSG_DELTABLE to support flushing the ruleset This patch extend the NFT_MSG_DELTABLE call to support flushing the entire ruleset. The options now are: * No family speficied, no table specified: flush all the ruleset. * Family specified, no table specified: flush all tables in the AF. * Family specified, table specified: flush the given table. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:26 +02:00
Arturo Borrero	ee01d54256	netfilter: nf_tables: add helpers to schedule objects deletion This patch refactor the code to schedule objects deletion. They are useful in follow-up patches. In order to be able to use these new helper functions in all the code, they are placed in the top of the file, with all the dependant functions and symbols. nft_rule_disactivate_next has been renamed to nft_rule_deactivate. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:25 +02:00
Bojan Prtvar	c435201bed	netfilter: xt_string: Remove unnecessary initialization of struct ts_state The skb_find_text() accepts uninitialized textsearch state variable. Signed-off-by: Bojan Prtvar <prtvar.b@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:25 +02:00
Julian Anastasov	5fcf0cf607	ipvs: reduce stack usage for sockopt data Use union to reserve the required stack space for sockopt data which is less than the currently hardcoded value of 128. Now the tables for commands should be more readable. The checks added for readability are optimized by compiler, others warn at compile time if command uses too much stack or exceeds the storage of set_arglen and get_arglen. As Dan Carpenter points out, we can run for unprivileged user, so we can silent some error messages. Signed-off-by: Julian Anastasov <ja@ssi.bg> CC: Dan Carpenter <dan.carpenter@oracle.com> CC: Andrey Utkin <andrey.krieger.utkin@gmail.com> CC: David Binderman <dcb314@hotmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:24 +02:00
Ana Rey	3045d76070	netfilter: nf_tables: add devgroup support in meta expresion Add devgroup support to let us match device group of a packets incoming or outgoing interface. Signed-off-by: Ana Rey <anarey@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:23 +02:00
Arturo Borrero	ce24b7217b	netfilter: nf_tables: rename nf_table_delrule_by_chain() For the sake of homogenize the function naming scheme, let's rename nf_table_delrule_by_chain() to nft_delrule_by_chain(). Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:22 +02:00
Arturo Borrero	c559879406	netfilter: nf_tables: add helper to unregister chain hooks This patch adds a helper function to unregister chain hooks in the chain deletion path. Basically, a code factorization. The new function is useful in follow-up patches. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:21 +02:00
Arturo Borrero	5e266fe7c0	netfilter: nf_tables: refactor rule deletion helper This helper function always schedule the rule to be removed in the following transaction. In follow-up patches, it is interesting to handle separately the logic of rule activation/disactivation from the transaction mechanism. So, this patch simply splits the original nf_tables_delrule_one() in two functions, allowing further control. While at it, for the sake of homigeneize the function naming scheme, let's rename nf_tables_delrule_one() to nft_delrule(). Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:20 +02:00
Pablo Neira Ayuso	876665eafc	netfilter: nft_chain_nat_ipv6: use generic IPv6 NAT code from core Use the exported IPv6 NAT functions that are provided by the core. This removes duplicated code so iptables and nft use the same NAT codebase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:09 +02:00
Pablo Neira Ayuso	2a5538e9aa	netfilter: nat: move specific NAT IPv6 to core Move the specific NAT IPv6 core functions that are called from the hooks from ip6table_nat.c to nf_nat_l3proto_ipv6.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. This also renames nf_nat_ipv6_fn to nft_nat_ipv6_fn in net/ipv6/netfilter/nft_chain_nat_ipv6.c to avoid a compilation breakage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:30:00 +02:00
Jukka Rissanen	39e90c7763	Bluetooth: 6lowpan: Route packets that are not meant to peer via correct device Packets that are supposed to be delivered via the peer device need to be checked and sent to correct device. This requires that user has set the routes properly so that the 6lowpan module can then figure out the destination gateway and the correct Bluetooth device. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.17.x	2014-09-09 15:51:47 +02:00
Jukka Rissanen	b2799cec22	Bluetooth: 6lowpan: Set the peer IPv6 address correctly The peer IPv6 address contained wrong U/L bit in the EUI-64 part. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.17.x	2014-09-09 15:51:47 +02:00
Jukka Rissanen	2ae50d8d3a	Bluetooth: 6lowpan: Increase the connection timeout value Use the default connection timeout value defined in l2cap.h because the current timeout was too short and most of the time the connection attempts timed out. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org # 3.17.x	2014-09-09 15:51:47 +02:00
Johan Hedberg	e1e930f591	Bluetooth: Fix mgmt pairing failure when authentication fails Whether through HCI with BR/EDR or SMP with LE when authentication fails we should also notify any pending Pair Device mgmt command. This patch updates the mgmt_auth_failed function to take the actual hci_conn object and makes sure that any pending pairing command is notified and cleaned up appropriately. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-09 03:12:15 +02:00
David S. Miller	5b4c314575	Merge tag 'master-2014-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-09-08 Please pull this batch of updates intended for the 3.18 stream... For the mac80211 bits, Johannes says: "Not that much content this time. Some RCU cleanups, crypto performance improvements, and various patches all over, rather than listing them one might as well look into the git log instead." For the Bluetooth bits, Gustavo says: "The changes consists of: - Coding style fixes to HCI drivers - Corrupted ack value fix for the H5 HCI driver - A couple of Enhanced L2CAP fixes - Conversion of SMP code to use common L2CAP channel API - Page scan optimizations when using the kernel-side whitelist - Various mac802154 and and ieee802154 6lowpan cleanups - One new Atheros USB ID" For the iwlwifi bits, Emmanuel says: "We have a new big thing coming up which is called Dynamic Queue Allocation (or DQA). This is a completely new way to work with the Tx queues and it requires major refactoring. This is being done by Johannes and Avri. Besides this, Johannes disables U-APSD by default because of APs that would disable A-MPDU if the association supports U-ASPD. Luca contributed to the power area which he was cleaning up on the way while working on CSA. A few more random things here and there." For the Atheros bits, Kalle says: "For ath6kl we had two small fixes and a new SDIO device id. For ath10k the bigger changes are: * support for new firmware version 10.2 (Michal) * spectral scan support (Simon, Sven & Mathias) * export a firmware crash dump file (Ben & me) * cleaning up of pci.c (Michal) * print pci id in all messages, which causes most of the churn (Michal)" Beyond that, we have the usual collection of various updates to ath9k, b43, mwifiex, and wil6210, as well as a few other bits here and there. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 16:43:58 -07:00
Willem de Bruijn	a7f26b7e1e	inet: remove dead inetpeer sequence code inetpeer sequence numbers are no longer incremented, so no need to check and flush the tree. The function that increments the sequence number was already dead code and removed in in "ipv4: remove unused function" (`068a6e18`). Remove the code that checks for a change, too. Verifying that v4_seq and v6_seq are never incremented and thus that flush_check compares bp->flush_seq to 0 is trivial. The second part of the change removes flush_check completely even though bp->flush_seq is exactly !0 once, at initialization. This change is correct because the time this branch is true is when bp->root == peer_avl_empty_rcu, in which the branch and inetpeer_invalidate_tree are a NOOP. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 16:42:42 -07:00
Tom Herbert	1e701f1698	net: Fix GRE RX to use skb_transport_header for GRE header offset GRE assumes that the GRE header is at skb_network_header + ip_hrdlen(skb). It is more general to use skb_transport_header and this allows the possbility of inserting additional header between IP and GRE (which is what we will done in Generic UDP Encapsulation for GRE). Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 15:23:05 -07:00
Eric Dumazet	82d5e2b8b4	net: fix skb_page_frag_refill() kerneldoc In commit `d9b2938aab` ("net: attempt a single high order allocation) I forgot to update kerneldoc, as @prio parameter was renamed to @gfp Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 14:12:14 -07:00
Johan Hedberg	c68b7f127d	Bluetooth: Fix dereferencing conn variable before NULL check This patch fixes the following type of static analyzer warning (and probably a real bug as well as the NULL check should be there for a reason): net/bluetooth/smp.c:1182 smp_conn_security() warn: variable dereferenced before check 'conn' (see line 1174) Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:57 +02:00
Behan Webster	9f06a8d623	Bluetooth: LLVMLinux: Remove VLAIS from bluetooth/amp.c Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99 compliant equivalent. This patch allocates the appropriate amount of memory using an char array. The new code can be compiled with both gcc and clang. struct shash_desc contains a flexible array member member ctx declared with CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning of the array declared after struct shash_desc with long long. No trailing padding is required because it is not a struct type that can be used in an array. The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long as would be the case for a struct containing a member with CRYPTO_MINALIGN_ATTR. Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	b28b494366	Bluetooth: Add strict checks for allowed SMP PDUs SMP defines quite clearly when certain PDUs are to be expected/allowed and when not, but doesn't have any explicit request/response definition. So far the code has relied on each PDU handler to behave correctly if receiving PDUs at an unexpected moment, however this requires many different checks and is prone to errors. This patch introduces a generic way to keep track of allowed PDUs and thereby reduces the responsibility & load on individual command handlers. The tracking is implemented using a simple bit-mask where each opcode maps to its own bit. If the bit is set the corresponding PDU is allow and if the bit is not set the PDU is not allowed. As a simple example, when we send the Pairing Request we'd set the bit for Pairing Response, and when we receive the Pairing Response we'd clear the bit for Pairing Response. Since the disallowed PDU rejection is now done in a single central place we need to be a bit careful of which action makes most sense to all cases. Previously some, such as Security Request, have been simply ignored whereas others have caused an explicit disconnect. The only PDU rejection action that keeps good interoperability and can be used for all the applicable use cases is to drop the data. This may raise some concerns of us now being more lenient for misbehaving (and potentially malicious) devices, but the policy of simply dropping data has been a successful one for many years e.g. in L2CAP (where this is the only policy for such cases - we never request disconnection in l2cap_core.c because of bad data). Furthermore, we cannot prevent connected devices from creating the SMP context (through a Security or Pairing Request), and once the context exists looking up the corresponding bit for the received opcode and deciding to reject it is essentially an equally lightweight operation as the kind of rejection that l2cap_core.c already successfully does. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	c6e81e9ae6	Bluetooth: Fix calling smp_distribute_keys() when still waiting for keys When we're in the process of receiving keys in phase 3 of SMP we keep track of which keys are still expected in the smp->remote_key_dist variable. If we still have some key bits set we need to continue waiting for more PDUs and not needlessly call smp_distribute_keys(). This patch fixes two such cases in the smp_cmd_master_ident() and smp_cmd_ident_addr_info() handler functions. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	88d3a8acf3	Bluetooth: Add define for key distribution mask This patch adds a define for the allowed bits of the key distribution mask so we don't have to have magic 0x07 constants throughout the code. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	fc75cc8684	Bluetooth: Fix locking of the SMP context Before the move the l2cap_chan the SMP context (smp_chan) didn't have any kind of proper locking. The best there existed was the HCI_CONN_LE_SMP_PEND flag which was used to enable mutual exclusion for potential multiple creators of the SMP context. Now that SMP has been converted to use the l2cap_chan infrastructure and since the SMP context is directly mapped to a corresponding l2cap_chan we get the SMP context locking essentially for free through the l2cap_chan lock. For all callbacks that l2cap_core.c makes for each channel implementation (smp.c in the case of SMP) the l2cap_chan lock is held through l2cap_chan_lock(chan). Since the calls from l2cap_core.c to smp.c are covered the only missing piece to have the locking implemented properly is to ensure that the lock is held for any other call path that may access the SMP context. This means user responses through mgmt.c, requests to elevate the security of a connection through hci_conn.c, as well as any deferred work through workqueues. This patch adds the necessary locking to all these other code paths that try to access the SMP context. Since mutual exclusion for the l2cap_chan access is now covered from all directions the patch also removes unnecessary HCI_CONN_LE_SMP_PEND flag (once we've acquired the chan lock we can simply check whether chan->smp is set to know if there's an SMP context). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	d6268e86a1	Bluetooth: Remove unnecessary deferred work for SMP key distribution Now that the identity address update happens through its own deferred work there's no need to have smp_distribute_keys anymore behind a second deferred work. This patch removes this extra construction and makes the code do direct calls to smp_distribute_keys() again. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	f3d82d0c8e	Bluetooth: Move identity address update behind a workqueue The identity address update of all channels for an l2cap_conn needs to take the lock for each channel, i.e. it's safest to do this by a separate workqueue callback. Previously this was partially solved by moving the entire SMP key distribution behind a workqueue. However, if we want SMP context locking to be correct and safe we should always use the l2cap_chan lock when accessing it, meaning even smp_distribute_keys needs to take that lock which would once again create a dead lock when updating the identity address. The simplest way to solve this is to have l2cap_conn manage the deferred work which is what this patch does. A subsequent patch will remove the now unnecessary SMP key distribution work struct. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	84bc0db53b	Bluetooth: Don't take any action in smp_resume_cb if not encrypted When smp_resume_cb is called if we're not encrypted (i.e. the callback wasn't called because the connection became encrypted) we shouldn't take any action at all. This patch moves also the security_timer cancellation behind this condition. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	1b0921d6be	Bluetooth: Remove unnecessary checks after canceling SMP security timer The SMP security timer used to be able to modify the SMP context state but now days it simply calls hci_disconnect(). It is therefore unnecessary to have extra sanity checks for the SMP context after canceling the timer. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	434714dc02	Bluetooth: Add clarifying comment for LE CoC result value The "pending" L2CAP response value is not defined for LE CoC. This patch adds a clarifying comment to the code so that the reader will not think there is a bug in trying to use this value for LE CoC. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	839035a7b3	Bluetooth: Move clock offset reading into hci_disconnect() To give all hci_disconnect() users the advantage of getting the clock offset read automatically this patch moves the necessary code from hci_conn_timeout() into hci_disconnect(). This way we pretty much always update the clock offset when disconnecting. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	e3f2f92a04	Bluetooth: Use hci_disconnect() for mgmt_disconnect_device() There's no reason to custom build the HCI_Disconnect command in the Disconnect Device mgmt command handler. This patch updates the code to use hci_disconnect() instead. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	e3b679d56c	Bluetooth: Update hci_disconnect() to return an error value We'll soon use hci_disconnect() from places that are interested to know whether the hci_send_cmd() really succeeded or not. This patch updates hci_disconnect() to pass on any error returned from hci_send_cmd(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	9b7b18ef1b	Bluetooth: Fix SMP error and response to be mutually exclusive Returning failure from the SMP data parsing function will cause an immediate disconnect, making any attempts to send a response PDU futile. This patch updates the function to always either send a response or return an error, but never both at the same time: * In the case that HCI_LE_ENABLED is not set we want to send a Pairing Not Supported response but it is not required to force a disconnection, so do not set the error return in this case. * If we get garbage SMP data we can just fail with the handler function instead of also trying to send an SMP Failure PDU. * There's no reason to force a disconnection if we receive an unknown SMP command. Instead simply send a proper Command Not Supported SMP response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	b04afa0c28	Bluetooth: Remove unused l2cap_conn_shutdown API Now that there are no more users of the l2cap_conn_shutdown API (since smp.c switched to using hci_disconnect) we can simply remove it along with all of it's l2cap_conn variables. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	1e91c29eb6	Bluetooth: Use hci_disconnect for immediate disconnection from SMP Relying on the l2cap_conn_del procedure (triggered through the l2cap_conn_shutdown API) to get the connection disconnected is not reliable as it depends on all users releasing (through hci_conn_drop) and that there's at least one user (so hci_conn_drop is called at least one time). A much simpler and more reliable solution is to call hci_disconnect() directly from the SMP code when we want to disconnect. One side-effect this has is that it prevents any SMP Failure PDU from being sent before the disconnection, however neither one of the scenarios where l2cap_conn_shutdown was used really requires this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	e31fb86005	Bluetooth: Set discon_timeout to 0 in l2cap_conn_del When the l2cap_conn_del() function is used we do not want to wait around "in case something happens" before disconnecting. This patch sets the disconnection timeout to 0 so that the disconnection routines get immediately scheduled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	bcbb655a18	Bluetooth: Remove hci_conn_hold/drop from hci_chan We can't have hci_chan contribute to the "active" reference counting of the hci_conn since otherwise the connection would never get dropped when there are no more users (since hci_chan would be counted as a user). This patch removes hold() when creating the hci_chan and drop() when destroying it. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	f94b665dcf	Bluetooth: Ignore incoming data after initiating disconnection When hci_chan_del is called the disconnection routines get scheduled through a workqueue. If there's any incoming ACL data before the routines get executed there's a chance that a new hci_chan is created and the disconnection never happens. This patch adds a new hci_conn flag to indicate that we're in the process of driving the connection down. We set the flag in hci_chan_del and check for it in hci_chan_create so that no new channels are created for the same connection. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	b3ff670a44	Bluetooth: Set disc_timeout to 0 when calling hci_chan_del The hci_chan_del() function is used in scenarios where we've decided we want to get rid of the underlying baseband link. It makes therefore sense to force the disc_timeout to 0 so that the disconnection routines are immediately scheduled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	6c388d32ec	Bluetooth: Fix hci_conn reference counting with hci_chan The hci_chan_del() function was doing a hci_conn_drop() but there was no matching hci_conn_hold() in the hci_chan_create() function. Furthermore, as the hci_chan struct holds a pointer to the hci_conn there should be proper use of hci_conn_get/put. This patch fixes both issues so that hci_chan does correct reference counting of the hci_conn object. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	f6c6324969	Bluetooth: Refactor connection parameter freeing into its own function The necessary steps for freeing connection paramaters have grown quite a bit so we can simplify the code by factoring it out into its own function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	f8aaf9b65a	Bluetooth: Fix using hci_conn_get() for hci_conn pointers Wherever we keep hci_conn pointers around we should be using hci_conn_get/put to ensure that they stay valid. This patch fixes all places violating against the principle currently. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	51bb8457dd	Bluetooth: Improve _get() functions to return the object type It's natural to have _get() functions that increment the reference count of an object to return the object type itself. This way it's simple to make a copy of the object pointer and increase the reference count in a single step. This patch updates two such get() functions, namely hci_conn_get() and l2cap_conn_get(), and updates the users to take advantage of the new API. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	5477610fc1	Bluetooth: Optimize connection parameter lookup for LE connections When we get an LE connection complete event there's really no reason to look through the entire connection parameter list as the entry should be present in the hdev->pend_le_conns list too. This patch changes the lookup code to do a more restricted lookup only in the pend_le_conns list. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	08853f18ea	Bluetooth: Set addr_type only when it's needed In the hci_le_conn_complete_evt() function there's no need to set the addr_type value until it's actually needed, i.e. for the black list lookup. This patch moves the code a bit further down in the function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	c16900cf28	Bluetooth: Fix hci_conn reference counting for fixed channels Now that SMP has been converted to use fixed channels we've got a bit of a problem with the hci_conn reference counting. So far the L2CAP code has kept a reference for each L2CAP channel that was notified of the connection. With SMP however this would mean that the connection is never dropped even though there are no other users of it. Furthermore, SMP already does its own hci_conn reference counting internally, starting from a security or pairing request and ending with the key distribution. This patch makes L2CAP fixed channels default to the L2CAP core not keeping a hci_conn reference for them. A new FLAG_HOLD_HCI_CONN flag is added so that L2CAP users can declare an exception to this rule and hold a reference even for their fixed channels. One such exception is the L2CAP socket layer which does want a reference for each socket (e.g. an ATT socket which uses a fixed channel). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	b3ed6c63f7	Bluetooth: Remove unnecessary l2cap_chan_unlock before l2cap_chan_add The l2cap_chan_add() function doesn't require the channel to be unlocked. It only requires the l2cap_conn to be unlocked. Therefore, it's unnecessary to unlock a channel before calling l2cap_chan_add(). This patch removes such unnecessary unlocking from the l2cap_chan_connect() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	72c6fb915f	Bluetooth: Fix incorrect LE CoC PDU length restriction based on HCI MTU The l2cap_create_le_flowctl_pdu() function that l2cap_segment_le_sdu() calls is perfectly capable of doing packet fragmentation if given bigger PDUs than the HCI buffers allow. Forcing the PDU length based on the HCI MTU (conn->mtu) would therefore needlessly strict operation on hardware with limited LE buffers (e.g. both Intel and Broadcom seem to have this set to just 27 bytes). This patch removes the restriction and makes it possible to send PDUs of the full length that the remote MPS value allows. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-09-08 19:07:52 +02:00
John W. Linville	61a3d4f9d5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-09-08 11:14:56 -04:00
Johannes Berg	b1e9be8775	mac80211: annotate MMIC head/tailroom warning This message occasionally triggers for some people as in https://bugzilla.redhat.com/show_bug.cgi?id=1111740 but it's not clear which (headroom or tailroom) is at fault. Annotate the message a bit to get more information. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-08 11:22:42 +02:00
Steinar H. Gunderson	c8d6591752	mac80211: support DTPC IE (from Cisco Client eXtensions) Linux already supports 802.11h, where the access point can tell the client to reduce its transmission power. However, 802.11h is only defined for 5 GHz, where the need for this is much smaller than on 2.4 GHz. Cisco has their own solution, called DTPC (Dynamic Transmit Power Control). Cisco APs on a controller sometimes but not always send 802.11h; they always send DTPC, even on 2.4 GHz. This patch adds support for parsing and honoring the DTPC IE in addition to the 802.11h element (they do not always contain the same limits, so both must be honored); the format is not documented, but very simple. Tested (on top of wireless.git and on 3.16.1) against a Cisco Aironet 1142 joined to a Cisco 2504 WLC, by setting various transmit power levels for the given access points and observing the results. The Wireshark 802.11 dissector agrees with the interpretation of the element, except for negative numbers, which seem to never happen anyway. Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2014-09-08 10:52:00 +02:00
Steinar H. Gunderson	24a4e4008c	mac80211: split 802.11h parsing from transmit power policy Decouple the logic of parsing the 802.11d and 802.11h IEs from the part of deciding what to do about the data (messaging, clamping to 0 dBm, doing the actual setting). This paves the way for the next patch, which introduces more data sources for transmit power limitation. Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-08 10:43:18 +02:00
David S. Miller	eb84d6b604	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-09-07 21:41:53 -07:00
Tejun Heo	908c7f1949	percpu_counter: add @gfp to percpu_counter_init() Percpu allocator now supports allocation mask. Add @gfp to percpu_counter_init() so that !GFP_KERNEL allocation masks can be used with percpu_counters too. We could have left percpu_counter_init() alone and added percpu_counter_init_gfp(); however, the number of users isn't that high and introducing _gfp variants to all percpu data structures would be quite ugly, so let's just do the conversion. This is the one with the most users. Other percpu data structures are a lot easier to convert. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: x86@kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org>	2014-09-08 09:51:29 +09:00
David S. Miller	45ce829dd0	Merge tag 'master-2014-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-05 Please pull this batch of fixes intended for the 3.17 stream... For the mac80211 bits, Johannes says: "Here are a few fixes for mac80211. One has been discussed for a while and adds a terminating NUL-byte to the alpha2 sent to userspace, which shouldn't be necessary but since many places treat it as a string we couldn't move to just sending two bytes. In addition to that, we have two VLAN fixes from Felix, a mesh fix, a fix for the recently introduced RX aggregation offload, a revert for a broken patch (that luckily didn't really cause any harm) and a small fix for alignment in debugfs." For the iwlwifi bits, Emmanuel says: "I revert a patch that disabled CTS to self in dvm because users reported issues. The revert is CCed to stable since the offending patch was sent to stable too. I also bump the firmware API versions since a new firmware is coming up. On top of that, Marcel fixes a bug I introduced while fixing a bug in our Kconfig file." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-07 16:11:10 -07:00
WANG Cong	de185ab46c	ipv6: restore the behavior of ipv6_sock_ac_drop() It is possible that the interface is already gone after joining the list of anycast on this interface as we don't hold a refcount for the device, in this case we are safe to ignore the error. What's more important, for API compatibility we should not change this behavior for applications even if it were correct. Fixes: commit `a9ed4a2986` ("ipv6: fix rtnl locking in setsockopt for anycast and multicast") Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-07 16:10:07 -07:00
Andy Shevchenko	13aa3463e5	rose: use %ph specifier Instead of dereference each byte let's use %ph specifier in the printk() calls. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-07 16:07:25 -07:00
Pablo Neira Ayuso	679ab4ddbd	netfilter: xt_TPROXY: undefined reference to `udp6_lib_lookup' CONFIG_IPV6=m CONFIG_NETFILTER_XT_TARGET_TPROXY=y net/built-in.o: In function `nf_tproxy_get_sock_v6.constprop.11': >> xt_TPROXY.c:(.text+0x583a1): undefined reference to `udp6_lib_lookup' net/built-in.o: In function `tproxy_tg_init': >> xt_TPROXY.c:(.init.text+0x1dc3): undefined reference to `nf_defrag_ipv6_enable' This fix is similar to `1a5bbfc` ("netfilter: Fix build errors with xt_socket.c"). Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-07 17:25:16 +02:00
Neal Cardwell	87d943085b	tcp: remove obsolete comment about TCP_SKB_CB(skb)->when in tcp_fragment() The TCP_SKB_CB(skb)->when field no longer exists as of recent change `7faee5c0d5` ("tcp: remove TCP_SKB_CB(skb)->when"). And in any case, tcp_fragment() is called on already-transmitted packets from the __tcp_retransmit_skb() call site, so copying timestamps of any kind in this spot is quite sensible. Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-06 12:29:10 -07:00
Eric Dumazet	7faee5c0d5	tcp: remove TCP_SKB_CB(skb)->when After commit `740b0f1841` ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
Eric Dumazet	04317dafd1	tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Lets add a different name to ease code comprehension. Note that 'when' field will disappear in following patch, as skb_mstamp already contains timestamp, the anonymous union will promptly disappear as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
Alexander Duyck	56193d1bce	net: Add function for parsing the header length out of linear ethernet frames This patch updates some of the flow_dissector api so that it can be used to parse the length of ethernet buffers stored in fragments. Most of the changes needed were to __skb_get_poff as it needed to be updated to support sending a linear buffer instead of a skb. I have split __skb_get_poff into two functions, the first is skb_get_poff and it retains the functionality of the original __skb_get_poff. The other function is __skb_get_poff which now works much like __skb_flow_dissect in relation to skb_flow_dissect in that it provides the same functionality but works with just a data buffer and hlen instead of needing an skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:02 -07:00
Alexander Duyck	82eabd9eb2	net: merge cases where sock_efree and sock_edemux are the same function Since sock_efree and sock_demux are essentially the same code for non-TCP sockets and the case where CONFIG_INET is not defined we can combine the code or replace the call to sock_edemux in several spots. As a result we can avoid a bit of unnecessary code or code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Alexander Duyck	62bccb8cdb	net-timestamp: Make the clone operation stand-alone from phy timestamping The phy timestamping takes a different path than the regular timestamping does in that it will create a clone first so that the packets needing to be timestamped can be placed in a queue, or the context block could be used. In order to support these use cases I am pulling the core of the code out so it can be used in other drivers beyond just phy devices. In addition I have added a destructor named sock_efree which is meant to provide a simple way for dropping the reference to skb exceptions that aren't part of either the receive or send windows for the socket, and I have removed some duplication in spots where this destructor could be used in place of sock_edemux. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Alexander Duyck	37846ef018	net-timestamp: Merge shared code between phy and regular timestamping This change merges the shared bits that exist between skb_tx_tstamp and skb_complete_tx_timestamp. By doing this we can avoid the two diverging as there were already changes pushed into skb_tx_tstamp that hadn't made it into the other function. In addition this resolves issues with the fact that skb_complete_tx_timestamp was included in linux/skbuff.h even though it was only compiled in if phy timestamping was enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Eric Dumazet	d546c62154	ipv4: harden fnhe_hashfun() Lets make this hash function a bit secure, as ICMP attacks are still in the wild. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:40:33 -07:00
Masanari Iida	e793c0f70e	net: treewide: Fix typo found in DocBook/networking.xml This patch fix spelling typo found in DocBook/networking.xml. It is because the neworking.xml is generated from comments in the source, I have to fix typo in comments within the source. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:35:28 -07:00
Pablo Neira Ayuso	84a59ca55f	netfilter: add explicit Kconfig for NETFILTER_XT_NAT Paul Bolle reports that 'select NETFILTER_XT_NAT' from the IPV4 and IPV6 NAT tables becomes noop since there is no Kconfig switch for it. Add the Kconfig switch to resolve this problem. Fixes: `8993cf8` netfilter: move NAT Kconfig switches out of the iptables scope Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:23:31 -07:00
Eric Dumazet	caa415270c	ipv4: fix a race in update_or_create_fnhe() nh_exceptions is effectively used under rcu, but lacks proper barriers. Between kzalloc() and setting of nh->nh_exceptions(), we need a proper memory barrier. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `4895c771c7` ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:15:50 -07:00
Nicolas Dichtel	e7478dfc46	ipv6: use addrconf_get_prefix_route() to remove peer addr addrconf_get_prefix_route() ensures to get the right route in the right table. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:13:24 -07:00
Nicolas Dichtel	f24062b07d	ipv6: fix a refcnt leak with peer addr There is no reason to take a refcnt before deleting the peer address route. It's done some lines below for the local prefix route because inet6_ifa_finish_destroy() will release it at the end. For the peer address route, we want to free it right now. This bug has been introduced by commit `caeaba7900` ("ipv6: add support of peer address"). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:13:24 -07:00
Andy Zhou	29abe2fda5	l2tp: fix missing line continuation This syntax error was covered by L2TP_REFCNT_DEBUG not being set by default. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:19:53 -07:00
Willem de Bruijn	c199105d15	net-timestamp: only report sw timestamp if reporting bit is set The timestamping API has separate bits for generating and reporting timestamps. A software timestamp should only be reported for a packet when the packet has the relevant generation flag (SKBTX_..) set and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set. The second check was accidentally removed. Reinstitute the original behavior. Tested: Without this patch, Documentation/networking/txtimestamp reports timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set. After the patch, it only reports them when the flag is set. Fixes: `f24b9be595` ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:02:43 -07:00
Guillaume Nault	eed4d839b0	l2tp: fix race while getting PMTU on PPP pseudo-wire Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU. The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get() could return NULL if tunnel->sock->sk_dst_cache was reset just before the call, thus making dst_mtu() dereference a NULL pointer: [ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0 [ 1937.664005] Oops: 0000 [#1] SMP [ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core] [ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G O 3.17.0-rc1 #1 [ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008 [ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000 [ 1937.664005] RIP: 0010:[<ffffffffa049db88>] [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP: 0018:ffff8800c43c7de8 EFLAGS: 00010282 [ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5 [ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000 [ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000 [ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000 [ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009 [ 1937.664005] FS: 00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 [ 1937.664005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0 [ 1937.664005] Stack: [ 1937.664005] ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009 [ 1937.664005] ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57 [ 1937.664005] ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000 [ 1937.664005] Call Trace: [ 1937.664005] [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp] [ 1937.664005] [<ffffffff81109b57>] ? might_fault+0x9e/0xa5 [ 1937.664005] [<ffffffff81109b0e>] ? might_fault+0x55/0xa5 [ 1937.664005] [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26 [ 1937.664005] [<ffffffff81309196>] SYSC_connect+0x87/0xb1 [ 1937.664005] [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56 [ 1937.664005] [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1 [ 1937.664005] [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 1937.664005] [<ffffffff8114c262>] ? spin_lock+0x9/0xb [ 1937.664005] [<ffffffff813092b4>] SyS_connect+0x9/0xb [ 1937.664005] [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b [ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89 [ 1937.664005] RIP [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp] [ 1937.664005] RSP <ffff8800c43c7de8> [ 1937.664005] CR2: 0000000000000020 [ 1939.559375] ---[ end trace 82d44500f28f8708 ]--- Fixes: `f34c4a35d8` ("l2tp: take PMTU from tunnel UDP socket") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 14:40:18 -07:00
Govindarajulu Varadarajan	f0db9b0734	ethtool: Add generic options for tunables This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting tunable values from driver. Add get_tunable and set_tunable to ethtool_ops. Driver implements these functions for getting/setting tunable value. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:12:20 -07:00
Daniel Borkmann	e020836d95	dev_ioctl: remove dev_load() CAP_SYS_MODULE message Marcel reported to see the following message when autoloading is being triggered when adding nlmon device: Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-nlmon instead. This false-positive happens despite with having correct capabilities set, e.g. through issuing `ip link del dev nlmon` more than once on a valid device with name nlmon, but Marcel has also seen it on creation time when no nlmon module is previously compiled-in or loaded as module and the device name equals a link type name (e.g. nlmon, vxlan, team). Stephen says: The netdev module alias is a hold over from the past. For normal devices, people used to create a alias eth0 to and point it to the type of network device used, that was back in the bad old ISA days before real discovery. Also, the tunnels create module alias for the control device and ip used to use this to autoload the tunnel device. The message is bogus and should just be removed, I also see it in a couple of other cases where tap devices are renamed for other usese. As mentioned in `8909c9ad8f` ("net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules"), we nevertheless still might want to leave the old autoloading behaviour in place as it could break old scripts, so for now, lets just remove the log message as Stephen suggests. Reference: http://thread.gmane.org/gmane.linux.kernel/1105168 Reported-by: Marcel Holtmann <marcel@holtmann.org> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:04:40 -07:00
Daniel Borkmann	60a3b2253c	net: bpf: make eBPF interpreter images read-only With eBPF getting more extended and exposure to user space is on it's way, hardening the memory range the interpreter uses to steer its command flow seems appropriate. This patch moves the to be interpreted bytecode to read-only pages. In case we execute a corrupted BPF interpreter image for some reason e.g. caused by an attacker which got past a verifier stage, it would not only provide arbitrary read/write memory access but arbitrary function calls as well. After setting up the BPF interpreter image, its contents do not change until destruction time, thus we can setup the image on immutable made pages in order to mitigate modifications to that code. The idea is derived from commit `314beb9bca` ("x86: bpf_jit_comp: secure bpf jit against spraying attacks"). This is possible because bpf_prog is not part of sk_filter anymore. After setup bpf_prog cannot be altered during its life-time. This prevents any modifications to the entire bpf_prog structure (incl. function/JIT image pointer). Every eBPF program (including classic BPF that are migrated) have to call bpf_prog_select_runtime() to select either interpreter or a JIT image as a last setup step, and they all are being freed via bpf_prog_free(), including non-JIT. Therefore, we can easily integrate this into the eBPF life-time, plus since we directly allocate a bpf_prog, we have no performance penalty. Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual inspection of kernel_page_tables. Brad Spengler proposed the same idea via Twitter during development of this patch. Joint work with Hannes Frederic Sowa. Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 12:02:48 -07:00
Sabrina Dubroca	a9ed4a2986	ipv6: fix rtnl locking in setsockopt for anycast and multicast Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST triggers the assertion in addrconf_join_solict()/addrconf_leave_solict() ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before calling ipv6_dev_mc_inc/dec. This patch moves ASSERT_RTNL() up a level in the call stack. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reported-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 11:52:28 -07:00
Eyal Shapira	f3000e1b43	mac80211: fix broken use of VHT/20Mhz with some APs commit "mac80211: disable 40MHz support in case of 20MHz AP" broke working VHT in 20Mhz with APs like Netgear R6300v2 which do not publish support for 40Mhz but allow use of VHT in 20Mhz. The break is because VHT is disabled once no HT cap doesn't indicate support for 40Mhz. This causes the assoc request to be sent without any VHT IE and the association is only HT due to this. For more details check out commit `4a817aa7` "mac80211: allow VHT with peers not capable of 40MHz" Fixes: `53b954ee4a` ("mac80211: disable 40MHz support in case of 20MHz AP") Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:54:07 +02:00
Lorenzo Bianconi	a4bcaf5556	mac80211: extend set_coverage_class signature Extend mac80211 set_coverage_class API in order to enable ACK timeout estimation algorithm (dynack) passing coverage class equals to -1 to lower drivers. Synchronize set_coverage_class routine signature with mac80211 function pointer for p54, ath9k, ath9k_htc and ath5k drivers. Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:54:07 +02:00
Lorenzo Bianconi	3057dbfdab	cfg80211: enable dynack through nl80211 Enable ACK timeout estimation algorithm (dynack) using mac80211 set_coverage_class API. Dynack is activated passing coverage class equals to -1 to lower drivers and it is automatically disabled setting valid value for coverage class. Define NL80211_ATTR_WIPHY_DYN_ACK flag attribute to enable dynack from userspace. In order to activate dynack NL80211_FEATURE_ACKTO_ESTIMATION feature flag must be set by lower drivers to indicate dynack capability. Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:54:03 +02:00
Eliad Peller	eaa336b0f5	mac80211: combine roc with the "next roc" if possible If the remaining time in the current roc is not long enough, mac80211 adds the new roc right after it (if they have similar params). However, in case of multiple rocs, the "next roc" is not considered, resulting in multiple rocs, each one with its own duration. Refactor the code a bit and consider the next roc, so a single max roc will be used instead of multiple rocs (which might last much longer). Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:09 +02:00
Eliad Peller	24ecd45e2e	mac80211: adjust roc duration when combining ROCs The new duration (remaining duration after the current ROC ends) was calculated but not used, making the optimization worthless. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:08 +02:00
Eliad Peller	a62a1aed37	cfg80211: avoid duplicate entries on regdomain intersection The regdom intersection code simply tries intersecting each rule of the source with each rule of the target. Since the resulting intersections are not observed as a whole, this can result in multiple overlapping/duplicate entries. Make the rule addition a bit more smarter, by looking for rules that can be contained within other rules, and adding only extended ones. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:08 +02:00
Assaf Krauss	cd2f5dd709	mac80211: Add RRM support to assoc request In case of a RRM-supporting connection, in the association request frame: set the RRM capability flag, and add the required IEs. Signed-off-by: Assaf Krauss <assaf.krauss@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:08 +02:00
Assaf Krauss	bab5ab7d2a	nl80211: Add flag attribute for RRM connections Add a flag attribute to use in associations, for tagging the target connection as supporting RRM. It is the responsibility of upper layers to set this flag only if both the underlying device, and the target network indeed support RRM. To be used in ASSOCIATE and CONNECT commands. Signed-off-by: Assaf Krauss <assaf.krauss@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:08 +02:00
Liad Kaufman	6188c271f0	mac80211: fix description comment of ieee80211_subif_start_xmit The function description claimed that on error the skb isn't freed even though it is, and stated return values that are different than what really happens in the code. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:07 +02:00
Johannes Berg	2740f0cf8e	cfg80211: add Intel Mobile Communications copyright Our legal structure changed at some point (see wikipedia), but we forgot to immediately switch over to the new copyright notice. For files that we have modified in the time since the change, add the proper copyright notice now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:06 +02:00
Johannes Berg	d98ad83ee8	mac80211: add Intel Mobile Communications copyright Our legal structure changed at some point (see wikipedia), but we forgot to immediately switch over to the new copyright notice. For files that we have modified in the time since the change, add the proper copyright notice now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:52:06 +02:00
Emmanuel Grumbach	785e21a89d	mac80211: use bss_conf->dtim_period instead of conf.ps_dtim_period sta_set_sinfo is obviously takes data for specific station. This specific station is attached to a specific virtual interface. Hence we should use the dtim_period from this virtual interface rather than the system wide dtim_period. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:41:08 +02:00
Johannes Berg	c10a19930f	mac80211: clean up ieee80211_i.h Not sure how the declaration of ieee80211_tdls_peer_del_work landed after the double inclusion protection end. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 10:56:50 +02:00
Hannes Frederic Sowa	a9fe8e2994	ipv4: implement igmp_qrv sysctl to tune igmp robustness variable As in IPv6 people might increase the igmp query robustness variable to make sure unsolicited state change reports aren't lost on the network. Add and document this new knob to igmp code. RFCs allow tuning this parameter back to first IGMP RFC, so we also use this setting for all counters, including source specific multicast. Also take over sysctl value when upping the interface and don't reuse the last one seen on the interface. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-04 22:26:14 -07:00
Hannes Frederic Sowa	2f711939d2	ipv6: add sysctl_mld_qrv to configure query robustness variable This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query robustness variable. It specifies how many retransmit of unsolicited mld retransmit should happen. Admins might want to tune this on lossy links. Also reset mld state on interface down/up, so we pick up new sysctl settings during interface up event. IPv6 certification requests this knob to be available. I didn't make this knob netns specific, as it is mostly a setting in a physical environment and should be per host. Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-04 22:26:14 -07:00
John W. Linville	ef4ead3f29	Not that much content this time. Some RCU cleanups, crypto performance improvements, and various patches all over, rather than listing them one might as well look into the git log instead. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUAIx4AAoJEDBSmw7B7bqrUYcP/3t4qdFxm0bd4j2AEkl3mPwB Qu7obTicOTfBRoJNEgS+8AU2u3PfztU6+ErZs4ETLUuqaZwXisqmwBiMo86+Wtdf gx9KonwEW051g7YmB0+6EMwuy04MGzTEk8VavQwqM4g9LIPJ4Buo/kj7MNJ51m11 XyRmJqZJnKKeiiQ4eC0gPf8e44qiQqaDuYZ0r1UDnNRg2KrbAHlGTBKYI3VRl2u4 xRpPGVnHwT0qkWb1Zw9fk0VfPr9m1ETthzcZvnhk6uMnJ28D+1B1FjZR1GJU6BW7 Zx2FbevbZTjDoNT1GQpLGMXBuW0lsZFetXVFiJCr/StaPBtHmtdu28fuNVm8yJYz euDlEgrE8F4npdec2F5R2zh7Ue2U7eMEL2uxxjciNSJOipHgx5EXH12Y/5QtrChy 4OHPbNHgpmqFB7TmkvHDgP/0A7XdyqKVc+NtIV+eECIwE4tHcJ6A+bQ+ZCoRV2Vw zmsNuNeNeDW7NEAw9veRXissLZMy/EjUnsOrnW29BpO/yG+2YjqpyQ6JQpcXeCPD WQgl2FHpk6ap3jpVjxminxw2HkDnQ0oTKusGLcezalhUlWMo7VYNN59aLzcphxX5 Fotp/8v1sbDTF46uc/QJ38N5TqflwWeFpxvGkdNGuAT4llP03NaXV0ORBecFmMW2 esb+PLwlByCDeVFu53q+ =Qth6 -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "Not that much content this time. Some RCU cleanups, crypto performance improvements, and various patches all over, rather than listing them one might as well look into the git log instead." Signed-off-by: John W. Linville <linville@tuxdriver.com> Conflicts: drivers/net/wireless/ath/wil6210/wmi.c	2014-09-04 13:41:33 -04:00
John W. Linville	190355cc06	Here are a few fixes for mac80211. One has been discussed for a while and adds a terminating NUL-byte to the alpha2 sent to userspace, which shouldn't be necessary but since many places treat it as a string we couldn't move to just sending two bytes. In addition to that, we have two VLAN fixes from Felix, a mesh fix, a fix for the recently introduced RX aggregation offload, a revert for a broken patch (that luckily didn't really cause any harm) and a small fix for alignment in debugfs. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUAFrmAAoJEDBSmw7B7bqr79AP/1yGi9lkv/wWUs5y0AhUSen9 850MU26BBlyAAFSz11xqgaEeRmeBeqhR3K7w/M02TX0CHxBzMqMZfyE//tq0UJaI ZwZmtyQmdMiOSNKignTIIx7OHTioq0wrGKb6O2UvKoJfTlB9t01jCC4jmCTF5Vos 6ReF7NaZEbxW6XDOsClNTAtIa1c6n1RQ5VbDIEL5Vfvqv8LbcobduF8WcYl80eIQ +EvIHtUm/Luxg6DblibgEVtwYOtNpvRz4pofdw3xoSHAnF+zhXbUr0dUjpkBNA7o vWboCBl14Qn1M7pOJZ0+TBzFmquAr6CDbDvArVCH01Swh27EUDQUcHQAggGpT71w DFgWHOYP0UCB6Y4U0GjBehy8PeuytqJLBSceKVud7DDqd8fY+Lq3MMyicIk0aw3o IIDLWrujkCBXsdfuxQETmYxHU05WHSuYOCTgGSqbq3QPTWm8pBGWTdbk+1t/0FyH cGLJOWs/jCrtHdzDj6TH+kL8NmvwB7sC9MT45qG0ilevmPW25yrnTJPEMEvFBqvZ lnaqiX6D1kGNZd09CxgSIhxrQi+N0Yg+UlLa4IUtOIqnQussOC3xH2U5qTufdpa1 Gi9aCkBGVKQiObPWucf2QB4t1sZ18rxBhrAelZhQPLTKrnsuLhpcVBlU+L6ScCAk FVni4HZH2IGtDQ577k10 =G/pB -----END PGP SIGNATURE----- Merge tag 'mac80211-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Here are a few fixes for mac80211. One has been discussed for a while and adds a terminating NUL-byte to the alpha2 sent to userspace, which shouldn't be necessary but since many places treat it as a string we couldn't move to just sending two bytes. In addition to that, we have two VLAN fixes from Felix, a mesh fix, a fix for the recently introduced RX aggregation offload, a revert for a broken patch (that luckily didn't really cause any harm) and a small fix for alignment in debugfs." Signed-off-by: John W. Linville <linville@redhat.com>	2014-09-04 13:08:24 -04:00
Li RongQing	c5eba0b6f8	openvswitch: distinguish between the dropped and consumed skb distinguish between the dropped and consumed skb, not assume the skb is consumed always Cc: Thomas Graf <tgraf@noironetworks.com> Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-03 20:50:51 -07:00
Jesper Dangaard Brouer	1f59533f9c	qdisc: validate frames going through the direct_xmit path In commit `50cbe9ab5f` ("net: Validate xmit SKBs right when we pull them out of the qdisc") the validation code was moved out of dev_hard_start_xmit and into dequeue_skb. However this overlooked the fact that we do not always enqueue the skb onto a qdisc. First situation is if qdisc have flag TCQ_F_CAN_BYPASS and qdisc is empty. Second situation is if there is no qdisc on the device, which is a common case for software devices. Originally spotted and inital patch by Alexander Duyck. As a result Alex was seeing issues trying to connect to a vhost_net interface after commit `50cbe9ab5f` was applied. Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in __dev_queue_xmit() when no qdisc. Also handle the error situation where dev_hard_start_xmit() could return a skb list, and does not return dev_xmit_complete(rc) and falls through to the kfree_skb(), in that situation it should call kfree_skb_list(). Fixes: `50cbe9ab5f` ("net: Validate xmit SKBs right when we pull them out of the qdisc") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-03 20:41:42 -07:00
Jesper Dangaard Brouer	3f3c7eec60	qdisc: exit case fixes for skb list handling in qdisc layer More minor fixes to merge commit `53fda7f7f9` (Merge branch 'xmit_list') that allows us to work with a list of SKBs. Fixing exit cases in qdisc_reset() and qdisc_destroy(), where a leftover requeued SKB (qdisc->gso_skb) can have the potential of being a skb list, thus use kfree_skb_list(). This is a followup to commit `10770bc2d1` ("qdisc: adjustments for API allowing skb list xmits"). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-03 20:41:42 -07:00
Pablo Neira Ayuso	cbb8125eb4	netfilter: nfnetlink: deliver netlink errors on batch completion We have to wait until the full batch has been processed to deliver the netlink error messages to userspace. Otherwise, we may deliver duplicated errors to userspace in case that we need to abort and replay the transaction if any of the required modules needs to be autoloaded. A simple way to reproduce this (assumming nft_meta is not loaded) with the following test file: add table filter add chain filter test add chain bad test # intentional wrong unexistent table add rule filter test meta mark 0 Then, when trying to load the batch: # nft -f test test:4:1-19: Error: Could not process rule: No such file or directory add chain bad test ^^^^^^^^^^^^^^^^^^^ test:4:1-19: Error: Could not process rule: No such file or directory add chain bad test ^^^^^^^^^^^^^^^^^^^ The error is reported twice, once when the batch is aborted due to missing nft_meta and another when it is fully processed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-03 16:56:23 +02:00
Michal Kazior	4549cf2b18	mac80211: fix offloaded BA session traffic after hw restart When starting an offloaded BA session it is unknown what starting sequence number should be used. Using last_seq worked in most cases except after hw restart. When hw restart is requested last_seq is (rightfully so) kept unmodified. This ended up with BA sessions being restarted with an aribtrary BA window values resulting in dropped frames until sequence numbers caught up. Instead of last_seq pick seqno of a first Rxed frame of a given BA session. This fixes stalled traffic after hw restart with offloaded BA sessions (currently only ath10k). Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-03 13:40:38 +02:00
Johannes Berg	bd8c78e78d	nl80211: clear skb cb before passing to netlink In testmode and vendor command reply/event SKBs we use the skb cb data to store nl80211 parameters between allocation and sending. This causes the code for CONFIG_NETLINK_MMAP to get confused, because it takes ownership of the skb cb data when the SKB is handed off to netlink, and it doesn't explicitly clear it. Clear the skb cb explicitly when we're done and before it gets passed to netlink to avoid this issue. Cc: stable@vger.kernel.org [this goes way back] Reported-by: Assaf Azulay <assaf.azulay@intel.com> Reported-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-03 11:13:14 +02:00
Pablo Neira Ayuso	d99407f42f	netfilter: nft_rbtree: no need for spinlock from set destroy path The sets are released from the rcu callback, after the rule is removed from the chain list, which implies that nfnetlink cannot update the rbtree and no packets are walking on the set anymore. Thus, we can get rid of the spinlock in the set destroy path there. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Reviewied-by: Thomas Graf <tgraf@suug.ch>	2014-09-03 10:57:08 +02:00
Pablo Neira Ayuso	39f390167e	netfilter: nft_hash: no need for rcu in the hash set destroy path The sets are released from the rcu callback, after the rule is removed from the chain list, which implies that nfnetlink cannot update the hashes (thus, no resizing may occur) and no packets are walking on the set anymore. This resolves a lockdep splat in the nft_hash_destroy() path since the nfnl mutex is not held there. =============================== [ INFO: suspicious RCU usage. ] 3.16.0-rc2+ #168 Not tainted ------------------------------- net/netfilter/nft_hash.c:362 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by ksoftirqd/0/3: #0: (rcu_callback){......}, at: [<ffffffff81096393>] rcu_process_callbacks+0x27e/0x4c7 stack backtrace: CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.16.0-rc2+ #168 Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012 0000000000000001 ffff88011769bb98 ffffffff8142c922 0000000000000006 ffff880117694090 ffff88011769bbc8 ffffffff8107c3ff ffff8800cba52400 ffff8800c476bea8 ffff8800c476bea8 ffff8800cba52400 ffff88011769bc08 Call Trace: [<ffffffff8142c922>] dump_stack+0x4e/0x68 [<ffffffff8107c3ff>] lockdep_rcu_suspicious+0xfa/0x103 [<ffffffffa079931e>] nft_hash_destroy+0x50/0x137 [nft_hash] [<ffffffffa078cd57>] nft_set_destroy+0x11/0x2a [nf_tables] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Thomas Graf <tgraf@suug.ch>	2014-09-03 10:57:06 +02:00
Jesper Dangaard Brouer	10770bc2d1	qdisc: adjustments for API allowing skb list xmits Minor adjustments for merge commit `53fda7f7f9` (Merge branch 'xmit_list') that allows us to work with a list of SKBs. Update code doc to function sch_direct_xmit(). In handle_dev_cpu_collision() use kfree_skb_list() in error handling. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 14:06:17 -07:00
Li RongQing	4ee45ea05c	openvswitch: fix a memory leak The user_skb maybe be leaked if the operation on it failed and codes skipped into the label "out:" without calling genlmsg_unicast. Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 14:01:21 -07:00
Pablo Neira	41ad82f7f8	netfilter: fix missing dependencies in NETFILTER_XT_TARGET_LOG make defconfig reports: warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NETFILTER_ADVANCED) Fixes: `d79a61d` netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_* Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 13:59:54 -07:00
David S. Miller	abccc5878a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== pull request: Netfilter/IPVS fixes for net The following patchset contains seven Netfilter fixes for your net tree, they are: 1) Make the NAT infrastructure independent of x_tables, some users are already starting to test nf_tables with NAT without enabling x_tables. Without this patch for Kconfig, there's a superfluous dependency between NAT and x_tables. 2) Allow to use 0 in the cgroup match, the kernel rejects with -EINVAL with no good reason. From Daniel Borkmann. 3) Select CONFIG_NF_NAT from the nf_tables NAT expression, this also resolves another NAT dependency with x_tables. 4) Use HAVE_JUMP_LABEL instead of CONFIG_JUMP_LABEL in the Netfilter hook code as elsewhere in the kernel to resolve toolchain problems, from Zhouyi Zhou. 5) Use iptunnel_handle_offloads() to set up tunnel encapsulation depending on the offload capabilities, reported by Alex Gartrell patch from Julian Anastasov. 6) Fix wrong family when registering the ip_vs_local_reply6() hook, also from Julian. 7) Select the NF_LOG_* symbols from NETFILTER_XT_TARGET_LOG. Rafał Miłecki reported that when jumping from 3.16 to 3.17-rc, his log target is not selected anymore due to changes in the previous development cycle to accomodate the full logging support for nf_tables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 13:56:30 -07:00
Nicolas Dichtel	ba9989069f	rtnl/do_setlink(): notify when a netdev is modified Depending on which parameters were updated, the changes were not propagated via the notifier chain and netlink. The new flag has been set only when the change did not cause a call to the notifier chain and/or to the netlink notification functions. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 12:57:04 -07:00
Nicolas Dichtel	90c325e3bf	rtnl/do_setlink(): last arg is now a set of flags There is no functional changes with this commit, it only prepares the next one. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 12:57:04 -07:00
Nicolas Dichtel	1889b0e7ef	rtnl/do_setlink(): set modified when IFLA_LINKMODE is updated The only effect of this patch is to print a warning if IFLA_LINKMODE is updated and a following change fails. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 12:57:04 -07:00
Nicolas Dichtel	5d1180fcac	rtnl/do_setlink(): set modified when IFLA_TXQLEN is updated The only effect of this patch is to print a warning if IFLA_TXQLEN is updated and a following change fails. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-02 12:57:04 -07:00
Pablo Neira Ayuso	65cd90ac76	netfilter: nft_chain_nat_ipv4: use generic IPv4 NAT code from core Use the exported IPv4 NAT functions that are provided by the core. This removes duplicated code so iptables and nft use the same NAT codebase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-02 17:14:11 +02:00
Pablo Neira Ayuso	30766f4c2d	netfilter: nat: move specific NAT IPv4 to core Move the specific NAT IPv4 core functions that are called from the hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-02 17:14:10 +02:00
Christophe Gouault	880a6fab8f	xfrm: configure policy hash table thresholds by netlink Enable to specify local and remote prefix length thresholds for the policy hash table via a netlink XFRM_MSG_NEWSPDINFO message. prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh). example: struct xfrmu_spdhthresh thresh4 = { .lbits = 0; .rbits = 24; }; struct xfrmu_spdhthresh thresh6 = { .lbits = 0; .rbits = 56; }; struct nlmsghdr hdr; struct nl_msg msg; msg = nlmsg_alloc(); hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST); nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4); nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6); nla_send_auto(sk, msg); The numbers are the policy selector minimum prefix lengths to put a policy in the hash table. - lbits is the local threshold (source address for out policies, destination address for in and fwd policies). - rbits is the remote threshold (destination address for out policies, source address for in and fwd policies). The default values are: XFRMA_SPD_IPV4_HTHRESH: 32 32 XFRMA_SPD_IPV6_HTHRESH: 128 128 Dynamic re-building of the SPD is performed when the thresholds values are changed. The current thresholds can be read via a XFRM_MSG_GETSPDINFO request: the kernel replies to XFRM_MSG_GETSPDINFO requests by an XFRM_MSG_NEWSPDINFO message, with both attributes XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH. Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-02 13:37:56 +02:00
Christophe Gouault	b58555f176	xfrm: hash prefixed policies based on preflen thresholds The idea is an extension of the current policy hashing. Today only non-prefixed policies are stored in a hash table. This patch relaxes the constraints, and hashes policies whose prefix lengths are greater or equal to a configurable threshold. Each hash table (one per direction) maintains its own set of IPv4 and IPv6 thresholds (dbits4, sbits4, dbits6, sbits6), by default (32, 32, 128, 128). Example, if the output hash table is configured with values (16, 24, 56, 64): ip xfrm policy add dir out src 10.22.0.0/20 dst 10.24.1.0/24 ... => hashed ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.1.1/32 ... => hashed ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.0.0/16 ... => unhashed ip xfrm policy add dir out \ src 3ffe:304:124:2200::/60 dst 3ffe:304:124:2401::/64 ... => hashed ip xfrm policy add dir out \ src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2401::2/128 ... => hashed ip xfrm policy add dir out \ src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2400::/56 ... => unhashed The high order bits of the addresses (up to the threshold) are used to compute the hash key. Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-02 13:29:44 +02:00
Willem de Bruijn	364a9e9324	sock: deduplicate errqueue dequeue sk->sk_error_queue is dequeued in four locations. All share the exact same logic. Deduplicate. Also collapse the two critical sections for dequeue (at the top of the recv handler) and signal (at the bottom). This moves signal generation for the next packet forward, which should be harmless. It also changes the behavior if the recv handler exits early with an error. Previously, a signal for follow-up packets on the errqueue would then not be scheduled. The new behavior, to always signal, is arguably a bug fix. For rxrpc, the change causes the same function to be called repeatedly for each queued packet (because the recv handler == sk_error_report). It is likely that all packets will fail for the same reason (e.g., memory exhaustion). This code runs without sk_lock held, so it is not safe to trust that sk->sk_err is immutable inbetween releasing q->lock and the subsequent test. Introduce int err just to avoid this potential race. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 21:49:08 -07:00
Tom Herbert	72297c59f7	l2tp: Enable checksum unnecessary conversions for l2tp/UDP sockets Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 21:36:28 -07:00
Tom Herbert	884d338c04	gre: Add support for checksum unnecessary conversions Call skb_checksum_try_convert and skb_gro_checksum_try_convert after checksum is found present and validated in the GRE header for normal and GRO paths respectively. In GRO path, call skb_gro_checksum_try_convert Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 21:36:28 -07:00
Tom Herbert	2abb7cdc0d	udp: Add support for doing checksum unnecessary conversion Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE conversion in UDP tunneling path. In the normal UDP path, we call skb_checksum_try_convert after locating the UDP socket. The check is that checksum conversion is enabled for the socket (new flag in UDP socket) and that checksum field is non-zero. In the UDP GRO path, we call skb_gro_checksum_try_convert after checksum is validated and checksum field is non-zero. Since this is already in GRO we assume that checksum conversion is always wanted. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 21:36:28 -07:00
Tom Herbert	5a21232983	net: Support for csum_bad in skbuff This flag indicates that an invalid checksum was detected in the packet. __skb_mark_checksum_bad helper function was added to set this. Checksums can be marked bad from a driver or the GRO path (the latter is implemented in this patch). csum_bad is checked in __skb_checksum_validate_complete (i.e. calling that when ip_summed == CHECKSUM_NONE). csum_bad works in conjunction with ip_summed value. In the case that ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the first (or next) checksum encountered in the packet is bad. When ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY, csum_level == 1, and csum_bad is set-- then the third checksum in the packet is bad. In the normal path, the packet will be dropped when processing the protocol layer of the bad checksum: __skb_decr_checksum_unnecessary called twice for the good checksums changing ip_summed to CHECKSUM_NONE so that __skb_checksum_validate_complete is called to validate the third checksum and that will fail since csum_bad is set. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 21:36:27 -07:00
Florian Fainelli	61b7363ffa	net: dsa: make dsa_pack_type static net/dsa/dsa.c:624:20: sparse: symbol 'dsa_pack_type' was not declared. Should it be static? Fixes: `3e8a72d1da` ("net: dsa: reduce number of protocol hooks") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 20:41:45 -07:00
stephen hemminger	688d1945bc	tcp: whitespace fixes Fix places where there is space before tab, long lines, and awkward if(){, double spacing etc. Add blank line after declaration/initialization. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 18:12:45 -07:00
David S. Miller	377655fe6b	Merge tag 'master-2014-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-08-28 Please pull this batch of fixes intended for the 3.17 stream. For the Bluetooth/6LowPAN/802.15.4 bits, Johan says: 'It contains a connection reference counting fix for LE where a connection might stay up even though it should get disconnected. The other 802.15.4 6LoWPAN related patches were sent to the bluetooth tree by Alexander Aring and described as follows by him: " these patches contains patches for the bluetooth branch. This series includes memory leak fixes and an errno value fix. Also there are two patches for sending and receiving 1280 6LoWPAN packets, which makes the IEEE 802.15.4 6LoWPAN stack more RFC compliant. "' Along with that... Alexey Khoroshilov fixes a use-after-free bug on at76c50x-usb. Hauke Mehrtens adds a PCI ID to bcma. Himangi Saraogi fixes a silly "A \|\| A" test in rtlwifi. Larry Finger adds a device ID to rtl8192cu. Maks Naumov fixes a strncmp argument in ath9k. Álvaro Fernández Rojas adds a PCI ID to ssb. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 18:08:11 -07:00
Jesper Dangaard Brouer	afb84b6261	pktgen: add flag NO_TIMESTAMP to disable timestamping Then testing the TX limits of the stack, then it is useful to be-able to disable the do_gettimeofday() timetamping on every packet. This implements a pktgen flag NO_TIMESTAMP which will disable this call to do_gettimeofday(). The performance change on (my system E5-2695) with skb_clone=0, goes from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus, the cost of do_gettimeofday() or saving is approx 23 nanosec. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 18:06:59 -07:00
Erik Hugne	a5325ae5b8	tipc: add name distributor resiliency queue TIPC name table updates are distributed asynchronously in a cluster, entailing a risk of certain race conditions. E.g., if two nodes simultaneously issue conflicting (overlapping) publications, this may not be detected until both publications have reached a third node, in which case one of the publications will be silently dropped on that node. Hence, we end up with an inconsistent name table. In most cases this conflict is just a temporary race, e.g., one node is issuing a publication under the assumption that a previous, conflicting, publication has already been withdrawn by the other node. However, because of the (rtt related) distributed update delay, this may not yet hold true on all nodes. The symptom of this failure is a syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error". In this commit we add a resiliency queue at the receiving end of the name table distributor. When insertion of an arriving publication fails, we retain it in this queue for a short amount of time, assuming that another update will arrive very soon and clear the conflict. If so happens, we insert the publication, otherwise we drop it. The (configurable) retention value defaults to 2000 ms. Knowing from experience that the situation described above is extremely rare, there is no risk that the queue will accumulate any large number of items. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:51:48 -07:00
Erik Hugne	f4ad8a4b8b	tipc: refactor name table updates out of named packet receive routine We need to perform the same actions when processing deferred name table updates, so this functionality is moved to a separate function. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:51:48 -07:00
David S. Miller	8dcda22a5d	net: xmit_list() becomes dev_hard_start_xmit(). Now fundamentally we can process lists of SKBs as cheaply as single packets. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:56 -07:00
David S. Miller	ce93718fb7	net: Don't keep around original SKB when we software segment GSO frames. Just maintain the list properly by returning the head of the remaining SKB list from dev_hard_start_xmit(). Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:56 -07:00
David S. Miller	50cbe9ab5f	net: Validate xmit SKBs right when we pull them out of the qdisc. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:56 -07:00
David S. Miller	eae3f88ee4	net: Separate out SKB validation logic from transmit path. dev_hard_start_xmit() does two things, it first validates and canonicalizes the SKB, then it actually sends it. Make a set of helper functions for doing the first part. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
David S. Miller	95f6b3dda2	net: Have xmit_list() signal more==true when appropriate. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
David S. Miller	fa2dbdc253	net: Pass a "more" indication down into netdev_start_xmit() code paths. For now it will always be false. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
David S. Miller	7f2e870f2a	net: Move main gso loop out of dev_hard_start_xmit() into helper. There is a slight policy change happening here as well. The previous code would drop the entire rest of the GSO skb if any of them got, for example, a congestion notification. That makes no sense, anything NET_XMIT_MASK and below is something like congestion or policing. And in the congestion case it doesn't even mean the packet was actually dropped. Just continue until dev_xmit_complete() evaluates to false. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
David S. Miller	2ea2551375	net: Create xmit_one() helper for dev_hard_start_xmit() Hopefully making the code a bit easier to read and digest. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
David S. Miller	10b3ad8c21	net: Do txq_trans_update() in netdev_start_xmit() That way we don't have to audit every call site to make sure it is doing this properly. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-01 17:39:55 -07:00
Vincent Cuissard	83724c3329	NFC: NCI: Fix NCI RF FRAME interface usage NCI RF FRAME interface is used for all kind of tags except ISODEP ones. So for all other kind of tags the status byte has to be removed. Signed-off-by: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-09-01 14:40:43 +02:00
Vincent Cuissard	3c1c0f5dc8	NFC: NCI: Fix nci_register_device init sequence All contexts have to be initiliazed before calling nfc_register_device otherwise it is possible to call nci_dev_up before ending the nci_register_device function. In such case kernel will crash on non initialized variables. Signed-off-by: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-09-01 14:40:37 +02:00
Vincent Cuissard	cfdbeeafdb	NFC: NCI: Add support of ISO15693 Update nci.h to respect latest NCI specification proposal (stop using proprietary opcodes). Handle ISO15693 parameters in NCI_RF_ACTIVATED_NTF handler. Signed-off-by: Vincent Cuissard <cuissard@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-09-01 14:40:31 +02:00
Pablo Neira Ayuso	d79a61d646	netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_* CONFIG_NETFILTER_XT_TARGET_LOG is not selected anymore when jumping from 3.16 to 3.17-rc1 if you don't set on the new NF_LOG_IPV4 and NF_LOG_IPV6 switches. Change this to select the three new symbols NF_LOG_COMMON, NF_LOG_IPV4 and NF_LOG_IPV6 instead, so NETFILTER_XT_TARGET_LOG remains enabled when moving from old to new kernels. Reported-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-01 13:46:31 +02:00
Masanari Iida	1a84db567a	treewide: fix errors in printk This patch fix spelling typo in printk. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-09-01 11:18:25 +02:00
Mark A. Greer	dddb3da046	NFC: digital: Add Inititor-side PSL support In order to operate at the fasted bit rate possible, add initiator-side support for PSL REQ while in P2P mode. The PSL REQ will switch the RF technology to 424F whenever possible. Reviewed-by: Thierry Escande <thierry.escande@linux.intel.com> Tested-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Mark A. Greer <mgreer@animalcreek.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-08-31 22:15:37 +02:00
Tom Herbert	202863fe4c	sctp: Change sctp to implement csum_levels CHECKSUM_UNNECESSARY may be applied to the SCTP CRC so we need to appropriate account for this by decrementing csum_level. This is done by calling __skb_dec_checksum_unnecessary. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:41:11 -07:00
Tom Herbert	662880f442	net: Allow GRO to use and set levels of checksum unnecessary Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY and to report new checksums verfied for use in fallback to normal path. Change GRO checksum path to track csum_level using a csum_cnt field in NAPI_GRO_CB. On GRO initialization, if ip_summed is CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to skb->csum_level + 1. For each checksum verified, decrement NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a deeper checksum than originally indicated in skbuf so increment csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is CHECKSUM_NONE or CHECKSUM_COMPLETE). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:41:11 -07:00
Tom Herbert	77cffe23c1	net: Clarification of CHECKSUM_UNNECESSARY This patch: - Clarifies the specific requirements of devices returning CHECKSUM_UNNECESSARY (comments in skbuff.h). - Adds csum_level field to skbuff. This is used to express how many checksums are covered by CHECKSUM_UNNECESSARY (stores n - 1). This replaces the overloading of skb->encapsulation, that field is is now only used to indicate inner headers are valid. - Change __skb_checksum_validate_needed to "consume" each checksum as indicated by csum_level as layers of the the packet are parsed. - Remove skb_pop_rcv_encapsulation, no longer needed in the new csum_level model. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:41:11 -07:00
Daniel Borkmann	38ab1fa981	net: sctp: fix ABI mismatch through sctp_assoc_to_state helper Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp tree, the official <netinet/sctp.h> header carries a copy of enum sctp_sstat_state that looks like (compared to the current in-kernel enumeration): User definition: Kernel definition: enum sctp_sstat_state { typedef enum { SCTP_EMPTY = 0, <removed> SCTP_CLOSED = 1, SCTP_STATE_CLOSED = 0, SCTP_COOKIE_WAIT = 2, SCTP_STATE_COOKIE_WAIT = 1, SCTP_COOKIE_ECHOED = 3, SCTP_STATE_COOKIE_ECHOED = 2, SCTP_ESTABLISHED = 4, SCTP_STATE_ESTABLISHED = 3, SCTP_SHUTDOWN_PENDING = 5, SCTP_STATE_SHUTDOWN_PENDING = 4, SCTP_SHUTDOWN_SENT = 6, SCTP_STATE_SHUTDOWN_SENT = 5, SCTP_SHUTDOWN_RECEIVED = 7, SCTP_STATE_SHUTDOWN_RECEIVED = 6, SCTP_SHUTDOWN_ACK_SENT = 8, SCTP_STATE_SHUTDOWN_ACK_SENT = 7, }; } sctp_state_t; This header was later on also placed into the uapi, so that user space programs can compile without having <netinet/sctp.h>, but the shipped with <linux/sctp.h> instead. While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we nevertheless have a what it appears to be dummy SCTP_EMPTY state from the very early days. While it seems to do just nothing, commit `0b8f9e25b0` ("sctp: remove completely unsed EMPTY state") did the right thing and removed this dead code. That however, causes an off-by-one when the user asks the SCTP stack via SCTP_STATUS API and checks for the current socket state thus yielding possibly undefined behaviour in applications as they expect the kernel to tell the right thing. The enumeration had to be changed however as based on the current socket state, we access a function pointer lookup-table through this. Therefore, I think the best way to deal with this is just to add a helper function sctp_assoc_to_state() to encapsulate the off-by-one quirk. Reported-by: Tristan Su <sooqing@gmail.com> Fixes: `0b8f9e25b0` ("sctp: remove completely unsed EMPTY state") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:31:08 -07:00
Eric Dumazet	d9b2938aab	net: attempt a single high order allocation In commit `ed98df3361` ("net: use __GFP_NORETRY for high order allocations") we tried to address one issue caused by order-3 allocations. We still observe high latencies and system overhead in situations where compaction is not successful. Instead of trying order-3, order-2, and order-1, do a single order-3 best effort and immediately fallback to plain order-0. This mimics slub strategy to fallback to slab min order if the high order allocation used for performance failed. Order-3 allocations give a performance boost only if they can be done without recurring and expensive memory scan. Quoting David : The page allocator relies on synchronous (sync light) memory compaction after direct reclaim for allocations that don't retry and deferred compaction doesn't work with this strategy because the allocation order is always decreasing from the previous failed attempt. This means sync light compaction will always be encountered if memory cannot be defragmented or reclaimed several times during the skb_page_frag_refill() iteration. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:28:23 -07:00
Ying Xue	cc086fcf92	tipc: fix a potential oops Commit `6c9808ce09` ("tipc: remove port_lock") accidentally involves a potential bug: when tipc socket instance(tsk) is not got with given reference number in tipc_sk_get(), tsk is set to NULL. Subsequently we jump to exit label where to decrease socket reference counter pointed by tsk pointer in tipc_sk_put(). However, As now tsk is NULL, oops may happen because of touching a NULL pointer. Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:22:43 -07:00
Daniel Borkmann	10c51b5623	net: add skb_get_tx_queue() helper Replace occurences of skb_get_queue_mapping() and follow-up netdev_get_tx_queue() with an actual helper function. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-29 20:02:07 -07:00
Mika Westerberg	d0616613d9	net: rfkill: gpio: Add more Broadcom bluetooth ACPI IDs This adds one more ACPI ID of a Broadcom bluetooth chip. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-29 13:10:44 +02:00
Michal Kazior	a00f4f6e04	mac80211: fix chantype recalc warning When a device driver is unloaded local->interfaces list is cleared. If there was more than 1 interface running and connected (bound to a chanctx) then chantype recalc was called and it ended up with compat being NULL causing a call trace warning. Warn if compat becomes NULL as a result of incompatible bss_conf.chandef of interfaces bound to a given channel context only. The call trace looked like this: WARNING: CPU: 2 PID: 2594 at /devel/src/linux/net/mac80211/chan.c:557 ieee80211_recalc_chanctx_chantype+0x2cd/0x2e0() Modules linked in: ath10k_pci(-) ath10k_core ath CPU: 2 PID: 2594 Comm: rmmod Tainted: G W 3.16.0-rc1+ #150 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000009 ffff88001ea279c0 ffffffff818dfa93 0000000000000000 ffff88001ea279f8 ffffffff810514a8 ffff88001ce09cd0 ffff88001e03cc58 0000000000000000 ffff88001ce08840 ffff88001ce09cd0 ffff88001ea27a08 Call Trace: [<ffffffff818dfa93>] dump_stack+0x4d/0x66 [<ffffffff810514a8>] warn_slowpath_common+0x78/0xa0 [<ffffffff81051585>] warn_slowpath_null+0x15/0x20 [<ffffffff818a407d>] ieee80211_recalc_chanctx_chantype+0x2cd/0x2e0 [<ffffffff818a3dda>] ? ieee80211_recalc_chanctx_chantype+0x2a/0x2e0 [<ffffffff818a4919>] ieee80211_assign_vif_chanctx+0x1a9/0x770 [<ffffffff818a6220>] __ieee80211_vif_release_channel+0x70/0x130 [<ffffffff818a6dd3>] ieee80211_vif_release_channel+0x43/0xb0 [<ffffffff81885f4e>] ieee80211_stop_ap+0x21e/0x5a0 [<ffffffff8184b9b5>] __cfg80211_stop_ap+0x85/0x520 [<ffffffff8181c188>] __cfg80211_leave+0x68/0x120 [<ffffffff8181c268>] cfg80211_leave+0x28/0x40 [<ffffffff8181c5f3>] cfg80211_netdev_notifier_call+0x373/0x6b0 [<ffffffff8107f965>] notifier_call_chain+0x55/0x110 [<ffffffff8107fa41>] raw_notifier_call_chain+0x11/0x20 [<ffffffff816a8dc0>] call_netdevice_notifiers_info+0x30/0x60 [<ffffffff816a8eb9>] __dev_close_many+0x59/0xf0 [<ffffffff816a9021>] dev_close_many+0x81/0x120 [<ffffffff816aa1c5>] rollback_registered_many+0x115/0x2a0 [<ffffffff816aa3a6>] unregister_netdevice_many+0x16/0xa0 [<ffffffff8187d841>] ieee80211_remove_interfaces+0x121/0x1b0 [<ffffffff8185e0e6>] ieee80211_unregister_hw+0x56/0x110 [<ffffffffa0011ac4>] ath10k_mac_unregister+0x14/0x60 [ath10k_core] [<ffffffffa0014fe7>] ath10k_core_unregister+0x27/0x40 [ath10k_core] [<ffffffffa003b1f4>] ath10k_pci_remove+0x44/0xa0 [ath10k_pci] [<ffffffff81373138>] pci_device_remove+0x28/0x60 [<ffffffff814cb534>] __device_release_driver+0x64/0xd0 [<ffffffff814cbcc8>] driver_detach+0xb8/0xc0 [<ffffffff814cb23a>] bus_remove_driver+0x4a/0xb0 [<ffffffff814cc697>] driver_unregister+0x27/0x50 Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-29 13:06:01 +02:00
Ying Xue	0244790c8a	xfrm: remove useless hash_resize_mutex locks In xfrm_state.c, hash_resize_mutex is defined as a local variable and only used in xfrm_hash_resize() which is declared as a work handler of xfrm.state_hash_work. But when the xfrm.state_hash_work work is put in the global workqueue(system_wq) with schedule_work(), the work will be really inserted in the global workqueue if it was not already queued, otherwise, it is still left in the same position on the the global workqueue. This means the xfrm_hash_resize() work handler is only executed once at any time no matter how many times its work is scheduled, that is, xfrm_hash_resize() is not called concurrently at all, so hash_resize_mutex is redundant for us. Cc: Christophe Gouault <christophe.gouault@6wind.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-08-29 11:40:03 +02:00
Chuck Lever	71efecb3f5	sunrpc: fix byte-swapping of displayed XID xprt_lookup_rqst() and bc_send_request() display a byte-swapped XID, but receive_cb_reply() does not. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-28 16:00:07 -04:00
J. Bruce Fields	ae89254da6	SUNRPC: Fix compile on non-x86 current_task appears to be x86-only, oops. Let's just delete this check entirely: Any developer that adds a new user without setting rq_task will get a crash the first time they test it. I also don't think there are normally any important locks held here, and I can't see any other reason why killing a server thread would bring the whole box down. So the effort to fail gracefully here looks like overkill. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: `983c684466` "SUNRPC: get rid of the request wait queue" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-28 15:51:35 -04:00
Jesper Dangaard Brouer	d7cdb96808	treewide: fix synchronize_rcu() in comments Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-08-28 15:01:24 +02:00
Florian Fainelli	246d7f773c	net: dsa: add Broadcom SF2 switch driver Add support for the Broadcom Starfigther 2 switch chip using a DSA driver. This switch driver supports the following features: - configuration of the external switch port interface: MII, RevMII, RGMII and RGMII_NO_ID are supported - support for the per-port MIB counters - support for link interrupts for special ports (e.g: MoCA) - powering up/down of switch memories to conserve power when ports are unused Finally, update the compatible property for the DSA core code to match our switch top-level compatible node. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	5037d532b8	net: dsa: add Broadcom tag RX/TX handler Add support for the 4-bytes Broadcom tag that built-in switches such as the Starfighter 2 might insert when receiving packets, or that we need to insert while targetting specific switch ports. We use a fake local EtherType value for this 4-bytes switch tag: ETH_P_BRCMTAG to make sure we can assign DSA-specific network operations within the DSA drivers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	ce31b31c68	net: dsa: allow updating fixed PHY link information Allow switch drivers to hook a PHY link update callback to perform port-specific link work. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	ec9436baed	net: dsa: allow drivers to do link adjustment Whenever libphy determines that the link status of a given PHY/port has changed, allow to call into the switch driver link adjustment callback so proper actions can be taken care of by the switch driver upon link notification. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	5aed85cec2	net: dsa: allow switches to work without tagging In case switch port tagging is disabled (voluntarily, or the switch just does not support it), allow us to continue using the defined set of dsa_device_ops in net/dsa/slave.c. We introduce dsa_protocol_is_tagged() to check whether we need to override skb->protocol and go through the DSA-specifif packet_type function, or if we just go on and receive the SKB through the normal path. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	0d8bcdd383	net: dsa: allow for more complex PHY setups Modify the DSA slave interface to be bound to an arbitray PHY, not just the ones that are available as child PHY devices of the switch MDIO bus. This allows us for instance to have external PHYs connected to a separate MDIO bus, but yet also connected to a given switch port. Under certain configurations, the physical port mask might not be a 1:1 mapping to the MII PHYs mask. This is the case, if e.g: Port 1 of the switch is used and connects to a PHY at a MDIO address different than 1. Introduce a phys_mii_mask variable which allows driver to implement and divert their own MDIO read/writes operations for a subset of the MDIO PHY addresses. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	bd47497a01	net: dsa: retain a per-port device_node pointer We will later use the per-port device_node pointer to fetch a bunch of port-specific properties. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:40 -07:00
Florian Fainelli	fa981d9af8	net: dsa: provide a switch device device tree node pointer We might need to fetch additional resources from the device tree node pointer, such as register ranges or other properties. Keep a device_node pointer around for this. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:39 -07:00
Florian Fainelli	3e8a72d1da	net: dsa: reduce number of protocol hooks DSA is currently registering one packet_type function per EtherType it needs to intercept in the receive path of a DSA-enabled Ethernet device. Right now we have three of them: trailer, DSA and eDSA, and there might be more in the future, this will not scale to the addition of new protocols. This patch proceeds with adding a new layer of abstraction and two new functions: dsa_switch_rcv() which will dispatch into the tag-protocol specific receive function implemented by net/dsa/tag_.c dsa_slave_xmit() which will dispatch into the tag-protocol specific transmit function implemented by net/dsa/tag_.c When we do create the per-port slave network devices, we iterate over the switch protocol to assign the DSA-specific receive and transmit operations. A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER like it used to be. This allows us to greatly simplify the check in eth_type_trans() and always override the skb->protocol with ETH_P_XDSA for Ethernet switches tagged protocol, while also reducing the number repetitive slave netdevice_ops assignments. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 22:59:39 -07:00
Julian Anastasov	eb90b0c734	ipvs: fix ipv6 hook registration for local replies commit `fc60476761` ("ipvs: changes for local real server") from 2.6.37 introduced DNAT support to local real server but the IPv6 LOCAL_OUT handler ip_vs_local_reply6() is registered incorrectly as IPv4 hook causing any outgoing IPv4 traffic to be dropped depending on the IP header values. Chris tracked down the problem to CONFIG_IP_VS_IPV6=y Bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768 Reported-by: Chris J Arges <chris.j.arges@canonical.com> Tested-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-08-28 10:52:37 +09:00
Florian Westphal	253ff51635	tcp: syncookies: mark cookie_secret read_mostly only written once. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-27 16:30:49 -07:00
Andreea-Cristina Bernat	2688eba9d5	mac80211: Replace rcu_dereference() with rcu_access_pointer() The "rcu_dereference()" calls are used directly in conditions. Since their return values are never dereferenced it is recommended to use "rcu_access_pointer()" instead of "rcu_dereference()". Therefore, this patch makes the replacements. The following Coccinelle semantic patch was used: @@ @@ ( if( (<+... - rcu_dereference + rcu_access_pointer (...) ...+>)) {...} \| while( (<+... - rcu_dereference + rcu_access_pointer (...) ...+>)) {...} ) Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-27 12:14:10 +02:00
Julian Anastasov	ea1d5d7755	ipvs: properly declare tunnel encapsulation The tunneling method should properly use tunnel encapsulation. Fixes problem with CHECKSUM_PARTIAL packets when TCP/UDP csum offload is supported. Thanks to Alex Gartrell for reporting the problem, providing solution and for all suggestions. Reported-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-08-27 14:31:56 +09:00
Alexey Perevalov	f111f780ae	netfilter: nfnetlink_acct: add filter support to nfacct counter list/reset You can use this to skip accounting objects when listing/resetting via NFNL_MSG_ACCT_GET/NFNL_MSG_ACCT_GET_CTRZERO messages with the NLM_F_DUMP netlink flag. The filtering covers the following cases: 1. No filter specified. In this case, the client will get old behaviour, 2. List/reset counter object only: In this case, you have to use NFACCT_F_QUOTA as mask and value 0. 3. List/reset quota objects only: You have to use NFACCT_F_QUOTA_PKTS as mask and value - the same, for byte based quota mask should be NFACCT_F_QUOTA_BYTES and value - the same. If you want to obtain the object with any quota type (ie. NFACCT_F_QUOTA_PKTS\|NFACCT_F_QUOTA_BYTES), you need to perform two dump requests, one to obtain NFACCT_F_QUOTA_PKTS objects and another for NFACCT_F_QUOTA_BYTES. Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-08-26 21:36:19 +02:00
Christoph Lameter	903ceff7ca	net: Replace get_cpu_var through this_cpu_ptr Replace uses of get_cpu_var for address calculation through this_cpu_ptr. Cc: netdev@vger.kernel.org Cc: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-08-26 13:45:47 -04:00
Andreea-Cristina Bernat	ad053a962f	mac80211: scan: Replace rcu_assign_pointer() with RCU_INIT_POINTER() The use of "rcu_assign_pointer()" is NULLing out the pointer. According to RCU_INIT_POINTER()'s block comment: "1. This use of RCU_INIT_POINTER() is NULLing out the pointer" it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:16:31 +02:00
Johannes Berg	5bc8c1f2b0	cfg80211: allow passing frame type to cfg80211_inform_bss() When using the cfg80211_inform_bss[_width]() functions drivers cannot currently indicate whether the data was received in a beacon or probe response. Fix that by passing a new enum that indicates such (or unknown). For good measure, use it in ath6kl. Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath6kl] Acked-by: Arend van Spriel <arend@broadcom.com> [brcmfmac] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:16:02 +02:00
Johannes Berg	0e227084ae	cfg80211: clarify BSS probe response vs. beacon data There are a few possible cases of where BSS data came from: 1) only a beacon has been received 2) only a probe response has been received 3) the driver didn't report what it received (this happens when using cfg80211_inform_bss[_width]()) 4) both probe response and beacon data has been received Unfortunately, in the userspace API, a few things weren't there: a) there was no way to differentiate cases 1) and 4) above without comparing the data of the IEs b) the TSF was always from the last frame, instead of being exposed for beacon/probe response separately like IEs Fix this by i) exporting a new flag attribute that indicates whether or not probe response data has been received - this addresses (a) ii) exporting a BEACON_TSF attribute that holds the beacon's TSF if a beacon has been received iii) not exporting the beacon attributes in case (3) above as that would just lead userspace into thinking the data actually came from a beacon when that isn't clear To implement this, track inside the IEs struct whether or not it (definitely) came from a beacon. Reported-by: William Seto Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:16:01 +02:00
Michal Kazior	f41ef64853	cfg80211: re-enable CSA for drivers that support it This reverts commit `dda444d524`. Channel switching code has been reworked and improved significantly since the time original locking issues were found. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:16:01 +02:00
Ido Yariv	c70f59a2a0	mac80211: don't resize skbs needlessly Header-less cloned skbs with sufficient headroom need not be cloned unless the tailroom is going to be modified. Fix ieee80211_skb_resize so it would only resize cloned skbs if either the header isn't released or the tailroom is going to be modified. Some drivers might have assumed that skbs are never cloned, so add a HW flag that explicitly permits cloned TX skbs. Drivers which do not modify TX skbs should set this flag to avoid copying skbs. Signed-off-by: Ido Yariv <idox.yariv@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:16:00 +02:00
Ido Yariv	ca34e3b5c8	mac80211: Fix accounting of the tailroom-needed counter When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags will only require headroom space. Consequently, the tailroom-needed counter can safely be decremented. Signed-off-by: Ido Yariv <idox.yariv@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:15:59 +02:00
Vladimir Kondratiev	970fdfa89b	cfg80211: remove @gfp parameter from cfg80211_rx_mgmt() In the cfg80211_rx_mgmt(), parameter @gfp was used for the memory allocation. But, memory get allocated under spin_lock_bh(), this implies atomic context. So, one can't use GFP_KERNEL, only variants with no __GFP_WAIT. Actually, in all occurrences GFP_ATOMIC is used (wil6210 use GFP_KERNEL by mistake), and it should be this way or warning triggered in the memory allocation code. Remove @gfp parameter as no actual choice exist, and use hard coded GFP_ATOMIC for memory allocation. Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:15:58 +02:00
Johannes Berg	649b2a4da5	mac80211: make ieee80211_vif_use_reserved_switch static Reorder some code to make ieee80211_vif_use_reserved_switch() static, no other changes. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:15:35 +02:00
Bob Copeland	f8134fed83	mac80211: mesh_plink: use get_unaligned_le16 instead of memcpy Use get_unaligned_le16 to access llid/plid. Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-08-26 11:15:34 +02:00

... 23 24 25 26 27 ...

36499 Commits