Commit Graph

3157 Commits

Author SHA1 Message Date
Daniel Borkmann
7a4b28c6cc bpf: add helper to invalidate hash
Add a small helper that complements 36bbef52c7 ("bpf: direct packet
write and access for helpers for clsact progs") for invalidating the
current skb->hash after mangling on headers via direct packet write.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-23 08:40:28 -04:00
Eric Dumazet
fefa569a9d net_sched: sch_fq: account for schedule/timers drifts
It looks like the following patch can make FQ very precise, even in VM
or stressed hosts. It matters at high pacing rates.

We take into account the difference between the time that was programmed
when last packet was sent, and current time (a drift of tens of usecs is
often observed)

Add an EWMA of the unthrottle latency to help diagnostics.

This latency is the difference between current time and oldest packet in
delayed RB-tree. This accounts for the high resolution timer latency,
but can be different under stress, as fq_check_throttled() can be
opportunistically be called from a dequeue() called after an enqueue()
for a different flow.

Tested:
// Start a 10Gbit flow
$ netperf --google-pacing-rate 1250000000 -H lpaa24 -l 10000 -- -K bbr &

Before patch :
$ sar -n DEV 10 5 | grep eth0 | grep Average
Average:         eth0  17106.04 756876.84   1102.75 1119049.02      0.00      0.00      0.52

After patch :
$ sar -n DEV 10 5 | grep eth0 | grep Average
Average:         eth0  17867.00 800245.90   1151.77 1183172.12      0.00      0.00      0.52

A new iproute2 tc can output the 'unthrottle latency' :

$ tc -s qd sh dev eth0 | grep latency
  0 gc, 0 highprio, 32490767 throttled, 2382 ns latency

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-23 07:19:06 -04:00
Liping Zhang
8061bb5443 netfilter: nft_queue: add _SREG_QNUM attr to select the queue number
Currently, the user can specify the queue numbers by _QUEUE_NUM and
_QUEUE_TOTAL attributes, this is enough in most situations.

But acctually, it is not very flexible, for example:
  tcp dport 80 mapped to queue0
  tcp dport 81 mapped to queue1
  tcp dport 82 mapped to queue2
In order to do this thing, we must add 3 nft rules, and more
mapping meant more rules ...

So take one register to select the queue number, then we can add one
simple rule to mapping queues, maybe like this:
  queue num tcp dport map { 80:0, 81:1, 82:2 ... }

Florian Westphal also proposed wider usage scenarios:
  queue num jhash ip saddr . ip daddr mod ...
  queue num meta cpu ...
  queue num meta mark ...

The last point is how to load a queue number from sreg, although we can
use *(u16*)&regs->data[reg] to load the queue number, just like nat expr
to load its l4port do.

But we will cooperate with hash expr, meta cpu, meta mark expr and so on.
They all store the result to u32 type, so cast it to u16 pointer and
dereference it will generate wrong result in the big endian system.

So just keep it simple, we treat queue number as u32 type, although u16
type is already enough.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-23 09:29:50 +02:00
Andrey Vagin
a7306ed8d9 nsfs: add ioctl to get a parent namespace
Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships.

In a future we will use this interface to dump and restore nested
namespaces.

Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-09-22 19:59:41 -05:00
Andrey Vagin
6786741dbf nsfs: add ioctl to get an owning user namespace for ns file descriptor
Each namespace has an owning user namespace and now there is not way
to discover these relationships.

Understending namespaces relationships allows to answer the question:
what capability does process X have to perform operations on a resource
governed by namespace Y?

After a long discussion, Eric W. Biederman proposed to use ioctl-s for
this purpose.

The NS_GET_USERNS ioctl returns a file descriptor to an owning user
namespace.
It returns EPERM if a target namespace is outside of a current user
namespace.

v2: rename parent to relative

v3: Add a missing mntput when returning -EAGAIN --EWB

Acked-by: Serge Hallyn <serge@hallyn.com>
Link: https://lkml.org/lkml/2016/7/6/158
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2016-09-22 19:59:40 -05:00
Laura Garcia Liebana
2b03bf7324 netfilter: nft_numgen: add number generation offset
Add support of an offset value for incremental counter and random. With
this option the sysadmin is able to start the counter to a certain value
and then apply the generated number.

Example:

	meta mark set numgen inc mod 2 offset 100

This will generate marks with the serie 100, 101, 100, 101, ...

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-22 16:33:05 +02:00
Shmulik Ladkani
45a497f2d1 net/sched: act_vlan: Introduce TCA_VLAN_ACT_MODIFY vlan action
TCA_VLAN_ACT_MODIFY allows one to change an existing tag.

It accepts same attributes as TCA_VLAN_ACT_PUSH (protocol, id,
priority).
If packet is vlan tagged, then the tag gets overwritten according to
user specified attributes.

For example, this allows user to replace a tag's vid while preserving
its priority bits (as opposed to "action vlan pop pipe action vlan push").

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22 01:34:20 -04:00
Jakub Kicinski
0d01d45f1b net: cls_bpf: limit hardware offload by software-only flag
Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_HW flag.
Unlike U32 and flower cls_bpf already has some netlink
flags defined.  Create a new attribute to be able to use
the same flag values as the above.

Unlike U32 and flower reject unknown flags.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 19:50:02 -04:00
Neal Cardwell
0f8782ea14 tcp_bbr: add BBR congestion control
This commit implements a new TCP congestion control algorithm: BBR
(Bottleneck Bandwidth and RTT). A detailed description of BBR will be
published in ACM Queue, Vol. 14 No. 5, September-October 2016, as
"BBR: Congestion-Based Congestion Control".

BBR has significantly increased throughput and reduced latency for
connections on Google's internal backbone networks and google.com and
YouTube Web servers.

BBR requires only changes on the sender side, not in the network or
the receiver side. Thus it can be incrementally deployed on today's
Internet, or in datacenters.

The Internet has predominantly used loss-based congestion control
(largely Reno or CUBIC) since the 1980s, relying on packet loss as the
signal to slow down. While this worked well for many years, loss-based
congestion control is unfortunately out-dated in today's networks. On
today's Internet, loss-based congestion control causes the infamous
bufferbloat problem, often causing seconds of needless queuing delay,
since it fills the bloated buffers in many last-mile links. On today's
high-speed long-haul links using commodity switches with shallow
buffers, loss-based congestion control has abysmal throughput because
it over-reacts to losses caused by transient traffic bursts.

In 1981 Kleinrock and Gale showed that the optimal operating point for
a network maximizes delivered bandwidth while minimizing delay and
loss, not only for single connections but for the network as a
whole. Finding that optimal operating point has been elusive, since
any single network measurement is ambiguous: network measurements are
the result of both bandwidth and propagation delay, and those two
cannot be measured simultaneously.

While it is impossible to disambiguate any single bandwidth or RTT
measurement, a connection's behavior over time tells a clearer
story. BBR uses a measurement strategy designed to resolve this
ambiguity. It combines these measurements with a robust servo loop
using recent control systems advances to implement a distributed
congestion control algorithm that reacts to actual congestion, not
packet loss or transient queue delay, and is designed to converge with
high probability to a point near the optimal operating point.

In a nutshell, BBR creates an explicit model of the network pipe by
sequentially probing the bottleneck bandwidth and RTT. On the arrival
of each ACK, BBR derives the current delivery rate of the last round
trip, and feeds it through a windowed max-filter to estimate the
bottleneck bandwidth. Conversely it uses a windowed min-filter to
estimate the round trip propagation delay. The max-filtered bandwidth
and min-filtered RTT estimates form BBR's model of the network pipe.

Using its model, BBR sets control parameters to govern sending
behavior. The primary control is the pacing rate: BBR applies a gain
multiplier to transmit faster or slower than the observed bottleneck
bandwidth. The conventional congestion window (cwnd) is now the
secondary control; the cwnd is set to a small multiple of the
estimated BDP (bandwidth-delay product) in order to allow full
utilization and bandwidth probing while bounding the potential amount
of queue at the bottleneck.

When a BBR connection starts, it enters STARTUP mode and applies a
high gain to perform an exponential search to quickly probe the
bottleneck bandwidth (doubling its sending rate each round trip, like
slow start). However, instead of continuing until it fills up the
buffer (i.e. a loss), or until delay or ACK spacing reaches some
threshold (like Hystart), it uses its model of the pipe to estimate
when that pipe is full: it estimates the pipe is full when it notices
the estimated bandwidth has stopped growing. At that point it exits
STARTUP and enters DRAIN mode, where it reduces its pacing rate to
drain the queue it estimates it has created.

Then BBR enters steady state. In steady state, PROBE_BW mode cycles
between first pacing faster to probe for more bandwidth, then pacing
slower to drain any queue that created if no more bandwidth was
available, and then cruising at the estimated bandwidth to utilize the
pipe without creating excess queue. Occasionally, on an as-needed
basis, it sends significantly slower to probe for RTT (PROBE_RTT
mode).

BBR has been fully deployed on Google's wide-area backbone networks
and we're experimenting with BBR on Google.com and YouTube on a global
scale.  Replacing CUBIC with BBR has resulted in significant
improvements in network latency and application (RPC, browser, and
video) metrics. For more details please refer to our upcoming ACM
Queue publication.

Example performance results, to illustrate the difference between BBR
and CUBIC:

Resilience to random loss (e.g. from shallow buffers):
  Consider a netperf TCP_STREAM test lasting 30 secs on an emulated
  path with a 10Gbps bottleneck, 100ms RTT, and 1% packet loss
  rate. CUBIC gets 3.27 Mbps, and BBR gets 9150 Mbps (2798x higher).

Low latency with the bloated buffers common in today's last-mile links:
  Consider a netperf TCP_STREAM test lasting 120 secs on an emulated
  path with a 10Mbps bottleneck, 40ms RTT, and 1000-packet bottleneck
  buffer. Both fully utilize the bottleneck bandwidth, but BBR
  achieves this with a median RTT 25x lower (43 ms instead of 1.09
  secs).

Our long-term goal is to improve the congestion control algorithms
used on the Internet. We are hopeful that BBR can help advance the
efforts toward this goal, and motivate the community to do further
research.

Test results, performance evaluations, feedback, and BBR-related
discussions are very welcome in the public e-mail list for BBR:

  https://groups.google.com/forum/#!forum/bbr-dev

NOTE: BBR *must* be used with the fq qdisc ("man tc-fq") with pacing
enabled, since pacing is integral to the BBR design and
implementation. BBR without pacing would not function properly, and
may incur unnecessary high packet loss rates.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:23:01 -04:00
Yuchung Cheng
eb8329e0a0 tcp: export data delivery rate
This commit export two new fields in struct tcp_info:

  tcpi_delivery_rate: The most recent goodput, as measured by
    tcp_rate_gen(). If the socket is limited by the sending
    application (e.g., no data to send), it reports the highest
    measurement instead of the most recent. The unit is bytes per
    second (like other rate fields in tcp_info).

  tcpi_delivery_rate_app_limited: A boolean indicating if the goodput
    was measured when the socket's throughput was limited by the
    sending application.

This delivery rate information can be useful for applications that
want to know the current throughput the TCP connection is seeing,
e.g. adaptive bitrate video streaming. It can also be very useful for
debugging or troubleshooting.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:23:00 -04:00
Eric Dumazet
77879147a3 net_sched: sch_fq: add low_rate_threshold parameter
This commit adds to the fq module a low_rate_threshold parameter to
insert a delay after all packets if the socket requests a pacing rate
below the threshold.

This helps achieve more precise control of the sending rate with
low-rate paths, especially policers. The basic issue is that if a
congestion control module detects a policer at a certain rate, it may
want fq to be able to shape to that policed rate. That way the sender
can avoid policer drops by having the packets arrive at the policer at
or just under the policed rate.

The default threshold of 550Kbps was chosen analytically so that for
policers or links at 500Kbps or 512Kbps fq would very likely invoke
this mechanism, even if the pacing rate was briefly slightly above the
available bandwidth. This value was then empirically validated with
two years of production testing on YouTube video servers.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:23:00 -04:00
Daniel Borkmann
36bbef52c7 bpf: direct packet write and access for helpers for clsact progs
This work implements direct packet access for helpers and direct packet
write in a similar fashion as already available for XDP types via commits
4acf6c0b84 ("bpf: enable direct packet data write for xdp progs") and
6841de8b0d ("bpf: allow helpers access the packet directly"), and as a
complementary feature to the already available direct packet read for tc
(cls/act) programs.

For enabling this, we need to introduce two helpers, bpf_skb_pull_data()
and bpf_csum_update(). The first is generally needed for both, read and
write, because they would otherwise only be limited to the current linear
skb head. Usually, when the data_end test fails, programs just bail out,
or, in the direct read case, use bpf_skb_load_bytes() as an alternative
to overcome this limitation. If such data sits in non-linear parts, we
can just pull them in once with the new helper, retest and eventually
access them.

At the same time, this also makes sure the skb is uncloned, which is, of
course, a necessary condition for direct write. As this needs to be an
invariant for the write part only, the verifier detects writes and adds
a prologue that is calling bpf_skb_pull_data() to effectively unclone the
skb from the very beginning in case it is indeed cloned. The heuristic
makes use of a similar trick that was done in 233577a220 ("net: filter:
constify detection of pkt_type_offset"). This comes at zero cost for other
programs that do not use the direct write feature. Should a program use
this feature only sparsely and has read access for the most parts with,
for example, drop return codes, then such write action can be delegated
to a tail called program for mitigating this cost of potential uncloning
to a late point in time where it would have been paid similarly with the
bpf_skb_store_bytes() as well. Advantage of direct write is that the
writes are inlined whereas the helper cannot make any length assumptions
and thus needs to generate a call to memcpy() also for small sizes, as well
as cost of helper call itself with sanity checks are avoided. Plus, when
direct read is already used, we don't need to cache or perform rechecks
on the data boundaries (due to verifier invalidating previous checks for
helpers that change skb->data), so more complex programs using rewrites
can benefit from switching to direct read plus write.

For direct packet access to helpers, we save the otherwise needed copy into
a temp struct sitting on stack memory when use-case allows. Both facilities
are enabled via may_access_direct_pkt_data() in verifier. For now, we limit
this to map helpers and csum_diff, and can successively enable other helpers
where we find it makes sense. Helpers that definitely cannot be allowed for
this are those part of bpf_helper_changes_skb_data() since they can change
underlying data, and those that write into memory as this could happen for
packet typed args when still cloned. bpf_csum_update() helper accommodates
for the fact that we need to fixup checksum_complete when using direct write
instead of bpf_skb_store_bytes(), meaning the programs can use available
helpers like bpf_csum_diff(), and implement csum_add(), csum_sub(),
csum_block_add(), csum_block_sub() equivalents in eBPF together with the
new helper. A usage example will be provided for iproute2's examples/bpf/
directory.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 23:32:11 -04:00
Emilio López
823d1bc108 dma-buf/sync_file: fix documentation error
The ioctl name and description on the documentation block don't
match the ioctl being defined. This was probably overlooked while
renaming the ioctls during the sync file destaging. This patch
provides a more accurate description of what the ioctl actually does.

Signed-off-by: Emilio López <emilio.lopez@collabora.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20160919042120.6280-1-emilio.lopez@collabora.co.uk
2016-09-20 18:12:38 +05:30
Jamal Hadi Salim
408fbc22ef net sched ife action: Introduce skb tcindex metadata encap decap
Sample use case of how this is encoded:
user space via tuntap (or a connected VM/Machine/container)
encodes the tcindex TLV.

Sample use case of decoding:
IFE action decodes it and the skb->tc_index is then used to classify.
So something like this for encoded ICMP packets:

.. first decode then reclassify... skb->tcindex will be set
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xbeef \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

...next match the decode icmp packet...
sudo $TC filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

... last classify it using the tcindex classifier and do someaction..
sudo $TC filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 tcindex classid 1:1 \
action blah..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 21:55:28 -04:00
Charles-Antoine Couret
7389e6ef34 [media] SDI: add flag for SDI formats and SMPTE 125M definition
Adding others generic flags, which could be used by many
components like GS1662.

Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-09-19 14:34:41 -03:00
Mahesh Bandewar
4fbae7d83c ipvlan: Introduce l3s mode
In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.

This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb->dev just before handing over skb to L4.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 01:25:22 -04:00
Nogah Frankel
69ae6ad2ff net: core: Add offload stats to if_stats_msg
Add a nested attribute of offload stats to if_stats_msg
named IFLA_STATS_LINK_OFFLOAD_XSTATS.
Under it, add SW stats, meaning stats only per packets that went via
slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-18 22:33:42 -04:00
Alexei Starovoitov
cfc7381b30 ip_tunnel: add collect_md mode to IPIP tunnel
Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17 10:13:07 -04:00
Miklos Szeredi
c568d68341 locks: fix file locking on overlayfs
This patch allows flock, posix locks, ofd locks and leases to work
correctly on overlayfs.

Instead of using the underlying inode for storing lock context use the
overlay inode.  This allows locks to be persistent across copy-up.

This is done by introducing locks_inode() helper and using it instead of
file_inode() to get the inode in locking code.  For non-overlayfs the two
are equivalent, except for an extra pointer dereference in locks_inode().

Since lock operations are in "struct file_operations" we must also make
sure not to call underlying filesystem's lock operations.  Introcude a
super block flag MS_NOREMOTELOCK to this effect.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: Jeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
2016-09-16 12:44:20 +02:00
Or Gerlitz
37a6c15123 net/sched: cls_flower: Specify vlan attributes format in the UAPI header
Specify the format (size and endianess) for the vlan attributes.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 20:27:23 -04:00
Or Gerlitz
aa72d70837 net/sched: cls_flower: Support masking for matching on tcp/udp ports
Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 20:27:23 -04:00
Jamal Hadi Salim
86da71b573 net_sched: Introduce skbmod action
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
munge offset -13 u8 set 0x15 \
munge offset -12 u8 set 0x15 \
munge offset -11 u8 set 0x15 \
munge offset -10 u16 set 0x1515 \
pipe

to:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15

Also try to do a MAC address swap with pedit or worse
try to debug a policy with destination mac, source mac and
etherype. Then make few rules out of those and you'll get my point.

In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.

The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesnt get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac). And at times
when flipping back the packet a swap of the MAC addresses is needed.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-15 19:33:47 -04:00
Greg Kroah-Hartman
8152263748 usb: patches for v4.9 merge window
This time around we have 92 non-merge commits. Most
 of the changes are in drivers/usb/gadget (40.3%)
 with drivers/usb/gadget/function being the most
 active directory (27.2%).
 
 As for UDC drivers, only dwc3 (26.5%) and dwc2
 (12.7%) have really been active.
 
 The most important changes for dwc3 are better
 support for scatterlist and, again, throughput
 improvements. While on dwc2 got some minor stability
 fixes related to soft reset and FIFO usage.
 
 Felipe Tonello has done some good work fixing up our
 f_midi gadget and Tal Shorer has implemented a nice
 API change for our ULPI bus.
 
 Apart from these, we have our usual set of
 non-critical fixes, spelling fixes, build warning
 fixes, etc.
 -----BEGIN PGP SIGNATURE-----
 
 iQI6BAABCAAkBQJX2TpXHRxmZWxpcGUuYmFsYmlAbGludXguaW50ZWwuY29tAAoJ
 EMy+uJnhGpkGxX0QAIOavB96wkAP4msMzCMIKyKX8NBVWEYzLy7Ou6IrPKiGOR28
 CjDi1C5qW7838H4neA6Gfw896rfTiAODhoiOY/RTXI7p2hTUUXHQuJ81Bad75gHD
 744BUMPy37YJnvgHTasYn0GxAvP73YmV+omRxo76poetYZ9eH8dGECvC9q6m+jRU
 XaubWEq1JMvzHvlyO7BIrndGY4ByRbBoG0XPiZF07e5YDkKWQmv56tgAAN7fEkeh
 8HIg8lG2xvgf+w6cDbrQ2c8fp055OvrOq40R2pSXwQgYYKXPJ+vFiNzriQ6Rfxai
 gIYrB+mrKZcY6mi6OhoulGfNxT65VqMqnUfwVbbwlJQbDe5EkV6o/1WYdaBvdO2s
 qTT9A5alabFzbQ8ZtjzsIHtV62LwmZlMWk7gxZlcvLFNjf/P2CMqqnJi30/JlrsE
 iqhwIGRDhMq4QZZbiiEiJEaEn6vh2zseRdmCy3uMFearXKBP/I2177QOTDG7ZMKf
 fZR4ROlv6c5tIpBCOsTV0+7c/fnnnOTHU4+vJiUzU0krkPzaLcL8iMT1tn+uGchX
 4d2XLuT6AbVxQR4N8YF4FwRzB/PbEb+ZWWGu1mOVSd9/dsA43K50zNdc061dgz8K
 q8lau6bmtfUXdbeWa3WMEaAZIuSBmFarJY0tPZV6W7cXUAgKitThRD6fp4E0
 =vTFa
 -----END PGP SIGNATURE-----

Merge tag 'usb-for-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next

Felipe writes:

usb: patches for v4.9 merge window

This time around we have 92 non-merge commits. Most
of the changes are in drivers/usb/gadget (40.3%)
with drivers/usb/gadget/function being the most
active directory (27.2%).

As for UDC drivers, only dwc3 (26.5%) and dwc2
(12.7%) have really been active.

The most important changes for dwc3 are better
support for scatterlist and, again, throughput
improvements. While on dwc2 got some minor stability
fixes related to soft reset and FIFO usage.

Felipe Tonello has done some good work fixing up our
f_midi gadget and Tal Shorer has implemented a nice
API change for our ULPI bus.

Apart from these, we have our usual set of
non-critical fixes, spelling fixes, build warning
fixes, etc.
2016-09-14 20:37:50 +02:00
Florian Westphal
8e8118f893 netfilter: conntrack: remove packet hotpath stats
These counters sit in hot path and do show up in perf, this is especially
true for 'found' and 'searched' which get incremented for every packet
processed.

Information like

searched=212030105
new=623431
found=333613
delete=623327

does not seem too helpful nowadays:

- on busy systems found and searched will overflow every few hours
(these are 32bit integers), other more busy ones every few days.

- for debugging there are better methods, such as iptables' trace target,
the conntrack log sysctls.  Nowadays we also have perf tool.

This removes packet path stat counters except those that
are expected to be 0 (or close to 0) on a normal system, e.g.
'insert_failed' (race happened) or 'invalid' (proto tracker rejects).

The insert stat is retained for the ctnetlink case.
The found stat is retained for the tuple-is-taken check when NAT has to
determine if it needs to pick a different source address.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-12 19:59:39 +02:00
Pablo Neira Ayuso
dbd2be0646 netfilter: nft_dynset: allow to invert match criteria
The dynset expression matches if we can fit a new entry into the set.
If there is no room for it, then it breaks the rule evaluation.

This patch introduces the inversion flag so you can add rules to
explicitly drop packets that don't fit into the set. For example:

 # nft filter input flow table xyz size 4 { ip saddr timeout 120s counter } overflow drop

This is useful to provide a replacement for connlimit.

For the rule above, every new entry uses the IPv4 address as key in the
set, this entry gets a timeout of 120 seconds that gets refresh on every
packet seen. If we get new flow and our set already contains 4 entries
already, then this packet is dropped.

You can already express this in positive logic, assuming default policy
to drop:

 # nft filter input flow table xyz size 4 { ip saddr timeout 10s counter } accept

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-12 18:49:50 +02:00
Laura Garcia Liebana
70ca767ea1 netfilter: nft_hash: Add hash offset value
Add support to pass through an offset to the hash value. With this
feature, the sysadmin is able to generate a hash with a given
offset value.

Example:

	meta mark set jhash ip saddr mod 2 seed 0xabcd offset 100

This option generates marks according to the source address from 100 to
101.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
2016-09-12 18:37:12 +02:00
Amir Vadai
d0f6dd8a91 net/sched: Introduce act_tunnel_key
This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.

The action will release the metadata created by the tunnel device
(decap), or set the metadata with the specified values for encap
operation.

For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:

$ tc filter add dev net0 protocol ip parent ffff: \
    flower \
      ip_proto 1 \
      dst_ip 11.11.11.2 \
    action tunnel_key set \
      src_ip 11.11.0.1 \
      dst_ip 11.11.0.2 \
      id 11 \
    action mirred egress redirect dev vxlan0

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-10 20:53:56 -07:00
Amir Vadai
bc3103f1ed net/sched: cls_flower: Classify packet in ip tunnels
Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.

For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0' (after metadata is released):

$ tc filter add dev vxlan0 protocol ip parent ffff: \
    flower \
      enc_src_ip 11.11.0.2 \
      enc_dst_ip 11.11.0.1 \
      enc_key_id 11 \
      dst_ip 11.11.11.1 \
    action tunnel_key release \
    action mirred egress redirect dev vnet0

The action tunnel_key, will be introduced in the next patch in this
series.

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-10 20:53:55 -07:00
Sakari Ailus
82dec0a7db [media] media: Add 1X16 16-bit raw bayer media bus code definitions
The codes will be called:

	MEDIA_BUS_FMT_SBGGR16_1X16
	MEDIA_BUS_FMT_SGBRG16_1X16
	MEDIA_BUS_FMT_SGRBG16_1X16
	MEDIA_BUS_FMT_SRGGB16_1X16

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-09-09 11:16:36 -03:00
Jouni Ukkonen
cfe4135994 [media] media: Add 1X14 14-bit raw bayer media bus code definitions
The codes will be called:

	MEDIA_BUS_FMT_SBGGR14_1X14
	MEDIA_BUS_FMT_SGBRG14_1X14
	MEDIA_BUS_FMT_SGRBG14_1X14
	MEDIA_BUS_FMT_SRGGB14_1X14

Signed-off-by: Jouni Ukkonen <jouni.ukkonen@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-09-09 11:15:50 -03:00
Mauro Carvalho Chehab
848d10314b [media] fix broken references on dvb/video*rst
Trivially fix those broken references, by copying the structs
fron the header, just like other API documentation at the
DVB side.

This doesn't have the level of quality used at the V4L2 side
of the API, but, as this documents a deprecated API, used
only by av7110 driver, it doesn't make much sense to invest
time making it better.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-09-09 09:53:35 -03:00
Richard Guy Briggs
34a3d4b2d1 xfrm: fix header file comment reference to struct xfrm_replay_state_esn
Reported-by: Paul Wouters <paul@nohats.ca>
Signed-off-by: Richard Guy Briggs <rgb@tricolour.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2016-09-09 09:10:07 +02:00
Thomas F Herbert
8c146bb9d5 openvswitch: 802.1ad uapi changes.
openvswitch: Add support for 8021.AD

Change the description of the VLAN tpid field.

Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08 17:10:27 -07:00
Lorenzo Colitti
d545caca82 net: inet: diag: expose the socket mark to privileged processes.
This adds the capability for a process that has CAP_NET_ADMIN on
a socket to see the socket mark in socket dumps.

Commit a52e95abf7 ("net: diag: allow socket bytecode filters to
match socket marks") recently gave privileged processes the
ability to filter socket dumps based on mark. This patch is
complementary: it ensures that the mark is also passed to
userspace in the socket's netlink attributes.  It is useful for
tools like ss which display information about sockets.

Tested: https://android-review.googlesource.com/270210
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-08 16:13:09 -07:00
Laura Garcia Liebana
0d9932b287 netfilter: nft_numgen: rename until attribute by modulus
The _until_ attribute is renamed to _modulus_ as the behaviour is similar to
other expresions with number limits (ex. nft_hash).

Renaming is possible because there isn't a kernel release yet with these
changes.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07 10:55:46 +02:00
Gao Feng
ecc6569f35 netfilter: gre: Use consistent GRE_* macros instead of ones defined by netfilter.
There are already some GRE_* macros in kernel, so it is unnecessary
to define these macros. And remove some useless macros

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-09-07 10:36:48 +02:00
David S. Miller
60175ccdf4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree.  Most relevant updates are the removal of per-conntrack timers to
use a workqueue/garbage collection approach instead from Florian
Westphal, the hash and numgen expression for nf_tables from Laura
Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag,
removal of ip_conntrack sysctl and many other incremental updates on our
Netfilter codebase.

More specifically, they are:

1) Retrieve only 4 bytes to fetch ports in case of non-linear skb
   transport area in dccp, sctp, tcp, udp and udplite protocol
   conntrackers, from Gao Feng.

2) Missing whitespace on error message in physdev match, from Hangbin Liu.

3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang.

4) Add nf_ct_expires() helper function and use it, from Florian Westphal.

5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also
   from Florian.

6) Rename nf_tables set implementation to nft_set_{name}.c

7) Introduce the hash expression to allow arbitrary hashing of selector
   concatenations, from Laura Garcia Liebana.

8) Remove ip_conntrack sysctl backward compatibility code, this code has
   been around for long time already, and we have two interfaces to do
   this already: nf_conntrack sysctl and ctnetlink.

9) Use nf_conntrack_get_ht() helper function whenever possible, instead
   of opencoding fetch of hashtable pointer and size, patch from Liping Zhang.

10) Add quota expression for nf_tables.

11) Add number generator expression for nf_tables, this supports
    incremental and random generators that can be combined with maps,
    very useful for load balancing purpose, again from Laura Garcia Liebana.

12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King.

13) Introduce a nft_chain_parse_hook() helper function to parse chain hook
    configuration, this is used by a follow up patch to perform better chain
    update validation.

14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the
    nft_set_hash implementation to honor the NLM_F_EXCL flag.

15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(),
    patch from Florian Westphal.

16) Don't use the DYING bit to know if the conntrack event has been already
    delivered, instead a state variable to track event re-delivery
    states, also from Florian.

17) Remove the per-conntrack timer, use the workqueue approach that was
    discussed during the NFWS, from Florian Westphal.

18) Use the netlink conntrack table dump path to kill stale entries,
    again from Florian.

19) Add a garbage collector to get rid of stale conntracks, from
    Florian.

20) Reschedule garbage collector if eviction rate is high.

21) Get rid of the __nf_ct_kill_acct() helper.

22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger.

23) Make nf_log_set() interface assertive on unsupported families.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-06 12:45:26 -07:00
Alexei Starovoitov
0515e5999a bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type
Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)

The program visible context meta structure is
struct bpf_perf_event_data {
    struct pt_regs regs;
     __u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
  ... ctx->sample_period ...
  ... ctx->regs.ip ...
}

The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02 10:46:44 -07:00
Nikolay Aleksandrov
b6cb5ac833 net: bridge: add per-port multicast flood flag
Add a per-port flag to control the unknown multicast flood, similar to the
unknown unicast flood flag and break a few long lines in the netlink flag
exports.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-01 22:48:33 -07:00
David S. Miller
6abdd5f593 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
All three conflicts were cases of simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-30 00:54:02 -04:00
Vidya Sagar Ravipati
5711a98221 net: ethtool: add support for 1000BaseX and missing 10G link modes
This patch enhances ethtool link mode bitmap to include
missing interface modes for 1G/10G speeds

Changes:
1000baseX is the mode introduced to cover all 1G Fiber cases.
All modes under 1000BaseX i.e. 1000BASE-SX, 1000BASE-LX, 1000BASE-LX10
and 1000BASE-BX10 are not explicitly defined at this moment.
10G CR,SR,LR and ER link modes are included for 10G speed..

Issue:
ethtool on  1G/10G SFP port reports Base-T
as this port supports 1000baseX,10G CR, SR and LR modes.

root@tor-02$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 0
        Transceiver: external
        Auto-negotiation: off
        Current message level: 0x00000000 (0)

        Link detected: yes

After fix:
root@tor-02$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseX/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 0
        Transceiver: external
        Auto-negotiation: off
        Current message level: 0x00000000 (0)
        Link detected: yes

Signed-off-by: Vidya Sagar Ravipati <vidya@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-29 00:27:34 -04:00
Richard Alpe
fdb3accc2c tipc: add the ability to get UDP options via netlink
Add UDP bearer options to netlink bearer get message. This is used by
the tipc user space tool to display UDP options.

The UDP bearer information is passed using either a sockaddr_in or
sockaddr_in6 structs. This means the user space receiver should
intermediately store the retrieved data in a large enough struct
(sockaddr_strage) before casting to the proper IP version type.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 21:38:41 -07:00
Richard Alpe
ef20cd4dd1 tipc: introduce UDP replicast
This patch introduces UDP replicast. A concept where we emulate
multicast by sending multiple unicast messages to configured peers.

The purpose of replicast is mainly to be able to use TIPC in cloud
environments where IP multicast is disabled. Using replicas to unicast
multicast messages is costly as we have to copy each skb and send the
copies individually.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26 21:38:41 -07:00
Eric Dumazet
72145a68e4 tcp: md5: add LINUX_MIB_TCPMD5FAILURE counter
Adds SNMP counter for drops caused by MD5 mismatches.

The current syslog might help, but a counter is more precise and helps
monitoring.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-25 16:43:11 -07:00
Bjorn Helgaas
8b2ec318ee PCI: Add PTM clock granularity information
The PTM Control register (PCIe r3.1, sec 7.32.3) contains an Effective
Granularity field:

  This provides information relating to the expected accuracy of the PTM
  clock, but does not otherwise affect the PTM mechanism.

Set the Effective Granularity based on the PTM Root and any intervening PTM
Time Sources.

This does not set Effective Granularity for Root Complex Integrated
Endpoints because I don't know how to figure out clock granularity for
them.  The spec says:

  ... system software must set [Effective Granularity] to the value
  reported in the Local Clock Granularity field by the associated PTM
  Time Source.

but I don't know how to identify the associated PTM Time Source.  Normally
it's the upstream bridge, but an integrated endpoint has no upstream
bridge.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-08-25 08:32:34 -05:00
Felix Hädicke
4368c28ae7 usb: gadget: f_fs: handle control requests in config 0
Introduces a new FunctionFS descriptor flag named
FUNCTIONFS_CONFIG0_SETUP.

When this flag is enabled, FunctionFS userspace drivers can process
non-standard control requests in configuration 0.

Signed-off-by: Felix Hädicke <felixhaedicke@web.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25 12:13:17 +03:00
Felix Hädicke
54dfce6d07 usb: gadget: f_fs: handle control requests not directed to interface or endpoint
Introduces a new FunctionFS descriptor flag named
FUNCTIONFS_ALL_CTRL_RECIP. When this flag is enabled, control requests,
which are not explicitly directed to an interface or endpoint, can be
handled.

This allows FunctionFS userspace drivers to process non-standard
control requests.

Signed-off-by: Felix Hädicke <felixhaedicke@web.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25 12:13:17 +03:00
Lorenzo Colitti
a52e95abf7 net: diag: allow socket bytecode filters to match socket marks
This allows a privileged process to filter by socket mark when
dumping sockets via INET_DIAG_BY_FAMILY. This is useful on
systems that use mark-based routing such as Android.

The ability to filter socket marks requires CAP_NET_ADMIN, which
is consistent with other privileged operations allowed by the
SOCK_DIAG interface such as the ability to destroy sockets and
the ability to inspect BPF filters attached to packet sockets.

Tested: https://android-review.googlesource.com/261350
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-24 21:57:20 -07:00
Hans Verkuil
adca8c8e25 [media] videodev2.h: put V4L2_YCBCR_ENC_SYCC under #ifndef __KERNEL__
This define is a duplicate of V4L2_YCBCR_ENC_601. So mark it as
'should not be used' and put it under #ifndef __KERNEL__ to
prevent drivers from referring to it.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-08-24 08:37:49 -03:00
Hans Verkuil
7e0739cd9c [media] videodev2.h: fix sYCC/AdobeYCC default quantization range
The default quantization range of the Y'CbCr encodings of sRGB
and AdobeRGB are full range instead of limited range according to
the corresponding standards.

Fix this in the V4L2_MAP_QUANTIZATION_DEFAULT macro.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-08-24 08:17:10 -03:00
Dan Williams
3bc52c45ba dax: define a unified inode/address_space for device-dax mappings
In support of enabling resize / truncate of device-dax instances, define
a pseudo-fs to provide a unified inode/address space for vm operations.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-08-23 22:58:51 -07:00
Nick Dyer
b2fe22d0cf [media] v4l2-core: Add support for touch devices
Some touch controllers send out touch data in a similar way to a
greyscale frame grabber.

Add new device type VFL_TYPE_TOUCH:
- This uses a new device prefix v4l-touch for these devices, to stop
  generic capture software from treating them as webcams. Otherwise,
  touch is treated similarly to video capture.
- Add V4L2_INPUT_TYPE_TOUCH
- Add MEDIA_INTF_T_V4L_TOUCH
- Add V4L2_CAP_TOUCH to indicate device is a touch device

Add formats:
- V4L2_TCH_FMT_DELTA_TD16 for signed 16-bit touch deltas
- V4L2_TCH_FMT_DELTA_TD08 for signed 16-bit touch deltas
- V4L2_TCH_FMT_TU16 for unsigned 16-bit touch data
- V4L2_TCH_FMT_TU08 for unsigned 8-bit touch data

This support will be used by:
- Atmel maXTouch (atmel_mxt_ts)
- Synaptics RMI4.
- sur40

Signed-off-by: Nick Dyer <nick@shmanahar.org>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-08-23 16:28:04 -03:00
Pablo Neira
b43f956951 netfilter: nf_tables: typo in trace attribute definition
Should be attributes, instead of attibutes, for consistency with other
definitions.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-23 16:05:02 +02:00
Mikko Rapeli
53dc65d4d3 include/uapi/linux/ipx.h: fix conflicting defitions with glibc netipx/ipx.h
Fixes these compiler warnings via libc-compat.h when glibc netipx/ipx.h is
included before linux/ipx.h:

./linux/ipx.h:9:8: error: redefinition of ‘struct sockaddr_ipx’
./linux/ipx.h:26:8: error: redefinition of ‘struct ipx_route_definition’
./linux/ipx.h:32:8: error: redefinition of ‘struct ipx_interface_definition’
./linux/ipx.h:49:8: error: redefinition of ‘struct ipx_config_data’
./linux/ipx.h:58:8: error: redefinition of ‘struct ipx_route_def’

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:15 -07:00
Mikko Rapeli
a1d1f65ff5 include/uapi/linux/openvswitch.h: use __u32 from linux/types.h
Kernel uapi header are supposed to use them. Fixes userspace compile error:

linux/openvswitch.h:583:2: error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:15 -07:00
Mikko Rapeli
cf00713a65 include/uapi/linux/atm_zatm.h: include linux/time.h
Fixes userspace compile error:

error: field ‘real’ has incomplete type
 struct timeval real;  /* real (wall-clock) time */

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:15 -07:00
Mikko Rapeli
e6571aa5cb include/uapi/linux/openvswitch.h: use __u32 from linux/types.h
Fixes userspace compiler error:

error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:15 -07:00
Mikko Rapeli
eafe921143 include/uapi/linux/if_pppox.h: include linux/in.h and linux/in6.h
Fixes userspace compilation errors:

error: field ‘addr’ has incomplete type
 struct sockaddr_in addr; /* IP address and port to send to */

error: field ‘addr’ has incomplete type
 struct sockaddr_in6 addr; /* IP address and port to send to */

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:15 -07:00
Mikko Rapeli
05ee5de745 include/uapi/linux/if_pppol2tp.h: include linux/in.h and linux/in6.h
Fixes userspace compilation errors like:

error: field ‘addr’ has incomplete type
 struct sockaddr_in addr; /* IP address and port to send to */
                    ^
error: field ‘addr’ has incomplete type
 struct sockaddr_in6 addr; /* IP address and port to send to */

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:14 -07:00
Mikko Rapeli
1fe8e0f074 include/uapi/linux/if_tunnel.h: include linux/if.h, linux/ip.h and linux/in6.h
Fixes userspace compilation errors like:

error: field ‘iph’ has incomplete type
error: field ‘prefix’ has incomplete type

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:14 -07:00
Mikko Rapeli
b47b0cc730 include/uapi/linux/if_pppox.h: include linux/if.h
Fixes userspace compilation error:

error: ‘IFNAMSIZ’ undeclared here (not in a function)

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-22 16:25:14 -07:00
Mikko Rapeli
7ac5d7b1a1 HSI: hsi_char.h: use __u32 from linux/types.h
Fixes userspace compiler errors like:

linux/hsi/hsi_char.h:51:2: error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
2016-08-22 22:43:35 +02:00
Laura Garcia Liebana
91dbc6be0a netfilter: nf_tables: add number generator expression
This patch adds the numgen expression that allows us to generated
incremental and random numbers, this generator is bound to a upper limit
that is specified by userspace.

This expression is useful to distribute packets in a round-robin fashion
as well as randomly.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-22 11:42:22 +02:00
Pablo Neira Ayuso
3d2f30a1df netfilter: nf_tables: add quota expression
This patch adds the quota expression. This new stateful expression
integrate easily into the dynset expression to build 'hashquota' flow
tables.

Arguably, we could use instead "counter bytes > 1000" instead, but this
approach has several problems:

1) We only support for one single stateful expression in dynamic set
   definitions, and the expression above is a composite of two
   expressions: get counter + comparison.

2) We would need to restore the packed counter representation (that we
   used to have) based on seqlock to synchronize this, since per-cpu is
   not suitable for this.

So instead of bloating the counter expression back with the seqlock
representation and extending the existing set infrastructure to make it
more complex for the composite described above, let's follow the more
simple approach of adding a quota expression that we can plug into our
existing infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-22 11:42:18 +02:00
Daniel Borkmann
5293efe62d bpf: add bpf_skb_change_tail helper
This work adds a bpf_skb_change_tail() helper for tc BPF programs. The
basic idea is to expand or shrink the skb in a controlled manner. The
eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(),
bpf_lX_csum_replace() and others rather than passing a raw buffer for
writing here.

bpf_skb_change_tail() is really a slow path helper and intended for
replies with f.e. ICMP control messages. Concept is similar to other
helpers like bpf_skb_change_proto() helper to keep the helper without
protocol specifics and let the BPF program mangle the remaining parts.
A flags field has been added and is reserved for now should we extend
the helper in future.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:38:16 -07:00
Richard Alpe
b34040227b tipc: add peer removal functionality
Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
an offline peer node from the internal data structures.

This will be supported by the tipc user space tool in iproute2.

Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:36:07 -07:00
Nikolay Aleksandrov
61ba1a2da9 net: bridge: export vlan flags with the stats
Use one of the vlan xstats padding fields to export the vlan flags. This is
needed in order to be able to distinguish between master (bridge) and port
vlan entries in user-space when dumping the bridge vlan stats.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:18:42 -07:00
Hadar Hen Zion
956af37102 net_sched: act_vlan: Add priority option
The current vlan push action supports only vid and protocol options.
Add priority option.

Example script that adds vlan push action with vid and
priority:

tc filter add dev veth0 protocol ip parent ffff: \
	   flower \
	   	indev veth0 \
	   action vlan push id 100 priority 5

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:13:14 -07:00
Hadar Hen Zion
9399ae9a6c net_sched: flower: Add vlan support
Enhance flower to support 802.1Q vlan protocol classification.
Currently, the supported fields are vlan_id and vlan_priority.

Example:

	# add a flower filter with vlan id and priority classification
	tc filter add dev ens4f0 protocol 802.1Q parent ffff: \
		flower \
		indev ens4f0 \
		vlan_ethtype ipv4 \
		vlan_id 100 \
		vlan_prio 3 \
	action vlan pop

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 23:13:14 -07:00
Bjorn Helgaas
eec097d431 PCI: Add pci_enable_ptm() for drivers to enable PTM on endpoints
Add an pci_enable_ptm() interface so drivers can enable PTM.

The PCI core enables PTM on PTM Roots and switches automatically, but we
don't enable PTM on endpoints unless a driver requests it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-08-18 16:04:57 -05:00
David S. Miller
60747ef4d1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor overlapping changes for both merge conflicts.

Resolution work done by Stephen Rothwell was used
as a reference.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18 01:17:32 -04:00
Linus Torvalds
184ca82348 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

 2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

 3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0.  From
    Satish Baddipadige and Siva Reddy Kallam.

 4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

 5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer.  From David Forster.

 6) Several sctp_diag fixes from Phil Sutter.

 7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

 8) Checksum fixup fixes in bpf from Daniel Borkmann.

 9) Memork leaks in nfnetlink, from Liping Zhang.

10) Use after free in rxrpc, from David Howells.

11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

12) Calipso resource leak, from Colin Ian King.

13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

15) Fix lockdep splats in macsec, from Sabrina Dubroca.

16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

17) Various tc-action bug fixes, from CONG Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net_sched: allow flushing tc police actions
  net_sched: unify the init logic for act_police
  net_sched: convert tcf_exts from list to pointer array
  net_sched: move tc offload macros to pkt_cls.h
  net_sched: fix a typo in tc_for_each_action()
  net_sched: remove an unnecessary list_del()
  net_sched: remove the leftover cleanup_a()
  mlxsw: spectrum: Allow packets to be trapped from any PG
  mlxsw: spectrum: Unmap 802.1Q FID before destroying it
  mlxsw: spectrum: Add missing rollbacks in error path
  mlxsw: reg: Fix missing op field fill-up
  mlxsw: spectrum: Trap loop-backed packets
  mlxsw: spectrum: Add missing packet traps
  mlxsw: spectrum: Mark port as active before registering it
  mlxsw: spectrum: Create PVID vPort before registering netdevice
  mlxsw: spectrum: Remove redundant errors from the code
  mlxsw: spectrum: Don't return upon error in removal path
  i40e: check for and deal with non-contiguous TCs
  ixgbe: Re-enable ability to toggle VLAN filtering
  ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
  ...
2016-08-17 17:26:58 -07:00
David S. Miller
00062a934b This feature patchset is all about adding netlink support, which should
supersede our debugfs configuration interface in the long run. It is
 especially necessary when batman-adv should be used in different
 namespaces, since debugfs can not differentiate between those.
 
 More specifically, the following changes are included:
 
  - Two fixes for namespace handling by Andrew Lunn, checking also the
    namespaces for parent interfaces, and supress debugfs entries
    for non-default netns
 
  - Implement various netlink commands for the new interface, by
    Matthias Schiffer, Andrew Lunn, Sven Eckelmann and Simon Wunderlich
    (13 patches):
     * routing algorithm list
     * hardif list
     * translation tables (local and global)
     * TTVN for the translation tables
     * originator and neighbor tables for B.A.T.M.A.N. IV
       and B.A.T.M.A.N. V
     * gateway dump functionality for B.A.T.M.A.N. IV
       and B.A.T.M.A.N. V
     * Bridge Loop Avoidance claims, and corresponding BLA group
     * Bridge Loop Avoidance backbone tables
 
  - Finally, mark batman-adv as netns compatible, by Andrew Lunn (1 patch)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdBQJXss7MFhxzd0BzaW1vbnd1bmRlcmxpY2guZGUACgkQoSvjmEKS
 nqGvTBAAw7A0lG5ghEEDTVWl++/q3fc41ZPn+XGihizQ3z9Hy5ZAuyREKqMz43RP
 MJb2sHnoS/guCY7Y0Mn/ubQDuvp7PmqJxNmHiqdW0UVKrwgRrlhk/uZfd3Blib8J
 TR1ktRAT/OKtPIxps2CSq2UX1GcnadtstaUvDDSWnak/0zsQl5GWVYxOkbdsbYUb
 qYAbcHBXkdvTfIpZxSwb3QDfKoRs+Hf8hr09V19DH/GZs4puYbIxjw1QhC2TBe0f
 SkcMVkmQ6GqJsjRU4BDVCrrfYvv3ncBWXtb5CKyq8il2AvdI1HbXha9hpg0SO69p
 fAC5yzyB0rCCr7AKMYBgeIf9u6z5mllKly9QJkZMjtWuIIxt4J5rFK2PN+M3xprb
 BWXrINWR4/1C4LA3dDvCL7sFHlObHVKRjSNwzmQ3b6UNY72d6UILG0D9JTI8M+y7
 YXtjwCQYNCvjmkprM6mgPMnlk90RdXNhNUngfOe2/2C1li2gaodX7lrx+lBS8/5N
 oK5W85vmO41FChLFof5PV6mn4cUV7sKlKPmv93xRvHd89RWBWIU/kGpQQjkCgh5U
 44CJiD+FDRkEkDJVo7IkqTxGF39zYR39mQrNFXc6G1H4wRFtqHGP+VOa72a/7arV
 FeGtulzeGBK3z1Qi9UyjS2N9mDYSKkfj4f2H+AC1GCRC2mTMCQU=
 =KDfF
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20160816' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
pull request for net-next: batman-adv 2016-08-16

This feature patchset is all about adding netlink support, which should
supersede our debugfs configuration interface in the long run. It is
especially necessary when batman-adv should be used in different
namespaces, since debugfs can not differentiate between those.

More specifically, the following changes are included:

 - Two fixes for namespace handling by Andrew Lunn, checking also the
   namespaces for parent interfaces, and supress debugfs entries
   for non-default netns

 - Implement various netlink commands for the new interface, by
   Matthias Schiffer, Andrew Lunn, Sven Eckelmann and Simon Wunderlich
   (13 patches):
    * routing algorithm list
    * hardif list
    * translation tables (local and global)
    * TTVN for the translation tables
    * originator and neighbor tables for B.A.T.M.A.N. IV
      and B.A.T.M.A.N. V
    * gateway dump functionality for B.A.T.M.A.N. IV
      and B.A.T.M.A.N. V
    * Bridge Loop Avoidance claims, and corresponding BLA group
    * Bridge Loop Avoidance backbone tables

 - Finally, mark batman-adv as netns compatible, by Andrew Lunn (1 patch)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-17 19:22:13 -04:00
Srinivas Pandruvada
0b28cb4bcb HID: intel-ish-hid: ISH HID client driver
This driver is responsible for implementing ISH HID client, which
gets HID description and report. Once it has completely gets
report descriptors, it registers as a HID LL drivers. This implements
necessary callbacks so that it can be used by HID sensor hub driver.

Original-author: Daniel Drubin <daniel.drubin@intel.com>
Reviewed-and-tested-by: Ooi, Joyce <joyce.ooi@intel.com>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Rann Bar-On <rb6@duke.edu>
Tested-by: Atri Bhattacharya <badshah400@aim.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-08-17 11:13:08 +02:00
Jonathan Yong
9bb04a0c4e PCI: Add Precision Time Measurement (PTM) support
Add Precision Time Measurement (PTM) support (see PCIe r3.1, sec 6.22).

Enable PTM on PTM Root devices and switch ports.  This does not enable PTM
on endpoints.

There currently are no PTM-capable devices on the market, but it is
expected to be supported by the Intel Apollo Lake platform.

[bhelgaas: complete rework]
Signed-off-by: Jonathan Yong <jonathan.yong@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-08-15 13:44:08 -05:00
Dan Williams
02486c2905 libnvdimm: fix SMART Health DSM payload definition
"NVDIMM DSM Interface Example" v1.2 made an incompatible change to the
layout of function1 "SMART and Health Info".  While the kernel does not
directly consume this payload, it does define it in ndctl.h that
userpace utilities consume.

Reported-by: Brian Boylston <brian.boylston@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-08-15 11:07:21 -07:00
Gao Feng
03459345bc pptp: Refactor the struct and macros of PPTP codes
1. Use struct gre_base_hdr directly in pptp_gre_header instead of
duplicated members;
2. Use existing macros like GRE_KEY, GRE_SEQ, and so on instead of
duplicated macros defined by PPTP;
3. Add new macros like GRE_IS_ACK/SEQ and so on instead of
PPTP_GRE_IS_A/S and so on;

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 10:55:53 -07:00
Daniel Borkmann
747ea55e4f bpf: fix bpf_skb_in_cgroup helper naming
While hashing out BPF's current_task_under_cgroup helper bits, it came
to discussion that the skb_in_cgroup helper name was suboptimally chosen.

Tejun says:

  So, I think in_cgroup should mean that the object is in that
  particular cgroup while under_cgroup in the subhierarchy of that
  cgroup. Let's rename the other subhierarchy test to under too. I
  think that'd be a lot less confusing going forward.

  [...]

  It's more intuitive and gives us the room to implement the real
  "in" test if ever necessary in the future.

Since this touches uapi bits, we need to change this as long as v4.8
is not yet officially released. Thus, change the helper enum and rename
related bits.

Fixes: 4a482f34af ("cgroup: bpf: Add bpf_skb_in_cgroup_proto")
Reference: http://patchwork.ozlabs.org/patch/658500/
Suggested-by: Sargun Dhillon <sargun@sargun.me>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2016-08-12 21:53:33 -07:00
Sargun Dhillon
60d20f9195 bpf: Add bpf_current_task_under_cgroup helper
This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-12 21:49:41 -07:00
Appana Durga Kedareswara Rao
300d8b93d7 net: Add mask for Control register 10Mbps speed
This patch adds mask for the Control register
10Mbps speed.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-12 16:57:20 -07:00
Laura Garcia Liebana
cb1b69b0b1 netfilter: nf_tables: add hash expression
This patch adds a new hash expression, this provides jhash support but
this can be extended to support for other hash functions. The modulus
and seed already comes embedded into this new expression.

Use case example:

	... meta mark set hash ip saddr mod 10

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 14:16:04 +02:00
Gao Feng
ab10dccb11 rps: Inspect PPTP encapsulated by GRE to get flow hash
The PPTP is encapsulated by GRE header with that GRE_VERSION bits
must contain one. But current GRE RPS needs the GRE_VERSION must be
zero. So RPS does not work for PPTP traffic.

In my test environment, there are four MIPS cores, and all traffic
are passed through by PPTP. As a result, only one core is 100% busy
while other three cores are very idle. After this patch, the usage
of four cores are balanced well.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:22:14 -07:00
David S. Miller
293fddff20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Use mod_timer_pending() to avoid reactivating a dead expectation in
   the h323 conntrack helper, from Liping Zhang.

2) Oneliner to fix a type in the register name defined in the nf_tables
   header.

3) Don't try to look further when we find an inactive elements with no
   descendants in the rbtree set implementation, otherwise we crash.

4) Handle valid zero CSeq in the SIP conntrack helper, from
   Christophe Leroy.

5) Don't display a trailing slash in conntrack helper with no classes
   via /proc/net/nf_conntrack_expect, from Liping Zhang.

6) Fix an expectation leak during creation from the nfqueue path, again
   from Liping Zhang.

7) Validate netlink port ID in verdict message from nfqueue, otherwise
   an injection can be possible. Again from Zhang.

8) Reject conntrack tuples with different transport protocol on
   original and reply tuples, also from Zhang.

9) Validate offset and length in nft_exthdr, make sure they are under
   sizeof(u8), from Laura Garcia Liebana.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 14:54:27 -07:00
Stefan Hajnoczi
28ad55578b virtio-vsock: fix include guard typo
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-09 13:42:38 +03:00
Simon Wunderlich
ea4152e117 batman-adv: add backbone table netlink support
Dump the list of bridge loop avoidance backbones via the netlink socket.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:43 +02:00
Andrew Lunn
04f3f5bf18 batman-adv: add B.A.T.M.A.N. Dump BLA claims via netlink
Dump the list of bridge loop avoidance claims via the netlink socket.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: add policy for attributes, fix includes, fix
soft_iface reference leak]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
[sw@simonwunderlich.de: fix kerneldoc, fix error reporting]
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:42 +02:00
Sven Eckelmann
d7129dafcb batman-adv: netlink: add gateway table queries
Add BATADV_CMD_GET_GATEWAYS commands, using handlers bat_gw_dump in
batadv_algo_ops. Will always return -EOPNOTSUPP for now, as no
implementations exist yet.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-08-09 07:54:40 +02:00
Matthias Schiffer
f02a478f51 batman-adv: add B.A.T.M.A.N. V bat_{orig, neigh}_dump implementations
Dump the algo V originators and neighbours.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: Fix includes, fix algo_ops integration]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:39 +02:00
Matthias Schiffer
024f99cb4a batman-adv: add B.A.T.M.A.N. IV bat_{orig, neigh}_dump implementations
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Fix function parameter alignments,
add policy for attributes, fix includes, fix algo_ops integration]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:39 +02:00
Matthias Schiffer
85cf8c859d batman-adv: netlink: add originator and neighbor table queries
Add BATADV_CMD_GET_ORIGINATORS and BATADV_CMD_GET_NEIGHBORS commands,
using handlers bat_orig_dump and bat_neigh_dump in batadv_algo_ops. Will
always return -EOPNOTSUPP for now, as no implementations exist yet.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: Rewrite based on new algo_ops structures]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:38 +02:00
Matthias Schiffer
d34f05507d batman-adv: netlink: add translation table query
This adds the commands BATADV_CMD_GET_TRANSTABLE_LOCAL and
BATADV_CMD_GET_TRANSTABLE_GLOBAL, which correspond to the transtable_local
and transtable_global debugfs files.

The batadv_tt_client_flags enum is moved to the UAPI to expose it as part
of the netlink API.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: add policy for attributes, fix includes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
[sw@simonwunderlich.de: fix VID attributes content]
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:37 +02:00
Matthias Schiffer
b60620cf56 batman-adv: netlink: hardif query
BATADV_CMD_GET_HARDIFS will return the list of hardifs (including index,
name and MAC address) of all hardifs for a given softif.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_HARDIFS, add policy for attributes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:36 +02:00
Matthias Schiffer
07a3061e08 batman-adv: netlink: add routing_algo query
BATADV_CMD_GET_ROUTING_ALGOS is used to get the list of supported routing
algorithms.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_ROUTING_ALGOS, fix includes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2016-08-09 07:54:36 +02:00
Phil Sutter
dca3f53c02 sctp: Export struct sctp_info to userspace
This is required to correctly interpret INET_DIAG_INFO messages exported
by sctp_diag module.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 12:51:58 -07:00
Pablo Neira Ayuso
2c86943c20 netfilter: nf_tables: s/MFT_REG32_01/NFT_REG32_01
MFT_REG32_01 is a typo, rename this to NFT_REG32_01.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-08 09:35:05 +02:00
Linus Torvalds
0803e04011 virtio/vhost: new features for 4.8
- New vsock device support in host and guest
 - Platform IOMMU support in host and guest,
   including compatibility quirks for legacy systems.
 - Misc fixes and cleanups.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXofvbAAoJECgfDbjSjVRpUTIH/iEoK9h636tBayXy0PXkPby0
 6fMaRFy6H1HgEttgDhJE8Pqg/ba3qaW9Em0fHyFq7Mp2waFHAZ8hAT8phC6TAK3c
 CIBnfzyyuI8u3N9SnNOfelPVcwCBfuALuuTsXB/rwKbYQEVv+U5Rdt3Vyx9+lXkj
 P005klz7PfqxFhQrrnj4Eh7VawtHwmMuLH8YoWpCZpM71dHPo6eL+3ftKwhH2boo
 qK86uVprwba03Pewpm13vQnotemfVfUUkjXd4EJpG3dx7E0KZosuj0ZG9OV8mPGQ
 Cl2gBdUhocdJgeUnAHmf6tumYi9KFlYfy6xLy44YMmN7FL3E9nQjaKZp25UKfiM=
 =ztIm
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost updates from Michael Tsirkin:

 - new vsock device support in host and guest

 - platform IOMMU support in host and guest, including compatibility
   quirks for legacy systems.

 - misc fixes and cleanups.

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  VSOCK: Use kvfree()
  vhost: split out vringh Kconfig
  vhost: detect 32 bit integer wrap around
  vhost: new device IOTLB API
  vhost: drop vringh dependency
  vhost: convert pre sorted vhost memory array to interval tree
  vhost: introduce vhost memory accessors
  VSOCK: Add Makefile and Kconfig
  VSOCK: Introduce vhost_vsock.ko
  VSOCK: Introduce virtio_transport.ko
  VSOCK: Introduce virtio_vsock_common.ko
  VSOCK: defer sock removal to transports
  VSOCK: transport-specific vsock_transport functions
  vhost: drop vringh dependency
  vop: pull in vhost Kconfig
  virtio: new feature to detect IOMMU device quirk
  balloon: check the number of available pages in leak balloon
  vhost: lockless enqueuing
  vhost: simplify work flushing
2016-08-06 09:20:13 -04:00
Linus Torvalds
80fac0f577 * ARM bugfix and MSI injection support
* x86 nested virt tweak and OOPS fix
 * Simplify pvclock code (vdso bits acked by Andy Lutomirski).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXpKOnAAoJEL/70l94x66D5z4H/R2660Vy3brQrI8lGxCtkXJt
 AVe8PwI8nDfYJ/UkMZ2KcHPSvy+sHW2ydaZXYNqXHVBeTaUxiPW9rTgK61ebypGL
 1tPOgJ3kGZF6XEdAz6gS8LniNFc+D3W6Y6sRylkEsqPj39/hxe7QMoOMSCQ9imbW
 WMIx7/81i1EMw6oi+9FVtq+yHCpvyfFnD8t1TDsYWOReVn1J15SxbEs4Ih+hBMLz
 HZ5DEjp9cAmzeR7GLje5eH1t6TEEoNb1MNgFWuscoAsDf8D9DKqRB9s0hC+TLFYn
 oZbGSqjQwu3/VMblgedinH6X9MTm8V0zW29ToGnDcoO00AUmdlNmXSaZUhvT/Rs=
 =H5cD
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 - ARM bugfix and MSI injection support
 - x86 nested virt tweak and OOPS fix
 - Simplify pvclock code (vdso bits acked by Andy Lutomirski).

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  nvmx: mark ept single context invalidation as supported
  nvmx: remove comment about missing nested vpid support
  KVM: lapic: fix access preemption timer stuff even if kernel_irqchip=off
  KVM: documentation: fix KVM_CAP_X2APIC_API information
  x86: vdso: use __pvclock_read_cycles
  pvclock: introduce seqcount-like API
  arm64: KVM: Set cpsr before spsr on fault injection
  KVM: arm: vgic-irqfd: Workaround changing kvm_set_routing_entry prototype
  KVM: arm/arm64: Enable MSI routing
  KVM: arm/arm64: Enable irqchip routing
  KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header
  KVM: irqchip: Convey devid to kvm_set_msi
  KVM: Add devid in kvm_kernel_irq_routing_entry
  KVM: api: Pass the devid in the msi routing entry
2016-08-06 09:18:21 -04:00
Linus Torvalds
2cfd716d27 powerpc updates for 4.8 #2
Fixes:
  - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
  - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
  - Move register_process_table() out of ppc_md from Michael Ellerman
 
 Use jump_label for [cpu|mmu]_has_feature() from Aneesh Kumar K.V, Kevin Hao and Michael Ellerman:
  - Add mmu_early_init_devtree() from Michael Ellerman
  - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
  - Do hash device tree scanning earlier from Michael Ellerman
  - Do radix device tree scanning earlier from Michael Ellerman
  - Do feature patching before MMU init from Michael Ellerman
  - Check features don't change after patching from Michael Ellerman
  - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
  - Convert mmu_has_feature() to returning bool from Michael Ellerman
  - Convert cpu_has_feature() to returning bool from Michael Ellerman
  - Define radix_enabled() in one place & use static inline from Michael Ellerman
  - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
  - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
  - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
  - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
  - Remove mfvtb() from Kevin Hao
  - Move cpu_has_feature() to a separate file from Kevin Hao
  - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
  - Add option to use jump label for cpu_has_feature() from Kevin Hao
  - Add option to use jump label for mmu_has_feature() from Kevin Hao
  - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
  - Annotate jump label assembly from Michael Ellerman
 
 TLB flush enhancements from Aneesh Kumar K.V:
  - radix: Implement tlb mmu gather flush efficiently
  - Add helper for finding SLBE LLP encoding
  - Use hugetlb flush functions
  - Drop multiple definition of mm_is_core_local
  - radix: Add tlb flush of THP ptes
  - radix: Rename function and drop unused arg
  - radix/hugetlb: Add helper for finding page size
  - hugetlb: Add flush_hugetlb_tlb_range
  - remove flush_tlb_page_nohash
 
 Add new ptrace regsets from Anshuman Khandual and Simon Guo:
  - elf: Add powerpc specific core note sections
  - Add the function flush_tmregs_to_thread
  - Enable in transaction NT_PRFPREG ptrace requests
  - Enable in transaction NT_PPC_VMX ptrace requests
  - Enable in transaction NT_PPC_VSX ptrace requests
  - Adapt gpr32_get, gpr32_set functions for transaction
  - Enable support for NT_PPC_CGPR
  - Enable support for NT_PPC_CFPR
  - Enable support for NT_PPC_CVMX
  - Enable support for NT_PPC_CVSX
  - Enable support for TM SPR state
  - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
  - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
  - Enable support for EBB registers
  - Enable support for Performance Monitor registers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXpGaLAAoJEFHr6jzI4aWA9aYP/1AqmRPJ9D0XVUJWT+FVABUK
 LESESoVFF4Hug1j1F8Synhg5o4SzD2t45iGKbclYaFthOIyovMg7Wr1KSu4hQ0go
 rPuQfpXDNQ8jKdDX8hbPXKUxrNRBNfqJGFo5E7mO6wN9AJ9d1LVwQ+jKAva29Tqs
 LaAlMbQNbeObPNzOl73B73iew3aozr+mXjBqv82lqvgYknBD2CLf24xGG3eNIbq5
 ZZk4LPC8pdkaxnajnzRFzqwiyPWzao0yfpVRKh52TKHBQF/prR/KACb6zUuja/61
 krOfegUKob14OYrehjs6X8XNRLnILRI0u1H5bmj7eVEiY/usyNzE93SMHZM3Wdau
 sQF/Au4OLNXj0ZQdNBtzRsZRyp1d560Gsj+lQGBoPd4hfIWkFYHvxzxsUSdqv4uA
 MWDMwN0Vvfk0cpprsabsWNevkaotYYBU00px5hF/e5ZUc9/x/xYUVMgPEDr0QZLr
 cHJo9/Pjk4u/0g4lj+2y1LLl/0tNEZZg69O6bvffPAPVSS4/P4y/bKKYd4I0zL99
 Ykp91mSmkl70F3edgOSFqyda2gN2l2Ekb/i081YGXheFy1rbD29Vxv82BOVog4KY
 ibvOqp38WDzCVk5OXuCRvBl0VudLKGJYdppU1nXg4KgrTZXHeCAC0E+NzUsgOF4k
 OMvQ+5drVxrno+Hw8FVJ
 =0Q8E
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "These were delayed for various reasons, so I let them sit in next a
  bit longer, rather than including them in my first pull request.

  Fixes:
   - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
   - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
   - Move register_process_table() out of ppc_md from Michael Ellerman

  Use jump_label use for [cpu|mmu]_has_feature():
   - Add mmu_early_init_devtree() from Michael Ellerman
   - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
   - Do hash device tree scanning earlier from Michael Ellerman
   - Do radix device tree scanning earlier from Michael Ellerman
   - Do feature patching before MMU init from Michael Ellerman
   - Check features don't change after patching from Michael Ellerman
   - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
   - Convert mmu_has_feature() to returning bool from Michael Ellerman
   - Convert cpu_has_feature() to returning bool from Michael Ellerman
   - Define radix_enabled() in one place & use static inline from Michael Ellerman
   - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
   - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
   - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
   - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
   - Remove mfvtb() from Kevin Hao
   - Move cpu_has_feature() to a separate file from Kevin Hao
   - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
   - Add option to use jump label for cpu_has_feature() from Kevin Hao
   - Add option to use jump label for mmu_has_feature() from Kevin Hao
   - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
   - Annotate jump label assembly from Michael Ellerman

  TLB flush enhancements from Aneesh Kumar K.V:
   - radix: Implement tlb mmu gather flush efficiently
   - Add helper for finding SLBE LLP encoding
   - Use hugetlb flush functions
   - Drop multiple definition of mm_is_core_local
   - radix: Add tlb flush of THP ptes
   - radix: Rename function and drop unused arg
   - radix/hugetlb: Add helper for finding page size
   - hugetlb: Add flush_hugetlb_tlb_range
   - remove flush_tlb_page_nohash

  Add new ptrace regsets from Anshuman Khandual and Simon Guo:
   - elf: Add powerpc specific core note sections
   - Add the function flush_tmregs_to_thread
   - Enable in transaction NT_PRFPREG ptrace requests
   - Enable in transaction NT_PPC_VMX ptrace requests
   - Enable in transaction NT_PPC_VSX ptrace requests
   - Adapt gpr32_get, gpr32_set functions for transaction
   - Enable support for NT_PPC_CGPR
   - Enable support for NT_PPC_CFPR
   - Enable support for NT_PPC_CVMX
   - Enable support for NT_PPC_CVSX
   - Enable support for TM SPR state
   - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
   - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
   - Enable support for EBB registers
   - Enable support for Performance Monitor registers"

* tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
  powerpc/mm: Move register_process_table() out of ppc_md
  powerpc/perf: Fix incorrect event codes in power9-event-list
  powerpc/32: Fix early access to cpu_spec relocation
  powerpc/ptrace: Enable support for Performance Monitor registers
  powerpc/ptrace: Enable support for EBB registers
  powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
  powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
  powerpc/ptrace: Enable support for TM SPR state
  powerpc/ptrace: Enable support for NT_PPC_CVSX
  powerpc/ptrace: Enable support for NT_PPC_CVMX
  powerpc/ptrace: Enable support for NT_PPC_CFPR
  powerpc/ptrace: Enable support for NT_PPC_CGPR
  powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
  powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
  powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
  powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
  powerpc/process: Add the function flush_tmregs_to_thread
  elf: Add powerpc specific core note sections
  powerpc/mm: remove flush_tlb_page_nohash
  powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
  ...
2016-08-05 09:00:54 -04:00
Linus Torvalds
d58b0d980f Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull more btrfs updates from Chris Mason:
 "This is part two of my btrfs pull, which is some cleanups and a batch
  of fixes.

  Most of the code here is from Jeff Mahoney, making the pointers we
  pass around internally more consistent and less confusing overall.  I
  noticed a small problem right before I sent this out yesterday, so I
  fixed it up and re-tested overnight"

* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (40 commits)
  Btrfs: fix __MAX_CSUM_ITEMS
  btrfs: btrfs_abort_transaction, drop root parameter
  btrfs: add btrfs_trans_handle->fs_info pointer
  btrfs: btrfs_relocate_chunk pass extent_root to btrfs_end_transaction
  btrfs: convert nodesize macros to static inlines
  btrfs: introduce BTRFS_MAX_ITEM_SIZE
  btrfs: cleanup, remove prototype for btrfs_find_root_ref
  btrfs: copy_to_sk drop unused root parameter
  btrfs: simpilify btrfs_subvol_inherit_props
  btrfs: tests, use BTRFS_FS_STATE_DUMMY_FS_INFO instead of dummy root
  btrfs: tests, require fs_info for root
  btrfs: tests, move initialization into tests/
  btrfs: btrfs_test_opt and friends should take a btrfs_fs_info
  btrfs: prefix fsid to all trace events
  btrfs: plumb fs_info into btrfs_work
  btrfs: remove obsolete part of comment in statfs
  btrfs: hide test-only member under ifdef
  btrfs: Ratelimit "no csum found" info message
  btrfs: Add ratelimit to btrfs printing
  Btrfs: fix unexpected balance crash due to BUG_ON
  ...
2016-08-04 19:56:16 -04:00
Linus Torvalds
fb1b83d3ff Removed the MODULE_SIG_FORCE-means-no-MODULE_FORCE_LOAD patch.
Only interesting thing here is Jessica's patch to add ro_after_init support
 to modules.  The rest are all trivia.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXopD9AAoJENkgDmzRrbjxDVEP+waK+E3Y+vJHibLwwCYcVqLG
 OAkQoFXGqxYAo0faGtGPZczxDH/GVK754y+qugOeQvCgHJqit7qWmIUs5uRgqUMb
 uKjoUOfCBiVGUsaHfw7RisOP5FXvAk1jkFxBVtywPj6eIonLr9BB4VE813iXnYGG
 RkVFvAmFxMgq2BY+yjp4IDCGNVEFBq9UrXZ8XY+WGhI1pbxVp9SCUVrLckARDSS4
 t5NeVeLCFlNKmw+ElU7zCKaa4Cyloq9lGFBA1ZgchGADRsOrha9VHNRVxR0pHSIG
 100SW+nFhncNWqVQ2YgspVe1so993wGnORPpsb+o3dg7mIn2wkj6WhTfAKv/UQ1W
 7JUFaRi/rMC8h/njLKvbX+gmEU1d4nnTyZ76UFh+VxU6mbVWYqI44DCLpt+mPT13
 JwwqGGCDPnB/28KFmQITYAkdmvAV3u2aZLXJAQPxKVF7/IzklxHHz2ifMEwtPzOh
 UvuWhjmmPAqncKWXzflxMj8i4C3sPyAs0RDSrMXG7jZJlhguVea+b8bXNhEafR+n
 GM0btAfGw+VWluyNMlOpigSpJt/n6/hQtzlgBQGn7CeknNwamBe5MLGSN3N9MgL9
 WXma9sKn34IqjxtSSP5rJlwTRWHELUZIsKmOnWP4/3gwf1+Fe65ML2cCwp6saeMX
 ZjEosYxdKo32LiZhRDPR
 =URwe
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "The only interesting thing here is Jessica's patch to add
  ro_after_init support to modules.  The rest are all trivia"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  extable.h: add stddef.h so "NULL" definition is not implicit
  modules: add ro_after_init support
  jump_label: disable preemption around __module_text_address().
  exceptions: fork exception table content from module.h into extable.h
  modules: Add kernel parameter to blacklist modules
  module: Do a WARN_ON_ONCE() for assert module mutex not held
  Documentation/module-signing.txt: Note need for version info if reusing a key
  module: Invalidate signatures on force-loaded modules
  module: Issue warnings when tainting kernel
  module: fix redundant test.
  module: fix noreturn attribute for __module_put_and_exit()
2016-08-04 09:14:38 -04:00