The sample tc action allows sampling packets matching a classifier. It
peeks randomly packets, and samples them using the psample netlink
channel. The user can specify the psample group, which the packet will be
sampled to, the sampling rate and the packet truncation (to save
kernel-user traffic).
The sampled packets contain informative metadata, for example, the input
interface and the original packet length.
The action syntax:
tc filter add [...] \
action sample rate <RATE> group <GROUP> [trunc <SIZE>]
[...]
Where:
RATE := The sampling rate which is the ratio of packets observed at the
data source to the samples generated
GROUP := the psample module sampling group
SIZE := optional truncation size
An example for a common usecase of the sample tc action: to sample ingress
traffic from interface eth1, one may use the commands:
tc qdisc add dev eth1 handle ffff: ingress
tc filter add dev eth1 parent ffff: \
matchall action sample rate 12 group 4
Where the first command adds an ingress qdisc and the second starts
sampling randomly with an average of one sampled packet per 12 packets
on dev eth1 to psample group 4.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
In order to ensure no backward/forward compatiablity problems,
make sure that all kernel headers used come from the local copy.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
iplink_vrf has 2 functions used to validate a user given device name is
a VRF device and to return the table id. If the user string is not a
device name ip commands with a vrf keyword show a confusing error
message: "RTNETLINK answers: No such device".
Add a variant of rtnl_talk that does not display the "RTNETLINK answers"
message and update iplink_vrf to use it.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Add make_path to recursively call mkdir as needed to create a given
path with the given mode.
Add find_cgroup2_mount to lookup path where cgroup2 is mounted. If it
is not already mounted, cgroup2 is mounted under /var/run/cgroup2 for
use by iproute2.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Based on version in kernel repo, samples/bpf/libbpf.h
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Adds support to configure BPF programs as nexthop actions via the LWT
framework.
Example:
ip route add 192.168.253.2/32 \
encap bpf out obj lwt_len_hist_kern.o section len_hist \
dev veth0
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.
Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.
Minimal example:
clang -target bpf -O2 -Wall -c prog.c -o prog.o
ip [-force] link set dev em1 xdp obj prog.o # attaching
ip [-d] link # dumping
ip link set dev em1 xdp off # detaching
For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.
The 'unset' action is optional. It is used to explicitly unset the
metadata created by the tunnel device during decap. If not used, the
metadata will be released automatically by the kernel.
The 'set' operation, will set the metadata with the specified values for
the encap.
For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:
$ tc filter add dev net0 protocol ip parent ffff: \
flower \
ip_proto 1 \
dst_ip 11.11.11.2 \
action tunnel_key set \
src_ip 11.11.0.1 \
dst_ip 11.11.0.2 \
id 11 \
action mirred egress redirect dev vxlan0
Signed-off-by: Amir Vadai <amir@vadai.me>
This is needed for some HWs to do proper macthing and steering.
Possible values are none, link, network, transport.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
This work moves the bpf loader into the iproute2 library and reworks
the tc specific parts into generic code. It's useful as we can then
more easily support new program types by just having the same ELF
loader backend. Joint work with Thomas Graf. I hacked a rough start
of a test suite to make sure nothing breaks [1] and looks all good.
[1] https://github.com/borkmann/clsact/blob/master/test_bpf.sh
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Adjusting iproute2 utility to support new macvlan link type mode called
"source".
Example of commands that can be applied:
ip link add link eth0 name macvlan0 type macvlan mode source
ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr del 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr flush
ip -details link show dev macvlan0
Based on previous work of Stefan Gula <steweg@gmail.com>
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Cc: steweg@gmail.com
If we have multicast routes and do ip route show table all we'll get the
following output:
...
multicast ???/32 from ???/32 table default proto static iif eth0
The "???" are because the rtm_family is set to RTNL_FAMILY_IPMR instead
(or RTNL_FAMILY_IP6MR for ipv6). Add a simple workaround that returns the
real family based on the rtm_type (always RTN_MULTICAST for ipmr routes)
and the rtm_family. Similar workaround is already used in ipmroute, and
we can use this helper there as well.
After the patch the output is:
multicast 239.10.10.10/32 from 0.0.0.0/32 table default proto static iif eth0
Also fix a minor whitespace error and switch to tabs.
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This patch adds support for the stats argument to the bridge
vlan command which will display the per-vlan statistics and the device
each vlan belongs to with its flags. The supported command filtering
options are dev and vid. Also the man page is updated to explain the new
option.
The patch uses the new RTM_GETSTATS interface with a filter_mask to dump
all bridges and ports vlans. Later we can add support for using the
per-device dump and filter it in the kernel instead.
Example:
$ bridge -s vlan show
port vlan id
br0 1 Egress Untagged
RX: 2536 bytes 20 packets
TX: 2536 bytes 20 packets
101
RX: 43158 bytes 50 packets
TX: 43158 bytes 50 packets
eth1 1 Egress Untagged
RX: 2536 bytes 20 packets
TX: 2536 bytes 20 packets
100
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets
101
RX: 43158 bytes 50 packets
TX: 43158 bytes 50 packets
102
RX: 16897 bytes 93 packets
TX: 0 bytes 0 packets
The format is the same as bridge vlan show but with stats, even though
under the hood the calls done to the kernel are different.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
All users of genl have the same code to open a genl socket and resolve
the family for their specific protocol. Introduce a helper to initialize
the handle, and use it in all the genl code.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Add two NLA's that allow configuration of Infiniband node or port GUIDs
by referencing the IPoIB net device set over the physical function. The
format to be used is as follows:
ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78
Signed-off-by: Eli Cohen <eli@mellanox.com>
Kernel gained support for filtering link dumps with commit dc599f76c22b
("net: Add support for filtering link dump by master device and kind").
Add support to ip link command. If a user passes master device or
kind to ip link command they are added to the link dump request message.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Use kernel shared buffer occupancy control commands to make snapshot and
clear occupancy watermarks. Also, allow to show occupancy values in a
nice way.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Implement kernel devlink shared buffer interface. Introduce new object
"sb" and allow to browse the shared buffer parameters and also change
configuration.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Follow-up to kernel commit 6c9059817432 ("bpf: pre-allocate hash map
elements"). Add flags support, so that we can pass in BPF_F_NO_PREALLOC
flag for disallowing preallocation. Update examples accordingly and also
remove the BPF_* map helper macros from them as they were not very useful.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add new signatures for BPF_FUNC_csum_diff, BPF_FUNC_skb_get_tunnel_opt
and BPF_FUNC_skb_set_tunnel_opt.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
There is only a single user who needs it to be reentrant (not really,
but it's safer like this), add rt_addr_n2a_r() for it to use.
Signed-off-by: Phil Sutter <phil@nwl.cc>
There are only three users which require it to be reentrant, the rest is
fine without. Instead, provide a reentrant format_host_r() for users
which need it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This adds two helper functions which map a given data field to a color,
so color_fprintf() statements don't have to be duplicated with only a
different color value depending on that data field's value. In order for
this to work in a generic way, COLOR_CLEAR has been added to serve as a
fallback default of uncolored output.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Netlink already provides hello_timer, tcn_timer, topology_change_timer
and gc_timer, so let's make them visible.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
This enables a user to remove an offline peer from the kernel data
structures. This could for example be useful when deliberately scaling
in peer nodes in a cloud environment.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
This enables a user to remove an offline peer from the kernel data
structures. This could for example be useful when deliberately scaling
in peer nodes in a cloud environment.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Fix a whitespace in bpf_dump_error() usage, and also a missing closing
bracket in ntohl() macro for eBPF programs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch:
- Adds a utility function for parsing a 64 bit address
- Adds a utility function for converting a 64 bit address to ASCII
- Adds and ILA encap type in lwt tunnels
Signed-off-by: Tom Herbert <tom@herbertland.com>
Improve example files further and add a more generic set of possible
helpers for them that can be used.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
The recently introduced object pinning can be further extended in order
to allow sharing maps beyond tc namespace. F.e. maps that are being pinned
from tracing side, can be accessed through this facility as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
This larger work addresses one of the bigger remaining issues on
tc's eBPF frontend, that is, to allow for persistent file descriptors.
Whenever tc parses the ELF object, extracts and loads maps into the
kernel, these file descriptors will be out of reach after the tc
instance exits.
Meaning, for simple (unnested) programs which contain one or
multiple maps, the kernel holds a reference, and they will live
on inside the kernel until the program holding them is unloaded,
but they will be out of reach for user space, even worse with
(also multiple nested) tail calls.
For this issue, we introduced the concept of an agent that can
receive the set of file descriptors from the tc instance creating
them, in order to be able to further inspect/update map data for
a specific use case. However, while that is more tied towards
specific applications, it still doesn't easily allow for sharing
maps accross multiple tc instances and would require a daemon to
be running in the background. F.e. when a map should be shared by
two eBPF programs, one attached to ingress, one to egress, this
currently doesn't work with the tc frontend.
This work solves exactly that, i.e. if requested, maps can now be
_arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within
a single object (but various program sections, PIN_OBJECT_NS) without
"loosing" the file descriptor set. To make that happen, we use eBPF
object pinning introduced in kernel commit b2197755b263 ("bpf: add
support for persistent maps/progs") for exactly this purpose.
The shipped examples/bpf/bpf_shared.c code from this patch can be
easily applied, for instance, as:
- classifier-classifier shared:
tc filter add dev foo parent 1: bpf obj shared.o sec egress
tc filter add dev foo parent ffff: bpf obj shared.o sec ingress
- classifier-action shared (here: late binding to a dummy classifier):
tc actions add action bpf obj shared.o sec egress pass index 42
tc filter add dev foo parent ffff: bpf obj shared.o sec ingress
tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \
action bpf index 42
The toy example increments a shared counter on egress and dumps its
value on ingress (if no sharing (PIN_NONE) would have been chosen,
map value is 0, of course, due to the two map instances being created):
[...]
<idle>-0 [002] ..s. 38264.788234: : map val: 4
<idle>-0 [002] ..s. 38264.788919: : map val: 4
<idle>-0 [002] ..s. 38264.789599: : map val: 5
[...]
... thus if both sections reference the pinned map(s) in question,
tc will take care of fetching the appropriate file descriptor.
The patch has been tested extensively on both, classifier and
action sides.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
If get_rt_realms() fails, try to get a possible raw u32 realms
value for the u32 RTA_FLOW/FRA_FLOW attribute, as it might be
useful to directly configure the hex value itself. And only if
that fails, then bail out.
The source realm is provided in the upper u16 (mask: 0xffff0000)
and the destination realm through the lower u16 part (mask:
0x0000ffff). This can be useful for tc's bpf realm matcher, but
also a full hex/mask param can be provided already for matching
through iptables' --realm cmdline option, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds support to parse and print lwtunnel
encapsulation attributes attached to routes for MPLS
and IP tunnels.
example:
Add ipv4 route with mpls encap attributes:
Examples:
MPLS:
$ ip route add 40.1.2.0/30 encap mpls 200 via inet 40.1.1.1 dev eth3
$ ip route show
40.1.2.0/30 encap mpls 200 via 40.1.1.1 dev eth3
Add ipv4 multipath route with mpls encap attributes:
$ ip route add 10.1.1.0/30 nexthop encap mpls 200 via 10.1.1.1 dev eth0 \
nexthop encap mpls 700 via 40.1.1.2 dev eth3
$ ip route show
10.1.1.0/30
nexthop encap mpls 200 via 10.1.1.1 dev eth0 weight 1
nexthop encap mpls 700 via 40.1.1.2 dev eth3 weight 1
IP:
$ ip route add 10.1.1.1/24 encap ip id 200 dst 20.1.1.1 dev vxlan0
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Benc <jbenc@redhat.com>
This patch introduces two new api's rta_nest and rta_nest_end to
nest attributes inside a rta attribute represented by 'struct rtattr'
as required to construct a nexthop. Also adds rta_addattr* variants
for u8, u16 and u64 as needed to support encapsulation.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Benc <jbenc@redhat.com>
When having optional classid, most minimal command can be sth
like:
tc filter add dev foo parent X: bpf obj prog.o
Therefore, adapt the code so that a next argument will not be
enforced as the case currently.
Also, minor cleanup on the classid, where we should rather
have used addattr32(), and add flags for exec configuration,
for example (using short notation):
tc filter add dev foo parent X: bpf da obj prog.o
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Add support for filtering neighbor dumps by master device. Kernel side
support provided by commit 21fdd092acc7. Since the feature is not
available in older kernels the user is given a warning message if the
kernel does not support the request.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
This patch follows the changes of commit 4d98ab0 ("Fix FSF address in
file headers"), fixing file headers added after it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This adds support for slightly less output than is normally provided by
'ip link show' and 'ip addr show'. This is a bit better when you have a
host with lots of interfaces. Sample output:
$ ip -br link show
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
p2p1 UP 08:00:27:ee:0b:3b <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1 UP 08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>
p8p1 DOWN 08:00:27:dc:d8:ca <NO-CARRIER,BROADCAST,MULTICAST,UP>
p9p1 UP 08:00:27:76:d9:75 <BROADCAST,MULTICAST,UP,LOWER_UP>
p7p1.100@p7p1 UP 08:00:27:9d:62:9f <BROADCAST,MULTICAST,UP,LOWER_UP>
$ ip -br -4 addr show
lo UNKNOWN 127.0.0.1/8
p2p1 UP 192.168.56.2/24
p7p1 UP 70.0.0.1/24
p8p1 DOWN 80.0.0.1/24
p9p1 UP 10.0.5.15/24
p7p1.100@p7p1 UP 200.0.0.1/24
$ ip -br -6 addr show
lo UNKNOWN ::1/128
p2p1 UP fe80::a00:27ff:feee:b3b/64
p7p1 UP 7000::1/8 fe80::a00:27ff:fe9d:629f/64
p8p1 DOWN 8000::1/8
p9p1 UP fe80::a00:27ff:fe76:d975/64
p7p1.100@p7p1 UP fe80::a00:27ff:fe9d:629f/64
$ ip -br addr show p7p1
p7p1 UP 70.0.0.1/24 7000::1/8 fe80::a00:27ff:fe9d:629f/64
v2: Now with color support!
v3: Better field width estimation (except netdev names to keep output at a
decent width) and whitespace fixup.
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
There have been several instances where response from kernel
has overrun the stack buffer from the caller. Avoid future problems
by passing a size argument.
Also drop the unused peer and group arguments to rtnl_talk.
With this patch, it's now possible to listen in all netns that have an nsid
assigned into the netns where the socket is opened.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
It is hard to quickly find what you are looking for in the output of the
ip command. Color helps.
This patch adds a '-c' flag to highlight these with individual colors:
- interface name
- ip address
- mac address
- up/down state
Signed-off-by: Mathias Nyman <m.nyman@iki.fi>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
The warning was:
In file included from namespace.c:14:0:
../include/namespace.h: In function ‘setns’:
../include/namespace.h:37:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration]
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Two commands are added:
- ip netns list-id
- ip monitor nsid
A cache is also added to remember the association between the iproute2 netns
name (from /var/run/netns/) and the nsid.
To avoid interfering with the rth socket, a new rtnl socket (rtnsh) is used to
get nsid (we may send rtnl request during listing on rth).
Example:
$ ip netns list-id
nsid 0 (iproute2 netns name: foo)
$ ip monitor nsid
Deleted nsid 0 (iproute2 netns name: foo)
nsid 16 (iproute2 netns name: bar)
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This work finalizes both eBPF front-ends for the classifier and action
part in tc, it allows for custom ELF section selection, a simplified tc
command frontend (while keeping compat), reusing of common maps between
classifier and actions residing in the same object file, and exporting
of all map fds to an eBPF agent for handing off further control in user
space.
It also adds an extensive example of how eBPF can be used, and a minimal
self-contained example agent that dumps map data. The example is well
documented and hopefully provides a good starting point into programming
cls_bpf and act_bpf.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
If '-nm' specified that do not fail if there is no
default class names file in /etc/iproute2.
Changed default class name file cls_names -> tc_cls.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
- Pull in the uapi mpls.h
- Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
- Define AF_MPLS in utils.h if it is not defined from elsewhere
as is done with AF_DECnet
The address syntax for multiple mpls labels is a complete invention.
When I looked there seemed to be no wide spread convention for talking
about an mpls label stack in text for. Sometimes people did:
"{ Label1, Label2, Label3 }", sometimes people would do:
"[ label3, label2, label1 ]", and most of the time label
stacks were not explicitly shown at all.
The syntax I wound up using, so it would not have spaces and so it
would visually distinct from other kinds of addresses is.
label1/label2/label3 Where label1 is the label at the top of the label
stack and label3 is the label at the bottom on the label stack.
When there is a single label this matches what seems to be convention
with other tools. Just print out the numeric value of the mpls label.
The netlink protocol for labels uses the on the wire format for a
label stack. The ttl and traffic class are expected to be 0. Using
the on the wire format is common and what happens with other address
types. BGP when passing label stacks also uses this technique with the
exception that the ttl byte is not included making each label in a BGP
label stack 3 bytes instead of 4.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add support for the RTA_VIA attribute that specifies an address family
as well as an address for the next hop gateway.
To make it easy to pass this reorder inet_prefix so that it's tail
is a proper RTA_VIA attribute.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add the functions family_name and read_family to convert an address
family to a string and to convernt a string to an address family.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
For some address families (like AF_PACKET) it is helpful to have the
length when prenting the address.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This work adds the tc frontend for kernel commit e2e9b6541dd4 ("cls_bpf:
add initial eBPF support for programmable classifiers").
A C-like classifier program (f.e. see e2e9b6541dd4) is being compiled via
LLVM's eBPF backend into an ELF file, that is then being passed to tc. tc
then loads, if any, eBPF maps and eBPF opcodes (with fixed-up eBPF map file
descriptors) out of its dedicated sections, and via bpf(2) into the kernel
and then the resulting fd via netlink down to cls_bpf. cls_bpf allows for
annotations, currently, I've used the file name for that, so that the user
can easily identify his filter when dumping configurations back.
Example usage:
clang -O2 -emit-llvm -c cls.c -o - | llc -march=bpf -filetype=obj -o cls.o
tc filter add dev em1 parent 1: bpf run object-file cls.o classid x:y
tc filter show dev em1 [...]
filter parent 1: protocol all pref 49152 bpf handle 0x1 flowid x:y cls.o
I placed the parser bits derived from Alexei's kernel sample, into tc_bpf.c
as my next step is to also add the same support for BPF action, so we can
have a fully fledged eBPF classifier and action in tc.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
The kernel now provides ids for peer netns. This patch implements a new command
'set' to assign an id.
When netns are listed, if an id is assigned, it is now displayed.
Example:
$ ip netns add foo
$ ip netns set foo 1
$ ip netns
foo (id: 1)
init_net
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
This change allows to exec some cmd on each
named netns (except default) by specifying '-all' option:
# ip -all netns exec ip link
Each command executes synchronously.
Exit status is not considered, so there might be a case
that some CMD can fail on some netns but success on the other.
EXAMPLES:
1) Show link info on all netns:
$ ip -all netns exec ip link
netns: test_net
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether 1a:19:6f:25:eb:85 brd ff:ff:ff:ff:ff:ff
netns: home0
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether ea:1a:59:40:d3:29 brd ff:ff:ff:ff:ff:ff
netns: lan0
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 500
link/ether ce:49:d5:46:81:ea brd ff:ff:ff:ff:ff:ff
2) Set UP tap0 device for the all netns:
$ ip -all netns exec ip link set dev tap0 up
netns: test_net
netns: home0
netns: lan0
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
When HAVE_SETNS is not set, iproute2 provides a local implementation of this
function based on __NR_setns.
This macro is defined in sys/syscall.h, which was not included, thus the local
implementation always returned -1.
CC: Vadim Kochan <vadim4j@gmail.com>
Fixes: eb67e4498a ("lib: Add netns_switch func for change network namespace")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Warning was:
In file included from bridge.c:16:0:
../include/namespace.h:33:12: warning: ‘setns’ defined but not used [-Wunused-function]
CC: Vadim Kochan <vadim4j@gmail.com>
Fixes: eb67e4498a ("lib: Add netns_switch func for change network namespace")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Sometimes, it is more convenient to get only one specific nested attribute by
type. For example for IFLA_AF_SPEC where type is address family (AF_INET6).
So add this helper for this purpose.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
New netns_switch func moved to the lib/namespace.c from ip/ipnetns.c
so it can be used from the other tools for fast switching
network namespace.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Added another timestamp format to look like more logging info:
[2014-12-22T22:36:50.489 ] 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default
link/ether 3c:97:0e:a3:86:2e brd ff:ff:ff:ff:ff:ff
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Replaced handling netlink messages by rtnl_dump_filter
from lib/libnetlink.c, also:
- removed unused dump_fp arg;
- added MAGIC_SEQ #define for 123456 seq id;
- silently exit if ENOENT errno is caused for NETLINK_SOCK_DIAG proto
in lib/libnetlink.c: rtnl_duml_filter_l(...) function. This fix
was added in a3fd8e58c1 by Eric
for misc/ss.c
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
There were only few Netlink protocol names
which were printed on the screen:
rtnl, fw, tcpdiag
So added the ability to identify Netlink proto name
from /etc/iproute/nl_protos or from static table.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Added 'ip fou...' commands to enable/disable UDP ports for doing
foo-over-udp and Generic UDP Encapsulation variant. Arguments are port
number to bind to and IP protocol to map to port (for direct FOU).
Examples:
ip fou add port 7777 gue
ip fou add port 8888 ipproto 4
The first command creates a GUE port, the second creates a direct FOU
port for IPIP (receive payload is a assumed to be an IPv4 packet).
Signed-off-by: Tom Herbert <therbert@google.com>
The RTM_NEWLINK message accepts ifi_index non-zero value and lets
creation of links with given index (if it's free, or course). This
functionality is available since linux-v3.5.
This patch makes this API available via ip tool.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Use warn_unused_result to enforce checking return value of rtnl_send,
and fix where the errors are.
Suggested by initial patch from Petr Písař <ppisar@redhat.com>
The kernel already supports it, so add the support
to iproute2 as well.
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
This change adds the interface group to the output of "ip link show".
It also makes "ip link" print _all_ devices if no group filter is specified;
previously, only interfaces of the default group (0) were shown.
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
ss -i can output "fastopen" attribute if socket used Fast Open
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Recent kernel patches added support for VLAN filtering on the bridge.
This functionality allows one to turn a basic bridge into a VLAN bridge,
where VLANs dicatate packet forwarding and header transformation.
To configure the VLANs on the bridge and its ports a new command is
added to the 'bridge' utility.
# bridge vlan add dev eth0 vid 10 pvid untagged brdev
# bridge vlan add
# bridge vlan delete dev eth0 vid 10
# bridge vlan show
This command supports the following flags:
master - peform the operation on the software bridge device. This is
the default behavior.
self - perform the operation on the hardware associated with the port.
This flag is required when the device is the bridge device and
the configuration is desired on the bridge device itself (not
one of the ports).
pvid - Set the PVID (port vlan id) for a given port. Any untagged
frames arriving on the port will be assigned to this vlan.
untagged - Sets the egress policy of for a given vlan. Default port
egress policy is tagged. Set this flag if you wish traffic
associated with this VLAN to exit the port untagged.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Commit (5a650703d4 Makefile: make warnings into
errors ) causes the following build error.
gcc -Wall -Wstrict-prototypes -Werror -Wmissing-prototypes
-Wmissing-declarations -Wold-style-definition -O2 -I../include
-DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\"
-D_GNU_SOURCE -DCONFIG_GACT -DCONFIG_GACT_PROB -DIPT_LIB_DIR=\"/lib/xtables\"
-DYY_NO_INPUT -c -o m_ipt.o m_ipt.c
cc1: warnings being treated as errors
m_ipt.c:72: error: no previous prototype for 'xtables_register_target'
m_ipt.c:361: error: no previous prototype for 'build_st'
make[1]: *** [m_ipt.o] Error 1
This is fixed by adding the prototype in the header include/iptables.h
I am not sure if this is due to something wrong on my build system but I am
using current glibc 2.17.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
This patch implements `bridge monitor mdb`.
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <amwang@redhat.com>
This patch implements:
bridge mdb { add | del } dev DEV port PORT grp GROUP
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <amwang@redhat.com>
This patch allows to manage ip6 tunnels via the interface ip link.
The syntax for parameters is the same that 'ip -6 tunnel'.
It also allows to display tunnels parameters with 'ip -details link' or
'ip -details monitor link'.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Example of the output:
$ ip monitor netconf&
[1] 24901
$ echo 0 > /proc/sys/net/ipv6/conf/all/forwarding
ipv6 dev lo forwarding off
ipv6 dev eth0 forwarding off
ipv6 all forwarding off
$ echo 1 > /proc/sys/net/ipv4/conf/eth0/forwarding
ipv4 dev eth0 forwarding on
$ ip -6 netconf
ipv6 all forwarding on mc_forwarding 0
$ ip netconf show dev eth0
ipv4 dev eth0 forwarding on rp_filter off mc_forwarding 1
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Minor cleanup of original patch, made sure netconf.h matched
result of santized kernel headers
ip tcp_metrics/tcpmetrics
We support get/del for single entry and dump for
show/flush.
v3:
- fix rtt/rttvar shifts as suggested by Eric Dumazet
- show rtt/rttvar usecs as suggested by David Laight
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Both macros are used together, so better to have
single define. Update all requests in ipl2tp.c to use the
new macro.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Create libgenl.h and libgenl.c. They will contain
common code for GENL users such as ipl2tp, tcp_metrics, etc.
Somewhat simplified by Stephen Hemminger
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Get the same info as from /proc file plus the peer inode.
Applies on top of new sock diag patch and udp diag patch.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
gcc -DLIBDIR=\"/usr/lib64\" -D_GNU_SOURCE -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -Wstrict-prototypes -fPIC -DXT_LIB_DIR=\"/usr/lib64/xtables\" -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib64\" -fPIC -c -o ipx_pton.o ipx_pton.c
In file included from ../include/utils.h:8:0,
from ipx_ntop.c:5:
../include/libnetlink.h: In function 'rta_getattr_u64':
../include/libnetlink.h:84:2: warning: implicit declaration of function 'memcpy'
../include/libnetlink.h:84:2: warning: incompatible implicit declaration of built-in function 'memcpy'
Both rtnl_talk and rtnl_dump had a callback for handling portions
of netlink message that do not match the correct pid or seq.
But this callback was never used by any part of iproute2 so remove
it.
Support ECNSEEN reporting in ss command.
ESTAB 0 0 10.170.73.123:4900
10.170.73.125:51001 uid:501 ino:385994 sk:f31e5f00
mem:(r0,w0,f0,t0) ts sack ecn ecnseen bic wscale:8,8 rto:210
rtt:18.75/15 ato:40 cwnd:10 send 69.9Mbps rcv_space:32768
"ecn" means TCP session negociated ECN capability (TCP layer) at setup
time
"ecnseen" at least one frame with ECT(0) or ECT(1) or ECN (IP layer) was
received from peer.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
User can specify device group to list by using the group keyword:
ip link show group test
If no group is specified, 0 (default) is implied.
Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org>
The get_jiffies() function retrieves rtt-type values in units of
milliseconds. This patch updates the function name accordingly,
following the pattern given by dst_metric() <=> dst_metric_rtt().
The get_jiffies() function retrieves rtt-type values in units of
milliseconds. This patch updates the function name accordingly,
following the pattern given by dst_metric() <=> dst_metric_rtt().
Add the group keyword to ip link set, which has the following meaning:
If both a group and a device name are pressent, we change the device's
group to the specified one. If only a group is present, then the
operation specified by the rest of the command should apply on an entire
group, not a single device.
So, to set eth0 to the default group, one would use
ip link set dev eth0 group default
Conversely, to set all the devices in the default group down, use
ip link set group default down
Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org>
User can specify device group to list by using the group keyword:
ip link show group test
If no group is specified, 0 (default) is implied.
Signed-off-by: Vlad Dogaru <ddvlad@rosedu.org>
Add the iproute2 support for the ACT_CSUM action. Can be used as
following, certainly in conjunction with the ACT_PEDIT action (pedit):
# In order to DNAT (stateless) IPv4 packet from 192.168.1.100 to
# 0x12345678 (18.52.86.120), and update the IPv4 header checksum and
# the UDP checksum (the last one, only if the packet is UDP).
tc filter add eth0 prio 1 protocol ip parent ffff: \
u32 match ip src 192.168.1.100/32 flowid :1 \
action pedit munge offset 16 u32 set 0x12345678 \
pipe csum ip and udp
# In order to alter destination address of IPv6 TCP packets from fc00::1
# and correct the TCP checksum (nothing happened? except maybe for
# checksums in the TCP payload ...).
tc filter add eth0 prio 1 protocol ipv6 parent ffff: \
u32 match ip6 src fc00::1/128 match ip6 protocol 0x06 0xff flowid :1 \
action pedit munge offset 24 u32 set 0x12345678 \
pipe csum tcp
The default remains at 10 for backwards compatibility.
For instance:
# ip addr flush dev eth2
*** Flush remains incomplete after 10 rounds. ***
# ip -l 20 addr flush dev eth2
*** Flush remains incomplete after 20 rounds. ***
# ip -loops 0 addr flush dev eth2
#
This is useful for getting rid of large numbers of IP
addresses in scripts.
Signed-off-by: Ben Greear <greearb@candelatech.com>
We can use rxhash to classify the traffic into flows. As rxhash maybe
supplied by NIC or RPS, it is cheaper.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Unless promote_secondaries has been active deleting the primary address of
an interface will automatically delete all the secondary addresses.
In the case where ip flush requests the primary then secondary addresses to
be removed - which is the order the addresses are returned by the kernel -
this will cause an error as by the time the request to remove a secondary
address is made it will be missing as it will have been deleted in the
course of deleting the primary address.
This approach to solving this problem orders requests for the
deletion of secondary addresses before primary ones providing
rtnl_dump_filter_l(), a version of rtnl_dump_filter() that
iterates over a list of filters. And by providing two specialised
filters print_addrinfo_secondary() and print_addrinfo_primary().
rtnl_dump_filter_l() first iterates over all addresses using
print_addrinfo_secondary(), which appends secondary addresses to the
request buffer. Then again using print_addrinfo_primary() which appends
primary addresses.
This approach should work regardless of it promote_secondaries is
active or not. And regardless of if any primary of secondary addresses
are present or not.
Signed-off-by: Simon Horman <horms@verge.net.au>
After calling ll_init_map, all of the information stored in the link-layer map
can be retrieved by function calls (ll_index_to_*), except for the link-layer
address. This patch fills the gap by adding a ll_index_to_addr function.
Changes welcome.
Signed-off-by: David Ward <david.ward@ll.mit.edu>
The iptables code supports a "no shared libs" mode where it can be used
without requiring dlfcn related functionality. This adds similar support
to iproute2 so that it can easily be used on systems like nommu Linux (but
obviously with a few limitations -- no dynamic plugins).
Rather than modify every location that uses dlfcn.h, I hooked the dlfcn.h
header with stub functions when shared library support is disabled. Then
symbol lookup is done via a local static lookup table (which is generated
automatically at build time) so that internal symbols can be found.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
It uses 1MB as receive buf limit by default (without
increasing /proc/sys/net/core/rmem_max it will be limited by less
however) and allows to specify the size manually using "-rcvbuf X"
(-r is already used, so you need to specify at least -rc).
Additionally rtnl_listen() continues on ENOBUFS after printing the
error message.
Many thanks to Yevgeny Kosarzhevsky <yevg@pisem.net> for reporting
and a lot of testing
Thanks to Jan Engelhardt <jengelh@medozas.de> for a lot of advice
Thanks to Denys Fedoryschenko <denys@visp.net.lb> for some sample
code that he tried and thanks to Andreas Henriksson <andreas@fatal.se>
(who maintains iproute2 on debian) for the persistent followup.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Provides ability to edit queue_mapping field
Provides ability to edit priority field
usage: action skbedit [queue_mapping QUEUE_MAPPING] [priority PRIORITY]
at least one option must be select, or both at the same time
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
“ip maddr show” on an infiniband address causes a stack corruption
because the length of the address for Infiniband (20 bytes, as
described in kernel doc Documentation/infiniband/ipoib.txt) does not
fit on the 16 bytes of the field in which it gets stored.
The proposed patch increases the size of the hardware address from 4
__u32 to 8 and also adds a check to avoid overriding the available
size while parsing the hardware address.
This bug affects current upstream code AFAICT.
Hope this helps,
Cheers,
Olivier.
“ip maddr show ib0” causes a stack corruption because the length of the address
for Infiniband (20 see kernel doc Documentation/infiniband/ipoib.txt) does not
fit on the 16 bytes of the field in which it gets stored.
The proposed patch increases the size of the hardware address from 4 u32 to 8
and adds a check to avoid overriding the available size while parsing the
hardware address.
Introducing the function that does the ATM cell alignment, and
modifying tc_calc_rtable() to use this based upon a linklayer
parameter.
Modified from original to use constants from atm.h and
fix all the usages of rtable in same patch.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
commit 94e9cba778cb97d77d9146dc3bd38ff195bc2c8a
Author: Patrick McHardy <kaber@trash.net>
Date: Sat Feb 2 18:22:16 2008 +0100
[IPROUTE]: cls_flow: add vlan-tag support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
[IPROUTE]: Add flow classifier support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Some usages of rtnl_send could cause errors (ie flush requests)
others do a listen afterwards.
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Use santized kernel header for veth.h and put in correct place
to prevent possible future problems with API.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
This routine parses CLI attributes, describing generic link
parameters such as name, address, etc.
This is mostly copy-pasted from iplink_modify().
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
This patch includes support for the Intra-Site Automatic Tunnel
Addressing Protocol (ISATAP) per RFC4214.
The following diffs are specific to the iproute2-2.6.23
software distribution. This message includes the full and
patchable diff text; please use this version to apply patches.
Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
The problem was that length of allocation changed but caller not told.
Anyway, the patch fixes a problem resulting in a double free
that occurs when using batch files that contains a special combination
of broken up lines and comments as reported in:
http://bugs.debian.org/398912
Thanks to Michal Pokrywka <mpokrywka@hoga.pl> for testcase and information
on which conditions problem could be reproduced under.
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Enable users of ip to specify the times for rtt, rttvar and rto_min
in human-friendly terms a la "tc" while maintaining backwards
compatability with the previous "raw" mechanism. Builds upon
David Miller's uncommited patch to set rto_min.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
This is a resend of the iproute VLAN patch with the if_link.h changes
edited out since the headers are already synced.
[IPROUTE]: VLAN support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
This patch applies on top of Patrick McHardy's RTNETLINK
patches to add nested compat attributes. This is needed to maintain
ABI for sch_{rr|prio} in the kernel with respect to tc. A new option,
namely multiqueue, was added to sch_prio and sch_rr. This will allow
a user to turn multiqueue support on for sch_prio or sch_rr at loadtime.
Also, tc qdisc ls will display whether or not multiqueue is enabled on
that qdisc. When in multiqueue mode, a user can specify a value of 0 for
bands, and the number of bands will be created to match the number of
queues on the device.
This patch is to support the new sch_rr (round-robin) qdisc being proposed
in NET for multiqueue network device support in the Linux network stack.
It uses q_prio.c as the template, since the qdiscs are nearly identical,
outside of the ->dequeue() routine.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
This adds capability for iproute2 to send nested attributes to the
kernel, while maintaining backwards compatibility.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Update the included version of the genetlink.h header to the multicast
group API and make the generic netlink controller part show multicast
groups where applicable. Also fix two typos.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
In order to support these new flags add current
linux/if.h into the directory with the local copies.
This caused troubles with outdated redefinitions from net/if.h
so I've removed the dependency on it.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
cheers,
jamal
[GENERAL] nl_mgrp to crap if base multicast groups exceeded
The old scheme of bitmasks works only for the first 32 groups.
Above that the setsockopt scheme must be used.
Signed-off-by: J Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Update kernel headers to be versions from 2.6.20.y
Solve cross compile build problems with x_tables and netfilter.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
[utils] make muticast group to bitmask conversion generic
Signed-off-by: J Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Stepehen,
Didnt hear back from you, please apply this one; needed for the next
patches.
cheers,
jamal
[GENL] Update generic netlink header
The header file needs to be uptodate with recent changes to allow
for forward compatibility
Signed-off-by: J Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Fix ip6tunnel.c to be fit with current ip command style.
Unlike other modules currently iptunnel (and ip6tunnel) is not
designed as protocol-independent because of unarranged structure
between IPv4 and IPv6.
Usage: ip -f inet6 tunnel { add | change | del | show } [ NAME ]
[ remote ADDR local ADDR ] [ dev PHYS_DEV ]
[ encaplimit ELIM ]
[ hoplimit HLIM ] [ tc TC ] [ fl FL ]
[ dscp inherit ]
Where: NAME := STRING
ADDR := IPV6_ADDRESS
ELIM := { none | 0..255 }(default=4)
HLIM := 0..255 (default=64)
TC := { 0x0..0xff | inherit }
FL := { 0x0..0xfffff | inherit }
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
[IPROUTE]: Add support for larger number of routing tables
Support support for 2^32 routing tables by using the new RTA_TABLE
attribute for specifying tables > 255 and intepreting it if it is
sent by the kernel.
When tables > 255 are used on a kernel not supporting it an error will
occur because of the unknown netlink attribute.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The controller is the only module using this at the moment.
Thomas has a sample user of genetlink that would fit here; bug him
for it.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>