Make common function for decoding cacheinfo.
This code may print more info than old version in some cases.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Refactor to reduce size of print_route and improve
readability.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Both next hop and route need to decode flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Having iplink_parse() and @struct iplink_req in include/utils.h does not
reflect it's IP nature: move to ip/ip_common.h.
Move contents of ip/iplink_xdp.h and ip/iproute_lwtunnel.h to
ip/ip_common.h since they are small (i.e. only two function prototypes):
ip/iplink_bridge.c and ip/iplink_vrf.c prototypes already there.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
When running "ip route list default" and not specifying address family,
one will get all of the routes instead of just default only. The same
is for "exact default" and "match default".
It behaves in such a way because default route with unspecified family
has the same all-zeroes value like no prefix specified at all. Thus
following code blindly ignores the fact, that prefix was actually
specified.
This patch adds the flag PREFIXLEN_SPECIFIED to the default route too.
And then checks its value when filtering routes.
Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Metric is one of the "unique key" fields of the route in Linux. But
still one can not use its value in filter while running ip list.
Because of this writing checks in scripts for example is incovenient.
Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds fastopen_no_cookie option to enable/disable TCP fastopen
without a cookie on a per-route basis.
Support in Linux was added with 71c02379c762 (tcp: Configure TFO without
cookie per socket and/or per route).
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
This is an update for 460c03f3f3 ("iplink: double the buffer size also in
iplink_get()"). After update, we will not need to double the buffer size
every time when VFs number increased.
With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove the
length parameter.
With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new variable
answer to avoid overwrite data in nlh, because it may has more info after
nlh. also this will avoid nlh buffer not enough issue.
We need to free answer after using.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
This fixes a corner-case for routes with a certain metric locked to
zero:
| ip route add 192.168.7.0/24 dev eth0 window 0
| ip route add 192.168.7.0/24 dev eth0 window lock 0
Since the kernel doesn't dump the attribute if it is zero, both routes
added above would appear as if they were equal although they are not.
Fix this by taking mxlock value for the given metric into account before
skipping it if it is not present.
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Covscan complained about dead code but after reading it, I assume the
author's intention was to prefix the interface list with 'Oifs: '.
Initializing first to 1 and setting it to 0 after above prefix was
printed should fix it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds support for the seg6local lightweight tunnel
("ip route add ... encap seg6local ...").
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Since kernel commit 475abbf1ef67 ("ipv4: fib: Set offload indication
according to nexthop flags") offload indication is reported on a
per-nexthop basis.
Adjust iproute2 to display it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
This patch replaces exits with returns in ip route
commands.
Allows to continue when invoked with ip -batch.
Signed-off-by: Élie Bouttier <elie@bouttier.eu>
This patch extends route get to support mpls specific
route attributes like RTA_NEWDST.
Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selection
By default the getroute handler returns matched
nexthop label, via and oif
With fibmatch keyword (RTM_F_FIB_MATCH flag), full matched
route is returned.
example:
$ip -f mpls route show
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2
$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12
$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get fibmatch 101
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
When modifying a route we set the RTA_OIF attribute only if a device was
specified with "dev" or "oif" keyword. But for some unknown reason we
earlier alternatively check also for the presence of "nexthop" keyword,
even though it has no effect. So remove the pointless check.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Uses newly introduced RTM_GETROUTE flag RTM_F_FIB_MATCH
to return a matching fib route. Introduces 'fibmatch'
keyword to ip route get.
ipv4:
----
$ip route show
default via 192.168.0.2 dev eth0
10.0.14.0/24
nexthop via 172.16.0.3 dev dummy0 weight 1
nexthop via 172.16.1.3 dev dummy1 weight 1
$ip route get 10.0.14.2
10.0.14.2 via 172.16.1.3 dev dummy1 src 172.16.1.1
cache
$ip route get fibmatch 10.0.14.2
10.0.14.0/24
nexthop via 172.16.0.3 dev dummy0 weight 1
nexthop via 172.16.1.3 dev dummy1 weight 1
ipv6:
----
$ip -6 route show
2001:db9:100::/120 metric 1024
nexthop via 2001:db8:2::2 dev dummy0 weight 1
nexthop via 2001:db8:12::2 dev dummy1 weight 1
$ip -6 route get 2001:db9:100::1
2001:db9:100::1 from :: via 2001:db8:12::2 dev dummy1 \
src 2001:db8:12::1 metric 1024 pref medium
$ip -6 route get fibmatch 2001:db9:100::1
2001:db9:100::/120 metric 1024
nexthop via 2001:db8:12::2 dev dummy1 weight 1
nexthop via 2001:db8:2::2 dev dummy0 weight 1
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Add support for setting and displaying the ttl-propagation attribute
initially used by MPLS to control propagation of MPLS TTL to IPv4/IPv6
TTL/hop-limit on popping final label on a per-route basis.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
MPLS multipath routes are missing a space between 'nexthop' and 'via':
$ ip -net ns1 -f mpls ro ls
100
nexthopvia inet 172.16.2.2 dev virt12
nexthopvia inet 172.16.3.2 dev br0
Add it.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Use the new helper functions rta_getattr_u* instead of direct
cast of RTA_DATA(). Where RTA_DATA() is a structure, then remove
the unnecessary cast since RTA_DATA() is void *
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds a new field that is printed in the end of the line which
denotes the real entry state. Before this patch an entry's IIF could
disappear and it would look like an unresolved one (iif = unresolved):
(3.0.16.1, 225.11.16.1) Iif: unresolved
with no way to really distinguish it from an unresolved entry.
After the patch if the dumped entry has RTNH_F_UNRESOLVED set we get:
(3.0.16.1, 225.11.16.1) Iif: unresolved State: unresolved
for unresolved entries and:
(0.0.0.0, 225.11.11.11) Iif: eth4 Oifs: eth3 State: resolved
for resolved entries after the OIF list. Note that "State:" has ':' in
it so it cannot be mistaken for an interface name.
And for the example above, we'd get:
(0.0.0.0, 225.11.11.11) Iif: unresolved State: resolved
Also when dumping all routes via ip route show table all,
it will show up as:
multicast 225.11.16.1/32 from 3.0.16.1/32 table default proto 17 unresolved
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To specify multiple nexthops in a route the user is expected to use the
"nexthop" keyword which ip route uses to create the RTA_MULTIPATH.
However, ip route always accepts multiple 'via' keywords where only the
last one is used in the route leading to confusion. For example, ip
accepts this syntax:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
but the route entered inserted by the kernel is just the last gateway:
1.1.1.0/24 via 10.100.2.18 dev eth2
which is not the full request from the user. Detect the presense of
multiple 'via' and give the user a hint to add nexthop:
$ ip ro add vrf red 1.1.1.0/24 via 10.100.1.18 via 10.100.2.18
Error: argument "via" is wrong: use nexthop syntax to specify multiple via
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
- Support adding, deleting and showing IP rules with UID ranges.
- Support querying per-UID routes via "ip route get uid <UID>".
UID range routing was added to net-next in 4fb7450683 ("Merge
branch 'uid-routing'")
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
ftell() may return -1 in error case, which is not handled and
therefore pass a negative offset to fseek(). The return code of
fseek() is also not checked.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
If we have multicast routes and do ip route show table all we'll get the
following output:
...
multicast ???/32 from ???/32 table default proto static iif eth0
The "???" are because the rtm_family is set to RTNL_FAMILY_IPMR instead
(or RTNL_FAMILY_IP6MR for ipv6). Add a simple workaround that returns the
real family based on the rtm_type (always RTN_MULTICAST for ipmr routes)
and the rtm_family. Similar workaround is already used in ipmroute, and
we can use this helper there as well.
After the patch the output is:
multicast 239.10.10.10/32 from 0.0.0.0/32 table default proto static iif eth0
Also fix a minor whitespace error and switch to tabs.
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
The code is a bit messy, as it starts with space after text and at some
point switches to space before text. But either way, printing space
before *and* after text almost certainly leads to printing more
whitespace than necessary.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Prior to this patch, If one route entry's RTA_PREFSRC and RTA_GATEWAY
both were NULL, it was supposed to be restored ONLY as a local address.
But as it didn't check tb[RTA_PREFSRC] when restoring local networks,
rtattr_cmp would return a success if it was NULL, this route entry would
be restored again as a local network.
This patch is to add tb[RTA_PREFSRC] check when restoring local networks.
Fixes: 74af8dd962 ("ip route: restore route entries in correct order")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Phil Sutter <phil@nwl.cc>
This big patch was compiled by vimgrepping for memset calls and changing
to C99 initializer if applicable. One notable exception is the
initialization of union bpf_attr in tc/tc_bpf.c: changing it would break
for older gcc versions (at least <=3.4.6).
Calls to memset for struct rtattr pointer fields for parse_rtattr*()
were just dropped since they are not needed.
The changes here allowed the compiler to discover some unused variables,
so get rid of them, too.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Sometimes we cannot restore route entries, because in kernel
[1] fib_check_nh()
[2] fib_valid_prefsrc()
cause some routes to depend on existence of others while adding.
For example, we saved all the routes, and flushed all tables
[a] default via 192.168.122.1 dev eth0
[b] 192.168.122.0/24 dev eth0 src 192.168.122.21
[c] broadcast 127.0.0.0 dev lo table local src 127.0.0.1
[d] local 127.0.0.0/8 dev lo table local src 127.0.0.1
[e] local 127.0.0.1 dev lo table local src 127.0.0.1
[f] broadcast 127.255.255.255 dev lo table local src 127.0.0.1
[g] broadcast 192.168.122.0 dev eth0 table local src 192.168.122.21
[h] local 192.168.122.21 dev eth0 table local src 192.168.122.21
[i] broadcast 192.168.122.255 dev eth0 table local src 192.168.122.21
Now start to restore them:
If we want to add [a], we have to add [b] first, as [1] and
'via 192.168.122.1' in [a].
If we want to add [b], we have to add [h] first, as [2] and
'src 192.168.122.21' in [b].
So the correct order to restore should be like:
[e][h] -> [b][c][d][f][g][i] -> [a]
This patch fixes it by traversing the file 3 times, it only restores
part of them in each run according to the following conditions, to
make sure every entry can be restored successfully.
1. !gw && (!fib_prefsrc || fib_prefsrc == cfg->fc_dst)
2. !gw && (fib_prefsrc != cfg->fc_dst)
3. gw
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Add vrf keyword to 'ip route' commands. Allows:
1. Users can list routes by VRF name:
$ ip route show vrf NAME
VRF tables have all routes including local and broadcast routes.
The VRF keyword filters LOCAL and BROADCAST routes; to see all
routes the table option can be used. Or to see local routes only
for a VRF:
$ ip route show vrf NAME type local
2. Add or delete a route for a VRF:
$ ip route {add|delete} vrf NAME <route spec>
3. Do a route lookup for a VRF:
$ ip route get vrf NAME ADDRESS
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently a timeout is multiplied by HZ in user-space and
then it multiplied by HZ in kernel-space.
$ ./ip/ip r add 2002::0/64 dev veth1 expires 10
$ ./ip/ip -6 r
2002::/64 dev veth1 metric 1024 linkdown expires 996sec pref medium
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Stephen Hemminger <shemming@brocade.com>
Fixes: 68eede2505 ("route: allow routes to be configured with expire values")
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
There is only a single user who needs it to be reentrant (not really,
but it's safer like this), add rt_addr_n2a_r() for it to use.
Signed-off-by: Phil Sutter <phil@nwl.cc>
There are only three users which require it to be reentrant, the rest is
fine without. Instead, provide a reentrant format_host_r() for users
which need it.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This is a bit pedantic, but brackets ([]) show optional values and since
TYPE must not become empty, they're not suited to surround the type
keyword choices. Use curly braces instead.
Also add some missing whitespace to the parameter list above.
Signed-off-by: Phil Sutter <phil@nwl.cc>
This patch adds support to add mpls multipath
routes.
example:
ip -f mpls route add 100 \
nexthop as 200 via inet 10.1.1.2 dev swp1 \
nexthop as 700 via inet 10.1.1.6 dev swp2
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
the warning was:
iproute.c:301:12: warning: 'val' may be used uninitialized in this
function [-Wmaybe-uninitialized]
features &= ~RTAX_FEATURE_ECN;
^
iproute.c:575:10: note: 'val' was declared here
__u32 val;
^
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Commit 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
accidentally reordered fprintf statements. This patch restores the
original ordering.
Fixes: 0f7543322c ("route: ignore RTAX_HOPLIMIT of value -1")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Older kernels use -1 internally as indicator to use the sysctl default,
but they still export the setting. Newer kernels use 0 to indicate that
(which is why the conversion from -1 to 0 was done here), but they also
stopped exporting the value. Since the meaning of -1 is clear, treat it
equally like default on newer kernels (which is to not print anything).
Signed-off-by: Phil Sutter <phil@nwl.cc>
Technically, the range of possible hoplimit values are defined by IPv4
and IPv6 header formats. Both define the field to be eight bits in size,
which leads to a value range of [0;255]. Setting a packet's hoplimit
field to 0 though makes not much sense, as the next hop would
immediately drop the packet. Therefore Linux uses 0 as a special value
indicating to use the system's default hoplimit (configurable via
sysctl). In iproute, setting the hoplimit of a route to 0 is equivalent
to omitting the hoplimit parameter alltogether, so it is actually not
necessary to allow that value to be specified, but keep it anyway for
backwards compatibility.
Signed-off-by: Phil Sutter <phil@nwl.cc>
If get_rt_realms() fails, try to get a possible raw u32 realms
value for the u32 RTA_FLOW/FRA_FLOW attribute, as it might be
useful to directly configure the hex value itself. And only if
that fails, then bail out.
The source realm is provided in the upper u16 (mask: 0xffff0000)
and the destination realm through the lower u16 part (mask:
0x0000ffff). This can be useful for tc's bpf realm matcher, but
also a full hex/mask param can be provided already for matching
through iptables' --realm cmdline option, for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds support to parse and print lwtunnel
encapsulation attributes attached to routes for MPLS
and IP tunnels.
example:
Add ipv4 route with mpls encap attributes:
Examples:
MPLS:
$ ip route add 40.1.2.0/30 encap mpls 200 via inet 40.1.1.1 dev eth3
$ ip route show
40.1.2.0/30 encap mpls 200 via 40.1.1.1 dev eth3
Add ipv4 multipath route with mpls encap attributes:
$ ip route add 10.1.1.0/30 nexthop encap mpls 200 via 10.1.1.1 dev eth0 \
nexthop encap mpls 700 via 40.1.1.2 dev eth3
$ ip route show
10.1.1.0/30
nexthop encap mpls 200 via 10.1.1.1 dev eth0 weight 1
nexthop encap mpls 700 via 40.1.1.2 dev eth3 weight 1
IP:
$ ip route add 10.1.1.1/24 encap ip id 200 dst 20.1.1.1 dev vxlan0
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jiri Benc <jbenc@redhat.com>
Currently 'ip route get' does not show the table the lookup result comes
from and prior to kernel commit c36ba6603a11 the response from the kernel
was hardcoded to the main table. From the discussion this appears to be
a leftover from the route cache where the cached entry lost the table id
and so the result was hardcoded to main table.
c36ba6603a11 added the RTM_F_LOOKUP_TABLE flag to maintain that behavior
but to allow new tools to ask for the actual table id for the lookup.
This patch adds that flag to ip route get request and if the result is
not the main table shows the table id.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Currently when we specify AF_INET6 when it is disabled, we will get
all routes.
For example, we can boot kernel with ipv6.disable=1 and try to get ipv6
routes:
$ ip -6 route show
default via 192.168.122.1 dev eth0 proto static metric 100
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.141 metric 100
Here are ipv4 routes and this is unexpected behaviour.
Signed-off-by: Andrew Vagin <avagin@openvz.org>
This patch replaces exits with returns in
ip route get command handling. This allows batching
of ip route get commands.
$cat route_get_batch.txt
route get 10.0.14.2
route get 12.0.14.2
route get 10.0.14.4
$ip -batch route_get_batch.txt
local 10.0.14.2 dev lo src 10.0.14.2
cache <local>
12.0.14.2 via 192.168.0.2 dev eth0 src 192.168.0.15
cache
10.0.14.4 dev dummy0 src 10.0.14.2
cache
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
This patch fixes incorrect -EINVAL errors due to invalid
scope and type during mpls route deletes.
$ip -f mpls route add 100 as 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route show
100 as to 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route del 100 as 200 via inet 10.1.1.2 dev swp1
RTNETLINK answers: Invalid argument
$ip -f mpls route del 100
RTNETLINK answers: Invalid argument
After patch:
$ip -f mpls route show
100 as to 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route del 100 as 200 via inet 10.1.1.2 dev swp1
$ip -f mpls route show
Always set type to RTN_UNICAST for mpls route add/deletes.
Also to keep things consistent with kernel set scope to
RT_SCOPE_UNIVERSE for both mpls and ipv6 routes. Both mpls and ipv6 route
deletes ignore scope.
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
There have been several instances where response from kernel
has overrun the stack buffer from the caller. Avoid future problems
by passing a size argument.
Also drop the unused peer and group arguments to rtnl_talk.
If kernel complains about ip route request, exit status should be
2 not 1.
This fixes regression introduced by:
commit 42ecedd4ba
Author: Roopa Prabhu <roopa@cumulusnetworks.com>
Date: Tue Mar 17 19:26:32 2015 -0700
fix ip -force -batch to continue on errors
The kernel now has the capability to offload FDB and FIB entries to hardware.
It is important to let users know if table entries are also offloaded to
hardware. Currently offloaded FDB entries are indicated by the existence of
the flag 'external' on the entry as of the following commit:
commit 28467b7f3f
Author: Scott Feldman <sfeldma@gmail.com>
Date: Thu Dec 4 09:57:15 2014 +0100
bridge/fdb: add flag/indication for FDB entry synced from offload device
When the patch to add support for indicating that FIB entries were also
offloaded as posted to netdev by Scott Feldman it became clear that 'external'
would not be an ideal name for routes. There could definitely be confusion
about what this might mean since many routes are to external networks -- a
collision/confusion that did not happen with FDB.
Scott Feldman asked me to check with others and build concensus around a name.
After speaking with several people about this I am proposing we refer to both
FDB and FIB entries that are currently backed by hardware (based on the work
done in rocker) with the flag 'offload' appended to the end ofthe entry.
Some people liked the string 'external,' others liked 'hardware,' but the point
is to communicate that these routes are available to something that will will
offload the forwarding normally done by the kernel. Since the term 'offload'
is used so frequently it seems appropriate to use the same language in
ip/bridge output.
The term 'offload' also seems to resonate with many of the people who have
responded on Scott's original thread or to those who I reached out to directly
and did respond to my query, so it seems we have reached consensus that it
should be the term used going forward.
v2: rebased against net-next branch
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: John W. Linville <linville@tuxdriver.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
This allows querying and setting the route preference. It's usually set from
the IPv6 Neighbor Discovery Router Advertisement messages.
Introduced in "ipv6: expose RFC4191 route preference via rtnetlink", enqueued
for Linux 4.1.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
- Pull in the uapi mpls.h
- Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
- Define AF_MPLS in utils.h if it is not defined from elsewhere
as is done with AF_DECnet
The address syntax for multiple mpls labels is a complete invention.
When I looked there seemed to be no wide spread convention for talking
about an mpls label stack in text for. Sometimes people did:
"{ Label1, Label2, Label3 }", sometimes people would do:
"[ label3, label2, label1 ]", and most of the time label
stacks were not explicitly shown at all.
The syntax I wound up using, so it would not have spaces and so it
would visually distinct from other kinds of addresses is.
label1/label2/label3 Where label1 is the label at the top of the label
stack and label3 is the label at the bottom on the label stack.
When there is a single label this matches what seems to be convention
with other tools. Just print out the numeric value of the mpls label.
The netlink protocol for labels uses the on the wire format for a
label stack. The ttl and traffic class are expected to be 0. Using
the on the wire format is common and what happens with other address
types. BGP when passing label stacks also uses this technique with the
exception that the ttl byte is not included making each label in a BGP
label stack 3 bytes instead of 4.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This attribute is like RTA_DST except it specifies the destination
address to place on a packet when it leaves the host. For ip based
protocols this is destination NAT and not a common part of forwarding.
For protocols like MPLS label swapping is something that typically
happens on every hop.
There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST
is printed as "as to" and can be specified either as "as to"
or just "as"
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Add support for the RTA_VIA attribute that specifies an address family
as well as an address for the next hop gateway.
To make it easy to pass this reorder inet_prefix so that it's tail
is a proper RTA_VIA attribute.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
For some address families (like AF_PACKET) it is helpful to have the
length when prenting the address.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
On ip route print dump, label externally offloaded routes with "external".
Offloaded routes are flagged with RTNH_F_EXTERNAL, a recent additon to
net-next. For example:
$ ip route
default via 192.168.0.2 dev eth0
11.0.0.0/30 dev swp1 proto kernel scope link src 11.0.0.2 external
11.0.0.4/30 via 11.0.0.1 dev swp1 proto zebra metric 20 external
11.0.0.8/30 dev swp2 proto kernel scope link src 11.0.0.10 external
11.0.0.12/30 via 11.0.0.9 dev swp2 proto zebra metric 20 external
12.0.0.2 proto zebra metric 30 external
nexthop via 11.0.0.1 dev swp1 weight 1
nexthop via 11.0.0.9 dev swp2 weight 1
12.0.0.3 via 11.0.0.1 dev swp1 proto zebra metric 20 external
12.0.0.4 via 11.0.0.9 dev swp2 proto zebra metric 20 external
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.15
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
This patch replaces exits with returns in several
iproute2 commands. This fixes `ip -batch -force`
to not exit but continue on errors.
$cat c.txt
route del 1.2.3.0/24 dev eth0
route del 1.2.4.0/24 dev eth0
route del 1.2.5.0/24 dev eth0
route add 1.2.3.0/24 dev eth0
$ip -force -batch c.txt
RTNETLINK answers: No such process
Command failed c.txt:2
RTNETLINK answers: No such process
Command failed c.txt:3
Reported-by: Sven-Haegar Koch <haegar@sdinet.de>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
This patch adds configuration and dumping of congestion control metric
for ip route, for example:
ip route add <dst> dev foo congctl [lock] dctcp
Reference: http://thread.gmane.org/gmane.linux.network/344733
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
This permits to selectively enable explicit congestion notification via
the routing table.
If this ecn feature is not set, the kernel will use the tcp_ecn sysctl
to decide wheter to use ECN when establising a TCP connection.
At the time of this writing, the kernel supports ecn and allfrags, but
allfrags is of dubious value and not implemented here.
Example:
ip route change 192.168.2.0/24 dev eth0 features ecn
Signed-off-by: Florian Westphal <fw@strlen.de>
In "ip route show" output unicast type, main table, boot protocol and
universe scope are hidden as default labels.
Sometimes it is helpful to show the hidden label for people not enough
familiar with routing subsystem to map the output of "ip route show" and
kernel source code.
With this patch "ip route show" with -d option shows the default labels.
Example of difference of output with -d option:
$ ./ip/ip -4 route show table all dev virbr1
...
192.168.121.0/28 proto kernel scope link src 192.168.121.1
...
$ ./ip/ip -4 -d route show table all dev virbr1
...
unicast 192.168.121.0/28 table main proto kernel scope link src 192.168.121.1
...
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
This patch adds quickack option to enable/disable TCP quick ack
mode for per-route.
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <amwang@redhat.com>
Fixes Debian bug #700434
Need to table id in filter to be unsigned to avoid conversion to -1
The documentation for "ip" suggests that, when using multiple routing tables, the table ID can be an arbitrary 32 bit number. I've been writing a script that calculates a table Id based on an IP addresses and sets up tables accordingly based on it. This seems to work for everything I've tried except "ip route flush". If you specify a table to flush with an ID over 2^31, it flushes all IPv4 routing tables. For example:
Will delete all routing tables, including the default one. Needless to say, this is quite annoying. I think this is an upstream bug, but your opinions will be greatly appreciated.