The Solaris code has gone through a deprecation cycle. No-one
has said anything to us and worse of all we don't have any test
systems running Solaris to know if we are making changes that
are breaking on Solaris. Remove it from the system so
we can clean up a bit.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Code was added in the past to support a value of VRF_DEFAULT different
from 0. This option was abandoned, the default vrf id is always 0.
Remove this code, this will simplify the code and improve performance
(use a constant value instead of a function that performs tests).
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
In all outputs (text and json): simplify and optimize the vrf name
display, use the vrf_id_to_name() handler.
Note: vrf_id_to_name() has a safeguard system that prevents from
crashing when the vrf cannot be found because it changed in some
(unexpected) manner, it returns "n/a".
Note: "vrf n/a" will now be displayed instead of "vrf UNKNOWN" in this
case, like in most other frr components.
This safeguard was missing for show ip route json, so this
optimization also fixes a potential crash.
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
Variable "show ip route" commands invoke the same helper
(do_show_ip_route), potentially several times.
When asking to dump a non-default vrf, all vrfs or all tables, the
output is messy, the header summarizing abbreviations is repeated
several times, excess line feeds appear, the default table of default
VRF is concatenated to the previous table output...
Normalize the output:
- whatever the case, display the common header at most once, if there
is at least an entry to dump.
- when using a "vrf all" or "table all" command, prepend a line with
the VRF and table (even for the default vrf or table).
- when dumping a specific vrf or table, prepend a line with the VRF
and table.
Example (vrf all)
=================
router# show ip route vrf all
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF main:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:24:09
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:24:09
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:00:26
VRF private:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:00:29
C>* 10.125.0.0/24 is directly connected, loop0, 00:00:42
Example (main vrf)
==================
router# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:24:41
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:24:41
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:00:58
Example (specific vrf)
======================
router# show ip route vrf private
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF private:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:01:23
C>* 10.125.0.0/24 is directly connected, loop0, 00:01:36
Example (all tables)
====================
router# show ip route table all
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:01:51
VRF main table 254:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:25:34
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:25:34
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:01:51
Example (all vrf, all table)
============================
router# show ip route table all vrf all
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:02:15
VRF main table 254:
C>* 10.0.2.0/24 is directly connected, mgmt0, 00:25:58
K>* 10.0.2.2/32 [0/100] is directly connected, mgmt0, 00:25:58
C>* 10.125.0.0/24 is directly connected, ntfp2, 00:02:15
VRF private table 200:
S>* 2.2.2.0/24 [1/0] via 10.125.0.2, loop0, 00:02:18
VRF private table 254:
S>* 1.1.1.0/24 [1/0] via 10.125.0.2, loop0, 00:02:18
C>* 10.125.0.0/24 is directly connected, loop0, 00:02:31
Example (specific table)
========================
router# show ip route table 200
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF main table 200:
S>* 4.4.4.4/32 [1/0] via 10.125.0.3, ntfp2, 00:05:26
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
This series of events:
$ sudo ifconfig lo0 add 4.4.4.4/32
$ sudo ifconfig lo0 inet 4.4.4.4/32 delete
would end up leaving the 4.4.4.4/32 address on the interface under
freebsd.
This all boils down to the fact that the interface is not
considered connected yet we have a destination. If the
destination is the same and we are not connected ignore
it on freebsd.
I am sure there are other fun scenarios that someone
will have to squirrel out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem commit -
[
b169fd6fd5 zebra: support for MAC-IP sync routes
]
That commit had accidentally replaced a mac-ip del to bgp with a mac
del (consequence of a bad cut-paste).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Changes to setup peer-synced as static in the dataplane. This prevents
them from being flushed out when the local switch cannot establish
their reachability.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
As a part of the re-factoring some of the evpn_vni_es apis got re-named
as evpn_evpn_es. Changed them to evpn_es_evi to make it common to
vxlan and mpls.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a MAC is detected duplicate on a local
learn event (with freeze action),
do not send update to bgp to advertise into
evpn control plane.
With evpn mh, inform_client flag is set and
sends notification to bgp albeit dup detect
is set.
Check mac are detected as duplicate before
setting inform_client to true.
Ticket:CM-29817
Reviewed By:CCR-10329
Testing Done:
Enable DAD with freeze action
Upon local learn MAC detected as duplica
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
this is used when parsing the newly network namespaces. actually, to
track the link of some interfaces like vxlan interfaces, both link index
and link nsid are necessary. if a vxlan interface is moved to a new
netns, the link information is in the default network namespace, then
LINK_NSID is the value of the netns by default in the new netns. That
value of the default netns in the new netns is not known, because the
system does not automatically assign an NSID of default network
namespace in the new netns. Now a new NSID of default netns, seen from
that new netns, is created. This permits to store at netns creation the
default netns relative value for further usage.
Because the default netns value is set from the new netns perspective,
it is not needed anymore to use the NETNSA_TARGET_NSID attribute only
available in recent kernels.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the walk routine is used by vxlan service to identify some contexts in
each specific network namespace, when vrf netns backend is used. that
walk mechanism is extended with some additional paramters to the walk
routine.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when duplicate address detection is observed, some incrementation,
some timing mechanisms need to be done. For that the main evpn
configuration is retrieved. Until now, the VRF that was storing the dad
config parameters was the same VRF that hosted the VXLAN interface. With
netns backend, this is not true, as the VXLAN interface is in the
same VRF as the bridge interface. The modification takes same definition
as in BGP, that is to say that there is a single bgp evpn instance, and
this is that instance that will give the correct config settings.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this change is needed when a MAC/IP entry is learned by zebra, and the
entry happens to be in a different namespace. So that the entry be
active, the correct vni match has to be found.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
1. MAC ref of a zero ESI was accidentally creating a new ES with zero
ES id.
2. When an ES was deleted and re-added the ES was not being sent to BGP
because of a stale flag that suppressed the update as a dup.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When we get a rib deletion event and we already have
that particular route node in the queue to be reprocessed,
just note that someone from kernel land has done us dirty
and allow it to be cleaned up by normal processing
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Imagine a situation where a interface is bouncing up/down.
The interface comes up and daemons like pbr will get a nht
tracking callback for a connected interface up and will install
the routes down to zebra. At this same time the interface can
go down. But since zebra is busy handling route changes ( from pbr )
it has not read the netlink message and can get into a situation
where the route resolves properly and then we attempt to install
it into the kernel( which is rejected ). If the interface
bounces back up fast at this point, the down then up netlink
message will be read and create two route entries off the connected
route node. Zebra will then enqueue both route entries for future processing.
After this processing happens the down/up is collapsed into an up
and nexthop tracking sees no changes and does not inform any upper
level protocol( in this case pbr ) that nexthop tracking has changed.
So pbr still believes the nexthops are good but the routes are not
installed since pbr has taken no action.
Fix this by immediately running rnh when we signal a connected
route entry is scheduled for removal. This should cause
upper level protocols to get a rnh notification for the small
amount of time that the connected route was bouncing around like
a madman.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It was wrongly assumed that the kernel is replying in batches when multiple
requests fail. The kernel sends one error message at a time, so we can
simply keep reading data from the socket as long as possible.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
During testing it was noticed that routes were considered
installed by zebra, but the kernel did not have the route.
Upon close debugging of the rib it was noticed that FRR
was turning a dplane_ctx_route_init into a success and
FRR was now in a bad state.
2020/08/26 17:55:53.897436 PBR: route_notify_owner: [0.0.0.0/0] Route Removed succeeded for table: 10012
2020/08/26 17:55:53.897572 ZEBRA: 0.0.0.0/0: uptime == 432033, type == 24, instance == 0, table == 10012
2020/08/26 17:55:53.897622 ZEBRA: rib_meta_queue_add: (0:10012):0.0.0.0/0: queued rn 0x5566b0ea7680 into sub-queue 5
2020/08/26 17:55:53.907637 ZEBRA: default(0:10012):0.0.0.0/0: Processing rn 0x5566b0ea7680
2020/08/26 17:55:53.907665 ZEBRA: default(0:10012):0.0.0.0/0: Examine re 0x5566b0d01200 (pbr) status 2 flags 1 dist 200 metric 0
2020/08/26 17:55:53.907702 ZEBRA: default(0:10012):0.0.0.0/0: After processing: old_selected 0x0 new_selected 0x5566b0d01200 old_fib 0x0 new_fib 0x5566b0d01200
2020/08/26 17:55:53.907713 ZEBRA: default(0:10012):0.0.0.0/0: Adding route rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr)
2020/08/26 17:55:53.907879 ZEBRA: default(0:10012):0.0.0.0/0: rn 0x5566b0ea7680 dequeued from sub-queue 5
2020/08/26 17:55:53.907943 ZEBRA: netlink_route_multipath: RTM_NEWROUTE 0.0.0.0/0 vrf 0(10012)
2020/08/26 17:55:53.910756 ZEBRA: default(0:10012):0.0.0.0/0 Processing dplane result ctx 0x5566b0ea82f0, op ROUTE_INSTALL result SUCCESS
2020/08/26 17:55:53.910769 ZEBRA: update_from_ctx: default(0:10012):0.0.0.0/0: SELECTED, re 0x5566b0d01200
2020/08/26 17:55:53.910785 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): no fib nhg
2020/08/26 17:55:53.910793 ZEBRA: default(0:10012):0.0.0.0/0 update_from_ctx(): rib nhg matched, changed 'true'
2020/08/26 17:55:53.910802 ZEBRA: (0:10012):0.0.0.0/0: Redist update re 0x5566b0d01200 (pbr), old 0x0 (None)
2020/08/26 17:55:53.910812 ZEBRA: Notifying Owner: 24 about prefix 0.0.0.0/0(10012) 2 vrf: 0
2020/08/26 17:55:53.910912 PBR: route_notify_owner: [0.0.0.0/0] Route installed succeeded for table: 10012
2020/08/26 17:55:55.400516 ZEBRA: RTM_DELROUTE 0.0.0.0/0 vrf default(0) table_id: 10012 metric: 20 Admin Distance: 0
2020/08/26 17:55:55.400527 ZEBRA: rib_delete: (0:10012):0.0.0.0/0: rn 0x5566b0ea7680, re 0x5566b0d01200 (pbr) was deleted from kernel, adding
We were receiving a notification from the kernel that the route was deleted and deciding
that we needed to reinstall it. At that point in time when it got into the dplane
handlers to convert it to the dplane pthread, the dplane decided to drop the request
convert it too a success and not do anything.
This code change removes the conversion from this failure to success and
notifies the upper level about it. After this change the default route
to table 10012 is now properly marked as rejected:
root@mlx-2700-07:mgmt:/var/log/frr# vtysh -c "show ip route table 10012"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
VRF default table 10012:
F>r 0.0.0.0/0 [200/0] via 172.168.1.164, isp2-uplink (vrf PUBLIC), weight 1, 00:24:48
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are not using nexthop groups, there is no need to
test for whether or not they are installed correctly or not
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The fuzzing code that is in the master branch is outdated and unused, so it
is worth to remove it to improve readablity of the code.
All the code related to the fuzzing is in the `fuzz` branch.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
in order to create appropriate policy route, family attribute is stored
in ipset and iptable zapi contexts. This commit also adds the flow label
attribute in iptables, for further usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When turning on `debug zebra packet detail` or `debug zebra packet recv detail`
only display the detailed packet dump when `detail` is added.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There are a bunch of places where the table id is not being outputed
in debug messages for routing changes. Add in the table id we
are operating on. This is especially useful for the case where
pbr is working.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
all network namespaces are read so as to collect interesting fdb and
neighbor tables for EVPN.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this information is necessary for local information, because the
interface associated to the mac address is stored with its ifindex, and
the ifindex may not be enough to get to the right interface when it
comes with multiple network namespaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when working with vrf netns backend, two bridges interfaces may have the
same bridge interface index, but not the same namespace. because in vrf
netns backend mode, a bridge slave always belong to the same network
namespace, then a check with the namespace id and the ns id of the
bridge interface permits to resolve correctly the interface pointer.
The problem could occur if a same index of two bridge interfaces can be
found on two different namespaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when receiving a netlink API for an interface in a namespace, this
interface may come with LINK_NSID value, which means that the interface
has its link in an other namespace. Unfortunately, the link_nsid value
is self to that namespace, and there is a need to know what is its
associated nsid value from the default namespace point of view.
The information collected previously on each namespace, can then be
compared with that value to check if the link belongs to the default
namespace or not.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
to be able to retrieve the network namespace identifier for each
namespace, the ns id is stored in each ns context. For default
namespace, the netns id is the same as that value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
as remind, the netns identifiers are local to a namespace. that is to
say that for instance, a vrf <vrfx> will have a netns id value in one
netns, and have an other netns id value in one other netns.
There is a need for zebra daemon to collect some cross information, like
the LINK_NETNSID information from interfaces having link layer in an
other network namespace. For that, it is needed to have a global
overview instead of a relative overview per namespace.
The first brick of this change is an API that sticks to netlink API,
that uses NETNSA_TARGET_NSID. from a given vrf vrfX, and a new vrf
created vrfY, the API returns the value of nsID from vrfX, inside the
new vrf vrfY.
The brick also gets the ns id value of default namespace in each other
namespace. An additional value in ns.h is offered, that permits to
retrieve the default namespace context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
an incoming bridge index has been found, that is linked with vxlan
interface, and the search for that bridge interface is done. In
vrf-lite, the search is done across the same default namespace, because
bridge and vxlan may not be in the same vrf. But this behaviour is wrong
when using vrf netns backend, as the bridge and the vxlan have to be in
the same vrf ( hence in the same network namespace). To comply with
that, use the netnamespace of the vxlan interface. Like that, the
appropriate nsid is passed as parameter, and consequently, the search is
correct, and the mac address passed to BGP will be ok too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
other network namespaces are parsed because bridge interface can be
bridged with vxlan interfaces with a link in the default vrf that hosts
l2vpn.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
With vrf-lite mechanisms, it is possible to create layer 3 vnis by
creating a bridge interface in default vr, by creating a vxlan interface
that is attached to that bridge interface, then by moving the vxlan
interface to the wished vrf.
With vrf-netns mechanism, it is slightly different since bridged
interfaces can not be separated in different network namespaces. To make
it work, the setup consists in :
- creating a vxlan interface on default vrf.
- move the vxlan interface to the wished vrf ( with an other netns)
- create a bridge interface in the wished vrf
- attach the vxlan interface to that bridged interface
from that point, if BGP is enabled to advertise vnis in default vrf,
then vxlan interfaces are discovered appropriately in other vrfs,
provided that the link interface still resides in the vrf where l2vpn is
advertised.
to import ipv4 entries from a separate vrf, into the l2vpn, the
configuration of vni in the dedicated vrf + the advertisement of ipv4
entries in bgp vrf will import the entries in the bgp l2vpn.
the modification consists in parsing the vxlan interfaces in all network
namespaces, where the link resides in the same network namespace as the
bgp core instance where bgp l2vpn is enabled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.
This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
Reverting probing of neigh entry. There is a timing where
probe and remote macip add request comes at the same time resulting
in neigh to remain in local state event though it should be remote.
In mobility case, the host moves to remote VTEP, first MAC only type-2
route is received which triggers a PROBE of neighs (associated to MAC).
PROBE request can go via network port to remote VTEP.
PROBE request picks up local neigh with MAC entry's outgoing port is
remote VTEP tunnel port.
The PROBE reply and MAC-IP (containing IP) almost comes same time at
DUT.
DUT first processes remote macip and installs neigh as remote.
Followed by receives neigh as REACHABLE which marks neigh as LOCAL.
FRR does have BPF filter which does not allow its own netlink request
to receive. Otherwise frr's request to program neigh as remote can move
neigh from local to remote.
Though ordering can not be guranteed that REACHABLE (PROBE's repsonse)
can come at anytime and move it to LOCAL.
This fix would not suffice the needs of converging LOCAL inactive neighs
to remove from DB. As mobility draft sugges to PROBE local neigh when
MAC moves to remote but it is not working with current framework.
Ticket:CM-22864
This reverts commit 44bc8ae550
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
clone zebra_vxlan.c to create a file zebra_evpn.c for core
EVPN functions whilst retaining the history of zebra_vxlan.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract the neighbor uninstall part of
zebra_vxlan_handle_kernel_neigh_del into a new function
zebra_evpn_neigh_del_ip in zebra_evpn_neigh.c.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract the neighbor uninstall part of process_remote_macip_add
into a new function zebra_evpn_neigh_remote_uninstall in
zebra_evpn_neigh.c.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract the neighbor part of process_remote_macip_add into a new
function zebra_evpn_neigh_gw_macip_add in zebra_evpn_neigh.c.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract the neighbor part of process_remote_macip_add into a new
function process_neigh_remote_macip_add in zebra_evpn_neigh.c.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
clone zebra_vxlan.c to create a file zebra_evpn_neigh.c for neighbor
dB functions whilst retaining the history of zebra_vxlan.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract mac_gateway add code from zevi_gw_macip_add and move it to
a new generic function zebra_evpn_mac_gw_macip_add in zebra_evpn_mac.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract generic local mac add code from zebra_vxlan_local_mac_del
into a new function zebra_evpn_del_local_mac in zebra_evpn_mac.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
extract the local mac add code from zebra_vxlan_local_mac_add_update
and create a new generic local mac add function
zebra_evpn_add_update_local_mac
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Move MAC add code from process_remote_macip_add in zebra_vxlan.c
to a generic function process_mac_remote_macip_add in
zebra_evpn_mac.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
clone zebra_vxlan.c to create a file zebra_evpn_mac.c for MAC dB
functions whilst retaining the history of zebra_vxlan.c
Signed-off-by: Pat Ruddy <pat@voltanet.io>
The main zebra_vni_t hash structure has been renamed to zebra_evpn_t
to allow for other transport underlays. Rename functions and variables
to reflect this change.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Configuration example:
ip route 9.9.9.9/32 6.6.6.6 color 123
The SR Policy to be chosen is uniquely identified by the policy
endpoint (6.6.6.6) and the SR-TE color (123). Traffic will be
augmented with an MPLS label stack according to the active
candidate path of that particular policy.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
We were noticing registration time of the last nht time.
Let's just store the original time, although I am a bit
dubious about the usefulness of this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
As part of PR 6758 vrf vni converted to transactional cli.
Handle a scenario where vrf is not created yet (inactive) and vni
is mapped to the inactive vrf.
Testing Done:
bharat(config-vrf)# do show vrf
vrf vrf1 id 11 table 1001
vrf vrf5 inactive (configured)
bharat(config)# vrf vrf5
bharat(config-vrf)# vni 5005
bharat(config-vrf)# do show vrf vni
VRF VNI VxLAN IF L3-SVI State Rmac
vrf5 5005 None None Down None
bharat(config-vrf)# no vni 5005
bharat(config-vrf)# do show vrf vni
VRF VNI VxLAN IF L3-SVI State Rmac
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
For the sake of Segment Routing (SR) and Traffic Engineering (TE)
Policies there's a need for additional infrastructure within zebra.
The infrastructure in this PR is supposed to manage such policies
in terms of installing binding SIDs and LSPs. Also it is capable of
managing MPLS labels using the label manager, keeping track of
nexthops (for resolving labels) and notifying interested parties about
changes of a policy/LSP state. Further it enables a route map mechanism
for BGP and SR-TE colors such that learned BGP routes can be mapped
onto SR-TE Policies.
This PR does not introduce any usable features by now, it is just
infrastructure for other upcoming PRs which will introduce 'pathd',
a new SR-TE daemon.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
For allocating a new label range the label manager will loop
the existing label chunks and compare the start and end labels
with the label range in question. In case a label range should
be re-allocated to the existing label chunk, the end label
of the chunk is not honored correctly, e.g. the new label
range has to be a true subset of the existing label chunk.
This is very easy reproducable by re-allocating a single label.
e.g. a label range of size 1.
This problem is fixed by allowing the mentioned 'end' labels to
be equal.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
It is causing build failures because of conflicts with netinet.
Instead I have re-defined the MAC-SYNC UAPIs in the re_netlink.c
This is clearly a hack that needs to be re-visited.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
MAC-IP routes are used for syncing local entries across redundant
switches in an EVPN-MH setup. A path from a peer that has a local
ES as destination is tagged as a SYNC path. The SYNC path results in the
addition of local MAC and/or local neigh entry in zebra and in the
dataplane.
Implementation overview
=======================
1. Three new flags "local-inactive", "peer-active" and "peer-proxy"
are maintained per-local-MAC and per-local-Neigh entry.
2. The "peer-XXX" flags are set and cleared via SYNC path updates
from BGP. Proxy sync paths result in the setting of "peer-proxy" flag
(and non-proxies result in the "peer-active").
3. A neigh entry that has a "peer-XXX" flag set is programmed as
"static" in the dataplane.
4. A MAC entry that has a "peer-XXX" flag set or is referenced by
a sync-neigh entry (that has a "peer-XXX" flags set) is programmed
as "static" in the dataplane.
5. The sync-seq number is used to normalize the MM seq number across
all the redundant switches i.e. the max MM seq number across all
switches is used by each of the switches. This commit also includes
the changes needed for extended MM seq syncing.
6. A MAC/neigh entry has to be local-active or peer-active to sent to
BGP. An entry that is NOT local-active is sent with the proxy flag (so
BGP can "proxy" advertise it).
7. The "peer-active" flag is aged out by zebra by using a hold_timer
(this is instead of being abruptly dropped on SYNC path delete). This
age-out is needed to handle peer-switch restart (procedures are specified
in draft-rbickhart-evpn-ip-mac-proxy-adv). The holdtime needs to be
sufficiently long to allow an external neighmgr daemon or the dataplane
component to independently probe and establish local reachability of a
host. The MAC and neigh hold time values are configurable.
PS: In the future this probing may happen in FRR itself.
CLI changes to display sync info
================================
MAC
===
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@torm-11:mgmt:~# net show evpn mac vni 1000
Number of MACs (local and remote) known for this VNI: 6
Flags: N=sync-neighs, I=local-inactive, P=peer-active, X=peer-proxy
MAC Type Flags Intf/Remote ES/VTEP VLAN Seq #'s
00:02:00:00:00:25 local vlan1000 1000 0/0
02:02:00:00:00:02 local PI hostbond1 1000 0/0
02:02:00:00:00:06 remote 03:00:00:00:00:02:11:00:00:01 0/0
02:02:00:00:00:01 local X hostbond1 1000 0/0
00:00:00:00:00:11 local PI hostbond1 1000 0/0
02:02:00:00:00:05 remote 03:00:00:00:00:02:11:00:00:01 0/0
root@torm-11:mgmt:~#
root@torm-11:mgmt:~# net show evpn mac vni 1000 mac 00:00:00:00:00:11
MAC: 00:00:00:00:00:11
ESI: 03:00:00:00:00:01:11:00:00:01
Intf: hostbond1(58) VLAN: 1000
Sync-info: neigh#: 0 local-inactive peer-active >>>>>>>>>>>>
Local Seq: 0 Remote Seq: 0
Neighbors:
No Neighbors
root@torm-11:mgmt:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
neigh
=====
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@torm-11:mgmt:~# net show evpn arp vni 1003
Number of ARPs (local and remote) known for this VNI: 4
Flags: I=local-inactive, P=peer-active, X=peer-proxy
Neighbor Type Flags State MAC Remote ES/VTEP Seq #'s
2001:fee1:0:3::6 local active 00:02:00:00:00:25 0/0
45.0.3.66 local P active 00:02:00:00:00:66 0/0
45.0.3.6 local active 00:02:00:00:00:25 0/0
fe80::202:ff:fe00:25 local active 00:02:00:00:00:25 0/0
root@torm-11:mgmt:~#
root@torm-11:mgmt:~# net show evpn arp vni 1003 ip 45.0.3.66
IP: 45.0.3.66
Type: local
State: active
MAC: 00:02:00:00:00:66
Sync-info: peer-active >>>>>>>>>>>>>>>>
Local Seq: 0 Remote Seq: 0
root@torm-11:mgmt:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. ES sample display
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# show evpn es
Type: L local, R remote
ESI Type ES-IF VTEPs
00:00:00:00:00:00:00:00:00:00 -
03:00:00:00:00:01:11:00:00:01 LR hostbond1 27.0.0.16
03:00:00:00:00:01:22:00:00:02 LR hostbond2 27.0.0.16
03:00:00:00:00:01:22:00:00:03 LR hostbond3 27.0.0.16
03:00:00:00:00:02:11:00:00:01 R - 27.0.0.17,27.0.0.18
03:00:00:00:00:02:22:00:00:02 R - 27.0.0.17,27.0.0.18
03:00:00:00:00:02:22:00:00:03 R - 27.0.0.17,27.0.0.18
torm-11#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2. ES-EVI sample display
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# show evpn es-evi
Type: L local, R remote
VNI ESI Type
1005 03:00:00:00:00:01:11:00:00:01 L
1005 03:00:00:00:00:01:22:00:00:02 L
1005 03:00:00:00:00:01:22:00:00:03 L
1002 03:00:00:00:00:01:11:00:00:01 L
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The linux kernel sends the VLAN list per-access port as bitmap. This
needs to be translated into a per-ES VNI list for generation of
EAD-EVI routes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. Local ethernet segments are configured in zebra by attaching a
local-es-id and sys-mac to a access interface -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
interface hostbond1
evpn mh es-id 1
evpn mh es-sys-mac 00:00:00:00:01:11
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
This info is then sent to BGP and used for the generation of EAD-per-ES
routes.
2. Access VLANs associated with an (ES) access port are translated into
ES-EVI objects and sent to BGP. This is used by BGP for the
generation of EAD-EVI routes.
3. Remote ESs are imported by BGP and sent to zebra. A list of VTEPs
is maintained per-remote ES in zebra. This list is used for the creation
of the L2-NHG that is used for forwarding traffic.
4. MAC entries with a non-zero ESI destination use the L2-NHG associated
with the ESI for forwarding traffic over the VxLAN overlay.
Please see zebra_evpn_mh.h for the datastruct organization details.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Multihoming support requires a new dataplane feature, MAC-ECMP, to
bridge traffic to remote ESs that are attached to more than one
active VTEP.
As a part of this support indirection has also been added via
L2-NHGs. Using a nexthop group allows for fast failover
of MAC entries when an access port attached to a remote-ES goes
down i.e. instead of updating many MAC entries this becomes a
single NHG update to the dataplane.
Note: Some of the code here needs to be reworked to the new
dataplane model.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Revert "zebra: support for macvlan interfaces"
This reverts commit bf69e212fd.
Revert "doc: add some documentation about bgp evpn netns support"
This reverts commit 89b97c33d7.
Revert "zebra: dynamically detect vxlan link interfaces in other netns"
This reverts commit de0ebb2540.
Revert "bgpd: sanity check when updating nexthop from bgp to zebra"
This reverts commit ee9633ed87.
Revert "lib, zebra: reuse and adapt ns_list walk functionality"
This reverts commit c4d466c830.
Revert "zebra: local mac entries populated in correct netnamespace"
This reverts commit 4042454891.
Revert "zebra: when parsing local entry against dad, retrieve config"
This reverts commit 3acc394bc5.
Revert "bgpd: evpn nexthop can be changed by default"
This reverts commit a2342a2412.
Revert "zebra: zvni_map_to_vlan() adaptation for all namespaces"
This reverts commit db81d18647.
Revert "zebra: add ns_id attribute to mac structure"
This reverts commit 388d5b438e.
Revert "zebra: bridge layer2 information records ns_id where bridge is"
This reverts commit b5b453a2d6.
Revert "zebra, lib: new API to get absolute netns val from relative netns val"
This reverts commit b6ebab34f6.
Revert "zebra, lib: store relative default ns id in each namespace"
This reverts commit 9d3555e06c.
Revert "zebra, lib: add an internal API to get relative default nsid in other ns"
This reverts commit 97c9e7533b.
Revert "zebra: map vxlan interface to bridge interface with correct ns id"
This reverts commit 7c990878f2.
Revert "zebra: fdb and neighbor table are read for all zns"
This reverts commit f8ed2c5420.
Revert "zebra: zvni_map_to_svi() adaptation for other network namespaces"
This reverts commit 2a9dccb647.
Revert "zebra: display interface slave type"
This reverts commit fc3141393a.
Revert "zebra: zvni_from_svi() adaptation for other network namespaces"
This reverts commit 6fe516bd4b.
Revert "zebra: importation of bgp evpn rt5 from vni with other netns"
This reverts commit 28254125d0.
Revert "lib, zebra: update interface name at netlink creation"
This reverts commit 1f7a68a2ff.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Current behavior:
eva# show mem
2020/08/04 18:07:38 ZEBRA: Not Notifying Owner: 2 about prefix 3.3.3.3/32(254) 2 vrf: 0
Fix it to show:
2020/08/04 18:07:38 ZEBRA: Not Notifying Owner: connected about prefix 3.3.3.3/32(254) 2 vrf: 0
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Added a macro to validate the v4 mapped v6 address.
Modified bgp receive & send updates for v4 mapped v6 address as
nexthop and installing it as recursive nexthop in RIB.
Minor change in fpm while sending the routes for nexthop as
v4 mapped v6 address.
Signed-off-by: Kaushik <kaushik@niralnetworks.com>
DEFPY_YANG will allow the CLI to identify which commands are
YANG-modeled or not before executing them. This is going to be
useful for the upcoming configuration back-off timer work that
needs to commit pending configuration changes before executing a
command that isn't YANG-modeled.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
We were not getting any benefits from attempting to walk all tables at the
same time and it made debugging harder, so lets execute one table walk
per time.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Zebra runs on a different thread than FPM, so we need to synchronize
them by using events. While here, implement completion detection for all
kinds of walk.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Two important fixes:
* `stream_read_try` does a dirty trick and converts the `-1` return to
`-2` when errno is `EAGAIN`, `EWOULDBLOCK` or `EINTR`.
* Don't enable reads until the connection is complete.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add a simple validation function for zapi_labels messages; it
checks for and validates backup nexthop indexes currently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
In networking restart event, l3vni (vxlan) interface followed by
associated vrf interfaces go down/deleted.
L3vni (oper) down event (from zebra to bgp) triggers to
clean up/un-import evpn routes (one-by-one) from the vrf table,
zebra internally removes the route entry from nexthop and RMAC hash.
When all the routes references in nexthop and RMAC db removed,
both (nexthop/rmac) are suppose to be uninstalled from the
bridge fdb and neigh table.
While evpn routes removal in progress, a vrf disable event removes
l3vni to its vrf association.
Subsequent bgp to evpn routes removal does not clean up thus evpn routes
reference to nexthop and RMAC remains in zebra hash.
bridge fdb and neigh tables are flushed out since networking restart brings down
all interfaces which results in flush of fdb and neigh tables.
By product is the zebra does not install nexthop and rmac when routes are re-imported
into vrf in VNI/VRF up event.
The fix is in vrf disable event to flush all l3vni's nexthop and rmac db.
Ticket:CM-30338
Reviewed By:CCR-10489
Testing Done:
Performed multiple networking restart and checked neigh and
bridge fdb tables for respective nexthop and router mac entry
programmed.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Include any installed backup nexthops when installing
pseudowires; include installed backups in vty and json
pw show output.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Improve vty output for routes and lsps with backups, including
json. Simplify or correct some code that uses both primary and
backup nexthops in dplane, nht.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Serialize the `fpm_reconnect` function by only allowing one part of our
code to call it, then make sure all zebra threads executions are done
before attempting to close and reset the output stream.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Prevent string manipulation where we might have data
passed into that is larger than the buffer we are pushing into.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add vrf as key in Rib operational nexthop list
PR 6296 has added vrf as key in nexthop list.
Rib operational model uses nexthop list, adding
vrf key into northbound callback.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Add a route_entry flag to indicate the presence of a fib
(installed) list of nexthops - more explicit and clearer.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Initial changes to support a nexthop with multiple backups. Lib
changes to hold a small array in each primary, zapi message
changes to support sending multiple backups, and daemon
changes to show commands to support multiple backups. The config
input for multiple backup indices is not present here.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
* add a vrf sub-command `[no] ipv6 router-id X:X::X:X`.
* add command `[no] ipv6 router-id X:X::X:X [vrf NAME]` for backward
compatibility.
* add a vrf sub-command `[no] ip router-id A.B.C.D` and make the old
one without `ip` an alias for it.
* add a command `[no] ip router-id A.B.C.D [vrf NAME]` for backward
comptibility and make the old one without `ip` an alias for it.
* add command `show ip router-id [vrf NAME]` and make
the old one without `ip` an alias for it.
* add command `show ipv6 router-id [vrf NAME]`.
* add ZAPI commands `ZEBRA_ROUTER_ID_V6_ADD`,
`ZEBRA_ROUTER_ID_V6_DELETE` and `ZEBRA_ROUTER_ID_V6_UPDATE`
for deamons to get notified of the IPv6 router-id.
* update zebra documentation.
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
This commit avoids freeing the iptable context, once created. the case
where there is an error when reading zapi stream simply needs to free
the zpi context.
Fixes: ("8b5c4dce07e6 zebra: fix iptable memleak, fix free funcs")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
We do not need to know anything about rules in afi 128/129
at this point in time. Just note it with a zebra kernel
debug and move on. This is not something that a operator
can do anything with and at this point in time FRR
does not care.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Include any installed backups when updating the local kernel
after processing an async notification. This includes routes'
nexthops and LSPs' nhlfes.
Add the 'b' character to the route show display and header to
indicate backup nexthops.
Signed-off-by: Mark Stapp <mjs@voltanet.io>