Add enums for set/unset of prodown state to handle the mainthread
knowing an update is already queued without actually marking it
as complete.
This is to make the logic confirm a bit more with other parts of the code
where we queue dplane updates and not update our internal structs until
success callback is received.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add support for setting the protodown reason code.
829eb208e8
These patches handle all our netlink code for setting the reason.
For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.
Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.
We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.
Also add debugging code for dumping netlink packets with
protodown/protodown_reason.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
When an interface goes down, it signals any related NHGs to
re-validate themselves. During zebra shutdown, ensure we remove
any NHGs we've installed.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
With recent changes to interface up mechanics in if_netlink.c
FRR was receiving as many as 4 up events for an interface
on ifdown/ifup events. This was causing timing issues
in FRR based upon some fun timings. Remove this from
happening.
Ticket: CM-31623
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There exists some interface types that are slow on startup
to fully register their link speed. Especially those that
are working with an asic backend. The speed_update timer
associated with each interface would keep trying if the
system returned a MAX_UINT32 as the speed. This speed
means both unknown or there is none under linux.
Since some interface types are slow on startup let's modify
FRR to try for at most 4 minutes and give up trying on those
interfaces where we never get any useful data.
Why 4 minutes? I wanted to balance the time associated with
slow interfaces coming up with those that will never give us
a value. So I choose 4 minutes as a good ballpark of time
to keep trying
Why not track all those interfaces and just not attempt to
do the speed lookup? I would prefer to not keep track of these
as that I do not know all the interface types, nor do I wish
to keep programming as new ones come in.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Use the dataplane to query and read interface NETCONF data;
add netconf-oriented data to the dplane context object, and
add accessors for it. Add handler for incoming update
processing.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
RA packets are pretty chatty and when there is a warning from
a missconfiguration on the network, the log file gets filed
up with warnings. Modify the code in rtadv.c to only spit
out the warning in these cases at most every 6 hours.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
VRF name should not be printed in the config since 574445ec. The update
was done for NB config output but I missed it for regular vty output.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add a thread_ignore_late_timer(struct thread *thread) function
that allows thread.c to ignore when timers are late to the party.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
With netns VRF backend, we may have multiple interfaces with the same
name. Currently, the function output is not deterministic in this case,
it returns the first interface that it finds in the list. Be more
explicit and tell the user that we need the VRF name.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Move the handler for incoming interface address events
to a neutral source file - it's not netlink-specific and
shouldn't have been in a netlink file.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
Read incoming interface address change notifications in the
dplane pthread; enqueue the events to the main pthread
for processing. This is netlink-only for now - the bsd
kernel socket path remains unchanged.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
There is a possibility that the same line can be matched as a command in
some node and its parent node. In this case, when reading the config,
this line is always executed as a command of the child node.
For example, with the following config:
```
router ospf
network 193.168.0.0/16 area 0
!
mpls ldp
discovery hello interval 111
!
```
Line `mpls ldp` is processed as command `mpls ldp-sync` inside the
`router ospf` node. This leads to a complete loss of `mpls ldp` node
configuration.
To eliminate this issue and all possible similar issues, let's print an
explicit "exit" at the end of every node config.
This commit also changes indentation for a couple of existing exit
commands so that all existing commands are on the same level as their
corresponding node-entering commands.
Fixes#9206.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The only difference in daemons' interface node definition is the config
write function. No need to define the node in every daemon, just pass
the callback as an argument to a library function and define the node
there.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Initially the reading of the speed of an interface happened
upon interface creation and happened until the speed of a link
settled down to a single value. The speed of an interface
can also change as that a new optic can be inserted that
changes the speed, in which case FRR would see a interface
down (optic removal) and then a interface up (optic insertion).
In this case FRR would not treat this as an event that changed
the speed. Let's expand the checking a bit more.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
zebra is able to get information about gre tunnels.
zebra_gre file is created to handle hooks, but is not yet used.
also, debug zebra gre command is done to add gre traces.
A zebra_gre file is used for complementary actions that may be needed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when zebra has vrf backend mapped to namespaces, the polling
of interfaces leads to fix all linkages of interfaces. This
was not done on non default namespace. do it for other namespaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
a zebra api is extended to offer ability to add or remove neighbor
entry from daemon. Also this extension makes possible to add neigh
entry, not only between IPs and macs, but also between IPs and NBMA IPs.
This API supports configuring ipv6/ipv4 entries with ipv4/ipv6 lladdr.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This one also needed a bit of shuffling around, but MTYPE_RE is the only
one left used across file boundaries now.
Signed-off-by: David Lamparter <equinox@diac24.net>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
When an ES-bond comes out of bypass FRR needs to flush the local MACs learnt
while the bond was in bypass. To do that efficiently local MACs are linked
to the dest-access port. This only happens if the access-port is in
LACP-bypass or if it is non-ES.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Feature overview:
=================
A 802.3ad bond can be setup to allow lacp-bypass. This is done to enable
servers to pxe boot without a LACP license i.e. allows the bond to go oper
up (with a single link) without LACP converging.
If an ES-bond is oper-up in an "LACP-bypass" state MH treats it as a non-ES
bond. This involves the following special handling -
1. If the bond is in a bypass-state the associated ES is placed in a
bypass state.
2. If an ES is in a bypass state -
a. DF election is disabled (i.e. assumed DF)
b. SPH filter is not installed.
3. MACs learnt via the host bond are advertised with a zero ESI.
When the ES moves out of "bypass" the MACs are moved from a zero-ESI to
the correct non-zero id. This is treated as a local station move.
Implementation:
===============
When (a) an ES is detached from a hostbond or (b) an ES-bond goes into
LACP bypass zebra deletes all the local macs (with that ES as destination)
in the kernel and its local db. BGP re-sends any imported MAC-IP routes
that may exist with this ES destination as remote routes i.e. zebra can
end up programming a MAC that was perviously local as remote pointing
to a VTEP-ECMP group.
When an ES is attached to a hostbond or an ES-bond goes
LACP-up (out of bypss) zebra again deletes all the local macs in the
kernel and its local db. At this point BGP resends any imported MAC-IP
routes that may exist with this ES destination as sync routes i.e.
zebra can end up programming a MAC that was perviously remote
as local pointing to an access port.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Added support for advertising SVI MAC if EVPN-MH is enabled.
In the case of EVPN MH arp replies from an attached server can be sent to
the ES-peer. To prevent flooding of the reply the SVI MAC needs to be
advertised by default.
Note:
advertise-svi-ip could have been used as an alternate way to advertise
SVI MAC. However that config cannot be turned on if SVI IPs are
re-used (which is done to avoid wasting IP addresses in a subnet).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Neither tabs nor newlines are acceptable in syslog messages. They also
break line-based parsing of file logs.
Signed-off-by: David Lamparter <equinox@diac24.net>
protodown state is a combination of the dplane and zebra states.
protodown reason is maintained exclusively by zebra. Display this
information on two separate lines to make that ownership clearer.
Also display n/a for bonds as the dplane doesn't support protodowning
the bond device.
Sample output -
==============
root@torm-11:mgmt:~# vtysh -c "show interface hostbond1"|grep -i protodown
protodown: off (n/a)
protodown reasons: (uplinks-down)
root@torm-11:mgmt:~# vtysh -c "show interface swp5"|grep -i protodown
protodown: on
protodown reasons: (uplinks-down)
root@torm-11:mgmt:~#
PS: Cosmetic changes only, no functional change.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Local ethernet segments are held in a protodown or error-disabled state
if access to the VxLAN overlay is not ready -
1. When FRR comes up the local-ESs/access-port are kept protodown
for the startup-delay duration. During this time the underlay and
EVPN routes via it are expected to converge.
2. When all the uplinks/core-links attached to the underlay go down
the access-ports are similarly protodowned.
The ES-bond protodown state is propagated to each ES-bond member
and programmed in the dataplane/kernel (per-bond-member).
Configuring uplinks -
vtysh -c "conf t" vtysh -c "interface swp4" vtysh -c "evpn mh uplink"
Configuring startup delay -
vtysh -c "conf t" vtysh -c "evpn mh startup-delay 100"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
EVPN protodown display -
========================
root@torm-11:mgmt:~# vtysh -c "show evpn"
L2 VNIs: 10
L3 VNIs: 3
Advertise gateway mac-ip: No
Advertise svi mac-ip: No
Duplicate address detection: Disable
Detection max-moves 5, time 180
EVPN MH:
mac-holdtime: 60s, neigh-holdtime: 60s
startup-delay: 180s, start-delay-timer: 00:01:14 <<<<<<<<<<<<
uplink-cfg-cnt: 4, uplink-active-cnt: 4
protodown: startup-delay <<<<<<<<<<<<<<<<<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ES-bond protodown display -
===========================
root@torm-11:mgmt:~# vtysh -c "show interface hostbond1"
Interface hostbond1 is up, line protocol is down
Link ups: 0 last: (never)
Link downs: 1 last: 2020/04/26 20:38:03.53
PTM status: disabled
vrf: default
OS Description: Local Node/s torm-11 and Ports swp5 <==> Remote Node/s hostd-11 and Ports swp1
index 58 metric 0 mtu 9152 speed 4294967295
flags: <UP,BROADCAST,MULTICAST>
Type: Ethernet
HWaddr: 00:02:00:00:00:35
Interface Type bond
Master interface: bridge
EVPN-MH: ES id 1 ES sysmac 00:00:00:00:01:11
protodown: off rc: startup-delay <<<<<<<<<<<<<<<<<
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ES-bond member protodown display -
==================================
root@torm-11:mgmt:~# vtysh -c "show interface swp5"
Interface swp5 is up, line protocol is down
Link ups: 0 last: (never)
Link downs: 3 last: 2020/04/26 20:38:03.52
PTM status: disabled
vrf: default
index 7 metric 0 mtu 9152 speed 10000
flags: <UP,BROADCAST,MULTICAST>
Type: Ethernet
HWaddr: 00:02:00:00:00:35
Interface Type Other
Master interface: hostbond1
protodown: on rc: startup-delay <<<<<<<<<<<<<<<<
root@torm-11:mgmt:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>