EVPN MH ES reduendant VTEPs need to install
sync MAC as notify inactive and generate
ND:Proxy stamped extended community on Type-2
route.
Ticket:#3436621
Issue:3436621
Testing Done:
tor-11 originates type-2 MAC route:
tor-11# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
tor-12 receives sync MAC route:
Before fix:
----------
tor-12:/# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static
After fix: inactive is set to MAC entry
----------
tor-12:/#bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify inactive master bridge
static
Notice the difference in `inactive` post notify on tor-12
with the fix.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Issue:
After vlan flap, zebra was not marking the selected/best route as installed.
As a result, when a static route was configured with nexthop as directly
connected interface's(vlan) IP, the static route was not being installed
in the kernel since its nexthop was unresolved. The nexthop was marked
unresolved because zebra failed to mark the best route as installed after
interface flap.
This was happening because, in dplane_route_update_internal() if the old and
new context type, and nexthop group id are the same, then zebra doesn't send
down a route replace request to kernel. But, the installed (ROUTE_ENTRY_INSTALLED)
flag is set when zebra receives a response from kernel. Since the
request to kernel was being skipped for the route entry, installed flag
was not being set
Fix:
In dplane_route_update_internal() if the old and new context type, and
nexthop group id are the same, then before returning, installed flag will
be set on the route-entry if it's not set already.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
"show evpn json" returns nothing when evpn is disabled.
Code has been fixed to return {} when evpn is disabled or no entry
available.
Before Fix:-
```
cumulus@r2:mgmt:~$ sudo vtysh -c "show evpn json"
cumulus@r2:mgmt:~$
```
After Fix:-
```
cumulus@r1:mgmt:~$ sudo vtysh -c "show evpn json"
{
}
cumulus@r1:mgmt:~$
```
Ticket:#3417955
Issue:3417955
Testing: UT done
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Signed-off-by: Sindhu Parvathi Gopinathan <sgopinathan@nvidia.com>
During shutdown, the main pthread stops the dplane pthread
before exiting. Don't try to clean up any events scheduled
to the dplane pthread at that point - just let the thread
exit and clean up.
Signed-off-by: Mark Stapp <mjs@labn.net>
two things:
On shutdown cleanup any events associated with the update walker.
Also do not allow new events to be created.
Fixes this mem-leak:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #8 0x7f0dd0797f0c in event_call lib/event.c:1995
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #10 0x55b42fa30caa in main zebra/main.c:465
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790- #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP signals to zebra that a afi has converged immediately
after it has finished processing all routes for a given
afi/safi. This generates events in zebra in this order
a) Routes received from BGP, placed on early-rib Meta-Q
b) Signal GR for the afi.
Now imagine that zebra reads GR code and immediately
processes routes that are in the actual rib and
removes some routes. This generates a
c) route deletion to the kernel for some number of
routes that may be in the the early-rib Meta-Q
d) Process the Meta-Q, and re-install the routes
This is undesirable behavior in zebra. In that
while we may end up in a correct state, there
will be a blip for some number of routes that
happen to be in the early rib Meta-Q.
Modify the GR code to have it's own processing
entry at the end of the Meta-Q. This will
allow all routes to be processed and ready
for handling by the Graceful Restart code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
After the restructure of the gr code to allow zebra_gr
to have individual cleanups of afi, this is no longer necessary.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR code in FRR used to wait till all AFI's were complete
before cleaning up the routes from the upper level protocol.
This of course can lead to some weird situations where say
ipv4 finishes and then v6 is stuck waiting for a peer to come
up and never finishes. v4 when it finishes signals zebra that
it is done but no action is taken at that moment.
Modify the code to allow the zebra_gr.c code to handle a per
afi removal, instead of doing it all at the end.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The zebra_gr code had 3 functions when effectively only
1 was needed. Cleans up some code weirdness around
multiple switch statements for the same api->cap
as well as consolidating down to only caring about
SAFI_UNICAST, since that is all we care about at the
moment.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have code that tracks both afi and safi's,
but we only ever operate on the afi's. So lets
limit our work being done to something more sensible.
I'm leaving the safi being broadcast through the zapi
message, as that I am not sure what else should be ripped
out at this point in time.
Finally re-arrange the zread_client_capabilites function
to stop the multiple levels of function calling that really
serve no purpose.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
By the time this function is called we have already
ensured that the pointers are good several times.
I like consistency but this is a bit much
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When GR is running and attempting to clear up a node
if the node that is currently saved and we are coming
back to happens to be deleted during the time zebra
suspends the GR code due to hitting the node limit
then zebra GR code will just completely stop processing
and potentially leave stale nodes around forever.
Let's just remove this hole and process what we can.
Can you imagine trying to debug this after the fact?
If we remove a node then that counts toward the maximum
to process of ZEBRA_MAX_STALE_ROUTE_COUNT. This should
prevent any non-processing with a slightly larger cost
of having to look at a few nodes repeatedly
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The info->do_delete variable was being set to true only when
u.val was 1. The problem with this is that u.val is a union
and the various ways that we can call this event causes
different values to be written to the union value on the thread.
This makes no sense. Just set the variable to what we want it to
be when we need it to be true. Since it was only ever set during
a thread_execute section.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Issue:
When a netns is deleted, since zebra doesn’t receive interface down/delete
notifications from kernel, it manually deletes the interface without removing
the association between zebra_l3vni and the interface that is being deleted
(i.e it deletes the interface without setting “zl3vni->vxlan_if” to NULL).
Later, during the deletion of netns, when zl3vni_rmac_uninstall() is called to
uninstall the remote RMAC from the kernel, zebra ends up accessing stale
“zl3vni->vxlan_if” pointer, which now points to freed memory.
This was causing heap use-after-free.
Fix:
Before zebra starts deleting the interfaces when it receives netns delete notification,
appropriate functions() are being called to remove the association between evpn structs
and interface and set “zl3vni->vxlan_if” to NULL. This ensures that when
zl3vni_rmac_uninstall() is called during netns deletion, it will bail because
“zl3vni->vxlan_if” is NULL.
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
The "show zebra mpls .. json" vty command may return empty information
in case the MPLS database is empty or a given label entry is not
available. When those errors occur, add the braces to return a
valid json format.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The GR debug logs are doing all sorts of wonderful stuff
but they were not actually displaying anything useful to the operator
about what vrf we are operating in.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Create VRF and interfaces:
ip netns add vrf1
ip link add veth1 index 100 type veth
ip link add link veth1 veth1.200 type vlan id 200
ip link set veth1.200 netns vrf1
ip -n vrf1 link add veth2 index 100 type veth
After reloading zebra, "show interface veth1.200" shows wrong parent
interface:
test# show interface veth1.200
Interface veth1.200 is down
...
Parent interface: veth2
This is because veth1.200 and veth1 are in different netns, and veth2
happens to have the same ifindex as veth1, in the same netns of
veth1.200.
When looking for parent, link-ifindex 100 should be looked up within
link-netns, rather than that of the child interface.
Add link_nsid to zebra interface, so that the <link_nsid, link_ifindex>
pair can uniquely identify the link interface.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>