In symmetric routing, when local ESI is down,
the MH peer learnt local mac-ip
prefix is installed into teannt vrf (given l3vni).
When ESI is back up and associated to evi/vni then
remove the local synced mac-ip imported routes from the
tenant vrf as local neigh/arp is present.
Ticket: #3878699
Testing:
peer advertised mac-ip route:
*> [2]:[0]:[48]:[aa:aa:aa:00:00:01]:[32]:[45.0.0.51] RD 27.0.0.4:9
27.0.0.4 (spine-1)
0 64435 65016 i
ESI:03:44:38:39:ff:ff:01:00:00:01
RT:65016:1000 RT:65016:4000 ET:8 Rmac:44:38:39:ff:ff:16
When local ESI is flapped
torm-11:# ip neigh show 45.0.0.51
45.0.0.51 dev vlan1000 lladdr aa:aa:aa:00:00:01 REACHABLE proto zebra
Before fix:
(The imported route remained in tenant-vrf)
torm-11:# ip route show vrf vrf1 45.0.0.51
45.0.0.51 nhid 257 proto bgp metric 20
After fix:
torm-11# ip route show vrf vrf1 45.0.0.51
torm-11#
trace:
2024/10/11 18:19:29 BGP: [JMP3T-178G8] route [2]:[0]:[48]:[00:02:00:00:00:08]:[32]:[21.1.0.5]
is matched on local esi 03:00:00:00:77:01:04:00:00:0e, uninstall from VRF tenant1 route table
Signed-off-by: Chirag Shah <chirag@nvidia.com>
This is handy when you need to do source matching e.g. `match src-peer ...`
on outgoing direction with a route-map.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The type of the val field in ecommunity_val is used inconsistently
in a number of places. It should be defined as uint8_t.
Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
When a prefix in a vrf is imported into evpn as a type5,
copy the asn of the source to make sure it is reflected
in the target vrf.
Ticket: cumuluslinux-2554562
Signed-off-by: Don Slice <dslice@nvidia.com>
Fix static-analyser warnings with BGP labels:
> $ scan-build make -j12
> bgpd/bgp_updgrp_packet.c:819:10: warning: Access to field 'extra' results in a dereference of a null pointer (loaded from variable 'path') [core.NullDereference]
> ? &path->extra->labels->label[0]
> ^~~~~~~~~
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When parsing EVPN NLRIs, and an error occurred, do no forget to free the memory.
Fixes: 4ace11d010 ("bgpd: Move evpn_overlay to a pointer")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Coverity complains there is a use after free (1598495 and 1598496)
At this point, most likely dest->refcount cannot go 1 and free up
the dest, but there might be some code path where this can happen.
Fixing this with a simple order change (no harm fix).
Ticket :#4001204
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
The return value of evpn_route_select_install is ignored in all cases
except during vni route table install/uninstall and based on the
returned value, an error is logged. Fixing this.
Ticket :#3992392
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
In case of imported routes (L3vni/vrf leaks), when a bgp instance is
being deleted, the peer->bgp comparision with the incoming bgp to remove
the dest from the pending fifo is wrong. This can lead to the fifo
having stale entries resulting in crash.
Two changes are done here.
- Instead of pop/push items in list if the struct bgp doesnt match,
simply iterate the list and remove the expected ones.
- Corrected the way bgp is fetched from dest rather than relying on
path_info->peer so that it works for all kinds of routes.
Ticket :#3980988
Signed-off-by: Chirag Shah <chirag @nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Instead of using 3 uint8_t variables under struct attr, let's use a single
uint8_t as the flags. Saving 2-bytes. Not a big deal, but it's even easier to
track EVPN-related flags/variables.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Fix for a bug, where FRR fails to install route received for an unknown but later-created VRF - detailed description can be found here https://github.com/FRRouting/frr/issues/13708
Signed-off-by: Piotr Suchy <psuchy@akamai.com>
Problem statement:
==================
When a vrf is deleted from the kernel, before its removed from the FRR
config, zebra gets to delete the the vrf and assiciated state.
It does so by sending a request to delete the l3 vni associated with the
vrf followed by a request to delete the vrf itself.
2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF
testVRF1001 to bgp
2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE
testVRF1001
The zebra client communication is asynchronous and about 1/5 cases the
bgp client process them in a different order.
2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be
deleted.
2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be
disabled.
2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766
2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766
2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion:
testVRF1001(4294967295)
.. and a bit later :
2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL
2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI
1001 Del - Could not find BGP instance
When the bgp vrf config is removed later it fails on the sanity check if
l3vni is removed.
if (bgp->l3vni) {
vty_out(vty, "%% Please unconfigure l3vni %u\n",
bgp->l3vni);
return CMD_WARNING_CONFIG_FAILED;
}
Solution:
=========
The solution is to make bgp cleanup the l3vni a bgp instance is going
down.
The fix:
========
The fix is to add a function in bgp_evpn.c to be responsible for for
deleting the local vni, if it should be needed, and call the function
from bgp_instance_down().
Testing:
========
Created a test, which can run in container lab that remove the vrf on
the host before removing the vrf and the bgp config form frr. Running
this test in a loop trigger the problem 18 times of 100 runs. After the
fix it did not fail.
To verify the fix a log message (which is not in the code any longer)
were used when we had a stale l3vni and needed to call
bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in
100 test runs.
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
bgpd: braces {} are not necessary for single line block
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
In case when bgp_evpn_free or bgp_delete is called and the announce_list
has few items where vpn/bgp does not match, we add the item back to the
list. Because of this the list count is always > 0 thereby hogging CPU or
infinite loop.
Ticket: #3905624
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
BGP receives notification from zebra about an vpn that
needs to be installed into the evpn tables. Unfortunately
this function was walking the entirety of evpn tables
3 times. Modify the code to walk the tree 1 time and
to just look for the needed route types as you go.
This reduces, in a scaled environment, processing
time of the zclient_read function from 130 seconds
to 95 seconds. For a up / down / up interface
scenario.
Signed-off-by: Rajasekar Raja <rajasekarr@vndia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Current changes deals with EVPN routes installation to zebra.
In evpn_route_select_install() we invoke evpn_zebra_install/uninstall
which sends zclient_send_message().
This is a continuation of code changes (similar to
ccfe452763) but to handle evpn part
of the code.
Ticket: #3390099
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Currently bgp_path_info's are stored in reverse order
received. Sort them by the best path ordering.
This will allow for optimizations in the future on
how multipath is done.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Do not reap instead let's schedule for deletion
and let best_path_selection take care of the deletion
as it should.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently evpn code calls bgp_best_selection for local
decisions for local tables to figure out what to do.
This is also pi based so let's note that the pi has
been changed before calling bgp_best_selection.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This will allow a consistency of approach to adding/removing
pi's to from the workqueue for processing as well as properly
handling the dest->info pi list more appropriately.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modify update_evpn_type5_route_entry to return a pointer to the
struct bgp_path_info modified in this function. This code
merely follows the standards used in other bgp_evpn.c code
where the update function returns the pointer to the path
info.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The MTYPE_BGP memory type was being over used as
both the handler for the bgp instance itself as
well as memory associated with name strings.
Let's separate out the two.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The dest pointer may be freed( but should not be
due to locking ). Let's ensure that this assumption
is true and make coverity happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There exist two spots in this function where the dest could be
freed, but is not due to locking, but coverity thinks it might
so let's make the function happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
But never really does due to locking, but since it can
we need to treat it like it does and ensure that FRR
is not making a mistake, by using memory after it
has been freed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is based on @donaldsharp's work
The current code base is the struct bgp_node data structure.
The problem with this is that it creates a bunch of
extra data per route_node.
The table structure generates ‘holder’ nodes
that are never going to receive bgp routes,
and now the memory of those nodes is allocated
as if they are a full bgp_node.
After splitting up the bgp_node into bgp_dest and route_node,
the memory of ‘holder’ node which does not have any bgp data
will be allocated as the route_node, not the bgp_node,
and the memory usage is reduced.
The memory usage of BGP node will be reduced from 200B to 96B.
The total memory usage optimization of this part is ~16.00%.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>