Coverity complains there is a use after free (1598495 and 1598496)
At this point, most likely dest->refcount cannot go 1 and free up
the dest, but there might be some code path where this can happen.
Fixing this with a simple order change (no harm fix).
Ticket :#4001204
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
The return value of evpn_route_select_install is ignored in all cases
except during vni route table install/uninstall and based on the
returned value, an error is logged. Fixing this.
Ticket :#3992392
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
In case of imported routes (L3vni/vrf leaks), when a bgp instance is
being deleted, the peer->bgp comparision with the incoming bgp to remove
the dest from the pending fifo is wrong. This can lead to the fifo
having stale entries resulting in crash.
Two changes are done here.
- Instead of pop/push items in list if the struct bgp doesnt match,
simply iterate the list and remove the expected ones.
- Corrected the way bgp is fetched from dest rather than relying on
path_info->peer so that it works for all kinds of routes.
Ticket :#3980988
Signed-off-by: Chirag Shah <chirag @nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Instead of using 3 uint8_t variables under struct attr, let's use a single
uint8_t as the flags. Saving 2-bytes. Not a big deal, but it's even easier to
track EVPN-related flags/variables.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Fix for a bug, where FRR fails to install route received for an unknown but later-created VRF - detailed description can be found here https://github.com/FRRouting/frr/issues/13708
Signed-off-by: Piotr Suchy <psuchy@akamai.com>
Problem statement:
==================
When a vrf is deleted from the kernel, before its removed from the FRR
config, zebra gets to delete the the vrf and assiciated state.
It does so by sending a request to delete the l3 vni associated with the
vrf followed by a request to delete the vrf itself.
2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF
testVRF1001 to bgp
2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE
testVRF1001
The zebra client communication is asynchronous and about 1/5 cases the
bgp client process them in a different order.
2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be
deleted.
2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be
disabled.
2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766
2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766
2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion:
testVRF1001(4294967295)
.. and a bit later :
2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL
2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI
1001 Del - Could not find BGP instance
When the bgp vrf config is removed later it fails on the sanity check if
l3vni is removed.
if (bgp->l3vni) {
vty_out(vty, "%% Please unconfigure l3vni %u\n",
bgp->l3vni);
return CMD_WARNING_CONFIG_FAILED;
}
Solution:
=========
The solution is to make bgp cleanup the l3vni a bgp instance is going
down.
The fix:
========
The fix is to add a function in bgp_evpn.c to be responsible for for
deleting the local vni, if it should be needed, and call the function
from bgp_instance_down().
Testing:
========
Created a test, which can run in container lab that remove the vrf on
the host before removing the vrf and the bgp config form frr. Running
this test in a loop trigger the problem 18 times of 100 runs. After the
fix it did not fail.
To verify the fix a log message (which is not in the code any longer)
were used when we had a stale l3vni and needed to call
bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in
100 test runs.
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
bgpd: braces {} are not necessary for single line block
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
In case when bgp_evpn_free or bgp_delete is called and the announce_list
has few items where vpn/bgp does not match, we add the item back to the
list. Because of this the list count is always > 0 thereby hogging CPU or
infinite loop.
Ticket: #3905624
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
BGP receives notification from zebra about an vpn that
needs to be installed into the evpn tables. Unfortunately
this function was walking the entirety of evpn tables
3 times. Modify the code to walk the tree 1 time and
to just look for the needed route types as you go.
This reduces, in a scaled environment, processing
time of the zclient_read function from 130 seconds
to 95 seconds. For a up / down / up interface
scenario.
Signed-off-by: Rajasekar Raja <rajasekarr@vndia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Current changes deals with EVPN routes installation to zebra.
In evpn_route_select_install() we invoke evpn_zebra_install/uninstall
which sends zclient_send_message().
This is a continuation of code changes (similar to
ccfe452763) but to handle evpn part
of the code.
Ticket: #3390099
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Currently bgp_path_info's are stored in reverse order
received. Sort them by the best path ordering.
This will allow for optimizations in the future on
how multipath is done.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Do not reap instead let's schedule for deletion
and let best_path_selection take care of the deletion
as it should.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently evpn code calls bgp_best_selection for local
decisions for local tables to figure out what to do.
This is also pi based so let's note that the pi has
been changed before calling bgp_best_selection.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This will allow a consistency of approach to adding/removing
pi's to from the workqueue for processing as well as properly
handling the dest->info pi list more appropriately.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modify update_evpn_type5_route_entry to return a pointer to the
struct bgp_path_info modified in this function. This code
merely follows the standards used in other bgp_evpn.c code
where the update function returns the pointer to the path
info.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The MTYPE_BGP memory type was being over used as
both the handler for the bgp instance itself as
well as memory associated with name strings.
Let's separate out the two.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The dest pointer may be freed( but should not be
due to locking ). Let's ensure that this assumption
is true and make coverity happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There exist two spots in this function where the dest could be
freed, but is not due to locking, but coverity thinks it might
so let's make the function happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
But never really does due to locking, but since it can
we need to treat it like it does and ensure that FRR
is not making a mistake, by using memory after it
has been freed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is based on @donaldsharp's work
The current code base is the struct bgp_node data structure.
The problem with this is that it creates a bunch of
extra data per route_node.
The table structure generates ‘holder’ nodes
that are never going to receive bgp routes,
and now the memory of those nodes is allocated
as if they are a full bgp_node.
After splitting up the bgp_node into bgp_dest and route_node,
the memory of ‘holder’ node which does not have any bgp data
will be allocated as the route_node, not the bgp_node,
and the memory usage is reduced.
The memory usage of BGP node will be reduced from 200B to 96B.
The total memory usage optimization of this part is ~16.00%.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Found some code where bgp was not unlocking the dest
and rd_dest when walking the tree attempting to
find something to install.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Even if some of the attributes in bgp_path_info_extra are
not used, their memory is still allocated every time. It
cause a waste of memory.
This commit code deletes all unnecessary attributes and
changes the optional attributes to pointer storage. Memory
will only be allocated when they are actually used. After
optimization, extra info related memory is reduced by about
half(~400B -> ~200B).
Signed-off-by: Valerian_He <1826906282@qq.com>
Consider the scenario of evpn, the box has some type-5 ECMP routes.
After one of its remote peers is removed ( or down ), `show evpn rmac vni all`
kept no change **sometimes**, it means the rmac of the removed peer maybe is
still in this box, and the traffic will be wrongly forwarded to the removed
peer.
The root cause is that the best path selection for type-5 routes maybe
keep no change and the best path is not routed to the removed peer, so `bgpd`
wrongly doesn't tell `zebra` to remove ( withdraw ) the type-5 routes owned
by the removed peer.
So, add a new flag to force the deletion.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Adds a generalized martian reimport function used for triggering a
relearn/reimport of EVPN routes that were previously filtered/deleted
as a result of a "self" check (either during import or by a martian
change handler). The MAC-VRF SoO is the first consumer of this function,
but can be expanded for use with Martian Tunnel-IPs, Interface-IPs,
Interface-MACs, and RMACs.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Currently we have a handler function that will walk the global EVPN
rib and unimport/remove routes matching a local IP/TIP. This generalizes
this function so that it can be re-used for other BGP Martian entry
types. Now this can be used to unimport routes when the MAC-VRF SoO is
reconfigured.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
For whatever reason, we were only updating tip_hash when we processed an
L2VNI add/del. This adds tip_hash updates to the L3VNI add/del codepaths
so that their VTEP-IPs are also used when when considering martian
addresses, e.g. bgp_nexthop_self().
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Initial support for configuring an SoO for all MAC-VRFs (EVIs/L2VNIs).
This provides a topology-independent method of preventing EVPN routes
from one MAC-VRF "site" (an L2 domain) from being imported by other PEs
in the same MAC-VRF "site", similar to how SoO is traditionally used in
L3VPN to identify and break loops for an L3/IP-VRF "site".
One example of where a MAC-VRF SoO can be used to avoid an L2 control
plane loop is with Active/Active MLAG VTEPs. For a given L2 site only
one control plane should be active. SoO can be used to ID/ignore entries
originated from the local MAC-VRF site so that EVPN will not attempt to
manage entries that are already handled by MLAG.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
bgp_create() and bgp_free() already call EVPN-specific handlers,
so there's no need to XCALLOC/XFREE BGP_EVPN_INFO directly. Let's move
all the references to MTYPE_BGP_EVPN_INFO into the EVPN specific files.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>