When creating one interface "vxlan66" ( ip link add vxlan66 type vxlan ... ),
which initially maintains down status, saw one unexpected EC log:
```
zebra[37906]: [ZAG0W-VSNSD] interface vxlan66 vrf default(0) index 35 becomes active.
zebra[37906]: [HPWGA-Y527W] IFLA_VXLAN_LINK missing from VXLAN IF message
...
zebra[37906]: [W6BZR-YZPAB] RTM_NEWLINK update for vxlan66(35) sl_type 0 master 0 flags 0x1002
zebra[37906]: [MR3ZF-ATDBY] Intf vxlan66(35) has gone DOWN
zebra[37906]: [T44X9-FFNVB] Intf vxlan66(35) L2-VNI 66 is DOWN
zebra[37906]: [KEGYY-K8XVV] Send EVPN_DEL 66 to bgp
zebra[37906]: [HPWGA-Y527W] IFLA_VXLAN_LINK missing from VXLAN IF message
bgpd[37911]: [TV0XP-3WR0A] Rx VNI del VRF default VNI 66 tenant-vrf default SVI ifindex 0
bgpd[37911]: [MDW89-YAXJG][EC 33554497] 0: VNI hash entry for VNI 66 not found at DEL
```
Since commit `6f908ded80eeba40a850e8d1f6246fb3ed31e648` support interfaces
from down to down, and bgpd doesn't know "VNI 66" at all. So, remove this
EC log.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
ASAN reported the following memleak:
```
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x4d4342 in calloc (/usr/lib/frr/bgpd+0x4d4342)
#1 0xbc3d68 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0xb869f7 in list_new /home/sharpd/frr8/lib/linklist.c:64:9
#3 0x5a38bc in bgp_evpn_remote_ip_hash_alloc /home/sharpd/frr8/bgpd/bgp_evpn.c:6789:24
#4 0xb358d3 in hash_get /home/sharpd/frr8/lib/hash.c:162:13
#5 0x593d39 in bgp_evpn_remote_ip_hash_add /home/sharpd/frr8/bgpd/bgp_evpn.c:6881:7
#6 0x59dbbd in install_evpn_route_entry_in_vni_common /home/sharpd/frr8/bgpd/bgp_evpn.c:3049:2
#7 0x59cfe0 in install_evpn_route_entry_in_vni_ip /home/sharpd/frr8/bgpd/bgp_evpn.c:3126:8
#8 0x59c6f0 in install_evpn_route_entry /home/sharpd/frr8/bgpd/bgp_evpn.c:3318:8
#9 0x59bb52 in install_uninstall_route_in_vnis /home/sharpd/frr8/bgpd/bgp_evpn.c:3888:10
#10 0x59b6d2 in bgp_evpn_install_uninstall_table /home/sharpd/frr8/bgpd/bgp_evpn.c:4019:5
#11 0x578857 in install_uninstall_evpn_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4051:9
#12 0x58ada6 in bgp_evpn_import_route /home/sharpd/frr8/bgpd/bgp_evpn.c:6049:9
#13 0x713794 in bgp_update /home/sharpd/frr8/bgpd/bgp_route.c:4842:3
#14 0x583fa0 in process_type2_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4518:9
#15 0x5824ba in bgp_nlri_parse_evpn /home/sharpd/frr8/bgpd/bgp_evpn.c:5732:8
#16 0x6ae6a2 in bgp_nlri_parse /home/sharpd/frr8/bgpd/bgp_packet.c:363:10
#17 0x6be6fa in bgp_update_receive /home/sharpd/frr8/bgpd/bgp_packet.c:2020:15
#18 0x6b7433 in bgp_process_packet /home/sharpd/frr8/bgpd/bgp_packet.c:2929:11
#19 0xd00146 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
```
The list itself was not being cleaned up when the final list entry was
removed, so make sure we do that instead of leaking memory.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Both install_evpn_route_entry_in_vni_mac() and
install_evpn_route_entry_in_vni_ip() will unlock the bgp_dest when
install_evpn_route_entry_in_vni_common() returns, so there's no need to
unlock the bgp_dest inside the _common function. Let's let the new
wrappers handle the cleanup of the dest.
Ticket: #3119673
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Re-work the bgp vni table to use separately keyed tables for type2
routes.
So, with type2 routes, we have the main table keyed off of the IP and a
new MAC table keyed off of MACs.
By separating out the two, we are able to run path selection separately
for the neigh and mac. Keeping the two separate is also more in-line
with what happens in zebra (they are managed comptletely seperate).
With this change type2 routes go into each table like so:
```
Remote MAC-IP -> IP Table & MAC Table
Remote MAC -> MAC Table
Local MAC-IP -> IP Table
Local MAC -> MAC Table
```
The difference for local is necessary because we should not ever allow
multiple paths for a local MAC.
Also cleaned up the commands for querying the vni tables:
```
show bgp vni all type ...
show bgp vni VNI type ...
```
Old commands will be deprecated in a separate commit.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
We locate the sync path with the highest seq number for normalizing
local path to the same seq. With the change to maintain the VNI RT table
by IP address (instead of MAC&IP) we need additional changes to skip
paths with a different mac address than the local one we are trying to
normalize.
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Use the IP addr of type2/macip routes only for the hash/key
of the VNI table and carry the MAC in a path_info_extra attribute.
There is exists situations that can be hit during extended MAC mobility events
where two MACs could be pointing to the same IP in our global table. It
is requires very specific timings.
When that happens, BPG would (because we key'd on both MAC and IP)
install both into it's VNI table as separate entries, but zebra only
knows/needs to know about a single IP -> MAC relationship for it's VNI
table's type2 routes. So it was compleletly undeterministic which one
zebra would end up with in these timing situations.
With these changes, we move BGP's VNI table to key'd the same as Zebra's
and now a single IP will have multiple path_info's with a path_info_extra
that is carrying the MAC info for each path.
BGP will then run best path to deterministically decide which one to send to
zebra during the occasions where there exist's two possible MACs.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement forcing L3 auto derivation via configs even when
manually RTs are set. This will allow both to coexist in
BGP RTs. Without using auto config command, it will remove
auto derived RTs when you manually configure your own. To allow
both, use the auto command ond import/export/both.
Implement '*' wildcard import L3 RTs so we can import a route into any AS.
This is necessary to avoid a user from having to configure an L3 RT for
every AS they care to import evpn route from.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Abstract the ecommunity into a container struct for L3
route targets so that we can add some additional info
via flags to go along with RT configs without modifying
the used elsewhere ecommunity struct. This functions as a
wrapper everywhere its used including the import/export lists.
The flags will be used in later commits to change behavior
when importing/exporting routes.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.
Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh
Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
FRR should create a bnc per peer. Not have
one's that write over others. Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC. This is insufficient in that
we were accidently overwriting the one LL with other
data. This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
`attr.rmac` is not set in debug as expected for its wrong place in code.
Just move the debug process (`bgp_debug_zebra(NULL)`) after possible `rmac`
value is set.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The "add" parameter of `bgp_evpn_mh_route_update()` makes no sense.
Just remove it to clarify this function, and remove the relevant check
with "add" as well.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When `bgp_evpn_new()` is called, the `bgp` parameter MUST be non-NULL,
remove this unnecessary check and remove the NULL check for returned
`struct bgpevpn *`, which should be non-NULL.
And modify `import_rt_new()` in the same way.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Two changes for `delete_global_type2_routes()`:
1) Remove check of `bgp_dest_has_bgp_path_info_data(rddest)`.
It is unnecessary(`dest->info` should not be NULL) and misleading.
`if (rddest && bgp_dest_has_bgp_path_info_data(rddest))`
Use (locked) node with this check, but unlock with `if (rddest)`,
The mismatched condition is misleading, there seems to be a
mistake to extra unlock.
Just make it clear, immediately exit with `(!rddest)`.
2) Remove checking returned value for it, and use `void` as return type.
It is unnecessary and wrong. Even the check failed, it should continue
to delete other types of routes.
Just remove the check and go through.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
If `hash_get()` returns NULL, the list created with
`list_new()` is not be freed.
Since `hash_get()` should not fail, we don't need
`list_delete()` and other boring `XFREE()`s for its
failure case.
Just ignore returning value of `hash_get()` in these
cases.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
RT value will be unique across different VNIs but the
same across routers (in the same AS) for a particula
VNI.
It is unique, so add `break` for search procedure.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
This is an alternate to EAD route fragmenation and allows the user to limit
the route to a single UPDATE (<4K) independent of the number of EVIs.
Sample config (add one l2-vni RT from each VRF) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
router bgp 5556
!
address-family l2vpn evpn
ead-es-route-target export 5556:1001
ead-es-route-target export 5556:1004
ead-es-route-target export 5556:1008
exit-address-family
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Sample route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Network Next Hop Metric LocPrf Weight Path
*> [1]:[4294967295]:[03:44:38:39:ff:ff:01:00:00:01]:[32]:[27.0.0.21]
27.0.0.21 32768 i
ET:8 ESI-label-Rt:AA RT:5556:1001 RT:5556:1004 RT:5556:1008
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
When configured, the ead-es-route-target is used instead of
the auto-generated version that includes all associated EVI's RTs.
Ticket: #2632967
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Importing local es routes should be skipped. But the check of it is a bit wrong.
It is ok that local es routes can't be imported, but importing local es will
wrongly enter `uninstall` procedure.
Just adjust this check to make it clear. Immediately return in the case
of importing local es routes.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
`bgp_evpn_import_route_in_vrfs()` is special ( l2vpn ) form of
`install_uninstall_evpn_route() with `AFI_L2VPN` and `SAFI_EVPN` family.
No caller, just remove it.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Description:
Replacing memcmp at certain places,
to avoid the coverity issues caused by it.
Co-authored-by: Kantesh Mundargi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
NH tracking is already in use for type-1, type-3 and type-5 routes.
This change extends that tracking to EAD and ESR to eliminate the 9s
delay (BGP holdtimer) with ES/L2-NHG update seen when all the uplinks
are shutdown on a remote EVPN PE.
Ticket: #2682896
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
The BGP configuration for BGP EVPN RT5 setup consists in mainly
2 bgp instances (eventually one is enough) and L3VNI config.
When L3VNI is configured before BGP instances, and BGP route
targets are auto derived as per rfc8365, then, the obtained
route targets are wrong. For instance, the following can be
obtained:
=> show bgp vrf cust1 vni
BGP VRF: cust1
Local-Ip: 10.209.36.1
L3-VNI: 1000
Rmac: da:85:42:ba:2a:e9
VNI Filter: none
L2-VNI List:
Export-RTs:
RT:12757:1000
Import-RTs:
RT:12757:1000
RD: 65000:1000
whereas the derived route targets should be the below
ones:
=> show bgp vrf cust1 vni
BGP VRF: cust1
Local-Ip: 10.209.36.1
L3-VNI: 1000
Rmac: 72:f3:af:a0:98:80
VNI Filter: none
L2-VNI List:
Export-RTs:
RT:12757:268436456
Import-RTs:
RT:12757:268436456
RD: 65000:1000
There is an update handler that updates appropriately L2VNIs.
But this is not the case for L3VNIs. Add the missing code.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when doing BGP over an IGP platform, the expectation is that
the path calculation for a given prefix takes into account the
igpmetric given by IGP.
This is true with prefixes obtained in a given BGP instance where
peering occurs. For instance, ipv4 unicast entries or l2vpn evpn
entries work this way. The igpmetric is obtained through nexthop
tracking, like below:
south-vm# show bgp nexthop
Current BGP nexthop cache:
1.1.1.1 valid [IGP metric 10], #paths 1, peer 1.1.1.1
2.2.2.2 valid [IGP metric 20], #paths 1, peer 2.2.2.2
The igp metric is taken into account when doing best path
selection, and only the entry with lowest igp wins.
[..]
*>i[5]:[0]:[32]:[5.5.5.5]
1.1.1.1 0 100 0 ?
RT:65400:268435556 ET:8 Rmac:2e:22:6c:67:bb:73
* i 2.2.2.2 0 100 0 ?
RT:65400:268435556 ET:8 Rmac:f2:d3:68:4e:f4:ed
however, for imported EVPN RT5 entries, the igpmetric was not
copied from the parent path info. Fix it. In this way, the
imported route entries use the igpmetric of the parent pi.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>