Fix NULL deference issue in log. And change one word - "vtep",
it should be with lowercase letters like other places.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Remove unnecessary `install_l3nhg` knob because it has already
been controlled by the command: "[no$no] use-es-l3nhg".
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Correct flag name of `attr.es_flags` - ATTR_ES_L3_NHG_USE.
"bgp_path_info"s (Per "es-vrf") with this flag can use l3nhg.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Two cosmetic change -
1) Remove unnecessary local variable - `es_vtep` used in one condition.
2) Remove unused variable - `es_cnt`. `proc_cnt` has already taken `es`
into account.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
"no ead-es-route-target export RT":
Since existance is already checked in `bgp_evpn_ead_es_rt_cmd`
with `bgp_evpn_rt_matches_existing()`, there MUST be a deleting
node in evpn's `bgp_mh_info->ead_es_export_rtl` list.
Just modify the check for deleting node to an `assert`.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Currently, `bgp_evpn_es_new()` always has an invald `struct bgp` pointer as
its input parameter, and it will always return valid `es`.
So two cleanup changes:
- Remove unnecessary checking for `bgp` in `bgp_evpn_es_new()`
- Remove unnecessary checkings of `bgp_evpn_es_new()`'s callers.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Since the `RB_INSERT()` is called after not found in RB tree, it MUST be ok and
and return zero. The check of returning value of `RB_INSERT()` is redundant,
just remove them.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The EAD-per-ES route can be fragmented to fit the EVIs on the switch. This
command allows the EVI limit to be configured -
!
router bgp 5556
!
address-family l2vpn evpn
ead-es-frag evi-limit 200
exit-address-family
!
!
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
The EAD-per-ES route carries ECs for all the ES-EVI RTs. As the number of VNIs
increase all RTs do not fit into a standard BGP UPDATE (4K) so the route needs
to be fragmented.
Each fragment is associated with a separate RD and frag-id -
1. Local ES-per-EAD -
ES route table - {ES-frag-ID, ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
2. Remote ES-per-EAD -
VNI route table - {ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
Note: The fragment ID is abandoned in the per-VNI routing table. At this
point that is acceptable as we dont expect more than one-ES-per-EAD fragment
to be imported into the per-VNI routing table. But that may need to be
re-worked at a later point.
CLI changes (sample with 4 VNIs per-fragment for experimental pruposes) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es 03:44:38:39:ff:ff:01:00:00:01"
ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
RD: 27.0.0.21:3
Originator-IP: 27.0.0.21
Local ES DF preference: 50000
VNI Count: 10
Remote VNI Count: 10
VRF Count: 3
MACIP EVI Path Count: 33
MACIP Global Path Count: 198
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
Fragments: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
27.0.0.21:3 EVIs: 4
27.0.0.21:13 EVIs: 4
27.0.0.21:22 EVIs: 2
VTEPs:
27.0.0.22 flags: EA df_alg: preference df_pref: 32767
27.0.0.23 flags: EA df_alg: preference df_pref: 32767
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es-evi vni 1002 detail"
VNI: 1002 ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
ES fragment RD: 27.0.0.21:13 >>>>>>>>>>>>>>>>>>>>>>>>>
Inconsistencies: -
VTEPs: 27.0.0.22(EV),27.0.0.23(EV)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
PS: The number of EVIs per-fragment has been set to 128 and may need further
tuning.
Ticket: #2632967
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
This is an alternate to EAD route fragmenation and allows the user to limit
the route to a single UPDATE (<4K) independent of the number of EVIs.
Sample config (add one l2-vni RT from each VRF) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
router bgp 5556
!
address-family l2vpn evpn
ead-es-route-target export 5556:1001
ead-es-route-target export 5556:1004
ead-es-route-target export 5556:1008
exit-address-family
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Sample route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Network Next Hop Metric LocPrf Weight Path
*> [1]:[4294967295]:[03:44:38:39:ff:ff:01:00:00:01]:[32]:[27.0.0.21]
27.0.0.21 32768 i
ET:8 ESI-label-Rt:AA RT:5556:1001 RT:5556:1004 RT:5556:1008
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
When configured, the ead-es-route-target is used instead of
the auto-generated version that includes all associated EVI's RTs.
Ticket: #2632967
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Using memcmp is wrong because struct ipaddr may contain unitialized
padding bytes that should not be compared.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When VNI RD changes, EAD/EVI routes with old RD should be withdrawn from
the global routing table and EAD/EVI routes in the VNI should be
advertised with the new RD.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
The MH datastructures were being released before the paths that were
referencing them. Fix is to do the MH cleanup last.
The MH finish function has also been stripped down to only do a
datastructure cleanup i.e. avoid sending route updates etc.
Ticket: 31376
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. When a local ES is deleted or the ES-bond goes into bypass we treat
imported MAC-IP routes with that ES destination as remote routes instead
of sync routes. This requires a re-evaluation of the routes as
"non-local-dest" and an update to zebra.
2. When a ES is attached to an access port or the ES-bond transitions from
bypass to LACP-up we treat imported MAC-IP routes with that ES destination as
sync routes. This requires a re-evaluation of the routes as
"local-dest" and an update to zebra.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In the case of EVPN type-2 routes that use ES as destination, BGP
consolidates the nh (and nh->rmac mapping) and sends it to zebra as
a nexthop add.
This nexthop is the EVPN remote PE and is created by reference of
VRF IPvx unicast paths imported from EVPN Type-2 routes.
zebra uses this nexthop for setting up a remote neigh enty for the PE
and a remote fdb entry for the PE's RMAC.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Setup a mh_info indirection in the path extra. This has been done to
avoid increasing evpn route's path size to add new (type based) pointers
in path_info_extra.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Theoretically we should just be able to use the L3 NHG in the other-VRF/nh-VRF.
But there is some change list handling (when an ES is added to or
removed from a VRF) that needs to be updated to account for routes in other
VRFs using that ES-VRF as nexthop. Till that is done we will disable L3-NHG
use for routes leaked from a different VRF.
Route in tenant2 with ES/NHG as destination -
===========================================
root@leaf11:mgmt:~# ip route show vrf tenant2 22.1.0.7
22.1.0.7 nhid 75000012 proto bgp metric 20
root@leaf11:mgmt:~# ip nexthop list id 75000012
id 75000012 group 103/107/111 proto bgp
root@leaf11:mgmt:~# ip nexthop |grep "103\|107\|111"
id 103 via 6.0.0.11 dev vlan12 scope link proto bgp onlink
id 107 via 6.0.0.12 dev vlan12 scope link proto bgp onlink
id 111 via 6.0.0.13 dev vlan12 scope link proto bgp onlink
id 75000012 group 103/107/111 proto bgp
root@leaf11:mgmt:~#
Leaked into VRF1 with a flat/exploded mpaths
============================================
root@leaf11:mgmt:~# ip route show vrf tenant1 |grep -A3 22.1.0.7
22.1.0.7 proto bgp metric 20
nexthop via 6.0.0.11 dev vlan12 weight 1 onlink
nexthop via 6.0.0.12 dev vlan12 weight 1 onlink
nexthop via 6.0.0.13 dev vlan12 weight 1 onlink
root@leaf11:mgmt:~#
Ticket: CM-31115
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Force flush all ES-EVI PE entries when a L2-VNI is deleted. This will
implicitly free up the remote ES-EVI and deref the ES entry.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
There are two changes in this commit -
1. Maintain a list of global MAC-IP routes per-ES. This list is maintained
for quick processing on the following events -
a. When the first VTEP/PE becomes active in the ES-VRF, the L3 NHG is
activated and the route can be sent to zebra.
b. When there are no active PEs in the ES-VRF the L3 NHG is
de-activated and -
- If the ES is present in the VRF -
The route is not installed in zebra as there are no active PEs for
the ES-VRF
- If the ES is not present in the VRF -
The route is installed with a flat multi-path list i.e. without L3NHG.
This is to handle the case where there are no locally attached L2VNIs
on the ES (for that tenant VRF).
2. Reinstall VRF route when an ES is installed or uninstalled in a
tenant VRF (the global MAC-IP list in #1 is used for this purpose also).
If an ES is present in the VRF we use L3NHG to enable fast-failover of
routed traffic.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is done to clearly indicate what routes are being linked to
the list i.e. MAC-IP routes in the VNI table.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In a sym-IRB setup the remote ES may not be installed if the tenant
VRF is not present locally. To allow that case while retaining the
fast-failover benefits for the case where the tenant VRF is locally
present we use the following approach -
1. If ES is present in the tenant VRF we use the L3NHG for installing
the MAC-IP based tenant route. This allows for efficient failover via
L3NHG updates.
2. If the ES is not present locally in the corresponding tenant VRF we
fall back to a non-NHG multi-path based routing approach. In this
case individual routes are updated when the ES links flap.
PS: #1 can be turned off entirely by disabling use-l3-nhg in BGP.
Ticket: CM-30935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When an ES goes down the MAC-IP route must be updated to remove it from
the tenant VRF routing table. This is because the fast-failover
(via EAD-per-ES withdraw) procedures described in RFC 7432 are only
applicable to L2 forwarding/MAC-ECMP. For L3/routed traffic (in a
sym-IRB setup) failover, individual paths need to be withdrawn.
To handle this difference in L2/L3 requirements BGP updates the MAC-IP
route to include the L3 ECOM if local destination ES is oper-up and
to exclude the L3 ECOM if local ES is oper-down.
Ticket: CM-30935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. When VNI export RT changes, for each local es_evi, update local
EAD/ES and EAD/EVI routes and advertise.
2. When VNI import RT changes, uninstall all type-1 routes imported in
the VNI and import routes carrying the updated RT.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>