Commit Graph

81 Commits

Author SHA1 Message Date
Anuradha Karuppiah
74be8313d4 bgpd: support for lacp bypass with EVPN MH
When a local ES is in LACP bypass state BGP doesn't advertise
reachability to it i.e. the Type-1/EAD-per-ES routes and Type-4
route for the ES is not advertised. This is the equivalent of
oper-down handling.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-02-24 08:11:26 -08:00
Anuradha Karuppiah
8fc2ffb3bb bgpd: enable ES consistency checks on first ES add
Consistency checks are processed in the background using a periodic timer.
Start this timer only if Ethernet Segments are present and consistency
checking is needed.

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-02-19 12:35:07 -08:00
Patrick Ruddy
b5a9054d76
Merge pull request #7635 from AnuradhaKaruppiah/ead-evi-knobs
bgpd: add config knobs to disable rx and tx of ead-per-evi routes
2021-01-26 17:14:57 +00:00
Anuradha Karuppiah
b37ff319f3 bgpd: fix typo "show bgp l2vpn evpn es-evi [vni] <> json" display
The ead-per-evi flag was being displayed as ed-per-evi.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-21 08:41:17 -08:00
Anuradha Karuppiah
2eef4f20d0 bgpd: rename some MH functions and take care of deffered logs etc.
Rename VTEP change functions for better readability, improve comments
and add missing logs.

No functional change.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-21 08:40:07 -08:00
Donatas Abraitis
3a6290bdd1 *: Replace s_addr check agains 0 with INADDR_ANY
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-12-14 21:03:38 +02:00
Anuradha Karuppiah
fe8293c326 bgpd: add config knobs to disable rx and tx of ead-per-evi routes
Some vendors only advertise EAD-per-ES routes i.e. they do not
advertise EAD-per-EVI routes. To interop with these vendors we need
to relax the dependancy on EAD-per-EVI routes to activate a remote ES-PE.

Sample config -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
router bgp 5553
 address-family l2vpn evpn
  disable-ead-evi-rx
>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Ticket: CM-31177

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-12-11 09:39:04 -08:00
Pat Ruddy
f61fbf216b bgpd: correctly store allocated ES struct
in the rare situation where we allocate the ES during the path link
we fail to check/store the allocated ES pointer thus leading to a
NULL dereference later in the function.

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-11-25 18:23:13 +00:00
Anuradha Karuppiah
2867823e49 bgpd: add a config knob to enable use of L3 NHG for EVPN host routes
Sample config -
vtysh -c "conf t"  -c "router bgp <N>" -c "address-family l2vpn evpn" -c "use-es-l3nhg"

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
8bcb09a18c bgpd: Use L3NHGs for symmetric IRB host routes
Two L3 next groups are installed per-VRF per-ES for v4 and v6. These
NHGs are used as an indirect destination for symmetric IRB host routes.

Using L3NHGs allows for efficient failover of an ES (similar to the
use of L2NHGs) i.e. when an ES goes down the number of dataplane
updates are limited to 2xN (where N is the number of tenant VRFs
associated with the ES) instead of updating all host-routes behind the
ES.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
229587fb46 bgpd: commands to display L3 NHGs and MAC-IP paths linked to an ES
Sample output -
===============
torm-11# sh bgp l2vpn evpn es-vrf
ES-VRF Flags: A Active
ESI                            VRF             Flags IPv4-NHG IPv6-NHG Ref
03:44:38:39:ff:ff:01:00:00:01  vrf3            A     1        0        2
03:44:38:39:ff:ff:01:00:00:01  vrf2            A     6        0        4
03:44:38:39:ff:ff:01:00:00:01  vrf1            A     7        0        4
03:44:38:39:ff:ff:01:00:00:02  vrf3            A     2        0        2
03:44:38:39:ff:ff:01:00:00:02  vrf2            A     4        0        4
03:44:38:39:ff:ff:01:00:00:02  vrf1            A     8        0        4

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
6348981a60 bgpd: use L3NHG while installing EVPN host routes in zebra
Host routes imported into the VRF can have a destination ES (per-VRF)
which is set up as a L3NHG for efficient failover.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
26c03e43fb bgpd: Handle ES VTEP add/del to a host route
1. MAC-IP routes in the VPN routing table are linked to the
destination ES for efficient handling for remote ES link flaps.
2. Only MAC-IP paths whose nexthops are active (added via EAD-ES)
are imported into the VRF routing table.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
c589d84746 bgpd: L3NHG infrastructure for host routes in EVPN
ES-VRF entries are maintained for the purpose of L3-NHG creation -
1. Each ES-EVI entry is associated with a tenant VRF. This associaton
triggers the creation of an ES-VRF entry.
2. Type-2/MAC-IP routes are imported into a tenant VRF and programmed as
a /32 or host route entry in the dataplane. If the destination of
the host route is a remote-ES the route is programmed with the
corresponding (keyed in by {vrf,ES-id}) L3-NHG.
3. The reason for this indirection (route->L3-NHG, L3-NHG->list-of-VTEPs)
is to avoid route updates to the dplane when a remote-ES link flaps i.e.
instead of updating all the dependent routes the NHG's contents are
updated. This reduces the amount of dataplane updates (fewer nhg updates vs.
route updates) allowing for a faster failover.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Donald Sharp
dc52beced1 bgpd: Fix missed unlocks
When iterating over the bgp_dest table, using this pattern:

	for (dest = bgp_table_top(table); dest;
	     dest = bgp_route_next(dest)) {

If the code breaks or returns in the middle we will not have
properly unlocked the node as that bgp_table_top locks the top
dest and bgp_route_next locks the next dest and unlocks the old
dest.

From code inspection I have found a bunch of places that
we either return in the middle of or a break is issued.

Add appropriate unlocks.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-14 15:32:49 -05:00
Anuradha Karuppiah
ec779825f8 bgpd: cleanup inet_ntoa in the bgp_evpn_mh debug logs
Replaced inet_ntoa with %pI4 in the debug logs.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-26 10:43:05 -07:00
Anuradha Karuppiah
a2339ed9e3 lib, bgpd: move json_array_string_add to lib
json_array_string_add is used to add a string entry into a JSON
list. This API is needed by zebra so moving it from bgpd to lib.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-26 10:33:21 -07:00
Anuradha Karuppiah
74e2bd891d bgpd: support for DF election in EVPN-MH
DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.

To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df

This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.

In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).

Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
 Type: LR
 RD: 27.0.0.15:6
 Originator-IP: 27.0.0.15
 Local ES DF preference: 100
 VNI Count: 10
 Remote VNI Count: 10
 Inconsistent VNI VTEP Count: 0
 Inconsistencies: -
 VTEPs:
  27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
                    27.0.0.15                          32768 i
                    ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-26 10:26:21 -07:00
Mark Stapp
b3d6bc6ef0 * : update signature of thread_cancel api
Change thread_cancel to take a ** to an event, NULL-check
before dereferencing, and NULL the caller's pointer. Update
many callers to use the new signature.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-10-23 08:59:34 -04:00
Donatas Abraitis
2dbe669bdf :* Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-22 09:07:41 +03:00
Anuradha Karuppiah
45a859f1c3 bgpd: fix crash in the MH cleanup handling
The MH datastructures were being released before the paths that were
referencing them. Fix is to do the MH cleanup last.

The MH finish function has also been stripped down to only do a
datastructure cleanup i.e. avoid sending route updates etc.

Ticket: 31376

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-10-21 09:09:21 -07:00
Donatas Abraitis
23d0a75356 bgpd: Convert inet_ntoa to %pI4/inet_ntop
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-18 11:22:30 +03:00
Donald Sharp
752eed47ef bgpd: Use bgp_dest_get_prefix accessor function
Use the appropriate bgp_dest_get_prefix accessor function

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17 08:52:35 -04:00
Donald Sharp
09319b4e0f bgpd: More bgp_node -> bgp_dest cleanup
Some more of the bgp_node usage snuck in from big commits in
the past month or so from feature work.  Do some work
to put it back to bgp_dest for incoming future work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17 08:52:35 -04:00
Philippe Guibert
7659ad686a bgpd: do not forget to set the size of community val length
because ecommunity structure can host both ext community and ipv6 ext
community, do not forget to set the unit_size field.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-08-21 13:37:08 +02:00
Philippe Guibert
34540b0d7f bgpd: fill in local ecommunity context with ecom unit length
because the same extended community can be used for storing ipv6 and
ipv4 et communities, the unit length must be stored. do not forget to
set the standard value in bgp evpn.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-08-21 13:37:08 +02:00
Anuradha Karuppiah
9e0c2fd182 bgpd, zebra: remove strcpy, strlen and sprintf calls
Replace with safe copy functions - strlcpy, strlcat, strnlen and
snprintf.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:13 -07:00
Anuradha Karuppiah
9c7edc03b8 bgpd: Type-2/MAC-IP SYNC route handling
SYNC routes are paths rxed from a local-ES peer. These routes result in
the installation of local dataplane entries i.e. with access port as
destination (vs. the remote-VTEP destination that results in the packet
being sent via the VxLAN overlay).

If a SYNC path is selected as the best path it is always turned around
into a local path which immediately lowers the status of the SYNC path
to non-best. However we need to keep track of the highest MM seq-number
and peer activity to continue advertising the local path. In order to
do that we need information from the "second-best" SYNC path to be
bubbled up to the local best path. This "SYNC" info is then consolidated
and sent to zebra which is responsible for the MM handling and local
path management.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:13 -07:00
Anuradha Karuppiah
7904e9fdfa bgpd: extended-community and attrs for MAC-IP SYNC route handling
A new proxy flag has been added to the already existing NA extended
community to allow proxy advertisment of a local host by a VTEP that is
yet to indpendently establish local reachability.
Reference: draft-rbickhart-evpn-ip-mac-proxy-adv

The extendend mac-mobility sequence number needs to be synced across
the ES peers. However we cannot let a ES-peer path win over a local
path on the same ES. To accomplish that some parameters such as the
MM seq number are bubbled up from the non-best path to the local path.
This mechanism is explained further in the path-selection patch.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00
Anuradha Karuppiah
c44ab6f1f3 bgpd: support for Ethernet Segments and Type-1/EAD routes
This is the base patch that brings in support for Type-1 routes.
It includes support for -
- Ethernet Segment (ES) management
- EAD route handling
- MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for
  active-active multihoming
- Initial infra for consistency checking. Consistency checking
  is a fundamental feature for active-active solutions like MLAG.
  We will try to levarage the info in the EAD-ES/EAD-EVI routes to
  detect inconsitencies in access config across VTEPs attached to
  the same Ethernet Segment.

Functionality Overview -
========================
1. Ethernet segments are created in zebra and associated with
access VLANs. zebra sends that info as ES and ES-EVI objects to BGP.
2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached
ethernet segments.
3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers
and translates them into ES-VTEP objects which are then sent to zebra
as remote ESs.
4. Each ES in zebra is associated with a list of active VTEPs which
is then translated into a L2-NHG (nexthop group). This is the ES
"Alias" entry
5. MAC-IP routes with a non-zero ESI use the alias entry created in
(4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES
destinations.

EAD route management (route table and key) -
============================================
1. Local EAD-ES routes
a. route-table: per-ES route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
b. route-table: per-VNI route-table
Not added
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)

2. Remote EAD-ES routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)

3. Local EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)

4. Remote EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)

Please refer to bgp_evpn_mh.h for info on how the data-structures are
organized.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00
Anuradha Karuppiah
185fb14a41 bgpd: pull the multihoming code out to a separate file
Re-org only; no other code changes. This is being done to make maintanence
of MH functionality (which will have more code added to it) easy.

The code moved here was originally committed via -
'commit 50f74cf131 ("*: support for evpn type-4 route")'

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00