Commit Graph

323 Commits

Author SHA1 Message Date
Anuradha Karuppiah
74efb82223 bgpd: handle local ES del or transition to LACP bypass
1. When a local ES is deleted or the ES-bond goes into bypass we treat
imported MAC-IP routes with that ES destination as remote routes instead
of sync routes. This requires a re-evaluation of the routes as
"non-local-dest" and an update to zebra.
2. When a ES is attached to an access port or the ES-bond transitions from
bypass to LACP-up we treat imported MAC-IP routes with that ES destination as
sync routes. This requires a re-evaluation of the routes as
"local-dest" and an update to zebra.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-03-25 19:24:39 -07:00
Anuradha Karuppiah
090efa2fb7 bgpd: changes for maintaining evpn nexthops and their rmac mapping
In the case of EVPN type-2 routes that use ES as destination, BGP
consolidates the nh (and nh->rmac mapping) and sends it to zebra as
a nexthop add.

This nexthop is the EVPN remote PE and is created by reference of
VRF IPvx unicast paths imported from EVPN Type-2 routes.

zebra uses this nexthop for setting up a remote neigh enty for the PE
and a remote fdb entry for the PE's RMAC.

Ticket: CM-31398

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-03-25 17:12:50 -07:00
Anuradha Karuppiah
58bff4d12e bgpd: re-eval use-l3nhg when a remote ES is [de]activated in a VRF
There are two changes in this commit -

1. Maintain a list of global MAC-IP routes per-ES. This list is maintained
for quick processing on the following events -
a. When the first VTEP/PE becomes active in the ES-VRF, the L3 NHG is
activated and the route can be sent to zebra.
b. When there are no active PEs in the ES-VRF the L3 NHG is
de-activated and -
- If the ES is present in the VRF -
The route is not installed in zebra as there are no active PEs for
the ES-VRF
- If the ES is not present in the VRF -
The route is installed with a flat multi-path list i.e. without L3NHG.
This is to handle the case where there are no locally attached L2VNIs
on the ES (for that tenant VRF).

2. Reinstall VRF route when an ES is installed or uninstalled in a
tenant VRF (the global MAC-IP list in #1 is used for this purpose also).
If an ES is present in the VRF we use L3NHG to enable fast-failover of
routed traffic.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-03-25 17:09:53 -07:00
Anuradha Karuppiah
36dd457465 bgpd: allow routes to be imported if the ES/ES-VRF is not present
In a sym-IRB setup the remote ES may not be installed if the tenant
VRF is not present locally. To allow that case while retaining the
fast-failover benefits for the case where the tenant VRF is locally
present we use the following approach -
1. If ES is present in the tenant VRF we use the L3NHG for installing
the MAC-IP based tenant route. This allows for efficient failover via
L3NHG updates.
2. If the ES is not present locally in the corresponding tenant VRF we
fall back to a non-NHG multi-path based routing approach. In this
case individual routes are updated when the ES links flap.

PS: #1 can be turned off entirely by disabling use-l3-nhg in BGP.

Ticket: CM-30935

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-03-25 17:09:53 -07:00
Anuradha Karuppiah
70524092b2 bgpd: on ES down re-advertise the MAC-IP entry without the L3 ECOM
When an ES goes down the MAC-IP route must be updated to remove it from
the tenant VRF routing table. This is because the fast-failover
(via EAD-per-ES withdraw) procedures described in RFC 7432 are only
applicable to L2 forwarding/MAC-ECMP. For L3/routed traffic (in a
sym-IRB setup) failover, individual paths need to be withdrawn.

To handle this difference in L2/L3 requirements BGP updates the MAC-IP
route to include the L3 ECOM if local destination ES is oper-up and
to exclude the L3 ECOM if local ES is oper-down.

Ticket: CM-30935

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2021-03-25 17:09:53 -07:00
Ameya Dharkar
9c49ac7424 bgpd: Update EVPN type-1 routes when VNI RT changes
1. When VNI export RT changes, for each local es_evi, update local
EAD/ES and EAD/EVI routes and advertise.

2. When VNI import RT changes, uninstall all type-1 routes imported in
the VNI and import routes carrying the updated RT.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-03-23 08:40:29 -07:00
David Lamparter
96244aca23 *: require semicolon after DEFINE_QOBJ & co.
Again, see previous commits.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-17 06:18:37 +01:00
Donald Sharp
c0d72166ee bgpd: Convert remaining string output to our internal types
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-03-09 19:50:42 -05:00
David Lamparter
1d5453d607 *: remove tabs & newlines from log messages
Neither tabs nor newlines are acceptable in syslog messages.  They also
break line-based parsing of file logs.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-02-14 15:36:51 +01:00
Donald Sharp
f6e07e1bdf bgpd: Use uint32_t for size value instead of int in ecommunity struct
The `struct ecommunity` structure is using an int for a size value.
Let's switch it over to a uint32_t for size values since a size
value for data can never be negative.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-18 09:06:49 -05:00
Donatas Abraitis
3a6290bdd1 *: Replace s_addr check agains 0 with INADDR_ANY
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-12-14 21:03:38 +02:00
Chirag Shah
5bbd2cc1e6 bgpd: fix evpn route-map vni filter at origin
evpn route-map match (filter) on vni is not working
at the origin of the routes.

evpn match vni route checks for encap type as vxlan.
the source route attribute is not set with vxlan encap
thus the match filter wouldn't work.

Ticket:CM-32554
Reviewed By:CCR-11056
Testing Done:

At source have match vni plus set statement in route-map.
Validate the origin of the route's outbound correctly sets
the 'set' statment based on match vni filter.

At origin:
route-map RM-EVPN-TE-Matches permit 10
 match evpn vni 4001
  set large-community 10:10:119

Receiving end:

Route [5]:[0]:[24]:[78.41.1.0] VNI 4001
5550
  27.0.0.15 from TORS1(downlink-5) (27.0.0.15)
    Origin incomplete, metric 0, valid, external, bestpath-from-AS 5550, best (First path received)
    Extended Community: RT:5550:4001 ET:8 Rmac:00:02:00:00:00:4d
    Large Community: 10:10:119    <--- Large community stamped
    Last update: Thu Dec 10 22:19:26 2020

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2020-12-12 14:08:16 -08:00
Anuradha Karuppiah
8bcb09a18c bgpd: Use L3NHGs for symmetric IRB host routes
Two L3 next groups are installed per-VRF per-ES for v4 and v6. These
NHGs are used as an indirect destination for symmetric IRB host routes.

Using L3NHGs allows for efficient failover of an ES (similar to the
use of L2NHGs) i.e. when an ES goes down the number of dataplane
updates are limited to 2xN (where N is the number of tenant VRFs
associated with the ES) instead of updating all host-routes behind the
ES.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
26c03e43fb bgpd: Handle ES VTEP add/del to a host route
1. MAC-IP routes in the VPN routing table are linked to the
destination ES for efficient handling for remote ES link flaps.
2. Only MAC-IP paths whose nexthops are active (added via EAD-ES)
are imported into the VRF routing table.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
bbc57c6cfa bgpd: skip VRF import of MAC-IP routes that belong to locally attached hosts
Local attached hosts are routed via the access ports using the neigh and
fdb/MAC dplane entries.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 10:22:48 -08:00
Russ White
2bd9d50ca1
Merge pull request #7523 from donaldsharp/route_map_object_t
*: Remove route_map_object_t from the system
2020-11-17 07:16:12 -05:00
Donatas Abraitis
3dbaf077d4
Merge pull request #7461 from donaldsharp/attribute_setget
Attribute setget
2020-11-16 12:20:40 +02:00
Donald Sharp
6c924775b5 bgpd: Convert attr->evpn_overlay to accessor functions
Convert usage of the attr->evpn_overlay to get/set functionality.
Future commits will allow us to abstract this data to when
we actually need it for the `struct attr`.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-15 09:49:14 -05:00
Donald Sharp
2a3f51cf6b bgpd: Add accessor for bgp_attr.pmsi_tnl_type
Add an accessor for the bgp_attr.pmsi_tnl_type to allow
us to abstract where it is.  Every attribute is paying
the price of this bit of data as part of `struct bgp_attr`
In the future we'll move it elsewhere.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-15 09:44:47 -05:00
Donald Sharp
dc52beced1 bgpd: Fix missed unlocks
When iterating over the bgp_dest table, using this pattern:

	for (dest = bgp_table_top(table); dest;
	     dest = bgp_route_next(dest)) {

If the code breaks or returns in the middle we will not have
properly unlocked the node as that bgp_table_top locks the top
dest and bgp_route_next locks the next dest and unlocks the old
dest.

From code inspection I have found a bunch of places that
we either return in the middle of or a break is issued.

Add appropriate unlocks.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-14 15:32:49 -05:00
Donald Sharp
1782514fb9 *: Remove route_map_object_t from the system
The route_map_object_t was being used to track what protocol we were
being called against.  But each protocol was only ever calling itself.
So we had a variable that was only ever being passed in from route_map_apply
that had to be carried against and everyone was testing if that variable
was for their own stack.

Clean up this route_map_object_t from the entire system.  We should
speed some stuff up.  Yes I know not a bunch but this will add up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-13 19:35:20 -05:00
Donatas Abraitis
2dbe669bdf :* Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-22 09:07:41 +03:00
Donald Sharp
cd7f9b1711
Merge pull request #7323 from ton31337/fix/inet_ntoa_to_pFX_master
bgpd: Convert inet_ntoa to %pI4
2020-10-20 09:10:24 -04:00
Donatas Abraitis
23d0a75356 bgpd: Convert inet_ntoa to %pI4/inet_ntop
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-18 11:22:30 +03:00
Donald Sharp
c10e14e96d *: Create/Use accessor functions for lock count
Create appropriate accessor functions for the rn->lock
data.  We should be accessing this data through accessor
functions since it is private data to the data structure.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17 13:39:10 -04:00
Donatas Abraitis
0dc8647094
Merge pull request #7306 from donaldsharp/bgp_dest_print
Bgp dest print
2020-10-17 20:21:52 +03:00
Donald Sharp
752eed47ef bgpd: Use bgp_dest_get_prefix accessor function
Use the appropriate bgp_dest_get_prefix accessor function

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17 08:52:35 -04:00
Donald Sharp
09319b4e0f bgpd: More bgp_node -> bgp_dest cleanup
Some more of the bgp_node usage snuck in from big commits in
the past month or so from feature work.  Do some work
to put it back to bgp_dest for incoming future work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-10-17 08:52:35 -04:00
David Lamparter
56ca3b5b3a bgpd: add %pBD for printing struct bgp_dest *
`%pRN` is not appropriate anymore.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-10-17 08:52:35 -04:00
Pat Ruddy
f137734bb4 bgpd: replace bgp_evpn_route2str with prefix2str
Remove bgp_evpn_route2str and replace calls with prefix2str

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-10-16 11:54:30 +01:00
Quentin Young
84f22ecc05 bgpd: fix ecom leak handling l3vni update
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-09-15 16:06:58 -07:00
Philippe Guibert
4371bf9110 bgpd: remove warnings related to line too longs in bgp code
remove warnings related to line too long in bgp code.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-08-21 13:37:08 +02:00
Philippe Guibert
34540b0d7f bgpd: fill in local ecommunity context with ecom unit length
because the same extended community can be used for storing ipv6 and
ipv4 et communities, the unit length must be stored. do not forget to
set the standard value in bgp evpn.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-08-21 13:37:08 +02:00
Rafael Zalamena
be8d09f125
Merge pull request #6924 from AnuradhaKaruppiah/mem-fixes
bgpd: fixes for problems found during EVPN fuzzing
2020-08-20 14:12:51 +00:00
Renato Westphal
4fe5bc8c62
Merge pull request #6943 from ton31337/fix/replace_sizeof_instead_of_constant_for_bgp_dump_attr
bgpd: Use sizeof() in bgp_dump_attr()
2020-08-19 07:36:13 -03:00
Donatas Abraitis
5022c8331d bgpd: Use sizeof() in bgp_dump_attr()
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-08-18 21:43:07 +03:00
Quentin Young
e121d83163 bgpd: fix bad heap reads in type-2 nlri parsing
Various forms of corrupt packets could trigger reads of garbage heap.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-08-15 08:24:59 -07:00
Anuradha Karuppiah
9c7edc03b8 bgpd: Type-2/MAC-IP SYNC route handling
SYNC routes are paths rxed from a local-ES peer. These routes result in
the installation of local dataplane entries i.e. with access port as
destination (vs. the remote-VTEP destination that results in the packet
being sent via the VxLAN overlay).

If a SYNC path is selected as the best path it is always turned around
into a local path which immediately lowers the status of the SYNC path
to non-best. However we need to keep track of the highest MM seq-number
and peer activity to continue advertising the local path. In order to
do that we need information from the "second-best" SYNC path to be
bubbled up to the local best path. This "SYNC" info is then consolidated
and sent to zebra which is responsible for the MM handling and local
path management.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:13 -07:00
Anuradha Karuppiah
c44ab6f1f3 bgpd: support for Ethernet Segments and Type-1/EAD routes
This is the base patch that brings in support for Type-1 routes.
It includes support for -
- Ethernet Segment (ES) management
- EAD route handling
- MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for
  active-active multihoming
- Initial infra for consistency checking. Consistency checking
  is a fundamental feature for active-active solutions like MLAG.
  We will try to levarage the info in the EAD-ES/EAD-EVI routes to
  detect inconsitencies in access config across VTEPs attached to
  the same Ethernet Segment.

Functionality Overview -
========================
1. Ethernet segments are created in zebra and associated with
access VLANs. zebra sends that info as ES and ES-EVI objects to BGP.
2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached
ethernet segments.
3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers
and translates them into ES-VTEP objects which are then sent to zebra
as remote ESs.
4. Each ES in zebra is associated with a list of active VTEPs which
is then translated into a L2-NHG (nexthop group). This is the ES
"Alias" entry
5. MAC-IP routes with a non-zero ESI use the alias entry created in
(4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES
destinations.

EAD route management (route table and key) -
============================================
1. Local EAD-ES routes
a. route-table: per-ES route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
b. route-table: per-VNI route-table
Not added
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)

2. Remote EAD-ES routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)

3. Local EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)

4. Remote EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)

Please refer to bgp_evpn_mh.h for info on how the data-structures are
organized.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00
Anuradha Karuppiah
0a50c24813 bgpd: attr changes for EAD routes
Add ESI as an inline attribute field along with the other EVPN
attributes. This may be re-worked when the rest of the EVPN
attributes find a new home.

Some cleanup has been done to get rid of stale/unused references
to ESI. And also to consolidate duplicate definitions of ES ID
types.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00
Anuradha Karuppiah
185fb14a41 bgpd: pull the multihoming code out to a separate file
Re-org only; no other code changes. This is being done to make maintanence
of MH functionality (which will have more code added to it) easy.

The code moved here was originally committed via -
'commit 50f74cf131 ("*: support for evpn type-4 route")'

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-08-05 06:46:12 -07:00
Russ White
4267c07425
Merge pull request #6628 from adharkar/frr-master-evpn_rt
bgpd: Incorrect auto-RT formed when L3VNI is not configured
2020-07-05 16:07:10 -04:00
Donald Sharp
9bcb3eef54 bgp: rename bgp_node to bgp_dest
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`".  It should not result in any functional
change.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-06-23 17:32:52 +02:00
Ameya Dharkar
ebdc9e64c3 bgpd: Incorrect auto-RT formed when L3VNI is not configured
We use ASN:VNI format to calculate auto RT for L3VNI.
When L3VNI is not configured, if we delete the configured RT, incorrect auto-RT
value is generated as VRF VNI is 0.

Fix:
Do not configure auto-RT if L3VNI is not configured.

Trigger:
1. Delete L3VNI
2. Delete configured RT.

Before fix:

dev# sh bgp vrf vrf-blue vni
BGP VRF: vrf-blue
  Local-Ip: 10.100.0.1
  L3-VNI: 0
  Rmac: 00:00:00:00:00:00
  VNI Filter: none
  L2-VNI List:

  Export-RTs:
  RT:101:0
  Import-RTs:
  RT:101:0
  RD: 10.100.0.1:2

After fix:

dev# sh bgp vrf vrf-blue vni
BGP VRF: vrf-blue
  Local-Ip: 10.100.0.1
  L3-VNI: 0
  Rmac: 00:00:00:00:00:00
  VNI Filter: none
  L2-VNI List:

  Export-RTs:

  Import-RTs:

  RD: 10.100.0.1:2

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2020-06-22 16:38:48 -07:00
Russ White
eeec40ba69
Merge pull request #6375 from adharkar/frr-master-l3vni_label
bgpd: EVPN RT-2 advertised with 2 labels for prefix-routes-only config
2020-05-26 12:14:16 -04:00
Sri Mohana Singamsetty
06fba5cb4c
Merge pull request #6463 from vivek-cumulus/evpn_extend_nht
bgpd: Extend EVPN next hop tracking for additional EVPN routes
2020-05-26 08:18:29 -07:00
vivek
e11329ca4c bgpd: Extend EVPN next hop tracking for additional EVPN routes
Extend the next hop tracking for type-2 and type-3 EVPN routes also.

Updates: "bgpd: Add nexthop of received EVPN RT-5 for nexthop tracking"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2020-05-25 23:00:49 -07:00
vivek
5f0c5ec85d bgpd: Minor tweaks to EVPN route-import debugs
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2020-05-25 14:06:10 -07:00
Ameya Dharkar
10f70510b9 bgpd: EVPN RT-2 advertised with 2 labels for prefix-routes-only config
L3VNI is configured with "prefix-routes-only" flag. Even in this case,
intermittently, we observed that local EVPN MACIP routes are installed and
advertised with 2 labels and 2 export RTs.

This is a sequencing issue. Consider following case where L2VNI 200 and L3VNI
1000 are configured for tenant vrf vrf-blue.

Bug is observed for following sequence of events:
1. vrf-blue BGP instance is created.
2. L2VNI is created in bgp for vni 200. It is linked to the tenant vrf vrf-blue
in function bgpevpn_link_to_l3vni.
Following code sets "VNI_FLAG_USE_TWO_LABELS" flag for vni 200 as L3VNI is not
yet attached to vrf-blue BGP instance.

/* check if we are advertising two labels for this vpn */
if (!CHECK_FLAG(bgp_vrf->vrf_flags, BGP_VRF_L3VNI_PREFIX_ROUTES_ONLY))
	SET_FLAG(vpn->flags, VNI_FLAG_USE_TWO_LABELS);

2. Now L3VNI is attached to vrf-blue BGP instance. In this case, we set
BGP_VRF_L3VNI_PREFIX_ROUTES_ONLY flag for vrf-blue but we do not clear
VNI_FLAG_USE_TWO_LABELS flag set on the corresponding L2VNIs.

This fix resolves following 2 issues observed above.
1. When L2VNI is created in BGP, flag VNI_FLAG_USE_TWO_LABELS should not be set
for this VNI if BGP vrf is not attached to any L3VNI.
2. When L3VNI is attached to the BGP vrf, set "VNI_FLAG_USE_TWO_LABELS" flag
if "prefix-routes-only" is not for the vrf.

UT cases:
1. Flap "prefix-routes-only" config for a vrf.
2. Test following triggers for vrfs with and without "prefix-routes-only"
   - Flap L2VNI from kernel.
   - Flap L3VNI from kernel.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2020-05-08 21:10:10 -07:00
Quentin Young
772270f3b6 *: sprintf -> snprintf
Replace sprintf with snprintf where straightforward to do so.

- sprintf's into local scope buffers of known size are replaced with the
  equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
  size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
  buffer followed by strlcat

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-04-20 19:14:33 -04:00