Commit Graph

380 Commits

Author SHA1 Message Date
Donatas Abraitis
fd43ffd974 bgpd: Do not forget to unlock bgp_dest from update_advertise_vni_routes
If "unexpected" happens.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-06 11:49:08 +03:00
Donatas Abraitis
3727e359e3 bgpd: Cleanup memory for missing hashes
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-30 13:46:26 +03:00
Donatas Abraitis
511211bf56 bgpd: Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:27 +03:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Trey Aspelund
7226bc40d6 bgpd: ignore NEXT_HOP for MP_REACH_NLRI
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.

Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh

Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-08-04 20:36:49 +00:00
Donald Sharp
35aae5c9bc bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer.  Not have
one's that write over others.  Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC.  This is insufficient in that
we were accidently overwriting the one LL with other
data.  This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-22 09:09:39 -04:00
anlan_cs
2304139a62 bgpd: fix missing rmac value in debug
`attr.rmac` is not set in debug as expected for its wrong place in code.

Just move the debug process (`bgp_debug_zebra(NULL)`) after possible `rmac`
value is set.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-07-08 00:27:00 -04:00
Donatas Abraitis
0f05ea43b0 bgpd: Initialize attr->local_pref to the configured default value
When we use network/redistribute local_preference is configured inproperly
when using route-maps something like:

```
network 100.100.100.100/32 route-map rm1
network 100.100.100.200/32 route-map rm2

route-map rm1 permit 10
 set local-preference +10
route-map rm2 permit 10
 set local-preference -10
```

Before:
```
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32 json' | jq '.paths[].locPrf'
10
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.200/32 json' | jq '.paths[].locPrf'
0
```

After:
```
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32 json' | jq '.paths[].locPrf'
110
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.200/32 json' | jq '.paths[].locPrf'
90
```

Set local-preference as the default value configured per BGP instance, but
do not set LOCAL_PREF flag by default.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-06-06 10:28:50 +03:00
Donatas Abraitis
6006b807b1 *: Properly use memset() when zeroing
Wrong: memset(&a, 0, sizeof(struct ...));
    Good:  memset(&a, 0, sizeof(a));

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-11 14:08:47 +03:00
Donatas Abraitis
b5605493a4 bgpd: Use sizeof() for memset instead of numeric
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-11 13:10:41 +03:00
Russ White
fb10a9479b
Merge pull request #11162 from anlancs/fix/bgpd-cleanup-5
bgpd: remove unnecessary check for evpn
2022-05-09 14:43:03 -04:00
mobash-rasool
d4caf64ef7
Merge pull request #11170 from anlancs/fix/bgpd-cleanup-8
bgpd: remove one unnecessary parameter for evpn-mh
2022-05-09 22:42:22 +05:30
anlan_cs
e0a798819b bgpd: remove one unnecessary parameter for evpn-mh
The "add" parameter of `bgp_evpn_mh_route_update()` makes no sense.
Just remove it to clarify this function, and remove the relevant check
with "add" as well.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-09 08:27:20 -04:00
anlan_cs
879e43a550 bgpd: remove unnecessary check for evpn
When `bgp_evpn_new()` is called, the `bgp` parameter MUST be non-NULL,
remove this unnecessary check and remove the NULL check for returned
`struct bgpevpn *`, which should be non-NULL.

And modify `import_rt_new()` in the same way.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-08 09:25:12 -04:00
anlan_cs
17151ae94b bgpd: clear misleading mismatched check
Two changes for `delete_global_type2_routes()`:

1) Remove check of `bgp_dest_has_bgp_path_info_data(rddest)`.
It is unnecessary(`dest->info` should not be NULL) and misleading.
`if (rddest && bgp_dest_has_bgp_path_info_data(rddest))`
Use (locked) node with this check, but unlock with `if (rddest)`,
The mismatched condition is misleading, there seems to be a
mistake to extra unlock.
Just make it clear, immediately exit with `(!rddest)`.

2) Remove checking returned value for it, and use `void` as return type.
It is unnecessary and wrong. Even the check failed, it should continue
to delete other types of routes.
Just remove the check and go through.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-07 22:20:15 -04:00
anlan_cs
8e3aae66ce *: remove the checking returned value for hash_get()
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.

Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.

Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.

Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
   ensure it is a created node, not a found node.
   Refer to `isis_vertex_queue_insert()` of isisd, there
   are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
   is a found node, then free <searching_data>.
   Refer to `aspath_intern()` of bgpd, there are many
   examples of this case in bgpd.

Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-03 00:41:48 +08:00
anlan_cs
ac390ef8ac bgpd: fix memory leak for evpn
If `hash_get()` returns NULL, the list created with
`list_new()` is not be freed.

Since `hash_get()` should not fail, we don't need
`list_delete()` and other boring `XFREE()`s for its
failure case.

Just ignore returning value of `hash_get()` in these
cases.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-03 00:41:48 +08:00
anlan_cs
d74a6cc126 bgpd: optimize "auto_rt" searching procedure for evpn
RT value will be unique across different VNIs but the
same across routers (in the same AS) for a particula
VNI.

It is unique, so add `break` for search procedure.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-25 04:36:18 -04:00
anlan_cs
671ec57621 bgpd: minor style change
Correct two style places and one comment.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-25 04:33:44 -04:00
Donatas Abraitis
f5327fc339
Merge pull request #11012 from anlancs/bgpd-mh-simplify-condition
zebra: simplify one check for evpn-mh
2022-04-19 13:04:43 +03:00
Donatas Abraitis
58cf5c088a bgpd: Reuse bgp_attr_set_ecommunity() for setting attribute flags
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-04-12 22:09:28 +03:00
anlan_cs
fff7545a03 bgpd: correct a few comments for evpn-mh
Correct a few evpn-mh omissions mainly on type-1 and type-4.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-12 09:33:06 -04:00
Russ White
e3e6510b87
Merge pull request #10820 from donaldsharp/evpn_route_frag
Evpn route frag
2022-03-20 15:49:16 -04:00
Russ White
d5a58008dc
Merge pull request #10816 from anlancs/fix-bgdp-local-es-rt
bgpd: fix wrong check on local es routes
2022-03-19 15:04:58 -04:00
Anuradha Karuppiah
f4a5218dc6 bgpd: evpn mh changes to advertise EAD routes with user configured export-rt
This is an alternate to EAD route fragmenation and allows the user to limit
the route to a single UPDATE (<4K) independent of the number of EVIs.

Sample config (add one l2-vni RT from each VRF) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
router bgp 5556
 !
 address-family l2vpn evpn
  ead-es-route-target export 5556:1001
  ead-es-route-target export 5556:1004
  ead-es-route-target export 5556:1008
 exit-address-family
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Sample route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
   Network          Next Hop            Metric LocPrf Weight Path
*> [1]:[4294967295]:[03:44:38:39:ff:ff:01:00:00:01]:[32]:[27.0.0.21]
                    27.0.0.21                          32768 i
                    ET:8 ESI-label-Rt:AA RT:5556:1001 RT:5556:1004 RT:5556:1008
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

When configured, the ead-es-route-target is used instead of
the auto-generated version that includes all associated EVI's RTs.

Ticket: #2632967

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2022-03-18 07:33:12 -04:00
anlan_cs
e57e63eb40 bgpd: fix wrong check on local es routes
Importing local es routes should be skipped. But the check of it is a bit wrong.
It is ok that local es routes can't be imported, but importing local es will
wrongly enter `uninstall` procedure.

Just adjust this check to make it clear. Immediately return in the case
of importing local es routes.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-17 21:46:44 +08:00
anlan_cs
d58f056ab7 bgpd: remove dead code
`bgp_evpn_import_route_in_vrfs()` is special ( l2vpn ) form of
`install_uninstall_evpn_route() with `AFI_L2VPN` and `SAFI_EVPN` family.

No caller, just remove it.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-16 21:06:46 +08:00
Donatas Abraitis
b53e67a389 bgpd: Use bgp_attr_[sg]et_ecommunity for struct ecommunity
This is an extra work before moving attr->ecommunity to attra_extra struct.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-02-04 15:56:20 +02:00
Igor Ryzhov
3b216639d7
Merge pull request #10430 from ton31337/fix/addpath_maximum-prefix-out
bgpd: Add bgp_check_selected() helper and just consistency changes
2022-02-01 18:38:57 +03:00
Donatas Abraitis
be92fc9f1a bgpd: Convert bgp_addpath_encode_[tr]x() to bool from int
Rename addpath_encode[d] to addpath_capable to be consistent.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-02-01 13:31:16 +02:00
Iqra Siddiqui
761cc919fa bgpd: Fixing memcmp to avoid coverity issue
Description:
Replacing memcmp at certain places,
to avoid the coverity issues caused by it.

Co-authored-by: Kantesh Mundargi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
2022-01-31 21:50:50 -08:00
Igor Ryzhov
860e740b36 bgpd: replace custom union gw_addr with struct ipaddr
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2022-01-19 23:13:04 +03:00
Anuradha Karuppiah
23aa35ade5 bgpd: initial batch of evpn lttng tracepoints
Low overhead bgp-evpn TPs have been added which push data out in a binary
format -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@switch:~# lttng list --userspace |grep "frr_bgp:evpn"
      frr_bgp:evpn_mh_nh_rmac_zsend (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      frr_bgp:evpn_mh_nh_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mh_nhg_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mh_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_bum_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mac_ip_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
root@switch:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

In addition to the tracepoints a babeltrace python plugin for pretty
printing (binary data is converted into grepable strings). Sample usage -
frr_babeltrace.py trace_path

Sample tracepoint output -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1. frr_bgp: evpn_mac_ip_zsend
frr_bgp:evpn_mac_ip_zsend {'action': 'add', 'vni': 1007, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'vtep': '27.0.0.15', 'esi': '03:44:38:39:ff:ff:01:00:00:02'}

2. frr_bgp: evpn_mh_vtep_zsend
frr_bgp:evpn_mh_vtep_zsend {'action': 'add', 'esi': '03:44:38:39:ff:ff:01:00:00:02', 'vtep': '27.0.0.16'}

3. frr_bgp: evpn_mh_nhg_zsend
frr_bgp:evpn_mh_nhg_zsend {'action': 'add', 'type': 'v4', 'nhg': 74999998, 'esi': '03:44:38:39:ff:ff:01:00:00:02', 'vrf': 85}

4. frr_bgp: evpn_mh_nh_zsend
frr_bgp:evpn_mh_nh_zsend {'nhg': 74999998, 'vtep': '27.0.0.16', 'svi': 93}

5. frr_bgp: evpn_mh_nh_rmac_zsend
frr_bgp:evpn_mh_nh_rmac_zsend {'action': 'add', 'vrf': 85, 'nh': '::ffff:1b00:12', 'rmac': '00:02:00:00:00:50'}
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-01 09:02:25 -07:00
Sri Mohana Singamsetty
2e2d2be87f
Merge pull request #9422 from pguibert6WIND/update_autort_l3vni
bgpd: update auto route target for l3vni appropriately
2021-09-28 09:15:34 -07:00
Anuradha Karuppiah
fa46a5cd45 bgpd: Extend EVPN next hop tracking to type-1 and type-4 routes
NH tracking is already in use for type-1, type-3 and type-5 routes.
This change extends that tracking to EAD and ESR to eliminate the 9s
delay (BGP holdtimer) with ES/L2-NHG update seen when all the uplinks
are shutdown on a remote EVPN PE.

Ticket: #2682896

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-09-14 09:07:59 -07:00
Philippe Guibert
4204021e46 bgpd: update auto route target for l3vni appropriately
The BGP configuration for BGP EVPN RT5 setup consists in mainly
2 bgp instances (eventually one is enough) and L3VNI config.

When L3VNI is configured before BGP instances, and BGP route
targets are auto derived as per rfc8365, then, the obtained
route targets are wrong. For instance, the following can be
obtained:

=> show bgp vrf cust1 vni
BGP VRF: cust1
  Local-Ip: 10.209.36.1
  L3-VNI: 1000
  Rmac: da:85:42:ba:2a:e9
  VNI Filter: none
  L2-VNI List:

  Export-RTs:
    RT:12757:1000
  Import-RTs:
    RT:12757:1000
  RD: 65000:1000

whereas the derived route targets should be the below
ones:

=> show bgp vrf cust1 vni
BGP VRF: cust1
  Local-Ip: 10.209.36.1
  L3-VNI: 1000
  Rmac: 72:f3:af:a0:98:80
  VNI Filter: none
  L2-VNI List:

  Export-RTs:
    RT:12757:268436456
  Import-RTs:
    RT:12757:268436456
  RD: 65000:1000

There is an update handler that updates appropriately L2VNIs.
But this is not the case for L3VNIs. Add the missing code.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-09-02 19:34:56 +00:00
Philippe Guibert
56c70d8714 bgpd: imported evpn rt5 routes copy igpmetric
when doing BGP over an IGP platform, the expectation is that
the path calculation for a given prefix takes into account the
igpmetric given by IGP.
This is true with prefixes obtained in a given BGP instance where
peering occurs. For instance, ipv4 unicast entries or l2vpn evpn
entries work this way. The igpmetric is obtained through nexthop
tracking, like below:

south-vm# show bgp nexthop
Current BGP nexthop cache:
 1.1.1.1 valid [IGP metric 10], #paths 1, peer 1.1.1.1
 2.2.2.2 valid [IGP metric 20], #paths 1, peer 2.2.2.2

The igp metric is taken into account when doing best path
selection, and only the entry with lowest igp wins.

[..]
*>i[5]:[0]:[32]:[5.5.5.5]
                    1.1.1.1                  0    100      0 ?
                    RT:65400:268435556 ET:8 Rmac:2e:22:6c:67:bb:73
* i                 2.2.2.2                  0    100      0 ?
                    RT:65400:268435556 ET:8 Rmac:f2:d3:68:4e:f4:ed

however, for imported EVPN RT5 entries, the igpmetric was not
copied from the parent path info. Fix it. In this way, the
imported route entries use the igpmetric of the parent pi.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-08-17 16:50:05 +02:00
Donatas Abraitis
90737805d9
Merge pull request #8956 from pguibert6WIND/bgp_loop_through_itself
bgpd: prevent routes loop through itself
2021-07-21 09:28:21 +03:00
Donald Sharp
6e95e6419c
Merge pull request #9039 from ton31337/fix/no_need_to_check_for_null_in_cmp_functions
bgpd: Do not check for NULL values for vni_hash_cmp()
2021-07-13 16:42:23 -04:00
Donatas Abraitis
ce40c6279a bgpd: Do not check for NULL values for vni_hash_cmp()
There is no need to test for null values in the hash compare
function as that we are guaranteed to send in data in
the hash compare functions.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-07-13 08:46:52 +03:00
Donald Sharp
63245a641a bgpd: hash compare functions never receive null values
There is no need to test for null values in the hash compare
function as that we are guaranteed to send in data in
the hash compare functions.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-07-12 14:23:51 -04:00
Philippe Guibert
654a5978f6 bgpd: prevent routes loop through itself
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.

This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.

By detecting in BGP that use case, we avoid installing the invalid
routes.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-12 13:57:36 +02:00
Donatas Abraitis
8643c2e5f7 *: Replace 4/16 integers to IPV4_MAX_BYTELEN/IPV6_MAX_BYTELEN
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-07-01 23:54:39 +03:00
Donald Sharp
b2c42dda96
Merge pull request #8853 from ton31337/fix/bgp_dest_lock_unlock
bgpd: Make sure we don't miss to unlock for bgp_dest before returning
2021-06-23 07:59:32 -04:00
Donatas Abraitis
e71ad4b64e bgpd: Make sure we don't miss to unlock for bgp_dest before returning
bgp_node_lookup() increases `lock` which is not decreased on return.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-06-22 23:14:47 +03:00
Ameya Dharkar
dc6cef732e bgpd: Add CLI for overlay index recursive resolution
Gateway IP overlay index of the remote type-5 route is resolved
recursively using remote type-2 route. For the purpose of this
recursive resolution, for each L2VNI, we build a hash table of the
remote IP addresses received by remote type-2 routes.
For the topologies where overlay index resolution is not needed, we
do not need to build this remote-ip-hash.

Thus, make the recursive resolution of the overlay index conditional on
"enable-resolve-overlay-index" configuration.

router bgp 65001
 bgp router-id 192.168.100.1
 neighbor 10.0.1.2 remote-as 65002
!
 address-family l2vpn evpn
  neighbor 10.0.1.2 activate
  advertise-all-vni
  enable-resolve-overlay-index----------> New configuration
 exit-address-family

Gateway IP overlay index will be resolved only if this configuration is present.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
021b659665 bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
  should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
  EVI(L2VPN table).

To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.

Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.

To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.

struct hash *vni_svi_hash;

An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.

Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.

For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.

Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;

When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.

It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.

Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
        struct ipaddr addr;
        struct list *macip_path_list;
};

A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.

bool is_evpn_gwip_nexthop;

A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.

This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.

Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags

*                | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable   | VALID      = 1 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID      = 0 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 0

Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.

When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
9daa5d471a bgpd, zebra: Add svi_interface to zebra VNI and bgp EVPN structures
SVI ifindex for L2VNI is required in BGP to perform EVPN type-5 to type-2
recusrsive resolution using gateway IP overlay index.

Program this svi_ifindex in struct zebra_vni_t as well as in struct bgpevpn

Changes include:
1. Add svi_if field to struct zebra_evpn_t
2. Add svi_ifindex field to struct bgpevpn
3. When SVI (bridge or VLAN) is bound to a VxLAN interface, store it in the
zebra_evpn_t structure.
4. Add this SVI ifindex to ZEBRA_VNI_ADD
5. Store svi_ifindex in struct bgpevpn

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:23 -07:00
Ameya Dharkar
a2299abae8 bgpd: Import received EVPN RT-5 prefix with gateway IP in BGP VRF
The IP/IPv6 prefix carried with EVPN RT-5 is imported in the BGP vrf according
to the attached route targets.
If the prefix carries a gateway IP overlay index, this gateway IP should be
installed as the nexthop of the route imported in the BGP vrf.
This route in vrf will be marked as VALID only if the nexthop is resolved in the
SVI network.
To receive runtime reachability information for the nexthop, register it with
the nexthop tracking module.
Send this route to zebra after processing.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Ameya Dharkar
66ff60895a bgpd: Parse EVPN RT-5 NLRI and store gateway IP for EVPN route
While installing this route in the EVPN table, make sure all the conditions
mentioned in the draft
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement-11 are
met.
Draft mentions following conditions:
  - ESI and gateway IP cannot be both nonzero at the same time.
  - ESI, gateway IP, RMAC and VNI label all cannot be 0 at the same time.

If the received EVPN RT-5 route does not meet these conditions, the route is
treated as withdraw.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00