Commit Graph

175 Commits

Author SHA1 Message Date
Donald Sharp
24a58196dd *: Convert event.h to frrevent.h
We should probably prevent any type of namespace collision
with something else.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
e16d030c65 *: Convert THREAD_XXX macros to EVENT_XXX macros
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
907a2395f4 *: Convert thread_add_XXX functions to event_add_XXX
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
e6685141aa *: Rename struct thread to struct event
Effectively a massive search and replace of
`struct thread` to `struct event`.  Using the
term `thread` gives people the thought that
this event system is a pthread when it is not

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
cb37cb336a *: Rename thread.[ch] to event.[ch]
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system.  There is a continual
problem where people are confusing `struct thread` with a true
pthread.  In reality, our entire thread.c is an event system.

In this commit rename the thread.[ch] files to event.[ch].

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:16 -04:00
Donatas Abraitis
e9ad26e53f bgpd: Check if the peer is configured as interface when checking NHT
This causes early return. peer->conf is NULL for IPv6 link-local peering,
and the session never establish.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-03-07 22:36:15 +02:00
Russ White
ba755d35e5
Merge pull request #12248 from pguibert6WIND/bgpasdot
lib, bgp: add initial support for asdot format
2023-02-21 08:01:03 -05:00
Philippe Guibert
4a8cd6ad7f bgpd: support for as notation format for route distinguisher
RD may be built based on an AS number. Like for the AS, the RD
may use the AS notation. The two below examples can illustrate:

RD 1.1:20 stands for an AS4B:NN RD with AS4B=65536 in dot format.
RD 0.1:20 stands for an AS2B:NNNN RD with AS2B=0.1 in dot+ format.

This commit adds the asnotation mode to prefix_rd2str() API so as
to pick up the relevant display.

Two new printfrr extensions are available to display the RD with
the two above display methods.
- The pRDD extension stands for dot asnotation format
- The pRDE extension stands for dot+ asnotation format.
- The pRD extension has been renamed to pRDP extension

The code is changed each time '%pRD' printf extension is called.
Possibly, the asnotation may change the output, then a macro defines
the asnotation mode to use. A side effect of forging the mode to
use is that the string could not be concatenated with other strings
in vty_out and snprintfrr. Those functions have been called multiple
times. When zlog_debug needs to display the RD with some other string,
the prefix_rd2str() old API is used instead of the printf extension.

Some code has been kept untouched:
- code related to running-config. Actually, wherever an RD is displayed,
its configured name should be dumped.
- bgp rfapi code
- bgp evpn multihoming code (partially done), since the logic is
missing to get the asnotation of 'struct bgp_evpn_es'.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:27:23 +01:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donald Sharp
2bb8b49ce1 Revert "Merge pull request #11127 from louis-6wind/bgp-leak"
This reverts commit 16aa1809e7, reversing
changes made to f616e71608.
2023-01-13 08:13:52 -05:00
Louis Scalbert
667a4e92da bgpd: move mp_nexthop_prefer_global boolean attribute to nh_flag
Previous commits have introduced a new 8 bits nh_flag in the attr
struct that has increased the memory footprint.

Move the mp_nexthop_prefer_global boolean in the attr structure that
takes 8 bits to the new nh_flag in order to go back to the previous
memory utilization.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 15:07:00 +01:00
Louis Scalbert
acf31ef73b bgpd: fix prefix VRF leaking with 'network import-check' (5/5)
The following configuration creates an infinite routing leaking loop
because 'rt vpn both' parameters are the same in both VRFs.

> router bgp 5227 vrf r1-cust4
>    no bgp network import-check
>    bgp router-id 192.168.1.1
>    address-family ipv4 unicast
>      network 28.0.0.0/24
>      rd vpn export 10:12
>      rt vpn both 52:100
>      import vpn
>      export vpn
>    exit-address-family
> !
> router bgp 5227 vrf r1-cust5
>    no bgp network import-check
>    bgp router id 192.168.1.1
>    address-family ipv4 unicast
>      network 29.0.0.0/24
>      rd vpn export 10:13
>      rt vpn both 52:100
>      import vpn
>      export vpn
>    exit-address-family

The previous commit has added a routing leak update when a nexthop
update is received from zebra. It indirectly calls
bgp_find_or_add_nexthop() in which a static route triggers a nexthop
cache entry registration that triggers a nexthop update from zebra.

Do not register again the nexthop cache entry if the BGP_STATIC_ROUTE is
already set.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
1e24860bf7 bgpd: fix prefix VRF leaking with 'network import-check' (4/5)
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB if the
'rt import' statement is defined after the 'network <prefix>' ones.

When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
d0a55f87e9 bgpd: fix prefix VRF leaking with 'network import-check' (3/5)
"if not XX else" statements are confusing.

Replace two "if not XX else" statements by "if XX else" to prepare next
commits. The patch is only cosmetic.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
ac2f64d3ec bgpd: fix the IGP metric for best path selection on VPN import
Since the commit da0c0ef70c ("bgpd: VRF-Lite fix best path selection"),
the best path selection is made from the comparison of the attributes
of the original route i.e. the ultimate path.

The IGP metric is currently set on the child path instead of the
ultimate path (i.e. the parent path). On eBGP, the ultimate path is the
child path. However, for imported routes, the ultimate path is always
set to 0, which results in skipping the IGP metric comparison when
selecting the best path.

Set the IGP metric on the ultimate path when a BGP nexthop is added or
updated.

Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-15 17:09:35 +01:00
Pooja Jagadeesh Doijode
51f3216bee bgpd: BGP fails to free the nexthop node
In case of BGP unnumbered, BGP fails to free the nexthop
node for peer if the interface is shutdown before
unconfiguring/deleting the BGP neighbor.

This is because, when the interface is shutdown,
peer's LL neighbor address will be cleared. Therefore,
during neighbor deletion, since the peer's neighbor
address is not available, BGP will skip freeing the
nexthop node of this peer. This results in a stale
nexthop node that points to a peer that's already
been freed.

Ticket: 3191547
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2022-12-10 07:40:32 -05:00
Donald Sharp
f3c6dd49f4 *: Add ability for daemons to notice resilience changes
This patch just introduces the callback mechanism for the
resilient nexthop changes so that upper level daemons
can take advantage of the change.  This does nothing
at this point but just call some code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donatas Abraitis
46dbf9d0c0 bgpd: Implement ACCEPT_OWN extended community
TL;DR: rfc7611.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-10-12 17:48:43 +03:00
Donatas Abraitis
c4f64ea94d bgpd: Use %pRD for prefix_rd2str()
Convert a bunch of prefix_rd2str() for json/vty stuff.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-22 13:12:11 +03:00
Philippe Guibert
4cd690ae4d bgpd: add 'mpls bgp forwarding' to ease mpls vpn ebgp peering
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.

The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:

This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).

restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-09-05 22:26:33 +02:00
Philippe Guibert
1bb550b63c bgpd: add resolution for l3vpn traffic over gre interfaces
When a route imported from l3vpn is analysed, the nexthop from default
VRF is looked up against a valid MPLS path. Generally, this is done on
backbones with a MPLS signalisation transport layer like LDP. Generally,
the BGP connection is multiple hops away. That scenario is already
working.

There is case where it is possible to run L3VPN over GRE interfaces, and
where there is no LSP path over that GRE interface: GRE is just here to
tunnel MPLS traffic. On that case, the nexthop given in the path does not
have MPLS path, but should be authorized to convey MPLS traffic provided
that the user permits it via a configuration command.

That commit introduces a new command that can be activated in route-map:
 > set l3vpn next-hop encapsulation gre

That command authorizes the nexthop tracking engine to accept paths that
o have a GRE interface as output, independently of the presence of an LSP
path or not.

A configuration example is given below. When bgp incoming vpnv4 updates
are received, the nexthop of NLRI is 192.168.0.2. Based on nexthop
tracking service from zebra, BGP knows that the output interface to reach
192.168.0.2 is r1-gre0. Because that interface is not MPLS based, but is
a GRE tunnel, then the update will be using that nexthop to be installed.

    interface r1-gre0
     ip address 192.168.0.1/24
    exit
    router bgp 65500
     bgp router-id 1.1.1.1
     neighbor 192.168.0.2 remote-as 65500
     !
     address-family ipv4 unicast
      no neighbor 192.168.0.2 activate
     exit-address-family
     !
     address-family ipv4 vpn
      neighbor 192.168.0.2 activate
      neighbor 192.168.0.2 route-map rmap in
     exit-address-family
    exit
    !
    router bgp 65500 vrf vrf1
     bgp router-id 1.1.1.1
     no bgp network import-check
     !
     address-family ipv4 unicast
      network 10.201.0.0/24
      redistribute connected
      label vpn export 101
      rd vpn export 444:1
      rt vpn both 52:100
      export vpn
      import vpn
     exit-address-family
    exit
    !
    route-map rmap permit 1
     set l3vpn next-hop encapsulation gre
    exit

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-09-05 22:26:25 +02:00
Donatas Abraitis
036f482fce bgpd: Drop bnc_str() function
Reuse %pFX -> prefix2str()

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:28 +03:00
Donatas Abraitis
511211bf56 bgpd: Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:27 +03:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Trey Aspelund
7226bc40d6 bgpd: ignore NEXT_HOP for MP_REACH_NLRI
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.

Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh

Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-08-04 20:36:49 +00:00
Donald Sharp
35aae5c9bc bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer.  Not have
one's that write over others.  Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC.  This is insufficient in that
we were accidently overwriting the one LL with other
data.  This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-22 09:09:39 -04:00
Donald Sharp
d00a5f6b8b bgpd: Fix SR color nexthop processing in BGP
Commit:
9f002fa5dd

Accidently broke the handling of SR color for nexthops
in BGP.  Put it back

Fixes: #11237
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-05-27 11:21:35 -04:00
Donald Sharp
9f002fa5dd bgpd: Fix import check removal
Fix: 06e4e90132

Modified BGP to pay more attention the prefix returned from
zebra to ensure that a LPM wasn't accidently causing BGP
import checks to think it had a match when it did not.
This unfortunately removed the check to handle the route
removal.

This sequence of config and events would leave BGP in a bad state:
ip route 100.100.100.0/24 Null0
router bgp 32932
  bgp network import-check
  address-family ipv4 uni
    network 100.100.100.0/24

Then if you removed the static route the import check would
still think the route existed:

donatas-pc(config)# ip route 100.100.100.0/24 Null0

donatas-pc(config)# do sh ip bgp import-check-table
Current BGP import check cache:
 100.100.100.0 valid [IGP metric 0], #paths 1
  blackhole
  Last update: Sat Apr 23 22:51:34 2022

donatas-pc(config)# do sh ip nht
100.100.100.0
 resolved via static
 is directly connected, Null0
 Client list: bgp(fd 17)

donatas-pc(config)# do sh ip bgp neighbors 192.168.10.123 advertised-routes | include 100.100.100.0
*> 100.100.100.0/24 0.0.0.0                  0         32768 i

donatas-pc(config)# no ip route 100.100.100.0/24 Null0

donatas-pc(config)# do sh ip nht
100.100.100.0
 resolved via kernel
 via 192.168.10.1, enp3s0
 Client list: bgp(fd 17)

donatas-pc(config)# do sh ip bgp import-check-table
Current BGP import check cache:
 100.100.100.0 valid [IGP metric 0], #paths 1
  blackhole
  Last update: Sat Apr 23 22:51:34 2022

donatas-pc(config)# do sh ip bgp neighbors 192.168.10.123 advertised-routes | include 100.100.100.0
*> 100.100.100.0/24 0.0.0.0                  0         32768 i
donatas-pc(config)#

Fix this by moving the code to handle the prefix check to the
evaluation function and mark the bnc as not matching and actually
evaluate the bnc.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-24 17:08:12 -04:00
Donatas Abraitis
3d3c38b1d4
Merge pull request #11051 from donaldsharp/speell_more
Speell more
2022-04-20 11:04:14 +03:00
Donald Sharp
4667220e3a *: Fix spelling of accidently
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-19 08:31:30 -04:00
Donald Sharp
7f2e9cce7f bgpd: Allow type 5 routes to be handled better when link is flapping
In some stress testing, we are seeing type-5 evpn routes being
left in a rejected state in zebra.

Sequence of events as I am seeing it:

a) Interface comes up that type5 routes nexthop depends on
b) zebra processes creates the connected and lets bgp know via nht
c) bgp installs the route to zebra
d) zebra processes and sends install to kernel
e) before route is installed, the interface the nexthop points at flaps
f) the route install is rejected, notify zebra
g) the interface comes up
h) zebra gets the notification about the route install rejection
i) zebra processes the down/up and turns it into a single up event
j) BGP never reinstalls the type 5 route

This up event does not translate into a nexthop tracking event
when the events happen quickly enough and/or zebra is extremelyh
busy and bgp would never see that the nexthops changed even very quickly.

This is the same thing that was going on with
https://github.com/FRRouting/frr/pull/7724
in PBR.

To fix this let's notice the interface up/down events for v4
in bgp now as well.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-18 14:15:23 -04:00
David Lamparter
eb3c9d9774 *: add SAFI argument to zclient_send_rnh
Just pushing that SAFI_UNICAST up 1 level to the caller.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-27 14:57:22 +02:00
Donatas Abraitis
23f60ffd52 bgpd: Remove dead code for [un]register_zebra_rnh
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-12 21:48:18 +02:00
Donald Sharp
06e4e90132 *: When matching against a nexthop send and process what it matched against
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against.  When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd.  It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.

This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.

Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 11:18:45 -05:00
Donald Sharp
cc9f21da22 *: Change thread->func to return void instead of int
The int return value is never used.  Modify the code
base to just return a void instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23 19:56:04 -05:00
Donatas Abraitis
5fee827d32
Merge pull request #10042 from wangshengjun/dev_bgp
bgpd: do not set the 'BGP_NEXTHOP_REGISTERED/BGP_NEXTHOP_UNREGISTERD'…
2021-11-29 09:39:29 +01:00
wangshengjun
a652203835 bgpd: do not set the 'BGP_NEXTHOP_REGISTERED/BGP_NEXTHOP_UNREGISTERD' zclient send failed
Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>
2021-11-29 09:52:09 +08:00
Igor Ryzhov
096f7609f9 *: cleanup ifp->vrf_id
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-22 20:47:23 +03:00
Donald Sharp
7a8ce9d56d *: use compiler.h MIN/MAX macros instead of everyone having one
We had various forms of min/max macros across multiple daemons
all of which duplicated what we have in compiler.h.  Convert
everyone to use the `correct` ones

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-11-11 09:39:52 -05:00
Igor Ryzhov
0b52b75a14 bgpd: don't use if_lookup_by_index_all_vrf
if_lookup_by_index_all_vrf doesn't work correctly with netns VRF backend
as the same index may be used in multiple netns simultaneously.

We always know the BGP instance we work with, so use its VRF id for the
interface lookup.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-10-28 18:54:46 +03:00
Donald Sharp
3d174ce08d *: Remove the ZEBRA_IMPORT_ROUTE_XXX zapi messages
These are no longer really needed.  The client just needs
to call nexthop resolution instead.

So let's remove the zapi types.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-09-27 12:38:08 -04:00
Donald Sharp
b8210849b8 bgpd: Make bgp ready to remove distinction between 2 nh tracking types
Allow bgp to figure out if it cares about address resolution instead
of having zebra care about it.  This will allow the removal of the
zapi type for import checking and just use nexthop resolution.

Effectively we just look up the route being returned and
if it is in either table we just handle it instead of
looking for clues from the zapi message type.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-09-27 12:38:08 -04:00
Donald Sharp
ed6cec97d7 *: Add resolve via default flag 2021-09-27 12:38:08 -04:00
Philippe Guibert
654a5978f6 bgpd: prevent routes loop through itself
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.

This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.

By detecting in BGP that use case, we avoid installing the invalid
routes.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-12 13:57:36 +02:00
Donald Sharp
acb4c44ef8
Merge pull request #8942 from ton31337/fix/cleanups_2
Another round of cleanup
2021-07-06 09:47:41 -04:00
Philippe Guibert
17ef5a9343 bgpd: nht unresolved with global address next-hop
When bgp peers with ipv6 link local addresses, it may receive a
BGP update with next-hop containing both LL and GA information.
By default, nexthop tracking applies to GA, and ignores presence
of LL, when both addresses are present. This is a problem for
resolving GA as next-hop as the next-hop information can be solved
by using the LL address only.

The solution consists in defaulting the nexthop ipv6 choice to LL
when available, and moving back to GA if a route-map is locally
configured at inbound.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-01 16:21:54 +02:00
Donatas Abraitis
d4980edf47 bgpd: No need casting to boolean for boolean
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-06-29 22:27:50 +03:00
Ameya Dharkar
021b659665 bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
  should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
  EVI(L2VPN table).

To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.

Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.

To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.

struct hash *vni_svi_hash;

An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.

Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.

For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.

Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;

When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.

It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.

Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
        struct ipaddr addr;
        struct list *macip_path_list;
};

A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.

bool is_evpn_gwip_nexthop;

A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.

This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.

Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags

*                | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable   | VALID      = 1 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID      = 0 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 0

Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.

When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
a2299abae8 bgpd: Import received EVPN RT-5 prefix with gateway IP in BGP VRF
The IP/IPv6 prefix carried with EVPN RT-5 is imported in the BGP vrf according
to the attached route targets.
If the prefix carries a gateway IP overlay index, this gateway IP should be
installed as the nexthop of the route imported in the BGP vrf.
This route in vrf will be marked as VALID only if the nexthop is resolved in the
SVI network.
To receive runtime reachability information for the nexthop, register it with
the nexthop tracking module.
Send this route to zebra after processing.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Hiroki Shirokura
2ba6be5b24 bgpd,sharpd,zebra: fix code style
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
2021-06-02 10:24:48 -04:00