Commit Graph

223 Commits

Author SHA1 Message Date
Donald Sharp
cb37cb336a *: Rename thread.[ch] to event.[ch]
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system.  There is a continual
problem where people are confusing `struct thread` with a true
pthread.  In reality, our entire thread.c is an event system.

In this commit rename the thread.[ch] files to event.[ch].

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:16 -04:00
Philippe Guibert
c9b416cbd1 bgpd: export redistributed routes with label allocation per nexthop
The label allocation per nexthop mode requires to use a nexthop
tracking context. For redistributed routes, a nexthop tracking
context is created, and the resolution helps to know the real
nexthop ip address used. The below configuration example has
been used:

 > vrf vrf1
 >  ip route 172.31.0.14/32 192.0.2.14
 >  ip route 172.31.0.15/32 192.0.2.12
 >  ip route 172.31.0.30/32 192.0.2.30
 > exit
 > router bgp 65500 vrf vrf1
 >  address-family ipv4 unicast
 >   redistribute static
 >   label vpn export per-nexthop
 > [..]

The static routes are correctly imported in the BGP IPv4 RIB.
Contrary to label allocation per vrf mode, some nexthop tracking
are created/or reused:

 > # show bgp vrf vrf1 nexthop
 > 192.0.2.12 valid [IGP metric 0], #paths 3, peer 192.0.2.12
 >  if r1-eth1
 >  Last update: Fri Jan 13 15:49:42 2023
 > 192.0.2.14 valid [IGP metric 0], #paths 1
 >  if r1-eth1
 >  Last update: Fri Jan 13 15:49:42 2023
 > 192.0.2.30 valid [IGP metric 0], #paths 1
 >  if r1-eth1
 >  Last update: Fri Jan 13 15:49:51 2023
 > [..]

This results in having a BGP VPN route for each of the static
routes:

 > # show bgp ipv4 vpn
 > [..]
 > Route Distinguisher: 444:1
 >  *> 172.31.0.14/32   192.0.2.14@9<            0         32768 ?
 >  *> 172.31.0.15/32   192.0.2.12@9<            0         32768 ?
 >  *> 172.31.0.30/32   192.0.2.30@9<            0         32768 ?
 > [..]

Without that patch, only the redistributed routes that rely on a
pre-existing nexthop tracking context could be exported.

Also, a command in the code about redistributed routes is modified
accordingly, to explain that redistribute routes may be submitted
to nexthop tracking in the case label allocation per next-hop is
used.

note:
VNC routes have been removed from the redistribution,
because of a test failure in the bgp_l3vpn_to_bgp_direct test.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-03-22 12:06:29 +01:00
Philippe Guibert
92d5e31ace bgpd: add support for l3vpn per-nexthop label
This commit introduces a new method to associate a label to
prefixes to export to a VPNv4 backbone. All the methods to
associate a label to a BGP update is documented in rfc4364,
chapter 4.3.2. Initially, the "single label for an entire
VRF" method was available. This commit adds "single label
for each attachment circuit" method.

The change impacts the control-plane, because each BGP update
is checked to know if the nexthop has reachability in the VRF
or not. If this is the case, then a unique label for a given
destination IP in the VRF will be picked up. This label will
be reused for an other BGP update that will have the same
nexthop IP address.

The change impacts the data-plane, because the MPLs pop
mechanism applied to incoming labelled packets changes: the
MPLS label is popped, and the packet is directly sent to the
connected nexthop described in the previous outgoing BGP VPN
update.

By default per-vrf mode is done, but the user may choose
the per-nexthop mode, by using the vty command from the
previous commit. In the latter case, a per-vrf label
will however be allocated to handle networks that are not directly
connected. This is the case for local traffic for instance.

The change also include the following:

-  ECMP case
In case a route is learnt in a given VRF, and is resolved via an
ECMP nexthop. This implies that when exporting the route as a BGP
update, if label allocation per nexthop is used, then two possible
MPLS values could be picked up, which is not possible with the
current implementation. Actually, the NLRI for VPNv4 stores one
prefix, and one single label value, not two. Today, RFC8277 with
multiple label capability is not yet available.
To avoid this corner case, when a route is resolved via more than one
nexthop, the label allocation per nexthop will not apply, and the
default per-vrf label will be chosen.
Let us imagine BGP redistributes a static route using the `172.31.0.20`
nexthop. The nexthop resolution will find two different nexthops fo a
unique BGP update.

 > r1# show running-config
 > [..]
 > vrf vrf1
 >  ip route 172.31.0.30/32 172.31.0.20
 > r1# show bgp vrf vrf1 nexthop
 > [..]
 > 172.31.0.20 valid [IGP metric 0], #paths 1
 >  gate 192.0.2.11
 >  gate 192.0.2.12
 >  Last update: Mon Jan 16 09:27:09 2023
 >  Paths:
 >    1/1 172.31.0.30/32 VRF vrf1 flags 0x20018

To avoid this situation, BGP updates that resolve over multiple
nexthops are using the unique per-vrf label.

- recursive route case

Prefixes that need a recursive route to be resolved can
also be eligible for mpls allocation per nexthop. In that
case, the nexthop will be the recursive nexthop calculated.

To achieve this, all nexthop types in bnc contexts are valid,
except for the blackhole nexthops.

- network declared prefixes

Nexthop tracking is used to look for the reachability of the
prefixes. When the the 'no bgp network import-check' command
is used, network declared prefixes are maintained active,
even if there is no active nexthop.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-03-22 12:06:29 +01:00
Donatas Abraitis
e9ad26e53f bgpd: Check if the peer is configured as interface when checking NHT
This causes early return. peer->conf is NULL for IPv6 link-local peering,
and the session never establish.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-03-07 22:36:15 +02:00
Russ White
ba755d35e5
Merge pull request #12248 from pguibert6WIND/bgpasdot
lib, bgp: add initial support for asdot format
2023-02-21 08:01:03 -05:00
Philippe Guibert
4a8cd6ad7f bgpd: support for as notation format for route distinguisher
RD may be built based on an AS number. Like for the AS, the RD
may use the AS notation. The two below examples can illustrate:

RD 1.1:20 stands for an AS4B:NN RD with AS4B=65536 in dot format.
RD 0.1:20 stands for an AS2B:NNNN RD with AS2B=0.1 in dot+ format.

This commit adds the asnotation mode to prefix_rd2str() API so as
to pick up the relevant display.

Two new printfrr extensions are available to display the RD with
the two above display methods.
- The pRDD extension stands for dot asnotation format
- The pRDE extension stands for dot+ asnotation format.
- The pRD extension has been renamed to pRDP extension

The code is changed each time '%pRD' printf extension is called.
Possibly, the asnotation may change the output, then a macro defines
the asnotation mode to use. A side effect of forging the mode to
use is that the string could not be concatenated with other strings
in vty_out and snprintfrr. Those functions have been called multiple
times. When zlog_debug needs to display the RD with some other string,
the prefix_rd2str() old API is used instead of the printf extension.

Some code has been kept untouched:
- code related to running-config. Actually, wherever an RD is displayed,
its configured name should be dumped.
- bgp rfapi code
- bgp evpn multihoming code (partially done), since the logic is
missing to get the asnotation of 'struct bgp_evpn_es'.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:27:23 +01:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donald Sharp
2bb8b49ce1 Revert "Merge pull request #11127 from louis-6wind/bgp-leak"
This reverts commit 16aa1809e7, reversing
changes made to f616e71608.
2023-01-13 08:13:52 -05:00
Louis Scalbert
667a4e92da bgpd: move mp_nexthop_prefer_global boolean attribute to nh_flag
Previous commits have introduced a new 8 bits nh_flag in the attr
struct that has increased the memory footprint.

Move the mp_nexthop_prefer_global boolean in the attr structure that
takes 8 bits to the new nh_flag in order to go back to the previous
memory utilization.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 15:07:00 +01:00
Louis Scalbert
acf31ef73b bgpd: fix prefix VRF leaking with 'network import-check' (5/5)
The following configuration creates an infinite routing leaking loop
because 'rt vpn both' parameters are the same in both VRFs.

> router bgp 5227 vrf r1-cust4
>    no bgp network import-check
>    bgp router-id 192.168.1.1
>    address-family ipv4 unicast
>      network 28.0.0.0/24
>      rd vpn export 10:12
>      rt vpn both 52:100
>      import vpn
>      export vpn
>    exit-address-family
> !
> router bgp 5227 vrf r1-cust5
>    no bgp network import-check
>    bgp router id 192.168.1.1
>    address-family ipv4 unicast
>      network 29.0.0.0/24
>      rd vpn export 10:13
>      rt vpn both 52:100
>      import vpn
>      export vpn
>    exit-address-family

The previous commit has added a routing leak update when a nexthop
update is received from zebra. It indirectly calls
bgp_find_or_add_nexthop() in which a static route triggers a nexthop
cache entry registration that triggers a nexthop update from zebra.

Do not register again the nexthop cache entry if the BGP_STATIC_ROUTE is
already set.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
1e24860bf7 bgpd: fix prefix VRF leaking with 'network import-check' (4/5)
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB if the
'rt import' statement is defined after the 'network <prefix>' ones.

When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
d0a55f87e9 bgpd: fix prefix VRF leaking with 'network import-check' (3/5)
"if not XX else" statements are confusing.

Replace two "if not XX else" statements by "if XX else" to prepare next
commits. The patch is only cosmetic.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-16 14:52:47 +01:00
Louis Scalbert
ac2f64d3ec bgpd: fix the IGP metric for best path selection on VPN import
Since the commit da0c0ef70c ("bgpd: VRF-Lite fix best path selection"),
the best path selection is made from the comparison of the attributes
of the original route i.e. the ultimate path.

The IGP metric is currently set on the child path instead of the
ultimate path (i.e. the parent path). On eBGP, the ultimate path is the
child path. However, for imported routes, the ultimate path is always
set to 0, which results in skipping the IGP metric comparison when
selecting the best path.

Set the IGP metric on the ultimate path when a BGP nexthop is added or
updated.

Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2022-12-15 17:09:35 +01:00
Pooja Jagadeesh Doijode
51f3216bee bgpd: BGP fails to free the nexthop node
In case of BGP unnumbered, BGP fails to free the nexthop
node for peer if the interface is shutdown before
unconfiguring/deleting the BGP neighbor.

This is because, when the interface is shutdown,
peer's LL neighbor address will be cleared. Therefore,
during neighbor deletion, since the peer's neighbor
address is not available, BGP will skip freeing the
nexthop node of this peer. This results in a stale
nexthop node that points to a peer that's already
been freed.

Ticket: 3191547
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2022-12-10 07:40:32 -05:00
Donald Sharp
f3c6dd49f4 *: Add ability for daemons to notice resilience changes
This patch just introduces the callback mechanism for the
resilient nexthop changes so that upper level daemons
can take advantage of the change.  This does nothing
at this point but just call some code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donatas Abraitis
46dbf9d0c0 bgpd: Implement ACCEPT_OWN extended community
TL;DR: rfc7611.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-10-12 17:48:43 +03:00
Donatas Abraitis
c4f64ea94d bgpd: Use %pRD for prefix_rd2str()
Convert a bunch of prefix_rd2str() for json/vty stuff.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-22 13:12:11 +03:00
Philippe Guibert
4cd690ae4d bgpd: add 'mpls bgp forwarding' to ease mpls vpn ebgp peering
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.

The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:

This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).

restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-09-05 22:26:33 +02:00
Philippe Guibert
1bb550b63c bgpd: add resolution for l3vpn traffic over gre interfaces
When a route imported from l3vpn is analysed, the nexthop from default
VRF is looked up against a valid MPLS path. Generally, this is done on
backbones with a MPLS signalisation transport layer like LDP. Generally,
the BGP connection is multiple hops away. That scenario is already
working.

There is case where it is possible to run L3VPN over GRE interfaces, and
where there is no LSP path over that GRE interface: GRE is just here to
tunnel MPLS traffic. On that case, the nexthop given in the path does not
have MPLS path, but should be authorized to convey MPLS traffic provided
that the user permits it via a configuration command.

That commit introduces a new command that can be activated in route-map:
 > set l3vpn next-hop encapsulation gre

That command authorizes the nexthop tracking engine to accept paths that
o have a GRE interface as output, independently of the presence of an LSP
path or not.

A configuration example is given below. When bgp incoming vpnv4 updates
are received, the nexthop of NLRI is 192.168.0.2. Based on nexthop
tracking service from zebra, BGP knows that the output interface to reach
192.168.0.2 is r1-gre0. Because that interface is not MPLS based, but is
a GRE tunnel, then the update will be using that nexthop to be installed.

    interface r1-gre0
     ip address 192.168.0.1/24
    exit
    router bgp 65500
     bgp router-id 1.1.1.1
     neighbor 192.168.0.2 remote-as 65500
     !
     address-family ipv4 unicast
      no neighbor 192.168.0.2 activate
     exit-address-family
     !
     address-family ipv4 vpn
      neighbor 192.168.0.2 activate
      neighbor 192.168.0.2 route-map rmap in
     exit-address-family
    exit
    !
    router bgp 65500 vrf vrf1
     bgp router-id 1.1.1.1
     no bgp network import-check
     !
     address-family ipv4 unicast
      network 10.201.0.0/24
      redistribute connected
      label vpn export 101
      rd vpn export 444:1
      rt vpn both 52:100
      export vpn
      import vpn
     exit-address-family
    exit
    !
    route-map rmap permit 1
     set l3vpn next-hop encapsulation gre
    exit

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-09-05 22:26:25 +02:00
Donatas Abraitis
036f482fce bgpd: Drop bnc_str() function
Reuse %pFX -> prefix2str()

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:28 +03:00
Donatas Abraitis
511211bf56 bgpd: Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:27 +03:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Trey Aspelund
7226bc40d6 bgpd: ignore NEXT_HOP for MP_REACH_NLRI
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.

Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh

Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-08-04 20:36:49 +00:00
Donald Sharp
35aae5c9bc bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer.  Not have
one's that write over others.  Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC.  This is insufficient in that
we were accidently overwriting the one LL with other
data.  This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-22 09:09:39 -04:00
Donald Sharp
d00a5f6b8b bgpd: Fix SR color nexthop processing in BGP
Commit:
9f002fa5dd

Accidently broke the handling of SR color for nexthops
in BGP.  Put it back

Fixes: #11237
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-05-27 11:21:35 -04:00
Donald Sharp
9f002fa5dd bgpd: Fix import check removal
Fix: 06e4e90132

Modified BGP to pay more attention the prefix returned from
zebra to ensure that a LPM wasn't accidently causing BGP
import checks to think it had a match when it did not.
This unfortunately removed the check to handle the route
removal.

This sequence of config and events would leave BGP in a bad state:
ip route 100.100.100.0/24 Null0
router bgp 32932
  bgp network import-check
  address-family ipv4 uni
    network 100.100.100.0/24

Then if you removed the static route the import check would
still think the route existed:

donatas-pc(config)# ip route 100.100.100.0/24 Null0

donatas-pc(config)# do sh ip bgp import-check-table
Current BGP import check cache:
 100.100.100.0 valid [IGP metric 0], #paths 1
  blackhole
  Last update: Sat Apr 23 22:51:34 2022

donatas-pc(config)# do sh ip nht
100.100.100.0
 resolved via static
 is directly connected, Null0
 Client list: bgp(fd 17)

donatas-pc(config)# do sh ip bgp neighbors 192.168.10.123 advertised-routes | include 100.100.100.0
*> 100.100.100.0/24 0.0.0.0                  0         32768 i

donatas-pc(config)# no ip route 100.100.100.0/24 Null0

donatas-pc(config)# do sh ip nht
100.100.100.0
 resolved via kernel
 via 192.168.10.1, enp3s0
 Client list: bgp(fd 17)

donatas-pc(config)# do sh ip bgp import-check-table
Current BGP import check cache:
 100.100.100.0 valid [IGP metric 0], #paths 1
  blackhole
  Last update: Sat Apr 23 22:51:34 2022

donatas-pc(config)# do sh ip bgp neighbors 192.168.10.123 advertised-routes | include 100.100.100.0
*> 100.100.100.0/24 0.0.0.0                  0         32768 i
donatas-pc(config)#

Fix this by moving the code to handle the prefix check to the
evaluation function and mark the bnc as not matching and actually
evaluate the bnc.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-24 17:08:12 -04:00
Donatas Abraitis
3d3c38b1d4
Merge pull request #11051 from donaldsharp/speell_more
Speell more
2022-04-20 11:04:14 +03:00
Donald Sharp
4667220e3a *: Fix spelling of accidently
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-19 08:31:30 -04:00
Donald Sharp
7f2e9cce7f bgpd: Allow type 5 routes to be handled better when link is flapping
In some stress testing, we are seeing type-5 evpn routes being
left in a rejected state in zebra.

Sequence of events as I am seeing it:

a) Interface comes up that type5 routes nexthop depends on
b) zebra processes creates the connected and lets bgp know via nht
c) bgp installs the route to zebra
d) zebra processes and sends install to kernel
e) before route is installed, the interface the nexthop points at flaps
f) the route install is rejected, notify zebra
g) the interface comes up
h) zebra gets the notification about the route install rejection
i) zebra processes the down/up and turns it into a single up event
j) BGP never reinstalls the type 5 route

This up event does not translate into a nexthop tracking event
when the events happen quickly enough and/or zebra is extremelyh
busy and bgp would never see that the nexthops changed even very quickly.

This is the same thing that was going on with
https://github.com/FRRouting/frr/pull/7724
in PBR.

To fix this let's notice the interface up/down events for v4
in bgp now as well.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-04-18 14:15:23 -04:00
David Lamparter
eb3c9d9774 *: add SAFI argument to zclient_send_rnh
Just pushing that SAFI_UNICAST up 1 level to the caller.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-27 14:57:22 +02:00
Donatas Abraitis
23f60ffd52 bgpd: Remove dead code for [un]register_zebra_rnh
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-03-12 21:48:18 +02:00
Donald Sharp
06e4e90132 *: When matching against a nexthop send and process what it matched against
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against.  When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd.  It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.

This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.

Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 11:18:45 -05:00
Donald Sharp
cc9f21da22 *: Change thread->func to return void instead of int
The int return value is never used.  Modify the code
base to just return a void instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23 19:56:04 -05:00
Donatas Abraitis
5fee827d32
Merge pull request #10042 from wangshengjun/dev_bgp
bgpd: do not set the 'BGP_NEXTHOP_REGISTERED/BGP_NEXTHOP_UNREGISTERD'…
2021-11-29 09:39:29 +01:00
wangshengjun
a652203835 bgpd: do not set the 'BGP_NEXTHOP_REGISTERED/BGP_NEXTHOP_UNREGISTERD' zclient send failed
Signed-off-by: wangshengjun <wangshengjun@asterfusion.com>
2021-11-29 09:52:09 +08:00
Igor Ryzhov
096f7609f9 *: cleanup ifp->vrf_id
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-11-22 20:47:23 +03:00
Donald Sharp
7a8ce9d56d *: use compiler.h MIN/MAX macros instead of everyone having one
We had various forms of min/max macros across multiple daemons
all of which duplicated what we have in compiler.h.  Convert
everyone to use the `correct` ones

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-11-11 09:39:52 -05:00
Igor Ryzhov
0b52b75a14 bgpd: don't use if_lookup_by_index_all_vrf
if_lookup_by_index_all_vrf doesn't work correctly with netns VRF backend
as the same index may be used in multiple netns simultaneously.

We always know the BGP instance we work with, so use its VRF id for the
interface lookup.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2021-10-28 18:54:46 +03:00
Donald Sharp
3d174ce08d *: Remove the ZEBRA_IMPORT_ROUTE_XXX zapi messages
These are no longer really needed.  The client just needs
to call nexthop resolution instead.

So let's remove the zapi types.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-09-27 12:38:08 -04:00
Donald Sharp
b8210849b8 bgpd: Make bgp ready to remove distinction between 2 nh tracking types
Allow bgp to figure out if it cares about address resolution instead
of having zebra care about it.  This will allow the removal of the
zapi type for import checking and just use nexthop resolution.

Effectively we just look up the route being returned and
if it is in either table we just handle it instead of
looking for clues from the zapi message type.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-09-27 12:38:08 -04:00
Donald Sharp
ed6cec97d7 *: Add resolve via default flag 2021-09-27 12:38:08 -04:00
Philippe Guibert
654a5978f6 bgpd: prevent routes loop through itself
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.

This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.

By detecting in BGP that use case, we avoid installing the invalid
routes.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-12 13:57:36 +02:00
Donald Sharp
acb4c44ef8
Merge pull request #8942 from ton31337/fix/cleanups_2
Another round of cleanup
2021-07-06 09:47:41 -04:00
Philippe Guibert
17ef5a9343 bgpd: nht unresolved with global address next-hop
When bgp peers with ipv6 link local addresses, it may receive a
BGP update with next-hop containing both LL and GA information.
By default, nexthop tracking applies to GA, and ignores presence
of LL, when both addresses are present. This is a problem for
resolving GA as next-hop as the next-hop information can be solved
by using the LL address only.

The solution consists in defaulting the nexthop ipv6 choice to LL
when available, and moving back to GA if a route-map is locally
configured at inbound.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-01 16:21:54 +02:00
Donatas Abraitis
d4980edf47 bgpd: No need casting to boolean for boolean
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-06-29 22:27:50 +03:00
Ameya Dharkar
021b659665 bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
  should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
  EVI(L2VPN table).

To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.

Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.

To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.

struct hash *vni_svi_hash;

An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.

Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.

For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.

Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;

When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.

It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.

Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
        struct ipaddr addr;
        struct list *macip_path_list;
};

A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.

bool is_evpn_gwip_nexthop;

A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.

This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.

Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags

*                | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable   | VALID      = 1 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID      = 0 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 0

Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.

When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
a2299abae8 bgpd: Import received EVPN RT-5 prefix with gateway IP in BGP VRF
The IP/IPv6 prefix carried with EVPN RT-5 is imported in the BGP vrf according
to the attached route targets.
If the prefix carries a gateway IP overlay index, this gateway IP should be
installed as the nexthop of the route imported in the BGP vrf.
This route in vrf will be marked as VALID only if the nexthop is resolved in the
SVI network.
To receive runtime reachability information for the nexthop, register it with
the nexthop tracking module.
Send this route to zebra after processing.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Hiroki Shirokura
2ba6be5b24 bgpd,sharpd,zebra: fix code style
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
2021-06-02 10:24:48 -04:00
Hiroki Shirokura
7f8c7d9166 bgpd: ignore nexthop validation for srv6-vpn
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
2021-06-02 10:24:48 -04:00
Donald Sharp
cc42c4f00c bgpd: use __func__ instead of __PRETTY_FUNCTION__
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-05-12 12:00:23 -04:00
Emanuele Di Pascale
c3b95419ba bgpd: fix invalid labeled nexthop check
the code processing an NHT update was only resetting the BGP_NEXTHOP_VALID
flag, so labeled nexthops were considered valid even if there was no
nexthop. Reset the flag in response to the update, and also make the
isvalid_nexthop functions a little more robust by checking the number
of nexthops.

Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
2021-04-23 11:20:52 +02:00
Donald Sharp
996319e63d bgpd: Address LL peer not NHT when receiving connection attempt
The new LL code in:
8761cd6ddb

Introduced the idea of the bgp unnumbered peers using interface up/down
events to track the bgp peers nexthop.  This code was not properly
working when a connection was received from a peer in some circumstances.

Effectively the connection from a peer was immediately skipping state transitions
and FRR was never properly tracking the peers nexthop.  When we receive the
connection attempt, let's track the nexthop now.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-04-15 13:16:28 -04:00
vivek
4115b2966b bgpd: Reset LLA NHT's interface if there is a change
For link-local IPv6 next hops, the next hop tracking is implemented based
on interface status changes. For this purpose, the ifindex is stored in
the NHT. Reset this value if a change in ifindex is noticed, such as for
example after a restart of the networking service.

Also add some additional debug logs.

Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Updates: "bgpd: Switch LL nexthop tracking to be interface based"

Ticket: RM 2575386
Testing Done:
1. Manual verification
2. Precommit (#156), evpn-smoke (#155), bgp-smoke (#157), vrl (#158)
-- Precommit is clean, reported failures in evpn-smoke & vrl are resolved
-- some other tests fail in evpn-smoke, bgp-smoke & vrl, appear to be existing
-- or unrelated failures
2021-03-22 08:45:41 -04:00
Donald Sharp
474cfe4a6c bgpd: Set metric appropriately for the bnc for a v6 LL address
The v6 LL commit 8761cd6ddb

incorrectly was setting the metric value to 1 for the underlying
connected interface.  Modify the code to use a metric value of 0
instead of 1 that now represents the actual metric value that
was originally passed up.

This was noticed when the `show bgp ipv4 uni` command was
inserting a `(metric 1)` into output where before it was not.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-03-16 10:35:40 -04:00
Donald Sharp
e817f2ccbf bgpd: Fix crash when we don't have a nexthop
Recent changes to allow bgpd to handle v6 LL slightly
differently in the nexthop tracking code has not
interacted well with the blackhole nexthop change
for peers.  Modify the code to do the right thing

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-02-18 06:55:29 -05:00
Donatas Abraitis
830fd32903
Merge pull request #8041 from donaldsharp/v6_ll_interface
bgpd: Switch LL nexthop tracking to be interface based
2021-02-18 09:22:51 +02:00
Donald Sharp
8761cd6ddb bgpd: Switch LL nexthop tracking to be interface based
bgp is currently registering v6 LL as nexthops to be tracked
from zebra.  This presents several problems.

a) zebra does not properly track multiple prefixes that match
the same route properly at this point in time.
b) BGP was receiving nexthops that were just incorrect because
of (a).
c) When a nexthop changed that really didn't affect the v6 LL
we were responding incorrectly because of this

Modify the code such that bgp nexthop tracking notices that
we are trying to register a v6 LL.  When we do so, shortcut
and watch interface up/down events for this v6 LL and do
the work when an interface goes up / down for this type
of tracking.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-02-17 08:14:45 -05:00
Donald Sharp
824065c401 bgpd: Blackhole nexthops are not reachable
When bgp registers for a nexthop that is not reachable due
to the nexthop pointing to a blackhole, bgp is never going
to be able to reach it when attempting to open a connection.

Broken behavior:

<show bgp nexthop>
 192.168.161.204 valid [IGP metric 0], #paths 0, peer 192.168.161.204
  blackhole
  Last update: Thu Feb 11 09:46:10 2021

eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory

Neighbor             EstdCnt DropCnt ResetTime Reason
192.168.161.204            0       0     never Waiting for peer OPEN

The log file fills up with this type of message:
2021-02-09T18:53:11.653433+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:21.654005+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:31.654381+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:41.654729+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument
2021-02-09T18:53:51.655147+00:00 nq-sjc6c-cor-01 bgpd[6548]: can't connect to 24.51.27.241 fd 26 : Invalid argument

As that the connect to a blackhole is correctly rejected by the kernel

Fixed behavior:

eva# show bgp ipv4 uni summ
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor             V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
annie(192.168.161.2) 4      64539    126264        39        0    0    0 00:01:36           38       40 N/A
192.168.161.178      4          0         0         0        0    0    0    never       Active        0 N/A
Total number of neighbors 2
eva# show bgp ipv4 uni summ fail
BGP router identifier 10.10.3.11, local AS number 3235 vrf-id 0
BGP table version 40
RIB entries 78, using 14 KiB of memory
Peers 2, using 54 KiB of memory
Neighbor             EstdCnt DropCnt ResetTime Reason
192.168.161.178            0       0     never Waiting for NHT
Total number of neighbors 2
eva# show bgp nexthop
Current BGP nexthop cache:
 192.168.161.2 valid [IGP metric 0], #paths 38, peer 192.168.161.2
  if enp39s0
  Last update: Thu Feb 11 09:52:05 2021
 192.168.161.131 valid [IGP metric 0], #paths 0, peer 192.168.161.131
  if enp39s0
  Last update: Thu Feb 11 09:52:05 2021
 192.168.161.178 invalid, #paths 0, peer 192.168.161.178
  Must be Connected
  Last update: Thu Feb 11 09:53:37 2021
eva#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-02-13 10:10:19 -05:00
Donald Sharp
df2a41a9bf bgpd: Add bgp_nexthop_dump_bnc_change_flags function
Allow us to read what the change flags are instead of having
to look them up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-29 07:54:58 -05:00
Donald Sharp
987a720a11 bgpd: Add bgp_nexthop_dump_bnc_flags
Add a function that allows us to see a string version of the
bnc->flags bit fields.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-29 07:54:58 -05:00
Pat Ruddy
4053e9520a bgpd: make sure nh is valid for MPLS vpn routes
If we are using a nexthop for a MPLS VPN route make sure the
nexthop is over a labeled path. This new check mirrors the one
in validate_paths (where routes are enabled when a nexthop
becomes reachable). The check is introduced to the code path
where routes are added and the nexthop is looked up.

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2021-01-27 13:56:45 +00:00
Anuradha Karuppiah
8bcb09a18c bgpd: Use L3NHGs for symmetric IRB host routes
Two L3 next groups are installed per-VRF per-ES for v4 and v6. These
NHGs are used as an indirect destination for symmetric IRB host routes.

Using L3NHGs allows for efficient failover of an ES (similar to the
use of L2NHGs) i.e. when an ES goes down the number of dataplane
updates are limited to 2xN (where N is the number of tenant VRFs
associated with the ES) instead of updating all host-routes behind the
ES.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Anuradha Karuppiah
c589d84746 bgpd: L3NHG infrastructure for host routes in EVPN
ES-VRF entries are maintained for the purpose of L3-NHG creation -
1. Each ES-EVI entry is associated with a tenant VRF. This associaton
triggers the creation of an ES-VRF entry.
2. Type-2/MAC-IP routes are imported into a tenant VRF and programmed as
a /32 or host route entry in the dataplane. If the destination of
the host route is a remote-ES the route is programmed with the
corresponding (keyed in by {vrf,ES-id}) L3-NHG.
3. The reason for this indirection (route->L3-NHG, L3-NHG->list-of-VTEPs)
is to avoid route updates to the dplane when a remote-ES link flaps i.e.
instead of updating all the dependent routes the NHG's contents are
updated. This reduces the amount of dataplane updates (fewer nhg updates vs.
route updates) allowing for a faster failover.

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
2020-11-24 11:06:08 -08:00
Mark Stapp
926bc58f78
Merge pull request #7478 from donaldsharp/buffer
Buffer
2020-11-18 08:30:47 -05:00
Donatas Abraitis
84c320dc01 bgpd: Use __func__ instead of hardcoded strings for some functions
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-11-17 13:32:15 +02:00
Donald Sharp
7cfdb48554 *: Convert all usage of zclient_send_message to new enum
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-15 15:04:52 -05:00
Donatas Abraitis
2dbe669bdf :* Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-10-22 09:07:41 +03:00
David Lamparter
56ca3b5b3a bgpd: add %pBD for printing struct bgp_dest *
`%pRN` is not appropriate anymore.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-10-17 08:52:35 -04:00
Quentin Young
6c83ddedcf *: make failure to decode nht update an error
This should never happen; no need to debug guard it and it's not a
warning, if this isn't working then NHT is not working at all.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
2020-09-30 18:37:15 -04:00
Quentin Young
f8dcd38ddb bgpd: rename bgp_fsm_event_update
This function is poorly named; it's really used to allow the FSM to
decide the next valid state based on whether a peer has valid /
reachable nexthops as determined by NHT or BFD.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
2020-09-17 12:45:37 -04:00
Pat Ruddy
e37e1e27e4 bgpd: do not unregister for prefix nexthop updates if nh exists
since the addition of srte_color to the comparison for bgp nexthops
it is possible to have several nexthops per prefix but since zebra
only sores a per prefix registration we should not unregister for
nh notifications for a prefix unti all the nexthops for that prefix
have been deleted. Otherwise we can get into a deadlock situation
where BGP thinks we have registered but we have unregistered from zebra.

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-08-31 09:11:47 +00:00
Renato Westphal
545aeef1d1 bgpd: extend the NHT code to understand SR-TE colors
Extend the NHT code so that only the affected BGP routes are affected
whenever an SR-policy is updated on zebra.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-31 09:11:03 +00:00
Renato Westphal
f663c5819c bgpd: convert NHT code to use rb-trees instead of routing tables
Fist, routing tables aren't the most appropriate data structure
to store nexthops and imported routes since we don't need to do
longest prefix matches with that information.

Second, by converting the NHT code to use rb-trees, we can index
the nexthops using additional information, not only the destination
address.  This will be useful later to index bgpd's nexthops by
both destination and SR-TE color.

Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-31 09:09:05 +00:00
Philippe Guibert
1840384bae bgpd: flowspec code support for ipv6
until now, the assumption was done in bgp flowspec code that the
information contained was an ipv4 flowspec prefix. now that it is
possible to handle ipv4 or ipv6 flowspec prefixes, that information is
stored in prefix_flowspec attribute. Also, some unlocking is done in
order to process ipv4 and ipv6 flowspec entries.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-08-21 13:37:08 +02:00
Kaushik
92d6f76988 lib,zebra,bgpd: Fix for nexthop as IPv4 mapped IPv6 address
Added a macro to validate the v4 mapped v6 address.
Modified bgp receive & send updates for v4 mapped v6 address as
nexthop and installing it as recursive nexthop in RIB.
Minor change in fpm while sending the routes for nexthop as
v4 mapped v6 address.

Signed-off-by: Kaushik <kaushik@niralnetworks.com>
2020-08-03 23:24:04 -07:00
Donald Sharp
9bcb3eef54 bgp: rename bgp_node to bgp_dest
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`".  It should not result in any functional
change.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-06-23 17:32:52 +02:00
Sri Mohana Singamsetty
089116f8e6
Merge pull request #6456 from ton31337/fix/set_ipv6_ll_if_zero
bgpd: Use IPv6 LL address as nexthop if global was set to ::/LL
2020-06-02 09:08:05 -07:00
vivek
0139efe084 bgpd: During NHT change evaluation, skip inappropriate paths
When there is a NHT change and the paths dependent on that NHT are being
evaluated, skip those that are marked for removal or as history.

When a route gets withdrawn, its valid flag is cleared and it is flagged
for removal; in the case of an EVPN route, it is also unimported from
VRFs (L2 and/or L3). bgp_process is then scheduled. Under rare timing
conditions, an NHT update for the route's next hop may arrive right after,
and if routes flagged for removal are not skipped, they may not only be
incorrectly marked as valid but also re-imported in the case of EVPN,
which will be a serious error.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2020-05-25 14:17:12 -07:00
vivek
34ea39b65a bgpd: Check NHT change for triggering EVPN import or unimport
Ensure that only if there is a change to the path's validity based
on the NHT update, EVPN import or unimport is invoked.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2020-05-25 14:15:37 -07:00
vivek
9e15d76adf bgpd: Enhance NHT path evaluation debugs
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2020-05-25 14:10:12 -07:00
Donatas Abraitis
606fdbb1fa bgpd: Use IPv6 LL address as nexthop if global was set to ::/LL
This happens between Bird and FRR. Maybe others as well, dunno.

Bird sends ::(fe80::1588) and we have a nexthop as :: which is inaccessible:

```
BGP routing table entry for fdff:b87d:f5b0::/48
Paths: (1 available, no best path)
  Not advertised to any peer
  4242421588 4242422547 4242422601 4242423605
    :: (inaccessible) from fe80::1588 (172.20.16.140)
    (fe80::1588) (used)
      Origin IGP, invalid, external
      Last update: Mon May 25 14:27:02 2020
```

bgpd[9554]: fe80::1588 went from OpenConfirm to Established
bgpd[9554]: fe80::1588 [FSM] Timer (routeadv timer expire)
bgpd[9554]: fe80::1588 rcvd UPDATE w/ attr: , origin i, mp_nexthop ::(fe80::1588)
bgpd[9554]: fe80::1588 rcvd UPDATE wlen 0 attrlen 120 alen 0
bgpd[9554]: fe80::1588 rcvd fda9:26a9:1c47:2d42::/64 IPv6 unicast
bgpd[9554]: Allocated bnc ::/128(VRF default) peer 0x0
bgpd[9554]: bgp_update(0.0.0.0): NH unresolved
bgpd[9554]: fe80::1588 rcvd fda9:26a9:1c47:d42::/64 IPv6 unicast

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-05-25 17:37:10 +03:00
Donald Sharp
68cecc3b69 bgpd: Ensure that we have a ifp pointer
It is possible that the if_lookup_by_index() call will return
a NULL value and calling zclient_send_interface_radv_req.  Just
test that we have a valid interface pointer.

Found by Coverity

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-04-30 11:16:28 -04:00
Don Slice
b3a3290e23 bgpd: turn off RAs when numbered peers are deleted
Problem reported that in many circumstances, RAs created in the
process of bringing up numbered IPv6 peers with extended-nexthop
capability enabled (for ipv4 over ipv6) were not stopped on the
interface when those peers were deleted.  Found several circumstances
where this occurred and fix them in this patch.

Ticket: CM-26875
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2020-04-27 17:49:41 +00:00
Naveen Thanikachalam
e7cbe5e599 bgpd: Force self-next-hop check in next-hop update.
Problem Description:
=====================
+--+                                            +--+
|R1|-(192.201.202.1)----iBGP----(192.201.202.2)-|R2|
+--+                                            +--+

Routes on R2:
=============
S>* 202.202.202.202/32 [1/0] via 192.201.78.1, ens256, 00:40:48
Where, the next-hop network, 192.201.78.0/24, is a directly connected network address.
C>* 192.201.78.0/24 is directly connected, ens256, 00:40:48

Configurations on R1:
=====================
!
router bgp 201
 bgp router-id 192.168.0.1
 neighbor 192.201.202.2 remote-as 201
!

Configurations on R2:
=====================
!
ip route 202.202.202.202/32 192.201.78.1
!
router bgp 201
 bgp router-id 192.168.0.2
 neighbor 192.201.202.1 remote-as 201
 !
 address-family ipv4 unicast
  redistribute static
 exit-address-family
!

Step-1:
=======
R1 receives the route 202.202.202.202/32 from R2.
R1 installs the route in its BGP RIB.

Step-2:
=======
On R1, a connected interface address is added.
The address is the same as the next-hop of the BGP route received from R2 (192.201.78.1).

Point of Failure:
=================
R1 resolves the BGP route even though the route's next-hop is its own connected address.
Even though this appears to be a misconfiguration it would still be better to safeguard the code against it.

Fix:
====
When BGP receives a connected route from Zebra, it processes the
routes for the next-hop update.
While doing so, BGP must ignore routes whose next-hop address matches
the address of the connected route for which Zebra sent the next-hop update
message.

Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
2020-04-11 07:26:33 -07:00
Donald Sharp
b54892e0ea bgpd: Convert users of rn->p to use accessor function
Add new function `bgp_node_get_prefix()` and modify
the bgp code base to use it.

This is prep work for the struct bgp_dest rework.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-26 16:25:16 -04:00
Donatas Abraitis
15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Donald Sharp
8c9769e03b bgpd: Ensure we don't crash when registering RA's
There exists a code path that the ifp can be NULL.
Prevent an accident.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-14 15:35:37 -05:00
Donatas Abraitis
724935d5a2
Merge pull request #5789 from donaldsharp/bgp_ebgp_reason
bgpd: Update failed reason to distinguish some NHT scenarios
2020-02-11 10:42:23 +02:00
Donald Sharp
1e91f1d119 bgpd: Update failed reason to distinguish some NHT scenarios
Current failed reasons for bgp when you have a peer that
is not online yet is `Waiting for NHT`, even if NHT has
succeeded.  Add some code to differentiate this.

eva# show bgp ipv4 uni summ failed
BGP router identifier 192.168.201.135, local AS number 3923 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 43 KiB of memory
Neighbor        EstdCnt DropCnt ResetTime Reason
192.168.44.1          0       0    never  Waiting for NHT
192.168.201.139       0       0    never  Waiting for Open to Succeed
Total number of neighbors 2
eva#

eva# show bgp nexthop
Current BGP nexthop cache:
 192.168.44.1 invalid, peer 192.168.44.1
  Must be Connected
  Last update: Mon Feb 10 19:05:19 2020

 192.168.201.139 valid [IGP metric 0], #paths 0, peer 192.168.201.139

So 192.168.201.139 is a peer for a connected route that has not been
created on .139, while 44.1 nexthop tracking has not succeeded yet.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-10 19:46:48 -05:00
Donatas Abraitis
892fedb611 bgpd: Replace bgp_flag_* to [UN]SET/CHECK_FLAG macros
Most of the code uses macros, thus let's keep the code unified.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-02-06 17:11:38 +02:00
Chirag Shah
65f803e80a bgpd: skip ra for blackhole nexthop type
bgp nexthop cache update triggers RA for global ipv6
nexthop update.
In case of blackhole route type the outgoing interface
information is NULL which leads to bgpd crash.

Skip sending RA for blackhole nexthop type.

Ticket:CM-27299
Reviewed By:
Testing Done:

Configure bgp neighbor over global ipv6 address.
Configure static blackhole route with prefix includes
connected ipv6 global address.
Upon link flap, zebra sends nexthop update to bgp.
Bgp nexthop cache skips sending RA for blackhole nexthop type.

router bgp 65002
 bgp router-id 91.189.93.190
 ...
 neighbor 2001:67c:1360::b peer-group internal

static route:
ipv6 route 2001:67c:1360::/48 Null0 254

iface rowlink.4010
        address 91.189.93.190/32
        address 2001:67c:1360::a/128

Trigger ifdown rowlink.4010; ifup rowlink.4010

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2019-12-29 22:16:51 -08:00
Ameya Dharkar
7c312383ba bgpd: Add nexthop of received EVPN RT-5 for nexthop tracking
Problem statement:
When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the
nexthop of the route with nexthop tracking module. The BGP route is marked as
valid only if the nexthop is resolved.

Even for EVPN RT-5, route should be marked as valid only if the the nexthop is
resolvable.

Code changes:
1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid
only if the nexthop is resolved.
2. Only the valid EVPN routes are imported to the vrf.
3. When nht update is received in BGP, make sure that the EVPN routes are
imported/unimported based on the route becomes valid/invalid.

Testcases:
1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2.
10.100.0.2 is resolved at rtr-2 in default vrf.
At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into
vrfs.

2. Make the nexthop 10.100.0.2 unreachable at rtr-2
Remote EVPN RT-5 should be marked as invalid and should be unimported from the
vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes
should be valid.

3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable.
EVPN RT-5 should again become valid and should be imported into the vrfs.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2019-11-15 10:15:14 -08:00
Donald Sharp
a5f271c635
Merge pull request #5299 from ton31337/fix/remove_dead_code
bgpd: Remove not used bgp_find_nexthop() function
2019-11-11 07:57:09 -05:00
Donatas Abraitis
a78d1c77fe bgpd: Remove not used bgp_find_nexthop() function
Seems like a dead code.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2019-11-08 15:04:29 +02:00
Donald Sharp
8c1a4c1041 bgpd: use bgp->name_pretty in debugs and add vrf to some output
Recently had a case where I was attempting to debug a nexthop tracking
issue across multiple bgp vrf's and since the setup vrf's in it with
overlapping address ranges, it became real fun real fast to track
vrf data associated.  Add a bit of code to allow us to figure out
what vrf we are in when we print out debug messages.

Look through the rest of the code and find debugs where we are
not using bgp->name_pretty and switch it over.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-11-07 07:20:41 -05:00
Quentin Young
2951a7a4c2 *: s/TRUE/true/, s/FALSE/false/
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-07-01 17:26:05 +00:00
Stephen Worley
78fba41bd8 lib,zebra,bgpd: Remove nexthop_same_no_recurse()
The functions nexthop_same() does not check the resolved
nexthops so I don't think this function is even needed
anymore.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Philippe Guibert
fc04a6778e bgpd: improve reconnection mechanism by cancelling connect timers
if bfd comes back up, and a bgp reconnection is in progress, theorically
it should be necessary to wait for the end of the reconnection process.
however, since that reconnection process may take some time, update the
fsm by cancelling the connect timer. This done, one just have to call
the start timer.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2019-04-18 16:11:51 +02:00
Chirag Shah
1eb6c3eae6 *: do not register nexthop 0.0.0.0 to nht
Avoid tracking 0.0.0.0/32 nexthop with RIB.

When routes are aggregated,
the originate of the route becomes self.
Do not track nexthop self (0.0.0.0) with rib.

Ticket: CM-24248
Testing Done:

Before fix-

tor-11# show ip nht vrf all

VRF blue:
0.0.0.0
 unresolved
 Client list: bgp(fd 16)

VRF default:

VRF green:

VRF magenta:
0.0.0.0
 unresolved
 Client list: bgp(fd 16)

After fix-

tor-11# show ip nht vrf all

VRF blue:

VRF default:

VRF green:

VRF magenta:

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2019-04-03 11:17:57 -07:00
Donald Sharp
e6cc3dc98b
Merge pull request #3415 from pguibert6WIND/flowspec_support_nh_tracking
Flowspec support nh tracking
2019-01-09 15:41:16 -05:00