Commit Graph

198 Commits

Author SHA1 Message Date
Donald Sharp
c2cedabed1
Revert "bgpd: combine import_check_table and nexthop_check_table" 2023-11-06 10:07:58 -05:00
Donald Sharp
4a43b81d7c bgpd: combine import_check_table and nexthop_check_table
In zebra, the import check table and the nexthop check tables
were combined.  This leaves an issue where when bgp happens
to have a tracked address in both the import check table
and the nexthop track table that are the same address.
When the the item is removed from one table the call
to remove it from zebra removes tracking for the other
table.

Combine the two tables together and keep track where
they came from for processing in bgpd.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-10-25 11:55:01 -04:00
Mark Stapp
8527084488 bgpd: replace ctime with ctime_r
No ctime, use ctime_r.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-09-19 16:25:01 -04:00
Donatas Abraitis
75dbd45c55
Merge pull request #14383 from donaldsharp/bgp_coverity_cleanup_early_sept
Bgp coverity cleanup early sept
2023-09-13 21:52:37 +03:00
Donald Sharp
493075d25b bgpd: bgp_connected_delete needs to ensure dest is still there
Again coverity believes that dest could be freed by a call
into bgp_dest_unlock_node, and it can if the lock count
is wrong.  Let's fix that assumption for coverity

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-11 12:45:59 -04:00
Donald Sharp
0c3a70c644 bgpd: Move the peer->su to connection->su
The sockunion is per connection.  So let's move it over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-10 08:31:25 -04:00
Donald Sharp
b2f25e1a17 bgpd: First pass of BGP_EVENT_ADD
Pass through a bunch of BGP_EVENT_ADD's and make
the code use a proper connection instead of a
peer->connection.  There still are a bunch
of places where peer->connection is used and
later commits will probably go through and
clean these up more.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-10 08:31:25 -04:00
Donald Sharp
d2ba78929f bgpd: bgp_fsm_change_status/BGP_TIMER_ON and BGP_EVENT_ADD
Modify bgp_fsm_change_status to be connection oriented and
also make the BGP_TIMER_ON and BGP_EVENT_ADD macros connection
oriented as well.  Attempt to make peer_xfer_conn a bit more
understandable because, frankly it was/is confusing.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-10 08:31:25 -04:00
Donald Sharp
7b1158b169 bgpd: peer_established should be connection oriented
The peer_established function should be connection oriented.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-10 08:31:25 -04:00
Philippe Guibert
1b34877af6 bgpd: rename bnc->ifindex to bnc->ifindex_ipv6_ll
This commit changes the 'ifindex' name of the bnc structure.
As it is used only to handle ipv6 link local addresses, let
us use the 'ifindex_ipv6_ll' naming to avoid any confusions
with the ifindex value of the resolved next-hops of the bnc
structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-07-13 12:05:15 +02:00
Philippe Guibert
1069425868 bgpd: allocate label bound to received mpls vpn routes
Current implementation does not offer a new label to bind
to a received VPN route entry to redistribute with that new
label.

This commit allocates a label for VPN entries that have
a valid label, and a reachable next-hop interface that is
configured as follows:

> interface eth0
>  mpls bgp l3vpn-multi-domain-switching
> exit

An mplsvpn next-hop label binding entry is created in an mpls
vpn nexthop label bind hash table of the current BGP instance.
That mpls vpn next-hop label entry is indexed by the (next-hop,
orig_label) values provided by the incoming updates, and shared
with other updates having the same (next-hop, orig_label) values.

A new 'LP_TYPE_BGP_L3VPN_BIND' label value is picked up from the
zebra mpls label pool, and assigned to the new_label attribute.

The 'bgp_path_info' appends a 'bgp_mplsvpn_nh_label_bind' structure
to the 'mplsvpn' union structure. Both structures in the union are not
used at the same, as the paths are either VRF updates to export, or MPLS
VPN updates. Using an union gives a 24 bytes memory gain compared to if
the structures had not been in an union (24 bytes compared to 48 bytes).

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-06-16 10:54:58 +02:00
Donald Sharp
b295810d00
Revert "bgpd: upon if up event, evaluate bnc with matching nexthop" 2023-06-08 23:17:53 -04:00
Philippe Guibert
713831fa7f bgpd: rename bnc->ifindex to bnc->ifindex_ipv6_ll
This commit changes the 'ifindex' name of the bnc structure.
As it is used only to handle ipv6 link local addresses, let
us use the 'ifindex_ipv6_ll' naming to avoid any confusions
with the ifindex value of the resolved next-hops of the bnc
structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-05-26 08:33:50 +02:00
Philippe Guibert
577be36a41 bgpd: add support for l3vpn per-nexthop label
This commit introduces a new method to associate a label to
prefixes to export to a VPNv4 backbone. All the methods to
associate a label to a BGP update is documented in rfc4364,
chapter 4.3.2. Initially, the "single label for an entire
VRF" method was available. This commit adds "single label
for each attachment circuit" method.

The change impacts the control-plane, because each BGP update
is checked to know if the nexthop has reachability in the VRF
or not. If this is the case, then a unique label for a given
destination IP in the VRF will be picked up. This label will
be reused for an other BGP update that will have the same
nexthop IP address.

The change impacts the data-plane, because the MPLs pop
mechanism applied to incoming labelled packets changes: the
MPLS label is popped, and the packet is directly sent to the
connected nexthop described in the previous outgoing BGP VPN
update.

By default per-vrf mode is done, but the user may choose
the per-nexthop mode, by using the vty command from the
previous commit. In the latter case, a per-vrf label
will however be allocated to handle networks that are not directly
connected. This is the case for local traffic for instance.

The change also include the following:

-  ECMP case
In case a route is learnt in a given VRF, and is resolved via an
ECMP nexthop. This implies that when exporting the route as a BGP
update, if label allocation per nexthop is used, then two possible
MPLS values could be picked up, which is not possible with the
current implementation. Actually, the NLRI for VPNv4 stores one
prefix, and one single label value, not two. Today, RFC8277 with
multiple label capability is not yet available.
To avoid this corner case, when a route is resolved via more than one
nexthop, the label allocation per nexthop will not apply, and the
default per-vrf label will be chosen.
Let us imagine BGP redistributes a static route using the `172.31.0.20`
nexthop. The nexthop resolution will find two different nexthops fo a
unique BGP update.

 > r1# show running-config
 > [..]
 > vrf vrf1
 >  ip route 172.31.0.30/32 172.31.0.20
 > r1# show bgp vrf vrf1 nexthop
 > [..]
 > 172.31.0.20 valid [IGP metric 0], #paths 1
 >  gate 192.0.2.11
 >  gate 192.0.2.12
 >  Last update: Mon Jan 16 09:27:09 2023
 >  Paths:
 >    1/1 172.31.0.30/32 VRF vrf1 flags 0x20018

To avoid this situation, BGP updates that resolve over multiple
nexthops are using the unique per-vrf label.

- recursive route case

Prefixes that need a recursive route to be resolved can
also be eligible for mpls allocation per nexthop. In that
case, the nexthop will be the recursive nexthop calculated.

To achieve this, all nexthop types in bnc contexts are valid,
except for the blackhole nexthops.

- network declared prefixes

Nexthop tracking is used to look for the reachability of the
prefixes. When the the 'no bgp network import-check' command
is used, network declared prefixes are maintained active,
even if there is no active nexthop.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-05-09 21:00:57 +02:00
Donatas Abraitis
786e2b8bdb Revert "MPLS allocation mode per next hop"
Broken tests, let's revert now.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-05-03 13:52:46 +03:00
Donatas Abraitis
99a1ab0b21
Merge pull request #12646 from pguibert6WIND/mpls_alloc_per_nh
MPLS allocation mode per next hop
2023-05-02 18:36:45 +03:00
Philippe Guibert
20d072d3ec bgpd: keep interface index on bgp nexthop tracking
The following BGP configuration does not show that the
resolved next-hop to 192.0.2.1 has a defined interface.

> router bgp 65500
>  bgp router-id 192.0.2.2
>  neighbor 192.0.2.1 remote-as 65500
>  neighbor 192.0.2.1 update-source loop1
>  neighbor 192.168.0.1 remote-as 65500
>  !
>  address-family ipv4 unicast
>   network 192.0.2.2/32
>   no neighbor 192.168.0.1 activate
>  exit-address-family
>  !
>  address-family ipv4 labeled-unicast
>   neighbor 192.168.0.1 activate
>  exit-address-family
>  !
>  address-family ipv4 vpn
>   neighbor 192.0.2.1 activate
>  exit-address-family

The 'show bgp nexthop' dump does not output the interface
whereas the zebra rnh has the information.

> dut-vm# show bgp nexthop
> [..]
> Current BGP nexthop cache:
>  192.0.2.1 valid [IGP metric 0], #paths 1, peer 192.0.2.1
>   gate 192.168.0.1
>   Last update: Mon Apr 24 22:10:07 2023
>
> dut-vm# show ip nht
> 192.0.2.1
>  resolved via bgp
>  via 192.168.0.1, r2-eth0
>  Client list: bgp(fd 33)

Modify the display of BGP nexthop tracking to also dump
the interface used:

> dut-vm# show bgp nexthop
> [..]
> Current BGP nexthop cache:
>  192.0.2.1 valid [IGP metric 0], #paths 1, peer 192.0.2.1
>   gate 192.168.0.1, r2-eth0
>   Last update: Mon Apr 24 22:10:07 2023

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-04-27 17:04:20 +02:00
Donald Sharp
24a58196dd *: Convert event.h to frrevent.h
We should probably prevent any type of namespace collision
with something else.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
cb37cb336a *: Rename thread.[ch] to event.[ch]
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system.  There is a continual
problem where people are confusing `struct thread` with a true
pthread.  In reality, our entire thread.c is an event system.

In this commit rename the thread.[ch] files to event.[ch].

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:16 -04:00
Philippe Guibert
92d5e31ace bgpd: add support for l3vpn per-nexthop label
This commit introduces a new method to associate a label to
prefixes to export to a VPNv4 backbone. All the methods to
associate a label to a BGP update is documented in rfc4364,
chapter 4.3.2. Initially, the "single label for an entire
VRF" method was available. This commit adds "single label
for each attachment circuit" method.

The change impacts the control-plane, because each BGP update
is checked to know if the nexthop has reachability in the VRF
or not. If this is the case, then a unique label for a given
destination IP in the VRF will be picked up. This label will
be reused for an other BGP update that will have the same
nexthop IP address.

The change impacts the data-plane, because the MPLs pop
mechanism applied to incoming labelled packets changes: the
MPLS label is popped, and the packet is directly sent to the
connected nexthop described in the previous outgoing BGP VPN
update.

By default per-vrf mode is done, but the user may choose
the per-nexthop mode, by using the vty command from the
previous commit. In the latter case, a per-vrf label
will however be allocated to handle networks that are not directly
connected. This is the case for local traffic for instance.

The change also include the following:

-  ECMP case
In case a route is learnt in a given VRF, and is resolved via an
ECMP nexthop. This implies that when exporting the route as a BGP
update, if label allocation per nexthop is used, then two possible
MPLS values could be picked up, which is not possible with the
current implementation. Actually, the NLRI for VPNv4 stores one
prefix, and one single label value, not two. Today, RFC8277 with
multiple label capability is not yet available.
To avoid this corner case, when a route is resolved via more than one
nexthop, the label allocation per nexthop will not apply, and the
default per-vrf label will be chosen.
Let us imagine BGP redistributes a static route using the `172.31.0.20`
nexthop. The nexthop resolution will find two different nexthops fo a
unique BGP update.

 > r1# show running-config
 > [..]
 > vrf vrf1
 >  ip route 172.31.0.30/32 172.31.0.20
 > r1# show bgp vrf vrf1 nexthop
 > [..]
 > 172.31.0.20 valid [IGP metric 0], #paths 1
 >  gate 192.0.2.11
 >  gate 192.0.2.12
 >  Last update: Mon Jan 16 09:27:09 2023
 >  Paths:
 >    1/1 172.31.0.30/32 VRF vrf1 flags 0x20018

To avoid this situation, BGP updates that resolve over multiple
nexthops are using the unique per-vrf label.

- recursive route case

Prefixes that need a recursive route to be resolved can
also be eligible for mpls allocation per nexthop. In that
case, the nexthop will be the recursive nexthop calculated.

To achieve this, all nexthop types in bnc contexts are valid,
except for the blackhole nexthops.

- network declared prefixes

Nexthop tracking is used to look for the reachability of the
prefixes. When the the 'no bgp network import-check' command
is used, network declared prefixes are maintained active,
even if there is no active nexthop.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-03-22 12:06:29 +01:00
Donald Sharp
d8bc11a592 *: Add a hash_clean_and_free() function
Add a hash_clean_and_free() function as well as convert
the code to use it.  This function also takes a double
pointer to the hash to set it NULL.  Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-21 08:54:21 -04:00
Russ White
ba755d35e5
Merge pull request #12248 from pguibert6WIND/bgpasdot
lib, bgp: add initial support for asdot format
2023-02-21 08:01:03 -05:00
Philippe Guibert
4a8cd6ad7f bgpd: support for as notation format for route distinguisher
RD may be built based on an AS number. Like for the AS, the RD
may use the AS notation. The two below examples can illustrate:

RD 1.1:20 stands for an AS4B:NN RD with AS4B=65536 in dot format.
RD 0.1:20 stands for an AS2B:NNNN RD with AS2B=0.1 in dot+ format.

This commit adds the asnotation mode to prefix_rd2str() API so as
to pick up the relevant display.

Two new printfrr extensions are available to display the RD with
the two above display methods.
- The pRDD extension stands for dot asnotation format
- The pRDE extension stands for dot+ asnotation format.
- The pRD extension has been renamed to pRDP extension

The code is changed each time '%pRD' printf extension is called.
Possibly, the asnotation may change the output, then a macro defines
the asnotation mode to use. A side effect of forging the mode to
use is that the string could not be concatenated with other strings
in vty_out and snprintfrr. Those functions have been called multiple
times. When zlog_debug needs to display the RD with some other string,
the prefix_rd2str() old API is used instead of the printf extension.

Some code has been kept untouched:
- code related to running-config. Actually, wherever an RD is displayed,
its configured name should be dumped.
- bgp rfapi code
- bgp evpn multihoming code (partially done), since the logic is
missing to get the asnotation of 'struct bgp_evpn_es'.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:27:23 +01:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Trey Aspelund
826c3f6db3 bgpd: only unimport routes if tunnel-ip changes
When processing a new local VNI, we were always walking the global EVPN
table to look for routes that needed to be removed due to a martian
nexthop change (specifically a tunnel-ip change).
Since the martian TIP table is global (all VNIs) + the walk is also in
the global table (all VNIs), we can trust that any new TIP from any VNI
would result in routes getting removed from the global table and
unimported from all live (L2)VNIs.
i.e.
The only time this update is actionable is if we are adding/removing an
IP from the martian TIP table, and we do not need to walk the table for
normal refcount adjustments.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-01-27 11:11:44 -05:00
Donatas Abraitis
a8adf1b3cb
Merge pull request #12573 from Pdoijode/bgp-nexthop-json-changes
Bgp nexthop json changes
2023-01-07 21:01:06 +02:00
Pooja Jagadeesh Doijode
071ec807cb bgpd: AFI option to query nexthops based on AFI
Added ipv4 and ipv6 option to existing "show bgp nexthop"
command to be able to query nexthops that belong to a
particular address-family.

Also fixed the warnings of MR 12171

Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2023-01-04 18:53:12 -08:00
Pooja Jagadeesh Doijode
8da79d08ad bgpd: Detail option for nexthop and import-check to display paths
1. Updated "show bgp vrf <vrf-name> nexthop detail"
   and "show bgp vrf <vrf-name> import-check-table
   detail" show commands to display paths associated with
   nexthop. "detail" option was previously unused.
2. Added 'ipv4' and 'ipv6' JSON object under top level JSON.
3. Removed the "nexthops" JSON object which was under the top
   level JSON object
4. Renamed "ifname" to "interfaceName"
5. Renamed "gates" JSON obejct to "nexthops"
6. Changed "flags" JSON array to JSON object and changed the
   flags from string to boolean
7. "lastUpdate" will display only epoch time for "detail" option

JSON output:
r4# show bgp vrf default nexthop detail json
{
  "ipv4":{
    "10.0.7.1":{
      "valid":true,
      "complete":true,
      "igpMetric":0,
      "pathCount":3,
      "peer":"10.0.7.1",
      "nexthops":[
        {
          "interfaceName":"r4-r2-eth0"
        }
      ],
      "lastUpdate":1672265350,
      "paths":[
        {
          "afi":"IPv4",
          "safi":"unicast",
          "prefix":"11.0.20.2/32",
          "vrf":"default",
          "flags":{
            "igpChanged":false,
            "damped":false,
            "history":false,
            "bestpath":true,
            "valid":true,
            "attrChanged":false,
            "deterministicMedCheck":false,
            "deterministicMedSelected":false,
            "stale":false,
            "removed":false,
            "counted":true,
            "multipath":false,
            "multipathChanged":false,
            "ribAttributeChanged":false,
            "nexthopSelf":false,
            "linkBandwidthChanged":false,
            "acceptOwn":false
          }
        }
      ]
    }
  }
}
}

Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2022-12-28 14:13:32 -08:00
Pooja Jagadeesh Doijode
bf85e4c5f1 bgpd: add json option to show commands in bgp_nexthop
Commands with json option:
 - show bgp nexthop
 - show bgp import-check-table

Example output below, "nexthop" and "import-check-table" are only
different in the nexthop entries, the format is the same
```
leaf-A# show bgp nexthop 10.11.10.1 detail json
{
  "nexthops":{
    "10.11.10.1":{
      "valid":true,
      "complete":true,
      "igpMetric":0,
      "pathCount":1,
      "peer":"10.11.10.1",
      "gates":[
        {
          "ifname":"eth1"
        }
      ],
      "lastUpdate":{
        "epoch":1669161758,
        "string":"Wed Nov 23 00:02:38 2022\n"
      },
      "paths":[
        {
          "afi":"IPv4",
          "safi":"unicast",
          "prefix":"192.168.11.0/24",
          "vrf":"default",
          "flags":[
            "valid",
            "dmedSelected",
            "counted"
          ]
        }
      ]
    }
  }
}
leaf-A# show bgp nexthop json
{
  "nexthops":{
    "10.10.10.1":{
      "valid":true,
      "complete":true,
      "igpMetric":0,
      "pathCount":1,
      "peer":"10.10.10.1",
      "gates":[
        {
          "ifname":"eth0"
        }
      ],
      "lastUpdate":{
        "epoch":1669161758,
        "string":"Wed Nov 23 00:02:38 2022\n"
      }
    },
    "10.11.10.1":{
      "valid":true,
      "complete":true,
      "igpMetric":0,
      "pathCount":1,
      "peer":"10.11.10.1",
      "gates":[
        {
          "ifname":"eth1"
        }
      ],
      "lastUpdate":{
        "epoch":1669161758,
        "string":"Wed Nov 23 00:02:38 2022\n"
      }
    }
  }
}

```

Signed-off-by: Yaroslav Fedoriachenko <yar.fed99@gmail.com>
2022-12-28 13:43:50 -08:00
Donald Sharp
1c225152c0 bgpd: bgp_connected_add memory was being leaked in some cases
On shutdown, bgp calls an unlock for the bnc connected table,
via the bgp_connected_cleanup function.  This function is
only ever called on shutdown, so we know that bgp is going
away.  The refcount for the connected data can be more than
1.  Let's not worry about the refcount on shutdown and
just delete the nodes instead of leaving them around.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-22 08:15:52 -05:00
Donatas Abraitis
073801481b bgpd: inet_ntop() adjustments
Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes
for inet_ntop() calls.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-29 17:36:13 +02:00
Philippe Guibert
8c6a164f40 bgpd: improve 'show bgp nexthop' command
- for a given IP nexthop, dump all NH entries, including
colored entries, or entries with an ifindex.
- when a given IP nexthop is requested, the path is displayed.
For better readibility, remove the carriage return between
'Last update' and 'Paths', because ctime() function already
performs carriage return.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-10-05 11:12:59 +02:00
Donald Sharp
a8cc325f59 bgpd: Fix show bgp nexthop A.B.C.D
The issuing of `show bgp nexthop A.B.C.D` fails even if that
nexthop exists:

eva# show bgp nexthop 192.168.119.120
specified nexthop does not have entry

Fixed:

eva# show bgp nexthop 192.168.119.120
 192.168.119.120 valid [IGP metric 0], #paths 0, peer 192.168.119.120
  if enp39s0
  Last update: Fri Sep 30 14:55:13 2022

  Paths:

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-09-30 14:58:21 -04:00
Donatas Abraitis
c4f64ea94d bgpd: Use %pRD for prefix_rd2str()
Convert a bunch of prefix_rd2str() for json/vty stuff.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-22 13:12:11 +03:00
Donatas Abraitis
036f482fce bgpd: Drop bnc_str() function
Reuse %pFX -> prefix2str()

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:28 +03:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Donald Sharp
35aae5c9bc bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer.  Not have
one's that write over others.  Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC.  This is insufficient in that
we were accidently overwriting the one LL with other
data.  This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-22 09:09:39 -04:00
Donatas Abraitis
6006b807b1 *: Properly use memset() when zeroing
Wrong: memset(&a, 0, sizeof(struct ...));
    Good:  memset(&a, 0, sizeof(a));

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-11 14:08:47 +03:00
Donatas Abraitis
8998807f69 *: Avoid casting to the same type as on the left
Just not necessary.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-08 16:07:42 +03:00
anlan_cs
8e3aae66ce *: remove the checking returned value for hash_get()
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.

Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.

Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.

Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
   ensure it is a created node, not a found node.
   Refer to `isis_vertex_queue_insert()` of isisd, there
   are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
   is a found node, then free <searching_data>.
   Refer to `aspath_intern()` of bgpd, there are many
   examples of this case in bgpd.

Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-03 00:41:48 +08:00
Ameya Dharkar
021b659665 bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
  should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
  EVI(L2VPN table).

To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.

Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.

To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.

struct hash *vni_svi_hash;

An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.

Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.

For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.

Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;

When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.

It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.

Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
        struct ipaddr addr;
        struct list *macip_path_list;
};

A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.

bool is_evpn_gwip_nexthop;

A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.

This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.

Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags

*                | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable   | VALID      = 1 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID      = 0 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 0

Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.

When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Donald Sharp
feb1723846 bgpd: Convert to using peer_established(peer) function
We are inconsistently using peer_establiahed(peer) with
sometimes using `peer->status == Established`.  Just Convert
over to using the function for consistency.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-06-07 10:48:36 -04:00
Donald Sharp
8761cd6ddb bgpd: Switch LL nexthop tracking to be interface based
bgp is currently registering v6 LL as nexthops to be tracked
from zebra.  This presents several problems.

a) zebra does not properly track multiple prefixes that match
the same route properly at this point in time.
b) BGP was receiving nexthops that were just incorrect because
of (a).
c) When a nexthop changed that really didn't affect the v6 LL
we were responding incorrectly because of this

Modify the code such that bgp nexthop tracking notices that
we are trying to register a v6 LL.  When we do so, shortcut
and watch interface up/down events for this v6 LL and do
the work when an interface goes up / down for this type
of tracking.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-02-17 08:14:45 -05:00
Donald Sharp
df2a41a9bf bgpd: Add bgp_nexthop_dump_bnc_change_flags function
Allow us to read what the change flags are instead of having
to look them up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-29 07:54:58 -05:00
Donald Sharp
987a720a11 bgpd: Add bgp_nexthop_dump_bnc_flags
Add a function that allows us to see a string version of the
bnc->flags bit fields.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-01-29 07:54:58 -05:00
Donald Sharp
2a99175f8d bgpd: Shorten some show memory strings
Some of the `show memory` strings in bgp are longer than the
columns we have allocated for it.  Shorten some strings to
make them fit and have the output pleasing to the eye.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-12 07:23:37 -05:00
David Lamparter
56ca3b5b3a bgpd: add %pBD for printing struct bgp_dest *
`%pRN` is not appropriate anymore.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-10-17 08:52:35 -04:00
Donald Sharp
3584c85e92 bgpd: Avoid memset when tip hash is empty
The tip hash is only used when we are dealing with
evpn.  In bgp_nexthop_self we are doing a memset
irrelevant of whether we will ever find data.  Yes
hash_lookup will return pretty quickly.

Modify the code to avoid doing a memset in the case
where the tip hash is empty as that we know we'll
never find anything.  With full BGP feeds this
small memset does take some time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-09-16 17:48:15 -04:00
Pat Ruddy
e37e1e27e4 bgpd: do not unregister for prefix nexthop updates if nh exists
since the addition of srte_color to the comparison for bgp nexthops
it is possible to have several nexthops per prefix but since zebra
only sores a per prefix registration we should not unregister for
nh notifications for a prefix unti all the nexthops for that prefix
have been deleted. Otherwise we can get into a deadlock situation
where BGP thinks we have registered but we have unregistered from zebra.

Signed-off-by: Pat Ruddy <pat@voltanet.io>
2020-08-31 09:11:47 +00:00
Renato Westphal
545aeef1d1 bgpd: extend the NHT code to understand SR-TE colors
Extend the NHT code so that only the affected BGP routes are affected
whenever an SR-policy is updated on zebra.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-31 09:11:03 +00:00