Commit Graph

476 Commits

Author SHA1 Message Date
Donald Sharp
e20c23fa5b bgpd: Move status and ostatus to struct peer_connection
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-18 09:29:04 -04:00
Donald Sharp
1e8ac95bfb bgpd: evpn code was not properly unlocking rd_dest
Found some code where bgp was not unlocking the dest
and rd_dest when walking the tree attempting to
find something to install.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-11 10:11:10 -04:00
Valerian_He
98efa5bc6b bgpd: bgp_path_info_extra memory optimization
Even if some of the attributes in bgp_path_info_extra are
not used, their memory is still allocated every time. It
cause a waste of memory.
This commit code deletes all unnecessary attributes and
changes the optional attributes to pointer storage. Memory
will only be allocated when they are actually used. After
optimization, extra info related memory is reduced by about
half(~400B -> ~200B).

Signed-off-by: Valerian_He <1826906282@qq.com>
2023-08-08 10:48:07 +00:00
Chirag Shah
bad029e92d bgpd: fix evpn zclient_send_message return code
In scaled EVPN route sync from bgp to zebra, return
code can be ZCLIENT_SEND_BUFFERED which was treated
as error and leads to route install/uninstall failure.

Following error logs were seen:
2023-07-07T17:05:59.640899+03:00 vtep12 bgpd[15305]: [WYBZ0-MM8F1][EC
33554471] 0: Failed to uninstall EVPN IMET route in VNI 478
2023-07-07T17:05:59.640913+03:00 vtep12 bgpd[15305]: [Y5VKN-9BV7H][EC
33554471] default (0): Failed to uninstall EVPN [3]:[0]:[32]:[27.0.0.5]
route from VNI 465 IP table
2023-07-07T17:05:59.640927+03:00 vtep12 bgpd[15305]: [WYBZ0-MM8F1][EC
33554471] 0: Failed to uninstall EVPN IMET route in VNI 465
2023-07-07T17:05:59.640940+03:00 vtep12 bgpd[15305]: [Y5VKN-9BV7H][EC
33554471] default (0): Failed to uninstall EVPN [3]:[0]:[32]:[27.0.0.5]
route from VNI 173 IP table

Ticket:#3499957
Testing Done:

Before fix:

root@vtep12:mgmt:/home/cumulus# bridge -d -s fdb show | grep  27.0.0.5 |
wc -l
16010

Once source VTEP withdraws, DUT VTEP still has stale entries
root@vtep12:mgmt:~# bridge -d -s fdb show | grep  27.0.0.5 | wc -l
12990

After fix:

Once source VTEP withdraws, DUT VTEP still is able to delete entries
root@vtep12:mgmt:/home/cumulus# bridge -d -s fdb show | grep  27.0.0.5 |
wc -l
0

Zapi stats:

Client: bgp
[32/133]
------------------------
FD: 76
Connect Time: 00:26:17
Nexthop Registry Time: 00:26:11
Nexthop Last Update Time: 00:23:31
Client will Not be notified about it's routes status
Last Msg Rx Time: 00:21:33
Last Msg Tx Time: 00:23:31
Last Rcvd Cmd: ZEBRA_REMOTE_MACIP_ADD
Last Sent Cmd: ZEBRA_NEXTHOP_UPDATE

Type        Add         Update      Del
==================================================
IPv4        7           0           1
IPv6        0           0           0
Redist:v4   22          0           0
Redist:v6   0           0           0
VRF         2           0           0
Connected   4170        0           0
Interface   9           0           4
Intf Addr   2166        0           0
BFD peer    0           0           0
NHT v4      2           0           1
NHT v6      4           0           0
VxLAN SG    0           0           0
VNI         1010        0           0
L3-VNI      0           0           0
MAC-IP      46010       0           0
ES          2024        0           0
ES-EVI      0           0           0
Errors: 0

Signed-off-by: Chirag Shah <chirag@nvidai.com>
2023-07-07 18:55:04 -07:00
Russ White
95070f2eef
Merge pull request #13557 from anlancs/fix/bgpd-evpn-rmac-best-path
bgpd: Fix missing deletion of evpn routes
2023-06-20 09:12:51 -04:00
anlan_cs
0c061db4d0 bgpd: Fix missing deletion of evpn routes
Consider the scenario of evpn, the box has some type-5 ECMP routes.

After one of its remote peers is removed ( or down ), `show evpn rmac vni all`
kept no change **sometimes**, it means the rmac of the removed peer maybe is
still in this box, and the traffic will be wrongly forwarded to the removed
peer.

The root cause is that the best path selection for type-5 routes maybe
keep no change and the best path is not routed to the removed peer, so `bgpd`
wrongly doesn't tell `zebra` to remove ( withdraw ) the type-5 routes owned
by the removed peer.

So, add a new flag to force the deletion.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-06-03 09:27:53 +08:00
Trey Aspelund
badc4857aa bgpd: add EVPN reimport handler for martian change
Adds a generalized martian reimport function used for triggering a
relearn/reimport of EVPN routes that were previously filtered/deleted
as a result of a "self" check (either during import or by a martian
change handler). The MAC-VRF SoO is the first consumer of this function,
but can be expanded for use with Martian Tunnel-IPs, Interface-IPs,
Interface-MACs, and RMACs.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30 15:20:35 +00:00
Trey Aspelund
67b493a5b3 bgpd: generalize EVPN martian nexthop changes
Currently we have a handler function that will walk the global EVPN
rib and unimport/remove routes matching a local IP/TIP. This generalizes
this function so that it can be re-used for other BGP Martian entry
types. Now this can be used to unimport routes when the MAC-VRF SoO is
reconfigured.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30 15:20:35 +00:00
Trey Aspelund
465d3e356d bgpd: track L3VNI VTEP-IPs in tip_hash
For whatever reason, we were only updating tip_hash when we processed an
L2VNI add/del. This adds tip_hash updates to the L3VNI add/del codepaths
so that their VTEP-IPs are also used when when considering martian
addresses, e.g. bgp_nexthop_self().

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30 15:20:35 +00:00
Trey Aspelund
65cdb9ce9b bgpd: Add MAC-VRF Site-of-Origin support
Initial support for configuring an SoO for all MAC-VRFs (EVIs/L2VNIs).
This provides a topology-independent method of preventing EVPN routes
from one MAC-VRF "site" (an L2 domain) from being imported by other PEs
in the same MAC-VRF "site", similar to how SoO is traditionally used in
L3VPN to identify and break loops for an L3/IP-VRF "site".
One example of where a MAC-VRF SoO can be used to avoid an L2 control
plane loop is with Active/Active MLAG VTEPs. For a given L2 site only
one control plane should be active. SoO can be used to ID/ignore entries
originated from the local MAC-VRF site so that EVPN will not attempt to
manage entries that are already handled by MLAG.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30 15:20:35 +00:00
Trey Aspelund
5d5d126777 bgpd: migrate MTYPE_BGP_EVPN_INFO
bgp_create() and bgp_free() already call EVPN-specific handlers,
so there's no need to XCALLOC/XFREE BGP_EVPN_INFO directly. Let's move
all the references to MTYPE_BGP_EVPN_INFO into the EVPN specific files.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-05-30 15:20:35 +00:00
Russ White
4855ca5e56
Merge pull request #13310 from opensourcerouting/feature/bgpd_node_target_extended_community
bgpd: Add Node Target Extended Communities support
2023-04-25 11:06:23 -04:00
Donatas Abraitis
8f2a51b7b7 bgpd: Reuse encode_route_target_ip() function
Before this patch, this function wasn't used in the code. Let's reuse this
since it's uses the same pattern for encoding route-target extcommunity.

Also reuse encode_route_target_as[4]() as well.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-04-14 22:33:34 +03:00
anlan_cs
e2c7a84bdd bgpd: Simplify the checking local path
Replace the checking local path code in `update_evpn_type5_route_entry()`
with generic function.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-04-13 08:47:06 +08:00
Donald Sharp
5a7c43c77e bgpd: Do not allow l3vni changes when shutting down
When a `no router bgp XXX` is issued and the bgp instance
is in the process of shutting down, do not allow a l3vni
change coming up from zebra to do anything.  We can just
safely ignore it at this point in time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-10 14:14:01 -04:00
Donald Sharp
aa056a2a64 bgpd: Treat withdraw variable as a bool
Used as a bool, treated as a bool.  Make it a bool

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-06 17:41:32 -04:00
Donald Sharp
d8bc11a592 *: Add a hash_clean_and_free() function
Add a hash_clean_and_free() function as well as convert
the code to use it.  This function also takes a double
pointer to the hash to set it NULL.  Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-21 08:54:21 -04:00
Donatas Abraitis
0da34e499a bgpd: Drop afi_t from bgp_evpn_global_node_lookup()
Not used.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-03-14 12:05:58 +02:00
Donatas Abraitis
59d6b4d6ab bgpd: Rename bgp_afi_node_lookup() to bgp_safi_node_lookup()
afi not used in this function, reduce a bit.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-03-14 12:01:56 +02:00
Philippe Guibert
7171949de5 bgpd: attr evpn attributes should be modified before interning attr
As remind, the attr attribute is a structure that contains
the attributes for a given BGP update. In order to avoid too much
memory consumption, the attr structure is stored in a hash table.
As consequence, other BGP updates may reuse the same attr. The
storage in the hash table is done when calling bgp_attr_intern(),
and a key is calculated based on all the attributes values of the
structure.

In BGP EVPN, when modifying the attributes of the attr structure
after having interned it, this means that some BGP updates will
want to use the old reference, whereas a new attr value is used.
Because in BGP EVPN, the modifications are done on a per BGP update
basis, a new attr entry specific to that BGP update should be created.
This is why a local_attr structure is done, modified, then later
interned.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-24 08:16:36 +01:00
Russ White
f48c8a92fb
Merge pull request #12854 from opensourcerouting/fix/bgp_withdraw_attr_not_used
bgpd: Drop struct attr from bgp_withdraw()
2023-02-21 08:18:37 -05:00
Russ White
ba755d35e5
Merge pull request #12248 from pguibert6WIND/bgpasdot
lib, bgp: add initial support for asdot format
2023-02-21 08:01:03 -05:00
Donatas Abraitis
bf0c616383 bgpd: Drop struct attr from bgp_withdraw()
It's not used at all.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-02-21 11:35:59 +02:00
Donald Sharp
8383d53e43
Merge pull request #12780 from opensourcerouting/spdx-license-id
*: convert to SPDX License identifiers
2023-02-17 09:43:05 -05:00
Stephen Worley
742341e144 bgpd: add mpath label stack helper functions for dvni
Add some bgp_path_info helper functions for getting the correct l3vni
label, getting the vni from the label stack, and determinging if
the mpath is D-VNI based.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13 18:12:05 -05:00
Philippe Guibert
fa566a94af bgpd: store the route-distinguisher from config as a string
The route-distinguisher string can be expressed in different
ways when the AS number is part of the RD. And the configured
string value has to be kept intact.
The following vty commands store the string value internally:
- router bgp / address-family ipv4 unicast / rd vpn export <>
- router bgp / address-family l2vpn evpn / rd <>
- router bgp / address-family l2vpn evpn / vni <> / rd <>

The vty commands where RD is configured in the below places is
not considered:
- router bgp / rfapi related commands
- router bgp / address-family xxx xxx / network .. rd <>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:27:23 +01:00
Philippe Guibert
e55b088399 bgpd: add as-notation keyword to 'router bgp' vty command
A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]

At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.

Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.

The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.

Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.

One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:27:23 +01:00
Philippe Guibert
9eb1199710 bgpd: store the bgp as identifier in the configured as-notation
This is a preliminary work to handle various ways to configure
a BGP Autonomous System. When creating a BGP instance, the
user may want to define the AS number as a dotted value,
instead of using an integer value.

To handle both cases, an as_pretty char attribute will store
the as number as it has been given to the vtysh command:

router bgp <as number>

Whenever the as integer of the BGP instance was dumped,
the as_pretty original format is used.

The json output reuses the integer value to keep backward
compatibility with old displays.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-02-10 10:19:06 +01:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donald Sharp
58cf0823bf bgpd: Add missing enum's to case statement
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-31 12:29:08 -05:00
Donald Sharp
367b458cb4 bgpd: bgp_update and bgp_withdraw never return failures
These two functions always return 0.  As such any and all
tests against this make no sense.  Remove the return 0
to a void and follow the chain, logically, to remove all
the dead code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-30 16:02:23 -05:00
Trey Aspelund
4dabdde32a bgpd: move tunnel-ip comparison into handler
Moves the old/new IP comparison into handle_tunnel_ip_change instead of
expecting the caller to do the check on their own.
Also changes handle_tunnel_ip_change to return void since it only ever
returned 0 in all cases.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-01-27 11:12:14 -05:00
Trey Aspelund
826c3f6db3 bgpd: only unimport routes if tunnel-ip changes
When processing a new local VNI, we were always walking the global EVPN
table to look for routes that needed to be removed due to a martian
nexthop change (specifically a tunnel-ip change).
Since the martian TIP table is global (all VNIs) + the walk is also in
the global table (all VNIs), we can trust that any new TIP from any VNI
would result in routes getting removed from the global table and
unimported from all live (L2)VNIs.
i.e.
The only time this update is actionable is if we are adding/removing an
IP from the martian TIP table, and we do not need to walk the table for
normal refcount adjustments.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2023-01-27 11:11:44 -05:00
Russ White
95e5cc2319
Merge pull request #12647 from anlancs/fix/bgpd-type-2
bgpd: cosmetic changes for debug
2023-01-24 10:13:22 -05:00
anlan_cs
47f5eb7487 bgpd: cosmetic changes for debug
Two changes for debug log -
1. Display empty VRF as "None".
2. Correct wrong "type-2" word for type-3 route.

Before:
```
2023/01/17 04:00:30 BGP: [Z5AV7-75RTE] VRF   vni 100 type-2 route evp [3]:[0]:[32]:[88.88.88.88] RMAC 00:00:00:00:00:00 nexthop 88.88.88.88 esi (null)
```

After:
```
2023/01/17 04:05:24 BGP: [M3X4Y-24DVB] VRF None vni 100 type-3 route evp [3]:[0]:[32]:[88.88.88.88] RMAC 00:00:00:00:00:00 nexthop 88.88.88.88 esi (null)
```

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-01-17 17:16:39 +08:00
anlan_cs
07b9f44dfa bgpd: remove one EC log
When creating one interface "vxlan66" ( ip link add vxlan66 type vxlan ... ),
which initially maintains down status, saw one unexpected EC log:

```
zebra[37906]: [ZAG0W-VSNSD] interface vxlan66 vrf default(0) index 35 becomes active.
zebra[37906]: [HPWGA-Y527W] IFLA_VXLAN_LINK missing from VXLAN IF message
...
zebra[37906]: [W6BZR-YZPAB] RTM_NEWLINK update for vxlan66(35) sl_type 0 master 0 flags 0x1002
zebra[37906]: [MR3ZF-ATDBY] Intf vxlan66(35) has gone DOWN
zebra[37906]: [T44X9-FFNVB] Intf vxlan66(35) L2-VNI 66 is DOWN
zebra[37906]: [KEGYY-K8XVV] Send EVPN_DEL 66 to bgp
zebra[37906]: [HPWGA-Y527W] IFLA_VXLAN_LINK missing from VXLAN IF message
bgpd[37911]: [TV0XP-3WR0A] Rx VNI del VRF default VNI 66 tenant-vrf default SVI ifindex 0
bgpd[37911]: [MDW89-YAXJG][EC 33554497] 0: VNI hash entry for VNI 66 not found at DEL
```

Since commit `6f908ded80eeba40a850e8d1f6246fb3ed31e648` support interfaces
from down to down, and bgpd doesn't know "VNI 66" at all. So, remove this
EC log.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-01-04 03:49:39 -05:00
Trey Aspelund
889360dcfd bgpd: cleanup macip_path_list for remote macip
ASAN reported the following memleak:
```
Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x4d4342 in calloc (/usr/lib/frr/bgpd+0x4d4342)
    #1 0xbc3d68 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
    #2 0xb869f7 in list_new /home/sharpd/frr8/lib/linklist.c:64:9
    #3 0x5a38bc in bgp_evpn_remote_ip_hash_alloc /home/sharpd/frr8/bgpd/bgp_evpn.c:6789:24
    #4 0xb358d3 in hash_get /home/sharpd/frr8/lib/hash.c:162:13
    #5 0x593d39 in bgp_evpn_remote_ip_hash_add /home/sharpd/frr8/bgpd/bgp_evpn.c:6881:7
    #6 0x59dbbd in install_evpn_route_entry_in_vni_common /home/sharpd/frr8/bgpd/bgp_evpn.c:3049:2
    #7 0x59cfe0 in install_evpn_route_entry_in_vni_ip /home/sharpd/frr8/bgpd/bgp_evpn.c:3126:8
    #8 0x59c6f0 in install_evpn_route_entry /home/sharpd/frr8/bgpd/bgp_evpn.c:3318:8
    #9 0x59bb52 in install_uninstall_route_in_vnis /home/sharpd/frr8/bgpd/bgp_evpn.c:3888:10
    #10 0x59b6d2 in bgp_evpn_install_uninstall_table /home/sharpd/frr8/bgpd/bgp_evpn.c:4019:5
    #11 0x578857 in install_uninstall_evpn_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4051:9
    #12 0x58ada6 in bgp_evpn_import_route /home/sharpd/frr8/bgpd/bgp_evpn.c:6049:9
    #13 0x713794 in bgp_update /home/sharpd/frr8/bgpd/bgp_route.c:4842:3
    #14 0x583fa0 in process_type2_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4518:9
    #15 0x5824ba in bgp_nlri_parse_evpn /home/sharpd/frr8/bgpd/bgp_evpn.c:5732:8
    #16 0x6ae6a2 in bgp_nlri_parse /home/sharpd/frr8/bgpd/bgp_packet.c:363:10
    #17 0x6be6fa in bgp_update_receive /home/sharpd/frr8/bgpd/bgp_packet.c:2020:15
    #18 0x6b7433 in bgp_process_packet /home/sharpd/frr8/bgpd/bgp_packet.c:2929:11
    #19 0xd00146 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
```

The list itself was not being cleaned up when the final list entry was
removed, so make sure we do that instead of leaking memory.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-12-15 18:52:16 +00:00
Donatas Abraitis
f8d69be43f
Merge pull request #12081 from sworleys/EMM-upstream
Rework of Various Handling in EVPN for Extended Mac Mobility
2022-11-17 16:46:58 +02:00
Donatas Abraitis
272c6d5db1
Merge pull request #8647 from sworleys/DVNI-Config-Changes
bgpd: EVPN D-VNI L3 RT Config Enhancements
2022-10-18 14:17:04 +03:00
Trey Aspelund
b5118501ac bgpd: don't unlock bgp_dest twice
Both install_evpn_route_entry_in_vni_mac() and
install_evpn_route_entry_in_vni_ip() will unlock the bgp_dest when
install_evpn_route_entry_in_vni_common() returns, so there's no need to
unlock the bgp_dest inside the _common function.  Let's let the new
wrappers handle the cleanup of the dest.

Ticket: #3119673

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-10-11 16:18:21 -04:00
Stephen Worley
852d9f9757 bgpd,zebra,lib: bgp evpn vni macip into two tables
Re-work the bgp vni table to use separately keyed tables for type2
routes.

So, with type2 routes, we have the main table keyed off of the IP and a
new MAC table keyed off of MACs.

By separating out the two, we are able to run path selection separately
for the neigh and mac. Keeping the two separate is also more in-line
with what happens in zebra (they are managed comptletely seperate).

With this change type2 routes go into each table like so:

```
Remote MAC-IP -> IP Table & MAC Table
Remote MAC -> MAC Table

Local MAC-IP -> IP Table
Local MAC -> MAC Table
```

The difference for local is necessary because we should not ever allow
multiple paths for a local MAC.

Also cleaned up the commands for querying the vni tables:

```
show bgp vni all type ...
show bgp vni VNI type ...

```

Old commands will be deprecated in a separate commit.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-10-11 16:18:21 -04:00
Anuradha Karuppiah
36bac85c7f bgpd: sync seq numbers only if the MAC address is the same
We locate the sync path with the highest seq number for normalizing
local path to the same seq. With the change to maintain the VNI RT table
by IP address (instead of MAC&IP) we need additional changes to skip
paths with a different mac address than the local one we are trying to
normalize.

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2022-10-11 15:18:39 -04:00
Anuradha Karuppiah
90f30caa24 bgpd: include ESI in the zebra update log
Log changes only; no functional change.

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2022-10-11 15:18:39 -04:00
Stephen Worley
34c7f35f02 bgpd: rework VNI table for type2/macip routes
Use the IP addr of type2/macip routes only for the hash/key
of the VNI table and carry the MAC in a path_info_extra attribute.

There is exists situations that can be hit during extended MAC mobility events
where two MACs could be pointing to the same IP in our global table. It
is requires very specific timings.

When that happens, BPG would (because we key'd on both MAC and IP)
install both into it's VNI table as separate entries, but zebra only
knows/needs to know about a single IP -> MAC relationship for it's VNI
table's type2 routes. So it was compleletly undeterministic which one
zebra would end up with in these timing situations.

With these changes, we move BGP's VNI table to key'd the same as Zebra's
and now a single IP will have multiple path_info's with a path_info_extra
that is carrying the MAC info for each path.

BGP will then run best path to deterministically decide which one to send to
zebra during the occasions where there exist's two possible MACs.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-10-11 15:18:39 -04:00
Donatas Abraitis
fd43ffd974 bgpd: Do not forget to unlock bgp_dest from update_advertise_vni_routes
If "unexpected" happens.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-06 11:49:08 +03:00
Donatas Abraitis
3727e359e3 bgpd: Cleanup memory for missing hashes
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-30 13:46:26 +03:00
Donatas Abraitis
511211bf56 bgpd: Convert prefix2str to %pFX
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-25 14:35:27 +03:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Stephen Worley
58d8948cf4 bgpd: evpn L3 RT auto config and wildcard implementation
Implement forcing L3 auto derivation via configs even when
manually RTs are set. This will allow both to coexist in
BGP RTs. Without using auto config command, it will remove
auto derived RTs when you manually configure your own. To allow
both, use the auto command ond import/export/both.

Implement '*' wildcard import L3 RTs so we can import a route into any AS.
This is necessary to avoid a user from having to configure an L3 RT for
every AS they care to import evpn route from.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-08-23 12:41:25 -04:00
Stephen Worley
ca337b4641 bgpd: abstract ecom into struct for l3 route targets
Abstract the ecommunity into a container struct for L3
route targets so that we can add some additional info
via flags to go along with RT configs without modifying
the used elsewhere ecommunity struct. This functions as a
wrapper everywhere its used including the import/export lists.
The flags will be used in later commits to change behavior
when importing/exporting routes.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-08-23 12:41:25 -04:00
Trey Aspelund
7226bc40d6 bgpd: ignore NEXT_HOP for MP_REACH_NLRI
RFC 4760 states we SHOULD ignore the NEXT_HOP attribute for BGP Update
messages carrying only MP_REACH_NLRI attributes. Thus we should use the
Network Address of Next Hop field of the MP_REACH_NLRI as the nexthop.

Instead of always looking for BGP_ATTR_NEXT_HOP, this commit ensures:
1) we set mp_nexthop_len to BGP_ATTR_NHLEN_IPV4 for v4 bgp_static routes
2) we check mp_nexthop_len when choosing the nexthop to use for nht
3) we check mp_nexthop_len when choosing the nexthop to send to zebra
4) we check mp_nexthop_len when picking the nexthop to shown by vtysh

Reported-by: Binon Gorbutt <binon@aervivo.com>
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-08-04 20:36:49 +00:00
Donald Sharp
35aae5c9bc bgpd: LL peers need bnc's per peer
FRR should create a bnc per peer.  Not have
one's that write over others.  Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC.  This is insufficient in that
we were accidently overwriting the one LL with other
data.  This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-22 09:09:39 -04:00
anlan_cs
2304139a62 bgpd: fix missing rmac value in debug
`attr.rmac` is not set in debug as expected for its wrong place in code.

Just move the debug process (`bgp_debug_zebra(NULL)`) after possible `rmac`
value is set.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-07-08 00:27:00 -04:00
Donatas Abraitis
0f05ea43b0 bgpd: Initialize attr->local_pref to the configured default value
When we use network/redistribute local_preference is configured inproperly
when using route-maps something like:

```
network 100.100.100.100/32 route-map rm1
network 100.100.100.200/32 route-map rm2

route-map rm1 permit 10
 set local-preference +10
route-map rm2 permit 10
 set local-preference -10
```

Before:
```
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32 json' | jq '.paths[].locPrf'
10
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.200/32 json' | jq '.paths[].locPrf'
0
```

After:
```
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.100/32 json' | jq '.paths[].locPrf'
110
root@spine1-debian-11:~# vtysh -c 'show bgp ipv4 unicast 100.100.100.200/32 json' | jq '.paths[].locPrf'
90
```

Set local-preference as the default value configured per BGP instance, but
do not set LOCAL_PREF flag by default.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-06-06 10:28:50 +03:00
Donatas Abraitis
6006b807b1 *: Properly use memset() when zeroing
Wrong: memset(&a, 0, sizeof(struct ...));
    Good:  memset(&a, 0, sizeof(a));

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-11 14:08:47 +03:00
Donatas Abraitis
b5605493a4 bgpd: Use sizeof() for memset instead of numeric
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-05-11 13:10:41 +03:00
Russ White
fb10a9479b
Merge pull request #11162 from anlancs/fix/bgpd-cleanup-5
bgpd: remove unnecessary check for evpn
2022-05-09 14:43:03 -04:00
mobash-rasool
d4caf64ef7
Merge pull request #11170 from anlancs/fix/bgpd-cleanup-8
bgpd: remove one unnecessary parameter for evpn-mh
2022-05-09 22:42:22 +05:30
anlan_cs
e0a798819b bgpd: remove one unnecessary parameter for evpn-mh
The "add" parameter of `bgp_evpn_mh_route_update()` makes no sense.
Just remove it to clarify this function, and remove the relevant check
with "add" as well.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-09 08:27:20 -04:00
anlan_cs
879e43a550 bgpd: remove unnecessary check for evpn
When `bgp_evpn_new()` is called, the `bgp` parameter MUST be non-NULL,
remove this unnecessary check and remove the NULL check for returned
`struct bgpevpn *`, which should be non-NULL.

And modify `import_rt_new()` in the same way.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-08 09:25:12 -04:00
anlan_cs
17151ae94b bgpd: clear misleading mismatched check
Two changes for `delete_global_type2_routes()`:

1) Remove check of `bgp_dest_has_bgp_path_info_data(rddest)`.
It is unnecessary(`dest->info` should not be NULL) and misleading.
`if (rddest && bgp_dest_has_bgp_path_info_data(rddest))`
Use (locked) node with this check, but unlock with `if (rddest)`,
The mismatched condition is misleading, there seems to be a
mistake to extra unlock.
Just make it clear, immediately exit with `(!rddest)`.

2) Remove checking returned value for it, and use `void` as return type.
It is unnecessary and wrong. Even the check failed, it should continue
to delete other types of routes.
Just remove the check and go through.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-07 22:20:15 -04:00
anlan_cs
8e3aae66ce *: remove the checking returned value for hash_get()
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.

Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.

Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.

Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
   ensure it is a created node, not a found node.
   Refer to `isis_vertex_queue_insert()` of isisd, there
   are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
   is a found node, then free <searching_data>.
   Refer to `aspath_intern()` of bgpd, there are many
   examples of this case in bgpd.

Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-03 00:41:48 +08:00
anlan_cs
ac390ef8ac bgpd: fix memory leak for evpn
If `hash_get()` returns NULL, the list created with
`list_new()` is not be freed.

Since `hash_get()` should not fail, we don't need
`list_delete()` and other boring `XFREE()`s for its
failure case.

Just ignore returning value of `hash_get()` in these
cases.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-05-03 00:41:48 +08:00
anlan_cs
d74a6cc126 bgpd: optimize "auto_rt" searching procedure for evpn
RT value will be unique across different VNIs but the
same across routers (in the same AS) for a particula
VNI.

It is unique, so add `break` for search procedure.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-25 04:36:18 -04:00
anlan_cs
671ec57621 bgpd: minor style change
Correct two style places and one comment.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-25 04:33:44 -04:00
Donatas Abraitis
f5327fc339
Merge pull request #11012 from anlancs/bgpd-mh-simplify-condition
zebra: simplify one check for evpn-mh
2022-04-19 13:04:43 +03:00
Donatas Abraitis
58cf5c088a bgpd: Reuse bgp_attr_set_ecommunity() for setting attribute flags
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-04-12 22:09:28 +03:00
anlan_cs
fff7545a03 bgpd: correct a few comments for evpn-mh
Correct a few evpn-mh omissions mainly on type-1 and type-4.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-04-12 09:33:06 -04:00
Russ White
e3e6510b87
Merge pull request #10820 from donaldsharp/evpn_route_frag
Evpn route frag
2022-03-20 15:49:16 -04:00
Russ White
d5a58008dc
Merge pull request #10816 from anlancs/fix-bgdp-local-es-rt
bgpd: fix wrong check on local es routes
2022-03-19 15:04:58 -04:00
Anuradha Karuppiah
f4a5218dc6 bgpd: evpn mh changes to advertise EAD routes with user configured export-rt
This is an alternate to EAD route fragmenation and allows the user to limit
the route to a single UPDATE (<4K) independent of the number of EVIs.

Sample config (add one l2-vni RT from each VRF) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
router bgp 5556
 !
 address-family l2vpn evpn
  ead-es-route-target export 5556:1001
  ead-es-route-target export 5556:1004
  ead-es-route-target export 5556:1008
 exit-address-family
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Sample route
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
   Network          Next Hop            Metric LocPrf Weight Path
*> [1]:[4294967295]:[03:44:38:39:ff:ff:01:00:00:01]:[32]:[27.0.0.21]
                    27.0.0.21                          32768 i
                    ET:8 ESI-label-Rt:AA RT:5556:1001 RT:5556:1004 RT:5556:1008
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

When configured, the ead-es-route-target is used instead of
the auto-generated version that includes all associated EVI's RTs.

Ticket: #2632967

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2022-03-18 07:33:12 -04:00
anlan_cs
e57e63eb40 bgpd: fix wrong check on local es routes
Importing local es routes should be skipped. But the check of it is a bit wrong.
It is ok that local es routes can't be imported, but importing local es will
wrongly enter `uninstall` procedure.

Just adjust this check to make it clear. Immediately return in the case
of importing local es routes.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-17 21:46:44 +08:00
anlan_cs
d58f056ab7 bgpd: remove dead code
`bgp_evpn_import_route_in_vrfs()` is special ( l2vpn ) form of
`install_uninstall_evpn_route() with `AFI_L2VPN` and `SAFI_EVPN` family.

No caller, just remove it.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-16 21:06:46 +08:00
Donatas Abraitis
b53e67a389 bgpd: Use bgp_attr_[sg]et_ecommunity for struct ecommunity
This is an extra work before moving attr->ecommunity to attra_extra struct.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-02-04 15:56:20 +02:00
Igor Ryzhov
3b216639d7
Merge pull request #10430 from ton31337/fix/addpath_maximum-prefix-out
bgpd: Add bgp_check_selected() helper and just consistency changes
2022-02-01 18:38:57 +03:00
Donatas Abraitis
be92fc9f1a bgpd: Convert bgp_addpath_encode_[tr]x() to bool from int
Rename addpath_encode[d] to addpath_capable to be consistent.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2022-02-01 13:31:16 +02:00
Iqra Siddiqui
761cc919fa bgpd: Fixing memcmp to avoid coverity issue
Description:
Replacing memcmp at certain places,
to avoid the coverity issues caused by it.

Co-authored-by: Kantesh Mundargi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
2022-01-31 21:50:50 -08:00
Igor Ryzhov
860e740b36 bgpd: replace custom union gw_addr with struct ipaddr
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2022-01-19 23:13:04 +03:00
Anuradha Karuppiah
23aa35ade5 bgpd: initial batch of evpn lttng tracepoints
Low overhead bgp-evpn TPs have been added which push data out in a binary
format -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@switch:~# lttng list --userspace |grep "frr_bgp:evpn"
      frr_bgp:evpn_mh_nh_rmac_zsend (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      frr_bgp:evpn_mh_nh_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mh_nhg_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mh_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_bum_vtep_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
      frr_bgp:evpn_mac_ip_zsend (loglevel: TRACE_INFO (6)) (type: tracepoint)
root@switch:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

In addition to the tracepoints a babeltrace python plugin for pretty
printing (binary data is converted into grepable strings). Sample usage -
frr_babeltrace.py trace_path

Sample tracepoint output -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
1. frr_bgp: evpn_mac_ip_zsend
frr_bgp:evpn_mac_ip_zsend {'action': 'add', 'vni': 1007, 'mac': '00:02:00:00:00:04', 'ip': 'fe80::202:ff:fe00:4', 'vtep': '27.0.0.15', 'esi': '03:44:38:39:ff:ff:01:00:00:02'}

2. frr_bgp: evpn_mh_vtep_zsend
frr_bgp:evpn_mh_vtep_zsend {'action': 'add', 'esi': '03:44:38:39:ff:ff:01:00:00:02', 'vtep': '27.0.0.16'}

3. frr_bgp: evpn_mh_nhg_zsend
frr_bgp:evpn_mh_nhg_zsend {'action': 'add', 'type': 'v4', 'nhg': 74999998, 'esi': '03:44:38:39:ff:ff:01:00:00:02', 'vrf': 85}

4. frr_bgp: evpn_mh_nh_zsend
frr_bgp:evpn_mh_nh_zsend {'nhg': 74999998, 'vtep': '27.0.0.16', 'svi': 93}

5. frr_bgp: evpn_mh_nh_rmac_zsend
frr_bgp:evpn_mh_nh_rmac_zsend {'action': 'add', 'vrf': 85, 'nh': '::ffff:1b00:12', 'rmac': '00:02:00:00:00:50'}
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-10-01 09:02:25 -07:00
Sri Mohana Singamsetty
2e2d2be87f
Merge pull request #9422 from pguibert6WIND/update_autort_l3vni
bgpd: update auto route target for l3vni appropriately
2021-09-28 09:15:34 -07:00
Anuradha Karuppiah
fa46a5cd45 bgpd: Extend EVPN next hop tracking to type-1 and type-4 routes
NH tracking is already in use for type-1, type-3 and type-5 routes.
This change extends that tracking to EAD and ESR to eliminate the 9s
delay (BGP holdtimer) with ES/L2-NHG update seen when all the uplinks
are shutdown on a remote EVPN PE.

Ticket: #2682896

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2021-09-14 09:07:59 -07:00
Philippe Guibert
4204021e46 bgpd: update auto route target for l3vni appropriately
The BGP configuration for BGP EVPN RT5 setup consists in mainly
2 bgp instances (eventually one is enough) and L3VNI config.

When L3VNI is configured before BGP instances, and BGP route
targets are auto derived as per rfc8365, then, the obtained
route targets are wrong. For instance, the following can be
obtained:

=> show bgp vrf cust1 vni
BGP VRF: cust1
  Local-Ip: 10.209.36.1
  L3-VNI: 1000
  Rmac: da:85:42:ba:2a:e9
  VNI Filter: none
  L2-VNI List:

  Export-RTs:
    RT:12757:1000
  Import-RTs:
    RT:12757:1000
  RD: 65000:1000

whereas the derived route targets should be the below
ones:

=> show bgp vrf cust1 vni
BGP VRF: cust1
  Local-Ip: 10.209.36.1
  L3-VNI: 1000
  Rmac: 72:f3:af:a0:98:80
  VNI Filter: none
  L2-VNI List:

  Export-RTs:
    RT:12757:268436456
  Import-RTs:
    RT:12757:268436456
  RD: 65000:1000

There is an update handler that updates appropriately L2VNIs.
But this is not the case for L3VNIs. Add the missing code.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-09-02 19:34:56 +00:00
Philippe Guibert
56c70d8714 bgpd: imported evpn rt5 routes copy igpmetric
when doing BGP over an IGP platform, the expectation is that
the path calculation for a given prefix takes into account the
igpmetric given by IGP.
This is true with prefixes obtained in a given BGP instance where
peering occurs. For instance, ipv4 unicast entries or l2vpn evpn
entries work this way. The igpmetric is obtained through nexthop
tracking, like below:

south-vm# show bgp nexthop
Current BGP nexthop cache:
 1.1.1.1 valid [IGP metric 10], #paths 1, peer 1.1.1.1
 2.2.2.2 valid [IGP metric 20], #paths 1, peer 2.2.2.2

The igp metric is taken into account when doing best path
selection, and only the entry with lowest igp wins.

[..]
*>i[5]:[0]:[32]:[5.5.5.5]
                    1.1.1.1                  0    100      0 ?
                    RT:65400:268435556 ET:8 Rmac:2e:22:6c:67:bb:73
* i                 2.2.2.2                  0    100      0 ?
                    RT:65400:268435556 ET:8 Rmac:f2:d3:68:4e:f4:ed

however, for imported EVPN RT5 entries, the igpmetric was not
copied from the parent path info. Fix it. In this way, the
imported route entries use the igpmetric of the parent pi.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-08-17 16:50:05 +02:00
Donatas Abraitis
90737805d9
Merge pull request #8956 from pguibert6WIND/bgp_loop_through_itself
bgpd: prevent routes loop through itself
2021-07-21 09:28:21 +03:00
Donald Sharp
6e95e6419c
Merge pull request #9039 from ton31337/fix/no_need_to_check_for_null_in_cmp_functions
bgpd: Do not check for NULL values for vni_hash_cmp()
2021-07-13 16:42:23 -04:00
Donatas Abraitis
ce40c6279a bgpd: Do not check for NULL values for vni_hash_cmp()
There is no need to test for null values in the hash compare
function as that we are guaranteed to send in data in
the hash compare functions.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-07-13 08:46:52 +03:00
Donald Sharp
63245a641a bgpd: hash compare functions never receive null values
There is no need to test for null values in the hash compare
function as that we are guaranteed to send in data in
the hash compare functions.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-07-12 14:23:51 -04:00
Philippe Guibert
654a5978f6 bgpd: prevent routes loop through itself
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.

This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.

By detecting in BGP that use case, we avoid installing the invalid
routes.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2021-07-12 13:57:36 +02:00
Donatas Abraitis
8643c2e5f7 *: Replace 4/16 integers to IPV4_MAX_BYTELEN/IPV6_MAX_BYTELEN
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-07-01 23:54:39 +03:00
Donald Sharp
b2c42dda96
Merge pull request #8853 from ton31337/fix/bgp_dest_lock_unlock
bgpd: Make sure we don't miss to unlock for bgp_dest before returning
2021-06-23 07:59:32 -04:00
Donatas Abraitis
e71ad4b64e bgpd: Make sure we don't miss to unlock for bgp_dest before returning
bgp_node_lookup() increases `lock` which is not decreased on return.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2021-06-22 23:14:47 +03:00
Ameya Dharkar
dc6cef732e bgpd: Add CLI for overlay index recursive resolution
Gateway IP overlay index of the remote type-5 route is resolved
recursively using remote type-2 route. For the purpose of this
recursive resolution, for each L2VNI, we build a hash table of the
remote IP addresses received by remote type-2 routes.
For the topologies where overlay index resolution is not needed, we
do not need to build this remote-ip-hash.

Thus, make the recursive resolution of the overlay index conditional on
"enable-resolve-overlay-index" configuration.

router bgp 65001
 bgp router-id 192.168.100.1
 neighbor 10.0.1.2 remote-as 65002
!
 address-family l2vpn evpn
  neighbor 10.0.1.2 activate
  advertise-all-vni
  enable-resolve-overlay-index----------> New configuration
 exit-address-family

Gateway IP overlay index will be resolved only if this configuration is present.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
021b659665 bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
  should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
  EVI(L2VPN table).

To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.

Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.

To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.

struct hash *vni_svi_hash;

An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.

Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.

For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.

Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;

When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.

It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.

Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
        struct ipaddr addr;
        struct list *macip_path_list;
};

A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.

bool is_evpn_gwip_nexthop;

A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.

This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.

Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags

*                | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable   | VALID      = 1 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID      = 0 | VALID      = 0
*                | INCOMPLETE = 0 | INCOMPLETE = 0

Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.

When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:59:45 -07:00
Ameya Dharkar
9daa5d471a bgpd, zebra: Add svi_interface to zebra VNI and bgp EVPN structures
SVI ifindex for L2VNI is required in BGP to perform EVPN type-5 to type-2
recusrsive resolution using gateway IP overlay index.

Program this svi_ifindex in struct zebra_vni_t as well as in struct bgpevpn

Changes include:
1. Add svi_if field to struct zebra_evpn_t
2. Add svi_ifindex field to struct bgpevpn
3. When SVI (bridge or VLAN) is bound to a VxLAN interface, store it in the
zebra_evpn_t structure.
4. Add this SVI ifindex to ZEBRA_VNI_ADD
5. Store svi_ifindex in struct bgpevpn

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:23 -07:00
Ameya Dharkar
a2299abae8 bgpd: Import received EVPN RT-5 prefix with gateway IP in BGP VRF
The IP/IPv6 prefix carried with EVPN RT-5 is imported in the BGP vrf according
to the attached route targets.
If the prefix carries a gateway IP overlay index, this gateway IP should be
installed as the nexthop of the route imported in the BGP vrf.
This route in vrf will be marked as VALID only if the nexthop is resolved in the
SVI network.
To receive runtime reachability information for the nexthop, register it with
the nexthop tracking module.
Send this route to zebra after processing.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Ameya Dharkar
66ff60895a bgpd: Parse EVPN RT-5 NLRI and store gateway IP for EVPN route
While installing this route in the EVPN table, make sure all the conditions
mentioned in the draft
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement-11 are
met.
Draft mentions following conditions:
  - ESI and gateway IP cannot be both nonzero at the same time.
  - ESI, gateway IP, RMAC and VNI label all cannot be 0 at the same time.

If the received EVPN RT-5 route does not meet these conditions, the route is
treated as withdraw.

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Ameya Dharkar
6c995628c1 bgpd: Generate and advertise gateway IP overlay index with EVPN RT-5
Gateway IP overlay index is generated for EVPN RT-5 when following CLI is
configured.

router bgp 100 vrf vrf-blue
 address-family l2vpn evpn
  advertise ipv4 unicast gateway-ip
  advertise ipv6 unicast gateway-ip

BGP nexthop of the VRF IP/IPv6 route is set as the gateway IP of the
corresponding EVPN RT-5

Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
2021-06-07 17:58:22 -07:00
Russ White
41e4b0c0ee
Merge pull request #8406 from adharkar/frr-es_rd
bgpd: Handle EAD/EVI local route updates on VNI RD change
2021-05-11 06:51:41 -04:00
Patrick Ruddy
4d504309d3
Merge pull request #8459 from taspelund/no_rmac_on_mac_only
bgpd: Fix IP-VRF ext-comm check for MACIP routes
2021-05-05 09:48:11 +01:00
Quentin Young
0ffd0fb536 bgpd, zebra: encode ip addr len as uint16
This is always a 16 bit unsigned value.

- signed int is the wrong type to use
- encoding a signed int as a uint32 is bad practice
- decoding a signed int encoded as a uint32 into a uint16 is bad
  practice

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
2021-04-28 11:43:45 -04:00