mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-05-25 19:41:42 +00:00

Author	SHA1	Message	Date
Donald Sharp	c2cedabed1	Revert "bgpd: combine import_check_table and nexthop_check_table"	2023-11-06 10:07:58 -05:00
Donald Sharp	4a43b81d7c	bgpd: combine import_check_table and nexthop_check_table In zebra, the import check table and the nexthop check tables were combined. This leaves an issue where when bgp happens to have a tracked address in both the import check table and the nexthop track table that are the same address. When the the item is removed from one table the call to remove it from zebra removes tracking for the other table. Combine the two tables together and keep track where they came from for processing in bgpd. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-10-25 11:55:01 -04:00
Mark Stapp	8527084488	bgpd: replace ctime with ctime_r No ctime, use ctime_r. Signed-off-by: Mark Stapp <mjs@labn.net>	2023-09-19 16:25:01 -04:00
Donatas Abraitis	75dbd45c55	Merge pull request #14383 from donaldsharp/bgp_coverity_cleanup_early_sept Bgp coverity cleanup early sept	2023-09-13 21:52:37 +03:00
Donald Sharp	493075d25b	bgpd: bgp_connected_delete needs to ensure dest is still there Again coverity believes that dest could be freed by a call into bgp_dest_unlock_node, and it can if the lock count is wrong. Let's fix that assumption for coverity Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-11 12:45:59 -04:00
Donald Sharp	0c3a70c644	bgpd: Move the peer->su to connection->su The sockunion is per connection. So let's move it over. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	b2f25e1a17	bgpd: First pass of BGP_EVENT_ADD Pass through a bunch of BGP_EVENT_ADD's and make the code use a proper connection instead of a peer->connection. There still are a bunch of places where peer->connection is used and later commits will probably go through and clean these up more. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	d2ba78929f	bgpd: bgp_fsm_change_status/BGP_TIMER_ON and BGP_EVENT_ADD Modify bgp_fsm_change_status to be connection oriented and also make the BGP_TIMER_ON and BGP_EVENT_ADD macros connection oriented as well. Attempt to make peer_xfer_conn a bit more understandable because, frankly it was/is confusing. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	7b1158b169	bgpd: peer_established should be connection oriented The peer_established function should be connection oriented. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Philippe Guibert	1b34877af6	bgpd: rename bnc->ifindex to bnc->ifindex_ipv6_ll This commit changes the 'ifindex' name of the bnc structure. As it is used only to handle ipv6 link local addresses, let us use the 'ifindex_ipv6_ll' naming to avoid any confusions with the ifindex value of the resolved next-hops of the bnc structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-07-13 12:05:15 +02:00
Philippe Guibert	1069425868	bgpd: allocate label bound to received mpls vpn routes Current implementation does not offer a new label to bind to a received VPN route entry to redistribute with that new label. This commit allocates a label for VPN entries that have a valid label, and a reachable next-hop interface that is configured as follows: > interface eth0 > mpls bgp l3vpn-multi-domain-switching > exit An mplsvpn next-hop label binding entry is created in an mpls vpn nexthop label bind hash table of the current BGP instance. That mpls vpn next-hop label entry is indexed by the (next-hop, orig_label) values provided by the incoming updates, and shared with other updates having the same (next-hop, orig_label) values. A new 'LP_TYPE_BGP_L3VPN_BIND' label value is picked up from the zebra mpls label pool, and assigned to the new_label attribute. The 'bgp_path_info' appends a 'bgp_mplsvpn_nh_label_bind' structure to the 'mplsvpn' union structure. Both structures in the union are not used at the same, as the paths are either VRF updates to export, or MPLS VPN updates. Using an union gives a 24 bytes memory gain compared to if the structures had not been in an union (24 bytes compared to 48 bytes). Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-06-16 10:54:58 +02:00
Donald Sharp	b295810d00	Revert "bgpd: upon if up event, evaluate bnc with matching nexthop"	2023-06-08 23:17:53 -04:00
Philippe Guibert	713831fa7f	bgpd: rename bnc->ifindex to bnc->ifindex_ipv6_ll This commit changes the 'ifindex' name of the bnc structure. As it is used only to handle ipv6 link local addresses, let us use the 'ifindex_ipv6_ll' naming to avoid any confusions with the ifindex value of the resolved next-hops of the bnc structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-05-26 08:33:50 +02:00
Philippe Guibert	577be36a41	bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-05-09 21:00:57 +02:00
Donatas Abraitis	786e2b8bdb	Revert "MPLS allocation mode per next hop" Broken tests, let's revert now. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-05-03 13:52:46 +03:00
Donatas Abraitis	99a1ab0b21	Merge pull request #12646 from pguibert6WIND/mpls_alloc_per_nh MPLS allocation mode per next hop	2023-05-02 18:36:45 +03:00
Philippe Guibert	20d072d3ec	bgpd: keep interface index on bgp nexthop tracking The following BGP configuration does not show that the resolved next-hop to 192.0.2.1 has a defined interface. > router bgp 65500 > bgp router-id 192.0.2.2 > neighbor 192.0.2.1 remote-as 65500 > neighbor 192.0.2.1 update-source loop1 > neighbor 192.168.0.1 remote-as 65500 > ! > address-family ipv4 unicast > network 192.0.2.2/32 > no neighbor 192.168.0.1 activate > exit-address-family > ! > address-family ipv4 labeled-unicast > neighbor 192.168.0.1 activate > exit-address-family > ! > address-family ipv4 vpn > neighbor 192.0.2.1 activate > exit-address-family The 'show bgp nexthop' dump does not output the interface whereas the zebra rnh has the information. > dut-vm# show bgp nexthop > [..] > Current BGP nexthop cache: > 192.0.2.1 valid [IGP metric 0], #paths 1, peer 192.0.2.1 > gate 192.168.0.1 > Last update: Mon Apr 24 22:10:07 2023 > > dut-vm# show ip nht > 192.0.2.1 > resolved via bgp > via 192.168.0.1, r2-eth0 > Client list: bgp(fd 33) Modify the display of BGP nexthop tracking to also dump the interface used: > dut-vm# show bgp nexthop > [..] > Current BGP nexthop cache: > 192.0.2.1 valid [IGP metric 0], #paths 1, peer 192.0.2.1 > gate 192.168.0.1, r2-eth0 > Last update: Mon Apr 24 22:10:07 2023 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-04-27 17:04:20 +02:00
Donald Sharp	24a58196dd	*: Convert event.h to frrevent.h We should probably prevent any type of namespace collision with something else. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	cb37cb336a	*: Rename thread.[ch] to event.[ch] This is a first in a series of commits, whose goal is to rename the thread system in FRR to an event system. There is a continual problem where people are confusing `struct thread` with a true pthread. In reality, our entire thread.c is an event system. In this commit rename the thread.[ch] files to event.[ch]. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:16 -04:00
Philippe Guibert	92d5e31ace	bgpd: add support for l3vpn per-nexthop label This commit introduces a new method to associate a label to prefixes to export to a VPNv4 backbone. All the methods to associate a label to a BGP update is documented in rfc4364, chapter 4.3.2. Initially, the "single label for an entire VRF" method was available. This commit adds "single label for each attachment circuit" method. The change impacts the control-plane, because each BGP update is checked to know if the nexthop has reachability in the VRF or not. If this is the case, then a unique label for a given destination IP in the VRF will be picked up. This label will be reused for an other BGP update that will have the same nexthop IP address. The change impacts the data-plane, because the MPLs pop mechanism applied to incoming labelled packets changes: the MPLS label is popped, and the packet is directly sent to the connected nexthop described in the previous outgoing BGP VPN update. By default per-vrf mode is done, but the user may choose the per-nexthop mode, by using the vty command from the previous commit. In the latter case, a per-vrf label will however be allocated to handle networks that are not directly connected. This is the case for local traffic for instance. The change also include the following: - ECMP case In case a route is learnt in a given VRF, and is resolved via an ECMP nexthop. This implies that when exporting the route as a BGP update, if label allocation per nexthop is used, then two possible MPLS values could be picked up, which is not possible with the current implementation. Actually, the NLRI for VPNv4 stores one prefix, and one single label value, not two. Today, RFC8277 with multiple label capability is not yet available. To avoid this corner case, when a route is resolved via more than one nexthop, the label allocation per nexthop will not apply, and the default per-vrf label will be chosen. Let us imagine BGP redistributes a static route using the `172.31.0.20` nexthop. The nexthop resolution will find two different nexthops fo a unique BGP update. > r1# show running-config > [..] > vrf vrf1 > ip route 172.31.0.30/32 172.31.0.20 > r1# show bgp vrf vrf1 nexthop > [..] > 172.31.0.20 valid [IGP metric 0], #paths 1 > gate 192.0.2.11 > gate 192.0.2.12 > Last update: Mon Jan 16 09:27:09 2023 > Paths: > 1/1 172.31.0.30/32 VRF vrf1 flags 0x20018 To avoid this situation, BGP updates that resolve over multiple nexthops are using the unique per-vrf label. - recursive route case Prefixes that need a recursive route to be resolved can also be eligible for mpls allocation per nexthop. In that case, the nexthop will be the recursive nexthop calculated. To achieve this, all nexthop types in bnc contexts are valid, except for the blackhole nexthops. - network declared prefixes Nexthop tracking is used to look for the reachability of the prefixes. When the the 'no bgp network import-check' command is used, network declared prefixes are maintained active, even if there is no active nexthop. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-03-22 12:06:29 +01:00
Donald Sharp	d8bc11a592	*: Add a hash_clean_and_free() function Add a hash_clean_and_free() function as well as convert the code to use it. This function also takes a double pointer to the hash to set it NULL. Also it cleanly does nothing if the pointer is NULL( as a bunch of code tested for ). Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-21 08:54:21 -04:00
Russ White	ba755d35e5	Merge pull request #12248 from pguibert6WIND/bgpasdot lib, bgp: add initial support for asdot format	2023-02-21 08:01:03 -05:00
Philippe Guibert	4a8cd6ad7f	bgpd: support for as notation format for route distinguisher RD may be built based on an AS number. Like for the AS, the RD may use the AS notation. The two below examples can illustrate: RD 1.1:20 stands for an AS4B:NN RD with AS4B=65536 in dot format. RD 0.1:20 stands for an AS2B:NNNN RD with AS2B=0.1 in dot+ format. This commit adds the asnotation mode to prefix_rd2str() API so as to pick up the relevant display. Two new printfrr extensions are available to display the RD with the two above display methods. - The pRDD extension stands for dot asnotation format - The pRDE extension stands for dot+ asnotation format. - The pRD extension has been renamed to pRDP extension The code is changed each time '%pRD' printf extension is called. Possibly, the asnotation may change the output, then a macro defines the asnotation mode to use. A side effect of forging the mode to use is that the string could not be concatenated with other strings in vty_out and snprintfrr. Those functions have been called multiple times. When zlog_debug needs to display the RD with some other string, the prefix_rd2str() old API is used instead of the printf extension. Some code has been kept untouched: - code related to running-config. Actually, wherever an RD is displayed, its configured name should be dumped. - bgp rfapi code - bgp evpn multihoming code (partially done), since the logic is missing to get the asnotation of 'struct bgp_evpn_es'. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2023-02-10 10:27:23 +01:00
David Lamparter	acddc0ed3c	*: auto-convert to SPDX License IDs Done with a combination of regex'ing and banging my head against a wall. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2023-02-09 14:09:11 +01:00
Trey Aspelund	826c3f6db3	bgpd: only unimport routes if tunnel-ip changes When processing a new local VNI, we were always walking the global EVPN table to look for routes that needed to be removed due to a martian nexthop change (specifically a tunnel-ip change). Since the martian TIP table is global (all VNIs) + the walk is also in the global table (all VNIs), we can trust that any new TIP from any VNI would result in routes getting removed from the global table and unimported from all live (L2)VNIs. i.e. The only time this update is actionable is if we are adding/removing an IP from the martian TIP table, and we do not need to walk the table for normal refcount adjustments. Signed-off-by: Trey Aspelund <taspelund@nvidia.com>	2023-01-27 11:11:44 -05:00
Donatas Abraitis	a8adf1b3cb	Merge pull request #12573 from Pdoijode/bgp-nexthop-json-changes Bgp nexthop json changes	2023-01-07 21:01:06 +02:00
Pooja Jagadeesh Doijode	071ec807cb	bgpd: AFI option to query nexthops based on AFI Added ipv4 and ipv6 option to existing "show bgp nexthop" command to be able to query nexthops that belong to a particular address-family. Also fixed the warnings of MR 12171 Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>	2023-01-04 18:53:12 -08:00
Pooja Jagadeesh Doijode	8da79d08ad	bgpd: Detail option for nexthop and import-check to display paths 1. Updated "show bgp vrf <vrf-name> nexthop detail" and "show bgp vrf <vrf-name> import-check-table detail" show commands to display paths associated with nexthop. "detail" option was previously unused. 2. Added 'ipv4' and 'ipv6' JSON object under top level JSON. 3. Removed the "nexthops" JSON object which was under the top level JSON object 4. Renamed "ifname" to "interfaceName" 5. Renamed "gates" JSON obejct to "nexthops" 6. Changed "flags" JSON array to JSON object and changed the flags from string to boolean 7. "lastUpdate" will display only epoch time for "detail" option JSON output: r4# show bgp vrf default nexthop detail json { "ipv4":{ "10.0.7.1":{ "valid":true, "complete":true, "igpMetric":0, "pathCount":3, "peer":"10.0.7.1", "nexthops":[ { "interfaceName":"r4-r2-eth0" } ], "lastUpdate":1672265350, "paths":[ { "afi":"IPv4", "safi":"unicast", "prefix":"11.0.20.2/32", "vrf":"default", "flags":{ "igpChanged":false, "damped":false, "history":false, "bestpath":true, "valid":true, "attrChanged":false, "deterministicMedCheck":false, "deterministicMedSelected":false, "stale":false, "removed":false, "counted":true, "multipath":false, "multipathChanged":false, "ribAttributeChanged":false, "nexthopSelf":false, "linkBandwidthChanged":false, "acceptOwn":false } } ] } } } } Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>	2022-12-28 14:13:32 -08:00
Pooja Jagadeesh Doijode	bf85e4c5f1	bgpd: add json option to show commands in bgp_nexthop Commands with json option: - show bgp nexthop - show bgp import-check-table Example output below, "nexthop" and "import-check-table" are only different in the nexthop entries, the format is the same ``` leaf-A# show bgp nexthop 10.11.10.1 detail json { "nexthops":{ "10.11.10.1":{ "valid":true, "complete":true, "igpMetric":0, "pathCount":1, "peer":"10.11.10.1", "gates":[ { "ifname":"eth1" } ], "lastUpdate":{ "epoch":1669161758, "string":"Wed Nov 23 00:02:38 2022\n" }, "paths":[ { "afi":"IPv4", "safi":"unicast", "prefix":"192.168.11.0/24", "vrf":"default", "flags":[ "valid", "dmedSelected", "counted" ] } ] } } } leaf-A# show bgp nexthop json { "nexthops":{ "10.10.10.1":{ "valid":true, "complete":true, "igpMetric":0, "pathCount":1, "peer":"10.10.10.1", "gates":[ { "ifname":"eth0" } ], "lastUpdate":{ "epoch":1669161758, "string":"Wed Nov 23 00:02:38 2022\n" } }, "10.11.10.1":{ "valid":true, "complete":true, "igpMetric":0, "pathCount":1, "peer":"10.11.10.1", "gates":[ { "ifname":"eth1" } ], "lastUpdate":{ "epoch":1669161758, "string":"Wed Nov 23 00:02:38 2022\n" } } } } ``` Signed-off-by: Yaroslav Fedoriachenko <yar.fed99@gmail.com>	2022-12-28 13:43:50 -08:00
Donald Sharp	1c225152c0	bgpd: bgp_connected_add memory was being leaked in some cases On shutdown, bgp calls an unlock for the bnc connected table, via the bgp_connected_cleanup function. This function is only ever called on shutdown, so we know that bgp is going away. The refcount for the connected data can be more than 1. Let's not worry about the refcount on shutdown and just delete the nodes instead of leaving them around. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-12-22 08:15:52 -05:00
Donatas Abraitis	073801481b	bgpd: inet_ntop() adjustments Use %pI4/%pI6 where possible, otherwise at least atjust stack buffer sizes for inet_ntop() calls. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-11-29 17:36:13 +02:00
Philippe Guibert	8c6a164f40	bgpd: improve 'show bgp nexthop' command - for a given IP nexthop, dump all NH entries, including colored entries, or entries with an ifindex. - when a given IP nexthop is requested, the path is displayed. For better readibility, remove the carriage return between 'Last update' and 'Paths', because ctime() function already performs carriage return. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2022-10-05 11:12:59 +02:00
Donald Sharp	a8cc325f59	bgpd: Fix `show bgp nexthop A.B.C.D` The issuing of `show bgp nexthop A.B.C.D` fails even if that nexthop exists: eva# show bgp nexthop 192.168.119.120 specified nexthop does not have entry Fixed: eva# show bgp nexthop 192.168.119.120 192.168.119.120 valid [IGP metric 0], #paths 0, peer 192.168.119.120 if enp39s0 Last update: Fri Sep 30 14:55:13 2022 Paths: Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-09-30 14:58:21 -04:00
Donatas Abraitis	c4f64ea94d	bgpd: Use %pRD for prefix_rd2str() Convert a bunch of prefix_rd2str() for json/vty stuff. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-09-22 13:12:11 +03:00
Donatas Abraitis	036f482fce	bgpd: Drop bnc_str() function Reuse %pFX -> prefix2str() Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-08-25 14:35:28 +03:00
Donald Sharp	083ec940ab	bgpd: Convert from bgp_clock() to monotime() Let's convert to our actual library call instead of using yet another abstraction that makes it fun for people to switch daemons. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-08-24 08:23:40 -04:00
Donald Sharp	35aae5c9bc	bgpd: LL peers need bnc's per peer FRR should create a bnc per peer. Not have one's that write over others. Currently when FRR has multiple Interface based peering, BGP wa creating a single BNC. This is insufficient in that we were accidently overwriting the one LL with other data. This causes issues when there are multiple and there is weird starting issues with those interfaces that you are peering over. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2022-07-22 09:09:39 -04:00
Donatas Abraitis	6006b807b1	*: Properly use memset() when zeroing Wrong: memset(&a, 0, sizeof(struct ...)); Good: memset(&a, 0, sizeof(a)); Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-05-11 14:08:47 +03:00
Donatas Abraitis	8998807f69	*: Avoid casting to the same type as on the left Just not necessary. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2022-05-08 16:07:42 +03:00
anlan_cs	8e3aae66ce	: remove the checking returned value for hash_get() Firstly, keep no change* for `hash_get()` with NULL `alloc_func`. Only focus on cases with non-NULL `alloc_func` of `hash_get()`. Since `hash_get()` with non-NULL `alloc_func` parameter shall not fail, just ignore the returned value of it. The returned value must not be NULL. So in this case, remove the unnecessary checking NULL or not for the returned value and add `void` in front of it. Importantly, also keep no change for the two cases with non-NULL `alloc_func` - 1) Use `assert(<returned_data> == <searching_data>)` to ensure it is a created node, not a found node. Refer to `isis_vertex_queue_insert()` of isisd, there are many examples of this case in isid. 2) Use `<returned_data> != <searching_data>` to judge it is a found node, then free <searching_data>. Refer to `aspath_intern()` of bgpd, there are many examples of this case in bgpd. Here, <returned_data> is the returned value from `hash_get()`, and <searching_data> is the data, which is to be put into hash table. Signed-off-by: anlan_cs <vic.lan@pica8.com>	2022-05-03 00:41:48 +08:00
Ameya Dharkar	021b659665	bgpd: EVPN route type-5 to type-2 recursive resolution using gateway IP When EVPN prefix route with a gateway IP overlay index is imported into the IP vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP. For this vrf route to be valid, following conditions must be met. - Gateway IP nexthop of this route should be L3 reachable, i.e., this route should be resolved in RIB. - A remote MAC/IP route should be present for the gateway IP address in the EVI(L2VPN table). To check for the first condition, gateway IP is registered with nht (nexthop tracking) to receive the reachability notifications for this IP from zebra RIB. If the gateway IP is reachable, zebra sends the reachability information (i.e., nexthop interface) for the gateway IP. This nexthop interface should be the SVI interface. Now, to find out type-2 route corresponding to the gateway IP, we need to fetch the VNI for the above SVI. To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with svi_ifindex as key. struct hash vni_svi_hash; An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero. Using this hash, we obtain struct bgpevpn corresponding to the gateway IP. For gateway IP overlay index recursive lookup, once we find the correct EVI, we have to lookup its route table for a MAC/IP prefix. As we have to iterate the entire route table for every lookup, this lookup is expensive. We can optimize this lookup by adding all the remote IP addresses in a hash table. Following hash table is defined for this purpose in struct bgpevpn Struct hash remote_ip_hash; When a MAC/IP route is installed in the EVI table, it is also added to remote_ip_hash. It is possible to have multiple MAC/IP routes with the same IP address because of host move scenarios. Thus, for every address addr in remote_ip_hash, we maintain list of all the MAC/IP routes having addr as their IP address. Following structure defines an address in remote_ip_hash. struct evpn_remote_ip { struct ipaddr addr; struct list macip_path_list; }; A Boolean field is added to struct bgp_nexthop_cache to indicate that the nexthop is EVPN gateway IP overlay index. bool is_evpn_gwip_nexthop; A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache. This flag is set when the gateway IP is L3 reachable but not yet resolved by a MAC/IP route. Following table explains the combination of L3 and L2 reachability w.r.t. BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags \| MACIP resolved \| MACIP unresolved ----------------\|----------------\|------------------ L3 reachable \| VALID = 1 \| VALID = 0 * \| INCOMPLETE = 0 \| INCOMPLETE = 1 * ---------------\|----------------\|-------------------- * L3 unreachable \| VALID = 0 \| VALID = 0 * \| INCOMPLETE = 0 \| INCOMPLETE = 0 Procedure that we use to check if the gateway IP is resolvable by a MAC/IP route: - Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash. - Check if the gateway IP is present in remote_ip_hash in this EVI. When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route, unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag. Signed-off-by: Ameya Dharkar <adharkar@vmware.com>	2021-06-07 17:59:45 -07:00
Donald Sharp	feb1723846	bgpd: Convert to using peer_established(peer) function We are inconsistently using peer_establiahed(peer) with sometimes using `peer->status == Established`. Just Convert over to using the function for consistency. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2021-06-07 10:48:36 -04:00
Donald Sharp	8761cd6ddb	bgpd: Switch LL nexthop tracking to be interface based bgp is currently registering v6 LL as nexthops to be tracked from zebra. This presents several problems. a) zebra does not properly track multiple prefixes that match the same route properly at this point in time. b) BGP was receiving nexthops that were just incorrect because of (a). c) When a nexthop changed that really didn't affect the v6 LL we were responding incorrectly because of this Modify the code such that bgp nexthop tracking notices that we are trying to register a v6 LL. When we do so, shortcut and watch interface up/down events for this v6 LL and do the work when an interface goes up / down for this type of tracking. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2021-02-17 08:14:45 -05:00
Donald Sharp	df2a41a9bf	bgpd: Add `bgp_nexthop_dump_bnc_change_flags` function Allow us to read what the change flags are instead of having to look them up. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2021-01-29 07:54:58 -05:00
Donald Sharp	987a720a11	bgpd: Add bgp_nexthop_dump_bnc_flags Add a function that allows us to see a string version of the bnc->flags bit fields. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2021-01-29 07:54:58 -05:00
Donald Sharp	2a99175f8d	bgpd: Shorten some `show memory` strings Some of the `show memory` strings in bgp are longer than the columns we have allocated for it. Shorten some strings to make them fit and have the output pleasing to the eye. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2020-11-12 07:23:37 -05:00
David Lamparter	56ca3b5b3a	bgpd: add `%pBD` for printing `struct bgp_dest *` `%pRN` is not appropriate anymore. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>	2020-10-17 08:52:35 -04:00
Donald Sharp	3584c85e92	bgpd: Avoid memset when tip hash is empty The tip hash is only used when we are dealing with evpn. In bgp_nexthop_self we are doing a memset irrelevant of whether we will ever find data. Yes hash_lookup will return pretty quickly. Modify the code to avoid doing a memset in the case where the tip hash is empty as that we know we'll never find anything. With full BGP feeds this small memset does take some time. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2020-09-16 17:48:15 -04:00
Pat Ruddy	e37e1e27e4	bgpd: do not unregister for prefix nexthop updates if nh exists since the addition of srte_color to the comparison for bgp nexthops it is possible to have several nexthops per prefix but since zebra only sores a per prefix registration we should not unregister for nh notifications for a prefix unti all the nexthops for that prefix have been deleted. Otherwise we can get into a deadlock situation where BGP thinks we have registered but we have unregistered from zebra. Signed-off-by: Pat Ruddy <pat@voltanet.io>	2020-08-31 09:11:47 +00:00
Renato Westphal	545aeef1d1	bgpd: extend the NHT code to understand SR-TE colors Extend the NHT code so that only the affected BGP routes are affected whenever an SR-policy is updated on zebra. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>	2020-08-31 09:11:03 +00:00

1 2 3 4

198 Commits