mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-06-03 12:24:44 +00:00

Author	SHA1	Message	Date
Louis Scalbert	e446308d76	bgpd: fix dynamic peer graceful restart race condition bgp_llgr topotest sometimes fails at step 8: > topo: STEP 8: 'Check if we can see 172.16.1.2/32 after R4 (dynamic peer) was killed' R4 neighbor is deleted on R2 because it fails to re-connect: > 14:33:40.128048 BGP: [HKWM3-ZC5QP] 192.168.3.1 fd -1 went from Established to Clearing > 14:33:40.128154 BGP: [MJ1TJ-HEE3V] 192.168.3.1(r4) graceful restart timer expired > 14:33:40.128158 BGP: [ZTA2J-YRKGY] 192.168.3.1(r4) graceful restart stalepath timer stopped > 14:33:40.128162 BGP: [H917J-25EWN] 192.168.3.1(r4) Long-lived stale timer (IPv4 Unicast) started for 20 sec > 14:33:40.128168 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 172.16.1.2/32 > 14:33:40.128220 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 192.168.3.0/24 > [...] > 14:33:41.138869 BGP: [RGGAC-RJ6WG] 192.168.3.1 [Event] Connect failed 111(Connection refused) > 14:33:41.138906 BGP: [ZWCSR-M7FG9] 192.168.3.1 [FSM] TCP_connection_open_failed (Connect->Active), fd 23 > 14:33:41.138912 BGP: [JA9RP-HSD1K] 192.168.3.1 (dynamic neighbor) deleted (bgp_connect_fail) > 14:33:41.139126 BGP: [P98A2-2RDFE] 192.168.3.1(r4) graceful restart stalepath timer stopped `af8496af08` ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") forgot to modify bgp_connect_fail() Do not delete the peer in bgp_connect_fail() if Non-Stop-Forwarding is in progress. Fixes: `af8496af08` ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>	2024-05-16 15:19:11 +02:00
Russ White	827badc53c	Merge pull request #15883 from opensourcerouting/fix/bgpd_gr_fsm bgpd: Apply NOOP when doing negative commands for GR operations	2024-05-07 09:56:51 -04:00
Donatas Abraitis	7b5595b61d	bgpd: Print old/new states of graceful restart FSM To better debug what's going on before/after. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-04-30 13:44:17 +03:00
Philippe Guibert	f101108e3e	bgpd: fix covery ID 1585206 The return value of bgp_getsockname() should always be checked. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2024-04-29 15:44:24 +02:00
Philippe Guibert	78ce63952a	bgpd: fix addressing information of non established outgoing sessions When trying to connect to a BGP peer that does not respons, the 'show bgp neighbors' command does not give any indication on the local and remote addresses used: > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:04, Waiting for peer OPEN (n/a) > Internal BGP neighbor may be up to 255 hops away. > BGP Connect Retry Timer in Seconds: 120 > Next connect timer due in 117 seconds > Read thread: off Write thread: off FD used: 27 The addressing information (address and port) are only available when TCP session is established, whereas this information is present at the system level: > root@ubuntu2204:~# netstat -pan \| grep 192.0.2.1 > tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV - > tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd Add the display for outgoing BGP session, as the information in the getsockname() API provides information for connected streams. When getpeername() API does not give any information, use the peer configuration (destination port is encoded in peer->port). > # show bgp neighbors > BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link > Local Role: undefined > Remote Role: undefined > BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1 > BGP state = Connect > [..] > Connections established 0; dropped 0 > Last reset 00:00:16, Waiting for peer OPEN (n/a) > Local host: 192.0.2.1, Local port: 46084 > Foreign host: 192.0.2.150, Foreign port: 179 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>	2024-04-15 09:16:54 +02:00
Donatas Abraitis	4967bf6d72	bgpd: Send "Send Hold Timer Expired" on such events notification This is required by the current (latest/-02 draft). IANA has registered code 8 for "Send Hold Timer Expired" in the "BGP Error (Notification) Codes" sub-registry under the "Border Gateway Protocol (BGP) Parameters" registry. https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-sendholdtimer Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-02-29 15:37:53 +02:00
Donatas Abraitis	72f0e06824	bgpd: Implement Paths-Limit capability https://datatracker.ietf.org/doc/html/draft-abraitis-idr-addpath-paths-limit Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-02-13 17:07:15 +02:00
Donatas Abraitis	2c69b4b516	bgpd: Fix format overflow for graceful-restart debug logs Use enum instead of int, and make the compiler happy when using -format-overflow. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2024-01-24 09:06:43 +02:00
Martin Winter	0222f553fb	Revert "bgpd: On shutdown do not create a workqueue for the self peer" This reverts commit `7bf3c2fb19`. Commit reverted as it introduces a memoery leak during the tests Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>	2024-01-06 15:57:12 +01:00
Donald Sharp	bdb5ae8bce	bgpd: Make `suppress-fib-pending` clear peering When a peer has come up and already started installing routes into the rib and `suppress-fib-pending` is either turned on or off. BGP is left with some routes that may need to be withdrawn from peers and routes that it does not know the status of. Clear the BGP peers for the interesting parties and let's let us come up to speed as needed. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-12-12 13:48:10 -05:00
Donald Sharp	7bf3c2fb19	bgpd: On shutdown do not create a workqueue for the self peer When bgp is shutting down, it calls bgp_fsm_change_status on everything including a self peer, which goes through and cleans the tables of the self peer data structures as if it's a real peer. Add a bit of code to just not do the work at all. This allows unlocks to flow a bit further and for the self peer to be deleted on shutdown. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-11-21 12:41:18 -05:00
Igor Ryzhov	7d67b9ff28	build: add -Wimplicit-fallthrough Also: - replace all /* fallthrough */ comments with portable fallthrough; pseudo keyword to accomodate both gcc and clang - add missing break; statements as required by older versions of gcc - cleanup some code to remove unnecessary fallthrough Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>	2023-10-12 21:23:18 +03:00
Donatas Abraitis	232470f3b7	bgpd: Set TCP MSS for the socket even if the session is set to passive Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-09-18 15:42:06 +03:00
Donatas Abraitis	142be67f8c	bgpd: Keep remote/local socket unions on BGP start event Not sure why this is needed, because it's reset on bgp_connect_success(), when the session is UP. When the session is reset, it clears those variables, and we are not able to see what remote address was before, etc. hostLocal, hostRemote reports Unknown for `show bgp neighbor json`. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-09-13 13:23:45 +03:00
Donald Sharp	0c3a70c644	bgpd: Move the peer->su to connection->su The sockunion is per connection. So let's move it over. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	70c3c27ebc	bgpd: bgp_connect is struct peer_connection oriented Make it so. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	b57e023cc2	bgpd: Convert bgp_fsm_nht_update to take a connection Convert this function over to using a connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	6dc9dc1edd	bgpd: modify bgp_connect_check to use a connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	d2ba78929f	bgpd: bgp_fsm_change_status/BGP_TIMER_ON and BGP_EVENT_ADD Modify bgp_fsm_change_status to be connection oriented and also make the BGP_TIMER_ON and BGP_EVENT_ADD macros connection oriented as well. Attempt to make peer_xfer_conn a bit more understandable because, frankly it was/is confusing. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	7b1158b169	bgpd: peer_established should be connection oriented The peer_established function should be connection oriented. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	d1e7215da0	bgpd: make bgp_keepalives_on\|off connection oriented The bgp_keepalives_on\|off functions should use a peer_connection as a basis for it's operation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	1f8274e050	bgpd: bgp_open_send is connection oriented not peer oriented The bgp_open_send function should use a connection oriented pointer for it's basis. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	33a14ce1f2	bgpd: convert bgp_stop_with_notify to connection based The bgp_stop_with_notify function should use a peer_connection pointer as the basis instead of a peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	3c7ef0a9c7	bgpd: make bgp_timer_set use a peer_connection instead The bgp_timer_set function should use a peer_connection pointer instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	3842286ed4	bgpd: bgp_notify_send use peer_connection instead of peer The bgp_notify_send function should use a peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	513c8c4f74	bgpd: move t_pmax_restart to peer_connection The t_pmax_restart event pointer belongs in the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	981dd86920	bgpd: move t_generate_updgrp_packets into peer_connection The t_generate_updgrp_packets event pointer belongs in the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	13ae845b94	bgpd: move t_gr_restart and _stale into peer_connection The t_gr_restart and t_gr_stale event pointers belong into the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	e79443fcd8	bgpd: move t_routeadv to peer_connection The t_routeadv belongs to the peer_connection data structure Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	6b7e50aacc	bgpd: t_connect_check_r and w move to peer connection These two event pointers belong in the peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	bdb832b489	bgpd: t_holdtime move to peer_connection The t_holdtime event pointer belongs in the peer connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	904c98c4d9	bgpd: move t_start into peer_connection The t_start event pointer belongs on the peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	b8f3b2cd4a	bgpd: move t_delayopen from peer to peer_connection This belongs in peer_connection let's move it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	a8888edd42	bgpd: t_connect conversion from peer to peer_connect Move t_connect into struct peer_connect Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	4aec430ce3	bgpd: Remove BGP_EVENT_FLUSH and just use event_cancel_event_ready The usage of BGP_EVENT_FLUSH is unnecessarily abstracting the call into event_cancel_event_ready and in addtion the macro was not always being used! Just convert to using the actual event_cancel_event_ready function directly. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	ce1f5d3774	bgpd: Add peers back to peer hash when peer_xfer_conn fails It was noticed that occassionally peering failed in a testbed upon investigation it was found that the peer was not in the peer hash and we saw these failure messages: Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1) Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' \| grep 2001:cafe:1ead:4::4 root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# Upon looking at the code the peer_xfer_conn function can fail and the bgp_establish code will then return before adding the peer back to the peerhash. This is only part of the failure. The peer also appears to be in a state where it is no longer initiating connection attempts but that will be another commited fix when we figure that one out. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-31 11:04:44 -04:00
Donatas Abraitis	bc81691247	Revert "bgpd: Add peers back to peer hash when peer_xfer_conn fails" peer is NULL, but we pass it to hash_get(). This reverts commit `6f8c927b03`.	2023-08-31 17:33:57 +03:00
Jafar Al-Gharaibeh	885146ea9c	Merge pull request #14301 from donaldsharp/bgp_lost_hash bgpd: Add peers back to peer hash when peer_xfer_conn fails	2023-08-30 20:11:46 -05:00
Donatas Abraitis	e89fd723ee	Merge pull request #14118 from GaladrielZhao/master bgpd: Convert from struct bgp_node to struct bgp_dest	2023-08-30 17:43:29 +03:00
Donald Sharp	6f8c927b03	bgpd: Add peers back to peer hash when peer_xfer_conn fails It was noticed that occassionally peering failed in a testbed upon investigation it was found that the peer was not in the peer hash and we saw these failure messages: Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1) Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' \| grep 2001:cafe:1ead:4::4 root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# Upon looking at the code the peer_xfer_conn function can fail and the bgp_establish code will then return before adding the peer back to the peerhash. This is only part of the failure. The peer also appears to be in a state where it is no longer initiating connection attempts but that will be another commited fix when we figure that one out. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-30 07:31:17 -04:00
Donald Sharp	5160672d99	bgpd: Prevent use after free When bgp_stop finishes and it deletes the peer it is sending back a return code stating that the peer was deleted, but the code was operating like it was not deleted and continued to access the data structure. Fix. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:43:56 -04:00
Donald Sharp	d4a9b103b7	bgpd: bgp_event_update switch to a switch The return code from a event handling perspective is an enum. Let's intentionally make it a switch so that all cases are ensured to be covered now and in the future. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:28:02 -04:00
Donald Sharp	8dd97a7404	bgpd: bgp_event_update mixes enum's with a non-enum Straighten out the code to not mix the two. Especially since bgp was assigning non enum values to the enum. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:03:14 -04:00
Yuqing Zhao	6e7f305e54	bgpd: Convert from struct bgp_node to struct bgp_dest This is based on @donaldsharp's work The current code base is the struct bgp_node data structure. The problem with this is that it creates a bunch of extra data per route_node. The table structure generates ‘holder’ nodes that are never going to receive bgp routes, and now the memory of those nodes is allocated as if they are a full bgp_node. After splitting up the bgp_node into bgp_dest and route_node, the memory of ‘holder’ node which does not have any bgp data will be allocated as the route_node, not the bgp_node, and the memory usage is reduced. The memory usage of BGP node will be reduced from 200B to 96B. The total memory usage optimization of this part is ~16.00%. Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>	2023-08-22 09:35:46 +08:00
Donald Sharp	419c5b4ef0	bgpd: Cleanup bgp_start declarations Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	26ad36e097	bgpd: Convert FSM to use `struct peer_connection` The BGP FSM was using the peer as the unit of work but the FSM is connection focused. So let's switch it over to using that. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	3e5a31b24e	bgpd: Convert `struct peer_connection` to dynamically allocated As part of the conversion to a `struct peer_connection` it will be desirable to have 2 pointers one for when we open a connection and one for when we receive a connection. Start this actual conversion over to this in `struct peer`. If this sounds confusing take a look at the bgp state machine for connections and how it resolves the processing of this router opening -vs- this router receiving an open. At some point in time the state machine decides that we are keeping one of the two connections. Future commits will allow us to untangle the peer/doppelganger duality with this abstraction. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	5d52756735	bgpd: Move t_process_packet and t_process_packet_error to connection The t_process_packet thread events should be managed by the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	e20c23fa5b	bgpd: Move status and ostatus to `struct peer_connection` The status and ostatus are a function of the `struct peer_connection` move it into that data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	71d72c4998	bgpd: READ and WRITE flags are a part of the connection Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to be a part of the `struct peer_connection` since this is a connection oriented bit of data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00

1 2 3 4 5 ...

356 Commits