mirror_frr

mirror of https://git.proxmox.com/git/mirror_frr synced 2025-06-02 13:03:50 +00:00

Author	SHA1	Message	Date
Louis Scalbert	0b091e71ee	bgpd: fix dynamic peer graceful restart race condition bgp_llgr topotest sometimes fails at step 8: > topo: STEP 8: 'Check if we can see 172.16.1.2/32 after R4 (dynamic peer) was killed' R4 neighbor is deleted on R2 because it fails to re-connect: > 14:33:40.128048 BGP: [HKWM3-ZC5QP] 192.168.3.1 fd -1 went from Established to Clearing > 14:33:40.128154 BGP: [MJ1TJ-HEE3V] 192.168.3.1(r4) graceful restart timer expired > 14:33:40.128158 BGP: [ZTA2J-YRKGY] 192.168.3.1(r4) graceful restart stalepath timer stopped > 14:33:40.128162 BGP: [H917J-25EWN] 192.168.3.1(r4) Long-lived stale timer (IPv4 Unicast) started for 20 sec > 14:33:40.128168 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 172.16.1.2/32 > 14:33:40.128220 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 192.168.3.0/24 > [...] > 14:33:41.138869 BGP: [RGGAC-RJ6WG] 192.168.3.1 [Event] Connect failed 111(Connection refused) > 14:33:41.138906 BGP: [ZWCSR-M7FG9] 192.168.3.1 [FSM] TCP_connection_open_failed (Connect->Active), fd 23 > 14:33:41.138912 BGP: [JA9RP-HSD1K] 192.168.3.1 (dynamic neighbor) deleted (bgp_connect_fail) > 14:33:41.139126 BGP: [P98A2-2RDFE] 192.168.3.1(r4) graceful restart stalepath timer stopped `af8496af08` ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") forgot to modify bgp_connect_fail() Do not delete the peer in bgp_connect_fail() if Non-Stop-Forwarding is in progress. Fixes: `af8496af08` ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com> (cherry picked from commit `e446308d76`)	2024-05-17 06:41:39 +00:00
Donatas Abraitis	c12c5c1114	bgpd: Fix format overflow for graceful-restart debug logs Use enum instead of int, and make the compiler happy when using -format-overflow. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org> (cherry picked from commit `2c69b4b516`)	2024-01-24 17:27:34 +00:00
Donald Sharp	6fb4068f35	bgpd: Make `suppress-fib-pending` clear peering When a peer has come up and already started installing routes into the rib and `suppress-fib-pending` is either turned on or off. BGP is left with some routes that may need to be withdrawn from peers and routes that it does not know the status of. Clear the BGP peers for the interesting parties and let's let us come up to speed as needed. Signed-off-by: Donald Sharp <sharpd@nvidia.com> (cherry picked from commit `bdb5ae8bce`)	2023-12-12 19:32:47 +00:00
Donatas Abraitis	232470f3b7	bgpd: Set TCP MSS for the socket even if the session is set to passive Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-09-18 15:42:06 +03:00
Donatas Abraitis	142be67f8c	bgpd: Keep remote/local socket unions on BGP start event Not sure why this is needed, because it's reset on bgp_connect_success(), when the session is UP. When the session is reset, it clears those variables, and we are not able to see what remote address was before, etc. hostLocal, hostRemote reports Unknown for `show bgp neighbor json`. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-09-13 13:23:45 +03:00
Donald Sharp	0c3a70c644	bgpd: Move the peer->su to connection->su The sockunion is per connection. So let's move it over. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	70c3c27ebc	bgpd: bgp_connect is struct peer_connection oriented Make it so. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	b57e023cc2	bgpd: Convert bgp_fsm_nht_update to take a connection Convert this function over to using a connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	6dc9dc1edd	bgpd: modify bgp_connect_check to use a connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	d2ba78929f	bgpd: bgp_fsm_change_status/BGP_TIMER_ON and BGP_EVENT_ADD Modify bgp_fsm_change_status to be connection oriented and also make the BGP_TIMER_ON and BGP_EVENT_ADD macros connection oriented as well. Attempt to make peer_xfer_conn a bit more understandable because, frankly it was/is confusing. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	7b1158b169	bgpd: peer_established should be connection oriented The peer_established function should be connection oriented. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	d1e7215da0	bgpd: make bgp_keepalives_on\|off connection oriented The bgp_keepalives_on\|off functions should use a peer_connection as a basis for it's operation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	1f8274e050	bgpd: bgp_open_send is connection oriented not peer oriented The bgp_open_send function should use a connection oriented pointer for it's basis. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	33a14ce1f2	bgpd: convert bgp_stop_with_notify to connection based The bgp_stop_with_notify function should use a peer_connection pointer as the basis instead of a peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	3c7ef0a9c7	bgpd: make bgp_timer_set use a peer_connection instead The bgp_timer_set function should use a peer_connection pointer instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-10 08:31:25 -04:00
Donald Sharp	3842286ed4	bgpd: bgp_notify_send use peer_connection instead of peer The bgp_notify_send function should use a peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	513c8c4f74	bgpd: move t_pmax_restart to peer_connection The t_pmax_restart event pointer belongs in the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	981dd86920	bgpd: move t_generate_updgrp_packets into peer_connection The t_generate_updgrp_packets event pointer belongs in the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	13ae845b94	bgpd: move t_gr_restart and _stale into peer_connection The t_gr_restart and t_gr_stale event pointers belong into the peer_connection pointer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	e79443fcd8	bgpd: move t_routeadv to peer_connection The t_routeadv belongs to the peer_connection data structure Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	6b7e50aacc	bgpd: t_connect_check_r and w move to peer connection These two event pointers belong in the peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	bdb832b489	bgpd: t_holdtime move to peer_connection The t_holdtime event pointer belongs in the peer connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	904c98c4d9	bgpd: move t_start into peer_connection The t_start event pointer belongs on the peer_connection Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	b8f3b2cd4a	bgpd: move t_delayopen from peer to peer_connection This belongs in peer_connection let's move it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	a8888edd42	bgpd: t_connect conversion from peer to peer_connect Move t_connect into struct peer_connect Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	4aec430ce3	bgpd: Remove BGP_EVENT_FLUSH and just use event_cancel_event_ready The usage of BGP_EVENT_FLUSH is unnecessarily abstracting the call into event_cancel_event_ready and in addtion the macro was not always being used! Just convert to using the actual event_cancel_event_ready function directly. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-09-09 16:28:05 -04:00
Donald Sharp	ce1f5d3774	bgpd: Add peers back to peer hash when peer_xfer_conn fails It was noticed that occassionally peering failed in a testbed upon investigation it was found that the peer was not in the peer hash and we saw these failure messages: Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1) Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' \| grep 2001:cafe:1ead:4::4 root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# Upon looking at the code the peer_xfer_conn function can fail and the bgp_establish code will then return before adding the peer back to the peerhash. This is only part of the failure. The peer also appears to be in a state where it is no longer initiating connection attempts but that will be another commited fix when we figure that one out. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-31 11:04:44 -04:00
Donatas Abraitis	bc81691247	Revert "bgpd: Add peers back to peer hash when peer_xfer_conn fails" peer is NULL, but we pass it to hash_get(). This reverts commit `6f8c927b03`.	2023-08-31 17:33:57 +03:00
Jafar Al-Gharaibeh	885146ea9c	Merge pull request #14301 from donaldsharp/bgp_lost_hash bgpd: Add peers back to peer hash when peer_xfer_conn fails	2023-08-30 20:11:46 -05:00
Donatas Abraitis	e89fd723ee	Merge pull request #14118 from GaladrielZhao/master bgpd: Convert from struct bgp_node to struct bgp_dest	2023-08-30 17:43:29 +03:00
Donald Sharp	6f8c927b03	bgpd: Add peers back to peer hash when peer_xfer_conn fails It was noticed that occassionally peering failed in a testbed upon investigation it was found that the peer was not in the peer hash and we saw these failure messages: Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1) Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' \| grep 2001:cafe:1ead:4::4 root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# Upon looking at the code the peer_xfer_conn function can fail and the bgp_establish code will then return before adding the peer back to the peerhash. This is only part of the failure. The peer also appears to be in a state where it is no longer initiating connection attempts but that will be another commited fix when we figure that one out. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-30 07:31:17 -04:00
Donald Sharp	5160672d99	bgpd: Prevent use after free When bgp_stop finishes and it deletes the peer it is sending back a return code stating that the peer was deleted, but the code was operating like it was not deleted and continued to access the data structure. Fix. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:43:56 -04:00
Donald Sharp	d4a9b103b7	bgpd: bgp_event_update switch to a switch The return code from a event handling perspective is an enum. Let's intentionally make it a switch so that all cases are ensured to be covered now and in the future. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:28:02 -04:00
Donald Sharp	8dd97a7404	bgpd: bgp_event_update mixes enum's with a non-enum Straighten out the code to not mix the two. Especially since bgp was assigning non enum values to the enum. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-25 10:03:14 -04:00
Yuqing Zhao	6e7f305e54	bgpd: Convert from struct bgp_node to struct bgp_dest This is based on @donaldsharp's work The current code base is the struct bgp_node data structure. The problem with this is that it creates a bunch of extra data per route_node. The table structure generates ‘holder’ nodes that are never going to receive bgp routes, and now the memory of those nodes is allocated as if they are a full bgp_node. After splitting up the bgp_node into bgp_dest and route_node, the memory of ‘holder’ node which does not have any bgp data will be allocated as the route_node, not the bgp_node, and the memory usage is reduced. The memory usage of BGP node will be reduced from 200B to 96B. The total memory usage optimization of this part is ~16.00%. Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>	2023-08-22 09:35:46 +08:00
Donald Sharp	419c5b4ef0	bgpd: Cleanup bgp_start declarations Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	26ad36e097	bgpd: Convert FSM to use `struct peer_connection` The BGP FSM was using the peer as the unit of work but the FSM is connection focused. So let's switch it over to using that. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	3e5a31b24e	bgpd: Convert `struct peer_connection` to dynamically allocated As part of the conversion to a `struct peer_connection` it will be desirable to have 2 pointers one for when we open a connection and one for when we receive a connection. Start this actual conversion over to this in `struct peer`. If this sounds confusing take a look at the bgp state machine for connections and how it resolves the processing of this router opening -vs- this router receiving an open. At some point in time the state machine decides that we are keeping one of the two connections. Future commits will allow us to untangle the peer/doppelganger duality with this abstraction. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	5d52756735	bgpd: Move t_process_packet and t_process_packet_error to connection The t_process_packet thread events should be managed by the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	e20c23fa5b	bgpd: Move status and ostatus to `struct peer_connection` The status and ostatus are a function of the `struct peer_connection` move it into that data structure. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	71d72c4998	bgpd: READ and WRITE flags are a part of the connection Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to be a part of the `struct peer_connection` since this is a connection oriented bit of data. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	c528b3b153	bgpd: Move t_write and t_read into `struct peer_connection` Move the peer->t_write and peer->t_read into `struct peer_connection` as that these are properties of the connection. Signed-off-by: Donald Sharp <sharpd@nvidia.com> P# Please enter the commit message for your changes. Lines starting	2023-08-18 09:29:04 -04:00
Donald Sharp	ccb51e8266	bgpd: Convert bgp_io.c to take `struct peer_connection` bgp_io.c is clearly connection oriented so let's convert it over to using `struct peer_connection` Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	1f32eb30d9	bgpd: Start abstraction of `struct peer_connection` BGP tracks connections based upon the peer. But the problem with this is that the doppelganger structure for it is being created. This has introduced a bunch of fragileness in that the peer exists independently of the connections to it. The whole point of the doppelganger structure was to allow BGP to both accept and initiate tcp connections and then when we get one to a `good` state we collapse into the appropriate one. The problem with this is that having 2 peer structures for this creates a situation where we have to make sure we are configing the `right` one and also make sure that we collapse the two independent peer structures into 1 acting peer. This makes no sense let's abstract out the peer into having 2 connection one for incoming connections and one for outgoing connections then we can easily collapse down without having to do crazy stuff. In addition people adding new features don't need to have to go touch a million places in the code. This is the start of this abstraction. In this commit we'll just pull out the fd and input/output buffers into a connection data structure. Future commits will abstract further. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-08-18 09:29:04 -04:00
Donald Sharp	acf4defcd8	bgpd: Remove peer->obuf_work This is never used. Free up another 65k of stream data never used per peer. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-07-21 12:30:20 -04:00
Donatas Abraitis	c76f6146ab	bgpd: Deprecate Prestandard Outbound Route Filtering capability https://www.rfc-editor.org/rfc/rfc8810.html Not relevant anymore. Use RFC'd version of ORF. Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-07-07 23:41:43 +03:00
Donatas Abraitis	58a92cb810	bgpd: Use enum bgp_fsm_state_progress for bgp_stop() ``` bgpd/bgp_fsm.c:1360:29: warning: conflicting types for ‘bgp_stop’ due to enum/integer mismatch; have ‘enum bgp_fsm_state_progress(struct peer )’ [-Wenum-int-mismatch] 1360 \| enum bgp_fsm_state_progress bgp_stop(struct peer peer) \| ^~~~~~~~ In file included from bgpd/bgp_fsm.c:29: ./bgpd/bgp_fsm.h:111:12: note: previous declaration of ‘bgp_stop’ with type ‘int(struct peer )’ 111 \| extern int bgp_stop(struct peer peer); \| ^~~~~~~~ ``` Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>	2023-06-13 16:01:40 +03:00
Donald Sharp	907234817c	bgpd: Give more data when state machine fails to change state When a state machine transition fails, bgpd would output data about what happened, but not necessarily give the reason why. Add that data to the output. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-06-02 11:02:54 -04:00
Donald Sharp	24a58196dd	*: Convert event.h to frrevent.h We should probably prevent any type of namespace collision with something else. Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00
Donald Sharp	e16d030c65	*: Convert THREAD_XXX macros to EVENT_XXX macros Signed-off-by: Donald Sharp <sharpd@nvidia.com>	2023-03-24 08:32:17 -04:00

1 2 3 4 5 ...

347 Commits