bgp_llgr topotest sometimes fails at step 8:
> topo: STEP 8: 'Check if we can see 172.16.1.2/32 after R4 (dynamic peer) was killed'
R4 neighbor is deleted on R2 because it fails to re-connect:
> 14:33:40.128048 BGP: [HKWM3-ZC5QP] 192.168.3.1 fd -1 went from Established to Clearing
> 14:33:40.128154 BGP: [MJ1TJ-HEE3V] 192.168.3.1(r4) graceful restart timer expired
> 14:33:40.128158 BGP: [ZTA2J-YRKGY] 192.168.3.1(r4) graceful restart stalepath timer stopped
> 14:33:40.128162 BGP: [H917J-25EWN] 192.168.3.1(r4) Long-lived stale timer (IPv4 Unicast) started for 20 sec
> 14:33:40.128168 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 172.16.1.2/32
> 14:33:40.128220 BGP: [H5X66-NXP9S] 192.168.3.1(r4) Long-lived set stale community (LLGR_STALE) for: 192.168.3.0/24
> [...]
> 14:33:41.138869 BGP: [RGGAC-RJ6WG] 192.168.3.1 [Event] Connect failed 111(Connection refused)
> 14:33:41.138906 BGP: [ZWCSR-M7FG9] 192.168.3.1 [FSM] TCP_connection_open_failed (Connect->Active), fd 23
> 14:33:41.138912 BGP: [JA9RP-HSD1K] 192.168.3.1 (dynamic neighbor) deleted (bgp_connect_fail)
> 14:33:41.139126 BGP: [P98A2-2RDFE] 192.168.3.1(r4) graceful restart stalepath timer stopped
af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart
kicks in") forgot to modify bgp_connect_fail()
Do not delete the peer in bgp_connect_fail() if Non-Stop-Forwarding is
in progress.
Fixes: af8496af08 ("bgpd: Do not delete BGP dynamic peers if graceful restart kicks in")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When trying to connect to a BGP peer that does not respons, the 'show
bgp neighbors' command does not give any indication on the local and
remote addresses used:
> # show bgp neighbors
> BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
> Local Role: undefined
> Remote Role: undefined
> BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
> BGP state = Connect
> [..]
> Connections established 0; dropped 0
> Last reset 00:00:04, Waiting for peer OPEN (n/a)
> Internal BGP neighbor may be up to 255 hops away.
> BGP Connect Retry Timer in Seconds: 120
> Next connect timer due in 117 seconds
> Read thread: off Write thread: off FD used: 27
The addressing information (address and port) are only available
when TCP session is established, whereas this information is present
at the system level:
> root@ubuntu2204:~# netstat -pan | grep 192.0.2.1
> tcp 0 0 192.0.2.1:179 192.0.2.150:38060 SYN_RECV -
> tcp 0 1 192.0.2.1:46526 192.0.2.150:179 SYN_SENT 488310/bgpd
Add the display for outgoing BGP session, as the information in
the getsockname() API provides information for connected streams.
When getpeername() API does not give any information, use the peer
configuration (destination port is encoded in peer->port).
> # show bgp neighbors
> BGP neighbor is 192.0.2.150, remote AS 65500, local AS 65500, internal link
> Local Role: undefined
> Remote Role: undefined
> BGP version 4, remote router ID 0.0.0.0, local router ID 192.0.2.1
> BGP state = Connect
> [..]
> Connections established 0; dropped 0
> Last reset 00:00:16, Waiting for peer OPEN (n/a)
> Local host: 192.0.2.1, Local port: 46084
> Foreign host: 192.0.2.150, Foreign port: 179
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is required by the current (latest/-02 draft).
IANA has registered code 8 for "Send Hold Timer Expired" in the "BGP
Error (Notification) Codes" sub-registry under the "Border Gateway
Protocol (BGP) Parameters" registry.
https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-sendholdtimer
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This reverts commit 7bf3c2fb19.
Commit reverted as it introduces a memoery leak during the tests
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
When a peer has come up and already started installing
routes into the rib and `suppress-fib-pending` is either
turned on or off. BGP is left with some routes that
may need to be withdrawn from peers and routes that
it does not know the status of. Clear the BGP peers
for the interesting parties and let's let us come
up to speed as needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When bgp is shutting down, it calls bgp_fsm_change_status
on everything including a self peer, which goes through
and cleans the tables of the self peer data structures
as if it's a real peer. Add a bit of code to just
not do the work at all. This allows unlocks to flow
a bit further and for the self peer to be deleted
on shutdown.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Also:
- replace all /* fallthrough */ comments with portable fallthrough;
pseudo keyword to accomodate both gcc and clang
- add missing break; statements as required by older versions of gcc
- cleanup some code to remove unnecessary fallthrough
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Not sure why this is needed, because it's reset on bgp_connect_success(),
when the session is UP.
When the session is reset, it clears those variables, and we are not able to
see what remote address was before, etc.
hostLocal, hostRemote reports Unknown for `show bgp neighbor json`.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Modify bgp_fsm_change_status to be connection oriented and
also make the BGP_TIMER_ON and BGP_EVENT_ADD macros connection
oriented as well. Attempt to make peer_xfer_conn a bit more
understandable because, frankly it was/is confusing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The usage of BGP_EVENT_FLUSH is unnecessarily abstracting the
call into event_cancel_event_ready and in addtion the macro
was not always being used! Just convert to using the actual
event_cancel_event_ready function directly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It was noticed that occassionally peering failed in a testbed
upon investigation it was found that the peer was not in the
peer hash and we saw these failure messages:
Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1)
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn
root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' | grep 2001:cafe:1ead:4::4
root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr#
Upon looking at the code the peer_xfer_conn function can fail
and the bgp_establish code will then return before adding the
peer back to the peerhash.
This is only part of the failure. The peer also appears to
be in a state where it is no longer initiating connection attempts
but that will be another commited fix when we figure that one out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It was noticed that occassionally peering failed in a testbed
upon investigation it was found that the peer was not in the
peer hash and we saw these failure messages:
Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1)
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn
root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr# vtysh -c 'show bgp peerhash' | grep 2001:cafe:1ead:4::4
root@doca-hbn-service-bf3-s06-1-ipmi:/var/log/hbn/frr#
Upon looking at the code the peer_xfer_conn function can fail
and the bgp_establish code will then return before adding the
peer back to the peerhash.
This is only part of the failure. The peer also appears to
be in a state where it is no longer initiating connection attempts
but that will be another commited fix when we figure that one out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When bgp_stop finishes and it deletes the peer it is sending
back a return code stating that the peer was deleted, but
the code was operating like it was not deleted and continued
to access the data structure. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The return code from a event handling perspective
is an enum. Let's intentionally make it a switch
so that all cases are ensured to be covered now
and in the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Straighten out the code to not mix the two. Especially
since bgp was assigning non enum values to the enum.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is based on @donaldsharp's work
The current code base is the struct bgp_node data structure.
The problem with this is that it creates a bunch of
extra data per route_node.
The table structure generates ‘holder’ nodes
that are never going to receive bgp routes,
and now the memory of those nodes is allocated
as if they are a full bgp_node.
After splitting up the bgp_node into bgp_dest and route_node,
the memory of ‘holder’ node which does not have any bgp data
will be allocated as the route_node, not the bgp_node,
and the memory usage is reduced.
The memory usage of BGP node will be reduced from 200B to 96B.
The total memory usage optimization of this part is ~16.00%.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>
The BGP FSM was using the peer as the unit of work
but the FSM is connection focused. So let's switch
it over to using that.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to
be a part of the `struct peer_connection` since this is
a connection oriented bit of data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>