When the connection goes up, the timer is not stopped and if we have a
subsequent GR event we have an old timer which is not as we expect.
Before:
```
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 95
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 08:27:53 2022
Time until Long-lived stale route deleted: 23 <<<<<<<<<<<<
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 103
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.0.1
65001 47583
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Last update: Mon Mar 28 08:43:29 2022
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 103
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 08:43:30 2022
Time until Long-lived stale route deleted: 17 <<<<<<<<<<<<<<<
```
After:
```
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 79
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (0.0.0.0)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 09:05:18 2022
Time until Long-lived stale route deleted: 24 <<<<<<<<<<<<<<<
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 87
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.0.1
65001 47583
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Last update: Mon Mar 28 09:05:25 2022
spine1-debian-11# sh ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32, version 87
Paths: (1 available, best #1, table default, mark routes to be retained for a longer time. Requires support for Long-lived BGP Graceful Restart)
Not advertised to any peer
65001 47583, (stale)
192.168.0.1 from 192.168.0.1 (100.100.200.100)
Origin incomplete, valid, external, best (First path received)
Community: llgr-stale
Last update: Mon Mar 28 09:05:29 2022
Time until Long-lived stale route deleted: 29 <<<<<<<<<<<<<<
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Used for graceful-restart mostly.
Especially for bgp_show_neighbor_graceful_restart_capability_per_afi_safi()
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
BGP can experience a bunch of errors associated with sockets
being manipulated which would prevent the peer from coming up.
Let's add some additional debug information here so that
our operators can do a bit more for themselves.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When BGP is notified by RIB that peer address is unreachable then BGP session must be brought
down immediately and not wait for the hold-timer expiry. Today single-hop EBGP already behaves
this way but need to change for iBGP and multi-hop EBGP sessions.
Signed-off-by: Prerana G.B <prerana@vmware.com>, Pushpasis Sarkar <spushpasis@vmware.com>
Currently, the following sequence of events between peers could
result in erroneous capability reports on the peer
with enabled dont-capability-negotiate option:
- having some of the capabilities advertised to a bgp neighbor,
- then disabling capability negotiation to that neighbor,
- then resetting connection to it,
- and no capabilities are actually sent to the neighbor,
- but "show bgp neighbors" on the host still displays them
as advertised to the neighbor.
There are two possibilities for establishing a new connection
- the established connection was initiated by us with bgp_start(),
- the connection was initiated on the neighbor side and processed by
us via bgp_accept() in bgp_network.c.
The former case results in "show bgp neighbors" displaying only
"received" in capabilities, as the peer's cap is initiated to zero
in bgp_start().
In the latter case, if bgp_accept() happens before bgp_start()
is called, then new peer capabilities are being transferred
from its previous record before being zeroed in bgp_start().
This results in "show bgp neighbors" still displaying
"advertised and received" in capabilities.
Following the logic of a similar af_cap field clearing,
treated correctly in both cases, we
- reset peer's capability during bgp_stop()
- don't pass it over to a new peer structure in bgp_accept().
This fix prevents transferring of the previous capabilities record
to a new peer instance in arbitrary reconnect scenario.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.
This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.
By detecting in BGP that use case, we avoid installing the invalid
routes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Problem: Sometimes the configured Local GR state is not reflected in
show command and peer node. This is causing failures in few of the
BGP-GR topotests.
RCA: This problem is seen when the configuration of local GR state
happens when the BGP session is in OpenSent state and moves to
Established after the configuration is complete.
When the session gets established, we move the GR state value from stub peer
to the config peer. This will result in overriding the GR state to
previous value.
Fix: The local GR state is modified only through CLI configuration and
does not change during BGP FSM transition. In this case it is not necessary
to transfer the GR state value from stub peer to config peer. This way we
can ensure that always the most recent config value is present in peer
datastructure.
Signed-off-by: Prerana-GB <prerana@vmware.com>
New peers should be initialized with a usual max packet size and later
determined on OPEN messages.
Testing with different peers supporting/not supporting extended support.
2021/07/02 13:48:00 BGP: [WEV7K-2GAQ5] u2:s2 send UPDATE len 8991 (max message len: 65535) numpfx 1788
2021/07/02 13:48:03 BGP: [WEV7K-2GAQ5] u3:s3 send UPDATE len 4096 (max message len: 4096) numpfx 809
2021/07/02 13:48:03 BGP: [WEV7K-2GAQ5] u3:s3 send UPDATE len 4096 (max message len: 4096) numpfx 809
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
We are inconsistently using peer_establiahed(peer) with
sometimes using `peer->status == Established`. Just Convert
over to using the function for consistency.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If we have a situation where BGP is partially reading in a config
file for a neighbor, *and* the neighbor is coming up *and* we
have a doppelganger. There exists a race condition when we transfer
the config from the doppelganger to the config peer that we will
overwrite later config because we are copying the config data
from the doppelganger peer( which was captured at the start of initiation
of the peering ).
From what I can tell the peer->af_flags variable is to hold configuration
flags for the local peer. The doppelganger should never overwrite this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The new LL code in:
8761cd6ddb
Introduced the idea of the bgp unnumbered peers using interface up/down
events to track the bgp peers nexthop. This code was not properly
working when a connection was received from a peer in some circumstances.
Effectively the connection from a peer was immediately skipping state transitions
and FRR was never properly tracking the peers nexthop. When we receive the
connection attempt, let's track the nexthop now.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Remove old BFD API usage and replace it with the new one.
Highlights:
- More shared code: the daemon gets notified with callbacks instead of
having to roll its own code to find the notified sessions.
- Less code to integrate with BFD.
- Remove hidden commands to configure single / multi hop. Use
protocol data instead.
BGP can determine if a peer is single/multi hop according to the
following criteria:
a. If the IP address is a link-local address (single hop)
b. The network is shared with peer (single hop)
c. BGP is configured for eBGP multi hop / TTL security (multi hop)
- Respect the configuration hierarchy:
a. Peer configuration take precendence over peer-group
configuration.
b. When peer group configuration is removed, reset peer
BFD configurations to defaults (unless peer had specific
configs).
Example:
neighbor foo peer-group
neighbor foo bfd profile X
neighbor 192.168.0.2 peer-group foo
neighbor 192.168.0.2 bfd
! If peer-group is removed the profile configuration gets
! removed from peer 192.168.0.2, but BFD will still enabled
! because of the neighbor specific bfd configuration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
bgp is currently registering v6 LL as nexthops to be tracked
from zebra. This presents several problems.
a) zebra does not properly track multiple prefixes that match
the same route properly at this point in time.
b) BGP was receiving nexthops that were just incorrect because
of (a).
c) When a nexthop changed that really didn't affect the v6 LL
we were responding incorrectly because of this
Modify the code such that bgp nexthop tracking notices that
we are trying to register a v6 LL. When we do so, shortcut
and watch interface up/down events for this v6 LL and do
the work when an interface goes up / down for this type
of tracking.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If we are using a nexthop for a MPLS VPN route make sure the
nexthop is over a labeled path. This new check mirrors the one
in validate_paths (where routes are enabled when a nexthop
becomes reachable). The check is introduced to the code path
where routes are added and the nexthop is looked up.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
* Process FIB update in bgp_zebra_route_notify_owner() and call
group_announce_route() if route is installed
* When bgp update is received for a route which is not installed earlier
(flag BGP_NODE_FIB_INSTALLED is not set) and suppress fib is enabled
set the flag BGP_NODE_FIB_INSTALL_PENDING to indicate fib install is
pending for the route. The route will be advertised when zebra send
ZAPI_ROUTE_INSTALLED status.
* The advertisement delay (BGP_DEFAULT_UPDATE_ADVERTISEMENT_TIME)
is added to allow more routes to be sent in single update message.
This is required since zebra sends route notify message for each route.
The delay will be applied to update group timer which advertises
routes to peers.
Signed-off-by: kssoman <somanks@gmail.com>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We currently have a global process queue for handling route
updates in bgp. This is fine, in general, except there are
places and times where we plug the queue for no new work
during certain peer states of bgp update delay. If we
happen to be processing multiple bgp instances on startup
why do we want to stop processing in vrf A when vrf B
is in a bit of a pickle?
Also this separation will allow us to start forward thinking
about how to fully integrate pthreads into route processing
in bgp.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This function is poorly named; it's really used to allow the FSM to
decide the next valid state based on whether a peer has valid /
reachable nexthops as determined by NHT or BFD.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
The bgpTrapBackwardTransition callback was being called only during
bgp_stop and only under the condition that peer status was Established.
The MIB defines that the event should be generated for every transition
of the BGP FSM from a higher to a lower state.
Signed-off-by: Babis Chalios <mail@bchalios.io>
Issue:
1. Initially BGP start listening to socket.
2. Start timer expires and BGP tries to connect to peer and moved
to Idle->connect (lets say peer datastructre X)
3. Connect for X succeeds and hence moved from idle ->connect with
FD-x.
4. A incoming connection is accepted and a new peer datastructure Y
is created with FD-y moves from idle->Active state.
5. Peer datastercture Y FD-y sends out OPEN and moves to
Active->Opensent state.
6. Peer datastrcture Y FD-y receives OPEN and moved from Opensent->
Openconfirm state.
7. Meanwhile on peer datastrcture X FD-x sends out a OPEN message
and moved from connect->Opensent.
8. For peer datastrcture Y FD-y keep alive is received and it is
moved from OpenConfirm->Established.
9. In this case peer datastructure Y FD-y is a accepted connection
so we try to copy all its parameter to peer datastructure X and
delete Y.
10. During this process TCP connection for the accepted connection
(FD-y) goes down and hence get remote address and port fails.
11. With this failure bgp_stop function for both peer datastrure X
and peer datastructure Y is called.
12. By this time all the parameters include state for datastrcture
for X and Y are exchanged. Peer Y FD-y when it entered this
function had state OpenConfirm still which has been moved to peer
datastrcture X.
13. In bgp_stop it will stop all the timers and take action only if
peer is in established state. Now that peer datastrcture X and Y
are not in established state (in this function) it will simply
close all timers and close the socket and assigns socket for both
the peer datastrcture to -1.
14. Peer datastrcture Y will be deleted as it is a datastrcture created
due to accept of connection where as peer datastrcture X will be held
as it is created with configuration.
15. Now peer datastrcture X now holds a state of OpenConfirm without any
timers running.
16. With this any new incoming connection will never be able to establish
as there is config connection X which is stuck in OpenConfirm.
Fix:
While transferring the peer datastructure Y FD-y (accepted connection)
to the peer datastructure X, if TCP connection for FD-y goes down, then
1. Call fsm event bgp_stop for X (do cleanup with bgp_stop and move the
state to Idle) and
2. Call fsm event bgp_stop for Y (do cleanup with bgp_stop and gets deleted
since it is an accept connection).
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue:
1. Initially BGP start listening to socket.
2. Start timer expires and BGP tries to connect to peer and moved
to Idle->connect (lets say peer datastructre X)
3. Peer datastrcture Y FD-X receives OPEN and moved from Opensent->
Openconfirm state and start the hold timer.
4. In the OpenConfirm state, the hold timer is stopped. So peer X
waits for Keepalive message from peer. If the Keepalive message
is not received, then it will be in OpenConfirm state for
indefinite time.
5. Due to this it neither close the existing connection nor it will
accept any connection from peer.
Fix:
In the OpenConfirm state, don't stop the hold timer.
1. Upon receipt of a neighbor’s Keepalive, the state is moved to
Established.
2. But If the hold timer expires, a stop event occurs, the state
is moved to Idle.
This is as per RFC.
Signed-off-by: Sarita Patra <saritap@vmware.com>
* Fixed integration in FSM and packet handling.
* Added CLI "show" output, incl. JSON.
* For review and testing only.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
* Removed old timer thread resets, since this has been taken care of
after execution of the threads by the thread_fetch function in
lib/thread.c for quite some time now.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
1) When a session comes up for a peer and if the peer has not adverised
the GR capabilities, BGP sends a request to Zebra to clear any
stale routes that might exist from that peer.
2) When OPEN message is received from the peer, clear the previously
advertised GR capability by the peer, if the lastest received
OPEN message does not contain the GR capability.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
Currently the I/O pthread handles incoming/outgoing data
communication with all peers. There is no attempt at modifying
the hold timers. It's sole goal is to read/write data to appropriate
channels. All this data is handled as *events* on the master pthread
in BGP. The problem is that if the master pthread is extremely busy
then any packet read that would be treated as a keepalive event may
happen after the hold timer pops, due to the way thread events are handled
in lib/thread.c.
In a last gap attempt, if we notice that we have incoming data
to proceses on the input Queue, slightly delay the hold timer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Convert some status defines for the fsm to an enum
so that we cannot mix and match them in the future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In PR #6052 which fixes issue #5963 the bgp fsm events
were confused with the bgp fsm status leading
to a bug. Let's start separating those out
so these types of failures cannot just
easily occur.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This scenario has been seen against microtik virtual machine with
bfd enabled. When remote microtik bgp reestablishes the bgp session
after a bgp reset, the bgp establishment comes first, then bfd is
initialising.
The second point is true for microtik, but not for frrouting, as the
frrouting, when receiving bfd down messages, is not at init state.
Actually, bfd state is up, and sees the first bfd down packet from
bfd as an issue. Consequently, the BGP session is cleared.
The fix consists in resetting the BFD session, only if bfd status is
considered as up, once BGP comes up.
That permits to align state machines of both local and remote bfd.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When bgp is updated with local source, the bgp session is reset; bfd
also must be reset. The bgp_stop() handler handles all kind of
unexpected failures, so the placeholder to deregister from bfd should be
ok, providing that when bgp establishes, a similar function in bgp will
recreate bfd context.
Note that the bfd session is not reset on one specific case, where BFD
down event is the last reset. In that case, we must let BFD to monitor
the link.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This scenario has been seen against microtik virtual machine with bfd
enabled. When remote microtik bgp reestablishes the bgp session after a
bgp reset, the bgp establishment comes first, then bfd is initialising.
The second point is true for microtik, but not for frrouting, as the
frrouting, when receiving bfd down messages, is not at init state.
Actually, bfd state is up, and sees the first bfd down packet from bfd
as an issue. Consequently, the BGP session is cleared.
The fix consists in resetting the BFD session, once BGP comes up. That
permits to align state machines of both local and remote bfd.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When bgp is updated with local source, the bgp session is reset; bfd
also must be reset. The bgp_stop() handler handles all kind of
unexpected failures, so the placeholder to deregister from bfd should be
ok, providing that when bgp establishes, a similar function in bgp will
recreate bfd context.
Note that the bfd session is not reset on one specific case, where BFD
down event is the last reset. In that case, we must let BFD to monitor
the link.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If the peer was shutdown locally, it doesn't show up as admin. shutdown.
Instead it's treated as "Waiting for peer OPEN".
The same applies to when the peer reaches maximum-prefix count.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Current failed reasons for bgp when you have a peer that
is not online yet is `Waiting for NHT`, even if NHT has
succeeded. Add some code to differentiate this.
eva# show bgp ipv4 uni summ failed
BGP router identifier 192.168.201.135, local AS number 3923 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 43 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.44.1 0 0 never Waiting for NHT
192.168.201.139 0 0 never Waiting for Open to Succeed
Total number of neighbors 2
eva#
eva# show bgp nexthop
Current BGP nexthop cache:
192.168.44.1 invalid, peer 192.168.44.1
Must be Connected
Last update: Mon Feb 10 19:05:19 2020
192.168.201.139 valid [IGP metric 0], #paths 0, peer 192.168.201.139
So 192.168.201.139 is a peer for a connected route that has not been
created on .139, while 44.1 nexthop tracking has not succeeded yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If you have enums handled in a switch adding a default case
makes it fun to fix when new stuff is added later. Remove.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* While the Deferral timer is running, signal route update pending
(ZEBRA_CLIENT_ROUTE_UPDATE_PENDING) from BGPD to Zebra.
* After expiry of the Deferral timer, the deferred routes are processed.
When the deferred route_list becomes empty, End-of-Rib is send to the
peer and route processing complete message (ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE)
is sent to Zebra. So that Zebra would delete any stale routes still
present in the rib.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
*After a restarting router comes up and the bgp session is
successfully established with the peer. If the restarting
router doesn’t have any route to send, it send EOR to
the peer immediately before receiving updates from its peers.
*Instead the restarting router should send EOR, if the
selection deferral timer is not running OR count of eor received
and eor required are matches then send EOR.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
bgp tcp connection.
When the BGP peer is configured between two bgp routes both routers would create
peer structure , when they receive each other’s open message. In this event both
speakers, open duplicate TCP sessions and send OPEN messages on each socket
simultaneously, the BGP Identifier is used to resolve which socket should be closed.
If BGP GR is enabled the old tcp session is dumped and the new session is retained.
So while this transfer of connection is happening, if all the bgp gr config
is not migrated to the new connection, the new bgp gr mode will never get applied.
Fix Summary:
1. Replicate GR configuration from the old session to the new session in bgp_accept().
2. Replicate GR configuration from stub to full-fledged peer in bgp_establish().
3. Disable all NSF flags, clear stale routes (if present), stop restart & stale timers
(if they are running) when the bgp GR mode is changed to “Disabled”.
4. Disable R-bit in cap, if it is not set the received open message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Changing GR mode on a router needs a session reset from the
SAME router to negotiate new GR capability.
* The present GR implementation needs a session reset after every
new BGP GR mode change.
* When BGP session reset happens due to sending or receiving BGP
notification after changing BGP GR mode, there is no need of
explicit session reset.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Selection Deferral Timer for Graceful Restart.
* Added selection deferral timer handling function.
* Route marking as selection defer when update message is received.
* Staggered processing of routes which are pending best selection.
* Fix for multi-path test case.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added new show command to show the graceful restart
information for each neighbor.
Cmd: show bgp [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|WORD>] graceful-restart
* Changes to show neighbors commands for displaying
graceful restart information.
Cmd :show [ip] bgp [<view|vrf> VIEWVRFNAME] [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added FSM for peer and global configuration for graceful restart
* Added debug option BGP_GRACEFUL_RESTART for logs specific to
graceful restart processing
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
We should send a NOTIFICATION message with the Error Code Finite State
Machine Error if we receive NOTIFICATION in OpenSent state
as defined in https://tools.ietf.org/html/rfc4271#section-8.2.2
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.
RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
been added to peer hash table for keepalive when it was
in openconfirm state.
3. Config node parameters including FD are exchanged with
doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
set to -1 and asserts.
Signed-off-by: Santosh P K <sapk@vmware.com>
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
There was a silly bug introduced when the command to show failed sessions
was added. A missing "," caused the wrong error message to be printed.
Debugging this led down a path that:
- Led to discovering one more error message that needed to be added
- Providing the error code along with the string in the JSON output
to allow programs to key off numbers rather than strings.
- Fixing the missing ","
- Changing the error message to "Waiting for Peer IPv6 LLA" to
make it clear that we're waiting for the link local addr.
Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited. This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In a data center, having 32-128 peers is not uncommon. In such a situation, to find a
peer that has failed and why is several commands. This hinders both the automatability of
failure detection and the ease/speed with which the reason can be found. To simplify this
process of catching a failure and its cause quicker, this patch does the following:
1. Created a new function, bgp_show_failed_summary to display the
failed summary output for JSON and vty
2. Created a new function to display the reset code/subcode. This is now used in the
failed summary code and in the show neighbors code
3. Added a new variable failedPeers in all the JSON outputs, including the vanilla
"show bgp summary" family. This lists the failed session count.
4. Display peer, dropped count, estd count, uptime and the reason for failure as the
output of "show bgp summary failed" family of commands
5. Added three resset codes for the case where we're waiting for NHT, waiting for peer
IPv6 addr, waiting for VRF to init.
This also counts the case where only one peer has advertised an AFI/SAFI.
The new command has the optional keyword "failed" added to the classical summary command.
The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As
we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr,
the output of that command will show a "waiting for NHT" etc. as the last reset reason.
This patch includes update to the documentation too.
Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
In the case of IPv4 route exchange using GUA IPv6 peering, the route install
into the FIB involves mapping the immediate next hop to an IPv4 link-local
address and installing neighbor entries for this next hop address. To
accomplish the latter, IPv6 Router Advertisements are exchanged (the next hop
or peer must also have this enabled) and the RAs are dynamically initiated
based on next hop resolution.
However, in the case of a passive connection where the local system has not
initiated anything, no NHT entry is created for the peer, hence RAs were not
getting triggered. Address this by ensuring that a NHT entry is created even
in this situation. This is done at the time the connection becomes established
because the code has other assumptions that a NHT entry will be present only
for the "configured" peer. The API to create the entry ensures there are
no duplicates.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Generally available hook for plugging application-specific
code in for bgp peer change events.
This hook (peer_status_changed) replaces the previous, more
specific 'peer_established' hook with a more general-purpose one.
Also, 'bgp_dump_state' is now registered under this hook.
Signed-off-by: Marton Kun-Szabo <martonk@amazon.com>