The backup_nexthop entry list has been populated by mistake,
and should not. Fix this by reverting the introduced behavior.
Fixes: 237ebf8d45 ("bgpd: rework bgp_zebra_announce() function, separate nexthop handling")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In case of EVPN MH bond, a member port going in
protodown state due to external reason (one case being linkflap),
frr updates the state correctly but upon manually
clearing external reason trigger FRR to reinstate
protodown without any reason code.
Fix is to ensure if the protodown reason was external
and new state is to have protodown 'off' then do no reinstate
protodown.
Ticket: #3947432
Testing:
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <linkflap>
switch:#ip link set swp1 protodown off protodown_reason linkflap off
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff
Signed-off-by: Chirag Shah <chirag@nvidia.com>
The current OSPF neighbor retransmission operates on a single per-neighbor
periodic timer that sends all LSAs on the list when it expires.
Additionally, since it skips the first retransmission of received LSAs so
that at least the retransmission interval (resulting in a delay of between
the retransmission interval and twice the interval. In environments where
the links are lossy on P2MP networks with "delay-reflood" configured (which
relies on neighbor retransmission in partial meshs), the implementation
is sub-optimal (to say the least).
This commit reimplements OSPF neighbor retransmission as follows:
1. A new data structure making use the application managed
typesafe.h doubly linked list implements an OSPF LSA
list where each node includes a timestamp.
2. The existing neighbor LS retransmission LSDB data structure
is augmented with a pointer to the list node on the LSA
list to faciliate O(1) removal when the LSA is acknowledged.
3. The neighbor LS retransmission timer is set to the expiration
timer of the LSA at the top of the list.
4. When the timer expires, LSAs are retransmitted that within
the window of the current time and a small delta (50 milli-secs
default). The LSAs that are retransmited are given an updated
retransmission time and moved to the end of the LSA list.
5. Configuration is added to set the "retransmission-window" to a
value other than 50 milliseconds.
6. Neighbor and interface LSA retransmission counters are added
to provide insight into the lossiness of the links. However,
these will increment quickly on non-fully meshed P2MP networks
with "delay-reflood" configured.
7. Added a topotest to exercise the implementation on a non-fully
meshed P2MP network with "delay-reflood" configured. The
alternative was to use existing mechanisms to instroduce loss
but these seem less determistic in a topotest.
Signed-off-by: Acee Lindem <acee@lindem.com>
The atomlist test consists of a sequence of (MT) sub-tests, from which
counters are collected and verified. TSAN doesn't know that these
counters are synchronized by way of the sub-test starting and finishing,
so it complains. Just use atomics to get rid of the warning.
(This is solely an issue with the test, not the atomlist code. There
are no warnings from that.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
For BSMs, we should track which of the RP candidates in the BSM message
are actually available, before trying to use them (which also puts them
in NHT for that). This applies for both BSRs as well as BSM receivers.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The upcoming Candidate-RP code needs to send PIM packets that go through
normal unicast routing, without forcing a specific output interface.
Allow passing in NULL ifp to do that.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
TSAN doesn't understand the OS specific "fast" seqlock code. Use the
pthread mutex/condvar based path when TSAN is enabled.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
TSAN warns about leaving the second thread dangling. Doesn't really
matter, but just add a pthread_join to get rid of the warning.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
I lost an underscore somewhere along the way. Which never caused issues
because we don't use that function macro. It is, however, useful for
testing, so fix it.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
seqlock_bump() used to return the value before bumping, but that's
unhelpful if you were to actually need it. I had changed it to return
the value after, but the update to the test got lost at some point.
The return value is not in fact used anywhere in FRR, so while it is
a bug, it has zero impact.
NB: yes, test_seqlock is not run, which sounds wrong. The problem here
is that (a) the test itself uses sleeps and is timing sensitive, which
would raise false positives. And (b), the test is meaningless if
executed once. It needs to be run millions of times under various
conditions (e.g. load) to catch rare races, and it needs to be run on
machines with "odd" memory models (in this case I used BE ppc32 and
ppc64 systems as test platforms.)
Fixes: 6046b690b5 ("lib/seqlock: avoid syscalls in no-waiter cases")
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Put some verbiage in place to warn people that we
are actively discouraging new development that uses
an older data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Before this patch, we always printed the last reason "Waiting for OPEN", but
if it's a manual shutdown, then we technically are not waiting for OPEN.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In scaled EVPN + ipv4/ipv6 uni route sync to zebra,
some of the ipv4/ipv6 routes skipped reinstallation
due to incorrect local variable's stale value.
Once the local variable value reset in each loop
iteration all skipped routes synced to zebra properly.
Ticket: #3948828
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Use the name for when putting out debugs in bgp_zebra.c.
Additionally add an evpn flag for announce_route_actual.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In the near future, some daemons may only register SIDs. This may be
the case for the pathd daemon when creating SRv6 binding SIDs.
When a locator is getting deleted at ZEBRA level, the daemon may have
an easy way to find out the SIds to unregister to.
This commit proposes to add the locator name to the SID_SRV6_NOTIFY
message whenever possible. Only case when an allocation failure happens,
the locator will not be present. In all other places, the notify API
at procol levels has the locator name extra-parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Zebra sends a SRV6_SID_NOTIFY notification to inform clients about the
result of a SID alloc/release operation. This commit adds a handler to
process a SRV6_SID_NOTIFY notification received from zebra.
If the notification indicates that a SID allocation operation was
successful, then it stores the allocated SID in the SRv6 database,
installs the SID into the RIB, and advertises the SID to the other IS-IS
routers.
If the notification indicates that an operation has failed, it logs the
error.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, IS-IS allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends IS-IS to release SIDs to Zebra when they are no
longer needed.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, IS-IS allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends IS-IS to request SIDs from Zebra instead of
allocating the SIDs on its own.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
This commit extends IS-IS to process locator information received from
SRv6 Manager (zebra) and save the locator info in the SRv6 database.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, when SRv6 is enabled in IS-IS, IS-IS requests a locator chunk
from Zebra. Zebra assigns a locator chunk to IS-IS, and then IS-IS can
allocate SIDs from the locator chunk.
Recently, the implementation of SRv6 in Zebra has been improved, and a
new API has been introduced for obtaining/releasing the SIDs.
Now, the daemons no longer need to request a chunk.
Instead, the daemons interact with Zebra to obtain information about the
locator and subsequently to allocate/release the SIDs.
This commit extends IS-IS to use the new SRv6 API. In particular, it
removes the chunk throughout the IS-IS code and modifies IS-IS to
request/save/advertise the locator instead of the chunk.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add an API to request information from the SRv6 SID Manager (zebra)
regarding a specific SRv6 locator.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
The following table is not compliant with caml format when displayed in
json:
> ttable_add_row(
> tt,
> "Vertex|Type|Metric|Next-Hop|Interface|Parent");
>
> ttable_json(tt, "ssdsss");
output observed:
> [..]
> {
> "Vertex":"r1",
> "Type":"",
> "Metric":0,
> "Next-Hop":"",
> "Interface":"",
> "Parent":""
> }
output expected:
> [..]
> {
> "vertex":"r1",
> "type":"",
> "metric":0,
> "nextHop":"",
> "interface":"",
> "parent":""
> }
Override the ttable_json() function with a new function which has an
extra paramter: this parameter will redefine the initial row value for
json:
> ttable_json_with_json_text(tt,
> "vertex|type|metric|nextHop|interface|parent");
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If using weighted ECMP, the weight for non-recursive next-hop should be
inherited from recursive next-hop.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>