The show zebra dplane provider command was ommitting
the input and output queues to the dplane itself.
It would be nice to have this insight as well.
New output:
r1# show zebra dplane providers
dataplane Incoming Queue from Zebra: 100
Zebra dataplane providers:
Kernel (1): in: 6, q: 0, q_max: 3, out: 6, q: 14, q_max: 3
dplane_fpm_nl (2): in: 6, q: 10, q_max: 3, out: 6, q: 0, q_max: 3
dataplane Outgoing Queue to Zebra: 43
r1#
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The dplane providers have a concept of input queues
and output queues. These queues are chained together
during normal operation. The code in zebra also has
a feedback mechanism where the MetaQ will not run when
the first input queue is backed up. Having the dplane_fpm_nl
code grab all contexts when it is backed up prevents
this system from behaving appropriately.
Modify the code to not add to the dplane_fpm_nl's internal
queue when it is already full. This will allow the backpressure
to work appropriately in zebra proper.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently when the dplane_thread_loop is run, it moves contexts
from the dg_update_list and puts the contexts on the input queue
of the first provider. This provider is given a chance to run
and then the items on the output queue are pulled off and placed
on the input queue of the next provider. Rinse/Repeat down through
the entire list of providers. Now imagine that we have a list
of multiple providers and the last provider is getting backed up.
Contexts will end up sticking in the input Queue of the `slow`
provider. This can grow without bounds. This is a real problem
when you have a situation where an interface is flapping and an
upper level protocol is sending a continous stream of route
updates to reflect the change in ecmp. You can end up with
a very very large backlog of contexts. This is bad because
zebra can easily grow to a very very large memory size and on
restricted systems you can run out of memory. Fortunately
for us, the MetaQ already participates with this process
by not doing more route processing until the dg_update_list
goes below the working limit of dg_updates_per_cycle. Thus
if FRR modifies the behavior of this loop to not move more
contexts onto the input queue if either the input queue
or output queue of the next provider has reached this limit.
FRR will naturaly start auto handling backpressure for the dplane
context system and memory will not go out of control.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The ctx queue data structures already have a counter
associated with them. Let's just use them instead.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
```
ton# sh ip bgp peer-group
BGP peer-group pg-a
Peer-group type is auto
Configured address-families: IPv4 Unicast;
BGP peer-group pg-e, remote AS 0
Peer-group type is external
Configured address-families: IPv4 Unicast;
BGP peer-group pg-i, remote AS 65001
Peer-group type is internal
Configured address-families: IPv4 Unicast;
ton#
```
`auto` should be handled accordingly.
Fixes: 0dfe25697f ("bgpd: Implement neighbor X remote-as auto")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In the near future, some daemons may only register SIDs. This may be
the case for the pathd daemon when creating SRv6 binding SIDs.
When a locator is getting deleted at ZEBRA level, the daemon may have
an easy way to find out the SIds to unregister to.
This commit proposes to add the locator name to the SID_SRV6_NOTIFY
message whenever possible. Only case when an allocation failure happens,
the locator will not be present. In all other places, the notify API
at procol levels has the locator name extra-parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Zebra sends a `SRV6_SID_NOTIFY` notification to inform clients about the
result of a SID alloc/release operation. This commit adds a handler to
process a `SRV6_SID_NOTIFY` notification received from zebra.
If the notification indicates that a SID allocation operation was
successful, then it stores the allocated SID in the SRv6 database,
installs the SID into the RIB, and advertises the SID to the other BGP
routers.
If the notification indicates that an operation has failed, it logs the
error.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Make the `sid_register()` function non-static to allow other BGP modules
(e.g. bgp_zebra.c) to register SIDs.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, BGP allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends BGP to request SIDs from Zebra instead of allocating
the SIDs on its own.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When SRv6 VPN is unconfigured in BGP, BGP needs to interact with SID Manager to
release the SID and make it available to other daemons
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
This commit extends BGP to process locator information received from
SRv6 Manager (zebra) and save the locator info in the SRv6 database.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, when SRv6 is enabled in BGP, BGP requests a locator chunk
from Zebra. Zebra assigns a locator chunk to BGP, and then BGP can
allocate SIDs from the locator chunk.
Recently, the implementation of SRv6 in Zebra has been improved, and a
new API has been introduced for obtaining/releasing the SIDs.
Now, the daemons no longer need to request a chunk.
Instead, the daemons interact with Zebra to obtain information about the
locator and subsequently to allocate/release the SIDs.
This commit extends BGP to use the new SRv6 API. In particular, it
removes the chunk throughout the BGP code and modifies BGP to
request/save/advertise the locator instead of the chunk.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add an API to request information from the SRv6 SID Manager (zebra)
regarding a specific SRv6 locator.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Without this patch:
```
r1# sh ip bgp vrf CUSTOMER-A
BGP table version is 1, local router ID is 20.20.20.0, vrf id 4
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
192.168.2.0/24 192.168.179.5@0<
0 0 479 ?
Displayed 1 routes and 1 total paths
r1#
```
This is because the route is imported, next-hop is in a default VRF, and we should
evaluate an ultimate path info, not the current path info.
After:
```
r1# sh ip bgp vrf CUSTOMER-A
BGP table version is 1, local router ID is 20.20.20.0, vrf id 4
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 192.168.2.0/24 192.168.179.5@0<
0 0 479 ?
Displayed 1 routes and 1 total paths
r1#
```
In both cases next-hop in cache table is valid:
```
r1# sh ip bgp nexthop
Current BGP nexthop cache:
192.168.179.5 valid [IGP metric 0], #paths 2, peer 192.168.179.5
Resolved prefix 192.168.179.0/24
if r1-eth0
Last update: Thu Sep 5 11:24:37 2024
r1#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The "if_is_vrf" check is unnecessary because it’s already handled by
"if_get_vrf_loopback". Additionally, it ignores the default loopback and
could introduce potential bugs.
Fixes: 8b81f32e97 ("bgpd: fix label lost when vrf loopback comes back")
Signed-off-by: Loïc Sang <loic.sang@6wind.com>
isisd is crashing when reading a ASLA sub-TLV with Application
Identifier Bit Mask length greater than 1 octet.
Set a limit of 8 bytes in accordance with RFC9479 and check that the
received value does not exceed the limit.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Link: https://www.rfc-editor.org/rfc/rfc9479.html#name-application-identifier-bit-
Fixes: 5749ac83a8 ("isisd: add ASLA support")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If the neighbor is not configured with `neighbor X default-originate route-map ...`,
then this timer is useless.
Change the logic to be it disabled by default, but enabled automatically once the
route-map is configured for default-originate command.
Automatically assigned timer value is as before, 5 seconds.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Enabe/fix using a munet.yaml config file for topology configuration.
Easier test writing.
This also uses the standard `frrinit.sh` to launch and teardown
FRR, so we actually test what most users use.
Signed-off-by: Christian Hopps <chopps@labn.net>
Ticket: #4060069
show bgp vrf afi unicast statistics json output is not return in json
format for non exists vrf.
Fix:
Json output is formatted for non exists vrf cases.
Command supported:
```
show bgp vrf <VRFNAME> ipv4/ipv6 unicast statistics json
show bgp vrf <VRFNAME> l2vpn evpn statistics json
```
Before Fix:
```
leaf11#
leaf11# show bgp vrf test ipv4 unicast statistics json
View/Vrf test is unknown
leaf11#
leaf11#
leaf11# show bgp vrf test ipv6 unicast statistics json
View/Vrf test is unknown
leaf11#
leaf11#
leaf11# show bgp vrf default1 l2vpn evpn statistics json
View/Vrf default1 is unknown
leaf11#
```
After Fix:
```
leaf11#
leaf11# show bgp vrf test ipv4 unicast statistics json
{
"warning":"View/Vrf is unknown"
}
leaf11#
leaf11#
leaf11# show bgp vrf test ipv6 unicast statistics json
{
"warning":"View/Vrf is unknown"
}
leaf11#
leaf11# show bgp vrf default1 l2vpn evpn statistics json
{
"warning":"View/Vrf is unknown"
}
leaf11#
```
Ticket: #4060069
Signed-off-by: Sindhu Parvathi Gopinathan's <sgopinathan@nvidia.com>
When applying the route-map, we always set rmap_type to know who triggered
this action. PEER_RMAP_TYPE_IMPORT/EXPORT was used as a dead-code, and
PEER_RMAP_TYPE_NOSET not used at all.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>