If we have this construct:
for (pi = bgp_dest_get_bgp_path_info(dest); pi; pi = pi->next) {
...
bgp_process();
}
This can induce an infinite loop. This happens because bgp_process
will move the unsorted items to the top of the list for handling,
as such it is necessary to hold the next pointer to the side
to actually look at each possible bgp_path_info.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Remove use of `ip multicast rpf-lookup-mode` from unrelated tests.
Looks like this test was just unlucky enough to pick that command as an
example for use here. Just changed it to something less likely to be
removed in the future.
Update route table output to include AFI SAFI output.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Add `mrib` flag to existing "show ip route" commands which then use
the multicast safi rather than the unicast safi. Updated the vty output
to include the AFI and SAFI string when printing the table.
Deprecate `show ip rpf` command, aliased to `show ip route mrib`.
Removed `show ip rpf A.B.C.D`.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Multicast mode belongs in PIM, so removing it completely from zebra.
Modified `show (ip|ipv6) rpf ADDRESS` to always lookup from SAFI_MULTICAST.
This means this command is now specific to the multicast table and does
not necessarily reflect the PIM RPF lookup, but that should be implemented
in PIM instead.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Modified ZEBRA_NEXTHOP_LOOKUP_MRIB to include the SAFI from which to do the lookup.
This generalizes the API away from MRIB specifically and allows the user to decide how it should do lookups.
Rename ZEBRA_NEXTHOP_LOOKUP_MRIB to ZEBRA_NEXTHOP_LOOKUP now that it is more generalized.
This change is in preperation to remove multicast lookup mode completely from zebra.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
If you have a bestpath list that looks something like this:
<local evpn mac route>
<learned from peer out swp60>
<learned from peer out swp57>
And a network event happens that causes the peer out swp60
to not be in an established state, yet we still have the
path_info for the destination for swp60, bestpath
will currently end up with this order:
<learned from peer out swp60>
<local evpn mac route>
<learned from peer out swp57>
This causes the local evpn mac route to be deleted in zebra( Wrong! ).
This is happening because swp60 is skipped in bestpath calculation and
not considered to be a path yet it stays at the front of the list.
Modify bestpath calculation such that when pulling the unsorted_list
together to pull path info's into that list when they are also
not in a established state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Instead of starting with a fairly high value of retry, let's try with a lower
and increase with a backoff to reach what was a default value (120s).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Reorganize the MSDP initialization code and configuration writing code
to its appropriated place.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
RFC 4271 recommends this 120 seconds, but most of the implementations use 60.
For a datacenter profile we set it to 10 seconds.
Let's roll with 30 and increase up to the maximum 120.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When running the bgp_bmp_2 vrf test, peer vrf up/down events from the pre
and post policy are received with a wrong peer distinguisher value.
> {"peer_type": "route distinguisher instance", "policy": "pre-policy",
> "ipv6": true, "peer_ip": "192:168::2", "peer_distinguisher": "0:0",
> "peer_asn": 65502, "peer_bgp_id": "192.168.0.2", "timestamp":
> "2024-10-16 21:59:53.111962", "bmp_log_type": "peer up", "local_ip":
> "192:168::1", "local_port": 179, "remote_port": 50836, "seq": 5}
RFC7854 mentions in 4.2 that if the peer is a "RD Instance Peer", it is
set to the route distinguisher of the particular instance the peer
belongs to.
Fix this by modifying the BMP client, update the peer distinguisher
value by filling the peer distinguisher in the bmp_peerstate function.
> {"peer_type": "route distinguisher instance", "policy": "pre-policy",
> "ipv6": true, "peer_ip": "192:168::2", "peer_distinguisher": "444:1",
> "peer_asn": 65502, "peer_bgp_id": "192.168.0.2", "timestamp":
> "2024-10-16 21:59:53.111962", "bmp_log_type": "peer up", "local_ip":
> "192:168::1", "local_port": 179, "remote_port": 50836, "seq": 5}
Add a test to check that peer_distinguisher value is not 0:0 when an
RD instance is set.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When running the bgp_bmp_2 vrf test, peer up/down events from the pre
and post policy are received with a wrong peer type value
> {"peer_type": "global instance", "policy": "pre-policy", "ipv6": false,
> "peer_ip": "192.168.0.2", "peer_distinguisher": "0:0", "peer_asn": 65502,
> "peer_bgp_id": "192.168.0.2", "timestamp": "2024-10-16 21:59:53.111962",
> "bmp_log_type": "peer up", "local_ip": "192.168.0.1", "local_port": 179,
> "remote_port": 50710, "seq": 4}
RFC7854 defines RD instance peer type, and later in 4.2 requests that
the peer distinguisher value be set to non zero value when the peer type
is not global. This is the case for peer vrf instances.
Fix this by modifying the BMP client, update the peer type
value by updating the peer type value when sending peer up/down messages.
Add a check in the bgp_bmp_2 test to ensure that peer type is correctly
set.
> {"peer_type": "route distinguisher instance", "policy": "pre-policy",
> "ipv6": true, "peer_ip": "192:168::2", "peer_distinguisher": "0:0",
> "peer_asn": 65502, "peer_bgp_id": "192.168.0.2", "timestamp":
> "2024-10-16 21:59:53.111962", "bmp_log_type": "peer up", "local_ip":
> "192:168::1", "local_port": 179, "remote_port": 50836, "seq": 5}
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When running the bgp_bmp_2 vrf test, peer route messages from the pre
and post policy are received with a wrong peer distinguisher value.
> {"peer_type": "route distinguisher instance", "policy": "pre-policy", "ipv6": false,
> "peer_ip": "192.168.0.2", "peer_distinguisher": "0:0", "peer_asn": 65502,
> "peer_bgp_id": "192.168.0.2", "timestamp": "2024-10-31 08:19:58.111963",
> "bmp_log_type": "update", "origin": "IGP", "as_path": "65501 65502",
> "bgp_nexthop": "192.168.0.2", "ip_prefix": "172.31.0.15/32", "seq": 15}
RFC7854 mentions in 4.2 that if the peer is a "RD Instance Peer", it is
set to the route distinguisher of the particular instance the peer
belongs to.
Fix this by modifying the BMP client:
- update the peer distinguisher value by unlocking the filling of the peer distinguisher in the function.
This change impacts monitoring messages.
- add the peer distinguisher computation for mirror messages
- modify the bgp_bmp_2 vrf test, update the peer_distinguisher value
> {"peer_type": "route distinguisher instance", "policy": "pre-policy", "ipv6": false,
> "peer_ip": "192.168.0.2", "peer_distinguisher": "444:1", "peer_asn": 65502,
> "peer_bgp_id": "192.168.0.2", "timestamp": "2024-10-31 08:19:58.111963",
> "bmp_log_type": "update", "origin": "IGP", "as_path": "65501 65502",
> "bgp_nexthop": "192.168.0.2", "ip_prefix": "172.31.0.15/32", "seq": 15}
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When running the bgp_bmp_2 vrf test, peer route messages from the pre
and post policy are received with a wrong peer type value
> {"peer_type": "global instance", "policy": "pre-policy", "ipv6": false,
> "peer_ip": "192.168.0.2", "peer_distinguisher": "0:0", "peer_asn": 65502,
> "peer_bgp_id": "192.168.0.2", "timestamp": "2024-10-31 08:19:58.111963",
> "bmp_log_type": "update", "origin": "IGP", "as_path": "65501 65502",
> "bgp_nexthop": "192.168.0.2", "ip_prefix": "172.31.0.15/32", "seq": 15}
In addition to global instance peers, RFC7854 defines RD instance peers.
This value can be used for peers which are on a BGP VRF instance, for
example with an L3VPN setup.
When configuring a BGP VRF instance, the peer type should be seen as an
RD instance peer.
Fix this by modifying the BMP client:
- update the peer type for vrf mirror and monitoring messages
- modify bgp_bmp_2 vrf test to control the peer_type value
> {"peer_type": "route distinguisher instance", "policy": "pre-policy", "ipv6": false,
> "peer_ip": "192.168.0.2", "peer_distinguisher": "0:0", "peer_asn": 65502,
> "peer_bgp_id": "192.168.0.2", "timestamp": "2024-10-31 08:19:58.111963",
> "bmp_log_type": "update", "origin": "IGP", "as_path": "65501 65502",
> "bgp_nexthop": "192.168.0.2", "ip_prefix": "172.31.0.15/32", "seq": 15}
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
All BMP peer up/down messages send a 0:0 peer distinguisher.
This will not be ok when adding RD instance type.
Add code to get the peer distinguisher value.
- modify the API to pass the BGP instance instead of BMP.
- implement error cases with an unknown vrf identifier or a
peer type with local type value.
- handle the error return of the API; consequently, handle
the bmp_peerstate() error return in the calling functions.
There is no functional change, as the peer type value is
either loc-rib or global, both cases are already handled.
The next commit will handle the RD instance case.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If a given L3VRF instance requests a peer distinguisher
for a peer up/down message, the AFI_UNSPEC afi parameter
will be used; no RD is chosen for this AFI.
Fix this by priorizing the AFI_IP value before the AFI_IP6
value. For instance, a router with both RD set for each
address-family, peer up/down messages will be sent with the
RD set to the one for AFI_IP.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The BMP implementation currently only supports global and
loc-rib instance types. When loc-rib is selected, the
peer_distinguisher is set to the route distinguisher of
the L3VRF where the BGP instance is. This functionality has
not been tested until now, because the peer distinguisher
value had been explicitly omitted in the bmp messages.
Expose the peer distinguisher value in all BMP messages
received. This change requires to modify the expected output
for loc-rib when the BGP instance is in a L3VRF.
The handling of peer distinguisher value for RD instances
will follow in the next commits.
Link: https://www.rfc-editor.org/rfc/rfc7854.html#section-4.2
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When running the bgp_bmp test, peer_up message from the loc-rib
are received with a wrong peer type.
> {"peer_type": "global instance", "policy": "pre-policy", "ipv6": false, "peer_ip": "0.0.0.0",
> "peer_distinguisher": "0:0", "peer_asn": 0, "peer_bgp_id": "0.0.0.0",
> "timestamp": "2024-10-16 21:59:53.111963", "bmp_log_type": "peer up", "local_ip": "0.0.0.0",
> "local_port": 0, "remote_port": 0, "seq": 1}
RFC9069 mentions in 5.1 that peer address must be set to 0.0.0.0,
and the peer_type value must be set to 3. Today, the value set
is 0 (global instance). This is wrong.
Fix this by modifying the BMP client, update the peer type value to
loc-rib on peer up messages.
Modify the current BMP test, by checking the peer up messages for the
0.0.0.0 IP address (which is the value used for loc-rib).
> {"peer_type": "loc-rib instance", "is_filtered": false, "policy": "loc-rib",
> "peer_distinguisher": "0:0", "peer_asn": 65501, "peer_bgp_id": "192.168.0.1",
> "timestamp": "2024-10-16 21:59:53.111963", "bmp_log_type": "peer up", "local_ip": "0.0.0.0",
> "local_port": 0, "remote_port": 0, "seq": 1}
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Although trigger unknown, based on the backtrace in one of the internal
testing, we do see some delete in the Intf where we can have the peer
ifp pointer null and we try to dereference it while trying to install
the route leading to a crash
Skip updating the ifindex in such cases and since the nexthop is not
properly updated, BGP skips sending it to zebra.
BackTrace:
0 0x00007faef05e7ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007faef0598fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007faef09900dc in core_handler (signo=11, siginfo=0x7ffdde8cb4b0, context=<optimized out>) at lib/sigevent.c:274
3 <signal handler called>
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
5 bgp_zebra_announce_actual (dest=dest@entry=0x5560afcfa950, info=0x5560b21187d0, bgp=0x5560ace04df0) at bgpd/bgp_zebra.c:1521
6 0x00005560aad4bc85 in bgp_handle_route_announcements_to_zebra (e=<optimized out>) at bgpd/bgp_zebra.c:1896
7 0x00007faef09a1c0d in thread_call (thread=thread@entry=0x7ffdde8d7580) at lib/thread.c:2008
8 0x00007faef095a598 in frr_run (master=0x5560ac7e5190) at lib/libfrr.c:1223
9 0x00005560aac65db6 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:557
(gdb) f 4
4 0x00005560aad4b7d8 in update_ipv6nh_for_route_install (api_nh=0x7ffdde8cbe94, is_evpn=false, best_pi=0x5560b21187d0, pi=0x5560b21187d0, ifindex=0, nexthop=0x5560b03cb0dc,
nh_bgp=0x5560ace04df0, nh_othervrf=0) at bgpd/bgp_zebra.c:1273
1273 in bgpd/bgp_zebra.c
(gdb) p pi->peer->ifp
$26 = (struct interface *) 0x0
Ticket :#4203904
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Without the fix:
```
show ip prefix-list test_1 10.20.30.96/27 first-match
<no result>
show ip prefix-list test_2 192.168.1.2/32 first-match
<no result>
```
With the fix:
```
ip prefix-list test_1 seq 10 permit 10.20.30.64/26 le 27
!
end
donatas# show ip prefix-list test_1 10.20.30.96/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 1, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/27
seq 10 permit 10.20.30.64/26 le 27 (hit count: 2, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.64/28
donatas# show ip prefix-list test_1 10.20.30.126/26
seq 10 permit 10.20.30.64/26 le 27 (hit count: 3, refcount: 0)
donatas# show ip prefix-list test_1 10.20.30.126/30
donatas#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When a unicast route from source vrf is imported into
evpn as type5 route, prepend the asn of the source vrf into
type5 asn path list.
The condition of evpn type5 prefix path info is present but
source vrf route has been cleared via clear command. In this
case existing path info needs to rewrite the source vrf asn.
prepends asn of the source vrf, but the further condition
for existing path attribute for the same route needs to prepend
source vrf asn.
Ticket: #2943080
Testing:
Before fix:
r4# clear ip bgp vrf overlay prefix 0.0.0.0/0
Route Distinguisher: 128.117.243.209:4
*> [5]:[0]:[0]:[0.0.0.0]
203.0.113.1 0 0 194 ? <--- 64512 is missing
ET:8 RT:64532:104001 Rmac:06:ec:bf:59:e8:93
After fix:
r4# clear ip bgp vrf overlay prefix 0.0.0.0/0
Route's source vrf bgp output containing ASN 64512:
r4# show bgp vrf overlay
BGP table version is 2, local router ID is 128.117.243.209, vrf id 10
Default local pref 100, local AS 64512
...
Notice after clear command source vrf asn 64512 is retained.
Route Distinguisher: 128.117.243.209:4
*> [5]:[0]:[0]:[0.0.0.0]
203.0.113.1 0 0 64512 194 ?
ET:8 RT:64532:104001 Rmac:06:ec:bf:59:e8:93
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Prior to this we were only filtering EVPN routes from the import logic
if they were not route-type 1/2/3/5, which allowed things like RT-5s to
be imported into an L2VNI/MAC-VRF. This adds additional logic to ensure
routes are only imported into EVIs where they make sense.
No more nonsensical route importing!
Ticket: 2848204
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Effectively When bgp would send a route update down
to zebra and immediately after that a asic update
from the kernel was read. Zebra would choose the
asic update and drop the bgp update leaving us in
a state where bgp was not used as the true source.
Modify the code so that in rib_multipath_nhe
we notice that we have an unprocessed route update
from bgp. And if so just drop this kernel update
about an older version of the route since it is
no longer needed.
Ticket: 2722533
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit:
"tools: run `vtysh -b` once for all-startup"
changed things so that `vtysh -b` is run after all daemons have started
up instead of doing it for each daemon as they are started up. This
results in one long `vtysh -b`, which for large configs and many daemons
(in the case I saw, 4 daemons and 30,000 line config) can exceed the 20
second timer watchfrr uses to kill "hung" background tasks.
Shouldn't be any harm to increasing this to 90 seconds to give us some
leeway while still making sure we kill anything truly misbehaving.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Looks like the cap setting was added for testing mlag via zebra test cli
to config the mlag role. However it is interfering with the valid state
updates rxed from the MLAG daemon based on timing (in some cases the
MLAG state changes are rxed before the capabilities).
Reference logs -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC11:mgmt:/home/cumulus# grep -ri "my_role\|MlagRole" /var/log/frr/bgpd.log
2021/06/18 13:26:40.380130 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: DOWN
2021/06/18 13:26:40.380766 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: DOWN
2021/06/18 13:26:41.382258 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: RUNNING
2021/06/18 13:26:41.382379 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: PRIMARY, peer_state: RUNNING
2021/06/18 13:26:52.386071 ZEBRA: Sending capabilities to client pim: MPLS enabled numMultipath 128 GR disabled MaintMode off MlagRole 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Ticket: #2691629
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
For those packets that we are not sending 16k of data, but something
far less than 256 bytes. Reduce those stream sizes we allocate
to something much more reasonable.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>