if (BGP_IS_VALID_STATE_FOR_NOTIF(peer->connection->status))
peer_notify_config_change(peer->connection);
else
bgp_session_reset_safe(peer, &nnode);
Let's add a bool return to peer_notify_config_change of whether or
not it should call the peer session reset. This simplifies
the code a bunch.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have about a bajillion tests of if we can
notify the peer and then we send a config change
notification. Let's just make a function that
does this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Slipped somehow that peer-groups with GR is just completely broken, but it was
working before.
Strikes again, that we MUST have more and more topotests.
Fixes: 15403f521a ("bgpd: Streamline GR config, act on change immediately")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If the desired state is the same - do nothing instead of resetting once again.
Fixes: bdb5ae8bce ("bgpd: Make suppress-fib-pending clear peering")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This call was originally put into place to help reduce
memory problems associated with bgp having a bajillion
events under load and then we would have a bunch of events
ready to be used on the unused list. In the meantime
code was put into place that limited the depth of the
unused list to 10 elements. This call has now become
unnecessary. Let's just remove it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
added bmp bgp peer for vrfs
added peer up vrf in bmp peer up state
added vrf state in bmpbgp
added safe bmp_peer_sendall : bmp_peer_sendall_safe
changed bgp_open_send to call new bgp_open_make
bgp_open_make creates a bgp open packet, now used in bmp for peer up vrf
added hook and call to bgp instance state
vrf peer state is recomputed when interfaces (including vrf itf) go up / down
and when it gets created or removed
Link: e48ba38070
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>
Fix printfrr_bp for non initialized peers. For example:
> Sep 26 17:56:44 r1 bgpd[26295]: [GJPH1-W8PZV] Resetting peer (null)(Unknown) due to change in addpath config
Is now:
> Oct 02 14:00:59 r1 bgpd[12795]: [MNE5N-K0G4Z] Resetting peer 2.2.2.2 due to change in addpath config
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Introduce a command to stop bgpd from enabling IPv6 router advertisement
messages sending on interfaces.
Signed-off-by: Mikhail Sokolovskiy <sokolmish@gmail.com>
```
==5445==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x7ff4c6bedb19 bp 0x7ffc95f2e400 sp 0x7ffc95f2e3c0 T0)
==5445==The signal is caused by a READ memory access.
==5445==Hint: address points to the zero page.
#0 0x7ff4c6bedb19 in hash_iterate lib/hash.c:246
#1 0x5618f41f5f59 in bgp_evpn_nh_finish bgpd/bgp_evpn_mh.c:4663
#2 0x5618f41dcbe8 in bgp_evpn_vrf_delete bgpd/bgp_evpn.c:7336
#3 0x5618f43bdd35 in bgp_delete bgpd/bgpd.c:4098
#4 0x5618f417ef6e in bgp_exit bgpd/bgp_main.c:206
#5 0x5618f417ef6e in sigint bgpd/bgp_main.c:164
#6 0x7ff4c6cac6c4 in frr_sigevent_process lib/sigevent.c:117
#7 0x7ff4c6cd8258 in event_fetch lib/event.c:1767
#8 0x7ff4c6c0dcbc in frr_run lib/libfrr.c:1230
#9 0x5618f418080d in main bgpd/bgp_main.c:555
#10 0x7ff4c670c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#11 0x7ff4c670c304 in __libc_start_main_impl ../csu/libc-start.c:360
#12 0x5618f417ea20 in _start (/usr/lib/frr/bgpd+0x2e4a20)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lib/hash.c:246 in hash_iterate
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
While it's okay to allow overwriting the ASN of a bgp vrf/instance
that is either hidden or automatically created, it's dangerous to
allow it on explicitly defined instances. If that were allowed,
a typo entering the bgp config could take down existing peering,
which would be a bad thing.
Signed-off-by: Don Slice <dslice@nvidia.com>
1. bgp coredump is observed when we delete default bgp instance
when we have multi-vrf; and route-leaking is enabled between
default, non-default vrfs.
Removing default router bgp when routes leaked between non-default vrfs.
- Routes are leaked from VRF-A to VRF-B
- VPN table is created with auto RD/RT in default instance.
- Default instance is deleted, we try to unimport the routes from all VRFs
- non-default VRF schedules a work-queue to process deleted routes.
- Meanwhile default bgp instance clears VPN tables and free the route
entries as well, which are still referenced by non-default VRFs which
have imported routes.
- When work queue process starts to delete imported route in VRF-A it cores
as it accesses freed memory.
- Whenever we delete bgp in default vrf, we skip deleting routes in the vpn
table, import and export lists.
- The default hidden bgp instance will not be listed in any of the show
commands.
- Whenever we create new default instance, handle it with AS number change
i.e. old hidden default bgp's AS number is updated and also changing
local_as for all peers.
2. A default instance is created with ASN of the vrf with the import
statement.
This may not be the ASN desired for the default table
- First problem with current behavior.
Define two vrfs with different ASNs and then add import between.
starting without any bgp config (no default instance)
A default instance is created with ASN of the vrf with the import
statement.
This may not be the ASN desired for the default table
- Second related problem. Start with a default instance and a vrf in a
different ASN. Do an import statement in the vrf for a bgp vrf instance
not yet defined and it auto-creates that bgp/vrf instance and it inherits
the ASN of the importing vrf
- Handle bgp instances with different ASNs and handle ASN for auto created
BGP instance
Signed-off-by: Kantesh Mundaragi <kmundaragi@vmware.com>
This is helpful for migrations, etc.
The neighbor is configured with:
```
router bgp 65000
neighbor X local-as 65001 no-prepend replace-as dual-as
```
Neighbor X can use either 65000, or 65001 to peer with.
Closes: https://github.com/FRRouting/frr/issues/13928
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently, when SRv6 is enabled in BGP, BGP requests a locator chunk
from Zebra. Zebra assigns a locator chunk to BGP, and then BGP can
allocate SIDs from the locator chunk.
Recently, the implementation of SRv6 in Zebra has been improved, and a
new API has been introduced for obtaining/releasing the SIDs.
Now, the daemons no longer need to request a chunk.
Instead, the daemons interact with Zebra to obtain information about the
locator and subsequently to allocate/release the SIDs.
This commit extends BGP to use the new SRv6 API. In particular, it
removes the chunk throughout the BGP code and modifies BGP to
request/save/advertise the locator instead of the chunk.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
If the neighbor is not configured with `neighbor X default-originate route-map ...`,
then this timer is useless.
Change the logic to be it disabled by default, but enabled automatically once the
route-map is configured for default-originate command.
Automatically assigned timer value is as before, 5 seconds.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
It might cause this use-after-free:
```
==6523==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300058d720 at pc 0x55f3ab62ab1f bp 0x7ffe5b95a0d0 sp 0x7ffe5b95a0c8
READ of size 8 at 0x60300058d720 thread T0
#0 0x55f3ab62ab1e in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2729
#1 0x55f3ab62ab1e in bgp_gr_update_all bgpd/bgp_fsm.c:2779
#2 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
#3 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
#4 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
#5 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
#6 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
#7 0x7fc553ae493f in vty_command lib/vty.c:616
#8 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
#9 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
#10 0x7fc553ad8a64 in event_call lib/event.c:1995
#11 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
#12 0x55f3ab57b78d in main bgpd/bgp_main.c:555
#13 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#14 0x7fc55342d304 in __libc_start_main_impl ../csu/libc-start.c:360
#15 0x55f3ab5799a0 in _start (/usr/lib/frr/bgpd+0x2e19a0)
0x60300058d720 is located 16 bytes inside of 24-byte region [0x60300058d710,0x60300058d728)
freed by thread T0 here:
#0 0x7fc553eb76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
#1 0x7fc553a2b713 in qfree lib/memory.c:130
#2 0x7fc553a0e50d in listnode_free lib/linklist.c:81
#3 0x7fc553a0e50d in list_delete_node lib/linklist.c:379
#4 0x55f3ab7ae353 in peer_delete bgpd/bgpd.c:2796
#5 0x55f3ab7ae91f in bgp_session_reset bgpd/bgpd.c:141
#6 0x55f3ab62ab17 in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2752
#7 0x55f3ab62ab17 in bgp_gr_update_all bgpd/bgp_fsm.c:2779
#8 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
#9 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
#10 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
#11 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
#12 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
#13 0x7fc553ae493f in vty_command lib/vty.c:616
#14 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
#15 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
#16 0x7fc553ad8a64 in event_call lib/event.c:1995
#17 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
#18 0x55f3ab57b78d in main bgpd/bgp_main.c:555
#19 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
#0 0x7fc553eb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x7fc553a2ae20 in qcalloc lib/memory.c:105
#2 0x7fc553a0d056 in listnode_new lib/linklist.c:71
#3 0x7fc553a0d85b in listnode_add_sort lib/linklist.c:197
#4 0x55f3ab7baec4 in peer_create bgpd/bgpd.c:1996
#5 0x55f3ab65be8b in bgp_accept bgpd/bgp_network.c:604
#6 0x7fc553ad8a64 in event_call lib/event.c:1995
#7 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
#8 0x55f3ab57b78d in main bgpd/bgp_main.c:555
#9 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we send a notification, there is no point setting the last_reset, because
bgp_notify_send() sets last_reset to PEER_DOWN_NOTIFY_SEND (almost everywhere).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
```
donatas.net(config-router)# do show ip bgp summary failed
IPv4 Unicast Summary:
BGP router identifier 1.1.1.1, local AS number 65001 VRF default vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 1, using 24 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
127.0.0.1 2 2 00:02:02 Password config change (GoBGP/3.26.0)
Displayed neighbors 1
Total number of neighbors 1
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Coverity complains there is a use after free (1598495 and 1598496)
At this point, most likely dest->refcount cannot go 1 and free up
the dest, but there might be some code path where this can happen.
Fixing this with a simple order change (no harm fix).
Ticket :#4001204
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
In case of imported routes (L3vni/vrf leaks), when a bgp instance is
being deleted, the peer->bgp comparision with the incoming bgp to remove
the dest from the pending fifo is wrong. This can lead to the fifo
having stale entries resulting in crash.
Two changes are done here.
- Instead of pop/push items in list if the struct bgp doesnt match,
simply iterate the list and remove the expected ones.
- Corrected the way bgp is fetched from dest rather than relying on
path_info->peer so that it works for all kinds of routes.
Ticket :#3980988
Signed-off-by: Chirag Shah <chirag @nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
In some cases (large scale) it's desired to avoid changing configurations, but
let the BGP to automatically handle ASN changes.
`auto` means the peering can be iBGP or eBGP. It will be automatically detected
and adjusted from the OPEN message.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Streamline the BGP graceful-restart configuration at the global and
peer level some more. Similar to many other neighbor capability
parameters like MP and ENHE, reset the session immediately upon a
change to the configuration. This will be more aligned with the
transactional UI model also and will not require a separate 'clear'
command to be executed.
Note: Peer-group graceful-restart configuration is not yet supported.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Add support for a BGP-wide setting for graceful restart modes and
parameters. This setting will apply to all BGP peers across all BGP
instances, but per-neighbor configuration can override it.
Per-instance configuration is disallowed if the BGP-wide setting
is in effect.
Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>
Before this patch, we always printed the last reason "Waiting for OPEN", but
if it's a manual shutdown, then we technically are not waiting for OPEN.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we do:
```
bfd
profile foo
shutdown
```
The session is dropped, but immediately established again because we don't
have a proper check on BFD.
If BFD is administratively shutdown, ignore starting the session.
Fixes: https://github.com/FRRouting/frr/issues/16186
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Configuration:
```
vtysh <<EOF
configure
vrf vrf100
vni 10100
exit-vrf
router bgp 50
address-family l2vpn evpn
advertise-all-vni
exit-address-family
exit
router bgp 100 vrf vrf100
exit
EOF
```
TL;DR; When we configure `advertise-all-vni` (in this case), a new BGP instance
is created with the name vrf100, and ASN 50. Next, when we create
`router bgp 100 vrf vrf100`, we look for the BGP instance with the same name
and we found it, but ASNs are different 50 vs. 100.
Every such a new auto created instance is flagged with BGP_VRF_AUTO.
After the fix:
```
router bgp 50
!
address-family l2vpn evpn
advertise-all-vni
exit-address-family
exit
!
router bgp 100 vrf vrf100
exit
!
end
donatas.net(config)# router bgp 51
BGP is already running; AS is 50
donatas.net(config)# router bgp 50
donatas.net(config-router)# router bgp 101 vrf vrf100
BGP is already running; AS is 100
donatas.net(config)# router bgp 100 vrf vrf100
donatas.net(config-router)#
```
Fixes: https://github.com/FRRouting/frr/issues/16152
Fixes: https://github.com/FRRouting/frr/issues/9537
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The default DSCP used for BGP connections is CS6. The DSCP value is
not part of the TCP header.
When setting the IP_TOS or IPV6_TCLASS socket options, the argument
is not the 6-bit DSCP value, but an 8-bit value for the former IPv4
Type of Service field or IPv6 Traffic Class field, respectively.
Fixes: 425bd64be8 ("bgpd: Allow bgp to control the DSCP session TOS value")
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Problem statement:
==================
When a vrf is deleted from the kernel, before its removed from the FRR
config, zebra gets to delete the the vrf and assiciated state.
It does so by sending a request to delete the l3 vni associated with the
vrf followed by a request to delete the vrf itself.
2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF
testVRF1001 to bgp
2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE
testVRF1001
The zebra client communication is asynchronous and about 1/5 cases the
bgp client process them in a different order.
2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be
deleted.
2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be
disabled.
2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766
2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766
2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion:
testVRF1001(4294967295)
.. and a bit later :
2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL
2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI
1001 Del - Could not find BGP instance
When the bgp vrf config is removed later it fails on the sanity check if
l3vni is removed.
if (bgp->l3vni) {
vty_out(vty, "%% Please unconfigure l3vni %u\n",
bgp->l3vni);
return CMD_WARNING_CONFIG_FAILED;
}
Solution:
=========
The solution is to make bgp cleanup the l3vni a bgp instance is going
down.
The fix:
========
The fix is to add a function in bgp_evpn.c to be responsible for for
deleting the local vni, if it should be needed, and call the function
from bgp_instance_down().
Testing:
========
Created a test, which can run in container lab that remove the vrf on
the host before removing the vrf and the bgp config form frr. Running
this test in a loop trigger the problem 18 times of 100 runs. After the
fix it did not fail.
To verify the fix a log message (which is not in the code any longer)
were used when we had a stale l3vni and needed to call
bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in
100 test runs.
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
bgpd: braces {} are not necessary for single line block
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
In case when bgp_evpn_free or bgp_delete is called and the announce_list
has few items where vpn/bgp does not match, we add the item back to the
list. Because of this the list count is always > 0 thereby hogging CPU or
infinite loop.
Ticket: #3905624
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>