add some match in route map rules
add some set unset bgp access path list
add another prefix for better tests discrimination
update expected results
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 0df2e14997)
whith the following config
router bgp 65001
no bgp ebgp-requires-policy
neighbor 192.168.1.2 remote-as external
neighbor 192.168.1.2 timers 3 10
!
address-family ipv4 unicast
neighbor 192.168.1.2 route-map r2 in
exit-address-family
exit
!
bgp as-path access-list FIRST seq 5 permit ^65
bgp as-path access-list SECOND seq 5 permit 2$
!
route-map r2 permit 6
match ip address prefix-list p2
set as-path exclude as-path-access-list SECOND
exit
!
route-map r2 permit 10
match ip address prefix-list p1
set as-path exclude 65003
exit
!
route-map r2 permit 20
match ip address prefix-list p3
set as-path exclude all
exit
making some
no bgp as-path access-list SECOND permit 2$
bgp as-path access-list SECOND permit 3$
clear bgp *
no bgp as-path access-list SECOND permit 3$
bgp as-path access-list SECOND permit 2$
clear bgp *
will induce some crashes
thus we rework the links between aslists and aspath_exclude
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
(cherry picked from commit 094dcc3cda)
We advance data pointer (data++), but we do memcpy() with the length that is 1-byte
over, which is technically heap overflow.
```
==411461==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x50600011da1a at pc 0xc4f45a9786f0 bp 0xffffed1e2740 sp 0xffffed1e1f30
READ of size 4 at 0x50600011da1a thread T0
0 0xc4f45a9786ec in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x3586ec) (BuildId: e794c5f796eee20c8973d7efb9bf5735e54d44cd)
1 0xc4f45abf15f8 in bgp_dynamic_capability_fqdn /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3457:4
2 0xc4f45abdd408 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3911:4
3 0xc4f45abdbeb4 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9
4 0xc4f45abde2cc in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11
5 0xc4f45a9b6110 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```
Found by fuzzing.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit b685ab5e1b)
If we receive CAPABILITY message (software-version), we SHOULD check if we really
have enough data before doing memcpy(), that could also lead to buffer overflow.
(data + len > end) is not enough, because after this check we do data++ and later
memcpy(..., data, len). That means we have one more byte.
Hit this through fuzzing by
```
0 0xaaaaaadf872c in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x35872c) (BuildId: 9c6e455d0d9a20f5a4d2f035b443f50add9564d7)
1 0xaaaaab06bfbc in bgp_dynamic_capability_software_version /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3713:3
2 0xaaaaab05ccb4 in bgp_capability_msg_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3839:4
3 0xaaaaab05c074 in bgp_capability_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:3980:9
4 0xaaaaab05e48c in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4109:11
5 0xaaaaaae36150 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```
Hit this again by Iggy \m/
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 5d7af51c4f)
Before this patch, we always printed the last reason "Waiting for OPEN", but
if it's a manual shutdown, then we technically are not waiting for OPEN.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit c25c7e929d)
In case of EVPN MH bond, a member port going in
protodown state due to external reason (one case being linkflap),
frr updates the state correctly but upon manually
clearing external reason trigger FRR to reinstate
protodown without any reason code.
Fix is to ensure if the protodown reason was external
and new state is to have protodown 'off' then do no reinstate
protodown.
Ticket: #3947432
Testing:
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <linkflap>
switch:#ip link set swp1 protodown off protodown_reason linkflap off
switch:#ip link show swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9216 qdisc
pfifo_fast master bond1 state DOWN mode DEFAULT group default qlen
1000
link/ether 1c:34:da:2c:aa:68 brd ff:ff:ff:ff:ff:ff
Signed-off-by: Chirag Shah <chirag@nvidia.com>
(cherry picked from commit e4d843b438)
The backup_nexthop entry list has been populated by mistake,
and should not. Fix this by reverting the introduced behavior.
Fixes: 237ebf8d45 ("bgpd: rework bgp_zebra_announce() function, separate nexthop handling")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit d4390fc217)
Problem statement:
==================
When a vrf is deleted from the kernel, before its removed from the FRR
config, zebra gets to delete the the vrf and assiciated state.
It does so by sending a request to delete the l3 vni associated with the
vrf followed by a request to delete the vrf itself.
2023/10/06 06:22:18 ZEBRA: [JAESH-BABB8] Send L3_VNI_DEL 1001 VRF
testVRF1001 to bgp
2023/10/06 06:22:18 ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE
testVRF1001
The zebra client communication is asynchronous and about 1/5 cases the
bgp client process them in a different order.
2023/10/06 06:22:18 BGP: [VP18N-HB5R6] VRF testVRF1001(766) is to be
deleted.
2023/10/06 06:22:18 BGP: [RH4KQ-X3CYT] VRF testVRF1001(766) is to be
disabled.
2023/10/06 06:22:18 BGP: [X8ZE0-9TS5H] VRF disable testVRF1001 id 766
2023/10/06 06:22:18 BGP: [X67AQ-923PR] Deregistering VRF 766
2023/10/06 06:22:18 BGP: [K52W0-YZ4T8] VRF Deletion:
testVRF1001(4294967295)
.. and a bit later :
2023/10/06 06:22:18 BGP: [MRXGD-9MHNX] DJERNAES: process L3VNI 1001 DEL
2023/10/06 06:22:18 BGP: [NCEPE-BKB1G][EC 33554467] Cannot process L3VNI
1001 Del - Could not find BGP instance
When the bgp vrf config is removed later it fails on the sanity check if
l3vni is removed.
if (bgp->l3vni) {
vty_out(vty, "%% Please unconfigure l3vni %u\n",
bgp->l3vni);
return CMD_WARNING_CONFIG_FAILED;
}
Solution:
=========
The solution is to make bgp cleanup the l3vni a bgp instance is going
down.
The fix:
========
The fix is to add a function in bgp_evpn.c to be responsible for for
deleting the local vni, if it should be needed, and call the function
from bgp_instance_down().
Testing:
========
Created a test, which can run in container lab that remove the vrf on
the host before removing the vrf and the bgp config form frr. Running
this test in a loop trigger the problem 18 times of 100 runs. After the
fix it did not fail.
To verify the fix a log message (which is not in the code any longer)
were used when we had a stale l3vni and needed to call
bgp_evpn_local_l3vni_del() to do the cleanup. This were hit 20 times in
100 test runs.
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
bgpd: braces {} are not necessary for single line block
Signed-off-by: Kacper Kwasny <kkwasny@akamai.com>
(cherry picked from commit 171d2583d0)
When removing a large number of routes, the linux kernel can take the
cpu for an extended amount of time, leaving a situation where FRR
detects a starvation event.
r1# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [NTFY] sharpd: [M7Q4P-46WDR] vty[5]@# sharp install routes 10.0.0.0 nexthop 192.168.44.33 1000000 repeat 10
2024-06-14 12:55:49.365 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:55:57.256 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.890085
2024-06-14 12:55:57.256 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:07.802 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 7078ms (cpu time 220ms)
2024-06-14 12:56:25.039 [DEBG] sharpd: [WTN53-GK9Y5] Removed all Items 27.783668
2024-06-14 12:56:25.039 [DEBG] sharpd: [YP4TQ-01TYK] Inserting 1000000 routes
2024-06-14 12:56:32.783 [DEBG] sharpd: [TPHKD-3NYSB] Installed All Items 7.743524
2024-06-14 12:56:32.783 [DEBG] sharpd: [YJ486-NX5R1] Removing 1000000 routes
2024-06-14 12:56:41.447 [WARN] zebra: [QH9AB-Y4XMZ][EC 100663314] STARVATION: task dplane_thread_loop (634377bc8f9e) ran for 5175ms (cpu time 179ms)
Let's modify the loop in dplane_thread_loop such that after a provider
has been run, check to see if the event should yield, if so, stop
and reschedule this for the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 6faad863f3)
If we receive a malformed packets, this could lead ptr_get_be64() reading
the packets more than needed (heap overflow).
```
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
0 0xaaaaaadf86ec in __asan_memcpy (/home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/.libs/bgpd+0x3586ec) (BuildId: 78123cd26ada92b8b59fc0d74d292ba70c9d2e01)
1 0xaaaaaaeb60fc in ptr_get_be64 /home/ubuntu/frr-public/frr_public_private-libfuzzer/./lib/stream.h:377:2
2 0xaaaaaaeb5b90 in ecommunity_linkbw_present /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_ecommunity.c:1895:10
3 0xaaaaaae50f30 in bgp_attr_ext_communities /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:2639:8
4 0xaaaaaae49d58 in bgp_attr_parse /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_attr.c:3776:10
5 0xaaaaab063260 in bgp_update_receive /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:2371:20
6 0xaaaaab05df00 in bgp_process_packet /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_packet.c:4063:11
7 0xaaaaaae36110 in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/bgpd/bgp_main.c:582:3
```
This is triggered when receiving such a packet (malformed):
```
(gdb) bt
0 ecommunity_linkbw_present (ecom=0x555556287990, bw=bw@entry=0x7fffffffda68)
at bgpd/bgp_ecommunity.c:1802
1 0x000055555564fcac in bgp_attr_ext_communities (args=0x7fffffffd840) at bgpd/bgp_attr.c:2619
2 bgp_attr_parse (peer=peer@entry=0x55555628cdf0, attr=attr@entry=0x7fffffffd960, size=size@entry=20,
mp_update=mp_update@entry=0x7fffffffd940, mp_withdraw=mp_withdraw@entry=0x7fffffffd950)
at bgpd/bgp_attr.c:3755
3 0x00005555556aa655 in bgp_update_receive (connection=connection@entry=0x5555562aa030,
peer=peer@entry=0x55555628cdf0, size=size@entry=41) at bgpd/bgp_packet.c:2324
4 0x00005555556afab7 in bgp_process_packet (thread=<optimized out>) at bgpd/bgp_packet.c:3897
5 0x00007ffff7ac2f73 in event_call (thread=thread@entry=0x7fffffffdc70) at lib/event.c:2011
6 0x00007ffff7a6fb90 in frr_run (master=0x555555bc7c90) at lib/libfrr.c:1212
7 0x00005555556457e1 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:543
(gdb) p *ecom
$1 = {refcnt = 1, unit_size = 8 '\b', disable_ieee_floating = false, size = 2, val = 0x555556282150 "",
str = 0x5555562a9c30 "UNK:0, 255 UNK:2, 6"}
```
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The isis_tilfa_topo1 topotest is comprehensive and contains a large
amount of reference data. One problem is that, when changes occur,
updating this reference data can be difficult.
To address this problem, this commit introduces a method to
automatically regenerate the reference data by setting the `REGEN_DATA`
environment variable.
Usage:
$ REGEN_DATA=true python3 ./test_isis_tilfa_topo1.py
When `REGEN_DATA` is set, the topotest regenerates reference data
from the current run instead of comparing against existing reference
data. Note that regenerated data must be manually verified for
correctness.
This commit also simplifies the reference data by replacing all diff
files with complete JSON snapshots.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
In this topotest, steps 10-15 were added to test the IS-IS switchover
functionality. In short, two cases were tested: switchover after a
link down event and switchover after a BFD down event. Both cases
were tested in sequence on the same router, rt6. This involved the
following steps:
- Setting the SPF delay timer to 15 seconds
- Shutting down the eth-rt5 interface from the switch side
- Testing the post-switchover RIB and LIB (triggered by the link down
event)
- Testing the post-SPF RIB and LIB
- Bringing the eth-rt5 interface back up
- Configuring a BFD session between rt6 and rt5
- Shutting down the eth-rt5 interface from the switch side once again
- Testing the post-switchover RIB and LIB (triggered by the BFD down
event)
- Testing the post-SPF RIB and LIB
Since the time window to test the post-switchover RIB and LIB was too
narrow (10 seconds), these tests were having sporadic failures.
To resolve this problem, we can simplify the switchover test as follows:
- Setting the SPF delay timer to 60 seconds (not 15)
- Disabling "link-detect" on rt6's eth-rt5 interface
- Shutting down the eth-rt5 interface from the switch side
- On rt6, testing the post-switchover RIB and LIB (triggered by the
BFD down event)
- On rt5, testing the post-switchover RIB and LIB (triggered by the
link down event)
Notice how we can test both post-link-down and post-BFD-down switchover
cases simultaneously by having different "link-detect" configurations
on rt5 and rt6. Additionally, by using a larger SPF delay timer, the
time window to test the post-switchover RIB and LIB is much larger
and less prone to sporadic failures.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When switching from nexthop to zapi_nexthop, the srte color
is copied. Do the same in reverse.
Fixes: 31f937fb43 ("lib, zebra: Add SR-TE policy infrastructure to zebra")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
- Use `uname -r` to also install specific module versions since
with github runners the running kernel can become out-dated with
the deployed packages.
Signed-off-by: Christian Hopps <chopps@labn.net>
New back-end clients may need to add notification static allocations so
we should have it available for those users, rather than requiring the
new user delve into the mgmtd infra and modify it themselves.
Signed-off-by: Christian Hopps <chopps@labn.net>
Check that "show ip route vrf XXX json" and the JSON at key "XXX" of
"show ip route vrf all json" gives the same output.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When displaying a route table in JSON, a table JSON object is storing
all the prefix JSON objects containing the prefix information. This
results in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine with large routing
tables.
To Fix the memory consumption issue for the "show ip[v6] route [vrf XX]
json" command, display the prefixes one by one and free the memory of
each JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
0e2fc3d67f ("vtysh, zebra: Fix malformed json output for multiple vrfs
in command 'show ip route vrf all json'") has been reverted in the
previous commit. Although the fix was correct, it was consuming too muca
memory when displaying large route tables.
A root JSON object was storing all the JSON objects containing the route
tables, each containing their respective prefixes in JSON objects. This
resulted in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine.
To Fix the memory consumption issue for the "show ip[v6] route vrf all
json" command, display the tables one by one and free the memory of each
JSON object after it has been displayed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This reverts commit 0e2fc3d67f.
This fix was correct but not optimal for memory consumption at scale.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Description:
OSPF6 GR HELPER router should consider as TOPOCHANGE when
it receives lsas with max age and should exit from Helper.
But, it is not exiting from helper because this max age lsa is
considered as duplicated lsa since the sender uses same seq
number for max age lsa from the previous lsa update.
Currently, topo change is not considered for duplicated lsas.
So removed the duplicated check when validating TOPOCHNAGE.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Topotest relevant changes:
- add support for `timeout` arg to `cmd_*()`
- handle invalid regexp in CLI commands
- fix long interface name support
Full munet changelog:
munet: 0.14.9: add support for `timeout` arg to `cmd_*()`
munet: 0.14.8: cleanup the cleanup (kill) on launch options
munet: 0.14.7: allow multiple extra commands for shell console init
munet: 0.14.6:
- qemu: gather gcda files where munet can find them
- handle invalid regexp in CLI commands
munet: 0.14.5:
- (podman) pull missing images for containers
- fix long interface name support
- add another router example
munet: 0.14.4: mutest: add color to PASS/FAIL indicators on tty consoles
munet: 0.14.3: Add hostnet node that runs it's commands in the host network namespace.
munet: 0.14.2:
- always fail mutest tests on bad json inputs
- improve ssh-remote for common use-case of connecting to host connected devices
- fix ready-cmd for python v3.11+
munet: 0.14.1: Improved host interface support.
Signed-off-by: Christian Hopps <chopps@labn.net>