Commit Graph

36293 Commits

Author SHA1 Message Date
Mark Stapp
ac2d9bae5c
Merge pull request #16680 from donaldsharp/route_scale_minor_changes
tests: Fix route-scale at higher ecmp
2024-08-29 08:17:34 -04:00
Jafar Al-Gharaibeh
12a3d5a748
Merge pull request #16683 from donaldsharp/test_ospf_netns_vrf_failure
tests: ospf_netns_vrf should give more time for coming up
2024-08-29 01:12:40 -04:00
Jafar Al-Gharaibeh
648566c6fb
Merge pull request #16682 from donaldsharp/bgp_suppress_test
tests: Ensure bgp suppress fib has a chance to transmit data to peer
2024-08-29 01:12:17 -04:00
Jafar Al-Gharaibeh
ffaa365cc4
Merge pull request #16681 from donaldsharp/zebra_re_after_rn
zebra: Move prefix lookup to outside re loop
2024-08-28 23:43:40 -04:00
Jafar Al-Gharaibeh
216ed8c796
Merge pull request #16673 from donaldsharp/default_original_sin
tests: Fix bgp_default_originate_topo1_3
2024-08-28 15:30:12 -04:00
Mark Stapp
79e0c6a2e0
Merge pull request #16672 from raja-rajasekar/vty_out_mem_spike_srujana
lib: Memory spike reduction for sh cmds at scale
2024-08-28 15:29:23 -04:00
Donald Sharp
ce74a6b0a8 tests: Fix route-scale at higher ecmp
Recent commits moved the default retries to 60, but
the higher ecmp counts were over-riding to 40.  Let's
make it 80.

Noticed this when I went looking at failures on 386 platforms
in our ci.  Route scale is timing out when deleting routes.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 15:18:24 -04:00
Donald Sharp
d58c44cebe tests: ospf_netns_vrf should give more time for coming up
Test fails:

            test_func = partial(
                topotest.router_json_cmp,
                router,
                "show ip ospf vrf {0}-ospf-cust1 json".format(rname),
                expected,
            )
            _, diff = topotest.run_and_expect(test_func, None, count=10, wait=0.5)
            assertmsg = '"{}" JSON output mismatches'.format(rname)
>           assert diff is None, assertmsg
E           AssertionError: "r1" JSON output mismatches
E           assert Generated JSON diff error report:
E
E             > $->r1-ospf-cust1->areas->0.0.0.0->nbrFullAdjacentCounter: output has element with value '1' but in expected it has value '2'

/home/sharpd/frr2/tests/topotests/ospf_netns_vrf/test_ospf_netns_vrf.py:239: AssertionError

Support bundle has this data:
r1# show ip ospf vrf all neighbor
% 2024/08/28 14:55:54.763

VRF Name: r1-ospf-cust1

Neighbor ID     Pri State           Up Time         Dead Time Address         Interface                        RXmtL RqstL DBsmL
10.0.255.3        1 Full/DR         10.547s           39.456s 10.0.3.1        r1-eth1:10.0.3.2                     0     0     0
10.0.255.2        1 Full/Backup     0.543s            38.378s 10.0.3.3        r1-eth1:10.0.3.2                     1     0     0

So immediately after the test fails this test, the neighbor comes up.
Let's give the test a bit more time for failure to not happen

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 15:10:04 -04:00
Donald Sharp
3797454a2a tests: Ensure bgp suppress fib has a chance to transmit data to peer
Giving only 5 seconds to pass bgp data to peers on a heavily
loaded system is a recipe for not having fun.  Add more time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 15:05:40 -04:00
Mark Stapp
be161ba4a2
Merge pull request #16679 from donaldsharp/nhrp_test_documentation
doc: Update topotest doc to include iptables is needed
2024-08-28 14:11:18 -04:00
Mark Stapp
8b23abf36e
Merge pull request #16300 from donaldsharp/local_connected
Local connected
2024-08-28 14:10:14 -04:00
Donald Sharp
184dccca60 zebra: Move prefix lookup to outside re loop
Move the prefix lookup/comparison to outside the re loop
and into the rn loop, since that is where the code should
actually be.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 13:18:00 -04:00
Donald Sharp
8e4389da56
Merge pull request #16676 from opensourcerouting/fix/lua_nexthop_handling_in_lua
Lua stack dumping
2024-08-28 12:45:39 -04:00
Jafar Al-Gharaibeh
04b763bcec
Merge pull request #16677 from donaldsharp/mgmt_map
mgmtd: Ensure map is NULL
2024-08-28 12:38:11 -04:00
Donald Sharp
4ec3f1ef0f doc: Update topotest doc to include iptables is needed
The nhrp tests skip tests that do not have iptables installed.
As such we have ended up with a situation where the nrhp test
is now failing locally for me because I have iptables installed
and if the CI system had iptables installed it would have detected
the problem as well.

Let's document that iptables is needed to do testing.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 12:05:41 -04:00
Donald Sharp
598d9a1f17 tests: Fix bgp_default_originate_topo1_3
This test was killing bgp on r1 and r2
and then immediately testing that the
default route transitioned.  Unfortunately
the test was written that under load the
system might be in a bad state.  Let's
modify the code to check for a bgp version
change and then that the bgp state has
come back up

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 11:09:32 -04:00
Donald Sharp
43508d3c67 mgmtd: Ensure map is NULL
Build is complaining:
build	27-Aug-2024 05:46:38	mgmtd/mgmt_be_adapter.c: In function ‘mgmt_register_client_xpath’:
build	27-Aug-2024 05:46:38	mgmtd/mgmt_be_adapter.c:274:27: warning: ‘maps’ may be used uninitialized [-Wmaybe-uninitialized]
build	27-Aug-2024 05:46:38	  274 |         map = darr_append(*maps);
build	27-Aug-2024 05:46:38	      |                           ^
build	27-Aug-2024 05:46:38	mgmtd/mgmt_be_adapter.c:250:36: note: ‘maps’ was declared here
build	27-Aug-2024 05:46:38	  250 |         struct mgmt_be_xpath_map **maps, *map;
build	27-Aug-2024 05:46:38	      |                                    ^~~~

Let's make the compiler happy, even though there is no problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-28 10:09:52 -04:00
Donatas Abraitis
a0a2a35ed3 lib: Add a helper function to dump Lua stack
Very handy for debugging.

In Lua script just use "log.trace(table)":

```
function on_rib_process_dplane_results(ctx)
	log.trace(ctx.rinfo.zd_ng)
end
```

You will get something like:

```
Aug 28 17:04:36 donatas-laptop zebra[3782199]: [GCZ7N-MM9D9] {
                                                 1: {
                                                   type: 2
                                                   weight: 1
                                                   flags: 5
                                                   backup_idx: 0
                                                   vrf_id: 0
                                                   nh_encap_type: 0
                                                   gate: {
                                                     value: 5.87967e+08
                                                     string: "192.168.11.35"
                                                   }
                                                   nh_label_type: 0
                                                   srte_color: 0
                                                   ifindex: 0
                                                   backup_num: 0
                                                 }
                                                 2: {
                                                   type: 3
                                                   weight: 1
                                                   flags: 3
                                                   backup_idx: 0
                                                   vrf_id: 0
                                                   nh_encap_type: 0
                                                   nh_label_type: 0
                                                   srte_color: 0
                                                   ifindex: 4
                                                   backup_num: 0
                                                 }
                                               }
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-28 17:08:45 +03:00
Donatas Abraitis
b1012b693f lib: Start from 1, not 0 when creating Lua tables for nexthops
Lua technically enumerates arrays from 1, not 0.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-28 15:31:47 +03:00
Srujana
9112fb367b lib: Memory spike reduction for sh cmds at scale
The output buffer vty->obuf is a linked list where
each element is of 4KB.
Currently, when a huge sh command  like <show ip route json>
is executed on a large scale, all the vty_outs are
processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses
and puts this data in the socket (131KB at a time).

Problem here is the memory spike for such heavy duty
show commands.

The fix here is to chunkify the output on VTY shell by
flushing it intermediately for every 128 KB of output
accumulated and free the memory allocated for the buffer data.

This way, we achieve ~25-30% reduction in the memory spike.

Fixes: #16498
Note: This is a continuation of MR #16498

Signed-off-by: Srujana <skanchisamud@nvidia.com>

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2024-08-27 12:47:00 -07:00
Donatas Abraitis
3d2c589766
Merge pull request #16655 from louis-6wind/fix-bmp-bpi-extra
bgpd: fix labels static-analyser
2024-08-27 22:14:13 +03:00
Donald Sharp
ae49b992ae
Merge pull request #16651 from opensourcerouting/fix/blackhole_community_bgpd
bgpd: Respect BLACKHOLE community for internal BGP peering also
2024-08-27 15:11:00 -04:00
Jafar Al-Gharaibeh
fa7c77f293
Merge pull request #16665 from louis-6wind/fix-flexalgo-crash-no-te
isisd: fix crash at flex-algo without mpls-te
2024-08-27 13:52:50 -04:00
Russ White
3150d963e6
Merge pull request #16652 from opensourcerouting/fix/prefix_sid_handling
bgpd: Filter Prefix-SID, Encap, PMSI Tunnel
2024-08-27 10:57:44 -04:00
Russ White
bd0fdc443e
Merge pull request #16610 from Jafaral/no-py
tools, ospfclient: add a config option to skip installing python scripts
2024-08-27 10:38:09 -04:00
Louis Scalbert
cd81d28ae2 isisd: fix crash at flex-algo without mpls-te
Fix crash when flex-algo is configured and mpls-te is disabled.

> interface eth0
>  ip router isis 1
> !
> router isis 1
>  flex-algo 129
>   dataplane sr-mpls
>   advertise-definition

> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140486233631168, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fc5802e9476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> #4  0x00007fc58076021f in core_handler (signo=11, siginfo=0x7ffd38d42470, context=0x7ffd38d42340) at lib/sigevent.c:248
> #5  <signal handler called>
> #6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> #7  0x000055c527fb29da in isis_instance_flex_algo_create (args=0x7ffd38d43120) at isisd/isis_nb_config.c:2875
> #8  0x00007fc58072655b in nb_callback_create (context=0x55c52ab1d2f0, nb_node=0x55c529f72950, event=NB_EV_APPLY, dnode=0x55c52ab06230, resource=0x55c52ab189f8, errmsg=0x7ffd38d43750 "",
>     errmsg_len=8192) at lib/northbound.c:1262
> #9  0x00007fc580727625 in nb_callback_configuration (context=0x55c52ab1d2f0, event=NB_EV_APPLY, change=0x55c52ab189c0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1662
> #10 0x00007fc580727c39 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55c52ab1d2f0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1794
> #11 0x00007fc580725f77 in nb_candidate_commit_apply (transaction=0x55c52ab1d2f0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1131
> #12 0x00007fc5807260d1 in nb_candidate_commit (context=..., candidate=0x55c529f0a730, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1164
> #13 0x00007fc58072d220 in nb_cli_classic_commit (vty=0x55c52a0fc6b0) at lib/northbound_cli.c:51
> #14 0x00007fc58072d839 in nb_cli_apply_changes_internal (vty=0x55c52a0fc6b0,
>     xpath_base=0x7ffd38d477f0 "/frr-isisd:isis/instance[area-tag='1'][vrf='default']/flex-algos/flex-algo[flex-algo='129']", clear_pending=false) at lib/northbound_cli.c:178
> #15 0x00007fc58072dbcf in nb_cli_apply_changes (vty=0x55c52a0fc6b0, xpath_base_fmt=0x55c528014de0 "./flex-algos/flex-algo[flex-algo='%ld']") at lib/northbound_cli.c:234
> #16 0x000055c527fd3403 in flex_algo_magic (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0, algorithm=129, algorithm_str=0x55c52ab120d0 "129")
>     at isisd/isis_cli.c:3752
> #17 0x000055c527fc97cb in flex_algo (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0) at ./isisd/isis_cli_clippy.c:6445
> #18 0x00007fc5806b9abc in cmd_execute_command_real (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, up_level=0) at lib/command.c:984
> #19 0x00007fc5806b9c35 in cmd_execute_command (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, vtysh=0) at lib/command.c:1043
> #20 0x00007fc5806ba1e5 in cmd_execute (vty=0x55c52a0fc6b0, cmd=0x55c52aae6bd0 "flex-algo 129\n", matched=0x0, vtysh=0) at lib/command.c:1209
> #21 0x00007fc580782ae1 in vty_command (vty=0x55c52a0fc6b0, buf=0x55c52aae6bd0 "flex-algo 129\n") at lib/vty.c:615
> #22 0x00007fc580784a05 in vty_execute (vty=0x55c52a0fc6b0) at lib/vty.c:1378
> #23 0x00007fc580787131 in vtysh_read (thread=0x7ffd38d4ab10) at lib/vty.c:2373
> #24 0x00007fc58077b605 in event_call (thread=0x7ffd38d4ab10) at lib/event.c:2011
> #25 0x00007fc5806f8976 in frr_run (master=0x55c529df9b30) at lib/libfrr.c:1212
> #26 0x000055c527f301bc in main (argc=5, argv=0x7ffd38d4ad58, envp=0x7ffd38d4ad88) at isisd/isis_main.c:350
> (gdb) f 6
> #6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> 176                     list_delete_all_node(ext->aslas);
> (gdb) p ext
> $1 = (struct isis_ext_subtlvs *) 0x0

Fixes: ae27101e6f ("isisd: fix building asla at first flex-algo config")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-08-27 16:06:08 +02:00
Russ White
5a6cb0bf75
Merge pull request #16103 from mjstapp/fix_5549_nhg_type
zebra: be consistent about v6 nexthops for v4 routes
2024-08-27 09:46:53 -04:00
Mark Stapp
17fffbad1b
Merge pull request #16656 from donaldsharp/minor_fix_for_pim_dr_nondr
tests: Allow convergence before adding multicast routes
2024-08-27 08:17:46 -04:00
Donald Sharp
37dd51867f tests: Add some tests to show new behavior works as expected
a) A noprefix address by itself should not create a connected route.
   This was pre-existing.
b) A noprefix address with a corresponding route should result in a
   connected route.  This is how NetworkManager appears to work.
   This is new behavior, so a new test.
c) A route is added to the system from someone else.
   This is new behavior, so a new test.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp
9bc0cd8241 zebra: Prevent accidental re memory leak in odd case
There exists a path in rib_add_multipath where if a decision
is made to not use the passed in re, we just drop the memory
instead of freeing it.  Let's free it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp
d528c02a20 zebra: Handle kernel routes appropriately
Current code intentionally ignores kernel routes.  Modify
zebra to allow these routes to be read in on linux.  Also
modify zebra to look to see if a route should be treated
as a connected and mark it as such.

Additionally this should properly handle some of the issues
being seen with NOPREFIXROUTE.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp
bdfccf69fa zebra: Expose rib_update_handle_vrf_all
This function will be used on interface down
events to allow for kernel routes to be cleaned
up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donald Sharp
f450e1cda4 zebra: Make p and src_p const for rib_delete
The prefix'es p and src_p are not const.  Let's make
them so.  Useful to signal that we will not change this
data.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-27 06:25:34 -04:00
Donatas Abraitis
7a461479a0 bgpd: Respect BLACKHOLE community for internal BGP peering also
rfc7999 does not define to use this technique ONLY for EBGP sessions.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-27 10:08:54 +03:00
Donald Sharp
52f292d188
Merge pull request #16657 from Jafaral/ospfv3_test_fix
tests: Fix frequent ospfv3 basic functionality test failure
2024-08-26 20:29:38 -04:00
Jafar Al-Gharaibeh
0d745741c9 tests: Fix frequent ospfv3 basic functionality test failure
The dead timer is set to 4 seconds, while the hello interval is set to 6535.
This test will only pass of the platform is fast enough for ospfv3 to
converge in 4 seconds. These timers were already tested multiple time earlier.
This test should just make sure that the boundary value 65535 is configurable,

Other changes in this commit:
  - add sequence numbers to the dead intervals tests to make it easier to
    track test faliures.
  - swap the config order in one test to match order with all other tests.

Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
2024-08-26 16:35:37 -05:00
Donald Sharp
3c4ffcacfe tests: Allow convergence before adding multicast routes
Current code adds a new vlan interface, sets up ospf and
pim on it and immediately starts shoving data down the pipes.
This of course has the fun thing where the IGP and pim do not
always come up in a nice neat manner and the test is looking
for state from a nice neat come up, even though pim is `working`
correctly it is not correct for what the test wants.

Modify the code to ensure that ospf is up and has propagated
the route where it is needed as well as that pim neighbors have
properly come up, then initiate the multicast streams and igmp
reports.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2024-08-26 16:02:46 -04:00
Louis Scalbert
a152692f5a bgpd: fix labels static-analyser
Fix static-analyser warnings with BGP labels:

> $ scan-build make -j12
> bgpd/bgp_updgrp_packet.c:819:10: warning: Access to field 'extra' results in a dereference of a null pointer (loaded from variable 'path') [core.NullDereference]
>                                                 ? &path->extra->labels->label[0]
>                                                    ^~~~~~~~~

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
2024-08-26 10:29:12 +02:00
Donatas Abraitis
d84cae2db7 bgpd: Set encap attribute if received and parsed
It's not used much in the code, but we should have it set when everything is fine.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-25 19:15:10 +03:00
Donatas Abraitis
b713df85bd bgpd: Allow filtering Encap attribute
Filtering this attribute via `path-attribute discard/treat-as-widthraw`.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-25 19:11:01 +03:00
Donatas Abraitis
f0bb2626ef bgpd: Allow filtering PMSI Tunnel attribute
Filtering this attribute via `path-attribute discard/treat-as-widthraw`.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-25 19:05:53 +03:00
Donatas Abraitis
f390253ca7 bgpd: Allow filtering Prefix-SID attribute
Filtering this attribute via `path-attribute discard/treat-as-widthraw`.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-25 19:03:05 +03:00
Donald Sharp
f85dfb3f1a
Merge pull request #16649 from opensourcerouting/fix/free_memory_on_return
bgpd: Free epvn_overlay memory on error
2024-08-24 15:30:45 -04:00
Donatas Abraitis
5eb80edf97 bgpd: Free epvn_overlay memory on error
When parsing EVPN NLRIs, and an error occurred, do no forget to free the memory.

Fixes: 4ace11d010 ("bgpd: Move evpn_overlay to a pointer")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2024-08-24 11:58:48 +03:00
Donatas Abraitis
ab2fd988c9
Merge pull request #16646 from csilt/pim-ifp-crash
pimd: Fix crash in pimd
2024-08-24 11:12:23 +03:00
Donatas Abraitis
8d4c70697b
Merge pull request #16631 from pguibert6WIND/imported_from_l3nhg_json
bgpd: add json support for BGP L3NHG values
2024-08-24 09:32:49 +03:00
Corey Siltala
12a783d313 pimd: Fix crash in pimd
ifp->info is not always set in PIM. So add a guard here to stop
it from crashing when addresses are added to a non-PIM enabled interface
and PIM zebra debugging is enabled.

Signed-off-by: Corey Siltala <csiltala@atcorp.com>
2024-08-23 15:42:03 -05:00
Mark Stapp
b4dae97381
Merge pull request #16609 from donaldsharp/singleton_no_weight
Reduce the number of Singleton objects when using weight for NHG's
2024-08-23 16:19:29 -04:00
Donatas Abraitis
7af65715fc
Merge pull request #16640 from louis-6wind/fix-nhrp-local
nhrpd: fix sending /32 shortcut
2024-08-23 22:57:09 +03:00
Donald Sharp
a04cca6f74
Merge pull request #16633 from Jafaral/fix-version-build
config: fix missing case when reporting version 'configured with'
2024-08-23 14:45:33 -04:00