Rename router variables in nhrp_redundancy to match the actual name.
Cosmetic change to help debugging.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Currently the FRR code will receive both kernel and
connected routes that do not actually have an underlying
nexthop group at all. Zebra turns around and creates
a `matching` nexthop hash entry and installs it.
For connected routes, this will create 2 singleton
nexthops in the dplane per interface (v4 and v6).
For kernel routes it would just create 1 singleton
nexthop that might be used or not.
This is bad because the dplane has a limited amount
of space available for nexthop entries and if you
happen to have a large number of interfaces then
all of a sudden you have 2x(# of interfaces) singleton
nexthops.
Let's modify the code to delay creation of these singleton
nexthops until they have been used by something else in the
system.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The expected prefix should be 5.5.5.0/24 otherwise the hosts behind NHRP
client 1 nhc1 (aka. r5) are not reachable via NHRP.
The issue was not seen in the FRR official CI because the tests were
skipped because iptables were missing in CI machines.
It solves the 16690 issue.
Fixes: https://github.com/FRRouting/frr/issues/16690
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
There is a code path that could theoretically get you
to a point where the ng->nexthop is a NULL value.
Let's just make sure the SA system believes that
cannot happen anymore.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A blackhole nexthop, according to the linux kernel,
can be v4 or v6. A v4 blackhole nexthop cannot be
used on a v6 route, but a v6 blackhole nexthop can
be used with a v4 route. Convert all blackhole
singleton nexthops to v6 and just use that.
Possibly reducing the number of active nexthops by 1.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's display the afi of the nexthop hash entry. Right
now it is impossible to tell the difference between v4 or
v6 nexthops, especially since it is important for the kernel.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Recent commits moved the default retries to 60, but
the higher ecmp counts were over-riding to 40. Let's
make it 80.
Noticed this when I went looking at failures on 386 platforms
in our ci. Route scale is timing out when deleting routes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Test fails:
test_func = partial(
topotest.router_json_cmp,
router,
"show ip ospf vrf {0}-ospf-cust1 json".format(rname),
expected,
)
_, diff = topotest.run_and_expect(test_func, None, count=10, wait=0.5)
assertmsg = '"{}" JSON output mismatches'.format(rname)
> assert diff is None, assertmsg
E AssertionError: "r1" JSON output mismatches
E assert Generated JSON diff error report:
E
E > $->r1-ospf-cust1->areas->0.0.0.0->nbrFullAdjacentCounter: output has element with value '1' but in expected it has value '2'
/home/sharpd/frr2/tests/topotests/ospf_netns_vrf/test_ospf_netns_vrf.py:239: AssertionError
Support bundle has this data:
r1# show ip ospf vrf all neighbor
% 2024/08/28 14:55:54.763
VRF Name: r1-ospf-cust1
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
10.0.255.3 1 Full/DR 10.547s 39.456s 10.0.3.1 r1-eth1:10.0.3.2 0 0 0
10.0.255.2 1 Full/Backup 0.543s 38.378s 10.0.3.3 r1-eth1:10.0.3.2 1 0 0
So immediately after the test fails this test, the neighbor comes up.
Let's give the test a bit more time for failure to not happen
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Giving only 5 seconds to pass bgp data to peers on a heavily
loaded system is a recipe for not having fun. Add more time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move the prefix lookup/comparison to outside the re loop
and into the rn loop, since that is where the code should
actually be.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The nhrp tests skip tests that do not have iptables installed.
As such we have ended up with a situation where the nrhp test
is now failing locally for me because I have iptables installed
and if the CI system had iptables installed it would have detected
the problem as well.
Let's document that iptables is needed to do testing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test was killing bgp on r1 and r2
and then immediately testing that the
default route transitioned. Unfortunately
the test was written that under load the
system might be in a bad state. Let's
modify the code to check for a bgp version
change and then that the bgp state has
come back up
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Build is complaining:
build 27-Aug-2024 05:46:38 mgmtd/mgmt_be_adapter.c: In function ‘mgmt_register_client_xpath’:
build 27-Aug-2024 05:46:38 mgmtd/mgmt_be_adapter.c:274:27: warning: ‘maps’ may be used uninitialized [-Wmaybe-uninitialized]
build 27-Aug-2024 05:46:38 274 | map = darr_append(*maps);
build 27-Aug-2024 05:46:38 | ^
build 27-Aug-2024 05:46:38 mgmtd/mgmt_be_adapter.c:250:36: note: ‘maps’ was declared here
build 27-Aug-2024 05:46:38 250 | struct mgmt_be_xpath_map **maps, *map;
build 27-Aug-2024 05:46:38 | ^~~~
Let's make the compiler happy, even though there is no problem.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The output buffer vty->obuf is a linked list where
each element is of 4KB.
Currently, when a huge sh command like <show ip route json>
is executed on a large scale, all the vty_outs are
processed and the entire data is accumulated.
After the entire vty execution, vtysh_flush proceeses
and puts this data in the socket (131KB at a time).
Problem here is the memory spike for such heavy duty
show commands.
The fix here is to chunkify the output on VTY shell by
flushing it intermediately for every 128 KB of output
accumulated and free the memory allocated for the buffer data.
This way, we achieve ~25-30% reduction in the memory spike.
Fixes: #16498
Note: This is a continuation of MR #16498
Signed-off-by: Srujana <skanchisamud@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
If the link-params are set when the circuit not yet up, the link-params
are never updated.
isis_link_params_update() is called from isis_circuit_up() but returns
immediately because circuit->state != C_STATE_UP. circuit->state is
updated in isis_csm_state_change after isis_circuit_up().
> struct isis_circuit *isis_csm_state_change(enum isis_circuit_event event,
> struct isis_circuit *circuit,
> void *arg)
> {
> [...]
> if (isis_circuit_up(circuit) != ISIS_OK) {
> isis_circuit_deconfigure(circuit, area);
> break;
> }
> circuit->state = C_STATE_UP;
> isis_event_circuit_state_change(circuit, circuit->area,
> 1);
Do not return isis_link_params_update() if circuit->state != C_STATE_UP.
Fixes: 0fdd8b2b11 ("isisd: update link params after circuit is up")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Implement common code for debug status output and remove daemon-specific
code that is duplicated everywhere.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Implement common code for debug config output and remove daemon-specific
code that is duplicated everywhere.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The debug library allows to register a `debug_set_all` callback which
should enable all debugs in a daemon. This callback is implemented
exactly the same in each daemon. Instead of duplicating the code, rework
the lib to allow registration of each debug type, and implement the
common code only once in the lib.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
a) A noprefix address by itself should not create a connected route.
This was pre-existing.
b) A noprefix address with a corresponding route should result in a
connected route. This is how NetworkManager appears to work.
This is new behavior, so a new test.
c) A route is added to the system from someone else.
This is new behavior, so a new test.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>