if (BGP_IS_VALID_STATE_FOR_NOTIF(peer->connection->status))
peer_notify_config_change(peer->connection);
else
bgp_session_reset_safe(peer, &nnode);
Let's add a bool return to peer_notify_config_change of whether or
not it should call the peer session reset. This simplifies
the code a bunch.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have about a bajillion tests of if we can
notify the peer and then we send a config change
notification. Let's just make a function that
does this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There is an extra space in the 'Displayed' line of show bgp command,
that should not be present.
Fix this by being consistent with the output of the other address
families.
Fixes: ("a1baf9e84f71") bgpd: Use single whitespace when displaying show bgp summary
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The json display of the version attribute is originally an
integer. It has changed, most probably mistakenly.
> {
> "vrfId": 7,
> "vrfName": "vrf1",
> "tableVersion": 3,
> "routerId": "192.0.2.1",
> "defaultLocPrf": 100,
> "localAS": 65500,
> "routes": {
> "172.31.0.1/32": {
> "prefix": "172.31.0.1/32",
> "version": "1", <--- int or string ??
Let us fix it, by using the integer display instead.
Fixes: f9f2d188e3 ("bgpd: fix 'json detail' output structure")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
We do not maintain docker.com/frrouting anymore and not building custom
images for topotests.
Use local images for topotests instead.
Just use:
```
mak topotests-build
make topotests
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
During zebra shutdown, the main pthread and the FPM pthread can
deadlock if the FPM pthread is in fpm_reconnect(). Each pthread
tries to use event_cancel_async() to cancel tasks that may be
scheduled for the other pthread - this leads to a deadlock as
neither thread can progress.
This adds an atomic boolean that's managed as each pthread
enters and leaves the cleanup code in question, preventing the
two threads from running into the deadlock.
Signed-off-by: Mark Stapp <mjs@cisco.com>
The test is failing because on r2 we are looking for a metric of 777
on startup, but the start of looking for this happens to be after
the 5 second delay that is setup in the config.
On r1:
2023/09/06 17:05:14.999407 BGP: [G822R-SBMNH] config-from-file# router bgp 65001
2023/09/06 17:05:15.003060 BGP: [G822R-SBMNH] config-from-file# bgp max-med on-startup 5 777
2023/09/06 17:05:15.003342 BGP: [G822R-SBMNH] config-from-file# no bgp ebgp-requires-policy
2023/09/06 17:05:15.003453 BGP: [G822R-SBMNH] config-from-file# neighbor 192.168.255.2 remote-as 65001
2023/09/06 17:05:15.004029 BGP: [G822R-SBMNH] config-from-file# neighbor 192.168.255.2 timers 3 10
2023/09/06 17:05:15.004242 BGP: [G822R-SBMNH] config-from-file# address-family ipv4 unicast
2023/09/06 17:05:15.004329 BGP: [G822R-SBMNH] config-from-file# redistribute connected
2023/09/06 17:05:15.005023 BGP: [G822R-SBMNH] config-from-file# exit-address-family
2023/09/06 17:05:15.005140 BGP: [G822R-SBMNH] config-from-file# !
2023/09/06 17:05:15.005162 BGP: [G822R-SBMNH] config-from-file# !
2023/09/06 17:05:17.538112 BGP: [M7Q4P-46WDR] vty[25]@> enable
2023/09/06 17:05:17.546700 BGP: [M7Q4P-46WDR] vty[25]@# clear log cmdline-targets
2023/09/06 17:05:17.570635 BGP: [M7Q4P-46WDR] vty[25]@(config)# log commands
2023/09/06 17:05:17.572518 BGP: [M7Q4P-46WDR] vty[25]@(config)# log timestamp precision 6
2023/09/06 17:05:24.982647 BGP: [YNGC8-65JDM] Begin maxmed onstartup mode - timer 5 seconds
2023/09/06 17:05:26.033134 BGP: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 192.168.255.2 in vrf default
2023/09/06 17:05:29.982960 BGP: [N1747-51Y51] Max med on startup ended - timer expired.
on r2:
2023/09/06 17:05:23.976029 BGP: [G822R-SBMNH] config-from-file# !
2023/09/06 17:05:26.084086 BGP: [M59KS-A3ZXZ] bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 192.168.255.1 in vrf default
2023/09/06 17:05:27.280103 BGP: [M7Q4P-46WDR] vty[25]@> enable
2023/09/06 17:05:27.290204 BGP: [M7Q4P-46WDR] vty[25]@# clear log cmdline-targets
2023/09/06 17:05:27.328798 BGP: [M7Q4P-46WDR] vty[25]@(config)# log commands
2023/09/06 17:05:27.335032 BGP: [M7Q4P-46WDR] vty[25]@(config)# log timestamp precision 6
2023/09/06 17:05:31.558216 BGP: [M7Q4P-46WDR] vty[5]@> enable
2023/09/06 17:05:31.562482 BGP: [M7Q4P-46WDR] vty[5]@# do show logging
2023/09/06 17:05:32.942204 BGP: [M7Q4P-46WDR] vty[5]@> enable
2023/09/06 17:05:32.946745 BGP: [M7Q4P-46WDR] vty[5]@# show ip bgp neighbor 192.168.255.1 json
2023/09/06 17:05:34.173879 BGP: [M7Q4P-46WDR] vty[5]@> enable
2023/09/06 17:05:34.178448 BGP: [M7Q4P-46WDR] vty[5]@# show ip bgp neighbor 192.168.255.1 routes json
2023/09/06 17:05:36.459365 BGP: [M7Q4P-46WDR] vty[5]@> enable
2023/09/06 17:05:36.472019 BGP: [M7Q4P-46WDR] vty[5]@# show ip bgp neighbor 192.168.255.1 routes json
2023/09/06 17:05:38.557840 BGP: [M7Q4P-46WDR] vty[5]@> enable
2023/09/06 17:05:38.558948 BGP: [M7Q4P-46WDR] vty[5]@# show ip bgp neighbor 192.168.255.1 routes json
2023/09/06 17:05:40.198563 BGP: [M7Q4P-46WDR] vty[5]@> enable
Notice that the 5 second delay for the max med expires at 29 seconds but the show routes
on r2 does not even begin until 34 seconds, long after the max med has expired and the
test has moved on.
Let's relax the max-med timer to 30 seconds and modify the test to wait a bit longer for
both finding it and expiring timer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Test is failing locally:
2023-09-06 18:39:56,865 DEBUG: r1: vtysh result:
Hello, this is FRRouting (version 9.1-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
r1# conf t
r1(config)# router ospf
r1(config-router)# ospf router-id 1.1.1.1
For this router-id change to take effect, use "clear ip ospf process" command
r1(config-router)#
2023-09-06 18:39:56,865 DEBUG: root: GOT LINE: 'SUCCESS: 1.0.0.0'
2023-09-06 18:39:56,866 DEBUG: root: GOT LINE: '2023-09-06 18:39:55,982 INFO: TESTER: root: Waiting for 1.1.1.1'
2023-09-06 18:39:56,867 DEBUG: root: GOT LINE: '2023-09-06 18:39:55,982 DEBUG: TESTER: root: expected '1.1.1.1' != '1.0.0.0''
2023-09-06 18:39:56,867 DEBUG: root: GOT LINE: 'waiting on notify'
Sure looks like the router-id is not allowed to be changed because
neighbors have already been formed. If we are changing the router-id
then let's clear the process to allow it to correctly change.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Noticed that we were not really attempting to even test
large swaths of our snmp infrastructure. Let's load
up some very simple configs for those daemons that
FRR supports and ensure that SNMP is working to
some extent.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The functions:
if_get_flags
if_flags_update
if_flags_mangle
are never invoked from a linux netlink build. Put a #ifdef
around those functions so that they are not included on the
linux build as that they are not needed there.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
I noticed that there was some missed code coverage in zebra.
multicast [enable|disable]
and
show interface description vrf all
Add a bit to get it covered.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
For interface config:
shutdown
mpls
multicast
These states were never being shown in output, let's show it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When VLAN-VNI mapping is updated, do not set the L2VNI up event
if the associated VXLAN device is not up.
This may result in bgp synced remote routes to skip installing
in Zebra and onwards (Kernel).
Ticket: #4139506
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Deprecate gracefulRestartCapability which is inconsistent with an existing
format if advertised and received are printed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Slipped somehow that peer-groups with GR is just completely broken, but it was
working before.
Strikes again, that we MUST have more and more topotests.
Fixes: 15403f521a ("bgpd: Streamline GR config, act on change immediately")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Consider the following scenario.
You start from the configuration below:
```
!
segment-routing
srv6
encapsulation
source-address fc00:0:1::1
!
!
!
```
Then you change the source address:
```
r1# configure
r1(config)# segment-routing
r1(config-sr)# srv6
r1(config-srv6)# encapsulation
r1(config-srv6-encap) source-address 1::1
```
And finally, reload the configuration
`python3 frr-reload.py --reload /etc/frr/frr.conf`
frr-reload returns the error below:
```
Failed to execute segment-routing srv6 no source-address 1::1 exit exit
"segment-routing -- srv6 -- no source-address 1::1 -- exit -- exit" we failed to remove this command
% Unknown command: no source-address 1::1
[79975|mgmtd] sending configuration
line 3: % Unknown command[76]: source-address fc00:0:1::1
[79975|mgmtd] Configuration file[/etc/frr/frr.conf] processing failure: 2
```
The reason is that the keyword `encapsulation` is missing in frr-reload.
This patch adds the missing keyword `encapsulation`.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
When OSPFd starts, there is 2 possible scenarios for Segment Routing:
1/ Routes associated to Prefixes are not yet available i.e. Segment Routing LSA
are received before LSA Type 1. In this case, the function
ospf_sr_nhlfe_update() is triggered when a new SPF is launch. Thus, neighbors
and output label are always synchronise with the routing table.
2/ Routes are already available i.e. LSA Type 1 are received before Segment
Routing LSA, in particular the Router Information which contains the SRGB.
During nhlfe computation, perfixes are leave with incomplete configuration, in
particular, the SR nexthop is set to NULL. If this scenario is handle through
the function update_out_nhlfe (triggered when SRGB is received or modified from
a neighbor node), the output label is not correctly configured as the nexthop
SR node associated to the prefix has been leave to NULL.
This patch correct this problem by calling the function compute_nhlfe() when
the nexthop SR Node associated to the prefix is NULL within the
update_out_nhlfe() function. Thus, we guarantee that the SR prefix is always
correctly configuration indpedently of the scenario i.e. arrival of the
different LSA.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
In zebra_mpls.c it has a usage of MTYPE_NH_LABEL which is
defined in both lib/nexthop.c and zebra/zebra_mpls.c. The
usage in zebra_mpls.c is a realloc. This leads to a crash:
(gdb) bt
0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=126487246404032) at ./nptl/pthread_kill.c:44
1 __pthread_kill_internal (signo=6, threadid=126487246404032) at ./nptl/pthread_kill.c:78
2 __GI___pthread_kill (threadid=126487246404032, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3 0x0000730a1b442476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
4 0x0000730a1b94fb18 in core_handler (signo=6, siginfo=0x7ffeed1e07b0, context=0x7ffeed1e0680) at lib/sigevent.c:268
5 <signal handler called>
6 __pthread_kill_implementation (no_tid=0, signo=6, threadid=126487246404032) at ./nptl/pthread_kill.c:44
7 __pthread_kill_internal (signo=6, threadid=126487246404032) at ./nptl/pthread_kill.c:78
8 __GI___pthread_kill (threadid=126487246404032, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
9 0x0000730a1b442476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
10 0x0000730a1b4287f3 in __GI_abort () at ./stdlib/abort.c:79
11 0x0000730a1b9984f5 in _zlog_assert_failed (xref=0x730a1ba59480 <_xref.16>, extra=0x0) at lib/zlog.c:789
12 0x0000730a1b8f8908 in mt_count_free (mt=0x576e0edda520 <MTYPE_NH_LABEL>, ptr=0x576e36617b80) at lib/memory.c:74
13 0x0000730a1b8f8a59 in qrealloc (mt=0x576e0edda520 <MTYPE_NH_LABEL>, ptr=0x576e36617b80, size=16) at lib/memory.c:112
14 0x0000576e0ec85e2e in nhlfe_out_label_update (nhlfe=0x576e368895f0, nh_label=0x576e3660e9b0) at zebra/zebra_mpls.c:1462
15 0x0000576e0ec833ff in lsp_install (zvrf=0x576e3655fb50, label=17, rn=0x576e366197c0, re=0x576e3660a590) at zebra/zebra_mpls.c:224
16 0x0000576e0ec87c34 in zebra_mpls_lsp_install (zvrf=0x576e3655fb50, rn=0x576e366197c0, re=0x576e3660a590) at zebra/zebra_mpls.c:2215
17 0x0000576e0ecbb427 in rib_process_update_fib (zvrf=0x576e3655fb50, rn=0x576e366197c0, old=0x576e36619660, new=0x576e3660a590) at zebra/zebra_rib.c:1084
18 0x0000576e0ecbc230 in rib_process (rn=0x576e366197c0) at zebra/zebra_rib.c:1480
19 0x0000576e0ecbee04 in process_subq_route (lnode=0x576e368e0270, qindex=8 '\b') at zebra/zebra_rib.c:2661
20 0x0000576e0ecc0711 in process_subq (subq=0x576e3653fc80, qindex=META_QUEUE_BGP) at zebra/zebra_rib.c:3226
21 0x0000576e0ecc07f9 in meta_queue_process (dummy=0x576e3653fae0, data=0x576e3653fb80) at zebra/zebra_rib.c:3265
22 0x0000730a1b97d2a9 in work_queue_run (thread=0x7ffeed1e3f30) at lib/workqueue.c:282
23 0x0000730a1b96b039 in event_call (thread=0x7ffeed1e3f30) at lib/event.c:1996
24 0x0000730a1b8e4d2d in frr_run (master=0x576e36277e10) at lib/libfrr.c:1232
25 0x0000576e0ec35ca9 in main (argc=7, argv=0x7ffeed1e4208) at zebra/main.c:536
Clearly replacing a label stack is an operation that should be owned by
lib/nexthop.c. So lets move this function into there and have
zebra_mpls.c just call the function to replace the label stack.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If the desired state is the same - do nothing instead of resetting once again.
Fixes: bdb5ae8bce ("bgpd: Make suppress-fib-pending clear peering")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>