Commit Graph

33543 Commits

Author SHA1 Message Date
Donatas Abraitis
f48f2de17c
Merge pull request #14499 from qlyoung/fix-doc-whitespace-toctree
fix various developer doc issues
2023-09-28 12:36:19 +03:00
Donatas Abraitis
0af4541576
Merge pull request #14498 from idryzhov/fix-conf-t-file-lock
Fixes for `file-lock` mode of configuration node
2023-09-28 10:03:06 +03:00
Quentin Young
f71f078023 doc: add .readthedocs.yaml configs
As of Sep 25 2023, RTD projects require config files to build. This
patch is necessary for docs to continue to build.

Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
2023-09-27 20:16:16 -04:00
Quentin Young
e45651fbd0 doc: include checkpatch & cspf docs in toctree
The documentation pages on checkpatch and CSPF were not reachable
because they were not included in any toctree. Include them in the tree!

Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
2023-09-27 19:55:35 -04:00
Quentin Young
d2292c6bfe doc: fix whitespace, formatting errors
* Fix various whitespace and syntax errors
* Fix a couple tiny grammar mistakes

Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
2023-09-27 19:55:35 -04:00
Igor Ryzhov
1a09cf3894 vtysh: fix entering configuration node in file-lock mode
When the config node is entered in file-lock mode, we should actually
remember it to correctly apply the workaround in `vtysh_exit`.
Otherwise, the file-lock mode is dropped once we exit any node one level
below the config node.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-09-28 02:45:05 +03:00
Igor Ryzhov
d3aa9adb8d vty: fix working in file-lock mode
When the configuration node is entered in file-lock mode, candidate
and running datastores are locked. Any configuration change is followed
by an implicit commit which leads to a crash of mgmtd, because double
lock is prohibited by an assert. When working in file-lock mode, we
shouldn't do implicit commits which is disabled by allowing pending
configuration changes.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-09-28 02:41:16 +03:00
Jafar Al-Gharaibeh
52cc7f1006
Merge pull request #14222 from opensourcerouting/doc/debian12
[DOC] Debian 12
2023-09-27 17:46:40 -05:00
Jafar Al-Gharaibeh
f5820215f2
Merge pull request #14495 from opensourcerouting/fix/update_releases_table
doc: Fix release dates in workflow
2023-09-27 17:45:05 -05:00
Igor Ryzhov
b8ebb7fc62 vty: fix configure terminal argument descriptions
"terminal" and "file-lock" description are mixed up.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-09-27 23:34:53 +03:00
Donald Sharp
60c38a99ac
Merge pull request #14342 from fdumontet6WIND/fix_crash_snmp
bgpd: fix crash in *bgpv2PeerErrorsTable"
2023-09-27 15:25:38 -04:00
Donatas Abraitis
fb5f11ae67 doc: Use backward order when for release dates hint
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-27 16:49:56 +03:00
Donatas Abraitis
21d718aa6c doc: Fix release dates in workflow
Align to the release rules:

Releases are scheduled in a 4-month cycle on the first Tuesday each March/July/November.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-27 16:45:01 +03:00
Donald Sharp
bb308b1efc
Merge pull request #14482 from opensourcerouting/fix/walltime_threshold_disable
lib: Drop deprecated enable-time-check, enable-cpu-time compile options
2023-09-27 06:32:11 -04:00
Igor Ryzhov
7a8b1875c5 mgmtd: fix crash on "show mgmtd datastore-contents"
When the command is called without specifying the datastore, it crashes.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-09-27 13:22:01 +03:00
Martin Winter
e1996b3f4a
doc: Add Debian 12 Build documentation
Signed-off-by: Rodrigo Nardi <rnardi@netdef.org>
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
2023-09-26 17:44:11 +02:00
Russ White
f289533d5d
Merge pull request #14447 from marcos-ng/master
doc: reference the correct MGMTd show command
2023-09-26 11:43:17 -04:00
Russ White
dbd08a31cc
Merge pull request #14356 from Keelan10/ospf_external_aggregator-leak
ospfd: Fix External Aggregator Leak
2023-09-26 10:18:08 -04:00
Russ White
8e755a03a3
Merge pull request #12649 from louis-6wind/bgp-link-state
bgpd: add basic support of BGP Link-State RFC7752
2023-09-26 10:07:02 -04:00
Donatas Abraitis
2853f14d05 bgpd: Set the TTL for the correct socket
When we accept a connection, we try to set TTL for the socket, but the socket
is not yet created/assigned and we are trying to set it on the wrong socket fd.

```
[Event] connection from 127.0.0.1 fd 25, active peer status 3 fd -1
can't set sockopt IP_TTL 255 to socket -1
bgp_set_socket_ttl: Can't set TxTTL on peer (rtrid 0.0.0.0) socket, err = 9
Unable to set min/max TTL on peer 127.0.0.1, Continuing
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-25 22:25:32 +03:00
Donald Sharp
c0a681eed5
Merge pull request #14487 from opensourcerouting/fix/doc_bullet_new_line_missing
Some recent documentation adjustments
2023-09-25 09:57:00 -04:00
Donald Sharp
646895a565
Merge pull request #14484 from opensourcerouting/coverity-20230924
lib: assert for VTY_PASSFD expectations
2023-09-25 09:52:23 -04:00
Donatas Abraitis
cd1dc02f89 doc: Use different label to distinguish PBR nexthop groups
/root/frr/doc/user/pbr.rst:32: WARNING: duplicate label nexthop-groups, other instance in /root/frr/doc/user/nexthop_groups.rst

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-25 09:33:02 +03:00
Donatas Abraitis
99ccb3a590 doc: Replace frr code highlighting marker to sh
No such thing exists.

 /root/frr/doc/user/ospfd.rst:624: WARNING: Cannot analyze code. No Pygments lexer found for "frr".

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-25 09:28:42 +03:00
Donatas Abraitis
d677be63f8 doc: Drop bullet point from ospfd documentation
/root/frr/doc/user/ospfd.rst:609: WARNING: Bullet list ends without a blank line; unexpected unindent.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-25 09:27:21 +03:00
David Lamparter
ee5dd0a081 lib: assert for VTY_PASSFD expectations
Coverity is complaining that vty->state could be VTY_PASSFD here.  It
can't, it really shouldn't, and if it actually is then something went
seriously wrong somewhere earlier so assert()ing out is the best thing
to do.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-09-24 20:14:37 +02:00
Donatas Abraitis
1c829fac8e
Merge pull request #14467 from cscarpitta/bugfix/fix-srv6-isis-memleaks
isisd: Fix memory leaks when IS-IS fails to process an SRv6 locator chunk
2023-09-24 20:47:15 +03:00
Donatas Abraitis
56d8305481
Merge pull request #14473 from cscarpitta/bugfix/fix-srv6-topotest-warning
tests: Fix DeprecationWarning in SRv6 L3VPN topotest
2023-09-24 20:47:07 +03:00
Donatas Abraitis
fd0fe0bb6a lib: Drop deprecated enable-time-check, enable-cpu-time compile options
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-24 20:41:24 +03:00
Donatas Abraitis
a2a9733fec
Merge pull request #14468 from donaldsharp/bgp_send_ordering
bgpd: Ensure send order is 100% consistent
2023-09-24 16:48:44 +03:00
Donald Sharp
9d9c6dc01e
Merge pull request #14476 from anlancs/fix/pimd-remove-fd-close
pimd: remove redundant closing socket
2023-09-23 18:43:43 -04:00
Donald Sharp
e0b37a21be
Merge pull request #14475 from opensourcerouting/fix/unset_per_afi_stuff_when_dynamic_UNSET_received
Clear per afi/safi stuff for GR/LLGR when dynamic capability with UNSET action received
2023-09-23 09:51:47 -04:00
Donald Sharp
7d12e26121
Merge pull request #14464 from opensourcerouting/fix/dampening_crash
bgpd: Fix dampening info crash
2023-09-23 09:51:01 -04:00
Donald Sharp
4f0db0daaf
Merge pull request #14470 from opensourcerouting/fix/rewrite_dynamic_capabilities_tests
tests: Improve BGP dynamic capability tests
2023-09-23 09:50:43 -04:00
anlan_cs
411e16a1c7 pimd: remove redundant closing socket
The socket has been closed in `ssmpingd_setsockopt()` in the wrong cases,
so remove the redundant closing socket from outer layer.

Signed-off-by: anlan_cs <anlan_cs@tom.com>
2023-09-23 21:06:32 +08:00
Donatas Abraitis
61bd60b984 bgpd: Flush per AFI/SAFI capabilities flags, stale_time for LLGR cap
Clear to defaults if receiving dynamic capability with UNSET action.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 20:50:07 +03:00
Donatas Abraitis
f793136d18 bgpd: Clear graceful-restart per AFI/SAFI capability flags when receiving unset
We flushed the main capability received flag, but missed flushing per AFI/SAFI.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 20:50:06 +03:00
Carmine Scarpitta
71ed1868d6 tests: Fix DeprecationWarning in SRv6 L3VPN topotest
Fix the following warning:

tests/topotests/bgp_srv6l3vpn_sid/test_bgp_srv6l3vpn_sid.py:42
  /media/SharedUTM/workspace/frr/tests/topotests/bgp_srv6l3vpn_sid/test_bgp_srv6l3vpn_sid.py:42: DeprecationWarning: invalid escape sequence '\ '

In test_bgp_srv6l3vpn_sid.py we have a comment containing some '\'
characters. Python mistakenly tries to interpret such "\" characters
as escape sequences, which leads to the above warning.

Let's tell Python to treat the comment as a raw string,
so that it simply treats backslashes as literal characters rather than
escape sequences.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
2023-09-22 18:43:42 +02:00
Rafael Zalamena
aed94c8096 lib: don't announce prefix delete for duplicates
When deleting a duplicated prefix list entry don't announce the change
to route map listeners, otherwise they will be removing rules that
shouldn't be removed causing the prefix that still exist in the
prefix-list to be not evaluated anymore.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-09-22 13:03:28 -03:00
Rafael Zalamena
71fb99d22e Revert "lib : fix duplicate prefix list delete"
This reverts commit 394ed767e7.
2023-09-22 12:24:16 -03:00
Donald Sharp
eceb1cab6d
Merge pull request #14450 from kuldeepkash/general_fixes
tests: Adding BGP convergence verification before starting PIM tests
2023-09-22 09:53:03 -04:00
Donald Sharp
1adbce9b1d
Merge pull request #14458 from opensourcerouting/fix/update_doc_for_vtysh
doc: domainname MUST be manually written to vtysh.conf also
2023-09-22 09:51:01 -04:00
Donald Sharp
f327f2e8ae
Merge pull request #14463 from mjstapp/fix_bgp_ctime_r
bgpd: fix return of local from ctime_r
2023-09-22 09:47:33 -04:00
Donald Sharp
45c2d514db
Merge pull request #14466 from mjstapp/fix_ospfd_snmp_ptrs
bgpd, ospfd: fix some dicey pointer arith in snmp modules
2023-09-22 09:46:52 -04:00
Donatas Abraitis
fa5783bbab tests: Check notification/capability received message stats instead of reset/established
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 15:50:27 +03:00
Donatas Abraitis
64b4a93d81 tests: Use frr.conf for bgp_dynamic_capabily tests
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 15:20:22 +03:00
Donatas Abraitis
e0a8795484 bgpd: Use proper AFI when dumping information for dampening stuff
Before we called IPv4 for IPv6 dampening info.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 12:04:17 +03:00
Donatas Abraitis
c39506d80f bgpd: Initialise timebuf arrays to zeros for dampening reuse timer
Avoid having something like this in outputs:

Before:
```
munet> r1 shi vtysh -c 'show bgp dampening damp'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          From             Reuse    Path
 *d 2001:db8:1::1/128
                    2001:db8::2      (null) 65002 ?
 *d 2001:db8:2::1/128
                    2001:db8::2      (null) 65002 ?
 *d 2001:db8:3::1/128
                    2001:db8::2      (null) 65002 ?
 *d 2001:db8:4::1/128
                    2001:db8::2      (null) 65002 ?
 *d 2001:db8:5::1/128
                    2001:db8::2      (null) 65002 ?

Displayed  5 routes and 5 total paths

munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          From            Flaps Duration Reuse    Path
 *d 2001:db8:1::1/128
                    2001:db8::2     2    00:03:10 (null) 65002 ?
 *d 2001:db8:2::1/128
                    2001:db8::2     2    00:03:10 (null) 65002 ?
 *d 2001:db8:3::1/128
                    2001:db8::2     2    00:03:10 (null) 65002 ?
 *d 2001:db8:4::1/128
                    2001:db8::2     2    00:03:10 (null) 65002 ?
 *d 2001:db8:5::1/128
                    2001:db8::2     2    00:03:10 (null) 65002 ?

Displayed  5 routes and 5 total paths
```

After:

```
munet> r1 shi vtysh -c 'show bgp dampening damp '
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          From             Reuse    Path
 *d 2001:db8:1::1/128
                    2001:db8::2      00:00:00 65002 ?
 *d 2001:db8:2::1/128
                    2001:db8::2      00:00:00 65002 ?
 *d 2001:db8:3::1/128
                    2001:db8::2      00:00:00 65002 ?
 *d 2001:db8:4::1/128
                    2001:db8::2      00:00:00 65002 ?
 *d 2001:db8:5::1/128
                    2001:db8::2      00:00:00 65002 ?

Displayed  5 routes and 5 total paths

munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          From            Flaps Duration Reuse    Path
 *d 2001:db8:1::1/128
                    2001:db8::2     2    00:00:15 00:00:00 65002 ?
 *d 2001:db8:2::1/128
                    2001:db8::2     2    00:00:15 00:00:00 65002 ?
 *d 2001:db8:3::1/128
                    2001:db8::2     2    00:00:15 00:00:00 65002 ?
 *d 2001:db8:4::1/128
                    2001:db8::2     2    00:00:15 00:00:00 65002 ?
 *d 2001:db8:5::1/128
                    2001:db8::2     2    00:00:15 00:00:00 65002 ?

Displayed  5 routes and 5 total paths
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 12:04:17 +03:00
Donatas Abraitis
14d8590688 bgpd: Make sure dampening is enabled for the specified AFI/SAFI
```
(gdb) bt
0  raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
1  0x00007f55897c6ab0 in core_handler (signo=11, siginfo=0x7ffd19764bb0, context=<optimized out>) at lib/sigevent.c:246
2  <signal handler called>
3  0x00005624ccabdee9 in bgp_get_reuse_time (penalty=<optimized out>, buf=buf@entry=0x7ffd19765590 "", len=len@entry=25, afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST, use_json=<optimized out>, json=0x0)
    at bgpd/bgp_damp.c:498
4  0x00005624ccabf5e7 in bgp_damp_reuse_time_vty (vty=vty@entry=0x5624ce484e30, path=path@entry=0x5624cdd797a0, timebuf=timebuf@entry=0x7ffd19765590 "", len=len@entry=25, afi=afi@entry=AFI_IP,
    safi=safi@entry=SAFI_UNICAST, use_json=false, json=0x0) at bgpd/bgp_damp.c:635
5  0x00005624cca146a9 in damp_route_vty_out (afi=AFI_IP, json_paths=0x0, use_json=false, safi=SAFI_UNICAST, display=<optimized out>, path=0x5624cdd797a0, p=0x5624ce3f3160, vty=0x5624ce484e30)
    at bgpd/bgp_route.c:9852
6  bgp_show_table (vty=0x5624ce484e30, bgp=0x5624ce400950, safi=safi@entry=SAFI_UNICAST, table=0x5624ce409300, type=type@entry=bgp_show_type_dampend_paths, output_arg=0x0, rd=0x0, is_last=1, output_cum=0x0,
    total_cum=0x0, json_header_depth=0x7ffd19765830, show_flags=0, rpki_target_state=RPKI_NOT_BEING_USED) at bgpd/bgp_route.c:11448
7  0x00005624cca15f74 in bgp_show (vty=vty@entry=0x5624ce484e30, bgp=<optimized out>, afi=<optimized out>, safi=<optimized out>, type=type@entry=bgp_show_type_dampend_paths, output_arg=output_arg@entry=0x0,
    show_flags=0, rpki_target_state=RPKI_NOT_BEING_USED) at bgpd/bgp_route.c:11702
8  0x00005624cca17679 in show_ip_bgp_magic (self=<optimized out>, viewvrfname=<optimized out>, aa_nn=<optimized out>, community_list=<optimized out>, community_list_str=<optimized out>,
    community_list_name=<optimized out>, as_path_filter_name=<optimized out>, prefix_list=<optimized out>, accesslist_name=<optimized out>, rmap_name=<optimized out>, version=<optimized out>,
    version_str=<optimized out>, alias_name=<optimized out>, wide=<optimized out>, detail_json=<optimized out>, uj=<optimized out>, detail_routes=<optimized out>, all=<optimized out>, argv=0x5624ce3f32f0,
    argc=<optimized out>, vty=0x5624ce484e30) at bgpd/bgp_route.c:12863
9  show_ip_bgp (self=<optimized out>, vty=<optimized out>, argc=<optimized out>, argv=0x5624ce3f32f0) at ./bgpd/bgp_route_clippy.c:514
10 0x00007f55897618ee in cmd_execute_command_real (vline=vline@entry=0x5624ce427020, vty=vty@entry=0x5624ce484e30, cmd=cmd@entry=0x0, up_level=up_level@entry=0) at lib/command.c:993
11 0x00007f5589761a91 in cmd_execute_command (vline=vline@entry=0x5624ce427020, vty=vty@entry=0x5624ce484e30, cmd=0x0, vtysh=vtysh@entry=0) at lib/command.c:1051
12 0x00007f5589761c30 in cmd_execute (vty=vty@entry=0x5624ce484e30, cmd=cmd@entry=0x5624ce47b1b0 "show bgp dampening damp", matched=matched@entry=0x0, vtysh=vtysh@entry=0) at lib/command.c:1218
13 0x00007f55897de95e in vty_command (vty=vty@entry=0x5624ce484e30, buf=<optimized out>) at lib/vty.c:591
14 0x00007f55897deb9d in vty_execute (vty=0x5624ce484e30) at lib/vty.c:1354
15 0x00007f55897e23eb in vtysh_read (thread=<optimized out>) at lib/vty.c:2362
16 0x00007f55897d9426 in event_call (thread=thread@entry=0x7ffd19767e70) at lib/event.c:1971
17 0x00007f5589789df8 in frr_run (master=0x5624cdc42100) at lib/libfrr.c:1213
18 0x00005624cc985f65 in main (argc=<optimized out>, argv=0x7ffd197680d8) at bgpd/bgp_main.c:510
(gdb) frame 4
(gdb) p damp[1][1]
$4 = {suppress_value = 0, reuse_limit = 0, max_suppress_time = 0, half_life = 0, tmax = 0, reuse_list_size = 0, reuse_index_size = 0, ceiling = 0, decay_rate_per_tick = 0, decay_array_size = 0,
  scale_factor = 0, reuse_scale_factor = 0, decay_array = 0x0, reuse_index = 0x0, reuse_list = 0x0, reuse_offset = 0, no_reuse_list = 0x0, t_reuse = 0x0, afi = AFI_UNSPEC, safi = SAFI_UNSPEC}
(gdb) p damp[2][1]
$5 = {suppress_value = 1, reuse_limit = 1, max_suppress_time = 1800, half_life = 60, tmax = 0, reuse_list_size = 181, reuse_index_size = 1024, ceiling = 1073741824, decay_rate_per_tick = 0,
  decay_array_size = 360, scale_factor = 9.5367431729442842e-07, reuse_scale_factor = 0, decay_array = 0x5624ce483780, reuse_index = 0x5624ce481320, reuse_list = 0x5624ce482c20, reuse_offset = 7,
  no_reuse_list = 0x0, t_reuse = 0x5624ce3ec840, afi = AFI_UNSPEC, safi = SAFI_UNSPEC}
(gdb)
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-09-22 12:04:17 +03:00
Donald Sharp
a7a7fa57fe bgpd: Ensure send order is 100% consistent
When BGP is sending updates to peers on a neighbor up event
it was noticed that the bgp updates being sent were in reverse
order being sent to the first peer.

Imagine r1 -- r2 -- r3.  r1 and r2 are ebgp peers and
r2 and r3 are ebgp peers.  r1's interface to r2 is currently
shutdown.  Prior to this fix the send order would look like this:

r1 -> r2 send of routes to r2 and then they would be installed in order
received:

10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20

r2 would then send these routes to r3 and then they would be installed
in order received:

10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20

Not that big of a deal right?  Well imagine a situation where r1 is
originating several ten's of thousands of routes.  It sends routes to r2
r2 is processing routes but in reverse order and at the same time it
is sending routes to r3, in the correct order of the bgp table.

r3 will have the early 10.0.0.1/32 routes installed and start forwarding
while r2 will not have those routes installed yet( since they were at the
end and zebra is slightly slower for processing routes than bgp is ).

Ensure that the order sent is a true FIFO.  What is happening is that
there is an update fifo which stores all routes.  And off that FIFO
is a bgp advertise attribute list which stores the list of prefixes
which share the same attribute that allow for more efficient packing
this list was being stored in reverse order causing the problem for
the initial send.  When adding items to this list put them at the
end so we keep the fifo order that is traversed when we walk through
the bgp table.

After the fix:

r2 installation order:

10.0.0.0 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.1 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.2 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.3 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.4 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.5 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.6 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.7 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.8 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.9 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.10 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.11 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20
10.0.0.12 nhid 39 via 192.168.8.2 dev leaf2-eth5 proto bgp metric 20

r3 installation order:

10.0.0.0 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.1 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.2 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.3 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.4 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.5 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.6 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.7 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.8 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.9 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.10 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.11 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20
10.0.0.12 nhid 12 via 192.168.61.2 dev spine2-eth1 proto bgp metric 20

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-09-21 15:30:08 -04:00