Commit Graph

33639 Commits

Author SHA1 Message Date
Russ White
b6b0001a4c
Merge pull request #14540 from opensourcerouting/feature/bgpd_handle_fqdn_capability_via_dynamic_capability
bgpd: Handle FQDN capability using dynamic capabilities
2023-10-24 06:23:32 -04:00
Russ White
27a78f80d8
Merge pull request #13979 from gpnaveen/bgp_unique_rid
tests: Adding a bgp router id chaos test case.
2023-10-24 06:09:58 -04:00
Donatas Abraitis
e8cdfa2761
Merge pull request #14629 from mjstapp/zebra_debug_netlink_ifname
zebra: debug ifname in netlink link debugs
2023-10-24 10:09:45 +03:00
Donatas Abraitis
614d7873d5
Merge pull request #14634 from LabNConsulting/chopps/gdb-use-emacs
tests: add --gdb-use-emacs option
2023-10-24 08:58:40 +03:00
Donald Sharp
a272a2b364 zebra: Allow longer prefix matches for nexthops
Zebra currently does a shortest prefix match for
resolving nexthops for a prefix.  This is typically
an ok thing to do but fails in several specific scenarios.
If a nexthop matches to a route that is not usable, nexthop
resolution just gives up and refuses to use that particular
route.  For example if zebra currently has a covering prefix
say a 10.0.0.0/8.  And about the same time it receives a
10.1.0.0/16 ( a more specific than the /8 ) and another
route A, who's nexthop is 10.1.1.1.  Imagine the 10.1.0.0/16
is processed enough to know we want to install it and the
prefix is sent to the dataplane for installation( it is queued )
and then route A is processed, nexthop resolution will fail
and the route A will be left in limbo as uninstallable.

Let's modify the nexthop resolution code in zebra such that
if a nexthop's most specific match is unusable, continue looking
up the table till we get to the 0.0.0.0/0 route( if it's even
installed ).  If we find a usable route for the nexthop accept
it and use it.

The bgp_default_originate topology test is frequently failing
with this exact problem:

B>* 0.0.0.0/0 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
B   1.0.1.17/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21
B>* 1.0.2.17/32 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
C>* 1.0.3.17/32 is directly connected, lo, 00:02:00
B>* 1.0.5.17/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32
B>* 192.168.0.0/24 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
B   192.168.1.0/24 [200/0] via 192.168.1.1 inactive, weight 1, 00:00:21
C>* 192.168.1.0/24 is directly connected, r2-r1-eth0, 00:02:00
C>* 192.168.2.0/24 is directly connected, r2-r3-eth1, 00:02:00
B>* 192.168.3.0/24 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32
B   198.51.1.1/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21
B>* 198.51.1.2/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32

Notice that the 1.0.1.17/32 route is inactive but the nexthop
192.168.0.1 is covered by both the 192.168.0.0/24 prefix( shortest match )
*and* the 0.0.0.0/0 route ( longest match ).  When looking at the logs
the 1.0.1.17/32 route was not being installed because the matching
route was not in a usable state, which is because the 192.168.0.0/24
route was in the process of being installed.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-10-23 08:15:11 -04:00
Donald Sharp
01d84db046
Merge pull request #14628 from opensourcerouting/fix/bgpd_conditional_advertisement_static_routes_withdrawn
bgpd: Do not suppress conditional advertisement updates if triggered
2023-10-23 07:41:07 -04:00
Christian Hopps
a921202a85 tests: add --gdb-use-emacs option
When specified `--gdb-use-emacs` will launch the daemon with gdb inside a
running emacs server using `emacsclient --eval` commands.

Signed-off-by: Christian Hopps <chopps@labn.net>
2023-10-23 05:11:32 -04:00
Donatas Abraitis
571b403519
Merge pull request #14631 from idryzhov/nb-remove-comment
lib: remove incorrect comment from northbound
2023-10-22 11:21:40 +03:00
Igor Ryzhov
a041d3169b lib: remove incorrect comment from northbound
This was true when we had only a CLI for configuration. Now mgmtd has a
public frontend interface that can be used by external applications, and
they can send invalid requests that lead to errors.

This is still true for CLI though, so the same comment still stays in
`nb_cli_apply_changes_internal`.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-10-21 13:09:00 +03:00
Mark Stapp
85dc2e85e0 zebra: debug ifname in netlink link debugs
Print the ifname with netlink LINK debug output.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-10-20 11:20:25 -04:00
Donatas Abraitis
3c94151258 tests: Check if BGP conditional advertisement works fine with static routes
If we modify the prefix-list that is used to define the routes to be
advertised, all of them MUST be advertised.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-20 12:58:33 +03:00
Donatas Abraitis
2d8e859585 bgpd: Do not suppress conditional advertisement updates if triggered
If we have a prefix-list with one entry, and after some time we append a prefix-list
with some more additional entries, conditional advertisement is triggered, and the
old entries are suppressed (because they look identical as sent before).

Hence, the old entries are sent as withdrawals and only new entries sent as updates.

Force re-sending all BGP updates for conditional advertisement. The same is done
for route-refresh, and/or soft clear operations.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-20 12:05:45 +03:00
Donatas Abraitis
49d1539a70 doc: Add a new command to resend dynamic capabilities
For now it includes only FQDN capability, because other capabilities can be
resend using specific knobs.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-20 09:36:33 +03:00
Donatas Abraitis
f90ea076da bgpd: Add clear bgp capabilities command to resend some dynamic capabilities
For instance, it's not possible to resend FQDN capability without resetting
the session, so let's create some more elegant way to do that.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-20 09:36:33 +03:00
Donatas Abraitis
03ee1cadd5 bgpd: Handle FQDN capability using dynamic capabilities
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-20 09:36:32 +03:00
Donald Sharp
627888864d
Merge pull request #14614 from opensourcerouting/feature/bgpd_handle_orf_capability_via_dynamic_capability
bgpd: Handle ORF capability using dynamic capabilities
2023-10-19 16:01:24 -04:00
Donatas Abraitis
2775d2263a
Merge pull request #14618 from donaldsharp/watchfrr_extend
watchfrr: Extend ignore option to daemon being killed
2023-10-19 18:48:37 +03:00
Donald Sharp
8f839353dc
Merge pull request #14615 from opensourcerouting/fix/rename_test_function_for_bgp_dynamic_capability
tests: Rename test_bgp_dynamic_capability_role
2023-10-19 08:15:15 -04:00
Donald Sharp
3f4bac66d8
Merge pull request #14616 from subsecond/patch-5
doc: add "enforce-first-as" to BGP documentation
2023-10-19 08:14:53 -04:00
Donald Sharp
c168244b99 watchfrr: Extend ignore option to daemon being killed
When testing GR features, it is desired to kill bgp
(or really any daemon )and not immediately have bgp start up again.
Modify the code to not attempt to restart the daemon
by hand to let us developers work when the `watchfrr ignore XXX`
command is issued.

Testing:
watchfrr ignore bgpd
kill -9 bgpd
start bgp by `/usr/lib/frr/watchfrr.sh start bgpd` at some point in time
in the future

leaf-1# show watchfrr
watchfrr global phase: Idle
 Restart Command: "/usr/lib/frr/watchfrr.sh restart %s"
 Start Command: "/usr/lib/frr/watchfrr.sh start %s"
 Stop Command: "/usr/lib/frr/watchfrr.sh stop %s"
 Min Restart Interval: 60
 Max Restart Interval: 600
 Restart Timeout: 90
  zebra                Up
  bgpd                 Up/Ignoring Timeout
  staticd              Up
leaf-1#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-10-18 14:30:03 -04:00
Philippe Guibert
d3f686d163 zebra: do not accept static label requests conflicting with dynamic-block
A static label allocation should not be accepted if the desired range
conflicts with the configured dynamic-block configuration.

Do not accept such label requests, only when dynamic blocks are
configured.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:46:28 +02:00
Louis Scalbert
3cae026428 topotests: add bgp_l3vpn_label_export test
There is no test that checks for the label allocation mechanisms
involved when using BGP and/or LDP.
- Some configuration changes are applied in the BGP configuration,
and the impact is checked on the BGP contexts, and on the label
manager.
- The label manager dynamic range is reconfigured, BGP auto mode
is checked against the new range, along with LDP when restarting.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:46:28 +02:00
Philippe Guibert
c6498ace44 zebra: dump the dynamic-block bounds on vty command
The 'show debugging label-table' needs to dump
dynamic block information.
Display the lower and upper values for the dynamic
block.

> # show debugging label-table json
> {
>     "dynamicBlock":{
>     "lowerBound":16,
>     "upperBound":1048575
>   },
> [..]
> # show debugging label-table
> Dynamic block: lower-bound 16, upper-bound 1048575
> [..]

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:46:25 +02:00
Philippe Guibert
0bd8a16082 zebra: add json support to 'show debugging label-table'
Add the json keyword to dump the label chunks of
the zebra label manager in json format.

>dut# show debugging label-table json
> {
>   "chunks":[
>     {
>       "protocol":"bgp",
>       "instance":0,
>       "sessionId":1,
>       "start":16,
>       "end":16,
>       "dynamic":true
>     },
>     {
>       "protocol":"ldp",
>       "instance":0,
>       "sessionId":1,
>       "start":17,
>       "end":80,
>       "dynamic":true
>     }
>   ]
> }

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:45:29 +02:00
Philippe Guibert
8a400bb70a topotests: bgp_srv6l3vpn_to_bgp_vrf[2,3], ignore tableVersion
The expected tableVersion is wrong, when checking r1 table.

The tableVersion value increments at each route updates. The
previous commit brought an additional route update with the
'vpn_leak_postchange_all()' call.

Keep the function call, and do not check the table version
in bgp_srv6l3vpn_to_bgp_vrf[2,3] tests.

Fixes: 205b62ffae2c ("bgpd: fix hardset l3vpn label available in mpls pool")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:45:29 +02:00
Philippe Guibert
66c85fde7e doc: add 'mpls label dynamic-block' information
Add information on the 'mpls label dynamic-block'
command.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 17:45:26 +02:00
Manuel Schweizer
3acc6ae932 doc: add "enforce-first-as" to BGP doc
With the deprecation of the global "bgp enforce-first-as" command back
in https://github.com/FRRouting/frr/pull/2259 the newly introduced
option to enable that setting on a specific peer was not documented.

This commit adds the necessary documentation and states the command's
default.

Signed-off-by: Manuel Schweizer <manuel.schweizer@cloudscale.ch>
2023-10-18 17:30:39 +02:00
Donatas Abraitis
2c0c11f3e8 bgpd: Handle ORF capability using dynamic capabilities
Add an ability to enable/disable ORF capability dynamically without tearing
down the session.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-18 16:56:02 +03:00
Donatas Abraitis
4b843e759b tests: Rename test_bgp_dynamic_capability_role
Was copied, but forgot to rename accordingly.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-18 12:31:22 +03:00
Philippe Guibert
dfb56806af topotests: fix bgp_vpnv[4,6]_per_nexthop prefix not updated
The bgp_vpnv[4,6]_table_check() functions analyze the
expected label value of VPN prefixes present in the BGP table.
However, it doesn't verify if the prefixes exist before doing
this. Consequently, the tests will fail if the prefixes do not
show up immediately.
Ensure that all expected VPN prefixes are present before
executing the function.

Fixes: ae5a6bc1f6 ("topotests: add bgp mpls allocation per next-hop test")
Fixes: 37a02a8dcb ("topotests: add bgp_vpnv6 test allocation")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
fccda55eac zebra: add label chunk allocation in the dynamic block range
This commit adds support for the label chunk allocation in
the configured dynamic block range.

An additional check ensures the upper bound does not go
over the upper bound of the dynamic-block.
Otherwise, a chunk is created with the lower bound set
to the first label element available in the defined
range.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
7a7c4bc80a zebra: rework dynamic label request algorithm
The label chunk algorithm needs to be revisited to support a
configured dynamic-block or the default one.

Reuse the 'lbl_mgr.dynamic_block_[start/end]' variables,
whereever needed, and simplify the algorithm.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
0832a2be53 zebra: add 'mpls label dynamic-block' command
Hardset label values (eg. ISIS Segment-routing label blocks,
hardset BGP L3VPN service label) may conflict with label chunks
dynamically allocated by zebra.

Add an optional 'mpls label dynamic-block' command to let the user
define a range that is not in conflict with the hardset values.
Restarting control planes is recommended when dynamic label
chunks are already allocated. Command is aborted when any hardset
label chunks conflict with the dynamic block.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
b71370e83f zebra: fix label allocation when room space before first chunk
After ISIS first allocates a label chunk at [1000;2000],
the '16' label value is not used when BGP tries to
allocate a label chunk in auto mode. This does not happen
when BGP is the only one to do the label allocation.

When a label chunk has been accepted, the next label
request checks if there is room space before the existing
label chunk, and uses the lower label value to 17, and not
16.

Fix this by changing the previous range end 'prev_end' label
value to 15 which is the end of the reserved MPLS label
range.

Fixes: 3c84497943 ("zebra: label manager should never return a reserved block")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
9d32589b58 zebra, test: mark mpls label chunks as dynamic or static
The zebra label manager stores the mpls label chunks,
but does not record if the label request was for a
dynamic or a static chunk.

For all label requests accepted, mark the label chunk
if the 'base' parameter is set to MPLS_LABEL_BASE_ANY,
unmark it otherwise.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
b5808ecc89 bgpd: fix wrong 'pending' labelpool counter value at startup
If BGP starts with a l3vpn configuration, the 'pending' value
of the 'show bgp labelpool summary' command is set to 128,
whereas the 'pending' value is 0 if the l3vpn configuration is
applied after.

with no config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger:       1
> InUse:        1
> Requests:     0
> LabelChunks:  1
> Pending:      0
> Reconnects:   1

with config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger:       1
> InUse:        1
> Requests:     0
> LabelChunks:  1
> Pending:      128
> Reconnects:   1

When BGP configuration is applied at startup, the label request fails,
because the zapi connection with zebra is not yet up. At zebra
up event, the label request is done again, succeeds, decrements the
'pending_count' value in 'bgp_lp_event_chunk() function, then sets
the 'pending_count' value to the 'labels_needed' value.

This method was correct when label requests were asyncronous: the
'pending_count' value was first set, then decremented. In syncronous
label requests, the operations are swapped.

Fix this by incrementing the expected 'labels_needed' value instead.

Fixes: 0043ebab99 ("bgpd: Use synchronous way to get labels from Zebra")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
0177a0ded1 bgpd: fix release label chunk when label pool unused
A label chunk is used by BGP for L3VPN or LU purposes,
by picking up labels from that chunk; but when those
labels are release, the label chunks are never released.

The below configuration sequence shows that the label
chunks are not released.

> router bgp 65500
>  bgp router-id 1.1.1.1
>  !
>  address-family ipv4 unicast
>   label vpn export auto
>   rd vpn export 55:1
>   rt vpn both 55:1
>   export vpn
>   import vpn
> [..]
>   no label vpn export auto
> [..]
> # show bgp labelpool summary
> [..]
> LabelChunks:  1
> Pending:      128
> [..]

The '128' value stands for the default label chunk size,
which is not released after unconfiguration.

Fix this by checking after each label release, that
the label chunk is still used. If not, release it.
Reset the 'next_chunksize' value to the default value.

Fixes: 955bfd984f ("bgpd: dynamic mpls label pool")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
4a81210169 topotests: fix accept_own test, bgp label value conflict with ldp
When configuring manual label value in BGP L3VPN, the label
allocation conflicts with the LDP label pool which is in use.
Choose BGP label values different that the ones from LDP.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
cb86d8e3a4 bgpd: fix label allocation should not be allocated at startup
BGP always asks zebra for a chunk of MPLS label even if it doesn't need it.
Fix this by correcting the rounding up "labels_needed" formula.

Fixes: 80853c2ec7 ("bgpd: improve labelpool performance at scale")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
d162d5f6f5 bgpd: fix hardset l3vpn label available in mpls pool
Today, when configuring BGP L3VPN mpls, the operator may
use that command to hardset a label value:

> router bgp 65500 vrf vrf1
> address-family ipv4 unicast
> label vpn export <hardset_label_value>

Today, BGP uses this value without checks, leading to potential
conflicts with other control planes like LDP. For instance, if
LDP initiates with a label chunk of [16;72] and BGP also uses the
50 label value, a conflict arises.

The 'label manager' service in zebra oversees label allocations.
While all the control plane daemons use it, BGP doesn't when a
hardset label is in place.

This update fixes this problem. Now, when a hardset label is set for
l3vpn export, a request is made to the label manager for approval,
ensuring no conflicts with other daemons. But, this means some existing
BGP configurations might become non-operational if they conflict with
labels already allocated to another daemon but not used.

note: Labels below 16 are reserved and won't be checked for consistency
by the label manager.

Fixes: ddb5b4880b ("bgpd: vpn-vrf route leaking")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Philippe Guibert
1c199f219d bgpd: rewrite 'bgp label vpn export' command
The original 'bgp label vpn export' code is confusing,
the 'no form' actions are mixed with the positive form.

Fix this by rewriting the code.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-10-18 09:41:02 +02:00
Donatas Abraitis
a681f525b9
Merge pull request #14607 from mobash-rasool/fixes2
pim6d: valgrind issue fixes
2023-10-17 17:34:11 +03:00
Donatas Abraitis
6ece98ecc1 bgpd: Reuse orf_type_str/orf_mode_str for dynamic capabilities code
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-17 16:01:00 +03:00
Donatas Abraitis
1fb08e91d7 tests: Check if ORF capability works with BGP dynamic capabilities
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-17 16:00:00 +03:00
Igor Ryzhov
d2977d57c8 mgmtd, lib: remove batch ids from cfg apply reply
The config is always applied fully, all batches are included. There's no
need to pass a list of applied batches as it always contains all of
them.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-10-17 15:06:13 +03:00
Mobashshera Rasool
1064818645 pim6d: valgrind issue fixes
Problem Statement:
===================
Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
at 0x4975157: sendmsg (sendmsg.c:28)
==2263111==    by 0x1413BE: pim_msg_send_frame (pim_pim.c:629)
==2263111==    by 0x1413BE: pim_msg_send (pim_pim.c:743)
==2263111==    by 0x1425DC: pim_register_send (pim_register.c:332)
==2263111==    by 0x1427EE: pim_null_register_send (pim_register.c:443)
==2263111==    by 0x14D228: pim_upstream_register_stop_timer (pim_upstream.c:1608)
==2263111==    by 0x48CE6DF: thread_call (thread.c:1693)
==2263111==    by 0x4899EFF: frr_run (libfrr.c:1068)
==2263111==    by 0x11D035: main (pim6_main.c:190)
==2263111==  Address 0x1ffeffdcb1 is on thread 1's stack
==2263111==  in frame #2, created by pim_register_send (pim_register.c:273)
==2263111==  Uninitialised value was created by a stack allocation
==2263111==    at 0x142690: pim_null_register_send (pim_register.c:389)

RCA:
====================
1. All members of struct pim_msg_header were not initiliased while sending
null register packet. Therefore when the pointers are assigned while
sending the msg via sendmsg, it complains the pointer points to
uninitialised byte.
2. struct ipv6_ph ph was also not initialised.

Fix:
====================
Initialised all the members using memset.

Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
2023-10-16 21:44:32 -07:00
Donald Sharp
c8d568487c
Merge pull request #14599 from opensourcerouting/fix/issue_14419
tests: Check if evpn route-map match by route type works
2023-10-16 10:20:23 -04:00
Donatas Abraitis
c7a9af861a tests: Check if evpn route-map match by route type works
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-10-15 19:46:34 +03:00
Donatas Abraitis
c97c449e1f
Merge pull request #14585 from donaldsharp/send_capability
ldpd: Clarify error situation for different problems
2023-10-14 20:22:37 +03:00
Donald Sharp
50e6ba26a4
Merge pull request #14582 from cloudscale-ch/denis/topotest-for-14488
tests: Add OSPF test for issue 14488
2023-10-14 09:42:49 -04:00