Zebra currently does a shortest prefix match for
resolving nexthops for a prefix. This is typically
an ok thing to do but fails in several specific scenarios.
If a nexthop matches to a route that is not usable, nexthop
resolution just gives up and refuses to use that particular
route. For example if zebra currently has a covering prefix
say a 10.0.0.0/8. And about the same time it receives a
10.1.0.0/16 ( a more specific than the /8 ) and another
route A, who's nexthop is 10.1.1.1. Imagine the 10.1.0.0/16
is processed enough to know we want to install it and the
prefix is sent to the dataplane for installation( it is queued )
and then route A is processed, nexthop resolution will fail
and the route A will be left in limbo as uninstallable.
Let's modify the nexthop resolution code in zebra such that
if a nexthop's most specific match is unusable, continue looking
up the table till we get to the 0.0.0.0/0 route( if it's even
installed ). If we find a usable route for the nexthop accept
it and use it.
The bgp_default_originate topology test is frequently failing
with this exact problem:
B>* 0.0.0.0/0 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
B 1.0.1.17/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21
B>* 1.0.2.17/32 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
C>* 1.0.3.17/32 is directly connected, lo, 00:02:00
B>* 1.0.5.17/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32
B>* 192.168.0.0/24 [200/0] via 192.168.1.1, r2-r1-eth0, weight 1, 00:00:21
B 192.168.1.0/24 [200/0] via 192.168.1.1 inactive, weight 1, 00:00:21
C>* 192.168.1.0/24 is directly connected, r2-r1-eth0, 00:02:00
C>* 192.168.2.0/24 is directly connected, r2-r3-eth1, 00:02:00
B>* 192.168.3.0/24 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32
B 198.51.1.1/32 [200/0] via 192.168.0.1 inactive, weight 1, 00:00:21
B>* 198.51.1.2/32 [20/0] via 192.168.2.2, r2-r3-eth1, weight 1, 00:00:32
Notice that the 1.0.1.17/32 route is inactive but the nexthop
192.168.0.1 is covered by both the 192.168.0.0/24 prefix( shortest match )
*and* the 0.0.0.0/0 route ( longest match ). When looking at the logs
the 1.0.1.17/32 route was not being installed because the matching
route was not in a usable state, which is because the 192.168.0.0/24
route was in the process of being installed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When specified `--gdb-use-emacs` will launch the daemon with gdb inside a
running emacs server using `emacsclient --eval` commands.
Signed-off-by: Christian Hopps <chopps@labn.net>
This was true when we had only a CLI for configuration. Now mgmtd has a
public frontend interface that can be used by external applications, and
they can send invalid requests that lead to errors.
This is still true for CLI though, so the same comment still stays in
`nb_cli_apply_changes_internal`.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
If we modify the prefix-list that is used to define the routes to be
advertised, all of them MUST be advertised.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we have a prefix-list with one entry, and after some time we append a prefix-list
with some more additional entries, conditional advertisement is triggered, and the
old entries are suppressed (because they look identical as sent before).
Hence, the old entries are sent as withdrawals and only new entries sent as updates.
Force re-sending all BGP updates for conditional advertisement. The same is done
for route-refresh, and/or soft clear operations.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
For now it includes only FQDN capability, because other capabilities can be
resend using specific knobs.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
For instance, it's not possible to resend FQDN capability without resetting
the session, so let's create some more elegant way to do that.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When testing GR features, it is desired to kill bgp
(or really any daemon )and not immediately have bgp start up again.
Modify the code to not attempt to restart the daemon
by hand to let us developers work when the `watchfrr ignore XXX`
command is issued.
Testing:
watchfrr ignore bgpd
kill -9 bgpd
start bgp by `/usr/lib/frr/watchfrr.sh start bgpd` at some point in time
in the future
leaf-1# show watchfrr
watchfrr global phase: Idle
Restart Command: "/usr/lib/frr/watchfrr.sh restart %s"
Start Command: "/usr/lib/frr/watchfrr.sh start %s"
Stop Command: "/usr/lib/frr/watchfrr.sh stop %s"
Min Restart Interval: 60
Max Restart Interval: 600
Restart Timeout: 90
zebra Up
bgpd Up/Ignoring Timeout
staticd Up
leaf-1#
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A static label allocation should not be accepted if the desired range
conflicts with the configured dynamic-block configuration.
Do not accept such label requests, only when dynamic blocks are
configured.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There is no test that checks for the label allocation mechanisms
involved when using BGP and/or LDP.
- Some configuration changes are applied in the BGP configuration,
and the impact is checked on the BGP contexts, and on the label
manager.
- The label manager dynamic range is reconfigured, BGP auto mode
is checked against the new range, along with LDP when restarting.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The expected tableVersion is wrong, when checking r1 table.
The tableVersion value increments at each route updates. The
previous commit brought an additional route update with the
'vpn_leak_postchange_all()' call.
Keep the function call, and do not check the table version
in bgp_srv6l3vpn_to_bgp_vrf[2,3] tests.
Fixes: 205b62ffae2c ("bgpd: fix hardset l3vpn label available in mpls pool")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
With the deprecation of the global "bgp enforce-first-as" command back
in https://github.com/FRRouting/frr/pull/2259 the newly introduced
option to enable that setting on a specific peer was not documented.
This commit adds the necessary documentation and states the command's
default.
Signed-off-by: Manuel Schweizer <manuel.schweizer@cloudscale.ch>
Add an ability to enable/disable ORF capability dynamically without tearing
down the session.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The bgp_vpnv[4,6]_table_check() functions analyze the
expected label value of VPN prefixes present in the BGP table.
However, it doesn't verify if the prefixes exist before doing
this. Consequently, the tests will fail if the prefixes do not
show up immediately.
Ensure that all expected VPN prefixes are present before
executing the function.
Fixes: ae5a6bc1f6 ("topotests: add bgp mpls allocation per next-hop test")
Fixes: 37a02a8dcb ("topotests: add bgp_vpnv6 test allocation")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit adds support for the label chunk allocation in
the configured dynamic block range.
An additional check ensures the upper bound does not go
over the upper bound of the dynamic-block.
Otherwise, a chunk is created with the lower bound set
to the first label element available in the defined
range.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The label chunk algorithm needs to be revisited to support a
configured dynamic-block or the default one.
Reuse the 'lbl_mgr.dynamic_block_[start/end]' variables,
whereever needed, and simplify the algorithm.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Hardset label values (eg. ISIS Segment-routing label blocks,
hardset BGP L3VPN service label) may conflict with label chunks
dynamically allocated by zebra.
Add an optional 'mpls label dynamic-block' command to let the user
define a range that is not in conflict with the hardset values.
Restarting control planes is recommended when dynamic label
chunks are already allocated. Command is aborted when any hardset
label chunks conflict with the dynamic block.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
After ISIS first allocates a label chunk at [1000;2000],
the '16' label value is not used when BGP tries to
allocate a label chunk in auto mode. This does not happen
when BGP is the only one to do the label allocation.
When a label chunk has been accepted, the next label
request checks if there is room space before the existing
label chunk, and uses the lower label value to 17, and not
16.
Fix this by changing the previous range end 'prev_end' label
value to 15 which is the end of the reserved MPLS label
range.
Fixes: 3c84497943 ("zebra: label manager should never return a reserved block")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The zebra label manager stores the mpls label chunks,
but does not record if the label request was for a
dynamic or a static chunk.
For all label requests accepted, mark the label chunk
if the 'base' parameter is set to MPLS_LABEL_BASE_ANY,
unmark it otherwise.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If BGP starts with a l3vpn configuration, the 'pending' value
of the 'show bgp labelpool summary' command is set to 128,
whereas the 'pending' value is 0 if the l3vpn configuration is
applied after.
with no config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger: 1
> InUse: 1
> Requests: 0
> LabelChunks: 1
> Pending: 0
> Reconnects: 1
with config at startup:
> show bgp labelpool summary
> Labelpool Summary
> -----------------
> Ledger: 1
> InUse: 1
> Requests: 0
> LabelChunks: 1
> Pending: 128
> Reconnects: 1
When BGP configuration is applied at startup, the label request fails,
because the zapi connection with zebra is not yet up. At zebra
up event, the label request is done again, succeeds, decrements the
'pending_count' value in 'bgp_lp_event_chunk() function, then sets
the 'pending_count' value to the 'labels_needed' value.
This method was correct when label requests were asyncronous: the
'pending_count' value was first set, then decremented. In syncronous
label requests, the operations are swapped.
Fix this by incrementing the expected 'labels_needed' value instead.
Fixes: 0043ebab99 ("bgpd: Use synchronous way to get labels from Zebra")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A label chunk is used by BGP for L3VPN or LU purposes,
by picking up labels from that chunk; but when those
labels are release, the label chunks are never released.
The below configuration sequence shows that the label
chunks are not released.
> router bgp 65500
> bgp router-id 1.1.1.1
> !
> address-family ipv4 unicast
> label vpn export auto
> rd vpn export 55:1
> rt vpn both 55:1
> export vpn
> import vpn
> [..]
> no label vpn export auto
> [..]
> # show bgp labelpool summary
> [..]
> LabelChunks: 1
> Pending: 128
> [..]
The '128' value stands for the default label chunk size,
which is not released after unconfiguration.
Fix this by checking after each label release, that
the label chunk is still used. If not, release it.
Reset the 'next_chunksize' value to the default value.
Fixes: 955bfd984f ("bgpd: dynamic mpls label pool")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When configuring manual label value in BGP L3VPN, the label
allocation conflicts with the LDP label pool which is in use.
Choose BGP label values different that the ones from LDP.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP always asks zebra for a chunk of MPLS label even if it doesn't need it.
Fix this by correcting the rounding up "labels_needed" formula.
Fixes: 80853c2ec7 ("bgpd: improve labelpool performance at scale")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Today, when configuring BGP L3VPN mpls, the operator may
use that command to hardset a label value:
> router bgp 65500 vrf vrf1
> address-family ipv4 unicast
> label vpn export <hardset_label_value>
Today, BGP uses this value without checks, leading to potential
conflicts with other control planes like LDP. For instance, if
LDP initiates with a label chunk of [16;72] and BGP also uses the
50 label value, a conflict arises.
The 'label manager' service in zebra oversees label allocations.
While all the control plane daemons use it, BGP doesn't when a
hardset label is in place.
This update fixes this problem. Now, when a hardset label is set for
l3vpn export, a request is made to the label manager for approval,
ensuring no conflicts with other daemons. But, this means some existing
BGP configurations might become non-operational if they conflict with
labels already allocated to another daemon but not used.
note: Labels below 16 are reserved and won't be checked for consistency
by the label manager.
Fixes: ddb5b4880b ("bgpd: vpn-vrf route leaking")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The original 'bgp label vpn export' code is confusing,
the 'no form' actions are mixed with the positive form.
Fix this by rewriting the code.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The config is always applied fully, all batches are included. There's no
need to pass a list of applied batches as it always contains all of
them.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem Statement:
===================
Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
at 0x4975157: sendmsg (sendmsg.c:28)
==2263111== by 0x1413BE: pim_msg_send_frame (pim_pim.c:629)
==2263111== by 0x1413BE: pim_msg_send (pim_pim.c:743)
==2263111== by 0x1425DC: pim_register_send (pim_register.c:332)
==2263111== by 0x1427EE: pim_null_register_send (pim_register.c:443)
==2263111== by 0x14D228: pim_upstream_register_stop_timer (pim_upstream.c:1608)
==2263111== by 0x48CE6DF: thread_call (thread.c:1693)
==2263111== by 0x4899EFF: frr_run (libfrr.c:1068)
==2263111== by 0x11D035: main (pim6_main.c:190)
==2263111== Address 0x1ffeffdcb1 is on thread 1's stack
==2263111== in frame #2, created by pim_register_send (pim_register.c:273)
==2263111== Uninitialised value was created by a stack allocation
==2263111== at 0x142690: pim_null_register_send (pim_register.c:389)
RCA:
====================
1. All members of struct pim_msg_header were not initiliased while sending
null register packet. Therefore when the pointers are assigned while
sending the msg via sendmsg, it complains the pointer points to
uninitialised byte.
2. struct ipv6_ph ph was also not initialised.
Fix:
====================
Initialised all the members using memset.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>