Because of the issue described in the above link, pinging from vrf with
the command "ip vrf exec <vrf> ping -I <src> <addr>" may fail.
> root@topo:~# ip vrf exec vrf1 ping -c1 -I 192.168.2.1 192.168.1.1
> bind: Cannot assign requested address
Raise an error if pinging its own IP from a VRF fails. This test should
always work unless in the condition of this issue.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203483
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add an "exist" key to check the existence of a prefix in the BGP RIB.
Useful to check that a prefix has not leaked by error.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When the last IPv4 address of an interface is deleted, Linux removes all
routes includes BGP ones using this interface without any Netlink
advertisement. bgpd keeps them in RIB as valid (e.g. installed in FIB).
The previous patch invalidates the associated nexthop groups in zebra
but bgpd is not notified of the event.
> 2022/05/09 17:37:52.925 ZEBRA: [TQKA8-0276P] Not Notifying Owner: connected about prefix 29.0.0.0/24(40) 3 vrf: 7
Look for the bgp_path_info that are unsynchronized with the kernel and
flag them for refresh in their attributes. A VPN route leaking update is
calles and the refresh flag triggers a route refresh to zebra and then a
kernel FIB installation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When the last IPv4 address of an interface is deleted, Linux removes
all routes using this interface without any Netlink advertisement.
Routes that have a IPv4 nexthop are correctly removed from the FRR RIB.
However, routes that only have an interface with no more IPv4 addresses
as a nexthop remains in the FRR RIB.
In this situation, among the routes that this particular interface
nexthop:
- remove from the zebra kernel routes
- reinstall the routes that have been added from FRR. It is useful when
the nexthop is for example a VRF interface.
Add related test cases in the zebra_netlink topotest.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Previous commits have introduced a new 8 bits nh_flag in the attr
struct that has increased the memory footprint.
Move the mp_nexthop_prefer_global boolean in the attr structure that
takes 8 bits to the new nh_flag in order to go back to the previous
memory utilization.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Leaked recursive routes are not resolved.
> VRF r1-cust1:
> B> 5.1.0.0/24 [200/98] via 99.0.0.1 (recursive), weight 1, 00:00:08
> * via 192.168.1.2, r1-eth4, weight 1, 00:00:08
> B>* 99.0.0.1/32 [200/0] via 192.168.1.2, r1-eth4, weight 1, 00:00:08
> VRF r1-cust4:
> B 5.1.0.0/24 [20/98] via 99.0.0.1 (vrf r1-cust1) inactive, weight 1, 00:00:08
> B>* 99.0.0.1/32 [20/0] via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:08
When announcing the routes to zebra, use the peer of the ultimate bgp
path info instead of the one of the first parent path info to determine
whether the route is recursive.
The result is:
> VRF r1-cust4:
> B> 5.1.0.0/24 [20/98] via 99.0.0.1 (vrf r1-cust1) (recursive), weight 1, 00:00:02
> * via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:02
> B>* 99.0.0.1/32 [20/0] via 192.168.1.2, r1-eth4 (vrf r1-cust1), weight 1, 00:00:02
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Update bgp_vrf_route_leak_basic to set up the VRF interfaces. Otherwise
the routes to the VRF interface are inactives.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
At bgpd startup, VRF instances are sent from zebra before the
interfaces. When importing a l3vpn prefix from another local VRF
instance, the interfaces are not known yet. The prefix nexthop interface
cannot be set to the loopback or the VRF interface, which causes setting
invalid routes in zebra.
Update route leaking when the loopback or a VRF interface is received
from zebra.
At a VRF interface deletion, zebra voluntarily sends a
ZEBRA_INTERFACE_ADD message to move it to VRF_DEFAULT. Do not update if
such a message is received. VRF destruction will destroy all the related
routes without adding codes.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Leaked connected routes have now the following nexthop interfaces:
- lo for routes imported from the default VRF
- or the VRF interface for routes imported from the other VRFs.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a function to find the VRF or the loopback interface: the loopback
interface for the default VRF and the VRF master interface otherwise.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The following configuration creates an infinite routing leaking loop
because 'rt vpn both' parameters are the same in both VRFs.
> router bgp 5227 vrf r1-cust4
> no bgp network import-check
> bgp router-id 192.168.1.1
> address-family ipv4 unicast
> network 28.0.0.0/24
> rd vpn export 10:12
> rt vpn both 52:100
> import vpn
> export vpn
> exit-address-family
> !
> router bgp 5227 vrf r1-cust5
> no bgp network import-check
> bgp router id 192.168.1.1
> address-family ipv4 unicast
> network 29.0.0.0/24
> rd vpn export 10:13
> rt vpn both 52:100
> import vpn
> export vpn
> exit-address-family
The previous commit has added a routing leak update when a nexthop
update is received from zebra. It indirectly calls
bgp_find_or_add_nexthop() in which a static route triggers a nexthop
cache entry registration that triggers a nexthop update from zebra.
Do not register again the nexthop cache entry if the BGP_STATIC_ROUTE is
already set.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB if the
'rt import' statement is defined after the 'network <prefix>' ones.
When a prefix nexthop is updated, update the prefix route leaking. The
current state of nexthop validation is now stored in the attributes of
the bgp path info. Attributes are compared with the previous ones at
route leaking update so that a nexthop validation change now triggers
the update of destination VRF BGP table.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
"if not XX else" statements are confusing.
Replace two "if not XX else" statements by "if XX else" to prepare next
commits. The patch is only cosmetic.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
If 'network import-check' is defined on the source BGP session, prefixes
that are stated in the network command cannot be leaked to the other
VRFs BGP table even if they are present in the origin VRF RIB.
Always validate the nexthop of BGP static routes (i.e. defined with the
network statement) if 'network import-check' is defined on the source
BGP session and the prefix is present in source RIB.
It fixes the issue when the 'rt import' statement is defined after the
'network' ones.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Prefixes that are stated in the network command cannot be leaked to
the other VRFs BGP table whether or not they are present in the origin
VRF RIB.
Always validate the nexthop of BGP static routes (i.e. defined with the
network statement) if 'no network import-check' is defined on the source
BGP session.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The function bgp_packet_mpattr_prefix was using an if statement
to encode packets to the peer. Change it to a switch and make
it handle all the cases and fail appropriately when something
has gone wrong. Hopefully in the future when a new afi/safi
is added we can catch it by compilation breaking instead of
weird runtime errors
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This function was just using default: case statements for
the encoding of nlri's to a peer. Lay out all the different
cases and make things fail hard when a dev escape is found.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The function bgp_packet_mpattr_prefix_size had an if/else
body that allowed people to add encoding types to bgpd
such that we could build the wrong size packets. This
was exposed recently in commit:
0a9705a1e0
Where it was discovered flowspec was causing bgp update
messages to exceed the maximum size and the peer to
drop the connection. Let's be proscriptive about this
and hopefully make it so that things don't work when
someone adds a new safi to the system ( and they'll have
to update this function ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When compiling with -fsanitize=thread. I started getting this error:
staticd/static_zebra.c: In function ‘static_zebra_nht_get_prefix’:
staticd/static_zebra.c:316:1: error: control reaches end of non-void function [-Werror=return-type]
316 | }
| ^
Just to make future efforts still work, let's just make the compiler happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
ASAN reported the following memleak:
```
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x4d4342 in calloc (/usr/lib/frr/bgpd+0x4d4342)
#1 0xbc3d68 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0xb869f7 in list_new /home/sharpd/frr8/lib/linklist.c:64:9
#3 0x5a38bc in bgp_evpn_remote_ip_hash_alloc /home/sharpd/frr8/bgpd/bgp_evpn.c:6789:24
#4 0xb358d3 in hash_get /home/sharpd/frr8/lib/hash.c:162:13
#5 0x593d39 in bgp_evpn_remote_ip_hash_add /home/sharpd/frr8/bgpd/bgp_evpn.c:6881:7
#6 0x59dbbd in install_evpn_route_entry_in_vni_common /home/sharpd/frr8/bgpd/bgp_evpn.c:3049:2
#7 0x59cfe0 in install_evpn_route_entry_in_vni_ip /home/sharpd/frr8/bgpd/bgp_evpn.c:3126:8
#8 0x59c6f0 in install_evpn_route_entry /home/sharpd/frr8/bgpd/bgp_evpn.c:3318:8
#9 0x59bb52 in install_uninstall_route_in_vnis /home/sharpd/frr8/bgpd/bgp_evpn.c:3888:10
#10 0x59b6d2 in bgp_evpn_install_uninstall_table /home/sharpd/frr8/bgpd/bgp_evpn.c:4019:5
#11 0x578857 in install_uninstall_evpn_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4051:9
#12 0x58ada6 in bgp_evpn_import_route /home/sharpd/frr8/bgpd/bgp_evpn.c:6049:9
#13 0x713794 in bgp_update /home/sharpd/frr8/bgpd/bgp_route.c:4842:3
#14 0x583fa0 in process_type2_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4518:9
#15 0x5824ba in bgp_nlri_parse_evpn /home/sharpd/frr8/bgpd/bgp_evpn.c:5732:8
#16 0x6ae6a2 in bgp_nlri_parse /home/sharpd/frr8/bgpd/bgp_packet.c:363:10
#17 0x6be6fa in bgp_update_receive /home/sharpd/frr8/bgpd/bgp_packet.c:2020:15
#18 0x6b7433 in bgp_process_packet /home/sharpd/frr8/bgpd/bgp_packet.c:2929:11
#19 0xd00146 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
```
The list itself was not being cleaned up when the final list entry was
removed, so make sure we do that instead of leaking memory.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
The list delete function on creation was set to srv6_locator_chunk_free
Which takes a double pointer and dereferences it to free the data.
When list_delete is called it calls the delete function like this:
if (*list->del)
(*list->del)(node->data);
The data is not passed in by reference and as such we do not have
a double pointer. Fortunately this list_delete is only really
called on shutdown when the locator was deleted and we do not
have a fun situation where we were suddenly freeing 'something'.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When using auth keys in ospfv3, there are some memory
leaks when you change the key or remove the interface
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The early route queue has a series of `struct zebra_early_route *`
entries. Zebra is treating this memory as just a `struct route entry`.
This is wrong. Correct this to free the memory correctly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The wq->spec.errorfunc is never used in the code.
It's been in the code base since 2005 and I also
do not remember ever seeing it being called. No
workqueue process function ever returns error.
Since it's not used let's just remove it from the
code base.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Building FRR with --enable-address-sanitizer and then running the
config_timing test makes the test run for over an hour on my machine.
The goal of this test is to ensure that the test runs 10000 routes
in/out in a reasonable amount of time. We cannot test this with
address-sanitizer enabled. So just make the test meaningless
from a timing perspective but keep it `alive` from a it might
catch some address sanitizer issue with 50 -vs- 10000 routes
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Address Sanitizer found this:
=================================================================
==418623==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 128 byte(s) in 4 object(s) allocated from:
#0 0x4bd732 in calloc (/usr/lib/frr/zebra+0x4bd732)
#1 0x7feaeab8f798 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0x7feaeaba40f4 in nexthop_group_new /home/sharpd/frr8/lib/nexthop_group.c:270:9
#3 0x56859b in netlink_route_change_read_unicast /home/sharpd/frr8/zebra/rt_netlink.c:950:9
#4 0x5651c2 in netlink_route_change /home/sharpd/frr8/zebra/rt_netlink.c:1204:2
#5 0x54af15 in netlink_information_fetch /home/sharpd/frr8/zebra/kernel_netlink.c:407:10
#6 0x53e7a3 in netlink_parse_info /home/sharpd/frr8/zebra/kernel_netlink.c:1184:12
#7 0x548d46 in kernel_read /home/sharpd/frr8/zebra/kernel_netlink.c:501:2
#8 0x7feaeacc87f6 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
#9 0x7feaeab36503 in frr_run /home/sharpd/frr8/lib/libfrr.c:1198:3
#10 0x550d38 in main /home/sharpd/frr8/zebra/main.c:476:2
#11 0x7feaea492d09 in __libc_start_main csu/../csu/libc-start.c:308:16
Indirect leak of 576 byte(s) in 4 object(s) allocated from:
#0 0x4bd732 in calloc (/usr/lib/frr/zebra+0x4bd732)
#1 0x7feaeab8f798 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
#2 0x7feaeab9b3f8 in nexthop_new /home/sharpd/frr8/lib/nexthop.c:373:7
#3 0x56875e in netlink_route_change_read_unicast /home/sharpd/frr8/zebra/rt_netlink.c:960:15
#4 0x5651c2 in netlink_route_change /home/sharpd/frr8/zebra/rt_netlink.c:1204:2
#5 0x54af15 in netlink_information_fetch /home/sharpd/frr8/zebra/kernel_netlink.c:407:10
#6 0x53e7a3 in netlink_parse_info /home/sharpd/frr8/zebra/kernel_netlink.c:1184:12
#7 0x548d46 in kernel_read /home/sharpd/frr8/zebra/kernel_netlink.c:501:2
#8 0x7feaeacc87f6 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
#9 0x7feaeab36503 in frr_run /home/sharpd/frr8/lib/libfrr.c:1198:3
#10 0x550d38 in main /home/sharpd/frr8/zebra/main.c:476:2
#11 0x7feaea492d09 in __libc_start_main csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: 704 byte(s) leaked in 8 allocation(s).
Fix this!
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When shutting down ensure that any daemon operating with
snmp tells it to stop operating so no more data is sent.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The L3VPN best path computation now takes into accound the IGP metric.
Adapt the bgp_l3vpn_to_bgp_vrf tests so that routes with the best IGP
metric are selected when needed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Display the IGP metric of the ultimate path in the SNMP OID
mplsL3VpnVrfRteInetCidrMetric1.
Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Display IGP metric of the ultimate path in the command
"show bgp vrf X ipv(4|6)".
Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Since the commit da0c0ef70c ("bgpd: VRF-Lite fix best path selection"),
the best path selection is made from the comparison of the attributes
of the original route i.e. the ultimate path.
The IGP metric is currently set on the child path instead of the
ultimate path (i.e. the parent path). On eBGP, the ultimate path is the
child path. However, for imported routes, the ultimate path is always
set to 0, which results in skipping the IGP metric comparison when
selecting the best path.
Set the IGP metric on the ultimate path when a BGP nexthop is added or
updated.
Fixes: da0c0ef70c ("bgpd: VRF-Lite fix best path selection")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Currently, bgp_packet_mpattr_prefix_size (bgpd/bgp_attr.c:3978) always returns zero for Flowspec prefixes.
This is because, for flowspec prefixes, the prefixlen attribute of the prefix struct is always set to 0, and the actual length is bytes is set inside the flowspec_prefix struct instead (see lib/prefix.h:293 and lib/prefix.h:178).
Because of this, with a large number of flowspec NLRIs, bgpd ends up building update messages that exceed the maximum size and cause the peer to drop the connection (bgpd/bgp_updgrp_packet.c:L719).
The proposed change allows the bgp_packet_mpattr_prefix_size to return correct result for flowspec prefixes.
Signed-off-by: Stephane Poignant <stephane.poignant@proton.ch>