Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.
Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zebra implementation of nexthop groups has
two types of nexthops groups currently. Singleton
objects which have afi's and combined nexthop groups
that do not. Specifically call this out in the code
to make this distinction.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Handling capability received from client. It may contain
GR enable/disable, Stale time changes, RIB update complete
for given AFi, ASAFI and instance. It also has changes for
stale route handling.
Signed-off-by: Santosh P K <sapk@vmware.com>
Add a null check in `handle_recursive_depend()` so it
doesn't try to add a NULL pointer to the RB tree.
This was found with clang SA.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were not resetting the nexthop pointer to NULL for each
new read of a nexthop from the zapi route. On the chance we
get a nexthop that does not have a proper type, we will not
create a new nexthop and update that pointer, thus it still
has the last valid one and will create a group with two
pointers to the same nexthop.
Then when it enters any code that iterates the group, it loops
endlessly.
This was found with zapi fuzzing.
```
0x00007f728891f1c3 in jhash2 (k=<optimized out>, length=<optimized out>, initval=12183506) at lib/jhash.c:138
0x00007f728896d92c in nexthop_hash (nexthop=<optimized out>) at lib/nexthop.c:563
0x00007f7288979ece in nexthop_group_hash (nhg=<optimized out>) at lib/nexthop_group.c:394
0x0000000000621036 in zebra_nhg_hash_key (arg=<optimized out>) at zebra/zebra_nhg.c:356
0x00007f72888ec0e1 in hash_get (hash=<optimized out>, data=0x7ffffb94aef0, alloc_func=0x0) at lib/hash.c:138
0x00007f72888ee118 in hash_lookup (hash=0x7f7288de2f10, data=0x7f728908e7fc) at lib/hash.c:183
0x0000000000626613 in zebra_nhg_find (nhe=0x7ffffb94b080, id=0, nhg=0x6020000032d0, nhg_depends=0x0, vrf_id=<optimized out>,
afi=<optimized out>, type=<optimized out>) at zebra/zebra_nhg.c:541
0x0000000000625f39 in zebra_nhg_rib_find (id=0, nhg=<optimized out>, rt_afi=AFI_IP) at zebra/zebra_nhg.c:1126
0x000000000065f953 in rib_add_multipath (afi=AFI_IP, safi=<optimized out>, p=0x7ffffb94b370, src_p=0x0, re=0x6070000013d0,
ng=0x7f728908e7fc) at zebra/zebra_rib.c:2616
0x0000000000768f90 in zread_route_add (client=0x61f000000080, hdr=<optimized out>, msg=<optimized out>, zvrf=<optimized out>)
at zebra/zapi_msg.c:1596
0x000000000077c135 in zserv_handle_commands (client=<optimized out>, msg=0x61b000000780) at zebra/zapi_msg.c:2636
0x0000000000575e1f in main (argc=<optimized out>, argv=<optimized out>) at zebra/main.c:309
```
```
(gdb) p *nhg->nexthop
$4 = {next = 0x5488e0, prev = 0x5488e0, vrf_id = 16843009, ifindex = 16843009, type = NEXTHOP_TYPE_IFINDEX, flags = 8 '\b', {gate = {ipv4 = {s_addr = 0},
ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}},
bh_type = BLACKHOLE_UNSPEC}, src = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0,
0}, __u6_addr32 = {0, 0, 0, 0}}}}, rmap_src = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0,
0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}, resolved = 0x0, rparent = 0x0, nh_label_type = ZEBRA_LSP_NONE, nh_label = 0x0, weight = 1 '\001'}
(gdb) quit
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.
The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.
This is a pretty unlikely situation to happen.
Also, pull up the RB handling of _del RB API as well.
This was found with the zapi fuzzing code.
```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840== by 0x48E1008: qcalloc (memory.c:110)
==1052840== by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840== by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840== by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840== by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840== by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840== by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840== by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840== by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840== by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840== by 0x428B11: main (main.c:309)
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Zebra will have special handling for clients with GR enabled.
When client disconnects with GR enabled, then a stale client
will be created and its RIB will be retained till stale timer
or client comes up and updated its RIB.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Soman K S <somanks@vmware.com>
Signed-off-by: Santosh P K <sapk@vmware.com>
Adding header files changes where structure to hold
received graceful restart info from client is defined.
Also there are changes for show commands where exisiting
commands are extended.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Soman K S <somanks@vmware.com>
Signed-off-by: Santosh P K <sapk@vmware.com>
Add a config that disables use of kernel-level nexthop ids.
Currently, zebra always uses nexthop ids if the kernel supports
them.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When we are receiving a kernel route, with an admin distance
of 255 we are not marking it as installed. This route
should be marked as installed.
New behavior:
K>* 4.5.7.0/24 [255/8192] via 192.168.209.1, enp0s8, 00:10:14
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
commit: 0eb97b860d
Removed this chunk of code in zebra:
- if (ifp)
- if (connected_is_unnumbered(ifp))
- SET_FLAG(nexthop->flags, NEXTHOP_FLAG_ONLINK);
Effectively if we had a NEXTHOP_TYPE_IPV4_IFINDEX we would
auto set the onlink flag. This commit dropped it for some reason.
Add it back in an intelligent manner.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
My previous patch to fix a memory leak, caused by not properly freeing
the iptable iface list on stream parse failure, created/exposed a heap
use after free because we were not doing a deep copy
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
With recent changes to the lib nexthop_group
APIs (e1f3a8eb19), we are making
new assumptions that this should be adding a single nexthop
to a group, not a list of nexthops.
This broke the case of a recursive nexthop resolving to a group:
```
D> 2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09
* via 1.1.1.1, dummy1 onlink, 00:00:09
via 1.1.1.2 (recursive), 00:00:09
* via 1.1.1.2, dummy2 onlink, 00:00:09
D> 3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04
* via 1.1.1.1, dummy1 onlink, 00:00:04
K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21
```
This group can instead just directly point to the nh that was passed.
Its only being used for a lookup (the memory gets copied and used
elsewhere if the nexthop is not found).
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Make the nexthop_copy/nexthop_dup APIs more consistent by
adding a secondary, non-recursive, version of them. Before,
it was inconsistent whether the APIs were expected to copy
recursive info or not. Make it clear now that the default is
recursive info is copied unless the _no_recurse() version is
called. These APIs are not heavily used so it is fine to
change them for now.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
cb86eba3ab was causing zebra to crash
when handling a nexthop group that had a nexthop which was recursively resolved.
Steps to recreate:
!
nexthop-group red
nexthop 1.1.1.1
nexthop 1.1.1.2
!
sharp install routes 8.8.8.1 nexthop-group red 1
=========================================
==11898== Invalid write of size 8
==11898== at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254)
==11898== by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296)
==11898== by 0x453593: handle_recursive_depend (zebra_nhg.c:481)
==11898== by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572)
==11898== by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597)
==11898== by 0x4536B4: depends_find (zebra_nhg.c:1065)
==11898== by 0x453526: depends_find_add (zebra_nhg.c:1087)
==11898== by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567)
==11898== by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==11898== by 0x452268: nexthop_active_update (zebra_nhg.c:1729)
==11898== by 0x461517: rib_process (zebra_rib.c:1049)
==11898== by 0x4610C8: process_subq_route (zebra_rib.c:1967)
==11898== Address 0x0 is not stack'd, malloc'd or (recently) free'd
Zebra crashes because we weren't handling the case of the depend nexthop
being recursive.
For this case, we cannot make the function more efficient. A nexthop
could resolve to a group of any size, thus we need allocs/frees.
To solve this and retain the goal of the original patch, we separate out the
two cases so it will still be more efficient if the nexthop is not recursive.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The only two safi's that are usable for zebra for installation
of routes into the rib are SAFI_UNICAST and SAFI_MULTICAST.
The acceptance of other safi's is causing a memory leak:
Direct leak of 56 byte(s) in 1 object(s) allocated from:
#0 0x5332f2 in calloc (/usr/lib/frr/zebra+0x5332f2)
#1 0x7f594adc29db in qcalloc /opt/build/frr/lib/memory.c:110:27
#2 0x686849 in zebra_vrf_get_table_with_table_id /opt/build/frr/zebra/zebra_vrf.c:390:11
#3 0x65a245 in rib_add_multipath /opt/build/frr/zebra/zebra_rib.c:2591:10
#4 0x7211bc in zread_route_add /opt/build/frr/zebra/zapi_msg.c:1616:8
#5 0x73063c in zserv_handle_commands /opt/build/frr/zebra/zapi_msg.c:2682:2
Collapse
Sequence of events:
Upon vrf creation there is a zvrf->table[afi][safi] data structure
that tables are auto created for. These tables only create SAFI_UNICAST
and SAFI_MULTICAST tables. Since these are the only safi types that
are zebra can actually work on. zvrf data structures also have a
zvrf->otable data structure that tracks in a RB tree other tables
that are created ( say you have routes stuck in any random table
in the 32bit route table space in linux ). This data structure is
only used if the lookup in zvrf->table[afi][safi] fails.
After creation if we pass a route down from an upper level protocol
that has non unicast or multicast safi *but* has the actual
tableid of the vrf we are in, the initial lookup will always
return NULL leaving us to look in the otable. This will create
a data structure to track this data.
If after this event you pass in a second route with the same
afi/safi/table_id, the otable will be created and attempted
to be stored, but the RB_TREE_UNIQ data structure when it sees
this will return the original otable returned and the lookup function
zebra_vrf_get_table_with_table_id will just drop the second otable.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
==25402==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x533302 in calloc (/usr/lib/frr/zebra+0x533302)
#1 0x7fee84cdc80b in qcalloc /home/qlyoung/frr/lib/memory.c:110:27
#2 0x5a3032 in create_label_chunk /home/qlyoung/frr/zebra/label_manager.c:188:3
#3 0x5a3c2b in assign_label_chunk /home/qlyoung/frr/zebra/label_manager.c:354:8
#4 0x5a2a38 in label_manager_get_chunk /home/qlyoung/frr/zebra/label_manager.c:424:9
#5 0x5a1412 in hook_call_lm_get_chunk /home/qlyoung/frr/zebra/label_manager.c:60:1
#6 0x5a1412 in lm_get_chunk_call /home/qlyoung/frr/zebra/label_manager.c:81:2
#7 0x72a234 in zread_get_label_chunk /home/qlyoung/frr/zebra/zapi_msg.c:2026:2
#8 0x72a234 in zread_label_manager_request /home/qlyoung/frr/zebra/zapi_msg.c:2073:4
#9 0x73150c in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2688:2
When creating label chunk that has a specified base, we eventually are
calling assign_specific_label_chunk. This function finds the appropriate
list node and deletes it from the lbl_mgr.lc_list but since
the function uses list_delete_node() the deletion function that is
specified for lbl_mgr.lc_list is not called thus dropping the memory.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The vrrpd one conflicts with the standalone vrrpd package; also we're
installing daemons to /usr/lib/frr on some systems so they're not on
PATH.
Signed-off-by: David Lamparter <equinox@diac24.net>
Previous patches introduced various issues:
- Removal of stream_free() to fix double free caused memleak
- Patch for memleak was incomplete
This should fix it hopefully.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The existing usage of the rta_nest and addattr_nest
functions were not adding the NLA_F_NESTED flag
to the type. As such the new nexthop functionality was
actually looking for this flag, while apparently older
code did not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add error handling for top level failures (not able to
execute command, unable to find vrf for command, etc.)
With this error handling we add a new zapi message type
of ZEBRA_ERROR used when we are unable to properly handle
a zapi command and pass it down into the lower level code.
In the event of this, we reply with a message of type
enum zebra_error_types containing the error type.
The sent packet will look like so:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Marker | Version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VRF ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Command |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ERROR TYPE |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Also add appropriate hooks for clients to subscribe to for
handling these types of errors.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
There's confusion between the nexthop-group configuration and a
zebra-specific show command. For now, make the zebra show
command string RIB-specific until we're able to unify these
paths.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
=================================================================
==3058==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7f5bf3ef7477 bp 0x7ffdfaa20d40 sp 0x7ffdfaa204c8 T0)
==3058==The signal is caused by a READ memory access.
==3058==Hint: address points to the zero page.
#0 0x7f5bf3ef7476 in memcpy /build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:134
#1 0x4d158a in __asan_memcpy (/usr/lib/frr/zebra+0x4d158a)
#2 0x7f5bf58da8ad in stream_put /home/qlyoung/frr/lib/stream.c:605:3
#3 0x67d428 in zsend_ipset_entry_notify_owner /home/qlyoung/frr/zebra/zapi_msg.c:851:2
#4 0x5c70b3 in zebra_pbr_add_ipset_entry /home/qlyoung/frr/zebra/zebra_pbr.c
#5 0x68e1bb in zread_ipset_entry /home/qlyoung/frr/zebra/zapi_msg.c:2465:4
#6 0x68f958 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
#7 0x55666d in main /home/qlyoung/frr/zebra/main.c:309:2
#8 0x7f5bf3e5db96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
#9 0x4311d9 in _start (/usr/lib/frr/zebra+0x4311d9)
the ipset->backpointer was NULL as that the hash lookup failed to find
anything. Prevent this crash from happening.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The decoding of _add and _del functions is practically identical
do a bit of work and make them so.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
=================================================================
==13611==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe9e5c8694 at pc 0x0000004d18ac bp 0x7ffe9e5c8330 sp 0x7ffe9e5c7ae0
WRITE of size 17 at 0x7ffe9e5c8694 thread T0
#0 0x4d18ab in __asan_memcpy (/usr/lib/frr/zebra+0x4d18ab)
#1 0x7f16f04bd97f in stream_get2 /home/qlyoung/frr/lib/stream.c:277:2
#2 0x6410ec in zebra_vxlan_remote_macip_del /home/qlyoung/frr/zebra/zebra_vxlan.c:7718:4
#3 0x68fa98 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
#4 0x556add in main /home/qlyoung/frr/zebra/main.c:309:2
#5 0x7f16eea3bb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
#6 0x431249 in _start (/usr/lib/frr/zebra+0x431249)
This decode is the result of a buffer overflow because we are
not checking ipa_len.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The linux kernel will occassionally send RTM_GETNEIGH when
it expects user space to help in resolution of an ARP entry.
See linux kernel commit:
commit 3e25c65ed085b361cc91a8f02e028f1158c9f255
Author: Tim Gardner <tim.gardner@canonical.com>
Date: Thu Aug 29 06:38:47 2013 -0600
net: neighbour: Remove CONFIG_ARPD
Since we don't care about this, let's just safely ignore this
message for the moment. I imagine in the future we might
care when we implement neighbor managment in the system.
Reported By: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There may be logic to prevent this ever happening earlier in the network
read path, but it doesn't hurt to double check it here, because clearly
deeper paths rely on this being the case.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Whatever this BFD re-transmission function is had a few problems.
1. Used memcpy instead of the (more concise) stream APIs, which include
bounds checking.
2. Did not sufficiently check packet sizes.
Actually, 2) is mitigated but is still a problem, because the BFD header
is 2 bytes larger than the "normal" ZAPI header, while the overall
message size remains the same. So if the source message being duplicated
is actually right up against the ZAPI_MAX_PACKET_SIZ, you still can't
fit the whole message into your duplicated message. I have no idea what
the intent was here but at least there's a warning if it happens now.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
- Fix iptable freeing code to free malloc'd list
- malloc iptable in zapi handler and use those functions to free it when
done to fix a linked list memleak
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We copy a fixed length buffer from the wire but don't ensure it is null
terminated. Then print it as a c-string. Lul.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
further down we hash the src & dst ip, which asserts that the afi is one
of the well known ones, given the field names i assume the correct afis
here are af_inet[6]
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zebra can catch the kernel's route deletion by netlink.
but current FRR can't delete kernel-route on vrf(l3mdev)
when kernel operator delete the route on out-side of FRR.
It looks problem about kernel-route deletion.
This problem is caused around _nexthop_cmp_no_labels(nh1,nh2)
that checks the each nexthop's member 'vrf_id'.
And _nexthop_cmp_no_labels's caller doesn't set the vrf_id
of nexthop structure. This commit fix that case.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
router-id is buried deep in "show running-config", this new
command makes it easy to retrieve the user configured router-id.
Example:
# configure terminal
(config)# router-id 1.2.3.4
(config)# end
# show router-id
router-id 1.2.3.4
# configure terminal
(config)# no router-id 1.2.3.4
(config)# end
# show router-id
#
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
We were not setting the RTNH_F_ONLINK flag where appropriate
when creating nexthop objects in the kernel.
Set it on the nhmsg.nh_flags netlink message.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When we are doing a lookup on an individual nexthop,
we should still be passing along the type that gets passed
via the arguments. Otherwise, we will always think we own that
NHE when in reality anyone could have put that into the
kernel.
Before this patch, nexthops in the kernel will get swepped
out even if we didn't create them.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We should be NULL checking the entire re->nhe struct, not
the group inside of it. When we get routes from the kernel
using a nexthop group (and future protocols) they will only
pass us an ID to use. Hence, this struct can (and will be)
NULL on first attach when only passed an ID.
There shouldn't be a situation where we have an re->nhe
and don't have an re->nhe->nhg anyway.
Before this patch you can easily make zebra crash by creating a
route in the kernel using a nexthop group and starting zebra.
`ip next add dev lo id 111`
`ip route add 1.1.1.1/32 nhid 111`
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Older versions of protobuf-c do not support version 3 of the
protocol. Add a check into the system to see if we have
version 3 available and if so, compile it in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If you compile FRR with no j factor zebra_mlag.c fails to
build because the vtysh extraction methodology runs first
before the protobuf compiler runs and that compilation does
not have the proper dependancy chain built for the inclusions
that zebra_mlag.c had. Moving the DEF* code into a zebra_mlag_vty.c
which can be included in the vtysh extraction code and has
no mlag.proto dependancies makes the compilation work better.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Handle the special case where a route update contains
no installed nexthops - that means the route is not
installed.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This is pretty much just to get rid of the HAVE_CUMULUS. The
hook/module API is as "wtf" as it was before...
Signed-off-by: David Lamparter <equinox@diac24.net>
Add an api that creates a copy of a list of nexthops and
enforces the canonical sort ordering; consolidate some nhg
code to avoid copy-and-paste. The zebra dplane uses
that api when a plugin sets up a list of nexthops, ensuring
that the plugin's list is ordered when it's processed in
zebra.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The processing of dataplane route notifications was a little
off-target after the nexthop-group re-work. This should allow
notifications to work better.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Linux has the idea of allowing a weight to be sent
down as part of a nexthop group to allow the kernel
to weight particular nexthop paths a bit more or less
than others.
See:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.multiple-links.html
Allow for installation into the kernel using the weight attribute
associated with the nexthop.
This code is foundational in that it just sets up the ability
to do this, we do not use it yet. Further commits will
allow for the pass through of this data from upper level protocols.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Use a per-nexthop flag to indicate the presence of labels; add
some utility zapi encode/decode apis for nexthops; use the zapi
apis more consistently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Use correct state/flags when installing EVPN macs; when we
converted from raw netlink to the zebra dataplane, a state value
got lost.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The flags can be important - like "threaded" - so we need to
actually capture them when plugins are registered.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Problem reported by testing agency that RFC4861 section 6.2.5
states that a router should send an RA with a lifetime of 0
before ceasing to send RAs on the interface, or when the interace
is shutdown, or the router is shutdown. This fix adds that capability.
Ticket: CM-27061
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
For SR-TE we'll need to create Binding-SIDs which are essentially
LSPs that can push multiple outgoing labels. This commit sets the
groundwork for that. Luckily the netlink code didn't need to be
changed since it already supports pushing label stacks.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The openfabric daemon has a longer name than anticipated for
`show zebra client summary` adjust to allow it to fit without
making columns all blomped.
Before:
robot# show zebra client summ
Name Connect Time Last Read Last Write IPv4 Routes IPv6 Routes
--------------------------------------------------------------------------------
static 00:00:06 00:00:06 00:00:06 4/0 0/0
openfabric 00:00:06 00:00:06 00:00:06 0/0 0/0
After:
[sharpd@robot frr4]$ vtysh -c "show zebra client summ"
Name Connect Time Last Read Last Write IPv4 Routes IPv6 Routes
--------------------------------------------------------------------------------
static 00:02:16 00:02:16 00:02:16 4/0 0/0
openfabric 00:02:16 00:02:16 00:02:16 0/0 0/0
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported by testing facility that our sending of Router
Advertisements more frequently than once very three seconds is not
compliant with rfc4861. Added a knob to turn off fast retransmits
in order to meet the requirement of the RFC.
Ticket: CM-27063
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Macvlan down event have sentinel check of its parent
link presence.
Ticket:CM-26622
Reviewed By:CCR-9326
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
"show vrf vni" and "show evpn vni <l3vni>" commands
need to display correct router mac value.
"show evpn vni <l3vni>" detail l3vni needs to display
system mac as in PIP scenario value can be different.
Syste MAC would be derived from SVI interface MAC wherelse
Router MAC would be derived from macvlan interface MAC value.
Ticket:CM-26710
Reviewed By:CCR-9334
Testing Done:
TORC11# show evpn vni 4001
VNI: 4001
Type: L3
Tenant VRF: vrf1
Local Vtep Ip: 36.0.0.11
Vxlan-Intf: vx-4001
SVI-If: vlan4001
State: Up
VNI Filter: none
System MAC: 00:02:00:00:00:2e
Router MAC: 44:38:39:ff:ff:01
L2 VNIs: 1000
TORC11# show vrf vni
VRF VNI VxLAN IF L3-SVI State Rmac
vrf1 4001 vx-4001 vlan4001 Up 44:38:39:ff:ff:01
TORC11# show evpn vni 4001 json
{
"vni":4001,
"type":"L3",
"localVtepIp":"36.0.0.11",
"vxlanIntf":"vx-4001",
"sviIntf":"vlan4001",
"state":"Up",
"vrf":"vrf1",
"sysMac":"00:02:00:00:00:2e",
"routerMac":"44:38:39:ff:ff:01",
"vniFilter":"none",
"l2Vnis":[
1000,
]
}
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
macvlan interface up/down event triggers
bgp to send updates for evpn routes
with changed RMAC and nexthop IP values.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
By default announct Self Type-2 routes with
system IP as nexthop and system MAC as
nexthop.
An API to check type-2 is self route via
checking ipv4/ipv6 address from connected interfaces list.
An API to extract RMAC and nexthop for type-2
routes based on advertise-svi-ip knob is enabled.
When advertise-pip is enabled/disabled, trigger type-2
route update. For self type-2 routes to use
anycast or individual (rmac, nexthop) addresses.
Ticket:CM-26190
Reviewed By:
Testing Done:
Enable 'advertise-svi-ip' knob in bgp default instance.
the vrf instance svi ip is advertised with nexthop
as default instance router-id and RMAC as system MAC.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Extract mac-vlan interface mac when a l3vni add is sent to bgp
Per L3VNI maintain vrr interface.
An api to extract vrr mac address from a vlan id, associated
master svi device.
When a l3vni operational up event is sent to bgpd,
extract vrr rmac along with svi rmac.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Zebra MLAG is using "t_read" for multiple tasks, such as
1. For opening Communication channel with MLAG
2. In case conncetion fails, same event is used for retries
3. after the connection establishment, same event is used to
read the data from MLAG
since all these taks will never schedule together, this will not
cause any issues.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
edge-2> show evpn vni detail json
{
"vni":79031,
"type":"L3",
...,
...
} <<<<<< no comma
{
"vni":79021,
"type":"L3",
...,
...
} <<<<<< no comma
{
} <<<<<< blank
edge-2>
The fix is to pack json info into json_array before printing it.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
Apparently the multipath_num functionatlity has been broken
for a while because we were ignoring the recusive nexthops
when marking them inactive based on it.
This sets them as inactive as well if the parent breaks it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were re-counting the entire group's active number on
every iteration of this nexthop_active_update() loop.
This is not great from a performance perspective but also
it was failing to properly mark things according to the
specified multipath_num.
Since a nexthop is set as active before this check, if its == to
the set ecmp, it gets marked inactive even though if its
under the max ecmp wanted!
ex)
set ecmp to 1.
`/usr/lib/frr/zebra -e 1`
All kernel routes will be marked inactive even with just one nexthop!
K 1.1.1.1/32 [0/0] is directly connected, dummy1 inactive, 00:00:10
K 1.1.1.2/32 [0/0] is directly connected, dummy2 inactive, 00:00:10
K 1.1.1.3/32 [0/0] is directly connected, dummy3 inactive, 00:00:10
K 1.1.1.4/32 [0/0] is directly connected, dummy4 inactive, 00:00:10
K 1.1.1.5/32 [0/0] is directly connected, dummy5 inactive, 00:00:10
K 1.1.1.6/32 [0/0] is directly connected, dummy6 inactive, 00:00:10
K 1.1.1.7/32 [0/0] is directly connected, dummy7 inactive, 00:00:10
K 1.1.1.8/32 [0/0] is directly connected, dummy8 inactive, 00:00:10
K 1.1.1.9/32 [0/0] is directly connected, dummy9 inactive, 00:00:10
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
On BSD systems null routes were not being installed into the
kernel. This is because commit 08ea27d112
introduced a bug where we were attempting to use the wrong
prefix afi types and as such we were going down the v6 code path.
test27.lab.netdef.org# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
K>* 0.0.0.0/0 [0/0] via 192.168.122.1, 00:00:23
S>* 4.5.6.8/32 [1/0] unreachable (blackhole), 00:00:11
C>* 192.168.122.0/24 [0/1] is directly connected, vtnet0, 00:00:23
test27.lab.netdef.org# exit
[ci@test27 ~/frr]$ netstat -rn
Routing tables
Internet:
Destination Gateway Flags Netif Expire
default 192.168.122.1 UGS vtnet0
4.5.6.8/32 127.0.0.1 UG1B lo0
127.0.0.1 link#2 UH lo0
192.168.122.0/24 link#1 U vtnet0
192.168.122.108 link#1 UHS lo0
Fixes: #4843
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code for when a new vrf is created to properly handle
router advertisement for it is messed up in several ways:
1) Generation of the zrouter data structure should set the rtadv
socket to -1 so that we don't accidently close someone elses
open file descriptor
2) When you created a new zvrf instance *after* bootup we are XCALLOC'ing
the data structure so the zvrf->fd was 0. The shutdown code was looking
for the >= 0 to know if the fd existed (since fd 0 is valid!)
This sequence of events would cause zebra to consume 100% of the
cpu:
Run zebra by itself ( no other programs )
ip link add vrf1 type vrf table 1003
ip link del vrf vrf1
vtysh -c "configure" -c "no interface vrf1"
This commit fixes this issue.
Fixes: #5376
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we shut down zebra, we were not doing anything to shut
down the FPM. Perform the necessary occult rituals and
stop the threads from running during early shutdown.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We should be setting the ns->info pointer to NULL when we free
what it points to. Just use XFREE directly on the void * pointer
to do this.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were not connecting the default zebra_ns to the default
ns->info at namespace initialization in zebra. Thus, when
we tried to use the `ns_walk_func()` it would ignore the
default zebra_ns since there is no pointer to it from the
ns struct.
Fix this by connecting them in `zebra_ns_init()` and,
if the default ns is not found, exit with failure
since this is not recoverable.
This was found during a crash where we fail to cancel the kernel_read
thread at termination (via the `ns_walk_func()`) and then we
get a netlink notification trying to use the zns struct that has
already been freed.
```
(gdb) bt
\#0 0x00007fc1134dc7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
\#1 0x00007fc1134c7535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
\#2 0x00007fc113996f8f in core_handler (signo=11, siginfo=0x7ffe5429d070, context=<optimized out>) at lib/sigevent.c:254
\#3 <signal handler called>
\#4 0x0000561880e15449 in if_lookup_by_index_per_ns (ns=0x0, ifindex=174) at zebra/interface.c:269
\#5 0x0000561880e1642c in if_up (ifp=ifp@entry=0x561883076c50) at zebra/interface.c:1043
\#6 0x0000561880e10723 in netlink_link_change (h=0x7ffe5429d8f0, ns_id=<optimized out>, startup=<optimized out>) at zebra/if_netlink.c:1384
\#7 0x0000561880e17e68 in netlink_parse_info (filter=filter@entry=0x561880e17680 <netlink_information_fetch>, nl=nl@entry=0x561882497238, zns=zns@entry=0x7ffe542a5940,
count=count@entry=5, startup=startup@entry=0) at zebra/kernel_netlink.c:932
\#8 0x0000561880e186a5 in kernel_read (thread=<optimized out>) at zebra/kernel_netlink.c:406
\#9 0x00007fc1139a4416 in thread_call (thread=thread@entry=0x7ffe542a5b70) at lib/thread.c:1599
\#10 0x00007fc113974ef8 in frr_run (master=0x5618823c9510) at lib/libfrr.c:1024
\#11 0x0000561880e0b916 in main (argc=8, argv=0x7ffe542a5f78) at zebra/main.c:483
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
1. add the Mlag ProtoBuf Lib to Zebra Compilation
2. Encode the messages with protobuf before writing to MLAG
3. Decode the MLAG Messages using protobuf and write to clients
based on their subscrption.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
This includes:
1. Processing client Registrations for MLAG
2. storing client Interests for MLAG updates
3. Opening communication channel to MLAG with First client reg
4. Closing Communication channel with last client De-reg
5. Spawning a new thread for handling MLAG updates peocessing
6. adding Test code
7. advertising MLAG Updates to clients based on their interests
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>