Commit Graph

160 Commits

Author SHA1 Message Date
Mark Stapp
f924db4961 zebra: fix some coverity SA warnings
Fix several coverity scan warnings.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-04-14 07:44:54 -04:00
Mark Stapp
0328a5bd0d zebra: don't include backup nhs in main nhe dependency tree
We don't want to install backup nexthops - yet - as part of the
nexthop-id-based kernel interactions on netlink platforms. Avoid
mixing backup and primary nexthops in the tree of dependencies
in the ecmp cases.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
377e29f7e7 zebra: handle backup nexthops in nhe/nhgs
Include backup nexthops in nhe processing; connect incoming
zapi route data with updated rib/nhg apis; add more debugs in
nhg processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
1d48702ede zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.

Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Stephen Worley
d43122b58f zebra: break if duplicate nexthop found in nhe2grp
If we find that a nexthop is a duplicate, break immediately
rather than continuing to look through the rest of the list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:57:45 -04:00
Stephen Worley
086e4e02f5 zebra: properly set the NEXTHOP_GROUP_VALID flag
Properly set the NEXTHOP_GROUP_VALID flag and use it
as a conditional for installation decisions for individual
nexthop and groups containing it.

We set the NEXTHOP_GROUP_VALID flag it is:

1) A fully resolved active nexthop
or
2) Its a group that contains at least one VALID NHE

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:16 -04:00
Stephen Worley
715e5c70d5 zebra: set valid on re->nhe directly in nexthop_active_update()
We were still doing a lookup on the nhe_id from before we
started referencing re->nhe directly.

Change set flag to just use re->nhe directly here since they
should always be the same at this point in the code anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
b1c3f7ef80 zebra: add debug for duplicate NH in dataplane array conversion
When we find a nexthop ID thats a duplicate in the code that converts
NHG rb trees into a flat list of nexthop IDs for the dataplane,
output a debug message.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
1866b3afc2 zebra: don't add ID to kernel nh_grp if not installed/queued
When we transform the nexthop group rb trees into a flat
array of IDs to send into the dataplane code (zebra_nhg_nhe2grp),
don't put an ID in there that has not been in installed or is
not currently queued to be installed into the dataplane.

Otherwise, if some of the nexthops fail to install, we will
still try to create a group with them and then the entire group
will fail.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
497ff5792f zebra: handle NHG in NHG dataplane group conversion
We were not properly handling the case of a NHG inside of
another NHG when converting the rb tree of a multilevel NHG
into a flat list of IDs. When constructing, we call the function
zebra_nhg_nhe2grp_internal() recursively so that the rare
case of a group within a group is handled such that its
singleton nexthops are appended to the grp array of IDs
we send to the dataplane code.

Ex)

1:
	-> 2:
		-> 3
		-> 4
	->5:
		->6

becomes this:

1:
	->3
	->4
	->6

when its sent to the dataplane code for final kernel installation.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
David Lamparter
d6951e5ef9 *: remove tabs from log messages
Some logging systems are, er, "allergic" to tabs in log messages.
(RFC5424: "The syslog application SHOULD avoid octet values below 32")

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-03-24 18:47:12 +01:00
Ruben Kerkhof
99e7ab12cf zebra: use modern C function definition
And also remove an assignment without effect while we're here.

Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2020-03-11 14:06:34 +01:00
Donald Sharp
0752c8d8a4 zebra: nhg->nexthop is not NULL
We have already asserted on nhg->nexthop an if statement
to flog_err makes no sense.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 16:37:19 -05:00
Donald Sharp
5e81f5dd1a *: Finish off the __PRETTY_FUNCTION__ to __func__
FINISH IT

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 09:23:22 -05:00
Donatas Abraitis
15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Stephen Worley
fc8a02c45f zebra: trust directly connected kernel/system routes
We made the decision to explicitly trust kernel and system routes
of every other type with 058c16b7e2.

So, we should trust directly connected routes the same way, assuming
the interface exists.

Old Behavior:

K   2.2.2.1/32 [0/0] is directly connected, unknown inactive, 00:00:39

New Behavior:

K>* 2.2.2.1/32 [0/0] is directly connected, test1, 00:00:03

As a bonus, this fixes the issues we were seeing with not removing
directly connected routes of certain interface types when
those interfaces go down/are deleted.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-02 13:45:40 -05:00
Mark Stapp
c415d89528 zebra: Embed lib nexthop-group in zebra hash entry
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-27 15:49:31 -05:00
Renato Westphal
ecaeb3b697
Merge pull request #5750 from qlyoung/fix-null-after-xfree
*: don't null after XFREE; XFREE does this itself
2020-02-05 01:49:08 -03:00
Russ White
c7a754408e
Merge pull request #5746 from donaldsharp/bgp_sa
Coverioty sa stuff
2020-02-04 11:24:08 -05:00
Russ White
05d0c66d8f
Merge pull request #5737 from mjstapp/zebra_disable_kern_nhs
zebra: add config to disable use of kernel nexthops
2020-02-04 08:12:34 -05:00
Donald Sharp
9275682559 zebra: top has already been derefed
The top variable has already been derefed by the time we get
to the test to see if it is non-NULL.  No need to check it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-04 08:10:52 -05:00
Quentin Young
b3ba5dc7fe *: don't null after XFREE; XFREE does this itself
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-02-03 11:22:13 -05:00
Donald Sharp
88cafda739 zebra: nexthop groups vrf's are only a function of namespaces
Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.

Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-31 08:45:51 -05:00
Stephen Worley
a7e1b02d4a zebra: add null check before connecting recursive depend
Add a null check in `handle_recursive_depend()` so it
doesn't try to add a NULL pointer to the RB tree.

This was found with clang SA.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:06 -05:00
Stephen Worley
5bf15faa19 zebra: don't created connected if duplicate depend
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.

The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.

This is a pretty unlikely situation to happen.

Also, pull up the RB handling of _del RB API as well.

This was found with the zapi fuzzing code.

```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840==    at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840==    by 0x48E1008: qcalloc (memory.c:110)
==1052840==    by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840==    by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840==    by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840==    by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840==    by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840==    by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840==    by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840==    by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840==    by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840==    by 0x428B11: main (main.c:309)
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:05 -05:00
Mark Stapp
7c99d51beb zebra: add config to disable use of kernel nexthops
Add a config that disables use of kernel-level nexthop ids.
Currently, zebra always uses nexthop ids if the kernel supports
them.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-01-28 11:00:42 -05:00
Mark Stapp
d26e2d9be4
Merge pull request #5600 from sworleys/NHG-Depend-Crash
zebra: can't improve efficiency for recursive depends
2020-01-15 16:31:55 -05:00
Mark Stapp
a67b69c024
Merge pull request #5616 from sworleys/NHG-Fix-Recurse-to-Group
zebra: just set nexthop member in handle_recursive_depend()
2020-01-15 16:26:06 -05:00
Stephen Worley
1d049aba72 zebra: just set nexthop member in handle_recursive_depend()
With recent changes to the lib nexthop_group
APIs (e1f3a8eb19), we are making
new assumptions that this should be adding a single nexthop
to a group, not a list of nexthops.

This broke the case of a recursive nexthop resolving to a group:

```
D>  2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09
  *                      via 1.1.1.1, dummy1 onlink, 00:00:09
                       via 1.1.1.2 (recursive), 00:00:09
  *                      via 1.1.1.2, dummy2 onlink, 00:00:09
D>  3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04
  *                      via 1.1.1.1, dummy1 onlink, 00:00:04
K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21
```

This group can instead just directly point to the nh that was passed.
Its only being used for a lookup (the memory gets copied and used
elsewhere if the nexthop is not found).

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:29 -05:00
Stephen Worley
77bf9504bf lib,zebra: tighten up the nexthop_copy/nexthop_dup APIs
Make the nexthop_copy/nexthop_dup APIs more consistent by
adding a secondary, non-recursive, version of them. Before,
it was inconsistent whether the APIs were expected to copy
recursive info or not. Make it clear now that the default is
recursive info is copied unless the _no_recurse() version is
called. These APIs are not heavily used so it is fine to
change them for now.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Stephen Worley
0fff714efa zebra: can't improve efficiency for recursive depends
cb86eba3ab was causing zebra to crash
when handling a nexthop group that had a nexthop which was recursively resolved.

Steps to recreate:

!
nexthop-group red
 nexthop 1.1.1.1
 nexthop 1.1.1.2
!

sharp install routes 8.8.8.1 nexthop-group red 1

=========================================
==11898== Invalid write of size 8
==11898==    at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254)
==11898==    by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296)
==11898==    by 0x453593: handle_recursive_depend (zebra_nhg.c:481)
==11898==    by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572)
==11898==    by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597)
==11898==    by 0x4536B4: depends_find (zebra_nhg.c:1065)
==11898==    by 0x453526: depends_find_add (zebra_nhg.c:1087)
==11898==    by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567)
==11898==    by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==11898==    by 0x452268: nexthop_active_update (zebra_nhg.c:1729)
==11898==    by 0x461517: rib_process (zebra_rib.c:1049)
==11898==    by 0x4610C8: process_subq_route (zebra_rib.c:1967)
==11898==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Zebra crashes because we weren't handling the case of the depend nexthop
being recursive.

For this case, we cannot make the function more efficient. A nexthop
could resolve to a group of any size, thus we need allocs/frees.

To solve this and retain the goal of the original patch, we separate out the
two cases so it will still be more efficient if the nexthop is not recursive.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Donald Sharp
946de1b95a bgpd, ospfd, zebra: Do not use 0 as VRF_DEFAULT
Explicitly spell out what we are trying to do.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-15 08:29:36 -05:00
Mark Stapp
cb86eba3ab zebra: improve efficiency of depends_find()
Do less malloc and free in depends_find(), when looking for
a singleton nexthop in the nhg hash.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-18 15:34:37 -05:00
Stephen Worley
b10d6b0744 zebra: pass type when finding individual nexthop
When we are doing a lookup on an individual nexthop,
we should still be passing along the type that gets passed
via the arguments. Otherwise, we will always think we own that
NHE when in reality anyone could have put that into the
kernel.

Before this patch, nexthops in the kernel will get swepped
out even if we didn't create them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-12-16 16:46:30 -05:00
Donald Sharp
df7fb5800b lib, zebra: Allow for installation of a weighted nexthop
Linux has the idea of allowing a weight to be sent
down as part of a nexthop group to allow the kernel
to weight particular nexthop paths a bit more or less
than others.

See:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.multiple-links.html

Allow for installation into the kernel using the weight attribute
associated with the nexthop.

This code is foundational in that it just sets up the ability
to do this, we do not use it yet.  Further commits will
allow for the pass through of this data from upper level protocols.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-12-09 13:37:37 -05:00
Donald Sharp
e302caaa81
Merge pull request #5416 from mjstapp/re_nhe_pointer
lib,zebra: use shared nexthop-group in route_entry
2019-12-04 14:11:04 -05:00
Mark Stapp
0eb97b860d lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-04 08:13:52 -05:00
Donatas Abraitis
d79368d3a5
Merge pull request #5192 from donaldsharp/zebra_rejection
zebra: Dissallow a /32 or /128 through itself
2019-12-03 09:29:50 +02:00
Stephen Worley
4c55b5ff6b zebra: Set resolved inactive when > multipath_num
Apparently the multipath_num functionatlity has been broken
for a while because we were ignoring the recusive nexthops
when marking them inactive based on it.

This sets them as inactive as well if the parent breaks it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-21 16:28:31 -05:00
Stephen Worley
08de78b876 zebra: Use curr_active to check multipath_num
We were re-counting the entire group's active number on
every iteration of this nexthop_active_update() loop.

This is not great from a performance perspective but also
it was failing to properly mark things according to the
specified multipath_num.

Since a nexthop is set as active before this check, if its == to
the set ecmp, it gets marked inactive even though if its
under the max ecmp wanted!

ex)

set ecmp to 1.
`/usr/lib/frr/zebra -e 1`

All kernel routes will be marked inactive even with just one nexthop!

K   1.1.1.1/32 [0/0] is directly connected, dummy1 inactive, 00:00:10
K   1.1.1.2/32 [0/0] is directly connected, dummy2 inactive, 00:00:10
K   1.1.1.3/32 [0/0] is directly connected, dummy3 inactive, 00:00:10
K   1.1.1.4/32 [0/0] is directly connected, dummy4 inactive, 00:00:10
K   1.1.1.5/32 [0/0] is directly connected, dummy5 inactive, 00:00:10
K   1.1.1.6/32 [0/0] is directly connected, dummy6 inactive, 00:00:10
K   1.1.1.7/32 [0/0] is directly connected, dummy7 inactive, 00:00:10
K   1.1.1.8/32 [0/0] is directly connected, dummy8 inactive, 00:00:10
K   1.1.1.9/32 [0/0] is directly connected, dummy9 inactive, 00:00:10

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-21 15:27:12 -05:00
Mark Stapp
5463ce26c3 zebra: clean up rib and nhg headers
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-11-21 15:05:52 -05:00
Russ White
943de56af6
Merge pull request #5241 from sworleys/SA-NHG
One More Zebra NHG SA Fix and nhg_ctx API Adjustment
2019-11-19 11:44:15 -05:00
Stephen Worley
7c6d5f255e zebra: Put freeing code in nhg_ctx_free()
Put the code to free the data held by a nhg_ctx
in nhg_ctx_free() as well. We do it similiarly for
the dplane_ctx.

Let nhg_ctx_fini() be any other routines that need to
be handled before freeing.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 10:29:16 -05:00
Stephen Worley
606fa9e58d zebra: handle depends_find() NULL nexthop
SA warned us lookup could be NULL dereferenced in some
paths. Handle the case where we are passed a NULL
nexthop before we try to copy it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 10:28:46 -05:00
Stephen Worley
148813c22a zebra: zebra_nhg check each nexthop for active, not just number
We were only checking that two nhg_hash_entry's were equal
based on the active nexthop NUMBER. This is not sufficient in
special cases where whats active with one route using it,
might not be active with the other. We can see this with
routes trying to resolve to themselves.

Ex)

1.1.1.0/24
	-> 1.1.1.1 dummy1 (inactive)
	-> 1.1.1.2 dummy2

1.1.2.0/24
	-> 1.1.1.1 dummy1
	-> 1.1.1.2 dummy1 (inactive)

Without checking each nexthop individually, they will
hash to the same group since they have the same number of
active nexthops.

Fix this by looping over every nexthop for each nhe (they should
be sorted) and checking if the NEXTHOP_FLAG_ACTIVE flag's match.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Donald Sharp
7134ba7060 zebra: Fix some nhg SA issues found in latest Coverity
Fix 2 Coverity issues:
1) zebra_nhg.c -> all paths in nhg_ctx_process_finish have
already deref'ed the ctx pointer no need for a test of it

2) the **ifp pointer passed in may be NULL.  Prevent an accidental
deref if calling function does not pass in a ifp pointer.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-28 20:30:06 -04:00
Stephen Worley
5948f013ba zebra: Cleanup zebra_nhg APIs
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.

Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
80286aa564 zebra: Re-work zebra_nhg_*_valid APIs
Re-work the validity setting and checking APIs
for nhg_hash_entry's to make them clearer.

Further, they were originally only beings set
on ifdown and install. Extended their use into
releasing entries and to account for setting
the validity of a recursive dependent.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
e1292378e2 zebra: Improve commenting for group requeue case
The commenting for why we would need to requeue a
group from the kernel to be later processed was not
sufficient. Add a better explanation for the flow
and state of the system.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
c1da832a94 zebra: Change wording of duplicate kernel nhg flag
Change the wording of the flag indicating we have received
a nexthop group from the kernel with a different ID but
is fundamentally identical to one we already have.

It was colliding with a flag of similar name in the nexthop struct.

Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE
since it is in fact unhashable.

Also change the wording of functions and comments referencing the same
problem.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00