Commit Graph

213 Commits

Author SHA1 Message Date
Mark Stapp
6bc5d97795 zebra: prefer outer label_type for recursive nexthops
When resolving a recursive nexthop, prefer the "outer"
label type, if present.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-05-12 14:27:02 -04:00
Donald Sharp
630d596249 zebra: Remove typedef rib_table_info_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Donald Sharp
5cfaa2d92b zebra: Loosen ONLINK restrictions a tiny bit
Loosen the ONLINK restrictions such that when an upper
level protocol sends us a nexthop with an ONLINK attribute
just ensure that interface is up and usable.  ONLINK effectively
means we know what we are doing to the kernel.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-06 10:15:41 -04:00
Mark Stapp
f924db4961 zebra: fix some coverity SA warnings
Fix several coverity scan warnings.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-04-14 07:44:54 -04:00
Mark Stapp
0328a5bd0d zebra: don't include backup nhs in main nhe dependency tree
We don't want to install backup nexthops - yet - as part of the
nexthop-id-based kernel interactions on netlink platforms. Avoid
mixing backup and primary nexthops in the tree of dependencies
in the ecmp cases.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
377e29f7e7 zebra: handle backup nexthops in nhe/nhgs
Include backup nexthops in nhe processing; connect incoming
zapi route data with updated rib/nhg apis; add more debugs in
nhg processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
1d48702ede zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.

Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Stephen Worley
d43122b58f zebra: break if duplicate nexthop found in nhe2grp
If we find that a nexthop is a duplicate, break immediately
rather than continuing to look through the rest of the list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:57:45 -04:00
Stephen Worley
086e4e02f5 zebra: properly set the NEXTHOP_GROUP_VALID flag
Properly set the NEXTHOP_GROUP_VALID flag and use it
as a conditional for installation decisions for individual
nexthop and groups containing it.

We set the NEXTHOP_GROUP_VALID flag it is:

1) A fully resolved active nexthop
or
2) Its a group that contains at least one VALID NHE

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:16 -04:00
Stephen Worley
715e5c70d5 zebra: set valid on re->nhe directly in nexthop_active_update()
We were still doing a lookup on the nhe_id from before we
started referencing re->nhe directly.

Change set flag to just use re->nhe directly here since they
should always be the same at this point in the code anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
b1c3f7ef80 zebra: add debug for duplicate NH in dataplane array conversion
When we find a nexthop ID thats a duplicate in the code that converts
NHG rb trees into a flat list of nexthop IDs for the dataplane,
output a debug message.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
1866b3afc2 zebra: don't add ID to kernel nh_grp if not installed/queued
When we transform the nexthop group rb trees into a flat
array of IDs to send into the dataplane code (zebra_nhg_nhe2grp),
don't put an ID in there that has not been in installed or is
not currently queued to be installed into the dataplane.

Otherwise, if some of the nexthops fail to install, we will
still try to create a group with them and then the entire group
will fail.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
497ff5792f zebra: handle NHG in NHG dataplane group conversion
We were not properly handling the case of a NHG inside of
another NHG when converting the rb tree of a multilevel NHG
into a flat list of IDs. When constructing, we call the function
zebra_nhg_nhe2grp_internal() recursively so that the rare
case of a group within a group is handled such that its
singleton nexthops are appended to the grp array of IDs
we send to the dataplane code.

Ex)

1:
	-> 2:
		-> 3
		-> 4
	->5:
		->6

becomes this:

1:
	->3
	->4
	->6

when its sent to the dataplane code for final kernel installation.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
David Lamparter
d6951e5ef9 *: remove tabs from log messages
Some logging systems are, er, "allergic" to tabs in log messages.
(RFC5424: "The syslog application SHOULD avoid octet values below 32")

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-03-24 18:47:12 +01:00
Ruben Kerkhof
99e7ab12cf zebra: use modern C function definition
And also remove an assignment without effect while we're here.

Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2020-03-11 14:06:34 +01:00
Donald Sharp
0752c8d8a4 zebra: nhg->nexthop is not NULL
We have already asserted on nhg->nexthop an if statement
to flog_err makes no sense.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 16:37:19 -05:00
Donald Sharp
5e81f5dd1a *: Finish off the __PRETTY_FUNCTION__ to __func__
FINISH IT

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 09:23:22 -05:00
Donatas Abraitis
15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Stephen Worley
fc8a02c45f zebra: trust directly connected kernel/system routes
We made the decision to explicitly trust kernel and system routes
of every other type with 058c16b7e2.

So, we should trust directly connected routes the same way, assuming
the interface exists.

Old Behavior:

K   2.2.2.1/32 [0/0] is directly connected, unknown inactive, 00:00:39

New Behavior:

K>* 2.2.2.1/32 [0/0] is directly connected, test1, 00:00:03

As a bonus, this fixes the issues we were seeing with not removing
directly connected routes of certain interface types when
those interfaces go down/are deleted.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-02 13:45:40 -05:00
Mark Stapp
c415d89528 zebra: Embed lib nexthop-group in zebra hash entry
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-27 15:49:31 -05:00
Renato Westphal
ecaeb3b697
Merge pull request #5750 from qlyoung/fix-null-after-xfree
*: don't null after XFREE; XFREE does this itself
2020-02-05 01:49:08 -03:00
Russ White
c7a754408e
Merge pull request #5746 from donaldsharp/bgp_sa
Coverioty sa stuff
2020-02-04 11:24:08 -05:00
Russ White
05d0c66d8f
Merge pull request #5737 from mjstapp/zebra_disable_kern_nhs
zebra: add config to disable use of kernel nexthops
2020-02-04 08:12:34 -05:00
Donald Sharp
9275682559 zebra: top has already been derefed
The top variable has already been derefed by the time we get
to the test to see if it is non-NULL.  No need to check it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-04 08:10:52 -05:00
Quentin Young
b3ba5dc7fe *: don't null after XFREE; XFREE does this itself
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-02-03 11:22:13 -05:00
Donald Sharp
88cafda739 zebra: nexthop groups vrf's are only a function of namespaces
Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.

Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-31 08:45:51 -05:00
Stephen Worley
a7e1b02d4a zebra: add null check before connecting recursive depend
Add a null check in `handle_recursive_depend()` so it
doesn't try to add a NULL pointer to the RB tree.

This was found with clang SA.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:06 -05:00
Stephen Worley
5bf15faa19 zebra: don't created connected if duplicate depend
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.

The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.

This is a pretty unlikely situation to happen.

Also, pull up the RB handling of _del RB API as well.

This was found with the zapi fuzzing code.

```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840==    at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840==    by 0x48E1008: qcalloc (memory.c:110)
==1052840==    by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840==    by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840==    by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840==    by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840==    by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840==    by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840==    by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840==    by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840==    by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840==    by 0x428B11: main (main.c:309)
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:05 -05:00
Mark Stapp
7c99d51beb zebra: add config to disable use of kernel nexthops
Add a config that disables use of kernel-level nexthop ids.
Currently, zebra always uses nexthop ids if the kernel supports
them.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-01-28 11:00:42 -05:00
Mark Stapp
d26e2d9be4
Merge pull request #5600 from sworleys/NHG-Depend-Crash
zebra: can't improve efficiency for recursive depends
2020-01-15 16:31:55 -05:00
Mark Stapp
a67b69c024
Merge pull request #5616 from sworleys/NHG-Fix-Recurse-to-Group
zebra: just set nexthop member in handle_recursive_depend()
2020-01-15 16:26:06 -05:00
Stephen Worley
1d049aba72 zebra: just set nexthop member in handle_recursive_depend()
With recent changes to the lib nexthop_group
APIs (e1f3a8eb19), we are making
new assumptions that this should be adding a single nexthop
to a group, not a list of nexthops.

This broke the case of a recursive nexthop resolving to a group:

```
D>  2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09
  *                      via 1.1.1.1, dummy1 onlink, 00:00:09
                       via 1.1.1.2 (recursive), 00:00:09
  *                      via 1.1.1.2, dummy2 onlink, 00:00:09
D>  3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04
  *                      via 1.1.1.1, dummy1 onlink, 00:00:04
K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21
```

This group can instead just directly point to the nh that was passed.
Its only being used for a lookup (the memory gets copied and used
elsewhere if the nexthop is not found).

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:29 -05:00
Stephen Worley
77bf9504bf lib,zebra: tighten up the nexthop_copy/nexthop_dup APIs
Make the nexthop_copy/nexthop_dup APIs more consistent by
adding a secondary, non-recursive, version of them. Before,
it was inconsistent whether the APIs were expected to copy
recursive info or not. Make it clear now that the default is
recursive info is copied unless the _no_recurse() version is
called. These APIs are not heavily used so it is fine to
change them for now.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Stephen Worley
0fff714efa zebra: can't improve efficiency for recursive depends
cb86eba3ab was causing zebra to crash
when handling a nexthop group that had a nexthop which was recursively resolved.

Steps to recreate:

!
nexthop-group red
 nexthop 1.1.1.1
 nexthop 1.1.1.2
!

sharp install routes 8.8.8.1 nexthop-group red 1

=========================================
==11898== Invalid write of size 8
==11898==    at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254)
==11898==    by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296)
==11898==    by 0x453593: handle_recursive_depend (zebra_nhg.c:481)
==11898==    by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572)
==11898==    by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597)
==11898==    by 0x4536B4: depends_find (zebra_nhg.c:1065)
==11898==    by 0x453526: depends_find_add (zebra_nhg.c:1087)
==11898==    by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567)
==11898==    by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==11898==    by 0x452268: nexthop_active_update (zebra_nhg.c:1729)
==11898==    by 0x461517: rib_process (zebra_rib.c:1049)
==11898==    by 0x4610C8: process_subq_route (zebra_rib.c:1967)
==11898==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Zebra crashes because we weren't handling the case of the depend nexthop
being recursive.

For this case, we cannot make the function more efficient. A nexthop
could resolve to a group of any size, thus we need allocs/frees.

To solve this and retain the goal of the original patch, we separate out the
two cases so it will still be more efficient if the nexthop is not recursive.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Donald Sharp
946de1b95a bgpd, ospfd, zebra: Do not use 0 as VRF_DEFAULT
Explicitly spell out what we are trying to do.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-15 08:29:36 -05:00
Mark Stapp
cb86eba3ab zebra: improve efficiency of depends_find()
Do less malloc and free in depends_find(), when looking for
a singleton nexthop in the nhg hash.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-18 15:34:37 -05:00
Stephen Worley
b10d6b0744 zebra: pass type when finding individual nexthop
When we are doing a lookup on an individual nexthop,
we should still be passing along the type that gets passed
via the arguments. Otherwise, we will always think we own that
NHE when in reality anyone could have put that into the
kernel.

Before this patch, nexthops in the kernel will get swepped
out even if we didn't create them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-12-16 16:46:30 -05:00
Donald Sharp
df7fb5800b lib, zebra: Allow for installation of a weighted nexthop
Linux has the idea of allowing a weight to be sent
down as part of a nexthop group to allow the kernel
to weight particular nexthop paths a bit more or less
than others.

See:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.multiple-links.html

Allow for installation into the kernel using the weight attribute
associated with the nexthop.

This code is foundational in that it just sets up the ability
to do this, we do not use it yet.  Further commits will
allow for the pass through of this data from upper level protocols.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-12-09 13:37:37 -05:00
Donald Sharp
e302caaa81
Merge pull request #5416 from mjstapp/re_nhe_pointer
lib,zebra: use shared nexthop-group in route_entry
2019-12-04 14:11:04 -05:00
Mark Stapp
0eb97b860d lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-04 08:13:52 -05:00
Donatas Abraitis
d79368d3a5
Merge pull request #5192 from donaldsharp/zebra_rejection
zebra: Dissallow a /32 or /128 through itself
2019-12-03 09:29:50 +02:00
Stephen Worley
4c55b5ff6b zebra: Set resolved inactive when > multipath_num
Apparently the multipath_num functionatlity has been broken
for a while because we were ignoring the recusive nexthops
when marking them inactive based on it.

This sets them as inactive as well if the parent breaks it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-21 16:28:31 -05:00
Stephen Worley
08de78b876 zebra: Use curr_active to check multipath_num
We were re-counting the entire group's active number on
every iteration of this nexthop_active_update() loop.

This is not great from a performance perspective but also
it was failing to properly mark things according to the
specified multipath_num.

Since a nexthop is set as active before this check, if its == to
the set ecmp, it gets marked inactive even though if its
under the max ecmp wanted!

ex)

set ecmp to 1.
`/usr/lib/frr/zebra -e 1`

All kernel routes will be marked inactive even with just one nexthop!

K   1.1.1.1/32 [0/0] is directly connected, dummy1 inactive, 00:00:10
K   1.1.1.2/32 [0/0] is directly connected, dummy2 inactive, 00:00:10
K   1.1.1.3/32 [0/0] is directly connected, dummy3 inactive, 00:00:10
K   1.1.1.4/32 [0/0] is directly connected, dummy4 inactive, 00:00:10
K   1.1.1.5/32 [0/0] is directly connected, dummy5 inactive, 00:00:10
K   1.1.1.6/32 [0/0] is directly connected, dummy6 inactive, 00:00:10
K   1.1.1.7/32 [0/0] is directly connected, dummy7 inactive, 00:00:10
K   1.1.1.8/32 [0/0] is directly connected, dummy8 inactive, 00:00:10
K   1.1.1.9/32 [0/0] is directly connected, dummy9 inactive, 00:00:10

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-21 15:27:12 -05:00
Mark Stapp
5463ce26c3 zebra: clean up rib and nhg headers
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-11-21 15:05:52 -05:00
Russ White
943de56af6
Merge pull request #5241 from sworleys/SA-NHG
One More Zebra NHG SA Fix and nhg_ctx API Adjustment
2019-11-19 11:44:15 -05:00
Stephen Worley
7c6d5f255e zebra: Put freeing code in nhg_ctx_free()
Put the code to free the data held by a nhg_ctx
in nhg_ctx_free() as well. We do it similiarly for
the dplane_ctx.

Let nhg_ctx_fini() be any other routines that need to
be handled before freeing.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 10:29:16 -05:00
Stephen Worley
606fa9e58d zebra: handle depends_find() NULL nexthop
SA warned us lookup could be NULL dereferenced in some
paths. Handle the case where we are passed a NULL
nexthop before we try to copy it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 10:28:46 -05:00
Stephen Worley
148813c22a zebra: zebra_nhg check each nexthop for active, not just number
We were only checking that two nhg_hash_entry's were equal
based on the active nexthop NUMBER. This is not sufficient in
special cases where whats active with one route using it,
might not be active with the other. We can see this with
routes trying to resolve to themselves.

Ex)

1.1.1.0/24
	-> 1.1.1.1 dummy1 (inactive)
	-> 1.1.1.2 dummy2

1.1.2.0/24
	-> 1.1.1.1 dummy1
	-> 1.1.1.2 dummy1 (inactive)

Without checking each nexthop individually, they will
hash to the same group since they have the same number of
active nexthops.

Fix this by looping over every nexthop for each nhe (they should
be sorted) and checking if the NEXTHOP_FLAG_ACTIVE flag's match.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Donald Sharp
7134ba7060 zebra: Fix some nhg SA issues found in latest Coverity
Fix 2 Coverity issues:
1) zebra_nhg.c -> all paths in nhg_ctx_process_finish have
already deref'ed the ctx pointer no need for a test of it

2) the **ifp pointer passed in may be NULL.  Prevent an accidental
deref if calling function does not pass in a ifp pointer.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-28 20:30:06 -04:00
Stephen Worley
5948f013ba zebra: Cleanup zebra_nhg APIs
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.

Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
80286aa564 zebra: Re-work zebra_nhg_*_valid APIs
Re-work the validity setting and checking APIs
for nhg_hash_entry's to make them clearer.

Further, they were originally only beings set
on ifdown and install. Extended their use into
releasing entries and to account for setting
the validity of a recursive dependent.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
e1292378e2 zebra: Improve commenting for group requeue case
The commenting for why we would need to requeue a
group from the kernel to be later processed was not
sufficient. Add a better explanation for the flow
and state of the system.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
c1da832a94 zebra: Change wording of duplicate kernel nhg flag
Change the wording of the flag indicating we have received
a nexthop group from the kernel with a different ID but
is fundamentally identical to one we already have.

It was colliding with a flag of similar name in the nexthop struct.

Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE
since it is in fact unhashable.

Also change the wording of functions and comments referencing the same
problem.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
0b4dadb385 zebra: Check depends for validity, not dependents
When determining whether to set the nhg_hash_entry as
invalid, we should have been checking the depends, not
the dependents. If its a group and at least one of its
depends is valid, the group is still valid.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
177e711dfc zebra: Adjust nhg handling for dataplane result off on shutdown
Now with this patch we can't use shutdown for cleanup:

```
commit 2fc69f03d2 (pr_5079)
Author: Mark Stapp <mjs@voltanet.io>
Date:   Fri Sep 27 12:15:34 2019 -0400

    zebra: during shutdown processing, drop dplane results

    Don't process dataplane results in zebra during shutdown (after
    sigint has been seen). The dplane continues to run in order to
    clean up, but zebra main just drops results.

    Signed-off-by: Mark Stapp <mjs@voltanet.io>
```

Adjusted nhg uninstall handling to clear data and other
cleanup before sending to the dataplane.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
724583edad zebra: Set the nhe type in the appropriate place
We were setting the nhe type on uninstall when it should be on
the install.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
fefa080e3c zebra: Remove cleanup and nhg workqueue boilerplate
This code was from a strategies we elected not to use and
can safely be removed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
d3a3513811 lib,pbrd,zebra: Use one api to delete nexthops/group
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
40a2a6cdd3 zebra: Add DPLANE_NEIGH and DPLANE_VTEP to nhg cases
Add DPLANE_OP_NEIGH and DPLANE_OP_VTEP to nhg dplane
handler's switch statements.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
fec211ad95 zebra: Zebra nexthop group re-work checkpatch fixes
Checkpatch fixes for the zebra nexthop group re-work.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
e9f6516243 zebra: Fix NULL check in zebra_nhg_rib_find()
Check both the nhg and nexthop are not NULL before passing
them to be hashed. Clang SA caught this.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
4d21c7c086 zebra: Only use passed afi for blackhole/ifindex nexthops
Only used the afi passed into `zebra_nhg_find()` for nexthops
that are blackhole/ifindex. Others should use the type actually declared
in the nexthop struct itself.

Basically, nexthop objects of type blackhole/ifindex in the kernel must
have an address family, they cannot be ambigious and be shared.

This is some requirement in the linux ip core code.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
1b366e63be zebra: Handle out of order kernel nexthop groups
Add a mechanism to requeue groups we receive from the
kernel if the IDs are in a weird order (Group ID is lower
than individual nexthop IDs for example).

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
3e347f4181 zebra: Free labels on nhg_ctx from kernel
If we get a nexthop group from the kernel with labels
and queue it as a context to process later, we have to
free the label stack we allocated.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
10200d4054 zebra: Add some getters for nhg_ctx
Add some getters for the nhg_ctx struct. Probably unnecessary
at this point since they are all static but if they ever become
public it will be nice to have them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
62991a1167 zebra: NHE hash reduce calls to jhash
Reduce the two calls to jhash to one jhash_3words() call
to save some more hashing time.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
2001be6cc0 zebra: NHE use nexthop_group_equal_no_recurse()
Update nhg_hash_entry to use the non-recursive version of
nexthop_group_equal() since it doesn't really need to compare all
of those.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
9ef49038d5 lib,zebra: Move nexthop dup marking into creation
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
e4ac313b12 zebra: Check active count first in nhg_hash_equal
Before checking the equivalence of the whole group itself,
check to see if they contain the same number of non-recursive
active nexthops. This should shorten lookup time for the case of
non-resolved nexthop group creation.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
6384cbcb0e zebra: Create depends after initial lookup
Create any depends only after the initial hash lookup
fails. Should reduce hashing cpu cycles significantly.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
815059466c zebra: Move the supports_nh bool to a better place
Move the supports_nh bool indicating whether the kernel we are
using supports nexthop objects into the netlink kernel interface
itself. Since only linux and netlink support nexthop object APIs
for now this is fine.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
9a1588c4ce zebra: Add handling for kernel del/update nexthop
Add handling for delete/update nexthop object messages from the
kernel.

If someone deletes a nexthop object we are still using, send it back
down. If the someone updates a nexthop we are using, replace that nexthop
with ours. Routes are referencing this nexthop object ID and we resolved
it ourselves, so we should force the other `someone` to submit to our
will.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
38e40db1c9 zebra: Sweep our nexthop objects out on restart
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.

Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.

Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
428b4c0a5d zebra: Give installed nhe's the zebra proto
Give all nhg_hash_entrys we install into the kernel
as nexthop objects a defined proto matching the zebra
rib table one. This makes sense since nhe's are proto-independent
and determined exclusively in zebra.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
8dbc800f42 zebra: Prevent duplication and overflow in nhe2grp
The kernel does not allow duplicate IDs in the same group, but
we are perfectly find with it internally if two different
nexthops resolve the the same nexthop (default route for instance).
So, we have to handle this when we get ready to install.

Further, pass the max group size in the arguments to ensure we
don't overflow. Don't actually think this is possible due to
multipath checking in nexthop_active_update() but better to be
safe.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
32e757f4ae zebra: Mark nhe valid if installed
If the nhe was successfully installed, make sure its marked
as valid. Not fully sure how/where the valid flag is going to
be used yet.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
904ba1c8ee zebra: A group isn't recursive if one depend is
We were setting a group to be recursive if its first depend
was. This is not the case; individual depends of the group
might be recursive but the group itself is not.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
f429bd1b24 zebra: Move resolve/add depend install into api
Move the resolving and installing of a single nhg_hash_entry
into the install function itself, rather than letting zebra_rib
handle it.

Further, ensure depends are installed/queued before installing
a group. The ordering should be find here since only one thread
will call this API.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
8dfbc65724 zebra: Install the nhe along with the route
Move the installation of an nhe out of nexthop_active_update()
and into the rib install path. So, only install the nhe when
a route using it is being installed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
37c6708b93 zebra: Switch nhg_connected to use new RB tree
Switch the nhg_connected tree structures to use the new
RB tree API in `lib/typerb.h`. We were using the openbsd-tree
implementation before.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
8654a42cd5 zebra: Remove some extraneous zebra_nhg logging
Remove some extraneos zebra_nhg logging that was being
used during development.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
6e3af3f17a zebra: Set recursive flag in rib_find() path
We were not setting the NEXTHOP_GROUP_RECURSIVE flag via
the rib find path. Adding a check and set after successful
creation of a new nhg_hash_entry.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
360aefc018 zebra: zebra_nhg_rib_find() handle recursive case
When going through the zebra_nhg_rib_find(), we now handle the
case of if that nexthop has been recursively resolved. A depend
is created and passed along to zebra_nhg_find().

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
583965448f zebra: Add refcnt for depends when connected
Add a refcnt as soon as depend is connected to mark
that this is being referenced as part of a group or
resolving another one. If the one referencing it
is never used, decrement it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
5657e7e943 zebra: Add some depends helper functions
Add some helper functions for finding/creating nexthop
group hash entries and assigning them as a depends for
another one using them in a group or resolving to them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
32e29e7910 zebra: Add nhg refcnt connected helper functions
Add some helper functions for ref incrementing and
decrementing the depends of a nexthop group hash entry.

This just abstracts the RB tree manipulation a bit more.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
8a507796fc zebra: Set resolved nhg in find path
Set the resolved nhg during the find path, rather
than after it has been created. This make more sense
now that we are hashing on the resolved nexthop as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
df31a989ca zebra: Refactor nexthop resolution in active funcs
Refactor/move around the code for nexthop resolution so
that it occurs only when the nexthop actually changes. Further,
provide a helper function to make the code more readable.

Also, remove the check for NEXTHOPS_CHANGED as this flag is used
specifcially for nexthop tracking and not an appropriate check
here.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
7f99772169 zebra: Use nexthop/interface vrf, not the routes
When hashing/creating the NHE, use the nexthops vrf as its
source of data. This is gotten directly from an interface
and should not come from a route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
139ddad8f1 zebra: Accept NULL value for updating route NHE
When updating a route's referenced NHE, accept a NULL value
as valid and clear out the pointer in the struct.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
9834bb52ae zebra: Resolved nh change, inc refcnt by rt refcnt
When the resolved nexthop changes, we should increment the new
resolved NHE by the refcnt for the unresolved NHE being used
by the routes and decrement the old one by the same amount.

Before, we were simple incrementing by one, causing incorrect refcnts
to occur.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
2d3c57e671 zebra: NHG checkpatch fixes
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
5155d86c6f zebra: Ignore cleanup for now
Ignore the cleanup for now until we get the timing
figured out without using the kernel nexthop object API.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
9ddc33505a zebra: Remove uneeded is_valid NHE functons
Remove some unused is_valid checks for the nhg_hash_entry's.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
055a3fa698 zebra: Check group before setting NHE invalid
If the nhg_hash_entry is a group, check if its members
are valid before setting it invalid. If even one is valid,
then this group should still be considered valid.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
4505578be0 zebra: Return true if the NHE created, not found
In zebra_nhg_find(), if we created a nhg_hash_entry, return
true so we know rib-side.

Kernel-side, we don't care since it will always just enqueue
a context to process later.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
144a1b34df zebra: Put NHE ref updating into a function
When the referenced NHE changes for a route_entry, use this function
to handle it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
7f1abf7926 zebra: Error if the ifp lookup fails for an NHE
If the lookup for an interface pointer fails when creating
the NHE, log an error message.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
98cda54a95 zebra: Add recursive functionality to NHE's
Add the ability to recursively resolve nexthop group hash entries
and resolve them when sending to the kernel.

When copying over nexthops into an NHE, copy resolved info as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
a6e6a6d825 zebra: Fix nhg ifindex setting and checking
We were only setting and checking the ifindex if
the nexthop had an *_IFINDEX type. However, when nexthop
active checking is done, the non-*_IFINDEX types can also
obtain a nexthop with an ifindex and are thus valid too.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00