Commit Graph

174 Commits

Author SHA1 Message Date
Stephen Worley
c479909b69 zebra: mark connected nh inactive if not matching ifindex
If we are asked to check if a nexthop is active and it matches a
connected route but the ifindex on it does not match the interface
with the connected route, mark as inactive. This is a bad nexthop.

Before, we would skip this check and just assume any nexthop that matches
on a connected route is valid and return here then fail during
installation. This adds a check for the IPV*_ifindex nexthop case where the
ifindex we have been sent doesn't match.

Old:
F>r 0.0.0.0/0 [200/0] via 20.0.0.2, test, weight 1, 00:00:27
  r                   via 40.4.4.4, lo, weight 1, 00:00:27

New:
F>* 0.0.0.0/0 [200/0] via 20.0.0.2, test, weight 1, 00:00:06
  *                   via 40.4.4.4, lo inactive, weight 1, 00:00:06

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-07-10 13:42:37 -04:00
Mark Stapp
9959f1daba zebra: improve logic handling backup nexthop installation
When handling a fib notification event that involves a route
with backup nexthops, be clearer about representing the
installed state of the backups: any installed backup will be
on a dedicated route_entry list.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
92ad0c558c zebra: skip un-installed recursive match
Do less work when resolving a recursive route: just skip
nexthops if the resolving route is not installed.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
9d43854d94 zebra: only use ACTIVE nexthops in recursive resolution
Only use ACTIVE nexthops to resolve recursive routes, not all
nexthops from a resolving route.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
f264672058 zebra: allow recursive resolution to use backup nexthops
Allow both primary and backup nexthops to be used in
recursive resolution processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
6b193087ca staticd,zebra: use ALLOW_RECURSION for static routes
Remove a special-case clause for static routes - it was the same
as the clause for other recursive routes. Have staticd just tell
zebra that recursion is allowed. Update topotest that was aware
of this 'internal' flag.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
cb3e512d97 tests,zebra: fix more startup topotest issues
Use the right list of daemons to avoid trying to start zebra twice.
Change a zebra log message to INFO level to avoid stderr check
failure.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-04 12:25:10 -04:00
Jakub Urbańczyk
60d8d43be4 zebra: prepare dplane to deal with pbr rules
This commit is the first step to convert IP rule installation to
use dplane thread.
 * Add dataplane's internal representation of a pbr rule
 * Add dplane stats related to rules
 * Introduce a new type of dplane operation

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-10 16:18:45 +02:00
Mark Stapp
f727646ada zebra: rename 'nhg_copy' to 'nhe_copy'
It copies nhes...

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-01 14:46:32 -04:00
Russ White
0a6fd9ce9d
Merge pull request #6389 from mjstapp/fix_recursive_label_type
zebra: prefer outer label_type for recursive nexthops
2020-05-19 11:42:36 -04:00
vivek
12b4d77bab zebra: Trust onlink flag for nexthop active resolution
When checking if a nexthop is active, if it has been marked as onlink,
just check on the presence and status of the nexthop's interface. When
handling client request to create a route, if the client says that the
nexthop is onlink, trust it; when internally (in zebra) determining
that the nexthop is onlink, ensure it is only done in the case of an
interface with a /32 IP address which is the case for OSPF unnumbered.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:   Stephen Worley <sworley@cumulusnetworks.com>
2020-05-15 16:22:01 -07:00
Mark Stapp
6bc5d97795 zebra: prefer outer label_type for recursive nexthops
When resolving a recursive nexthop, prefer the "outer"
label type, if present.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-05-12 14:27:02 -04:00
Donald Sharp
630d596249 zebra: Remove typedef rib_table_info_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Donald Sharp
5cfaa2d92b zebra: Loosen ONLINK restrictions a tiny bit
Loosen the ONLINK restrictions such that when an upper
level protocol sends us a nexthop with an ONLINK attribute
just ensure that interface is up and usable.  ONLINK effectively
means we know what we are doing to the kernel.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-06 10:15:41 -04:00
Mark Stapp
f924db4961 zebra: fix some coverity SA warnings
Fix several coverity scan warnings.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-04-14 07:44:54 -04:00
Mark Stapp
0328a5bd0d zebra: don't include backup nhs in main nhe dependency tree
We don't want to install backup nexthops - yet - as part of the
nexthop-id-based kernel interactions on netlink platforms. Avoid
mixing backup and primary nexthops in the tree of dependencies
in the ecmp cases.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
377e29f7e7 zebra: handle backup nexthops in nhe/nhgs
Include backup nexthops in nhe processing; connect incoming
zapi route data with updated rib/nhg apis; add more debugs in
nhg processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
1d48702ede zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.

Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Stephen Worley
d43122b58f zebra: break if duplicate nexthop found in nhe2grp
If we find that a nexthop is a duplicate, break immediately
rather than continuing to look through the rest of the list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:57:45 -04:00
Stephen Worley
086e4e02f5 zebra: properly set the NEXTHOP_GROUP_VALID flag
Properly set the NEXTHOP_GROUP_VALID flag and use it
as a conditional for installation decisions for individual
nexthop and groups containing it.

We set the NEXTHOP_GROUP_VALID flag it is:

1) A fully resolved active nexthop
or
2) Its a group that contains at least one VALID NHE

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:16 -04:00
Stephen Worley
715e5c70d5 zebra: set valid on re->nhe directly in nexthop_active_update()
We were still doing a lookup on the nhe_id from before we
started referencing re->nhe directly.

Change set flag to just use re->nhe directly here since they
should always be the same at this point in the code anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
b1c3f7ef80 zebra: add debug for duplicate NH in dataplane array conversion
When we find a nexthop ID thats a duplicate in the code that converts
NHG rb trees into a flat list of nexthop IDs for the dataplane,
output a debug message.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
1866b3afc2 zebra: don't add ID to kernel nh_grp if not installed/queued
When we transform the nexthop group rb trees into a flat
array of IDs to send into the dataplane code (zebra_nhg_nhe2grp),
don't put an ID in there that has not been in installed or is
not currently queued to be installed into the dataplane.

Otherwise, if some of the nexthops fail to install, we will
still try to create a group with them and then the entire group
will fail.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
Stephen Worley
497ff5792f zebra: handle NHG in NHG dataplane group conversion
We were not properly handling the case of a NHG inside of
another NHG when converting the rb tree of a multilevel NHG
into a flat list of IDs. When constructing, we call the function
zebra_nhg_nhe2grp_internal() recursively so that the rare
case of a group within a group is handled such that its
singleton nexthops are appended to the grp array of IDs
we send to the dataplane code.

Ex)

1:
	-> 2:
		-> 3
		-> 4
	->5:
		->6

becomes this:

1:
	->3
	->4
	->6

when its sent to the dataplane code for final kernel installation.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-26 10:48:15 -04:00
David Lamparter
d6951e5ef9 *: remove tabs from log messages
Some logging systems are, er, "allergic" to tabs in log messages.
(RFC5424: "The syslog application SHOULD avoid octet values below 32")

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2020-03-24 18:47:12 +01:00
Ruben Kerkhof
99e7ab12cf zebra: use modern C function definition
And also remove an assignment without effect while we're here.

Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2020-03-11 14:06:34 +01:00
Donald Sharp
0752c8d8a4 zebra: nhg->nexthop is not NULL
We have already asserted on nhg->nexthop an if statement
to flog_err makes no sense.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 16:37:19 -05:00
Donald Sharp
5e81f5dd1a *: Finish off the __PRETTY_FUNCTION__ to __func__
FINISH IT

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-03-06 09:23:22 -05:00
Donatas Abraitis
15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Stephen Worley
fc8a02c45f zebra: trust directly connected kernel/system routes
We made the decision to explicitly trust kernel and system routes
of every other type with 058c16b7e2.

So, we should trust directly connected routes the same way, assuming
the interface exists.

Old Behavior:

K   2.2.2.1/32 [0/0] is directly connected, unknown inactive, 00:00:39

New Behavior:

K>* 2.2.2.1/32 [0/0] is directly connected, test1, 00:00:03

As a bonus, this fixes the issues we were seeing with not removing
directly connected routes of certain interface types when
those interfaces go down/are deleted.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-03-02 13:45:40 -05:00
Mark Stapp
c415d89528 zebra: Embed lib nexthop-group in zebra hash entry
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-27 15:49:31 -05:00
Renato Westphal
ecaeb3b697
Merge pull request #5750 from qlyoung/fix-null-after-xfree
*: don't null after XFREE; XFREE does this itself
2020-02-05 01:49:08 -03:00
Russ White
c7a754408e
Merge pull request #5746 from donaldsharp/bgp_sa
Coverioty sa stuff
2020-02-04 11:24:08 -05:00
Russ White
05d0c66d8f
Merge pull request #5737 from mjstapp/zebra_disable_kern_nhs
zebra: add config to disable use of kernel nexthops
2020-02-04 08:12:34 -05:00
Donald Sharp
9275682559 zebra: top has already been derefed
The top variable has already been derefed by the time we get
to the test to see if it is non-NULL.  No need to check it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-04 08:10:52 -05:00
Quentin Young
b3ba5dc7fe *: don't null after XFREE; XFREE does this itself
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-02-03 11:22:13 -05:00
Donald Sharp
88cafda739 zebra: nexthop groups vrf's are only a function of namespaces
Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.

Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-31 08:45:51 -05:00
Stephen Worley
a7e1b02d4a zebra: add null check before connecting recursive depend
Add a null check in `handle_recursive_depend()` so it
doesn't try to add a NULL pointer to the RB tree.

This was found with clang SA.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:06 -05:00
Stephen Worley
5bf15faa19 zebra: don't created connected if duplicate depend
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.

The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.

This is a pretty unlikely situation to happen.

Also, pull up the RB handling of _del RB API as well.

This was found with the zapi fuzzing code.

```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840==    at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840==    by 0x48E1008: qcalloc (memory.c:110)
==1052840==    by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840==    by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840==    by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840==    by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840==    by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840==    by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840==    by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840==    by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840==    by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840==    by 0x428B11: main (main.c:309)
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-30 17:15:05 -05:00
Mark Stapp
7c99d51beb zebra: add config to disable use of kernel nexthops
Add a config that disables use of kernel-level nexthop ids.
Currently, zebra always uses nexthop ids if the kernel supports
them.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-01-28 11:00:42 -05:00
Mark Stapp
d26e2d9be4
Merge pull request #5600 from sworleys/NHG-Depend-Crash
zebra: can't improve efficiency for recursive depends
2020-01-15 16:31:55 -05:00
Mark Stapp
a67b69c024
Merge pull request #5616 from sworleys/NHG-Fix-Recurse-to-Group
zebra: just set nexthop member in handle_recursive_depend()
2020-01-15 16:26:06 -05:00
Stephen Worley
1d049aba72 zebra: just set nexthop member in handle_recursive_depend()
With recent changes to the lib nexthop_group
APIs (e1f3a8eb19), we are making
new assumptions that this should be adding a single nexthop
to a group, not a list of nexthops.

This broke the case of a recursive nexthop resolving to a group:

```
D>  2.2.2.1/32 [150/0] via 1.1.1.1 (recursive), 00:00:09
  *                      via 1.1.1.1, dummy1 onlink, 00:00:09
                       via 1.1.1.2 (recursive), 00:00:09
  *                      via 1.1.1.2, dummy2 onlink, 00:00:09
D>  3.3.3.1/32 [150/0] via 2.2.2.1 (recursive), 00:00:04
  *                      via 1.1.1.1, dummy1 onlink, 00:00:04
K * 10.0.0.0/8 [0/1] via 172.27.227.148, tun0, 00:00:21
```

This group can instead just directly point to the nh that was passed.
Its only being used for a lookup (the memory gets copied and used
elsewhere if the nexthop is not found).

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:29 -05:00
Stephen Worley
77bf9504bf lib,zebra: tighten up the nexthop_copy/nexthop_dup APIs
Make the nexthop_copy/nexthop_dup APIs more consistent by
adding a secondary, non-recursive, version of them. Before,
it was inconsistent whether the APIs were expected to copy
recursive info or not. Make it clear now that the default is
recursive info is copied unless the _no_recurse() version is
called. These APIs are not heavily used so it is fine to
change them for now.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Stephen Worley
0fff714efa zebra: can't improve efficiency for recursive depends
cb86eba3ab was causing zebra to crash
when handling a nexthop group that had a nexthop which was recursively resolved.

Steps to recreate:

!
nexthop-group red
 nexthop 1.1.1.1
 nexthop 1.1.1.2
!

sharp install routes 8.8.8.1 nexthop-group red 1

=========================================
==11898== Invalid write of size 8
==11898==    at 0x48E53B4: _nexthop_add_sorted (nexthop_group.c:254)
==11898==    by 0x48E5336: nexthop_group_add_sorted (nexthop_group.c:296)
==11898==    by 0x453593: handle_recursive_depend (zebra_nhg.c:481)
==11898==    by 0x451CA8: zebra_nhg_find (zebra_nhg.c:572)
==11898==    by 0x4530FB: zebra_nhg_find_nexthop (zebra_nhg.c:597)
==11898==    by 0x4536B4: depends_find (zebra_nhg.c:1065)
==11898==    by 0x453526: depends_find_add (zebra_nhg.c:1087)
==11898==    by 0x451C4D: zebra_nhg_find (zebra_nhg.c:567)
==11898==    by 0x4519DE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==11898==    by 0x452268: nexthop_active_update (zebra_nhg.c:1729)
==11898==    by 0x461517: rib_process (zebra_rib.c:1049)
==11898==    by 0x4610C8: process_subq_route (zebra_rib.c:1967)
==11898==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Zebra crashes because we weren't handling the case of the depend nexthop
being recursive.

For this case, we cannot make the function more efficient. A nexthop
could resolve to a group of any size, thus we need allocs/frees.

To solve this and retain the goal of the original patch, we separate out the
two cases so it will still be more efficient if the nexthop is not recursive.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-01-15 13:35:04 -05:00
Donald Sharp
946de1b95a bgpd, ospfd, zebra: Do not use 0 as VRF_DEFAULT
Explicitly spell out what we are trying to do.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-15 08:29:36 -05:00
Mark Stapp
cb86eba3ab zebra: improve efficiency of depends_find()
Do less malloc and free in depends_find(), when looking for
a singleton nexthop in the nhg hash.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-18 15:34:37 -05:00
Stephen Worley
b10d6b0744 zebra: pass type when finding individual nexthop
When we are doing a lookup on an individual nexthop,
we should still be passing along the type that gets passed
via the arguments. Otherwise, we will always think we own that
NHE when in reality anyone could have put that into the
kernel.

Before this patch, nexthops in the kernel will get swepped
out even if we didn't create them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-12-16 16:46:30 -05:00
Donald Sharp
df7fb5800b lib, zebra: Allow for installation of a weighted nexthop
Linux has the idea of allowing a weight to be sent
down as part of a nexthop group to allow the kernel
to weight particular nexthop paths a bit more or less
than others.

See:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.multiple-links.html

Allow for installation into the kernel using the weight attribute
associated with the nexthop.

This code is foundational in that it just sets up the ability
to do this, we do not use it yet.  Further commits will
allow for the pass through of this data from upper level protocols.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-12-09 13:37:37 -05:00
Donald Sharp
e302caaa81
Merge pull request #5416 from mjstapp/re_nhe_pointer
lib,zebra: use shared nexthop-group in route_entry
2019-12-04 14:11:04 -05:00