Commit Graph

3257 Commits

Author SHA1 Message Date
Stephen Worley
ace3bbba4b zebra: Don't clear nexthop fib flag on rib_install
We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.

We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Stephen Worley
19474c9c8c zebra: mpls_ftn_uninstall handle nhg hash label change
We were crashing due to a missed label change code path
in mpls_ftn_uninstall() with the zebra_nhg hashing code.

Add a static handler function for label changing everywhere
in that code and use it in mpls_ftn_uninstall().

The crash was found in the ISIS-SR tests:

==23== Thread 1:
==23== Invalid read of size 4
==23==    at 0x15B20E: zebra_nhg_hash_equal (zebra_nhg.c:365)
==23==    by 0x489A2FD: hash_get (hash.c:143)
==23==    by 0x489A4BC: hash_lookup (hash.c:183)
==23==    by 0x15B5A3: zebra_nhg_find (zebra_nhg.c:494)
==23==    by 0x15C536: zebra_nhg_rib_find (zebra_nhg.c:1070)
==23==    by 0x1573E8: mpls_ftn_update (zebra_mpls.c:2661)
==23==    by 0x1A2554: zread_mpls_labels_replace (zapi_msg.c:1890)
==23==    by 0x1A41CD: zserv_handle_commands (zapi_msg.c:2613)
==23==    by 0x199B17: zserv_process_messages (zserv.c:517)
==23==    by 0x48EE6B7: thread_call (thread.c:1549)
==23==    by 0x48A8AD5: frr_run (libfrr.c:1064)
==23==    by 0x1391B7: main (main.c:468)
==23==  Address 0x5839330 is 0 bytes inside a block of size 80 free'd
==23==    at 0x48369AB: free (vg_replace_malloc.c:530)
==23==    by 0x48AEE6C: qfree (memory.c:129)
==23==    by 0x15C5F8: zebra_nhg_free (zebra_nhg.c:1095)
==23==    by 0x15BC8C: zebra_nhg_handle_uninstall (zebra_nhg.c:734)
==23==    by 0x15DCFA: zebra_nhg_uninstall_kernel (zebra_nhg.c:1826)
==23==    by 0x15C666: zebra_nhg_decrement_ref (zebra_nhg.c:1106)
==23==    by 0x15D9D7: zebra_nhg_re_update_ref (zebra_nhg.c:1711)
==23==    by 0x15D8B1: nexthop_active_update (zebra_nhg.c:1660)
==23==    by 0x167072: rib_process (zebra_rib.c:1154)
==23==    by 0x168D72: process_subq_route (zebra_rib.c:2039)
==23==    by 0x168E92: process_subq (zebra_rib.c:2078)
==23==    by 0x168F5B: meta_queue_process (zebra_rib.c:2112)
==23==  Block was alloc'd at
==23==    at 0x4837B65: calloc (vg_replace_malloc.c:752)
==23==    by 0x48AED56: qcalloc (memory.c:110)
==23==    by 0x15B07B: zebra_nhg_copy (zebra_nhg.c:307)
==23==    by 0x15B13E: zebra_nhg_hash_alloc (zebra_nhg.c:329)
==23==    by 0x489A339: hash_get (hash.c:148)
==23==    by 0x15B6CA: zebra_nhg_find (zebra_nhg.c:532)
==23==    by 0x15C536: zebra_nhg_rib_find (zebra_nhg.c:1070)
==23==    by 0x15D89A: nexthop_active_update (zebra_nhg.c:1658)
==23==    by 0x167072: rib_process (zebra_rib.c:1154)
==23==    by 0x168D72: process_subq_route (zebra_rib.c:2039)
==23==    by 0x168E92: process_subq (zebra_rib.c:2078)
==23==    by 0x168F5B: meta_queue_process (zebra_rib.c:2112)

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Quentin Young
61395893b4
Merge pull request #5259 from mjstapp/dplane_sample_plugin
zebra: Add a sample dataplane plugin module
2019-11-11 11:56:42 -05:00
Russ White
46ddfc4096
Merge pull request #5269 from sworleys/Zebra-VRF-Lookup-Not-Get
zebra: separate zebra_vrf_lookup_table_with_id()
2019-11-06 14:10:59 -05:00
Donald Sharp
f609709a58 lib, ospfd, zebra: Convert interface_delete to take double pointer
When free'ing the interface pointer, set it to NULL.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-11-02 16:13:44 -04:00
Donald Sharp
721c08573a *: Convert connected_free to a double pointer
Set the connected pointer to set the pointer to NULL.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-11-02 16:13:44 -04:00
Stephen Worley
c7c0b007a4 zebra: separate zebra_vrf_lookup_table_with_id()
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.

Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).

Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.

This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.

=23464== Invalid read of size 8
==23464==    at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464==    by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464==    by 0x48F3D45: vrf_delete (vrf.c:243)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==    by 0x13D8C5: sigint (main.c:172)
==23464==    by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464==    by 0x48F0502: thread_fetch (thread.c:1417)
==23464==    by 0x48AC82B: frr_run (libfrr.c:1023)
==23464==    by 0x13DD02: main (main.c:483)
==23464==  Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B25B8: qfree (memory.c:129)
==23464==    by 0x48EA335: route_node_destroy (table.c:500)
==23464==    by 0x48E967F: route_node_free (table.c:90)
==23464==    by 0x48E9742: route_table_free (table.c:124)
==23464==    by 0x48E9599: route_table_finish (table.c:60)
==23464==    by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464==    by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464==    by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464==    by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464==    by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==  Block was alloc'd at
==23464==    at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B24A2: qcalloc (memory.c:110)
==23464==    by 0x48EA2FE: route_node_create (table.c:488)
==23464==    by 0x48E95C7: route_node_new (table.c:66)
==23464==    by 0x48E95E5: route_node_set (table.c:75)
==23464==    by 0x48E9EA9: route_node_get (table.c:326)
==23464==    by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464==    by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464==    by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464==    by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464==    by 0x19D689: zserv_process_messages (zserv.c:523)
==23464==    by 0x48F09F8: thread_call (thread.c:1599)

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-01 16:06:19 -04:00
Mark Stapp
743dd5f618 zebra: Add a sample dataplane plugin module
Add a dataplane plugin module as a sample or reference for
folks who might like to integrate with the zebra dataplane
subsystem. This isn't part of the FRR build or product; there
are some simple build and load-at-runtime instructions in
comments in the file.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-10-31 16:24:16 -04:00
Donald Sharp
d1accb2e19 zebra: zvni_map_to_svi may return NULL act accordingly
The zvni_map_to_svi function may return NULL as such prevent
a deref and crash.  Found via coverity

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-28 20:52:40 -04:00
Donald Sharp
7134ba7060 zebra: Fix some nhg SA issues found in latest Coverity
Fix 2 Coverity issues:
1) zebra_nhg.c -> all paths in nhg_ctx_process_finish have
already deref'ed the ctx pointer no need for a test of it

2) the **ifp pointer passed in may be NULL.  Prevent an accidental
deref if calling function does not pass in a ifp pointer.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-28 20:30:06 -04:00
Mark Stapp
882364f11a
Merge pull request #4897 from sworleys/zebra_nhg_add
Zebra Nexthop Group Rework and Kernel Nexthop Object API Init
2019-10-28 13:07:23 +01:00
Stephen Worley
f3354e1612 zebra: rt_netlink nexthop handling checkpatch
Checkpatch was complaining because this code was extending
beyond 80 characters on a couple lines. Adjusted a conditional
tree to fix that.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
5948f013ba zebra: Cleanup zebra_nhg APIs
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.

Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
80286aa564 zebra: Re-work zebra_nhg_*_valid APIs
Re-work the validity setting and checking APIs
for nhg_hash_entry's to make them clearer.

Further, they were originally only beings set
on ifdown and install. Extended their use into
releasing entries and to account for setting
the validity of a recursive dependent.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
e1292378e2 zebra: Improve commenting for group requeue case
The commenting for why we would need to requeue a
group from the kernel to be later processed was not
sufficient. Add a better explanation for the flow
and state of the system.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
c1da832a94 zebra: Change wording of duplicate kernel nhg flag
Change the wording of the flag indicating we have received
a nexthop group from the kernel with a different ID but
is fundamentally identical to one we already have.

It was colliding with a flag of similar name in the nexthop struct.

Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE
since it is in fact unhashable.

Also change the wording of functions and comments referencing the same
problem.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
0b4dadb385 zebra: Check depends for validity, not dependents
When determining whether to set the nhg_hash_entry as
invalid, we should have been checking the depends, not
the dependents. If its a group and at least one of its
depends is valid, the group is still valid.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
5a935f79d5 zebra: Guard nexthop group overflow read
Guard against an overflow read when processing
nexthop groups from netlink. Add a check to ensure
we don't try to write passed the array size.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
177e711dfc zebra: Adjust nhg handling for dataplane result off on shutdown
Now with this patch we can't use shutdown for cleanup:

```
commit 2fc69f03d2 (pr_5079)
Author: Mark Stapp <mjs@voltanet.io>
Date:   Fri Sep 27 12:15:34 2019 -0400

    zebra: during shutdown processing, drop dplane results

    Don't process dataplane results in zebra during shutdown (after
    sigint has been seen). The dplane continues to run in order to
    clean up, but zebra main just drops results.

    Signed-off-by: Mark Stapp <mjs@voltanet.io>
```

Adjusted nhg uninstall handling to clear data and other
cleanup before sending to the dataplane.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
724583edad zebra: Set the nhe type in the appropriate place
We were setting the nhe type on uninstall when it should be on
the install.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
fefa080e3c zebra: Remove cleanup and nhg workqueue boilerplate
This code was from a strategies we elected not to use and
can safely be removed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
53ac1fbbe0 zebra: Comment to indicate where nhg hashtables live
Add a comment to the header of `zebra_nhg.c` to point the reader
to where the hashtables containing the nhg entries are held.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
d3a3513811 lib,pbrd,zebra: Use one api to delete nexthops/group
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
b2665a211e zebra: Use ng pointer in mpls_ftn_uninstall
With the new nexthop group shared memory framework, pointers
are being used in route_entry for the nexthop_group. Update
the use of this in `mpls_ftn_uninstall()` to reflect the change.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
40a2a6cdd3 zebra: Add DPLANE_NEIGH and DPLANE_VTEP to nhg cases
Add DPLANE_OP_NEIGH and DPLANE_OP_VTEP to nhg dplane
handler's switch statements.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
a7df21c4d2 zebra: Fallback to default ns if nhg vrf lookup fails
If the vrf lookup fails, use the default namespace
to find/delete the nexthop group from the kernel because it
should be there anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
fec211ad95 zebra: Zebra nexthop group re-work checkpatch fixes
Checkpatch fixes for the zebra nexthop group re-work.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
e9f6516243 zebra: Fix NULL check in zebra_nhg_rib_find()
Check both the nhg and nexthop are not NULL before passing
them to be hashed. Clang SA caught this.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
d7b5921c58 zebra: Update ip route show with nexthop_num API
Switch the nexthop_num dereferences to use the nexthop_group
API in `vty_show_ip_route()`.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
07cc1745ff zebra: Add bsd nexthop install boilerplate
Add some boilerplate for nexthop installation for bsd kernels.
They do not support nexthop objects for now so its just boilerplate.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
62b045d6c5 zebra: Fix missed bsd nexthop group pointer
When moving the nexthop group in a route entry to be a pointer,
we missed one wrapped in a `ifndef` for when the kernel doesn't
have netlink.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
7d5bb02b1a zebra: Force off kernel nexthop group API for now
Force off kernel nexthop group API for now. Will re-enable
after suffient testing.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
08c51a385d zebra: Only check nexthop status on route install/update
We do not need to check that the nexthop is installed or queued
when sending a route deletion since we only need to the prefix for it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
61d9ffe168 zebra: Only show route nexthop group ID when asked
In lieu of the fact that we probably shouldn't change show
command output too much, changing this to only give nhe_id
output when the user explicitly asks for it. Probably only
going to be used for debugging for now anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
4d21c7c086 zebra: Only use passed afi for blackhole/ifindex nexthops
Only used the afi passed into `zebra_nhg_find()` for nexthops
that are blackhole/ifindex. Others should use the type actually declared
in the nexthop struct itself.

Basically, nexthop objects of type blackhole/ifindex in the kernel must
have an address family, they cannot be ambigious and be shared.

This is some requirement in the linux ip core code.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
1b366e63be zebra: Handle out of order kernel nexthop groups
Add a mechanism to requeue groups we receive from the
kernel if the IDs are in a weird order (Group ID is lower
than individual nexthop IDs for example).

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
3e347f4181 zebra: Free labels on nhg_ctx from kernel
If we get a nexthop group from the kernel with labels
and queue it as a context to process later, we have to
free the label stack we allocated.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
10200d4054 zebra: Add some getters for nhg_ctx
Add some getters for the nhg_ctx struct. Probably unnecessary
at this point since they are all static but if they ever become
public it will be nice to have them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
8d03bc501b zebra: Handle nhg_hash_entry encaps/more debugging
Add code for handling nexthop group hash entry encaps
and sending them to the kernel. Add some more debugging
information for the encaps and groups in general.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
b7537db639 zebra: Add common netlink mpls stack building path
There was some code copypasta for mpls stack building in the
netlink install path. Reduced that to a common function.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
62991a1167 zebra: NHE hash reduce calls to jhash
Reduce the two calls to jhash to one jhash_3words() call
to save some more hashing time.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
0ad40d1615 zebra: Add nhe_id to show ip route detailed
When querying for detailed route information, show the nexthop
group id for its nh_hash_entry in the output before listing the
nexthops.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
df9069cd18 zebra: Add some more output to show nexthop-group
Add some more detailed output to `show nexthop-group`.
It closely resembles the output of `show ip routes`.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
986a6617cc zebra: Optimize the fib/notified nexthop matching
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
2001be6cc0 zebra: NHE use nexthop_group_equal_no_recurse()
Update nhg_hash_entry to use the non-recursive version of
nexthop_group_equal() since it doesn't really need to compare all
of those.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
9ef49038d5 lib,zebra: Move nexthop dup marking into creation
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
e4ac313b12 zebra: Check active count first in nhg_hash_equal
Before checking the equivalence of the whole group itself,
check to see if they contain the same number of non-recursive
active nexthops. This should shorten lookup time for the case of
non-resolved nexthop group creation.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
6384cbcb0e zebra: Create depends after initial lookup
Create any depends only after the initial hash lookup
fails. Should reduce hashing cpu cycles significantly.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
6e72876478 zebra: TODO for hanlding blackhole attr exclusive
Add a TODO statement for handling the exclusiveness
of blackhole attributes.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
bc541126e4 zebra: Use nexthop object id on route delete
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00