We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.
We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were crashing due to a missed label change code path
in mpls_ftn_uninstall() with the zebra_nhg hashing code.
Add a static handler function for label changing everywhere
in that code and use it in mpls_ftn_uninstall().
The crash was found in the ISIS-SR tests:
==23== Thread 1:
==23== Invalid read of size 4
==23== at 0x15B20E: zebra_nhg_hash_equal (zebra_nhg.c:365)
==23== by 0x489A2FD: hash_get (hash.c:143)
==23== by 0x489A4BC: hash_lookup (hash.c:183)
==23== by 0x15B5A3: zebra_nhg_find (zebra_nhg.c:494)
==23== by 0x15C536: zebra_nhg_rib_find (zebra_nhg.c:1070)
==23== by 0x1573E8: mpls_ftn_update (zebra_mpls.c:2661)
==23== by 0x1A2554: zread_mpls_labels_replace (zapi_msg.c:1890)
==23== by 0x1A41CD: zserv_handle_commands (zapi_msg.c:2613)
==23== by 0x199B17: zserv_process_messages (zserv.c:517)
==23== by 0x48EE6B7: thread_call (thread.c:1549)
==23== by 0x48A8AD5: frr_run (libfrr.c:1064)
==23== by 0x1391B7: main (main.c:468)
==23== Address 0x5839330 is 0 bytes inside a block of size 80 free'd
==23== at 0x48369AB: free (vg_replace_malloc.c:530)
==23== by 0x48AEE6C: qfree (memory.c:129)
==23== by 0x15C5F8: zebra_nhg_free (zebra_nhg.c:1095)
==23== by 0x15BC8C: zebra_nhg_handle_uninstall (zebra_nhg.c:734)
==23== by 0x15DCFA: zebra_nhg_uninstall_kernel (zebra_nhg.c:1826)
==23== by 0x15C666: zebra_nhg_decrement_ref (zebra_nhg.c:1106)
==23== by 0x15D9D7: zebra_nhg_re_update_ref (zebra_nhg.c:1711)
==23== by 0x15D8B1: nexthop_active_update (zebra_nhg.c:1660)
==23== by 0x167072: rib_process (zebra_rib.c:1154)
==23== by 0x168D72: process_subq_route (zebra_rib.c:2039)
==23== by 0x168E92: process_subq (zebra_rib.c:2078)
==23== by 0x168F5B: meta_queue_process (zebra_rib.c:2112)
==23== Block was alloc'd at
==23== at 0x4837B65: calloc (vg_replace_malloc.c:752)
==23== by 0x48AED56: qcalloc (memory.c:110)
==23== by 0x15B07B: zebra_nhg_copy (zebra_nhg.c:307)
==23== by 0x15B13E: zebra_nhg_hash_alloc (zebra_nhg.c:329)
==23== by 0x489A339: hash_get (hash.c:148)
==23== by 0x15B6CA: zebra_nhg_find (zebra_nhg.c:532)
==23== by 0x15C536: zebra_nhg_rib_find (zebra_nhg.c:1070)
==23== by 0x15D89A: nexthop_active_update (zebra_nhg.c:1658)
==23== by 0x167072: rib_process (zebra_rib.c:1154)
==23== by 0x168D72: process_subq_route (zebra_rib.c:2039)
==23== by 0x168E92: process_subq (zebra_rib.c:2078)
==23== by 0x168F5B: meta_queue_process (zebra_rib.c:2112)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.
Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).
Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.
This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.
=23464== Invalid read of size 8
==23464== at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464== by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464== by 0x48F3D45: vrf_delete (vrf.c:243)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== by 0x13D8C5: sigint (main.c:172)
==23464== by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464== by 0x48F0502: thread_fetch (thread.c:1417)
==23464== by 0x48AC82B: frr_run (libfrr.c:1023)
==23464== by 0x13DD02: main (main.c:483)
==23464== Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B25B8: qfree (memory.c:129)
==23464== by 0x48EA335: route_node_destroy (table.c:500)
==23464== by 0x48E967F: route_node_free (table.c:90)
==23464== by 0x48E9742: route_table_free (table.c:124)
==23464== by 0x48E9599: route_table_finish (table.c:60)
==23464== by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464== by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464== by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464== by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464== by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464== by 0x48F4468: vrf_terminate (vrf.c:532)
==23464== Block was alloc'd at
==23464== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464== by 0x48B24A2: qcalloc (memory.c:110)
==23464== by 0x48EA2FE: route_node_create (table.c:488)
==23464== by 0x48E95C7: route_node_new (table.c:66)
==23464== by 0x48E95E5: route_node_set (table.c:75)
==23464== by 0x48E9EA9: route_node_get (table.c:326)
==23464== by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464== by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464== by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464== by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464== by 0x19D689: zserv_process_messages (zserv.c:523)
==23464== by 0x48F09F8: thread_call (thread.c:1599)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a dataplane plugin module as a sample or reference for
folks who might like to integrate with the zebra dataplane
subsystem. This isn't part of the FRR build or product; there
are some simple build and load-at-runtime instructions in
comments in the file.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The zvni_map_to_svi function may return NULL as such prevent
a deref and crash. Found via coverity
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix 2 Coverity issues:
1) zebra_nhg.c -> all paths in nhg_ctx_process_finish have
already deref'ed the ctx pointer no need for a test of it
2) the **ifp pointer passed in may be NULL. Prevent an accidental
deref if calling function does not pass in a ifp pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Checkpatch was complaining because this code was extending
beyond 80 characters on a couple lines. Adjusted a conditional
tree to fix that.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.
Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Re-work the validity setting and checking APIs
for nhg_hash_entry's to make them clearer.
Further, they were originally only beings set
on ifdown and install. Extended their use into
releasing entries and to account for setting
the validity of a recursive dependent.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The commenting for why we would need to requeue a
group from the kernel to be later processed was not
sufficient. Add a better explanation for the flow
and state of the system.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Change the wording of the flag indicating we have received
a nexthop group from the kernel with a different ID but
is fundamentally identical to one we already have.
It was colliding with a flag of similar name in the nexthop struct.
Change it from NEXTHOP_GROUP_DUPLICATE -> NEXTHOP_GROUP_UNHASHABLE
since it is in fact unhashable.
Also change the wording of functions and comments referencing the same
problem.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When determining whether to set the nhg_hash_entry as
invalid, we should have been checking the depends, not
the dependents. If its a group and at least one of its
depends is valid, the group is still valid.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Guard against an overflow read when processing
nexthop groups from netlink. Add a check to ensure
we don't try to write passed the array size.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Now with this patch we can't use shutdown for cleanup:
```
commit 2fc69f03d2 (pr_5079)
Author: Mark Stapp <mjs@voltanet.io>
Date: Fri Sep 27 12:15:34 2019 -0400
zebra: during shutdown processing, drop dplane results
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
```
Adjusted nhg uninstall handling to clear data and other
cleanup before sending to the dataplane.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a comment to the header of `zebra_nhg.c` to point the reader
to where the hashtables containing the nhg entries are held.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
With the new nexthop group shared memory framework, pointers
are being used in route_entry for the nexthop_group. Update
the use of this in `mpls_ftn_uninstall()` to reflect the change.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
If the vrf lookup fails, use the default namespace
to find/delete the nexthop group from the kernel because it
should be there anyway.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Check both the nhg and nexthop are not NULL before passing
them to be hashed. Clang SA caught this.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some boilerplate for nexthop installation for bsd kernels.
They do not support nexthop objects for now so its just boilerplate.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When moving the nexthop group in a route entry to be a pointer,
we missed one wrapped in a `ifndef` for when the kernel doesn't
have netlink.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We do not need to check that the nexthop is installed or queued
when sending a route deletion since we only need to the prefix for it.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
In lieu of the fact that we probably shouldn't change show
command output too much, changing this to only give nhe_id
output when the user explicitly asks for it. Probably only
going to be used for debugging for now anyway.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Only used the afi passed into `zebra_nhg_find()` for nexthops
that are blackhole/ifindex. Others should use the type actually declared
in the nexthop struct itself.
Basically, nexthop objects of type blackhole/ifindex in the kernel must
have an address family, they cannot be ambigious and be shared.
This is some requirement in the linux ip core code.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a mechanism to requeue groups we receive from the
kernel if the IDs are in a weird order (Group ID is lower
than individual nexthop IDs for example).
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
If we get a nexthop group from the kernel with labels
and queue it as a context to process later, we have to
free the label stack we allocated.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some getters for the nhg_ctx struct. Probably unnecessary
at this point since they are all static but if they ever become
public it will be nice to have them.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add code for handling nexthop group hash entry encaps
and sending them to the kernel. Add some more debugging
information for the encaps and groups in general.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
There was some code copypasta for mpls stack building in the
netlink install path. Reduced that to a common function.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When querying for detailed route information, show the nexthop
group id for its nh_hash_entry in the output before listing the
nexthops.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some more detailed output to `show nexthop-group`.
It closely resembles the output of `show ip routes`.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Update nhg_hash_entry to use the non-recursive version of
nexthop_group_equal() since it doesn't really need to compare all
of those.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Before checking the equivalence of the whole group itself,
check to see if they contain the same number of non-recursive
active nexthops. This should shorten lookup time for the case of
non-resolved nexthop group creation.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Create any depends only after the initial hash lookup
fails. Should reduce hashing cpu cycles significantly.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>