Commit Graph

565 Commits

Author SHA1 Message Date
Donald Sharp
cf05bc9424 zebra: Modify zebra to order nexthops received
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-09 10:44:41 -04:00
David Lamparter
a5ddc34dc7
lib: Add a couple functions to nexthop_group.c (#4323)
lib: Add a couple functions to nexthop_group.c

Co-authored-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-03 14:42:35 +02:00
Sri Mohana Singamsetty
1c05eb4419
Merge pull request #4622 from donaldsharp/import_table_fix
Import table fix
2019-07-02 11:20:02 -07:00
Quentin Young
2951a7a4c2 *: s/TRUE/true/, s/FALSE/false/
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-07-01 17:26:05 +00:00
Stephen Worley
50d8965075 lib: Private api for nexthop_group manipulation
Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-06-25 22:58:48 -04:00
Donald Sharp
fe257ae733 zebra: Push VRF_DEFAULT outside of import table code
The import table code assumes that they will only work
in the default vrf.  This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.

This will allow us to fix a couple of things:

1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.

2) In the future allow `ip import-table X` to become
vrf aware very easily.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp
82d7d0e28a zebra: Improve debugging when we can't find a route to delete
Improve debugging when we cannot find a route to delete
that we have been told to delete.

New output:

2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.7/32 doesn't exist in rib
2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.8/32 doesn't exist in rib

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp
53c16fbec0 zebra: Put route in debug dump of rib data
When dumping rib data about a route for `debug rib detail`
modify the dump command to display the prefix as part
of every line so that we can use a grep on the log
file.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-20 04:55:47 -04:00
Don Slice
739c9c90e7 zebra: resolve issue with rnh not evaluating nexhops correctly
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers.  Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some.  Changed
from frr_each to frr_each_safe and the problem is now gone.

Ticket: CM-25301
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-06-19 07:06:32 -07:00
Donald Sharp
0a7be32866 zebra: Display a bit better debugging for rnh tracking
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-18 15:47:10 -04:00
David Lamparter
731bea2844
Merge pull request #4417 from sworleys/Move-Multicast-Mode
zebra: Move multicast mode to being a property of the router
2019-06-03 15:59:48 +02:00
Sri Mohana Singamsetty
979dd989c4
Merge pull request #4413 from donaldsharp/bgp_distance_comes_closer
Bgp distance comes closer
2019-05-30 09:45:43 -07:00
Donald Sharp
526052fb6d zebra: Move multicast mode to being a property of the router
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 15:25:33 -04:00
Quentin Young
eab4a5c2d0 zebra: strcat -> strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-29 18:03:26 +00:00
Mark Stapp
827debeac2
Merge pull request #4326 from sworleys/Move-NH-Active-Functions
zebra: Move nexthop_active_XXX functions to zebra_nhg.c
2019-05-29 11:35:27 -04:00
Donald Sharp
94c08afe02
Merge pull request #4228 from mjstapp/dplane_notif
zebra: async notifications from the dataplane
2019-05-29 10:10:05 -04:00
Donald Sharp
eea2ef0754 zebra: BGP always sends distance no need to double check
BGP always sends down the correct distance to use.  We do
not need rib_add_multipath to double check the code.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 08:57:11 -04:00
Stephen Worley
ad28e79ac9 zebra: Move nexthop_active_XXX functions to zebra_nhg.c
Since these functions are not really rib processing problems
let's move them to zebra_nhg.c which is meant for processing of
nexthop groups.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-28 17:41:38 -04:00
Mark Stapp
188a00e014 zebra: generate updates from notifications
If an async notification changes a route that's current,
generate an update to keep the kernel in sync.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 13:41:37 -04:00
Renato Westphal
f6fd430e44
Merge pull request #4322 from sworleys/Nexthop-Cmp
lib: Add nexthop_cmp
2019-05-28 11:32:44 -03:00
Mark Stapp
104e3ad95e zebra: mpls lsp async notifications
Add LSP notification event type; add a handler for LSP notifs;
dispatch to that handler.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:35:01 -04:00
Mark Stapp
efe6c026a9 zebra: support route changes via dplane notifications
Allow route notifications to trigger route state changes,
such as installed -> not installed.

Clean up the fib-specific nexthop-group in a couple of
un-install paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:31 -04:00
Mark Stapp
941e261c97 zebra: share rib processing of updates and notifications
Use some common handling for both route update results
processing and dataplane notification processing. Use the
fib-specific nexthop-group if the update to a route results
in different nexthop status than the default rib-provided
nexthop-group.

Use the fib-specific nexthop-group, if present, to provide
the output of 'show ip fib'.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:21 -04:00
Mark Stapp
ee5e8a4820 zebra: add a fib-specific nexthop-group
Add a fib-specific nhg, distinct from the nhg developed from
the route-owner / RIB information.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:27:42 -04:00
Mark Stapp
54818e3b01 zebra: begin dataplane notifications
Add dataplane route notification type; add handler for zebra
routes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:27 -04:00
Mark Stapp
5695d9ac5d zebra: set nexthop install state more accurately
When setting route nexthops' installation state based on a
dataplane context struct, unset the installed state if a
nexthop was not present in the dataplane context.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:21 -04:00
Mark Stapp
fad4d69cd4 zebra: add api to locate route-node from dplane ctx
Create a helper api that locates a zebra route-node from info
in a dplane context struct. Moved code from the results handler
to make a more-general api that could be used in other paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:20 -04:00
Mark Stapp
78bf56b0b6 zebra: add api to update route from dplane ctx
Add an api to update the status of a route based on info
from a dplane context object. Use the api when processing
route update results from the dataplane.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:09 -04:00
Donald Sharp
d4644d4196 zebra: Add kernel level graceful restart
<Initial Code from Praveen Chaudhary>

Add the a `--graceful_restart X` flag to zebra start that
now creates a timer that pops in X seconds and will go
through and remove all routes that are older than startup.

If graceful_restart is not specified then we will just pop
a timer that cleans everything up immediately.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-23 19:35:42 -04:00
Stephen Worley
a5a2d802d7 lib,zebra,bgpd,pbrd: Compare nexthops without labels
Allow label ignoring when comparing nexthops. Specifically,
add another functon nexthop_same_no_labels() that shares
a path with nexthop_same() but doesn't check labels.

rib_delete() needs to ignore labels in this case.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Stephen Worley
78fba41bd8 lib,zebra,bgpd: Remove nexthop_same_no_recurse()
The functions nexthop_same() does not check the resolved
nexthops so I don't think this function is even needed
anymore.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Renato Westphal
81fddbe7ae *: rename new ForEach macros from the typesafe API
This is necessary to avoid a name collision with std::for_each
from C++.

Fixes the compilation of the gRPC northbound module.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-05-21 15:59:08 -03:00
Quentin Young
303b93cdee zebra: update zebra_rib for vrrp
VRRP doesn't install any routes, but should still have an array entry.
Also add a help string for VRRP to route_types.txt

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-17 00:27:08 +00:00
Russ White
1b072ce466
Merge pull request #4269 from donaldsharp/other_tables
zebra Other tables
2019-05-16 10:11:56 -04:00
Russ White
cc25952f2a
Merge pull request #4327 from sworleys/Move-Multipath-Num
zebra: Move multipath_num into zrouter
2019-05-16 10:04:14 -04:00
Donald Sharp
b3f2b59020 zebra: Move multipath_num into zrouter
The multipath_num variable is a property of zebra_router,
so move it there.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-14 14:15:18 -07:00
Mark Stapp
98124e2d6a
Merge pull request #4321 from sworleys/Ribsystem-Ribkernel
zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
2019-05-14 09:29:08 -04:00
Donald Sharp
84340a15b4 zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
These defines should be available from rib.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-13 13:11:49 -04:00
Donald Sharp
98572489ea zebra: Switch to using monotime(NULL) for re->uptime
The re->uptime usage of time(NULL) leaves it open to
timing changes from outside influence.  Switching
to monotime allows us to ensure that we have a timestamp
that is always increasing.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-11 01:44:42 -04:00
Renato Westphal
773fc72b1f
Merge pull request #4242 from donaldsharp/zebra_diet
Zebra diet
2019-05-10 08:29:59 -03:00
Donald Sharp
d8612e6545 zebra: Track tables allocated by vrf and cleanup
For each table created by a vrf, keep track of it and
allow for proper cleanup on shutdown of that particular
table.  Cleanup client shutdown to only cleanup data
that the particular vrf owns.  Before we were cleaning
the same table 2 times.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-09 07:11:22 -04:00
Renato Westphal
95f092540e
Merge pull request #4256 from donaldsharp/zebra_table
doc, zebra: Remove "table X" command
2019-05-06 19:08:17 -03:00
Donald Sharp
c447ad08b2 doc, zebra: Remove "table X" command
This command is broken and has been broken since the introduction
of vrf's.  Since no-one has complained it is safe to assume that
there is no call for this specialized linux command.  Remove
from the system with extreme prejudice.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-06 13:42:23 -04:00
Donald Sharp
8dc7a75918 zebra: Add some extra safety for route_info
The route_info[X].meta_q_map *must* be less than MQ_SIZE
or we will do some strange stuff, so assert on it at startup.

The distance in route_info is a uint8_t so let's keep the data
structure the same.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp
4bb55bbecc zebra: ifp must be a real pointer sometimes
The ifp pointer must be pointing at a real location
in memory since right above us in this loop we
return if it is.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp
045207e27c zebra: Use built in list handler for route entries now
The route entry code was using a custom linked list to handle
route entries.  Remove and replace with the new lib link list
code.  This reduces the size of the route entry by a further
8 bytes.

Observant people will notice that the current linked list
implementation is singly linked, while the Route Entry
is doubly linked.  I am not terribly concerned about this
change as that 1) we do not see a large number of route
entries per prefix( say 2 maybe 3 items ) and route entries
do not come and go that often.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 17:41:35 -04:00
Donald Sharp
aa57abfbb5 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 16:21:38 -04:00
Lou Berger
e8b9ad5cdd
Revert "Zebra diet" 2019-05-02 06:54:59 -04:00
Donald Sharp
0a45d97472 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-01 20:28:57 -04:00
Stephen Worley
eaa2716dfb zebra: Check on startup route_info has all types
Add a function to check if the route_info array
has all types specified with data in it. Specifically,
test the 'key' attribute for non-zero data. Ignore
ZEBRA_ROUTE_SYSTEM as it should be zero key anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-01 15:32:18 -04:00
Mark Stapp
88351c8f6d
Merge pull request #4226 from sworleys/PBR-BFD-OF-route_info
zebra: Add PBR, BFD, OpenFabric to route_info
2019-05-01 11:22:54 -04:00
Lou Berger
31e944a8a7
Merge pull request #3045 from opensourcerouting/atoms
READY: lists/skiplists/rb-trees new API & sequence lock & atomic lists
2019-04-30 10:26:35 -04:00
Stephen Worley
d6abd8b070 zebra: Comment to ensure types added to route_info
Add a comment to indicate that route types added to
Zebra, should also be present in the route_info array.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-30 10:07:45 -04:00
Stephen Worley
eab7b6e371 zebra: Add OpenFabric to route_info array
Add OpenFabric to the route_info array for handling processing
of the OpenFabric route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:28:15 -04:00
Stephen Worley
42d96b73cb zebra: Add BFD to route_info array
Add BFD to the route_info array for handling processing
of the BFD route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:26:11 -04:00
Stephen Worley
9815665214 zebra: Add PBR to route_info array
Add PBR to the route_info array for handling processing
of the PBR route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:24:26 -04:00
Don Slice
ade4a8868e zebra: resolve issue with protocol route-map not applied properly
Problem reported that route-maps applied to "ip protocol table bgp"
would not be invoked if the ip protocol table command was issued
after the bgp prefixes were installed.  Found that a recent change
improving how often nexthop_active_update runs missed causing this
filtering to be applied. This fix resolves that issue as well as
a couple of other places that were problematic with the recent
change.

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-04-26 17:15:44 +00:00
Donald Sharp
df38b099ee zebra: Update flag output for route entry dump
Update the nexthop flag output for the route entry dump to
include all possible flag states be output.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
6883bf8d35 zebra: Run nexthop_active_check once
We currently run nexthop_active_check multiple times.  Make the
code run once and figure out state from that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
80ad04184f zebra: Double check is not necessary in nexthop_active_update
The nexthop_active_update command looks at each individual
nexthop and decides if it has changed.  If any nexthop
has changed we will set the re->status to ROUTE_ENTRY_CHANGED
and ROUTE_ENTRY_NEXTHOPS_CHANGED.

Additionally the test for old_nh_num != curr_active
makes no sense because suppose we have several events
we are processing at the same time and a total ecmp
of 16 but 14 are active at the start and 14 are active
at the end but different interfaces are up or down.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
dd50eeb115 lib, zebra: Remove unused flag
The NEXTHOP_FLAG_FILTERED went away when we started treating
static routes like every other route in the system.  This was
a special case for handling static route code that just didn't
get finished cleaning up.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
99eabcec1a zebra: nexthop_active_update does not need set
We are effectively calling nexthop_active_update() on every
route entry being processed for installation at least 2 times.
This is a bit ridiculous.  We need to resolve the nexthops
when we know a route has changed in some manner, so do so.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Renato Westphal
e412d3b8d9 lib: move zlog() prototype back to the public logging API
zlog() should be part of the public logging API as it's useful in
the cases where the logging priority isn't known at compile time
(i.e. it depends on a variable).

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-04-18 13:15:13 -03:00
David Lamparter
7e3a1ec742 lib: ZEBRA_NUM_OF -> array_size
The latter is widely used, e.g. in the Linux kernel.

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-04-18 12:44:29 +02:00
Mark Stapp
cf363e1bd8 zebra: dataplane notifications for system route changes
Add notifications from zebra to the dataplane subsystem when
kernel or connected routes change.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-10 16:07:01 -04:00
Mark Stapp
f4c6e2a815 zebra: remove unused VRF_RIB_SCHEDULED flag
We don't use th vrf-level VRF_RIB_SCHEDULED flag any longer;
remove it and collapse the zebra_vrf flags' values.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-05 08:46:28 -04:00
Donald Sharp
a1494c250c zebra: Modify lsp processing to be invoked as needed
LSP processing was a zvrf flag based upon a connected route
coming or going.  But this did not allow us to know
that we should do lsp processing other than after the meta-queue
processing was finished.

Eventually we moved meta-queue processing of do_nht_processing
to after the dataplane sent the main pthread some results.
This of course left us with a timing hole where if a connected
route came in and we received a data plane response *before*
the meta queue was processed we would not do the work as necessary.

Move the lsp processing to a flag off of the rib_dest_t. If it
is marked then we need to process lsps.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
50872b0804 zebra: Add detailed debugging command for NHT tracking
Add a detailed debugging command for NHT tracking and add
the detailed output to the log about why we make some decisions
that we are.  I tried to model this like the rib processing
detailed debugs that we added a few months back.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
699dae230d zebra: Modify NHT to occur when needed.
Currently nexthop tracking is performed for all nexthops that
are being tracked after a group of contexts are passed back
from the data plane for post install processing.

This is inefficient and leaves us sending nexthop tracking
changes at an accelerated pace, when we think we've changed
a route.  Additionally every route change will cause us
to relook at all nexthops we are tracking irrelevant if
they are possibly related to the route change or not.

Let's modify the code base to track the rnh's off of the rib
table's rn, `rib_dest_t`.  So after we process a node, install
it into the data plane, in rib_process_result we can
look at the `rib_dest_t` associated with the rn and see that
a nexthop depended on this route node.  If so, refigure it.

Additionally we will store rnh's that are not resolved on the
0.0.0.0/0 nexthop tracking list.  As such when a route node
changes we can quickly walk up the rib tree and notice that
it needs to be reprocessed as well.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
c86ba6c283 zebra: Add a base node for the zebra vrf tables
Add a default route_node for our routing tables.  This will allow us
to know that we can hang data off the default route for processing.

We will be hanging the nexthop tracking data structures off the rib_dest_t
so that we can know which nexthops we need to handle.  Effectively
nexthops that we are tracking that are unresolved will be stored on the
default route.  When something changes in the rib tree we can
work up the rn->parent pointer checking for nexthops we need to re-evaluate.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
434434f704 zebra: Abstract the rib_dest_t creation
Abstract the creation of the rib_dest_t so that we can call it
from multiple places.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
3cdba47a82 zebra: Modify code so that dplane is responsible for indicating success/fail of install
We have several route types KERNEL and CONNECT that are handled via special
case in the code.  This was causing a lot of work keeping the two different
classes of route types as special(SYSTEM OR NOT).  Put the dplane
in charge of the code that sets the bits for signalling route install/failure.

This greatly simplifies the code calling path and makes all route types
be handled exactly the same.  Additionaly code that we want to run
post data plane install can just work as per normal then, instead
of having to know we need to run it when we have a special type
of route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
2019-03-27 16:19:28 -04:00
Donald Sharp
7a230a9d0c zebra: On route install/update failure correctly indicate in rib
When we get a route install failure from the kernel, actually
indicate in the rib the status of the routes.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
9ef0c6ba87 zebra: Unset old_re as queued.
When switching routes from one route type to another actually
unset the old route as enqueued.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Quentin Young
73fb891892 Revert "Merge pull request #3982 from pacovn/Coverity_1479148_copy_paste"
This reverts commit 3a3704fe36, reversing
changes made to 5a3c6e736d.
2019-03-20 21:25:04 +00:00
F. Aragon
23fbacb455
zebra: copy-paste error (Coverity 1479148)
Signed-off-by: F. Aragon <paco@voltanet.io>
2019-03-20 16:45:32 +01:00
Donald Sharp
b900245adc zebra: System routes sometimes can not be properly selected
System Routes if received over the netlink bus in a
specific pattern that causes an update operation for that
route in zebra can leave the dest->selected_fib pointer NULL,
while having the ZEBRA_FLAG_SELECTED flag set. Specifically
one way to achieve this is to do this:

`ip addr del 4.5.6.7/32 dev swp1 ; ip addr add 4.5.6.7/32 dev swp1 metric 9`

Why is this a big deal?
Because nexthop tracking is looking at ZEBRA_FLAG_SELECTED to
know if we can use a route, while nexthop active checking uses
dest->selected_fib.

So imagine we have bgp registering a nexthop. nexthop tracking in
the above case will be able to choose the 4.5.6.7/32 route
if that is what the nexthop is, due to the ZEBRA_FLAG_SELECTED being
properly set. BGP then allows the peers connection to come up and we
install routes with a 4.5.6.7 nexthop. The rib processing for route
installation will then look at the 4.5.6.7 route see no
dest->selected_fib and then start walking up the tree to resolve
the route. In our case we could easily hit the default route and be
unable to resolve the route. Which then becomes inactive in the
rib so we never attempt to install it.

This commit fixes this problem because when the rib_process decides
that we need to update the fib( ie replace old w/ new ), the
replacement with new was not setting the `dest->selected_fib` pointer
to the new route_entry, when the route was a system route.

Ticket: CM-24203
Signed-off-by: Donald Sharp <sharpd@cumulusnetworkscom>
2019-03-15 10:02:11 -04:00
David Lamparter
d3b05897ed
Merge pull request #3869 from qlyoung/cocci-fixes
Assorted Coccinelle fixes
2019-03-06 15:54:44 +01:00
vivek
2b83602b24 *: Explicitly mark nexthop of EVPN-sourced routes as onlink
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface. Howver, in the model that
is supported in the implementation and commonly deployed, there is no
explicit Overlay IP address associated with the next hop in the tenant
VRF; the underlay IP is used if (since) the forwarding plane requires
a next hop IP. Therefore, the next hop has to be explicit flagged as
onlink to cause any next hop reachability checks in the forwarding plane
to be skipped.

https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.

Use existing mechanism to specify the nexthops as onlink when installing
these routes from bgpd to zebra and get rid of a special flag that was
introduced for EVPN-sourced routes. Also, use the onlink flag during next
hop validation in zebra and eliminate other special checks.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-27 12:54:24 +00:00
Quentin Young
9f2d035447 *: remove useless return variables
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-02-25 23:00:16 +00:00
Donald Sharp
5f27bcba2a zebra: Fix use after free in rib_process_result
Running zebra after commit 888756b208
in valgrind produces this item:

==17102== Invalid read of size 8
==17102==    at 0x44D84C: rib_dest_from_rnode (rib.h:375)
==17102==    by 0x4546ED: rib_process_result (zebra_rib.c:1904)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Address 0x83bd468 is 88 bytes inside a block of size 96 free'd
==17102==    at 0x4A35F54: free (vg_replace_malloc.c:530)
==17102==    by 0x4CCAC00: qfree (memory.c:129)
==17102==    by 0x4D03DC6: route_node_destroy (table.c:501)
==17102==    by 0x4D039EE: route_node_free (table.c:90)
==17102==    by 0x4D03971: route_node_delete (table.c:382)
==17102==    by 0x44D82A: route_unlock_node (table.h:256)
==17102==    by 0x454617: rib_process_result (zebra_rib.c:1882)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Block was alloc'd at
==17102==    at 0x4A36FF6: calloc (vg_replace_malloc.c:752)
==17102==    by 0x4CCAA2D: qcalloc (memory.c:110)
==17102==    by 0x4D03D88: route_node_create (table.c:489)
==17102==    by 0x4D0360F: route_node_new (table.c:65)
==17102==    by 0x4D034F8: route_node_set (table.c:74)
==17102==    by 0x4D03486: route_node_get (table.c:327)
==17102==    by 0x4CFB700: srcdest_rnode_get (srcdest_table.c:243)
==17102==    by 0x4545C1: rib_process_result (zebra_rib.c:1872)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==

This is happening because of this order of events:

1) Route is deleted in the main thread and scheduled for rib processing.
2) Rib garbage collection is run and we remove the route node since it
is no longer needed.
3) Data plane returns from the deletion in the kernel and we call
the srcdest_rnode_get function to get the prefix that was deleted.
This recreates a new route node.  This creates a route_node with
a lock count of 1, which we freed via the route_unlock_node call.
Then we continued to use the rn pointer.  Which leaves us with use
after frees.

The solution is, of course, to just move the unlock the node at the
end of the function if we have a route_node.

Fixes: #3854
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-23 20:03:48 -05:00
Mark Stapp
5c111895d6 zebra: unlock route-node in dplane results handler
Unlock the route-node struct we look up while processing
async dataplane results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-21 16:15:14 -05:00
Mark Stapp
8263d1d0d9 zebra: use update semantics for routes consistently
Use 'update' semantics for route updates, to ensure that
netlink replace behavior works correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-11 16:11:02 -05:00
David Lamparter
b7777b57c4
Merge pull request #3722 from donaldsharp/static_recursive
Zebra fixes
2019-02-07 19:22:29 +01:00
Donald Sharp
4634d02cfd
Merge pull request #3684 from mjstapp/dplane_pw
zebra: async dataplane for pseudowires
2019-02-05 18:41:12 -05:00
Donald Sharp
6c47d39902 zebra: Fix multiple levels of static recursion
Allow the nexthop-check code to figure out recursive static routes
in a logical manner.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 15:21:26 -05:00
Donald Sharp
46a4e3455b zebra: NHT was being run at least 2 times and missreporting data
With the data plane changes that were made, we are now running
nexthop tracking 2 times.  Once at the end of meta-queue insertion
and once at the end of receiving a bunch of data from the dataplane.

The Addition of the data plane code caused flags to not be set
fully for the resolved routes( since we do not know the answer yet ),
This in turn caused the nexthop tracking run after the meta-queue
to think that the route was not `good`.  This would cause it to
tell all interested parties that there was no nexthop.

After the dataplane insertion we are also no running nht code.
This was re-figuring out the nexthop correctly and also
correctly reporting to interested parties that there was a path again.

Example:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:06:47
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:04:47
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:06:47
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:06:47
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:06:47
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:07:06
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:04
  *                  via 192.168.210.1, enp0s9, 00:00:04
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:07:06
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:07:06
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:07:06
donna.cumulusnetworks.com(config)#

Log files for sharp, which is watching 4.5.6.7:
2019/02/04 15:20:54.844288 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844820 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844836 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:20:54.844853 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

As you can see we have received an update with no nexthops( invalid route )
and a second update immediately after it with 2 nexthops.

What's the big deal you say?  Well we have code in other daemons that reacts
to not having a path for a nexthop.  In BGP this will cause us to tear
down the peer.  In staticd we'll remove the recursively resolved route.
In pim we'll remove all paths to the mroute.  This is not desirable.

The fix is to remove the meta-queue run of nexthop tracking.

While running after data plane notice of routes to handle is not ideal
we will be fixing this in the future with the nexthop group code, which
should know what nexthops are affected by a nexthop group change.

Fixed code debug code:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:46
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:46
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:46
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:46
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:59
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
  *                  via 192.168.210.1, enp0s9, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:59
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:59
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:59

2019/02/04 15:26:20.656395 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:20.656440 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688251 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:33.688322 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688329 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 09:17:02 -05:00
Donald Sharp
2561d12e5d zebra: Remove struct zebra_t
This structure is unused anymore and does not belong in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
ea45a4e7db zebra: Move the mq data structure to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
489a961429 zebra: Move ribq from zebrad to zrouter
The zrouter should own this data structure and it should not
be defined in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
b3d43ff471 zebra: Move rtm_table_default to zrouter
The zrouter should own this particular piece of data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
3801e7646c zebra: Move the master thread handler to the zrouter structure
The master thread handler is really part of the zrouter structure.
So let's move it over to that.  Eventually zserv.h will only be
used for zapi messages.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
David Lamparter
19b336d343
Merge pull request #3699 from donaldsharp/zebra_rib_debugs
Zebra Respect my authority
2019-01-31 01:51:52 +01:00
Donald Sharp
b9f0e5ee24 zebra: On route update context is sometimes indeterminate in post-processing
When we get into rib_process_result and the operation we are handling
is DPLANE_OP_ROUTE_UPDATE *and* the route entry being looked at
is a route replace, we currently have no way to decode to the old_re
and the re due to how we have stored context.  As such they are the
same pointer.

As such the route replace for the same route type is causing the re
to set the installed flag and then immediately unset the installed
flag, leaving us in a state where the kernel has the route but
the rib thinks we are not installed.

Since the true old_re( the one being replaced by the update operation )
is going away( as that it zebra deletes the old one for us already )
this fix is not optimal but will get us moving forward.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-30 09:52:13 -05:00
Donald Sharp
058c16b7e2 zebra: Trust kernel and System routes
If we receive a valid message from the kernel that
is either a kernel or system route, we should trust
that the route is legit and just use it.

Old behavior:

K * 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1 inactive, 00:00:16

New Behavior:

K>* 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1, 00:02:35

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:45:02 -05:00
Donald Sharp
2da33d6b3a zebra: Convert route entry id number to string in debugs
The route entry being displayed in debugs was displaying
the originating route type as a number.  While numbers
are cool, I for one am not terribly interested in
memorizing them.  Modify the (type %d) to a (%s) to
just list the string type of the route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:35:07 -05:00
Russ White
2538f1dad7
Merge pull request #3681 from donaldsharp/onlink
*: The onlink attribute should be owned by the nexthop not the route.
2019-01-29 10:09:44 -05:00
Donald Sharp
fe85601c96 *: The onlink attribute should be owned by the nexthop not the route.
The onlink attribute was being passed from upper level protocols
as an attribute of the route *not* the individual nexthop.  When
we pass this data to the kernel, we treat the onlink as a attribute
of the nexthop.  This commit modifies the code base to allow
us to pass the ONLINK attribute as an attribute of the nexthop.

This commit also fixes static routes that have multiple nexthops
some onlink and some not.

ip route 4.5.6.7/32 192.168.41.1 eveth1 onlink
ip route 4.5.6.7/32 192.168.42.2

S>* 4.5.6.7/32 [1/0] via 192.168.41.1, eveth1 onlink, 00:03:04
  *                  via 192.168.42.2, eveth2, 00:03:04

sharpd@robot ~/frr2> sudo ip netns exec EVA ip route show
4.5.6.7 proto 196 metric 20
	nexthop via 192.168.41.1 dev eveth1 weight 1 onlink
	nexthop via 192.168.42.2 dev eveth2 weight 1

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-26 21:02:26 -05:00
Donald Sharp
60f98b236e zebra: Keep track of when routes are queued/dequeued from the dataplane
When we process the dataplane data, keep track of whether or not a route
is in transit or not.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00
Donald Sharp
677c1dd5cb zebra: Use ROUTE_ENTRY_INSTALLED as decision for route is installed
zebra is using NEXTHOP_FLAG_FIB as the basis of whether or not
a route_entry is installed.  This is problematic in that we plan
to separate out nexthop handling from route installation.  So modify
the code to keep track of whether or not a route_entry is installed/failed.

This basically means that every place we set/unset NEXTHOP_FLAG_FIB, we
actually also set/unset ROUTE_ENTRY_INSTALLED on the route_entry.
Additionally where we check for route installed via NEXTHOP_FLAG_FIB
switch over to checking if the route think's it is installed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00