When we get a route for installation via any method we should
consolidate on 32 bits as the flag size, since we have
actually more than 8 bits of data to bass around.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
During quick ifdown / ifup events from the linux kernel there
exists a situation where a prefix that has both a kernel route
and a static route can queued up on the meta-q. If the static
route happens to point at a connected route for nexthop resolution
and we receive a series of quick up/down events *after* the
static route and kernel route are queued up for rib reprocessing.
Since the static route and kernel route are queued on meta-q 1
and the connected route is also on meta-q 1 there exists a situation
where the connected route will be resolved after the static route
fails to resolve, leaving the static route in a unresolved state.
Add a new queue level and put connected routes on their own level,
since they are the fundamental building blocks of pretty much
all the other routes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fix some reference counting issues seen when replacing
a NHG and deleting one.
For replacement, we should end with the same refcnt on the new
one.
For delete, its the caller's job to decrement its ref after
its done with it.
Further, update routes in the rib with the new pointer after replace.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Imagine a situation where a interface is bouncing up/down.
The interface comes up and daemons like pbr will get a nht
tracking callback for a connected interface up and will install
the routes down to zebra. At this same time the interface can
go down. But since zebra is busy handling route changes ( from pbr )
it has not read the netlink message and can get into a situation
where the route resolves properly and then we attempt to install
it into the kernel( which is rejected ). If the interface
bounces back up fast at this point, the down then up netlink
message will be read and create two route entries off the connected
route node. Zebra will then enqueue both route entries for future processing.
After this processing happens the down/up is collapsed into an up
and nexthop tracking sees no changes and does not inform any upper
level protocol( in this case pbr ) that nexthop tracking has changed.
So pbr still believes the nexthops are good but the routes are not
installed since pbr has taken no action.
Fix this by immediately running rnh when we signal a connected
route entry is scheduled for removal. This should cause
upper level protocols to get a rnh notification for the small
amount of time that the connected route was bouncing around like
a madman.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a route_entry flag to indicate the presence of a fib
(installed) list of nexthops - more explicit and clearer.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When given a route_table this allows the corresponding kernel table
ID to be determined. The table_id value is set upon table creation
to the table_id of the VRF, unless the table was created with a
specific ID.
Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
to make sure that c++ code can include them, avoid using reserved
keywords like 'delete' or 'new'.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.
Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We will use a nhe context for dataplane interaction with
nextho group hash entries.
New nhe's from the kernel will be put into a group array
if they are a group and queued on the rib metaq to be processed
later.
New nhe's sent to the kernel will be set on the dataplane context
with approprate ID's in the group array if needed.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Switched the route entries to use ID's instead of pointers.
Perform lookups with the ID and then check if its null.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a nexthop hash entry to the route_entry so that we can
track the nhe with the route entry.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nexthop_active_num data structure is a property of the
nexthop group. Move the keeping of this data to that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the route_entry we are keeping a non pointer based
nexthop group, switch the code to use a pointer for all
operations here and ensure we create and delete the memory.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If we need to batch process the rib (all tables or specific
vrf), do so as a scheduled thread event rather than immediately
handling it. Further, add context to the events so that you
narrow down to certain route types you want to reprocess.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The flag ROUTE_ENTRY_NEXTHOPS_CHANGED is only ever set or unset.
Since this flag is not used for anything useful, remove from system.
By changing this flag we have re-ordered `internalStatus' of json
output of zebra rib routes. Go through and fix up tetsts to
use the new values.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code as written before this code change point would enqueue
every system route type to be refigured when we have an
interface event. I believe this was to originally handle bugs
in the way nexthop tracking was handled, mainly that if you keep
asking the question you'll eventually get the right answer.
Modify the code to not do this, we have fixed nexthop tracking
to not be so brain dead and to know when it needs to refigure
a route that it is tracking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
<Initial Code from Praveen Chaudhary>
Add the a `--graceful_restart X` flag to zebra start that
now creates a timer that pops in X seconds and will go
through and remove all routes that are older than startup.
If graceful_restart is not specified then we will just pop
a timer that cleans everything up immediately.
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Make the RIB_*_ROUTE() macro which is passed a route in rib.h just use
the R*_ROUTE() macros that directly check the type in rt.h.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The route entry code was using a custom linked list to handle
route entries. Remove and replace with the new lib link list
code. This reduces the size of the route entry by a further
8 bytes.
Observant people will notice that the current linked list
implementation is singly linked, while the Route Entry
is doubly linked. I am not terribly concerned about this
change as that 1) we do not see a large number of route
entries per prefix( say 2 maybe 3 items ) and route entries
do not come and go that often.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node. This was taking up
a bunch of memory. Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:
Old:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 675 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 567 MiB
Free small blocks: 39 MiB
Free ordinary blocks: 69 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
New:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 574 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 536 MiB
Free small blocks: 33 MiB
Free ordinary blocks: 4600 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies. This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node. This was taking up
a bunch of memory. Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:
Old:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 675 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 567 MiB
Free small blocks: 39 MiB
Free ordinary blocks: 69 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
New:
Memory statistics for zebra:
System allocator statistics:
Total heap allocated: 574 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 536 MiB
Free small blocks: 33 MiB
Free ordinary blocks: 4600 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies. This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
LSP processing was a zvrf flag based upon a connected route
coming or going. But this did not allow us to know
that we should do lsp processing other than after the meta-queue
processing was finished.
Eventually we moved meta-queue processing of do_nht_processing
to after the dataplane sent the main pthread some results.
This of course left us with a timing hole where if a connected
route came in and we received a data plane response *before*
the meta queue was processed we would not do the work as necessary.
Move the lsp processing to a flag off of the rib_dest_t. If it
is marked then we need to process lsps.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Currently nexthop tracking is performed for all nexthops that
are being tracked after a group of contexts are passed back
from the data plane for post install processing.
This is inefficient and leaves us sending nexthop tracking
changes at an accelerated pace, when we think we've changed
a route. Additionally every route change will cause us
to relook at all nexthops we are tracking irrelevant if
they are possibly related to the route change or not.
Let's modify the code base to track the rnh's off of the rib
table's rn, `rib_dest_t`. So after we process a node, install
it into the data plane, in rib_process_result we can
look at the `rib_dest_t` associated with the rn and see that
a nexthop depended on this route node. If so, refigure it.
Additionally we will store rnh's that are not resolved on the
0.0.0.0/0 nexthop tracking list. As such when a route node
changes we can quickly walk up the rib tree and notice that
it needs to be reprocessed as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify the status flag from 8 bits to 32 bits and to add
a few new flags that will be used in future commits.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The rib_lookup_ipv4_route function is only used in a debug path.
Is only used for v4 and only checks to make sure that the rib
and fib are in sync( which is not needed/used/supported on other
platforms ). So let's just remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block
Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.
Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Wrapper the get/set of the table->info pointer so that
people are not directly accessing this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Check for the modified routemap in zebra_route_map_process_update_cb()
* Added zebra_rib_table_rm_update() for RIB routemap processing
* Added zebra_nht_rm_update() for NHT routemap processing
Signed-off-by: kssoman <somanks@vmware.com>