Commit Graph

776 Commits

Author SHA1 Message Date
Mark Stapp
f515871207 zebra: fix SA warning in rib_process()
Fix an SA warning about a possible NULL pointer deref in
rib_process().

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-08-21 09:39:02 -04:00
Donald Sharp
c2c02b76bc zebra: Add table id to debug output
There are a bunch of places where the table id is not being outputed
in debug messages for routing changes.  Add in the table id we
are operating on.  This is especially useful for the case where
pbr is working.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-19 13:59:29 -04:00
Jakub Urbańczyk
d68e74b41c lib, zebra: add support for sending ARP requests
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.

This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-12 23:19:58 +02:00
Jakub Urbańczyk
86d5622362 zebra: remove "PENDING" dplane request state
This request state is redundant with new message batching interface.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-08-10 21:33:00 +02:00
Sebastien Merle
31f937fb43 lib, zebra: Add SR-TE policy infrastructure to zebra
For the sake of Segment Routing (SR) and Traffic Engineering (TE)
Policies there's a need for additional infrastructure within zebra.
The infrastructure in this PR is supposed to manage such policies
in terms of installing binding SIDs and LSPs. Also it is capable of
managing MPLS labels using the label manager, keeping track of
nexthops (for resolving labels) and notifying interested parties about
changes of a policy/LSP state. Further it enables a route map mechanism
for BGP and SR-TE colors such that learned BGP routes can be mapped
onto SR-TE Policies.

This PR does not introduce any usable features by now, it is just
infrastructure for other upcoming PRs which will introduce 'pathd',
a new SR-TE daemon.

Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
2020-08-07 11:08:49 +02:00
Renato Westphal
790953a387
Merge pull request #6765 from mjstapp/backup_nhg_netlink
lib,zebra: support multiple backup nexthops
2020-07-27 12:49:36 -03:00
Stephen Worley
55528234ea
Merge pull request #6753 from mjstapp/fix_zebra_backup_sa
zebra: fix SA warnings in backup nexthop code
2020-07-17 17:29:49 -04:00
Mark Stapp
7483dcbe29 zebra: add a route_entry flag for FIB-specific nexthops
Add a route_entry flag to indicate the presence of a fib
(installed) list of nexthops - more explicit and clearer.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-17 13:12:33 -04:00
Mark Stapp
474aebd939 lib,sharpd,zebra: initial support for multiple backup nexthops
Initial changes to support a nexthop with multiple backups. Lib
changes to hold a small array in each primary, zapi message
changes to support sending multiple backups, and daemon
changes to show commands to support multiple backups. The config
input for multiple backup indices is not present here.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-17 13:12:33 -04:00
Mark Stapp
43a9f66cd1 zebra: fix SA warnings in backup nexthop code
Fix a couple of recent SA warnings that came from backup
nexthop/nhlfe changes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-16 11:00:17 -04:00
David Lamparter
3efd0893d0 *: un-split strings across lines
Remove mid-string line breaks, cf. workflow doc:

  .. [#tool_style_conflicts] For example, lines over 80 characters are allowed
     for text strings to make it possible to search the code for them: please
     see `Linux kernel style (breaking long lines and strings)
     <https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
     and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.

Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```

Signed-off-by: David Lamparter <equinox@diac24.net>
2020-07-14 10:37:25 +02:00
Mark Stapp
9959f1daba zebra: improve logic handling backup nexthop installation
When handling a fib notification event that involves a route
with backup nexthops, be clearer about representing the
installed state of the backups: any installed backup will be
on a dedicated route_entry list.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Mark Stapp
4db01e7914 zebra: add fib nhg for backups, revise api
Add an nhg for the fib-installed backup nexthops; rename an
api to access the fib-installed nexthop nhg.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-07 13:14:01 -04:00
Jakub Urbańczyk
2f74a82a11 zebra: prepare data plane for batching
* Add new zebra_dplane_result to allow kernel updates not to return
   a result immediately.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-26 22:03:44 +02:00
Mark Stapp
9db35a5e6f zebra: improve route_entry comparison logic
Improve and centralize some logic used to a) compare two
route_entries, and b) to locate a route_entry that matches
a dplane context object that contains the results of a
fib update. We were not rigorous enough in checking routes'
properties, especially when examining connected routes where
we allow multiple route_entries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-25 08:21:27 -04:00
Jakub Urbańczyk
f62e5480ec zebra: convert ip rule installation to use dplane thread
* Implement new dataplane operations
 * Convert existing code to use dataplane context object
 * Modify function preparing netlink message to use dataplane
   context object

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-06-10 16:18:45 +02:00
Renato Westphal
30712725a0
Merge pull request #6480 from volta-networks/feat_pwstatus
ldpd: Relay data plane pseudowire status in LDP notification
2020-06-01 21:00:51 -03:00
Mark Stapp
daaeaa2150 zebra: init dest's list of routes
Use the dlist init api on the zebra dest object's list
of routes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-06-01 14:46:32 -04:00
Karen Schoener
fd563cc7f3 ldpd: Relay data plane pseudowire status in LDP notification
Provide a way for the data plane to indicate pseudowire
status (such as: not forwarding, AC failure).

On a data plane pseudowire install failure, data plane
sets the pseudowire status.
Zebra relays the pseudowire status to LDP.
LDP includes the pseudowire status in the LDP notification
to the LDP peer.

Signed-off-by: Karen Schoener <karen@voltanet.io>
2020-06-01 13:21:37 -04:00
Emanuele Di Pascale
cd7108ba92 zebra: avoid using c++ keywords in headers
to make sure that c++ code can include them, avoid using reserved
keywords like 'delete' or 'new'.

Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
2020-05-14 16:42:47 +02:00
Donald Sharp
91e6f25bc0 zebra: remove typedef rib_update_event_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Donald Sharp
630d596249 zebra: Remove typedef rib_table_info_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Stephen Worley
090152ec9c
Merge pull request #5786 from mjstapp/fix_notif_empty_nhg
zebra: fix handling of failed route install via notification
2020-04-29 12:28:56 -04:00
Mark Stapp
a126f12003 zebra: fix handling of failed route install via notification
An async route notification can indicate that installation
has failed, but the handling code wasn't dealing with that
possibility correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-04-27 10:24:55 -04:00
Quentin Young
772270f3b6 *: sprintf -> snprintf
Replace sprintf with snprintf where straightforward to do so.

- sprintf's into local scope buffers of known size are replaced with the
  equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
  size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
  buffer followed by strlcat

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-04-20 19:14:33 -04:00
Jakub Urbańczyk
bd47f3a3b4 zebra: Add vrf name and id to debugs
In some places we log the interface but not the vfr the
interface is in. In others we only output the vrf id, which
can be difficult for human to read. This commit makes zebra
debugs more vrf aware.

Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
2020-04-12 21:03:29 +02:00
Donatas Abraitis
c4efd0f423 *: Do not cast to the same type
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-04-08 17:15:06 +03:00
Stephen Worley
ff82bbbb91
Merge pull request #5901 from mjstapp/backup_nh_prep
zebra, lib: Backup nexthop (path) prep work
2020-03-30 10:26:17 -04:00
David Lamparter
07ef3e34ae lib: prepare for plugin-based frr_format check
Signed-off-by: David Lamparter <equinox@diac24.net>
2020-03-29 10:45:46 +02:00
Mark Stapp
377e29f7e7 zebra: handle backup nexthops in nhe/nhgs
Include backup nexthops in nhe processing; connect incoming
zapi route data with updated rib/nhg apis; add more debugs in
nhg processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
6d81b590a9 zebra: improve route debugging and add support for backups
Refactor the detailed route debugging so that the dump of nexthops
can be used for both normal/active nexthops and backups (if they
are present).

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Mark Stapp
1d48702ede zebra: add per-nexthop backup index
Use a backup index in a nexthop directly (if it has a backup
nexthop); revise the zebra nhe/nhg code; revise zapi route
decoding to match; revise the dataplane route datastructs.

Refactor some of the rib_add_multipath code to be prepared to
be called with an nhe, carrying nexthop and (possibly) backup
info together.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-03-27 11:50:03 -04:00
Donatas Abraitis
15569c58f8 *: Replace __PRETTY_FUNCTION__/__FUNCTION__ to __func__
Just keep the code cool.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-03-05 20:23:23 +02:00
Ruben Kerkhof
05267678eb zebra: fix typo in debug log message
Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
2020-03-04 16:08:18 +01:00
Mark Stapp
c415d89528 zebra: Embed lib nexthop-group in zebra hash entry
Embed nexthop-group, which is just a pointer, in the zebra
nexthop-hash-entry object, rather than mallocing one.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-27 15:49:31 -05:00
Donald Sharp
c479e75665 zebra: Add vrf name to debug output
The vrf id is insufficient of a discriminator in people's head
Give them what they need.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-02-14 08:41:42 -05:00
Quentin Young
efa618369a
Merge pull request #5794 from mjstapp/remove_nexthop_matched_flag
lib,zebra: remove unused MATCHED nexthop flag
2020-02-12 11:29:22 -05:00
Mark Stapp
0641a955d7 lib,zebra: remove unused MATCHED nexthop flag
Remove an unused flag value from the nexthop struct.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-02-11 15:56:35 -05:00
Russ White
5bf7fe566d
Merge pull request #5722 from donaldsharp/kernel_routes
Kernel routes
2020-02-06 08:04:42 -05:00
Quentin Young
b3ba5dc7fe *: don't null after XFREE; XFREE does this itself
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2020-02-03 11:22:13 -05:00
Donald Sharp
3332f4f0fb zebra: Kernel routes w/ AD were not being marked as installed
When we are receiving a kernel route, with an admin distance
of 255 we are not marking it as installed.  This route
should be marked as installed.

New behavior:
K>* 4.5.7.0/24 [255/8192] via 192.168.209.1, enp0s8, 00:10:14

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-01-23 17:17:01 -05:00
Mark Stapp
9287b4c50f zebra: route changes via notify path trigger nht and mpls
Changes to a route via the dataplane notify path should
trigger nht and mpls lsp processing.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-01-06 10:09:47 -05:00
Stephen Worley
84a89a8d2e zebra: null check re->nhe not re->nhe->nhg on attach
We should be NULL checking the entire re->nhe struct, not
the group inside of it. When we get routes from the kernel
using a nexthop group (and future protocols) they will only
pass us an ID to use. Hence, this struct can (and will be)
NULL on first attach when only passed an ID.

There shouldn't be a situation where we have an re->nhe
and don't have an re->nhe->nhg anyway.

Before this patch you can easily make zebra crash by creating a
route in the kernel using a nexthop group and starting zebra.

`ip next add dev lo id 111`
`ip route add 1.1.1.1/32 nhid 111`

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-12-16 16:37:14 -05:00
Mark Stapp
1f6a5aca26 zebra: handle route notification with no nexthops
Handle the special case where a route update contains
no installed nexthops - that means the route is not
installed.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-12 12:55:51 -05:00
Mark Stapp
4c0b5436f9 zebra: accept async notification for un-install
Handle an async notification when a route-update operation
uninstalls one route in favor of another.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-11 11:22:53 -05:00
Mark Stapp
1c30d64bb6 zebra: align dplane notify processing with nhg work
The processing of dataplane route notifications was a little
off-target after the nexthop-group re-work. This should allow
notifications to work better.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-09 16:19:14 -05:00
Donald Sharp
e302caaa81
Merge pull request #5416 from mjstapp/re_nhe_pointer
lib,zebra: use shared nexthop-group in route_entry
2019-12-04 14:11:04 -05:00
Mark Stapp
0eb97b860d lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-04 08:13:52 -05:00
David Lamparter
2b64873d24 *: generously apply const
const const const your boat, merrily down the stream...

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-12-02 15:01:29 +01:00
Mark Stapp
5463ce26c3 zebra: clean up rib and nhg headers
Clean up the relationships between zebra's rib and nexthop-group
headers as prep for adding a nexthop-group pointer to the
route_entry.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-11-21 15:05:52 -05:00
Stephen Worley
ace3bbba4b zebra: Don't clear nexthop fib flag on rib_install
We cannot clear the NEXTHOP_FLAG_FIB nexthop flag
when sending routes to the dataplane anymore since
nexthops are now shared.

We were seeing a situation where if we delete a route
using a nexthop group that is still active with another
route, the fib flag was being unset by this code
path despite them still being valid fib nexthops with the
other route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-12 01:24:39 -05:00
Stephen Worley
c7c0b007a4 zebra: separate zebra_vrf_lookup_table_with_id()
We were creating `other` tables in rib_del(), vty commands, and
dataplane return callback via the zebra_vrf_table_with_table_id()
API.

Seperate the API into only a lookup, never create
and added another with `get` in the name (following the standard
we use in other table APIs).

Then changed the rib_del(), rib_find_rn_from_ctx(), and show route
summary vty command to use the lookup API instead.

This was found via a crash where two different vrfs though they owned
the table. On delete, one free'd all the nodes, and then the other tried
to use them. It required specific timing of a VRF existing, going away,
and coming back again to cause the crash.

=23464== Invalid read of size 8
==23464==    at 0x179EA4: rib_dest_from_rnode (rib.h:433)
==23464==    by 0x17ACB1: zebra_vrf_delete (zebra_vrf.c:253)
==23464==    by 0x48F3D45: vrf_delete (vrf.c:243)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==    by 0x13D8C5: sigint (main.c:172)
==23464==    by 0x48DD25C: quagga_sigevent_process (sigevent.c:105)
==23464==    by 0x48F0502: thread_fetch (thread.c:1417)
==23464==    by 0x48AC82B: frr_run (libfrr.c:1023)
==23464==    by 0x13DD02: main (main.c:483)
==23464==  Address 0x5152788 is 104 bytes inside a block of size 112 free'd
==23464==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B25B8: qfree (memory.c:129)
==23464==    by 0x48EA335: route_node_destroy (table.c:500)
==23464==    by 0x48E967F: route_node_free (table.c:90)
==23464==    by 0x48E9742: route_table_free (table.c:124)
==23464==    by 0x48E9599: route_table_finish (table.c:60)
==23464==    by 0x170CEA: zebra_router_free_table (zebra_router.c:165)
==23464==    by 0x170DB4: zebra_router_release_table (zebra_router.c:188)
==23464==    by 0x17AAD2: zebra_vrf_disable (zebra_vrf.c:222)
==23464==    by 0x48F3F0C: vrf_disable (vrf.c:313)
==23464==    by 0x48F3CCF: vrf_delete (vrf.c:223)
==23464==    by 0x48F4468: vrf_terminate (vrf.c:532)
==23464==  Block was alloc'd at
==23464==    at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==23464==    by 0x48B24A2: qcalloc (memory.c:110)
==23464==    by 0x48EA2FE: route_node_create (table.c:488)
==23464==    by 0x48E95C7: route_node_new (table.c:66)
==23464==    by 0x48E95E5: route_node_set (table.c:75)
==23464==    by 0x48E9EA9: route_node_get (table.c:326)
==23464==    by 0x48E1EDB: srcdest_rnode_get (srcdest_table.c:244)
==23464==    by 0x16EA4B: rib_add_multipath (zebra_rib.c:2730)
==23464==    by 0x1A5310: zread_route_add (zapi_msg.c:1592)
==23464==    by 0x1A7B8E: zserv_handle_commands (zapi_msg.c:2579)
==23464==    by 0x19D689: zserv_process_messages (zserv.c:523)
==23464==    by 0x48F09F8: thread_call (thread.c:1599)

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-11-01 16:06:19 -04:00
Stephen Worley
d3a3513811 lib,pbrd,zebra: Use one api to delete nexthops/group
Reduce the api for deleting nexthops and the containing
group to just one call rather than having a special case
and handling it separately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
fec211ad95 zebra: Zebra nexthop group re-work checkpatch fixes
Checkpatch fixes for the zebra nexthop group re-work.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
986a6617cc zebra: Optimize the fib/notified nexthop matching
Optimize the fib and notified nexthop group comparison algorithm
to assume ordering. There were some pretty serious performance hits with
this on high ecmp routes.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
9ef49038d5 lib,zebra: Move nexthop dup marking into creation
We were waiting until install time to mark nexthops as duplicate.
Since they are immutable now and re-used, move this marking into
when they are actually created to save a bunch of cycles.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
bc541126e4 zebra: Use nexthop object id on route delete
When we receive a route delete from the kernel and it
contains a nexthop object id, use that to match against
route gateways with instead of explicit nexthops.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:42 -04:00
Stephen Worley
38e40db1c9 zebra: Sweep our nexthop objects out on restart
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.

Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.

Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
4b87c90d58 zebra: TODO for handling upper level nhe_id passing
We need to handle refcnt differently if we ever start making
upper level protocols aware of nhg_hash_entry IDs.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
f429bd1b24 zebra: Move resolve/add depend install into api
Move the resolving and installing of a single nhg_hash_entry
into the install function itself, rather than letting zebra_rib
handle it.

Further, ensure depends are installed/queued before installing
a group. The ordering should be find here since only one thread
will call this API.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
8dfbc65724 zebra: Install the nhe along with the route
Move the installation of an nhe out of nexthop_active_update()
and into the rib install path. So, only install the nhe when
a route using it is being installed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
7f99772169 zebra: Use nexthop/interface vrf, not the routes
When hashing/creating the NHE, use the nexthops vrf as its
source of data. This is gotten directly from an interface
and should not come from a route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
6df591527f zebra: Remove route only if NHE is installed check
Only remove a route if the nexthop it is using is still installed.
If a nexthop object is removed from the kernel, all routes referencing
it will be removed from the kernel.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
144a1b34df zebra: Put NHE ref updating into a function
When the referenced NHE changes for a route_entry, use this function
to handle it.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
98cda54a95 zebra: Add recursive functionality to NHE's
Add the ability to recursively resolve nexthop group hash entries
and resolve them when sending to the kernel.

When copying over nexthops into an NHE, copy resolved info as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
e22e80010e zebra: Use a nhe context dataplane and rib metaq
We will use a nhe context for dataplane interaction with
nextho group hash entries.

New nhe's from the kernel will be put into a group array
if they are a group and queued on the rib metaq to be processed
later.

New nhe's sent to the kernel will be set on the dataplane context
with approprate ID's in the group array if needed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:40 -04:00
Stephen Worley
13dfea50e4 zebra: Check for nexthop group before free on fail
Check to make sure the route entry has a nexthop
group before we try to free after a table lookup
failure.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
bbb3940ed1 zebra: VRF ID should be null if a nexthop group
A nexthop group should not have a VRF ID. Only individual
nexthops need to be using a VRF. Fixed this both kernel and
proto side.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
fe593b781d zebra: Re-organize/expose nhg_connected
Re-organize and expose the nhg_connected functions so that
it can be used outside zebra_nhg.c. And then abstract those
into zebra_nhg_depends_* and zebra_nhg_depenents_* functons.

Switch the ifp struct to use an RB tree for its dependents,
making use of the nhg_connected functions.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
a15d4c0089 zebra: Abstract the RB nodes/add dependents tree
Create a nhg_depenents tree that will function as a way
to get back pointers for NHE's depending on it.

Abstract the RB nodes into nhg_connected for both depends and
dependents. This same struct is used for both.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
0c8215cbab zebra,lib: Refactor depends to RB tree
Refactor the depends to use an RB tree instead of a list.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
ab942eb285 zebra: Protocol side nhg_hash_entry afi fix
Default the afi of the nexthop to the route entry using it.
If it turns out to be a group, update the afi to AFI_UNSPEC.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
86946a2207 zebra: Update rib_add_multipath with increment nhe
Update rib_add_multipath to use the reference count
increment function for nexthop group hash entries.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
20822f9d2e zebra: Add equivalence function for nhg_depends
Add a helper function to allow us to check if two
nhg_hash_entry's dependency lists are equal.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:39 -04:00
Stephen Worley
2d6cd1f007 zebra: Always copy nhg and depends on nhe alloc
Changed our alloc function to just copy the nhg and
nhg_depends. This makes the zebra_nhg_find code a
little bit cleaner, hopefully preventing bugs.

The only issue with this is that it makes us have to loop
over the nexthops in a group an extra time for the copies.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley
8e401b251f zebra: Refactor nexthop group creation code to use allocated memory
Simplify the code for nexthop hash entry creation. I made nexthop
hash entry creation expect the nexthop group and depends to always
be allocated before lookup. Before, it was only allocated if it had
dependencies. I think it makes the code a bit more readable to go
ahead an allocate even for single nexthops as well.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley
85f5e76175 zebra: Read in nexthop dependencies from the kernel
Add functionality to read in a group from the kernel,
create a hash entry for it, and add its nexthops to
its dependency list.

Further, we create its nhg struct separtely from this,
copying over any nexthops it should reference directly
into it.

Thus, we have two types for representation of the nexthop group:
	nhe->nhg_depends->[nhe, nhe, nhe]

	nhe->nhg->nexthop->nexthop->nexthop

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley
77b76fc900 Revert "zebra: Remove afi field in nexthop hash entry"
This reverts commit be73fe9393aac58c7f4bdb5c8a98c24c6cda6d5d.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:38 -04:00
Stephen Worley
2614bf8764 zebra: Add ifp to zebra-side rib_add
Add an interface pointer for an nexthop group hash entry
when we are getting a rib_add for a new route.

Also, add the interface index to the `show nexthop-group` command.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
bbb322f292 zebra: Make route entry nexthop groups point to our hash entry
Make our route entry struct's re->ng nexthop group pointer
just point to the nhe->nhg nexthop hash entry nexthop group.
This will allow updates to the nexthop itself to propogate
to our routes immediately.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
8032b71737 zebra: Update rib_add to take a nexthop ID
Add a parameter to the rib_add function so that it takes
a nexthop ID from the kernel if one is passed along
with the route.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
9865e1c393 zebra: Add calls to the nexthop context process result function
Added in case statements to handle finished dataplane contexts
and then handle them with the nexthop process result function.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
8e76637969 zebra: Route entries use nexthop entry ID's instead of pointers
Switched the route entries to use ID's instead of pointers.
Perform lookups with the ID and then check if its null.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
d9f5b2f50f zebra: Add functionality to parse RTM_NEWNEXTHOP and RTM_DELNEXTHOP messages
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Stephen Worley
8b5bdc8bdf zebra: Remove afi field in nexthop hash entry
I do not believe we should be hashing based on AFI
in for our upper level nexthop group entries. These
should be ambiguous with regards to  address families since
an ipv4 or ipv6 address can have the same interface
nexthop. This can be seen in NEXTHOP_TYPE_IFINDEX.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp
9a0d4dd39b zebra: Remove nexthop_active_num from route entry
The nexthop_active_num data structure is a property of the
nexthop group.  Move the keeping of this data to that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp
eecacedc3b zebra: Remove re->nexthop_num from re
The nexthop_num is not a function of the re.  It is owned
by the nexthop group.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp
6b46851168 zebra: Replace nexthop_group with pointer in route entry
In the route_entry we are keeping a non pointer based
nexthop group, switch the code to use a pointer for all
operations here and ensure we create and delete the memory.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp
22bcedb231 zebra: Add code to create/remove nexthop groups
Add some code to create/remove nexthop groups.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:35 -04:00
Stephen Worley
2a99ab95e6 zebra: Handle rib updates as a thread event
If we need to batch process the rib (all tables or specific
vrf), do so as a scheduled thread event rather than immediately
handling it. Further, add context to the events so that you
narrow down to certain route types you want to reprocess.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-18 16:59:25 -04:00
Donald Sharp
a8100f512a
Merge pull request #5079 from mjstapp/fix_dplane_drop_at_shut
zebra: during shutdown processing, drop dplane results
2019-10-03 11:49:07 -04:00
Mark Stapp
2fc69f03d2 zebra: during shutdown processing, drop dplane results
Don't process dataplane results in zebra during shutdown (after
sigint has been seen). The dplane continues to run in order to
clean up, but zebra main just drops results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-27 12:15:34 -04:00
dturlupov
873fde8cd9 zebra: if_is_loopback_or_vrf crash if if_lookup_by_index return NULL
Function if_lookup_by_index() can return NULL, but in if_is_loopback_or_vrf() we don't chech NULL and get next:

Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x48) [0x7fb5f704cf18]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(zlog_signal+0x378) [0x7fb5f704d728]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(+0x6b495) [0x7fb5f706b495]
Sep 2 07:44:34 XXX zebra[4616]: /lib64/libpthread.so.0(+0x123b0) [0x7fb5f6d573b0]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback+0) [0x7fb5f7045160]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(if_is_loopback_or_vrf+0x11) [0x7fb5f7045191]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43b26d]
Sep 2 07:44:34 XXX zebra[4616]: /usr/sbin/zebra() [0x43db6f]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(work_queue_run+0xc8) [0x7fb5f7080de8]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(thread_call+0x47) [0x7fb5f7077d27]
Sep 2 07:44:34 XXX zebra[4616]: /usr/lib64/libfrr.so.0(frr_run+0xd8) [0x7fb5f704b448]

Signed-off-by: Dmitrii Turlupov dturlupov@factor-ts.ru
2019-09-26 16:11:50 +03:00
Donald Sharp
17a8a5beec
Merge pull request #4972 from mjstapp/fix_notif_installed
zebra: route updates from dataplane need to check all nexthops
2019-09-24 14:31:58 -04:00
Donald Sharp
16296beaa5
Merge pull request #4731 from mjstapp/fix_redist_update
zebra: redistribute deletes when updating selected route
2019-09-18 19:43:43 -04:00
Mark Stapp
11260e7011 zebra: check all dplane nexthops when processing
When processing route updates from the dataplane, we were
terminating the checking of nexthops prematurely, and we could
miss meaningful changes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-17 10:50:36 -04:00
Mark Stapp
cbb9ddad16 zebra: remove empty, unused internal api
Remove a leftover, empty nht api call from zebra.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-16 12:00:14 -04:00
Mark Stapp
40f321c000 zebra: revise redistribution delete to improve update case
When selecting a new best route, zebra sends a redist update
when the route is installed. There are cases where redist
clients may not see that redist add - clients who are not
subscribed to the new route type, e.g. In that case, attempt
to send a redist delete for the old/previous route type.

Revised the redist delete api to accomodate both cases;
also tightened up the const-ness of a few internal redist apis.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-12 08:51:05 -04:00
Mark Stapp
0bbd4ff442 zebra: move EVPN VTEP programming to dataplane
Move VTEP install/uninstall to the zebra dataplane. Remove
synch kernel-facing apis and helper functions.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-09-04 10:30:17 -04:00
Donald Sharp
c9042b2890
Merge pull request #4877 from mjstapp/dplane_neighs
zebra: move evpn neighbors to dataplane
2019-09-04 10:23:31 -04:00
David Lamparter
00dffa8cde lib: add frr_with_mutex() block-wrapper
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited.  This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-09-03 17:15:17 +02:00
Mark Stapp
931fa60c09 zebra: Use dataplane for evpn neighbor changes
Move neighbor programming to the dataplane; remove
old apis; remove some ifdef'd use of direct netlink
code points, using neutral values outside of the netlink-
specific files.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-23 14:10:41 -04:00
Donald Sharp
ad4c7e8d4e
Merge pull request #4778 from mjstapp/dplane_macs
zebra: use dataplane for evpn macs
2019-08-21 20:26:29 -04:00
Mark Stapp
272e89030e zebra: clear route QUEUED flag in async notification handler
Ensure that the route-entry QUEUED flag is cleared in the async
notification path, as it is in the normal results processing
code path.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-05 10:52:31 -04:00
Mark Stapp
036d93c047 zebra: use dataplane for vxlan remote mac programming
Move vxlan remote MAC install and uninstall to the async
dataplane.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-08-02 14:54:16 -04:00
Donald Sharp
228a811a2e zebra: Redistribution should be told about the old route
When we are sending a redistribute_update, pass the old_re in
so that if we still have it around we can update the calling protocol.

Test:

router ospf
  redistribute sharp
!

sharp install route 4.5.6.7 nexthop 192.168.201.1 1

Now add a `ip route 4.5.6.7/32 192.168.201.1`.
This causes zebra to replace the sharp route with the static route.
No update is sent to ospf and debug:
2019/08/01 19:02:38.271998 ZEBRA: 0:4.5.6.7/32: Redist update re 0x12fdbda0 (static), old 0x0 (None)

With fix:

2019/08/01 19:15:09.644499 ZEBRA: 0:4.5.6.7/32: Redist update re 0x1ba5bce0 (static), old 0x1beea4e0 (sharp)
2019/08/01 19:15:09.645462 OSPF: ospf_zebra_read_route: from client sharp: vrf_id 0, p 4.5.6.7/32

Ticket: CM-25847
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-08-01 19:09:59 -04:00
Russ White
3d07ec896e
Merge pull request #4746 from donaldsharp/zebra_rib_improvements
Zebra rib improvements
2019-07-30 11:11:41 -04:00
Donald Sharp
42fc558ee3 zebra, tests: Remove ROUTE_ENTRY_NEXTHOPS_CHANGED
The flag ROUTE_ENTRY_NEXTHOPS_CHANGED is only ever set or unset.
Since this flag is not used for anything useful, remove from system.

By changing this flag we have re-ordered `internalStatus' of json
output of zebra rib routes.  Go through and fix up tetsts to
use the new values.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-29 14:53:58 -04:00
Donald Sharp
b5046a3c50 zebra: Remove repeated enqueueing of system routes for rethinking
The code as written before this code change point would enqueue
every system route type to be refigured when we have an
interface event.  I believe this was to originally handle bugs
in the way nexthop tracking was handled, mainly that if you keep
asking the question you'll eventually get the right answer.

Modify the code to not do this, we have fixed nexthop tracking
to not be so brain dead and to know when it needs to refigure
a route that it is tracking.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-29 11:39:06 -04:00
Don Slice
6d0ee6a0d4 zebra: skip queued entries when resolving nexthop
Problem reported where certain routes were not being passed on to
clients if they were operated on while still queued for kernel
installation.   Changed it to defer working on entries that were
queued to dplane so we could operate on them after getting an
answer back from kernel installatino.

Ticket: CM-25480
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-07-26 17:26:10 +00:00
Mark Stapp
7c7ef4a8c8 zebra: consolidate dataplane interface name and ifindex attrs
Move interface name and index to shared data struct, and remove
operation-specific values.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-07-25 14:01:22 -04:00
Donald Sharp
cf05bc9424 zebra: Modify zebra to order nexthops received
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-09 10:44:41 -04:00
David Lamparter
a5ddc34dc7
lib: Add a couple functions to nexthop_group.c (#4323)
lib: Add a couple functions to nexthop_group.c

Co-authored-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-07-03 14:42:35 +02:00
Sri Mohana Singamsetty
1c05eb4419
Merge pull request #4622 from donaldsharp/import_table_fix
Import table fix
2019-07-02 11:20:02 -07:00
Quentin Young
2951a7a4c2 *: s/TRUE/true/, s/FALSE/false/
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-07-01 17:26:05 +00:00
Stephen Worley
50d8965075 lib: Private api for nexthop_group manipulation
Add a file that exposes functions which modify nexthop groups.
Nexthop groups are techincally immutable but there are a
few special cases where we need direct access to add/remove
nexthops after the group has been made. This file provides a
way to expose those functions in a way that makes it clear
this is a private/hidden api.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-06-25 22:58:48 -04:00
Donald Sharp
fe257ae733 zebra: Push VRF_DEFAULT outside of import table code
The import table code assumes that they will only work
in the default vrf.  This is ok, but we should push the
vrf_id and zvrf to be passed in instead of just using
VRF_DEFAULT.

This will allow us to fix a couple of things:

1) A bug in import where we are not creating the
route entry with the appropriate table so the imported
entry is showing up in the wrong spot.

2) In the future allow `ip import-table X` to become
vrf aware very easily.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp
82d7d0e28a zebra: Improve debugging when we can't find a route to delete
Improve debugging when we cannot find a route to delete
that we have been told to delete.

New output:

2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.7/32 doesn't exist in rib
2019/06/25 17:43:49 ZEBRA: default[0]:4.5.6.8/32 doesn't exist in rib

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-25 17:47:41 -04:00
Donald Sharp
53c16fbec0 zebra: Put route in debug dump of rib data
When dumping rib data about a route for `debug rib detail`
modify the dump command to display the prefix as part
of every line so that we can use a grep on the log
file.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-20 04:55:47 -04:00
Don Slice
739c9c90e7 zebra: resolve issue with rnh not evaluating nexhops correctly
Problem discovered in testing that occasionally when an interface
address was flushed, the corresponding route would be removed from
the kernel and zebra but remain in the bgp table and be advertised
to peers.  Discovered that when zebra_rib_evaluate_nexthops spun
thru the tree list of rns, if the timing and circumstances were
right, it would move elements and miss evaluating some.  Changed
from frr_each to frr_each_safe and the problem is now gone.

Ticket: CM-25301
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-06-19 07:06:32 -07:00
Donald Sharp
0a7be32866 zebra: Display a bit better debugging for rnh tracking
Add a expected count for the route node we will be processing
as part of nexthop resolution and modify the type to display
a useful string of what the type is instead of a number.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-06-18 15:47:10 -04:00
David Lamparter
731bea2844
Merge pull request #4417 from sworleys/Move-Multicast-Mode
zebra: Move multicast mode to being a property of the router
2019-06-03 15:59:48 +02:00
Sri Mohana Singamsetty
979dd989c4
Merge pull request #4413 from donaldsharp/bgp_distance_comes_closer
Bgp distance comes closer
2019-05-30 09:45:43 -07:00
Donald Sharp
526052fb6d zebra: Move multicast mode to being a property of the router
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 15:25:33 -04:00
Quentin Young
eab4a5c2d0 zebra: strcat -> strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-29 18:03:26 +00:00
Mark Stapp
827debeac2
Merge pull request #4326 from sworleys/Move-NH-Active-Functions
zebra: Move nexthop_active_XXX functions to zebra_nhg.c
2019-05-29 11:35:27 -04:00
Donald Sharp
94c08afe02
Merge pull request #4228 from mjstapp/dplane_notif
zebra: async notifications from the dataplane
2019-05-29 10:10:05 -04:00
Donald Sharp
eea2ef0754 zebra: BGP always sends distance no need to double check
BGP always sends down the correct distance to use.  We do
not need rib_add_multipath to double check the code.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 08:57:11 -04:00
Stephen Worley
ad28e79ac9 zebra: Move nexthop_active_XXX functions to zebra_nhg.c
Since these functions are not really rib processing problems
let's move them to zebra_nhg.c which is meant for processing of
nexthop groups.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-28 17:41:38 -04:00
Mark Stapp
188a00e014 zebra: generate updates from notifications
If an async notification changes a route that's current,
generate an update to keep the kernel in sync.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 13:41:37 -04:00
Renato Westphal
f6fd430e44
Merge pull request #4322 from sworleys/Nexthop-Cmp
lib: Add nexthop_cmp
2019-05-28 11:32:44 -03:00
Mark Stapp
104e3ad95e zebra: mpls lsp async notifications
Add LSP notification event type; add a handler for LSP notifs;
dispatch to that handler.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:35:01 -04:00
Mark Stapp
efe6c026a9 zebra: support route changes via dplane notifications
Allow route notifications to trigger route state changes,
such as installed -> not installed.

Clean up the fib-specific nexthop-group in a couple of
un-install paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:31 -04:00
Mark Stapp
941e261c97 zebra: share rib processing of updates and notifications
Use some common handling for both route update results
processing and dataplane notification processing. Use the
fib-specific nexthop-group if the update to a route results
in different nexthop status than the default rib-provided
nexthop-group.

Use the fib-specific nexthop-group, if present, to provide
the output of 'show ip fib'.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:34:21 -04:00
Mark Stapp
ee5e8a4820 zebra: add a fib-specific nexthop-group
Add a fib-specific nhg, distinct from the nhg developed from
the route-owner / RIB information.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:27:42 -04:00
Mark Stapp
54818e3b01 zebra: begin dataplane notifications
Add dataplane route notification type; add handler for zebra
routes.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:27 -04:00
Mark Stapp
5695d9ac5d zebra: set nexthop install state more accurately
When setting route nexthops' installation state based on a
dataplane context struct, unset the installed state if a
nexthop was not present in the dataplane context.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:22:21 -04:00
Mark Stapp
fad4d69cd4 zebra: add api to locate route-node from dplane ctx
Create a helper api that locates a zebra route-node from info
in a dplane context struct. Moved code from the results handler
to make a more-general api that could be used in other paths.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:20 -04:00
Mark Stapp
78bf56b0b6 zebra: add api to update route from dplane ctx
Add an api to update the status of a route based on info
from a dplane context object. Use the api when processing
route update results from the dataplane.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-05-28 08:21:09 -04:00
Donald Sharp
d4644d4196 zebra: Add kernel level graceful restart
<Initial Code from Praveen Chaudhary>

Add the a `--graceful_restart X` flag to zebra start that
now creates a timer that pops in X seconds and will go
through and remove all routes that are older than startup.

If graceful_restart is not specified then we will just pop
a timer that cleans everything up immediately.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-23 19:35:42 -04:00
Stephen Worley
a5a2d802d7 lib,zebra,bgpd,pbrd: Compare nexthops without labels
Allow label ignoring when comparing nexthops. Specifically,
add another functon nexthop_same_no_labels() that shares
a path with nexthop_same() but doesn't check labels.

rib_delete() needs to ignore labels in this case.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Stephen Worley
78fba41bd8 lib,zebra,bgpd: Remove nexthop_same_no_recurse()
The functions nexthop_same() does not check the resolved
nexthops so I don't think this function is even needed
anymore.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-23 12:21:15 -04:00
Renato Westphal
81fddbe7ae *: rename new ForEach macros from the typesafe API
This is necessary to avoid a name collision with std::for_each
from C++.

Fixes the compilation of the gRPC northbound module.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-05-21 15:59:08 -03:00
Quentin Young
303b93cdee zebra: update zebra_rib for vrrp
VRRP doesn't install any routes, but should still have an array entry.
Also add a help string for VRRP to route_types.txt

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-05-17 00:27:08 +00:00
Russ White
1b072ce466
Merge pull request #4269 from donaldsharp/other_tables
zebra Other tables
2019-05-16 10:11:56 -04:00
Russ White
cc25952f2a
Merge pull request #4327 from sworleys/Move-Multipath-Num
zebra: Move multipath_num into zrouter
2019-05-16 10:04:14 -04:00
Donald Sharp
b3f2b59020 zebra: Move multipath_num into zrouter
The multipath_num variable is a property of zebra_router,
so move it there.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-14 14:15:18 -07:00
Mark Stapp
98124e2d6a
Merge pull request #4321 from sworleys/Ribsystem-Ribkernel
zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
2019-05-14 09:29:08 -04:00
Donald Sharp
84340a15b4 zebra: Make RIB_SYSTEM|KERNEL_ROUTE a property of rib.h
These defines should be available from rib.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-13 13:11:49 -04:00
Donald Sharp
98572489ea zebra: Switch to using monotime(NULL) for re->uptime
The re->uptime usage of time(NULL) leaves it open to
timing changes from outside influence.  Switching
to monotime allows us to ensure that we have a timestamp
that is always increasing.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-11 01:44:42 -04:00
Renato Westphal
773fc72b1f
Merge pull request #4242 from donaldsharp/zebra_diet
Zebra diet
2019-05-10 08:29:59 -03:00
Donald Sharp
d8612e6545 zebra: Track tables allocated by vrf and cleanup
For each table created by a vrf, keep track of it and
allow for proper cleanup on shutdown of that particular
table.  Cleanup client shutdown to only cleanup data
that the particular vrf owns.  Before we were cleaning
the same table 2 times.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-09 07:11:22 -04:00
Renato Westphal
95f092540e
Merge pull request #4256 from donaldsharp/zebra_table
doc, zebra: Remove "table X" command
2019-05-06 19:08:17 -03:00
Donald Sharp
c447ad08b2 doc, zebra: Remove "table X" command
This command is broken and has been broken since the introduction
of vrf's.  Since no-one has complained it is safe to assume that
there is no call for this specialized linux command.  Remove
from the system with extreme prejudice.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-06 13:42:23 -04:00
Donald Sharp
8dc7a75918 zebra: Add some extra safety for route_info
The route_info[X].meta_q_map *must* be less than MQ_SIZE
or we will do some strange stuff, so assert on it at startup.

The distance in route_info is a uint8_t so let's keep the data
structure the same.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp
4bb55bbecc zebra: ifp must be a real pointer sometimes
The ifp pointer must be pointing at a real location
in memory since right above us in this loop we
return if it is.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-03 05:05:19 -04:00
Donald Sharp
045207e27c zebra: Use built in list handler for route entries now
The route entry code was using a custom linked list to handle
route entries.  Remove and replace with the new lib link list
code.  This reduces the size of the route entry by a further
8 bytes.

Observant people will notice that the current linked list
implementation is singly linked, while the Route Entry
is doubly linked.  I am not terribly concerned about this
change as that 1) we do not see a large number of route
entries per prefix( say 2 maybe 3 items ) and route entries
do not come and go that often.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 17:41:35 -04:00
Donald Sharp
aa57abfbb5 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-02 16:21:38 -04:00
Lou Berger
e8b9ad5cdd
Revert "Zebra diet" 2019-05-02 06:54:59 -04:00
Donald Sharp
0a45d97472 zebra: Remove linked list and replace with new LIST
The `struct rib_dest_t` was being used to store the linked
list of rnh's associated with the node.  This was taking up
a bunch of memory.  Replace with new data structure supplied
by David and see the memory reductions associated with 1 million
routes in the zebra rib:

Old:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  675 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  567 MiB
  Free small blocks:     39 MiB
  Free ordinary blocks:  69 MiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

New:
Memory statistics for zebra:
System allocator statistics:
  Total heap allocated:  574 MiB
  Holding block headers: 0 bytes
  Used small blocks:     0 bytes
  Used ordinary blocks:  536 MiB
  Free small blocks:     33 MiB
  Free ordinary blocks:  4600 KiB
  Ordinary blocks:       0
  Small blocks:          0
  Holding blocks:        0

`struct rnh` was moved to rib.h because of the tangled web
of structure dependancies.  This data structure is used
in numerous places so it should be ok for the moment.
Future work might be needed to do a better job of splitting
up data structures and function definitions.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-01 20:28:57 -04:00
Stephen Worley
eaa2716dfb zebra: Check on startup route_info has all types
Add a function to check if the route_info array
has all types specified with data in it. Specifically,
test the 'key' attribute for non-zero data. Ignore
ZEBRA_ROUTE_SYSTEM as it should be zero key anyway.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-05-01 15:32:18 -04:00
Mark Stapp
88351c8f6d
Merge pull request #4226 from sworleys/PBR-BFD-OF-route_info
zebra: Add PBR, BFD, OpenFabric to route_info
2019-05-01 11:22:54 -04:00
Lou Berger
31e944a8a7
Merge pull request #3045 from opensourcerouting/atoms
READY: lists/skiplists/rb-trees new API & sequence lock & atomic lists
2019-04-30 10:26:35 -04:00
Stephen Worley
d6abd8b070 zebra: Comment to ensure types added to route_info
Add a comment to indicate that route types added to
Zebra, should also be present in the route_info array.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-30 10:07:45 -04:00
Stephen Worley
eab7b6e371 zebra: Add OpenFabric to route_info array
Add OpenFabric to the route_info array for handling processing
of the OpenFabric route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:28:15 -04:00
Stephen Worley
42d96b73cb zebra: Add BFD to route_info array
Add BFD to the route_info array for handling processing
of the BFD route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:26:11 -04:00
Stephen Worley
9815665214 zebra: Add PBR to route_info array
Add PBR to the route_info array for handling processing
of the PBR route type.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-04-29 19:24:26 -04:00
Don Slice
ade4a8868e zebra: resolve issue with protocol route-map not applied properly
Problem reported that route-maps applied to "ip protocol table bgp"
would not be invoked if the ip protocol table command was issued
after the bgp prefixes were installed.  Found that a recent change
improving how often nexthop_active_update runs missed causing this
filtering to be applied. This fix resolves that issue as well as
a couple of other places that were problematic with the recent
change.

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2019-04-26 17:15:44 +00:00
Donald Sharp
df38b099ee zebra: Update flag output for route entry dump
Update the nexthop flag output for the route entry dump to
include all possible flag states be output.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
6883bf8d35 zebra: Run nexthop_active_check once
We currently run nexthop_active_check multiple times.  Make the
code run once and figure out state from that.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
80ad04184f zebra: Double check is not necessary in nexthop_active_update
The nexthop_active_update command looks at each individual
nexthop and decides if it has changed.  If any nexthop
has changed we will set the re->status to ROUTE_ENTRY_CHANGED
and ROUTE_ENTRY_NEXTHOPS_CHANGED.

Additionally the test for old_nh_num != curr_active
makes no sense because suppose we have several events
we are processing at the same time and a total ecmp
of 16 but 14 are active at the start and 14 are active
at the end but different interfaces are up or down.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
dd50eeb115 lib, zebra: Remove unused flag
The NEXTHOP_FLAG_FILTERED went away when we started treating
static routes like every other route in the system.  This was
a special case for handling static route code that just didn't
get finished cleaning up.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Donald Sharp
99eabcec1a zebra: nexthop_active_update does not need set
We are effectively calling nexthop_active_update() on every
route entry being processed for installation at least 2 times.
This is a bit ridiculous.  We need to resolve the nexthops
when we know a route has changed in some manner, so do so.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-04-18 14:57:54 -04:00
Renato Westphal
e412d3b8d9 lib: move zlog() prototype back to the public logging API
zlog() should be part of the public logging API as it's useful in
the cases where the logging priority isn't known at compile time
(i.e. it depends on a variable).

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-04-18 13:15:13 -03:00
David Lamparter
7e3a1ec742 lib: ZEBRA_NUM_OF -> array_size
The latter is widely used, e.g. in the Linux kernel.

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-04-18 12:44:29 +02:00
Mark Stapp
cf363e1bd8 zebra: dataplane notifications for system route changes
Add notifications from zebra to the dataplane subsystem when
kernel or connected routes change.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-10 16:07:01 -04:00
Mark Stapp
f4c6e2a815 zebra: remove unused VRF_RIB_SCHEDULED flag
We don't use th vrf-level VRF_RIB_SCHEDULED flag any longer;
remove it and collapse the zebra_vrf flags' values.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-04-05 08:46:28 -04:00
Donald Sharp
a1494c250c zebra: Modify lsp processing to be invoked as needed
LSP processing was a zvrf flag based upon a connected route
coming or going.  But this did not allow us to know
that we should do lsp processing other than after the meta-queue
processing was finished.

Eventually we moved meta-queue processing of do_nht_processing
to after the dataplane sent the main pthread some results.
This of course left us with a timing hole where if a connected
route came in and we received a data plane response *before*
the meta queue was processed we would not do the work as necessary.

Move the lsp processing to a flag off of the rib_dest_t. If it
is marked then we need to process lsps.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
50872b0804 zebra: Add detailed debugging command for NHT tracking
Add a detailed debugging command for NHT tracking and add
the detailed output to the log about why we make some decisions
that we are.  I tried to model this like the rib processing
detailed debugs that we added a few months back.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
699dae230d zebra: Modify NHT to occur when needed.
Currently nexthop tracking is performed for all nexthops that
are being tracked after a group of contexts are passed back
from the data plane for post install processing.

This is inefficient and leaves us sending nexthop tracking
changes at an accelerated pace, when we think we've changed
a route.  Additionally every route change will cause us
to relook at all nexthops we are tracking irrelevant if
they are possibly related to the route change or not.

Let's modify the code base to track the rnh's off of the rib
table's rn, `rib_dest_t`.  So after we process a node, install
it into the data plane, in rib_process_result we can
look at the `rib_dest_t` associated with the rn and see that
a nexthop depended on this route node.  If so, refigure it.

Additionally we will store rnh's that are not resolved on the
0.0.0.0/0 nexthop tracking list.  As such when a route node
changes we can quickly walk up the rib tree and notice that
it needs to be reprocessed as well.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:22:22 -04:00
Donald Sharp
c86ba6c283 zebra: Add a base node for the zebra vrf tables
Add a default route_node for our routing tables.  This will allow us
to know that we can hang data off the default route for processing.

We will be hanging the nexthop tracking data structures off the rib_dest_t
so that we can know which nexthops we need to handle.  Effectively
nexthops that we are tracking that are unresolved will be stored on the
default route.  When something changes in the rib tree we can
work up the rn->parent pointer checking for nexthops we need to re-evaluate.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
434434f704 zebra: Abstract the rib_dest_t creation
Abstract the creation of the rib_dest_t so that we can call it
from multiple places.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
3cdba47a82 zebra: Modify code so that dplane is responsible for indicating success/fail of install
We have several route types KERNEL and CONNECT that are handled via special
case in the code.  This was causing a lot of work keeping the two different
classes of route types as special(SYSTEM OR NOT).  Put the dplane
in charge of the code that sets the bits for signalling route install/failure.

This greatly simplifies the code calling path and makes all route types
be handled exactly the same.  Additionaly code that we want to run
post data plane install can just work as per normal then, instead
of having to know we need to run it when we have a special type
of route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
2019-03-27 16:19:28 -04:00
Donald Sharp
7a230a9d0c zebra: On route install/update failure correctly indicate in rib
When we get a route install failure from the kernel, actually
indicate in the rib the status of the routes.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
9ef0c6ba87 zebra: Unset old_re as queued.
When switching routes from one route type to another actually
unset the old route as enqueued.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Quentin Young
73fb891892 Revert "Merge pull request #3982 from pacovn/Coverity_1479148_copy_paste"
This reverts commit 3a3704fe36, reversing
changes made to 5a3c6e736d.
2019-03-20 21:25:04 +00:00
F. Aragon
23fbacb455
zebra: copy-paste error (Coverity 1479148)
Signed-off-by: F. Aragon <paco@voltanet.io>
2019-03-20 16:45:32 +01:00
Donald Sharp
b900245adc zebra: System routes sometimes can not be properly selected
System Routes if received over the netlink bus in a
specific pattern that causes an update operation for that
route in zebra can leave the dest->selected_fib pointer NULL,
while having the ZEBRA_FLAG_SELECTED flag set. Specifically
one way to achieve this is to do this:

`ip addr del 4.5.6.7/32 dev swp1 ; ip addr add 4.5.6.7/32 dev swp1 metric 9`

Why is this a big deal?
Because nexthop tracking is looking at ZEBRA_FLAG_SELECTED to
know if we can use a route, while nexthop active checking uses
dest->selected_fib.

So imagine we have bgp registering a nexthop. nexthop tracking in
the above case will be able to choose the 4.5.6.7/32 route
if that is what the nexthop is, due to the ZEBRA_FLAG_SELECTED being
properly set. BGP then allows the peers connection to come up and we
install routes with a 4.5.6.7 nexthop. The rib processing for route
installation will then look at the 4.5.6.7 route see no
dest->selected_fib and then start walking up the tree to resolve
the route. In our case we could easily hit the default route and be
unable to resolve the route. Which then becomes inactive in the
rib so we never attempt to install it.

This commit fixes this problem because when the rib_process decides
that we need to update the fib( ie replace old w/ new ), the
replacement with new was not setting the `dest->selected_fib` pointer
to the new route_entry, when the route was a system route.

Ticket: CM-24203
Signed-off-by: Donald Sharp <sharpd@cumulusnetworkscom>
2019-03-15 10:02:11 -04:00
David Lamparter
d3b05897ed
Merge pull request #3869 from qlyoung/cocci-fixes
Assorted Coccinelle fixes
2019-03-06 15:54:44 +01:00
vivek
2b83602b24 *: Explicitly mark nexthop of EVPN-sourced routes as onlink
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface. Howver, in the model that
is supported in the implementation and commonly deployed, there is no
explicit Overlay IP address associated with the next hop in the tenant
VRF; the underlay IP is used if (since) the forwarding plane requires
a next hop IP. Therefore, the next hop has to be explicit flagged as
onlink to cause any next hop reachability checks in the forwarding plane
to be skipped.

https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.

Use existing mechanism to specify the nexthops as onlink when installing
these routes from bgpd to zebra and get rid of a special flag that was
introduced for EVPN-sourced routes. Also, use the onlink flag during next
hop validation in zebra and eliminate other special checks.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-27 12:54:24 +00:00
Quentin Young
9f2d035447 *: remove useless return variables
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2019-02-25 23:00:16 +00:00
Donald Sharp
5f27bcba2a zebra: Fix use after free in rib_process_result
Running zebra after commit 888756b208
in valgrind produces this item:

==17102== Invalid read of size 8
==17102==    at 0x44D84C: rib_dest_from_rnode (rib.h:375)
==17102==    by 0x4546ED: rib_process_result (zebra_rib.c:1904)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Address 0x83bd468 is 88 bytes inside a block of size 96 free'd
==17102==    at 0x4A35F54: free (vg_replace_malloc.c:530)
==17102==    by 0x4CCAC00: qfree (memory.c:129)
==17102==    by 0x4D03DC6: route_node_destroy (table.c:501)
==17102==    by 0x4D039EE: route_node_free (table.c:90)
==17102==    by 0x4D03971: route_node_delete (table.c:382)
==17102==    by 0x44D82A: route_unlock_node (table.h:256)
==17102==    by 0x454617: rib_process_result (zebra_rib.c:1882)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Block was alloc'd at
==17102==    at 0x4A36FF6: calloc (vg_replace_malloc.c:752)
==17102==    by 0x4CCAA2D: qcalloc (memory.c:110)
==17102==    by 0x4D03D88: route_node_create (table.c:489)
==17102==    by 0x4D0360F: route_node_new (table.c:65)
==17102==    by 0x4D034F8: route_node_set (table.c:74)
==17102==    by 0x4D03486: route_node_get (table.c:327)
==17102==    by 0x4CFB700: srcdest_rnode_get (srcdest_table.c:243)
==17102==    by 0x4545C1: rib_process_result (zebra_rib.c:1872)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==

This is happening because of this order of events:

1) Route is deleted in the main thread and scheduled for rib processing.
2) Rib garbage collection is run and we remove the route node since it
is no longer needed.
3) Data plane returns from the deletion in the kernel and we call
the srcdest_rnode_get function to get the prefix that was deleted.
This recreates a new route node.  This creates a route_node with
a lock count of 1, which we freed via the route_unlock_node call.
Then we continued to use the rn pointer.  Which leaves us with use
after frees.

The solution is, of course, to just move the unlock the node at the
end of the function if we have a route_node.

Fixes: #3854
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-23 20:03:48 -05:00
Mark Stapp
5c111895d6 zebra: unlock route-node in dplane results handler
Unlock the route-node struct we look up while processing
async dataplane results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-21 16:15:14 -05:00
Mark Stapp
8263d1d0d9 zebra: use update semantics for routes consistently
Use 'update' semantics for route updates, to ensure that
netlink replace behavior works correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-11 16:11:02 -05:00
David Lamparter
b7777b57c4
Merge pull request #3722 from donaldsharp/static_recursive
Zebra fixes
2019-02-07 19:22:29 +01:00
Donald Sharp
4634d02cfd
Merge pull request #3684 from mjstapp/dplane_pw
zebra: async dataplane for pseudowires
2019-02-05 18:41:12 -05:00
Donald Sharp
6c47d39902 zebra: Fix multiple levels of static recursion
Allow the nexthop-check code to figure out recursive static routes
in a logical manner.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 15:21:26 -05:00
Donald Sharp
46a4e3455b zebra: NHT was being run at least 2 times and missreporting data
With the data plane changes that were made, we are now running
nexthop tracking 2 times.  Once at the end of meta-queue insertion
and once at the end of receiving a bunch of data from the dataplane.

The Addition of the data plane code caused flags to not be set
fully for the resolved routes( since we do not know the answer yet ),
This in turn caused the nexthop tracking run after the meta-queue
to think that the route was not `good`.  This would cause it to
tell all interested parties that there was no nexthop.

After the dataplane insertion we are also no running nht code.
This was re-figuring out the nexthop correctly and also
correctly reporting to interested parties that there was a path again.

Example:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:06:47
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:04:47
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:06:47
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:06:47
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:06:47
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:07:06
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:04
  *                  via 192.168.210.1, enp0s9, 00:00:04
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:07:06
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:07:06
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:07:06
donna.cumulusnetworks.com(config)#

Log files for sharp, which is watching 4.5.6.7:
2019/02/04 15:20:54.844288 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844820 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844836 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:20:54.844853 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

As you can see we have received an update with no nexthops( invalid route )
and a second update immediately after it with 2 nexthops.

What's the big deal you say?  Well we have code in other daemons that reacts
to not having a path for a nexthop.  In BGP this will cause us to tear
down the peer.  In staticd we'll remove the recursively resolved route.
In pim we'll remove all paths to the mroute.  This is not desirable.

The fix is to remove the meta-queue run of nexthop tracking.

While running after data plane notice of routes to handle is not ideal
we will be fixing this in the future with the nexthop group code, which
should know what nexthops are affected by a nexthop group change.

Fixed code debug code:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:46
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:46
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:46
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:46
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:59
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
  *                  via 192.168.210.1, enp0s9, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:59
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:59
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:59

2019/02/04 15:26:20.656395 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:20.656440 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688251 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:33.688322 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688329 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 09:17:02 -05:00
Donald Sharp
2561d12e5d zebra: Remove struct zebra_t
This structure is unused anymore and does not belong in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
ea45a4e7db zebra: Move the mq data structure to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00