Commit Graph

485 Commits

Author SHA1 Message Date
Donald Sharp
5f27bcba2a zebra: Fix use after free in rib_process_result
Running zebra after commit 888756b208
in valgrind produces this item:

==17102== Invalid read of size 8
==17102==    at 0x44D84C: rib_dest_from_rnode (rib.h:375)
==17102==    by 0x4546ED: rib_process_result (zebra_rib.c:1904)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Address 0x83bd468 is 88 bytes inside a block of size 96 free'd
==17102==    at 0x4A35F54: free (vg_replace_malloc.c:530)
==17102==    by 0x4CCAC00: qfree (memory.c:129)
==17102==    by 0x4D03DC6: route_node_destroy (table.c:501)
==17102==    by 0x4D039EE: route_node_free (table.c:90)
==17102==    by 0x4D03971: route_node_delete (table.c:382)
==17102==    by 0x44D82A: route_unlock_node (table.h:256)
==17102==    by 0x454617: rib_process_result (zebra_rib.c:1882)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==  Block was alloc'd at
==17102==    at 0x4A36FF6: calloc (vg_replace_malloc.c:752)
==17102==    by 0x4CCAA2D: qcalloc (memory.c:110)
==17102==    by 0x4D03D88: route_node_create (table.c:489)
==17102==    by 0x4D0360F: route_node_new (table.c:65)
==17102==    by 0x4D034F8: route_node_set (table.c:74)
==17102==    by 0x4D03486: route_node_get (table.c:327)
==17102==    by 0x4CFB700: srcdest_rnode_get (srcdest_table.c:243)
==17102==    by 0x4545C1: rib_process_result (zebra_rib.c:1872)
==17102==    by 0x45436D: rib_process_dplane_results (zebra_rib.c:3295)
==17102==    by 0x4D0902B: thread_call (thread.c:1607)
==17102==    by 0x4CC3983: frr_run (libfrr.c:1011)
==17102==    by 0x4266F6: main (main.c:473)
==17102==

This is happening because of this order of events:

1) Route is deleted in the main thread and scheduled for rib processing.
2) Rib garbage collection is run and we remove the route node since it
is no longer needed.
3) Data plane returns from the deletion in the kernel and we call
the srcdest_rnode_get function to get the prefix that was deleted.
This recreates a new route node.  This creates a route_node with
a lock count of 1, which we freed via the route_unlock_node call.
Then we continued to use the rn pointer.  Which leaves us with use
after frees.

The solution is, of course, to just move the unlock the node at the
end of the function if we have a route_node.

Fixes: #3854
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-23 20:03:48 -05:00
Mark Stapp
5c111895d6 zebra: unlock route-node in dplane results handler
Unlock the route-node struct we look up while processing
async dataplane results.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-21 16:15:14 -05:00
Mark Stapp
8263d1d0d9 zebra: use update semantics for routes consistently
Use 'update' semantics for route updates, to ensure that
netlink replace behavior works correctly.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-02-11 16:11:02 -05:00
David Lamparter
b7777b57c4
Merge pull request #3722 from donaldsharp/static_recursive
Zebra fixes
2019-02-07 19:22:29 +01:00
Donald Sharp
4634d02cfd
Merge pull request #3684 from mjstapp/dplane_pw
zebra: async dataplane for pseudowires
2019-02-05 18:41:12 -05:00
Donald Sharp
6c47d39902 zebra: Fix multiple levels of static recursion
Allow the nexthop-check code to figure out recursive static routes
in a logical manner.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 15:21:26 -05:00
Donald Sharp
46a4e3455b zebra: NHT was being run at least 2 times and missreporting data
With the data plane changes that were made, we are now running
nexthop tracking 2 times.  Once at the end of meta-queue insertion
and once at the end of receiving a bunch of data from the dataplane.

The Addition of the data plane code caused flags to not be set
fully for the resolved routes( since we do not know the answer yet ),
This in turn caused the nexthop tracking run after the meta-queue
to think that the route was not `good`.  This would cause it to
tell all interested parties that there was no nexthop.

After the dataplane insertion we are also no running nht code.
This was re-figuring out the nexthop correctly and also
correctly reporting to interested parties that there was a path again.

Example:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:06:47
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:04:47
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:06:47
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:06:47
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:06:47
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:07:06
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:04
  *                  via 192.168.210.1, enp0s9, 00:00:04
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:07:06
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:07:06
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:07:06
donna.cumulusnetworks.com(config)#

Log files for sharp, which is watching 4.5.6.7:
2019/02/04 15:20:54.844288 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844820 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:20:54.844836 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:20:54.844853 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

As you can see we have received an update with no nexthops( invalid route )
and a second update immediately after it with 2 nexthops.

What's the big deal you say?  Well we have code in other daemons that reacts
to not having a path for a nexthop.  In BGP this will cause us to tear
down the peer.  In staticd we'll remove the recursively resolved route.
In pim we'll remove all paths to the mroute.  This is not desirable.

The fix is to remove the meta-queue run of nexthop tracking.

While running after data plane notice of routes to handle is not ideal
we will be fixing this in the future with the nexthop group code, which
should know what nexthops are affected by a nexthop group change.

Fixed code debug code:
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:46
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:46
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:46
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:46
donna.cumulusnetworks.com(config)# ip route 4.5.6.7/32 192.168.210.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, f - failed route

K>* 0.0.0.0/0 [0/103] via 10.50.11.1, enp0s3, 00:00:59
S>* 4.5.6.7/32 [1/0] via 192.168.209.1, enp0s8, 00:00:02
  *                  via 192.168.210.1, enp0s9, 00:00:02
C>* 10.50.11.0/24 is directly connected, enp0s3, 00:00:59
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:59
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:59

2019/02/04 15:26:20.656395 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:20.656440 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688251 SHARP: Received update for 4.5.6.7/32
2019/02/04 15:26:33.688322 SHARP: 	Nexthop 192.168.209.1, type: 2, ifindex: 3, vrf: 0, label_num: 0
2019/02/04 15:26:33.688329 SHARP: 	Nexthop 192.168.210.1, type: 2, ifindex: 4, vrf: 0, label_num: 0

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-02-05 09:17:02 -05:00
Donald Sharp
2561d12e5d zebra: Remove struct zebra_t
This structure is unused anymore and does not belong in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
ea45a4e7db zebra: Move the mq data structure to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
489a961429 zebra: Move ribq from zebrad to zrouter
The zrouter should own this data structure and it should not
be defined in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
b3d43ff471 zebra: Move rtm_table_default to zrouter
The zrouter should own this particular piece of data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
3801e7646c zebra: Move the master thread handler to the zrouter structure
The master thread handler is really part of the zrouter structure.
So let's move it over to that.  Eventually zserv.h will only be
used for zapi messages.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
David Lamparter
19b336d343
Merge pull request #3699 from donaldsharp/zebra_rib_debugs
Zebra Respect my authority
2019-01-31 01:51:52 +01:00
Donald Sharp
b9f0e5ee24 zebra: On route update context is sometimes indeterminate in post-processing
When we get into rib_process_result and the operation we are handling
is DPLANE_OP_ROUTE_UPDATE *and* the route entry being looked at
is a route replace, we currently have no way to decode to the old_re
and the re due to how we have stored context.  As such they are the
same pointer.

As such the route replace for the same route type is causing the re
to set the installed flag and then immediately unset the installed
flag, leaving us in a state where the kernel has the route but
the rib thinks we are not installed.

Since the true old_re( the one being replaced by the update operation )
is going away( as that it zebra deletes the old one for us already )
this fix is not optimal but will get us moving forward.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-30 09:52:13 -05:00
Donald Sharp
058c16b7e2 zebra: Trust kernel and System routes
If we receive a valid message from the kernel that
is either a kernel or system route, we should trust
that the route is legit and just use it.

Old behavior:

K * 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1 inactive, 00:00:16

New Behavior:

K>* 172.22.0.0/15 [0/0] via 172.22.2.254, eva_dummy1, 00:02:35

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:45:02 -05:00
Donald Sharp
2da33d6b3a zebra: Convert route entry id number to string in debugs
The route entry being displayed in debugs was displaying
the originating route type as a number.  While numbers
are cool, I for one am not terribly interested in
memorizing them.  Modify the (type %d) to a (%s) to
just list the string type of the route.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-29 21:35:07 -05:00
Russ White
2538f1dad7
Merge pull request #3681 from donaldsharp/onlink
*: The onlink attribute should be owned by the nexthop not the route.
2019-01-29 10:09:44 -05:00
Donald Sharp
fe85601c96 *: The onlink attribute should be owned by the nexthop not the route.
The onlink attribute was being passed from upper level protocols
as an attribute of the route *not* the individual nexthop.  When
we pass this data to the kernel, we treat the onlink as a attribute
of the nexthop.  This commit modifies the code base to allow
us to pass the ONLINK attribute as an attribute of the nexthop.

This commit also fixes static routes that have multiple nexthops
some onlink and some not.

ip route 4.5.6.7/32 192.168.41.1 eveth1 onlink
ip route 4.5.6.7/32 192.168.42.2

S>* 4.5.6.7/32 [1/0] via 192.168.41.1, eveth1 onlink, 00:03:04
  *                  via 192.168.42.2, eveth2, 00:03:04

sharpd@robot ~/frr2> sudo ip netns exec EVA ip route show
4.5.6.7 proto 196 metric 20
	nexthop via 192.168.41.1 dev eveth1 weight 1 onlink
	nexthop via 192.168.42.2 dev eveth2 weight 1

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-26 21:02:26 -05:00
Donald Sharp
60f98b236e zebra: Keep track of when routes are queued/dequeued from the dataplane
When we process the dataplane data, keep track of whether or not a route
is in transit or not.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00
Donald Sharp
677c1dd5cb zebra: Use ROUTE_ENTRY_INSTALLED as decision for route is installed
zebra is using NEXTHOP_FLAG_FIB as the basis of whether or not
a route_entry is installed.  This is problematic in that we plan
to separate out nexthop handling from route installation.  So modify
the code to keep track of whether or not a route_entry is installed/failed.

This basically means that every place we set/unset NEXTHOP_FLAG_FIB, we
actually also set/unset ROUTE_ENTRY_INSTALLED on the route_entry.
Additionally where we check for route installed via NEXTHOP_FLAG_FIB
switch over to checking if the route think's it is installed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-25 20:16:15 -05:00
Mark Stapp
9bd9717bb2 zebra: add handler for pw install errors
Add handler for async error results from the dataplane for
pseudowire installation attempts.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-25 10:45:57 -05:00
Donald Sharp
313731ac92
Merge pull request #3630 from opensourcerouting/fix-show-import-check
zebra: fix the "show ip import-check" command
2019-01-22 20:10:56 -05:00
Mark Stapp
d37f4d6c61 zebra: move LSP updates into dataplane subsystem
Start performing LSP updates through the async dataplane
subsystem. This is plumbed through for linux/netlink.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-22 13:56:48 -05:00
Renato Westphal
73bf60a06b zebra: consolidate how we indentify address-families in the NHT code
Favor usage of the afi_t enumeration to identify address-families
over using the classic AF_INET[6] constants for that. The choice to
use either of the two seems to be mostly arbitrary throughout our
code base, which leads to confusion and bugs like the one fixed by
commit 6f95d11a1. To address this problem, favor usage of the afi_t
enumeration whenever possible, since 1) it's an enumeration (helps
the compilers to catch some bugs), 2) has a safi_t sibling and 3)
can be used to index static arrays. AF_INET[6] should then be used
only when interfacing with the kernel or external libraries like
libc. The family2afi() and afi2family() functions can be used to
convert between the two different representations back and forth.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2019-01-21 13:26:36 -02:00
Donald Sharp
12e7fe3aa0 zebra: Add a switch statement for rib_process_after
Future commits are going to introduce more rigor in
state setting in the case of received results from
the data plane.  So let us move the DPLANE_OP_ROUTE_DELETE
state check to the same spot as the rest of the code that
is handling a particular operation.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-11 11:48:14 -05:00
Donald Sharp
f52ed67796 zebra: Limit meta_queue insertion to one time.
Modify the meta_queue insertion such that we only enqueue
the route_node into one meta_queue instead of several.

Suppose we have multiple route_entries associated with
a particular node from rip, bgp, staticd.  If we receive a
route update from rip, we would enqueue the route_node into
the 1, 2, 3 meta-nodes.  Which means that we would run
the entire process of figuring out a route 3 times, while
nothing would change the second two times.

Modify the code to choose the lowest meta-queue and
install it into that one for processing.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-11 11:48:14 -05:00
Donald Sharp
9d5a82a5c2
Merge pull request #3526 from mjstapp/dplane_lists
zebra: pass lists of results from dataplane to zebra
2019-01-10 19:20:35 -05:00
Mark Stapp
4c206c8f74 zebra: pass lists of results from dataplane to zebra
Pass lists of results back to zebra from the dataplane subsystem
(and pthread). This helps reduce the lock/unlock cycles when
zebra is busy. Also remove a couple of typedefs that made their
way into the dataplane header file - those violate the FRR style
guidelines.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-01-10 13:24:13 -05:00
Donald Sharp
73547a754e zebra: Consolidate meta_queue_map into route_info
The route_info data structure already had a mapping of route type
to admin distance.  Consolidate the meta_queue_map information
into this route_info data structure.  This is to reduce the number
of places we need to remember to touch when adding a new routing
protocol.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-02 09:15:30 -05:00
Donald Sharp
85c3d6005e
Merge pull request #3464 from mjstapp/wq_event
libs,zebra: support timeout for workqueue retries, use for rib
2018-12-14 10:00:49 -05:00
Donald Sharp
dba52387b7 zebra: On route removal failure return proper message
When a route removal failure happens return to the installing
protocol that the route deletion failed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-13 20:00:33 -05:00
Mark Stapp
6dd7b84894 zebra: use a small retry timeout for the rib workqueue
In the zebra rib processing workqueue, set a small timeout
so that we will wait a short time if the queue into the
async dataplane is full. This helps avoid a situation where
the zebra main pthread constantly retries rib work without
giving the dataplane pthread a chance to make progress.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-12-13 14:15:27 -05:00
Russ White
f4aaa03907
Merge pull request #3477 from donaldsharp/multipath_respect
zebra: Allow zebra to only mark up to multipath_num nexthops as ACTIVE
2018-12-13 10:41:26 -05:00
Russ White
eefe8ab766
Merge pull request #3467 from donaldsharp/kernel_socket_cleanup
Kernel socket cleanup
2018-12-13 10:32:09 -05:00
Donald Sharp
220f0f4245 zebra: Allow zebra to only mark up to multipath_num nexthops as ACTIVE
NEXTHOP_FLAG_ACTIVE currently means that the nexthop is considered
good enough to be installed. With current ecmp restrictions this
translation from multipath_num is enforced in the data plane.
The problem with this is of course that every data plane now
becomes concerned about the multipath num and must enforce it
independently.  Currently *bsd does not honor multipath_num at
all and linux marks all nexthops as being installed even when
it honors a multipath_num that is less than the total.

This code change moves the multipath_num enforcement from a dataplane
decision to a zebra nexthop decision.  Thus dataplanes now can
just install those nexthops marked as NEXTHOP_FLAG_ACTIVE
without having to worry about multipath_num.

*BSD will now respect multipath_num and Linux now properly notes
which routes are actually installed or not:

sharpd@donna ~/f/t/topotests> ps -ef | grep frr
frr       6261  1556  0 09:12 ?        00:00:00 /usr/lib/frr/zebra -e 2 --daemon -A 127.0.0.1
frr       6279  1556  0 09:12 ?        00:00:00 /usr/lib/frr/staticd --daemon -A 127.0.0.1

donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 [0/106] via 10.0.2.2, enp0s3, 00:00:45
S>* 4.4.4.4/32 [1/0] via 10.0.2.1, enp0s3, 00:00:02
  *                  via 192.168.209.1, enp0s8, 00:00:02
                     via 192.168.210.1, enp0s9 inactive, 00:00:02
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:45
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:45
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:45
donna.cumulusnetworks.com(config)#

sharpd@donna ~/f/t/topotests> ip route show
default via 10.0.2.2 dev enp0s3 proto dhcp metric 106
4.4.4.4 proto 196 metric 20
	nexthop via 10.0.2.1 dev enp0s3 weight 1
	nexthop via 192.168.209.1 dev enp0s8 weight 1
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 106
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
192.168.209.0/24 dev enp0s8 proto kernel scope link src 192.168.209.2 metric 105
192.168.210.0/24 dev enp0s9 proto kernel scope link src 192.168.210.2 metric 103
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-13 09:21:26 -05:00
Donald Sharp
c626d369fd zebra: Remove rib_lookup_ipv4_route
The rib_lookup_ipv4_route function is only used in a debug path.
Is only used for v4 and only checks to make sure that the rib
and fib are in sync( which is not needed/used/supported on other
platforms ).  So let's just remove it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-12 11:54:12 -05:00
Donald Sharp
0dddbf72ec zebra: Convert nexthop_active functions to use bool
The set value was only being used as a bool, formalize this
in the call chain.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-12-10 10:21:41 -05:00
Mark Stapp
68b375e059 zebra: revise dplane dequeue api
Change the dataplane context dequeue api used by zebra to make the
purpose a bit clearer.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-11-21 10:38:08 -05:00
Mark Stapp
c831033fff zebra: dataplane provider enhancements
Limit the number of updates processed from the incoming queue;
add more stats. Fill out apis for dataplane providers; convert
route update processing to provider model; move dataplane
status enum

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-11-21 10:37:54 -05:00
Donald Sharp
effcfaeb3d zebra: Carry onlink if set from resolving nexthop
When resolving a nexthop, carry the onlink flag if it
is set to the new nexthop.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-11-07 11:28:01 -05:00
Mark Stapp
258c07e4e2 zebra: only uninstall once, when closing rib table
When the rib code is informed that a table is closing/
going away, only try once to uninstall associated routes from
the fib/dataplane. The close path can be called multiple times
in some cases - zebra shutdown, e.g.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-30 09:41:55 -04:00
Donald Sharp
69c19e1def
Merge pull request #2946 from mjstapp/dplane_2
Zebra: async dataplane, phase 1
2018-10-28 16:10:45 -04:00
Mark Stapp
8b962e7759 zebra: rebase dataplane, align with master
Rebase and pick up dataplane changes on master, including
renamed structs and enums.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp
91f1681258 zebra: limit queued route updates
Impose a configurable limit on the number of route updates
that can be queued towards the dataplane subsystem.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp
2577906481 zebra: revise struct names to resolve review comments
Use standard type naming and remove use of typedef to resolve
some review comments.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp
14c8b173d2 zebra: remove old apis after new dplane work
Replaced or out-grew a few zebra internal apis during async
dataplane work; removing them.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp
f183e380fa zebra: add handy res2str utility
Add a 2str utility for dplane result codes; use it in
a debug or two.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:57:04 -04:00
Mark Stapp
1bcea841b1 zebra: netlink fuzzing path correction
Correct use of netlink_parse_info() in the netlink fuzzing path.
Also clarify a couple of comments about pthreads.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
fe2c53d4ea zebra: Fix style issues
Clean up a couple of checkstyle reports in the dataplane
commit.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
5af4b34689 zebra: ensure redist of system routes
We need a bit of special handling for system routes, which need
to be offered for redistribution even though they won't be
passing through the dplane system.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
5709131cec zebra: resolve style issues in dplane commit
Resolve (most) style issues in the initial zebra dataplane
commit branch.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
8cb41cd624 zebra: set SELECTED flag in rib_process
Set SELECTED re immediately in rib_process, without expecting
that fib install has completed. Remove premature redistribute
call also.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
97f5b44182 zebra: use async dplane route updates
Enqueue updates to the dplane system; add a couple of stats.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
e5ac2adf17 zebra: wip: early version of dplane result handler
Early try at a result handler for async dplane route updates

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Mark Stapp
7cdb1a8445 zebra: start dataplane layer work
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.

Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block

Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.

Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-10-25 08:34:30 -04:00
Donald Sharp
89272910f7 zebra: Start breakup of zns into zrouter and zns
The `struct zebra_ns` data structure is being used
for both router information as well as support for
the vrf backend( as appropriate ).  This is a confusing
state.  Start the movement of `struct zebra_ns` into
2 things `struct zebra_router` and `struct zebra_ns`.

In this new regime `struct zebra_router` is purely
for handling data about the router.  It has no knowledge
of the underlying representation of the Data Plane.

`struct zebra_ns` becomes a linux specific bit of code
that allows us to handle the vrf backend and is allowed
to have knowledge about underlying data plane constructs.

When someone implements a *bsd backend the zebra_vrf data
structure will need to be abstracted to take advantage of this
instead of relying on zebra_ns.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-24 06:52:07 -04:00
David Lamparter
cd5f56bb4e Merge branch 'pull/3165'
...with an additional comment.

Signed-off-by: David Lamparter <equinox@diac24.net>
2018-10-23 12:42:42 +02:00
David Lamparter
ef57f35f41 zebra: add comment about Linux ifdown handling
Signed-off-by: David Lamparter <equinox@diac24.net>
2018-10-23 12:42:06 +02:00
Donald Sharp
7939ff769f zebra: Add some missing breadcrumbs
During a debugging session last night I discovered that I was
still having some `fun` figuring out why zebra was not making
a route's nexthop active.  After some debugging I figured out
that I was missing some states that we could end up in that
didn't have debug information about what happened in nexthop_active.

Add the missing breadcrumbs for nexthop resolution.  In addition
add a bit of code to notice the ebgp state without recursion turned
on and to let the user know about it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-18 09:13:18 -04:00
Philippe Guibert
212df1de28 zebra: remove kernel routes that are suppressed
on some cases, kernel routes are not selected, because the kernel
suppressed it without informing the netlink layer that the route has
been suppressed ( for instance, when an interface goes down, the route
never goes back when interface goes up again). This commit intends to
suppress that entry from zebra.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2018-10-17 23:01:10 +02:00
vishaldhingra
6d53d7b1af zebra: vrf aware routmap is missing in Zebra #2802(Part 2 of 4)
Function parameter replacement of using zvrf instead of vrf_id

Signed-off-by: vishaldhingra vdhingra@vmware.com
2018-10-11 10:46:55 -07:00
vishaldhingra
ac6eebce50 zebra: vrf aware routmap is missing in Zebra #2802(Part 1 of 4)
Work to handle the route-maps, namely the header changes in zebra_vrf.h
 and the mapping of using that everywhere

 Signed-off-by: vishaldhingra vdhingra@vmware.com
2018-10-11 10:44:55 -07:00
David Lamparter
6a154c8812 *: list_delete_and_null() -> list_delete()
Signed-off-by: David Lamparter <equinox@diac24.net>
2018-10-02 11:40:52 +02:00
Renato Westphal
38ca1c9256
Merge pull request #3081 from donaldsharp/table_table_table
bgpd, lib, zebra: Wrapper get/set of table->info pointer
2018-09-24 23:32:50 -03:00
Donald Sharp
6ca30e9ec6 bgpd, lib, zebra: Wrapper get/set of table->info pointer
Wrapper the get/set of the table->info pointer so that
people are not directly accessing this data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-09-23 20:04:39 -04:00
Mark Stapp
ea1c14f680 zebra: Create zebra_dplane.c and .h
Add first sketchy 'dplane' files.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-09-19 18:29:55 -04:00
Quentin Young
1c50c1c0d6 *: style for EC replacements
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-13 19:38:57 +00:00
Quentin Young
e914ccbe9c zebra: ZEBRA_[ERR|WARN] -> EC_ZEBRA
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-13 19:23:29 +00:00
David Lamparter
e991eff5b5 Merge remote-tracking branch 'frr/master' into warnings
Conflicts:
	zebra/if_ioctl_solaris.c
	zebra/rtread_getmsg.c

Signed-off-by: David Lamparter <equinox@diac24.net>
2018-09-12 21:58:39 +02:00
Donald Sharp
714e135429
Merge pull request #2875 from opensourcerouting/fabricd
OpenFabric support
2018-09-08 13:48:48 -04:00
Donald Sharp
34815ea334 zebra: Modify nexthop checks to report inactive a bit more
Debugging inactive nexthops in zebra can be quite difficult
and non-obvious what has gone wrong.  Add detailed rib
debugs for the cases where we decide that a nexthop is
inactive so that we can more easily debug a reason
for the failure.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-09-06 20:24:00 -04:00
Quentin Young
9df414feeb zebra: flog_warn conversion
Convert Zebra to user error subsystem.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-06 20:56:38 +00:00
Donald Sharp
2d68a0f2da zebra: Fix _route_entry_dump to handle nexthop family as appropriate
The _route_entry_dump function was not handling the nexthop as passed
in from an upper level protocol appropriate and as such not displaying
the v4/v6 nexthop right in the case where we have both going.

Additionally dump the nexthop vrf as well.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-09-05 21:02:18 -04:00
Christian Franke
103e4a718f zebra: add a ZEBRA_FLAG_ONLINK so that routes bypass the is-unnumbered check
For OpenFabric operation, we need to be able to install routes via
interfaces without any IPv4 addresses configured. Introduce a flag
ZEBRA_FLAG_ONLINK which upper protocols can set on a route they send
towards zebra, to force the nexthops to be considered onlink.

Signed-off-by: Christian Franke <chris@opensourcerouting.org>
2018-09-05 11:38:13 +02:00
Don Slice
fec4ca191e zebra: if multiple connecteds, select loopback or vrf if present
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2018-08-23 18:49:48 +00:00
Donald Sharp
dca5ef3053
Merge pull request #2818 from kssoman/rmap_fix
Zebra does not properly track which route-maps are changed (#2493)
2018-08-22 07:50:14 -04:00
kssoman
d5b8c21628 zebra : Zebra does not properly track which route-maps are changed (#2493)
* Check for the modified routemap in zebra_route_map_process_update_cb()
* Added zebra_rib_table_rm_update() for RIB routemap processing
* Added zebra_nht_rm_update() for NHT routemap processing

Signed-off-by: kssoman <somanks@vmware.com>
2018-08-17 08:47:48 -07:00
Quentin Young
af4c27286d *: rename zlog_fer -> flog_err
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-08-14 20:02:05 +00:00
Quentin Young
43e52561b4 zebra, lib: error references for zebra
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-08-14 20:02:05 +00:00
Renato Westphal
4373435488 zebra: remove unguarded debugging leftovers
These debug messages were committed by accident.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2018-08-13 18:53:45 -03:00
Donald Sharp
0ce1ca805d *: ALLOC calls cannot fail
There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening.  So we can safely remove all of this code.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-08-11 17:14:58 +02:00
Donald Sharp
40ecd8e46d lib, zebra: Allow protocols to use Distance as part of RR semantics
Allow protocols to specify to zebra that they would like zebra
to use the distance passed down as part of determine sameness for
Route Replace semantics.

This will be used by the static daemon to allow it to have
backup static routes with greater distances.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-07-29 12:43:23 -04:00
Donald Sharp
7e24fdf333 staticd: Start the addition of a staticd
This is the start of separating out the static
handling code from zebra -> staticd.  This will
help simplify the zebra code and isolate static
route handling to it's own code base.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-07-29 12:37:24 -04:00
Christian Franke
1f610a1fb3 zebra: do not ignore ipv6 srcdest routes
Commit a2ca67d1d2 consolidated IPv4 and IPv6 handling. It also applied
our ignorance for IPv4 srcdest routes onto IPv6.

Signed-off-by: Christian Franke <chris@opensourcerouting.org>
2018-07-24 14:09:17 +02:00
Mark Stapp
86391e5659 zebra, libs: use const prefix ptrs in apis
Add 'const' to prefix args to several zebra route update,
redistribution, and route owner notification apis.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2018-07-11 09:22:49 -04:00
paco
0cfbff749e
zebra: flow control (Coverity 1462467 1465497)
Signed-off-by: F. Aragon <paco@voltanet.io>
2018-06-21 17:09:04 +02:00
paco
36228974c2
isisd, zebra: FIXME fixes
Signed-off-by: F. Aragon <paco@voltanet.io>
2018-06-19 19:22:13 +02:00
Donald Sharp
1e88567226 zebra: Add a result from dataplane request
Add a bit of code to allow return of data plane
request messages.

Add the ability to pass the result back to callers
of kernel_route_rib.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-05-30 08:03:13 -04:00
Donald Sharp
215181cbf1 zebra: Rename SOUTHBOUND_XXX to DP_XXX
The SOUTHBOUND_XXX enum was named a bit poorly.
Let's use a bit better name for what we are trying to do.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-05-30 08:00:55 -04:00
Donald Sharp
633a66a586 zebra: Add 'match source-instance' to allow finer grained control
Add to zebra route-maps the ability to match on a source-instance

route-map FOO deny 55
 match source-instance 5
route-map FOO permit 60

ip protocol any route-map FOO

This will match any protocol route installation with a source-instance of 5.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-05-17 10:57:59 -04:00
vivek
a317a9b9a4 bgpd, zebra: Handle EVPN router MAC per next hop
Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.

Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
2018-04-26 07:50:34 -04:00
Quentin Young
bf094f6975 zebra: clean up zapi organization
zserv.c has become something of a dumping ground for everything vaguely
related to ZAPI and really needs some love. This change splits out the
code fo building and consuming ZAPI messages into a separate source
file, leaving the actual session and client lifecycle code in zserv.c.

Unfortunately since the #include situation in Zebra has not been paid
much attention I was forced to fix the headers in a lot of other source
files. This is a net improvement overall though.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-04-22 22:50:24 -04:00
Renato Westphal
15da01e92d
Merge pull request #1973 from donaldsharp/static_nh_vrf
Static nh vrf
2018-04-10 17:27:57 -03:00
Donald Sharp
b8faa875f7 zebra: Notice when our route is deleted and re-install.
The code to reinstall self originated routes was not behaving
correctly.  For some reason we were looking for self originated
routes from the kernel to be of type KERNEL.  This was probably
missed when we started installing the route types.  We should
depend on the self originated flag that we determine from
the callback from the kernel.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
2018-04-09 07:54:57 -04:00
Donald Sharp
9713497ff4 zebra: Properly deregister static nexthops
There were a few cases where we were not properly de-registering
the static nexthops passed to us.  This was important when
the static route was being removed for whatever reason that
we did not leave slag for the nexthop tracking.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-03-27 15:51:53 -04:00
Quentin Young
d7c0a89a3a
*: use C99 standard fixed-width integer types
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t

Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-03-27 15:13:34 -04:00
Donald Sharp
18febdb05a
Merge pull request #1913 from LabNConsulting/working/master/bgp-vpn-leak-cli
bgpd: new vpn-policy CLI
2018-03-20 13:26:48 -04:00
G. Paul Ziemba
b9c7bc5ab0 bgpd: new vpn-policy CLI
PR #1739 added code to leak routes between (default VRF) VPN safi and unicast RIBs in any VRF. That set of changes included temporary CLI including vpn-policy blocks to specify RD/RT/label/&c. After considerable discussion, we arrived at a consensus CLI shown below.

The code of this PR implements the vpn-specific parts of this syntax:

router bgp <as> [vrf <FOO>]
    address-family <afi> unicast
        rd (vpn|evpn) export (AS:NN | IP:nn)
        label (vpn|evpn) export (0..1048575)
        rt (vpn|evpn) (import|export|both) RTLIST...
        nexthop vpn (import|export) (A.B.C.D | X:X::X:X)
        route-map (vpn|evpn|vrf NAME) (import|export) MAP

        [no] import|export [vpn|evpn|evpn8]
        [no] import|export vrf NAME

User documentation of the vpn-specific parts of the above syntax is in PR #1937

Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2018-03-19 22:13:43 -07:00
Donald Sharp
ea7637ccd4 zebra: Cleanup dead function rib_weed_table
the rib_wib_table function was uncalled by anyone remove
and additionally remove it's static function it called.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-03-16 10:20:32 -04:00
Donald Sharp
95a29032bc zebra: Read in on startup arbitrary tables
When we receive an arbitrary table over the netlink bus
save it for later perusal and sweep any routes that
we may have created from an earlier run.

The current redistribute code is limited to
ZEBRA_KERNEL_TABLE_MAX.  I left this alone for the
moment because I believe it needs to be converted
to a RB tree instead of a flat array.  Which is more
work for the future.  Additionally this proposed
change might necessitate some cli changes or rethinks.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-03-16 10:18:58 -04:00