Commit Graph

57 Commits

Author SHA1 Message Date
Donald Sharp
e16d030c65 *: Convert THREAD_XXX macros to EVENT_XXX macros
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
d8bc11a592 *: Add a hash_clean_and_free() function
Add a hash_clean_and_free() function as well as convert
the code to use it.  This function also takes a double
pointer to the hash to set it NULL.  Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-21 08:54:21 -04:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donald Sharp
06525c4f99 zebra: Add zrouter.asic_notification_nexthop_control
Volta submitted notification changes for the dplane that had a
special use case for their system.  Volta is no more, the code
is not being actively developed and from talking with ex-Volta
employees there is no current plans to even maintain this code.
Wrap the special handling of nexthops that their asic-dataplane
did in a bit of code to isolate it and allow for future removal,
as that I do not actually believe anyone else is using this code.
Add a CPP_NOTICE several years into the future that will tell us
to remove the code.  If someone starts using it then they will
have to notice this variable to set it and hopefully they will
see my CPP_NOTICE to come talk to us.  If this is being used then
we can just remove this wrapper.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-12 10:44:57 -05:00
Siger Yang
c317d3f246
zebra: traffic control state management
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.

Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-11-22 22:35:35 +08:00
Donald Sharp
a310ebc114 zebra: Combine meta_queue_free and meta_queue_vrf_free functions
These functions essentially do the same thing.  Combine them
for the goodness of mankind.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-10 07:14:43 -04:00
Donald Sharp
d5795103bc zebra: Fix memory leaks and use after frees in nhg's on shutdown
Fixup both memory leaks as well as use after free's in nhg's
on shutdown.

This approach is effectively just iterating through all the
hash items and directly just freeing the memory instead
of handling ref counts or cross references.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-05 07:51:27 -04:00
Donald Sharp
88b0baa648 zebra: move allow_delete to zrouter.allow_delete
Instead of having global allow_delete move it to
where it belongs in the zrouter data structure.

Additionally show this data in `show zebra`

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-01 07:59:53 -04:00
Anuradha Karuppiah
4cf4fad153 zebra: add support for maintaining local neigh entries
Currently specific local neighbors (attached to SVIs) are maintatined
in an EVPN specific database. There is a need to maintain L3 neighbors
for other purposes including MAC resolution for PBR nexthops.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
   Cleanup compile and fix crash
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
2022-06-27 07:56:55 -04:00
Donald Sharp
c9af62e314 zebra: Add a configurable knob zebra nexthop-group keep (1-3600)
Allow end operator to set how long a nexthop-group is kept around
in the system after it is no-longer being used.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-16 14:47:19 -04:00
Mark Stapp
348698095d zebra: make netlink object hash threadsafe
The recently-added hashtable of nlsock objects needs to be
thread-safe: it's accessed from the main and dplane pthreads.
Add a mutex for it, use wrapper apis when accessing it. Add
a per-OS init/terminate api so we can do init that's not
per-vrf or per-namespace.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-11 17:03:26 -05:00
Donald Sharp
c3343a755f zebra: Prevent thread usage of data after it being freed
On startup we create a thread timer event to do a rib sweep
of the system.  On shutdown we never stopped this timer and
as such we have a situation where a thread event could be run
on shutdown after the data for it has been freed.  Here is the
crash I am seeing:

(gdb) bt
(gdb)

Save the thread data in zebra_router and stop the thread so we don't
accidently do work on shutdown we don't mean to.  In this case
it happened in our topotests with some severe system load.
Essentially we happened to kill the zebra daemon just as the
graceful_restart timer popped here.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-11-29 15:51:45 -05:00
Mark Stapp
695b279ae3 zebra: free LSP workqueue early, revert PR 10050
this reverts commit dd9538c5f3, which tried to clear
the LSP workqueue late during shutdown.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2021-11-18 07:35:35 -05:00
Mark Stapp
dd9538c5f3 zebra: free LSP workqueue later during shutdown
Free the LSP workqueue later during shutdown, so that zebra
has enough time to clean up and uninstall any LSPs.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2021-11-12 15:10:00 -05:00
Donald Sharp
cbefb650bc zebra: Recent Merge broke --enable-werror
Recent code broke upon compiling with --enable-dev-build
and --enable-werror.  Fix.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2021-10-27 08:53:43 -04:00
Donald Lee
461c173cbd zebra: Add script initialization and destroy
Signed-off-by: Donald Lee <dlqs@gmx.com>
2021-10-20 00:56:00 +08:00
Stephen Worley
7c2ddfb976 zebra: rework RA handling for vrf-lite
Rework RA handling for vrf-lite scenarios.

Before we were using a single FD descriptor for polling
across multiple zvrf's. This would cause us to hit this
assert() in some bgp unnumbered and vrrp configs:

```
/*
 * What happens if we have a thread already
 * created for this event?
 */
if (thread_array[fd])
	assert(!"Thread already scheduled for file descriptor");
```

We were scheduling a thread_read on the same FD for every zvrf.

With vrf-lite, RAs and ARPs are not vrf-bound, so we can just use one
rtadv instance to manage them for all VRFs. We will choose the default
VRF for this.

This patch removes the rtadv_sock altogether for zrouter and moves the
functionality this represented to the default VRF. All RAs will be
handled in the default VRF under vrf-lite configs with only one poll
thread started for it.

This patch also extends how we track subscribed interfaces (s or msec)
to use an actual sorted list by interface names rather than just a
counter. With multiple daemons turning interfaces/on/off these counters
can get very wrong during ifup/down events. Making them a sorted list
prevents this from happening by preventing duplicates.

With netns-vrf's nothing should change other than the interface list.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2021-06-08 15:05:43 -04:00
David Lamparter
224ccf29d9 zebra: kill zebra_memory.h, use MTYPE_STATIC
This one also needed a bit of shuffling around, but MTYPE_RE is the only
one left used across file boundaries now.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-22 20:02:17 +01:00
David Lamparter
bf8d3d6aca *: require semicolon after DEFINE_MTYPE & co
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet.  Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition.  And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...

With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.

Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.

Signed-off-by: David Lamparter <equinox@diac24.net>
2021-03-17 06:18:17 +01:00
Donald Sharp
e4876266e4 zebra: Add --asic-offload command
Add a command that allows FRR to know it's being used with
an underlying asic offload, from the linux kernel perspective.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-11-15 10:19:25 -05:00
Donald Sharp
4c56ce1cea zebra: Add basic knowledge of asic offload available
Some linux kernels are starting to support the idea of knowledge
about the underlying asic.  Add a boolean that we can set/unset
to track whether or not we think the router has this functionality
available.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2020-09-22 15:57:43 -04:00
Duncan Eastoe
b62983cf98 zebra: Add table_id to rib_table_info_t
When given a route_table this allows the corresponding kernel table
ID to be determined. The table_id value is set upon table creation
to the table_id of the VRF, unless the table was created with a
specific ID.

Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
2020-07-08 12:52:13 +01:00
Chirag Shah
9d86e091bb zebra: northbound changes for the rib model
This commit implements:
RIB operational list create/destroy.
Walk over RIB tables using keys.
The first RIB table will be IPV4/unicast (table-id 254)
will be fetched.
Create a new api to fetch RIB table based on
afi-safi and table id as the keys.

remove mandatory true statement from the leaf which
is part of the list key.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2020-05-12 13:25:10 -07:00
Donald Sharp
630d596249 zebra: Remove typedef rib_table_info_t from system
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-05-08 08:10:49 -04:00
Stephen Worley
c25c3ea57a zebra: free unhashable (dup) NHEs via ID table cleanup
Free unhashable (duplicate NHEs from the kernel) via ID table
cleanup. Since the NHE ID hash table contains extra entries,
that's the one we need to be calling zebra_nhg_hash_free()
on, otherwise we will never free the unhashable NHEs.

This was found via a memleak:

==1478713== HEAP SUMMARY:
==1478713==     in use at exit: 10,267 bytes in 46 blocks
==1478713==   total heap usage: 76,810 allocs, 76,764 frees, 3,901,237 bytes allocated
==1478713==
==1478713== 208 (88 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 35 of 41
==1478713==    at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1478713==    by 0x48E35E8: qcalloc (memory.c:110)
==1478713==    by 0x451CCB: zebra_nhg_alloc (zebra_nhg.c:369)
==1478713==    by 0x453DE3: zebra_nhg_copy (zebra_nhg.c:379)
==1478713==    by 0x452670: nhg_ctx_process_new (zebra_nhg.c:1143)
==1478713==    by 0x4523A8: nhg_ctx_process (zebra_nhg.c:1234)
==1478713==    by 0x452A2D: zebra_nhg_kernel_find (zebra_nhg.c:1294)
==1478713==    by 0x4326E0: netlink_nexthop_change (rt_netlink.c:2433)
==1478713==    by 0x427320: netlink_parse_info (kernel_netlink.c:945)
==1478713==    by 0x432DAD: netlink_nexthop_read (rt_netlink.c:2488)
==1478713==    by 0x41B600: interface_list (if_netlink.c:1486)
==1478713==    by 0x457275: zebra_ns_enable (zebra_ns.c:127)

Repro with:
ip next add id 1 blackhole
ip next add id 2 blackhole

valgrind /usr/lib/frr/zebra

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2020-04-02 11:41:25 -04:00
Mark Stapp
0eb97b860d lib,zebra: use nhg_hash_entry pointer in route_entry
Replace the existing list of nexthops (via a nexthop_group
struct) in the route_entry with a direct pointer to zebra's
new shared group (from zebra_nhg.h). This allows more
direct access to that shared group and the info it carries.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2019-12-04 08:13:52 -05:00
Donald Sharp
311c15ee60 zebra: Router Advertisement socket mess up
The code for when a new vrf is created to properly handle
router advertisement for it is messed up in several ways:

1) Generation of the zrouter data structure should set the rtadv
socket to -1 so that we don't accidently close someone elses
open file descriptor
2) When you created a new zvrf instance *after* bootup we are XCALLOC'ing
the data structure so the zvrf->fd was 0.  The shutdown code was looking
for the >= 0 to know if the fd existed (since fd 0 is valid!)

This sequence of events would cause zebra to consume 100% of the
cpu:

Run zebra by itself ( no other programs )
ip link add vrf1 type vrf table 1003
ip link del vrf vrf1
vtysh -c "configure" -c "no interface vrf1"

This commit fixes this issue.

Fixes: #5376
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-11-19 15:51:10 -05:00
Stephen Worley
5948f013ba zebra: Cleanup zebra_nhg APIs
Add a private header file for functions that are internal/special
case like how we do it for `lib/nexthop_group_private.h`.

Remove a bunch of functions from the header file only being used
statically and add some comments for those remaining to indicate
better what their use is.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:44 -04:00
Stephen Worley
fefa080e3c zebra: Remove cleanup and nhg workqueue boilerplate
This code was from a strategies we elected not to use and
can safely be removed.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:43 -04:00
Stephen Worley
38e40db1c9 zebra: Sweep our nexthop objects out on restart
On restart, if we failed to remove any nexthop objects due
to a kill -9 or such event, sweep them if we aren't using them.
Add a proto field to handle this and remove the is_kernel bool.

Add a dupicate flag that indicates this nexthop group is only
present in our ID hashtable. It is a dupicate nexthop we received
from the kernel, therefore we cannot hash on it.

Make the idcounter globally accessible so that kernel updates
increment it as soon as we receive them, not when we handle them.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:41 -04:00
Stephen Worley
3e0372d20e zebra: Uninstall nexthops on shutdown
Add functionality to uninstall nexthops we created on shutdown.
To account for this, I added in a function for zebra_router
cleanup in a shutdown event.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:37 -04:00
Stephen Worley
d9f5b2f50f zebra: Add functionality to parse RTM_NEWNEXTHOP and RTM_DELNEXTHOP messages
Add the functionality to parse new nexthop group messages
from the kernel and insert them into the appropriate hash
tables. Parsing is done at startup between interface and
interface address lookup. Add functionality to parse
changes to nexthops we already have. Add functionality
to parse delete nexthop messages from the kernel and
remove them from our table.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Stephen Worley
a95b8020ca zebra: Add a second table for indexing by ID
The messages we get from the kernel come with ids only
for groups, so lets index with those as well. Also adding
a helper function for lookup and get with the two different
tables.

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
2019-10-25 11:13:36 -04:00
Donald Sharp
69171da262 zebra: Add hash of nexthop groups
This commit does nothing more than just create a hash structure
that we will use to track nexthop groups.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-10-25 11:13:35 -04:00
David Lamparter
c1344b54a8 zebra: use MTYPE_STATIC
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-06-21 08:54:25 +02:00
Donald Sharp
526052fb6d zebra: Move multicast mode to being a property of the router
The multicast mode enum was a global static in zebra_rib.c
it does not belong there, it belongs in zebra_router, moving.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-29 15:25:33 -04:00
Russ White
1b072ce466
Merge pull request #4269 from donaldsharp/other_tables
zebra Other tables
2019-05-16 10:11:56 -04:00
Donald Sharp
b3f2b59020 zebra: Move multipath_num into zrouter
The multipath_num variable is a property of zebra_router,
so move it there.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-14 14:15:18 -07:00
Donald Sharp
4bc1617c0c zebra: Remove unused zebra_router_score_proto
With the previous commit, the zebra_router_score_proto function
became unnecessary, so let us remove it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-09 07:13:01 -04:00
Donald Sharp
c447ad08b2 doc, zebra: Remove "table X" command
This command is broken and has been broken since the introduction
of vrf's.  Since no-one has complained it is safe to assume that
there is no call for this specialized linux command.  Remove
from the system with extreme prejudice.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-05-06 13:42:23 -04:00
Chirag Shah
8a88f81550 zebra: avoid removing node twice from rb_tree
In zebra terminate path, the node was attempted to remove
twice from the RB_TREE table. This lead to a crash during
zebra shutdown zebra_router_free_table already calls RB_REMOVE
to remove a node from rb tree table.

    siginfo=0x7fffd9134a30, context=<optimized out>) at lib/sigevent.c:249
     rbt=<optimized out>, t=<optimized out>) at lib/openbsd-tree.c:226
     t=0x56296965ff50 <zebra_router_table_head_RB_INFO>) at lib/openbsd-tree.c:383
    rbt=rbt@entry=0x562969669bd0 <zrouter+16>, elm=elm@entry=0x56296afcf810)
    at lib/openbsd-tree.c:393
    (elm=0x56296afcf810, head=0x562969669bd0 <zrouter+16>) at zebra/zebra_router.h:46

Singned-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2019-04-09 12:30:15 -07:00
Donald Sharp
3f2b1b56cc zebra: zebra_router.c does not own the data plane shutdown of tables
When shutting down, the individual vrf's own the shutdown of the table
and subsuquent removal from the routes from the kernel.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-27 16:19:28 -04:00
Donald Sharp
bd4fb6158d zebra: Upon vrf deletion, actually release this data.
When a vrf is deleted we need to tell the zebra_router that we have
finished using the tables we are keeping track of.  This will allow
us to properly cleanup the data structures associated with them.

This fixes this valgrind error found:

==8579== Invalid read of size 8
==8579==    at 0x430034: zvrf_id (zebra_vrf.h:167)
==8579==    by 0x432366: rib_process (zebra_rib.c:1580)
==8579==    by 0x432366: process_subq (zebra_rib.c:2092)
==8579==    by 0x432366: meta_queue_process (zebra_rib.c:2188)
==8579==    by 0x48C99FE: work_queue_run (workqueue.c:291)
==8579==    by 0x48C3788: thread_call (thread.c:1607)
==8579==    by 0x48A2E9E: frr_run (libfrr.c:1011)
==8579==    by 0x41316A: main (main.c:473)
==8579==  Address 0x5aeb750 is 0 bytes inside a block of size 4,424 free'd
==8579==    at 0x4839A0C: free (vg_replace_malloc.c:540)
==8579==    by 0x438914: zebra_vrf_delete (zebra_vrf.c:279)
==8579==    by 0x48C4225: vrf_delete (vrf.c:243)
==8579==    by 0x48C4225: vrf_delete (vrf.c:217)
==8579==    by 0x4151CE: netlink_vrf_change (if_netlink.c:364)
==8579==    by 0x416810: netlink_link_change (if_netlink.c:1189)
==8579==    by 0x41C1FC: netlink_parse_info (kernel_netlink.c:904)
==8579==    by 0x41C2D3: kernel_read (kernel_netlink.c:389)
==8579==    by 0x48C3788: thread_call (thread.c:1607)
==8579==    by 0x48A2E9E: frr_run (libfrr.c:1011)
==8579==    by 0x41316A: main (main.c:473)
==8579==  Block was alloc'd at
==8579==    at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==8579==    by 0x48A6030: qcalloc (memory.c:110)
==8579==    by 0x4389EF: zebra_vrf_alloc (zebra_vrf.c:382)
==8579==    by 0x438A42: zebra_vrf_new (zebra_vrf.c:93)
==8579==    by 0x48C40AD: vrf_get (vrf.c:209)
==8579==    by 0x415144: netlink_vrf_change (if_netlink.c:319)
==8579==    by 0x415E90: netlink_interface (if_netlink.c:653)
==8579==    by 0x41C1FC: netlink_parse_info (kernel_netlink.c:904)
==8579==    by 0x4163E8: interface_lookup_netlink (if_netlink.c:760)
==8579==    by 0x42BB37: zebra_ns_enable (zebra_ns.c:130)
==8579==    by 0x42BC5E: zebra_ns_init (zebra_ns.c:208)
==8579==    by 0x4130F4: main (main.c:401)

This can be found by: `ip link del <VRF DEVICE NAME>` then `ip link add <NAME> type vrf table X` again and
then attempting to use the vrf.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-01 16:30:31 -05:00
Donald Sharp
5ec5a7160a zebra: Move packets_to_process to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
ea45a4e7db zebra: Move the mq data structure to zrouter
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
489a961429 zebra: Move ribq from zebrad to zrouter
The zrouter should own this data structure and it should not
be defined in zserv.h

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
b3d43ff471 zebra: Move rtm_table_default to zrouter
The zrouter should own this particular piece of data.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-31 09:20:46 -05:00
Donald Sharp
1485bbe755 zebra: Add code to track sequence number from zebra_router
The sequence number used should be unique and increase by 1
for netlink commands.  This will allow the code to match
up batched commands to actual requests, so that we can signal
the failure correctly back.

So start the movement and tracking of sequence numbers as
an atomic uint32_t in zebra_router.  Modify the dataplane
code to start tracking contexts from this value.

In future commits we will move more of the sequencing
data into using this value.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-24 08:02:39 -05:00
Renato Westphal
ca88bbed50
Merge pull request #3567 from donaldsharp/cleanup_route_table_creation
Route Table Handling and shows
2019-01-14 10:56:07 -02:00
Donald Sharp
df39560091 zebra: Add some small infrastructure to get the mlag code in zebra started
Add a zebra_mlag.h and a zebra_mlag.c startup/shutdown code to zebra.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-01-04 12:21:00 -05:00