Commit Graph

311 Commits

Author SHA1 Message Date
Donald Sharp
ab1c9a314e lib, zebra: Check for not being a blackhole route
In zebra_interface_nhg_reinstall zebra is checking that the
nhg is a singleton and not a blackhole nhg.  This was originally
done with checking that the nexthop is a NEXTHOP_TYPE_IFINDEX,
NEXTHOP_TYPE_IPV4_IFINDEX and NEXTHOP_TYPE_IPV6_IFINDEX.  This
was excluding NEXTHOP_TYPE_IPV4 and NEXTHOP_TYPE_IPV6.  These
were both possible to be received and maintained from the upper
level protocol for when a route is being recursively resolved.
If we have gotten to this point in zebra_interface_nhg_reinstall
the nexthop group has already been installed at least once
and we *know* that it is actually a valid nexthop.  What the
test is really trying to do is ensure that we are not reinstalling
a blackhole nexthop group( Which is not possible to even be
here by the way, but safety first! ).  So let's change
to test for that instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 29c1ff446e)
2024-04-23 12:10:36 +00:00
Donald Sharp
0c91f45d96 Revert "lib: register bgp link-state afi/safi"
This reverts commit 1642a68d60.

(cherry picked from commit 0dc12c9003)
2023-10-11 05:02:54 +00:00
Chirag Shah
0ddda5cd96
Merge pull request #14515 from mjstapp/fix_nhg_intf_uninstall
zebra: be more careful removing 'installed' flag from nhgs
2023-10-10 08:30:55 -07:00
Mark Stapp
0da89ac985 zebra: be more careful removing 'installed' flag from nhgs
When interface addresses change, we examine nhgs associated
with the interface in case they need to be reinstalled. As
part of that, we may need to reinstall ecmp nhgs that use the
interface being examined - but not always.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-09-29 12:08:17 -04:00
Russ White
8e755a03a3
Merge pull request #12649 from louis-6wind/bgp-link-state
bgpd: add basic support of BGP Link-State RFC7752
2023-09-26 10:07:02 -04:00
Dmytro Shytyi
f20cf1457d bgpd,lib,sharpd,zebra: srv6 introduce multiple segs/SIDs in nexthop
Append zebra and lib to use muliple SRv6 segs SIDs, and keep one
seg SID for bgpd and sharpd.

Note: bgpd and sharpd compilation relies on the lib and zebra files,
i.e if we separate this: lib or zebra or bgpd or sharpd in different
commits - this will not compile.

Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
2023-09-20 15:07:15 +02:00
Louis Scalbert
1642a68d60 lib: register bgp link-state afi/safi
Register BGP Link-State AFI/SAFI values from RFC7752.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
2023-09-18 14:22:51 +02:00
Rajasekar Raja
27ccfd9aa6 zebra: Fix zebra crash when replacing NHE during shutdown
During replace of a NHE from upper proto in zebra_nhg_proto_add(),
 - rib_handle_nhg_replace() is invoked with old NHE where we walk all
   RNs/REs & replace the re->nhe whose address points to old NHE.
 - In this walk, if prev re->nhe refcnt is decremented to 0, we free up
   the memory which the old NHE is pointing to.
Later in zebra_nhg_proto_add(), we end up accessing this freed memory
and crash.

Logs:
1380766 2023/08/16 22:34:11.994671 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 2 => 1
1380773 2023/08/16 22:34:11.994678 ZEBRA: [WDEB1-93HCZ] zebra_nhg_decrement_ref: nhe 0x56091d890840 (70312519[2756/2762/2810]) 1 => 0
1380777 2023/08/16 22:34:11.994844 ZEBRA: [JE46R-G2NEE] zebra_nhg_release: nhe 0x56091d890840 (70312519[2756/2762/2810])
1380778 2023/08/16 22:34:11.994849 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (70312519[2756/2762/2810]), refcnt 0
1380782 2023/08/16 22:34:11.995000 ZEBRA: [SCDBM-4H062] zebra_nhg_free: nhe 0x56091d890840 (0[]), refcnt 0
1380783 2023/08/16 22:34:11.995011 ZEBRA: lib/memory.c:84: mt_count_free(): assertion (mt->n_alloc) failed

Backtrace:
0  0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
1  0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
2  0x00007f833f636648 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3  0x00007f833f63cd6a in ?? () from /lib/x86_64-linux-gnu/libc.so.6
4  0x00007f833f63cfb4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
5  0x00007f833f63fbc8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
6  0x00007f833f64172a in malloc () from /lib/x86_64-linux-gnu/libc.so.6
7  0x00007f833f6c3fd2 in backtrace_symbols () from /lib/x86_64-linux-gnu/libc.so.6
8  0x00007f833f9013fc in zlog_backtrace_sigsafe (priority=priority@entry=2, program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:222
9  0x00007f833f901593 in zlog_signal (signo=signo@entry=6, action=action@entry=0x7f833f988ee8 "aborting...", siginfo_v=siginfo_v@entry=0x7ffee1ce4a30,
    program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:154
10 0x00007f833f92dbd1 in core_handler (signo=6, siginfo=0x7ffee1ce4a30, context=<optimized out>) at lib/sigevent.c:254
11 <signal handler called>
12 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
13 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
14 0x00007f833f958f96 in _zlog_assert_failed (xref=xref@entry=0x7f833f9e4080 <_xref.10705>, extra=extra@entry=0x0) at lib/zlog.c:680
15 0x00007f833f905400 in mt_count_free (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:84
16 mt_count_free (ptr=0x51, mt=0x7f833fa02800 <MTYPE_NH_LABEL>) at lib/memory.c:80
17 qfree (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:140
18 0x00007f833f90799c in nexthop_del_labels (nexthop=nexthop@entry=0x56091d776640) at lib/nexthop.c:563
19 0x00007f833f907b91 in nexthop_free (nexthop=0x56091d776640) at lib/nexthop.c:393
20 0x00007f833f907be8 in nexthops_free (nexthop=<optimized out>) at lib/nexthop.c:408
21 0x000056091c21aa76 in zebra_nhg_free_members (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628
22 zebra_nhg_free (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628
23 0x000056091c21bab2 in zebra_nhg_proto_add (id=<optimized out>, type=9, instance=<optimized out>, session=0, nhg=nhg@entry=0x56091d7da028, afi=afi@entry=AFI_UNSPEC)
    at zebra/zebra_nhg.c:3532
24 0x000056091c22bc4e in process_subq_nhg (lnode=0x56091d88c540) at zebra/zebra_rib.c:2689
25 process_subq (qindex=META_QUEUE_NHG, subq=0x56091d24cea0) at zebra/zebra_rib.c:3290
26 meta_queue_process (dummy=<optimized out>, data=0x56091d24d4c0) at zebra/zebra_rib.c:3343
27 0x00007f833f9492c8 in work_queue_run (thread=0x7ffee1ce55a0) at lib/workqueue.c:285
28 0x00007f833f93f60d in thread_call (thread=thread@entry=0x7ffee1ce55a0) at lib/thread.c:2008
29 0x00007f833f8f9888 in frr_run (master=0x56091d068660) at lib/libfrr.c:1223
30 0x000056091c1b8366 in main (argc=12, argv=0x7ffee1ce5988) at zebra/main.c:551

Issue: 3492162

Ticket# 3492162

Signed-off-by: Chirag Shah <chirag@nvidia.com>

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2023-08-31 05:40:34 +00:00
Donald Sharp
d381190a55 zebra: Remove tag from zebra_rmap_obj
The tag value in all cases was being set to the re->tag.
re is already stored, so let's just use that.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-11 11:21:03 -04:00
Donald Sharp
b7542d5af8 zebra: Remove instance from zebra_rmap_obj data structure
In all cases the instance is derived from the re pointer
and since the re pointer is already stored, let's just
remove it from the game and cut to the chase.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-11 11:15:06 -04:00
Donald Sharp
cad4d0c332 zebra: Replace source_protocol with just using re in route map object
Replace the source_protocol with just saving a pointer to the re
in the `struct zebra_rmap_obj` data structure.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-08-11 11:11:40 -04:00
Donatas Abraitis
12e4a71b3a
Merge pull request #14078 from raja-rajasekar/frr_dev1
zebra: fix nhe refcnt when frr service goes down
2023-08-01 12:19:18 +03:00
Rajasekar Raja
cd0558e629 zebra: fix nhe refcnt when frr service goes down
When frr.service is going down(restart or stop),
zebra core can be seen.

Sequence of events leading to crash:
Increments of nhe refcnt:
  - Upper level creates a new nhe(say NHE1) —> nhe->refcnt=1
  - Two RE’s (Say RE1 & RE2) associate with NHE1 —> nhe->refcnt = 3

Decrements of nhe refcnt:
  - BGP sends a zapi msg to zebra to delete NHG. —> nhe->refcnt = 2
  - RE1 is queued for delete in META-Q
  - As zebra is dissociating with its clients, zebra_nhg_score_proto() is
     invoked -> nhe->refcnt=1
  - RE2 is no more associated with the NHE1 —>nhe->refcnt=0 &
     hence NHE IS FREED
  - Now RE1 is dequeued from META-Q for processing the re delete. —> At
     this point re->nhe is pointing to freed pointer. CRASH CRASH!!!!

Fix:
  - When we iterate zebra_nhg_score_proto_entry() to delete the upper
     proto specific nhe’s,  we need to skip the additional nhe->refcnt
     decrement in case nhe->flags has NEXTHOP_GROUP_PROTO_RELEASED set.

Backtrace-1
0x00007fa8449ce8eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fa8449b9535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
0x00007fa844d32f86 in _zlog_assert_failed (xref=xref@entry=0x55fa37871040 <_xref.28142>, extra=extra@entry=0x0) at lib/zlog.c:680
0x000055fa3778f770 in rib_re_nhg_free (re=0x55fa39e33770) at zebra/zebra_rib.c:2578
rib_unlink (rn=0x55fa39e27a60, re=0x55fa39e33770) at zebra/zebra_rib.c:3930
0x000055fa3778ff18 in rib_process (rn=0x55fa39e27a60) at zebra/zebra_rib.c:1439
0x000055fa37790b1c in process_subq_route (qindex=8 '\b', lnode=0x55fa39e1c1b0) at zebra/zebra_rib.c:2549
process_subq (qindex=META_QUEUE_BGP, subq=0x55fa3999c580) at zebra/zebra_rib.c:3107
meta_queue_process (dummy=<optimized out>, data=0x55fa3999c480) at zebra/zebra_rib.c:3146
0x00007fa844d232b8 in work_queue_run (thread=0x7ffffbdf6cb0) at lib/workqueue.c:285
0x00007fa844d195fd in thread_call (thread=thread@entry=0x7ffffbdf6cb0) at lib/thread.c:2008
0x00007fa844cd3888 in frr_run (master=0x55fa397b7630) at lib/libfrr.c:1223
0x000055fa3771e294 in main (argc=12, argv=0x7ffffbdf7098) at zebra/main.c:526

Backtrace-2
0x00007f125af3f535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
0x00007f125b2b8f96 in _zlog_assert_failed (xref=xref@entry=0x7f125b344260 <_xref.18768>, extra=extra@entry=0x0) at lib/zlog.c:680
0x00007f125b268190 in nexthop_copy_no_recurse (copy=copy@entry=0x5606dd726f10, nexthop=nexthop@entry=0x7f125b0d7f90, rparent=<optimized out>) at lib/nexthop.c:806
0x00007f125b2681b2 in nexthop_copy (copy=0x5606dd726f10, nexthop=0x7f125b0d7f90, rparent=<optimized out>) at lib/nexthop.c:836
0x00007f125b268249 in nexthop_dup (nexthop=nexthop@entry=0x7f125b0d7f90, rparent=rparent@entry=0x0) at lib/nexthop.c:860
0x00007f125b26b67b in copy_nexthops (tnh=tnh@entry=0x5606dd9ec748, nh=<optimized out>, rparent=rparent@entry=0x0) at lib/nexthop_group.c:457
0x00007f125b26b6ba in nexthop_group_copy (to=to@entry=0x5606dd9ec748, from=from@entry=0x5606dd9ee9f8) at lib/nexthop_group.c:291
0x00005606db6ec678 in zebra_nhe_copy (orig=0x5606dd9ee9d0, id=id@entry=0) at zebra/zebra_nhg.c:431
0x00005606db6ddc63 in mpls_ftn_uninstall_all (zvrf=zvrf@entry=0x5606dd6e7cd0, afi=afi@entry=2, lsp_type=ZEBRA_LSP_NONE) at zebra/zebra_mpls.c:3410
0x00005606db6de108 in zebra_mpls_cleanup_zclient_labels (client=0x5606dd8e03b0) at ./zebra/zebra_mpls.h:471
0x00005606db73e575 in hook_call_zserv_client_close (client=0x5606dd8e03b0) at zebra/zserv.c:566
zserv_client_free (client=0x5606dd8e03b0) at zebra/zserv.c:585
zserv_close_client (client=0x5606dd8e03b0) at zebra/zserv.c:706
0x00007f125b29f60d in thread_call (thread=thread@entry=0x7ffc2a740290) at lib/thread.c:2008
0x00007f125b259888 in frr_run (master=0x5606dd3b7630) at lib/libfrr.c:1223
0x00005606db68d298 in main (argc=12, argv=0x7ffc2a740678) at zebra/main.c:534

Issue: 3492031

Ticket# 3492031

Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
2023-07-26 21:17:16 +00:00
anlan_cs
045df14427 zebra: fix nhg out of sync between zebra and kernel
PR#13413 introduces reinstall mechanism, but there is problem with the route
leak scenario.

With route leak configuration: ( `x1` and `x2` are binded to `vrf1` )
```
vrf vrf2
 ip route 75.75.75.75/32 77.75.1.75 nexthop-vrf vrf1
 ip route 75.75.75.75/32 77.75.2.75 nexthop-vrf vrf1
exit-vrf
```

Firstly, all are ok.  But after `x1` is set down and up ( The interval
between the down and up operations should be less than 180 seconds. ) ,
`x1` is lost from the nexthop group:
```
anlan# ip nexthop
id 121 group 122/123 proto zebra
id 122 via 77.75.1.75 dev x1 scope link proto zebra
id 123 via 77.75.2.75 dev x2 scope link proto zebra
anlan# ip route show table 2
75.75.75.75 nhid 121 proto 196 metric 20
        nexthop via 77.75.1.75 dev x1 weight 1
        nexthop via 77.75.2.75 dev x2 weight 1
anlan# ip link set dev x1 down
anlan# ip link set dev x1 up
anlan# ip route show table 2 <- Wrong, one nexthop lost from group
75.75.75.75 nhid 121 via 77.75.2.75 dev x2 proto 196 metric 20
anlan# ip nexthop
id 121 group 123 proto zebra
id 122 via 77.75.1.75 dev x1 scope link proto zebra
id 123 via 77.75.2.75 dev x2 scope link proto zebra
anlan# show ip route vrf vrf2 <- Still ok
VRF vrf2:
S>* 75.75.75.75/32 [1/0] via 77.75.1.75, x1 (vrf vrf1), weight 1, 00:00:05
  *                      via 77.75.2.75, x2 (vrf vrf1), weight 1, 00:00:05
```

From the impact on kernel:
The `nh->type` of `id 122` is *always* `NEXTHOP_TYPE_IPV4` in the route leak
case.  Then, `nexthop_is_ifindex_type()` introduced by commit `5bb877` always
returns `false`, so its dependents can't be reinstalled.  After `x1` is down,
there is only `id 123` in the group of `id 121`.  So, Finally `id 121` remains
unchanged after `x1` is up, i.e., `id 122` is not added to the group even it is
reinstalled itself.

From the impact on zebra:
The `show ip route vrf vrf2` is still ok because the `id`s are reused/reinstalled
successfully within 180 seconds after `x1` is down and up.  The group of `id 121`
is with old `NEXTHOP_GROUP_INSTALLED` flag, and it is still the group of `id 122`
and `id 123` as before.

In this way, kernel and zebra have become out of sync.

The `nh->type` of `id 122` should be adjusted to `NEXTHOP_TYPE_IPV4_IFINDEX`
after nexthop resolved.  This commit is for doing this to make that reinstall
mechanism work.

Signed-off-by: anlan_cs <anlan_cs@tom.com>
2023-07-24 18:00:16 +08:00
Donald Sharp
af80201876 zebra: Further handle route replace semantics
When an upper level protocol is installing a route X that needs to be
route replaced and at the same time the same or another protocol installs a
different route that depends on route X for nexthop resolution can leave
us with a state where the route is not accepted because zebra is still
really early in the route replace semantics ( route X is still on the work
Queue to be processed ) then the dependent route would not be installed.
This came up in the bgp_default_originate test cases frequently.

Further extendd the ROUTE_ENTR_ROUTE_REPLACING flag to cover this case
as well.  This has come up because the early route processing queueing
that was implemented late last year.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-07-17 10:00:32 -04:00
Mark Stapp
bb58cad150 zebra: use NHRP routes as valid in nexthop check
Treat NHRP-installed routes as valid, as if they were
CONNECTED routes, when checking candidate routes'
nexthops for validity. This allows use of NHRP by an
IGP, for example, that doesn't normally want recursive
nexthop resolution.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-07-10 16:43:53 -04:00
Russ White
1e9e82e803
Merge pull request #13396 from donaldsharp/interface_is_interface
move interface ( LINK and ADDR ) events to the dplane
2023-07-06 08:31:16 -04:00
Donald Sharp
a014450441 zebra: Add code to get/set interface to pass up from dplane
1) Add a bunch of get/set functions and associated data
structure in zebra_dplane to allow the setting and retrieval
of interface netlink data up into the master pthread.

2) Add a bit of code to breakup startup into stages.  This is
because FRR currently has a mix of dplane and non dplane interactions
and the code needs to be paused before continuing on.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-07-05 13:03:14 -04:00
anlan_cs
098519caf8 zebra: fix wrong nexthop check for kernel routes
When changing one interface's vrf, the kernel routes are wrongly kept
in old vrf.  Finally, the forwarding table in that old vrf can't forward
traffic correctly for those residual entries.

Follow these steps to make this problem happen:
( Firstly, "x1" interface of default vrf is with address of "6.6.6.6/24". )

```
anlan# ip route add 4.4.4.0/24 via 6.6.6.8 dev x1
anlan# ip link add vrf1 type vrf table 1
anlan# ip link set vrf1 up
anlan# ip link set x1 master vrf1
```

Then check `show ip route`, the route of "4.4.4.0/24" is still selected
in default vrf.

If the interface goes down, the kernel routes will be reevaluated.  Those
kernel routes with active interface of nexthop can be kept no change, it
is a fast path.  Otherwise, it enters into slow path to do careful examination
on this nexthop.

After the interface's vrf had been changed into new vrf, the down message of
this interface came.  It means the interface is not in old vrf although it
still exists during that checking, so the kernel routes should be dropped
after this nexthop matching against a default route in slow path. But, in
current code they are wrongly kept in fast path for not checking vrf.

So, modified the checking active nexthop with vrf comparision for the interface
during reevaluation.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-07-02 10:30:09 +08:00
anlan_cs
caf896d6ef zebra: Remove unnecessary condition check for kernel routes
There are relaxed nexthop requirements for kernel routes because we
trust kernel routes.

Two minor changes for kernel routes:

1. `if_is_up()` is one of the necessary conditions for `if_is_operative()`.
Here, we can remove this unnecessary check for clarity.

2. Since `nexthop_active()` doesn't distinguish whether it is kernel route,
modified the corresponding comment in it.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-07-02 10:30:09 +08:00
Chirag Shah
69cf016ee2 zebra:re-install dependent nhgs on interface up
Upon interface up associated singleton NHG's
dependent NHGs needs to be reinstalled as
kernel would have deleted if there is no route
referencing it.

Ticket:#3416477
Issue:3416477
Testing Done:
flap interfaces which are part of route NHG,
upon interfaces up event, NHGs are resynced
into dplane.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-05-05 14:37:52 -07:00
Ashwini Reddy
5bb87732f6 zebra: re-install nhg on interface up
Intermittently zebra and kernel are out of sync
when interface flaps and the add's/dels are in
same processing queue and zebra assumes no change in nexthop.
Hence we need to bring in a reinstall to kernel
of the nexthops and routes to sync their states.

Upon interface flap kernel would have deleted NHGs
associated to a interface (the one flapped),
zebra retains NHGs for 3 mins even though upper
layer protocol removes the nexthops (associated NHG).
As part of interface address add ,
re-add singleton NHGs associated to interface.

Ticket: #3173663
Issue: 3173663

Signed-off-by: Ashwini Reddy <ashred@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-05-05 14:37:52 -07:00
Donald Sharp
e16d030c65 *: Convert THREAD_XXX macros to EVENT_XXX macros
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
5f6eaa9b96 *: Convert a bunch of thread_XX to event_XX
Convert these functions:

thread_getrusage
thread_cmd_init
thread_consumed_time
thread_timer_to_hhmmss
thread_is_scheduled
thread_ignore_late_timer

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
907a2395f4 *: Convert thread_add_XXX functions to event_add_XXX
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
e6685141aa *: Rename struct thread to struct event
Effectively a massive search and replace of
`struct thread` to `struct event`.  Using the
term `thread` gives people the thought that
this event system is a pthread when it is not

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
8383d53e43
Merge pull request #12780 from opensourcerouting/spdx-license-id
*: convert to SPDX License identifiers
2023-02-17 09:43:05 -05:00
Stephen Worley
0bbad9d19a zebra: clang-format style fixes
clang-format style fixes

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13 18:12:05 -05:00
Stephen Worley
371298399e zebra: account for non-evpn ecmp
Account for non-evpn nexthops in ecmp groups when
doing the DVNI check.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13 18:12:05 -05:00
Stephen Worley
b991a37262 zebra: nhg resolution handler for d-vni
Add code in the nhg resolution path for determining if Downstream
VNI is in play. This is the only place in all of zebra where
we should be arbitrarily setting the ifindex/labels since
this is where new nhgs are created/destroyed. If something
changes, it must happen here.

We determine if D-VNI is being used by matching the carried
label (VNI) on the nexthop with the vrf VNI from the route.
If they do not match, we can assume this is a D-VNI labeled
nexthop.

We loop through all of the group to see if any are D-VNI. If even
one is, we must treat them all as such. Otherwise, fallback to
traditional EVPN route handling and remove all the labels.

If they are going to be treated as D-VNI we retain the labels and
verify the underlying VRF vxlan interface is a Single VXlan Device.
If it is not, we cannot use D-VNI. If it is, continue on. The VNI label
will encapped via LWTUNNEL and sent to the kernel.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2023-02-13 18:12:05 -05:00
David Lamparter
acddc0ed3c *: auto-convert to SPDX License IDs
Done with a combination of regex'ing and banging my head against a wall.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2023-02-09 14:09:11 +01:00
Donald Sharp
a98701f053 zebra: Add missing enums to switch statements
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-31 15:15:42 -05:00
Donald Sharp
75c87b7279 zebra: i declaration shadows other i declared
Clear up some confustion

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-26 11:40:33 -05:00
Siger Yang
c317d3f246
zebra: traffic control state management
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.

Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-11-22 22:35:35 +08:00
Donald Sharp
ca2b346783 *: Add ability to encode / decode resilence down zapi
At this point add abilty for the encode/decode of the
resilience down ZAPI to zebra.  Just hookup sharpd
at this point in time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
569e141113 lib, zebra: Add ability to encode/decode resilient nhg's
Add ability to read the nexthop group resilient linux
kernel data as well as write it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:29:36 -04:00
Donald Sharp
8d4665aabf zebra: Fix handling of recursive routes when processing closely in time
When zebra receives routes from upper level protocols it decodes the
zapi message and places the routes on the metaQ for processing.  Suppose
we have a route A that is already installed by some routing protocol.
And there is a route B that has a nexthop that will be recursively
resolved through A.  Imagine if a route replace operation for A is
going to happen from an upper level protocol at about the same time
the route B is going to be installed into zebra.  If these routes
are received, and decoded, at about the same time there exists a
chance that the metaQ will contain both of them at the same time.
If the order of installation is [ B, A ].  B will be resolved
correctly through A and installed, A will be processed and
re-installed into the FIB.  If the nexthops have changed for
A then the owner of B should be notified about the change( and B
can do the correct action here and decide to withdraw or re-install ).
Now imagine if the order of routes received for processing on the
metaQ is [ A, B ].  A will be received, processed and sent to the
dataplane for reinstall.  B will then be pulled off the metaQ and
fail the install since A is in a `not Installed` state.

Let's loosen the restriction in nexthop resolution for B such
that if the route we are dependent on is a route replace operation
allow the resolution to suceed.  This requires zebra to track a new
route state( ROUTE_ENTRY_ROUTE_REPLACING ) that can be looked at
during nexthop resolution.  I believe this is ok because A is
a route replace operation, which could result in this:
-route install failed, in which case B should be nht'ing and
will receive the nht failure and the upper level protocol should
remove B.
-route install succeeded, no nexthop changes.  In this case
allowing the resolution for B is ok, NHT will not notify the upper
level protocol so no action is needed.
-route install succeeded, nexthops changes.  In this case
allowing the resolution for B is ok, NHT will notify the upper
level protocol and it can decide to reinstall B or not based
upon it's own algorithm.

This set of events was found by the bgp_distance_change topotest(s).
Effectively the tests were looking for the bug ( A, B order in the metaQ )
as the `correct` state.  When under very heavy load, the A, B ordering
caused A to just be installed and fully resolved in the dataplane before
B is gotten to( which is entirely possible ).

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-10-26 15:06:23 -04:00
Donald Sharp
040a0e6d26 zebra: Fix debug of filtering out prefix due to routemap
The debug for notification about a filtered prefix was
just printing the nexthop ifindex and vrf id.  Not all
nexthops have this data.  Just print out the actual nexthop

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-10-20 07:43:45 -04:00
Siger Yang
449a30edf6
zebra: add tc netlink and dplane ops
This commit implements necessary netlink encoders for traffic control
including QDISC, TCLASS and TFILTER, and adds basic dplane operations.

Co-authored-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Siger Yang <siger.yang@outlook.com>
2022-08-11 02:32:43 +08:00
Donald Sharp
a69b10c1e6 zebra: Cleanup unguarded debug
Left over debug from earlier commits

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-08 09:15:22 -04:00
Donald Sharp
0a5f9773a8 zebra: zrouter.in_shutdown is an atomic variable
So let's treat the variable like it is atomic and
properly load it when we need to look at it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-05 07:51:27 -04:00
Donald Sharp
d5795103bc zebra: Fix memory leaks and use after frees in nhg's on shutdown
Fixup both memory leaks as well as use after free's in nhg's
on shutdown.

This approach is effectively just iterating through all the
hash items and directly just freeing the memory instead
of handling ref counts or cross references.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-05 07:51:27 -04:00
Donald Sharp
34a67a7d1e zebra: When saving nhg for later stop processing
Commit 35729f38fa introduced the idea of
holding a nexthop group for a small amount of time
before removing it from the system.  When this code
was introduced the nexthop group entry was saved
and a timer started, except instead of stopping
processing at that point in time, zebra was
continuing on and deleting nexthop group entries
that that entry depended on as well.  This
should not be done until the timer pops.

Fixes: #11596
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-05 07:51:27 -04:00
Donald Sharp
9d1fec4c7e zebra: When deleting nexthop group entries ensure the thread is off
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-07-16 19:00:43 -04:00
Donald Sharp
f00b37e710 zebra: make rib_process_dplane_results own ctx freeing
The rib_process_dplane_results function was having each
sub function handler process the results and then
free the ctx.  Lot's of functionality that needs to remember
to free the context.  Let's just free it in the main loop.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-29 15:24:20 -04:00
Donald Sharp
fc3de981be zebra: Allow kernel routes to stick around better on interface state changes
Currently kernel routes on system bring up would be `auto-accepted`,
then if an interface went down all kernel and system routes would
be re-evaluated.  There exists situations where a kernel route can
exist but the interface itself is not exactly in a state that is
ready to create a connected route yet.  As such when any interface
goes down in the system all kernel/system routes would be re-evaluated
and then since that interfaces connected route is not in the table yet
the route is matching against a default route( or not at all ) and
is being dropped.

Modify the code such that kernel or system routes just look for interface
being in a good state (up or operative) and accept it.

Broken code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:08
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:08
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:05:08
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:08
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:08
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:08
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:05:28
C>* 4.5.6.99/32 is directly connected, dummy9, 00:05:28
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:05:28
C>* 192.168.10.0/24 is directly connected, dummy99, 00:05:28
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:05:28

Working code:
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:04
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:04
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:04
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:04
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:04
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:04
<shutdown a non-related interface>
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:00:15
K>* 1.2.3.5/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.6/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.7/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.8/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.9/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.10/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.12/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.13/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.14/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.16/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 1.2.3.17/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
C>* 4.5.6.99/32 is directly connected, dummy9, 00:00:15
K>* 4.9.10.11/32 [0/0] via 172.22.0.44, br-23e378ed7fd2 linkdown, 00:00:15
K>* 10.11.12.13/32 [0/0] via 192.168.119.1, enp39s0, 00:00:15
C>* 192.168.10.0/24 is directly connected, dummy99, 00:00:15
C>* 192.168.119.0/24 is directly connected, enp39s0, 00:00:15
eva#

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-23 12:22:30 -04:00
Donald Sharp
c9af62e314 zebra: Add a configurable knob zebra nexthop-group keep (1-3600)
Allow end operator to set how long a nexthop-group is kept around
in the system after it is no-longer being used.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-16 14:47:19 -04:00
Donald Sharp
35729f38fa zebra: Add a timer to nexthop group deletion
Before deleting nexthop groups, that are installed,
from the system, start a timer and hold the nexthop
group for that time.

Suppose you have this scenario

a) create a static route with 1 x ecmp
      creates a nhg with 1 x ecmp
b) create a static route with 2 x ecmp
      creates a nhg with 2 x ecmp
      deletes a's nhg
c) create a static route with 3 x ecmp
      creates a nhg with 3 x ecmp
      deletes b's nhg
d) create a different route with 1 x ecmp
      creates another 1 x ecmp ( since a's ecmp was deleted )
e) create a different route with 2 x ecmp
      creates another 2 x ecmp ( since b's ecmp was deleted )

If you don't delete the nhg, start a timer, the nhg's used
in steps a and b can be reused for steps d and e.  This reduces
overhead work with zebra <-> kernel interactions and improves
the speed of the system.

So modify the code to note that an installed nexthop group should
be kept around a bit and hopefully reused.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-16 14:47:19 -04:00
Donald Sharp
382858d015 zebra: Move where zebra marks a nhg as uninstalled in fib
Currently the code is marking the nhg as uninstalled but not
causing that to flood up to the dependent nhgs:

nhg 3 is a group of 1/2
   1 -> interface A
   2 -> interface B

Suppose A goes down, old code would mark nhg 1 as !VALID and !INSTALLED.
Suppose B then goes down, old code would mark nhg 2 as !VALID and !INSTALLED
But would not mark nhg 3 as !VALID and !INSTALLED (sort of assuming that
it would just be cleaned up by NHG refcounts ).  I would prefer that
the code is pedantic about nhg 3 actually being removed from the system.

This code moves the setting of !INSTALLED into zebra_nhg.c where it
really belongs.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-16 14:47:19 -04:00
Donald Sharp
68d188be7a zebra: Convert debugs to use %pNG
The nexthop group debugs were using %u to just display the id.
I found this very hard to figure out what was going on.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-14 20:25:56 -04:00