Commit Graph

5476 Commits

Author SHA1 Message Date
Mark Stapp
59b8965aa6
Merge pull request #13861 from opensourcerouting/fix/memory_leak_zserv
zebra: Free Zebra client resources
2023-06-28 08:18:11 -04:00
Donatas Abraitis
97072d144e zebra: Free Zebra client resources
Memory leaks started flowing:

```
AddressSanitizer Topotests Part 0:  15 KB -> 283 KB
AddressSanitizer Topotests Part 1:  1 KB -> 495 KB
AddressSanitizer Topotests Part 2:  13 KB -> 478 KB
AddressSanitizer Topotests Part 3:  39 KB -> 213 KB
AddressSanitizer Topotests Part 4:  30 KB -> 836 KB
AddressSanitizer Topotests Part 5:  0 bytes -> 356 KB
AddressSanitizer Topotests Part 6:  86 KB -> 783 KB
AddressSanitizer Topotests Part 7:  0 bytes -> 354 KB
AddressSanitizer Topotests Part 8:  0 bytes -> 62 KB
AddressSanitizer Topotests Part 9:  408 KB -> 518 KB
```

```
Direct leak of 3584 byte(s) in 1 object(s) allocated from:
    #0 0x7f1957b02d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    #1 0x559895c55df0 in qcalloc lib/memory.c:105
    #2 0x559895bc1cdf in zserv_client_create zebra/zserv.c:743
    #3 0x559895bc1cdf in zserv_accept zebra/zserv.c:880
    #4 0x559895cf3438 in event_call lib/event.c:1995
    #5 0x559895c3901c in frr_run lib/libfrr.c:1213
    #6 0x559895a698f1 in main zebra/main.c:472
    #7 0x7f195635ec86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
```

Fixes b20acd0 ("bgpd: Use synchronous way to get labels from Zebra")

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-06-27 22:48:39 +03:00
Russ White
1f08a055a8
Merge pull request #13852 from mjstapp/fix_opq_cov_msg
zebra: clean up coverity warning in opaque api
2023-06-27 11:28:31 -04:00
Chirag Shah
a7d77ee58b zebra: fix evpn rmac nh list cmp function
EVPN RMAC (Router MAC) nexthop list compare
function needs to return all values so
the list element can be compared and added/deleted
properly.

Ticket:#3486989
Testing Done:
Originate EVPN Type-5 route with PIP IP and MAC as remote
nexthops.
Change the PIP IP address which triggers nexthop change.

Before fix:
When PIP IP changes RMAC is deleted from remote VTEPs.

TORS1# show evpn next-hops vni 4001 | include 00:02:00:00:00:2d
27.0.0.11       00:02:00:00:00:2d
TORS1# show evpn rmac vni 4001 | include 00:02:00:00:00:2d
00:02:00:00:00:2d 27.0.0.11

----- Remote VTEP change nexthop IP to 172.16.16.16 -----

TORS1# show evpn next-hops vni 4001 | include 00:02:00:00:00:2d
172.16.16.16    00:02:00:00:00:2d
TORS1# show evpn rmac vni 4001 | include 00:02:00:00:00:2d
TORS1#

After fix:
RMAC is retained as its nexthop list is not empty,
thus it is not deleted from remote VTEPs.

TORS1# show evpn rmac vni 4001 | include 00:02:00:00:00:2d
00:02:00:00:00:2d 172.16.16.16

Log:
2023/06/27 00:50:36.833474 ZEBRA: [XREH0-ZYMH6] L3VNI 4001 Remote VTEP
change(27.0.0.11 -> 172.16.16.16) for RMAC 00:02:00:00:00:2d

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-06-26 17:59:16 -07:00
Donald Sharp
161972c9fe *: Rearrange vrf_bitmap_X api to reduce memory footprint
When running all daemons with config for most of them, FRR has
sharpd@janelle:~/frr$ vtysh -c "show debug hashtable"  | grep "VRF BIT HASH" | wc -l
3570

3570 hashes for bitmaps associated with the vrf.  This is a very
large number of hashes.  Let's do two things:

a) Reduce the created size of the actually created hashes to 2
instead of 32.

b) Delay generation of the hash *until* a set operation happens.
As that no hash directly implies a unset value if/when checked.

This reduces the number of hashes to 61 in my setup for normal
operation.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-26 14:59:21 -04:00
Mark Stapp
0ee56dd332 zebra: clean up coverity warning in opaque api
Seems a bit fussy of coverity, but ... don't NULL a variable
unnecessarily.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-26 13:19:23 -04:00
Mark Stapp
de1a9ce0a7 zebra: support notifications for opaque ZAPI messages
Allow zapi clients to register to be notified when a server
for an  opaque message type is present. Zebra maintains these
notification registrations in the same data structures that it
uses for opaque message handling.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-23 08:57:37 -04:00
Mark Stapp
ef8e3ac02c lib, zebra: include source client zapi info in opaque messages
Include the sending zapi client info (proto, instance, and
session id) in each opaque zapi message. Add opaque 'init'
apis for clients who want to encode their opaque data inline,
into the zclient's internal stream buffer. Use these init apis
in the TE/link-state lib code, instead of hand-coding the
zapi opaque header info.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-23 08:27:42 -04:00
Donatas Abraitis
3cbc7150bb
Merge pull request #13545 from idryzhov/remove-bond-slave
zebra: remove ZEBRA_IF_BOND_SLAVE interface type
2023-06-23 11:01:19 +03:00
Donatas Abraitis
52dde8747b zebra: Ignore non GR-aware zclient handling for BGP
This is for synchronous client (label/table manager) - aka session_id == 1.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-06-20 20:50:40 +03:00
Donatas Abraitis
20c2c8787a zebra: Show session id when printing an error when the client disconnects
Before:

```
2023/06/18 22:00:42 ZEBRA: [VXKFG-8SJRV][EC 4043309121] Client 'bgp' encountered an error and is shutting down.
2023/06/18 22:00:42 ZEBRA: [VXKFG-8SJRV][EC 4043309121] Client 'bgp' encountered an error and is shutting down.
```

After:

```
2023/06/18 22:06:44 ZEBRA: [N5M5Y-J5BPG][EC 4043309121] Client 'bgp' (session id 0) encountered an error and is shutting down.
2023/06/18 22:06:44 ZEBRA: [N5M5Y-J5BPG][EC 4043309121] Client 'bgp' (session id 1) encountered an error and is shutting down.
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-06-20 20:50:40 +03:00
Russ White
40502902f4
Merge pull request #13394 from mjstapp/fix_zebra_mpls_config
zebra: clarify interface-level mpls config
2023-06-20 09:10:53 -04:00
Donald Sharp
f89d090230
Merge pull request #13755 from LabNConsulting/ziemba/zebra-dplane-priority
zebra: bugfix dplane priority sorting
2023-06-13 10:36:57 -04:00
Mark Stapp
a32d40a676 zebra: clarify interface-level mpls config
We have both interface-level configuration to enable mpls,
and runtime mpls status. They need to be distinct.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-12 16:41:27 -04:00
Mark Stapp
4112baec9f pbrd, zebra: fix zapi and netlink rule encoding
In pbrd, don't encode a rule without a table. There are cases
where the zapi encoding was incorrect because the 4-octet
table id was missing. In zebra, mask off the ECN bits in the
TOS byte when encoding an iprule to match netlink's
expectation.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-06-12 16:39:26 -04:00
G. Paul Ziemba
9e5c9e6d65 zebra: bugfix dplane priority sorting
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2023-06-09 06:58:20 -07:00
Donald Sharp
977d7e24ff zebra: Prevent crash because nl is NULL on shutdown
When shutting down the main pthread was first closing
the sockets associated with the dplane pthread and
then telling it to shutdown the pthread at a later point
in time.  This caused the dplane to crash because the nl
data has been freed already.  Change the shutdown order
to stop the dplane pthread *and* then close the sockets.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-08 12:03:49 -04:00
Donatas Abraitis
29f6fb04d8
Merge pull request #13649 from donaldsharp/unlock_the_node_or_else
zebra: Unlock the route node when sending route notifications
2023-06-06 08:52:40 +03:00
Donald Sharp
3ddf7680fd zebra: Consolidate the stream_failure section with normal return
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-01 08:58:16 -04:00
Donald Sharp
c2cf522347 zebra: No need to set msg to NULL
The msg value is always reset to something new before it is used inside
the mutex.  No need to set it to NULL.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-01 08:54:25 -04:00
Donald Sharp
82c6e4fea5 zebra: Unlock the route node when sending route notifications
When using a context to send route notifications to upper
level protocols, the code was using a locking function to
get the route node.  There is no need for this to be locked
as such FRR should free it up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-06-01 07:35:12 -04:00
Donatas Abraitis
147c7a2de3
Merge pull request #13631 from donaldsharp/fix_some_ping_issues
various issues
2023-05-30 21:26:24 +03:00
Christian Hopps
ff6b14a658 zebra: use ifindex vs ifp to avoid use-after-free on shutdown
Signed-off-by: Christian Hopps <chopps@labn.net>
2023-05-30 04:09:29 -04:00
Christian Hopps
8cfe36bc7e zebra: avoid unneeded vxlan work on shutdown
Signed-off-by: Christian Hopps <chopps@labn.net>
2023-05-30 04:09:29 -04:00
Donald Sharp
46d725f76b lib, zebra: Ensure that the ifp->node exists
On removal, ensure that the ifp->node is set to a null
pointer so that FRR does not use data after freed.
In addition ensure that the ifp->node exists before
attempting to free it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-28 10:13:16 -04:00
Russ White
7b7da41def
Merge pull request #13556 from donaldsharp/token_to_desc
memory desciprtion shortening
2023-05-23 08:21:51 -04:00
Donald Sharp
d7c9666e06 zebra: Fix paths that have already de-refed ctx
There is no path in some functions where the ctx
has not already been de-refed.  As such no need
to test for it's existence.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-22 10:52:54 -04:00
Igor Ryzhov
9ce24c31bf zebra: remove ZEBRA_IF_BOND_SLAVE interface type
It is never actually used in the code.

Closes #13532.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2023-05-21 23:37:39 +03:00
Donald Sharp
a01f310709 zebra: Make memory description string smaller to fit in vty space
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-19 21:31:35 -04:00
Donald Sharp
5ec001aa53 zebra: On shutdown stop hook calls for fpm rmac updates
When shutting down zebra, the hook for the rmac update was
not being unregistered.  As such it would be possible
to get into a condition where more rmacs are being
added to the queue for handling in the future after we
are told to shutdown.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-19 10:02:19 -04:00
Donald Sharp
540334324c zebra: Properly handle zfpm_g->t_conn_down in zebra_fpm.c
The t_conn_down pointer was being set to NULL when it already
was.  The t_conn_down pointer was being dropped( and leaving
a thread possibly running in the background ) which could
cause problems on shutdown.  And finally when shutting down
the t_conn_down event was not being stopped at all.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-19 10:02:19 -04:00
Donald Sharp
0eaa6523f6 zebra: Do not allow old FPM to access freed memory after shutdown
On shutdown, the old FPM queues up dests to be sent to
the FPM listener.  This is done through the rib_shutdown
hook.  Which is called when the table that the routes are
stored in are being deleted.  This dest has pointers
to the rnode.  The rnode has pointers to the table it
is associated with as well as the table->info pointer for
the zebra data associated with this table.

The FPM after this attempts to tell this to it's listener
via events.  Unfortunately the zvrf, table_id and nl_pid
was being grabbed from memory that had been freed!  Since
all this can be grabbed from memory that has not been freed
on shutdown let's switch over to using that instead of freed
memory for gathering data.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-19 10:02:19 -04:00
Carmine Scarpitta
eb68d4a04c zebra: Fix build error when --disable-bfdd
When FRR is built with the option `--disable-bfdd`, the build process
fails with the following error:

```
zebra/zebra_ptm.c: In function ‘zebra_ptm_init’:
zebra/zebra_ptm.c:119:35: error: ‘FRR_PTM_NAME’ undeclared (first use in this function)
  119 |  snprintf(buf, sizeof(buf), "%s", FRR_PTM_NAME);
      |                                   ^~~~~~~~~~~~
zebra/zebra_ptm.c:119:35: note: each undeclared identifier is reported only once for each function it appears in
make[1]: *** [Makefile:10520: zebra/zebra_ptm.o] Error 1
```

The reason is that `FRR_PTM_NAME` is defined in `version.h` which is not
imported.

This commit adds the missing import.

Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
2023-05-17 18:47:23 +02:00
Mark Stapp
e8224402cd
Merge pull request #13444 from donaldsharp/fix_dplane_provider_counter
zebra: Fix dp_out_queued counter to actually reflect real life
2023-05-12 14:54:13 -04:00
Donald Sharp
995d810d08 zebra: Fix dp_out_queued counter to actually reflect real life
The prov->dp_out_queued counter was never being decremented
when a ctx was pulled off of the list.  Let's change it to
accurately reflect real life.

Broken:
janelle.pinkbelly.org# show zebra dplane providers detailed
Zebra dataplane providers:
Kernel (1): in: 330872, q: 0, q_max: 100, out: 330872, q: 330872, q_max: 330872
janelle.pinkbelly.org#

Fixed:
sharpd@janelle:/tmp/topotests$ vtysh -c "show zebra dplane providers detailed"
Zebra dataplane providers:
Kernel (1): in: 221495, q: 0, q_max: 100, out: 221495, q: 0, q_max: 100
sharpd@janelle:/tmp/topotests$

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-12 11:34:56 -04:00
Philippe Guibert
fab64b600a zebra: mpls nexthop entry displays also interface when available
The 'show mpls table json' command displays the outgoing interface
name only when the nexthop type is either NEXTHOP_TYPE_IFINDEX or
NEXTHOP_TYPE_IPV6_IFINDEX. add the interface name for the nexthop
type NEXTHOP_TYPE_IPV4_IFINDEX.

Fixes: ("b78b820d46d6") MPLS: Display enhancements and JSON support
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-05-09 21:00:57 +02:00
Philippe Guibert
7bae48960e zebra: handle nexthop vrf_id in ZEBRA_MPLS_LABELS messages
This commit addresses the case where a service wants to install
an LSP entry to a next-hop located in a VRF instance. The incoming
MPLS packet is on the namespace and has to be directed to a nexthop
located behind an interface that sits in a specific VRF instance.
The below iproute command can illustrate:

  > ip link add vrf1 type vrf table 10
  > ip link set dev vrf1 up
  > ip link set dev eth0 master vrf1
  > ip a a 192.0.2.1/24 dev eth0
  > ip -f mpls route add 105 via inet 192.0.2.45 dev eth0

If a service uses the ZEBRA_MPLS_LABELS messages, then the LSP
message is ignored: from zebra perspective, the MPLS entries are
visible via the 'show mpls table' command, but no LSP entry is
installed in the kernel.

The issue is in the nhlfe_nexthop_active_ipv[4/6] function: the
outgoing interface mentioned in the nexthop is searched in the
main VRF, whereas the interface is in a separate VRF. The interface
is not found, and the nhlfe to install is considered not active.

To address this issue, reuse the incoming vrf_id parameter transmitted
in the nexthop structure from the ZEBRA_MPLS_LABELS message. When
creating an NHLFE entry, the vrf_id is used instead of the DEFAULT_VRF.
And the nhlfe entry can be considered as active.

One alternate solution to reuse the vrf_id parameter in the mpls network
context would be to modify the search function in nhlfe_nexthop_active..()
function: looking for an existing ifindex in the zns. However, this
solution may not fit later when netns backend would be used.

Note that some changes have not been done yet and are considered
sufficient for now:
- The 'nhlfe_find' API: the assumption is done that only the linux vrf
backend is used for now.

- The 'mpls_lsp_install()' API: It is currently used by the CLI command
which does not handle the interface parameter, and the SRTE service, whih
always sends LSPs towards a nexthop located in the VRF_DEFAULT.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-05-09 21:00:57 +02:00
Philippe Guibert
bd21ba79aa zebra: accept LSP entries with an mpls-less outgoing interface
The ZEBRA_MPLS_LABELS_[ADD/DELETE/REPLACE] messages may change an
LSP entry based on an incoming MPLS entry, followed by a given
next-hop.
Having a next hop with no label information inside is rejected
by the zebra layer. As illustration, the following ZAPI message
would be rejected, because the next hop does not contain any
label information.

  > ip -f mpls route add 105 via inet 192.0.2.45

At the same time, such configuration is desirable to be
supported:

An attempt has been done to configure the next-hop with an implicit-
null label. But the message is rejected by the kernel:

  > ip -f mpls route add 104 as 3 via inet 192.0.2.45
  > Error: Implicit NULL Label (3) can not be used in encapsulation.

The commit proposes to accept ZEBRA_MPLS_LABELS_[XX] messages with
a nexthop that does not contain any label information.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-05-09 21:00:57 +02:00
Donatas Abraitis
bae305fc9b
Merge pull request #13445 from donaldsharp/lua_scripting_mem_leak
zebra: Reduce creation and fix memory leak of frrscripting pointers
2023-05-09 15:38:06 +03:00
Mark Stapp
eb4c026d13
Merge pull request #13413 from chiragshah6/fdev2
zebra: re-install NHG on interface up
2023-05-08 14:36:07 -04:00
Donald Sharp
3e7b3ed1dc zebra: dplane_gre_set could return while leaking ctx
Prevent this function from leaking the ctx memory.
Also properly record that something has gone wrong.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05 19:11:02 -04:00
Donald Sharp
6636fc44c8 zebra: Dplane ctx allocation cannot fail
Having tests for memory allocation success makes no sense
given what happens when frr fails to allocate memory.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05 19:10:59 -04:00
Chirag Shah
69cf016ee2 zebra:re-install dependent nhgs on interface up
Upon interface up associated singleton NHG's
dependent NHGs needs to be reinstalled as
kernel would have deleted if there is no route
referencing it.

Ticket:#3416477
Issue:3416477
Testing Done:
flap interfaces which are part of route NHG,
upon interfaces up event, NHGs are resynced
into dplane.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-05-05 14:37:52 -07:00
Ashwini Reddy
5bb87732f6 zebra: re-install nhg on interface up
Intermittently zebra and kernel are out of sync
when interface flaps and the add's/dels are in
same processing queue and zebra assumes no change in nexthop.
Hence we need to bring in a reinstall to kernel
of the nexthops and routes to sync their states.

Upon interface flap kernel would have deleted NHGs
associated to a interface (the one flapped),
zebra retains NHGs for 3 mins even though upper
layer protocol removes the nexthops (associated NHG).
As part of interface address add ,
re-add singleton NHGs associated to interface.

Ticket: #3173663
Issue: 3173663

Signed-off-by: Ashwini Reddy <ashred@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-05-05 14:37:52 -07:00
Donald Sharp
d8be139972 zebra: Reduce creation and fix memory leak of frrscripting pointers
There are two issues being addressed:

a) The ZEBRA_ON_RIB_PROCESS_HOOK_CALL script point
was creating a fs pointer per dplane ctx in
rib_process_dplane_results().

b) The fs pointer was not being deleted and directly
leaked.

For (a) Move the creation of the fs to outside
the do while loop.

For (b) At function end ensure that the pointer
is actually deleted.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-05-05 12:24:02 -04:00
Donatas Abraitis
786e2b8bdb Revert "MPLS allocation mode per next hop"
Broken tests, let's revert now.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-05-03 13:52:46 +03:00
Donatas Abraitis
99a1ab0b21
Merge pull request #12646 from pguibert6WIND/mpls_alloc_per_nh
MPLS allocation mode per next hop
2023-05-02 18:36:45 +03:00
Russ White
1998805bd5
Merge pull request #13403 from anlancs/fix/zebra-missing-vrf-flag
zebra: Fix missing VRF flag
2023-05-02 10:47:41 -04:00
Russ White
856e85e910
Merge pull request #13270 from pguibert6WIND/better_srv6_output_seg6local
zebra: display seg6local only when specified
2023-05-02 10:30:15 -04:00
anlan_cs
41414503e4 zebra: Fix missing VRF flag
1. No any configuration in FRR, and `ip link add vrf1 type vrf ...`.
Currently, everything is ok.

2.  `ip link del vrf1`.
`zebra` will wrongly/redundantly notify clients to add "vrf1" as a normal
interface after correct deletion of "vrf1".

```
ZEBRA: [KMXEB-K771Y] netlink_parse_info: netlink-listen (NS 0) type RTM_DELLINK(17), len=588, seq=0, pid=0
ZEBRA: [TDJW2-B9KJW] RTM_DELLINK for vrf1(93) <- Wrongly as normal interface, not vrf
ZEBRA: [WEEJX-M4HA0] interface vrf1 vrf vrf1(93) index 93 is now inactive.
ZEBRA: [NXAHW-290AC] MESSAGE: ZEBRA_INTERFACE_DELETE vrf1 vrf vrf1(93)
ZEBRA: [H97XA-ABB3A] MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/DEL vrf1 VRF Id 93 -> 0
ZEBRA: [HP8PZ-7D6D2] MESSAGE: ZEBRA_INTERFACE_VRF_UPDATE/ADD vrf1 VRF Id 93 -> 0 <-
ZEBRA: [Y6R2N-EF2N4] interface vrf1 is being deleted from the system
ZEBRA: [KNFMR-AFZ53] RTM_DELLINK for VRF vrf1(93)
ZEBRA: [P0CZ5-RF5FH] VRF vrf1 id 93 is now inactive
ZEBRA: [XC3P3-1DG4D] MESSAGE: ZEBRA_VRF_DELETE vrf1
ZEBRA: [ZMS2F-6K837] VRF vrf1 id 4294967295 deleted
OSPF: [JKWE3-97M3J] Zebra: interface add vrf1 vrf default[0] index 0 flags 480 metric 0 mtu 65575 speed 0 <- Wrongly add interface
```

`if_handle_vrf_change()` moved the interface from specific vrf to default
vrf. But it doesn't skip interface of vrf type. So, the wrong/redundant
add operation is done.

Note, the wrong add operation is regarded as an normal interface because
the `ifp->status` is cleared too early, so it is without VRF flag
( `ZEBRA_INTERFACE_VRF_LOOPBACK` ). Now, ospfd will initialize `ifp->type`
to `OSPF_IFTYPE_BROADCAST`.

3. `ip link add vrf1 type vrf ...`, add "vrf1" again. FRR will be with
wrong display:

```
interface vrf1
 ip ospf network broadcast
exit
```

Here, zebra will send `ZEBRA_INTERFACE_ADD` again for "vrf1" with
correct `ifp->status`, so it will be updated into vrf type. But
it can't update `ifp->type` from `OSPF_IFTYPE_BROADCAST` to
`OSPF_IFTYPE_LOOPBACK` because it had been already configured in above
step 2.

Two changes to fix it:

1. Skip the procedure of switching VRF for interfaces of vrf type.
It means, don't send `ZEBRA_INTERFACE_ADD` to clients when deleting vrf.

2. Put the deletion of this flag at the last.
It means, clients should get correct `ifp->status`.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-05-01 20:21:37 +08:00
Sindhu Parvathi Gopinathan
2223b4d543 zebra:add df flag into evpn esi json output
FRR "show evpn es 'esi-id' json" output dont have the 'df' flag.

Modified the code to add the 'df' flag into json output.

Before Fix:

```
torm-11# show evpn es 03:44:38:39:ff:ff:01:00:00:01 json
{
  "esi":"03:44:38:39:ff:ff:01:00:00:01",
  "accessPort":"hostbond1",
  "flags":[
    "local",
    "remote",
    "readyForBgp",
    "bridgePort",
    "operUp",
    "nexthopGroupActive"
	 ====================> df is missing
  ],
  "vniCount":10,
  "macCount":13,
  "dfPreference":50000,
  "nexthopGroup":536870913,
  "vteps":[
    {
      "vtep":"27.0.0.16",
      "dfAlgorithm":"preference",
      "dfPreference":32767,
      "nexthopId":268435460
    },
    {
      "vtep":"27.0.0.17",
      "dfAlgorithm":"preference",
      "dfPreference":32767,
      "nexthopId":268435461
    }
  ]
}
torm-11#
```

After Fix:-

```
torm-11# show evpn es 03:44:38:39:ff:ff:01:00:00:01 json
{
  "esi":"03:44:38:39:ff:ff:01:00:00:01",
  "accessPort":"hostbond1",
  "flags":[
    "local",
    "remote",
    "readyForBgp",
    "bridgePort",
    "operUp",
    "nexthopGroupActive",
    "df" ========================> designated-forward flag added
  ],
  "vniCount":10,
  "macCount":13,
  "dfPreference":50000,
  "nexthopGroup":536870913,
  "vteps":[
    {
      "vtep":"27.0.0.16",
      "dfAlgorithm":"preference",
      "dfPreference":32767,
      "nexthopId":268435460
    },
    {
      "vtep":"27.0.0.17",
      "dfAlgorithm":"preference",
      "dfPreference":32767,
      "nexthopId":268435461
    }
  ]
}
torm-11#

```

Ticket:# 3447935

Issue: 3447935

Testing: UT done

Signed-off-by: Sindhu Parvathi Gopinathan's <sgopinathan@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-04-28 16:04:25 -07:00
Russ White
257fddaeb6
Merge pull request #13246 from opensourcerouting/rip-bfd
ripd: support BFD integration
2023-04-25 11:54:32 -04:00
Donatas Abraitis
76cd90fb4e
Merge pull request #13330 from chiragshah6/fdev1
zebra: EVPN handle duplicate detected local mac delete event
2023-04-24 16:51:10 +03:00
Donald Sharp
6f99cfcd89 zebra: ctx has to be non NULL at this point
Remove the pointer check for ctx.  At this point in the
function it has to be non null since we deref'ed it.
Additionally the alloc function that creates it cannot
fail.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-21 08:54:51 -04:00
Chirag Shah
89844a9678 zebra:fix evpn dup detected local mac del event
The current local mac delete event send to flag with force
always which breaks the duplicate detected MACs where
it requires to be resynced from bgpd to earlier state.

Ticket:#3233019
Issue:3233019

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-04-20 15:45:39 -07:00
Chirag Shah
ad7685de28 zebra: evpn handle del event for dup detected mac
Upon receiving local mobility event for MAC + NEIGH,
both are detected as duplicate upon hitting DAD threshold.

Duplicated detected ( freezed) MAC + NEIGH are not known
to bgpd.

If locally learnt MAC + NEIGH are deleted in kernel,
the MAC is marked as AUTO after sending delete event
to bgpd.

Bgpd only reinstalls best route for MAC_IP route (NEIGH)
but not for MAC event.
This puts a situation where MAC is AUTO state and
associated neigh as remote.

Fix:
DUPLICATE + LOCAL MAC deletion, set MAC delete request
as reinstall from bgpd.

Ticket:#2873307
Reviewed By:
Testing Done:

Freeze MAC + two NEIGHs in local mobility event.
Delete MAC and NEIGH from kerenl.
bgp rsync remote mac route which puts MAC to remote state.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-04-20 15:45:26 -07:00
Renato Westphal
c262df828b ripd: support BFD integration
Implement RIP peer monitoring with BFD.

RFC 5882 Generic Application of Bidirectional Forwarding Detection
(BFD), Section 10.3 Interactions with RIP.

Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-04-19 09:15:01 -03:00
Chirag Shah
4a1f91a366 zebra: evpn mh sync mac install as inactive
EVPN MH ES reduendant VTEPs need to install
sync MAC as notify inactive and generate
ND:Proxy stamped extended community on Type-2
route.

Ticket:#3436621
Issue:3436621

Testing Done:

tor-11 originates type-2 MAC route:

tor-11# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static

tor-12 receives sync MAC route:

Before fix:
----------
tor-12:/# bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify master bridge static

After fix: inactive is set to MAC entry
----------
tor-12:/#bridge -d fdb show | grep 00:65:00:00:00:01
00:65:00:00:00:01 dev hostbond1 vlan 1000 notify inactive master bridge
static

Notice the difference in `inactive` post notify on tor-12
with the fix.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
2023-04-14 14:50:24 -07:00
Philippe Guibert
f38f5c9a78 zebra: keep seg6local information from 'show ipv6 route' consistent with iproute2
Srv6 nexthop segments may not be set when configuring seg6local
attributes. This is the case for the following seg6local route:

Dump in vtysh, extract from 'show ipv6 route'
> B>* 2001:db8:1:1:1::/128 [20/0] is directly connected, vrf1, seg6local End.DT46 table 10, seg6 ::, weight 1, 00:02:10

Dump in iproute2, extract from 'ip -6 route show'
> 2001:db8:1:1:1:: nhid 22  encap seg6local action End.DT46 vrftable 10 dev vrf1 proto bgp metric 20 pref medium

As can be seen, the 'seg6 ::' nexthop segment is not visible on iproute2,
because it is not set. Do not display seg6 ipv6 nexthop when not set.

After:
> B>* 2001:db8:1:1:1::/128 [20/0] is directly connected, vrf1, seg6local End.DT46 table 10, weight 1, 00:02:10

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-04-14 18:04:01 +02:00
Philippe Guibert
b4bb3b1735 zebra: display seg6local only when specified
Srv6 routes which configure encap method, may not have
seg6local instructions. Generally speaking, seg6local
attributes that are not specified should not be dumped.

Before:
> B>* 10.200.0.0/24 [20/0] via fd00:125::2, ntfp2 (vrf default), label 16, seg6local unspec unknown(seg6local_context2str), seg6 2001:db8:1:1:1::, weight 1, 0\
0:00:17

After:
> B>* 10.200.0.0/24 [20/0] via fd00:125::2, ntfp2 (vrf default), label 16, seg6 2001:db8:1:1:1::, weight 1, 00:00:17

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-04-14 18:04:01 +02:00
Mark Stapp
4b6b10cb81
Merge pull request #13273 from donaldsharp/metaq_not_making_me_meta_happy
zebra: Actually free up memory associated with the mq list
2023-04-12 14:02:14 -04:00
Mark Stapp
52ccf12c30
Merge pull request #13249 from Pdoijode/connected-route-install-fix
zebra: Mark connected route as installed after interface flap event
2023-04-12 11:03:47 -04:00
Donald Sharp
1b192d88e4 zebra: Actually free up memory associated with the mq list
Free up the link list data structures as well as properly
account for data sizes.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-04-12 10:41:42 -04:00
Jafar Al-Gharaibeh
bd2711d251
Merge pull request #12959 from leonshaw/fix/zif-link-nsid
zebra: Add link_nsid to zebra interface
2023-04-11 16:38:33 -05:00
Donatas Abraitis
b69fa56517
Merge pull request #13213 from mjstapp/fix_dplane_shutdown_event
zebra: fix race during shutdown
2023-04-11 22:24:35 +03:00
Pooja Jagadeesh Doijode
e25a0b138a zebra: Install directly connected route after interface flap
Issue:
After vlan flap, zebra was not marking the selected/best route as installed.

As a result, when a static route was configured with nexthop as directly
connected interface's(vlan) IP, the static route was not being installed
in the kernel since its nexthop was unresolved. The nexthop was marked
unresolved because zebra failed to mark the best route as installed after
interface flap.

This was happening because, in dplane_route_update_internal() if the old and
new context type, and nexthop group id are the same, then zebra doesn't send
down a route replace request to kernel. But, the installed (ROUTE_ENTRY_INSTALLED)
flag is set when zebra receives a response from kernel. Since the
request to kernel was being skipped for the route entry, installed flag
was not being set

Fix:
In dplane_route_update_internal() if the old and new context type, and
nexthop group id are the same, then before returning, installed flag will
be set on the route-entry if it's not set already.

Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
2023-04-10 16:03:23 -07:00
Donatas Abraitis
cf35e49354
Merge pull request #13214 from chiragshah6/fdev2
zebra:return empty dict in json when evpn is disabled
2023-04-06 12:48:52 +03:00
Mark Stapp
27552b48ab zebra: null-check client pointer during GR processing
Add a null check.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-04-05 12:30:52 -04:00
Sindhu Parvathi Gopinathan
61f3a6c353 zebra:return empty dict when evpn is disabled
"show evpn json" returns nothing when evpn is disabled.

Code has been fixed to return {} when evpn is disabled or no entry
available.

Before Fix:-
```
cumulus@r2:mgmt:~$ sudo vtysh -c "show evpn json"
cumulus@r2:mgmt:~$
```

After Fix:-
```
cumulus@r1:mgmt:~$ sudo vtysh -c "show evpn json"
{
}
cumulus@r1:mgmt:~$
```

Ticket:#3417955

Issue:3417955

Testing: UT done

Signed-off-by: Chirag Shah <chirag@nvidia.com>
Signed-off-by: Sindhu Parvathi Gopinathan <sgopinathan@nvidia.com>
2023-04-04 19:41:25 -07:00
Jafar Al-Gharaibeh
92c4494ce5
Merge pull request #13145 from donaldsharp/do_delete
Improve and fix zebra GR
2023-04-04 21:10:54 -05:00
Mark Stapp
38a2e2cb26 zebra: fix race during shutdown
During shutdown, the main pthread stops the dplane pthread
before exiting. Don't try to clean up any events scheduled
to the dplane pthread at that point - just let the thread
exit and clean up.

Signed-off-by: Mark Stapp <mjs@labn.net>
2023-04-04 16:37:38 -04:00
Russ White
c0656e9040
Merge pull request #12837 from donaldsharp/unlikely_routemap
Unlikely routemap
2023-04-04 08:20:25 -04:00
Christian Hopps
9ecc5f3603
Merge pull request #13179 from donaldsharp/array_size
isisd, zebra: Use array_size instead of ARRAY_SIZE
2023-04-02 08:21:41 +09:00
Donald Sharp
6cd594ecfd isisd, zebra: Use array_size instead of ARRAY_SIZE
Use the FRR provided array_size.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-31 13:58:47 -04:00
Donald Sharp
3cd0accb50 zebra: Cleanup ctx leak on shutdown and turn off event
two things:

On shutdown cleanup any events associated with the update walker.
Also do not allow new events to be created.

Fixes this mem-leak:

./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790:Direct leak of 8 byte(s) in 1 object(s) allocated from:
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #0 0x7f0dd0b08037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #1 0x7f0dd06c19f9 in qcalloc lib/memory.c:105
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #2 0x55b42fb605bc in rib_update_ctx_init zebra/zebra_rib.c:4383
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #3 0x55b42fb6088f in rib_update zebra/zebra_rib.c:4421
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #4 0x55b42fa00344 in netlink_link_change zebra/if_netlink.c:2221
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #5 0x55b42fa24622 in netlink_information_fetch zebra/kernel_netlink.c:399
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #6 0x55b42fa28c02 in netlink_parse_info zebra/kernel_netlink.c:1183
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #7 0x55b42fa24951 in kernel_read zebra/kernel_netlink.c:493
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #8 0x7f0dd0797f0c in event_call lib/event.c:1995
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #9 0x7f0dd0684fd9 in frr_run lib/libfrr.c:1185
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #10 0x55b42fa30caa in main zebra/main.c:465
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-    #11 0x7f0dd01b5d09 in __libc_start_main ../csu/libc-start.c:308
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-
./msdp_topo1.test_msdp_topo1/r2.zebra.asan.1117790-SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-31 09:09:21 -04:00
Jafar Al-Gharaibeh
3b0e17067e
Merge pull request #13082 from inspurSDN/bugfix_zebra_crash_rebooting
zebra: move vrf deleting handle to zebra final state handle
2023-03-31 00:17:19 -05:00
Donald Sharp
81322b96b0 zebra: Ensure gr events run after Meta Queue has run
BGP signals to zebra that a afi has converged immediately
after it has finished processing all routes for a given
afi/safi.  This generates events in zebra in this order

a) Routes received from BGP, placed on early-rib Meta-Q
b) Signal GR for the afi.

Now imagine that zebra reads GR code and immediately
processes routes that are in the actual rib and
removes some routes.  This generates a

c) route deletion to the kernel for some number of
routes that may be in the the early-rib Meta-Q
d) Process the Meta-Q, and re-install the routes

This is undesirable behavior in zebra.  In that
while we may end up in a correct state, there
will be a blip for some number of routes that
happen to be in the early rib Meta-Q.

Modify the GR code to have it's own processing
entry at the end of the Meta-Q.  This will
allow all routes to be processed and ready
for handling by the Graceful Restart code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 20:25:51 -04:00
Donald Sharp
644a8d3560 zebra: remove current_afi as that it is no longer used
After the restructure of the gr code to allow zebra_gr
to have individual cleanups of afi, this is no longer necessary.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 15:40:56 -04:00
Donald Sharp
347ded1ec8 zebra: Allow GR to run per AFI as they are reported
The GR code in FRR used to wait till all AFI's were complete
before cleaning up the routes from the upper level protocol.
This of course can lead to some weird situations where say
ipv4 finishes and then v6 is stuck waiting for a peer to come
up and never finishes.  v4 when it finishes signals zebra that
it is done but no action is taken at that moment.

Modify the code to allow the zebra_gr.c code to handle a per
afi removal, instead of doing it all at the end.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 15:40:56 -04:00
Donald Sharp
9c1c21da8a zebra: Rearrange zebra_gr zapi functions
The zebra_gr code had 3 functions when effectively only
1 was needed.  Cleans up some code weirdness around
multiple switch statements for the same api->cap
as well as consolidating down to only caring about
SAFI_UNICAST, since that is all we care about at the
moment.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 15:40:56 -04:00
Donald Sharp
0f5ef7f9b1 zebra: zebra GR only works with AFI's limit it
We have code that tracks both afi and safi's,
but we only ever operate on the afi's.  So lets
limit our work being done to something more sensible.

I'm leaving the safi being broadcast through the zapi
message, as that I am not sure what else should be ripped
out at this point in time.

Finally re-arrange the zread_client_capabilites function
to stop the multiple levels of function calling that really
serve no purpose.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 15:40:13 -04:00
Donald Sharp
096abfb815 zebra: Remove redundant check for pointers being good
By the time this function is called we have already
ensured that the pointers are good several times.
I like consistency but this is a bit much

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 07:48:42 -04:00
Donald Sharp
0c1fd82df6 zebra: GR code could potentially stop running
When GR is running and attempting to clear up a node
if the node that is currently saved and we are coming
back to happens to be deleted during the time zebra
suspends the GR code due to hitting the node limit
then zebra GR code will just completely stop processing
and potentially leave stale nodes around forever.

Let's just remove this hole and process what we can.
Can you imagine trying to debug this after the fact?

If we remove a node then that counts toward the maximum
to process of ZEBRA_MAX_STALE_ROUTE_COUNT.  This should
prevent any non-processing with a slightly larger cost
of having to look at a few nodes repeatedly

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 07:48:42 -04:00
Donald Sharp
559dbc2ea1 zebra: Cleanup indentation in function
Indentation was deep and hard to understand in
zebra_gr_delete_stale_route

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 07:48:42 -04:00
Donald Sharp
310ee91718 zebra: Just set the variable for what is wanted in GR code
The info->do_delete variable was being set to true only when
u.val was 1.  The problem with this is that u.val is a union
and the various ways that we can call this event causes
different values to be written to the union value on the thread.

This makes no sense.  Just set the variable to what we want it to
be when we need it to be true.  Since it was only ever set during
a thread_execute section.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-29 07:48:42 -04:00
Donald Sharp
9a7d1e7427 zebra: Use zebra_vrf_lookup_by_id when we can
Let's make this as consistent as is possible.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-28 15:49:50 -04:00
Donald Sharp
24a58196dd *: Convert event.h to frrevent.h
We should probably prevent any type of namespace collision
with something else.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
cd9d053741 *: Convert struct event_master to struct event_loop
Let's find a better name for it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
e16d030c65 *: Convert THREAD_XXX macros to EVENT_XXX macros
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
70d4d90c82 lib, zebra: Convert THREAD_TIMER_STRLEN to EVENT_TIMER_STRLEN
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
2453d15dbf *: Convert struct thread_master to struct event_master and it's ilk
Convert the `struct thread_master` to `struct event_master`
across the code base.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
5f6eaa9b96 *: Convert a bunch of thread_XX to event_XX
Convert these functions:

thread_getrusage
thread_cmd_init
thread_consumed_time
thread_timer_to_hhmmss
thread_is_scheduled
thread_ignore_late_timer

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
70c35c11f2 *: Convert thread_should_yield and thread_set_yield_time
Convert thread_should_yield and thread_set_yield_time
to event_should_yield and event_set_yield_time

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
4f830a0799 *: Convert thread_timer_remain_XXX to event_timer_remain_XXX
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
8c1186d38e *: Convert thread_execute to event_execute
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
332beb64b8 *: Convert thread_cancelXXX to event_cancelXXX
Modify the code base so that thread_cancel becomes event_cancel

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
907a2395f4 *: Convert thread_add_XXX functions to event_add_XXX
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
e6685141aa *: Rename struct thread to struct event
Effectively a massive search and replace of
`struct thread` to `struct event`.  Using the
term `thread` gives people the thought that
this event system is a pthread when it is not

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:17 -04:00
Donald Sharp
cb37cb336a *: Rename thread.[ch] to event.[ch]
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system.  There is a continual
problem where people are confusing `struct thread` with a true
pthread.  In reality, our entire thread.c is an event system.

In this commit rename the thread.[ch] files to event.[ch].

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-03-24 08:32:16 -04:00
Donatas Abraitis
6927446645
Merge pull request #13074 from donaldsharp/hash_clean_and_free
*: Add a hash_clean_and_free() function
2023-03-23 14:08:29 +02:00