Commit Graph

4930 Commits

Author SHA1 Message Date
Donatas Abraitis
ecf0ea4b00
Merge pull request #9953 from donaldsharp/system_route_replace
zebra: Better handle replacing our route by a system route
2022-03-20 23:25:52 +02:00
Donald Sharp
0399a608e0
Merge pull request #10830 from anlancs/zebra-rb-remove
zebra, bgpd: remove check returning value of RB_INSERT()
2022-03-20 14:32:49 -04:00
Donald Sharp
f2f2a16af4 zebra: Do not complain if deletion fails
When issuing a RTM_DELETE operation and the kernel tells
us that the route is already deleted, let's not complain
about the situation:

2022/03/19 02:40:34 ZEBRA: [EC 100663303] kernel_rtm: 2a10:cc42:1d51::/48: rtm_write() unexpectedly returned -4 for command RTM_DELETE

I can recreate this issue on freebsd by doing this:
a) create a route using sharpd
b) shutdown the nexthop's interface
c) remove the route using sharpd

This would also be true of pretty much any routing protocol's behavior.
Let's just not complain about the situation if a RTM_DELETE
operation is issued and FRR is told that the route does not
exist to delete.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-19 07:44:54 -04:00
anlan_cs
2a778afe9d zebra: remove check returning value of RB_INSERT()
Since the `RB_INSERT()` is called after not found in RB tree, it MUST be ok and
and return zero. The check of returning value of `RB_INSERT()` is redundant,
just remove them.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-19 13:45:14 +08:00
Jafar Al-Gharaibeh
d6d0d718b0
Merge pull request #10806 from donaldsharp/dplane_fixup_for_lua
zebra: Fixup lua with new dplane ops
2022-03-16 16:38:01 -05:00
Donald Sharp
60cd8d3b14
Merge pull request #10790 from anlancs/zebra-adjust-flag
zebra: minor changes on "zebra_evpn_mac_gw_macip_add" function
2022-03-16 16:25:24 -04:00
Donald Sharp
105271792d zebra: Fixup lua with new dplane ops
Commit: 5d41413833 added 3 new dplane ops:

DPLANE_OP_INTF_INSTALL
DPLANE_OP_INTF_UPDATE
DPLANE_OP_INTF_DELETE

The build system does not build lua so zebra_script.c
was not updated.  Update of course!

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-16 15:10:54 -04:00
Russ White
5d97021ba3
Merge pull request #10427 from sworleys/Protodown-Reason-Upstream
Add Support for Setting Protodown Reason Code
2022-03-15 19:58:16 -04:00
Sri Mohana Singamsetty
3d58538a75
Merge pull request #10770 from chiragshah6/evpn_dev3
zebra: evpn disable remove l2vni from l3vni list
2022-03-15 12:32:22 -07:00
Donald Sharp
2ea93eb023
Merge pull request #10580 from leonshaw/fix/link-ns
zebra: Lookup linked interface in link netns
2022-03-15 15:23:18 -04:00
Donald Sharp
052b0eee2a
Merge pull request #10693 from anlancs/bgpd-add-check-ns
zebra: use "assert" instead of unnecessary check
2022-03-15 08:27:44 -04:00
Donald Sharp
6c72dd869e
Merge pull request #10725 from opensourcerouting/zebra-fpm-crash-fix
zebra: don't enqueue data with FPM socket closed
2022-03-14 08:27:10 -04:00
Donatas Abraitis
a9321141fc
Merge pull request #10731 from donaldsharp/multipath_output_in_zebra
zebra: Multipath output
2022-03-14 13:38:08 +02:00
Rafael Zalamena
3b1caddd34 zebra: don't enqueue data with FPM socket closed
It will trigger an assert while trying to schedule the next write.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2022-03-14 07:14:36 -03:00
Donald Sharp
7547d5288e
Merge pull request #10704 from anlancs/zebra-remove-check
zebra: Remove unnecessary check
2022-03-13 10:17:13 -04:00
Donald Sharp
b74f72c1fb zebra: prefixlen is not afi/safi dependant in encoding nexthops
When encoding a response to the upper level protocol the
prefixlen is not something that needs to be part of the
switch statement for handling of a prefix.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 11:18:45 -05:00
Donald Sharp
06e4e90132 *: When matching against a nexthop send and process what it matched against
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against.  When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd.  It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.

This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.

Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 11:18:45 -05:00
Donald Sharp
5c7861fe35 zebra: Remove unused ZEBRA_NHT_EXACT_MATCH
This usage was removed in an earlier bit of code
do some final cleanup

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-12 08:27:22 -05:00
anlan_cs
df44783078 zebra: use "assert" instead of unnecessary check
Since `zvni_map_to_svi_ns()` is used to find and return one specific interface
based on passed attributes of SVI, so the two parameters `in_param` and `p_ifp`
must not be NULL.

Passing NULL `p_ifp` makes no sense, so the check `if (p_ifp)` is
unnecessary.

So use `assert` to ensure the two parameters, and remove that unnecessary check.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-12 12:44:47 +08:00
Donald Sharp
04442fdbea zebra: Add ECMP supported to show zebra
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-11 14:25:23 -05:00
Sri Mohana Singamsetty
1b387a2894
Merge pull request #10711 from anlancs/zebra-remove-flag
zebra: remove unnecessary assignment
2022-03-11 11:08:51 -08:00
Chirag Shah
b27be4dbd7 zebra: evpn disable remove l2vni from l3vni list
Upon 'no advertise-all-vni', cleanup l2vni from
its tenant-vrf's l3vni list, instead of passed
zvrf->l3vni which will not be present in case
of default instance.

Reviewed By:
Testing Done:
Before Fix:
----------
TORC12(config-router-af)# advertise-all-vni
TORC12(config-router-af)# end
TORC12# show evpn vni 4001
VNI: 4001
  Type: L3
  Tenant VRF: vrf1
  Vxlan-Intf: vni4001
  State: Up
  Router MAC: 44:38:39:ff:ff:01
  L2 VNIs: 134217728 0 1000 1002 <-----

After Fix:
----------
TORC12# show evpn vni 4001
VNI: 4001
  Type: L3
  Tenant VRF: vrf1
  Vxlan-Intf: vni4001
  State: Up
  Router MAC: 44:38:39:ff:ff:01
  L2 VNIs: 1000 1002

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-10 19:59:33 -08:00
Chirag Shah
3d43b95ce1 zebra: cleanup host prefix from rmac
Ticket:#2798406
Testing Done:

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-10 17:27:15 -08:00
Chirag Shah
4a8e182a66 zebra: print rmac nexthop list
Ticket:#2798406
Reviewed By:
Testing Done:

Before change:
--------------

TORS1# show evpn rmac vni 4001 mac  44:38:39:ff:ff:01
MAC: 44:38:39:ff:ff:01
 Remote VTEP: 36.0.0.11
 Refcount: 1
  Prefixes:
    [1]:[00:00:00:00:00:00:00:00:00:00]:[::]/352
TORS1#
TORS1# show evpn rmac vni 4001 mac 44:38:39:ff:ff:01 json
{
  "routerMac":"44:38:39:ff:ff:01",
  "vtepIp":"36.0.0.11",
  "refCount":1,
  "localSequence":0,
  "remoteSequence":0,
  "prefixList":[
    "[1]:[00:00:00:00:00:00:00:00:00:00]:[::]\/352"
  ]
}

After change:
-------------

TORS1# show evpn rmac vni 4001 mac 44:38:39:ff:ff:01
MAC: 44:38:39:ff:ff:01
 Remote VTEP: 36.0.0.11
 Refcount: 0
  Prefixes:
TORS1#
TORS1# show evpn rmac vni 4001 mac 44:38:39:ff:ff:01 json
{
  "routerMac":"44:38:39:ff:ff:01",
  "vtepIp":"36.0.0.11",
  "nexthops":[
    "36.0.0.11"
  ]
}

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-10 17:27:15 -08:00
Chirag Shah
ae9e6beaea zebra: remove host prefix mapping in rmac
RMAC keeping list of nexthops to keep track
of its existiance, remove the (old way) host prefix
mapping.

Ticket: #2798406
Reviewed By:
Testing Done:

TORS1# show evpn rmac vni 4001 mac  44:38:39:ff:ff:01
MAC: 44:38:39:ff:ff:01
 Remote VTEP: 36.0.0.11
  Refcount: 0
    Prefixes:

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-10 17:27:15 -08:00
Chirag Shah
db88997872 zebra: maintain list of nhs in rmac db
Keep the list of remote-vteps/nexthops in
rmac db.

Problem:
In CLAG deployment there might be a situation
where CLAG secondary sends individual ip as nexthop
along with anycast mac as RMAC. This combination
is updated in zebra's rmac cache.
Upon recovery at clag secondary sends withdrawal
of the incorrect rmac and nexthop mapping.
The RMAC entry mapping to nh is not cleaned up properly
in the zebra rmac cache.

Fix:
Zebra rmac db needs to maintain a list of nexthops.
When a bgp withdrawal for rmac to nexthop mapping
is received, remove the old nexthop from the rmac's nh
list and if the host reference still remains for
the RMAC,fall back to the nexthop one remaining in
the list.
At most you expect two nexthops mapped to RMAC
(in clag deployment).

Ticket: 2798406
Reviewed By:
Testing Done:

CLAG primary and secondary have advertise-pip enabled
advertise type-5 route (default route) with
individual IP as nh and individual svi mac as rmac.

- disable advertise pip on both clag devices, this
results in advertisement of routes with anycast ip as nh
and anycast mac as rmac.

- disable peerlink on clag primary, this triggers
clag secondary to (transitory) send bgp update with
individual ip as nh and anycast mac as rmac.

- At the remote vtep:
Check the zebra's rmac cache/nh mapping correctly
and points to anycast rmac and anycast ip as nh of the
clag system.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-10 17:27:15 -08:00
Stephen Worley
47c1d76a6c zebra: cleanup protodown netlink logs
Cleanup the logs in the netlink code for setting
protodown on/off to be more useful to a user parsing them
after an issue.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:45 -05:00
Stephen Worley
7140b00cb0 zebra: use SET/UNSET/CHECK/COND in protodown code
Use the SET/UNSET/CHECK/COND macros for flag bifields
where appropriate throught the protodown code base.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
26a64ff9ca zebra: include old reason in evpn-mh bond update
Ensure we include the old reason when we are updating the reason
code for a evpn-mh bond member. Now that this is a common API
it could include things external to EVPN in this reason code
bitfield (ex: vrrp).

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
ed7a1622c3 zebra: make netlink protodown func more readable
Make the netlink protodown static function for checking
if the only bit set for protodown reason is FRR's more
easily readable to someone not familiar with the code.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
00d57e6dd5 zebra: clear dplane flags on failure for protodown
Make sure we clear our dplane flags for SET/UNSET
on failure so that we try again.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
a3ea8493a8 zebra: simplify reason code printing in show
Simplify the code for printing the reason codes via
show command. Just remove the trailing comma last
before printing.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
3db9f2db2b zebra: cleanup protodown api logs
Cleanup the logs in the api for setting protodown on/off
that zapi and others use. Make them more useful to a user parsing
them after an issue.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
ddfea8a233 zebra: wrap macro zif argument in parans
Missing parenthesis for macro ZIF argument.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
b1da028e45 zebra: avoid initialization in ctx_intf_init
Avoid initialization in dplane_ctx_intf_init() so
the compiler can warn us about using unintialized data.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
4b82b95488 zebra: evpn-mh bonds protodown check for set
When we are processing a bond member's protodown we get from
the dataplane, check to make sure we haven't already queued
up a set. If we have, it's likely this is just a notification
we get from the kernel after we set protodown and before we have
processed the result in our dplane pthread.

This change is needed now that we set protodown via the dplane
pthread.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
cb5b31f5d0 zebra: evpn-mh use protodown update reason api
When setting the protodown reason use the update api
where we can directly update the entire reason bitfield
since we have to set more than one.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
321c80d6b2 zebra: extern setting protodown reason directly
Extern the api for setting the protodown reason code
bitfield directly. Some places may want to completely update the
bitfield with more than one reason at a time.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
ab465d24bd zebra: only clear pd_reason on shutdown/sweep
Only clear protodown reason on shutdown/sweep, retain protodown
state.

This is to retain traditional and expected behavior with daemons
like vrrpd setting protodown. They expet it to be set on shutdown
and retained on bring up to prevent traffic from being dropped.

We must cleanup our reason code though to prevent us from blocking
others.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
97c7263373 zebra: add boilerplate protodown updates for *bsd
Add boilerplate for someone to come and add protdown updates
for bsd platforms if it ever exists.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
e4d87b5894 zebra: remove old protodown dplane path
Remove the old protodown dplane path install on the main thread.
This is now dead code.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
d89b300829 zebra: use a macro for check protodown
Add a helper macro for checking if interface is protodown,
typing this out is annoying.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:44 -05:00
Stephen Worley
0dcd8506f2 zebra: clear protodown_rc on shutdown and sweep
Add functionality to clear any reason code set on shutdown
of zebra when we are freeing the interface, in case a bad
client didn't tell us to clear it when the shutdown.

Also, in case of a crash or failure to do the above, clear reason
on startup if it is set.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 18:02:42 -05:00
Stephen Worley
71ef5cbb95 zebra: add enum set/unset states for queing
Add enums for set/unset of prodown state to handle the mainthread
knowing an update is already queued without actually marking it
as complete.

This is to make the logic confirm a bit more with other parts of the code
where we queue dplane updates and not update our internal structs until
success callback is received.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 17:52:44 -05:00
Stephen Worley
c40e1b1cfb zebra: add command for setting protodown bit
Add command for use to set protodown via frr.conf in
the case our default conflicts with another application
they are using.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 17:52:44 -05:00
Stephen Worley
5d41413833 zebra: add support for protodown reason code
Add support for setting the protodown reason code.

829eb208e8

These patches handle all our netlink code for setting the reason.

For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.

Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.

We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.

Also add debugging code for dumping netlink packets with
protodown/protodown_reason.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-03-09 17:52:44 -05:00
Russ White
82aca4ae4f
Merge pull request #10662 from chiragshah6/evpn_dev1
zebra: netlink protodown event handling for vxlan device
2022-03-09 16:37:30 -05:00
Mark Stapp
bc5302b1b1
Merge pull request #10635 from anlancs/staticd-cross
zebra: let same host route cross VRF
2022-03-09 11:05:59 -05:00
David Lamparter
e3c54a9383
Merge pull request #10079 from mjstapp/fix_intf_del_nhgs 2022-03-09 10:15:00 +01:00
anlan_cs
3f04f9cf24 zebra: let /32 host route with same IP cross VRF
Contraints of host routes are too strict in current code:
Host routes with same destination address and nexthop address are forbidden
even when cross VRFs.

Currently host routes with different destination and nexthop address can cross
VRFs, it is ok. But host routes with same addresses are forbidden to cross VRFs,
it is wrong.

Since different VRFs can have the same addresses, leak specific host route with
the same nexthop address ( it means destination address is same to nexthop
address ) to other VRFs is a normal case.

This commit relaxes that contraints. Host routes with same destination address
and nexthop address are forbidden only when not cross VRFs.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-09 07:22:11 +08:00
Mark Stapp
2472d3e876 zebra: shutdown doesn't uninstall zebra's NHGs
When an interface goes down, it signals any related NHGs to
re-validate themselves. During zebra shutdown, ensure we remove
any NHGs we've installed.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-03-08 16:16:55 -05:00
Russ White
5e412c5e73
Merge pull request #10722 from chiragshah6/evpn_dev3
zebra: fix crash in evpn neigh cleanup all
2022-03-08 11:00:29 -05:00
anlan_cs
c2fd85a854 zebra: remove unnecessary assignment
In `zebra_evpn_neigh_gw_macip_add()`, it sets `mac->flags` to "ZEBRA_MAC_DEF_GW"
for "advertise-default-gw" mode. But this set is redundant because this "mac"
is already set by `zebra_evpn_mac_gw_macip_add()`.

So remove this redundant assignment.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-08 22:58:22 +08:00
David Lamparter
378260fb65 zebra: remove unused variable
clang complains "variable 'curr_length' set but not used".

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-03-07 17:37:27 +01:00
anlan_cs
38eda16a24 zebra: Delay the usage of one variable until need
In the loop, local variable `ip` is always set even if the check condition
is not satisfied.

Avoid the redundant set, move this set exactly after the check condition is
satisfied. Set `ip` only if the check condition is met, otherwise needn't.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-03-05 06:57:35 +08:00
Chirag Shah
d5fdae8f45 zebra: fix crash in evpn neigh cleanup all
zebra crash is seen during shutdown (frr restart).

During shutdown, remote neigh and remote mac clean up
is triggered first, followed by per vni all neigh
(including local) and macs cleanup is triggered.

The crash occurs when a remote mac is cleaned up first
and its reference is remained in local neigh.
When local neigh attempt removes itself from its associated
mac's neigh_list it triggers inaccessible memory crash.

The fix is during mac deletion if its neigh_list is non-empty
then retain the MAC in AUTO state.
This can arise when MAC and neigh duo are in different state
(remote/local). Otherwise, the order of cleanup operation
is neighs followed by macs.

The auto mac will be cleaned up when per vni all neighs and macs
are cleaned up.

Ticket:CM-29826
Reviewed By:CCR-10369
Testing Done:

Configure evpn symmetric config where
MAC is in remote state and neigh is in local state.
Perform frr restart then crash is not seen.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-03 14:59:56 -08:00
Donald Sharp
8b48cdb913 zebra: Prevent installation of connected multiple times
With recent changes to interface up mechanics in if_netlink.c
FRR was receiving as many as 4 up events for an interface
on ifdown/ifup events.  This was causing timing issues
in FRR based upon some fun timings.  Remove this from
happening.

Ticket: CM-31623
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-03-02 18:34:32 -08:00
Chirag Shah
d78fa57195 zebra: protodown-up event trigger interface up
Vxlan interfaces flap (protodown/up) event,
non ptm operative interfaces do not come up
as protodown up event do not trigger "if_up()"
event.

Ticket:CM-30477
Reviewed By:CCR-10681
Testing Done:

validated interfaces flaps, ip link down, ifdown
and protodown followed by UP event. all Vxlan interfaces
come up in bgpd post flap.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-02 18:16:56 -08:00
Chirag Shah
2d04bd98ac zebra: handle protodown netlink for vxlan device
Frr need to handle protocol down event for vxlan
interface.
In MLAG scenario, one of the pair switch can put
vxlan port to protodown state, followed by
tunnel-ip change from anycast IP to individual IP.

In absence of protodown handling, evpn end up
advertising locally learn EVPN (MAC-IP) routes
with individual IP as nexthop.
This leads an issue of overwriting locally learn
entries as remote on MLAG pair.

Ticket:CM-24545
Reviewed By:CCR-10310
Testing Done:

In EVPN deployment, restart one of the MLAG
daemon, which puts vxlan interfaces in protodown state.
FRR treats protodown as oper down for vxlan interfaces.
VNI down cleans up/withdraws locally learn routes.
Followed by vxlan device UP event, re-advertise
locally learn routes.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
2022-03-02 17:40:44 -08:00
David Lamparter
2821405a69
Merge pull request #10640 from donaldsharp/thread_timers 2022-03-01 11:45:36 +01:00
Jafar Al-Gharaibeh
868efb9e9f
Merge pull request #10672 from donaldsharp/bsd_zebra_graceful_restart_cleanup
Bsd zebra graceful restart cleanup
2022-02-28 14:57:35 -06:00
Donald Sharp
45dafca86c zebra: Use the routes vrf not the vrf of the nexthop for route-map application
When a end operator is doing cross vrf imports in bgp:

router bgp 3239 vrf FOO
  address-family ipv4 uni
    import vrf BAR
!

and zebra has this configuration:

vrf FOO
  ip protocol bgp route-map EVA
!

The current code in zebra_nhg.c was looking up the vrf of the
nexthop and attempting to apply the ip protocol route-map.

For most people the nexthop vrf and the re vrf are one and the
same so they never see a problem.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-28 13:08:01 -05:00
David Lamparter
b9db469fe9
Merge pull request #10667 from donaldsharp/bufsize 2022-02-28 15:56:51 +01:00
Donald Sharp
73d3197c73 zebra: Get zebra graceful restart working when restarting on *BSD
Upon restart zebra reads in the kernel state.  Under linux
there is a mechanism to read the route and convert the protocol
to the correct internal FRR protocol to allow the zebra graceful
restart efforts to work properly.

Under *BSD I do not see a mechanism to convey the original FRR
protocol into the kernel and thus back out of it.  Thus when
zebra crashes ( or restarts ) the routes read back in are kernel
routes and are effectively lost to the system and FRR cannot
remove them properly.  Why?  Because FRR see's kernel routes
as routes that it should not own and in general the admin
distance for those routes will be a better one than the
admin distance from a routing protocol.  This is even
worse because when the graceful restart timer pops and rib_sweep
is run, FRR becomes out of sync with the state of the kernel forwarding
on *BSD.

On restart, notice that the route is a self route that there
is no way to know it's originating protocol.  In this case
let's set the protocol to ZEBRA_ROUTE_STATIC and set the admin
distance to 255.

This way when an upper level protocol reinstalls it's route
the general zebra graceful restart code still works.  The
high admin distance allows the code to just work in a way
that is graceful( HA! )

The drawback here is that the route shows up as a static
route for the time the system is doing it's work.  FRR
could introduce *another* route type but this seems like
a bad idea and the STATIC route type is loosely analagous
to the type of route it has become.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-28 09:50:35 -05:00
Donald Sharp
16d91fce15 zebra: Prevent crash if ZEBRA_ROUTE_ALL is used for a route type
FRR will crash when the re->type is a ZEBRA_ROUTE_ALL and it
is inserted into the meta-queue.  Let's just put some basic
code in place to prevent a crash from happening.  No routing
protocol should be using ZEBRA_ROUTE_ALL as a value but
bugs do happen.  Let's just accept the weird route type
gracefully and move on.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-28 09:50:35 -05:00
Donald Sharp
fbc83b9a10 zebra: Limit speed lookup to at most 4 minutes
There exists some interface types that are slow on startup
to fully register their link speed.  Especially those that
are working with an asic backend.  The speed_update timer
associated with each interface would keep trying if the
system returned a MAX_UINT32 as the speed.  This speed
means both unknown or there is none under linux.

Since some interface types are slow on startup let's modify
FRR to try for at most 4 minutes and give up trying on those
interfaces where we never get any useful data.

Why 4 minutes?  I wanted to balance the time associated with
slow interfaces coming up with those that will never give us
a value.  So I choose 4 minutes as a good ballpark of time
to keep trying

Why not track all those interfaces and just not attempt to
do the speed lookup?  I would prefer to not keep track of these
as that I do not know all the interface types, nor do I wish
to keep programming as new ones come in.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-28 06:39:07 -05:00
Xiao Liang
86bad1cb6e zebra: Lookup linked interface in link netns
Look up linked interface in the correct netns, otherwise, either a wrong
interface or NULL would be used.

For example, enable VRF netns backend, and:

    ip netns add ns1
    ip link add link eth0 link1 type macvlan
    ip link set link1 netns ns1 up

Zebra will crash in zebra_vxlan_macvlan_up because zif->link is NULL.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
2022-02-28 12:12:15 +08:00
Donald Sharp
9fb83b5506 zebra: Allow *BSD to specify a receive buffer size
End operator is reporting that they are receiving buffer overruns
when attempting to read from the kernel receive socket.  It is
possible to adjust this size to more modern levels especially
for when the system is under load.  Modify the code base
so that *BSD operators can use the zebra `-s XXX` option
to specify a read buffer.

Additionally setup the default receive buffer size on *BSD
to be 128k instead of the 8k so that FRR does not run into
this issue again.

Fixes: #10666
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-27 07:47:58 -05:00
Donald Sharp
ae45a63022
Merge pull request #10669 from anlancs/bgpd-line
*: Add necessary new line for output of vty_out()
2022-02-27 07:43:28 -05:00
anlan_cs
4d4c404bf6 *: Add necessary new line for output of vty_out()
Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-02-27 10:59:19 +08:00
Mark Stapp
cd787a8a45 zebra: use dataplane to read interface NETCONF info
Use the dataplane to query and read interface NETCONF data;
add netconf-oriented data to the dplane context object, and
add accessors for it. Add handler for incoming update
processing.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 10:18:32 -05:00
Mark Stapp
728f2017ae zebra: add dplane type for NETCONF data
Add a new dplane op for interface NETCONF data; add the new
enum value to several switch statements.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Mark Stapp
d4bcd88d8a zebra: avoid default clause in FPM switch
Avoid default clause in a switch in the FPM module that handles
dplane op codes - include all the codes.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Mark Stapp
9f3f1486c8 zebra: add xxxNETCONF messages to the netlink BPF filter
Allow self-produced xxxNETCONF netlink messages through the BPF
filter we use. Just like address-configuration actions, we'll
process NETCONF changes in one path, whether the changes were
generated by zebra or by something else in the host OS.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Mark Stapp
777f96503e zebra: add netlink debug dump for netconf messages
Add the RTM_NETCONF messages to the detailed netlink message
dump module.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Mark Stapp
b6beb70047 zebra: include mpls enabled status in interface output
Add mpls status to the zebra interface struct; include mpls
status in show interface output.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Donald Sharp
ebb61fcaf5 zebra: Start of work to get data about mpls from kernel
a) We'll need to pass the info up via some dataplane control method
(This way bsd and linux can both be zebra agnostic of each other)

b) We'll need to modify `struct interface *` to track this data
and when it changes to notify upper level protocols about it.

c) Work is needed to dump the entire mpls state at the start
so we can gather interface state.  This should be done
after interface data gathering from the kernel.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-25 09:53:02 -05:00
Christian Hopps
7bf63db79b
Merge pull request #10632 from donaldsharp/thread_return_null
*: Change thread->func to return void instead of int
2022-02-24 01:43:48 -05:00
Donald Sharp
cc9f21da22 *: Change thread->func to return void instead of int
The int return value is never used.  Modify the code
base to just return a void instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-23 19:56:04 -05:00
vdhingra
1b1e934fac zebra: Nexthop tracking, route resolution recursive lookup
Description:
===========
Change is intended for fixing the NHT resolution logic.
While recursively resolving nexthop, keep looking for a valid/useable route in the rib,
by not stopping at the first/most-specific route in the rib.

Consider the following set of events taking place on R1:
R1(config)# ip route 2.2.2.0/24 ens192
R1# sharp watch nexthop 2.2.2.32 connected
R1# show ip nht
2.2.2.32(Connected)
 resolved via static
 is directly connected, ens192
 Client list: sharp(fd 33)

-2.2.2.32 NHT is resolved over the above valid static route.

R1# sharp install routes 2.2.2.32 nexthop 2.2.2.32 1
R1# 2.2.2.32(Connected)
 resolved via static
 is directly connected, ens192
 Client list: sharp(fd 33)

-.32/32 comes which is going to resolve through itself, but since this is an invalid route,
it will be marked as inactive and will not affect the NHT.

R1# sharp install routes 2.2.2.31 nexthop 2.2.2.32 1
R1# 2.2.2.32(Connected)
 unresolved(Connected)
 Client list: sharp(fd 50)

-Now a .31/32 comes which will resolve over .32 route, but as per the current logic,
this will trigger the NHT check, in turn making the NHT unresolved.

-With fix, NHT should stay in resolved state as long as the valid static or connected route stays installed

Fix:
====
-While resolving nexthops, walk up the tree from the most-specific match,
walk up the tree without any ZEBRA_NHT_CONNECTED check.

Co-authored-by: Vishal Dhingra <vdhingra@vmware.com>
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
2022-02-22 09:28:00 -08:00
anlan_cs
5ff58d0a47 zebra: minor changes on "zebra_evpn_mac_gw_macip_add" function
Two minor changes:
    1) Change `zebra_evpn_mac_gw_macip_add()` 's return type to `void`.
    2) Since `zebra_evpn_mac_gw_macip_add()` has already `assert` the returned
    `mac`, the check of its return value makes no sense. And keep setting
    `mac->flags` inside `zebra_evpn_mac_gw_macip_add()` is more reasonable. So
    just move the setting `mac->flags` inside `zebra_evpn_mac_gw_macip_add()`.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-02-18 22:49:47 -05:00
Donald Sharp
7f6ff7a3d3
Merge pull request #10557 from alexk99/zebra-fpm-multihop-weight
Zebra FPM: don't lose next hop weights while exporting via FPM
2022-02-17 09:41:52 -05:00
Russ White
c131015905
Merge pull request #10547 from donaldsharp/10458
zebra: Keep the interface flags safe on multiple ioctl calls
2022-02-16 19:20:47 -05:00
Jafar Al-Gharaibeh
76d8e1a4a7
Merge pull request #10561 from mjstapp/nlsock_hash_lock
zebra: make netlink object hash threadsafe
2022-02-16 13:11:21 -06:00
Donald Sharp
b9d95135a8 zebra: Fix spelling mistake
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-14 12:56:44 -05:00
Mark Stapp
348698095d zebra: make netlink object hash threadsafe
The recently-added hashtable of nlsock objects needs to be
thread-safe: it's accessed from the main and dplane pthreads.
Add a mutex for it, use wrapper apis when accessing it. Add
a per-OS init/terminate api so we can do init that's not
per-vrf or per-namespace.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-11 17:03:26 -05:00
Trey Aspelund
e54cd97838 zebra: cleanup multiline strings in debug_nl.c
NetDEF CI has been whining about multiline string style.
Make the strings single-line and call it a day.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-02-10 21:37:45 +00:00
Trey Aspelund
95fe32880f zebra: add netlink debugs for ip rules
Adds functions to parse + decode netlink rules.
Adds RTM_NEWRULE + RTM_DELRULE to "debug zebra kernel".

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
2022-02-10 21:36:34 +00:00
kiselev99@gmail.com
eca3256db8 zebra: FPM next hop weights
Don't lose next hop weights while exporting via FPM

Signed-off-by: Alex Kiselev <alex@bisonrouter.com>
2022-02-10 19:16:33 +03:00
Rafael Zalamena
70d79c359b
Merge pull request #10537 from mjstapp/fix_dplane_strdup
zebra: use frr mem apis in dplane
2022-02-10 10:24:22 -03:00
Bijan
16dca7cec5 zebra: Keep the interface flags safe on multiple ioctl calls
Trying to call multiple ioctl calls on ifreq will result in
overwriting ifreq with garbage data. On if_get_flags call,
try to keep the flags field safe from another possible ioctl
call before applying the flags field.

Modified code as per Code Review, done by Donald Sharp.

Signed-off-by: Bijan <bijanebrahimi@riseup.net>
2022-02-09 10:07:47 -05:00
Donald Sharp
2cf7651f0b zebra: Make netlink buffer reads resizeable when needed
Currently when the kernel sends netlink messages to FRR
the buffers to receive this data is of fixed length.
The kernel, with certain configurations, will send
netlink messages that are larger than this fixed length.
This leads to situations where, on startup, zebra gets
really confused about the state of the kernel.  Effectively
the current algorithm is this:

read up to buffer in size
while (data to parse)
     get netlink message header, look at size
        parse if you can

The problem is that there is a 32k buffer we read.
We get the first message that is say 1k in size,
subtract that 1k to 31k left to parse.  We then
get the next header and notice that the length
of the message is 33k.  Which is obviously larger
than what we read in.  FRR has no recover mechanism
nor is there a way to know, a priori, what the maximum
size the kernel will send us.

Modify FRR to look at the kernel message and see if the
buffer is large enough, if not, make it large enough to
read in the message.

This code has to be per netlink socket because of the usage
of pthreads.  So add to `struct nlsock` the buffer and current
buffer length.  Growing it as necessary.

Fixes: #10404
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Donald Sharp
d4000d7ba3 zebra: Remove struct nlsock from dataplane information and use int fd
Store the fd that corresponds to the appropriate `struct nlsock` and pass
that around in the dplane context instead of the pointer to the nlsock.
Modify the kernel_netlink.c code to store in a hash the `struct nlsock`
with the socket fd as the key.

Why do this?  The dataplane context is used to pass around the `struct nlsock`
but the zebra code has a bug where the received buffer for kernel netlink
messages from the kernel is not big enough.  So we need to dynamically
grow the receive buffer per socket, instead of having a non-dynamic buffer
that we read into.  By passing around the fd we can look up the `struct nlsock`
that will soon have the associated buffer and not have to worry about `const`
issues that will arise.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Donald Sharp
3670f5047c zebra: Store the sequence number to use as part of the dp_info
Store and use the sequence number instead of using what is in
the `struct nlsock`.  Future commits are going away from storing
the `struct nlsock` and the copy of the nlsock was guaranteeing
unique sequence numbers per message.  So let's store the
sequence number to use instead.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-08 17:28:19 -05:00
Mark Stapp
b6b6e59c6e zebra: use frr mem apis
Replace a couple of strdup/free with XSTRDUP/XFREE.

Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2022-02-08 15:57:57 -05:00
Russ White
1a8a7016a6
Merge pull request #9066 from donaldsharp/ships_in_the_night
zebra: Fix ships in the night issue
2022-02-08 14:41:01 -05:00
Igor Ryzhov
60cda04dda *: use ipaddr_cmp instead of memcmp
Using memcmp is wrong because struct ipaddr may contain unitialized
padding bytes that should not be compared.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
2022-02-08 20:31:34 +03:00
Russ White
e735c8073c
Merge pull request #9649 from proelbtn/add-support-for-end-dt4
add support for SRv6 IPv4 L3VPN
2022-02-08 08:30:02 -05:00
Donald Sharp
ce649b9d11 zebra: Abstract nhg deletion to reduce code duplication
Reduce code duplication when we are cleaning up nexthop
groups.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-07 16:10:36 -05:00
Donald Sharp
c6eee91f66 zebra: Fix ships in the night issue
When using wait for install there exists situations where
zebra will issue several route change operations to the kernel
but end up in a state where we shouldn't be at the end
due to extra data being received.  Example:

a) zebra receives from bgp a route change, installs sends the
route to the kernel.
b) zebra receives a route deletion from bgp, removes the
struct route entry and then sends to the kernel a deletion.
c) zebra receives an asynchronous notification that (a) succeeded
but we treat this as a new route.

This is the ships in the night problem.  In this case if we receive
notification from the kernel about a route that we know nothing
about and we are not in startup and we are doing asic offload
then we can ignore this update.

Ticket: #2563300
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-02-07 16:10:03 -05:00