EVPN nexthops are installed as remote neighs by zebra. This was earlier
done only via VRF IPvX uni routes imported from EVPN routes.
With EVPN-MH these VRF routes now reference a L3NHG which is setup based
on the EAD and doesn't include the RMAC. To workaround that BGP now
consolidates and maintains EVPN nexthops which are then sent to zebra.
zebra sets up these nexthops as L3-VNI nh entries using a dummy type-1
route as reference.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add new status for Vertex, Edge and Subnet to manage their
respective states in the data base.
Add new functions:
- to register/unregister server and client
- to show content of the Database (VTY and Json output)
- to update and compare subnets
- to clean vertex and ted from ORPHAN elements
- to convert message or stream into a Link State Element and update
Link State Database accordingly to message event
Change Edge and Vertex key computation by using the host order systematically.
This impact mostly key based on IPv4 addresses where `ntohl()` function must
be used when searching a Vertex or Edge by key.
Update the documentation accordingly
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
like it has been done for iptable contexts, a zebra dplane context is
created for each ipset/ipset entry event. The zebra_dplane_ctx job is
then enqueued and processed by separate thread. Like it has been done
for zebra_pbr_iptable context, the ipset and ipset entry contexts are
encapsulated into an union of structures in zebra_dplane_ctx.
There is a specificity in that when storing ipset_entry structure, there
was a backpointer pointer to the ipset structure that is necessary
to get some complementary information before calling the hook. The
proposal is to use an ipset_entry_info structure next to the ipset_entry,
in the zebra_dplane context. That information is used for ipset_entry
processing. The ipset name and the ipset type are the only fields
necessary.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The raw zapi apis to encode and decode NHGs don't need to be
public; also add a little more validity-checking.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The re->flags and re->status in debugs were being dumped as hex values.
I can never quickly decode this. Here is an idea. Let's let FRR do
it for me.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add an API that allows IGP client daemons to register/unregister
RLFAs with ldpd.
IGP daemons need to be able to query the LDP labels needed by RLFAs
and monitor label updates that might affect those RLFAs. This is
similar to the NHT mechanism used by bgpd to resolve and monitor
recursive nexthops.
This API is based on the following ZAPI opaque messages:
* LDP_RLFA_REGISTER: used by IGP daemons to register an RLFA with ldpd.
* LDP_RLFA_UNREGISTER_ALL: used by IGP daemons to unregister all of
their RLFAs with ldpd.
* LDP_RLFA_LABELS: used by ldpd to send RLFA labels to the registered
clients.
For each RLFA, ldpd needs to return the following labels:
* Outer label(s): the labels advertised by the adjacent routers to
reach the PQ node;
* Inner label: the label advertised by the PQ node to reach the RLFA
destination.
For the inner label, ldpd automatically establishes a targeted
neighborship with the PQ node if one doesn't already exist. For that
to work, the PQ node needs to be configured to accept targeted hello
messages. If that doesn't happen, ldpd doesn't send a response to
the IGP client daemon which in turn won't be able to activate the
previously computed RLFA.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Removing the obsolete ldp-sync periodic 'hello' message.
When ldp-sync is configured, IGPs take action if the LDP process goes down.
The IGPs have been updated to use the zapi client close callback to detect
the LDP process going down.
Signed-off-by: Karen Schoener <karen@voltanet.io>
Add a bit of code that allows for opaque data to be
sent from an upper level protocol to zebra. This is just
pass through data that will be used as part of displaying
useful data about a route in a `show ip route` command
in future commits.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a `enum zclient_send_status` for appropriate handling
of return codes from zclient_send_message. Touch all the places
where we handle this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When FRR sends data over the ZAPI protocol from the upper levels to zebra, indicate
to the calling functions that we have started buffering data to be sent if the
socket is full underneath it.
Also add a call back function `zebra_buffer_write_ready` that we can call
when an upper level protocol's socket buffer has been drained.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The linux kernel is getting RTM_F_OFFLOAD_FAILED for kernel routes
that have failed to offload. Write the code
to receive these notifications from the linux kernel
and store that data for display about the routes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Issue:
The bgp routes learnt from peers which are not installed in kernel are
advertised to peers. This can cause routers to send traffic to these
destinations only to get dropped. The fix is to provide a configurable
option "bgp suppress-fib-pending". When the option is enabled, bgp will
advertise routes only if it these are successfully installed in kernel.
Fix (Part1) :
* Added message ZEBRA_ROUTE_NOTIFY_REQUEST used by client to request
FIB install status for routes
* Added AFI/SAFI to ZAPI messages
* Modified the functions zapi_route_notify_decode(), zsend_route_notify_owner()
and route_notify_internal() to include AFI, SAFI as parameters
Signed-off-by: kssoman <somanks@gmail.com>
DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.
To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df
This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.
In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).
Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
Type: LR
RD: 27.0.0.15:6
Originator-IP: 27.0.0.15
Local ES DF preference: 100
VNI Count: 10
Remote VNI Count: 10
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
VTEPs:
27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
27.0.0.15 32768 i
ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add the zapi code for encoding/decoding of backup nexthops for when
we are ready for it, but disable it for now so that we revert
to the old way with them.
When zebra gets a proto-NHG with a backup in it, we early fail and
tell the upper level proto. In this case sharpd. Sharpd then reverts
to the old way of installation with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Align the zapi NHG apis to be more consistent with the zapi_route
apis. Add a struct zapi_nhg to use for encodings as well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Determine the NHG ID spacing and lower bound with ZEBRA_ROUTE_MAX
in macros.
Directly set the upperbound to be the lower 28bits of the uint32_t ID
space (the top 4 are reserved for l2-NHGs). Round that number down
a bit to make it more even.
Convert all former lower_bound calls to just use the macro.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a command/functionality to only install proto-based nexthops.
That is nexthops owned/created by upper level protocols, not ones
implicitly created by zebra.
There are some scenarios where you would not want zebra to be
arbitrarily installing nexthop groups and but you still want
to use ones you have control over via lib/nexthop_group config
and an upper level protocol.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Modify the send down of a route to use the nexthop group id
if we have one associated with the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to send a NHG from an upper level protocol down to
zebra. ZAPI_NHG_ADD encompasses both the addition and replace
semantics ( If the id passed down does not exist yet, it's Add,
else it's a replace ).
Effectively zebra will take this nhg passed down save the nhg
in the id hash for nhg's and then create the appropriate nhg's
and finally install them into the linux kernel. Notification
will be the ZAPI_NHG_NOTIFY_OWNER zapi message for normal
success/failure messaging to the installing protocol.
This work is being done to allow us to work with EVPN MH
which needs the ability to modify NHG's that BGP will own
and operate on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add new function zclient_get_nhg_start that will allow an
upper level protocol to get a starting point for it's own
nhg space. Give each protocol a space of 50 million.
zebra will own the space from 0 - 199999999 because
of SYSTEM, KERNEL and CONNECT route types.
This is the start of some work that will allow upper
level protocols to install and maintain their own NHG's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The linux kernel is getting RTM_F_TRAP and RTM_F_OFFLOAD for
kernel routes that have an underlying asic offload. Write the
code to receive these notifications from the linux kernel and
to store that data for display about the routes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The sorting for zapi nexthops in zapi routes needs to match
the sorting of nexthops done in zebra. Ensure all zapi_nexthop
attributes are included in the sort.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.
This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
For the sake of Segment Routing (SR) and Traffic Engineering (TE)
Policies there's a need for additional infrastructure within zebra.
The infrastructure in this PR is supposed to manage such policies
in terms of installing binding SIDs and LSPs. Also it is capable of
managing MPLS labels using the label manager, keeping track of
nexthops (for resolving labels) and notifying interested parties about
changes of a policy/LSP state. Further it enables a route map mechanism
for BGP and SR-TE colors such that learned BGP routes can be mapped
onto SR-TE Policies.
This PR does not introduce any usable features by now, it is just
infrastructure for other upcoming PRs which will introduce 'pathd',
a new SR-TE daemon.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
1. BGP informs zebra if a MAC-IP is a SYNC path and if it active on the
ES peer.
2. Zebra sends paths that are "local-inactive" with the proxy flag to
BGP.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. Local ethernet segments are configured in zebra by attaching a
local-es-id and sys-mac to a access interface -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
interface hostbond1
evpn mh es-id 1
evpn mh es-sys-mac 00:00:00:00:01:11
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
This info is then sent to BGP and used for the generation of EAD-per-ES
routes.
2. Access VLANs associated with an (ES) access port are translated into
ES-EVI objects and sent to BGP. This is used by BGP for the
generation of EAD-EVI routes.
3. Remote ESs are imported by BGP and sent to zebra. A list of VTEPs
is maintained per-remote ES in zebra. This list is used for the creation
of the L2-NHG that is used for forwarding traffic.
4. MAC entries with a non-zero ESI destination use the L2-NHG associated
with the ESI for forwarding traffic over the VxLAN overlay.
Please see zebra_evpn_mh.h for the datastruct organization details.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Initial changes to support a nexthop with multiple backups. Lib
changes to hold a small array in each primary, zapi message
changes to support sending multiple backups, and daemon
changes to show commands to support multiple backups. The config
input for multiple backup indices is not present here.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
* add a vrf sub-command `[no] ipv6 router-id X:X::X:X`.
* add command `[no] ipv6 router-id X:X::X:X [vrf NAME]` for backward
compatibility.
* add a vrf sub-command `[no] ip router-id A.B.C.D` and make the old
one without `ip` an alias for it.
* add a command `[no] ip router-id A.B.C.D [vrf NAME]` for backward
comptibility and make the old one without `ip` an alias for it.
* add command `show ip router-id [vrf NAME]` and make
the old one without `ip` an alias for it.
* add command `show ipv6 router-id [vrf NAME]`.
* add ZAPI commands `ZEBRA_ROUTER_ID_V6_ADD`,
`ZEBRA_ROUTER_ID_V6_DELETE` and `ZEBRA_ROUTER_ID_V6_UPDATE`
for deamons to get notified of the IPv6 router-id.
* update zebra documentation.
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
Start modifying the OPAQUE zapi message to include optional
unicast destination zapi client info. Add a 'decode' api and
opaque msg struct to encapsulate that optional info.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Change name of an opaque zapi api to 'decode' to align with the
other zapi message parsing apis. Missed that in the original
opaque commits.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add a zapi message type designed to carry opaque data. Add
'send' api, and prototype for client handler function. Also
add registration/unreg messages, so that clients can 'subscribe'
to receive these messages as they're passing through zebra.
Signed-off-by: Mark Stapp <mjs@voltanet.io>