* Process FIB update in bgp_zebra_route_notify_owner() and call
group_announce_route() if route is installed
* When bgp update is received for a route which is not installed earlier
(flag BGP_NODE_FIB_INSTALLED is not set) and suppress fib is enabled
set the flag BGP_NODE_FIB_INSTALL_PENDING to indicate fib install is
pending for the route. The route will be advertised when zebra send
ZAPI_ROUTE_INSTALLED status.
* The advertisement delay (BGP_DEFAULT_UPDATE_ADVERTISEMENT_TIME)
is added to allow more routes to be sent in single update message.
This is required since zebra sends route notify message for each route.
The delay will be applied to update group timer which advertises
routes to peers.
Signed-off-by: kssoman <somanks@gmail.com>
DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.
To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df
This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.
In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).
Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
Type: LR
RD: 27.0.0.15:6
Originator-IP: 27.0.0.15
Local ES DF preference: 100
VNI Count: 10
Remote VNI Count: 10
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
VTEPs:
27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
27.0.0.15 32768 i
ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
* Added vtysh cli commands and functions to set/unset bgp daemons no-rib
option during runtime and withdraw/announce routes in bgp instances
RIB from/to Zebra.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
The pbra variable is already derefed in all paths to this spot
and as such we cannot be NULL at this point.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Example configuration:
route-map SET_SR_POLICY permit 10
set sr-te color 1
!
router bgp 1
bgp router-id 1.1.1.1
neighbor 2.2.2.2 remote-as 1
neighbor 2.2.2.2 update-source lo
address-family ipv4 unicast
neighbor 2.2.2.2 next-hop-self
neighbor 2.2.2.2 route-map SET_SR_POLICY in
exit-address-family
!
!
Learned BGP routes from 2.2.2.2 are mapped to the SR-TE Policy
which is uniquely determined by the BGP nexthop (2.2.2.2 in this
case) and the SR-TE color in the route-map.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
in addition to ipv4 flowspec, ipv6 flowspec address family can configure
its own list of interfaces to monitor. this permits filtering the policy
routing only on some interfaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
in order to create appropriate policy route, family attribute is stored
in ipset and iptable zapi contexts. This commit also adds the flow label
attribute in iptables, for further usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this commit supports [0] where ipv6 address is encoded in nexthop
attribute of nlri, and not in bgp redirect ip extended community. the
community contains only duplicate information or not.
Adding to this, because an action or a rule needs to apply to either
ipv4 or ipv6 flow, modify some internal structures so as to be aware of
which flow needs to be filtered. This work is needed when an ipv6
flowspec rule without ip addresses is mentioned, we need to know which
afi is served. Also, this work will be useful when doing redirect VRF.
[0] draft-simpson-idr-flowspec-redirect-02.txt
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is the base patch that brings in support for Type-1 routes.
It includes support for -
- Ethernet Segment (ES) management
- EAD route handling
- MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for
active-active multihoming
- Initial infra for consistency checking. Consistency checking
is a fundamental feature for active-active solutions like MLAG.
We will try to levarage the info in the EAD-ES/EAD-EVI routes to
detect inconsitencies in access config across VTEPs attached to
the same Ethernet Segment.
Functionality Overview -
========================
1. Ethernet segments are created in zebra and associated with
access VLANs. zebra sends that info as ES and ES-EVI objects to BGP.
2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached
ethernet segments.
3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers
and translates them into ES-VTEP objects which are then sent to zebra
as remote ESs.
4. Each ES in zebra is associated with a list of active VTEPs which
is then translated into a L2-NHG (nexthop group). This is the ES
"Alias" entry
5. MAC-IP routes with a non-zero ESI use the alias entry created in
(4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES
destinations.
EAD route management (route table and key) -
============================================
1. Local EAD-ES routes
a. route-table: per-ES route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
b. route-table: per-VNI route-table
Not added
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
2. Remote EAD-ES routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
3. Local EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
4. Remote EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
Please refer to bgp_evpn_mh.h for info on how the data-structures are
organized.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Re-org only; no other code changes. This is being done to make maintanence
of MH functionality (which will have more code added to it) easy.
The code moved here was originally committed via -
'commit 50f74cf131 ("*: support for evpn type-4 route")'
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Revert "zebra: support for macvlan interfaces"
This reverts commit bf69e212fd.
Revert "doc: add some documentation about bgp evpn netns support"
This reverts commit 89b97c33d7.
Revert "zebra: dynamically detect vxlan link interfaces in other netns"
This reverts commit de0ebb2540.
Revert "bgpd: sanity check when updating nexthop from bgp to zebra"
This reverts commit ee9633ed87.
Revert "lib, zebra: reuse and adapt ns_list walk functionality"
This reverts commit c4d466c830.
Revert "zebra: local mac entries populated in correct netnamespace"
This reverts commit 4042454891.
Revert "zebra: when parsing local entry against dad, retrieve config"
This reverts commit 3acc394bc5.
Revert "bgpd: evpn nexthop can be changed by default"
This reverts commit a2342a2412.
Revert "zebra: zvni_map_to_vlan() adaptation for all namespaces"
This reverts commit db81d18647.
Revert "zebra: add ns_id attribute to mac structure"
This reverts commit 388d5b438e.
Revert "zebra: bridge layer2 information records ns_id where bridge is"
This reverts commit b5b453a2d6.
Revert "zebra, lib: new API to get absolute netns val from relative netns val"
This reverts commit b6ebab34f6.
Revert "zebra, lib: store relative default ns id in each namespace"
This reverts commit 9d3555e06c.
Revert "zebra, lib: add an internal API to get relative default nsid in other ns"
This reverts commit 97c9e7533b.
Revert "zebra: map vxlan interface to bridge interface with correct ns id"
This reverts commit 7c990878f2.
Revert "zebra: fdb and neighbor table are read for all zns"
This reverts commit f8ed2c5420.
Revert "zebra: zvni_map_to_svi() adaptation for other network namespaces"
This reverts commit 2a9dccb647.
Revert "zebra: display interface slave type"
This reverts commit fc3141393a.
Revert "zebra: zvni_from_svi() adaptation for other network namespaces"
This reverts commit 6fe516bd4b.
Revert "zebra: importation of bgp evpn rt5 from vni with other netns"
This reverts commit 28254125d0.
Revert "lib, zebra: update interface name at netlink creation"
This reverts commit 1f7a68a2ff.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
During perf testing of receiving and installing 7.5 million
routes into zebra it was noticed that memset in bgp_zebra_announce
was taking ~11% of the runtime. With this change bgp_zebra_announce
now no longer has any appreciable time spent in memset as reported
by perf. In addition bgp_zebra_announce run time in perf was
reduced by a composite amount.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`". It should not result in any functional
change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem reported that in many circumstances, RAs created in the
process of bringing up numbered IPv6 peers with extended-nexthop
capability enabled (for ipv4 over ipv6) were not stopped on the
interface when those peers were deleted. Found several circumstances
where this occurred and fix them in this patch.
Ticket: CM-26875
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This macro is undefined if vnc is disabled, and while it defaults to 0,
this is still wrong and causes issues with -Werror
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Support configurable options to control how link bandwidth is handled
by the receiver. The default behavior is to automatically honor the
link bandwidths received and use it to perform a weighted ECMP BUT only
if all paths in the multipath have associated link bandwidth; if one or
more paths do not have link bandwidth, normal ECMP is performed among
the multipaths. This behavior is as recommended by
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth.
The additional options available are to (a) completely ignore any link
bandwidth (i.e., weighted ECMP is effectively disabled), (b) skip paths
in the multipath which do not have link bandwidth and perform weighted
ECMP among the other paths (if at least some paths have the bandwidth)
or (c) use a default weight (value chosen is 1) for the paths which
do not have link bandwidth.
The command syntax is
bgp bestpath bandwidth <ignore|skip-missing|default-weight-for-missing>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Perform weighted ECMP if the multipaths have link bandwidth. This involves
assigning weights to each of the next hops associated with the prefix based
on the link bandwidth of the corresponding path as a factor of the total
(cumulative) link bandwidth for the prefix. The weight values used are
between 1 and 100. Weights are assigned only if all paths in the multipath
have link bandwidth, otherwise any bandwidths are ignored and regular
ECMP is performed. This is as recommended in
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth
A subsequent commit will implement additional (user-configurable) behaviors.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Add new function `bgp_node_get_prefix()` and modify
the bgp code base to use it.
This is prep work for the struct bgp_dest rework.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Future work needs the ability to specify a
const struct prefix value. Iterate into
bgp a bit to get this started.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
For Graceful restart clients have to send GR capabilities
library functions are added to encode capabilities and
also for zebra to decode client capabilities.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Soman K S <somanks@vmware.com>
Signed-off-by: Santosh P K <sapk@vmware.com>
*Adding helper function for signalling from BGPD to ZEBRA to
enable or disable GR feature in ZEBRA depending on bgp per
peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
For evpn routes, nexthop and RMAC fileds are synced
in route add to zebra.
In case of EVPN routes display RMAC field in route add
debug log.
Reviewed By:CCR-9381
Testing Done:
BGP: nhop [1]: 27.0.0.11 if 30 VRF 26 RMAC 00:02:00:00:00:2e
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Use a per-nexthop flag to indicate the presence of labels; add
some utility zapi encode/decode apis for nexthops; use the zapi
apis more consistently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
In L3VNI add callback parse, vrr rmac value.
For non-zero vrr mac value, use it as anycast RMAC
and svi mac as individual rmac value.
If advertise-pip is disable or vrr rmac is not present
use svi mac as anycast rmac value for all routes.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>