In a variety of places we are using DAEMON_VTY_DIR, convert
to use frr_vtydir. This will allow us in a future commit
to have the -N namespace option be automatically used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When you have compiled FRR with a large multipath number
then encoding large ecmp routes between zebra and the
routing daemons. There exists a theoritical size
of multipath that will cause the encoding to be larger
than the ZEBRA_MAX_PACKET_SIZ. In the cases where
we have allocated streams that will encode routes
then let's ensure that whatever size we have will
auto-fit what we say we can send.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were running into some problems where VRRP is trying to protodown
interfaces that no longer exist. While this is a minor bug in its own
right, this was crashing Zebra because Zebra was not doing a null check
after its ifindex lookup.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The re->uptime usage of time(NULL) leaves it open to
timing changes from outside influence. Switching
to monotime allows us to ensure that we have a timestamp
that is always increasing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we switched to a pthread per client, we lost the ability to
correlate zapi message debugs with their handlers in zlog, because the
message was logged when it was read off the zapi socket and not right
before it was processed. Move the zapi msg hexdump to happen right
before we call the message handler.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The current code path of registration does this:
a) Lookup or create the rnh
b) register the client with the rnh for callback
If this is a new rnh send a response to the client that
only includes the rnh data that it has (nothing so no path)
If this is a existing rnh send the actual path to the client,
if it exists.
c) If a new client or a flag has changed refigure and send result
to all clients.
This is problematic in that suppose the rnh is new. Clients
will receive two answers:
1) A call back with no nexthops
2) A call back with the resolved # of nexthops
Imagine pim who depends on nht to handle this, pim will create
a mroute( because it does a hard lookup of the rpf as it is registering
the nexthop ), then it will receive the first callback causing
it to tear down the mroute and then receive the second callback
causing it to put it right back.. This is obviously not very
good for mroutes.
This code moves the send to the new client till after the new
client has connected, thus only allowing one callback to the new
client with the actual answer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Routing protocols are allowed ( and even encouraged ) to modify
the flags that influence the nexthop tracking. As such when
we modify the tracking of a nexthop to go from, say, connected force
or not we must re-evaluate the nexthop and send the results
up to the interested parties.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that the next hop's VRF is used for IPv4 and IPv6 unicast routes
sourced from EVPN routes, for next hop and Router MAC tracking and
install. This way, leaked routes from other instances are handled properly.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface. Howver, in the model that
is supported in the implementation and commonly deployed, there is no
explicit Overlay IP address associated with the next hop in the tenant
VRF; the underlay IP is used if (since) the forwarding plane requires
a next hop IP. Therefore, the next hop has to be explicit flagged as
onlink to cause any next hop reachability checks in the forwarding plane
to be skipped.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
Use existing mechanism to specify the nexthops as onlink when installing
these routes from bgpd to zebra and get rid of a special flag that was
introduced for EVPN-sourced routes. Also, use the onlink flag during next
hop validation in zebra and eliminate other special checks.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case of EVPN symmetric routing, the tenant VRF is associated with
a VNI that is used for routing and commonly referred to as the L3 VNI or
VRF VNI. Corresponding to this VNI is a VLAN and its associated L3 (IP)
interface (SVI). Overlay next hops (i.e., next hops for routes in the
tenant VRF) are reachable over this interface.
https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement
section 4.4 provides additional description of the above constructs.
Use the L3 interface exchanged between zebra and bgp in route install.
This patch in conjunction with the earlier one helps to eliminate some
special code in zebra to derive the next hop's interface.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
In Asymmetric and symetric routing scenario in EVPN
where each VTEP pair having different set of addresses
for the SVIs.
This knob allows reachability (ping connectivity) of
SVI IPs and resolve ARP resoultion VTEPs across racks.
This knob should not be used when same SVI IPs configured
on VTEPs across racks or when advertise default gateway
is configured.
Ticket:CM-23782
Testing Done:
Bring up EVPN symmetric routing topology with different
SVI IPs on different VTEPs. Enable advertise svi ip
at each VTEP, remote VTEPs installs arp entry for
SVI IPs via EVPN type-2 route exchange.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The restricting of data about interfaces was both inconsistent
in application and allowed protocol developers to get into states where
they did not have the expected data about an interface that they
thought that they would. These restrictions and inconsistencies
keep causing bugs that have to be sorted through.
The latest iteration of this bug was that commit:
f20b478ef3
Has caused pim to not receive interface up notifications( but
it knows the interface is back in the vrf and it knows the
relevant ip addresses on the interface as they were changed
as part of an ifdown/ifup cycle ).
Remove this restriction and allow the interface events to
be propagated to all clients.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The client_list should be owned by the zebra_router data structure
as that it is part of global state information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The onlink attribute was being passed from upper level protocols
as an attribute of the route *not* the individual nexthop. When
we pass this data to the kernel, we treat the onlink as a attribute
of the nexthop. This commit modifies the code base to allow
us to pass the ONLINK attribute as an attribute of the nexthop.
This commit also fixes static routes that have multiple nexthops
some onlink and some not.
ip route 4.5.6.7/32 192.168.41.1 eveth1 onlink
ip route 4.5.6.7/32 192.168.42.2
S>* 4.5.6.7/32 [1/0] via 192.168.41.1, eveth1 onlink, 00:03:04
* via 192.168.42.2, eveth2, 00:03:04
sharpd@robot ~/frr2> sudo ip netns exec EVA ip route show
4.5.6.7 proto 196 metric 20
nexthop via 192.168.41.1 dev eveth1 weight 1 onlink
nexthop via 192.168.42.2 dev eveth2 weight 1
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Favor usage of the afi_t enumeration to identify address-families
over using the classic AF_INET[6] constants for that. The choice to
use either of the two seems to be mostly arbitrary throughout our
code base, which leads to confusion and bugs like the one fixed by
commit 6f95d11a1. To address this problem, favor usage of the afi_t
enumeration whenever possible, since 1) it's an enumeration (helps
the compilers to catch some bugs), 2) has a safi_t sibling and 3)
can be used to index static arrays. AF_INET[6] should then be used
only when interfacing with the kernel or external libraries like
libc. The family2afi() and afi2family() functions can be used to
convert between the two different representations back and forth.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add a new field in the ZEBRA_CAPABILITIES zapi message specifying
the VRF backend in use.
For simplicity, make the zclient code call vrf_configure_backend()
to apply the received value automatically instead of requiring
the daemons to do that themselves in their zebra_capabilities()
callbacks.
Additionally, call zebra_vrf_update_all() only after sending the
capabilities message to the client, so that it will know which VRF
backend is in use when processing the VRF messages.
This commit fixes a couple of bugs in the "interface" CLI command and
associated northbound callbacks, which behave differently depending
on the VRF backend in use. Before this commit, the vrf_backend
variable would always be set to VRF_BACKEND_NETNS in the client
daemons, even when zebra was started without the --vrfwnetns option.
This could lead to inconsistent behavior and subtle bugs under
specific circumstances.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
We were sending ZEBRA_INTERFACE_LINK_PARAMS messages under the
following circumstances:
* New interface was created (via kernel or config);
* Interface went from down to up;
* Update in the link-params configuration.
Now also send ZEBRA_INTERFACE_LINK_PARAMS messages whenever a zclient
connects and sends a ZEBRA_INTERFACE_ADD request. Without this fix,
the client daemons don't receive interface link parameters if they
are configured in the zebra startup configuration.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
client->ifinfo is a VRF bitmap, hence we need to use
vrf_bitmap_check() to check if a client is subscribed to receive
interface information for a particular VRF. Just checking if
the client->ifinfo value is set will always succeed since it's
a pointer initialized by zserv_client_create(). With this fix,
we'll stop sending interface messages from all VRFs to all clients,
even those that didn't subscribe to it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Routes without nexthops don't make any sense, so we need to reject
them otherwise weird things can happen.
NOTE: blackhole routes aren't nexthop-less, they do have a single
nexthop of type NEXTHOP_TYPE_BLACKHOLE.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Some daemons like ospfd and isisd have the ability to advertise a
default route to their peers only if one exists in the RIB. This
is what the "default-information originate" commands do when used
without the "always" parameter.
For that to work, these daemons use the ZEBRA_REDISTRIBUTE_DEFAULT_ADD
message to request default route information to zebra. The problem
is that this message didn't have an AFI parameter, so a default route
from any address-family would satisfy the requests from both daemons
(e.g. ::/0 would trigger ospfd to advertise a default route to its
peers, and 0.0.0.0/0 would trigger isisd to advertise a default route
to its IPv6 peers).
Fix this by adding an AFI parameter to the
ZEBRA_REDISTRIBUTE_DEFAULT_{ADD,DELETE} messages and making the
corresponding code changes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Unlike the other interface zapi messages, ZEBRA_INTERFACE_VRF_UPDATE
identifies interfaces using ifindexes and not interface names. This
is a problem because zebra always sends ZEBRA_INTERFACE_DOWN
and ZEBRA_INTERFACE_DELETE messages before sending
ZEBRA_INTERFACE_VRF_UPDATE, and the ZEBRA_INTERFACE_DELETE callback
from all daemons set the interface index to IFINDEX_INTERNAL. Hence,
when decoding a ZEBRA_INTERFACE_VRF_UPDATE message, the interface
lookup would always fail since the corresponding interface lost
its ifindex. Example (ospfd):
OSPF: Zebra: Interface[rt1-eth2] state change to down.
OSPF: Zebra: interface delete rt1-eth2 vrf default[0] index 8 flags 11143 metric 0 mtu 1500
OSPF: [EC 100663301] INTERFACE_VRF_UPDATE: Cannot find IF 8 in VRF 0
To fix this problem, use interface names instead of ifindexes to
indentify interfaces like the other interface zapi messages do.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This commit is the last missing piece to complete BGP LU support in bgpd. To this moment, bgpd (and zebra) supported auto label assignment only for prefixes leaked from VRFs to vpn and for MPLS SR prefixes. This adds auto label assignment to other routes types in bgpd. The following enhancements have been made:
* bgp_route.c:bgp_process_main_one() now sets implicit-null local_label to all local, aggregate and redistributed routes.
* bgp_route.c:bgp_process_main_one() now will request a label from the label pool for any prefix that loses the label for some reason (for example, when the static label assignment config is removed)
* bgp_label.c:bgp_reg_dereg_for_label() now requests labels from label pool for routes which have no associated label index
* zebra_mpls.c:zebra_mpls_fec_register() now expects both label and label_index from the calling function, one of which must be set to MPLS_INVALID_LABEL or MPLS_INVALID_LABEL_INDEX, based on this it will decide how to register the provided FEC.
Signed-off-by: Anton Degtyarev <anton@cumulusnetworks.com>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block
Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.
Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
These three data structures belong in the `zebra_router` structure
as that they do not belong in `struct zebra_ns`.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the rules_hash to the zrouter data structure and provide
the additional bit of work needed to lookup the rule based upon
the namespace id as well. Make the callers of functions not
care about what namespace id we are in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the modification of whether or not we will allow
BUM flooding on the vxlan bridge. To do this allow
the upper level protocol to specify via the ZEBRA_VXLAN_FLOOD_CONTROL
zapi message.
If flooding is disabled then BUM traffic will not be forwarded
to other VTEP's.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Corrections so that the BGP daemon can work with the label manager properly
through a label-manager proxy. Details:
- Correction so the BGP daemon behind a proxy label manager gets the range
correctly (-I added to the BGP daemon, to set the daemon instance id)
- For the BGP case, added an asynchronous label manager connect command so
the labels get recycled in case of a BGP daemon reconnection. With this,
BGPd and LDPd would behave similarly.
Signed-off-by: F. Aragon <paco@voltanet.io>