`gate_buf` should be big enough to hold IPv6 addresses and `inet_ntop`
should be run in the correct `sockaddr` struct member.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Debug messages should use `prefix_buf` and `prefix2str` should only be
called once in `kernel_rtm`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add a new field in the ZEBRA_CAPABILITIES zapi message specifying
the VRF backend in use.
For simplicity, make the zclient code call vrf_configure_backend()
to apply the received value automatically instead of requiring
the daemons to do that themselves in their zebra_capabilities()
callbacks.
Additionally, call zebra_vrf_update_all() only after sending the
capabilities message to the client, so that it will know which VRF
backend is in use when processing the VRF messages.
This commit fixes a couple of bugs in the "interface" CLI command and
associated northbound callbacks, which behave differently depending
on the VRF backend in use. Before this commit, the vrf_backend
variable would always be set to VRF_BACKEND_NETNS in the client
daemons, even when zebra was started without the --vrfwnetns option.
This could lead to inconsistent behavior and subtle bugs under
specific circumstances.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
We were sending ZEBRA_INTERFACE_LINK_PARAMS messages under the
following circumstances:
* New interface was created (via kernel or config);
* Interface went from down to up;
* Update in the link-params configuration.
Now also send ZEBRA_INTERFACE_LINK_PARAMS messages whenever a zclient
connects and sends a ZEBRA_INTERFACE_ADD request. Without this fix,
the client daemons don't receive interface link parameters if they
are configured in the zebra startup configuration.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
client->ifinfo is a VRF bitmap, hence we need to use
vrf_bitmap_check() to check if a client is subscribed to receive
interface information for a particular VRF. Just checking if
the client->ifinfo value is set will always succeed since it's
a pointer initialized by zserv_client_create(). With this fix,
we'll stop sending interface messages from all VRFs to all clients,
even those that didn't subscribe to it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Routes without nexthops don't make any sense, so we need to reject
them otherwise weird things can happen.
NOTE: blackhole routes aren't nexthop-less, they do have a single
nexthop of type NEXTHOP_TYPE_BLACKHOLE.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Some daemons like ospfd and isisd have the ability to advertise a
default route to their peers only if one exists in the RIB. This
is what the "default-information originate" commands do when used
without the "always" parameter.
For that to work, these daemons use the ZEBRA_REDISTRIBUTE_DEFAULT_ADD
message to request default route information to zebra. The problem
is that this message didn't have an AFI parameter, so a default route
from any address-family would satisfy the requests from both daemons
(e.g. ::/0 would trigger ospfd to advertise a default route to its
peers, and 0.0.0.0/0 would trigger isisd to advertise a default route
to its IPv6 peers).
Fix this by adding an AFI parameter to the
ZEBRA_REDISTRIBUTE_DEFAULT_{ADD,DELETE} messages and making the
corresponding code changes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Future commits are going to introduce more rigor in
state setting in the case of received results from
the data plane. So let us move the DPLANE_OP_ROUTE_DELETE
state check to the same spot as the rest of the code that
is handling a particular operation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify the status flag from 8 bits to 32 bits and to add
a few new flags that will be used in future commits.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify the meta_queue insertion such that we only enqueue
the route_node into one meta_queue instead of several.
Suppose we have multiple route_entries associated with
a particular node from rip, bgp, staticd. If we receive a
route update from rip, we would enqueue the route_node into
the 1, 2, 3 meta-nodes. Which means that we would run
the entire process of figuring out a route 3 times, while
nothing would change the second two times.
Modify the code to choose the lowest meta-queue and
install it into that one for processing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a dataplane provider/plugin registers, return the new
handle/object - that's needed to use some provider apis
later on.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Pass lists of results back to zebra from the dataplane subsystem
(and pthread). This helps reduce the lock/unlock cycles when
zebra is busy. Also remove a couple of typedefs that made their
way into the dataplane header file - those violate the FRR style
guidelines.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
if the default vrf name is manually set, by passing -o parameter to
zebra, then this should be detected when walking the list of netns
available in the system. If a netns called vrf0 is present, then it
should be ignored.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when zebra is run, by using vrf netns backend mode, then the parser
detector of netns is run before forcing the default vrf to a possible
value. In that case, there is a possibility that the forced '-o' option
will create a second vrf with same name, whereas this option should be
there to uniquely have a default vrf with a value.
To make things consistent, the forced value will be priorised. Then, the
notifier will attempt to create vrf contexts. The expectation is that
the creation will fail, due to an already present vrf with same name.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When an empty netmask a wrong end size is calculated, lets handle this
corner case to avoid spurious warning messages.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Handle corner case where a warning log message is issued on interface
address netmask handling with sockaddr type AF_LINK: it may come empty
or with match all (all 0xFF).
In the first case all lengths are zero and we only need to copy the
first bytes, second case it comes with a zero index and all 0xFF bytes.
In any case we only need to figure out a few of the first bytes instead
of all data.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When porting routing socket macro data handling to functions, the
attribute function was forgotten. The only difference between the
attribute and address handler is the family type check.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
"brief" output for "show interface" helps when we have to quickly check
important information like ip address, vrf etc. This prints
information in the easy to read tabular format. Currently it prints oper
status, ifname, vrf, ipv4 and ipv6 addresses.
Ticket: CM-9109
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
For neigh check duplicate flag as it can be inherited from
duplicate detected MAC (count could be 0).
Ticket:CM-23316
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Below are cases where EVPN duplicate detection
Freeze and Unfreeze required fixes:
Auto recovery needs to check neighbor's duplicate flag
to take action, as neigh could be marked duplicate
via inherited from MAC where IP detection count could be 0.
MAC duplicate detection needs to set flag to true
if freeze action is configured.
Local MAC add update should not send update to bgp
if MAC is in frozen state.
Remote MAC-IP update should not process neigh update if MAC
is detected as duplicate during remote update.
Ticket:CM-23344
Testing Done:
Trigger duplicate detection via both local and remote update trigger,
Validate clear command and other changes expected behavior.
Auto-recovery takes appropriate action on inherited IPs.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Add the ability to retrieve the current role of mlag for this machine.
If mlag is not setup we will always return MLAG_ROLE_NONE.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zebra_delete_rnh function is not needed to be exposed
to the entire world. Limit it's scope.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The deletion of a rnh is always proceeded by the same checks
to see if it is done. Just let zebra_delete_rnh do this test.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we call zebra_vrf_table_create, we've already created the info
pointer in zebra_router_get_table, so properly set the info->safi
and just store the zvrf->table[afi][safi] value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When handling events from /var/run/netns folder, if several netns are
removed at the same time, only the first one is deleted in the frr. Fix
this behaviour by applying continue in the loop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Duplicate address detection should operate
at default vrf instance.
For mac and neigh show command, auto recovery and few places
where tanent vrf_id used for zvrf instead use default
vrf instance. Use vxlan_if's or VRF_DEFAULT vrf_id to
fetch zebra's default vrf instance.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
zebra uses the SIOCETHTOOL ioctl with the ETHTOOL_GSET command to
fetch the speed of interfaces from the kernel. The only problem is
that ETHTOOL_GSET returns EOPNOTSUPP when the given interface is a
virtual interface. This leads to zebra emitting warnings like this
at startup:
ZEBRA: IOCTL failure to read interface lo speed: 95 Operation not supported
ZEBRA: IOCTL failure to read interface dummy0 speed: 95 Operation not supported
ZEBRA: IOCTL failure to read interface ovs-system speed: 95 Operation not supported
Silence these warnings by ignoring EOPNOTSUPP errors, since we know
they are harmless. This is similar to how we handle EINVAL errors
from the BSD SIOCGIFMEDIA ioctl (commit c69f2c1ff).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Unlike the other interface zapi messages, ZEBRA_INTERFACE_VRF_UPDATE
identifies interfaces using ifindexes and not interface names. This
is a problem because zebra always sends ZEBRA_INTERFACE_DOWN
and ZEBRA_INTERFACE_DELETE messages before sending
ZEBRA_INTERFACE_VRF_UPDATE, and the ZEBRA_INTERFACE_DELETE callback
from all daemons set the interface index to IFINDEX_INTERNAL. Hence,
when decoding a ZEBRA_INTERFACE_VRF_UPDATE message, the interface
lookup would always fail since the corresponding interface lost
its ifindex. Example (ospfd):
OSPF: Zebra: Interface[rt1-eth2] state change to down.
OSPF: Zebra: interface delete rt1-eth2 vrf default[0] index 8 flags 11143 metric 0 mtu 1500
OSPF: [EC 100663301] INTERFACE_VRF_UPDATE: Cannot find IF 8 in VRF 0
To fix this problem, use interface names instead of ifindexes to
indentify interfaces like the other interface zapi messages do.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The route_info data structure already had a mapping of route type
to admin distance. Consolidate the meta_queue_map information
into this route_info data structure. This is to reduce the number
of places we need to remember to touch when adding a new routing
protocol.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
An EVPN type-2 entry is in freeze state during remote update,
remote VTEP can send typ-2 withdraw update,
upon receiving an entry delete (withdraw), first check
kernel has in local reachable state. Upon
unfreeze use the local entry to advertise to peers.
Fetch is for both MAC and IP, delete can come for
only MAC or MAC-IP combined route.
The specific entry fetch only required request flag to be set,
dump flag is not required.
Testing Done:
Simulate two VTEPs to do M1, IP1 mobility sequence,
freeze MAC during remote MAC update, subsequently send
withdraw type-2 route from origintating VTEP.
This results in read apis to invoke for local reachable entry.
Zebra updates its cache and upon unfreeze originates type-2.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Make netlink_request api generic where it can be used
for dump or querying specific information request.
nelink request nlm flags (NLM_F_ROOT | NLM_F_MATCH) are
used to dump purpose, if client wants to query spcific
MAC or IP using netlink_request does not require to set
them.
nlm struct is passed by the caller of netlink_request,
it can also set the nlm request flags.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit is the last missing piece to complete BGP LU support in bgpd. To this moment, bgpd (and zebra) supported auto label assignment only for prefixes leaked from VRFs to vpn and for MPLS SR prefixes. This adds auto label assignment to other routes types in bgpd. The following enhancements have been made:
* bgp_route.c:bgp_process_main_one() now sets implicit-null local_label to all local, aggregate and redistributed routes.
* bgp_route.c:bgp_process_main_one() now will request a label from the label pool for any prefix that loses the label for some reason (for example, when the static label assignment config is removed)
* bgp_label.c:bgp_reg_dereg_for_label() now requests labels from label pool for routes which have no associated label index
* zebra_mpls.c:zebra_mpls_fec_register() now expects both label and label_index from the calling function, one of which must be set to MPLS_INVALID_LABEL or MPLS_INVALID_LABEL_INDEX, based on this it will decide how to register the provided FEC.
Signed-off-by: Anton Degtyarev <anton@cumulusnetworks.com>
Reduce the zebra rib workqueue retry timeout, used when the queue
towards the zebra dataplane has reached its limit. Lowering the
value was reported to improve update throughput on some platforms.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The label processing for socket installs was not ensuring
that each nexthop would not accidently use the last
nexthops value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The test we were using to ensure that a mask was sent in
is a bit redundant, let's just always send it in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The ADD/DELETE messages are the only ones we support, so leave
early from the function, in other words don't check it every
nexthop loop.
Additionally nexthops only care about non recursive active flags.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
I'm going to rearrage the kernel_rtm_ipv4 and v6 functions
so the sin6_masklen needs to be moved a bit earlier.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The write function converted to v4 and v6 functions to a union sockunion
via casting. Just use `union sockunion` instead.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the ns deletion event to happen *after* the data validity
checks.
Please note this probably still leaves a weird hole if we receive
multiple namespace events ( as the for loop implies ). We will
stop handling anything after a namespace deletion notification.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
the default vrf name was hardset to "Default", whereas the default vrf
name could have been configured in an other manner. Fix this
inconsistency.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the l3vni structure is allocated only once, since that structure is only
used for default netns. For that, move the initialisation part is moved
to a proper place, where there is no risk of attempting to initialise it
more than once, even when vrf backend is netns.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a route removal failure happens return to the installing
protocol that the route deletion failed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the zebra rib processing workqueue, set a small timeout
so that we will wait a short time if the queue into the
async dataplane is full. This helps avoid a situation where
the zebra main pthread constantly retries rib work without
giving the dataplane pthread a chance to make progress.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
NEXTHOP_FLAG_ACTIVE currently means that the nexthop is considered
good enough to be installed. With current ecmp restrictions this
translation from multipath_num is enforced in the data plane.
The problem with this is of course that every data plane now
becomes concerned about the multipath num and must enforce it
independently. Currently *bsd does not honor multipath_num at
all and linux marks all nexthops as being installed even when
it honors a multipath_num that is less than the total.
This code change moves the multipath_num enforcement from a dataplane
decision to a zebra nexthop decision. Thus dataplanes now can
just install those nexthops marked as NEXTHOP_FLAG_ACTIVE
without having to worry about multipath_num.
*BSD will now respect multipath_num and Linux now properly notes
which routes are actually installed or not:
sharpd@donna ~/f/t/topotests> ps -ef | grep frr
frr 6261 1556 0 09:12 ? 00:00:00 /usr/lib/frr/zebra -e 2 --daemon -A 127.0.0.1
frr 6279 1556 0 09:12 ? 00:00:00 /usr/lib/frr/staticd --daemon -A 127.0.0.1
donna.cumulusnetworks.com(config)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route
K>* 0.0.0.0/0 [0/106] via 10.0.2.2, enp0s3, 00:00:45
S>* 4.4.4.4/32 [1/0] via 10.0.2.1, enp0s3, 00:00:02
* via 192.168.209.1, enp0s8, 00:00:02
via 192.168.210.1, enp0s9 inactive, 00:00:02
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:00:45
C>* 192.168.209.0/24 is directly connected, enp0s8, 00:00:45
C>* 192.168.210.0/24 is directly connected, enp0s9, 00:00:45
donna.cumulusnetworks.com(config)#
sharpd@donna ~/f/t/topotests> ip route show
default via 10.0.2.2 dev enp0s3 proto dhcp metric 106
4.4.4.4 proto 196 metric 20
nexthop via 10.0.2.1 dev enp0s3 weight 1
nexthop via 192.168.209.1 dev enp0s8 weight 1
10.0.2.0/24 dev enp0s3 proto kernel scope link src 10.0.2.15 metric 106
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
192.168.209.0/24 dev enp0s8 proto kernel scope link src 192.168.209.2 metric 105
192.168.210.0/24 dev enp0s9 proto kernel scope link src 192.168.210.2 metric 103
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Stop creating individual, one-time events as each batch of
incoming zserv/zapi messages is processed - use a singleton
event so that the incoming message activity is more fair if
the zebra main pthread has other events to run.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We never used this information and it was merely stored.
Additionally this is not something that is a flag, it's
a status.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Make the v4 and v6 code paths for rib_XXX calls in kernel_socket
as similiar as we can possibly make them. There is no need
for code duplication at this point in time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The rib_lookup_ipv4_route function is only used in a debug path.
Is only used for v4 and only checks to make sure that the rib
and fib are in sync( which is not needed/used/supported on other
platforms ). So let's just remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For nexthop handling use the actual resolved nexthop.
Nexthops are stored as a `special` list:
Suppose we have 3 way ecmp A, B, C:
nhop A -> resolves to nhop D
|
nhop B
|
nhop C -> resolves to nhop E
A and C are typically NEXTHOP_TYPE_IPV4( or 6 ) if they recursively resolve
We do not necessarily store the ifindex that this resolves to.
Current nexthop code only loops over A,B and C and uses those for
the zebra_rnh.c handling. So interested parties might receive non-fully
resolved nexthops( and they assume they are! ).
Let's convert the looping to go over all nexthops and only deal with
the resolved ones, so we will look at and use D,B and E.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `show ip route A.B.C.D json` command was only displaying
the last route entry looked at and we would drop the data
associated with other route entries. This fixes the issue:
robot# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route
K>* 0.0.0.0/0 [0/100] via 192.168.201.1, enp3s0, 00:13:31
C>* 4.50.50.50/32 is directly connected, lo, 00:13:31
D 10.0.0.1/32 [150/0] via 192.168.201.1, enp3s0, 00:09:46
S>* 10.0.0.1/32 [1/0] via 192.168.201.1, enp3s0, 00:10:04
C>* 192.168.201.0/24 is directly connected, enp3s0, 00:13:31
robot# show ip route 10.0.0.1 json
{
"10.0.0.1\/32":[
{
"prefix":"10.0.0.1\/32",
"protocol":"sharp",
"distance":150,
"metric":0,
"internalStatus":0,
"internalFlags":1,
"uptime":"00:09:50",
"nexthops":[
{
"flags":1,
"ip":"192.168.201.1",
"afi":"ipv4",
"interfaceIndex":2,
"interfaceName":"enp3s0",
"active":true
}
]
},
{
"prefix":"10.0.0.1\/32",
"protocol":"static",
"selected":true,
"distance":1,
"metric":0,
"internalStatus":0,
"internalFlags":2064,
"uptime":"00:10:08",
"nexthops":[
{
"flags":3,
"fib":true,
"ip":"192.168.201.1",
"afi":"ipv4",
"interfaceIndex":2,
"interfaceName":"enp3s0",
"active":true
}
]
}
]
}
robot#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When using `SIOCGIFMEDIA` check for `EINVAL`, otherwise we might print
an error message on an unsupported interface.
FreeBSD source code reference:
https://github.com/freebsd/freebsd/blob/master/sys/net/if_media.c#L300
And:
8cb4b0c018/usr.sbin/rtsold/if.c (L211)
/*
* EINVAL simply means that the interface does not support
* the SIOCGIFMEDIA ioctl. We regard it alive.
*/
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Some address types were not being skipped triggering a warning log
message, so lets refactor this code to properly handle known and unknown
types.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Move the declaration of ROUNDUP and ROUND_TYPE to outside of
`ifdef SA_SIZE`. We'll use these definitions in the next commit.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Clear dup address vni needs to return non-zero value
in case of command is not successful.
Ticket:CM-23122
Testing Done:
run clear command and check upon failure return code is non-zero.
root@TORS1:~# vtysh -c "clear evpn dup-addr vni 1000 ip 45.0.1.26"
% Requested IP's associated MAC 00:01:02:03:04:05 is still in duplicate
% state
root@TORS1:~# echo $?
1
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Problem reported that kernel neighbor entries could end up in "FAILED"
state when the neighbor entry was deleted. This fix handles the
notification of the event from netlink messages and re-inserts the
deleted entry.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Always resend the nexthop information when we get a registration
event. Multiple daemons expect this information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
Change helps display detailed output for all possible VNI neighbors
without specifying VNI and ip. It helps in troubleshooting as a single
command can be fired to capture detailed info on all VNIs.
Ticket: CM-22832
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8034
Change helps display detailed output for all possible VNI MACs without
specifying VNI or mac. It helps in troubleshooting - a single
command can be fired to capture detailed info on all VNIs.
Also fixed and existing json related bug where json object is created by
a parent function and freed in child function.
Ticket: CM-22832
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8028
A while ago all FRR configuration commands were converted to use the
QOBJ infrastructure to keep track of configuration objects. This
means the configuration lock isn't necessary anymore because the
QOBJ code detects when someones tries to edit a configuration object
that was deleted and react accordingly (log an error and abort the
command). The possibility of accessing dangling pointers doesn't
exist anymore since vty->index was removed.
Summary of the changes:
* remove the configuration lock and the vty_config_lockless() function.
* rename vty_config_unlock() to vty_config_exit() since we need to
clean up a few things when exiting from the configuration mode.
* rename vty_config_lock() to vty_config_enter() to remove code
duplication that existed between the three different "configuration"
commands (terminal, private and exclusive).
Configuration commands converted to the new northbound model don't
need the configuration lock either since the northbound API also
detects when someone tries to edit a configuration object that
doesn't exist anymore.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
the vrf context was not created at previous location of the call.
The call is done after vrf initialisation.
PR=61513
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Acked-by: Nicolas dichtel <nicolas.dichtel@6wind.com>
the netns discovery process executed when vrf backend is netns, allows
the zebra daemon to dynamically change the default vrf name value. This
option is disabled, when the zebra is forced to a default vrf value with
option -o.
PR=61513
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Change helps display detailed output for all possible VNIs without
specifying VNI. It helps in troubleshooting - a single command can
be fired to capture detailed info on all VNIs.
Ticket: CM-22831
Signed-off-by: Nitin Soni <nsoni@cumulusnetworks.com>
Reviewed-by: CCR-8013
To avoid conflicts between the zebra main pthread and the
dataplane pthread, use a separate routing socket (on non-netlink
platforms) for dataplane route updates to the OS.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Use a separate netlink socket for the dataplane's updates, to
avoid races between the dataplane pthread and the zebra main
pthread. Revise zebra shutdown so that the dataplane netlink
socket is cleaned-up later, after all shutdown-time dataplane
work has been done.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Improve, simplify dataplane provider locking apis. Add accessor
for dataplane pthread's thread_master, for use by providers who
need to use the thread/event apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Update the dataplane shutdown checks to include the providers.
Also revise the typedef for provider structs to make const
work.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Limit the number of updates processed from the incoming queue;
add more stats. Fill out apis for dataplane providers; convert
route update processing to provider model; move dataplane
status enum
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Display following Per MAC and Neigh's output:
If duplicate address detection is under process,
display detection start time and detection count.
If duplicate address detection detected an address
as duplicate, display detection time and duplicate
status.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The if_is_loopback() function is the right abstraction for identifying
loopback interfaces. There should be no reason for not using it in the
router-id code.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
It's been a year since we added the new optional parameters
to instantiation. Let's switch over to the new name.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the remote mac is deleted by bgpd we can end up with an auto mac
entry in zebra if there are neighs referring to the mac. The remote sequence
number in the auto mac entry needs to be reset to 0 as the mac entry may
have been removed on all VTEPs (including the originating one).
Now if the MAC comes back on a remote VTEP it may be added with MM=0 which
will NOT be accepted if the remote seq was not reset in the previous step.
Ticket: CM-22707
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
This is a fixup to commit -
f32ea5c07 - zebra: act on kernel notifications for remote neighbors
The original commit handled a race condition between kernel and zebra
that would result in an inconsistent state i.e.
kernel has an offload/remote neigh
zebra has a local neigh
The original commit missed setting the neigh to active when zebra
tried to resolve the inconsistency by modifying the local neigh to
remote neigh on hearing back its own kernel update. Fixed here.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-22700
When events cross paths between bgp and zebra bgpd could end up with a
dangling local MAC entry. Consider the following sequence of events on
rack-1 -
1. MAC1 has MM sequence number 1 and points to rack-3
2. Now a packet is rxed locally on rack-1 and rack-2 (simultaneously) with
source-mac=MAC1.
3. This would cause rack-1 and rack-2 to set the MM seq to 2 and
simultaneously report the MAC as local.
4. Now let's say on rack-1 zebra's MACIP_ADD is in bgpd's queue. bgpd
accepts rack-3's update and sends a remote MACIP add to zebra with MM=2.
5. zebra updates the MAC entry from local=>remote.
6. bgpd now processes zebra's "stale local" making it the best path.
However zebra no longer has a local MAC entry.
At this point bgpd and zebra are effectively out of sync i.e. bgpd has a
local-MAC which is not present in the kernel or in zebra.
To handle this window zebra should send a local MAC delete to bgpd on
modifying its cache to remote.
Ticket: CM-22687
Reviewed By: CCR-7935
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Current clang has an issue with the pointer/target argument
to at least one atomic/intrinsic. A variable with '_Atomic'
generates a compile-time error. Use a cast as a workaround
here to allow use of clang for now.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When the rib code is informed that a table is closing/
going away, only try once to uninstall associated routes from
the fib/dataplane. The close path can be called multiple times
in some cases - zebra shutdown, e.g.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Even if the neighbor entry we want already exists, force its
reinstallation to ensure that it's valid. This will now take place when
we request an update of the neighbor entry.
Ticket: CM-22604
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Recursive multipath nexthops were broken by the initial async
dataplane - we were trying to install an extra, invalid
nexthop.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The interface type can be a bond or a bond slave, add some
code to note this and to display it as part of a show interface
command.
Signed-off-by: Dinesh Dutt <didutt@gmail.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Correctly set safi to prevent duplicate allocations
* Free previously allocated table->info before overwriting it
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The frr-interface YANG module models interfaces using a YANG list keyed
by the interface name and the interface VRF. Interfaces can't be keyed
only by their name since interface names might not be globally unique
when the netns VRF backend is in use. When using the VRF-Lite backend,
however, interface names *must* be globally unique. In this case, we need
to validate the uniqueness of interface names inside the appropriate
northbound callback since this constraint can't be expressed in the
YANG language. We must also ensure that only inactive interfaces can be
removed, among other things we need to validate in the northbound layer.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Introduce frr-interface.yang, which defines a model for managing FRR
interfaces.
Update the 'frr_yang_module_info' array of all daemons that will
implement this module.
Add automatically generated stub callbacks in if.c. These callbacks will
be implemented in the following commit.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
FRR_DAEMON_INFO should now contain an array of 'frr_yang_module_info'
structures describing the YANG modules implemented by the daemon.
This array will be used by frr_init() function to load all YANG modules
and initialize the northbound callbacks during the daemon initialization.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Avoid running the shutdown/sigint handler code more than once. With
the async dataplane, once shutdown has been initiated, the completion
of all async updates triggers final shutdown of the zebra main
pthread. During that time, avoid taking and processing a second
signal, such as SIGINT or SIGTERM.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Impose a configurable limit on the number of route updates
that can be queued towards the dataplane subsystem.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Dplane support for zebra's route cleanup during shutdown (clean
shutdown via SIGINT, anyway.) The dplane has the opportunity to
process incoming updates, and then triggers final cleanup
in zebra's main thread.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add first pass at show commands for the zebra dplane. Add some stats
counters to show. Start prep for correct shutdown processing, and for
multiple providers.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Correct use of netlink_parse_info() in the netlink fuzzing path.
Also clarify a couple of comments about pthreads.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We need a bit of special handling for system routes, which need
to be offered for redistribution even though they won't be
passing through the dplane system.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Initial WIP api to add providers into the zebra dataplane system,
with some simple ordering/prioritization.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Set SELECTED re immediately in rib_process, without expecting
that fib install has completed. Remove premature redistribute
call also.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Slide netlink_talk_info into place to remove dependency on core
zebra structs; add accessors for dplane context block
Start init of route context from zebra core re and rn structs;
start queueing and event handling for incoming route updates.
Expose netlink apis that don't rely on zebra core structs;
add parallel route-update code path using the dplane ctx;
simplest possible event loop to process queued route'
updates.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When we fail to install a route into bsd, note the case
where we have no viable nexthops installed for it, so
that we can know in zebra if the route is good or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The _wrap_script inclusion implies a certain end functionality
of which we don't care. We just care that the hooks are called.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
These three data structures belong in the `zebra_router` structure
as that they do not belong in `struct zebra_ns`.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the rules_hash to the zrouter data structure and provide
the additional bit of work needed to lookup the rule based upon
the namespace id as well. Make the callers of functions not
care about what namespace id we are in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The `struct zebra_ns` data structure is being used
for both router information as well as support for
the vrf backend( as appropriate ). This is a confusing
state. Start the movement of `struct zebra_ns` into
2 things `struct zebra_router` and `struct zebra_ns`.
In this new regime `struct zebra_router` is purely
for handling data about the router. It has no knowledge
of the underlying representation of the Data Plane.
`struct zebra_ns` becomes a linux specific bit of code
that allows us to handle the vrf backend and is allowed
to have knowledge about underlying data plane constructs.
When someone implements a *bsd backend the zebra_vrf data
structure will need to be abstracted to take advantage of this
instead of relying on zebra_ns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We had a variety of issues with sorted list compare functions.
This commit identifies and fixes these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
During a debugging session last night I discovered that I was
still having some `fun` figuring out why zebra was not making
a route's nexthop active. After some debugging I figured out
that I was missing some states that we could end up in that
didn't have debug information about what happened in nexthop_active.
Add the missing breadcrumbs for nexthop resolution. In addition
add a bit of code to notice the ebgp state without recursion turned
on and to let the user know about it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
on some cases, kernel routes are not selected, because the kernel
suppressed it without informing the netlink layer that the route has
been suppressed ( for instance, when an interface goes down, the route
never goes back when interface goes up again). This commit intends to
suppress that entry from zebra.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Allow the modification of whether or not we will allow
BUM flooding on the vxlan bridge. To do this allow
the upper level protocol to specify via the ZEBRA_VXLAN_FLOOD_CONTROL
zapi message.
If flooding is disabled then BUM traffic will not be forwarded
to other VTEP's.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Work to handle the route-maps, namely the header changes in zebra_vrf.h
and the mapping of using that everywhere
Signed-off-by: vishaldhingra vdhingra@vmware.com
The condition in the do/while is always false because 'return_nsid' cannot
reach the end of the loop with 'return_nsid' having a different value than
NS_UNKNOWN. Because of that, the condition can be replaced with 0 (false).
Also, the loop can be removed because the two assignments made at the end
of the loop before the condition check are not used (detected via Clang,
afterwards).
Signed-off-by: F. Aragon <paco@voltanet.io>
Conditional code in netlink_macfdb_update() introduced in 2232a77c used
the 'dst_present' variable because not all cases were covered. Now it is
not necessary.
Signed-off-by: F. Aragon <paco@voltanet.io>
Wrapper the get/set of the table->info pointer so that
people are not directly accessing this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Unnecesary redeclaration of already-defined enum 'dp_results' removed.
Can be detected via static analysis with e.g.
./configure CFLAGS=-Wgnu-redeclared-enum CC=clang
Signed-off-by: F. Aragon <paco@voltanet.io>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
All I can see is an unneccessary complication. If there's some purpose
here it needs to be documented...
Signed-off-by: David Lamparter <equinox@diac24.net>
When we receive a v6 RA packet with an optional
ND_OPT_SOURCE_LINKADDR take that data and construct the
v4 to v6 neighbor entry for that interface to allow
v4 w/ v6 nexthops to work with only global v6 addresses
on an interface.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Abstract the mac neigh installation for 169.254.0.1 into
it's own function that we can pass the mac address into.
This will allow a future commit to use this functionality
when we have the appropriate mac address from reading
optional attributes of a RA packet.
Signed-off-by: Donald Sharp <sharpd@cumuusnetworks.com>
This change makes the zebra acting as label manager proxy not to relay non-LM
messages to clients that a zebra acting in non-proxy mode may send to it. Also,
the existing code does not schedule a rcv in case of relay_response_back
returns -1. This patch re-schedules reads on the socket even in case such a
function returns -1 by calling thread_add_read().
Signed-off-by: F. Aragon <paco@voltanet.io>
Corrections so that the BGP daemon can work with the label manager properly
through a label-manager proxy. Details:
- Correction so the BGP daemon behind a proxy label manager gets the range
correctly (-I added to the BGP daemon, to set the daemon instance id)
- For the BGP case, added an asynchronous label manager connect command so
the labels get recycled in case of a BGP daemon reconnection. With this,
BGPd and LDPd would behave similarly.
Signed-off-by: F. Aragon <paco@voltanet.io>
The block comments from a couple commits were not following
proper style. Fix.
Fix SA warning that had snuck in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Netdevices are not sorted in any fashion by the kernel during the initial
interface nldump. So you can get an upper device (such as an SVI) before
its corresponding lower device (bridge).
To fix this problem we skip resolving link dependencies during handling of
nldump notifications. Resolving instead at the end (when all the devices
are present)
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-22388, CM-21796
Reviewed By: CCR-7845
Testing Done:
1. verified on a setup with missing linkages
2. automation - evpn-min
Ensure that when the is_router condition changes for a locally learnt
neighbor, it is informed to BGP only if it is active i.e., the MAC is
also locally learnt.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
1. Failed test
2. vxlan_routing_test.py
Use boolean variables instead of unsigned int for certain VxLAN-EVPN
flags which are really used as boolean.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
Along with a subsequent, related commit
When a remote MAC goes away, but there are neighbors referring to it,
ensure that when the last remote neighbor goes away, the MAC is
uninstalled from the kernel and no longer considered as remote.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22130
Reviewed By: CCR-7777
Testing Done:
1. Replicated failed scenario and verified with fix.
2. evpn-min
When a MAC moves from local to remote, a replace is allowed, EVPN
no longer has to delete the local MAC before installing the remote
MAC.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
So the linux kernel uses the RT_TABLE_MAIN for the table
id used for ip routing. The multicast routing tables use
RT_TABLE_DEFAULT. We changed the internal code of zebra_vrf
a few months back to use RT_TABLE_MAIN as the tableid to
use. This caused the pim sg stats to stop working because
of the kernel bug where it uses a different table
for ip routing and ip multicast.
Put a bit of a special case in to do the right thing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When debugging the mroute code path in zebra, add a bit of additional
data to allow us to know what is going on a bit more.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Newer linux kernels apparently send data down the netlink
bus for the creation of mroutes. Add a bit of code
to notice this and to handle it appropriately( ie do
nothing at this point in time ) as that the correct
place to do this is in the pim socket in pimd.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are displaying data about a netlink message
in debugs or errors, print out the message type
as a string instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were linking all libs and binaries against libprotobuf-c if the
option was enabled... that makes no sense at all.
Signed-off-by: David Lamparter <equinox@diac24.net>
Since we're now building through one large Makefile, we can easily put
things with their daemons and crossreference nicely.
Signed-off-by: David Lamparter <equinox@diac24.net>
Debugging inactive nexthops in zebra can be quite difficult
and non-obvious what has gone wrong. Add detailed rib
debugs for the cases where we decide that a nexthop is
inactive so that we can more easily debug a reason
for the failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The _route_entry_dump function was not handling the nexthop as passed
in from an upper level protocol appropriate and as such not displaying
the v4/v6 nexthop right in the case where we have both going.
Additionally dump the nexthop vrf as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The RB-Tree used to store rmac information was not properly
handling the v6 address family. Modify the code to allow
this handling.
Cleans up this error message:
zebra[2231]: host_rb_entry_compare: Unexpected family type: 10
That is being seen, This fixes some connectivity issues being seen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For OpenFabric operation, we need to be able to install routes via
interfaces without any IPv4 addresses configured. Introduce a flag
ZEBRA_FLAG_ONLINK which upper protocols can set on a route they send
towards zebra, to force the nexthops to be considered onlink.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Problem reported that some bgp and ospf json commands did not return
any json output at all if the bgp/ospf instance did not exist.
Additionally, some bgp and ospf json commands did not return any json
output if the instance existed but no neighbors were defined. This
fix makes these commands more consistent in returning empty braces for
json output and issue a message if not using json output. Additionally,
made the flag "use_json" a bool to make it consistent since previously,
it had been defined as an int, char, u_char, and bool at various places.
Ticket: CM-21040
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This crash occurs only with netns implementation.
vrf meaning is different regarging its implementation (netns or
vrf-lite)
- With vrf-lite implementation vrf is a property of the interface that
can be changed as the speed or the state (iproute2 command: "ip link
set dev IF_NAME master VRF_NAME"). All interfaces of the system are in
the same netns and so interface name is unique.
- With netns implementation vrf is a characteristic of the interface
that CANNOT be changed: it is the id of the netns where the interface
is located. To change the vrf of an interface (iproute2 command to
move an interface "ip netns exec VRF_NAME1 ip link set dev IF_NAME
netns VRF_NAME2") the interface is deleted from the old vrf and
created in the new vrf.
Interface name is not unique, the same name can be present in the
different netns (typically the lo interface) and search of interface
must be done by the tuple (interface name, netns id).
Current tests on the vrf implementation (vrf-lite or netns) are not
sufficient. In some cases (for example when an interface is moved from
a vrf X to the default vrf and then move back to VRF X) we can have a
corruption message and then a crash of zebra.
To avoid this corruption test on the vrf implementation, needed when an
interface changes, has been rewritten:
- For all interface changes except deletion the if_get_by_name function,
that checks if an interface exists and creates or updates it if
needed, is changed:
* The vrf-lite implementation is unchanged: search of the interface
is based only on the name and update the vrf-id if needed.
* The netns implementation search of the interface is based on the
(name, vrf-id) tuple and interface is created if not found, the
vrf-id is never updated.
- deletion of an interface (reception of a RTM_DELLINK netlink message):
* The vrf-lite implementation is unchanged: the interface
information are cleared and the interface is moved to the default
vrf if it does not belong to (to allow vrf deletion)
* The netns implementation is changed: only the interface
information are cleared and the interface stays in its vrf to
avoid conflict with interface with the same name in the default
vrf.
This implementation reverts (partially or totally):
commit 393ec5424e ("zebra: fix missing node attribute set in ifp")
commit e9e9b1150f ("lib: create interface even if name is the same")
commit 9373219c67 ("zebra: improve logs when replacing interface to an
other netns")
Fixes: b53686c52a ("zebra: delete interface that disappeared")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when interface is a virtual ethernet interface, then there is no need to
update link pointer of interface.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There exists a possibility that the ifindex we are passed
does not exist and as such we should check for it not
resolving as part of the debug.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case the default netns has a netns path, then a new NETNS
creation will be bypassed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The Vrf aliases can be known with a specific hook. That hook will then,
from zebra propagate the information to the relevant zapi clients.
The registration hook function is the same for all daemons.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This function is changed so that the interface index is searched across
the correct namespace.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a header to cleanup no declaration and properly
wrapper some variables to appropriate #ifdef.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were ignoring mpls labels encapped with static routes.
Added support for single and multipath labels.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code prior to this change, was allowing clients to register
for nexthop tracking. Then zebra would look up the rnh and
send to that particular client any known data. Additionally
zebra was blindly re-evaluating the rnh for every registration.
This leads to interesting behavior in that all people registered
for that nexthop will get callbacks even if nothing changes.
Modify the code to know if we have evaluated the rnh or not
and if so limit the re-evaluation to when absolutely necessary
This is of particular importance to do because of nht callbacks
for protocols cause those protocols to do not insignificant
work and as more protocols are registering for nht callbacks
we will cause more work than is necessary.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
These MIB OIDs were only used to identify clients on the SMUX protocol.
And even for that, they were essentially pointless.
Signed-off-by: David Lamparter <equinox@diac24.net>
The ZEBRA_IPV4_ROUTE_[ADD|DELETE] and ZEBRA_IPV6_ROUTE_[ADD|DELETE] functionality
has been deprecated for a year now, let's remove this code from the system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zebra/client_main.c code is not being maintained or used.
Remove from system. Especially since the encode/decode
zapi functionality it `purports` to be testing is deprecated
and now being removed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Handle Remote Neigh entry state change from Router to Host.
Remote MAC-IP update may not continue EVPN NA Extended community,
Zebra need to accomodate if router_flag change for existing neigh
and install with or without Router Flag (R-bit).
Testing:
Have locally run MAC/IP (neigh entry) with R-bit set,
Checke on remote VTEP 'show bgp evpn route ...mac ip' and
'show evpn arp-cache ...' contians router flag.
Change host to remove R-bit, which locally learnt entry removes
Router flag. This results in remote vtep to remove R-bit from
neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Neigh update can have router_flag change, from unset to set and
viceversa. This is the case where MAC, IP and VLAN are same but
entry's flag moved from R to not R bit and reverse case.
Router flag change needs to trigger bgpd to inform all evpn peers
to remove from the evpn route.
Testing Done:
Send GARP with and without R bit from host and validate neigh entry
and evpn neigh and mac-ip route entry in zebra and bgpd.
Check Peer VTEP evpn route entry where router flag is (un)set.
With R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP1(uplink-1) (27.0.0.9)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8 ND:Router
Flag
AddPath ID: RX 0, TX 1261
Last update: Wed Aug 15 20:52:14 2018
Without R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP2(uplink-2) (27.0.0.10)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8
AddPath ID: RX 0, TX 1263
Last update: Wed Aug 15 20:53:10 2018
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The neigh update can come prior to mac add update.
In this case, the mac will be auto created for the vni.
set router flag to local neigh update for mac with auto flag.
The neigh update will be informed to bgpd once local mac is learnt.
Unset router flag if the neigh update comes without the router flag
for an existing neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Enhance the EVPN MAC and Neighbor cache display to show additional
information such as the mobility sequence numbers and the state.
Ensure that the neighbor state is set in a couple of places so
that the display is correct.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement procedures similar to what is specified in
https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility
in order to support extended mobility scenarios in EVPN. These are scenarios
where a host/VM move results in a different (MAC,IP) binding from earlier.
For example, a host with an address assignment (IP1, MAC1) moves behind a
different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host
with an address assignment (IP5, MAC5) has a different assignment of (IP6,
MAC5) after the move. Note that while these are described as "move" scenarios,
they also cover the situation when a VM is shut down and a new VM is spun up
at a different location that reuses the IP address or MAC address of the
earlier instance, but not both. Yet another scenario is a MAC change for an
attached host/VM i.e., when the MAC of an attached host changes from MAC1 to
MAC2. This is necessary because there may already be a non-zero sequence
number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before
(IP, MAC2) is advertised, they may propagate through the network differently.
The procedures continue to rely on the MAC mobility extended community
specified in RFC 7432 and already supported by the implementation, but
augment it with a inheritance mechanism that understands the relationship
of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC
forwarding database entry). In FRR, this relationship is understood by the
zebra component which doubles as the "host mobility manager", so the MAC
mobility sequence numbers are determined through interaction between bgpd
and zebra.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a host moves and is locally reachable, if the local neighbor event
is received before the local MAC event, flag the neighbor as inactive
just as would happen in the case of a new host. This ensures that the
MACIP route will get originated as soon as the local MAC event is got.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
* Added code for "match ipv6 address prefix list" command
* Added common function route_match_address_prefix_list() to process
routemap for AFI_IP and AFI_IP6 address family
Signed-off-by: kssoman <somanks@vmware.com>
* Check for the modified routemap in zebra_route_map_process_update_cb()
* Added zebra_rib_table_rm_update() for RIB routemap processing
* Added zebra_nht_rm_update() for NHT routemap processing
Signed-off-by: kssoman <somanks@vmware.com>
In order for connected routes to be installed the if_is_operative
function is called. This function checks the status of ptm
and decides to use ptm enabled/disabled on the interface.
The call to zebra_ptm_get_enable was returning true and causing
the interface subsystem to do the wrong thing. Modify the
internal bfd case to when checking for ptm enabled to say it
is not enabled.
Tested-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The CMSG_FIRSTHDR was broken on solaris pre version 9. Version 9
was released in May of 2002 and EOL'ed in 2014. Version 8 EOL'ed
in 2012. Remove special case code for a little used platform
that has not seen the light of day in a very long time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Use the correct license header
* Stop headers from including themselves
* Use uniform relative include conventions
* Ensure that sources include what they use
* Turn off clang-format around struct array blocks
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We must hide only "pseudowire IFNAME" from vtysh, the "no" form of the
command should be made available to the extract.pl script. Split the
command into two to fix this problem.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening. So we can safely remove all of this code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
On `zebra` / `bfdd` shutdown we now clean up all client data to avoid
memory leaks (ghost clients). This also prevents 'slow' shutdown on
`zebra` sparing us from seeing some rare topotests shutdown failures
(signal handler getting stopped by signal).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This will make `bfdd` synchronize with its client when zebra dies or
bfdd is restarted.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When `bfdd` is enabled - which it is by default - re-route the PTM-BFD
messages to the FRR's internal BFD daemon instead of the external
PTM daemon.
This will help the migration of BFD implementations and avoid
duplicating code.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The client socket value can only be modified by the main thread.
Modifying the client socket from within the client I/O pthread
introduces race conditions.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Socket should be closed in zserv_client_free() and nowhere else.
Credit to Mark Stapp for catching this one.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Rename some things to be less confusing
* Convert client close function to take a client struct rather than a
task
* Extern client close function and use it when handling SIGTERM
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Allow protocols to specify to zebra that they would like zebra
to use the distance passed down as part of determine sameness for
Route Replace semantics.
This will be used by the static daemon to allow it to have
backup static routes with greater distances.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is the start of separating out the static
handling code from zebra -> staticd. This will
help simplify the zebra code and isolate static
route handling to it's own code base.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
As part of moving the static route handling to it's own daemon
allow zebra to accept static route types from upper level
protocols.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
show evpn mac vni all
show evpn mac vni x
does not display local svi and anycast mac into count.
Ticket:CM-20456
Testing Done:
Before:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 4
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
dell-s6000-07#
After:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 6
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When I did a show ip route with `json` on a vrf when it didn't exist,
frr would output invalid json.
Signed-off-by: Nathan Van Gheem <nathan@cumulusnetworks.com>
The parameter was missing in that vty command. Then it is being added.
Also some documentation is refreshed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
NLMSG_NEXT decrements the buffer length (status) by
the header msg length (nlmsg_len) everytime its called.
If nlmsg_len isn't accurate and set to be larger than
what it should represent, it will cause status to
decrement passed 0. This makes NLMSG_NEXT return a
pointer that references an inaccessible address.
When that is passed to NLMSG_OK, it segfaults.
Add a check to verify that there is still something to read
before we try to.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Prefix length validation checks should be returning an error
rather than 0. Switch to that and make them error messages.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Change the fuzzing code so that it fakes data from
the listening socket rather than using its own pseudo one.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Commit a2ca67d1d2 consolidated IPv4 and IPv6 handling. It also applied
our ignorance for IPv4 srcdest routes onto IPv6.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Each ipset with port value monitors either src port or dst port.
The information is added to show pbr iptable commmand.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Bad nexthop messages from netlink were causing zebra
to hang here. Added a check to verify the length
of the nexthop so it doesn't keep trying to read.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Some more address family filters we can safely ignore
as well as typos in logger. Added AF_MPLS as filterable.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Zebra needed a check that varifies the prefix length
of an address is a valid length when receiving route
changes and interface address changes.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Zebra needed a check for mtu from the message it
received from the kernel before adding the new link.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The zebra netlink socket was attempting to read netlink
messages with invalid address families in a couple areas.
Added filters and warn messages.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
This code allows you to fuzz the netlink listening socket
in zebra by --enable-fuzzing and passing the -w [FILE]
option when running zebra.
File collection is stored in /var/run/frr/netlink_*
where each number is just a counter to keep the
files distinct.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
To keep configuration consistent, vrf that have not been able to be
associated with netns are removed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This test case happens in scenarios with mininet, where external netns
may be impossible for the local instance to be modified. The error is
ignored and the netns parsed is ignored too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
It was reported that "show ipv6 route vrf <vrfname>", "show ipv6 route
vrf <vrfname> ::/0 " or "show ipv6 route vrf <vrfname> json" all
displayed that the nexthop was in the default vrf. This was because
the kernel netlink messages would supply the RTA_OIF of the loopback
interface for the kernel-created default route for the vrf, where ipv4
did not supply any RTA_OIF. This fix suppresses the display if the
nexthop and route entry are in different vrfs and the nexthop is
NEXTHOP_TYPE_BLACKHOLE.
Ticket: CM-21722
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Kernel requests via netlink are synchronous.
Therefore we do not need to specify a need for a ACK and
we can make the netlink_cmd NONBLOCKING
1) If the netlink message is going to cause an error
we will still get one. Since results from the kernel
are synchronous we will get the error message on the
netlink_cmd socket and handle it
2) If the netlink message is going to send more than
one packet we will still get them all. Since the results
from the kernel are synchronous we will receive all data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a netlink_socket, listen to error
codes and abandon ship if it crashes and burns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The fuzzing code was calling zebra_client_create which was refactored to zserv_client_create.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add 'const' to prefix args to several zebra route update,
redistribution, and route owner notification apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The search algorithm for interface based on ifindex only is adapted to
vrf netns based too. Only the default netns will be used to search the
interface index.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the interface lookup based on ifindex in the case the target vrf is
unknown is using the generic vrf api. Like that, in the case of vrf
based netns, the search across different netns other than the default
one are not searched.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The interface lookup algorithm is different according to if we are on
netns vrf or not. If we are on the former case, then we only have to
parse the interfaces of the netns, while if we are on the other case, we
have to parse all the interfaces of all the vrfs ( since index is not
overlapping in the latter case).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
SVI interface ip/hw address is advertised by the GW VTEP (say TORC11) with
the default-GW community. And the rxing VTEP (say TORC21) installs the GW
MAC as a dynamic FDB entry. The problem with this is a rogue packet from a
server with the GW MAC as source can cause a station move resulting in
TORC21 hijacking the GW MAC address and blackholing all inter rack traffic.
Fix is to make the GW MAC "sticky" pinning it to the GW VTEP (TORC11). This
commit does it by installing the FDB entry as static if the MACIP route is
received with the default-GW community (mimics handling of
mac-mobility-with-sticky community)
Sample output with from TORC12 with TORC11 setup as gateway -
root@TORC21:~# net show evpn mac vni 1004 mac 00:00:5e:00:01:01
MAC: 00:00:5e:00:01:01
Remote VTEP: 36.0.0.11 Remote-gateway Mac
Neighbors:
45.0.4.1
fe80::200:5eff:fe00:101
2001:fee1:0:4::1
root@TORC21:~# bridge fdb show |grep 00:00:5e:00:01:01|grep 1004
00:00:5e:00:01:01 dev vx-1004 vlan 1004 master bridge static
00:00:5e:00:01:01 dev vx-1004 dst 36.0.0.11 self static
root@TORC21:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-21508
While ZAPI I/O threads make a best effort to kill any scheduled tasks on
their threadmasters, after death another pthread can continue to
schedule onto the threadmaster. This isn't a problem per se since the
tasks will never run, but it also means that asserting that it hasn't
happened is pointless.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The warning given by PVS-Studio is related to per-element overflow (there is
no real overflow, because of how elements are mapped in the union). This
same warning is typically reported by Coverity, too.
Signed-off-by: F. Aragon <paco@voltanet.io>
Problem created by the fix for cm-21306 (inactive cross-vrf static routes
when vrfs were bounced.) Determined that in another case, that fix would
cause duplicate nexthops to appear in the table. Resolved the problem by
removing the vrf static route process from the zebra "add" process leaving
it in the zebra " if up" process as added in cm-21306 since that's the point
that the vrf device is now functional.
Ticket: CM-21429
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This correction fixes two bugs detected by Clang scan:
Bug Group: Dead store
Bug Type: Dead assignment
File: zebra/kernel_netlink.c
Function: netlink_parse_extended_ack
Line: 548
Bug Type: Dead increment
File: isisd/isis_lsp.c
Function: lsp_bits2string
Line: 625
Signed-off-by: F. Aragon <paco@voltanet.io>
incoming iptable entries with fragment parameter is handled.
An iptable context is created for each fragment value received from BGP.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The packet length is added to iptable zapi message.
Then the iptable structure is taking into account the pkt_len field.
The show pbr iptable command displays the packet length used if any.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The icmp type/code is displayed.
Also, the flags are correctly set in case ICMP protocol is elected.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When in a dev build add a bit of code to track max
depth of a fifo and to allow zebra to report on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is an additional correction after 45981fda06 / PR #2462. I hope
this fixes the Coverity warning (I've added an additional check for ensuring
the string provided by the inotify read is zero-terminated).
Signed-off-by: F. Aragon <paco@voltanet.io>
When a filter function fails to work correctly, we get an
error message that something has gone wrong. Unfortunately
we may not have any clues as to where the decode failure
happened. Add a backtrace to give us a clue.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive a netlink message from the kernel we have
handler functions for when we send a netlink command, if these
return a failure ( < 0 ) then we output that we had a parse
issue. But if all we get is:
2018-06-21T23:47:45.298156+00:00 qct-ix1-08 zebra[1484]: netlink-cmd (NS 0) filter function error
Then it is not very useful to figure out *where* the error happened.
Add more error code when in a decode path to hopefully allow us
to figure out where this message is coming from.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a correction over 7f61ea7bd4 in order
to avoid the TAINTED_SCALAR Coverity warning (ending in "Untrusted array
index read"). This is equivalent to the previous commit, but avoiding
pointer arithmetic with tainted variables.
Signed-off-by: F. Aragon <paco@voltanet.io>
Add code to request and read in extended ack information
to provide a bit more context of what went wrong when
a failure is detected in the kernel.
Example of a failed delete:
Jun 20 21:19:25 robot zebra[11878]: Extended Error: Invalid prefix for given prefix length
Jun 20 21:19:25 robot zebra[11878]: netlink-cmd (NS 0) error: Invalid argument, type=RTM_DELROUTE(25), seq=8, pid=4078403400
Jun 20 21:19:25 robot zebra[11878]: 0:4.3.2.0/24: Route Deletion failure
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a correction over 32ac96b2ba, so
removing the forced string null termination doesn't involve a worse situation
than before (the underflow check should protect for the case of receiving
an incomplete buffer, which would be the cause of non-zero terminated string)
Signed-off-by: F. Aragon <paco@voltanet.io>
The route_map_walk_update_list callback function
never uses the return code, so just remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add some basic code for zebra to start to keep track
of route-maps that have changed. At this point we
are not doing anything. As we fix code to handle
route-maps better, code will be shifted around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported that if the vrf device is taken down and then brought
back up, any static route referencing that vrf device was not
re-installed. This fix runs back thru the static routes that
reference the vrf device coming up and re-install them.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Hide following l3vni config from DEFAULT_VRF instance
until it is fully supported.
TORS1(config)# vni 2222456 prefix-routes-only
Ticket:CM-20572
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Cleanup the zebra code to test for failure for reading
from stream once instead of once to see if we should
debug and once for the actual failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
New version of clang are detecting function parameters that we should
not be casting as such. Fix these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The IFLA_INFO_SLAVE_KIND constant is always defined now that we imported
our own copies of the Linux kernel headers. Remove the preprocessor
checks since they aren't necessary anymore.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When we have a host prefix, actually free the alloced memory
associated with it when we free it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When debugging code in redistribute.c, it is useful to output
the vrf we think the interface is in. So display it
when we are debugging.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Programs that link to libnetsnmp must be compiled using a special set
of flags as specified by the "net-snmp-config --base-cflags" command
(whose output is stored in the SNMP_CFLAGS variable). The problem is
that "net-snmp-config --base-cflags" can output -std=c99 in addition to
other compiler flags in some platforms, and this breaks the build since
FRR souce code makes use of some GNU compiler extensions (e.g. allow
trailing commas in function parameter lists). In order to solve this
problem, append -std=gnu99 after SNMP_CFLAGS in all makefiles where this
variable is used. This way the -std=c99 flag will be overwritten when it's
present. Source files that don't link to libnetsnmp will be compiled using
either -std=gnu99 or -std=gnu11 depending on the compiler availability.
Fixes#1617.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
That fix is a workaround from a vtysh limitation.
Because table identifier should be accessible in configuration only for
vrf netns backends, there was a need to differentiate the vty commands.
Unfortunately, vtysh parses the two commands without knowing which
command has really been installed.
Using one single vty command will avoid having this issue in vtysh.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
By default, nothing is displayed. If vrf backend is linux network
namespaces, then "netns-based vrfs" is displayed, before dumping the
list of VRFs.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In the case where vrf backend is netns, then the list of ns tables may
be extended. A single list is kept,but an attribute is added: the ns_id.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
As table_id for VRF with netns backend is main table ( RT_TABLE_MAIN or
zebrad.rtm_table_default), this makes possible to return the table id
that wants to be configured for those cases. ( in addition to default
VRF). In other cases ( VRF Lite presumably), then vrf table_id is
returned.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add the table keyword for all ip route/ip mroute/ipv6 route commands
that are available. Also, the main structure is being added a table
identifier.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a bit of code to allow return of data plane
request messages.
Add the ability to pass the result back to callers
of kernel_route_rib.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The SOUTHBOUND_XXX enum was named a bit poorly.
Let's use a bit better name for what we are trying to do.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
I mistakenly used an external mechanism to cause a pthread to shut
itself down instead of using the one built into frr_pthread.[ch]. This
created a race condition whereby a pthread could schedule work onto a
dead pthread and cause it to reanimate.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Coalesce multiple write() syscalls into one
* Write larger chunks
* Decrease default read limit to 1000
* Remove unnecessary operations from hot loop (zserv_write)
* Move cross-schedule out of obuf lock
* Use atomic ops to update atomic variable
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Cancelling threads is nice but they can potentially be scheduled again
after cancellation without an explicit check.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Only one I/O task can be scheduled per file descriptor. Having two
separate tasks for buffer filling and buffer flushing was breaking that
invariant and causing messages to never be written.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Separate flush task from write task, so we can continue adding to the
write buffer while it's waiting to flush
* Handle write errors sooner rather than later
* Only schedule a process job if we have packets to process
* Tweak zserv_process_messages to not reschedule itself and rely on
zserv_read() to do so in all proper cases
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Label manager reaches its hands into session / IO code for zserv for
whatever reason, gotta handle that.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Simplify zapi_msg <-> zserv interaction
* Remove header validity checks, as they're already performed before the
packet ever makes it here
* Perform the same kind of batch processing done in zserv_write by
copying multiple inbound packets under lock instead of doing serial
locking
* Perform self-scheduling under the same lock
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Dequeue all pending messages when writing and push them all into the
write buffer. This removes the necessity to self-schedule, avoiding a
mutex lock, and should also maximize throughput by not writing 1 packet
per job.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Increase the maximum number of packets to read per read job
* Store read packets in a local cached buffer to avoid mutex overhead
* Only update last-read time / last-command if we actually read a packet
* Add missing log line for corrupt header case
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add centralized thread scheduling dispatchers for client threads and
the main thread
* Rename everything in zserv.c to stop using a combination of:
- zebra_server_*
- zebra_*
- zserv_*
Everything in zserv.c now begins with zserv_*.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Since it is already quite difficult to understand the various pieces
going on here, I reorganized the file to make it much cleaner and easier
to understand. The organization is now:
zserv.c:
,---------------------------------.
/ include statements |
| ... |
| ... |
| -------------------------------- |
| Client pthread server functions |
| ... |
| ... |
| -------------------------------- |
| Main pthread server functions |
| ... |
| ... |
| -------------------------------- |
| CLI commands, other |
| ... |
| ... |
\_________________________________/
No code has been changed; the functions have merely been moved around.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Time counters need to use atomic access between threads
* After a client disconnects, we properly kill the thread but need to
free its frr_pthread as well
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add doc comments explaining hairy bits of thread lifecycle
* Remove t_suicide as it no longer makes sense
* Remove client double-free
* Remove unnecessary THREAD_OFF being used in incorrect pthread context
* Eliminate unnecessary racey access to client's obuf_fifo
* Ensure zserv_process_messages() reschedules itself if it has not
finished its work
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When we receive a route that we think we own and we
are not in startup conditions, then add a small debug
to help debug the issue when this happens, instead
of silently just ignoring the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The re-use of RTPROT_STATIC has caused too many collisions
where other legitimate route sources are causing us to
believe we are the originator of the route. Modify
the code so that if another protocol inserts RTPROT_STATIC
we will assume it's a Kernel Route.
Fixes: #2293
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
With:
commit ba7773964c
Author: Renato Westphal <renato@opensourcerouting.org>
Date: Wed Sep 20 22:12:56 2017 -0300
We added our own copy of if_link.h (among others). This
file unconditionally defines IFLA_WIRELESS, so we don't need
the conditional defines in the if_netlink.c code...
Issue: https://github.com/FRRouting/frr/issues/2299
Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
After PBR or BGP sends back a request for sending a rule/ipset/ipset
entry/iptable delete, there may be issue in deleting it. A notification
is sent back with a new value indicating that the removal failed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This hook can be used if the plugin module wrap_script is used.
This hook is called to dump the debugging status of this module, on the
vty.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following PBR handlers: ipset, and iptables will prioritary
call the hook from a possible plugin.
If a plugin is attached, then it will return a positive value.
That is why the return status is tested against 0 value, since that
means that there are no plugin module plugged
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon reception of an iptable_add or iptable_del, a list of interface
indexes may be passed in the zapi interface. The list is converted in
interface name so that it is ready to be passed to be programmed to the
underlying system.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those 3 fields are read and written between zebra and bgpd.
This permits extending the ipset_entry structure.
Combinatories will be possible:
- filtering with one of the src/dst port.
- filtering with one of the range src/ range dst port
usage of src or dst is exclusive in a FS entry.
- filtering a port or a port range based on either src or dst port.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Two new vty show functions available:
show pbr ipset <NAME>
show pbr iptables <NAME>
Those function dump the underlying "kernel" contexts. It relies on the
zebra pbr contexts. This helps then to know which zebra pbr
context has been configured since those contexts are mainly configured
by BGP Flowspec.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a mark is set, incoming traffic having that mark set can be
redirected to a specific table identifier. This work is done through
netlink.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In cast the removal of an iptable or an ipset pbr context is done,
then a notification is sent back to the relevant daemon that sent the
message.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon the remote daemon leaving, some contexts may have to be flushed.
This commit does the change. IPset and IPSet Entries and iptables are
flushed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is a fix that removes the structure from the hash list,
instead of just removing that structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add ns_id into zebra_pbr ipset
This is important so that each ipset entry knows on which NETNS the
ipset entry must be inkected
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Fix the code so that we would actually start receiving
RULE netlink notifications.
The Kernel expects the long long to be a bit field
value, while the newer netlink message types are
an enum. So we need to convert the message type
number to a bit position and set that value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move where we check for non-kernel netlink messages to
a slightly earlier spot. This will allow in subsuquent
commits the removal of an extra parameter that needs to
be passed around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The BPF filter was an exclusion list of netlink messages
we did not want to receive from our self. The problem
with this is that the exclusion list was and will be
ever growing. So switch the test around to an inclusion
list since it is shorter and not growing. Right
now this is RTM_NEWADDR and RTM_DELADDR.
Change some of the debug messages to error messages
so that when something slips through and it is unexpected
during development we will see the problem.
Also try to improve the documentation about what
the filter is doing and leave some breadcrumbs for
future developers to know where to change code
when new functionality is added.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In case, the BGP or PBR daemon leaves, the PBR contexts created by this
daemon are flushed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The linux kernel is getting the same Route Replace semantics
for v6 that v4 uses. Allow the end-user to know if their
kernel has this ability and if so to specify it so zebra
can take advantage of this.
Why not do auto-detection? Because you would have to write
code in zebra to add a route then add the same route again
with different nexthops to see if which semantics it is using.
It sure is easier to just add a cli that allows the user to
do it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Setup the buf used for extra data passed into kernel such
that we are cleaning it out before writing data to it,
so we can avoid writing uninited data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add to zebra route-maps the ability to match on a source-instance
route-map FOO deny 55
match source-instance 5
route-map FOO permit 60
ip protocol any route-map FOO
This will match any protocol route installation with a source-instance of 5.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The neighbor host_list is expensive as well. Modify
the code to take advantage of a rb_tree as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We are going to modify more host_list's to host_rb's
so let's rename some functions to take advantage of
what is there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The host_list when we attempt to use it at scale, ends
up spending a non-trivial amount of time finding and
sorting entries for the host list. Convert to a rb tree.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-15658
Reviewed By: CCR-6534
Testing Done: Unit
Issue: frr ptm-enable command not working for interfaces that have been created by frr as a place holder.
Root Cause: The ptm-enable on interface configuration was not getting stored when the interface was internally created by frr.
Fix: Store the ptm-enable configuration even if the interface is internally created.
Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Ensure that the next hop of the leaked VRF is not overwritten when the
route is being imported into the target VRF from the VPN table. Also, in
the case of multipath routes, ensure that the nexthop's ifindex is not
inadvertently reset.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Netlink messages from the kernel need to be received in a buffer larger
than 8K in order to handle some types of info - for example, the VLAN
information. Define a separate size for receive and set it to 32K, which
is the value used by other netlink receivers like iproute2.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra starts up it receives from the kernel a full dump of
interface information. Unfortunately it is in no particular order.
As such we sometimes receive data from the kernel about interfaces
we do not know about yet.
In this bug, we are attempting to use the interface pointer(->link)
for a vlan interface that we have not properly resolved.
This fix ensures that we will not attempt to call zvni_map_svi
if we have a NULL pointer. There are other places in the code
we are already checking for the fact that the ->link pointer
is valid before calling this function, so I believe that this
is correct.
We do need to come back and resolve all ->link pointers
after we have received the full table. This can be
done in another commit.
Ticket: CM-17041
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a command to enable symmetric routing only for type-5 routes.
This command is provided under vrf <> option in zebra as follows:
vrf <VRF>
vni <VNI> [prefix-routes-only]
We need the corresponding no version of the command as well as follows:
vrf <VRF>
no vni <VNI> [prefix-routes-only]
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For ipv6 host, the next hop is conevrted to ipv6 mapped address.
However, the remote rmac should still be programmed with the ipv4 address.
This is how the entries will look in the kernel for ipv6 hosts routing.
vrf routing table:
ipv6 -> ipv6_mapped remote vtep on l3vni SVI
neigh table:
ipv6_mapped remote vtep -> remote RMAC
bridge fdb:
remote rmac -> ipv4 vtep tunnel
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.
Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
There are cases when switching from one netns to an other one, where the
if_table registration by index has not been flushed. This fix mitigates
the potential crashes, in case the ifp->node pointer is null, the value
is overwritten by the route_node obtained.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When checking for a duplicate interface in an other NETNS, one may find
an interface in default VRF. That interface may have been moved to that
default VRF, for further action. Prevent from doing any action at this
point.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The log information is better displated.
Also the variable name fits better with other_ifp, than with old_ifp.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
re->status and re->flags both influence our decision states
for rib processing. Yet it's impossible to see them. Add
a tiny bit of code to allow us to look at them when things
are not behaving like we would expect.
Additionally dump the nexthop->flags at the same time for
the same reasons.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zns->ns pointer is not created until we get a callback
from the kernel that a ns exists. This should potentially
fix a crash in the *BSD code path.
Fixes: #2152
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since BGPd is not currently setting ID and PROTOCOL in label
requests, temporally disable mismatch error propagation.
This commit will be reverted once fixes for BGPd and label
manager are integrated.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
The current implementation did not consider multiple clients to
a label-manager acting as proxy, i.e. relaying messages to another
label manager. Specifically, upon a client's request, it checked
the socket & buffer from the actual label manager for pending
responses and directly copìed them to the client --currently--
being served. As a result, if two clients (e.g. ldpd and bgpd)
sent requests, it could happen that responses being 'on the wire'
from the real label manager towards the proxy, where relayed to
the wrong client. This patch, which requires all msgs to include
a a proto & instance pair, lookups up the zserv client that a
message (response) is to be relayed to.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
Add client proto and instance number in all msg (request and
responses) to/form a label manager. This is required for a
label manager acting as 'proxy' (i.e. relaying messages towards
another label manager) to correctly deliver responses to the
requesting clients.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
We are missing some handling of PBR and SHARP protocols
for netlink operations w/ the linux kernel.
Additionally add a bread crumb for new developers( or existing )
to know to fixup the rt_netlink.c when we start handling new
route types to hand to the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In a prior refactor, label manager proxy functionality
was broken in two places:
1) in function relay_response_back(), "dst" stream was
accidentally replaced by "src".
2) in zread_relay_label_manager_request(), src was set to point
to a global struct stream *ibuf that was not used/initialized
anywhere.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
When we are debugging add a bit of extra information
so we can know what we are redistributing to our peers
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Rename client_connect and client_close hooks to zapi_client_connect
and zapi_client_close
* Remove some more unnecessary headers
* Fix a copy-paste error in zapi_msg.[ch] header comments
* Fix an inclusion comment in zserv.c
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c was using hardcoded callbacks to clean up various components
when a client disconnected. Ergo zserv.c had to know about all these
unrelated components that it should not care about. We have hooks now,
let's use the proper thing instead.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c has become something of a dumping ground for everything vaguely
related to ZAPI and really needs some love. This change splits out the
code fo building and consuming ZAPI messages into a separate source
file, leaving the actual session and client lifecycle code in zserv.c.
Unfortunately since the #include situation in Zebra has not been paid
much attention I was forced to fix the headers in a lot of other source
files. This is a net improvement overall though.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When changing from "ip import-table 10 route-map rdn" to "ip
import-table 10" without a route-map, routes would be deleted
and not reinstalled. This fix resolves that problem.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Zebra is starting to have some run-time capabilites that would be
useful to pass up to the higher level protocols so that they
can act in an appropriate manner when needed.
Send the ecmp value zebra is being run with and whether or not
we believe mpls is enabled in the kernel or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The mpls_label2str and mpls_str2label functions should not
be zebra exclusive functions. Move them to lib/mpls.c
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Somewhere along the way the ability to install multiple
pbr-policys for the same pbr-map was lost.
Add this back. There is a limitation in that we are limited
to 64 interfaces per pbr-policy.
Ticket: CM-20429
Signed-off-by: Donald Sharp sharpd@cumulusnetworks.com>
When I implemented this code change I was only testing against
static routes and with one nexthop. I missed the fact that
we needed to tell rib_process to actually rethink the nexthops.
Ticket: CM-20274
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Need to explicitly exit this context otherwise we risk ambiguities
between global and vrf context commands
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When a user specifies static routes, there are a couple of states
where we will store the route and display it as part of the 'show run'
but it will not be installed until such time that the dependant state
is created. Add some breadcrumbs to the user so that they can figure
out WTF just happened.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Realized (with coverity's help) the fix had a mistake by pasting in
the wrong route entry to unset the selected flag. This fix takes
care of that mistake.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
With the recent change to just pass the prefix in
for the RTM_DELROUTE, for blackhole routes we
had stopped modifying the req.rtm_type to
be the appropriate type for blackhole routes.
Since we are just deleting on the route, and
zebra is never going to really install the same
route multiple times then we do not need
to specify the req.r.rtm_type for the deletion
command.
Ticket: CM-20616
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When I implemented the same functionality in add_ipv6 that
add_ipv4 has I just assumed that broad would not be NULL with
the ZEBRA_IFA_PEER flag set.
Modify the code to act similiar to the flow of control
in add_ipv4.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem was due to in certain route replace circumstances,
we would mark the old route_entry as removed to delete it but
would leave the selected flag set. When the rn was pulled off the
work queue for process, we would find both the new re and old re
(being deleted) with the selected flag set and would assert.
In this change, when we decide to delete the old re, we also mark
it as no longer selected.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This renaming of structure permits better identify which structure is
looked up, since policy routing will not only rely on iprule, but also
on some other structures.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In order to avoid duplicates functions, the zebra_pbr_rule structure
used by zebra to decode the zapi message, and send netlink messages, is
slightly modified. the structure is derived from pbr_rule, but it also
includes sock identifier that is used to send back information to the
daemon that did the request. Also, the ifp pointer is stored in that
structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add an intermediate helper structure that is used to walk the list of
ipset entries, and look for associated name.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those messages permit a remote daemon to configure an iptable entry. A
structure is defined that maps to an iptable entry. More specifically,
this structure proposes to associate fwmark, and a table ID.
Adding to the configuration, the initialisation of iptables hash list is
done into zebra netnamespace. Also a hook for notifying the sender that
the iptables has been correctly set is done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
PBR rule is being added a 32 bit value that can be used to record a rule
in the kernel, by using a fwmark information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Once ipset entries are injected in the kernel, the relevant daemon is
informed with a zebra message sent back.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
ZEBRA IPSET defines are added for creating/deleting ipset contexts.
Ans also create ipset hash sets.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
IPset and IPset entries structures are introduced. Those entries reflect
the ipset structures and ipset hash sets that will be created on the
kernel.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Zebra did not have a handler for tunnels in v6 for
some reason. Add code to handle the broadcast address
for both addition and deletion.
This appears to fix the crash. There might still need
to be some work to make the code `work` properly for
this type of tunnel.
Fixes: #2063
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This list "table" is created in the case the netns backend for VRF is
used. This contains the mapping between the NSID value read from the
'ip netns list' and the ns id external used to create the VRF
value from vrf context. This mapping is
necessary in order to reserve default 0 value for vrf_default.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN owns the remote neigh entries which are programed in the kernel.
This entries should not age out and the only way to delete should be
from EVPN. We should program these entries with NUD_NOARP instead of
NUD_REACHABLE to avoid aging of this macs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
There can be a race condition between kernel and frr as follows.
Frr sends remote neigh notification.
At the (almost) same time kernel might send a notification saying
neigh is local.
After processing this notifications, the state in frr is local while
state in kernel is remote. This causes kernel and frr to be out of sync.
This problem will be avoided if FRR acts on the kernel notifications for
remote neighbors. When FRR sees a remote neighbor notification for a
neighbor which it thinks is local, FRR will change the neigh state to remote.
Ticket: CM-19923/CM-18830
Review: CCR-7222
Testing: Manual
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
[zebra/zebra_vxlan.c:5779] -> [zebra/zebra_vxlan.c:5778]:
(warning) Either the condition 'if(svi_if_zif&&svi_if_link)'
is redundant or there is possible null pointer dereference: svi_if_zif.
Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
Background:
v6 does not have route replace semantics. If you want to add a nexthop
to an existing route, you just send RTM_NEWROUTE and the new nexthop.
If you want to delete a nexthop you should just send RTM_DELROUTE
with the removed nexthop.
This leads to situations where if zebra is processing a route
and has lost track of intermediate nexthops( yes this sucks )
then v6 routes will get out of sync when we try to implement
route replace semantics.
So notice when we are doing a route delete and the route is
not being updated, just send the prefix and tell it too delete.
Ticket: CM-20391
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit does 2 things:
1) When receiving a route from the kernel, display the incoming
table as part of the debug, to facilatate knowing what we are
talking about as part of the debug.
2) When displaying nexthop information for routes we were sending
to the kernel, no need to display the route information every time
Display the route then the individual nexthops for what we are doing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Notice when someone deletes a neighbor entry we've put in for
rfc-5549 gets deleted by some evil evil person. When this happens
notice and push it back in, immediately.
Ticket: CM-18612
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code to reinstall self originated routes was not behaving
correctly. For some reason we were looking for self originated
routes from the kernel to be of type KERNEL. This was probably
missed when we started installing the route types. We should
depend on the self originated flag that we determine from
the callback from the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
When the last match criteria was removed (dst-ip or src-ip), we were
not deleting the rule correctly for ipv6. This fix retains the
needed src-ip/dst-ip during the pbr_send_pbr_map process so the
appropriate information is available for the rule delete.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
When we have a PBR installed as a table, we need to notice
when a nexthop changes and rethink the routes for the pbr
tables.
Add code to nexthop tracking to notice the pbr watched
nexthop has changed in some manner. If it is a pbr route
that depends on the nexthop then just enqueue it for
rethinking.
This is a bit of a hammer, we know that only pbr routes
are going to be installing routes in weird non-standard
tables as such we need to only handle nexthop changes
for nexthops that are actually changing that we care
about and to only requeue for route nodes we have
route entries for from PBR
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Holdem statics display the dest (and mask, if present) string that the
user entered instead of converting to CIDR notation and applying the
mask. They need to do the latter.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Adds support for V4 GoAway flag as described in
https://www.ietf.org/id/draft-bz-v4goawayflag-00.txt
This option allows advertising neighbors to indicate to recipients that
they should disable IPv4 on the link.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Add some additional debug information to the netlink debug
messages so we can see the table we are installing to as
well as the nexthop's vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The header length needs to be subtracted from the handling
side of the zapi in zebra. This is because we refigure the
header data structure. The receive side doesn't care
about the total header length so no need to subtract there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In BGP, doing policy-routing requires to use table identifiers.
Flowspec protocol will need to have that. 1 API from bgp zebra has been
done to get the table chunk.
Internally, onec flowspec is enabled, the BGP engine will try to
connect smoothly to the table manager. If zebra is not connected, it
will try to connect 10 seconds later. If zebra is connected, and it is
success, then a polling mechanism each 60 seconds is put in place. All
the internal mechanism has no impact on the BGP process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is connecting the table manager with remote daemons by
handling the queries.
As the function is similar in many points with label allocator, a
function has been renamed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The range is given from table manager from zebra daemon.
There are 2 ranges available for table identifier:
- [1;252] and [256;0xffffffff]
If the wished size enters in the first range, then the start and end
range of table identifier is given within the first range.
Otherwise, the second range is given, and an appropriate range is given.
Note that for now, the case of the VRF table identifier used is not
taken into account. Meaning that there may be overlapping. There are two
cases to handle:
- case a vrf lite is allocated after the zebra and various other daemons
started.
- case a vrf lite is initialised and the daemons then start
The second case is easy to handle. For the former case, I am not so
sure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Prevent zebra from crashing for when the nexthop vrf has
changed in some manner and the lookup fails.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There are many callpaths to get to static_install_route. The nexthops
each have their own vrf that may or may not be up yet. If it is
allow the installation.
Doing this check here to avoid having to add this all over the place.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a interface is moved from one vrf to another, we get a callback
to move the static routes. Extend the work to look at all static
routes across all vrf's since we allow static route leaking now.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a user enables and disables a vrf, we were not
properly cleaning up the static routes leaving us
in a state where we would crash by looking at anything
in zebra.
On disable of a vrf -> Search through all static routes
and if the nexthop vrf is the disabled vrf uninstall it.
Additionally uninstall all static routes in that zvrf
On enable of a vrf -> Search through all static routes
and if the nexthop vrf is the enabled vrf install it.
Additionally install all the static routes in that zvrf.
Ticket: CM-19768
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There were a few cases where we were not properly de-registering
the static nexthops passed to us. This was important when
the static route was being removed for whatever reason that
we did not leave slag for the nexthop tracking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When moving interfaces to an other place, like other netns, the
remaining interface is still present, with inactive status.
Now, that interface is deleted from the list, if the interface appears
on an other netns. If not, the interface is kept.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The table id of the vrf is being given to us as part
of the vrf creation netlink callback. Unfortunately it
was being set in the zvrf *after* the vrf_enable callback.
This didn't used to matter until we started having config data
stored on the side that we needed to act on when the vrf
came up enough to start working.
So when we were storing static routes and installing them
they were being pushed into the default table for non-default
vrf's.
Ticket: CM-19141
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Upon a 'ip netns del' event, the associated vrf with netns backend is
looked for, then the internal contexts are first disabled, then
suppressed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The vrf netns usage makes a crash, when deleting vrf, due to the hash
list of rules not initialised for non default VRF.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because vrf with netns backend may be used, the correct zns must be
found prior any modifications.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When we are removing a rule from the zns->rules_hash, free up
the rule from the hash and free the memory.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we get a rule that is supposed to replace
an existing rule, make it look like a rule replace
semantics.
Install new rule, then delete the old original rule.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This patch fixes two bugs with respect to static route configuration
inside vrf contexts:
* Entering a negative form of a static route created the static route.
* Once created, static routes could not be deleted.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When a route_delete is received allow the deletion
to occur in the passed in tableid if the vrf is VRF_DEFAULT.
This now matches route_add behavior in rib_add_multipath
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that we have properly decoded the zapi_route sent to us
and if we cannot decode, log and move on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we have a case where the user re-enters the same
ip route line, we need to delete the memory we just
malloc'ed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
PR #1739 added code to leak routes between (default VRF) VPN safi and unicast RIBs in any VRF. That set of changes included temporary CLI including vpn-policy blocks to specify RD/RT/label/&c. After considerable discussion, we arrived at a consensus CLI shown below.
The code of this PR implements the vpn-specific parts of this syntax:
router bgp <as> [vrf <FOO>]
address-family <afi> unicast
rd (vpn|evpn) export (AS:NN | IP:nn)
label (vpn|evpn) export (0..1048575)
rt (vpn|evpn) (import|export|both) RTLIST...
nexthop vpn (import|export) (A.B.C.D | X:X::X:X)
route-map (vpn|evpn|vrf NAME) (import|export) MAP
[no] import|export [vpn|evpn|evpn8]
[no] import|export vrf NAME
User documentation of the vpn-specific parts of the above syntax is in PR #1937
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
When figuring out whom to call and if we actually can legally
call into the handler array actually use the number of elements
in the array instead of the size of the array.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When specifying a ip route:
ip route 4.3.2.0/24 192.168.201.1 vrf DONNA
Accept DONNA even if it has not been created yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If a user enters a route inside a non kernel existant vrf:
vrf BLOOP
ip route 4.3.2.0/24 192.168.201.1
!
They should be able to enter it over and over and over and
over and over no matter how futile it is.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Currently if I try to use a nexthop-vrf that has
not been specified yet we get a failure from the cli.
Add code to zebra so that if we fail to find the nexthop-vrf
we auto create it, instead of failing the install.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add to the function prototypes the names of variables
to hopefully make it easier for people to program against
this header.
Signed-off-by: Donald Sharp<sharpd@cumulusnetworks.com>
When we are signaling to a client from zebra that a nexthop
has changed, include the labels on the nexthop as well.
Upper level protocols need to know if the labels exist
in order to make intelligent decisions about what to do.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The application of a label to a route entry needs to
look at all non-recursive nexthops to be attached to
instead of just the first one.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
the rib_wib_table function was uncalled by anyone remove
and additionally remove it's static function it called.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive an arbitrary table over the netlink bus
save it for later perusal and sweep any routes that
we may have created from an earlier run.
The current redistribute code is limited to
ZEBRA_KERNEL_TABLE_MAX. I left this alone for the
moment because I believe it needs to be converted
to a RB tree instead of a flat array. Which is more
work for the future. Additionally this proposed
change might necessitate some cli changes or rethinks.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It is possible for clients to install routes into tables
that they desire. Modify the code to delete these routes
from these tables as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra detects that the originator has dissapeared
delete all rules associated with that client.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There were several places where when I am attempting
to debug zebra functionality that I would really
like to have the ability to know what vrf I think
I am operating on.
Add the vrf_id to a bunch of zlog_debug messages
to help figure out issues when they happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Vty commands that link netns context to a vrf is requiring some
privileges. The change consists in retrieving the privileges at the
vrf_cmd_init() called by the relevant daemon. Then use it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In order to create the netns context, the zebra parser at startup needs
to have its privileges raised.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Nobody uses it, but it's got the same definition. Move the parser
function into zclient.c and use it.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Group send and receive functions together, change handlers to take a
message instead of looking at ->ibuf and ->obuf, allow zebra to read
multiple packets off the wire at a time.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
A lot of the handler functions that are called directly from the ZAPI
input processing code take different argument sets where they don't need
to. These functions are called from only one place and all have the same
fundamental information available to them to do their work. There is no
need to specialize what information is passed to them; it is cleaner and
easier to understand when they all accept the same base set of
information and extract what they need inline.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Formalize the ZAPI header by documenting it in code and providing it to
message handlers free of charge to reduce complexity.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
All of the ZAPI message handlers return an integer that means different
things to each of them, but nobody ever reads these integers, so this is
technical debt that we can just eliminate outright.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
musl-libc is a lightweight libc used by alpine linux:
https://www.musl-libc.org/
AFAICT, this is the only change to the source needed to get
basic frr support compiling on musl.
Two changes in one patch, get ethhdr from netinet/if_ether.h
and replace the only __caddr_t I could find in the source base
with caddr_t.
Testing done:
Compiled apk packages using a docker environment (patches
coming soon) also compiled redhat and debian using a similar
docker environment (RFC patches for those changes are queued
up too)...
Issue: https://github.com/FRRouting/frr/issues/1859
Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
Every place we need to pass around the rule structure
we need to pass around the ifp as well. Move it into
the structure. This will also allow us to notify up
to higher level protocols that this worked properly
or not better too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Keep track of rules written into the kernel. This will
allow us to delete them on shutdown if we are not cleaned
up properly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the add/delete to go through a intermediary function in
zebra_pbr.c instead of directly to the underlying os call. This
will allow future refinements to track the data a bit better
so that on shutdown we can delete the rules.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) use uint32_t instead of u_int32_t as we are supposed to
2) Consolidate priority into the rule.
3) Cleanup the api from this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement netlink interactions for Policy Based Routing. This includes
APIs to install and uninstall rules and handle notifications from the
kernel related to rule addition or deletion. Various definitions are
added to facilitate this.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Also modify `struct route_entry` to use nexthop_groups.
Move ALL_NEXTHOPS loop to nexthop_group.h
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the calling daemon to pass down what table-id we
want to use to install the route. Useful for PBR.
The vrf id passed must be the VRF_DEFAULT else this
value is ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The work_queue_free function free'd up the wq pointer but
did not set it too NULL. This of course causes situations
where we may use the work_queue after it is freed. Let's
modify the work_queue to set the pointer for you.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If a interested party removes one of it's routes let
it know that it has happened as asked for.
Add a ZAPI_ROUTE_REMOVED to the send of the route_notify_owner
Add a ZAPI_ROUTE_REMOVE_FAIL to the send of the route_notify_owner
Add code in sharpd to notice this and to allow it to keep
track of routes removed for that invocation and give timing
results.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move setting vrf loopback flag on ifp after
zebra vrf type is set (ziftype).
Zebra connected not to announce unnumbered for
VRF interface (similar to loopback).
Ticket:CM-19914
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com
When zebra is being configed we allow for static routes
to be entered. This presents a problem for when a vrf
is cli configed but not kernel configed yet.
Modify zebra to notice that when a static route is
entered and either the nexthop vrf or the vrf
is not fully configed, to save that config to the
side.
When vrf's become active( kernel configed ) parse
through the list of saved to the side static routes
and determine if any of them can be installed.
Additionally modify the cli to output the saved
to the side cli, so that we can properly handle
a wr mem.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When you have individual 'ip route..' commands
under a VRF allow them to be displayed properly
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the originating routes type and instance to the nexthop
update message. This is necessary because there exist
scenarios where BGP needs to make a decision about the
originating route type and instance to know if it is
going to be doing a route replace to a route that would
resolve to itself.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When decoding and creating the appropriate data structures
for a nexthop, use the passed in vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement support for EVPN symmetric routing for IPv6 routes. The next hop
for EVPN routes is the IP address of the remote VTEP which is only an IPv4
address. This means that for IPv6 symmetric routing, there will be IPv6
destinations with IPv4 next hops. To make this work, the IPv4 next hops are
converted into IPv4-mapped IPv6 addresses.
As part of support, ensure that "L3" route-targets are not announced with
IPv6 link-local addresses so that they won't be installed in the routing
table.
Signed-off-by: Vivek Venkatraman vivek@cumulusnetworks.com
Reviewed-by: Mitesh Kanjariya mitesh@cumulusnetworks.com
Reviewed-by: Donald Sharp sharpd@cumulusnetworks.com
This limitation ignores the creation of a new NS context, when an
already present NS is available with the same NSID. This limitation
removes confusion, so that only the first NS will be used for
configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
So as to get the correct NETNS where some discovery must be done and
populated, the zns pointer is directly retrieved from zvrf, instead of
checking that the VRF is a backend NETNS or not.
In the case where the interfaces are discovered before the VRF is enabled
( VRF-lite populate), then the default NS is retrieved.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because socket creation is tightly linked with socket binding for vrf
lite, the proposal is made to extend socket creation APIs and to create
a new API called vrf_bind that applies to vrf lite. The passed interface
name is the interface that will be bound to the socket passed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
That API can be used to wrap the ioctl call with various vrf instances.
This permits transparently doing the ioctl() call without taking into
consideration the vrf backend kind.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The zebra daemon introduces the logical router initialisation.
Because right now, the usage of logical router and vrf NETNS is
exclusive, then the logical router and VRF are initialised accordingly.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new API is available for interface ioctl operations on Linux:
vrf_if_ioctl. This is the unified API that permits doing ioctl
operations on a per interface basis.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When interfaces are located on different NETNS ( different VRF), then a
switch from netns context is necessary when calling setns(). The VRF
apis to switch and switch back are called, so that the ioctl will work
accordingly.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The assert appears in zebra_mpls.c when checking default zebra_vrf.
It appears that when the mpls entries are flushed, it gets the default
vrf which is already flushed by vrf_terminate() function. In order to
avoid that assert to trigger a crash, the mpls flush is called before
vrf termination.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this is a static analysis performed by c-lang scan-build tool that
demonstrated this issue. This commit is handling the fix.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when the netns backend is selected for VRF, the default VRF is being
assigned a NSID. This avoids the need to handle the case where if the
incoming NSID was 0 for a non default VRF, then a specific handling had
to be done to keep 0 value for default VRF.
In most cases, as the first NETNS to get a NSID will be the default VRF,
most probably the default VRF will be assigned to 0, while the other
ones will have their value incremented. On some cases, where the NSID is
already assigned for NETNS, including default VRF, then the default VRF
value will be the one derived from the NSID of default VRF, thus keeping
consistency between VRF IDs and NETNS IDs.
Default NS is attempted to be created. Actually, some VMs may have the
netns feature, but the NS initialisation fails because that folder is
not present.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
upon zebra initialisation, and upon further netnamespace creation, the
the netnamespaces are created and a vrf associated to the netnamespace
is created. By convention, the name of the netns will be the same as the
VRF.
Add a stub routine that returns a fake ns identifier, in case netlink (
linux machines) is not available.
Also, upon each newly discovered NETNS, a NSID id being generated,
either by relying on kernel NSID feature, or by generating locally the
NSID ( see previous commit for more information).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A NS identifier is collected by netlink. This identifier is a 32 bit
identifier that is either generated by the kernel (if not set) or
manually set by a set netlink command. The commit here is getting the
NSID from the newly created NS. If the linux option to create or get a
new NSID from the kernel does not exist, then the NSID is locally
genrated.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The addition of the name of the netns in the vrf message introduces also
a limitation when the size of the netns is bigger than 15 bytes. Then
the netns are ignored by the library.
In addition to this, some sanity checks have been introduced. some
functions to create the netns from a call not coming from the vty is
being added with traces.
Also, the ns vty function is reentrant, if the context is already
created.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Show vrf command displays information on the vrf, if it is related to
vrf kernel or if it is related to netns.
When a vrf from kernel is detected, before creating a new vrf, a check
is done against an already present vrf, and if that vrf is not a vrf
mapped with a netns. If that is that case, then the creation is
rejected.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The zebra netnamespace contexts are initialised, based on the callback
coming from the NS. Reversely, the list of ns is parsed to disable the
ns contexts.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If vrf backend is netns, then the zebra will create its own
zebra_ns context for each new netns discovered. As consequence,
a routing table, and other contexts will be created for each
new namespace discovered. When it is enabled, a populate process
will be done, consisting in learning new interfaces and routes, and
addresses from other NETNS.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is also a fix that avoids a VRF to be attached to the wrong
namespace context, at creation time. Because the VRF, at creation time
does not know yet the namespace where it will get its information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
For each route to be added or deleted, instead of applying directly to
default namespaces, when a vrf is mapped to a namespace, then the
correct zns must be found out.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon following calls: interface poll, address poll, route poll, and
ICMPv6 handling, each new Namespace is being parsed. For that, the
socket operations need to switch from one NS to one other, to get the
necessary information.
As of now, there is a crash when dumping interfaces, through show
running-config.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
a vty command is added:
in addition to this command ( kept for future usage):
- [no] logical-router-id <ID> netns <NETNSNAME>
a new command is being placed under vrf subnode
- vrf <NAME>
[no] netns <NETNSNAME>
exit
This command permits to map a VRF with a Netnamespace.
The commit only handles the relationship between vrf and ns structures.
It adds 2 attributes to vrf structure:
- one defines the kind of vrf ( mapped under netns or vrf from kernel)
- the other is the opaque pointer to ns
The show running-config is handled by zebra daemon.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The netns backend is chosen by VRF if a runtime flag named vrfwnetns is
selected when running zebra.
In the case the NETNS backend is chosen, in some case the VRFID value is
being assigned the value of the NSID. Within the perimeter of that work,
this is why the vrf_lookup_by_table function is extended with a new
parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The ZEBRA_FLAG_INTERNAL flag is used to signal to zebra that
the route being added, the nexthops for it can be recursively
resolved. This name keeps throwing me off when I read it
so let's rename to something that allows the developer to
understand what is going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The 'struct route_entry *old' and 'struct route_entry *new' can sometimes
be the same route type( for a route replace ), so when we are checking
to see if a new owner has taken over, don't tell the owner it is
replacing it self.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
Add a bit more detail to tell us what we are sending
up to a protocol so we can debug it better in the
future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some of the tables are no longer stored in the zvrf
and in the zns now. On shutdown zns is cleaned up
after vrf( and rightly so!) As such we should not
attempt to count the information if we don't have
a zvrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
With the ability of zebra to handle random tables,
add code to display those tables via the
show <ip|ipv6> route table (1-...) [json] command.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The linux kernel allows a vast expanse of tables to be used.
It would be useful for zebra to track these tables if they
are being used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The other_tables data structure does not belong to a vrf.
It belongs to the zns. This is because each vrf does not
need to have copies of each of other_tables.
Additionally move the array into a RB_TREE. This will allow
us to sort quickly and easily expand the number of tables
we can support to beyond the ZEBRA_KERNEL_TABLE_MAX define.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem seen when a prefix was learned with nexthops from multiple
route sources (static and ospf in this case) and the link to that
nexthop flaps. The nht entry was incorrectly deleted so when the
link came back up the static was not re-installed correctly.
Ticket: CM-19675
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
When a BGP-labeled route is resolved into an LDP-labeled IGP route,
zebra would install it with no labels in the kernel. This patch implements
recursive MPLS labels, i.e. make zebra install all labels from the route's
nexthop chain (the labels from the top-level nexthop being installed in
the top of the MPLS label stack). Multiple recursion levels are supported.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
If you were to configure a v4 and v6 vrf pop and forward label
that both happened to be the same, unconfiguring one would
remove them both.
This fixes that issue by noticing if we should remove it or
not based upon v4 or v6 having the same label or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to pass in an afi to zebra. zebra_vrf keeps
track of the afi/label tuple and then does the right thing
before we call down. AF_MPLS does not care about v4 or v6
it just knows label and what device to use for lookup.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1) Add asserts in a couple of spots to show we
never expect prefix to be bad.
2) Fix some bfd code where out_ctxt will
always be NULL.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Asymmetric routing is an ideal choice when all VLANs are cfged on all leafs.
It simplifies the routing configuration and
eliminates potential need for advertising subnet routes.
However, we need to reach the Internet or global destinations
or to do subnet-based routing between PODs or DCs.
This requires EVPN type-5 routes but those routes require L3 VNI configuration.
This task is to support EVPN type-5 routes for prefix-based routing in
conjunction with asymmetric routing within the POD/DC.
It is done by providing an option to use the L3 VNI only for prefix routes,
so that type-2 routes (host routes) will only use the L2 VNI.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
is_vni_l3 was removed as a part of PR1700. However, it seems to be used in master.
Causing the breakage. Made the changes to not use the API anymore.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
removed an additional field 'local-tunnel-ip' from l2vnis o/p
Ticket: CM-19670
Review: CCR-7167
Testing: Verified that the output is proper
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Modify mpls.h to rename MPLS_LABEL_ILLEGAL to be MPLS_LABEL_NONE.
Fix all pre-existing code that used MPLS_LABEL_ILLEGAL.
Modify the zapi vrf label message to use MPLS_LABEL_NONE as the
signal to remove label associated with a vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to pass the lsp owner type through the zapi
and in addition add a new label type for the sharp protocol
for testing.
Finally modify zebra_mpls.h to not have defaults specified
for the enum. That way when we add a new LSP type the
compile fails and the person doing the addition knows
where he has to touch shit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Turns out we had 3 different ways to define labels
all of them overlapping with the same meanings.
Consolidate to 1. This one choosen is consistent
naming wise with what the *bsd and linux kernels
use.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability for the nexthops to be a NEXTHOP_TYPE_IFINDEX.
Since we are using this code for L3vpn pop and forward operations
and we know that the lo or vrf device name must exist we
trust that it is correct.
Update display to show the correct data with a 'show mpls table'
Update the mpls install into the kernel to treat
NEXTHOP_TYPE_IFINDEX as special and we do not need
to pass in the nexthop label.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For L3VPN's we need to create a label associated with the specified
vrf to be installed into the kernel to allow a pop and lookup
operation.
The new api is:
zclient_send_vrf_label(struct zclient *zclient, vrf_id_t vrf_id,
mpls_label_t label);
For the specified vrf_id associate the specified label for
a pop and lookup operation for forwarding.
To setup a POP and Forward use MPLS_LABEL_IMPLICIT_NULL
If the same label is passed in we ignore the call.
If the label is different we update entry.
If the label is MPLS_LABEL_NONE we remove
the entry.
This sets up the api. Future commits will have the functionality
to actually install into the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nh_resolve_via_default function is an accessor function
for NHT in zebra. Let's move this function to it's proper
place.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix the read in of vrf routes on a start or restart that caused
the nexthop_vrf to be assumed to be the default vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The encoding of the nexthop update made some distinctions
between nexthop types that it does not need to.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* zebra/kernel_socket.c: include "rt.h" to provide the prototypes of
kernel_init() and kernel_terminate();
* lib/prefix.h: remove the deprecation warning whenever ETHER_ADDR_LEN
is used. isisd uses the ETHER_HDR_LEN constant which is defined in
terms of ETHER_ADDR_LEN in the *BSD system headers. So, when building
FRR on *BSD, we were getting several warnings because we were using
ETHER_ADDR_LEN indirectly;
* lib/command_lex.l, lib/defun_lex.l: ignore other harmless warnings;
* lib/spf_backoff.c: cast 'tv->tv_usec' to 'long int' before printing.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The v6 code had the same issue with how it handled
nexthop-vrf and nexthop when it was entered on the
same line. This fixes that issue.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a rib_unlink() event is directly called for a
route_entry we need to see if the dest->selected_fib
is the same and just unset the dest->selected_fib.
This was happening for redistributed table 10 routes
into BGP.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Zebra stores routes coming from the kernel for non-default
tables. This information on shutdown was being leaked
because we never cleaned it up. Allow for this to happen
now.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The error handling of the nexthop vrf and the vrf
for what was specified on the cli was not as clean
as it should have been.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If src happens to point at all 0's due to not initializing
it and if the address passed in is not a v6 address then
we would not set src in the AF_INET6 call and would
fail the (src.ipv4.s_addr && inet_pton(AF_INET...)
call. Thus causing us to return a NULL and make
the routemap code think there was an issue.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nexthop_vrf should be looked up as appropriate,
If the nexthop_vrf was specified use that, else
use the vrf context of what was passed in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code change to switch from stream_getX to STREAM_GETX added
a goto statement to be handled for a failure case. The failure
case was properly handled but the normal case was not tested
properly and there exists a situation where we would free
the out_ctxt 2 times. Prevent that from happening.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Currently, while processing kernel messages related to VNIs
we first check if VNI is L3 - this is a hash lookup
later, we do the lookup again to find the L3-VNI.
This is non-optimal.
Made changed to make sure we only do the lookup once.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
The dest->selected_fib assignment needs to happen
after the install and should be controlled by
the southbound api return of success or failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The route_node that we are working on is going to be interesting
to the kernel_route_rib_pass_fail. So I am setting up the
code to allow me to pass it. This will be done in a subsuquent
commit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Refine the notion of what FRR considers as "configured" VRF. It is no longer
based on user just typing "vrf FOO" but when something is actually configured
against that VRF. Right now, in zebra, the only configuration against a VRF
are static IP routes and EVPN L3 VNI. Whenever a configuration is removed,
check and clear the "configured" flag if there is no other configuration for
this VRF. When user attempts to configure a static route and the VRF doesn't
exist, a VRF is created; the VRF is only active when also defined in the
kernel.
Updates: 8b73ea7bd479030418ca06eef59d0648d913b620
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-10139, CM-18553
Reviewed By: CCR-7019
Testing Done:
1. Manual testing for L3 VNI and static routes - FRR restart, networking
restart etc.
2. 'vrf' smoke
<DETAILED DESCRIPTION (REPLACE)>
When a VRF gets deleted - e.g., networking restart or ifdown of the VRF - but
has associated FRR configuration, additional cleanup of all dynamic data pertaining
to this VRF is necessary. This includes the routing tables, next hop tables,
temporary queues for this VRF etc. Only the FRR configuration for this VRF must
be retained.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-19148
Reviewed By: CCR-7030
Testing Done:
1. Manual testing - This scenario and EVPN configuration
2. Various smoke tests - vrf, bgp, pim, l3-smoke
Only check on L3-VNI SVI status when uninstalling remote next hops.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-19036
Reviewed By: None
Testing Done:
1. Networking restart
2. VxLAN interface disable/enable
3. VRF delete and readd
A VRF is active only when the corresponding VRF device is present in the
kernel. However, when the kernel VRF device is removed, the VRF container in
FRR should go away only if there is no user configuration for it. Otherwise,
when the VRF device is created again so that the VRF becomes active, FRR
cannot take the correct actions. Example configuration for the VRF includes
static routes and EVPN L3 VNI.
Note that a VRF is currently considered to be "configured" as soon as the
operator has issued the "vrf <name>" command in FRR. Such a configured VRF
is not deleted upon VRF device removal, it is only made inactive. A VRF that
is "configured" can be deleted only upon operator action and only if the VRF
has been deactivated i.e., the VRF device removed from the kernel. This is
an existing restriction.
To implement this change, the VRF disable and delete actions have been modified.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mkanjariya@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-18553, CM-18918, CM-10139
Reviewed By: CCR-7022
Testing Done:
1. vrf and pim-vrf automation tests
2. Multiple VRF delete and readd (ifdown, ifup-with-depends)
3. FRR stop, start, restart
4. Networking restart
5. Configuration delete and readd
Some of the above tests run in different sequences (manually).
Kernel can delete a frr installed remote RMAC on a L3-VNI.
We should re-add if such a siatuation occurs
as we are the owner of the RMAC.
This behavor is same for remote MACs as well and was missing for RMACs.
Ticket: CM-18762
Review: CCR-6992
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
0. move all global EVPN details to 'show evpn [json]' command
1. change "VRF" to "Tenant VRF" in 'show evpn vni'
2. change 'show vrf vni' command to tabular form
and add l3-vni related params to the output
3. show evpn rmac should show refcount only in detailed output
4. show evpn next-hop should show refcount only in detailed output
5. move VRF in 'show evpn l3vni' to the end
6. add num rmacs and num nexthops to show evpn l3vni
7. remove "info" from 'show bgp vrf <> l3vni info'
8. show evpn vni <vni> should show l2vni details or l3 vni details
9. show evpn vni should show both L2 and L3 VNIs
10. show bgp l2vpn evpn - shows all global bgp l2vpn evpn details
11. show bgp l2vpn evpn vni - will show both l2 and l3 vnis
12. show bgp l2vpn evpn vni - should show both l2 and l3 vnis
13. follow camel notation for all json keys
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
In EVPN symmetric routing, not all subnets are presents everywhere.
We have multiple scenarios where a host might not get learned locally.
1. GARP miss
2. SVI down/up
3. Silent host
We need a mechanism to resolve such hosts. In order to achieve this,
we will be advertising a subnet route from a box and that box will help
in resolving the ARP to such hosts.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
- Remove OSPD_SR route type
- Check that Segment Routing is enable only in default VRF
- Add comment for SRGB in lib/mpls.h
- Update documentation
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
When a nexthop is resolved via a label based nexthop, copy
the labels into the newly created recursive nexthop.
Please note that this does not fix the case where we
have a label based nexthop that is recursively resolved
through *another* nexthop that is also label based.
In this case we need to create a new label stack
for those routes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The function zserv_create_header was exactly the same
as zclient_create_header. Let's just have one in the
system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
On some places, macro NS_DEFAULT was not used. This commit is replacind
on some identified places where 0 can be replaced with NS_DEFAULT macro.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
On some places of code, the VRF_DEFAULT define was not used. This commit
is ensuring that the macros is well used.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because the VRF_ID is mapped into 32 bit, and because when NETNS will be
the backend of VRF, then the NS identifier must also be encoded as 32
bit.
Also, the NS_UNKNOWN value is changed accordingly to UINT32_MAX.
Also, the NS_UNKNOWN and NS_DEFAULT values are removed from zebra_ns.h
and kept on ns.h header file.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preparatory work for configuring vrf/frr over netns
vrf structure is being changed to 32 bit, and the VRF will have the
possibility to have a backend made up of NETNS.
Let's put some history.
Initially the 32 bit was because one wanted to map on vrf_id both the
VRFLITE and the NSID.
Initially, one would have liked to make zebra configure at the same time
both vrf lite and vrf from netns in a flat way. From the show
running perspective, one would have had both kind of vrfs, thatone
would configure on the same way.
however, it leads to inconsistencies in concepts, because it mixes vrf
vrf with vrf, and vrf is not always mapped with netns.
For instance, logical-router could also be used with netns. In that
case, it would not be possible to map vrf with netns.
There was an other reason why 32 bit is proposed. this is because
some systems handle NSID to 32 bits. As vrf lite exists only on
Linux, there are other systems that would like to use an other vrf
backend than vrf lite. The netns backend for vrf will be used for that
too. for instance, for windows or freebsd, some similar
netns concept exists; so it will be easier to reuse netns
backend for vrf, than reusing vrflite backend for vrf.
This commit is here to extend vrf_id to 32 bits. Following commits in a
second step will help in enable a VRF backend.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
During VRF change handling, the connected route for the interface should be
installed only if the interface is up. Otherwise, we end up with duplicate
connected routes which can lead to other problems.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-19364
Reviewed By: CCR-7099
Testing Done: Manual verification
This is an implementation of draft-ietf-ospf-segment-routing-extensions-24
and RFC7684 for Extended Link & Prefix Opaque LSA.
Look to doc/OSPF_SR.rst for implementation details & known limitations.
New files:
- ospfd/ospf_sr.h: Segment Routing structure definition (SubTLVs + SRDB)
- ospfd/ospf_sr.c: Main functions for Segment Routing support
- ospfd/ospf_ext.h: TLVs and SubTLVs definition for RFC7684
- ospfd/ospf_ext.c: RFC7684 Extended Link / Prefix implementation
- doc/OSPF-SRr.rst: Documentation
Modified Files:
- doc/ospfd.texi: Add new Segment Routing CLI command definition
- lib/command.h: Add new string command for Segment Routing CLI
- lib/mpls.h: Add default value for SRGB
- lib/route_types.txt: Add new OSPF Segment Routing route type
- ospfd/ospf_dump.[c,h]: Add OSPF SR debug
- ospfd/ospf_memory.[c,h]: Add new Segment Routing memory type
- ospfd/ospf_opaque.[c,h]: Add ospf_sr_init() starting function
- ospfd/ospf_ri.c: Add new functions to Set/Get Segment Routing TLVs
Add new ospf_router_info_lsa_upadte() to send Opaque LSA to ospf_sr.c()
- ospfd/ospf_ri.h: Add new Router Information SR SubTLVs
- ospfd/ospf_spf.c: Add new scheduler when running SPF to trigger
update of NHLFE
- ospfd/ospfd.h: Add new thread for Segment Routing scheduler
- ospfd/subdir.am: Add new files
- vtysh/Makefile.am: Add new ospf_sr.c file for vtysh
- zebra/kernel_netlink.c: Add new OSPF_SR route type
- zebra/rt_netlink.[c,h]: Add new OSPF_SR route type
- zebra/zebra_mpls.h: Add new OSPF_SR route type
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
When we receive a read failure in handling a FPM read
let's add a bit more information to what we think has
gone wrong, in a hope that debugging will be a bit easier.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow this to work:
vrf DONNA
ip route 4.3.2.1/32 192.168.1.5 nexthop-vrf EVA
The static route code was not properly telling the
nexthop resolution code what vrf to use.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In order for routes to be leaked the ifindex must be sent
down into the kernel over the netlink protocol. So
send it( we always figure it out ) when we add the
route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the code that generates the 'show run' output for
'ip route' to be controlled by the vrf config generation
code. Since it really belongs there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the NS/VRF initialization code for zebra to an earlier
point in startup. In the future we will have code that
will want to install_element into a VRF_NODE from zebra_vty.c
Signed-off-by: Donald Sharp <sahrpd@cumulusnetworks.com>
If the vrf for the nexthop is different than the vrf the
route is in, display the nexthops vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are handling nexthops in zebra, use the appropriate
vrf to figure out if the nexthops are active or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add to the rib_add function the ability to pass in the nexthops
vrf.
Additionally when we decode the netlink message from the linux
kernel, properly figure out the nexthops vrf_id.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
With VRF route-leaking we need to know what vrf
the nexthops are in compared to this vrf. This
code adds the nh_vrf_id to the route entry and
sets it up correctly for the non-route-leaking
case.
The assumption here is that future commits
will make the nh_vrf_id *different* than
the vrf_id.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There are certain interfaces that when brought up and we receive
the netlink notification about it, the speed of the interface is
not set correctly. This creates a one-shot thread that will
wait 15 seconds and then requery the speed and if it is different
it will renotify the running daemons.
The kernel should notify us on speed changes, unfortunately this
is not done currently via a netlink message as you would think.
As I understand it there is some in-fighting about the proper
way to approach this issue and due to the way the kernel release
cycle works we are a ways off from getting this fixed. This
is a `hack` to make us work correctly while we wait for the
true answer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The rn can not have an rn->info pointer and as
such the dest may be NULL. Don't assign
the old_fib pointer if so. This is ok
because we know RNODE_FOREACH... will not
iterate if dest is NULL.
Fixes: #1575
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
EVPN is only enabled when user configures advertise-all-vni.
All VNIs (L2 and L3) should be cleared upon removal of this config.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For EVPN type-5 route the NH in the NLRI is set to the local tunnel ip.
This information has to be obtained from kernel notification.
We need to pass this info from zebra to bgp in l3vni call flow.
This patch doesn't handle the tunnel-ip change.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
When a remote VTEP next hop entry (for symmetric routing) becomes
stale, reinstall it. This makes the behavior the same as what is
done for remote host next hops (for asymmetric routing and ARP
suppression).
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Upon a l3vni delete (no vni under a vrf) is executed,
we should uninstall all the RMACs and NHs associated with the l3vni.
This is because by the time we get a route delete in zebra
l3vni is already deleted and we dont have refernce to RMACs and NHs
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
It is technically possible to attempt to use a NULL pointer.
Remove this from happening.
Additionally cleanup code indentation a small bit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When displaying a specific route and if it has a tag
and if we have turned on realm support notify the user
that a tag value of (1-255) is installed into the kernel
with the realm set.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Linux has the ability to support a concept of 'realms'.
This concept allows you to mark routes with a realm id
value of 1-255. If you have marked the realm
of a route then you can use the tc program to
apply policy to the routes.
This commit adds the ability of FRR to interpret
a tag from (1-255) as a realm when installing into
the kernel. Please note that at this point in time
there is no way to set policy from within FRR. This
must be done outside of it.
The normal methodology for setting tags is valid here
via a route-map.
Finally this is only applied if the --enable-realms configure
option is applied.
Signed-off-by: Kaloyan Kovachev <kkovachev@varna.net>
As netlink is available for all linux systems ( old linux distributions
are not considered), this commit removes the ipv6 ioctl support for
linux.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>