Allow the modification of whether or not we will allow
BUM flooding on the vxlan bridge. To do this allow
the upper level protocol to specify via the ZEBRA_VXLAN_FLOOD_CONTROL
zapi message.
If flooding is disabled then BUM traffic will not be forwarded
to other VTEP's.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Work to handle the route-maps, namely the header changes in zebra_vrf.h
and the mapping of using that everywhere
Signed-off-by: vishaldhingra vdhingra@vmware.com
The condition in the do/while is always false because 'return_nsid' cannot
reach the end of the loop with 'return_nsid' having a different value than
NS_UNKNOWN. Because of that, the condition can be replaced with 0 (false).
Also, the loop can be removed because the two assignments made at the end
of the loop before the condition check are not used (detected via Clang,
afterwards).
Signed-off-by: F. Aragon <paco@voltanet.io>
Conditional code in netlink_macfdb_update() introduced in 2232a77c used
the 'dst_present' variable because not all cases were covered. Now it is
not necessary.
Signed-off-by: F. Aragon <paco@voltanet.io>
Wrapper the get/set of the table->info pointer so that
people are not directly accessing this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Unnecesary redeclaration of already-defined enum 'dp_results' removed.
Can be detected via static analysis with e.g.
./configure CFLAGS=-Wgnu-redeclared-enum CC=clang
Signed-off-by: F. Aragon <paco@voltanet.io>
Reduce or eliminate use of global zebra_ns structs in
a couple of netlink/kernel code paths, so that those paths
can potentially be made asynch eventually.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
All I can see is an unneccessary complication. If there's some purpose
here it needs to be documented...
Signed-off-by: David Lamparter <equinox@diac24.net>
When we receive a v6 RA packet with an optional
ND_OPT_SOURCE_LINKADDR take that data and construct the
v4 to v6 neighbor entry for that interface to allow
v4 w/ v6 nexthops to work with only global v6 addresses
on an interface.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Abstract the mac neigh installation for 169.254.0.1 into
it's own function that we can pass the mac address into.
This will allow a future commit to use this functionality
when we have the appropriate mac address from reading
optional attributes of a RA packet.
Signed-off-by: Donald Sharp <sharpd@cumuusnetworks.com>
This change makes the zebra acting as label manager proxy not to relay non-LM
messages to clients that a zebra acting in non-proxy mode may send to it. Also,
the existing code does not schedule a rcv in case of relay_response_back
returns -1. This patch re-schedules reads on the socket even in case such a
function returns -1 by calling thread_add_read().
Signed-off-by: F. Aragon <paco@voltanet.io>
Corrections so that the BGP daemon can work with the label manager properly
through a label-manager proxy. Details:
- Correction so the BGP daemon behind a proxy label manager gets the range
correctly (-I added to the BGP daemon, to set the daemon instance id)
- For the BGP case, added an asynchronous label manager connect command so
the labels get recycled in case of a BGP daemon reconnection. With this,
BGPd and LDPd would behave similarly.
Signed-off-by: F. Aragon <paco@voltanet.io>
The block comments from a couple commits were not following
proper style. Fix.
Fix SA warning that had snuck in.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Netdevices are not sorted in any fashion by the kernel during the initial
interface nldump. So you can get an upper device (such as an SVI) before
its corresponding lower device (bridge).
To fix this problem we skip resolving link dependencies during handling of
nldump notifications. Resolving instead at the end (when all the devices
are present)
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-22388, CM-21796
Reviewed By: CCR-7845
Testing Done:
1. verified on a setup with missing linkages
2. automation - evpn-min
Ensure that when the is_router condition changes for a locally learnt
neighbor, it is informed to BGP only if it is active i.e., the MAC is
also locally learnt.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
1. Failed test
2. vxlan_routing_test.py
Use boolean variables instead of unsigned int for certain VxLAN-EVPN
flags which are really used as boolean.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22288
Reviewed By: CCR-7832
Testing Done:
Along with a subsequent, related commit
When a remote MAC goes away, but there are neighbors referring to it,
ensure that when the last remote neighbor goes away, the MAC is
uninstalled from the kernel and no longer considered as remote.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Ticket: CM-22130
Reviewed By: CCR-7777
Testing Done:
1. Replicated failed scenario and verified with fix.
2. evpn-min
When a MAC moves from local to remote, a replace is allowed, EVPN
no longer has to delete the local MAC before installing the remote
MAC.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
So the linux kernel uses the RT_TABLE_MAIN for the table
id used for ip routing. The multicast routing tables use
RT_TABLE_DEFAULT. We changed the internal code of zebra_vrf
a few months back to use RT_TABLE_MAIN as the tableid to
use. This caused the pim sg stats to stop working because
of the kernel bug where it uses a different table
for ip routing and ip multicast.
Put a bit of a special case in to do the right thing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When debugging the mroute code path in zebra, add a bit of additional
data to allow us to know what is going on a bit more.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Newer linux kernels apparently send data down the netlink
bus for the creation of mroutes. Add a bit of code
to notice this and to handle it appropriately( ie do
nothing at this point in time ) as that the correct
place to do this is in the pim socket in pimd.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are displaying data about a netlink message
in debugs or errors, print out the message type
as a string instead of a number.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were linking all libs and binaries against libprotobuf-c if the
option was enabled... that makes no sense at all.
Signed-off-by: David Lamparter <equinox@diac24.net>
Since we're now building through one large Makefile, we can easily put
things with their daemons and crossreference nicely.
Signed-off-by: David Lamparter <equinox@diac24.net>
Debugging inactive nexthops in zebra can be quite difficult
and non-obvious what has gone wrong. Add detailed rib
debugs for the cases where we decide that a nexthop is
inactive so that we can more easily debug a reason
for the failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The _route_entry_dump function was not handling the nexthop as passed
in from an upper level protocol appropriate and as such not displaying
the v4/v6 nexthop right in the case where we have both going.
Additionally dump the nexthop vrf as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The RB-Tree used to store rmac information was not properly
handling the v6 address family. Modify the code to allow
this handling.
Cleans up this error message:
zebra[2231]: host_rb_entry_compare: Unexpected family type: 10
That is being seen, This fixes some connectivity issues being seen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For OpenFabric operation, we need to be able to install routes via
interfaces without any IPv4 addresses configured. Introduce a flag
ZEBRA_FLAG_ONLINK which upper protocols can set on a route they send
towards zebra, to force the nexthops to be considered onlink.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Problem reported that some bgp and ospf json commands did not return
any json output at all if the bgp/ospf instance did not exist.
Additionally, some bgp and ospf json commands did not return any json
output if the instance existed but no neighbors were defined. This
fix makes these commands more consistent in returning empty braces for
json output and issue a message if not using json output. Additionally,
made the flag "use_json" a bool to make it consistent since previously,
it had been defined as an int, char, u_char, and bool at various places.
Ticket: CM-21040
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This crash occurs only with netns implementation.
vrf meaning is different regarging its implementation (netns or
vrf-lite)
- With vrf-lite implementation vrf is a property of the interface that
can be changed as the speed or the state (iproute2 command: "ip link
set dev IF_NAME master VRF_NAME"). All interfaces of the system are in
the same netns and so interface name is unique.
- With netns implementation vrf is a characteristic of the interface
that CANNOT be changed: it is the id of the netns where the interface
is located. To change the vrf of an interface (iproute2 command to
move an interface "ip netns exec VRF_NAME1 ip link set dev IF_NAME
netns VRF_NAME2") the interface is deleted from the old vrf and
created in the new vrf.
Interface name is not unique, the same name can be present in the
different netns (typically the lo interface) and search of interface
must be done by the tuple (interface name, netns id).
Current tests on the vrf implementation (vrf-lite or netns) are not
sufficient. In some cases (for example when an interface is moved from
a vrf X to the default vrf and then move back to VRF X) we can have a
corruption message and then a crash of zebra.
To avoid this corruption test on the vrf implementation, needed when an
interface changes, has been rewritten:
- For all interface changes except deletion the if_get_by_name function,
that checks if an interface exists and creates or updates it if
needed, is changed:
* The vrf-lite implementation is unchanged: search of the interface
is based only on the name and update the vrf-id if needed.
* The netns implementation search of the interface is based on the
(name, vrf-id) tuple and interface is created if not found, the
vrf-id is never updated.
- deletion of an interface (reception of a RTM_DELLINK netlink message):
* The vrf-lite implementation is unchanged: the interface
information are cleared and the interface is moved to the default
vrf if it does not belong to (to allow vrf deletion)
* The netns implementation is changed: only the interface
information are cleared and the interface stays in its vrf to
avoid conflict with interface with the same name in the default
vrf.
This implementation reverts (partially or totally):
commit 393ec5424e ("zebra: fix missing node attribute set in ifp")
commit e9e9b1150f ("lib: create interface even if name is the same")
commit 9373219c67 ("zebra: improve logs when replacing interface to an
other netns")
Fixes: b53686c52a ("zebra: delete interface that disappeared")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when interface is a virtual ethernet interface, then there is no need to
update link pointer of interface.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There exists a possibility that the ifindex we are passed
does not exist and as such we should check for it not
resolving as part of the debug.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the case the default netns has a netns path, then a new NETNS
creation will be bypassed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The Vrf aliases can be known with a specific hook. That hook will then,
from zebra propagate the information to the relevant zapi clients.
The registration hook function is the same for all daemons.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This function is changed so that the interface index is searched across
the correct namespace.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a header to cleanup no declaration and properly
wrapper some variables to appropriate #ifdef.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were ignoring mpls labels encapped with static routes.
Added support for single and multipath labels.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code prior to this change, was allowing clients to register
for nexthop tracking. Then zebra would look up the rnh and
send to that particular client any known data. Additionally
zebra was blindly re-evaluating the rnh for every registration.
This leads to interesting behavior in that all people registered
for that nexthop will get callbacks even if nothing changes.
Modify the code to know if we have evaluated the rnh or not
and if so limit the re-evaluation to when absolutely necessary
This is of particular importance to do because of nht callbacks
for protocols cause those protocols to do not insignificant
work and as more protocols are registering for nht callbacks
we will cause more work than is necessary.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
These MIB OIDs were only used to identify clients on the SMUX protocol.
And even for that, they were essentially pointless.
Signed-off-by: David Lamparter <equinox@diac24.net>
The ZEBRA_IPV4_ROUTE_[ADD|DELETE] and ZEBRA_IPV6_ROUTE_[ADD|DELETE] functionality
has been deprecated for a year now, let's remove this code from the system.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zebra/client_main.c code is not being maintained or used.
Remove from system. Especially since the encode/decode
zapi functionality it `purports` to be testing is deprecated
and now being removed.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Handle Remote Neigh entry state change from Router to Host.
Remote MAC-IP update may not continue EVPN NA Extended community,
Zebra need to accomodate if router_flag change for existing neigh
and install with or without Router Flag (R-bit).
Testing:
Have locally run MAC/IP (neigh entry) with R-bit set,
Checke on remote VTEP 'show bgp evpn route ...mac ip' and
'show evpn arp-cache ...' contians router flag.
Change host to remove R-bit, which locally learnt entry removes
Router flag. This results in remote vtep to remove R-bit from
neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Neigh update can have router_flag change, from unset to set and
viceversa. This is the case where MAC, IP and VLAN are same but
entry's flag moved from R to not R bit and reverse case.
Router flag change needs to trigger bgpd to inform all evpn peers
to remove from the evpn route.
Testing Done:
Send GARP with and without R bit from host and validate neigh entry
and evpn neigh and mac-ip route entry in zebra and bgpd.
Check Peer VTEP evpn route entry where router flag is (un)set.
With R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP1(uplink-1) (27.0.0.9)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8 ND:Router
Flag
AddPath ID: RX 0, TX 1261
Last update: Wed Aug 15 20:52:14 2018
Without R-bit
Route [2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
VNI 1001
Imported from
27.0.0.16:5:[2]:[0]:[0]:[48]:[00:1f:2f:db:45:a6]:[128]:[2006:33:33:2::10]
4435 5551
27.0.0.16 from MSP2(uplink-2) (27.0.0.10)
Origin IGP, valid, external, bestpath-from-AS 4435, best
Extended Community: RT:5551:1001 ET:8
AddPath ID: RX 0, TX 1263
Last update: Wed Aug 15 20:53:10 2018
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The neigh update can come prior to mac add update.
In this case, the mac will be auto created for the vni.
set router flag to local neigh update for mac with auto flag.
The neigh update will be informed to bgpd once local mac is learnt.
Unset router flag if the neigh update comes without the router flag
for an existing neigh entry.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Enhance the EVPN MAC and Neighbor cache display to show additional
information such as the mobility sequence numbers and the state.
Ensure that the neighbor state is set in a couple of places so
that the display is correct.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement procedures similar to what is specified in
https://tools.ietf.org/html/draft-malhotra-bess-evpn-irb-extended-mobility
in order to support extended mobility scenarios in EVPN. These are scenarios
where a host/VM move results in a different (MAC,IP) binding from earlier.
For example, a host with an address assignment (IP1, MAC1) moves behind a
different PE (VTEP) and has an address assignment of (IP1, MAC2) or a host
with an address assignment (IP5, MAC5) has a different assignment of (IP6,
MAC5) after the move. Note that while these are described as "move" scenarios,
they also cover the situation when a VM is shut down and a new VM is spun up
at a different location that reuses the IP address or MAC address of the
earlier instance, but not both. Yet another scenario is a MAC change for an
attached host/VM i.e., when the MAC of an attached host changes from MAC1 to
MAC2. This is necessary because there may already be a non-zero sequence
number associated with MAC2. Also, even though (IP, MAC1) is withdrawn before
(IP, MAC2) is advertised, they may propagate through the network differently.
The procedures continue to rely on the MAC mobility extended community
specified in RFC 7432 and already supported by the implementation, but
augment it with a inheritance mechanism that understands the relationship
of the host MACIP (ARP/neighbor table entry) to the underlying MAC (MAC
forwarding database entry). In FRR, this relationship is understood by the
zebra component which doubles as the "host mobility manager", so the MAC
mobility sequence numbers are determined through interaction between bgpd
and zebra.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a host moves and is locally reachable, if the local neighbor event
is received before the local MAC event, flag the neighbor as inactive
just as would happen in the case of a new host. This ensures that the
MACIP route will get originated as soon as the local MAC event is got.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
* Added code for "match ipv6 address prefix list" command
* Added common function route_match_address_prefix_list() to process
routemap for AFI_IP and AFI_IP6 address family
Signed-off-by: kssoman <somanks@vmware.com>
* Check for the modified routemap in zebra_route_map_process_update_cb()
* Added zebra_rib_table_rm_update() for RIB routemap processing
* Added zebra_nht_rm_update() for NHT routemap processing
Signed-off-by: kssoman <somanks@vmware.com>
In order for connected routes to be installed the if_is_operative
function is called. This function checks the status of ptm
and decides to use ptm enabled/disabled on the interface.
The call to zebra_ptm_get_enable was returning true and causing
the interface subsystem to do the wrong thing. Modify the
internal bfd case to when checking for ptm enabled to say it
is not enabled.
Tested-by: Mark Stapp <mjs@voltanet.io>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The CMSG_FIRSTHDR was broken on solaris pre version 9. Version 9
was released in May of 2002 and EOL'ed in 2014. Version 8 EOL'ed
in 2012. Remove special case code for a little used platform
that has not seen the light of day in a very long time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Use the correct license header
* Stop headers from including themselves
* Use uniform relative include conventions
* Ensure that sources include what they use
* Turn off clang-format around struct array blocks
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We must hide only "pseudowire IFNAME" from vtysh, the "no" form of the
command should be made available to the extract.pl script. Split the
command into two to fix this problem.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There is no need to check for failure of a ALLOC call
as that any failure to do so will result in a assert
happening. So we can safely remove all of this code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
On `zebra` / `bfdd` shutdown we now clean up all client data to avoid
memory leaks (ghost clients). This also prevents 'slow' shutdown on
`zebra` sparing us from seeing some rare topotests shutdown failures
(signal handler getting stopped by signal).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This will make `bfdd` synchronize with its client when zebra dies or
bfdd is restarted.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When `bfdd` is enabled - which it is by default - re-route the PTM-BFD
messages to the FRR's internal BFD daemon instead of the external
PTM daemon.
This will help the migration of BFD implementations and avoid
duplicating code.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The client socket value can only be modified by the main thread.
Modifying the client socket from within the client I/O pthread
introduces race conditions.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Socket should be closed in zserv_client_free() and nowhere else.
Credit to Mark Stapp for catching this one.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Rename some things to be less confusing
* Convert client close function to take a client struct rather than a
task
* Extern client close function and use it when handling SIGTERM
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Allow protocols to specify to zebra that they would like zebra
to use the distance passed down as part of determine sameness for
Route Replace semantics.
This will be used by the static daemon to allow it to have
backup static routes with greater distances.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is the start of separating out the static
handling code from zebra -> staticd. This will
help simplify the zebra code and isolate static
route handling to it's own code base.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
As part of moving the static route handling to it's own daemon
allow zebra to accept static route types from upper level
protocols.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
show evpn mac vni all
show evpn mac vni x
does not display local svi and anycast mac into count.
Ticket:CM-20456
Testing Done:
Before:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 4
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
dell-s6000-07#
After:
TOR1# show evpn mac vni 1008
Number of MACs (local and remote) known for this VNI: 6
MAC Type Intf/Remote VTEP VLAN
44:38:39:00:6b:4c local vlan1008 1008
00:02:00:00:00:04 local hostbond5 1008
00:02:00:00:00:02 local hostbond4 1008
00:00:5e:00:01:01 local vlan1008-v0 1008
00:02:00:00:00:0c remote 27.0.0.15
00:02:00:00:00:0a remote 27.0.0.15
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
When I did a show ip route with `json` on a vrf when it didn't exist,
frr would output invalid json.
Signed-off-by: Nathan Van Gheem <nathan@cumulusnetworks.com>
The parameter was missing in that vty command. Then it is being added.
Also some documentation is refreshed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
NLMSG_NEXT decrements the buffer length (status) by
the header msg length (nlmsg_len) everytime its called.
If nlmsg_len isn't accurate and set to be larger than
what it should represent, it will cause status to
decrement passed 0. This makes NLMSG_NEXT return a
pointer that references an inaccessible address.
When that is passed to NLMSG_OK, it segfaults.
Add a check to verify that there is still something to read
before we try to.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Prefix length validation checks should be returning an error
rather than 0. Switch to that and make them error messages.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Change the fuzzing code so that it fakes data from
the listening socket rather than using its own pseudo one.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Commit a2ca67d1d2 consolidated IPv4 and IPv6 handling. It also applied
our ignorance for IPv4 srcdest routes onto IPv6.
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
Each ipset with port value monitors either src port or dst port.
The information is added to show pbr iptable commmand.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Bad nexthop messages from netlink were causing zebra
to hang here. Added a check to verify the length
of the nexthop so it doesn't keep trying to read.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Some more address family filters we can safely ignore
as well as typos in logger. Added AF_MPLS as filterable.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Zebra needed a check that varifies the prefix length
of an address is a valid length when receiving route
changes and interface address changes.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Zebra needed a check for mtu from the message it
received from the kernel before adding the new link.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
The zebra netlink socket was attempting to read netlink
messages with invalid address families in a couple areas.
Added filters and warn messages.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
This code allows you to fuzz the netlink listening socket
in zebra by --enable-fuzzing and passing the -w [FILE]
option when running zebra.
File collection is stored in /var/run/frr/netlink_*
where each number is just a counter to keep the
files distinct.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
To keep configuration consistent, vrf that have not been able to be
associated with netns are removed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This test case happens in scenarios with mininet, where external netns
may be impossible for the local instance to be modified. The error is
ignored and the netns parsed is ignored too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN ND ext community support NA flag R-bit, to have proxy ND.
Set R-bit in EVPN NA if a given router is default gateway or there is a
local
router attached, which can be determine based on local neighbor entry.
Implement BGP ext community attribute to generate and parse R-bit and
pass along zebra to program neigh entry in kernel.
Upon receiving MAC/IP update with community type 0x06 and sub_type 0x08,
pass the R-bit to zebra to program neigh entry.
Set NTF_ROUTER in neigh entry and inform kernel to do proxy NA for EVPN.
Ref:
https://tools.ietf.org/html/draft-ietf-bess-evpn-na-flags-01
Ticket:CM-21712, CM-21711
Reviewed By:
Testing Done:
Configure Local vni enabled L3 Gateway, which would act as router,
checked
show evpn arp-cache vni x ip <ip of svi> on originated and remote VTEPs.
"Router" flag is set.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
It was reported that "show ipv6 route vrf <vrfname>", "show ipv6 route
vrf <vrfname> ::/0 " or "show ipv6 route vrf <vrfname> json" all
displayed that the nexthop was in the default vrf. This was because
the kernel netlink messages would supply the RTA_OIF of the loopback
interface for the kernel-created default route for the vrf, where ipv4
did not supply any RTA_OIF. This fix suppresses the display if the
nexthop and route entry are in different vrfs and the nexthop is
NEXTHOP_TYPE_BLACKHOLE.
Ticket: CM-21722
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Kernel requests via netlink are synchronous.
Therefore we do not need to specify a need for a ACK and
we can make the netlink_cmd NONBLOCKING
1) If the netlink message is going to cause an error
we will still get one. Since results from the kernel
are synchronous we will get the error message on the
netlink_cmd socket and handle it
2) If the netlink message is going to send more than
one packet we will still get them all. Since the results
from the kernel are synchronous we will receive all data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a netlink_socket, listen to error
codes and abandon ship if it crashes and burns.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The fuzzing code was calling zebra_client_create which was refactored to zserv_client_create.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add 'const' to prefix args to several zebra route update,
redistribution, and route owner notification apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The search algorithm for interface based on ifindex only is adapted to
vrf netns based too. Only the default netns will be used to search the
interface index.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the interface lookup based on ifindex in the case the target vrf is
unknown is using the generic vrf api. Like that, in the case of vrf
based netns, the search across different netns other than the default
one are not searched.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The interface lookup algorithm is different according to if we are on
netns vrf or not. If we are on the former case, then we only have to
parse the interfaces of the netns, while if we are on the other case, we
have to parse all the interfaces of all the vrfs ( since index is not
overlapping in the latter case).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
SVI interface ip/hw address is advertised by the GW VTEP (say TORC11) with
the default-GW community. And the rxing VTEP (say TORC21) installs the GW
MAC as a dynamic FDB entry. The problem with this is a rogue packet from a
server with the GW MAC as source can cause a station move resulting in
TORC21 hijacking the GW MAC address and blackholing all inter rack traffic.
Fix is to make the GW MAC "sticky" pinning it to the GW VTEP (TORC11). This
commit does it by installing the FDB entry as static if the MACIP route is
received with the default-GW community (mimics handling of
mac-mobility-with-sticky community)
Sample output with from TORC12 with TORC11 setup as gateway -
root@TORC21:~# net show evpn mac vni 1004 mac 00:00:5e:00:01:01
MAC: 00:00:5e:00:01:01
Remote VTEP: 36.0.0.11 Remote-gateway Mac
Neighbors:
45.0.4.1
fe80::200:5eff:fe00:101
2001:fee1:0:4::1
root@TORC21:~# bridge fdb show |grep 00:00:5e:00:01:01|grep 1004
00:00:5e:00:01:01 dev vx-1004 vlan 1004 master bridge static
00:00:5e:00:01:01 dev vx-1004 dst 36.0.0.11 self static
root@TORC21:~#
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Ticket: CM-21508
While ZAPI I/O threads make a best effort to kill any scheduled tasks on
their threadmasters, after death another pthread can continue to
schedule onto the threadmaster. This isn't a problem per se since the
tasks will never run, but it also means that asserting that it hasn't
happened is pointless.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The warning given by PVS-Studio is related to per-element overflow (there is
no real overflow, because of how elements are mapped in the union). This
same warning is typically reported by Coverity, too.
Signed-off-by: F. Aragon <paco@voltanet.io>
Problem created by the fix for cm-21306 (inactive cross-vrf static routes
when vrfs were bounced.) Determined that in another case, that fix would
cause duplicate nexthops to appear in the table. Resolved the problem by
removing the vrf static route process from the zebra "add" process leaving
it in the zebra " if up" process as added in cm-21306 since that's the point
that the vrf device is now functional.
Ticket: CM-21429
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This correction fixes two bugs detected by Clang scan:
Bug Group: Dead store
Bug Type: Dead assignment
File: zebra/kernel_netlink.c
Function: netlink_parse_extended_ack
Line: 548
Bug Type: Dead increment
File: isisd/isis_lsp.c
Function: lsp_bits2string
Line: 625
Signed-off-by: F. Aragon <paco@voltanet.io>
incoming iptable entries with fragment parameter is handled.
An iptable context is created for each fragment value received from BGP.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The packet length is added to iptable zapi message.
Then the iptable structure is taking into account the pkt_len field.
The show pbr iptable command displays the packet length used if any.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The icmp type/code is displayed.
Also, the flags are correctly set in case ICMP protocol is elected.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When in a dev build add a bit of code to track max
depth of a fifo and to allow zebra to report on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is an additional correction after 45981fda06 / PR #2462. I hope
this fixes the Coverity warning (I've added an additional check for ensuring
the string provided by the inotify read is zero-terminated).
Signed-off-by: F. Aragon <paco@voltanet.io>
When a filter function fails to work correctly, we get an
error message that something has gone wrong. Unfortunately
we may not have any clues as to where the decode failure
happened. Add a backtrace to give us a clue.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive a netlink message from the kernel we have
handler functions for when we send a netlink command, if these
return a failure ( < 0 ) then we output that we had a parse
issue. But if all we get is:
2018-06-21T23:47:45.298156+00:00 qct-ix1-08 zebra[1484]: netlink-cmd (NS 0) filter function error
Then it is not very useful to figure out *where* the error happened.
Add more error code when in a decode path to hopefully allow us
to figure out where this message is coming from.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a correction over 7f61ea7bd4 in order
to avoid the TAINTED_SCALAR Coverity warning (ending in "Untrusted array
index read"). This is equivalent to the previous commit, but avoiding
pointer arithmetic with tainted variables.
Signed-off-by: F. Aragon <paco@voltanet.io>
Add code to request and read in extended ack information
to provide a bit more context of what went wrong when
a failure is detected in the kernel.
Example of a failed delete:
Jun 20 21:19:25 robot zebra[11878]: Extended Error: Invalid prefix for given prefix length
Jun 20 21:19:25 robot zebra[11878]: netlink-cmd (NS 0) error: Invalid argument, type=RTM_DELROUTE(25), seq=8, pid=4078403400
Jun 20 21:19:25 robot zebra[11878]: 0:4.3.2.0/24: Route Deletion failure
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is a correction over 32ac96b2ba, so
removing the forced string null termination doesn't involve a worse situation
than before (the underflow check should protect for the case of receiving
an incomplete buffer, which would be the cause of non-zero terminated string)
Signed-off-by: F. Aragon <paco@voltanet.io>
The route_map_walk_update_list callback function
never uses the return code, so just remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add some basic code for zebra to start to keep track
of route-maps that have changed. At this point we
are not doing anything. As we fix code to handle
route-maps better, code will be shifted around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported that if the vrf device is taken down and then brought
back up, any static route referencing that vrf device was not
re-installed. This fix runs back thru the static routes that
reference the vrf device coming up and re-install them.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Hide following l3vni config from DEFAULT_VRF instance
until it is fully supported.
TORS1(config)# vni 2222456 prefix-routes-only
Ticket:CM-20572
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Cleanup the zebra code to test for failure for reading
from stream once instead of once to see if we should
debug and once for the actual failure.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
New version of clang are detecting function parameters that we should
not be casting as such. Fix these issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The IFLA_INFO_SLAVE_KIND constant is always defined now that we imported
our own copies of the Linux kernel headers. Remove the preprocessor
checks since they aren't necessary anymore.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When we have a host prefix, actually free the alloced memory
associated with it when we free it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When debugging code in redistribute.c, it is useful to output
the vrf we think the interface is in. So display it
when we are debugging.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Programs that link to libnetsnmp must be compiled using a special set
of flags as specified by the "net-snmp-config --base-cflags" command
(whose output is stored in the SNMP_CFLAGS variable). The problem is
that "net-snmp-config --base-cflags" can output -std=c99 in addition to
other compiler flags in some platforms, and this breaks the build since
FRR souce code makes use of some GNU compiler extensions (e.g. allow
trailing commas in function parameter lists). In order to solve this
problem, append -std=gnu99 after SNMP_CFLAGS in all makefiles where this
variable is used. This way the -std=c99 flag will be overwritten when it's
present. Source files that don't link to libnetsnmp will be compiled using
either -std=gnu99 or -std=gnu11 depending on the compiler availability.
Fixes#1617.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
That fix is a workaround from a vtysh limitation.
Because table identifier should be accessible in configuration only for
vrf netns backends, there was a need to differentiate the vty commands.
Unfortunately, vtysh parses the two commands without knowing which
command has really been installed.
Using one single vty command will avoid having this issue in vtysh.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
By default, nothing is displayed. If vrf backend is linux network
namespaces, then "netns-based vrfs" is displayed, before dumping the
list of VRFs.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In the case where vrf backend is netns, then the list of ns tables may
be extended. A single list is kept,but an attribute is added: the ns_id.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
As table_id for VRF with netns backend is main table ( RT_TABLE_MAIN or
zebrad.rtm_table_default), this makes possible to return the table id
that wants to be configured for those cases. ( in addition to default
VRF). In other cases ( VRF Lite presumably), then vrf table_id is
returned.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add the table keyword for all ip route/ip mroute/ipv6 route commands
that are available. Also, the main structure is being added a table
identifier.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a bit of code to allow return of data plane
request messages.
Add the ability to pass the result back to callers
of kernel_route_rib.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The SOUTHBOUND_XXX enum was named a bit poorly.
Let's use a bit better name for what we are trying to do.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
I mistakenly used an external mechanism to cause a pthread to shut
itself down instead of using the one built into frr_pthread.[ch]. This
created a race condition whereby a pthread could schedule work onto a
dead pthread and cause it to reanimate.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Coalesce multiple write() syscalls into one
* Write larger chunks
* Decrease default read limit to 1000
* Remove unnecessary operations from hot loop (zserv_write)
* Move cross-schedule out of obuf lock
* Use atomic ops to update atomic variable
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Cancelling threads is nice but they can potentially be scheduled again
after cancellation without an explicit check.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Only one I/O task can be scheduled per file descriptor. Having two
separate tasks for buffer filling and buffer flushing was breaking that
invariant and causing messages to never be written.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Separate flush task from write task, so we can continue adding to the
write buffer while it's waiting to flush
* Handle write errors sooner rather than later
* Only schedule a process job if we have packets to process
* Tweak zserv_process_messages to not reschedule itself and rely on
zserv_read() to do so in all proper cases
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Label manager reaches its hands into session / IO code for zserv for
whatever reason, gotta handle that.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Simplify zapi_msg <-> zserv interaction
* Remove header validity checks, as they're already performed before the
packet ever makes it here
* Perform the same kind of batch processing done in zserv_write by
copying multiple inbound packets under lock instead of doing serial
locking
* Perform self-scheduling under the same lock
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Dequeue all pending messages when writing and push them all into the
write buffer. This removes the necessity to self-schedule, avoiding a
mutex lock, and should also maximize throughput by not writing 1 packet
per job.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Increase the maximum number of packets to read per read job
* Store read packets in a local cached buffer to avoid mutex overhead
* Only update last-read time / last-command if we actually read a packet
* Add missing log line for corrupt header case
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add centralized thread scheduling dispatchers for client threads and
the main thread
* Rename everything in zserv.c to stop using a combination of:
- zebra_server_*
- zebra_*
- zserv_*
Everything in zserv.c now begins with zserv_*.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Since it is already quite difficult to understand the various pieces
going on here, I reorganized the file to make it much cleaner and easier
to understand. The organization is now:
zserv.c:
,---------------------------------.
/ include statements |
| ... |
| ... |
| -------------------------------- |
| Client pthread server functions |
| ... |
| ... |
| -------------------------------- |
| Main pthread server functions |
| ... |
| ... |
| -------------------------------- |
| CLI commands, other |
| ... |
| ... |
\_________________________________/
No code has been changed; the functions have merely been moved around.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Time counters need to use atomic access between threads
* After a client disconnects, we properly kill the thread but need to
free its frr_pthread as well
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add doc comments explaining hairy bits of thread lifecycle
* Remove t_suicide as it no longer makes sense
* Remove client double-free
* Remove unnecessary THREAD_OFF being used in incorrect pthread context
* Eliminate unnecessary racey access to client's obuf_fifo
* Ensure zserv_process_messages() reschedules itself if it has not
finished its work
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When we receive a route that we think we own and we
are not in startup conditions, then add a small debug
to help debug the issue when this happens, instead
of silently just ignoring the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The re-use of RTPROT_STATIC has caused too many collisions
where other legitimate route sources are causing us to
believe we are the originator of the route. Modify
the code so that if another protocol inserts RTPROT_STATIC
we will assume it's a Kernel Route.
Fixes: #2293
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
With:
commit ba7773964c
Author: Renato Westphal <renato@opensourcerouting.org>
Date: Wed Sep 20 22:12:56 2017 -0300
We added our own copy of if_link.h (among others). This
file unconditionally defines IFLA_WIRELESS, so we don't need
the conditional defines in the if_netlink.c code...
Issue: https://github.com/FRRouting/frr/issues/2299
Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
After PBR or BGP sends back a request for sending a rule/ipset/ipset
entry/iptable delete, there may be issue in deleting it. A notification
is sent back with a new value indicating that the removal failed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This hook can be used if the plugin module wrap_script is used.
This hook is called to dump the debugging status of this module, on the
vty.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following PBR handlers: ipset, and iptables will prioritary
call the hook from a possible plugin.
If a plugin is attached, then it will return a positive value.
That is why the return status is tested against 0 value, since that
means that there are no plugin module plugged
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon reception of an iptable_add or iptable_del, a list of interface
indexes may be passed in the zapi interface. The list is converted in
interface name so that it is ready to be passed to be programmed to the
underlying system.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those 3 fields are read and written between zebra and bgpd.
This permits extending the ipset_entry structure.
Combinatories will be possible:
- filtering with one of the src/dst port.
- filtering with one of the range src/ range dst port
usage of src or dst is exclusive in a FS entry.
- filtering a port or a port range based on either src or dst port.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Two new vty show functions available:
show pbr ipset <NAME>
show pbr iptables <NAME>
Those function dump the underlying "kernel" contexts. It relies on the
zebra pbr contexts. This helps then to know which zebra pbr
context has been configured since those contexts are mainly configured
by BGP Flowspec.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a mark is set, incoming traffic having that mark set can be
redirected to a specific table identifier. This work is done through
netlink.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In cast the removal of an iptable or an ipset pbr context is done,
then a notification is sent back to the relevant daemon that sent the
message.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon the remote daemon leaving, some contexts may have to be flushed.
This commit does the change. IPset and IPSet Entries and iptables are
flushed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is a fix that removes the structure from the hash list,
instead of just removing that structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add ns_id into zebra_pbr ipset
This is important so that each ipset entry knows on which NETNS the
ipset entry must be inkected
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Fix the code so that we would actually start receiving
RULE netlink notifications.
The Kernel expects the long long to be a bit field
value, while the newer netlink message types are
an enum. So we need to convert the message type
number to a bit position and set that value.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move where we check for non-kernel netlink messages to
a slightly earlier spot. This will allow in subsuquent
commits the removal of an extra parameter that needs to
be passed around.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The BPF filter was an exclusion list of netlink messages
we did not want to receive from our self. The problem
with this is that the exclusion list was and will be
ever growing. So switch the test around to an inclusion
list since it is shorter and not growing. Right
now this is RTM_NEWADDR and RTM_DELADDR.
Change some of the debug messages to error messages
so that when something slips through and it is unexpected
during development we will see the problem.
Also try to improve the documentation about what
the filter is doing and leave some breadcrumbs for
future developers to know where to change code
when new functionality is added.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In case, the BGP or PBR daemon leaves, the PBR contexts created by this
daemon are flushed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The linux kernel is getting the same Route Replace semantics
for v6 that v4 uses. Allow the end-user to know if their
kernel has this ability and if so to specify it so zebra
can take advantage of this.
Why not do auto-detection? Because you would have to write
code in zebra to add a route then add the same route again
with different nexthops to see if which semantics it is using.
It sure is easier to just add a cli that allows the user to
do it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Setup the buf used for extra data passed into kernel such
that we are cleaning it out before writing data to it,
so we can avoid writing uninited data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add to zebra route-maps the ability to match on a source-instance
route-map FOO deny 55
match source-instance 5
route-map FOO permit 60
ip protocol any route-map FOO
This will match any protocol route installation with a source-instance of 5.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The neighbor host_list is expensive as well. Modify
the code to take advantage of a rb_tree as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We are going to modify more host_list's to host_rb's
so let's rename some functions to take advantage of
what is there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The host_list when we attempt to use it at scale, ends
up spending a non-trivial amount of time finding and
sorting entries for the host list. Convert to a rb tree.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-15658
Reviewed By: CCR-6534
Testing Done: Unit
Issue: frr ptm-enable command not working for interfaces that have been created by frr as a place holder.
Root Cause: The ptm-enable on interface configuration was not getting stored when the interface was internally created by frr.
Fix: Store the ptm-enable configuration even if the interface is internally created.
Signed-off-by: Radhika Mahankali <radhika@cumulusnetworks.com>
Ensure that the next hop of the leaked VRF is not overwritten when the
route is being imported into the target VRF from the VPN table. Also, in
the case of multipath routes, ensure that the nexthop's ifindex is not
inadvertently reset.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Netlink messages from the kernel need to be received in a buffer larger
than 8K in order to handle some types of info - for example, the VLAN
information. Define a separate size for receive and set it to 32K, which
is the value used by other netlink receivers like iproute2.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
When zebra starts up it receives from the kernel a full dump of
interface information. Unfortunately it is in no particular order.
As such we sometimes receive data from the kernel about interfaces
we do not know about yet.
In this bug, we are attempting to use the interface pointer(->link)
for a vlan interface that we have not properly resolved.
This fix ensures that we will not attempt to call zvni_map_svi
if we have a NULL pointer. There are other places in the code
we are already checking for the fact that the ->link pointer
is valid before calling this function, so I believe that this
is correct.
We do need to come back and resolve all ->link pointers
after we have received the full table. This can be
done in another commit.
Ticket: CM-17041
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a command to enable symmetric routing only for type-5 routes.
This command is provided under vrf <> option in zebra as follows:
vrf <VRF>
vni <VNI> [prefix-routes-only]
We need the corresponding no version of the command as well as follows:
vrf <VRF>
no vni <VNI> [prefix-routes-only]
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
For ipv6 host, the next hop is conevrted to ipv6 mapped address.
However, the remote rmac should still be programmed with the ipv4 address.
This is how the entries will look in the kernel for ipv6 hosts routing.
vrf routing table:
ipv6 -> ipv6_mapped remote vtep on l3vni SVI
neigh table:
ipv6_mapped remote vtep -> remote RMAC
bridge fdb:
remote rmac -> ipv4 vtep tunnel
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.
Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
There are cases when switching from one netns to an other one, where the
if_table registration by index has not been flushed. This fix mitigates
the potential crashes, in case the ifp->node pointer is null, the value
is overwritten by the route_node obtained.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When checking for a duplicate interface in an other NETNS, one may find
an interface in default VRF. That interface may have been moved to that
default VRF, for further action. Prevent from doing any action at this
point.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The log information is better displated.
Also the variable name fits better with other_ifp, than with old_ifp.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
re->status and re->flags both influence our decision states
for rib processing. Yet it's impossible to see them. Add
a tiny bit of code to allow us to look at them when things
are not behaving like we would expect.
Additionally dump the nexthop->flags at the same time for
the same reasons.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zns->ns pointer is not created until we get a callback
from the kernel that a ns exists. This should potentially
fix a crash in the *BSD code path.
Fixes: #2152
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since BGPd is not currently setting ID and PROTOCOL in label
requests, temporally disable mismatch error propagation.
This commit will be reverted once fixes for BGPd and label
manager are integrated.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
The current implementation did not consider multiple clients to
a label-manager acting as proxy, i.e. relaying messages to another
label manager. Specifically, upon a client's request, it checked
the socket & buffer from the actual label manager for pending
responses and directly copìed them to the client --currently--
being served. As a result, if two clients (e.g. ldpd and bgpd)
sent requests, it could happen that responses being 'on the wire'
from the real label manager towards the proxy, where relayed to
the wrong client. This patch, which requires all msgs to include
a a proto & instance pair, lookups up the zserv client that a
message (response) is to be relayed to.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
Add client proto and instance number in all msg (request and
responses) to/form a label manager. This is required for a
label manager acting as 'proxy' (i.e. relaying messages towards
another label manager) to correctly deliver responses to the
requesting clients.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
We are missing some handling of PBR and SHARP protocols
for netlink operations w/ the linux kernel.
Additionally add a bread crumb for new developers( or existing )
to know to fixup the rt_netlink.c when we start handling new
route types to hand to the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In a prior refactor, label manager proxy functionality
was broken in two places:
1) in function relay_response_back(), "dst" stream was
accidentally replaced by "src".
2) in zread_relay_label_manager_request(), src was set to point
to a global struct stream *ibuf that was not used/initialized
anywhere.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
When we are debugging add a bit of extra information
so we can know what we are redistributing to our peers
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Rename client_connect and client_close hooks to zapi_client_connect
and zapi_client_close
* Remove some more unnecessary headers
* Fix a copy-paste error in zapi_msg.[ch] header comments
* Fix an inclusion comment in zserv.c
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c was using hardcoded callbacks to clean up various components
when a client disconnected. Ergo zserv.c had to know about all these
unrelated components that it should not care about. We have hooks now,
let's use the proper thing instead.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
zserv.c has become something of a dumping ground for everything vaguely
related to ZAPI and really needs some love. This change splits out the
code fo building and consuming ZAPI messages into a separate source
file, leaving the actual session and client lifecycle code in zserv.c.
Unfortunately since the #include situation in Zebra has not been paid
much attention I was forced to fix the headers in a lot of other source
files. This is a net improvement overall though.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When changing from "ip import-table 10 route-map rdn" to "ip
import-table 10" without a route-map, routes would be deleted
and not reinstalled. This fix resolves that problem.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Zebra is starting to have some run-time capabilites that would be
useful to pass up to the higher level protocols so that they
can act in an appropriate manner when needed.
Send the ecmp value zebra is being run with and whether or not
we believe mpls is enabled in the kernel or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The mpls_label2str and mpls_str2label functions should not
be zebra exclusive functions. Move them to lib/mpls.c
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Somewhere along the way the ability to install multiple
pbr-policys for the same pbr-map was lost.
Add this back. There is a limitation in that we are limited
to 64 interfaces per pbr-policy.
Ticket: CM-20429
Signed-off-by: Donald Sharp sharpd@cumulusnetworks.com>
When I implemented this code change I was only testing against
static routes and with one nexthop. I missed the fact that
we needed to tell rib_process to actually rethink the nexthops.
Ticket: CM-20274
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Need to explicitly exit this context otherwise we risk ambiguities
between global and vrf context commands
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When a user specifies static routes, there are a couple of states
where we will store the route and display it as part of the 'show run'
but it will not be installed until such time that the dependant state
is created. Add some breadcrumbs to the user so that they can figure
out WTF just happened.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Realized (with coverity's help) the fix had a mistake by pasting in
the wrong route entry to unset the selected flag. This fix takes
care of that mistake.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
With the recent change to just pass the prefix in
for the RTM_DELROUTE, for blackhole routes we
had stopped modifying the req.rtm_type to
be the appropriate type for blackhole routes.
Since we are just deleting on the route, and
zebra is never going to really install the same
route multiple times then we do not need
to specify the req.r.rtm_type for the deletion
command.
Ticket: CM-20616
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When I implemented the same functionality in add_ipv6 that
add_ipv4 has I just assumed that broad would not be NULL with
the ZEBRA_IFA_PEER flag set.
Modify the code to act similiar to the flow of control
in add_ipv4.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem was due to in certain route replace circumstances,
we would mark the old route_entry as removed to delete it but
would leave the selected flag set. When the rn was pulled off the
work queue for process, we would find both the new re and old re
(being deleted) with the selected flag set and would assert.
In this change, when we decide to delete the old re, we also mark
it as no longer selected.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This renaming of structure permits better identify which structure is
looked up, since policy routing will not only rely on iprule, but also
on some other structures.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In order to avoid duplicates functions, the zebra_pbr_rule structure
used by zebra to decode the zapi message, and send netlink messages, is
slightly modified. the structure is derived from pbr_rule, but it also
includes sock identifier that is used to send back information to the
daemon that did the request. Also, the ifp pointer is stored in that
structure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add an intermediate helper structure that is used to walk the list of
ipset entries, and look for associated name.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those messages permit a remote daemon to configure an iptable entry. A
structure is defined that maps to an iptable entry. More specifically,
this structure proposes to associate fwmark, and a table ID.
Adding to the configuration, the initialisation of iptables hash list is
done into zebra netnamespace. Also a hook for notifying the sender that
the iptables has been correctly set is done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
PBR rule is being added a 32 bit value that can be used to record a rule
in the kernel, by using a fwmark information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Once ipset entries are injected in the kernel, the relevant daemon is
informed with a zebra message sent back.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
ZEBRA IPSET defines are added for creating/deleting ipset contexts.
Ans also create ipset hash sets.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
IPset and IPset entries structures are introduced. Those entries reflect
the ipset structures and ipset hash sets that will be created on the
kernel.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Zebra did not have a handler for tunnels in v6 for
some reason. Add code to handle the broadcast address
for both addition and deletion.
This appears to fix the crash. There might still need
to be some work to make the code `work` properly for
this type of tunnel.
Fixes: #2063
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This list "table" is created in the case the netns backend for VRF is
used. This contains the mapping between the NSID value read from the
'ip netns list' and the ns id external used to create the VRF
value from vrf context. This mapping is
necessary in order to reserve default 0 value for vrf_default.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
EVPN owns the remote neigh entries which are programed in the kernel.
This entries should not age out and the only way to delete should be
from EVPN. We should program these entries with NUD_NOARP instead of
NUD_REACHABLE to avoid aging of this macs.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
There can be a race condition between kernel and frr as follows.
Frr sends remote neigh notification.
At the (almost) same time kernel might send a notification saying
neigh is local.
After processing this notifications, the state in frr is local while
state in kernel is remote. This causes kernel and frr to be out of sync.
This problem will be avoided if FRR acts on the kernel notifications for
remote neighbors. When FRR sees a remote neighbor notification for a
neighbor which it thinks is local, FRR will change the neigh state to remote.
Ticket: CM-19923/CM-18830
Review: CCR-7222
Testing: Manual
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
[zebra/zebra_vxlan.c:5779] -> [zebra/zebra_vxlan.c:5778]:
(warning) Either the condition 'if(svi_if_zif&&svi_if_link)'
is redundant or there is possible null pointer dereference: svi_if_zif.
Signed-off-by: Ilya Shipitsin <chipitsine@gmail.com>
Background:
v6 does not have route replace semantics. If you want to add a nexthop
to an existing route, you just send RTM_NEWROUTE and the new nexthop.
If you want to delete a nexthop you should just send RTM_DELROUTE
with the removed nexthop.
This leads to situations where if zebra is processing a route
and has lost track of intermediate nexthops( yes this sucks )
then v6 routes will get out of sync when we try to implement
route replace semantics.
So notice when we are doing a route delete and the route is
not being updated, just send the prefix and tell it too delete.
Ticket: CM-20391
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit does 2 things:
1) When receiving a route from the kernel, display the incoming
table as part of the debug, to facilatate knowing what we are
talking about as part of the debug.
2) When displaying nexthop information for routes we were sending
to the kernel, no need to display the route information every time
Display the route then the individual nexthops for what we are doing.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Notice when someone deletes a neighbor entry we've put in for
rfc-5549 gets deleted by some evil evil person. When this happens
notice and push it back in, immediately.
Ticket: CM-18612
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code to reinstall self originated routes was not behaving
correctly. For some reason we were looking for self originated
routes from the kernel to be of type KERNEL. This was probably
missed when we started installing the route types. We should
depend on the self originated flag that we determine from
the callback from the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
When the last match criteria was removed (dst-ip or src-ip), we were
not deleting the rule correctly for ipv6. This fix retains the
needed src-ip/dst-ip during the pbr_send_pbr_map process so the
appropriate information is available for the rule delete.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
When we have a PBR installed as a table, we need to notice
when a nexthop changes and rethink the routes for the pbr
tables.
Add code to nexthop tracking to notice the pbr watched
nexthop has changed in some manner. If it is a pbr route
that depends on the nexthop then just enqueue it for
rethinking.
This is a bit of a hammer, we know that only pbr routes
are going to be installing routes in weird non-standard
tables as such we need to only handle nexthop changes
for nexthops that are actually changing that we care
about and to only requeue for route nodes we have
route entries for from PBR
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Holdem statics display the dest (and mask, if present) string that the
user entered instead of converting to CIDR notation and applying the
mask. They need to do the latter.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Adds support for V4 GoAway flag as described in
https://www.ietf.org/id/draft-bz-v4goawayflag-00.txt
This option allows advertising neighbors to indicate to recipients that
they should disable IPv4 on the link.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Add some additional debug information to the netlink debug
messages so we can see the table we are installing to as
well as the nexthop's vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The header length needs to be subtracted from the handling
side of the zapi in zebra. This is because we refigure the
header data structure. The receive side doesn't care
about the total header length so no need to subtract there.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In BGP, doing policy-routing requires to use table identifiers.
Flowspec protocol will need to have that. 1 API from bgp zebra has been
done to get the table chunk.
Internally, onec flowspec is enabled, the BGP engine will try to
connect smoothly to the table manager. If zebra is not connected, it
will try to connect 10 seconds later. If zebra is connected, and it is
success, then a polling mechanism each 60 seconds is put in place. All
the internal mechanism has no impact on the BGP process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This commit is connecting the table manager with remote daemons by
handling the queries.
As the function is similar in many points with label allocator, a
function has been renamed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The range is given from table manager from zebra daemon.
There are 2 ranges available for table identifier:
- [1;252] and [256;0xffffffff]
If the wished size enters in the first range, then the start and end
range of table identifier is given within the first range.
Otherwise, the second range is given, and an appropriate range is given.
Note that for now, the case of the VRF table identifier used is not
taken into account. Meaning that there may be overlapping. There are two
cases to handle:
- case a vrf lite is allocated after the zebra and various other daemons
started.
- case a vrf lite is initialised and the daemons then start
The second case is easy to handle. For the former case, I am not so
sure.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Prevent zebra from crashing for when the nexthop vrf has
changed in some manner and the lookup fails.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There are many callpaths to get to static_install_route. The nexthops
each have their own vrf that may or may not be up yet. If it is
allow the installation.
Doing this check here to avoid having to add this all over the place.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a interface is moved from one vrf to another, we get a callback
to move the static routes. Extend the work to look at all static
routes across all vrf's since we allow static route leaking now.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a user enables and disables a vrf, we were not
properly cleaning up the static routes leaving us
in a state where we would crash by looking at anything
in zebra.
On disable of a vrf -> Search through all static routes
and if the nexthop vrf is the disabled vrf uninstall it.
Additionally uninstall all static routes in that zvrf
On enable of a vrf -> Search through all static routes
and if the nexthop vrf is the enabled vrf install it.
Additionally install all the static routes in that zvrf.
Ticket: CM-19768
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There were a few cases where we were not properly de-registering
the static nexthops passed to us. This was important when
the static route was being removed for whatever reason that
we did not leave slag for the nexthop tracking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When moving interfaces to an other place, like other netns, the
remaining interface is still present, with inactive status.
Now, that interface is deleted from the list, if the interface appears
on an other netns. If not, the interface is kept.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The table id of the vrf is being given to us as part
of the vrf creation netlink callback. Unfortunately it
was being set in the zvrf *after* the vrf_enable callback.
This didn't used to matter until we started having config data
stored on the side that we needed to act on when the vrf
came up enough to start working.
So when we were storing static routes and installing them
they were being pushed into the default table for non-default
vrf's.
Ticket: CM-19141
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Upon a 'ip netns del' event, the associated vrf with netns backend is
looked for, then the internal contexts are first disabled, then
suppressed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The vrf netns usage makes a crash, when deleting vrf, due to the hash
list of rules not initialised for non default VRF.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because vrf with netns backend may be used, the correct zns must be
found prior any modifications.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When we are removing a rule from the zns->rules_hash, free up
the rule from the hash and free the memory.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>