Ensure that when EVPN routes are installed into zebra, the router MAC
is passed per next hop and appropriately handled. This is required for
proper multipath operation.
Ticket: CM-18999
Reviewed By:
Testing Done: Verified failed scenario, other manual tests
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
When popping a stream from a stream_fifo, the stream->next pointer is
not NULL'd out. If this same stream is subsequently pushed onto a
stream_fifo (either the same one or a different one), because
stream_fifo's use tail insertion the ->next pointer is not updated and
thus will point to whatever the next stream in the first stream_fifo
was. stream_fifo_free does not check the count of the stream_fifo when
freeing its constituent elements, and instead walks the linked list.
Consequently it will continue walking into the first stream_fifo from
which the last stream was popped, freeing each stream contained there.
This leads to use-after-free errors.
This patch makes sure to set the ->next pointer to NULL when doing tail
insertion in stream_fifo_push and when popping a stream from a
stream_fifo.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The addr value will never be null because of the way we do the
cli, but the SA system doesn't understand this. Add an assert
to make it happy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The grammar sandbox has had the ability to dump individual commands as
DOT graphs, but now that generalized DOT support is present it's trivial
to extend this to entire submodes. This is quite useful for visualizing
the CLI space when debugging CLI errors.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add general-purpose DFS traversal code
* Add ability to dump any graph to DOT language
* Add tests for graph datastructure
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Thread statistics are collected and stored in a hashtable shared across
threads, but while the hashtable itself is protected by a mutex, the
records themselves were not being updated safely. Change all thread
history collection to use atomic operations.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
For the last six years this source file has been using a type defined in
a header it did not include.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add general-purpose DFS traversal code
* Add ability to dump any graph to DOT language
* Add tests for graph datastructure
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Zebra is starting to have some run-time capabilites that would be
useful to pass up to the higher level protocols so that they
can act in an appropriate manner when needed.
Send the ecmp value zebra is being run with and whether or not
we believe mpls is enabled in the kernel or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The mpls_label2str and mpls_str2label functions should not
be zebra exclusive functions. Move them to lib/mpls.c
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Properly notice when we get if up/down and vrf enable/disable
events and attempt to properly install nexthops as they
come in.
Ticket: CM20489
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Store Nexthop's as the incoming raw data. This will allow
us to separate the act of inputting the cli from the
act of instantiating the cli.
Ticket: CM-20489
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The delete was not properly deleting the nexthop from
the nexthop group and it was not properly setting the
nexthop's pointers to NULL.
Ticket: CM-20261
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Prevent the creation of a v6 LL nexthop that does not include an interface
for proper resolution.
Ticket: CM-20276
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pbr_rule structure is derived from zebra_pbr_rule, and is
defined, so that a zclient will be able to encode the zebra_pbr_rule to
send ADD_RULE or DEL_RULE command. Also, the same structure can be used
by other daemons to derive a structure ( this will be the case for
zebra_pbr_rule).
Adding to this, an encoding function is defined, and will be used by
remote daemon to encode that message.
Those definitions are moved in new file pbr.h file.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Those messages permit a remote daemon to configure an iptable entry. A
structure is defined that maps to an iptable entry. More specifically,
this structure proposes to associate fwmark, and a table ID.
Adding to the configuration, the initialisation of iptables hash list is
done into zebra netnamespace. Also a hook for notifying the sender that
the iptables has been correctly set is done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Once ipset entries are injected in the kernel, the relevant daemon is
informed with a zebra message sent back.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
ZEBRA IPSET defines are added for creating/deleting ipset contexts.
Ans also create ipset hash sets.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
IPset and IPset entries structures are introduced. Those entries reflect
the ipset structures and ipset hash sets that will be created on the
kernel.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
These asserts verify that the status correlates with the expected result
and fixes a clang-analyze warning.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The PBR and PIM daemons, needed the ability to connect
to zebra. Unfortunately this connection also implied
an ability to redistribute to other valid protocols.
Add a additional hook to the route_types.pl script
to allow us to specify if the client type should
be redistributed at all.
Additionally cleanup the PIM code to not show up
as a protocol under the header for a 'show ip route'
command
Ticket: CM-20568
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This list "table" is created in the case the netns backend for VRF is
used. This contains the mapping between the NSID value read from the
'ip netns list' and the ns id external used to create the VRF
value from vrf context. This mapping is
necessary in order to reserve default 0 value for vrf_default.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because at startup, remote daemons attempt to create default VRF,
the VRF_ID may be set to unknown. In that case, an event will be
triggered later by zebra to inform remote daemon that the vrf id of that
VRF has changed to valid value. In that case, two instances of default
VRF must not be created. By looking first at vrf name, this avoids
having two instances.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
MPLS label pool backed by allocations from the zebra label manager.
A caller requests a label (e.g., in support of an "auto" label
specification in the CLI) via lp_get(), supplying a unique ID and
a callback function. The callback function is invoked at a later
time with the unique ID and a label value to inform the requestor
of the assigned label.
Requestors may release their labels back to the pool via lp_release().
The label pool is stocked with labels allocated by the zebra label
manager. The interaction with zebra is asynchronous so that bgpd
is not blocked while awaiting a label allocation from zebra.
The label pool implementation allows for bgpd operation before (or
without) zebra, and gracefully handles loss and reconnection of
zebra. Of course, before initial connection with zebra, no labels
are assigned to requestors. If the zebra connection is lost and
regained, callbacks to requestors will invalidate old assignments
and then assign new labels.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
This commit adds code to notify the compiler that we
will not be changing the arguments to nexthop2str
and we expect thre return to be treated the same.
Additionally we add some code to allow nexthops to
be hashed to be used in a hash.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This is an implementation of PBR for FRR.
This implemenation uses a combination of rules and
tables to determine how packets will flow.
PBR introduces a new concept of 'nexthop-groups' to
specify a group of nexthops that will be used for
ecmp. Nexthop-groups are specified on the cli via:
nexthop-group DONNA
nexthop 192.168.208.1
nexthop 192.168.209.1
nexthop 192.168.210.1
!
PBR sees the nexthop-group and installs these as a default
route with these nexthops starting at table 10000
robot# show pbr nexthop-groups
Nexthop-Group: DONNA Table: 10001 Valid: 1 Installed: 1
Valid: 1 nexthop 192.168.209.1
Valid: 1 nexthop 192.168.210.1
Valid: 1 nexthop 192.168.208.1
I have also introduced the ability to specify a table
in a 'show ip route table XXX' to see the specified tables.
robot# show ip route table 10001
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route
F>* 0.0.0.0/0 [0/0] via 192.168.208.1, enp0s8, 00:14:25
* via 192.168.209.1, enp0s9, 00:14:25
* via 192.168.210.1, enp0s10, 00:14:25
PBR tracks PBR-MAPS via the pbr-map command:
!
pbr-map EVA seq 10
match src-ip 4.3.4.0/24
set nexthop-group DONNA
!
pbr-map EVA seq 20
match dst-ip 4.3.5.0/24
set nexthop-group DONNA
!
pbr-maps can have 'match src-ip <prefix>' and 'match dst-ip <prefix>'
to affect decisions about incoming packets. Additionally if you
only have one nexthop to use for a pbr-map you do not need
to setup a nexthop-group and can specify 'set nexthop XXXX'.
To apply the pbr-map to an incoming interface you do this:
interface enp0s10
pbr-policy EVA
!
When a pbr-map is applied to interfaces it can be installed
into the kernel as a rule:
[sharpd@robot frr1]$ ip rule show
0: from all lookup local
309: from 4.3.4.0/24 iif enp0s10 lookup 10001
319: from all to 4.3.5.0/24 iif enp0s10 lookup 10001
1000: from all lookup [l3mdev-table]
32766: from all lookup main
32767: from all lookup default
[sharpd@robot frr1]$ ip route show table 10001
default proto pbr metric 20
nexthop via 192.168.208.1 dev enp0s8 weight 1
nexthop via 192.168.209.1 dev enp0s9 weight 1
nexthop via 192.168.210.1 dev enp0s10 weight 1
The linux kernel now will use the rules and tables to properly
apply these policies.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Routes that have labels must be sent via a nexthop that also has labels.
This change notes whether any path in a nexthop update from zebra contains
labels. If so, then the nexthop is valid for routes that have labels.
If a nexthop update has no labeled paths, then any labeled routes
referencing the nexthop are marked not valid.
Add a route flag BGP_INFO_ANNC_NH_SELF that means "advertise myself
as nexthop when announcing" so that we can track our notion of the
nexthop without revealing it to peers.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Do not complain about failure to create a namespace if we
do not have any such thing going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Static route commands are now installed inside the VRF nodes. This has
quietly broken top-level static routes in certain scenarios due to
walkup logic resolving a static route configuration command inside
VRF_NODE first if the command is issued while in a CLI node lower than
VRF_NODE. To fix this VRF_NODE needs a special exit command, as has been
done for many other nodes with the same issue, to explicitly change the
vrf context to the default VRF so that when walkup resolves against the
VRF node it will configure against the default VRF as desired.
Of course this is a hack on top of a hack and the CLI walkup
implementation needs to be rewritten.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This work is derived from a work done by China-Telecom.
That initial work can be found in [0].
As the gap between frr and quagga is important, a reworks has been
done in the meantime.
The initial work consists of bringing the following:
- Bringing the client side of flowspec.
- the enhancement of address-family ipv4/ipv6 flowspec
- partial data path handling at reception has been prepared
- the support for ipv4 flowspec or ipv6 flowspec in BGP open messages,
and the internals of BGP has been done.
- the memory contexts necessary for flowspec has been provisioned
In addition to this work, the following has been done:
- the complement of adaptation for FS safi in bgp code
- the code checkstyle has been reworked so as to match frr checkstyle
- the processing of IPv6 FS NLRI is prevented
- the processing of FS NLRI is stopped ( temporary)
[0] https://github.com/chinatelecom-sdn-group/quagga_flowspec/
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: jaydom <chinatelecom-sdn-group@github.com>
prefix structure is used to handle flowspec prefixes. A new AFI is
introduced: AF_FLOWSPEC. A sub structure named flowspec_prefix is
used in prefix to host the flowspec entry.
Reason to introduce that new kind is that prefixlen from prefix
structure is too short to all the flowspec needs, since NLRI can go over
0xff bytes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In BGP, doing policy-routing requires to use table identifiers.
Flowspec protocol will need to have that. 1 API from bgp zebra has been
done to get the table chunk.
Internally, onec flowspec is enabled, the BGP engine will try to
connect smoothly to the table manager. If zebra is not connected, it
will try to connect 10 seconds later. If zebra is connected, and it is
success, then a polling mechanism each 60 seconds is put in place. All
the internal mechanism has no impact on the BGP process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The library changes add 3 new messages to exchange between daemons and
ZEBRA.
- ZEBRA_TABLE_MANAGER_CONNECT,
- ZEBRA_GET_TABLE_CHUNK,
- ZEBRA_RELEASE_TABLE_CHUNK,
the need is that routing tables identifier are shared by various
services. For the current case, policy routing enhancements are planned
to be used in FRR. Poliy routing relies on routing tables identifiers
from kernels. It will be mainly used by the future policy based routing
daemon, but not only. In the flowspec case, the BGP will need also to
inject policy routing information into specific routing tables.
For that, the proposal is made to let zebra give the appropriate range
that is needed for all daemons.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t
Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Upon a 'ip netns del' event, the associated vrf with netns backend is
looked for, then the internal contexts are first disabled, then
suppressed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
PR #1739 added code to leak routes between (default VRF) VPN safi and unicast RIBs in any VRF. That set of changes included temporary CLI including vpn-policy blocks to specify RD/RT/label/&c. After considerable discussion, we arrived at a consensus CLI shown below.
The code of this PR implements the vpn-specific parts of this syntax:
router bgp <as> [vrf <FOO>]
address-family <afi> unicast
rd (vpn|evpn) export (AS:NN | IP:nn)
label (vpn|evpn) export (0..1048575)
rt (vpn|evpn) (import|export|both) RTLIST...
nexthop vpn (import|export) (A.B.C.D | X:X::X:X)
route-map (vpn|evpn|vrf NAME) (import|export) MAP
[no] import|export [vpn|evpn|evpn8]
[no] import|export vrf NAME
User documentation of the vpn-specific parts of the above syntax is in PR #1937
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
When we are signaling to a client from zebra that a nexthop
has changed, include the labels on the nexthop as well.
Upper level protocols need to know if the labels exist
in order to make intelligent decisions about what to do.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When I use these functions and am programming on linux I
always have to pull up a man page for these two functions
since they exist in *BSD land only.
Modify the name of the size variable to destsize on
pass in to give me the small hint I need to know
what to do.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add code to allow nexthops to be written by people who are
interested in writing their own nexthop line.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Expose to the world the nhgc_find command so that
interested parties can find a stored nexthop group.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a nexthop-group cli:
nexthop-group NAME
nexthop A
nexthop B
nexthop C
!
This will allow interested parties to hook into the cli for
nexthops. Users can add callback functions for add/delete
of a nexthop group as well as add/delete of each individual
nexthop.
Future work( PBR and static routes ) will take advantage
of this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Vty commands that link netns context to a vrf is requiring some
privileges. The change consists in retrieving the privileges at the
vrf_cmd_init() called by the relevant daemon. Then use it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Nobody uses it, but it's got the same definition. Move the parser
function into zclient.c and use it.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Add DEBUG*() macros
This set of macros allows you to write printf-like debugging lines that
automatically check whether a debug is on before printing. This should
eliminate the need for explicit checks in simple cases. For example:
if (SUCH_AND_SUCH_DEBUG_IS_ON) {
zlog_warn(...);
}
Becomes:
DEBUG(warn, such_and_such, ...);
Or, equivalently,
DEBUGE(such_and_such, ...);
The levels passed to DEBUG are expanded into the names of zlog_*
functions, so the same zlog levels are available. There's also a set of
macros that have the level built into them; DEBUGE for errors, DEBUGW
for warnings, etc. Good for brevity.
* Add singular setting macros
Change the 'SET' macros to accept a boolean indicating whether the
provided bits should be set or unset, and map on/off macros to them.
Helps condense code where you already have a boolean condition that
tells you what you want to do as you can avoid writing the branch.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Also modify `struct route_entry` to use nexthop_groups.
Move ALL_NEXTHOPS loop to nexthop_group.h
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the calling daemon to pass down what table-id we
want to use to install the route. Useful for PBR.
The vrf id passed must be the VRF_DEFAULT else this
value is ignored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The work_queue_free function free'd up the wq pointer but
did not set it too NULL. This of course causes situations
where we may use the work_queue after it is freed. Let's
modify the work_queue to set the pointer for you.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If a interested party removes one of it's routes let
it know that it has happened as asked for.
Add a ZAPI_ROUTE_REMOVED to the send of the route_notify_owner
Add a ZAPI_ROUTE_REMOVE_FAIL to the send of the route_notify_owner
Add code in sharpd to notice this and to allow it to keep
track of routes removed for that invocation and give timing
results.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The buffer size is currently 4k. Increase x4 times to allow for bigger
messages to be sent over the zapi.
The current size sufficient for most cases, but there are a couple
of cases with installing data to the kernel ip rules where we will
quickly hit this 4k size limit. I forsee flowspec getting close
to this limit as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The NS_DEFAULT value returns UNKNOWN in the case the vrf lite backend is
used, whereas this is wrong. This commit fixes the default value.
Also, it fixes the default value in the case NETNS support from system
is not ok, or some error can occur when reading default NS at startup.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The current strategy for fine-grained debugging across FRR is to use
static long int bitfields, in combination with helper macros that are
copy-pasted between daemons, to hold state on what debugging information
should be collected at any given time. This has a couple of problems:
* These bitfields are generally extern'd and accessed everywhere, so
they are not MT-safe or easy to make MT-safe
* Lots of code duplication from copy-pasting the DEBUG_* macros...
* Code duplication because of the "term" vs "conf" debugging concept
This patch aims to remedy that by providing some infrastructure to work
with debugs. The core concept of using bitfields has been retained, but
the number of these for each debug has been reduced to 1. This allows
easy use of lock-free methods for synchronizing access to debugging
info.
The helper macros have also been retained but they are now collected in
one place and perform exclusively atomic operations.
Finally there is a bit of code that allows daemons to register
callbacks, which I used to implement a command that will toggle all
debugging for any daemons that use these facilities.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Add the originating routes type and instance to the nexthop
update message. This is necessary because there exist
scenarios where BGP needs to make a decision about the
originating route type and instance to know if it is
going to be doing a route replace to a route that would
resolve to itself.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The addition of some rmac code snuck in the usage of a
stream_get instead of a STREAM_GET()
We need to be using STREAM_GET()
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Implement support for EVPN symmetric routing for IPv6 routes. The next hop
for EVPN routes is the IP address of the remote VTEP which is only an IPv4
address. This means that for IPv6 symmetric routing, there will be IPv6
destinations with IPv4 next hops. To make this work, the IPv4 next hops are
converted into IPv4-mapped IPv6 addresses.
As part of support, ensure that "L3" route-targets are not announced with
IPv6 link-local addresses so that they won't be installed in the routing
table.
Signed-off-by: Vivek Venkatraman vivek@cumulusnetworks.com
Reviewed-by: Mitesh Kanjariya mitesh@cumulusnetworks.com
Reviewed-by: Donald Sharp sharpd@cumulusnetworks.com
Because socket creation is tightly linked with socket binding for vrf
lite, the proposal is made to extend socket creation APIs and to create
a new API called vrf_bind that applies to vrf lite. The passed interface
name is the interface that will be bound to the socket passed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
That API can be used to wrap the ioctl call with various vrf instances.
This permits transparently doing the ioctl() call without taking into
consideration the vrf backend kind.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This split is introducing logicalrouter.[ch] as the file that contains
the vty commands to configure logical router feature. The split has as
consequence that the backend of logical router is linux_netns.c formerly
called ns.c. The same relationship exists between VRF and its backend
which may be linux_netns.c file.
The split is adapting ns and vrf fiels so as to :
- clarify header
- ensure that the daemon persepctive, the feature VRF or logical router
is called instead of calling directly ns.
- this implies that VRF will call NS apis, as logical router does.
Also, like it is done for default NS and default VRF, the associated VRF
is enabled first, before NETNS is enabled, so that zvrf->zns pointer is
valid when NETNS discovery applies.
Also, other_netns.c file is a stub handler that will be used for non
linux systems. As NETNS feature is only used by Linux, some BSD systems
may want to use the same backend API to benefit from NETNS. This is what
that file has been done.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The vrf_sockunion_socket() wraps sockunion_socket() with vrf_id as
additional parameter. The creation of socket forces the user to
transparently move to new NETNS for doing the operation.
The vrf_getaddr_info() wraps getaddr_info() with vrf_id as additional
parameter. That API relies on the underlying system. Then there may be
need to switch to an other netns in that case too.
Also, the vrf_socket() implementation is simplified.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
For supporting vrf based on namespaces, it is possible that an interface
with the same index is present. This is the case for loopback
interfaces. For that, for each query, if the interface is not found
, matching the vrf identifier, then a new interface is created, when the
backens for VRF is NETNS.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
when the netns backend is selected for VRF, the default VRF is being
assigned a NSID. This avoids the need to handle the case where if the
incoming NSID was 0 for a non default VRF, then a specific handling had
to be done to keep 0 value for default VRF.
In most cases, as the first NETNS to get a NSID will be the default VRF,
most probably the default VRF will be assigned to 0, while the other
ones will have their value incremented. On some cases, where the NSID is
already assigned for NETNS, including default VRF, then the default VRF
value will be the one derived from the NSID of default VRF, thus keeping
consistency between VRF IDs and NETNS IDs.
Default NS is attempted to be created. Actually, some VMs may have the
netns feature, but the NS initialisation fails because that folder is
not present.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Two apis are provided so that the switch from one netns to an other one
is taken care.
Also an other API to know if the VRF has a NETNS backend or a VRF Lite
backend.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The addition of the name of the netns in the vrf message introduces also
a limitation when the size of the netns is bigger than 15 bytes. Then
the netns are ignored by the library.
In addition to this, some sanity checks have been introduced. some
functions to create the netns from a call not coming from the vty is
being added with traces.
Also, the ns vty function is reentrant, if the context is already
created.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Show vrf command displays information on the vrf, if it is related to
vrf kernel or if it is related to netns.
When a vrf from kernel is detected, before creating a new vrf, a check
is done against an already present vrf, and if that vrf is not a vrf
mapped with a netns. If that is that case, then the creation is
rejected.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The zebra netnamespace contexts are initialised, based on the callback
coming from the NS. Reversely, the list of ns is parsed to disable the
ns contexts.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If vrf backend is netns, then the zebra will create its own
zebra_ns context for each new netns discovered. As consequence,
a routing table, and other contexts will be created for each
new namespace discovered. When it is enabled, a populate process
will be done, consisting in learning new interfaces and routes, and
addresses from other NETNS.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In addition to have the possibility to create from vty vrf based on a
netns backend, the API will be made accessible from external, especially
for zebra that will handle the netns discovery part. This commit is
externalising following functions:
- netns_pathname
- ns_handler_create
- vrf_handler_create
Also, the VRF initialisation case when under NETNS backend is changed,
since the NS identifier may not be known at the configuration time,but
may be known later, under discovery process.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Upon following calls: interface poll, address poll, route poll, and
ICMPv6 handling, each new Namespace is being parsed. For that, the
socket operations need to switch from one NS to one other, to get the
necessary information.
As of now, there is a crash when dumping interfaces, through show
running-config.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Using the vrf backend kind, the vty command that configured netns
under vty will not be installed if the vrf backend is vrf lite
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
a vty command is added:
in addition to this command ( kept for future usage):
- [no] logical-router-id <ID> netns <NETNSNAME>
a new command is being placed under vrf subnode
- vrf <NAME>
[no] netns <NETNSNAME>
exit
This command permits to map a VRF with a Netnamespace.
The commit only handles the relationship between vrf and ns structures.
It adds 2 attributes to vrf structure:
- one defines the kind of vrf ( mapped under netns or vrf from kernel)
- the other is the opaque pointer to ns
The show running-config is handled by zebra daemon.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The netns backend is chosen by VRF if a runtime flag named vrfwnetns is
selected when running zebra.
In the case the NETNS backend is chosen, in some case the VRFID value is
being assigned the value of the NSID. Within the perimeter of that work,
this is why the vrf_lookup_by_table function is extended with a new
parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The ZEBRA_FLAG_INTERNAL flag is used to signal to zebra that
the route being added, the nexthops for it can be recursively
resolved. This name keeps throwing me off when I read it
so let's rename to something that allows the developer to
understand what is going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In the future we are going to have a rule_notify_owner
so make the distinction between the two types of notification
clearer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The notification of the owner was not properly decoding
the prefix and as such we were not properly reading the
table it was installed into.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit is the implementation of weak multicast traceroute.
It consists of IGMP module dealing with mtrace type IGMP messages
and client program mtrace/mtracebis for initiating mtrace queries.
Signed-off-by: Mladen Sablic <mladen.sablic@gmail.com>
Add the ability to pass in an afi to zebra. zebra_vrf keeps
track of the afi/label tuple and then does the right thing
before we call down. AF_MPLS does not care about v4 or v6
it just knows label and what device to use for lookup.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify mpls.h to rename MPLS_LABEL_ILLEGAL to be MPLS_LABEL_NONE.
Fix all pre-existing code that used MPLS_LABEL_ILLEGAL.
Modify the zapi vrf label message to use MPLS_LABEL_NONE as the
signal to remove label associated with a vrf.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to pass the lsp owner type through the zapi
and in addition add a new label type for the sharp protocol
for testing.
Finally modify zebra_mpls.h to not have defaults specified
for the enum. That way when we add a new LSP type the
compile fails and the person doing the addition knows
where he has to touch shit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Turns out we had 3 different ways to define labels
all of them overlapping with the same meanings.
Consolidate to 1. This one choosen is consistent
naming wise with what the *bsd and linux kernels
use.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For L3VPN's we need to create a label associated with the specified
vrf to be installed into the kernel to allow a pop and lookup
operation.
The new api is:
zclient_send_vrf_label(struct zclient *zclient, vrf_id_t vrf_id,
mpls_label_t label);
For the specified vrf_id associate the specified label for
a pop and lookup operation for forwarding.
To setup a POP and Forward use MPLS_LABEL_IMPLICIT_NULL
If the same label is passed in we ignore the call.
If the label is different we update entry.
If the label is MPLS_LABEL_NONE we remove
the entry.
This sets up the api. Future commits will have the functionality
to actually install into the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nh_resolve_via_default function is an accessor function
for NHT in zebra. Let's move this function to it's proper
place.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Create a zapi_nexthop_update_decode function that both
pim and bgp use to decode the message from zebra.
There probably could be further optimizations but I opted
to keep the code as similiar as is possible between the
originals because they both make some assumptions about
code flow that I do not fully understand yet.
The real goal here is that I want to create a new
user of the nexthop tracking code from a higher level
daemon and I see no need to re-implement this damn
code again for a 3rd time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* zebra/kernel_socket.c: include "rt.h" to provide the prototypes of
kernel_init() and kernel_terminate();
* lib/prefix.h: remove the deprecation warning whenever ETHER_ADDR_LEN
is used. isisd uses the ETHER_HDR_LEN constant which is defined in
terms of ETHER_ADDR_LEN in the *BSD system headers. So, when building
FRR on *BSD, we were getting several warnings because we were using
ETHER_ADDR_LEN indirectly;
* lib/command_lex.l, lib/defun_lex.l: ignore other harmless warnings;
* lib/spf_backoff.c: cast 'tv->tv_usec' to 'long int' before printing.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
None of these variables can actually be used before being initialized,
but unfortunately some old compilers are not smart enough to detect that.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The PIM_NODE command is only being used to display
default vrf configuration. Move this into the
vrf display and remove PIM_NODE.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some work on FRR's pthread wrapper.
* Provide a built-in way to synchronize thread startup
* Make utility functions take frr_pthread * instead of its integer ID
* Pass frr_pthread * as pthread start function argument
* Correct some comment styling
* Rename some variables to match naming conventions in the file
* Change parameter ordering in stop function prototype to follow the
convention in the other functions
* Default new frr_pthreads to using a vanilla event loop
For the last point, the original goal when designing the implementation
of pthreads into FRR was to be able to use the thread.c event based
system inside pthreads. This code essentially encapuslates all the
thread.c functionality into an easy to use pthread out of the box.
Creating a new frr_pthread with a null attributes field will cause the
created frr_pthread to run a thread.c event loop. The upshot of this is
that it is now possible to safely run existing functions in a pthread in
roughly 3 lines of code. It also serves as an example / starting point
for others.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Refine the notion of what FRR considers as "configured" VRF. It is no longer
based on user just typing "vrf FOO" but when something is actually configured
against that VRF. Right now, in zebra, the only configuration against a VRF
are static IP routes and EVPN L3 VNI. Whenever a configuration is removed,
check and clear the "configured" flag if there is no other configuration for
this VRF. When user attempts to configure a static route and the VRF doesn't
exist, a VRF is created; the VRF is only active when also defined in the
kernel.
Updates: 8b73ea7bd479030418ca06eef59d0648d913b620
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-10139, CM-18553
Reviewed By: CCR-7019
Testing Done:
1. Manual testing for L3 VNI and static routes - FRR restart, networking
restart etc.
2. 'vrf' smoke
<DETAILED DESCRIPTION (REPLACE)>
When shutting down, ensure that all VRFs including "configured" ones are
cleaned up properly.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-19069
Reviewed By: CCR-7011
Testing Done: Manual verification of failed scenario
A VRF is active only when the corresponding VRF device is present in the
kernel. However, when the kernel VRF device is removed, the VRF container in
FRR should go away only if there is no user configuration for it. Otherwise,
when the VRF device is created again so that the VRF becomes active, FRR
cannot take the correct actions. Example configuration for the VRF includes
static routes and EVPN L3 VNI.
Note that a VRF is currently considered to be "configured" as soon as the
operator has issued the "vrf <name>" command in FRR. Such a configured VRF
is not deleted upon VRF device removal, it is only made inactive. A VRF that
is "configured" can be deleted only upon operator action and only if the VRF
has been deactivated i.e., the VRF device removed from the kernel. This is
an existing restriction.
To implement this change, the VRF disable and delete actions have been modified.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Mitesh Kanjariya <mkanjariya@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-18553, CM-18918, CM-10139
Reviewed By: CCR-7022
Testing Done:
1. vrf and pim-vrf automation tests
2. Multiple VRF delete and readd (ifdown, ifup-with-depends)
3. FRR stop, start, restart
4. Networking restart
5. Configuration delete and readd
Some of the above tests run in different sequences (manually).
In EVPN symmetric routing, not all subnets are presents everywhere.
We have multiple scenarios where a host might not get learned locally.
1. GARP miss
2. SVI down/up
3. Silent host
We need a mechanism to resolve such hosts. In order to achieve this,
we will be advertising a subnet route from a box and that box will help
in resolving the ARP to such hosts.
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
1. Added default gw extended community
2. code modification to handle sticky-mac/default-gw-mac as they go together
3. show command support for newly added extended community
4. State in zebra to reflect if a mac/neigh is default gateway
5. show command enhancement to refelect the same in zebra commands
Ticket: CM-17428
Review: CCR-6580
Testing: Manual
Signed-off-by: Mitesh Kanjariya <mitesh@cumulusnetworks.com>
Abstract the code that sends the zapi message into zebra
for the turn on/off of nexthop tracking for a prefix.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The zclient->redist bitmap for vrf's was being set again
for the zclient_send_dereg_requests function. This should
be a unset on tear down.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
- Remove OSPD_SR route type
- Check that Segment Routing is enable only in default VRF
- Add comment for SRGB in lib/mpls.h
- Update documentation
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Because the VRF_ID is mapped into 32 bit, and because when NETNS will be
the backend of VRF, then the NS identifier must also be encoded as 32
bit.
Also, the NS_UNKNOWN value is changed accordingly to UINT32_MAX.
Also, the NS_UNKNOWN and NS_DEFAULT values are removed from zebra_ns.h
and kept on ns.h header file.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The number of vrf bitmap groups is increased so as to avoid consuming
too much memory. This fix is related to a fork memory that occured when
running pimd as daemon.
A check on memory consumed shows that the memory consumed goes from
33480ko to 46888ko with that change. This is less compared to if the
value of the bitmap groups is increased to 16 ( 852776ko).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preparatory work for configuring vrf/frr over netns
vrf structure is being changed to 32 bit, and the VRF will have the
possibility to have a backend made up of NETNS.
Let's put some history.
Initially the 32 bit was because one wanted to map on vrf_id both the
VRFLITE and the NSID.
Initially, one would have liked to make zebra configure at the same time
both vrf lite and vrf from netns in a flat way. From the show
running perspective, one would have had both kind of vrfs, thatone
would configure on the same way.
however, it leads to inconsistencies in concepts, because it mixes vrf
vrf with vrf, and vrf is not always mapped with netns.
For instance, logical-router could also be used with netns. In that
case, it would not be possible to map vrf with netns.
There was an other reason why 32 bit is proposed. this is because
some systems handle NSID to 32 bits. As vrf lite exists only on
Linux, there are other systems that would like to use an other vrf
backend than vrf lite. The netns backend for vrf will be used for that
too. for instance, for windows or freebsd, some similar
netns concept exists; so it will be easier to reuse netns
backend for vrf, than reusing vrflite backend for vrf.
This commit is here to extend vrf_id to 32 bits. Following commits in a
second step will help in enable a VRF backend.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is an implementation of draft-ietf-ospf-segment-routing-extensions-24
and RFC7684 for Extended Link & Prefix Opaque LSA.
Look to doc/OSPF_SR.rst for implementation details & known limitations.
New files:
- ospfd/ospf_sr.h: Segment Routing structure definition (SubTLVs + SRDB)
- ospfd/ospf_sr.c: Main functions for Segment Routing support
- ospfd/ospf_ext.h: TLVs and SubTLVs definition for RFC7684
- ospfd/ospf_ext.c: RFC7684 Extended Link / Prefix implementation
- doc/OSPF-SRr.rst: Documentation
Modified Files:
- doc/ospfd.texi: Add new Segment Routing CLI command definition
- lib/command.h: Add new string command for Segment Routing CLI
- lib/mpls.h: Add default value for SRGB
- lib/route_types.txt: Add new OSPF Segment Routing route type
- ospfd/ospf_dump.[c,h]: Add OSPF SR debug
- ospfd/ospf_memory.[c,h]: Add new Segment Routing memory type
- ospfd/ospf_opaque.[c,h]: Add ospf_sr_init() starting function
- ospfd/ospf_ri.c: Add new functions to Set/Get Segment Routing TLVs
Add new ospf_router_info_lsa_upadte() to send Opaque LSA to ospf_sr.c()
- ospfd/ospf_ri.h: Add new Router Information SR SubTLVs
- ospfd/ospf_spf.c: Add new scheduler when running SPF to trigger
update of NHLFE
- ospfd/ospfd.h: Add new thread for Segment Routing scheduler
- ospfd/subdir.am: Add new files
- vtysh/Makefile.am: Add new ospf_sr.c file for vtysh
- zebra/kernel_netlink.c: Add new OSPF_SR route type
- zebra/rt_netlink.[c,h]: Add new OSPF_SR route type
- zebra/zebra_mpls.h: Add new OSPF_SR route type
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
There are some observed instances where we end up trying to cancel a rw
job based on a file descriptor that we don't have a reference on. The
specific cancel function for rw jobs assumes it's called with a file
descriptor that is valid within pollfds and will cause a segmentation
fault by buffer overrun if this is not the case.
Instead log it and move on. Since the fd does not exist this should
patch over the buggy behavior and provide additional information to help
in finding the root cause.
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>