When specifying only an "le" for an existing ip prefix-list qualified with
both an "le" and "ge" make sure to remove the "ge" property so it does
not stay in the tree.
E.g. Saying these two things in order:
ip prefix-list test seq 1 permit 1.1.0.0/16 ge 18 le 24
ip prefix-list test seq 1 permit 1.1.0.0/16 ge 18
... should result in the second statement "overwriting" the first like
this:
vxdev-arch# do show ip prefix-list
ZEBRA: ip prefix-list foobar: 3 entries
seq 1 permit 15.0.0.0/16 ge 18
Previously this did not happen and "le" would stick around since it was
never given NB_OP_DESTROY and purged from the data tree.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
Show alias name instead of numerical value in `show bgp <prefix>. E.g.:
```
root@exit1-debian-9:~/frr# vtysh -c 'sh run' | grep 'bgp community alias'
bgp community alias 65001:123 community-1
bgp community alias 65001:123:1 lcommunity-1
root@exit1-debian-9:~/frr#
```
```
exit1-debian-9# sh ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 21
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Community: 65001:12 65001:13 community-1 65001:65534
Large Community: lcommunity-1 65001:123:2
Last update: Fri Apr 16 12:51:27 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The distribute_list_init command is not used and is setup
code that will never be used because it makes assumptions about
how distribute-lists work that are fundamentally incorrect.
Remove the code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Abstract the parsing of distribute lists so that we
don't have as much cut-n-paste code.
This is a setup commit for future work. In effect
current distribute-list handling is all kinds of messed up
a) eigrp and babel both attempt to use distribute-lists, they just plain
don't work.
b) `distribute-list` is only sent to rip. `ipv6 distribute-list`
is sent to ripngd. If you use `distribute-list` under `router ripng`
it sends the command to rip but ripd is in the wrong mode and it
never works.
c) Should ripngd care about v4 and v6 specific distribute-lists?
This dichotomy was added for babel but babel has been broke
about this since day 1( see a ).
All in all we need to unwind this whole mess. Make distribute-list
commands specific to the daemons( so that we can be in the right
sub-mode ). But the parsing is going to be the same across all
daemons. So let's provide that functionality in `lib/distribute.c`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a simpler, more limited nexthop comparison function. This
compares a few key attributes, such as vrf, gateway, labels.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Problem Statement:
=================
In scale setup BGP sessions start flapping.
RCA:
====
In virtualized environment there are multiple places where
MTU need to be set. If there are some places were MTU is not set
properly then there is chances that BGP packets get fragmented,
in scale setup this will lead to BGP session flap.
Fix:
====
A new tcp option is provided as part of this implementation,
which can be configured per neighbor and helps to set the TCP
max segment size. User need to derive the path MTU between the BGP
neighbors and set that value as part of tcp-mss setting.
1. CLI Configuration:
[no] neighbor <A.B.C.D|X:X::X:X|WORD> tcp-mss (1-65535)
2. Running config
frr# show running-config
router bgp 100
neighbor 198.51.100.2 tcp-mss 150 => new entry
neighbor 2001:DB8::2 tcp-mss 400 => new entry
3. Show command
frr# show bgp neighbors 198.51.100.2
BGP neighbor is 198.51.100.2, remote AS 100, local AS 100, internal link
Hostname: frr
Configured tcp-mss is 150, synced tcp-mss is 138 => new display
4. Show command json output
frr# show bgp neighbors 2001:DB8::2 json
{
"2001:DB8::2":{
"remoteAs":100,
"bgpTimerKeepAliveIntervalMsecs":60000,
"bgpTcpMssConfigured":400, => new entry
"bgpTcpMssSynced":388, => new entry
Risk:
=====
Low - This is a config driven feature and it sets the max segment
size for the TCP session between BGP peers.
Tests Executed:
===============
Have done manual testing with three router topology.
1. Executed basic config and un config scenarios
2. Verified if the config is updated in running config
during config and no config operation
3. Verified the show command output in both CLI format and
JSON format.
4. Verified if TCP SYN messages carry the max segment size
in their initial packets.
5. Verified the behaviour during clear bgp session.
6. done packet capture to see if the new segment size
takes effect.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
pimd was the only user of this function, and that has gone away now.
So just kill the function.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These hoops to get warnings for mis-printing `uint64_t` are apparently
breaking some C++ bits...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The previous method, using zassert.h and hoping nothing includes
assert.h (which, on glibc at least, just does "#undef assert" and puts
its own definition in...) was fragile - and actually broke undetected.
Just provide our own assert.h and control overriding by putting it in a
separate directory to add to the include path (or not.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When an operator encounters a situation where the number
of FD's open is greater than what we have been configured
to legitimately handle via uname or the `--limit-fds` command
line, abort with a message that they should be able to
debug and figure out what is going on.
Fixes: #8596
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
flush netlink related dependencies with gre information.
Add some linux headers required to compile with it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
3 new gre commands are available:
- GRE_GET to permit a daemon to retrieve gre information.
- GRE_UPDATe is the reply message from zebra to the daemon. as it is a
syncronous request, the GRE_GET expected will have to match the vrf id
where the gre information is wished. this has an impact on label
manager with change in APIs.
- SET_GRE_SOURCE. this command will be stubbed for now, assuming that
the gre interface is set accordingly by external script.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Description:
This looks broken after NB changes in routemap. When routemap
action modified from permit to deny, it is expected to apply
the new action on the filtered routes before the action in the
routemap data structure has been changed. But currently this is
not handled by the corresponding northbound API.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Use unsigned value for all RA requests to Zebra
- encoding signed int as unsigned is bad practice
- RA interval is never, and should never be, negative
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
`config.h` has all the defines from autoconf, which may include things
that switch behavior of other included headers (e.g. _GNU_SOURCE
enabling prototypes for additional functions.)
So, the first include in any `.c` file must be either `config.h` (with
the appropriate guard) or `zebra.h` (which includes `config.h` first
thing.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Don't uninstall sessions if the address, interface, VRF or TTL didn't change.
Update the library documentation to make it clear to other developers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently this flag is only helpful in an extremely rare situation when
the BFD session registration was unsuccessful and after that zebra is
restarted. Let's remove this flag to simplify the API. If we ever want
to solve the problem of unsuccessful registration/deregistration, this
can be done using internal flags, without API modification.
Also add the error log to help user understand why the BFD session is
not working.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Creating any threads before we fork() into the background (if `-d` is
given) is an extremely dangerous footgun; the threads are created in
the parent and terminated when that exits.
This is extra dangerous because while testing, you'd often run the
daemon in foreground without `-d`, and everything works as expected.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... for any initialization that needs to run after forking, but that
would be racy if it were just scheduled on the thread_master (since the
config load is also just a thread callback, ordering would be undefined
for another scheduled thread callback.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
very_late_init doesn't really say what this does, config_post is much
more descriptive. (A config_pre is coming in a jiffy.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The (legacy) code for reading split configs tries to execute config
commands in parent nodes, but doesn't call the node_exit function when
it goes up to a parent node. This breaks BGP RPKI setup (and extended
syslog, which is in the next commit.)
Doing this correctly is a slight bit involved since the node_exit
callbacks should only be called if the command is actually executed on a
parent node.
Signed-off-by: David Lamparter <equinox@diac24.net>
If the last message in a batched logging operation isn't printed due to
priority, this skips the code that flushes prepared messages through
writev() and can trigger the assert() at the end of zlog_fd().
Since any logmsg above info priority triggers a buffer flush, running
into this situation requires a log file target configured for info
priority, at least 1 message of info priority buffered, a debug message
buffered after that, and then a buffer flush (explicit or due to buffer
full).
I haven't seen this chain of events happen in the wild, but it needs
fixing anyway.
Signed-off-by: David Lamparter <equinox@diac24.net>
`CFLAGS` is a "user variable", not intended to be controlled by
configure itself. Let's put all the "important" stuff in AC_CFLAGS and
only leave debug/optimization controls in CFLAGS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... by referencing all autogenerated headers relative to the root
directory. (90% of the changes here is `version.h`.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The log module buffers outgoing messages by default; add an
api to turn that off, and emit messages immediately.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
abort() raises SIGABRT, which would confusingly cause a 2nd backtrace to
be printed after the first one...
Signed-off-by: David Lamparter <equinox@diac24.net>
The "xref_p" variables are placed in the "xref_array" section
specifically so they're next to each other and we get an array at the
end. The ASAN redzone that is inserted around global variables is
breaks that since it'd be inserted before and after each of the array
items. So disable the ASAN redzone for these variables (and only these
variables, nothing else should be affected.)
Signed-off-by: David Lamparter <equinox@diac24.net>
- Handle badly formatted messages (don't set `ifp` in that case)
- Handle unread messages (e.g. when family is not don't trigger `assert`)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
A long time ago there was a difference between number-named and string-named
access/prefix-lists. Currently we always treat the name as a string and
there is no need for a separate list for number-named lists.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problems with the current implementation:
* Delete hook is called before the deletion of the access-list from the
master list, which means that daemons processing this hook successfully
find this access-list, store a pointer to it in their structures, and
right after that the access-list is freed. Daemons end up having stale
pointer to the freed structure.
* Route-maps are not notified of the deletion.
This commit fixes both issues.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
do not add a new route type, and consider 0 as a value meaning
that zebra should be the owner.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zapi_nbr structure is renamed to zapi_neigh_ip.
Initially used to set a neighbor ip entry for gre interfaces, this
structure is used to get events from the zebra layer to nhrp layer.
The ndm state has been added, as it is needed on both sides.
The zebra dplane layer is slightly modified.
Also, to clarify what ZEBRA_NEIGH_ADD/DEL means, a rename is done:
it is called now ZEBRA_NEIGH_IP_ADD/DEL, and it signified that this
zapi interface permits to set link operations by associating ip
addresses to link addresses.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The first change in this commit is the processing of the VRF termination.
When we terminate the VRF, we should not delete the underlying interfaces,
because there may be pointers to them in the northbound configuration. We
should move them to the default VRF instead.
Because of the first change, the VRF interface itself is also not deleted
when deleting the VRF. It should be handled in netlink_link_change. This
is done by the second change.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently we have a "route-map optimization" command which is entered
from inside the route-map entry but actually applies to the whole
route-map. In addition, this command is not shown in the running-config
and not stored to the startup-config during "write".
Let's add a new command on the config node level to control this setting
and show it in the running-config to make possible to save it during
"write".
The old command is saved for the backward compatibility but hidden and
marked as deprecated.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
EVPN neighbor operations were already done in the zebra dataplane
framework. Now that NHRP is able to use zebra to perform neighbor IP
operations (by programming link IP operations), handle this operation
under dataplane framework:
- assign two new operations NEIGH_IP_INSTALL and NEIGH_IP_DELETE; this
is reserved for GRE like interfaces:
example: ip neigh add A.B.C.D lladdr E.F.G.H
- use 'struct ipaddr' to store and encode the link ip address
- reuse dplane_neigh_info, and create an union with mac address
- reuse the protocol type and use it for neighbor operations; this
permits to store the daemon originating this neighbor operation.
a new route type is created: ZEBRA_ROUTE_NEIGH.
- the netlink level functions will handle a pointer, and a type; the
type indicates the family of the pointer: AF_INET or AF_INET6 if the
link type is an ip address, mac address otherwise.
- to keep backward compatibility with old queries, as no extension was
done, an option NEIGH_NO_EXTENSION has been put in place
- also, 2 new state flags are used: NUD_PERMANENT and NUD_FAILED.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this api is needed for nhrp. the goal is to implement it in zebra, while
other daemon will used it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
a zebra api is extended to offer ability to add or remove neighbor
entry from daemon. Also this extension makes possible to add neigh
entry, not only between IPs and macs, but also between IPs and NBMA IPs.
This API supports configuring ipv6/ipv4 entries with ipv4/ipv6 lladdr.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
zebra implements zebra api for configuring link layer information. that
can be an arp entry (for ipv4) or ipv6 neighbor discovery entry. This
can also be an ipv4/ipv6 entry associated to an underlay ipv4 address,
as it is used in gre point to multipoint interfaces.
this api will also be used as monitoring. an hash list is instantiated
into zebra (this is the vrf bitmap). each client interested in those entries
in a specific vrf, will listen for following messages: entries added, removed,
or who-has messages.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This patch implements new zapi api to get neighbor information that zebra knows
and that other daemons may need to know. Actually, nhrp daemons is
interested in getting the neighbor information on gre interfaces, and
the API will be used for that.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
We should delete the access-list when the last entry and remark is
deleted. This is already done for prefix-lists.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We delete the prefix-list when its last entry is deleted, but the check
is missed when we delete the description.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
CLI must never use operational data, because this won't work in
transactional mode. Rework search for prefix-list/access-list entries
using only candidate config.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The frr_zmq shim was trying to use some internal scheduling
macros, and that was causing trouble. Just use the public
apis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This wasn't quite formatting IPv6+port in a useful way (no brackets),
and printing the scope ID (interface index) and unix addrs is useful
too.
Signed-off-by: David Lamparter <equinox@diac24.net>
These are for string quoting (`%pSQ`) and string escaping (`%pSE`); the
sets / escape methods are currently rather "basic" and might be extended
in the future.
Signed-off-by: David Lamparter <equinox@diac24.net>
Analogous to Linux kernel `%pV` (but our mechanism expects 2 specifier
chars and `%pVA` is clearer anyway.)
Signed-off-by: David Lamparter <equinox@diac24.net>
... to suppress the warnings when using something that isn't quite ISO C
compatible and would otherwise cause compiler warnings from `-Wformat`.
Signed-off-by: David Lamparter <equinox@diac24.net>
With 0 currently the default value for the width specifier, it's not
possible to discern that from a %*p where 0 was passed as the length
parameter. Use -1 to allow for that.
Signed-off-by: David Lamparter <equinox@diac24.net>
This commit introduces the changes to the library route-map
north-bound callback implementation in order to align it to
the modified yang definitions.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Signed-off-by: Sarita Patra <saritap@vmware.com>
The checks were incorrectly removed in commit 4d2f546f under the
assumption that it is needed only in CLI. Actually the checks are needed
for the case when the sequence number is explicitly set by a user.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There was an attempt to consolidate the code in commit fae60215, but the
work was not actually finished and some necessary checks were missed.
Let's finish it.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This is to fix the crash reproduced by the following steps:
* ip link add red type vrf table 1
Creates VRF.
* vtysh -c "conf" -c "vrf red"
Creates VRF NB node and marks VRF as configured.
* ip route 1.1.1.0/24 2.2.2.2 vrf red
* no ip route 1.1.1.0/24 2.2.2.2 vrf red
(or similar l3vni set/unset in zebra)
Marks VRF as NOT configured.
* ip link del red
VRF is deleted, because it is marked as not configured, but NB node
stays.
Subsequent attempt to configure something in the VRF leads to a crash
because of the stale pointer in NB layer.
Fixes#8357.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
clang does not accept __attribute__((transparent_union)) in C++ source
code. We don't need it there anyway so let's just limit it to C.
Signed-off-by: David Lamparter <equinox@diac24.net>
This replaces `%n` with a safe, out-of-band option that simply records
the start and end offset of the output produced for each `%...`
specifier.
The old `%n` code is removed.
Signed-off-by: David Lamparter <equinox@diac24.net>
Allowing printfrr extensions to directly write to the output buffer has
a few advantages:
- there is no arbitrary length limit imposed (previously 64)
- the output doesn't need to be copied another time
- the extension can directly use bprintfrr() to put together pieces
The downside is that the theoretical length (regardless of available
buffer space) must be computed correctly.
Extended unit tests to test these paths a bit more thoroughly.
Signed-off-by: David Lamparter <equinox@diac24.net>
Incorporate into the `show thread cpu` the number of times
we have issued warnings about a particular thread being
too slow.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When generating SLOW_THREAD warnings let's differentiate between
a cpu bound process and a wall bound process. Effectively
a slow thread can now be a process in FRR doing lots of work( cpu bound )
or wall bound ( the cpu is heavy load and a FRR process may be pre-empted
and never scheduled ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
EVPN nexthops are installed as remote neighs by zebra. This was earlier
done only via VRF IPvX uni routes imported from EVPN routes.
With EVPN-MH these VRF routes now reference a L3NHG which is setup based
on the EAD and doesn't include the RMAC. To workaround that BGP now
consolidates and maintains EVPN nexthops which are then sent to zebra.
zebra sets up these nexthops as L3-VNI nh entries using a dummy type-1
route as reference.
Ticket: CM-31398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Before the transition of prefix-lists to northbound, this setting
controlled whether sequence numbers were displayed in the config.
After the transition, sequence numbers are always displayed in the
configuration, and this command only controls the output of the show
commands, which is not very useful. This command is not even shown in
the config anymore.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Make the local buffer offered to printfrr extension tokens
bigger; existing size wasn't quite enough for some of the
more elaborate struct prefix types.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
New BFD protocol integration API with abstractions to fix most common
protocol integration errors:
- Set address family together with the source/destination addresses
- Set the TTL together with the single/multi hop option
- Set/unset profile/interface easily
- Keep the arguments so we don't have to rebuild them every time
- Install/uninstall functions that keep track of current state so the
daemon doesn't have to
- Reinstall when critical configuration changes (address, multi hop
etc...)
- Reconfigure BFD when the daemon restarts automatically
- Automatically calls the user defined callback for session update
- Shutdown handler for all protocols
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add new status for Vertex, Edge and Subnet to manage their
respective states in the data base.
Add new functions:
- to register/unregister server and client
- to show content of the Database (VTY and Json output)
- to update and compare subnets
- to clean vertex and ted from ORPHAN elements
- to convert message or stream into a Link State Element and update
Link State Database accordingly to message event
Change Edge and Vertex key computation by using the host order systematically.
This impact mostly key based on IPv4 addresses where `ntohl()` function must
be used when searching a Vertex or Edge by key.
Update the documentation accordingly
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
This logs the unique ID prefix from the xref that each log message call
has, and adds on/off knobs for both EC and unique ID printing.
Signed-off-by: David Lamparter <equinox@diac24.net>
This parametrized use of flog with variable EC and priority doesn't mesh
particularly well with the xref code & there isn't really much reason to
not use fixed/constant calls like this.
Signed-off-by: David Lamparter <equinox@diac24.net>
Just small utilities for when you already have a struct fbuf (i.e. when
bprintfrr() is used to construct longer text from multiple pieces.)
Signed-off-by: David Lamparter <equinox@diac24.net>
Accept macros without ; for LabN CI *only*. This is a bit hairy since
we can't generate warnings for this, so it's very limited in both scope
and duration.
Signed-off-by: David Lamparter <equinox@diac24.net>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
The point of the `-std=gnu99` was to override a `-std=c99` that may be
coming in from net-snmp. However, we want C11, not C99.
Signed-off-by: David Lamparter <equinox@diac24.net>
like it has been done for iptable contexts, a zebra dplane context is
created for each ipset/ipset entry event. The zebra_dplane_ctx job is
then enqueued and processed by separate thread. Like it has been done
for zebra_pbr_iptable context, the ipset and ipset entry contexts are
encapsulated into an union of structures in zebra_dplane_ctx.
There is a specificity in that when storing ipset_entry structure, there
was a backpointer pointer to the ipset structure that is necessary
to get some complementary information before calling the hook. The
proposal is to use an ipset_entry_info structure next to the ipset_entry,
in the zebra_dplane context. That information is used for ipset_entry
processing. The ipset name and the ipset type are the only fields
necessary.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Implement new ringbuf function to do the proper socket reads without
the need of intermediary buffers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add support for read only mib objects from RFC4444.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
Signed-off-by: Karen Schoener <karen@voltanet.io>
Problem:
Prefix-list with mulitiple rules, an update to
a rule/sequence with different prefix/prefixlen
reset prefix-list next-base pointer to avoid
having stale value.
In some case the old next-bast's reference leads
to an assert in tri (trie_install_fn ) add.
bt:
(object=0x55576a4c8a00, updptr=0x55576a4b97e0) at lib/plist.c:560
(plist=0x55576a4a1770, pentry=0x55576a4c8a00) at lib/plist.c:585
(ple=0x55576a4c8a00) at lib/plist.c:745
(args=0x7fffe04beb50) at lib/filter_nb.c:1181
Solution:
Reset prefix-list next-base pointer whenver a
sequence/rule is updated.
Ticket:CM-33109
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
In prefix-list nortbound callback's validation
phase, use type as enum rather string for better
performance.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Following patch
" lib: disallow access list duplicated values"
introduce a libyang dnode iterator for every prefix-list
config which adds an overhead of traversal all prefix dnodes
and degrades the performance in scaled prefix-list config.
This check is not necessary in prefix-list northbound callbacks
as there won't be a case where prefix-list config comes to nb
callback without sequence number.
The dup check is only necessary for the vtysh case for backward
compatiblity reason where cli can be accepted without sequence number.
Ticket: CM-32035
Reviewed By: CCR-11096
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Remove when statements from prefix-list yang OM,
and do the same check in frr validation phase.
This helps a bit in perfomance of prefix-lists
scale config.
Ticket:CM-32035
Reviewed By:CCR-11096
Testing Done:
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Instead of crashing, print "NULL" when printfrr callback for
nexthops is called for a NULL nexthop argument.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Example:
```
show yang operational-data /frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-staticd:staticd'][name='staticd'][vrf='default'] staticd
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Apparently you can do `#__VA_ARGS__` and it actually does something
sensible, so here we go recording the format parameters for log messages
into the xref.
This allows some more checking in xrelfo.py, e.g. hints to use `%pFX`
and co.
Signed-off-by: David Lamparter <equinox@diac24.net>
This adds _clippy.ELFFile, which provides a fast wrapper around libelf.
The API is similar to / a subset of pyelfutils, which unfortunately is
painfully slow (to the tune of minutes instead of seconds.)
The idea is that xrefs can be read out of ELF files by reading out the
"xref_array" section or "FRRouting/XREF" note.
Signed-off-by: David Lamparter <equinox@diac24.net>
When the control plane protocol is created, the vrf structure is
allocated, and its address is stored in the northbound node.
The vrf structure may later be deleted by the user, which will lead to
a stale pointer stored in this node.
Instead of this, allow daemons that use the vrf pointer to register the
dependency between the control plane protocol and vrf nodes. This will
guarantee that the nodes will always be created and deleted together, and
there won't be any stale pointers.
Add such registration to staticd and pimd.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When handling a large number of events at one time
FRR will call monotime and getrusage 2 times for each
event. With this change modify the code to change
this to (X events / 2) + 1 calls of getrusage and monotime
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since some recent commit, building c++ code attempting to use zlog_debug
(or any other level) would fail with the following complaint:
lib/zlog.h:91:3: sorry, unimplemented: non-trivial designated
initializers not supported
};
^
lib/zlog.h:105:26: note: in expansion of macro ‘_zlog_ref’
#define zlog_debug(...) _zlog_ref(LOG_DEBUG, __VA_ARGS__)
This is due to out-of-order initialization of the xrefdata struct
fields. Setting them all in the order in which they are defined
fixes the issue.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
When running clippy, the main function in it's
error path could leak the memory pointed to by name.
Fix this. This was/is reported by clang SA.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Only ask the event-loop poll() to awaken if a newly-added timer
actually might have changed the required timeout. Also compute
timer deadline outside of mutex locks.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The existing code was using the oid2in_addr API to copy IPv6
addresses passing an IPv6 length. Create a utility to do this
properly and avoid annoying coverity with type checking.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Adding agentx_enabled hook.
This permits the main LDP process to signal the lde and ldpe
processes when agentx is enabled.
Signed-off-by: Karen Schoener <karen@voltanet.io>
CR, NL and CRNL are all OK, but CRNL shouldn't get treated as 2
newlines (which causes an empty command to be executed => empty prompt
line.)
Signed-off-by: David Lamparter <equinox@diac24.net>
The FDs are in struct vty, and there's ->fd and ->wfd, which shouldn't
be confused. Passing vty_sock along separately just creates mixups.
Signed-off-by: David Lamparter <equinox@diac24.net>
The return code from smux_trap is never used. If we have
never used it after all this time. Remove the return from
the function.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Neither tabs nor newlines are acceptable in syslog messages. They also
break line-based parsing of file logs.
Signed-off-by: David Lamparter <equinox@diac24.net>
The logging code writes log messages with a `\n` line ending, meanwhile
the VTY code switches it so you need `\r\n`...
And we don't flush the newline after executing a command either.
After this patch, starting daemons like `zebra/zebra -t` should provide
a nice development/debugging experience with a VTY open right there on
stdio and `log stdout` interspersed.
(This is already documented in the man pages, it just looked like sh*t
previously since the log messages didn't newline correctly.)
Signed-off-by: David Lamparter <equinox@diac24.net>
... in case the user does something like `zebra 3>logfile`. Also useful
for some module purposes, maybe even feeding config at some point in the
future.
Signed-off-by: David Lamparter <equinox@diac24.net>
This command doesn't rely on transactional CLI and works perfectly for
daemons converted to northbound configuration.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
mallinfo() is deprecated as of glibc 2.33 and emits a warning if used.
Support mallinfo2() if available.
Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
The VRF must be marked as configured when user enters "vrf NAME" command.
Otherwise, the following problem occurs:
`ip link add red type vrf table 1`
VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is called, and pointer to the VRF structure is stored
to the nb_config_entry.
`ip link del red`
VRF structure is freed (because it is not marked as configured), but
the pointer is still stored in the nb_config_entry.
`vtysh -c "conf t" -c "no vrf red"`
Nothing happens, because VRF structure doesn't exist. It means that
`lib_vrf_destroy` is not called, and nb_config_entry still exists in
the running config with incorrect pointer.
`ip link add red type vrf table 1`
New VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is NOT called, because the nb_config_entry for that
VRF name still exists in the running config.
After that all NB commands for this VRF will use incorrect pointer to
the freed VRF structure.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Extend the thread_cancel_event api so that it's more complete:
look in all the lists of events, including io and timers, for
matching tasks. Add a limited version of the api that only
examines tasks in the event and ready queues.
BGP appears to require the old behavior, so change its macro
to use the more limited cancel api.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
No reason for the thread/task cancellation struct to be public:
move it out of the header file. Also add a flags field.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Valgrind reports:
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A090D: bgp_bfd_dest_update (bgp_bfd.c:416)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A093C: bgp_bfd_dest_update (bgp_bfd.c:421)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
On looking at bgp_bfd_dest_update the function call into bfd_get_peer_info
when it fails to lookup the ifindex ifp pointer just returns leaving
the dest and src prefix pointers pointing to whatever was passed in.
Let's do two things:
a) The src pointer was sometimes assumed to be passed in and sometimes not.
Forget that. Make it always be passed in
b) memset the src and dst pointers to be all zeros. Then when we look
at either of the pointers we are not making decisions based upon random
data in the pointers.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This didn't exist yet when the xref code came around, and since
frrtrace() gets collapsed to nothing by the preprocessor when
tracepoints are disabled, it didn't cause any compiler errors...
Signed-off-by: David Lamparter <equinox@diac24.net>
gcc fucks up global variables with section attributes when they're used
in templated C++ code. The template instantiation "magic" kinda breaks
down (it's implemented through COMDAT in the linker, which clashes with
the section attribute.)
The workaround provides full runtime functionality, but the xref
extraction tool (xrelfo.py) won't work on C++ code compiled by GCC.
FWIW, clang gets this right.
Signed-off-by: David Lamparter <equinox@diac24.net>
The function smux_trap only allows the paaasin of one index which is
applied to all indexed objects. However there is a requirement for
differently indexed objects within a singe trap. This commit
introduces a new function smux_trap_multi_index which can be called
with an array of indices. If this array is onf length 1 the original
smux_trap behaviour is maintained. smux_trap now calls the new
function with and index array length of 1 to avoid changes to
existing callers.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Add defines for IANA SNMP routing protocol values
Add macro for returning an IPv6 address to the SNMP agent.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Add if_vrf_lookup_by_index_next to get the next ifindex in a vrf
given the previous ifindex or 0 for the first.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Add SNMP support for L3vpn Vrf table as defined in [RFC4382]
Keep track of vrf status for the table and for future traps.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Run through the vrf's interface list and return a count, skipping
the l3mdev which has a name which matches the vrf name.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
We don't use `%n` anywhere, so the only purpose it serves is enabling
exploits.
(I thought about this initially when adding printfrr, but I wasn't sure
we don't use `%n` anywhere, and thought I'll check later, and then just
forgot it...)
Signed-off-by: David Lamparter <equinox@diac24.net>
Description: When we get a new vrf add and vrf with same name, but different vrf-id already
exists in the database, we should treat vrf add as update.
This happens mostly when there are lots of vrf and other configuration being replayed.
There may be a stale vrf delete followed by new vrf add. This
can cause timing race condition where vrf delete could be missed and
further same vrf add would get rejected instead of treating last arrived
vrf add as update.
Treat vrf add for existing vrf as update.
Implicitly disable this VRF to cleanup routes and other functions as part of vrf disable.
Update vrf_id for the vrf and update vrf_id tree.
Re-enable VRF so that all routes are freshly installed.
Above 3 steps are mandatory since it can happen that with config reload
stale routes which are installed in vrf-1 table might contain routes from
older vrf-0 table which might have got deleted due to missing vrf-0 in new configuration.
Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
This allows grabbing a list of all DEFUNs and their help texts through
the xref extraction mechanics.
Signed-off-by: David Lamparter <equinox@diac24.net>
This allows extracting a list of all log messages including their ECs
and autogenerated unique IDs for them.
Signed-off-by: David Lamparter <equinox@diac24.net>
Our "true" libraries (i.e. not modules) don't invoke neither
FRR_DAEMON_INFO nor FRR_MODULE_SETUP, hence XREF_SETUP isn't invoked
either. Invoke it directly to get things working.
Signed-off-by: David Lamparter <equinox@diac24.net>
This adds the machinery for cross reference points (hence "xref") for
things to be annotated with source code location or other metadata
and/or to be uniquely identified and found at runtime or by dissecting
executable files.
The extraction tool to walk down an ELF file is done and working but
needs some more cleanup and will be added in a separate commit.
Signed-off-by: David Lamparter <equinox@diac24.net>
Makes more sense to have this as a static inline. Also I don't want to
be forced to link network.o into clippy ;)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The output from `show thread cpu` was not lined up appropriately
for the header line. As well as the function name we were
calling in the output. Fix it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In 5a3cf85391 the trailing empty line
following the "show ip(v6) route" header was removed. Restore it for
consistency.
Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
gcc-10 is complaining:
lib/frrscript.c:42:14: error: cast between incompatible function types from ‘const char * (*)(lua_State *, const char *)’ to ‘void (*)(lua_State *, const void *)’ [-Werror=cast-function-type]
42 | .encoder = (encoder_func)lua_pushstring,
| ^
Wrapper it to make it happy. Not sure what else to do.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
fname is MAXPATHLEN and scriptdir and fs->name are less then
MAXPATHLEN but the combination of those two + the `.lua` are
greater than the MAXPATHLEN. Just give us more room to prevent
a coding boo boo.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
```
exit1-debian-9(config-route-map)# set community
AA:NN Community number in AA:NN format (where AA and NN are (0-65535)) or local-AS|no-advertise|no-export|internet|graceful-shutdown|accept-own-nexthop|accept-own|route-filter-translated-v4|route-filter-v4|route-filter-translated-v6|route-filter-v6|llgr-stale|no-llgr|blackhole|no-peer or additive
none No community attribute
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The raw zapi apis to encode and decode NHGs don't need to be
public; also add a little more validity-checking.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When the routemap code was rewritten for performance the
code to track the number of times a particular section of
a route-map was applied was not correctly updated. In
this case I found another sequence of events where the
number of times a section was invoked was not being correctly
kept.
Effectively in this case when route_map_get_index is called
and returns an index the route map has been applied( see that
skip_match_clause is set to true and then in the for loop
below the skip_match_clause is tested and index->applied is
incremented.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The re->flags and re->status in debugs were being dumped as hex values.
I can never quickly decode this. Here is an idea. Let's let FRR do
it for me.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add an API that allows IGP client daemons to register/unregister
RLFAs with ldpd.
IGP daemons need to be able to query the LDP labels needed by RLFAs
and monitor label updates that might affect those RLFAs. This is
similar to the NHT mechanism used by bgpd to resolve and monitor
recursive nexthops.
This API is based on the following ZAPI opaque messages:
* LDP_RLFA_REGISTER: used by IGP daemons to register an RLFA with ldpd.
* LDP_RLFA_UNREGISTER_ALL: used by IGP daemons to unregister all of
their RLFAs with ldpd.
* LDP_RLFA_LABELS: used by ldpd to send RLFA labels to the registered
clients.
For each RLFA, ldpd needs to return the following labels:
* Outer label(s): the labels advertised by the adjacent routers to
reach the PQ node;
* Inner label: the label advertised by the PQ node to reach the RLFA
destination.
For the inner label, ldpd automatically establishes a targeted
neighborship with the PQ node if one doesn't already exist. For that
to work, the PQ node needs to be configured to accept targeted hello
messages. If that doesn't happen, ldpd doesn't send a response to
the IGP client daemon which in turn won't be able to activate the
previously computed RLFA.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Define new models for Link State Database a.k.a TED
and functions to manipulate the new database as well as exchange Link State
information through ZAPI Opaque message.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
There exists a possibilty that route map dependencies
have gotten wrong. Prevent the crash and warn the user
that we may be in trouble.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Route-maps contain a hash of hash's that contain the
container type name ( say community or access list or whatever )
and then it has a hash of route-maps that this maps too
Suppose you have this:
!
frr version 7.3.1
frr defaults traditional
hostname eva
log stdout
!
debug route-map
!
router bgp 239
neighbor 192.168.161.2 remote-as external
!
address-family ipv4 unicast
neighbor 192.168.161.2 route-map foo in
exit-address-family
!
bgp community-list standard 7000:40002 permit 7000:40002
bgp community-list standard 7000:40002 permit 7000:40003
!
route-map foo deny 20
match community 7000:40002
!
route-map foo permit 10
!
line vty
!
end
You have a community hash which has an
7000:40002 entry
This entry has a hash of routemaps that are referencing it. In this above
example it would have `foo` as the single entry.
Given the above config if you do this:
eva# conf
eva(config)# route-map foo deny 20
eva(config-route-map)# match community 7000:4003
eva(config-route-map)#
We would expect the `7000:40002` community hash to no longer have
a reference to the `foo` routemap. Instead we see the code doing this:
2020/12/18 13:47:12 BGP: bgpd 7.3.1 starting: vty@2605, bgp@<all>:179
2020/12/18 13:47:47 BGP: Add route-map foo
2020/12/18 13:47:47 BGP: Route-map foo add sequence 10, type: permit
2020/12/18 13:47:57 BGP: Route-map foo add sequence 20, type: deny
2020/12/18 13:48:05 BGP: Adding dependency for filter 7000:40002 in route-map foo
2020/12/18 13:48:05 BGP: route_map_print_dependency: Dependency for 7000:40002: foo
2020/12/18 13:48:41 BGP: bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 192.168.161.2 in vrf default
2020/12/18 13:49:19 BGP: Deleting dependency for filter 7000:4003 in route-map foo
2020/12/18 13:49:19 BGP: Adding dependency for filter 7000:4003 in route-map foo
2020/12/18 13:49:19 BGP: route_map_print_dependency: Dependency for 7000:4003: foo
Note how the code attempts to remove the dependency for `7000:4003` instead of the
dependency for `7000:40002`. Then we create a new hash for `7000:4003` and then
install the routemap name in it.
This is wrong. We should remove the `7000:40002` dependency and then install
a dependency for `7000:4003`.
Fix the code to do the right thing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This new dynamic module makes pathd behave as a PCC for dynamic candidate path
using the external library pcpelib https://github.com/volta-networks/pceplib .
The candidate paths defined as dynamic will trigger computation requests to the
configured PCE, and the PCE response will be used to update the policy.
It supports multiple PCE. The one with smaller precedence will be elected
as the master PCE, and only if the connection repeatedly fails, the PCC will
switch to another PCE.
Example of configuration:
segment-routing
traffic-eng
pcep
pce-config CONF
source-address ip 10.10.10.10
sr-draft07
!
pce PCE1
config CONF
address ip 1.1.1.1
!
pce PCE2
config CONF
address ip 2.2.2.2
!
pcc
peer PCE1 precedence 10
peer PCE2 precedence 20
!
!
!
!
Co-authored-by: Brady Johnson <brady@voltanet.io>
Co-authored-by: Emanuele Di Pascale <emanuele@voltanet.io>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Javier Garcia <javier.garcia@voltanet.io>
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
This new daemon manages Segment-Routing Traffic-Engineering
(SR-TE) Policies and installs them into zebra. It provides
the usual yang support and vtysh commands to define or change
SR-TE Policies.
In a nutshell SR-TE Policies provide the possibility to steer
traffic through a (possibly dynamic) list of Segment Routing
segments to the endpoint of the policy. This list of segments
is part of a Candidate Path which again belongs to the SR-TE
Policy. SR-TE Policies are uniquely identified by their color
and endpoint. The color can be used to e.g. match BGP
communities on incoming traffic.
There can be multiple Candidate Paths for a single
policy, the active Candidate Path is chosen according to
certain conditions of which the most important is its
preference. Candidate Paths can be explicit (fixed list of
segments) or dynamic (list of segment comes from e.g. PCEP, see
below).
Configuration example:
segment-routing
traffic-eng
segment-list SL
index 10 mpls label 1111
index 20 mpls label 2222
!
policy color 4 endpoint 10.10.10.4
name POL4
binding-sid 104
candidate-path preference 100 name exp explicit segment-list SL
candidate-path preference 200 name dyn dynamic
!
!
!
There is an important connection between dynamic Candidate
Paths and the overall topic of Path Computation. Later on for
pathd a dynamic module will be introduced that is capable
of communicating via the PCEP protocol with a PCE (Path
Computation Element) which again is capable of calculating
paths according to its local TED (Traffic Engineering Database).
This dynamic module will be able to inject the mentioned
dynamic Candidate Paths into pathd based on calculated paths
from a PCE.
https://tools.ietf.org/html/draft-ietf-spring-segment-routing-policy-06
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Emanuele Di Pascale <emanuele@voltanet.io>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
Currently when nhrp shortcuts are purged they will not be recreated. This
patch fixes that by ensuring the shortcut routes get purged correctly.
This situation can be reproduced by first allowing a shortcut to be created
then clearing the shortcut:
clear ip nhrp cache
clear ip nhrp shortcuts
Signed-off-by: Reuben Dowle <reuben.dowle@4rf.com>
There exists a world where some people have put `end` in their
configuration. Then vtysh will command search for it and find
it and then bad things happen.
Ticket: CM-32665
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Removing the obsolete ldp-sync periodic 'hello' message.
When ldp-sync is configured, IGPs take action if the LDP process goes down.
The IGPs have been updated to use the zapi client close callback to detect
the LDP process going down.
Signed-off-by: Karen Schoener <karen@voltanet.io>
Add a bit of code that allows for opaque data to be
sent from an upper level protocol to zebra. This is just
pass through data that will be used as part of displaying
useful data about a route in a `show ip route` command
in future commits.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Keep the previous CLI behavior of silently ignoring access lists which
contain the same value.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Don't allow users to create multiple entries in the same list with the
same value to keep the behavior previously to northbound migration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Keep the previous CLI behavior of silently ignoring access lists which
contain the same value.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Don't allow users to create multiple rules in the same list with the
same value to keep the behavior previously to northbound migration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently, IGPs are coded to receive a 'hello' message from LDP every second.
Intermittently, LDP Sync topotests are failing because the IGPs fail to
receive this 'hello' message every second.
When the LDP Sync topotests fail, LDP logs show that LDP is processing
zapi messages for 1-2 seconds.
This is a shortterm fix, in order to prevent CI pipeline failures.
The longterm fix is in progress.
Signed-off-by: Karen Schoener <karen@voltanet.io>
Specify default via --with-scriptdir at compile time, override default
with --scriptdir at runtime. If unspecified, it's {sysconfdir}/scripts
(usually /etc/frr/scripts)
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
This implements the ability to get results out from lua scripts after
they've run.
For each C type we support passing to Lua, there is a corresponding
`struct frrscript_codec`. This struct contains a typename field - just a
string identifying the type - and two function pointers. The first
function pointer, encode, takes a lua_State and a pointer to the C value
and pushes some corresponding Lua representation onto the stack. The
second, decode, assumes there is some Lua value on the stack and decodes
it into the corresponding C value.
Each supported type's `struct frrscript_codec` is registered with the
scripting stuff in the library, which creates a mapping between the type
name (string) and the `struct frrscript_codec`. When calling a script,
you specify arguments by passing an array of `struct frrscript_env`.
Each of these structs has a void *, a type name, and a desired binding
name. The type names are used to look up the appropriate function to
encode the pointed-at value onto the Lua stack, then bind the pushed
value to the provided binding name, so that the converted value is
accessible by that name within the script.
Results work in a similar way. After a script runs, call
frrscript_get_result() with the script and a `struct frrscript_env`.
The typename and name fields are used to fetch the Lua value from the
script's environment and use the registered decoder for the typename to
convert the Lua value back into a C value, which is returned from the
function. The caller is responsible for freeing these.
frrscript_call()'s macro foo has been stripped, as the underlying
function now takes fixed arrays. varargs have awful performance
characteristics, they're hard to read, and structs are more defined than
an order sensitive list.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
None of the core lua_push* functions return anything, and it helps to
not have to wrap those when using them as function pointers for our
encoder system, so change the type of our custom encoders to return void
as well.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Add:
- log.warn()
- log.error()
- log.notice()
- log.info()
- log.debug()
to the global namespace for each script
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Update the two test functions that encode a prefix and an interface to
match the encoder_func signature expected by the scripting infra.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Rather than let Luaisms propagate from the start, this is some generic
wrapper stuff that defines some semantics for interacting with scripts
that aren't specific to the underlying language.
The concept I have in mind for FRR's idea of a script is:
- has a name
- has some inputs, which have types
- has some outputs, which have types
I don't want to even say they have to be files; maybe we can embed
scripts in frr.conf, for example. Similarly the types of inputs and
outputs are probably going to end up being some language-specific setup.
For now, we will stick to this simple model, but the plan is to add full
object support (ie calling back into C).
This shouldn't be misconstrued as prepping for multilingual scripting
support, which is a bad idea for the following reasons:
- Each language would require different FFI methods, and specifically
different object encoders; a lot of code
- Languages have different capabilities that would have to be brought to
parity with each other; a lot of work
- Languages have *vastly* different performance characteristics; bad
impressions, lots of issues we can't do anything about
- Each language would need a dedicated maintainer for the above reasons;
pragmatically difficult
- Supporting multiple languages fractures the community and limits the
audience with which a given script can be shared
The only pro for multilingual support would be ease of use for users not
familiar with Lua but familiar with one of the other supported
languages. This is not enough to outweigh the cons.
In order to get rich scripting capabilities, we need to be able to pass
representations of internal objects to the scripts. For example, a
script that performs some computation based on information about a peer
needs access to some equivalent of `struct peer` for the peer in
question. To transfer these objects from C-space into Lua-space we need
to encode them onto the Lua stack. This patch adds a mapping from
arbitrary type names to the functions that encode objects of that type.
For example, the function that encodes `struct peer` into a Lua table
could be registered with:
bgp_peer_encoder_func(struct frrscript *fs, struct peer *peer)
{
// encode peer to Lua table, push to stack in fs->scriptinfo->L
}
frrscript_register_type_encoder("peer", bgp_peer_encoder_func);
Later on when calling a script that wants a peer, the plan is to be able
to specify the type name like so:
frrscript_call(script, "peer", peer);
Using C-style types for the type names would have been nice, it might be
possible to do this with preprocessor magic or possibly python
preprocessing later on.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
mergeme no stdlib
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
This was toy code used for testing purposes. Code calling Lua should be
very explicit about what is loaded into the Lua state. Also, the
allocator used is exactly the same allocator used by default w/
luaL_newstate().
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Add a function that will export FRR's logging functions into a Lua
table, and add that table to the table of your choice (usually _ENV).
For instance, to add logging to the global environment:
lua_gettable(L, LUA_REGISTRYINDEX);
lua_gettable(L, LUA_RIDX_GLOBALS);
frrlua_export_logging(L);
Then the following functions are globally accessible to any Lua scripts
running with state L:
- log.debug()
- log.info()
- log.notice()
- log.warn()
- log.error()
These are bound to zlog_debug, zlog_info, etc. They only take one string
argument for now but this shouldn't be an issue given Lua's builtin
facilities for formatting strings.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Use frrlua_* prefix to differentiate from Lua builtins
* Allow frrlua_initialize to pass an empty script
* Fixup naming of table accessors
* Fixup naming of prefix -> table encoder
* Fixup BGP routemap code to new function names
* Fix includes for frrlua.h
* Clean up doc comments
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
As code comment states, 1 count of MTYPE_COMPLETION is leaked for each
autocompleted token. Let's manually decrement the counter before passing
the pointer to readline.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The function was originally implemented for zebra data plane FPM plugin,
but another code places could use it.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
start_config and end_config are already used as function names in DEFUN,
so the current naming is a little bit confusing. Let's use different
names for arguments.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add a startup-time option to limit the number of fds used
by the thread/event infrastructure. If nothing is configured,
the system ulimit is used.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
The return from sockunion2hostprefix tells us if the conversion
succeeded or not. There are places in the code where we
always assume that it just `works`, since it can fail
notice and try to do the right thing.
Please note that failure of this function for most cases
of sockunion2hostprefix is highly highly unlikely as that
the sockunion was already created and tested elsewhere
it's just that this function can fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modify the code to change from zlog_debug to zlog_err.
vtysh was not outputting the vtysh doc string issues
after a change a couple of months back. By changing
to error level we start seeing them on vtysh start up
again. This will allow us to catch these issues
in the CI runs again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a FRR process dies due to SIGILL/SIGABORT/etc attempt
to drain the log buffer. This code change is capturing
some missing logs that were not part of the log file on
a crash.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a `enum zclient_send_status` for appropriate handling
of return codes from zclient_send_message. Touch all the places
where we handle this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When FRR sends data over the ZAPI protocol from the upper levels to zebra, indicate
to the calling functions that we have started buffering data to be sent if the
socket is full underneath it.
Also add a call back function `zebra_buffer_write_ready` that we can call
when an upper level protocol's socket buffer has been drained.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The linux kernel is getting RTM_F_OFFLOAD_FAILED for kernel routes
that have failed to offload. Write the code
to receive these notifications from the linux kernel
and store that data for display about the routes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move the FOREACH_AFI_SAFI macro from bgpd.h to zebra.h( GLOBAL's YOUALL )
Then convert all the places that have the two level for loop to
iterate over all afi/safis
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The route_map_object_t was being used to track what protocol we were
being called against. But each protocol was only ever calling itself.
So we had a variable that was only ever being passed in from route_map_apply
that had to be carried against and everyone was testing if that variable
was for their own stack.
Clean up this route_map_object_t from the entire system. We should
speed some stuff up. Yes I know not a bunch but this will add up.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As part of normal processing we allow bgp commands to walk
up the command node chain. We are experiencing this crash:
Thread 1 "bgpd" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
assertion=0x7ffff7f3ba4f "set", file=0x7ffff7f3ba44 "lib/yang.c", line=413, function=<optimized out>)
at assert.c:92
line=413, function=0x7ffff7f3bc50 <__PRETTY_FUNCTION__.9> "yang_dnode_get") at assert.c:101
vty=0x5555561715a0, argc=3, argv=0x555558601620) at bgpd/bgp_vty.c:9568
cmd=0x0) at lib/command.c:937
at lib/command.c:997
matched=0x0, vtysh=0) at lib/command.c:1161
at lib/vty.c:517
(gdb)
9582 bgp_glb_dnode = yang_dnode_get(vty->candidate_config->dnode,
(gdb) p vty->xpath
$8 = {
"/frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-bgp:bgp'][name='bgp'][vrf='default']/frr-bgp:bgp", '\000' <repeats 897 times>, '\000' <repeats 1023 times>, '\000' <repeats 1023 times>,
'\000' <repeats 1023 times>, '\000' <repeats 1023 times>, '\000' <repeats 1023 times>, '\000' <repeats 1023 times>,
'\000' <repeats 1023 times>}
(gdb) p vty->xpath_index
$9 = 0
(gdb)
We are effectively sending in an array index based upon vty->xpath_index( which is zero) but
the VTY_CURR_XPATH macro subtracts 1 from that value to find the appropriate xpath to use.
This of course subtracts 1 from 0 and we underflow the array.
The relevant section in a config file is this:
address-family ipv6 flowspec
bgp maxim...
Effectively we were trying to walk up the command chain for flowspec to see
if the command is entered correctly. There is a function vty_check_node_for_xpath_decrement
that was looking at bgp sub-modes to make the decision to allow us to decrement
the vty->xpath_index which did not have the v4 or v6 flowspec bgp sub modes in the
check.
Adding them in fixes the problem.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
calling "skiplist test" and then "skiplist debug",
there was a crash due to a freed pointer.
Agreed to remove static pointer (see PR #7474).
Signed-off-by: Emanuele Bovisio <emanuele.bovisio@eolo.it>
When a BFD integrated session already exists setting the profile
doesn't cause a session update (or vice versa): fix this issue by
handling the other cases.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Let the integration protocol always send the full configuration
instead of saving a few bytes. It will also allow protocols to specify
source address for IPv4 single hop connections and interface for multi
hop configuration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Issue:
The bgp routes learnt from peers which are not installed in kernel are
advertised to peers. This can cause routers to send traffic to these
destinations only to get dropped. The fix is to provide a configurable
option "bgp suppress-fib-pending". When the option is enabled, bgp will
advertise routes only if it these are successfully installed in kernel.
Fix (Part1) :
* Added message ZEBRA_ROUTE_NOTIFY_REQUEST used by client to request
FIB install status for routes
* Added AFI/SAFI to ZAPI messages
* Modified the functions zapi_route_notify_decode(), zsend_route_notify_owner()
and route_notify_internal() to include AFI, SAFI as parameters
Signed-off-by: kssoman <somanks@gmail.com>
gcc 10 complains about some of our format specs, fix them. Use
atomic size_t in thread stats, to work around platform
differences.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Manage the main pthread's signal mask to avoid a signal-handling
race. Before entering poll, check for pending signals that the
application needs to handle. Use ppoll() to re-enable those
signals during the poll call.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add an api that blocks application-handled signals (SIGINT,
SIGTERM, e.g.) then tests whether any signals have been received.
This helps to manage a race between signal reception and the poll
call in the main event loop.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
json_array_string_add is used to add a string entry into a JSON
list. This API is needed by zebra so moving it from bgpd to lib.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
DF (Designated forwarder) election is used for picking a single
BUM-traffic forwarded per-ES. RFC7432 specifies a mechanism called
service carving for DF election. However that mechanism has many
disadvantages -
1. LBs poorly.
2. Doesn't allow for a controlled failover needed in upgrade
scenarios.
3. Not easy to hw accelerate.
To fix the poor performance of service carving alternate DF mechanisms
have been proposed via the following drafts -
draft-ietf-bess-evpn-df-election-framework
draft-ietf-bess-evpn-pref-df
This commit adds support for the pref-df election mechanism which
is used as the default. Other mechanisms including service-carving
may be added later.
In this mechanism one switch on an ES is elected as DF based on the
preference value; higher preference wins with IP address acting
as the tie-breaker (lower-IP wins if pref value is the same).
Sample output
=============
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn es 03:00:00:00:00:01:11:00:00:01
ESI: 03:00:00:00:00:01:11:00:00:01
Type: LR
RD: 27.0.0.15:6
Originator-IP: 27.0.0.15
Local ES DF preference: 100
VNI Count: 10
Remote VNI Count: 10
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
VTEPs:
27.0.0.16 flags: EA df_alg: preference df_pref: 32767
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
torm-11# sh bgp l2vpn evpn route esi 03:00:00:00:00:01:11:00:00:01
*> [4]:[03:00:00:00:00:01:11:00:00:01]:[32]:[27.0.0.15]
27.0.0.15 32768 i
ET:8 ES-Import-Rt:00:00:00:00:01:11 DF: (alg: 2, pref: 100)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In transactional cli mode, bgp address-family <afi> <afi>
node builds xpath on top of `router bgp` node's xpath.
When `exit` is applied under afi-safi commands, retain
xpath_index to 1 to keep using bgp global xpath.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Make it possible to load YANG modules outside the main northbound
initialization. The primary use case is to support YANG modules
that are specific to an FRR plugin. Example: only load the PCEP
YANG module when the corresponding FRR plugin is loaded. Other use
cases might arise in the future.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Combine yang_snodes_iterate_module() and yang_snodes_iterate_all()
into an unified yang_snodes_iterate() function, where the first
"module" parameter is optional. There's no point in having two
separate YANG schema iteration functions anymore now that they are
too similar.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The only safe way to iterate over all schema nodes of a given YANG
module is by iterating over all schema nodes of all YANG modules
and filter out the nodes that belong to other modules.
The original yang_snodes_iterate_module() code did the following:
1 - Iterate over all top-level schema nodes of the given module;
2 - Iterate over all augmentations of the given module.
While that iteration strategy is more efficient, it does't handle
well more complex YANG hierarchies containing nested augmentations
or self-augmenting modules. Any iteration that isn't done on the
resolved YANG data hierarchy is fragile and prone to errors.
Fixes regression introduced by commit 8a923b4851 where the
gen_northbound_callbacks tool was generating duplicate callbacks
for certain modules.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
- tracepoint() -> frrtrace()
- tracelog() -> frrtracelog()
- tracepoint_enabled() -> frrtrace_enabled()
Also removes copypasta'd #ifdefs for those LTTng macros, those are
handled in lib/trace.h
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Previous commits added LTTng tracepoints. This was primarily for testing
/ trial purposes; in practice we'd like to support arbitrary tracing
methods, and especially USDT probes, which SystemTap and dtrace expect,
and which are supported on at least one flavor of BSD (FreeBSD).
To that end this patch adds an frr-specific tracing macro, frrtrace(),
which proxies into either DTRACE_PROBEn() or tracepoint() macros
depending on whether --enable-usdt or --enable-lttng is passed at
compile time.
At some point this could be tweaked to allow compiling in both types of
probes. Ideally there should be some logic there to use LTTng's optional
support for generating USDT probes when both are requested.
No additional libraries are required to use USDT, since these probes are
a kernel feature and only need the <sys/sdt.h> header.
- add --enable-usdt to toggle use of LTTng tracepoints or USDT probes
- add new trace.h library header for use with tracepoint definition
headers
- add frrtrace() wrapper macro; this should be used to define
tracepoints instead of using tracepoint() or DTRACE_PROBEn()
Compilation with USDT does nothing as of this commit; the existing LTTng
tracepoints need to be converted to use the frrtrace*() macros in a
subsequent commit.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
hash_get is used for both lookup and insert; add a tracepoint for when
we insert something into the hash
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
LTTng supports tracef() and tracelog() macros, which work like printf,
and are used to ease transition between logging and tracing. Messages
printed using these macros end up as trace events. For our uses we are
not interested in dropping logging, but it is nice to get log messages
in trace output, so I've added a call to tracelog() in zlog that dumps
our zlog messages as trace events.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
This commit adds initial support for LTTng.
When --enable-lttng=no or is not specified, no tracing code is included.
When --enable-lttng=yes, LTTng tracing events are (will be) generated.
configure.ac:
- add --enable-lttng
- define HAVE_LTTNG when enabled
- minimum LTTng version: 2.12.0
lib:
- add trace.[ch]
- update subdir.am
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Change thread_cancel to take a ** to an event, NULL-check
before dereferencing, and NULL the caller's pointer. Update
many callers to use the new signature.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Convert over to using the %pFX and %pRN modifiers
to output strings to allow us to consolidate on
one standard for printing prefixes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Create appropriate accessor functions for the rn->lock
data. We should be accessing this data through accessor
functions since it is private data to the data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently the prefix length M must be less than Y.
Relax this restriction to allow M to be less than or equal
to Y.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have 2 different routines to turn an evpn route into a string.
This commit aligns the two to the latest maintained version as a
first step in removing one of them.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Consolidate evpn type help strings into one single
macro for use on commands that need to support all
the types.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We have this pattern in the code base:
if (thread)
THREAD_OFF(thread);
If we look at THREAD_OFF we check to see if thread
is non-null too. So we have a double check.
This is unnecessary. Convert to just using THREAD_OFF
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Display human readable error message in northbound rpc
transaction failure. In case of vtysh nb client, the error
message will be displayed to user.
Testing:
bharat# clear evpn dup-addr vni 1002 ip 11.11.11.11
Error type: generic error
Error description: Requested IP's associated MAC aa:aa:aa:aa:aa:aa is still
in duplicate state
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Don't attempt to compress the wildcard information to fit a `/M`, but
use its own full 4 byte field.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Remove the nexthop_same_firsthop() api and just call nexthop_same().
Not entirely sure why we were using this function in the first place,
but now we are just marking dupes with it so lets just call a
common function and avoid issues.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
* remove pre-generation of route_types.h from configure
This change is a partial revert of commit 306ed6816. This is a little
drawback, but at least "make lib/libfrr.la", mentioned in the commit,
still works because route_types.h is forced to be built in f1b32b2e5.
* add "enabled" field to route_types.txt to track which daemon should
be enabled to add the routing protocol to "show ip route" header and
to redistribution list
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This function returns true on success and false otherwise. Returning -1
on error is equivalent to returning true.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Add the zapi code for encoding/decoding of backup nexthops for when
we are ready for it, but disable it for now so that we revert
to the old way with them.
When zebra gets a proto-NHG with a backup in it, we early fail and
tell the upper level proto. In this case sharpd. Sharpd then reverts
to the old way of installation with the route.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Align the zapi NHG apis to be more consistent with the zapi_route
apis. Add a struct zapi_nhg to use for encodings as well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add some header documentation to make it clear that you
cannot delete more than one item during each iteration.
Doing so could cause memory corruption for next pointer
if its also deleted from the table.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add the proto Nexthop Group Notify Owner header to
the log command types for string conversion.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Make the message parameters align better with other zapi
notifications and change the ID to correctly be a uint32.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add logging info for the new zapi ZEBRA_NHG_ADD[DEL]
message types. With this patch, they are logged properly
when debugs are turned on.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a command `set installable` that allows configured nexthop
groups to be treated as separate/installable objects in the RIB.
A callback needs to be implemented per daemon to handle installing
the NHG into the rib via zapi when this command is set. This
patch includes the implementation for sharpd.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add setting the onlink flag to the zapi_nh conversion
helper function so that we can set the onlink flag with
it when passing down NHGs from upper level protos.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Determine the NHG ID spacing and lower bound with ZEBRA_ROUTE_MAX
in macros.
Directly set the upperbound to be the lower 28bits of the uint32_t ID
space (the top 4 are reserved for l2-NHGs). Round that number down
a bit to make it more even.
Convert all former lower_bound calls to just use the macro.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add a command/functionality to only install proto-based nexthops.
That is nexthops owned/created by upper level protocols, not ones
implicitly created by zebra.
There are some scenarios where you would not want zebra to be
arbitrarily installing nexthop groups and but you still want
to use ones you have control over via lib/nexthop_group config
and an upper level protocol.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Implement the underlying zebra functionality to Add/Del an
internal zebra and kernel NHG.
These NHGs are managed by the upperlevel protocols that send them
down via zapi messaging.
They are not put into the overall zebra NHG hash table and only
put into to the ID table. Therefore, different protos cannot
and will not share NHGs.
The proto is also set appropriately when sent to the kernel.
Expand the separation of Zebra hashed/shared/created NHGs and
proto created and mangaged NHGs.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Modify the send down of a route to use the nexthop group id
if we have one associated with the route.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add the ability to send a NHG from an upper level protocol down to
zebra. ZAPI_NHG_ADD encompasses both the addition and replace
semantics ( If the id passed down does not exist yet, it's Add,
else it's a replace ).
Effectively zebra will take this nhg passed down save the nhg
in the id hash for nhg's and then create the appropriate nhg's
and finally install them into the linux kernel. Notification
will be the ZAPI_NHG_NOTIFY_OWNER zapi message for normal
success/failure messaging to the installing protocol.
This work is being done to allow us to work with EVPN MH
which needs the ability to modify NHG's that BGP will own
and operate on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add new function zclient_get_nhg_start that will allow an
upper level protocol to get a starting point for it's own
nhg space. Give each protocol a space of 50 million.
zebra will own the space from 0 - 199999999 because
of SYSTEM, KERNEL and CONNECT route types.
This is the start of some work that will allow upper
level protocols to install and maintain their own NHG's.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When calling yang_snodes_iterate_subtree we don't care about
the return code. So explicitly say we don't care so that
SA tools can be on the same page as us.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The linux kernel is getting RTM_F_TRAP and RTM_F_OFFLOAD for
kernel routes that have an underlying asic offload. Write the
code to receive these notifications from the linux kernel and
to store that data for display about the routes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The Solaris code has gone through a deprecation cycle. No-one
has said anything to us and worse of all we don't have any test
systems running Solaris to know if we are making changes that
are breaking on Solaris. Remove it from the system so
we can clean up a bit.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* use actual error code instead of "false"
* add missing new line
Before:
```
nfware# show interface | include (a]
% Regex compilation error: Success% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
After:
```
nfware# show interface | include (a]
% Regex compilation error: Unmatched ( or \(
% Bad regexp '(a]'
% Unknown command: show interface | include (a]
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Code was added in the past to support a value of VRF_DEFAULT different
from 0. This option was abandoned, the default vrf id is always 0.
Remove this code, this will simplify the code and improve performance
(use a constant value instead of a function that performs tests).
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
vrf_id_to_name() looks up in a RB_TREE to find the VRF entry, then
reads the name.
Avoid it for VRF_DEFAULT, which always exists and for which the
translation is straightforward.
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
The vrf_get function is called throughout the code base
so much so that when you turn on vrf debugging it eclipses
everything else to a degree that is completely unreasonable.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The vrf name was not being printed out in some vrf debugs. Add
this data in so people don't have to remember the vrf id.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the nexthop-groups were added to FRR for some
reason the call to nexthop_group_disable_vrf was
not added although it was written.
Add it in.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In the new Sysrepo, all SR_EV_ENABLED notifications are followed by
SR_EV_DONE notifications (assuming no errors occur), so there's no
need to special case the SR_EV_ENABLED event anymore (e.g. do full
transactions in one step).
While here, add a few more guarded debug messages to facilitate
troubleshooting.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Make the sysrepo plugin ignore the deletion of configuration
nodes that don't exist anymore instead of logging an error and
rejecting the changes. This is necessary because Sysrepo delivers
delete notifications for all nodes of a deleted data tree instead
of delivering a single delete notification of the top-level subtree
node (which would suffice for the northbound layer).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
From Sysrepo's documentation:
"Note: do not use fork() after creating a connection. Sysrepo
internally stores PID of every created connection and this way a
mismatch of PID and connection is created".
Introduce a new "frr_very_late_init" hook in libfrr that is only
called after the daemon is forked (when the '-d' option is used)
and after the configuration is read. This way we can initialize
the sysrepo plugin correctly even when the daemon is daemonized,
and after the Sysrepo CLI commands are processed (only "debug
northbound client sysrepo" for now).
Fixes#7062
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When installing rules pass by the interface name across
zapi.
This is being changed because we have a situation where
if you quickly create/destroy ephermeal interfaces under
linux the upper level protocol may be trying to add
a rule for a interface that does not quite exist
at the moment. Since ip rules actually want the
interface name ( to handle just this sort of situation )
convert over to passing the interface name and storing
it and using it in zebra.
Ticket: CM-31042
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Change the way the YANG schema node iteration functions work so that
the northbound layer won't have issues with more complex YANG modules
that contain multiple levels of YANG augmentations or modules that
augment themselves indirectly (by augmenting groupings).
Summary of the changes:
* Change the yang_snodes_iterate_subtree() function to always follow
augmentations and add an optional "module" parameter to narrow down
the iteration to nodes of a single module (which is necessary in
some cases). Also, remove the YANG_ITER_ALLOW_AUGMENTATIONS flag
as it's no longer necessary.
* Change yang_snodes_iterate_all() to do a DFS iteration on the resolved
YANG data hierarchy instead of iterating over each module and their
augmentations sequentially.
Reported-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Whenever libyang loads a module that contains a leafref, it will
also implicitly load the module of the referring node if it's
not loaded already. That makes sense as otherwise it wouldn't be
possible to validate the leafref value correctly.
The problem is that loading a module implicitly violates the
assumption of the northbound layer that all loaded modules
are implemented (i.e. they have a northbound node associated
to each schema node). This means that loading a module that
isn't implemented can lead to crashes as the "priv" pointer
of schema nodes is no longer guaranteed to be valid. To fix this
problem, add a few null checks to ignore data nodes associated
to non-implemented modules.
The side effect of this change is harmless. If a daemon receives
configuration it doesn't support (e.g. BFD peers on staticd),
that configuration will be stored but otherwise ignored. This can
only happen when using a northbound client like gRPC, as the CLI
will never send to a daemon a command it doesn't support. This
minor problem should go away in the long run as FRR migrates to
a centralized management model, at which point the YANG-modeled
configuration of all daemons will be maintained in a single place.
Finally, update some daemons to stop implementing YANG modules
they don't need to (i.e. revert 1b741a01c and a74b47f5).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
the walk routine is used by vxlan service to identify some contexts in
each specific network namespace, when vrf netns backend is used. that
walk mechanism is extended with some additional paramters to the walk
routine.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Changed negating set metric route-map command to be usable in
conjunction with the affirming command.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
The "set metric" command wasn't processing metric additions and
subtractions (using + and -) correctly. Fix those problems.
Also, remove the "+metric" and "-metric" options since they don't
work and don't make any sense (they could be interpreted as unitary
increments/decrements but that was never supported).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
In some cases one or both of the zlog targets in use here can be null,
we need to check for that.
Interestingly it appears we don't crash even when this is the case.
Undefined behavior ftw
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
LDP would mark all routes as learned on a non-ldp interface. Then
when LDP was configured the labels were not updated correctly. This
commit fixes issues 6841 and 6842.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
stream_forward_getp() cannot be used with negative numbers due to the
size_t argument, we'll end up doing overflow arithmetic.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Signed values get converted to unsigned for addition, so when the value
to adjust a stats variable for hash tables was negative this resulted in
overflow arithmetic, which we generally don't want.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
When not using the transactional CLI mode, do not display a
warning when a YANG-modeled commmand doesn't perform any effective
configuration change.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
1. Added new API for add/delete acl with route map notify.
Co-authored-by: harios <hari@niralnetworks.com>
Signed-off-by: Kaushik <kaushik@niralnetworks.com>
If we have an interface configured in a daemon on shutdown
store the old ifindex value for retrieval on when it is
possibly recreated.
This is especially important for nexthop groups as that we
had at one point in time the ability to restore the
configuration but it was lost when we started deleting
all deleted interfaces. We need the nexthop group subsystem
to also mark that it has configured an interface.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The nexthop_group_write_nexthop_simple function outputs the
interface name, because we've stored the ifindex. The problem
is that there are ephermeal interfaces in linux that can be
destroyed/recreated. Allow us to keep that data and do something
a bit smarter to allow show run's and other show commands to continue
to work when the interface is deleted.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Not everything cares about the vrf and backup info. Break
up the API to add a simple version to just write gateway/interface
info.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Wildcards bits have the opposite representation of a network mask,
example:
192.168.0.0/24 has the following netmask 255.255.0.0 and the wildcard
representation is 0.0.255.255.
To avoid future confusion lets put those definitions into a macro so we
know for sure which form to use.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When removing an IPv4 prefix configuration the wrong amount of bytes
will be read from `struct prefix_ipv4` from `DEFPY`, so lets use the
proper function for this.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When configuring a access list rule with type `any` it is now ambiguous
between cisco and zebra because both have the same syntax, so lets
remove the cisco command to avoid that.
YANG users will not notice this change.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
`nb_cli_enqueue_change` just points to the string values passed in
parameter, so we must use different strings for different function
calls (at least until `nb_cli_apply_changes`).
While here fix a variable name typo/copy paste error on destination host
case.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The restriction was already lift at the YANG model level, now lets
unlock the CLI as well.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When you add a key chain in the RIP configuration file and reload the
configurations via the frr-reload.py script, the script will fail and
the key chain will not appear in the running configuration. The reason
is that frr-reload.py doesn't recognize key as a sub-context.
Before this change, keys were generated this way:
key chain test
key 2
key-string 123
key 3
key-string 456
With this change, keys will be generated this way:
key chain test
key 2
key-string 123
exit
key 3
key-string 456
exit
This will allow frr-reload.py to see the key sub-context and correctly
reload them.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
in ipv6 flowspec, a new type is defined to be able to do filtering rules
based on 20 bits flow label field as depicted in [0]. The change include
the decoding by flowspec, and the addition of a new attribute in policy
routing rule, so that the data is ready to be sent to zebra.
The commit also includes a check on fragment option, since dont fragment
bit does not exist in ipv6, the value should always be set to 0,
otherwise the flowspec rule becomes invalid.
[0] https://tools.ietf.org/html/draft-ietf-idr-flow-spec-v6-09
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
to recognize whether a flowspec prefix has been carried out by
ipv4 flowspec or ipv6 flowspec ( actually, the hypothesis is that only
ipv4 flowspec is supported), then a new attribute should contain the
family value: AF_INET or AF_INET6. That value will be further used in
the BGP flowspec code.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In case of config rollback is enabled,
record northbound transaction based on a control flag.
The actual frr daemons would set the flag to true via
nb_init from frr_init.
This will allow test daemon to bypass recording
transacation to db.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
The sorting for zapi nexthops in zapi routes needs to match
the sorting of nexthops done in zebra. Ensure all zapi_nexthop
attributes are included in the sort.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Sysrepo recently underwent a complete rewrite, where some substantial
architectural changes were made (the most important one being the
extinction of the sysrepod daemon). While most of the existing API
was preserved, quite a few backward-incompatible changes [1] were
introduced (mostly simplifications). This commit adapts our sysrepo
northbound plugin to those API changes in order for it to be compatible
with the latest Sysrepo version.
Additional notes:
* The old Sysrepo version is EOL and not supported anymore.
* The new Sysrepo version requires libyang 1.x.
Closes#6936
[1] https://github.com/sysrepo/sysrepo/blob/devel/CHANGES
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
During the prep phase to apply a northbound commit, if no changes were
detected make sure we fill the error message buffer to explain this.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Fix a crash where if we issue a show run after a vrf has been
deleted we would crash here due to not null checking.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
when receiving a netlink API for an interface in a namespace, this
interface may come with LINK_NSID value, which means that the interface
has its link in an other namespace. Unfortunately, the link_nsid value
is self to that namespace, and there is a need to know what is its
associated nsid value from the default namespace point of view.
The information collected previously on each namespace, can then be
compared with that value to check if the link belongs to the default
namespace or not.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
to be able to retrieve the network namespace identifier for each
namespace, the ns id is stored in each ns context. For default
namespace, the netns id is the same as that value.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
as remind, the netns identifiers are local to a namespace. that is to
say that for instance, a vrf <vrfx> will have a netns id value in one
netns, and have an other netns id value in one other netns.
There is a need for zebra daemon to collect some cross information, like
the LINK_NETNSID information from interfaces having link layer in an
other network namespace. For that, it is needed to have a global
overview instead of a relative overview per namespace.
The first brick of this change is an API that sticks to netlink API,
that uses NETNSA_TARGET_NSID. from a given vrf vrfX, and a new vrf
created vrfY, the API returns the value of nsID from vrfX, inside the
new vrf vrfY.
The brick also gets the ns id value of default namespace in each other
namespace. An additional value in ns.h is offered, that permits to
retrieve the default namespace context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
With vrf-lite mechanisms, it is possible to create layer 3 vnis by
creating a bridge interface in default vr, by creating a vxlan interface
that is attached to that bridge interface, then by moving the vxlan
interface to the wished vrf.
With vrf-netns mechanism, it is slightly different since bridged
interfaces can not be separated in different network namespaces. To make
it work, the setup consists in :
- creating a vxlan interface on default vrf.
- move the vxlan interface to the wished vrf ( with an other netns)
- create a bridge interface in the wished vrf
- attach the vxlan interface to that bridged interface
from that point, if BGP is enabled to advertise vnis in default vrf,
then vxlan interfaces are discovered appropriately in other vrfs,
provided that the link interface still resides in the vrf where l2vpn is
advertised.
to import ipv4 entries from a separate vrf, into the l2vpn, the
configuration of vni in the dedicated vrf + the advertisement of ipv4
entries in bgp vrf will import the entries in the bgp l2vpn.
the modification consists in parsing the vxlan interfaces in all network
namespaces, where the link resides in the same network namespace as the
bgp core instance where bgp l2vpn is enabled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
While a configuration transaction can't be rejected once it reaches
the APPLY phase, we should allow NB callbacks to generate error
or warning messages when a configuration change is being applied.
That should be useful, for example, to return warnings back to
the user informing that the applied configuration has some kind of
inconsistency or is missing something in order to be effectively
activated. The infrastructure for this was already present, but the
northbound layer was ignoring all errors/warnings generated during
the apply/abort phases instead of returning them to the user. This
commit changes that.
In the gRPC plugin, extend the Commit() RPC adding a new
"error_message" field to the response type. This is necessary to
allow errors/warnings to be returned even when the commit operation
succeeds (since grpc::Status::OK doesn't support error messages
like the other status codes).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Use `args->errmsg` instead of just `zlog_info` for registering the error
so the users don't need to check their log files.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
We can make the Linux kernel send an ARP/NDP request by adding
a neighbour with the 'NUD_INCOMPLETE' state and the 'NTF_USE' flag.
This commit adds new dataplane operation as well as new zapi message
to allow other daemons send ARP/NDP requests.
Signed-off-by: Jakub Urbańczyk <xthaid@gmail.com>
There are situations where POLLERR will be returned. But
since we were not handling it. Thread processing effectively
is turned into an infinite loop, which is bad.
Modify the code so that if we receive a POLLERR we turn it
into a read event to be handled as an error from the handler
function.
This was discovered in pim:
Thread statistics for pimd:
Showing poll FD's for main
--------------------------
Count: 14/1024
0 fd: 9 events: 1 revents: 0 mroute_read
1 fd: 12 events: 1 revents: 0 vty_accept
2 fd: 13 events: 1 revents: 0 vtysh_accept
3 fd: 11 events: 1 revents: 0 zclient_read
4 fd: 15 events: 1 revents: 0 mroute_read
5 fd: 16 events: 1 revents: 0 mroute_read
6 fd: 17 events: 1 revents: 0 pim_sock_read
7 fd: 19 events: 1 revents: 0 pim_sock_read
8 fd: 21 events: 1 revents: 0 pim_igmp_read
9 fd: 22 events: 1 revents: 0 pim_sock_read
10 fd: 23 events: 1 revents: 0 pim_sock_read
11 fd: 20 events: 1 revents: 0 vtysh_read
12 fd: 18 events: 1 revents: 0 pim_sock_read
13 fd: 24 events: 0 revents: 0
strace was showing this line over and over and over:
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=11, events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22, events=POLLIN}, {fd=23, events=POLLIN}, {fd=20, events=POLLIN}, {fd=18, events=POLLIN}, {fd=6, events=POLLIN}], 14, 20) = 1 ([{fd=21, revents=POLLERR}])
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Somewhere along the way the indentation for comments got
all messed up. Let's make it follow our standards and
also look right too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For the sake of Segment Routing (SR) and Traffic Engineering (TE)
Policies there's a need for additional infrastructure within zebra.
The infrastructure in this PR is supposed to manage such policies
in terms of installing binding SIDs and LSPs. Also it is capable of
managing MPLS labels using the label manager, keeping track of
nexthops (for resolving labels) and notifying interested parties about
changes of a policy/LSP state. Further it enables a route map mechanism
for BGP and SR-TE colors such that learned BGP routes can be mapped
onto SR-TE Policies.
This PR does not introduce any usable features by now, it is just
infrastructure for other upcoming PRs which will introduce 'pathd',
a new SR-TE daemon.
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
1. BGP informs zebra if a MAC-IP is a SYNC path and if it active on the
ES peer.
2. Zebra sends paths that are "local-inactive" with the proxy flag to
BGP.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The `struct evpn_ead_addr` structure had a prefix length
associated with it. This value was only ever set never
used. Remove this from our system. The other
nice thing about this change is that it puts back
the sizeof struct route_node to 192 bytes.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
1. Local ethernet segments are configured in zebra by attaching a
local-es-id and sys-mac to a access interface -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
!
interface hostbond1
evpn mh es-id 1
evpn mh es-sys-mac 00:00:00:00:01:11
!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
This info is then sent to BGP and used for the generation of EAD-per-ES
routes.
2. Access VLANs associated with an (ES) access port are translated into
ES-EVI objects and sent to BGP. This is used by BGP for the
generation of EAD-EVI routes.
3. Remote ESs are imported by BGP and sent to zebra. A list of VTEPs
is maintained per-remote ES in zebra. This list is used for the creation
of the L2-NHG that is used for forwarding traffic.
4. MAC entries with a non-zero ESI destination use the L2-NHG associated
with the ESI for forwarding traffic over the VxLAN overlay.
Please see zebra_evpn_mh.h for the datastruct organization details.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is the base patch that brings in support for Type-1 routes.
It includes support for -
- Ethernet Segment (ES) management
- EAD route handling
- MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for
active-active multihoming
- Initial infra for consistency checking. Consistency checking
is a fundamental feature for active-active solutions like MLAG.
We will try to levarage the info in the EAD-ES/EAD-EVI routes to
detect inconsitencies in access config across VTEPs attached to
the same Ethernet Segment.
Functionality Overview -
========================
1. Ethernet segments are created in zebra and associated with
access VLANs. zebra sends that info as ES and ES-EVI objects to BGP.
2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached
ethernet segments.
3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers
and translates them into ES-VTEP objects which are then sent to zebra
as remote ESs.
4. Each ES in zebra is associated with a list of active VTEPs which
is then translated into a L2-NHG (nexthop group). This is the ES
"Alias" entry
5. MAC-IP routes with a non-zero ESI use the alias entry created in
(4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES
destinations.
EAD route management (route table and key) -
============================================
1. Local EAD-ES routes
a. route-table: per-ES route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
b. route-table: per-VNI route-table
Not added
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
2. Remote EAD-ES routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
3. Local EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
4. Remote EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
Please refer to bgp_evpn_mh.h for info on how the data-structures are
organized.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This api was earlier present in the daemon code but as multiple daemons
need it moving it to lib will avoid unnecessary copy-paste.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In the global evpn routing table RD is part of the key. However in the
per-VNI routing table the key doesn't include the RD and we need more
than the ESI to distinguish between EAD routes from different VTEPs
attached to the same Ethernet Segment.
This commit also includes other definitions needed for managing an
ESI.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In most cases this memory is pre-allocated along with the base element.
Similarly it is stored in the base element to allow efficient del
without lookup (main reason for using DLL vs. SLL).
So (in most cases) there should be no need to manage the element/data
and listnode memories separately.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
New macros have been added for the following -
1. to efficiently iterate and execute functions on already set bits
2. to check if a bit is in use
3. to check if a bitfield has been initialized (this is to safetly
handle cases where the bitfield is freed and re-allocated).
4. to check if two bitfields have the same bits set
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Revert "zebra: support for macvlan interfaces"
This reverts commit bf69e212fd.
Revert "doc: add some documentation about bgp evpn netns support"
This reverts commit 89b97c33d7.
Revert "zebra: dynamically detect vxlan link interfaces in other netns"
This reverts commit de0ebb2540.
Revert "bgpd: sanity check when updating nexthop from bgp to zebra"
This reverts commit ee9633ed87.
Revert "lib, zebra: reuse and adapt ns_list walk functionality"
This reverts commit c4d466c830.
Revert "zebra: local mac entries populated in correct netnamespace"
This reverts commit 4042454891.
Revert "zebra: when parsing local entry against dad, retrieve config"
This reverts commit 3acc394bc5.
Revert "bgpd: evpn nexthop can be changed by default"
This reverts commit a2342a2412.
Revert "zebra: zvni_map_to_vlan() adaptation for all namespaces"
This reverts commit db81d18647.
Revert "zebra: add ns_id attribute to mac structure"
This reverts commit 388d5b438e.
Revert "zebra: bridge layer2 information records ns_id where bridge is"
This reverts commit b5b453a2d6.
Revert "zebra, lib: new API to get absolute netns val from relative netns val"
This reverts commit b6ebab34f6.
Revert "zebra, lib: store relative default ns id in each namespace"
This reverts commit 9d3555e06c.
Revert "zebra, lib: add an internal API to get relative default nsid in other ns"
This reverts commit 97c9e7533b.
Revert "zebra: map vxlan interface to bridge interface with correct ns id"
This reverts commit 7c990878f2.
Revert "zebra: fdb and neighbor table are read for all zns"
This reverts commit f8ed2c5420.
Revert "zebra: zvni_map_to_svi() adaptation for other network namespaces"
This reverts commit 2a9dccb647.
Revert "zebra: display interface slave type"
This reverts commit fc3141393a.
Revert "zebra: zvni_from_svi() adaptation for other network namespaces"
This reverts commit 6fe516bd4b.
Revert "zebra: importation of bgp evpn rt5 from vni with other netns"
This reverts commit 28254125d0.
Revert "lib, zebra: update interface name at netlink creation"
This reverts commit 1f7a68a2ff.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
When you make a change to a route-map or a prefix-list it depends on, note
that the route-map needs to be reprocessed for the change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Added a macro to validate the v4 mapped v6 address.
Modified bgp receive & send updates for v4 mapped v6 address as
nexthop and installing it as recursive nexthop in RIB.
Minor change in fpm while sending the routes for nexthop as
v4 mapped v6 address.
Signed-off-by: Kaushik <kaushik@niralnetworks.com>
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.
As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible). This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.
In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours). It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.
The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.
Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like #6658 while we work
on better long-term solutions.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
DEFPY_YANG will allow the CLI to identify which commands are
YANG-modeled or not before executing them. This is going to be
useful for the upcoming configuration back-off timer work that
needs to commit pending configuration changes before executing a
command that isn't YANG-modeled.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
BGP Yang is using sub modules and at present FRR is not processing
submodules in embedded framework yang
Signed-off-by: VishalDhingra <vdhingra@vmware.com>
Move pim and igmp yang files registery to appropriate makefiles.
In yang directory makefile move under `PIMD`
Remove pimd yang files from library makefile instead move them
to pimd makefile.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This adds -N and --netns options to watchfrr, allowing it to start
daemons with -N and switching network namespaces respectively.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently, all DEFPY commands are translated into one-liners in
vtysh_cmd.c. After the patch, DEFPY commands are correctly indented just
like DEFUN/ALIAS commands.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
... this didn't work on NetBSD. Like, at all. It returns a positive
error code from posix_fallocate() and then we bang our head against a
brick wall trying to write to the mmap'd buffer.
Signed-off-by: David Lamparter <equinox@diac24.net>
Merge the cisco style access list with zebra's logic so we can mix both
types of rules while keeping the commands.
With this the cisco style limitation of having 'destination-*' only for
specific number ranges no longer exist for users of YANG/northbound (the
CLI still has this limitation).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Initial changes to support a nexthop with multiple backups. Lib
changes to hold a small array in each primary, zapi message
changes to support sending multiple backups, and daemon
changes to show commands to support multiple backups. The config
input for multiple backup indices is not present here.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
* add a vrf sub-command `[no] ipv6 router-id X:X::X:X`.
* add command `[no] ipv6 router-id X:X::X:X [vrf NAME]` for backward
compatibility.
* add a vrf sub-command `[no] ip router-id A.B.C.D` and make the old
one without `ip` an alias for it.
* add a command `[no] ip router-id A.B.C.D [vrf NAME]` for backward
comptibility and make the old one without `ip` an alias for it.
* add command `show ip router-id [vrf NAME]` and make
the old one without `ip` an alias for it.
* add command `show ipv6 router-id [vrf NAME]`.
* add ZAPI commands `ZEBRA_ROUTER_ID_V6_ADD`,
`ZEBRA_ROUTER_ID_V6_DELETE` and `ZEBRA_ROUTER_ID_V6_UPDATE`
for deamons to get notified of the IPv6 router-id.
* update zebra documentation.
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
Include any installed backups when updating the local kernel
after processing an async notification. This includes routes'
nexthops and LSPs' nhlfes.
Add the 'b' character to the route show display and header to
indicate backup nexthops.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
1. Modifies the data structs to make the distance, tag and table-id
property of a route, i.e created a hireachical data struct to save
route and nexthop information.
2. Backend northbound implementation
Signed-off-by: VishalDhingra <vdhingra@vmware.com>
Extend PBR maps to discriminate by Differentiated Services Code Point and / or
Explicit Congestion Notification fields. These fields are used in the IP header
for classifying network traffic.
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| DS FIELD, DSCP | ECN FIELD |
+-----+-----+-----+-----+-----+-----+-----+-----+
DSCP: differentiated services codepoint
ECN: Explicit Congestion Notification
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
Signed-off-by: Saurav Kumar Paul <saurav@cumulusnetworks.com>
While iteratively looking for a best match route-map index amongst
a list of potential best match route-map indices, if a candidate
best match index is already found, disregard the value returned by
the function route_map_apply_match() if it returns either RMAP_NOOP
or RMAP_NOMATCH in the following iterations.
This is because if a best match route-map index is found then, the
return value must always be set to RMAP_MATCH.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
BFD profiles can now be used on the interface level like this:
interface eth1
ip router isis 1
isis bfd
isis bfd profile default
Here the 'default' profile needs to be specified as usual in the
bfdd configuration.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
It is possible that the same VRF exists in one daemon and doesn't exist
in another. In this case, "no vrf NAME" command execution will stop on
the first daemon without the VRF and it won't be possible to delete the
VRF from other daemons.
Such behavior can be reproduced with the following steps:
```
# ip link add test type vrf table 1
# vtysh -c "conf t" -c "vrf test" -c "ip route 1.1.1.1/32 blackhole"
# vtysh -c "show run"
...
vrf test
ip route 1.1.1.1/32 blackhole
exit-vrf
!
...
# ip link del test
# vtysh -c "conf t" -c "no vrf test"
% VRF test does not exist
# vtysh -c "show run"
...
vrf test
ip route 1.1.1.1/32 blackhole
exit-vrf
!
...
```
This commit fixes the issue by returning success from "no vrf" command
when VRF doesn't exist.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Block signals in child/additional pthreads; frr daemons generally
expect that only the main thread will handle signals.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Route map entries are not getting a chance to call `description` string
deallocation on shutdown or when the parent entry is destroyed, so lets
add a code to handle this in the `route_map_index_delete` function.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
rpki vrf subnode is instantiated under the vrf subnode.
It it to be noted that this commit contains a change in vtysh.
Actually, the output of bgp daemon from show running-config is extracted
in vtysh, and reengineered ( hence the vtysh_config.c change done). This
permits having a subnode under vrf sub node.
Also, add vrf node support to bgpd, as rpki command can not be found
under vrf node.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
a missing '!' operator was making any STREAM_GETF fail
when in fact it should have succeeded. As a consequence
of this, for example, many link-params of an interface
were not being read and populated.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
A new config option `--disable-version-build-config`
allows you to show short version string by dropping
"configured with:" and all of its build configs
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Unfortunately, the way the frr-format plugin is set up, snprintf() with
PRId64 can generate false warnings :|. Easy workaround is to use
snprintfrr().
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Add the proper handling for cases where user forgets or doesn't have the
pointer needed to call the library function.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
A couple of daemons take/use no capabilities/privs; allow cleanup
of the privs/capabilities library module even if a daemon has no
caps.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Fix a number of library and daemon issues so that daemons can
call frr_fini() during normal termination. Without this,
temporary logging files are left behind in /var/tmp/frr/.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
... it contains our pid, so doing it before fork leads to littering
buffers since we try to clean up with the forked pid...
Fixes: #6541
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Start modifying the OPAQUE zapi message to include optional
unicast destination zapi client info. Add a 'decode' api and
opaque msg struct to encapsulate that optional info.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Change name of an opaque zapi api to 'decode' to align with the
other zapi message parsing apis. Missed that in the original
opaque commits.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
We can avoid a big amount of `snprintf` by using relative XPath in
`nb_cli_apply_changes`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
`acl_get_seq` should be able to get the sequence number from candidate
configuration without needing to commit anything midway.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Lets just use them directly to avoid extra code and to be extra clear
that we are using those callbacks.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Some tests expect that a prefix list structure is gone after all its
entries are removed, so lets keep that behaviour.
NOTE: users using YANG/northbound directly without CLI won't be
affected.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>