Fix the handling of multiple BFD profiles by adding the appropriated
code to push/pop contexts inside BFD configuration node.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If frr.conf contains a prefix-list or access-list without a seq number,
frr-reload needs to be aware that it should not delete/add if the running
config contains a seq number.
Ticket: CM-32623
Signed-off-by: Don Slice <dslice@nvidia.com>
Since new workflow instructions state to run black against
python change and it found formatting changes required that
were not part of my change set, committing those changes
separately.
Signed-off-by: Don Slice <dslice@nvidia.com>
make sure that the order in which the pcep-related commands are
removed by frr-reload.py is the correct one, i.e., pce followed
by pce-config followed by pcc.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
on one hand, the default value for a peer preference was always being
displayed, and on the other there was some code in frr-reload.py which
was attempting to add a default value to match this behavior, and which
was incorrectly overriding a specified preference. Fix this by removing
this code and making pathd behave like other daemons in this respect,
i.e. not displaying the default value.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
a) Add some useful commands
b) Remove `show error all` this just dumps the error codes. If
we know the version we don't need this. Additionally this is
rather large.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add some missing commands ( I am sure that there are more useful ones to )
Cleanup to use the modern non-deprecated syntax in case anyone runs across
this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This new dynamic module makes pathd behave as a PCC for dynamic candidate path
using the external library pcpelib https://github.com/volta-networks/pceplib .
The candidate paths defined as dynamic will trigger computation requests to the
configured PCE, and the PCE response will be used to update the policy.
It supports multiple PCE. The one with smaller precedence will be elected
as the master PCE, and only if the connection repeatedly fails, the PCC will
switch to another PCE.
Example of configuration:
segment-routing
traffic-eng
pcep
pce-config CONF
source-address ip 10.10.10.10
sr-draft07
!
pce PCE1
config CONF
address ip 1.1.1.1
!
pce PCE2
config CONF
address ip 2.2.2.2
!
pcc
peer PCE1 precedence 10
peer PCE2 precedence 20
!
!
!
!
Co-authored-by: Brady Johnson <brady@voltanet.io>
Co-authored-by: Emanuele Di Pascale <emanuele@voltanet.io>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Javier Garcia <javier.garcia@voltanet.io>
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
This new daemon manages Segment-Routing Traffic-Engineering
(SR-TE) Policies and installs them into zebra. It provides
the usual yang support and vtysh commands to define or change
SR-TE Policies.
In a nutshell SR-TE Policies provide the possibility to steer
traffic through a (possibly dynamic) list of Segment Routing
segments to the endpoint of the policy. This list of segments
is part of a Candidate Path which again belongs to the SR-TE
Policy. SR-TE Policies are uniquely identified by their color
and endpoint. The color can be used to e.g. match BGP
communities on incoming traffic.
There can be multiple Candidate Paths for a single
policy, the active Candidate Path is chosen according to
certain conditions of which the most important is its
preference. Candidate Paths can be explicit (fixed list of
segments) or dynamic (list of segment comes from e.g. PCEP, see
below).
Configuration example:
segment-routing
traffic-eng
segment-list SL
index 10 mpls label 1111
index 20 mpls label 2222
!
policy color 4 endpoint 10.10.10.4
name POL4
binding-sid 104
candidate-path preference 100 name exp explicit segment-list SL
candidate-path preference 200 name dyn dynamic
!
!
!
There is an important connection between dynamic Candidate
Paths and the overall topic of Path Computation. Later on for
pathd a dynamic module will be introduced that is capable
of communicating via the PCEP protocol with a PCE (Path
Computation Element) which again is capable of calculating
paths according to its local TED (Traffic Engineering Database).
This dynamic module will be able to inject the mentioned
dynamic Candidate Paths into pathd based on calculated paths
from a PCE.
https://tools.ietf.org/html/draft-ietf-spring-segment-routing-policy-06
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Emanuele Di Pascale <emanuele@voltanet.io>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
In the case of some linux distros the /var/run dir is mounted
with tmpfs so in every reboot it's removed.
Then the frrcommon.sh will recreate it without 'x' perm
So no pid file cannot be created in /var/run/frr
Signed-off-by: Javier Garcia <rampxxxx@gmail.com>
MAC address can be configured as lower/upper hex characters but is
always rendered as lower case in "show run". Avoid incorrect "change
detection" by ignoring case.
Ticket: CM-32235
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
The condition to normalize ipv6 addresses was accidentally broken via -
[
e238920df0 tools: Fix reload with 'ipv6 address...' in interface
]
The condition was supposed to be skipped only if "ipv6 add" was present
in the line.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
when deleting a whole l2vpn context in ldpd which also had pseudowires
in it, we were first deleting the l2vpn with a 'no l2vpn XXX' command,
and then adding it again by running 'l2vpn XXX\n no member pseudowire YYY'
which obviously was not needed. As a result the l2vpn would be reinstated.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
gcc-10 has a more strict internal assert for type checks so the plugin
currently causes an Internal Compiler Error. Fix.
Signed-off-by: David Lamparter <equinox@diac24.net>
when type is forking, it is recommended to also use the PIDFile= option,
so that systemd can reliably identify the main process of the service.
Signed-off-by: Emanuele Bovisio <emanuele.bovisio@eolo.it>
Combine yang_snodes_iterate_module() and yang_snodes_iterate_all()
into an unified yang_snodes_iterate() function, where the first
"module" parameter is optional. There's no point in having two
separate YANG schema iteration functions anymore now that they are
too similar.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add Quentin's cocci patch to align code with the changes
to the event cancel api. Also added a README to explain what
this collection of cocci patches is for.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Based on the current code, I think the intent was to gracefully handle
vtysh failures and print a useful error message. Barriers in the way of
that:
- Despite reading the results of subprocess.communicate(), there won't
be anything there, because we aren't passing subprocess.PIPE as stdin
and stderr when calling subprocess.Popen()
- Despite catching subprocess.TimeoutExpired, if we were to actually hit
this case frr-reload.py would just crash because it's calling
.communicate() on an unbound process variable, probably a copy-paste
error
- Aside from that, building a kwargs dict to pass to a function that
contains something if something else is not None and nothing if it is,
is pointless when we could just pass the thing itself
Net result is that if vtysh fails to read an frr.conf due to syntax
errors, instead of crashing with a traceback, we actually handle the
error condition, log the problem and vtysh's output, and exit. Actually
we were printing the failed line just by chance because stderr wasn't
captured from the subprocess and I guess showed up as part of systemd's
error capturing or something, but the traceback did a good job of
obscuring that with useless noise.
Old:
frrinit.sh[32183]: * Started watchfrr
frrinit.sh[32183]: line 20: % Unknown command: eee
frrinit.sh[32183]: Traceback (most recent call last):
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 1316, in <module>
frrinit.sh[32183]: newconf.load_from_file(args.filename)
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 231, in load_from_file
frrinit.sh[32183]: file_output = self.vtysh.mark_file(filename)
frrinit.sh[32183]: File "/usr/lib/frr/frr-reload.py", line 146, in mark_file
frrinit.sh[32183]: % (child.returncode, stderr))
frrinit.sh[32183]: __main__.VtyshException: vtysh (mark file) exited with status 2:
frrinit.sh[32183]: None
New:
frrinit.sh[30090]: * Started watchfrr
frrinit.sh[30090]: vtysh failed to process new configuration: vtysh (mark file) exited with status 2:
frrinit.sh[30090]: line 20: % Unknown command: eee
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
When you add a key chain in the RIP configuration file and reload the
configurations via the frr-reload.py script, the script will fail and
the key chain will not appear in the running configuration. The reason
is that frr-reload.py doesn't recognize key as a sub-context.
Before this change, keys were generated this way:
key chain test
key 2
key-string 123
key 3
key-string 456
With this change, keys will be generated this way:
key chain test
key 2
key-string 123
exit
key 3
key-string 456
exit
This will allow frr-reload.py to see the key sub-context and correctly
reload them.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
The support bundle feature(tm) asks for some data
from zebra in the form of a command that has
never existed in FRR. Looks like some
cruft snuck in remove.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Drop the `-n` (`--noerror`) flag from the `vtysh -b` invocation called by the
init script responsible for starting FRR. This ensures that errors in the
configuration file is propagated to the administrator, and prevents a node from
entering a production network while running an essentially undefined
configuration (a behaviour that I can personally attest to has the potential to
cause disastrous network outages - documented in more detail in Cumulus
Networks CS#12791).
Silently ignoring errors also leads to the rather odd behaviour that starting
FRR will ostensibly succeed, while reloading it immediately after - without
changing the configuration - will fail. This is due to the fact that the `-n`
flag is not used while reloading.
The use of the `-n` flag appears to have been introduced without any
explanation in commit 858aa29c68 by @donaldsharp.
Looking at the commit message, I suspect that it was not an intentional change.
It seems more likely to me that it was just meant to be used during testing and
development, but ended up being committed to master by accident.
Ticket:CM-28003
Signed-off-by: Tore Anderson <tore@fud.no>
This adds -N and --netns options to watchfrr, allowing it to start
daemons with -N and switching network namespaces respectively.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Reported that in certain config changes, a static intended for the
default table would be duplicated into a vrf context. Determined
that we still weren't keeping or adding the exit-vrf command when
necessary to keep the contexts straight. Added logic to look for
the failing circumstances and add or remove the delete of the
exit-vrf command as needed.
Signed-off-by: Don Slice <dslice@nvidia.com>
In several instances a call to log.error() is preceded by a print()
for the same message. To prevent duplicate messages these print()
calls are removed.
To maintain (very) similar behaviour we add a StreamHandler to the
logger, when doing logging to a file (ie. --reload without --stdout),
which additionally sends error and above logs to STDOUT without any
metadata (exactly as they did before, with print()).
There is one subtle change - the log from Vtysh.is_config_available()
is now preceded with the "vtysh 'configure' returned" text, whereas
previously only the output from vtysh was sent to STDOUT.
Furthermore any error logs which weren't previously explicitly logged
to STDOUT will now be.
Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
Add a "--log-level" option to frr-reload to set the maximum message
level to be logged. When the option is not used, the level is set to
info as before.
The existing --debug option is synonymous with --log-level=debug and
these options are therefore mutually exclusive.
Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
When you have this configuration:
int foo
ipv6 address fd01:0:0:1::1/64
And issue a reload statement, FRR-reload
is reducing the code to a
`no ipv6 address fd01:0:0:1::/64`
and then issuing a:
`ipv6 address fd01:0:0:1::/64`
The end result is of course that the foo
interface now has two v6 addresses on it.
The brilliance of this is of course if you
happen to have two systems that are connected
over an interface, and you issue a reload command.
They both get fd01:0:0:1::/64 as an ipv6 address
and DAD detection kicks in and stomps on your stuff.
Put a special hey don't munch the v6 address line
in a reload situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
frr-reload.py needs to know about config-level commands, otherwise it
assumes they are contexts
Ticket: CM-30128
Ticket: CM-30077
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
the refactored frr-reload.py is adding 'no-header' to the
'show running' command of vtysh, but if a daemon is specified
the no-header option should only be added after the daemon name.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
After the cleanup, adding this doesn't require updating a zillion
locations in the code anymore, just one :)
Partially derived from 6a00e91d99f7f98d857c2056d0dcfeba48966581
Originally-by: Emanuele Di Pascale <emanuele@voltanet.io>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
- throw vtysh into a wrapper class
- ignore "username" commands
- use mark output on stdout
- some other random cleanups
Signed-off-by: David Lamparter <equinox@diac24.net>
Original start/stop of FRR prior to David's rewrite in
PR 3507, when configuring multi-instance would
only start multi-instance (-1 -2 -3 -4...) or
just the daemon, not both. If you happened
to start a ospfd instance of 1 then both
the default and instance 1 would react to cli.
Do not allow this, put it back to original behavior
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This reverts commit 3fa139a65b.
This is being reverted because this commit completely
breaks the invocation of frr-reload.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This only applies for split-config; the init script would create an
empty config file with default permissions.
Reported-by: Robert Scheck <robert@fedoraproject.org>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem reported that with certain configs, when the user
deleted a "neighbor x.x.x.x bfd 4 100 100" statement from
frr.conf and then reloaded, a traceback was seen and the
deletion did not succeed. Found that in some scenarios
it was possible to have something in lines_to_add that
was in a different context and when the re.search was
attempted, it found an empy line and was unhappy. This
fix avoids trying to search in the wrong context.
Ticket: CM-29145
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This dumps call graph data from LLVM bitcode files into a JSON file.
Specifically for FRR, it understands thread_add_*(), hook_*() and
install_element() so it can provide extra information in these cases.
As a general feature, it tries to track down function pointers as far as
easily feasible.
Signed-off-by: David Lamparter <equinox@diac24.net>
when removing a whole address-family block from ldpd config
we were erroneously trying to also remove each of the interface
sub-sub-contexts that belonged to it; this would effectively
re-enable the AF we just removed. Work around this by ignoring
these sub-sub-contexts if we detect that we are already
removing the parent block.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Having a fixed set of parameters for each northbound callback isn't a
good idea since it makes it difficult to add new parameters whenever
that becomes necessary, as several hundreds or thousands of existing
callbacks need to be updated accordingly.
To remediate this issue, this commit changes the signature of all
northbound callbacks to have a single parameter: a pointer to a
'nb_cb_x_args' structure (where x is different for each type
of callback). These structures encapsulate all real parameters
(both input and output) the callbacks need to have access to. And
adding a new parameter to a given callback is as simple as adding
a new field to the corresponding 'nb_cb_x_args' structure, without
needing to update any instance of that callback in any daemon.
This commit includes a .cocci semantic patch that can be used to
update old code to the new format automatically.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
- Fix 1 byte overflow when showing GR info in bgpd
- Use PATH_MAX for path buffers
- Use unsigned specifiers for uint16_t's in zebra pbr
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Coccinelle needs to know about complicated macros to understand certain
code paths, add some more macros there.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Our two northbound tools don't have embedded YANG modules like the
other FRR binaries. As such, ly_ctx_set_module_imp_clb() shouldn't be
called when the YANG subsystem it being initialized by a northbound
tool. To make that possible, add a new "embedded_modules" parameter
to the yang_init() function to control whether libyang should look
for embedded modules or not.
With this fix, "gen_northbound_callbacks" and "gen_yang_deviations"
won't emit "YANG model X not embedded, trying external file"
warnings anymore.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This provides the first reasonably-working version of the frr-format GCC
plugin. I've only tested it with gcc 9.3.0.
Signed-off-by: David Lamparter <equinox@diac24.net>
... remove everything we don't need (or can't use because GCC doesn't
export all of its internal classes & stuff.)
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem seen when deleting many static routes or access-lists due
to frr-reload.py issuing individual vtysh -c commands for every
line. On slow switches, this can take long enough for systemd to
time out the reload process and restart frr. This fix uses add
logic for static routes, prefix-lists, and access-lists to gang
the changes together.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-27856
These make no sense. stderr=subprocess.STDOUT means that vtysh's stdout
and stderr are combined and returned by check_output. We don't expect
errors in that, and we certainly don't log them.
Leaving vtysh's stderr as stderr is perfectly fine, it'll be captured
for logging just like stderr output from frr-reload.py.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Hopefully at some point we can get rid of the --enable-datacenter switch
and just have the init script do magic. Should already work for Cumulus
as it is.
NB: the profile name can't be baked into the package. The whole point
is to make the package profile-agnostic; in theory at some point the
exact same package files should work on both, say, a Cumulus switch and
a Linux software BGP DFZ router.
Signed-off-by: David Lamparter <equinox@diac24.net>
Found that while the previous fix solved the traceback and created
the correct configuration, it was doing a delete/add process rather
than just an add. This was due to an incorrectly created search
string. This commit fixes that search string and testing verifies
that the correct thing is now being done.
Ticket: CM-27233
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Problem reported with tracebacks seen when making multiple bfd timer
changes in frr.conf and applying via frr-reload.py. Found that when
multiple bfd timer changes are made, the same line can be added for
deletion more than once, causing the traceback when the deletion is
performed. This fix verifies the correct line is being appended for
deletion.
Ticket: CM-27233
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
instead of suppressing the 'exit' markers at the end of each
'interface XXX' clause in the mpls ldp configuration, mark
those with a special marker 'exit-ldp-if' and teach the
reload script to correctly recognize the new sub-subcontext
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Add a new '-s' option which controls whether the generated northbound
callbacks are declared with the 'static' specifier or not. If not
(the default), a prototype is generated for each callback before
their declarations.
It's suggested that daemons shouldn't use the '-s' option so that
their northbound callbacks can be implemented in different files
according to their class (config, state, rpc or notification).
libfrr commands, on the other hand, can use the '-s' option when
their associated YANG module is too small and putting all callbacks
in the same file is desirable.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
frr-reload.py has many special case rules that did not consider ldpd
at all. Specifically:
1. The bulk of ldp configuration comes in a big 'mpls ldp' context, which was
previously considered a single-line context as it started with 'mpls'. This
rule should only apply to labels and lsps.
2. ldp has a 'router-id' config line that fell into the same rule as the above
one. It should not be considered a single-line context as more ldp
configuration can follow.
3. enabled interfaces should not end their context. A better fix
would actually require popping a new context for each interface
in case there is any interface-specific config, but at least this
fix will address the most common use case.
4. when declaring pseudowires, any line with 'member pseudowire XXX' should
be considered a sub-context of the 'l2vpn YYY type ZZZ' context. Without
this fix, changes in the first psuedowire declared would not correctly
be processed (e.g. removing a 'control-word exclude' line would not
be picked up).
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
allow frr-reload.py to be invoked with a --daemon option to specify
an individual daemon for which the configuration diff should be
computed. This is useful when integrated config is not used and we
want to apply a patch to a single daemon config file.
No attempt to integrate this with 'service frr reload' has been done.
Making watchfrr work with per-daemon config is outside the scope of
this simple patch.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
allow command line parameters to specify different folder for
the vtysh binary, config file location and temporary file.
Keep the old hardcoded paths as default values for those options
to preserve current functionality.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
This commit is to copy the support bundle scripts to appropriate directories during installation
Signed-off-by: Sri Mohana Singamsetty <msingamsetty@vmware.com>
For frr_each, just fix some existing warnings; for frr_with_* add a
warning indicating that braces should always be used.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited. This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The correct cast for these is (unsigned char), because "char" could be
signed and thus have some negative value. isalpha & co. expect an int
arg that is positive, i.e. 0-255. So we need to cast to (unsigned char)
when calling any of these.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Use the alternate struct instantiation that does not generates warning
on old compilers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This allows developer to set a temporary YANG model directory path for
generating northbound for models not yet installed.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add `allow-external-route-update` and `domainname` to the one line
context list, otherwise reload will fail when those commands show up in
the running configuration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Without this, we end up restarting watchfrr with the systemd watchdog
non-functional & tripped a bit later. Also, if watchfrr is in the
"control" cgroup, systemd 232 will kill it. (241 apparently doesn't.
Can't find anything about this in systemd's ChangeLog though.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Zebra already flushes routes on proper shutdown if you are not
using the -K option. If you are using the -K option then you
do not want the tools/frr script to flush routes.
If zebra crashes and we restart then load up will either delete
the routes or leave them depending on the -K option.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Place in the code the ability for end operators to know how
to modify MAX_FDS so that they can run large scale operations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Problem reported that if a bgp neighbor had a bfd timer change
made in frr.conf and systemctl reload frr performed, the neighbor
with the timer changed bounced. If the change is made in vtysh
by just adding the new timer values, no peer bounce occurs. This
fix skips the delete part of the delete/add process in frr-reload
so the peers stay up.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Apparently, the default changed to use `/etc/frr/daemons` instead of
`/etc/frr/daemons.conf`. Therefore, we should ignore absence of the
latter file, because its absence is not an actuall error but will
cause a confusing error message like this:
/etc/init.d/frr: line 507: /etc/frr/daemons.conf: No such file or directory
The "declare -p watchfrr_options" call is just to support backwards
compatibility. If it fails, silently ignore that.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add some Coccinelle semantic patches we can use to automatically
refactor code in the future.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Discovered in testing that if a static route in the default table
was entered immediately after a vrf static block, the static route
intended for the default table was put in the vrf instead. This
fix retains the "exit-vrf" statement which causes the following
static routes to appear in the default table correctly.
Ticket: CM-23985
Signed-off-by: Don Slice <dslice@cumulusnetwork.com>
Problem caused when nclu is used to create "ip route 1.1.1.0/24
blackhole" because frr-reload.py changed the line to Null0 instead
of blackhole. If nclu tries to delete it using the same line as
entered, the commit fails since it doesn't match.
Ticket: CM-23986
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This has a python script that helps in collecting various CLI show command outputs in an automated way.
This commit has two files.
1.Text Configuration file: support_bundle_commands.conf - This file has list of CLI show commands to be executed. This file will be in tools/etc/frr/ directory. On executing command "sudo install -m 644 tools/etc/frr/ support_bundle_commands.conf /etc/frr/support_bundle_commands.conf", as part of FRR installation, this file will be copied into /etc/frr directory.
2.Python script file: generate_support_bundle.py - This file has the python code that has the below functionality.
* It reads the support_bundle_commands.conf file. For each process present in the conf file, it creates a support_bundle file. For example, it creates bgp_support_bundle.log file for BGP and zebra_support_bundle.log file for Zebra. These files will be created in /var/log/frr/ directory. This is where regular FRR log files are also stored currently.
* The script reads the CLI command specified between CLI_START and CLI_END key words for each process. It will execute the commands one by one.
* For each such command, the script also appends the current time stamp at which the CLI command is executed.
* In case of successful execution of the CLI command, it will copy the CLI output into the above support bundle file.
* In case of CLI command failure, it will capture the error thrown and the error is also written into the same file.
* A small snippet of the output file is as below.
>>[2019-01-02 13:55:23.318987]show bgp summary
IPv4 Unicast Summary:
BGP router identifier 203.0.113.1, local AS number 65000 vrf-id 0
BGP table version 4
RIB entries 7, using 1176 bytes of memory
Peers 1, using 21 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
203.0.113.2 4 65001 34 34 0 0 0 00:29:47 2
Total number of neighbors 1
>>[2019-01-02 13:55:23.619953]show ip bgp
BGP table version is 4, local router ID is 203.0.113.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Signed-off-by: Sri Mohana Singamsetty <msingamsetty@vmware.com>
TBH when I looked at watchfrr I didn't see any MI support and hence
assumed this just didn't work to begin with. However, it actually does
(transparently to watchfrr, by just using "ospfd-1" as daemon name.)
So, fix this up and make it work again.
(Also remove 2 extraneous \n in messages.)
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to not have these options default to the
installation path of tools/watchfrr.sh. Doing so allows us to ditch
watchfrr_options from daemons/daemons.conf completely.
Fixes: #3652
Signed-off-by: David Lamparter <equinox@diac24.net>
If we try to monitor a nonexisting daemon in watchfrr, it will
(currently) forever wait at startup since the vty connection will never
come up. Just drop the daemon from the daemon list in such a case.
Signed-off-by: David Lamparter <equinox@diac24.net>
The debian/ directory is distributed separately for tarballs in 3.0
(quilt) format. Including it in the dist tarball causes problems with
automake when the separately distributed debian directory is unpacked on
top of the dist tarball; the clean and correct thing to do here is to
not include the debian/ directory in dist tarballs.
Users have two choices for building FRR Debian packages:
- build straight off git
- build from a "frr.tar" + "frr-debian.tar"
The tarsource.sh tool does the right thing when invoked with the -D
("Debian") option.
Signed-off-by: David Lamparter <equinox@diac24.net>
It cleans your house and cooks dinner. Or maybe it creates a clean dist
tarball for you, plus a Debian .dsc if you have dpkg installed - and
GPG-signs the result appropriately if requested.
In any case the resulting tarball should be distributed for our
releases.
Signed-off-by: David Lamparter <equinox@diac24.net>
Change the northbound lib operation from DELETE to DESTROY;
make the required changes in the users of the northbound, in
the cli, rip, ripng, and isis.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Currently our systemd dependencies look something like this (example
from vanilla Debian 9):
$ systemctl list-dependencies frr
frr.service
● ├─system.slice
● └─sysinit.target
...
$ systemctl list-dependencies --reverse frr
frr.service
● └─network-online.target
● └─apt-daily.service
Note that sysinit.target does not depend on any network* service or
target.
In other words, unless there is a service that requires
network-online.service, even if FRR is enabled it will not be started.
Therefore network-online.target is the wrong unit to have in WantedBy=,
as it is not always started.
This patch updates our service file so that it is properly started by
the system when enabled, delayed until networking is up, and if possible
delayed until after NetworkManager, systemd-networkd or any other
networking configuration manager has finished performing its tasks -
i.e. after network-online.target.
After these changes our new dependency graph looks like this:
$ systemctl list-dependencies frr
frr.service
● ├─system.slice
● │ └─networking.service
● ├─network.target
● └─sysinit.target
...
$ systemctl list-dependencies --reverse frr
frr.service
● └─multi-user.target
● └─graphical.target
This way, FRR will be started by multi-user.target (just like most
applications), but delayed until after networking has been configured.
In the same stroke, this should also fix issues on systems that do not
provide "networking.service" (such as CentOS 7).
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@diac24.net>
- some target_CFLAGS that needed to include AM_CFLAGS didn't do so
- libyang/sysrepo/sqlite3/confd CFLAGS + LIBS weren't used at all
- consistently use $(FOO_CFLAGS) instead of @FOO_CFLAGS@
- 2 dependencies were missing for clippy
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem seen when removing last config item under the vrf context,
where frr-reload.py tries instead to delete the vrf context itself.
Since that is not permitted on an active vrf, the command errors
out and nothing is deleted.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
This separates the init script used for the system (and called in the
systemd unit file) from the script that watchfrr uses to control
daemons. Mixing these two caused the entire thing to become a rather
huge spaghetti mess.
Note that there is a behaviour change in that the new script always
starts zebra regardless of zebra_enable.
Side changes:
- Ubuntu 12.04 removed from backports since it doesn't work anyway
- zebra is always started regardless of zebra_enable. To disable FRR,
the entire init script should be disabled through policy.
- no-watchfrr operation is no longer supported by the scripts in the
Debian packages. (This is intentional.)
Signed-off-by: David Lamparter <equinox@diac24.net>
The northbound infrastructure for operational data was subpar compared
to the infrastructure for configuration data. This commit addresses most
of the existing problems, making it possible to write operational-data
callbacks for more complex YANG models.
Summary of the changes:
* Add support for nested YANG lists.
* Add support for leaf-lists.
* Add support for leafs of type "empty".
* Introduce the "show yang operational-data XPATH" command, and write an
unit test for it. The main purpose of this command is to make it
easier to test the operational-data northbound callbacks.
* Introduce the nb_oper_data_iterate() function, that can be used
to iterate over operational data. Make the CLI and sysrepo use this
function.
* Since ConfD has a very peculiar API, it can't reuse the
nb_oper_data_iterate() like the other northbound clients. In this
case, adapt the existing ConfD callbacks to support the new features
(and make some performance improvements in the process).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
A YANG list that contains both configuration and state data must have
the following callbacks: create(), delete(), get_next(), get_keys()
and lookup_entry().
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
* Rename yang_snodes_iterate() to yang_snodes_iterate_subtree() and
expose it in the public API.
* Rename yang_module_snodes_iterate() to yang_snodes_iterate_module().
* Rename yang_all_snodes_iterate() to yang_snodes_iterate_all().
* Make it possible to stop the iteration at any time by returning
YANG_ITER_STOP in the iteration callbacks.
* Make the iteration callbacks accept only one user argument and not
two.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
1) Certain echo statements present in the script before/after SSD process
restart are causing the FRR script to hang. This is breaking the frr script
functionality for start/stop/restart. Removed such echo statements.
Tests:
1. Multiple start, stop, restart
2. Multiple restarts/kill of same process.
Signed-off-by: Sri Mohana Singamsetty <msingamsetty@vmware.com>
Need to use /usr/lib/frr/frr script for start/stop/restart of FRR. /usr/sbin/service frr command is not working as expected.
Signed-off-by: Sri Mohana Singamsetty <msingamsetty@vmware.com>
clang-format always indent labels by default and that can't be changed
with any configuration option. Also, indented labels tend to improve
code readability, especially in long functions.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
We weren't cleaning up some files (a whole lot of python foobar) and had
some files in the dist tarball that don't quite belong there.
Signed-off-by: David Lamparter <equinox@diac24.net>
This is no longer neccessary since start-stop-daemon will block until
watchfrr's launch parent has exited.
Signed-off-by: David Lamparter <equinox@diac24.net>
The script simplifies the relatively lengthy procedure.
It should be invoked from the top level source directory, for example:
./tools/build-debian-package.sh
Signed-off-by: Daniil Baturin <daniil@baturin.org>
Please note this is a Proof of Concept and not actually something
that is ready to commit at this point. The file tools/lua.scr
contains some documentation on how we expect it to work currently.
Additionally not all bgp values have been hooked up into the
ability to lua script yet.
There is still significant work to be done here:
1) Add the ability to pass in more data and to adjust the return values
as appropriate.
To set it up:
1) copy tools/lua.scr into /etc/frr (or whereever the config
directory is )
2) Create a route-map match command:
!
router bgp 55
neighbor 10.50.11.116 remote-as external
!
address-family ipv4 unicast
neighbor 10.50.11.116 route-map TEST in
exit-address-family
!
route-map TEST permit 10
match command mooey
!
3) In the lua.scr file make sure that you have a function
named 'mooey' ( as the above example does ):
function mooey ()
zlog_debug(string.format("Family: %d: %s %d ifindex: %d aspath: %s localpref: %d",
prefix.family, prefix.route,
nexthop.metric, nexthop.ifindex, nexthop.aspath, nexthop.localpref))
nexthop.metric = 33
nexthop.localpref = 13
return 3
end
This example script modifies the metric and localpref currently. I've also provided
a zlog_debug function in lua to allow some simple debugging.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixed using XCALLOC(MTYPE_TMP, ...) instead of calloc(...) because of the
error handling (XCALLOC checks + log + abort through memory_oom())
Signed-off-by: F. Aragon <paco@voltanet.io>
config.h (or, transitively, zebra.h) must be the first include file
listed for autoconf things like _GNU_SOURCE and _POSIX_C_SOURCE to work
correctly.
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem reported that when a peer-group was added in certain
configurations, it would be rejected because of the order of the
commands put in by nclu. Issued turned out to be how frr-reload.py
was handling the sub-sub-context of the vni under the address-family
and subsequently how it handled the following exit-vni.
Ticket: CM-21996
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Add BFD daemon to the build process and packaging instructions.
Currently the bfdd daemon does nothing, this is just to document how the
daemon insertion step occured.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
checkpatch cuts from the diff between the outputs of pre-patch and
post-patch runs of `checkpatch.pl`, but fixed-length greps sometimes
don't cut correctly.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Only frr-reload.py pulls in a python depenedency for frr, we can
reduce the size of the base frr package by a lot if we separate
out frr-pythontools. When we do this, we get a somewhat cryptic
error message when frr-reload.py is missing on frr reload.
Here, we pull the error message from frr-reload script, which is
much clearer.
Testing done:
frr reload both with and without the frr-reload.py script, see
the frr-reload message when missing and it runs frr-reload.py when
not missing.
Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
Add code to allow FRR to properly build and handle the staticd
for some of the more common packaging.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Move configure flag propagations out of user flags
* Use AC_SUBST to transfer flag values to Automake
* Set default AM_CFLAGS and AM_CPPFLAGS in common.am and change child
Makefiles to modify these base variables
* Add flag override to turn off all sanitizers when building clippy
* Remove LSAN suppressions blacklist as it's no longer needed
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The re-use of RTPROT_STATIC has caused too many collisions
where other legitimate route sources are causing us to
believe we are the originator of the route. Modify
the code so that if another protocol inserts RTPROT_STATIC
we will assume it's a Kernel Route.
Fixes: #2293
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have run across a few cases where the startup timeout is
ocurring on heavily loaded systems. This is especially true
in simulation environments where the hypervisor load is
extremely high.
Modify the code base to give ourselves more time to startup.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The sharp and pbr protocols needed a bit more handling
to be 'right' from a start/stop perspective.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Currently, we just package the frr daemons, but we don't run
them. This is fine for basic tests, but it is inconvenient to
orchestrate the daemons from downstream test environments.
Here, we follow the redhat and debianpkg formats more closely,
putting the daemons in /usr/lib/frr and including the frr user
and groups in the package. We also include a docker specific
startup script and a sysvinit link in /etc/init.d/frr for
openrc based alpine installs.
Testing done:
Built packages, built base images, everything seems to work fine.
Uninstalled the package, all the daemons stopped.
Issue: https://github.com/FRRouting/frr/issues/2030
Signed-off-by: Arthur Jones <arthur.jones@riverbed.com>
This is an implementation of PBR for FRR.
This implemenation uses a combination of rules and
tables to determine how packets will flow.
PBR introduces a new concept of 'nexthop-groups' to
specify a group of nexthops that will be used for
ecmp. Nexthop-groups are specified on the cli via:
nexthop-group DONNA
nexthop 192.168.208.1
nexthop 192.168.209.1
nexthop 192.168.210.1
!
PBR sees the nexthop-group and installs these as a default
route with these nexthops starting at table 10000
robot# show pbr nexthop-groups
Nexthop-Group: DONNA Table: 10001 Valid: 1 Installed: 1
Valid: 1 nexthop 192.168.209.1
Valid: 1 nexthop 192.168.210.1
Valid: 1 nexthop 192.168.208.1
I have also introduced the ability to specify a table
in a 'show ip route table XXX' to see the specified tables.
robot# show ip route table 10001
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route
F>* 0.0.0.0/0 [0/0] via 192.168.208.1, enp0s8, 00:14:25
* via 192.168.209.1, enp0s9, 00:14:25
* via 192.168.210.1, enp0s10, 00:14:25
PBR tracks PBR-MAPS via the pbr-map command:
!
pbr-map EVA seq 10
match src-ip 4.3.4.0/24
set nexthop-group DONNA
!
pbr-map EVA seq 20
match dst-ip 4.3.5.0/24
set nexthop-group DONNA
!
pbr-maps can have 'match src-ip <prefix>' and 'match dst-ip <prefix>'
to affect decisions about incoming packets. Additionally if you
only have one nexthop to use for a pbr-map you do not need
to setup a nexthop-group and can specify 'set nexthop XXXX'.
To apply the pbr-map to an incoming interface you do this:
interface enp0s10
pbr-policy EVA
!
When a pbr-map is applied to interfaces it can be installed
into the kernel as a rule:
[sharpd@robot frr1]$ ip rule show
0: from all lookup local
309: from 4.3.4.0/24 iif enp0s10 lookup 10001
319: from all to 4.3.5.0/24 iif enp0s10 lookup 10001
1000: from all lookup [l3mdev-table]
32766: from all lookup main
32767: from all lookup default
[sharpd@robot frr1]$ ip route show table 10001
default proto pbr metric 20
nexthop via 192.168.208.1 dev enp0s8 weight 1
nexthop via 192.168.209.1 dev enp0s9 weight 1
nexthop via 192.168.210.1 dev enp0s10 weight 1
The linux kernel now will use the rules and tables to properly
apply these policies.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The shutdown code was sometimes taking 1 minute to run because
the ssd program was misbehaving after a make install.
This commit just removes the usage of ssd for shutdown
since we already have a pid file and we know that the
frr script cleans up the pid file as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Checkpatch.pl now checks for nonstandard integral types
* Add shell script to replace all nonstandard types with their standard
counterparts in C source files
* Document usage of types, mention conversion script
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Building FRR with AddressSanitizer is kind of annoying since
libpython3.5 leaks memory, clippy links libpython3.5 and clippy runs as
part of the build process. LeakSanitizer has a way to suppress leaks at
runtime by setting the LSAN_OPTIONS environment variable to contain a
file path to a suppression list:
LSAN_OPTIONS=suppressions=path/to/suppr.txt
This commit provides the file. Setting this environment variable to
LSAN_OPTIONS=suppressions=../tools/lsan-suppressions.txt
before building should allow a clean build with ASAN enabled. The
relative path is there because LeakSanitizer looks at paths relative to
the binary it is sanitizing; clippy is in lib/ so the path is set
relative to lib/.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>