Commit Graph

20865 Commits

Author SHA1 Message Date
Renato Westphal
b855e95fd3 lib: introduce configuration back-off timer for YANG-modeled commands
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like #6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-03 15:17:03 -03:00
Renato Westphal
ca77b518bd *: introduce DEFPY_YANG & friends
DEFPY_YANG will allow the CLI to identify which commands are
YANG-modeled or not before executing them. This is going to be
useful for the upcoming configuration back-off timer work that
needs to commit pending configuration changes before executing a
command that isn't YANG-modeled.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-03 15:17:03 -03:00
Quentin Young
774daaed3f
Merge pull request #6845 from opensourcerouting/foreach-safi-formatting
clang-format: add FOREACH_SAFI to the ForEachMacros list
2020-08-03 12:27:28 -04:00
Renato Westphal
afdb3e867f
Merge pull request #6781 from chiragshah6/mdev
yang: create route-map leafref reference type
2020-08-03 12:57:45 -03:00
Renato Westphal
6e4e5353e4 clang-format: add FOREACH_SAFI to the ForEachMacros list
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2020-08-03 12:18:24 -03:00
Donald Sharp
4e85367800 doc: Add documentation for the new cli
Document the `show bgp ipv4 uni neighbors 192.168.161.2 bestpath-routes`
command.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-03 10:34:14 -04:00
Donald Sharp
f20ce998fb bgpd: Add bestpath-routes to neighbor command
Add the ability to list the bestpath-routes to the
`show bgp afi safi neighbor X` command.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-03 10:34:14 -04:00
Donald Sharp
2f9bc755fd bgpd: Abstract the header inclusion for show_adj_route
Cut-n-paste code can go away.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-08-03 10:34:14 -04:00
Donatas Abraitis
5a45d61308
Merge pull request #6833 from donaldsharp/pcount_selected
bgpd: Add to neighbor prefix-counts the count of best path selected
2020-08-01 13:09:28 +03:00
Chirag Shah
15435a3ce7 yang: route-map model description format
Added "." at the enf of each description of fields.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2020-07-31 16:21:45 -07:00
Chirag Shah
1cbba4b0bd yang: route-map style format
Align to yanglint format

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2020-07-31 16:21:45 -07:00
Chirag Shah
37746447c1 yang: create route-map leafref reference type
Create leafref reference type for route-map name.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
2020-07-31 16:21:45 -07:00
Donatas Abraitis
986b0fc389 doc: Add wide option for show bgp commands
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-07-31 22:14:00 +03:00
Donald Sharp
7e3d96323b bgpd: Add to neighbor prefix-counts the count of best path selected
When we have a prefix that has been selected, note that that
particular flag has been set and give that information to the
end user.

eva# show bgp ipv4 uni neighbors 192.168.161.131 prefix-counts
Prefix counts for 192.168.161.131, IPv4 Unicast
PfxCt: 814246

Counts from RIB table walk:

              Adj-in: 0
              Damped: 0
             Removed: 0
             History: 0
               Stale: 0
               Valid: 814246
             All RIB: 814246
       PfxCt counted: 814246
 PfxCt Best Selected: 0
             Useable: 814246
eva# show bgp ipv4 uni neighbors 192.168.161.2 prefix-counts
Prefix counts for 192.168.161.2, IPv4 Unicast
PfxCt: 814070

Counts from RIB table walk:

              Adj-in: 0
              Damped: 0
             Removed: 0
             History: 0
               Stale: 0
               Valid: 814070
             All RIB: 814070
       PfxCt counted: 814070
 PfxCt Best Selected: 814070
             Useable: 814070

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-31 10:06:39 -04:00
Philippe Guibert
b598a1458c nhrpd: ignore zebra updates about our routes being deleted/added
nhrp listens for route entries to be deleted, in case some new routes
impact the current routes installed by nhrp. To prevent from
unconfiguring nhrp shortcut route, just prevent nhrp routes to be
processed.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2020-07-31 13:50:57 +02:00
Rafael Zalamena
b6c86dc197
Merge pull request #6778 from mjstapp/fix_topo_route_scale
tests: rework route_scale topotest
2020-07-30 17:59:42 -03:00
Donald Sharp
a8c6e6e895
Merge pull request #6824 from liweitianux/patch-1
ospfd: Fix Zebra route add message truncation issue
2020-07-30 14:34:34 -04:00
Mark Stapp
54d321aaa3 zebra: fix SA warning, handle return code
Handle a return code, resolving an SA warning

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-30 14:29:16 -04:00
Mark Stapp
76e036e4b0 tests: Avoid top ecmp route_scale test case when memory limited
Address-sanitizer runs in the CI appear to require more
memory than is available (at present), so skip the top
x32 route_scale testcase when running with <4G of ram.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-30 14:23:52 -04:00
Mark Stapp
0ae6467f0e tests: rework route_scale topotest
Make some changes to the route-scale topotest, in view of
issue #6734. Table-drive the test to eliminate some
repeated code. Assert and fail if a step in the progression
of scale fails. Wait a little longer between checking the show
output - it's costly to generate that output at scale. Add a
memleak testcase.

Signed-off-by: Mark Stapp <mjs@voltanet.io>
2020-07-30 14:23:52 -04:00
Mark Stapp
dca30d85fb
Merge pull request #6825 from kuldeepkash/evpn_type2_tests
tests: Skipping evpn_type5_test_topo1 tests from CI runs
2020-07-30 14:09:24 -04:00
Aaron LI
1a08238236 ospfd: Fix Zebra route add message truncation issue
The `INET_ADDRSTRLEN` is 16 and is only enough to format an IPv4 address.
So when there is a prefix (`/xx`), the debug output may get truncated.
Use `PREFIX2STR_BUFFER` macro instead to fix the issue.

Signed-off-by: Aaron LI <aly@aaronly.me>
2020-07-30 08:16:18 -07:00
Kuldeep Kashyap
e66778d007 tests: Skipping evpn_type5_test_topo1 tests from CI runs
1. evpn_type5_test_topo1 tests started failing in CI for all Ubuntu 18.04 machine,
which are having kernel version: 5.4.0-42-generic
2. We will enable these tests once issue is found and fixed.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2020-07-30 14:29:25 +00:00
Quentin Young
25ee44b522
Merge pull request #6732 from opensourcerouting/printfrr-prep
*: preparations for printfrr coccinelle run
2020-07-29 14:29:34 -04:00
Donald Sharp
93d08879ad
Merge pull request #6769 from opensourcerouting/acl-regress
lib,yang: merge cisco/zebra access list styles
2020-07-29 09:57:39 -04:00
Rafael Zalamena
e41e0f8135 zebra,fpm: serialize zebra table walks
We were not getting any benefits from attempting to walk all tables at the
same time and it made debugging harder, so lets execute one table walk
per time.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2020-07-28 12:34:12 -03:00
Rafael Zalamena
55eb9d4d7d zebra,fpm: fix race on completion detection
Zebra runs on a different thread than FPM, so we need to synchronize
them by using events. While here, implement completion detection for all
kinds of walk.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2020-07-28 12:34:12 -03:00
Rafael Zalamena
e1afb97fdd zebra,fpm: fix input handling
Two important fixes:

* `stream_read_try` does a dirty trick and converts the `-1` return to
  `-2` when errno is `EAGAIN`, `EWOULDBLOCK` or `EINTR`.
* Don't enable reads until the connection is complete.

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2020-07-28 12:34:12 -03:00
David Lamparter
fd2edd5ac1
Merge pull request #6792 from chiragshah6/pim_dev
*: pim igmp yang registery to appropriate makefile
2020-07-28 15:13:00 +02:00
David Lamparter
f41f38d88d
Merge pull request #6787 from toreanderson/master
tools: do not silently ignore errors when loading config during startup
2020-07-28 15:09:15 +02:00
Russ White
996b789193
Merge pull request #6758 from chiragshah6/yang_nb6
EVPN northbound conversion for vrf l3vni mapping command
2020-07-28 07:22:24 -04:00
Russ White
4f08132ae9
Merge pull request #6808 from ton31337/fix/dampening_reuse_limit_assert
bgpd: Bypass SA tests regarding division by zero for reuse_limit in dampening
2020-07-28 06:20:29 -04:00
vdhingra
65de8bc8d0 lib: Add support to load submodules in embedded modules framework
BGP Yang is using sub modules and at present FRR is not processing
submodules in embedded framework yang

Signed-off-by: VishalDhingra <vdhingra@vmware.com>
2020-07-28 00:39:32 -07:00
David Lamparter
9034306e4d
Merge pull request #6806 from donaldsharp/tests_log_monitor
tests: Remove 'log monitor' from tests
2020-07-28 00:15:22 +02:00
Rafael Zalamena
5a1ac9688f
Merge pull request #6805 from ton31337/fix/dead_code
bgpd: Remove peer_afc_set()
2020-07-27 18:35:20 -03:00
Rafael Zalamena
4abb2e9951
Merge pull request #6804 from donaldsharp/remove_cisco_compatability
doc: Remove `Cisco Compatability` section
2020-07-27 17:15:02 -03:00
Donatas Abraitis
3ec5c50019 bgpd: Bypass SA tests regarding division by zero for reuse_limit in dampening
reuse_limit can't be zero basically, Coverity just does not know how the
value comes in.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-07-27 20:38:42 +03:00
Renato Westphal
790953a387
Merge pull request #6765 from mjstapp/backup_nhg_netlink
lib,zebra: support multiple backup nexthops
2020-07-27 12:49:36 -03:00
Donald Sharp
c19e12b74f tests: Remove 'log monitor' from tests
The `log monitor' command is a no-op and actually
outputs a `this doesn't do anything` warning.  Let's remove
this cli line from our tests as that don't do anything and
people will look at these configs for guidance.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-27 11:09:16 -04:00
Donatas Abraitis
db8a7160da
Merge pull request #6803 from donaldsharp/coverity_moo_moo
Coverity code cleanup
2020-07-27 17:20:51 +03:00
Donatas Abraitis
dfbd3ae378 bgpd: Remove peer_afc_set()
Dead code.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-07-27 17:16:32 +03:00
Mark Stapp
6c4b304f33
Merge pull request #6662 from kuldeepkash/evpn_type2_tests
tests: Adding test suites evpn_type5_test_topo1
2020-07-27 08:16:31 -04:00
Donald Sharp
b53e5c9ce9 doc: Remove Cisco Compatability section
This code was deprecated in 5.0 and removed after a year.
It has not been in the code base and we forgot to update the
doc.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-27 07:57:13 -04:00
Donald Sharp
5f140efeef bgpd: Deref after null check in bgp_evpn_vty.c
Coverity has noticed that we are using bgp_evpn after
we have already NULL checked it one time.  Add an assert
to make Coverity happy here, if we get to this point
something terrible has happened.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-27 07:10:41 -04:00
Donald Sharp
7b3a380531 bgpd: Prevent Null pointer usage
Coverity rightly points out that bgp_table_top might return
NULL and immediately deref'ing it might be a problem.
Add a bit of safety.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-27 06:59:45 -04:00
Donald Sharp
3130e28686 bgpd: Comment out dead code for future
I wanted to preserve the old code flow to see what might
be needed in the future in commit:
23ca3269da

Coverity doesn't like dead code.  So let's comment it out.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-27 06:54:23 -04:00
Donatas Abraitis
1460d1b473
Merge pull request #6802 from donaldsharp/various
Various
2020-07-27 10:36:24 +03:00
Donald Sharp
0b1321e218 vrrpd: Make clang 11 happy
Recent changes to remove PRIu... in commit:
6cde4b4552

causes clang 11 to be unhappy, with length of field warnings.
Modify the offending code to compile properly using that compiler.
I've tested against clang 11 and gcc 9.3

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-26 19:05:09 -04:00
Donald Sharp
ced26d3d49 doc: Add doc for coalesce-time in bgp
The documentation for this command was missing.  Add a little
bit of data for people in the future.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2020-07-26 18:58:25 -04:00
Donatas Abraitis
9cbd06e0f8 bgpd: Add a knob to force maximum-prefix even for filtered routes
If _force_ is set, then ALL prefixes are counted for maximum instead of
accepted only. This is useful for cases where an inbound filter is applied,
but you want maximum-prefix to act on ALL (including filtered) prefixes.

For instance, we have a configuration like:

neighbor r1 maximum-prefix 10
neighbor r1 prefix-list custom in
!
ip prefix-list custom seq 1 permit 10.0.0.0/24
ip prefix-list custom seq 2 permit 10.0.1.0/24

This will accept only 2 prefixes and discard all others instead of
shutting down the session when 10 is reached.

With this new knob (force), we will count all received prefixes and shutdown
the session when 10 is reached.

The bigger problem is when you have lots of peers with full feed and such a
configuration like in an example.

This is kinda re-ordering of how to treat filter vs. maximum-prefix.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2020-07-26 23:16:37 +03:00