Adding an `s` after these printfrr specifiers replaces 0.0.0.0 / :: in
the output with a star (`*`). This is primarily intended for use with
multicast, e.g. to print `(*,G)`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since this is only used in very few places, moving it out of the way is
reasonable. (`%pSG` will be pim_sgaddr)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The test is doing this:
a) gather interface data about packets sent
b) shut interface
c) no shut interface
d) gather interface data about packets sent
e) compare a to d and fail if packets sent/received has not incremented
The problem is, of course, that under heavy system load insufficient time
might not have passed for packets to be sent between c and d. Add up to
35 seconds of looking for packet data being incremented else heavily
loaded systems may never show that data is being sent.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
verify_pim_interface_traffic *fetches* the pim
traffic data. Rename the function to what it
actually does
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The nhrp_topo test sets up some infrastructure and
was displaying the commands it was outputting
incorrectly. Fix this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Using with LLGR, this should be allowed setting GR restart-time timer to 0,
to immediately start LLGR timers.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The following subcodes are defined for the Cease NOTIFICATION
message:
Subcode Symbolic Name
1 Maximum Number of Prefixes Reached
2 Administrative Shutdown
3 Peer De-configured
4 Administrative Reset
5 Connection Rejected
6 Other Configuration Change
7 Connection Collision Resolution
8 Out of Resources
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Currently, it is possible to rename the default VRF either by passing
`-o` option to zebra or by creating a file in `/var/run/netns` and
binding it to `/proc/self/ns/net`.
In both cases, only zebra knows about the rename and other daemons learn
about it only after they connect to zebra. This is a problem, because
daemons may read their config before they connect to zebra. To handle
this rename after the config is read, we have some special code in every
single daemon, which is not very bad but not desirable in my opinion.
But things are getting worse when we need to handle this in northbound
layer as we have to manually rewrite the config nodes. This approach is
already hacky, but still works as every daemon handles its own NB
structures. But it is completely incompatible with the central
management daemon architecture we are aiming for, as mgmtd doesn't even
have a connection with zebra to learn from it. And it shouldn't have it,
because operational state changes should never affect configuration.
To solve the problem and simplify the code, I propose to expand the `-o`
option to all daemons. By using the startup option, we let daemons know
about the rename before they read their configs so we don't need any
special code to deal with it. There's an easy way to pass the option to
all daemons by using `frr_global_options` variable.
Unfortunately, the second way of renaming by creating a file in
`/var/run/netns` is incompatible with the new mgmtd architecture.
Theoretically, we could force daemons to read their configs only after
they connect to zebra, but it means adding even more code to handle a
very specific use-case. And anyway this won't work for mgmtd as it
doesn't have a connection with zebra. So I had to remove this option.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently the Wait for Install code ( bgp_suppress_fib ) does
not properly handle two states from zebra: ROUTE_INSTALL_FAILED
and BETTER_ADMIN_DISTANCE_WON. Pre this change the WFI code
would just never notify our peers about a route install failure
but more is needed. In the ROUTE_INSTALL_FAILED and the
BETTER_ADMIN_DISTANCE_WON we need to notify our peers with
a withdrawal about the route, else we will continue to
draw traffic to us when we cannot legally do so.
Why is this needed? In either case imagine that we've already
received a bgp route, installed it and sent to our peers.
In the Better admin distance won case, say a static route is installed
at this point in time we must stop advertising the route through
us since we are not installed. As such a withdrawal must be sent.
In the ROUTE_INSTALL_FAILED case, the code was not properly handling
the situation where we have Route A, it was successfully installed
and then we received a update to Route A that was attempted to be
installed but failed. In this case we also need to send a withdrawal
Finally update the bgp_suppress_fib topotest to test both of these
situations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There still existed chances that best path consideration
has not taken place for both bgp_l3vpn_to_bgp_vrf and
bgp_instance_del_test ( since they both used the same
check_routes.py scripting ). Add some more checks
to ensure that we have all the data. Prior to this
change I could see one of these two tests failing
every 2-3 runs on my test system. I am not seeing
this anymore after ~5 complete test runs.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
OSPF when converging will choose a DR / Backup DR based upon
who has already come up. Irrelevant of priority. As such if
under system load OSPF comes up first and elects a DR that under
normal circumstances not be the elected one due to priority
OSPF does not go back through and re-elect to keep the system
stable in this case. Tests are experiencing this:
unet> r0 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.1 99 Full/Backup 4m14s 3.780s 10.0.1.2 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.2 0 Full/DROther 4m14s 3.848s 10.0.1.3 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.3 0 Full/DROther 4m14s 3.912s 10.0.1.4 r0-s1-eth0:10.0.1.1 0 0 0
unet> r1 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m15s 3.011s 10.0.1.1 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.2 0 Full/DROther 4m19s 3.124s 10.0.1.3 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.3 0 Full/DROther 4m19s 3.188s 10.0.1.4 r1-s1-eth1:10.0.1.2 0 0 0
unet> r2 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m27s 3.483s 10.0.1.1 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.1 99 Full/Backup 4m32s 3.527s 10.0.1.2 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.3 0 2-Way/DROther 4m32s 3.660s 10.0.1.4 r2-s1-eth0:10.0.1.3 0 0 0
unet> r3 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m55s 3.786s 10.0.1.1 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.1 99 Full/Backup 4m55s 3.829s 10.0.1.2 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.2 0 2-Way/DROther 4m54s 3.897s 10.0.1.3 r3-s1-eth1:10.0.1.4 0 0 0
Modify the test to do a clear to enforce the order we are specifically looking for.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- Add advertisement of Global IPv6 address in IIH pdu
- Add new CLI to set IPv6 Router ID
- Add advertisement of IPv6 Router ID
- Correctly advertise IPv6 local and neighbor addresses in Extended IS and MT
Reachability TLVs
- Correct output of Neighbor IPv6 address in 'show isis database detail'
- Manage IPv6 addresses advertisement and corresponiding Adjacency SID when
IS-IS is not using Multi-Topology by introducing a new ISIS_MT_DISABLE
value for mtid (== 4096 i.e. first reserved flag set to 1)
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Lot's of the GR topotests kill daemons in order to test code
that deals with crashing daemons. Under heavy system load
it was noticed that a kill command was sent and if told to
wait we would sleep 2 seconds send another kill command and
call it good. This was causiing issues when subsuquent
json commands would get errors like `lost connection to daemon`
as the daemon finally shut down after some time due to load.
Modify the kill the daemon function to notice that the daemon
was not actually killed and if we need to wait wait some
more time for it too happen
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently under system load tests that use verify_pim_interface_traffic
immediately after a interface down/up event are not giving any time
for pim to receive and process the data from that event. Give
the test some time to gather this data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Under heavy system load, we are sometimes seeing this
output for addKernelRoute:
2021-11-28 16:17:27,604 INFO: topolog: [DUT: b1]: Running command: [ip route add 224.0.0.13 dev b1-f1-eth0]
2021-11-28 16:17:27,604 DEBUG: topolog.b1: LinuxNamespace(b1): cmd_status("['/bin/bash', '-c', 'ip route add 224.0.0.13 dev b1-f1-eth0']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2021-11-28 16:17:27,967 DEBUG: topolog.b1: LinuxNamespace(b1): cmd_status("['/bin/bash', '-c', 'ip route']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2021-11-28 16:17:28,243 DEBUG: topolog: ip route
70.0.0.0/24 dev b1-f1-eth0 proto kernel scope link src 70.0.0.1
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This tells us that the ip route add succeeded but when looking for it
the system failed to immediately find it. Why is this happening?
Probably we are under heavy system load and the two different
commands, 'ip route add..' and 'ip route show' are being executed
on different cpu's and the data has not been copied to the different
cpu yet in the kernel. This is not necessarily something normally
seen but entirely possible. Giving the system a few extra seconds
for the kernel to execute/work the memory barrier system seems
prudent for long term success of our programming.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modify the timers uses to send updates/hello's every
1 seconds instead of 5. Allowing this test to converge
faster under heavy system load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
During repeated runs I am seeing this test fail to run successfully.
Upon inspecting the output:
{
"prefix":"10.0.10.0/24",
"prefixLen":24,
"protocol":"isis",
"vrfId":6,
"vrfName":"r1-cust1",
"selected":true,
"destSelected":true,
"distance":115,
"metric":10,
"queued":true,
We can see that the route is still queued. Under heavy system
load and not ensuring that isis has time to send the route to
zebra and for zebra to install the route, this test can fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Update verify_ospf6_neighbor() so we can verify there are no
neighbors in a given router
input_dict = {
"r0": {
"ospf6": {
"neighbors": []
}
}
}
result = verify_ospf6_neighbor(tgen, topo, dut, input_dict)
Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
The interface area command is deprecated under
router ospf6 and should be on the individual interface.
Let's modify the tests to not actually put the
interface foo area 0.0.0.0 command under the
router node.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When using build_config_from_json there exists a timing
window where neighbors can come up before the router-id
is applied. As a precaution, quickly clear the neighbors
to ensure that we get neighbors with the expected router-id.
This can especially happen under high system load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test_ospf_dual_stack test had area configuration
under the `router ospf6` nodes. This is getting
lots of warning messages from the cli. Let's remove
this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When testers use the build_config_from_json function
the create_router_ospf function is double creating
the ospfv3 cli to be passed in. This is because
the create_router_ospf loops over both v2 and v3
and then create_router_ospf6 re-adds v3.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The ospf_basic_functionality/test_ospf_lan.py creates
a ethernet segment and attaches 4 routers to it and
assigns ip addresses in a /24. As one of the tests
it picks a new address for r0 which coincides with
a ip address on r3. Then the test immediatly
checks for other data. The problem is of course
that if a test is `slow` enough hello's will
start to be ignored from r3 to r0 and the
neighbor relationships will come down. Choose
an ip address that doesn't cause this issue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The new bgp_route_server_client test is not setting the
timers for peers to be fast enough to have the ability
to converge in under 60 seconds if a packet is dropped/missed
at startup. Make the test have the ability to converge
under load
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
add a parameter to resolver api that is the vrf identifier. this permits
to make resolution self to each vrf. in case vrf netns backend is used,
this is very practical, since resolution can happen on one netns, while
it is not the case in an other one.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There exist systems that do not explicity have a python soft-link
on their system. Let's explicity call out which python we want
to be using with exabgp.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The bgp gr topotests had run times that were greater than 10 minutes each.
Just brute force break up the tests to 4 different sub parts.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
- Changing the expected output for selected route in the script.
- With our changes for VRF-Lite fix best path selection,
during best path selection, while comparing the paths for imported routes,
we should correctly refer to the original route i.e. the ultimate path.
In this case, when we have ibgp route and imported ibgp route
for the same prefix, we do compare IGP metric which is same for both,
So we proceed to comparing router-ids and selecting the best path.
- Before our changes, ibgp route was preferred because of IGP metric.
With our fix, expected output for selected route is changed to
imported ibgp route because of the lower router-id.
- Corresponding changes for expected advertised route and
the large community are made.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Somewhere along the line core-files stopped being generated
with the running of the topotests. With this change we now
see this:
sharpd@eva /t/topotests> find . -name '*.dmp' -print
./ospfv3_basic_functionality.test_ospfv3_asbr_summary_topo1/r0/ospf6d_core-sig_6-pid_430478.dmp
sharpd@eva /t/topotests> sudo gdb /usr/lib/frr/ospf6d ./ospfv3_basic_functionality.test_ospfv3_asbr_summary_topo1/r0/ospf6d_core-sig_6-pid_430478.dmp
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/frr/ospf6d...
[New LWP 430478]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospf6d --log file:ospf6d.log --log-level debug -d'.
Program terminated with signal SIGABRT, Aborted.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
(gdb)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is implicitly checked by the "verify mroute" below, but it's much
more helpful to explicitly check in advance.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently I get bgp_instance_del-test as well as bgp_l3vpn_to_bgp_vrf
failures every ~3-4 runs when under a 40 parallel run with micronet.
Examination of the failure and passing cases always leads to the
failures showing convergence of bgp bestpath immediately after
the show commands to ensure that the routes are there.
Modify the code to look for the fact that the vrf has
converged from routes being passed around across vrf's
and ensure that bestpath has run on them.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When debugging issues for routes in multiple vrf's. It would
be extremely useful if the debug output had which vrf we
are acting on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Because this test can be run in either netns vrf mode or vrflite
vrf mode, the default vrf name has different name. When netns mode
is chosen, vrf0 name is chosen as default name, while when vrflite
mode is chosen, default name is chosen. Remove the vrf keyword from
the expected dump.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Issue #9983 explains what is wrong with the GR helper mode.
To unblock the CI that fails almost all the time on the ospf_gr_topo1
test, remove the commands and disable the test. Also add a reminder to
completely remove the helper mode if no one fixes the code in a month.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This can't really be run as part of CI, it's intended as a helper
instead, to use manually after poking around in the c-ares binding code.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
common_cli.c disables logging by default so stdio is usable as vty
without log messages getting strewn inbetween. This the right thing for
most tests, but not all; sometimes we do want log messages.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
949aaea5 removed debugs from all topotests, but this test relies on the
debug logs so it constantly fails now.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When our CI test system is under high load, expecting bfd to
converge in under 2 seconds is not going to happen. Modify the test
suites to just ensure that things reconvderge.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Debugs take up a significant amount of cpu time as well as
increased disk space for storage of results. Reduce test
over head by removing the debugs, Hopefully this helps
alleviate some of the overloading that we are seeing in
our CI systems.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test system under load looks for upstream state only
1 time immediately after sending 2 streams of S,G data
flowing. Give the system some time to process this
and ensure that it actually shows up in a small
amount of time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test_ldp_pseudowires_after_link_down test
shuts a link down and was blindly waiting 5 seconds
before just assuming the test system was in a sane
state. Remove the sleep(5) and actually look for
the changed state for the route 2.2.2.2 that the
psueudowire actually depends on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test does this:
a) shut link down
b) test for ospf convergence
c) ensure the route is installed
When under a heavily loaded system c) is not guaranteed
to happen quickly. Give the system 10 extra seconds
to ensure it happens.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The route replace test was doing this seq of events:
a) Create nhg
b) Install route w/ sharpd
c) Ensure it worked
d) Modify nhg
d) Ensure the update group replace worked
The problem is that the sharp code is doing this:
/* Only send via ID if nhgroup has been successfully installed */
if (nhgid && sharp_nhgroup_id_is_installed(nhgid)) {
SET_FLAG(api.message, ZAPI_MESSAGE_NHG);
api.nhgid = nhgid;
} else {
for (ALL_NEXTHOPS_PTR(nhg, nh)) {
api_nh = &api.nexthops[i];
zapi_nexthop_from_nexthop(api_nh, nh);
i++;
}
api.nexthop_num = i;
}
The created nhg has not been successfully installed( or at least
sharpd has not read the results yet) when it gets the command
to install the routes. As such it passes down the individual
nexthops instead. The route replace is never going to work.
Modify the code to add a bit of sleep to allow sharpd to
get notified when the system is under load. At this point
there is no way to query sharpd for whether or not it
thinks it's nhg is installed properly or not. This
test is failing all over the place for a bunch of people
let's get this fixed so people can get running
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
the test_nexthop_groups function is failing occassionally
because the test executes 4 in succession sharp install
routes commands. When I dumped the rib on a failed test
run there were only 2 of the 4 routes in the rib and
the two that were in were the last 2 installed.
The sharp daemon setups a event process where it
installs routes `automatically`. If the previous
run is not finished entering a new command to install
the routes will mess up the last one from ever happening.
It is assumed that the user doesn't do stupid stuff here.
In this case I am just adding a small sleep between each
installation to just let the test proceed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
the isis_topo1 test has two functions where immediately
after the test ensures that the routes are in isis
tests to see if they are in the rib. Under system
load I am seeing this test failing because the
routes are still queued. Modify the zebra check
for the isis routes to look for the proper results
for 10 seconds.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, we have a lot of checks in CLI and NB layer to prevent
incompatible IS-types of circuits and areas. All these checks become
completely meaningless when the interface is moved between VRFs. If the
area IS-type is different in the new VRF, previously done checks mean
nothing and we still end up with incorrect circuit IS type. To actually
prevent incorrect IS type, all checks must be done in the processing
code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This code has two issues:
a) The loop to test for successful installation re-installs
the route every time it loops. A system under load will
have issues ensuring the route is installed and repeated
attempts does not help
b) The nexthop group installation was always failing
but never noticed (because of the previous commit)
and the test was always passing, when it should
have never passed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test is checking installing of seg6 routes by this
loop:
for up to 5 times:
sharp install seg6 route
show ip route and is it installed
The problem is that if the system is under heavy
load the installation may not have happened yet
and by immediately reinstalling the same route
the same thing could happen again.
Modify the code to pull the route installation
outside of the loop and to increase to 10 attempts
in case there is very heavy system load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The check_ping function `_check` function was asserting and being
passed to the topotests.run_and_expect() functionality causing
it to not run the full range of pings if one failed the test.
So effectively it was properly detecting pass / failure but
only allowing for 1 iteration if it was going to fail.
Modify the code to not assert and act like all the other
run_and_expect functionality.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
isis_tlvs.c would fail at multiple places if incorrect
TLVs were received in unpack_item_ext_subtlvs(),
causing stream assertion violations.
Signed-off-by: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
The bgp_l3vpn_to_direct test is failing sometimes because
the 2.2.2.2 route is dissapearing. What is happening?
The log file for the failed test run shows us this:
build 15-Oct-2021 07:26:12 scripts/adjacencies.py:8 WAIT:r4:ping 2.2.2.2 -c 1: 0. packet loss:wait:PE->P2 (loopback) ping:60:0.5:
build 15-Oct-2021 07:26:12 Fri Oct 15 14:26:12 2021 (#9) scripts/adjacencies.py:8 COMMAND:r4:ping 2.2.2.2 -c 1: 0. packet loss:wait:PE->P2 (loopback) ping:
build 15-Oct-2021 07:26:12 COMMAND OUTPUT:PING 2.2.2.2 (2.2.2.2) 56(84) bytes of data.
build 15-Oct-2021 07:26:12 64 bytes from 2.2.2.2: icmp_seq=1 ttl=64 time=0.143 ms
build 15-Oct-2021 07:26:12
build 15-Oct-2021 07:26:12 --- 2.2.2.2 ping statistics ---
build 15-Oct-2021 07:26:12 1 packets transmitted, 1 received, 0% packet loss, time 0ms
build 15-Oct-2021 07:26:12 rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms:
build 15-Oct-2021 07:26:12 Done after 1 loops, time=0.024507761001586914, Found= 0% packet loss
build 15-Oct-2021 07:26:12 Fri Oct 15 14:26:12 2021 (#9) scripts/adjacencies.py:9 COMMAND:r4:ping 2.2.2.2 -c 1: 0. packet loss:pass:PE->P2 (loopback) ping +0.02 secs:
build 15-Oct-2021 07:26:12 2021-10-15 14:26:12,446 WARNING: topolog.r4: LinuxNamespace(r4): proc failed: rc 2 pid 28826
build 15-Oct-2021 07:26:12 args: /usr/bin/nsenter -a -t 27444 -F --wd=/tmp/topotests/bgp_l3vpn_to_bgp_direct.test_bgp_l3vpn_to_bgp_direct/r4 /bin/bash -c ping 2.2.2.2 -c 1
build 15-Oct-2021 07:26:12 stdout: connect: Network is unreachable:
build 15-Oct-2021 07:26:17 COMMAND OUTPUT:connect: Network is unreachable:
build 15-Oct-2021 07:26:17 R:9 r4 PE->P2 (loopback) ping +0.02 secs 0 1
So the 2.2.2.2 route is coming/going and is failing on these test lines:
luCommand(
"r1", "ping 2.2.2.2 -c 1", " 0. packet loss", "wait", "PE->P2 (loopback) ping", 60
)
luCommand(
"r3", "ping 2.2.2.2 -c 1", " 0. packet loss", "wait", "PE->P2 (loopback) ping", 60
)
luCommand(
"r4", "ping 2.2.2.2 -c 1", " 0. packet loss", "wait", "PE->P2 (loopback) ping", 60
)
So the 2.2.2.2 routes on r1,3 and 4 are received via ospf, but are
modified by some other process to add labels ( probably ldp, since
it is running too ). The 2nd ping to 2.2.2.2 is failing because
the 2.2.2.2 route on r4 is being replaced. As an example here
is `ip monitor all` on r4 during boot up. Please note timestamps
are not necessarily representative of what we will see on the
loaded ci system.
[2021-10-15T15:46:52.261456] [NEXTHOP]id 27 via 10.0.2.2 dev r4-eth0 scope link proto zebra
[2021-10-15T15:46:52.261490] [ROUTE]2.2.2.2 nhid 27 via 10.0.2.2 dev r4-eth0 proto ospf metric 20
<snip>
[2021-10-15T15:46:53.556405] [NEXTHOP]Deleted id 27 via 10.0.2.2 dev r4-eth0 scope link proto zebra
<snip>
[2021-10-15T15:46:53.566575] [NEXTHOP]id 32 via 10.0.2.2 dev r4-eth0 scope link proto zebra
[2021-10-15T15:46:53.566585] [ROUTE]2.2.2.2 nhid 32 via 10.0.2.2 dev r4-eth0 proto ospf metric 20
For a small amount of time the route was *gone*. I believe the upstream
CI system hits that window sometimes, causing the test to fail.
This patch attempts to ensure that the 2.2.2.2 route should be learned
appropriately ( thus slowing it down ) before the test moves onto
the ping. I suspect the long term answer might be to add a test to
the scripts/adjancies.py script to ensure that the test does not
continue until the appropriate label is in place, but I want to
make the test run a bit more perscriptive in what it is looking
for here.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Recent commit 83f325901a had a accidental
turn of a 1 second wait into a 10 second wait
between retries. 10 seconds is too long.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Test doesn't wait long enough when it checks the routers after
restart. On slower systems, it frequently failed as it ran out
of time
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
When our ci test system is under high load, expecting bfd to converge
in under 2 seconds is not going to happen. Modify the test suites
to just ensure that things converge. If we need actual functional
testing of bfd response times the topotests are not an appropriate place
to do this or we need to modify the test system to gather the data for
how long it takes after the tests are run.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
During a local CI run, bgp_ecmp_topo3 was failing
to properly notice the fast-convergence command
issued before the interface is shut down. As
such there exists a race condition where under
high load the zebra process can actually shut
an interface down before we have properly ensured
that fast convergence is on for ibgp.
Modify the test for in two ways:
1) Ensure that previous section makes sure
that we have properly converged for when we
bring back up the interfaces instead of
assuming that we have done so.
2) After issuing the fast-convergence command.
Ensure that bgp has fully processed it and is
ready to receive the interface down events
as triggers for shutting down the ibgp session.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
On a local CI run. The test_ldp_topo1.py showed fail to converge
on r3. r3 has 2 neighbors but only 1 was up when we got to
further steps in the test suites.
Modify the neighbor checking to `know` how many neighbors
should be operational and continue looking for them until
they are up and running.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Previously, when a valgrind memleak was discovered, would cause a
catastrophic pytest failure. Now correctly fails the current pytest as
intended.
As a result of this fix --valgrind-memleaks now works in distributed
pytest mode as well.
Signed-off-by: Christian Hopps <chopps@labn.net>
Revert the accidental enabling of the optional memleak tests that came
with the large micronet changeset.
Signed-off-by: Christian Hopps <chopps@labn.net>
The nexthop group code is installing routes and nexthop groups
and immediately expecting zebra to have processed the results
as a result there is a situation when the CI system is under
intense load that the nexthop group might not have been processed.
Add a bit of code to allow the test to give FRR some time
to finish work before declaring it not working.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the CI system is heavily loaded, we might see the following failures:
```
test failed at "test_config_timing/test_static_timing": assert 20.083204 <= 19.487716
```
Currently we allow each step to run 2 times slower than the initial
measurement. Let's allow them to run 3 times slower.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
On the first step, the test creates 10000 static routes. It passes 10000
to `get_ip_networks` and it generates 10000 /22 routes.
On the fourth step, the test tries to remove 5000 previously created
routes. It passes 5000 to `get_ip_networks` and here starts the problem.
Instead of generating 5000 /22 routes, it generates 5000 /21 routes. And
the whole step is a no-op, we constantly see the following logs:
```
% Refusing to remove a non-existent route
```
To consistently generate same routes, `get_ip_networks` must always use
the same prefix length.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Our topotests send SIGBUS 2 seconds after a SIGTERM is
initiated. This is bad because under a heavily loaded
topotest system we may have a case where the system has
not had a chance to properly shut down the daemon.
Extend the time greatly before topotests send SIGBUS.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This removes a giant `switch { }` block from lib/zclient.c and
harmonizes all zclient callback function types to be the same (some had
a subset of the args, some had a void return, now they all have
ZAPI_CALLBACK_ARGS and int return.)
Apart from getting rid of the giant switch, this is a minor security
benefit since the function pointers are now in a `const` array, so they
can't be overwritten by e.g. heap overflows for code execution anymore.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
*_anywhere(item) returns whether an item is on _any_ container. Only
available for unsorted containers for now.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This provides a "is this item on this list" check, which may or may not
be faster than using *_find() for the same purpose. (If the container
has no faster way of doing it, it falls back to using *_find().)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Even if it doesn't matter for an unit test in general, it hides actual
leaks in the code being tested. Fix so any leaks will be actual bugs.
(Currently there aren't any, yay.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This script is failing occassionally in our upstream topotests.
Where it was changing route-maps and attempting to see if
summarization was working correctly. The problem was that
the code appeared to be attempting to add route-maps to
redistribution in ospf then modifying the route-maps behavior
to affect summarization as well as the metric type of that
summarization.
The problem is of course that ospf does not appear to modify
the summary routes metric-type when the components
of that summary change it's metric-type. So the test
is testing nothing. In addition the test had messed
up the usage of the route-map generation code and all
the generated config was in different sequence numbers
but route-map processing would never get to those
new sequence numbers because of how route-maps are processed.
Let's just remove this part of the test instead of trying
to unwind it into anything meaningfull
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Several tests used the route_map_create functionality
with `metric-type` but never bothered to add the
backend code to ensure it works correctly.
Add it in so it can be used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have this pattern in this test:
# Let's kill the interface on rt2 and see what happens with the RIB and BFD on rt1
tgen.gears["rt2"].link_enable("eth-rt1", enabled=False)
# By default BFD provides a recovery time of 900ms plus jitter, so let's wait
# initial 2 seconds to let the CI not suffer.
topotest.sleep(2, 'Wait for BFD down notification')
router_compare_json_output(
"rt1", "show ip route ospf json", "step3/show_ip_route_rt2_down.ref", 1, 0
)
Under a heavy CI load, interface down events and then reacting to them may not actually
happen within 2 seconds. Allow some more grace time in the test to ensure that we
react to it in an appropriate manner.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
OSPF when it is deciding on whom it should elect for DR and backup
has a process that prioritizes network stabilty over the exact
same results of who is the DR / Backups.
Essentially if we have r1 ----- r2
Let's say r1 has a higher priority, but r2 comes up first, starts
sending hello packets and then decides that it is the DR. At some
point in time in the future, r1 comes up and then connects to r2
at that point it sees that r2 has elected itself DR and it keeps
it that way.
This is by design of the system. With our tight ospf timers as
well as high load being experienced on our test systems. There
exists a bunch of ospf tests that we cannot guarantee that a
consistent DR will be elected for the test. As such let's not
even pretend that we care a bunch and just look for `Full`.
If we care about `ordering` we need to spend more time getting
the tests to actually start routers, ensure that htey are up and
running in the right order so that priority can take place.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fix a loop in the setup phase of isis_topo1_vrf: only configure
interfaces that each router actually has.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Ensure GR helpers have received a Grace-LSA before killing the
ospfd/ospf6d process that is undergoing a graceful restart.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There's no more difference between number-named and word-named access-lists.
This commit removes separate arguments for number-named ACLs from CLI.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
BGP LU will use implicit-null in more situations now; adjust
the original LU topotest to align with that. Node R2 uses
imp-null now, while R1 continues to allocate labels.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Add a second BGP labelled-unicast (BGP-LU) test suite, with
an additional router and some additional tests.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
2 things:
a) Each test was setting up for graceful restart with calls to
`graceful-restart prepare ip[v6] ospf`, then sleeping for
3 or 5 seconds. Then killing the ospf process. Under heavy
load there is no guarantee that zebra has received/processed
this signal. Write some code to ensure that this happens
b) Tests are issuing commands in this order:
1) issue gr prepare command
2) kill router
3) <ensure routes were still installed in zebra>
4) start router
5) <ensure routes were stil installed in zebra>
Imagine that the system is under some load and there is
a small amount of time before step 5 happens. In this
case ospf could have come up and started neighbor relations
and also started installing routes. If zebra receives
a new route before step 5 is issued then the route could
be in a state where it is not installed, because it is
being sent to the kernel for installation. This would
fail the test because it would only look 1 time. This
is fixed by giving time on restart for the routes to
be in the installed state.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Any command that uses `peer_lookup_in_view` crashes when "vrf all" is
used, because bgp is NULL in this case.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Remove references to the deprecated "CLI()" function; clean up
a couple of string escapes; make one test-case sensitive to
previous failures.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Some tests had commented-out references to the old "CLI()"
function. Remove those so they're not confusing in the future,
and replace at least one with a comment that uses the
'mininet_cli()' function.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
* Add new debug directives for NSSA LSAs;
* Remove the "debug ospf6 gr helper" command since it doesn't make
sense for this test (not to mention it was renamed to "debug ospf6
graceful-restart");
* Migrate to the new interface-level command to enable OSPFv3 on
interfaces ("interface WORD area A.B.C.D" was deprecated).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Document the `sleep` statement so people know that we are sleeping
because we are waiting for the BFD down notification. If we don't
sleep here it is possible that we get outdated `show` command results.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Call the `show` commands less often to reduce the CPU pressure.
Also increase the wait time from 60 to 80 seconds to have spare room
for failures (4 times more). This is the latest measure wait time:
> INFO: topolog: 'router_json_cmp' succeeded after 20.08 seconds
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Reduce timers so we send hello packets more often and reduce dead
interval to converge faster.
Previous test wait amount:
> INFO: topolog: 'router_json_cmp' succeeded after 47.20 seconds
New test wait amount:
> INFO: topolog: 'router_json_cmp' succeeded after 20.08 seconds
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Add the "default-information-originate" option to the "area X nssa"
command. That option allows the origination of Type-7 default routes
on NSSA ABRs and ASBRs.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The route created by the "default-information-originate" command
isn't a regular external route. As such, an NSSA ABR shouldn't
originate a corresponding Type-7 LSA for it (there's a separate
configuration knob to generate Type-7 default routes).
While here, fix a small issue in ospf6_asbr_redistribute_add()
where routes created by "default-information-originate" were being
displayed with an incorrect "unknown" type.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Fix wrong comparison since route->path.metric_type is always set
to either 1 or 2. The OSPF6_PATH_TYPE_EXTERNAL2 constant, whose
value is 4, refers to a route type so its usage was incorrect here.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Considering that both the GR helper mode and restarting mode can be
enabled at the same time, the "graceful-restart helper-only" command
can be a bit misleading since it implies that only the helper mode
is enabled. Rename the command to "graceful-restart helper enable"
to clarify what the command does.
Start a deprecation cycle of one year before removing the original
command
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Because vrf backend may be based on namespaces, each vrf can
use in the [16-(2^32-1)] range table identifier for daemons that
request it. Extend the table manager to be hosted by vrf.
That possibility is disabled in the case the vrf backend is vrflite.
In that case, all vrf context use the same table manager instance.
Add a configuration command to be able to configure the wished
range of tables to use. This is a solution that permits to give
chunks to bgp daemon when it works with bgp flowspec entries and
wants to use specific iptables that do not override vrf tables.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Considering that both the GR helper mode and restarting mode can be
enabled at the same time, the "graceful-restart helper-only" command
can be a bit misleading since it implies that only the helper mode
is enabled. Rename the command to "graceful-restart helper enable"
to clarify what the command does.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Issue #9535 describes how the export-list/import-list commands work
differently on ospfd and ospf6d.
In short:
* On ospfd, "area A.B.C.D export-list" filters which internal
routes an ABR exports to other areas. On ospf6d, instead, that
command filters which inter-area routes an ABR exports to the
configured area (which is quite counter-intuitive). In other words,
both commands do the same but in opposite directions.
* On ospfd, "area A.B.C.D import-list" filters which inter-area
routes an ABR imports into the configured area. On ospf6d, that
command filters which inter-area routes an interior router accepts.
* On both daemons, "area A.B.C.D filter-list prefix NAME <in|out>"
works exactly the same as import/export lists, but using prefix-lists
instead of ACLs.
The inconsistency on how those commands work is undesirable. This
PR proposes to adapt the ospf6d commands to behave like they do
in ospfd.
These changes are obviously backward incompatible and this PR doesn't
propose any mitigation strategy other than warning users about the
changes in the next release notes. Since these ospf6d commands are
undocumented and work in such a peculiar way, it's unlikely many
users will be affected (if any at all).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
A bunch of tests have this pattern:
a) Install a new prefix into bgp
b) Run this loop:
foreach (router in topology) {
verify_bgp_rib(router)
}
This is to ensure that the prefix is actually disseminated.
The problem with this, of course, is that a wait of 2 seconds
for every item in that loop makes no sense. As that the initial
router verification of it's bgp rib will wait 2 seconds and
all the remaining bgp routers in the topology will have gotten
the data. So we end up waiting a bunch of extra time.
Remove the initial_wait time for verify_bgp_rib. Also
increase the failure wait time to 30 seconds. This is
to give a bigger window for bgp to send it's data for
our test systems that could be under heavy load. In the
normal case tests will never hit this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a new topotest that features a topology with seven routers spread
across four OSPF areas:
* 1 backbone area;
* 1 regular non-backbone area (0.0.0.1);
* 1 stub area (0.0.0.2);
* 1 NSSA area (0.0.0.3).
All routers have both GR and GR helper functionality enabled in
the configuration. The test consists of restarting each router,
one at time, and checking that all forwarding planes (and LSDBs)
are kept intact during those restarts.
A successful run takes about three minutes to finish.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Compilation is warning that a memcpy is only copying
the first (sizeof pointer) into memory. This is not
what we really want. Although it does beg the question about
why this memcpy is needed( or what it is doing ). I'm going
to just fix the memcpy and call it a day.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
frrmod_load() attempts to dlopen() several possible paths
(constructed from its basename argument) until one succeeds.
Each dlopen() attempt may fail for a different reason, and
the important one might not be the last one. Example:
dlopen(a/foo): file not found
dlopen(b/foo): symbol "bar" missing
dlopen(c/foo): file not found
Previous code reported only the most recent error. Now frrmod_load()
describes each dlopen() failure.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
1. Optimized test: test_clear_pim_neighbors_and_mroute_p0 run time by clearing
mroute and verifying mroutes separately. Execution time is reduced from almots 10 mins
to ~220 sec.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
route_scale run is 500+ seconds. Break it up into
2 separate tests. This should reduce run time a slight
bit.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fix issue of topotest failures with BGP status Connect or Idle
instead of the expected Active
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Modernize the test a bit, generate expected results rather than load from
file, and add a general json_cmp with retry function and use it.
Signed-off-by: Christian Hopps <chopps@labn.net>
- Update the template and documentation to use newer pytest fixutres for
setup and teardown, as well as skipping tests when the suite fails.
Signed-off-by: Christian Hopps <chopps@labn.net>
When looking for a implied host route it is not necessary
to add the `/32` to an ip route add. As such masks
will not be set in this case. Set the value of masks
to a known good value so that when the route installation
fails the test for it actually being there will tell you
that the route is not there -vs- complaining about mask
being uninited.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There is no need to add calls to addKernelRoutes for
groups. They do not need to be routed via the
normal kernel methodology.
Tests run successfully with this change.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- Fix xterm support to work, previously it mostly didn't, not it should
in all cases (i.e., single or dist mode).
- Catch when the user tries to use various window requiring topotests
features (e.g., --cli-on-error) but isn't running under supported
system (e.g., byobu/tmux/xterm), and fail the run with an explanation.
Signed-off-by: Christian Hopps <chopps@labn.net>
If the SRv6 locator is deleted in zebra, zclient(bgpd)
which allocates SIDs from the locator will update the
RIBs which use those SIDs and make them invalid.
This will cause the VPNv6 route to be withdrawn and
the VPN to stop.
If the SRv6 locator is added again, zclient(bgpd) will
allocate the SIDs from the locator again, and VPNv6
will be re-established.
This commit add a test case to confirm this.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Before this PR, in case of get locator chunk zapi from
zclient, zebra precreated a down state locator and set
the chunk ownership. After this PR, this is no longer
done, and chunks are no longer automatically generated.
In this commit, we will make a test update to check the
corresponding detailed behavior.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Description:
Change is intended for fixing the following issues related to vrf route leaking:
Routes with special nexthops i.e. blackhole/sink routes when imported,
are not programmed into the FIB and corresponding nexthop is set as 'inactive',
nexthop interface as 'unknown'.
While importing/leaking routes between VRFs, in case of special nexthop(ipv4/ipv6)
once bgp announces route(s) to zebra, nexthop type is incorrectly set as
NEXTHOP_TYPE_IPV6_IFINDEX/NEXTHOP_TYPE_IFINDEX
i.e. directly connected even though we are not able to resolve through an interface.
This leads to nexthop_active_check marking nexthop !NEXTHOP_FLAG_ACTIVE.
Unable to find the active nexthop(s), route is not programmed into the FIB.
Whenever BGP leaks routes, set the correct nexthop type, so that route gets resolved
and correctly programmed into the FIB, in the imported vrf.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Create a pid file for the router created by topotest.
By executing nsenter directly against this pid, developers
can execute commands directly from outside the unet shell.
This allows the developer to use script, tab completion, etc.,
and improves efficiency.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Refactor the bgp_auth test to create common_config code to allow
non-json based tests to reset routers and load configs in parallel.
Signed-off-by: Christian Hopps <chopps@labn.net>
- Reduce OSPF timers to 1 and 4
- Reduce BGP connect timer to 5
- Apply configs in parallel as single file
- Remove the switches as all links are p2p, perhaps this will help with
reliability?
Signed-off-by: Christian Hopps <chopps@labn.net>
Utilizes new pytest fixtures to completely factor out setup and teardown
functionality. Supply the JSON config and write your tests.
"The best topotest template yet!"
Signed-off-by: Christian Hopps <chopps@labn.net>
New generic script uses a new default node specific log dir to avoid
collisions when running in parallel.
Signed-off-by: Christian Hopps <chopps@labn.net>
- The PIM tests do not need kernel routes to help them bind joins and
sources to specific interfaces. They should do that themselves directly.
Also do not change system wide "rp_filter" sysctl away from the value
required by everyone else.
Signed-off-by: Christian Hopps <chopps@labn.net>
There were some tests where we were turning on mpls on
interface names that don't exist for certain `machines`
in the topology. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is to avoid breaking changes between existing deployments of
extended community for bandwidth encoding. By default FRR uses uint32
to encode bandwidth, which is not as the draft requires (IEEE floating-point).
This switch enables the required encoding per-peer.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This allows defining a CLI command like this:
`[no] some setting ![VALUE]`
with VALUE being optional for the "no" form, but required for the
positive form. It's just a `[...]` where the empty branch can only be
taken for commands starting with `no`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Add the "metric" and "metric-type" options to the "redistribute"
command.
This is a small commit since the logic of setting the metric
value and type of external routes was already present due to the
implementation of the "default-information originate" command months
ago. This commit merely extends the "redistribute" command to
leverage that functionality.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Also, update the ospf6_topo2 topotest since the expected output
was wrong. With this fix, NSSA routes will be created on r2
("redistribute connected"), and NSSA routes appear in the routing
table as regular external routes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
OSPF mixes uses of "delete" and "del_action" depending on which library
function is called. It's a bug-prone mess that needs fixing; however, for
now we fix the one obvious incorrect use in this test.
Signed-off-by: Christian Hopps <chopps@labn.net>
There is a possibility that the same line can be matched as a command in
some node and its parent node. In this case, when reading the config,
this line is always executed as a command of the child node.
For example, with the following config:
```
router ospf
network 193.168.0.0/16 area 0
!
mpls ldp
discovery hello interval 111
!
```
Line `mpls ldp` is processed as command `mpls ldp-sync` inside the
`router ospf` node. This leads to a complete loss of `mpls ldp` node
configuration.
To eliminate this issue and all possible similar issues, let's print an
explicit "exit" at the end of every node config.
This commit also changes indentation for a couple of existing exit
commands so that all existing commands are on the same level as their
corresponding node-entering commands.
Fixes#9206.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
In particular, the fixed 2 second sleep here was not long enough.
Switch to standard run_and_expect polling to make test more robust.
Signed-off-by: Christian Hopps <chopps@labn.net>
- In order to run tests in parallel the netns-based vrfs need to
have unique names primarily bc they are all tracked/looked-up in
`/run/netns` which is not network namespace nesting friendly
- use ip(8) exclusively rather than a mix of `ip` and `ifconfig`
and `vconfig`, reducing required pkg count by a couple.
Signed-off-by: Christian Hopps <chopps@labn.net>
- bugs in the support library function `verify_gr_address_family`
allowed this test to pass depending on ordering of python dictinoary
keys. Fix the bugs, fix the test.
Signed-off-by: Christian Hopps <chopps@labn.net>
Related: http://docs.frrouting.org/projects/dev-guide/en/latest/topotests.html
Directory name for a new topotest must not contain hyphen (-) characters.
To separate words, use underscores (_). For example, tests/topotests/bgp_new_example.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Meanwhile we don't get all MSDP features (MSDP route validation via BGP
AS Path as described in RFC 4611 Section 2), kill one of the links of
the topology to avoid intermittent test failures due to different
traffic route.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Remove a 200 second sleep from bgp-evpn-overlay-index-gateway.
There does not seem to be any evidence that this is needed
and I cannot make the test fail without it.
Fixes: #9035
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- A more general fix for the bgp listener test which requires interfaces be
configured in the kernel when the bgpd daemons are launched.
Signed-off-by: Christian Hopps <chopps@labn.net>
The test_simple_snmp.py test starts bgp, zebra and snmpd at the
same time. Then zebra configuration is read in and interface
addresses are applied. If snmp start slower than zebra
the snmp process can properly get it's ip address to bind to
if it is faster than zebra, it will fail. Ensure that the
test has addresses before we start daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When running this test on a locally loaded system I am seeing the
static route as `queued` still after 1 second. Let's just blanket
increase the timeout to something longer to give a very loaded system
more time to install the route.
Output on my test system when it was loaded:
INFO topolog.r1:topogen.py:880 vtysh result:
{
"4.5.1.0/24":[
{
"prefix":"4.5.1.0/24",
"prefixLen":24,
"protocol":"static",
"vrfId":0,
"vrfName":"default",
"selected":true,
"destSelected":true,
"distance":1,
"metric":0,
"queued":true,
"table":254,
"internalStatus":8,
"internalFlags":73,
"internalNextHopNum":1,
"internalNextHopActiveNum":1,
"uptime":"00:00:00",
"nexthops":[
{
"flags":1,
"ip":"192.168.216.3",
"afi":"ipv4",
"interfaceIndex":11,
"interfaceName":"r1-eth6",
"active":true,
"weight":1
}
]
},
I suspect 10 seconds should be enough( I would hope ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Tests should have low enough overhead that sending
the join/prune every 5 seconds should be sufficient
also it should allow us to converge faster in case of
dropped packets.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
frrscript_load now loads a function instead of a file, so frrscript_unload
should be renamed since it does not unload a function.
Signed-off-by: Donald Lee <dlqs@gmx.com>
- Remove incorrect requirement for `service integrated-vtysh-config`
when producing a delta.
- Add `--test-reset` option which suppresses non-parseable lines from the
produced delta
- Use new features in common_config.py
Signed-off-by: Christian Hopps <chopps@labn.net>
TMUX and Screen support when running topotests inside docker. This
allows the gdb, shell and vtysh features to correctly work even when
running the tests inside docker.
Add options:
--asan-abort :: aborts the process on ASAN errors
--strace-daemons :: strace some or all daemons
Signed-off-by: Christian Hopps <chopps@labn.net>
Some BGP updates received by BGP invite local router to
install a route through itself. The system will not do it, and
the route should be considered as not valid at the earliest.
This case is detected on the zebra, and this detection prevents
from trying to install this route to the local system. However,
the nexthop tracking mechanism is called, and acts as if the route
was valid, which is not the case.
By detecting in BGP that use case, we avoid installing the invalid
routes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Since the common CLI code calls nb_init, allow specifying some modules
to load by overriding test_yang_models.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add a new topotest that features a topology with seven routers spread
across four OSPF areas:
* 1 backbone area;
* 1 regular non-backbone area (0.0.0.1);
* 1 stub area (0.0.0.2);
* 1 NSSA area (0.0.0.3).
All routers have both GR and GR helper functionality enabled in
the configuration. The test consists of restarting each router,
one at time, and checking that all forwarding planes (and LSDBs)
are kept intact during those restarts.
A successful run takes about three minutes to finish.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Using "write memory" to save the daemons' configurations before
restarting them can cause log files to stop working correctly. Add
a new "save_config" to the kill_router_daemons() function to prevent
that from happening when saving the configurations isn't necessary.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Speedup (large topo): OLD: ~6 minutes NEW: ~1 second
(when paired with generate_support_bundle.py changes)
- Collect from each node in parallel
Bug fixes:
- sub-directory test name was the same internal pytest function name
for any test, and not the actual test name.
Signed-off-by: Christian Hopps <chopps@labn.net>
On a loaded machine running FRR with ASAN I've got the following result:
INFO: waiting MSDP connection from peer 10.254.254.3 on router r1
INFO: 'router_json_cmp' polling started (interval 1 secs, maximum 30 tries)
INFO: 'router_json_cmp' succeeded after 22.53 seconds
Which is very close to the limit, so lets bump the value 4x to avoid a
test false positive.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This python fixture was way too complex for what is needed.
Eliminate gratuitous options/over-engineering:
- Change from non-deterministic `wait` and `attempts` to a single
`retry_timeout` value. This is both more deterministic, as well as
what the user should actually be thinking about.
- Use a fixed 2 second pause between executing the wrapped function
rather than a bunch of arbitrary choices of 2, 3 and 4 seconds
spread all over the test code.
- Get rid of the multiple variables for determining what "Positive" and
"Negative" results are. Instead just implement what all the user code
already wants, i.e., boolean False or a str (errormsg) means
"Negative" result otherwise it's a "Positive" result.
- As part of the above the inversion logic is much more comprehensible
in the fixture code (and more correct to boot).
Signed-off-by: Christian Hopps <chopps@labn.net>
Pylint cleanup in commit 914faab594 removed a crucial function
parameter that inverted the logic of verify function calls.
Signed-off-by: Christian Hopps <chopps@labn.net>
Modify both the default and vrf ospf6 topologies to include a test
where write-multiplier is configured to a non-default value and
the ospf6 neighbors are reset then checked.
Run black on both test files.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Currently, the dynamic hostname cache is global. It is incorrect because
neighbors in different VRFs may have the same system ID and different
hostnames.
This also fixes a memory leak - when the instance is deleted, the cache
must be cleaned up and the cleanup thread must be cancelled.
Fixes#8832.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
ospf6d (and all other daemons except zebra) doesn't correctly process
`interface X vrf Y`, because it doesn't know existing VRFs at the time
of configuration file reading. Therefore it doesn't apply configuration
provided in the interface node.
Fix the problem by removing `vrf Y` part, having just an interface name
is enough.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add a terse option to show bgp summary to shorten output.
Do not show the following information about the BGP
instances: the number of RIB entries, the table version and the used memory.
The "terse" option can be used in combination with the "remote-as", "neighbor",
"failed" and "established" filters, and with the "wide" option as well.
Before patch:
ubuntu# show bgp summary remote-as 123456
IPv4 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
BGP table version 0
RIB entries 3, using 552 bytes of memory
Peers 5, using 3635 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.200.200.2 4 123456 81432 4 0 56092 0 00:00:13 572106 0 N/A
Displayed neighbors 1
Total number of neighbors 4
IPv6 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
BGP table version 0
RIB entries 3, using 552 bytes of memory
Peers 5, using 3635 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
% No matching neighbor
Total number of neighbors 5
After patch:
ubuntu# show bgp summary remote-as 123456 terse
IPv4 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 0
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.200.200.2 4 123456 81432 4 0 56092 0 00:00:13 572106 0 N/A
Displayed neighbors 1
Total number of neighbors 4
IPv6 Unicast Summary (VRF default):
BGP router identifier X.X.X.X, local AS number XXX vrf-id 1
% No matching neighbor
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When filtering sessions on show bgp summary with failed, established,
neighbor and remote-as options, add a counter of displayed neighbors
in addition to the total number of neighbor :
"Displayed neighbors X"
ubuntu# show bgp summary failed remote-as external
IPv4 Unicast Summary (VRF default):
Neighbor EstdCnt DropCnt ResetTime Reason
10.200.200.2 0 0 never Waiting for NHT
172.16.29.2 0 0 never Waiting for NHT
10.22.1.2 0 0 never Waiting for NHT
Displayed neighbors 3
Total number of neighbors 5
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Display on which VRF/view the neighbor was not found. Useful when
selecting "vrf all".
Before patch:
No such neighbor in this view/vrf
After patch:
No such neighbor in VRF default
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Issue: There was an error reported by Pylint regarding "expected" keyword:
Unexpected keyword argument 'expected' in function call (unexpected-keyword-arg)
Fix: We have defined expected keyword in all topojson APIs.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Following functionality is covered:
+--------+ BGP +--------+ BGP +--------+ +--------+
SN1 | | IPv4/v6 | | EVPN | | | |
======+ Host1 +---------+ PE1 +------+ PE2 +------+ Host2 +
| | | | | | | |
+--------+ +--------+ +--------+ +--------+
Host1 is connected to PE1 and host2 is connected to PE2
Host1 and PE1 have IPv4/v6 BGP sessions.
PE1 and PE2 gave EVPN session.
Host1 advertises IPv4/v6 prefixes to PE1.
PE1 advertises these prefixes to PE2 as EVPN type-5 routes.
Gateway IP for these EVPN type-5 routes is host1 IP.
Host1 MAC/IP is advertised by PE1 as EVPN type-2 route
Following testcases are covered:
TC_1:
Check BGP and zebra states for above topology at PE1 and PE2.
TC_2:
Stop advertising prefixes from host1. It should withdraw type-5 routes. Check
states at PE1 and PE2
Advertise the prefixes again. Check states.
TC_3:
Shut down VxLAN interface at PE1. This should withdraw type-2 routes. Check
states at PE1 and PE2.
Enable VxLAN interface again. Check states.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
To start we use 10k static route config. This test goes along with
recent batching changes it will fail w/o them (b/c some operations w/o
batching take 100 times as long).
This test should be added to over time for other large config
items (e.g., acl, policy, etc)
Signed-off-by: Christian Hopps <chopps@labn.net>
Test uses staticd which required some C++ header protections.
Additionally, the test also runs in the ubuntu20 docker container as
grpc is supported there by the packaging system.
Signed-off-by: Christian Hopps <chopps@labn.net>
similarly to what was done for IS-IS in commit 01d43141, combine
the SRGB and SRLB commands for OSPF-SR, so that we can replace
overlapping ranges in one sweep change.
Also allow the range configuration to be stored before SR is enabled.
There is no reason why we should not - in fact that constraint meant
that we were always requesting the default label ranges regardless
of what we actually wanted to use.
Finally, update the topotests now that we do not need to refresh
the SRGB/SRLB/MSD after disabling SR. Note that the prefix-sid still
needs to be re-added.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Modify VRF/view display in show bgp summary:
- to be more concise
- to display on which VRF/view no neighbor was found
Before patch:
ubuntu# show bgp vrf all summary
Instance default:
IPv4 Unicast Summary:
BGP router identifier XX.XX.XX.XX, local AS number XXXX vrf-id 0
(...)
IPv6 Unicast Summary:
Instance private:
IPv4 Unicast Summary:
ubuntu# show bgp vrf all ipv4 multicast summary
% No BGP neighbors found
% No BGP neighbors found
After patch:
ubuntu# show bgp vrf all summary
IPv4 Unicast Summary (VRF default):
BGP router identifier XX.XX.XX.XX, local AS number XXXX vrf-id 0
(...)
IPv6 Unicast Summary (VRF default):
(...)
IPv4 Unicast Summary (VRF private):
(...)
ubuntu# show bgp vrf all ipv4 multicast summary
% No BGP neighbors found in VRF default
% No BGP neighbors found in VRF private
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
New OSPFv3 NSSA test:
* When a static route is redistributed to an NSSA router it should be
type 7 and should show up in OSPFv3 route database.
* Test LSA Type 7 and route removal.
Co-authored-by: Soman K.S <somanks@gmail.com>
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
https://github.com/FRRouting/frr/pull/5865#discussion_r597670225
As this comment says. ZEBRA_FLAG_XXX should not have been used.
To communicate SRv6 Route Information. A simple Nexthop Flag would
have been sufficient for SRv6 information. And I fixed the whole
thing that way.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
The "show sharp segment-routing srv6" command was a
json output command, but it did not follow the common
practice of the other commands.
It follows the review and outputs the json format by
using the json keyword. Otherwise, it produces human
readable output.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit fix bgpd's prefix-sid type4,5 feature which has
miss implementation from https://github.com/FRRouting/frr/pull/5653
was merged. Due to some nessesary lines are not presented.
When bgpd receives multi update message with same service-sid on
prefix-sid type-5 attribute, bgpd will crash arround path-attribute's
values object reference count.
And also, this commit add a topotest to check that feature work fine.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit is a part of #5853 works.
This commit add new topotest to verify SRv6-manager's functionality.
Following tests are performed on this topotest.
- check that SRv6-locator is set correctly
- check that default SRv6-function locator is set correctly
- check that SRv6-function is installed as ipv6 route correctly
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit checks seg6local route configuration via ZAPI is
working fine.
SRv6 feature is little young feature as kernel feature so netlink
interface may be changed/updated in the future. And this ZAPI extention
is something to support new routing paradigm, so it should be checked by
topotests until srv6 feature of linux kernel will be well stable.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
The backoff code assumed that yang operations always completed quickly.
It checked for > 100 YANG modeled commands happening in under 1 second
to enable batching. If 100 yang modeled commands always take longer than
1 second batching is never enabled. This is the exact opposite of what
we want to happen since batching speeds the operations up.
Here are the results for libyang2 code without and with batching.
| action | 1K rts | 2K rts | 1K rts | 2K rts | 20k rts |
| | nobatch | nobatch | batch | batch | batch |
| Add IPv4 | .881 | 1.28 | .703 | 1.04 | 8.16 |
| Add Same IPv4 | 28.7 | 113 | .590 | .860 | 6.09 |
| Rem 1/2 IPv4 | .376 | .442 | .379 | .435 | 1.44 |
| Add Same IPv4 | 28.7 | 113 | .576 | .841 | 6.02 |
| Rem All IPv4 | 17.4 | 71.8 | .559 | .813 | 5.57 |
(IPv6 numbers are basically the same as iPv4, a couple percent slower)
Clearly we need this. Please note the growth (1K to 2K) w/o batching is
non-linear and 100 times slower than batched.
Notes on code: The use of the new `nb_cli_apply_changes_clear_pending`
is to commit any pending changes (including the current one). This is
done when the code would not correctly handle a single diff that
included the current changes with possible following changes. For
example, a "no" command followed by a new value to replace it would be
merged into a change, and the code would not deal well with that. A good
example of this is BGP neighbor peer-group changing. The other use is
after entering a router level (e.g., "router bgp") where the follow-on
command handlers expect that router object to now exists. The code
eventually needs to be cleaned up to not fail in these cases, but that
is for future NB cleanup.
Signed-off-by: Christian Hopps <chopps@labn.net>
There are two possible use-cases for the `vrf_bind` function:
- bind socket to an interface in a vrf
- bind socket to a vrf device
For the former case, there's one problem - success is returned when the
interface is not found. In that case, the socket is left unbound without
throwing an error.
For the latter case, there are multiple possible problems:
- If the name is not set, then the socket is left unbound (zebra, vrrp).
- If the name is "default" and there's an interface with that name in the
default VRF, then the socket is bound to that interface.
- In most daemons, if the router is configured before the VRF is actually
created, we're trying to open and bind the socket right after the
daemon receives a VRF registration from zebra. We may not receive the
VRF-interface registration from zebra yet at that point. Therefore,
`if_lookup_by_name` fails, and the socket is left unbound.
This commit fixes all the issues and updates the function description.
Suggested-by: Pat Ruddy <pat@voltanet.io>
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add ability to filter session on show bgp summary by neighbor or
remote AS:
ubuntu# show bgp summary ?
neighbor Show only the specified neighbor session
remote-as Show only the specified remote AS session
ubuntu# show bgp summary neighbor ?
A.B.C.D Neighbor to display information about
WORD Neighbor on BGP configured interface
X:X::X:X Neighbor to display information about
ubuntu# show bgp summary remote-as ?
(1-4294967295) AS number
external External (eBGP) AS sessions
internal Internal (iBGP) AS sessions
This patch includes the documentation and the topotest.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Test case test_verify_mroute_when_5_different_receiver_joining_same_sources_p0
is failing intermittently in master. Fixed the issue.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
1. Automated test cases to verify BGP Graceful Shutdown community functionality,
with 2 different topologies.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
We only need an instance when we have at least one area configured in a
VRF. Currently we have the following issues:
- instance for the default VRF is always created
- instance is not removed after the last area config is removed
This commit fixes both issues.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When the redistribution is configured in non-default VRF, isisd should
redistribute routes from this VRF instead of default.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Compile with v2.0.0 tag of `libyang2` branch of:
https://github.com/CESNET/libyang
staticd init load time of 10k routes now 6s vs ly1 time of 150s
Signed-off-by: Christian Hopps <chopps@labn.net>
Change every `-` to `_` in directory names. This is to avoid mixing _ and -.
Just for consistency and directory sorting properly.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
BGP_MAX_PACKET_SIZE no longer represented the absolute maximum BGP
packet size as it did before, instead it was defined as 4096 bytes,
which is the maximum unless extended message capability is negotiated,
in which case the maximum goes to 65k.
That introduced at least one bug - last_reset_cause was undersized for
extended messages, and when sending an extended message > 4096 bytes
back to a peer as part of NOTIFY data would trigger a bounds check
assert.
This patch redefines the macro to restore its previous meaning,
introduces a new macro - BGP_STANDARD_MESSAGE_MAX_PACKET_SIZE - to
represent the 4096 byte size, and renames the extended size to
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE for consistency. Code locations
that definitely should use the small size have been updated, locations
that semantically always need whatever the max is, no matter what that
is, use BGP_MAX_PACKET_SIZE.
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE should only be used as a constant
when storing what the negotiated max size is for use at runtime and to
define BGP_MAX_PACKET_SIZE. Unless there is a future standard that
introduces a third valid size it should not be used for any other
purpose.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
There is a rare case where with prefix peers the peer is
completely absent from the json output when checking the
peer state resulting in a python key error. Check key exists
before checking the state.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
parse_topology function doesn't correctly process vertex types with
spaces. Therefore the reference topology files are completely messed up,
we have values in incorrect fields - types in metrics, metrics in
parents, etc.
This commit fixes the parsing function and the reference files.
The same fix was done for isis-topo1-vrf in #8365.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem Statement:
=================
In scale setup BGP sessions start flapping.
RCA:
====
In virtualized environment there are multiple places where
MTU need to be set. If there are some places were MTU is not set
properly then there is chances that BGP packets get fragmented,
in scale setup this will lead to BGP session flap.
Fix:
====
A new tcp option is provided as part of this implementation,
which can be configured per neighbor and helps to set the TCP
max segment size. User need to derive the path MTU between the BGP
neighbors and set that value as part of tcp-mss setting.
1. CLI Configuration:
[no] neighbor <A.B.C.D|X:X::X:X|WORD> tcp-mss (1-65535)
2. Running config
frr# show running-config
router bgp 100
neighbor 198.51.100.2 tcp-mss 150 => new entry
neighbor 2001:DB8::2 tcp-mss 400 => new entry
3. Show command
frr# show bgp neighbors 198.51.100.2
BGP neighbor is 198.51.100.2, remote AS 100, local AS 100, internal link
Hostname: frr
Configured tcp-mss is 150, synced tcp-mss is 138 => new display
4. Show command json output
frr# show bgp neighbors 2001:DB8::2 json
{
"2001:DB8::2":{
"remoteAs":100,
"bgpTimerKeepAliveIntervalMsecs":60000,
"bgpTcpMssConfigured":400, => new entry
"bgpTcpMssSynced":388, => new entry
Risk:
=====
Low - This is a config driven feature and it sets the max segment
size for the TCP session between BGP peers.
Tests Executed:
===============
Have done manual testing with three router topology.
1. Executed basic config and un config scenarios
2. Verified if the config is updated in running config
during config and no config operation
3. Verified the show command output in both CLI format and
JSON format.
4. Verified if TCP SYN messages carry the max segment size
in their initial packets.
5. Verified the behaviour during clear bgp session.
6. done packet capture to see if the new segment size
takes effect.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
Not having scapy in the docker image leads to very obtuse failures in
the pim bsm tests (obtuse, as in, it just fails without any hint as to
why...)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The previous method, using zassert.h and hoping nothing includes
assert.h (which, on glibc at least, just does "#undef assert" and puts
its own definition in...) was fragile - and actually broke undetected.
Just provide our own assert.h and control overriding by putting it in a
separate directory to add to the include path (or not.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Individual tests must not depend on each other. In particular, a test
can't be sure that the previous test config is applied or cleared.
It is definitely not true when a single test is executed, for example:
`test_bgp_auth.py::test_prefix_peer_remove_passwords`.
This commit makes all tests independent of each other. It also adds a
call to check_all_peers_established at the start of "remove_passwords"
tests to make sure that we not only block new peers with an incorrect
password, but also clean the existing peers.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
YANG model and CLI commands allow user to configure LDP-sync per area.
But the actual implementation is incorrect - all commands are changing
the config for the whole VRF instead of a single area. This commit fixes
this issue by actually implementing per area configuration.
Fixes#8578.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently we don't allow to configure the interface before the area is
configured. This approach has the following issues:
1. The area config can be deleted even when we have an interface config
relying on it. The code is not ready for that - we'll have a whole
bunch of stale pointers if user does that.
2. The code doesn't correctly process the event of changing the VRF for
an interface. There is no mechanism to ensure that the area exists
in the new VRF so currently the circuit still stays in the old VRF.
This commit allows an arbitrary order of area/interface configuration.
There is no more need to configure the area before configuring the
interface.
This change fixes both the issues.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Description:
DR information is missing under "show ip ospf interface [json]".
Added DR infomation to display in "show ip ospf interface".
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
The current log prints maximum wait time which is not actually correct,
because it doesn't include the command execution time. We usually have
"failed after X seconds" log with X being far longer than this maximum.
Let's print the maximum number of tries instead.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This test is completely incorrect on test_bfd_loss_intermediate step.
It shuts down the interface and then "waiting" for the BGP session to
fail. But instead of the actual wait it compares the output of "show bfd
peers" with the "up" state. As it does this comparison right after the
interface shutdown, the BFD session has not yet failed and the comparison
is always successful except very rare cases when the command takes a lot
of time to execute (due to the heavy load on CI system I suppose).
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This function kills all processes that happen to have the same
name to frr processes and it was only ever used in the setup.
Setup should not be used to kill old runs. That should be a
separate process.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
`CFLAGS` is a "user variable", not intended to be controlled by
configure itself. Let's put all the "important" stuff in AC_CFLAGS and
only leave debug/optimization controls in CFLAGS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This test establishes a binding between nbma ip of a spoke and its
protocol address. This information is pushed to hub.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Just another round of trying to add pytest.mark.bgpd. Not finished yet just
what I could stand doing for a few minutes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Fixes:
/usr/lib/python3.9/site-packages/_pytest/config/__init__.py:1463: in getoption
val = getattr(self.option, name)
E AttributeError: 'Namespace' object has no attribute 'topology_only'
The above exception was the direct cause of the following exception:
/usr/lib/python3.9/site-packages/pluggy/manager.py:127: in register
hook._maybe_apply_history(hookimpl)
/usr/lib/python3.9/site-packages/pluggy/hooks.py:333: in _maybe_apply_history
res = self._hookexec(self, [method], kwargs)
/usr/lib/python3.9/site-packages/pluggy/manager.py:93: in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
/usr/lib/python3.9/site-packages/pluggy/manager.py:84: in <lambda>
self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall(
tests/topotests/conftest.py:62: in pytest_configure
if config.getoption("--topology-only"):
/usr/lib/python3.9/site-packages/_pytest/config/__init__.py:1474: in getoption
raise ValueError(f"no option named {name!r}") from e
E ValueError: no option named 'topology_only'
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Description:
BGP session not established for ipv6 link local address with vrf config
Problem Description/Summary :
BGP session not established for ipv6 link local address with vrf configyy
1.Configure ipv6 link-local address fe80::1234/64 on dut1 and fe80::4567/64 on dut2
2.Configure BGP neighbors for ipv6 link-local on both dut1 and dut2
3.Verify BGP session is UP over link-local ipv6 address
4.Observed that bgp session not established for ipv6 link local address
Expected Behavior :
BGP session should be established for ipv6 link local address with vrf config
Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
These are for string quoting (`%pSQ`) and string escaping (`%pSE`); the
sets / escape methods are currently rather "basic" and might be extended
in the future.
Signed-off-by: David Lamparter <equinox@diac24.net>
Analogous to Linux kernel `%pV` (but our mechanism expects 2 specifier
chars and `%pVA` is clearer anyway.)
Signed-off-by: David Lamparter <equinox@diac24.net>
... to suppress the warnings when using something that isn't quite ISO C
compatible and would otherwise cause compiler warnings from `-Wformat`.
Signed-off-by: David Lamparter <equinox@diac24.net>
Three new tests:
- OSPFv3 convergence using 'ipv6 ospf6 neighbor json'
- Default route functionality:
* Check that the LSA is present
* Check that the route was installed
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This replaces `%n` with a safe, out-of-band option that simply records
the start and end offset of the output produced for each `%...`
specifier.
The old `%n` code is removed.
Signed-off-by: David Lamparter <equinox@diac24.net>
Allowing printfrr extensions to directly write to the output buffer has
a few advantages:
- there is no arbitrary length limit imposed (previously 64)
- the output doesn't need to be copied another time
- the extension can directly use bprintfrr() to put together pieces
The downside is that the theoretical length (regardless of available
buffer space) must be computed correctly.
Extended unit tests to test these paths a bit more thoroughly.
Signed-off-by: David Lamparter <equinox@diac24.net>
When "bgp bestpath peer-type multipath-relax" is enabled, multipaths
with both eBGP and iBGP learned routes may exist. It is not desirable
for the iBGP next hops to be discarded from the FIB because they are not
directly connected. When publishing a nexthop group to zebra, the
ZEBRA_FLAG_ALLOW_RECURSION flag is normally not set when the best path
is eBGP; when "bgp bestpath aspath multipath-relax" is configured, the
flag will now be set if any paths are from iBGP peers. This leaves
all-eBGP multipaths still requiring nexthops over connected routes.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
This new BGP configuration is akin to "bgp bestpath aspath
multipath-relax". When applied, paths learned from different peer types
will be eligible to be considered for multipath (ECMP). Paths from all
of eBGP, iBGP, and confederation peers may be included in multipaths
if they are otherwise equal cost.
This change preserves the existing bestpath behavior of step 10's result
being returned, not the result from steps 8 and 9, in the case where
both 8+9 and 10 determine a winner.
Signed-off-by: Joanne Mikkelson <jmmikkel@arista.com>
This new test launches a small network composed by 4 OSPF routers with
Traffic Engineering and Segment Routing configuration. To assess the Link
State Traffic Engineering feature, the TED of each router is compared
against the reference TED which corresponds to the network topology.
Then a series of 6 steps, where Link, TE & SR configurations are modified
up to r4 shutwdown, are used to verify that the TED is correctly updated
on the 4 routers.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Changes:
- Decrease hello interval to avoid packet loss slow downs
- Decrease dead interval to converge faster
- Remove previous 'Full' state check that wasn't checking for all
peers (only one per router)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The previous, more complex mechanism failed to take into account that
git worktrees only have a stub .git file & copying the worktree itself
is not enough. Just extract a file list beforehand & don't bother with
git inside the container.
Signed-off-by: David Lamparter <equinox@diac24.net>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
The following error is shown when running the OSPFv3 tests
2021-03-16 23:37:44,792 INFO: Function returned global name 'data_rid' is not defined
2021-03-16 23:37:44,792 INFO: Retry [#1] after sleeping for 2s
2021-03-16 23:37:46,794 INFO: Verifying OSPF6 neighborship on router r1:
2021-03-16 23:37:46,993 INFO: Output for command [ show ipv6 ospf6 neighbor ] on router r1:
Neighbor ID Pri DeadTime State/IfState Duration I/F[State]
2.2.2.2 1 00:00:03 Full/PointToPoint 00:00:01 r1-r2-eth0[PointToPoint]
Fix the "data_rid" warning by using the correct variable
Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
Currently there is a single interval for both RX and TX echo functions.
This commit introduces separate RX and TX timers for echo packets.
The main advantage is to be able to set the receive interval to zero
when we don't want to receive echo packets from the remote system.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Additional example usage of iproute2_is_vrf_capable check in
isis-topo1-vrf topotest.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
Example usage of iproute2_is_vrf_capable check in bgp_multi_vrf_topo1
and bgp_multi_vrf_topo2 topotests.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
The test has been failing often recently and it is causing some false
positives for unrelated PRs.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
We version the tests with the source code so we should no longer attempt
to support old versions in development branch.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The new ospf-sr-topo2 tests are much broader and detailed,
hence it makes no sense to keep the old ospf-sr-topo1
tests.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
1. Improved error meesage logging.
2. No functionality changes only put some meaningfull error messages.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Avoid undocumented topotest dependency on installing en_US locale.
With this change dependency is removed.
Signed-off-by: Christian Hopps <chopps@labn.net>
Didn't test this but it's already randomly broken so cant be worse
Hopefully fixes:
raise InvalidCLIError("%s" % output)
InvalidCLIError: line 2: % Command incomplete[4]: bgp
large-community-list standard Test1 permit
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Pass the topogen 'tgen' object into the startRouterDaemons()
method. it can be used to start a debug cli immediately after
starting a daemon, and that can be handy.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Sleeping when convergence is not guaranteed in 60 seconds
and then testing the rib to see if it has the data is
not a great way to have a test complete all the time.
Modify the code so that we check for convergence
and if we have converged then look in the rib.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1. There were few tests where routes were configured with blackhole and
non-blackhole nexthops simultaneously, enhanced tests accordingly and
verified in master branch and with PR #8158 changes.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Add some pytest.mark.bgpd. This is about all I could stomach doing
in one patch. I'll do another pass at another time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the last SID in the TI-LFA repair list is an Adj-SID from the
penultimate hop router towards the final hop, the No-PHP flag of the
original Prefix-SID must be honored in the repair list itself since
the penultimate hop router won't have a chance to process that SID
and pop it if necessary.
Reported-by: Fredi Raspall <fredi@voltanet.io>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There are two fixes to handle slow convergence on ARM -
1. Ping on every re-try attempt to account for initial packet loss
2. Handle incomplete show outputs gracefully
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The changes add the "jsoncmp_pass" and the "jsoncmp_fail" commands to
compare VTY shell's JSON output to an expected JSON object during
topotests using the LabN testing framework. This helps to eliminate
false negative test results (i.e. due to routes beeing out of order
after convergence or cosmetic changes in VTY shell's text output).
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
When parsing the output of "ip -6 address", allow arbitrary base interface
names (the part after "@" in the interface name), not just "if0-9". Without
this, link-local addresses sometimes are attributed to the wrong interface
because we're not matching the interface name but still handle the
interface's addresses.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Make the generate-support-bundle script and interactions more
python3-friendly, and use python3 explicitly.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Modify the timers on the bgp_blackhole_community test to
be more aggressive so our test system will recover faster
when we drop packets.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add pytest marking for ospfd. This commit also has some other test markings
because I do not want to have to go through the same test multiple times.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have a ospfd.conf file in the r2 directory but the
ldp-sync-isis-topo1 test does not use ospf. Let's remove it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Only one of the four reference files was present; add the missing
three. The test just silently passed if a ref file was missing:
change that to a failure.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add a test for the infinite recursion case fixed
with 0c4dbb5f8fe8fb188fa0e0aa8ce04764e893b79b
See that commit for details of the problem. This test uses a simpler
version of the repro found there as the test.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
This test relied on the default addition of SVI MAC in zebra
now this has been fixed the test needs to be updated to work
with the new behaviour.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
In test_converge_protocols() use sed to match the "show ip(v6) route"
header and strip it, rather than using tail which requires hardcoding
the expected length of the header (which is subject to change).
Signed-off-by: Duncan Eastoe <duncan.eastoe@att.com>
Since SNMP is a pain to install add a check which will be used
in all SNMP tests in future to silently skip SNMP tests if SNMP
has not been installed on the base system.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Adding test to verify default route is added when attached-bit
receive and send are enabled and not added when feature is disabled.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
This script involves Restart ospfd,
restart frr with ospf enabled,
staticd with redistribution inside ospf is enabled
Signed-off-by: nguggarigoud <nguggarigoud@vmware.com>
When P and Q spaces are adjacent then it makes sense to use adjacency SIDs to
from the P node to the Q node. There are some other corner cases where this
makes also sense like when a P/Q node adjacent to root node.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
1. Added 7 testcases to verify PIM BSM functionality. Here we have used Scapy
to send raw packet, generated using Cisco and FRR. Raw packets are kept in
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
1. Added 8 testcases to verify PIM BSM functionality. Here we have used Scapy
to send raw packet, generated using Cisco and FRR. Raw packets are kept in
JSON file and sent tests on-demand in script.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Tests were timing out in our test system due to lost packets and
flakiness of the lower end systems. Just set the timers to 3/10
and give them plenty of time to converge.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
New test does this:
a) Ensures that we run the correct number of times given two
`ip protocol X` commands( ie we do not run the route-map application
against all routes, only those affected )
b) Ensure that when we modify the route-map the state ends up sane
this includes making a static route depend on a sharp route that
gets removed from the change of the sharp route-map
c) Ensure that the kernel routes are correct.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add the ability for our topotests to take advantage of pytest `mark`ing.
This effectively allows you to tell pytest to run against certain sets
of tests. For a demonstration purpose I've added in marks for:
babel
eigrp
ldp
ospf
pim
rip
And setup tests to run against those tests that only test those protocols.
You can run against eigrp tests by running `pytest -k eigrp`
Other combinations are also available based upon simple boolean logic.
Just read the pytest.mark documentation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Simple test which creates a router running snmp and bgpd and
checks we can read the correct bgpVersion using snmp.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
1. Adding api to verify ip nht command.
2. 5 cases of static routes with admin dist and tag
3. Run time = 89Secs
Signed-off-by: nguggarigoud <nguggarigoud@vmware.com>
Prepare the infrastructure to allow configuring and launching an SNMP
daemon as part of testing scenario.
Signed-off-by: Babis Chalios <babis@voltanet.io>
Signed-off-by: Pat Ruddy Chalios <pat@voltanet.io>
* If pathd binary is not found, skip the SR-TE topotests.
* Fix some compilation warnings when pathd is not built.
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
* Added a new topotest to test bgpd listening on multiple addresses.
* Updated the existing bgpd tests according to the parameter added to
bgp_master_init.
Signed-off-by: "Adriano Marto Reis" <adrianomarto@gmail.com>
In test_bgp_mutli_vrf_topo2.py it's clear that we remove then
re-add the vrf interfaces. Then the test was immediately
checking to ensure that the routes were available.
BGP needs time to reconverge. Let's ensure that first.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add new RLFA topotest that tests all RLFA configuration knobs and
how isisd and ldpd react to various configuration changes that can
occur in the network.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Extend the existing SPF unit testing infrastructure so that it can
test RLFA as well.
These new unit tests are useful to test the RLFA PQ node
computation on several different network topologies in a timely
manner. Artificial LDP labels (starting from 50000) are used to
activate the computed RLFAs.
It's worth mentioning that the computed backup routing tables
contain both local LFAs and remote LFAs, as running RLFA separately
isn't possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The bgp_gr_functionality_topo1 test was shutting down an
interface on r2 and then trying to bring it up on r1.
Hijinx ensued.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
`lcommunity_gettoken` expects a space-delimeted list of 0 or more large
communities. `lcommunity_list_valid` can perform this check.
`lcommunity_list_valid` now validates large community lists more
accurately based on the following condition: Each quantity in a standard bgp
large community must:
1. Contain at least one digit
2. Fit within 4 octets
3. Contain only digits unless the lcommunity is "expanded"
4. Contain a valid regex if the lcommunity is "expanded"
Moreover we validate that each large community list contains exactly 3
such values separated by a single colon each.
One quirk of our validation which is worth documenting is:
```
bgp large-community-list standard test2 permit 1:c:3
bgp large-community-list expanded test1 permit 1:c:3
```
The first line will throw an error complaining about a "malformed community-list
value". The second line will be accepted because the each value is each treated as
a regex when matching large communities, it simply will never match anything so
it's rather useless.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
1. Enhanced lib/topojson.py for creating topologies with switches and routers
2. Ran it through (black) for expected formatting
Signed-off-by: kuldeepkash <kashyapk@vmware.com>
1. Enhanced lib/common_config.py for common configuration/verification needed
for PIM automation
2. Ran it through (black) for expected formatting
Signed-off-by: kuldeepkash <kashyapk@vmware.com>
1. Added lib/pim.py for PIM configuration/verification
2. Ran it through (black) for expected formatting
Signed-off-by: kuldeepkash <kashyapk@vmware.com>
An external label manager plugin may want to use the following
functions:
- create_label_chunk
- assign_label_chunk
- delete_label_chunk
- release_label_chunk
This test ensures that they are externally visible.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
This test checks the interactions between the BGP label requesting
code and the labelpool code to ensure the correct number of labels
and label chunks are requested and those labels are freed back into
the pool when the corresponding prefix is removed.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Timestamps in test logs are needed for correlation with messages in
routing protocol log files. Vox populi indicates preference for
timestamp at beginning of line.
OLD:
(#55) scripts/rip-show.py:61 COMMAND:r1:vtysh -c "show ip rip status": 00:0.* 00:0:wait:RIP Peers:
NEW:
Sat Dec 19 08:26:45 2020 (#55) scripts/rip-show.py:61 COMMAND:r1:vtysh -c "show ip rip status": 00:0.* 00:0:wait:RIP Peers:
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
While accidently running the topotests with version 3
I keep getting:
TypeError: `dict_values` object does not support indexing..
version 2 of python dict.values() returns a list.
version 3 does not
Write some code to allow both to be handled.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This new daemon manages Segment-Routing Traffic-Engineering
(SR-TE) Policies and installs them into zebra. It provides
the usual yang support and vtysh commands to define or change
SR-TE Policies.
In a nutshell SR-TE Policies provide the possibility to steer
traffic through a (possibly dynamic) list of Segment Routing
segments to the endpoint of the policy. This list of segments
is part of a Candidate Path which again belongs to the SR-TE
Policy. SR-TE Policies are uniquely identified by their color
and endpoint. The color can be used to e.g. match BGP
communities on incoming traffic.
There can be multiple Candidate Paths for a single
policy, the active Candidate Path is chosen according to
certain conditions of which the most important is its
preference. Candidate Paths can be explicit (fixed list of
segments) or dynamic (list of segment comes from e.g. PCEP, see
below).
Configuration example:
segment-routing
traffic-eng
segment-list SL
index 10 mpls label 1111
index 20 mpls label 2222
!
policy color 4 endpoint 10.10.10.4
name POL4
binding-sid 104
candidate-path preference 100 name exp explicit segment-list SL
candidate-path preference 200 name dyn dynamic
!
!
!
There is an important connection between dynamic Candidate
Paths and the overall topic of Path Computation. Later on for
pathd a dynamic module will be introduced that is capable
of communicating via the PCEP protocol with a PCE (Path
Computation Element) which again is capable of calculating
paths according to its local TED (Traffic Engineering Database).
This dynamic module will be able to inject the mentioned
dynamic Candidate Paths into pathd based on calculated paths
from a PCE.
https://tools.ietf.org/html/draft-ietf-spring-segment-routing-policy-06
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Co-authored-by: Renato Westphal <renato@opensourcerouting.org>
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Co-authored-by: Emanuele Di Pascale <emanuele@voltanet.io>
Signed-off-by: Sebastien Merle <sebastien@netdef.org>
1. Enhanced framework to
a. Verify fib active routes(lib/common_config.py).
b. Verify bgp multi path routes(lib/bgp.py).
c. Create mininet nodes with different names(lib/topojson.py).
4. 12 Test cases of static routing with ibgp.
Test suite execution time is ~30 minutes.
5. 12 Test cases of static routing with ebgp.
Test suite execution time is ~30 minutes.
Signed-off-by: naveen <nguggarigoud@vmware.com>
The `show ip nht` and `show ipv6 nht` commands were broken.
This is because recent code commit: 0154d8ce45
assumed that p must not be NULL and this is not the case.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1. As per recent changes done in PR #7652, we have modified the auto-rd verification logic
2. Dev PR link: https://github.com/FRRouting/frr/pull/7652
Signed-off-by: kuldeepkash <kashyapk@vmware.com>
I accidently installed something that is telling me about
unlosed handles in the tests. Let's clean them up.
<and yes I have no idea wtf I did>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test_bgp_multi_vrf_topo2.py script had a bunch
of places where it would change an interface status
or add delete routes that would affect bgp convergence
but it was never ensuring that convergence had happened
before the test verified the bgp rib. I believe this
was leading to many intermittant ci failures in
testing for other PR's to be accepted. Modify
the code to wait for bgp convergence if we just
made a change to the topology
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
For each afi/safi of 'show bgp summary', display the peer description
each time needed. This information is useful, for instance in the case
of a device connected with multiple peers.
The topotest all_protocol_startup is changed accordingly.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is the opposite of TOPOTEST_AUTOLOAD: Instead of automatically loading
missing modules, TOPOTEST_NOLOAD prevents module loading and supresses
questions about it.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
the topolog importation folder must be precised. otherwise following
error message appears:
root@dut-vm:~/topotests/bgp_flowspec# python3 test_bgp_flowspec_topo.py
Traceback (most recent call last):
File "test_bgp_flowspec_topo.py", line 96, in <module>
from lib.lutil import lUtil
File "/root/topotests/bgp_flowspec/../lib/lutil.py", line 25, in <module>
from topolog import logger
ImportError: No module named 'topolog'
root@dut-vm:~/topotests/bgp_flowspec#
The same error occurs with lutil and bgprib which are 2 libraries
located under lib/ folder. Some precisions are added too.
PR=71290
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
python3 does not support execfile implementation.
replace it with open and exec api that are available in both python 2
and 3 implementations.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>