1. Added interface name, group address and detail option to existing
"show ip igmp groups" so that user can retrieve all the groups
or a particular group for an interface. Detail option shows the source
information for the group. With that, the show command
looks like:
"show ip igmp [vrf NAME$vrf_name] groups [INTERFACE$ifname [GROUP$grp_str]] [detail$detail] [json$json]"
2. Changed pim_cmd_lookup_vrf() to return empty JSON if VRF is not present
3. Changed "detail" option to print non pretty JSON
4. Added interface name and group address to existing
"show ip igmp sources" so that user can retrieve all the sources for
all the groups or, all the sorces for a particular group for an
interface. With that, the show command looks like:
"show ip igmp [vrf NAME$vrf_name] sourcess [INTERFACE$ifname [GROUP$grp_str]] [json$json]"
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
The scale_up.py script used by several tests installs 50k routes into the rib from
sharpd. It is first looking for the results in the bgp database. Let's ensure
that the routes are actually installed into the rib first before looking in
the bgp tables. This should help situations where the system is under extreme
load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test ensures that the incoming prefixes are received with
the appropriate label value, and that connectivity is ensured
between prefixes.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Locally, the bgp_evpn_vxlan_svd_topo1 and bgp_evpn_vxlan_topo1
tests are failing for me. Upon inspection the test is looking
for the mac addresses of the interfaces participating in the
evpn bridging on the hosts. For some reason on my machine
these mac addresses are not in the l2 tables at all on
PE1 or PE2. Adding quick pings solves the problems.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Following replacement of Edge Key type (uint64_t by new structure), this patch
updates the various TE topotests to the new Edge Key references.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
For multicast pimv6 join and traffic, socat is
used, which was not cleaned up post tests executions,
enhanced kill_socat() API to kill socat join and
traffic specific PIDs during teardown module.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Enhanced or added new libraries to support
multicast mld local join automation
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Co-Auther: Vijay Kumar Gupta <vijayg@vmware.com>
Added new test suite to verify functionality
of multicast MLD local join. Added 4 different
test cases in test suite.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Co-Authored-by: Vijay Kumar Gupta <vijayg@vmware.com>
At this point OSPF NSSA deserves a dedicated topotest given the
latest nerd knobs that were added :)
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Combine all variation of the "area nssa" command into a single
DEFPY to improve code maintainability.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Prior to this the full retry cycle was run with a "passing" negative
result each time through
Previous runtime ~5 minutes
New runtime ~20 seconds.
Signed-off-by: Christian Hopps <chopps@labn.net>
Test failed time to time, let's try this way:
```
$ for x in $(seq 1 20); do cp test_bgp_labeled_unicast_addpath.py test_$x.py; done
$ sudo pytest -s -n 20
```
Ran 10 times using this pattern, no failure 🤷
Before this change, we checked advertised routes, and at some point `=` was
missing from the output, but advertised correctly. Receiving router gets as
much routes as expected to receive.
I reversed checking received routes, not advertised.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If we set `bgp route-map delay-timer X`, we should ignore starting to announce
routes immediately, and wait for delay timer to expire (or ignore at all if set
to zero).
f1aa49293a broke this because we always sent
route refresh and on receiving BoRR before sending back EoRR.
Let's get fix this.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The flag for telling BGP that a route is expected to be installed
first before notifying a peer was always being set upon receipt
of a path that could be accepted as bestpath. This is not correct:
imagine that you have a peer sending you a route and you have a
network statement that covers the same route. Irrelevant if the
network statement would win the flag on the dest was being set
in bgp_update. Thus you could get into a situation where
the network statement path wins but since the flag is set on
the node, it will never be announced to a peer.
Let's just move the setting of the flag into bgp_zebra_announce
and _withdraw. In _announce set the flag to TRUE when suppress-fib
is enabled. In _withdraw just always unset the flag as that a withdrawal
does not need to wait for rib removal before announcing. This will
cover the case when a network statement is added after the route has
been learned from a peer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test demonstrates that a label is allocated for each
ipv6 next-hop. IPv6 test introduces link local ipv6 addresses
as next hops, and compared to IPv4, one can have two different
next-hops depending if the next-hop is defined by a global
address (static route redistributed) or a bgp peer.
This test checks that:
- The labels are correctly allocated per connected next-hop.
- The default label is used for non connected prefixes.
- The withdraw operation frees the MPLS entry.
- If a recursive route is redistributed by BGP, then the nexthop
tracking will find the appropriate nexthop entry, and the
associated label will be found out.
- When a prefix moves from one peer to one another behind the
vrf, then the MPLS switching operation for return
traffic is changing the outgoing interface to use.
- When the 'label vpn export <value>' MPLS label value is changed,
then the modification is propagated to prefixes which use that value.
- Also, when unconfiguring the per-nexthop allocation mode, check
that the MPLS entries and the VPNv4 entries of r1 are changed
accordingly.
- Reversely, when re-configuring the per-nexthop allocation mode,
check that the allocation mode reuses the other label values.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new test suite checks for the mpls label allocation
per nexthop mode. This test checks that:
- The labels are correctly allocated per connected
next-hop.
- The default label is used for non connected prefixes
- The withdraw operation frees the mpls entry.
- If a recursive route is redistributed by BGP, then the nexthop
tracking will find the appropriate nexthop entry, and the associated
label will be found out.
- When a prefix moves from one peer to one another behind the vrf,
then the MPLS switching operation for return traffic is changing
the outgoing interface to use.
- When the 'label vpn export <value>' MPLS label value is changed,
then the modification is propagated to prefixes which use that value.
- When unconfiguring the per-nexthop allocation mode, check
that the MPLS entries and the VPNv4 entries of r1 are changed
accordingly.
- Reversely, when re-configuring the per-nexthop allocation mode,
check that the allocation mode reuses the other label values.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Automated new scenarios to multicast pim6
SM test suite. Added 10 test cases to verify
multicast PIM6-SM functionality.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Co-Auther: Vijay Kumar Gupta <vijayg@vmware.com>
Enhanced or added new libraries to support
multicast pimv6 automation
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Co-Auther: Vijay Kumar Gupta <vijayg@vmware.com>
Add tests that configure and disable advertise-high-metrics with wide, narrow, and transition metric styles. Also test ip route behavior.
Signed-off-by: Isabella de Leon <ideleon@microsoft.com>
This test ensures that BGP VRF instance is able to import ECMP
paths, and is able to install 2 labelled routes accordingly.
The test also ensures that the imported 172.31.0.10/32 prefix
is selected and that the reason why the 172.31.0.10/32 prefix is
selected is not 'Locally configured route'. Actually, imported
routes do not figure out correctly the peer, and the reason is
falsely mentioned as local.
This test also uses IP ranges used for documentation and for
testing.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Depending on ip_route and kernel, the output might include a nhid
which causes the test to fail with a strict text output check.
Change to json output to avoid the issue
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Quite a few well-known communities from IANA's list do
not receive special treatment in Cisco IOS XR, and at least one
community on Cisco IOS XR's special treatment list, internet == 0:0,
is not formally a well-known community as it is not in [IANA-WKC] (it
is taken from the Reserved range [0x00000000-0x0000FFFF]).
https://datatracker.ietf.org/doc/html/rfc8642
This is Cisco-specific command which is causing lots of questions when it
comes to debugging and/or configuring it properly, but overall, this behavior
is very odd and it's not clear how it should be treated between different
vendor implementations.
Let's deprecate it and let the operators use 0:0/0 communities as they want.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
There were a few tests using "show bgp ... json detail" that did json
comparisons against a predefined json structure. This updates those
predefined json structures to match the new format of the output.
(new output moves path array under "paths" key and adds header keys)
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Add a iproute2 API guard to the SVD test using `bridge fdb get`.
While it SHOULD be present on most systems based on their kernel
version it may not be present due to kernel/iproute2 version mismatch
weirdness.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add first of dvni topotests. Covers just basic usage of importing
wildcard VNI and installing it via lwt encap.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Add new topo tests for validating mac learning, bridging and routing
with single vxlan device configuration
Signed-off-by: Sharath Ramamurthy <sramamurthy@nvidia.com>
Have added topotest to verify below scenarios.
1. Verify OSPF Flood reduction functionality with ospf enabled on process level.
2. Verify OSPF Flood reduction functionality with ospf enabled on area level.
3. Verify OSPF Flood reduction functionality between different area's
Have sussessfully tested these in my local setup
Signed-off-by: nguggarigoud <nguggarigoud@vmware.com>
This test ensures that the regex used to filter as paths has to
be expressed in the asnotation of the BGP instance where prefixes
are received. 2 aspaths have been forged, both for AS 65540, but
only the former is expressed in asdot. If the local BGP instance
is expressed in asdot format, then only the former ASPATH will
match properly the incoming update. Reversely, when the local BGP
instance is expressed in plain format, then only the latter ASPATH
will match properly the incoming update.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This test performs AS handling operations on BGP instances,
and does some checks by using the asdot notation. AS4B values
are used for configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]
At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.
Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.
The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.
Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.
One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
AS number can be defined as an unsigned long number, or
two uint16 values separated by a period (.). The possible
valus are:
- usual 32 bit values : [1;2^32 -1]
- <1.65535>.<0.65535> for dot notation
- <0.65535>.<0.65535> for dot+ notation.
The 0.0 value is forbidden when configuring BGP instances
or peer configurations.
A new ASN type is added for parsing in the vty.
The following commands use that new identifier:
- router bgp ..
- bgp confederation ..
- neighbor <> remote-as <>
- neighbor <> local-as <>
- clear ip bgp <>
- route-map / set as-path <>
An asn library is available in lib/ and provides some
services:
- convert an as string into an as number.
- parse an as path list string and extract a number.
- convert an as number into a string.
Also, the bgp tests forge an as_zero_path, and to do that,
an API to relax the possibility to have a 0 as value is
specifically called from the tests.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preliminary work to handle various ways to configure
a BGP Autonomous System. When creating a BGP instance, the
user may want to define the AS number as a dotted value,
instead of using an integer value.
To handle both cases, an as_pretty char attribute will store
the as number as it has been given to the vtysh command:
router bgp <as number>
Whenever the as integer of the BGP instance was dumped,
the as_pretty original format is used.
The json output reuses the integer value to keep backward
compatibility with old displays.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The files converted in this commit either had some random misspelling or
formatting weirdness that made them escape automated replacement, or
have a particularly "weird" licensing setup (e.g. dual-licensed.)
This also marks a bunch of "public domain" files as SPDX License "NONE".
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Filter out keys in JSON output with "grep -v" does not work when JSON
does not use the pretty format.
Use native python code to filter out keys.
Fixes: 6c13bd5744 ("topotests: fix bgp_vpnv4_noretain")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The test was sometimes failing around the sleep(4) for
waiting for the routes to be installed. Instead of blindly
sleeping let's check to see that the routes are actually
there in zebra and then continue on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1. Renamed "gates" to "nexthops"
2. Displaying afi of the nexthops being dispalyed in place of
"nexthops" JSON object in the old JSON output
3. Calling show_route_nexthop_helper() and show_nexthop_json_helper()
instead of print_nh() inorder to keeps the fields in "nexthops"
JSON object in sync with "nexthops" JSON object of
"show nexthop-group rib json".
Updated vtysh:
r1# show ip nht
192.168.0.2
resolved via connected
is directly connected, r1-eth0 (vrf default)
Client list: static(fd 28)
192.168.0.4
resolved via connected
is directly connected, r1-eth0 (vrf default)
Client list: static(fd 28)
Updated JSON:
r1# show ip nht json
{
"default":{
"ipv4":{
"192.168.0.2":{
"nhtConnected":false,
"clientList":[
{
"protocol":"static",
"socket":28,
"protocolFiltered":"none"
}
],
"nexthops":[
{
"flags":3,
"fib":true,
"directlyConnected":true,
"interfaceIndex":2,
"interfaceName":"r1-eth0",
"vrf":"default",
"active":true
}
],
"resolvedProtocol":"connected"
}
}
}
}
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
When running the build in a separate build directory, redirecting output
into a file can error out if the directory does not exist yet. Some
places already had `mkdir -p` calls, but not all.
Make all occurences of this consistently use `@$(MKDIR_P)`.
(Extension of PR #12575 to catch more places.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Fix the following issues:
- two tests are done in one function. Dispatch the tests in two
functions to help the test debug.
- the first test passes even if a third prefix is not filtered. Match
the exact to avoid false positive.
- the expected values contains variable like version. Do no check
variable values.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
After implementing ACCEPT_OWN extended community, bgpd can't import VPN
routes to the VRFs whose RD is matched with that of VPN routes. This
commit adds new test to check the effect of the next commit.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Testcase: test_pim6_multiple_groups_different_RP_address_p2
was failing because of a bug in framework, Fixed the
bug in this commit.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Multicast pim6 static RP tests are failing
when run in parallel using micronet. There
are APIs to clean mcast traffic before
starting new test but these cleanups
are not needed when socat is used.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Under really heavily loaded systems this is insufficient. Looking
at the run output we have this:
"2.1.3.22\/32":[
{
"installed":true,
}
],
"2.1.3.23\/32":[
{
"queued":true,
}
],
So after 10 seconds on the micronet system only 30 of the 100 routes are installed.
Give it more time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Looks like under heavy load, the test is not giving enough
time to come to steady state. Do this:
a) send more udp packets and for longer
b) Increase time spent waiting
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
MPLS VPN networks can either peer with iBGP or eBGP. When
calculating the distance to send to zebra, the imported prefix
is never sent with distance information, even if the vty
command is used under the ipv4 unicast address family:
router bgp 65505 vrf vrf1
address-family ipv4 unicast
distance bgp 26 27 28
[vpn config]
The observation is that the distance sent to zebra for an
imported prefix is still 20:
[..]
VRF vrf1:
B> 192.168.0.0/24 [20/0] via 2.2.2.2 (vrf default) (recursive), label 20, weight 1, 00:00:12
* via 10.125.0.6, ntfp3 (vrf default), label implicit-null/20, weight 1, 00:00:12
The expectation is that the incoming prefix has to follow the
distance that is configured, or the distance derived from the peer
relationship established by the parent prefix.
In the case, an iBGP relationship is done, and no distance
configuration is done, the below show is expected:
[..]
VRF vrf1:
B*> 192.168.0.0/24 [200/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12
In the case an iBGP relationship is done, and distance configuration
is performed as below:
[..]
distance bgp 21 201 41
[..]
Then the below show is expected:
[..]
VRF vrf1:
B*> 192.168.0.0/24 [201/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12
To get this behaviour, get the peer origin where the prefix is coming
from.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
I'm seeing test failures after in micronet runs in CI
after 7 seconds * 30 attempts at seeing if it succeeds.
Let's see if another 60 seconds of attempts allows
this to work properly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Single run of this test suite on my machine was 8 minutes.
Breaking this up into 3 test suites halves the run time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This change alters the behavior of existing test code. The
default mode (before any call to luSetWaitType()) is now
"strict".
The historical behavior of luCommand(op="wait) is to ignore
failures to match the specified regexp in the specified time.
In those cases, no result was logged and no error was signaled.
This change introduces a new "strict" mode for luCommand(op="wait):
in "strict" wait mode, each invocation of luCommand(op="wait)
generates an explicit, logged failure result when it fails to match
the specified regexp in the specified time. These failures signal
an error for the test.
Calling luSetWaitType("nostrict") restores the historical behavior.
Calling luSetWaitType("strict") (re)enables the new strict behavior.
Individual calls to luCommand() may also specify op="wait-nostrict"
to override any default and use the historical behavior.
Individual calls to luCommand() may also specify op="wait-strict"
to override any default and use the new behavior.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Test that BFD static monitoring works:
When BFD session is up the routes are installed in the RIB and
distributed with routing protocol (in this case BGP). When the session
is down it is removed from RIB and propagated.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Tests are failing in micronet because linux kernel needs are 4.19
not 4.15
2023-01-11 17:15:06,657.657 INFO: topolog.r1: vtysh command => "show zebra"
2023-01-11 17:15:06,657.657 DEBUG: topolog.r1: LinuxNamespace(r1): cmd_status("['/bin/bash', '-c', 'vtysh -c "show zebra" 2>/dev/null']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2023-01-11 17:15:06,729.729 INFO: topolog.r1: vtysh result:
OS Linux(4.15.0-193-generic)
Notice the missing pimreg11 device needed in vrf blue:
2023-01-11 17:15:06,731.731 DEBUG: topolog.r1: LinuxNamespace(r1): cmd_status("['/bin/bash', '-c', 'vtysh -c "show int brief" 2>/dev/null']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2023-01-11 17:15:06,781.781 INFO: topolog.r1: vtysh result:
Interface Status VRF Addresses
--------- ------ --- ---------
blue up blue 192.168.0.1/32
r1-eth0 up blue 192.168.100.1/24
r1-eth1 up blue 192.168.101.1/24
Interface Status VRF Addresses
--------- ------ --- ---------
erspan0 down default
gre0 down default
gretap0 down default
lo up default
pimreg up default
Interface Status VRF Addresses
--------- ------ --- ---------
r1-eth2 up red 192.168.100.1/24
r1-eth3 up red 192.168.101.1/24
red up red 192.168.0.1/32
While on a 5.4 machine we have this:
mininet310# show int brief
Interface Status VRF Addresses
--------- ------ --- ---------
blue up blue
dummy1 up blue
dummy2 up blue
pimreg11 up blue
As such let's limit the test to a 4.19 kernel or above that our
documentations states we need for proper pim operation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Previously, routes leaked from one VRF to another VRF were associated
with the original nexthop interface.
Commit 14aabc0156 replaced the nexthop
interface with the index of incoming VRF interface.
Due to this change, the `bgp_srv6l3vpn_route_leak` topotest always fails
because it still expects the nexthop interface.
This commit fixes the expected interface name in the
`bgp_srv6l3vpn_route_leak` topotest.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
To verify previous changes, this PR adds topotest to verify whether
imported routes redistributed will be active on other VRF RIB.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
Because of the issue described in the above link, pinging from vrf with
the command "ip vrf exec <vrf> ping -I <src> <addr>" may fail.
> root@topo:~# ip vrf exec vrf1 ping -c1 -I 192.168.2.1 192.168.1.1
> bind: Cannot assign requested address
Raise an error if pinging its own IP from a VRF fails. This test should
always work unless in the condition of this issue.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203483
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add an "exist" key to check the existence of a prefix in the BGP RIB.
Useful to check that a prefix has not leaked by error.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Update bgp_vrf_route_leak_basic to set up the VRF interfaces. Otherwise
the routes to the VRF interface are inactives.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Leaked connected routes have now the following nexthop interfaces:
- lo for routes imported from the default VRF
- or the VRF interface for routes imported from the other VRFs.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The wq->spec.errorfunc is never used in the code.
It's been in the code base since 2005 and I also
do not remember ever seeing it being called. No
workqueue process function ever returns error.
Since it's not used let's just remove it from the
code base.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Building FRR with --enable-address-sanitizer and then running the
config_timing test makes the test run for over an hour on my machine.
The goal of this test is to ensure that the test runs 10000 routes
in/out in a reasonable amount of time. We cannot test this with
address-sanitizer enabled. So just make the test meaningless
from a timing perspective but keep it `alive` from a it might
catch some address sanitizer issue with 50 -vs- 10000 routes
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The L3VPN best path computation now takes into accound the IGP metric.
Adapt the bgp_l3vpn_to_bgp_vrf tests so that routes with the best IGP
metric are selected when needed.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Earlier daemon parameter was passed to
start_topology(), which is not needed now,
as new code is implemented to start
feature specific daemons.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Currently topotests starts all daemons by default,
made changes to f/w so only needed daemons can
be started, daemons which are needed to tests
particular test suite.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
This series of events will crash BGP prior to the prior commit:
a) Configure an interfaced based peering
b) Shut the interface the peering is over
c) remove the peering from bgp
Show that this no longer happens
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Check if we advertise more routes when an additional path comes up, and
if we withdraw if dissapears.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
To verify previous changes, this PR introduces topotest to verify
whether imported routes learnt from BGP unnumbered peers will be active
on VPN RIB and other VRF RIB.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
With a dead interval of 40 seconds, each tests is waiting 40+
seconds for ospf convergence to occurr because the DR is re-elected
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In bgp_srv6l3vpn tests, check_ping checks reachability. However, this
function have a bug and if we set expect_connected to True, check will
pass even if all ping packets are lost. This commit fixes this issue.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
The `bgp_srv6l3vpn_to_bgp_vrf2` topotest tests the SRv6 IPv4 L3VPN
functionality. It applies the appropriate configuration in `bgpd` and
`zebra`, and then checks that the RIB is updated correctly.
The topotest expects to find the AS-Path in the RIB, which is only
present if the `bgp send-extra-data zebra` option is enabled in the
`bgpd` configuration.
Currently, the `bgp send-extra-data zebra` option is not set in the
`bgpd` configuration, which always causes the topotest to fail.
This commit fixes the `bgp_srv6l3vpn_to_bgp_vrf2` topotest by enabling
the `bgp send-extra-data zebra` option for both routers `r1` and `r2`.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
add a test dedicated to confederation. it also take into
account the support of AS memberwith same id that the
confederation id.
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
Previous commit changes the order of srv6 locator parameters. So, this
PR reflect the previous changes.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
This commit adds a basic test for sharpd traffic control PoC, which will check
interface TC info from iproute2 `tc` cli.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
This is for run_and_expect_type and run_and_expect topotests method.
Some contributions unintentionally get merged with very low values, that leads
to CI failures, let's guard this a bit.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Ensure that the minimum time spent run and expecting is
5 seconds. Heavy load is not a reason to fail a test.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
isis_lfa_topo1 topotests regularly fails at step 24. The test expects
that the BFD session between rt1 and rt2 comes down after shutting the
link between rt1 and rt2.
Since the BFD is multihop, the BFD can get back through rt3.
Set the BFD type to single-hop.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This test ensures that the command `behavior usid` works properly.
When the `behavior usid` command is set, a flag is added to the locator
to indicate that the locator is a uSID locator. This test verifies that
the locator works correctly when you set / unset the `behavior usid`
command.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Automated new scenarios to multicast pimv6
static rp test suite. Added new folder
multicast_pim6_static_rp_topo1 for pimv6
static_rp automation.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
When zebra receives routes from upper level protocols it decodes the
zapi message and places the routes on the metaQ for processing. Suppose
we have a route A that is already installed by some routing protocol.
And there is a route B that has a nexthop that will be recursively
resolved through A. Imagine if a route replace operation for A is
going to happen from an upper level protocol at about the same time
the route B is going to be installed into zebra. If these routes
are received, and decoded, at about the same time there exists a
chance that the metaQ will contain both of them at the same time.
If the order of installation is [ B, A ]. B will be resolved
correctly through A and installed, A will be processed and
re-installed into the FIB. If the nexthops have changed for
A then the owner of B should be notified about the change( and B
can do the correct action here and decide to withdraw or re-install ).
Now imagine if the order of routes received for processing on the
metaQ is [ A, B ]. A will be received, processed and sent to the
dataplane for reinstall. B will then be pulled off the metaQ and
fail the install since A is in a `not Installed` state.
Let's loosen the restriction in nexthop resolution for B such
that if the route we are dependent on is a route replace operation
allow the resolution to suceed. This requires zebra to track a new
route state( ROUTE_ENTRY_ROUTE_REPLACING ) that can be looked at
during nexthop resolution. I believe this is ok because A is
a route replace operation, which could result in this:
-route install failed, in which case B should be nht'ing and
will receive the nht failure and the upper level protocol should
remove B.
-route install succeeded, no nexthop changes. In this case
allowing the resolution for B is ok, NHT will not notify the upper
level protocol so no action is needed.
-route install succeeded, nexthops changes. In this case
allowing the resolution for B is ok, NHT will notify the upper
level protocol and it can decide to reinstall B or not based
upon it's own algorithm.
This set of events was found by the bgp_distance_change topotest(s).
Effectively the tests were looking for the bug ( A, B order in the metaQ )
as the `correct` state. When under very heavy load, the A, B ordering
caused A to just be installed and fully resolved in the dataplane before
B is gotten to( which is entirely possible ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Column headers in BGP routes table are not aligned with data when
RPKI status is available. This was fixed to insert a space at the
beginning of the header and at the beginning of lines that do not
have RPKI status.
This fix requires that several testing templates be adjusted to
match the new output.
Signed-off-by: Wayne Morrison <wmorrison@netgate.com>
Add a switchover test that consists in:
- Setting up ISIS BFD between rt1 and rt2
- The no link-detect setting on rt1 eth-rt2 is still present so that
zebra does not take account linkdown events on this interface.
- Shutting down rt1 eth-rt2 from the switch side
- Wait for BFD to comes down
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in clearing the rt2 neighbor on rt1.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in:
- Setting no link-detect on rt1 eth-rt2 so that zebra does not take
account linkdown events on this interface.
- Shutting down rt1 eth-rt2 from the switch side
- Wait for the hello timer expiration
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in shutting down an interface.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in:
- Setting up ISIS BFD between rt5 and rt6
- Setting no link-detect on rt6 eth-rt5 so that zebra does not take
account linkdown events on this interface.
- Shutting down rt6 eth-rt5 from the switch side
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in shutting down an interface.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Also, make sure we check if the advertisement table changed using FROM peer,
not TO peer.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This commit extends the `bgp_srv6l3vpn_to_bgp_vrf3` topotest by adding
two tests:
* prevent bgpd from exporting routes from a VRF to the VPN RIB
(`no sid vpn per-vrf export`);
* enable bgpd to export routes from a VRF to the VPN RIB
(`sid vpn per-vrf export auto`).
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
This commit adds a new topotest to verify the functionality of SRv6
locators with custom bits length parameters.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
This commit adds a new topotest which tests SRv6 L3VPN for IPv4 and
IPv6 address families using a single SID.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
This commit adds a new test case to the
test_zebra_seg6local_route topotest. The new test case performs two
operations:
* try to install a seg6local route with an End.DT46 action
* verify that the route is created correctly
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
When bgp is using `bgp suppress-fib-pending` and the end
operator is using network statements, bgp was not sending
the network'ed prefix'es to it's peers. Fix this.
Also update the test cases for bgp_suppress_fib to test
this new corner case( I am sure that there are going to
be others that will need to be added ).
Fixes: #12112
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, if `bgp max-med on-startup` is configured, after BGP session
is established for the first time, a timer for the specified time is
started. When the timer is expired, an UPDATE message should be sent to
reflect changes in the routes' MED value. The problem is that the routes
are being suppressed because based on the attributes they look like they
have not changed. However, in the case of max-med, the value is copied
to the packet directly from `bgp->maxmed_value`, not from the
attributes. Thus, changes in this case cannot be detected by comparing
attributes.
With this fix, avoid route suppressing when the `max-med on-startup`
timer expires and initiates an UPDATE.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Updating topojson script's assert messages,
which will help in better debugging, when
test will fail.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
The bgp_gr_restart_retain_routes test is looking for specific output
that does not include the routes nexthop id:
def _bgp_check_kernel_retained_routes():
output = (
r2.cmd("ip route show 172.16.255.1/32 proto bgp dev r2-eth0")
.replace("\n", "")
.rstrip()
)
expected = "172.16.255.1 via 192.168.255.1 metric 20"
diff = topotest.get_textdiff(
output, expected, "Actual IP Routing Table", "Expected IP RoutingTable"
)
if diff:
return False
return True
While the output includes nexthop group id's now:
root@r2:# ip route show 172.16.255.1 proto bgp dev r2-eth0
172.16.255.1 nhid 8 via 192.168.255.1 metric 20
Let's just mark r2 as not to use nexthop groups for installation
and this test issue will go away.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In order to minimize the changes of test files, this PR adds `func-bits`
parameters on the SRv6 locator definition.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
The issue fixed in the previous commit now correctly triggers a failure:
("assertion (list_add(&head, &itm[j]) == &itm[j]) failed")
Turns out the "shitty" hash function was not shitty enough.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We might disable sending unconfig/shutdown notifications when
Graceful-Restart is enabled and negotiated.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When primary global v6 unicast address is configured on an
unnumbered interface, BGP does not re-advertise updates out
with the new global v6 address as the nexthop
Signed-off-by: Pdoijode <pdoijode@nvidia.com>
Before it worked only when configured initially via CLI. Later, when we
receive a new route, that should match a decent MED, we just skip it, because
MED mismatch is not recalculated.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When redistributing connected addresses, the address family has
to be figured out. The calculation was not done, the next-hop
address length was not set, and as consequence, the nexthop
is displayed like if it was an ipv6 address, which is wrong for
ipv4 addresses.
Calculate the family for connected addresses.
Change the topotests accordingly.
Fixes: ("7226bc40d606") bgpd: ignore NEXT_HOP for MP_REACH_NLRI
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This test ensures that MPLS VPN routes can be installed into a
gre interface with route-map l3vpn next-hop encapsulation command
set. On the other hand, if this command is not set, incoming bgp
routes are not considered as valid.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
- double the size of each new chunk request from zebra
- use bitfields to track label allocations in a chunk
- When allocating:
- skip chunks with no free labels
- search biggest chunks first
- start search in chunk where last search ended
- Improve API documentation in comments (bgp_lp_get() and callback)
- Tweak formatting of "show bgp labelpool chunks"
- Add test features (compiled conditionally on BGP_LABELPOOL_ENABLE_TESTS)
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Whn using as-override, we should be able to deny outgoing updates from
being propogated when `neighbor soo` is configured.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
To prove that this works. Modify a test that uses mpls to
turn on mpls for the interfaces that need mpls via the
new mpls command.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test is sometimes failing when it looks at the
v6 routes in the fib. Since the step before is
ensuring that v3 ospf has just converged let's
give it a bit of time to find and see if things
have had a chance to install the routes too.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test directory takes almost 7 minutes to complete splitting
this up into 3 test files drops it down to just over 3 minutes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Enhanced few exsiting PIM APIs to support both
IPv4 and IPv6 configuration. Added few new APIs
for PIMv6. Tested all existing tests with new
API changes.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
This tests checks that there are no errors when receiving BFD
packets over the various linux vrf interfaces. For example, if
an incoming packet is received by the wrong socket, a VRF
mismatch error would occur, and BFD flapping would be observed.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
I rarely get this failure:
@classname: bgp_snmp_mplsl3vpn.test_bgp_snmp_mplsvpn
@name: test_pe1_converge_evpn
@time: 44.875
@message: AssertionError: BGP SNMP does not seem to be running
assert False
+ where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
+ where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
"Wait for protocol convergence"
tgen = get_topogen()
r1 = tgen.gears["r1"]
r1_snmp = SnmpTester(r1, "10.1.1.1", "public", "2c")
assertmsg = "BGP SNMP does not seem to be running"
> assert r1_snmp.test_oid("bgpVersion", "10"), assertmsg
E AssertionError: BGP SNMP does not seem to be running
E assert False
E + where False = <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>>('bgpVersion', '10')
E + where <bound method SnmpTester.test_oid of <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>> = <lib.snmptest.SnmpTester object at 0x7fa8562eb4f0>.test_oid
Under heavy system load a quick test before BGP can fully come up can result in a failed
test. Add some extra time for snmp to come up properly.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test checks that when retain functionality is disabled,
some prefixes are removed from the BGP ipv4 vpn RIB.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
- ignore parent from daemonize valgrind files these allocations will be
checked in the child.
- check for memleaks at end of module/file not just after tests.
Signed-off-by: Christian Hopps <chopps@labn.net>
API to verify static route was checking whether
router is installed with expected nexthop. In
this particular scenario we has same route as
Connected and Static both. In heavy loaded
system static routes was taking time to get
installed and API was doing the verification
on Connected route instead Static route.
Enhanced scripts to check only static routes.
Issue: https://github.com/FRRouting/frr/issues/11563
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
When there is change in route-map policy associated with default-originate, changes does not reflect.
When route-map associated with default-originate is deleted, default route doesn't get withdrawn
Update message is not being sent when only route-map is removed from the default-originate config.
SNT counter gets incremented on change of every policy associated with default-originate
Route-map with multiple match clauses causes inconsistencies with default-originate.
Default-originate behaviour on BGP-attributes
Signed-off-by: ARShreenidhi <rshreenidhi@vmware.com>
this commit containes 2 testcases that covers
1. Default originate behaviour on restarting the BGP daemon and FRR router
2. Default Originate behaviour on shut no-shutting the interface
Signed-off-by: ARShreenidhi <rshreenidhi@vmware.com>
Issue was reported by Donald, we were hitting
with key not found error and execution was
stopped, which is fixed by this PR.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
The bgp_conditional_advertisement topotest runs all the test cases in
the same function. It is not easy to debug it because the pytest
"--pause" argument does not make breaks between test cases.
Dispatch the test-cases into functions to benefit from the "--pause"
feature.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Before this patch we can enable 'ip ospf bfd' via '[no] ip ospf bfd profile ...' commads.
After patch '[no] ip ospf bfd profile ...' actual only if 'ip ospf bfd' is set.
Signed-off-by: Dmitrii Turlupov <dturlupov@factor-ts.ru>
Just adding a support for peer-groups, because now it's not possible to
configure BGP role for peer-groups.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
A SR policy matches a BGP nexthop based on the IP address of
the nexthop and the color of the route (color may be assigned
to routes using a route-map).
The order of events (BGP route arrival, route-map definition,
policy and candidate-path definition) should not affect the
matching/mapping.
These changes add tests for:
- removing/adding BGP route after policy and routemap are
defined and held constant
- changing route map color to be different from policy color,
and then changing back to match
after each change, the policy should be observed to be in effect
unchanged from before, i.e., the route's nexthops should reflect
the matching SR policy.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
In topotests, we also want to check for role mismatch cases. However, if
we are testing the sender of a role mismatch notification, sometimes it
can have non-deterministic behavior (probably due to a configuration
change). Thus, there is an assumption that the recipient of
notifications will more consistently display the reason why the session
was terminated in the first place.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
I have a test failure:
r1.vtysh_cmd(
"sharp install seg6local-routes {} nexthop-seg6local dum0 {} 1".format(
dest, context
)
)
test_func = partial(
check,
r1,
dest,
manifest["out"],
)
success, result = topotest.run_and_expect(test_func, None, count=5, wait=1)
> assert result is None, "Failed"
E AssertionError: Failed
E assert Generated JSON diff error report:
E
E > $: d2 has the following element at index 0 which is not present in d1:
E
E {
E "prefix": "1::1/128",
E "protocol": "sharp",
E "selected": true,...
E
The test output for 1::1/128:
{
"1::1/128":[
{
"prefix":"1::1/128",
"prefixLen":128,
"protocol":"sharp",
"vrfId":0,
"vrfName":"default",
"selected":true,
"destSelected":true,
"distance":150,
"metric":0,
"queued":true,
"table":254,
"internalStatus":8,
Notice that it is still queued after 5 seconds. Under extremely heavy system load
this is not long enough for convergence. Also the zebra.log shows thread starvation
as well as long running tasks
2022/06/17 15:30:02 ZEBRA: [PHJDC-499N2][EC 100663314] STARVATION: task dplane_incoming_request (55b3ce0fea8b) ran for 6369ms (cpu time 0ms)
2022/06/17 15:30:02 ZEBRA: [T83RR-8SM5G] zebra 8.4-dev starting: vty@2601
2022/06/17 15:30:02 ZEBRA: [YZRX4-ZXG0C][EC 100663315] Thread Starvation: {(thread *)0x55b3ce6c15b0 arg=0x0 timer r=-6.375 rib_sweep_route() &zrouter.sweeper from zebra/main.c:447} was scheduled to pop greater than 4s ago
Increasing the time to 25 seconds to give it a chance.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In the previous version, the time.sleep function was included to wait
for the moment when the routes were sent to all routers. Changed this
function to topotest.run_and_expect for more deterministic behavior.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
1. Removed the step from hello test case with hello
timer of 65535. This test works in some platforms
and does not work in others, affecting stability.
Signed-off-by: nguggarigoud <nguggarigoud@vmware.com>
this PR contains the basic bgp default originate.
Details of the testcase is available in respective script files
Signed-off-by: ARShreenidhi <rshreenidhi@vmware.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
In the last step of this test, r1's link to r2 is shut down but
both routers stay connected through a multi-hop LDP session. That
happens because r1 and r2 have a targeted adjacency created by
the pseudowire. The test then checks whether the pseudowire is
still up, using an alternate path for nexthop resolution.
Everything's fine except for the fact that LDP GTSM (aka
ttl-security) is enabled by default. This means that messages sent
over a multi-hop session are not delivered. In the case of this
test, it can prevent PW-Status notifications from being delivered,
which in turn can prevent the pseudowire from coming back up.
Fix the test by disabling GTSM so that LDP multi-hop sessions can
work normally. This is in accordance with RFC6720 which mentions
that GTSM should be disabled (statically or dynamically) for
multi-hop sessions.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Verifying and making sure PIM neighbors are
up before sending BSM packet using Scapy.
Verifying static routes are installed before
proceeding fruther.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
When you have a static route with multiple different admin
distances there exists a chance that route will have been
installed multiple times due to system load when inserted
at about the same time. If this is the case then the
verify_rib function can and will select the wrong route
that happens to have a nexthop group that is still installed.
Modify verify_rib to ensure that the route that is going to
be looked at for nexthop correctness is the actual installed
route, not a previous version of it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The sporadic failures were happening because, under heavy load,
the r4 router could form an OSPF adjacency with r3 a few seconds
before doing the same with r2. In that interim, LDP could establish
a neighborship with r2 going through r3 (instead of connecting
directly). That would cause all label mappings received from r3
to be ignored since they can't be mapped to the routes' nexthops
received from zebra, causing all sorts of test failures. None of
this is erroneous behavior as LDP simply follows the IGP.
The fix consists of updating the test to ensure all expected OSPF
adjacencies fully converged before proceeding to the LDP checks.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
There are a couple steps listing what is being done that are both inprecise
and missleading. Fix to actually say what is going on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The reachable router table is used by OSPF opaque clients in order to
determine if the router advertising the opaque LSA data is
reachable (i.e., 2-way conectivity check).
Signed-off-by: Christian Hopps <chopps@labn.net>
Related: https://datatracker.ietf.org/doc/html/draft-ietf-idr-bfd-subcode
When BFD Down notification comes and BGP is configured to track on BFD events,
send BGP Cease/BFD Down notification to the peer.
If RFC 8538 is enabled (Notification support for Graceful-Restart), notification
should be encapsulated into Hard Reset message.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If at first you succeed try try again.
No I mean if it works the first time no need to do
the same command again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
"ip vrf exec" command is not possible in the topotest shell.
> root@r1:~# ip vrf exec r1-cust5 bash
> mkdir failed for /sys/fs/cgroup/unified: No such file or directory
> Failed to setup vrf cgroup2 directory
Remount cgroup after remounting sysfs.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This breakup converts run times for test_bgp_auth.py from
~9 minutes to just over 2 and a half minutes of run
time when running in parallel.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When Segment Routing is disabled, if isisd received LSP with Segment Routing
information, in particular prefix SIDs, it installs corresponding MPLS entries
while it should not as SR is disabled.
This patch adds extra control to verify if SR is enabled or not before
configuring MPLS LFIB & IP FIB with prefix SIDs and adjust SR & TI-LFA
tests accordingly.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
The cspf_topo1 test is comparing the adj-sid value that is
assigned dynamically based upon bring up order. Under very
large scale this order changes causing the test to fail.
Since the adj-sid is dynamically allocated and appears to
be tested elsewhere, let's remove it from the grab all check.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a switchover test that consists in:
- Setting up ISIS BFD between rt1 and rt2
- The no link-detect setting on rt1 eth-rt2 is still present so that
zebra does not take account linkdown events on this interface.
- Shutting down rt1 eth-rt2 from the switch side
- Wait for BFD to comes down
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in clearing the rt2 neighbor on rt1.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in:
- Setting no link-detect on rt1 eth-rt2 so that zebra does not take
account linkdown events on this interface.
- Shutting down rt1 eth-rt2 from the switch side
- Wait for the hello timer expiration
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in shutting down an interface.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in:
- Setting up ISIS BFD between rt5 and rt6
- Setting no link-detect on rt6 eth-rt5 so that zebra does not take
account linkdown events on this interface.
- Shutting down rt6 eth-rt5 from the switch side
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a switchover test that consists in shutting down an interface.
Check that the switchover between primary and backup happens before the
SPF re-computation.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
As of now we are logging only JSON output of CLIs
in topotests(topojson) executions and same o/p is
getting printed twice, which is of no use.
Enhanced code to show both plain and JSON output
of CLIs and remove duplicate logging.
It will help in reducing execution logs and in
verification, if sometimes there is mis-match
in CLI plain and JSON outputs.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
This test is sometimes failing under severe load. Give some time
for the linux rule installation to actually be registered by the
system before declaring failure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Allowing only 4 seconds for a bfd test to synchronize is going
to run into problems on extremely loaded systems. The test
system should value it actually converged over it actually
converged in a reasonable time, especially on test systems
that are loaded because of many multiples of tests running
at the same time. If it is important to actually test
that something got done by the RFC, the CI system as it
is currently written is not the correct place for this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Under heavy load I am seeing verify_rib failing after 12 seconds
but succeeding after 17:
2022-05-19 18:52:54,374 DEBUG: topolog: Exiting lib API: verify_rib
2022-05-19 18:52:54,374 DEBUG: topolog: Function returned True
2022-05-19 18:52:54,374 WARNING: topolog: RETRY DIAGNOSTIC: SUCCEED after FAILED with requested timeout of 12.0s; however, succeeded in 14.7s, investigate timeout timing
There is no reason to not have the test wait a bit longer for very very
heavily loaded systems. Change the time to 40 seconds.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Lots of tests call verify_rib that takes a list of routes that
need to be verified in some fashion. This verify_rib functionality
will try up to 12 seconds before failing the check that zebra
has the route and has installed it.
Unfortunately the verify_rib code was not looking to see if
the route was queued for installation and was then allowing
tests to immediately do subsuquent steps that depended on
that route actually being installed sometimes causing tests
to fail.
Write a bit of additional code that looks at the queued
status and allows the test to wait a bit longer for zebra
to finish processing before allowing the test to move on
to the next bit.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test is failing intermittently because sometimes igmp
local join is not getting deleted. I did split the joins means
trying to delete igmp local joins one by one. I tried running
tests multiple times and it seems to be working fine with
current changes.
There was an issue found during debugging this test failure,
which was raised already:
Issue# https://github.com/FRRouting/frr/issues/11105
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
1. Modified pim APIs name to generic one, same APIs would be used for PIMv4 and PIMv6
verifications
2. Modified all affacted scripts and ran multiple times locally to avoid CI failures
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
New compilers are noticing that the tests are compiling with
a pointer for the bgpd_privs variable while the bgp library
that is being linked against is not a pointer. Since
these tests had the declaration just to make the compiler
happy, let's actually align the variable type to make the
compiler even happier.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test is testing whether interface flaps are causing
the appropriate pim reactions. Unfortunately the test
is turning off the multicast stream and the test also
has a keep alive timer of 15 seconds set on all routers.
Which of course means the test has 15 seconds(at most) to finish
testing. This is not always possible given system loads.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test_multicast_pim_sm_topo3.py test is both spending extra time
looking for state that will never occurr but also generating a support
bundle when it doesn't find it. Fix the test to come to the correct
solution faster.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a test case where a kernel route depends on a kernel route
and when you perturb an interface, ensure that FRR does not
loose the route.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a test case to ensure that Kernel routes are not lost
when there are multiple overlapping connected routes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
a) Remove the retry mechanism to continue looking for 75%
of the time for pim code.
This alone saves a bunch of time in tests that use lib/pim.py
Effectively all the times given for retry are already long
enough. Additionally some tests are gathering data with
the expectation that they will not find data so the entire
time is being taken up in retry's. Extending the retry
mechanism makes this even worse. This is especially bad
for pim in that keep alive timers are counting down and
state can be removed due to excessive time waiting.
b) Reduce verify verify_multicast_traffic from 40 seconds
to 20 seconds to gather traffic data.
A bunch of tests are doing this:
a) gather pre test start traffic data( taking about 70
seconds to run, because a bunch of time it was looking
for data that does not exist yet)
b) run a change to introduce a different traffic flow
c) gather post test traffic data ( taking about 70
seconds to run )
Why does this matter? Tests were iterating through
all the different routers looking for traffic flow
as well as different mroute state. This is against
the keepalive timer of 210 seconds. It does not take
long before the stream can be removed and the test is
still looking for data that is no longer there due
to state timeout.
The multicast_pim_sm_topo3/test_multicast_pim_sm_topo3.py
test reduced run time from 398 seconds to 297 seconds.
Greatly reducing keepalive timeout problems.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
it wants yang models installed which will only be there if frr has been
installed before, causing `make check` to fail when run on a system on
which frr has not been installed when GRPC is enabled (--enable-grpc)
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
1. Handle KeyError
2. logger object is defined in main function and its not not accessible
in other functions so defined it in local functions.
Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
Improving the test case to show database info as well
to help narrow down whether its a LSA origination problem or
route calculation problem in case of failures.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
RB-tree and double-linked-list easily support backwards iteration, and
an use case seems to have popped up. Let's make it accessible.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Do not allow the test system to turn off the logging of commands
Some tests use the reload command that is accidently turning off
the logging. Just force the tests to ignore it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Test ospf running with 3 vrfs: default, neno, ray
Route leaking is setup via bgp between default and neno vrfs
Leaked routes include connected and ospf
Included test:
1- OSPF convergnce
2- zebra/kernel routes
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
This PR adds support for configuring topotest routers using a single file.
instead of:
```
router.load_config(
TopoRouter.RD_ZEBRA, os.path.join(CWD, "{}/zebra.conf".format(rname))
)
router.load_config(
TopoRouter.RD_OSPF, os.path.join(CWD, "{}/ospfd.conf".format(rname))
)
router.load_config(
TopoRouter.RD_BGP, os.path.join(CWD, "{}/bgpd.conf".format(rname))
)
```
you can now do:
```
router.load_frr_config(
os.path.join(CWD, "{}/frr.conf".format(rname)),
[TopoRouter.RD_ZEBRA, TopoRouter.RD_OSPF, TopoRouter.RD_BGP]
)
```
or just:
```
router.load_frr_config(os.path.join(CWD, "{}/frr.conf".format(rname)))
```
In this latter case, the daemons list will be inferred from frr.conf file.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
When running a topotest with the --shell or --vtysh argument, the
window titles of the routers are generic.
Set the router name as title to identify correctly the window.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Opening new tab in screen is not possible when using option --vtysh or
--shell. Error 'No such file or directory'.
Fix the issue.
Fixes: 6a5433ef0b ("tests: NEW micronet replacement for mininet")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
noticed that pylint was complaining about some easily
fixable stuff in test_route_map_topo1.py so let's clean
it up some.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Have added topotest to verify below combination.
Auth support for md5
Auth support for hmac-sha-256
Auth support with keychain for md5
Auth support with keychain for hmac-sha-256
Have sussessfully run all 4 test cases in my local setup.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
isis_tlvs.c would fail at multiple places if incorrect TLVs were
received causing stream assertion violations.
This patch fixes the issues by adding missing length checks, missing
consumed length updates and handling malformed Segment Routing subTLVs.
Signed-off-by: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
Small adjustments by Igor Ryzhov:
- fix incorrect replacement of srgb by srlb on lines 3052 and 3054
- add length check for ISIS_SUBTLV_ALGORITHM
- fix conflict in fuzzing data during rebase
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add new topotest for the Constraints ShortestPath First (CSPF) algorithm.
This topotest uses IS-IS-TE as base network to populate a Traffic Engineering
Database (TED) and sharpd to call cspf algorithms on this IS-IS-TE topology.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
When link-param is enabled for a given interface, TE metric is automatically
assigned to the metric of the interface. However, the metric of the interface
could be unassigned and keep the default value equal to 0. Thus, if the TE
metric is not explicitely modified within the `link-param metric` statement,
TE metric remains set to 0 which is not a valid value especially when
computing constrainted path.
This patch changes the assignement of the default value of the TE metric.
It is set to the metric of the interface only if the latter is not equal to 0.
TE topotests for OSPF and IS-IS have been adjusted accordingly.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Test the ability to use the following configure command with a Y value:
no neighbor X.X.X.X maximum-prefix-out Y
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Introduces a topotest to validate proper AS-Path manipulation when using
"neighbor ... remove-private-AS".
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Opaque data takes up a lot of memory when there are a lot of routes on
the box. Given that this is just a cosmetic info, I propose to disable
it by default to not shock people who start using FRR for the first time
or upgrades from an old version.
Fixes#10101.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
VRF name should not be printed in the config since 574445ec. The update
was done for NB config output but I missed it for regular vty output.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The current maximum-prefix-out topo-test starts a configuration with a
maximum-prefix-out.
Test the application of new maximum-prefix-out value without clearing
the neighbor.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Redistribution for ospf with instance id's using instance id's
was incorrect. Add some small tests to make sure it catches the
issues and we don't regress.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This just tries logging messages in random ways to allow the fuzzer to
do its thing and try to find weird edge cases.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The test case test_PIM_hello_tx_rx_p1 is failing randomly because
sometimes the hello packet is received and sometimes not received while getting
the stats data.
When the hello packet is received HelloRx gets incremented to 1 and then
shutdown of the interface is executed which resets the stats to 0
and again when "no shutdown" of the interface is done, the stats get incremented to 1.
The test case checks after "no shutdown" of the interface whether the stats is incremented
but in this case although the stats got incremented the before and after value is same.
Hence the test case failed.
Adding correct expectations in the test case.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Adding an `s` after these printfrr specifiers replaces 0.0.0.0 / :: in
the output with a star (`*`). This is primarily intended for use with
multicast, e.g. to print `(*,G)`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since this is only used in very few places, moving it out of the way is
reasonable. (`%pSG` will be pim_sgaddr)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The test is doing this:
a) gather interface data about packets sent
b) shut interface
c) no shut interface
d) gather interface data about packets sent
e) compare a to d and fail if packets sent/received has not incremented
The problem is, of course, that under heavy system load insufficient time
might not have passed for packets to be sent between c and d. Add up to
35 seconds of looking for packet data being incremented else heavily
loaded systems may never show that data is being sent.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
verify_pim_interface_traffic *fetches* the pim
traffic data. Rename the function to what it
actually does
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The nhrp_topo test sets up some infrastructure and
was displaying the commands it was outputting
incorrectly. Fix this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Using with LLGR, this should be allowed setting GR restart-time timer to 0,
to immediately start LLGR timers.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The following subcodes are defined for the Cease NOTIFICATION
message:
Subcode Symbolic Name
1 Maximum Number of Prefixes Reached
2 Administrative Shutdown
3 Peer De-configured
4 Administrative Reset
5 Connection Rejected
6 Other Configuration Change
7 Connection Collision Resolution
8 Out of Resources
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Currently, it is possible to rename the default VRF either by passing
`-o` option to zebra or by creating a file in `/var/run/netns` and
binding it to `/proc/self/ns/net`.
In both cases, only zebra knows about the rename and other daemons learn
about it only after they connect to zebra. This is a problem, because
daemons may read their config before they connect to zebra. To handle
this rename after the config is read, we have some special code in every
single daemon, which is not very bad but not desirable in my opinion.
But things are getting worse when we need to handle this in northbound
layer as we have to manually rewrite the config nodes. This approach is
already hacky, but still works as every daemon handles its own NB
structures. But it is completely incompatible with the central
management daemon architecture we are aiming for, as mgmtd doesn't even
have a connection with zebra to learn from it. And it shouldn't have it,
because operational state changes should never affect configuration.
To solve the problem and simplify the code, I propose to expand the `-o`
option to all daemons. By using the startup option, we let daemons know
about the rename before they read their configs so we don't need any
special code to deal with it. There's an easy way to pass the option to
all daemons by using `frr_global_options` variable.
Unfortunately, the second way of renaming by creating a file in
`/var/run/netns` is incompatible with the new mgmtd architecture.
Theoretically, we could force daemons to read their configs only after
they connect to zebra, but it means adding even more code to handle a
very specific use-case. And anyway this won't work for mgmtd as it
doesn't have a connection with zebra. So I had to remove this option.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently the Wait for Install code ( bgp_suppress_fib ) does
not properly handle two states from zebra: ROUTE_INSTALL_FAILED
and BETTER_ADMIN_DISTANCE_WON. Pre this change the WFI code
would just never notify our peers about a route install failure
but more is needed. In the ROUTE_INSTALL_FAILED and the
BETTER_ADMIN_DISTANCE_WON we need to notify our peers with
a withdrawal about the route, else we will continue to
draw traffic to us when we cannot legally do so.
Why is this needed? In either case imagine that we've already
received a bgp route, installed it and sent to our peers.
In the Better admin distance won case, say a static route is installed
at this point in time we must stop advertising the route through
us since we are not installed. As such a withdrawal must be sent.
In the ROUTE_INSTALL_FAILED case, the code was not properly handling
the situation where we have Route A, it was successfully installed
and then we received a update to Route A that was attempted to be
installed but failed. In this case we also need to send a withdrawal
Finally update the bgp_suppress_fib topotest to test both of these
situations.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There still existed chances that best path consideration
has not taken place for both bgp_l3vpn_to_bgp_vrf and
bgp_instance_del_test ( since they both used the same
check_routes.py scripting ). Add some more checks
to ensure that we have all the data. Prior to this
change I could see one of these two tests failing
every 2-3 runs on my test system. I am not seeing
this anymore after ~5 complete test runs.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
OSPF when converging will choose a DR / Backup DR based upon
who has already come up. Irrelevant of priority. As such if
under system load OSPF comes up first and elects a DR that under
normal circumstances not be the elected one due to priority
OSPF does not go back through and re-elect to keep the system
stable in this case. Tests are experiencing this:
unet> r0 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.1 99 Full/Backup 4m14s 3.780s 10.0.1.2 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.2 0 Full/DROther 4m14s 3.848s 10.0.1.3 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.3 0 Full/DROther 4m14s 3.912s 10.0.1.4 r0-s1-eth0:10.0.1.1 0 0 0
unet> r1 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m15s 3.011s 10.0.1.1 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.2 0 Full/DROther 4m19s 3.124s 10.0.1.3 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.3 0 Full/DROther 4m19s 3.188s 10.0.1.4 r1-s1-eth1:10.0.1.2 0 0 0
unet> r2 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m27s 3.483s 10.0.1.1 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.1 99 Full/Backup 4m32s 3.527s 10.0.1.2 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.3 0 2-Way/DROther 4m32s 3.660s 10.0.1.4 r2-s1-eth0:10.0.1.3 0 0 0
unet> r3 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m55s 3.786s 10.0.1.1 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.1 99 Full/Backup 4m55s 3.829s 10.0.1.2 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.2 0 2-Way/DROther 4m54s 3.897s 10.0.1.3 r3-s1-eth1:10.0.1.4 0 0 0
Modify the test to do a clear to enforce the order we are specifically looking for.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
- Add advertisement of Global IPv6 address in IIH pdu
- Add new CLI to set IPv6 Router ID
- Add advertisement of IPv6 Router ID
- Correctly advertise IPv6 local and neighbor addresses in Extended IS and MT
Reachability TLVs
- Correct output of Neighbor IPv6 address in 'show isis database detail'
- Manage IPv6 addresses advertisement and corresponiding Adjacency SID when
IS-IS is not using Multi-Topology by introducing a new ISIS_MT_DISABLE
value for mtid (== 4096 i.e. first reserved flag set to 1)
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Lot's of the GR topotests kill daemons in order to test code
that deals with crashing daemons. Under heavy system load
it was noticed that a kill command was sent and if told to
wait we would sleep 2 seconds send another kill command and
call it good. This was causiing issues when subsuquent
json commands would get errors like `lost connection to daemon`
as the daemon finally shut down after some time due to load.
Modify the kill the daemon function to notice that the daemon
was not actually killed and if we need to wait wait some
more time for it too happen
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently under system load tests that use verify_pim_interface_traffic
immediately after a interface down/up event are not giving any time
for pim to receive and process the data from that event. Give
the test some time to gather this data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Under heavy system load, we are sometimes seeing this
output for addKernelRoute:
2021-11-28 16:17:27,604 INFO: topolog: [DUT: b1]: Running command: [ip route add 224.0.0.13 dev b1-f1-eth0]
2021-11-28 16:17:27,604 DEBUG: topolog.b1: LinuxNamespace(b1): cmd_status("['/bin/bash', '-c', 'ip route add 224.0.0.13 dev b1-f1-eth0']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2021-11-28 16:17:27,967 DEBUG: topolog.b1: LinuxNamespace(b1): cmd_status("['/bin/bash', '-c', 'ip route']", kwargs: {'encoding': 'utf-8', 'stdout': -1, 'stderr': -2, 'shell': False, 'stdin': None})
2021-11-28 16:17:28,243 DEBUG: topolog: ip route
70.0.0.0/24 dev b1-f1-eth0 proto kernel scope link src 70.0.0.1
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This tells us that the ip route add succeeded but when looking for it
the system failed to immediately find it. Why is this happening?
Probably we are under heavy system load and the two different
commands, 'ip route add..' and 'ip route show' are being executed
on different cpu's and the data has not been copied to the different
cpu yet in the kernel. This is not necessarily something normally
seen but entirely possible. Giving the system a few extra seconds
for the kernel to execute/work the memory barrier system seems
prudent for long term success of our programming.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Modify the timers uses to send updates/hello's every
1 seconds instead of 5. Allowing this test to converge
faster under heavy system load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
During repeated runs I am seeing this test fail to run successfully.
Upon inspecting the output:
{
"prefix":"10.0.10.0/24",
"prefixLen":24,
"protocol":"isis",
"vrfId":6,
"vrfName":"r1-cust1",
"selected":true,
"destSelected":true,
"distance":115,
"metric":10,
"queued":true,
We can see that the route is still queued. Under heavy system
load and not ensuring that isis has time to send the route to
zebra and for zebra to install the route, this test can fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Update verify_ospf6_neighbor() so we can verify there are no
neighbors in a given router
input_dict = {
"r0": {
"ospf6": {
"neighbors": []
}
}
}
result = verify_ospf6_neighbor(tgen, topo, dut, input_dict)
Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
The interface area command is deprecated under
router ospf6 and should be on the individual interface.
Let's modify the tests to not actually put the
interface foo area 0.0.0.0 command under the
router node.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When using build_config_from_json there exists a timing
window where neighbors can come up before the router-id
is applied. As a precaution, quickly clear the neighbors
to ensure that we get neighbors with the expected router-id.
This can especially happen under high system load.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The test_ospf_dual_stack test had area configuration
under the `router ospf6` nodes. This is getting
lots of warning messages from the cli. Let's remove
this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When testers use the build_config_from_json function
the create_router_ospf function is double creating
the ospfv3 cli to be passed in. This is because
the create_router_ospf loops over both v2 and v3
and then create_router_ospf6 re-adds v3.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The ospf_basic_functionality/test_ospf_lan.py creates
a ethernet segment and attaches 4 routers to it and
assigns ip addresses in a /24. As one of the tests
it picks a new address for r0 which coincides with
a ip address on r3. Then the test immediatly
checks for other data. The problem is of course
that if a test is `slow` enough hello's will
start to be ignored from r3 to r0 and the
neighbor relationships will come down. Choose
an ip address that doesn't cause this issue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The new bgp_route_server_client test is not setting the
timers for peers to be fast enough to have the ability
to converge in under 60 seconds if a packet is dropped/missed
at startup. Make the test have the ability to converge
under load
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
add a parameter to resolver api that is the vrf identifier. this permits
to make resolution self to each vrf. in case vrf netns backend is used,
this is very practical, since resolution can happen on one netns, while
it is not the case in an other one.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There exist systems that do not explicity have a python soft-link
on their system. Let's explicity call out which python we want
to be using with exabgp.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The bgp gr topotests had run times that were greater than 10 minutes each.
Just brute force break up the tests to 4 different sub parts.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
- Changing the expected output for selected route in the script.
- With our changes for VRF-Lite fix best path selection,
during best path selection, while comparing the paths for imported routes,
we should correctly refer to the original route i.e. the ultimate path.
In this case, when we have ibgp route and imported ibgp route
for the same prefix, we do compare IGP metric which is same for both,
So we proceed to comparing router-ids and selecting the best path.
- Before our changes, ibgp route was preferred because of IGP metric.
With our fix, expected output for selected route is changed to
imported ibgp route because of the lower router-id.
- Corresponding changes for expected advertised route and
the large community are made.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Somewhere along the line core-files stopped being generated
with the running of the topotests. With this change we now
see this:
sharpd@eva /t/topotests> find . -name '*.dmp' -print
./ospfv3_basic_functionality.test_ospfv3_asbr_summary_topo1/r0/ospf6d_core-sig_6-pid_430478.dmp
sharpd@eva /t/topotests> sudo gdb /usr/lib/frr/ospf6d ./ospfv3_basic_functionality.test_ospfv3_asbr_summary_topo1/r0/ospf6d_core-sig_6-pid_430478.dmp
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/frr/ospf6d...
[New LWP 430478]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospf6d --log file:ospf6d.log --log-level debug -d'.
Program terminated with signal SIGABRT, Aborted.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
(gdb)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is implicitly checked by the "verify mroute" below, but it's much
more helpful to explicitly check in advance.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently I get bgp_instance_del-test as well as bgp_l3vpn_to_bgp_vrf
failures every ~3-4 runs when under a 40 parallel run with micronet.
Examination of the failure and passing cases always leads to the
failures showing convergence of bgp bestpath immediately after
the show commands to ensure that the routes are there.
Modify the code to look for the fact that the vrf has
converged from routes being passed around across vrf's
and ensure that bestpath has run on them.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When debugging issues for routes in multiple vrf's. It would
be extremely useful if the debug output had which vrf we
are acting on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Because this test can be run in either netns vrf mode or vrflite
vrf mode, the default vrf name has different name. When netns mode
is chosen, vrf0 name is chosen as default name, while when vrflite
mode is chosen, default name is chosen. Remove the vrf keyword from
the expected dump.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>