OSPF when converging will choose a DR / Backup DR based upon
who has already come up. Irrelevant of priority. As such if
under system load OSPF comes up first and elects a DR that under
normal circumstances not be the elected one due to priority
OSPF does not go back through and re-elect to keep the system
stable in this case. Tests are experiencing this:
unet> r0 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.1 99 Full/Backup 4m14s 3.780s 10.0.1.2 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.2 0 Full/DROther 4m14s 3.848s 10.0.1.3 r0-s1-eth0:10.0.1.1 0 0 0
100.1.1.3 0 Full/DROther 4m14s 3.912s 10.0.1.4 r0-s1-eth0:10.0.1.1 0 0 0
unet> r1 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m15s 3.011s 10.0.1.1 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.2 0 Full/DROther 4m19s 3.124s 10.0.1.3 r1-s1-eth1:10.0.1.2 0 0 0
100.1.1.3 0 Full/DROther 4m19s 3.188s 10.0.1.4 r1-s1-eth1:10.0.1.2 0 0 0
unet> r2 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m27s 3.483s 10.0.1.1 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.1 99 Full/Backup 4m32s 3.527s 10.0.1.2 r2-s1-eth0:10.0.1.3 0 0 0
100.1.1.3 0 2-Way/DROther 4m32s 3.660s 10.0.1.4 r2-s1-eth0:10.0.1.3 0 0 0
unet> r3 show ip ospf neigh
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
100.1.1.0 98 Full/DR 4m55s 3.786s 10.0.1.1 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.1 99 Full/Backup 4m55s 3.829s 10.0.1.2 r3-s1-eth1:10.0.1.4 0 0 0
100.1.1.2 0 2-Way/DROther 4m54s 3.897s 10.0.1.3 r3-s1-eth1:10.0.1.4 0 0 0
Modify the test to do a clear to enforce the order we are specifically looking for.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
frr-reload triggers restart of service in case
it fails to parse new config file and conjunction with
running config contains 'router bgp' (default bgp instnace).
When frr-reload fails to parse new config file, it fails
to build newconfig context (empty object).
Instead of bailing out it compares against the running config
context. If the running config contains default bgp instance
it thinks new config is removing default bgp instance so it
triggers frr restart.
Fix is to to bail out reload script when it fails to parse
config file.
Ticket:#2861989
Reviewed By: MR-83
Testing Done:
router bgp 102 vrf RED
bgp router-id 2.2.2.2
neighbor underlay peer-group
neighbor underlay remote-as <---- Partial config
Before fix:
2021-12-02 02:43:16,987 ERROR: vtysh failed to process new
configuration: vtysh (mark file) exited with status 4:
b'line 79: % Command incomplete: neighbor underlay remote-as\n\n'
2021-12-02 02:43:17,145 INFO: Loading Config object from vtysh show
running
2021-12-02 02:43:17,362 INFO: "frr version 7.5+cl5.0.0u0" cannot be
removed
2021-12-02 02:43:17,362 INFO: "frr defaults datacenter" cannot be
removed
2021-12-02 02:43:17,362 INFO: "service integrated-vtysh-config" cannot
be removed
2021-12-02 02:43:17,363 INFO: "line vty" cannot be removed
2021-12-02 02:43:17,522 INFO: EVPN is enabled and default instance del
needed
2021-12-02 02:43:17,522 INFO: Restarting FRR <---- Restart frr
After fix:
Just throw Error and abort the script.
root@TORS1:mgmt:/home/cumulus# /usr/lib/frr/frr-reload.py --debug
--reload --stdout /etc/frr/frr.conf
2021-12-02 04:00:56,519 INFO: Called via "Namespace(bindir='/usr/bin',
confdir='/etc/frr', daemon='', debug=True, filename='/etc/frr/$
rr.conf', input=None, overwrite=False, pathspace=None, reload=True,
rundir='/var/run/frr', stdout=True, test=False, vty_socket=None)"
2021-12-02 04:00:56,520 INFO: Loading Config object from file
/etc/frr/frr.conf
2021-12-02 04:00:56,679 ERROR: vtysh failed to process new
configuration: vtysh (mark file) exited with status 4:
b'line 79: % Command incomplete: neighbor underlay remote-as\n\n'
root@TORS1:mgmt:/home/cumulus#
Signed-off-by: Chirag Shah <chirag@nvidia.com>
This utility script helps in generated formatted and consistent
change log including:
1- group logs per daemon
2- standarize daemon names (lowercase, end with d)
3- capitalize all log lines
4- no merge commits
caveat: comments are assumed to be in the form
daemon-name : message
Sample Output:
```
sharpd
Follow the practice on cli design for json output
Install route supports nexthop-seg6 (step3)
Install_routes_helper support zapi_route flags (step1)
snapcraft
Add missing dependency
Add pathd to frr snap daemons
Change base to ubuntu 18.04 and libyang 2.0.7
staticd
Convert typedef to enum
Fix distance processing
Fix late initialization of blackhole type
Output config using nb callbacks instead of operational data
```
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
When the detection time expires, we put the session down and restart the
timer. As the comment in the code says, it's needed to zero the remote
discriminator after the second expiration.
But the RFC clearly says that this must be done on the first expiration:
bfd.RemoteDiscr
The remote discriminator for this BFD session. This is the
discriminator chosen by the remote system, and is totally opaque
to the local system. This MUST be initialized to zero. If a
period of a Detection Time passes without the receipt of a valid,
authenticated BFD packet from the remote system, this variable
MUST be set to zero.
And we actually already do it in `ptm_bfd_sess_dn`, so there's no need
to reset the timer and wait for it twice.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Remove neighbor <> remote-as <> config line,
if the neighbor is part of the peer-group and
peer-group contains remote-as config.
Neighbors which are part of the peer-group
cannot override remote-as.
Fix:
Frr-reload needs to remote 'neighbor <> remote-as <>'
from lines_to_add if its already part of peer-group
and peer-group has remote-as config.
Testing Done:
Before:
Config snippet:
neighbor PEERS peer-group
neighbor PEERS remote-as external
neighbor PEERS timers 3 9
neighbor 10.2.1.1 remote-as external
neighbor 10.2.1.1 peer-group PEERS
neighbor 10.2.1.1 timers 3 9
neighbor 10.2.1.2 remote-as external
neighbor 10.2.1.2 peer-group PEERS
Frr-reload failure:
line 179: Failure to communicate[13] to bgpd, line: neighbor 10.2.1.1
remote-as external
% Peer-group member cannot override remote-as of peer-group
line 179: Failure to communicate[13] to bgpd, line: neighbor 10.2.1.2
remote-as external
% Peer-group member cannot override remote-as of peer-group
After:
frr-reload apply the config successfully.
Signed-off-by: Chirag Shah <chirag@nvidia.com>
When both ripd and eigrpd run at the same time, all key configuration in
key chain node is duplicated. This change adds a concept of nested nodes
into vtysh to fix the issue.
Before:
```
key chain test
key 1
key-string 1
exit
key 1
key-string 1
exit
exit
!
```
After:
```
key chain test
key 1
key-string 1
exit
exit
!
```
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Per RFC 5880 section 6.8.12, the use of a Poll Sequence is not necessary
when the Detect Multiplier is changed. Currently, we update the Detection
Timeout only when a Poll Sequence is terminated, therefore we ignore the
Detect Multiplier change if it's not accompanied with RX/TX timer change.
To fix the problem, we should update the Detection Timeout on every
received packet.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
YANG leaf means "enable" while CLI command is "disable".
So we should use "no" when the leaf is "true", not "false".
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Modifying the members of pim_interface which are to be used
for both IPv4 and IPv6 to common names(for both MLD and IGMP).
Issue: #10023
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Add optional NHG ID output to `show ip route` dumps. We have
this in json output already as nexthopGroupID but nice
to have the option in a normal dump as well. Not including in main
output for now to avoid breaking screen scrapers.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Don't hide the LABELED_UNICAST safi when processing route
updates; map it where necessary (to use the UNICAST table
for instance).
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
Lot's of the GR topotests kill daemons in order to test code
that deals with crashing daemons. Under heavy system load
it was noticed that a kill command was sent and if told to
wait we would sleep 2 seconds send another kill command and
call it good. This was causiing issues when subsuquent
json commands would get errors like `lost connection to daemon`
as the daemon finally shut down after some time due to load.
Modify the kill the daemon function to notice that the daemon
was not actually killed and if we need to wait wait some
more time for it too happen
Signed-off-by: Donald Sharp <sharpd@nvidia.com>