Commit Graph

30207 Commits

Author SHA1 Message Date
Donald Sharp
fc15f734aa bgpd: rpki should use a stack pointer instead of a pointer
The prefix was being allocated and freed.  No point in this
let's just use a stack pointer.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donald Sharp
7651f27751 bgpd: Make rpki soft_reconfig calling events
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive.  Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donald Sharp
802ca11f10 bgpd: Use bgp pointer instead of peer pointer
When looking up a table, use the bgp pointer that we
have.  Code cleanliness and all that.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donald Sharp
89c73443e8 bgpd: Make calling bgp_soft_reconfig_in consistent
Not all places were checking to see if soft reconfiguration
was turned on before calling into it to do all that work.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donald Sharp
8fb15d02fe bgpd: In rpki use FOREACH_AFI_SAFI to loop over afi/safi
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Sai Gomathi N
358a7549dc tools: Add pim6d support bundle commands
PIMv6 Support Bundle commands are added in support_bundle_commands.conf file.
This will help in debugging PIMv6 test Failures.

Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
2022-11-08 01:41:48 -08:00
Donatas Abraitis
c4831d2864 packaging: Reuse frr.logrotate for Debian and Redhat builds
It will be easier to maintain a single file instead of two separate.

Also, fixes the issue when the file (/var/log/frr/frr.log) is not created
after logrotate.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-08 11:17:56 +02:00
Donatas Abraitis
93bae5f81f ospf6d: Show if the interface is passive for show ipv6 ospf6 interface
donatas-pc# sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
  Interface ID: 2
  Internet Address:
    inet : 192.168.10.17/24
    inet6: fe80::ca5d:fd0d:cd8:1bb7/64
  Instance ID 0, Interface MTU 1500 (autodetect: 1500)
  MTU mismatch detection: enabled
  Area ID 0.0.0.0, Cost 1000
  State Waiting, Transmit Delay 1 sec, Priority 1
  Timer intervals configured:
   Hello 10(8.149), Dead 40, Retransmit 5
  DR: 0.0.0.0 BDR: 0.0.0.0
  Number of I/F scoped LSAs is 1
    0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
    0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
  Authentication Trailer is disabled
donatas-pc# con
donatas-pc(config)# int enp3s0
donatas-pc(config-if)# ipv6 ospf6 passive
donatas-pc(config-if)# do sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
  Interface ID: 2
  Internet Address:
    inet : 192.168.10.17/24
    inet6: fe80::ca5d:fd0d:cd8:1bb7/64
  Instance ID 0, Interface MTU 1500 (autodetect: 1500)
  MTU mismatch detection: enabled
  Area ID 0.0.0.0, Cost 1000
  State Waiting, Transmit Delay 1 sec, Priority 1
  Timer intervals configured:
   No Hellos (Passive interface)
  DR: 0.0.0.0 BDR: 0.0.0.0
  Number of I/F scoped LSAs is 1
    0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
    0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
  Authentication Trailer is disabled
donatas-pc(config-if)#

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-08 09:37:19 +02:00
Jafar Al-Gharaibeh
473f9912cf
Merge pull request #12276 from opensourcerouting/fix/ospf_wrong_arg
ospfd: Get route-map name for default-information originate
2022-11-07 22:08:18 -06:00
Donatas Abraitis
bd162aae09 ospfd: Get route-map name for default-information originate
LR1.wue3(config)# route-map foo-bar-baz10 permit 10
LR1.wue3(config-route-map)# exit
LR1.wue3(config)# router ospf
LR1.wue3(config-router)#  ospf router-id 172.18.254.201
LR1.wue3(config-router)#  log-adjacency-changes
LR1.wue3(config-router)# default-information originate metric 50 metric-type 1 route-map foo-bar-baz10
LR1.wue3(config-router)# end

Results in:

LR1.wue3# show run
...
!
router ospf
 ospf router-id 172.18.254.201
 log-adjacency-changes
 default-information originate metric 50 metric-type 1 route-map oute-map
exit
!
route-map foo-bar-baz10 permit 10
exit
!
end

Let's fix this.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 22:23:07 +02:00
Donatas Abraitis
286197f728 docker: Compile Alpine image using PCRE2
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 21:23:53 +02:00
Donatas Abraitis
d567ea001b docker: Use pcre2 for Alpine builds
libyang already uses pcre2 too.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 21:23:53 +02:00
Donatas Abraitis
7078f9a587 docker: Use Alpine 3.16 image
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 21:23:53 +02:00
Donatas Abraitis
061f5d1cb4 lib: Add PCRE2 support
Some results:

```
====
PCRE
====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001

ret status: 0
[14:31] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001

ret status: 0

=====
PCRE2
=====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001

ret status: 0
[14:30] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001

ret status: 1
```

Seems that if using PCRE2, we need to escape outer `()` chars and `|`. Sounds
like a bug.
But this is only with some older PCRE2 versions. With >= 10.36, I wasn't able
to reproduce this, everything is fine and working as expected.

Adding _FRR_PCRE2_POSIX definition because pcre2posix.h does not have
include's guard.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 21:23:53 +02:00
Donatas Abraitis
54757dc179 docker: Reuse all possible cores when building FRR for Alpine
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-07 21:23:53 +02:00
Donald Sharp
0096b066f9
Merge pull request #12268 from opensourcerouting/fix/zebra_tc_include_netinet_for_ethhdr
zebra: Reuse netinet/if_ether.h to avoid redefinition of struct ethhdr
2022-11-07 13:33:37 -05:00
Kuldeep Kashyap
3748e8d030 tests: Add pim6d marker to pytest.ini
Added pim6d marker to pytest.ini file,
to run tests pim6d marker based, if added
to scripts.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2022-11-07 02:19:23 -08:00
Kuldeep Kashyap
787e3da1d7 tests: [PIMv6] Add new scenarios to static_rp suite
Automated new scenarios to multicast pimv6
static rp test suite. Added new folder
multicast_pim6_static_rp_topo1 for pimv6
static_rp automation.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2022-11-07 02:19:23 -08:00
Kuldeep Kashyap
d7032129b0 tests: [PIMv6] F/W support for multicast pimv6 automation
Enhanced or added new libraries to support
multicast pimv6 automation

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2022-11-07 02:19:15 -08:00
Donatas Abraitis
47f3d0905b
Merge pull request #12238 from donaldsharp/append
lib, zebra: Allow for zebra to recognize that a route has gotten desy…
2022-11-07 10:37:05 +02:00
mobash-rasool
ac8aa2f7ca
Merge pull request #12263 from anlancs/fix/pimd-log-bug
pimd: avoid one EC log
2022-11-07 12:40:19 +05:30
Donatas Abraitis
29bde9e1f5
Merge pull request #12188 from donaldsharp/resilience
Resilience
2022-11-06 22:57:04 +02:00
Donatas Abraitis
83f496bdf0 zebra: Reuse netinet/if_ether.h to avoid redefinition of struct ethhdr
In file included from /usr/include/net/ethernet.h:10,
                 from ./lib/prefix.h:26,
                 from zebra/tc_netlink.c:32:
/usr/include/netinet/if_ether.h:115:8: error: redefinition of 'struct ethhdr'
  115 | struct ethhdr {
      |        ^~~~~~
In file included from zebra/tc_netlink.c:28:
/usr/include/linux/if_ether.h:169:8: note: originally defined here
  169 | struct ethhdr {
      |        ^~~~~~

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-06 22:50:47 +02:00
anlan_cs
5aed4d1376 pimd: avoid one EC log
Saw this EC log:

```
PIM: [WX4HZ-FA72S][EC 100663307] pim_rp_find_match_group: BUG We should have found default group information
```

The root cause is group address of "0.0.0.0" is wrongly introduced into
`pim_rp_find_match_group()`. So add a check to avoid it.

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2022-11-05 21:54:30 +08:00
Donald Sharp
a5e5c9a301 tests: Test Resilient NHG's are properly created in zebra
When a Resilient NHG is created, ensure that Zebra notes
that it is created and has it as well.

Signed-off-by: Donald Sharp <sharp@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
8966cca209 tests: Speedup test_all_protocol_startup.py by 55 seconds
Just make ospf and ospfv3 converge faster with faster
hello timers.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
1e8a2920cb doc: Add nexthop_groups documentation
Remove the nexthop groups documentation from pbr.rst and
make it `generic`.  Add the resilient buckets nexthop
group type.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
ca2b346783 *: Add ability to encode / decode resilence down zapi
At this point add abilty for the encode/decode of the
resilience down ZAPI to zebra.  Just hookup sharpd
at this point in time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
f3c6dd49f4 *: Add ability for daemons to notice resilience changes
This patch just introduces the callback mechanism for the
resilient nexthop changes so that upper level daemons
can take advantage of the change.  This does nothing
at this point but just call some code.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
f0f618dcdb lib, vtysh: Add ability to specify resilient nhgs
Add the ability to specify a resilient nexthop group

nexthop-group A
 resilient buckets 32 idle_timer 100 unbalanced_timer 500
 nexthop 192.168.100.1 enp7s0
 nexthop 192.168.100.33 enp7s0
 nexthop 192.168.122.1 enp1s0

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:34:27 -04:00
Donald Sharp
569e141113 lib, zebra: Add ability to encode/decode resilient nhg's
Add ability to read the nexthop group resilient linux
kernel data as well as write it.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:29:36 -04:00
Donald Sharp
e483855d24 lib: When adding to front of list ensure we handle tail to
When inserting to the front of a list with listnode_add_head
if the list is empty, the tail will not be properly set and
subsuquent calls to insert/remove will cause the function
to crash.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 13:29:36 -04:00
David Lamparter
ab2f9e89b4 pimd: consistently ignore prefix list mask len
... the prefix length wasn't ignored as expected.  Both S and G are
always /32.  But expecting "le 32" in prefix-list config is unexpected &
counterintuitive.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-11-04 17:17:39 +01:00
Donald Sharp
a048d52399 lib, zebra: Allow for zebra to recognize that a route has gotten desynced
FRR does not use the NLM_F_APPEND semantics ( in fact I would argue that
the NLM_F_APPEND semantics just introduce pain for all parties involved )
I would also argue that most people who use the kernel netlink api
have recognized that NLM_F_APPEND for a route is a recipe for disaster
that is well documented and as such it is not used as anything other
than a curiousity by operators.

See:
https://bugzilla.redhat.com/show_bug.cgi?id=1337855
https://github.com/thom311/libnl/issues/226

Are 2 great examples of how confusing it is for anyone in user
space to know what the correct thing to do is.  Given that
new fields can be added with no semantics to allow us to know
what has resulted in a change or not.

In an attempt to recognize this, let's note that FRR
believes it has gotten out of sync with the kernel.
Future commits will react to the desynchronized route
and request from the kernel a reload of that specific
route if possible.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-04 12:02:00 -04:00
Donald Sharp
3e85fb3373
Merge pull request #12244 from anlancs/fix/bgpd-evpn-leak-l3rt
bgpd: avoid possible memleak
2022-11-04 11:59:32 -04:00
Donald Sharp
295a6489c8
Merge pull request #12252 from opensourcerouting/fix/frr-reload.py_reuse_non_default_dirs
tools: Honor sysdir, confdir, bindir for frr-reload.py from "frr" wrapper
2022-11-04 11:58:37 -04:00
Donatas Abraitis
f41255a0ef bgpd: Show the counters for RTT when shutdown on RTT feature is enabled
"shutdownRttInMsecs":5,
    "shutdownRttAfterCount":5,

Estimated round trip time: 116 ms
Shutdown when RTT > 5ms, count > 17

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-04 16:07:07 +02:00
Donatas Abraitis
5597214ccb bgpd: Show the reason when the session is killed due to RTT
Simulated latency with:

```
tc qdisc add dev eth3 root netem delay 100ms
```

```
donatas-laptop# sh ip bgp summary failed

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory

Neighbor        EstdCnt DropCnt ResetTime Reason
192.168.10.65         2       2  00:00:17 Admin. shutdown (RTT)

Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```

Another end received:

```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-04 15:56:23 +02:00
Donatas Abraitis
9f4fa17629 bgpd: Always show estimated RTT to the peer
It's very annoying when flapping between 0 (missing the output) and non-zero.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-04 14:46:14 +02:00
Donatas Abraitis
94fdbad234
Merge pull request #12251 from donaldsharp/various_and_sundry
Various and sundry
2022-11-04 14:03:50 +02:00
Donald Sharp
efda3db030
Merge pull request #12256 from opensourcerouting/fix/llgr_max_values
bgpd: Cap LLGR stale-time to 16777215
2022-11-04 08:00:06 -04:00
Donald Sharp
63e357a82c
Merge pull request #12257 from opensourcerouting/fix/bgp_orf_reserved
bgpd: Check and print if we receive ORF reserved type
2022-11-04 07:59:14 -04:00
David Lamparter
c34a7afc74 bgpd: fix "storing the address of local variable"
New GCC 12 warning.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2022-11-04 12:34:00 +01:00
Donatas Abraitis
fe54d0f72d
Merge pull request #12255 from donaldsharp/established_lost
bgpd: Limit snmp trap for backwards state movement from established
2022-11-04 13:27:47 +02:00
Donatas Abraitis
5970204c69 bgpd: Cap LLGR stale-time to 16777215
This value is 3 bytes (24-bits), let's do not overuse this.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-04 08:21:18 +02:00
Donald Sharp
adf552ab6b bgpd: Limit snmp trap for backwards state movement from established
Currently the bgp mib specifies two traps:

a) Into established state
b) transition backwards from a state

b) really is an interesting case.  It means transitioning
from say established to starting over.  It can also
mean when bgp is trying to connect and that fails and
the state transitions backwards.

Now let's imagine 500 peers with tight timers (say a data center)
and there is network trauma you have just created an inordinately
large number of traps for each peer.

Let's limit FRR to changing from the old status as Established
to something else.  This will greatly limit the trap but it
will also be something end operators are actually interested in.

I actually had several operators say they had to write special code
to ignore all the backward state transitions that they didn't care
about.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-03 20:39:39 -04:00
Donatas Abraitis
efe9529821 tools: Honor sysdir, confdir, bindir for frr-reload.py from "frr" wrapper
Without this, those variables are not passed to frr-reload.py and uses
default values.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-03 20:15:10 +02:00
Donald Sharp
f531fae829 vtysh: Allow service ... lines to not repeat
When any `service ...` line is entered and there are multiple
daemons running prevent this from being displayed multiple times.

Fixes: #5475
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-03 13:01:35 -04:00
Donald Sharp
c4f16627d3 bgpd: rfapi doc strings are messed up for one command
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-03 13:01:16 -04:00
Donald Sharp
ac3ee3270b lib: Fix double set of event pointer to NULL
The event system when executing a thread already
sets the pointer of it to NULL.  No need to
do it again.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-03 12:50:24 -04:00