Commit Graph

32667 Commits

Author SHA1 Message Date
Donald Sharp
963b7ee448 bgpd: Limit peer output queue length like input queue length
Consider this scenario:

Lots of peers with a bunch of route information that is changing
fast.  One of the peers happens to be really slow for whatever
reason.  The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future.  Now suppose that peer has hit it's input Queue
limit and is slow.  As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.

Let's limit the Output Queue to the same limit as the Input
Queue.  This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-19 11:48:01 -05:00
Christian Hopps
c2c17459ea
Merge pull request #12655 from donaldsharp/pim_basic_cleanups
tests: pim_basic fails in micronet
2023-01-19 11:41:44 -05:00
Mark Stapp
11f4fa6fe0
Merge pull request #12665 from opensourcerouting/cs-misc
lib,zebra: fix null dereference and remove dead code
2023-01-19 11:28:58 -05:00
Rafael Zalamena
d8145114e0 pimd: fix mtracebis tool warning
Use `getpid()` to initialize the sequence number. This change silences
Coverity Scan warning about truncated use of `time()` which in this case
is not a problem.

Found by Coverity Scan (CID 1519828)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-19 12:09:29 -03:00
Rafael Zalamena
ff9232c83b lib: remove dead logic code
If we got inside the condition of `vrfp->status == VRF_ACTIVE` then
don't make the same check again.

Found by Coverity Scan (CID 1519760)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-19 10:42:01 -03:00
Donatas Abraitis
cd16c74b77
Merge pull request #12492 from kuldeepkash/update_assert_msg
tests: [topojson] Update assert/error messages
2023-01-19 15:33:19 +02:00
Rafael Zalamena
ab80e474f2 zebra: fix possible null dereference
Don't attempt to dereference `ifp` directly if it might be null: there
is a check right before this usage: `ifp ? ifp->info : NULL`.

In this context it should be safe to assume `ifp` is not NULL because
the only caller of this function checks that for this `ifindex`. For
consistency we'll check for null anyway in case this ever changes (and
with this the coverity scan warning gets silenced).

Found by Coverity Scan (CID 1519776)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-19 10:32:18 -03:00
Kuldeep Kashyap
2583746108 tests: fix for test_pim6_multiple_groups_different_RP_address_p2 failure
Testcase: test_pim6_multiple_groups_different_RP_address_p2
was failing because of a bug in framework, Fixed the
bug in this commit.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2023-01-19 05:26:37 -08:00
Kuldeep Kashyap
2a8ad2ea97 tests: Fix for multicast_pim6_static_rp tests failure
Multicast pim6 static RP tests are failing
when run in parallel using micronet. There
are APIs to clean mcast traffic before
starting new test but these cleanups
are not needed when socat is used.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2023-01-19 05:26:37 -08:00
Donatas Abraitis
69306c44e8
Merge pull request #12657 from anlancs/fix/lib-debug-empty-ip
lib: show "(null)" for empty IP address
2023-01-19 09:12:04 +02:00
anlan_cs
927c633dd9 lib: show "(null)" for empty IP address
Use "(null)" for empty IP address.

One example in `bgp_zebra_send_remote_macip()` to install mac:

Before:
```
2023/01/18 02:09:09 BGP: [SCHS5-AK960] Tx ADD MACIP, VNI 200 MAC 06:6b:7c:db:83:72 IP  flags 0x0 seq 0 remote VTEP 88.88.88.88 esi -
```

After:
```
2023/01/18 20:19:57 BGP: [SCHS5-AK960] Tx ADD MACIP, VNI 200 MAC 06:6b:7c:db:83:72 IP (null) flags 0x0 seq 0 remote VTEP 88.88.88.88 esi -
```

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-01-19 09:30:49 +08:00
Donald Sharp
133b0b5a7e
Merge pull request #12659 from opensourcerouting/fpm-cs
zebra: fix fpm netlink encode out of bounds read
2023-01-18 20:20:45 -05:00
Russ White
bb1d52b3c0
Merge pull request #12604 from donaldsharp/distance_metric_offload_fixes
Distance/metric offload fixes
2023-01-18 15:57:48 -05:00
Donald Sharp
a6782fbaf8 tests: zebra_netlink only gives 10 seconds to install all routes
Under really heavily loaded systems this is insufficient.  Looking
at the run output we have this:

	  "2.1.3.22\/32":[
	    {
	      "installed":true,
	    }
	  ],
	  "2.1.3.23\/32":[
	    {
	      "queued":true,
            }
           ],

So after 10 seconds on the micronet system only 30 of the 100 routes are installed.
Give it more time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-18 15:29:32 -05:00
Donald Sharp
6483e73336 tests: pim_basic fails in micronet
Looks like under heavy load, the test is not giving enough
time to come to steady state.  Do this:

a) send more udp packets and for longer
b) Increase time spent waiting

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-18 15:29:32 -05:00
Rafael Zalamena
18b7958e47 zebra: fix fpm netlink encode out of bounds read
Don't attempt to encode the pointer address instead pass the pointer
directly so the real contents can be accessed.

(`ri->pref_src` type is `union g_addr *`)

Found by Coverity Scan (CID 1482162)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-18 15:53:10 -03:00
Mark Stapp
405227b519
Merge pull request #12653 from opensourcerouting/netns-cs
zebra: make sure string is null terminated
2023-01-18 07:32:15 -05:00
Mark Stapp
e7523b9a94
Merge pull request #12648 from opensourcerouting/gmtime-fix
lib: fix gmtime_assafe potential issues
2023-01-17 16:25:35 -05:00
Donald Sharp
f9341ad6a1
Merge pull request #12588 from LabNConsulting/chopps/unet-aflags
tests: replace -a (all) with individual flags for nsenter
2023-01-17 15:55:31 -05:00
Rafael Zalamena
da5bd13c08 zebra: make sure string is null terminated
Do extra inotify data structure checks and copy the file name to a stack
buffer making sure it is null byte terminated.

Found by Coverity Scan (CID 1465494)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-17 17:08:23 -03:00
Christian Hopps
7cec6e4359 tests: replace -a (all) with individual flags for nsenter
- required for old OSs like centos 7
- fixes #11924

Signed-off-by: Christian Hopps <chopps@labn.net>
2023-01-17 11:33:27 -05:00
Donatas Abraitis
7475ed3330
Merge pull request #12449 from chiragshah6/mdev1
zebra: vrf-id support for show vrf vni json cmd
2023-01-17 18:25:01 +02:00
Donatas Abraitis
af5d731255 Revert "lib: BGP registration with IGP for BGP ORR rSPF calc"
This reverts commit a5dd4bf47d.
2023-01-17 18:15:56 +02:00
Donatas Abraitis
cb6e090a90 Revert "doc: Add documentation for BGP ORR support"
This reverts commit 2b55ff400f.
2023-01-17 18:15:50 +02:00
Donatas Abraitis
cfd01fc0ac Revert "bgpd: optimal router reflection cli and fsm changes"
This reverts commit 70cd87ca02.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-17 18:15:28 +02:00
Donatas Abraitis
405e1c848f Revert "ospfd: rSPF calc and messaging for optimal route reflection"
This reverts commit a3d3a14c09.
2023-01-17 18:11:56 +02:00
Donatas Abraitis
1ea57af264 Revert "bgpd, ospfd: BGP ORR CI warning fixes"
This reverts commit d6b2761134.
2023-01-17 18:10:04 +02:00
Donatas Abraitis
3228977f58 Revert "ospfd: few fixes in rSPF calc when LSA received from non root node"
This reverts commit 9f2984d97c.
2023-01-17 18:09:55 +02:00
Donatas Abraitis
af7e7dbec5 Revert "bgpd: fix for crash when no neighbor A.B.C.D remote-as AS_NUM with orr config"
This reverts commit 5fcf01c9ae.
2023-01-17 18:07:46 +02:00
Donatas Abraitis
731d0769e2 Revert "bgpd, ospfd: update BGP when routes are removed from OSPF routing table"
This reverts commit bba9435157.
2023-01-17 18:07:41 +02:00
Donatas Abraitis
e993b11c23 Revert "bgpd: code review comments addressed"
This reverts commit 80f6ea8b99.
2023-01-17 18:07:36 +02:00
Russ White
e2fd75fce2
Merge pull request #12584 from pguibert6WIND/bgp_imported_distance
bgpd: imported vpn entries get appropriate distance
2023-01-17 10:16:46 -05:00
Russ White
f31c35993d
Merge pull request #12644 from opensourcerouting/rib-uaf
zebra: fix use after free on RIB processing
2023-01-17 09:40:58 -05:00
Russ White
775ce087f1
Merge pull request #12643 from opensourcerouting/fix/cosmetic_log_changes
bgpd: Drop redundant `vrf` keyword in BGP debug log changes
2023-01-17 09:40:28 -05:00
Russ White
6664d74505
Merge pull request #12641 from samanvithab/bgpd_crash
bgpd: Fix crash during shutdown due to race condition
2023-01-17 09:40:05 -05:00
Russ White
00d7261e20
Merge pull request #12636 from opensourcerouting/fix/bgp_accept-own_connected_routes
bgpd: Allow importing local routes with accept-own mechanism
2023-01-17 09:31:37 -05:00
Russ White
c542606e56
Merge pull request #12603 from opensourcerouting/fix/deprecate_bgp_stuff_some
bgpd: Deprecate some stuff
2023-01-17 09:12:39 -05:00
Russ White
2a71812153
Merge pull request #12601 from opensourcerouting/feature/bgp_neighbor_path-attribute_discard
bgpd: Add `neighbor path-attribute discard` command
2023-01-17 09:12:17 -05:00
Russ White
3b506eccc1
Merge pull request #12597 from opensourcerouting/fix/bgp_sender_as_path_prevention
bgpd: Do not send routes back received from a peer
2023-01-17 09:11:53 -05:00
Russ White
13e9afe9e4
Merge pull request #12424 from opensourcerouting/static-route-bfd
staticd: BFD static route monitoring
2023-01-17 09:09:24 -05:00
Rafael Zalamena
0839d0c742 lib: fix gmtime_assafe potential issues
Changes:
- Convert `unsigned int` to `time_t` to satisfy time truncation warnings
  even though at this point we had already used the modulus operator.

- Avoid trying to access outside the bounds of the array

  `months` array has a size of 13 elements, but the code inside the loop
  uses `i + 1` to peek on the next month.

Found by Coverity Scan (CID 1519752 and 1519769)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-17 10:21:39 -03:00
Philippe Guibert
a04f1c42eb bgpd: imported vpn entries get appropriate distance
MPLS VPN networks can either peer with iBGP or eBGP. When
calculating the distance to send to zebra, the imported prefix
is never sent with distance information, even if the vty
command is used under the ipv4 unicast address family:

router bgp 65505 vrf vrf1
 address-family ipv4 unicast
  distance bgp 26 27 28
  [vpn config]

The observation is that the distance sent to zebra for an
imported prefix is still 20:

[..]
VRF vrf1:
B>  192.168.0.0/24 [20/0] via 2.2.2.2 (vrf default) (recursive), label 20, weight 1, 00:00:12
  *                          via 10.125.0.6, ntfp3 (vrf default), label implicit-null/20, weight 1, 00:00:12

The expectation is that the incoming prefix has to follow the
distance that is configured, or the distance derived from the peer
relationship established by the parent prefix.

In the case, an iBGP relationship is done, and no distance
configuration is done, the below show is expected:

   [..]
   VRF vrf1:
   B*>  192.168.0.0/24 [200/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12

In the case an iBGP relationship is done, and distance configuration
is performed as below:
   [..]
   distance bgp 21 201 41
   [..]

Then the below show is expected:

   [..]
   VRF vrf1:
   B*>  192.168.0.0/24 [201/0] via 192.168.0.2, r1-gre0 (vrf default), label 20, weight 1, 00:00:12

To get this behaviour, get the peer origin where the prefix is coming
from.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2023-01-17 13:24:33 +01:00
anlan_cs
47f5eb7487 bgpd: cosmetic changes for debug
Two changes for debug log -
1. Display empty VRF as "None".
2. Correct wrong "type-2" word for type-3 route.

Before:
```
2023/01/17 04:00:30 BGP: [Z5AV7-75RTE] VRF   vni 100 type-2 route evp [3]:[0]:[32]:[88.88.88.88] RMAC 00:00:00:00:00:00 nexthop 88.88.88.88 esi (null)
```

After:
```
2023/01/17 04:05:24 BGP: [M3X4Y-24DVB] VRF None vni 100 type-3 route evp [3]:[0]:[32]:[88.88.88.88] RMAC 00:00:00:00:00:00 nexthop 88.88.88.88 esi (null)
```

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-01-17 17:16:39 +08:00
Kuldeep Kashyap
244f2df015 tests: Fix for frr-bot style issues
Fixed style issues repoted by frr-bot

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2023-01-17 14:07:23 +05:30
Kuldeep Kashyap
db726bb8b7 tests: [topojson]Update assert/error messages
Few assert/error messages are updated for test
scripts for better debugging.

Signed-off-by: Kuldeep Kashyap <kashyapk@vmware.com>
2023-01-17 14:00:49 +05:30
anlan_cs
efa354a978 lib: fix wrong returned value for filter
When setting rule for access-list ( and prefix-list ) without sequence, it
will automatically get a sequence by `acl_get_seq()`, and return
`CMD_SUCCESS` for command even this sequence value is wrong.

In this scene, `CMD_WARNING_CONFIG_FAILED` should be returned with a
warning.

So, add the check in `acl_get_seq()` and move `nb_cli_enqueue_change()`
after the check of wrong sequence.

Both `plist_remove_if_empty()` and `acl_remove_if_empty()` should ignore
this check, there is no change on them.

Before:
```
anlan(config)# access-list aa seq 4294967295 deny 6.6.6.6/32
anlan(config)# access-list aa deny 6.6.6.7/32  <- Return CMD_SUCCESS
YANG error(s):
 Value "4294967300" is out of uint32's min/max bounds.
 Value "4294967300" is out of uint32's min/max bounds.
 Value "4294967300" is out of uint32's min/max bounds.
 Value "4294967300" is out of uint32's min/max bounds.
 Value "4294967300" is out of uint32's min/max bounds.
 YANG path: Schema location /frr-filter:lib/prefix-list/entry/sequence.
% Failed to edit configuration.
```

After:
```
anlan(config)# access-list aa seq 4294967295 deny 6.6.6.6/32
anlan(config)# access-list aa deny 6.6.6.7/32  <- Return CMD_WARNING_CONFIG_FAILED
% Malformed sequence value
```

Additionally, fixed the overflow issue on `acl_get_seq()` on **32bit** platforms.
Just change the returned type of `acl_get_seq()` from `long` to `int64_t`.

Before:
```
anlan(config)# access-list bb seq 4294967295 deny 6.6.6.6/32
anlan(config)# access-list bb deny 6.6.6.7/32
anlan(config)# do show run
...
access-list bb seq 4294967295 deny 6.6.6.6/32
access-list bb seq 4 deny 6.6.6.7/32 <- Overflow
```

After:
```
anlan(config)# access-list bb seq 4294967295 deny 6.6.6.6/32
anlan(config)# access-list bb deny 6.6.6.7/32
% Malformed sequence value
```

Signed-off-by: anlan_cs <vic.lan@pica8.com>
2023-01-17 09:36:50 +08:00
Rafael Zalamena
e96817f877 zebra: fix use after free on RIB processing
After calling `rib_unlink` the variable `re` will point to `free()`d
memory, so don't attempt to use it after this point.

Found by Coverity Scan (Coverity ID 1519784)

Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
2023-01-16 17:40:54 -03:00
Donatas Abraitis
b2a100e439 bgpd: Drop redundant vrf keyword in BGP debug log changes
Before:

```
nexthop is not valid (in vrf VRF vrf100)
updating RD 65001:100, 10.100.1.1/32 to vrf VRF vrf200
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-16 21:36:08 +02:00
Donatas Abraitis
2273550f04
Merge pull request #12621 from donaldsharp/route_scale_failure_micronet
tests: route_scale tests are failing on micronet
2023-01-16 14:32:26 +02:00
Samanvitha B Bhargav
8c9d306c8d bgpd: Fix crash during shutdown due to race condition
[New LWP 2524]
[New LWP 2539]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/opt/avi/bin/bgpd -f /run/frr/avi_ns3_bgpd.config -i /opt/avi/etc/avi_ns3_bgpd.'.
Program terminated with signal SIGABRT, Aborted.
[Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))]
0  0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7f92ac8f1740 (LWP 2524))]
0  0x00007f92acb3800b in raise () from /lib/x86_64-linux-gnu/libc.so.6
1  0x00007f92acb17859 in abort () from /lib/x86_64-linux-gnu/libc.so.6
2  0x00007f92acb17729 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3  0x00007f92acb28fd6 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
4  0x00007f92accf2164 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
5  0x000055b46be1ef63 in bgp_keepalives_wake () at bgpd/bgp_keepalives.c:311
6  0x000055b46be1f111 in bgp_keepalives_stop (fpt=0x55b46cfacf20, result=<optimized out>) at bgpd/bgp_keepalives.c:323
7  0x00007f92acea9521 in frr_pthread_stop (fpt=0x55b46cfacf20, result=result@entry=0x0) at lib/frr_pthread.c:176
8  0x00007f92acea9586 in frr_pthread_stop_all () at lib/frr_pthread.c:188
9  0x000055b46bdde54a in bgp_pthreads_finish () at bgpd/bgpd.c:8150
10 0x000055b46bd696ca in bgp_exit (status=0) at bgpd/bgp_main.c:210
11 sigint () at bgpd/bgp_main.c:154
12 0x00007f92acecc1e9 in quagga_sigevent_process () at lib/sigevent.c:105
13 0x00007f92aced689a in thread_fetch (m=m@entry=0x55b46cf23540, fetch=fetch@entry=0x7fff95379238) at lib/thread.c:1487
14 0x00007f92aceb2681 in frr_run (master=0x55b46cf23540) at lib/libfrr.c:1010
15 0x000055b46bd676f4 in main (argc=11, argv=0x7fff953795a8) at bgpd/bgp_main.c:482

Root cause:
This is due to race condition between main thread & keepalive thread during clean-up.

This happens when the keepalive thread is processing a wake signal owning the mutex, when meanwhile the main thread tries to stop the keepalives thread.

In main thread, the keepalive thread’s running bit (fpt->running) is set to false, without taking the mutex & then it blocks on mutex.
Meanwhile, keepalive thread which owns the mutex sees that the running bit is false & executes bgp_keepalives_finish() which also frees up mutex.
Main thread that is waiting on mutex with pthread_mutex_lock() will cause core while trying to access mutex.

Fix:
Take the lock in main thread while setting the fpt->running to false.

Signed-off-by: Samanvitha B Bhargav <bsamanvitha@vmware.com>
2023-01-16 04:22:11 -08:00