Commit Graph

574 Commits

Author SHA1 Message Date
Donatas Abraitis
b0d906ea15 tests: Check if peer->af_flags can be higher than uint32_t
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 5acfd822be)
2023-02-24 14:29:25 +00:00
Donatas Abraitis
c95bfed068 bgpd: Renumber peer->af_flags to be without any gaps
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 47017b846f)
2023-02-24 14:29:25 +00:00
Donatas Abraitis
8587e5434e bgpd: Convert peer_af_flag_check() to bool
Since we increased peer->af_flags from uint32_t to uint64_t,
peer_af_flag_check() was historically returning integer, and not bool
as should be.

The bug was that if we have af_flags higher than uint32_t it will never
returned a right value.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
(cherry picked from commit 2c722516c3)
2023-02-24 14:29:24 +00:00
Donatas Abraitis
e9dbc60ee2
Merge pull request #12666 from donaldsharp/bgp_outq_limit
Bgp outq limit
2023-01-20 11:59:34 +02:00
Donald Sharp
963b7ee448 bgpd: Limit peer output queue length like input queue length
Consider this scenario:

Lots of peers with a bunch of route information that is changing
fast.  One of the peers happens to be really slow for whatever
reason.  The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future.  Now suppose that peer has hit it's input Queue
limit and is slow.  As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.

Let's limit the Output Queue to the same limit as the Input
Queue.  This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2023-01-19 11:48:01 -05:00
Donatas Abraitis
cfd01fc0ac Revert "bgpd: optimal router reflection cli and fsm changes"
This reverts commit 70cd87ca02.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-17 18:15:28 +02:00
Russ White
c542606e56
Merge pull request #12603 from opensourcerouting/fix/deprecate_bgp_stuff_some
bgpd: Deprecate some stuff
2023-01-17 09:12:39 -05:00
Donatas Abraitis
db3f8f3199 bgpd: Deprecate some unused BGP stuff
* BGP optional parameter type (Authentication)
* BGP UPDATE message error subcode for AS loop

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-14 21:30:35 +02:00
Donatas Abraitis
a5c6a9b18e bgpd: Add neighbor path-attribute discard command
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.

This is the command that Cisco has also.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-14 21:29:41 +02:00
Donatas Abraitis
f5540d6d41 bgpd: Drop deprecated BGP_ATTR_AS_PATHLIMIT path attribute
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2023-01-06 14:40:49 +02:00
Donald Sharp
534db980a2 bgpd: When creating peer convey if it is a CONFIG_NODE or not
When actually creating a peer in BGP, tell the creation if
it is a config node or not.  There were cases where the
CONFIG_NODE was being set *after* being placed into
the bgp->peerhash, thus causing collisions between the
doppelganger and the peer and eventually use after free's.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-12-05 09:11:22 -05:00
Donatas Abraitis
4f770cf1d2 bgpd: Implement graceful-shutdown command per neighbor
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.

Especially when a single neighbor needs to be restarted/shutdown gracefuly.

We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-16 21:42:21 +02:00
Donald Sharp
b36156760b
Merge pull request #12259 from opensourcerouting/fix/show_rtt_always
bgpd: Shutdown RTT improvements
2022-11-16 10:28:23 -05:00
Donald Sharp
7f1f931447 bgpd: Break up rpki prefix revalidation by bgp structure
RPKI revalidation is an possibly expensive operation.  Break up
revalidation on a prefix basis by the `struct bgp` pointer.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donald Sharp
7651f27751 bgpd: Make rpki soft_reconfig calling events
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive.  Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-11-08 08:11:52 -05:00
Donatas Abraitis
5597214ccb bgpd: Show the reason when the session is killed due to RTT
Simulated latency with:

```
tc qdisc add dev eth3 root netem delay 100ms
```

```
donatas-laptop# sh ip bgp summary failed

IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory

Neighbor        EstdCnt DropCnt ResetTime Reason
192.168.10.65         2       2  00:00:17 Admin. shutdown (RTT)

Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```

Another end received:

```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-11-04 15:56:23 +02:00
Russ White
a5dac02901
Merge pull request #12114 from opensourcerouting/feature/bgp_aigp_attribute
bgpd: Implement AIGP
2022-10-31 11:24:43 -04:00
Donatas Abraitis
97a52c82a5 bgpd: Implement Accumulated IGP Metric Attribute for BGP
https://www.rfc-editor.org/rfc/rfc7311.html

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-10-26 11:26:57 +03:00
Stephen Worley
a0b937de42 bgpd,doc: limit InQ buf to allow for back pressure
Add a default limit to the InQ for messages off the bgp peer
socket. Make the limit configurable via cli.

Adding in this limit causes the messages to be retained in the tcp
socket and allow for tcp back pressure and congestion control to kick
in.

Before this change, we allow the InQ to grow indefinitely just taking
messages off the socket and adding them to the fifo queue, never letting
the kernel know we need to slow down. We were seeing under high loads of
messages and large perf-heavy routemaps (regex matching) this queue
would cause a memory spike and BGP would get OOM killed. Modifying this
leaves the messages in the socket and distributes that load where it
should be in the socket buffers on both send/recv while we handle the
mesages.

Also, changes were made to allow the ringbuffer to hold messages and
continue to be filled by the IO pthread while we wait for the Main
pthread to handle the work on the InQ.

Memory spike seen with large numbers of routes flapping and route-maps
with dozens of regex matching:

```
Memory statistics for bgpd:
System allocator statistics:
  Total heap allocated:  > 2GB
  Holding block headers: 516 KiB
  Used small blocks:     0 bytes
  Used ordinary blocks:  160 MiB
  Free small blocks:     3680 bytes
  Free ordinary blocks:  > 2GB
  Ordinary blocks:       121244
  Small blocks:          83
  Holding blocks:        1
```

With most of it being held by the inQ (seen from the stream datastructure info here):

```
Type                          : Current#   Size       Total     Max#  MaxBytes
...
...
Stream                        :   115543 variable  26963208 15970740 3571708768
```

With this change that memory is capped and load is left in the sockets:

RECV Side:
```
State    Recv-Q    Send-Q                           Local Address:Port                         Peer Address:Port    Process
ESTAB    265350    0            [fe80::4080:30ff:feb0:cee3]%veth1:36950         [fe80::4c14:9cff:fe1d:5bfd]:179      users:(("bgpd",pid=1393334,fd=26))
         skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61)

```

SEND Side:
```
State  Recv-Q  Send-Q                        Local Address:Port                  Peer Address:Port   Process
ESTAB  0       1275012   [fe80::4c14:9cff:fe1d:5bfd]%veth1:179    [fe80::4080:30ff:feb0:cee3]:36950   users:(("bgpd",pid=1393443,fd=27))
         skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0)

```

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-10-24 18:23:29 -04:00
Carmine Scarpitta
527588aa78 bgpd: add support for per-VRF SRv6 SID
In the current implementation of bgpd, SRv6 SIDs can be configured only
under the address-family. This enables bgpd to leak IPv6 routes using
an SRv6 End.DT6 behavior and IPv4 routes using an SRv6 End.DT4
behavior. It is not possible to leak both IPv6 and IPv4 routes using a
single SRv6 SID.

This commit adds a new CLI command
"sid vpn per-vrf export <sid_idx|auto>" that enables bgpd to leak both
IPv6 and IPv4 routes using a single SRv6 SID (End.DT46 behavior).

Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
2022-10-18 16:08:23 +02:00
Donatas Abraitis
272c6d5db1
Merge pull request #8647 from sworleys/DVNI-Config-Changes
bgpd: EVPN D-VNI L3 RT Config Enhancements
2022-10-18 14:17:04 +03:00
Donatas Abraitis
46dbf9d0c0 bgpd: Implement ACCEPT_OWN extended community
TL;DR: rfc7611.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-10-12 17:48:43 +03:00
Russ White
984eb32b58
Merge pull request #11159 from maduri111/bgpd-orr
bgpd: optimal route reflection
2022-10-12 09:30:36 -04:00
Russ White
b6aa61ba3c
Merge pull request #11981 from proelbtn/add-support-to-change-function-length
bgpd: Add support to change Segment Routing function length
2022-10-12 08:44:29 -04:00
Madhuri Kuruganti
70cd87ca02 bgpd: optimal router reflection cli and fsm changes
Signed-off-by: Madhuri Kuruganti <maduri111@gmail.com>
2022-10-12 13:43:55 +05:30
Donatas Abraitis
eb53128367
Merge pull request #9998 from pguibert6WIND/bgp_tcp_keepalive
Bgp tcp keepalive
2022-10-10 15:46:30 +03:00
Ryoga Saito
bee2e7d08f bgpd: save srv6_locator_chunk in vpn_policy
In order to send correct SRv6 L3VPN advertisement, we need to save
srv6_locator_chunk in vpn_policy. With this information, we can
construct correct SRv6 L3VPN advertisement packets.

Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
2022-10-07 18:26:48 +09:00
Russ White
a8ef436639
Merge pull request #12040 from opensourcerouting/fix/bgp_local_as_remote_as
bgpd: Allow using remote-as the same as local-as
2022-10-06 10:03:26 -04:00
Madhuri Kuruganti
e85e4a8d16 bgpd: conditional advertisement code cleanup
Signed-off-by: Madhuri Kuruganti <maduri111@gmail.com>
2022-10-06 12:43:05 +05:30
Donatas Abraitis
d6b0327c35 bgpd: Allow using remote-as the same as local-as
As an example, Arista EOS allows this behavior.

Configuration something like:

```
 neighbor PG peer-group
 neighbor PG remote-as 65001
 neighbor PG local-as 65001
 neighbor 192.168.10.124 peer-group PG
```

Or without peer-group.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-09-29 21:13:40 +03:00
Xiao Liang
a783cc05f0 bgpd: Handle route-refresh request received before EoR
See the BGP message sequence:

    R1                  R2
    |      updates      |
    |------------------>|
    |                   |
    |  refresh request  |
    x<------------------|
    |                   |
    |   updates cont.   |
    |------------------>|
    |                   |
    |    end-of-rib     |
    |------------------>|
    |                   |

When R1 and R2 establish BGP session, R1 begins to send initial updates.
If R2 sends a route-refresh request before EoR, it's silently ignored
by R1, and routes received earlier have no chance to be processed again.

RFC7313 says, "for a BGP speaker that supports the BGP Graceful Restart,
it MUST NOT send a BoRR for an <AFI, SAFI> to a neighbor before it sends
the EoR for the <AFI, SAFI> to the neighbor." But it doesn't forbid
route-refresh request to be sent before receiving EoR.

To handle this scenario, postpone response to refresh request until EoR
is sent.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
2022-09-16 18:26:21 +08:00
Rafael Zalamena
340ed5f9e2
Merge pull request #11823 from pguibert6WIND/bgp_vpnv4_gre_ebgp
Bgp vpnv4 convey without transport label
2022-09-06 13:37:19 -03:00
Philippe Guibert
4cd690ae4d bgpd: add 'mpls bgp forwarding' to ease mpls vpn ebgp peering
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.

The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:

This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).

restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-09-05 22:26:33 +02:00
Donatas Abraitis
b6a3df6b48 bgpd: Drop useless comments for peer af flags
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-31 14:39:38 +03:00
Donatas Abraitis
da5e1a58e9 bgpd: Increase peer af_flags to uint64_t
Increasing in advance, as we already hitting the current limit.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-31 14:35:55 +03:00
Russ White
d72c279d08
Merge pull request #11833 from opensourcerouting/feature/bgp_neighbor_soo
bgpd: Add `neighbor soo` command
2022-08-30 11:17:53 -04:00
Philippe Guibert
d1adb44843 bgpd: support TCP keepalive for BGP connection
TCP keepalive is enabled once BGP connection is established.

New vty commands:

bgp tcp-keepalive <1-65535> <1-65535> <1-30>
no bgp tcp-keepalive

Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-08-30 15:09:28 +02:00
Donald Sharp
083ec940ab bgpd: Convert from bgp_clock() to monotime()
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-08-24 08:23:40 -04:00
Stephen Worley
58d8948cf4 bgpd: evpn L3 RT auto config and wildcard implementation
Implement forcing L3 auto derivation via configs even when
manually RTs are set. This will allow both to coexist in
BGP RTs. Without using auto config command, it will remove
auto derived RTs when you manually configure your own. To allow
both, use the auto command ond import/export/both.

Implement '*' wildcard import L3 RTs so we can import a route into any AS.
This is necessary to avoid a user from having to configure an L3 RT for
every AS they care to import evpn route from.

Signed-off-by: Stephen Worley <sworley@nvidia.com>
2022-08-23 12:41:25 -04:00
Donatas Abraitis
01da2d2691 bgpd: Add neighbor soo command
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.

If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.

Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:

```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
 set extcommunity soo 65000:1
...
route-map cpe deny 10
 match extcommunity cpe
route-map cpe permit 20
...
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-20 21:22:45 +03:00
Donatas Abraitis
0b1fb52c2a bgpd: Convert some int functions to void
The output is not checked, we can have void instead.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-08-12 13:26:38 +03:00
Donatas Abraitis
c41e93720a bgpd: Reset BGP sessions when changing the port
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-07-27 11:44:07 +03:00
Philippe Guibert
a486300b26 bgpd: implement retain route-target all behaviour
A new command is available under SAFI_MPLS_VPN:

With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.

A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2022-07-18 08:57:19 +02:00
Quentin Young
ecf2b628d9 bgpd: rename update_type enum values
These values were named WITHDRAW and UPDATE. Yeah, you guessed it, those
are already #define's elsewhere (bgp_debug.h). Hilarity ensues.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
2022-07-01 15:22:04 +03:00
Donatas Abraitis
7dddd1f733 bgpd: Make sure peer-groups/unnumbered work too with BGP role
Just adding a support for peer-groups, because now it's not possible to
configure BGP role for peer-groups.

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-06-28 18:32:11 +03:00
Lou Berger
613025ef10
Merge pull request #11093 from donaldsharp/allow_martians
Allow martians
2022-06-28 10:38:57 -04:00
Donatas Abraitis
83194f394b bgpd: Use uin64_t for peer->flags
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
2022-06-27 17:22:54 +03:00
Donatas Abraitis
f646c17a43
Merge pull request #11426 from error2407/open_policy
bgpd: Add RFC9234 implementation
2022-06-27 09:57:29 +03:00
Donald Sharp
8666265e2e bgpd: Add bgp allow-martian-nexthop command
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.

Add the `bgp allow-martian-nexthop` command and allow it to be
used.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
2022-06-24 07:45:46 -04:00
Eugene Bogomazov
8f2d6021f8 bgpd: Add patches for RFC9234 implementation
This commit fixes some issues that were noted by the reviewer

Signed-off-by: Eugene Bogomazov <eb@qrator.net>
2022-06-21 17:41:53 +03:00