We copy the password only if an existing peer structure didn't have it.
But it might be the case when it exists, and we skip here.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It was hard to catch those unless using higher values than uint32_t, but
already hit, it's time to fix completely.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Since we increased peer->af_flags from uint32_t to uint64_t,
peer_af_flag_check() was historically returning integer, and not bool
as should be.
The bug was that if we have af_flags higher than uint32_t it will never
returned a right value.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When we specify remote-as as external/internal, we need to set local_as to
bgp->as, instead of bgp->confed_id. Before this patch, (bgp->as != *as) is
always valid for such a case because *as is always 0.
Also, append peer->local_as as CONFED_SEQ to avoid other side withdrawing
the routes due to confederation own AS received and/or malformed as-path.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When leaving the BGP shutdown state we must restart the peer timers
otherwise nothing will happen.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The route-distinguisher string can be expressed in different
ways when the AS number is part of the RD. And the configured
string value has to be kept intact.
The following vty commands store the string value internally:
- router bgp / address-family ipv4 unicast / rd vpn export <>
- router bgp / address-family l2vpn evpn / rd <>
- router bgp / address-family l2vpn evpn / vni <> / rd <>
The vty commands where RD is configured in the below places is
not considered:
- router bgp / rfapi related commands
- router bgp / address-family xxx xxx / network .. rd <>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The confederation peers as and the confederation identifier as
are stored as a string to preserve the output in the running
configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This identifier is used to display the peer configuration in
the running-config, like it has been configured.
The following commands are using a specific string attribute:
- neighbor .. remote-as ASN
- neighbor .. local-as ASN
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A json AS number API is created in order to output a
given AS number. In order to keep backward compatibility,
if the as-notation uses a number, then the json is encoded
as an integer, otherwise the encoding will be a string.
For what is not relevant to running-configuration, the
as-notation mode is the one used for the BGP instance.
Also, the vty completion gets the configured 'as_pretty'
string value, when an user wants to get the available
BGP instances.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]
At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.
Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.
The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.
Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.
One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
AS number can be defined as an unsigned long number, or
two uint16 values separated by a period (.). The possible
valus are:
- usual 32 bit values : [1;2^32 -1]
- <1.65535>.<0.65535> for dot notation
- <0.65535>.<0.65535> for dot+ notation.
The 0.0 value is forbidden when configuring BGP instances
or peer configurations.
A new ASN type is added for parsing in the vty.
The following commands use that new identifier:
- router bgp ..
- bgp confederation ..
- neighbor <> remote-as <>
- neighbor <> local-as <>
- clear ip bgp <>
- route-map / set as-path <>
An asn library is available in lib/ and provides some
services:
- convert an as string into an as number.
- parse an as path list string and extract a number.
- convert an as number into a string.
Also, the bgp tests forge an as_zero_path, and to do that,
an API to relax the possibility to have a 0 as value is
specifically called from the tests.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preliminary work to handle various ways to configure
a BGP Autonomous System. When creating a BGP instance, the
user may want to define the AS number as a dotted value,
instead of using an integer value.
To handle both cases, an as_pretty char attribute will store
the as number as it has been given to the vtysh command:
router bgp <as number>
Whenever the as integer of the BGP instance was dumped,
the as_pretty original format is used.
The json output reuses the integer value to keep backward
compatibility with old displays.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Consider this scenario:
Lots of peers with a bunch of route information that is changing
fast. One of the peers happens to be really slow for whatever
reason. The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future. Now suppose that peer has hit it's input Queue
limit and is slow. As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.
Let's limit the Output Queue to the same limit as the Input
Queue. This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.
This is the command that Cisco has also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When actually creating a peer in BGP, tell the creation if
it is a config node or not. There were cases where the
CONFIG_NODE was being set *after* being placed into
the bgp->peerhash, thus causing collisions between the
doppelganger and the peer and eventually use after free's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently bgp does not stop any events that are on the thread
system for execution on peer deletion. This is not good.
Stop those events and prevent use after free's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the decision has been made to copy a peer configuration from
a peer to another peer because one is taking over. Do not automatically
set the CONFIG_NODE flag. Instead we need to handle that appropriately
when the final decision is made to transfer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When changing the peers sockunion structure the bgp->peer
list was not being updated properly. Since the peer's su
is being used for a sorted insert then the change of it requires
that the value be pulled out of the bgp->peer list and then
put back into as well.
Additionally ensure that the hash is always released on peer
deletion.
Lead to this from this decode in a address sanitizer run.
=================================================================
==30778==ERROR: AddressSanitizer: heap-use-after-free on address 0x62a0000d8440 at pc 0x7f48c9c5c547 bp 0x7ffcba272cb0 sp 0x7ffcba272ca8
READ of size 2 at 0x62a0000d8440 thread T0
#0 0x7f48c9c5c546 in sockunion_same lib/sockunion.c:425
#1 0x55cfefe3000f in peer_hash_same bgpd/bgpd.c:890
#2 0x7f48c9bde039 in hash_release lib/hash.c:209
#3 0x55cfefe3373f in bgp_peer_conf_if_to_su_update bgpd/bgpd.c:1541
#4 0x55cfefd0be7a in bgp_stop bgpd/bgp_fsm.c:1631
#5 0x55cfefe4028f in peer_delete bgpd/bgpd.c:2362
#6 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
#7 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
#8 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
#9 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
#10 0x7f48c9c87402 in vty_command lib/vty.c:526
#11 0x7f48c9c87832 in vty_execute lib/vty.c:1291
#12 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
#13 0x7f48c9c7a66d in thread_call lib/thread.c:1585
#14 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
#15 0x55cfefc75a15 in main bgpd/bgp_main.c:540
#16 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
#17 0x55cfefc787f9 in _start (/usr/lib/frr/bgpd+0xe27f9)
0x62a0000d8440 is located 576 bytes inside of 23376-byte region [0x62a0000d8200,0x62a0000ddd50)
freed by thread T0 here:
#0 0x7f48c9eb9fb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
#1 0x55cfefe3fe42 in peer_free bgpd/bgpd.c:1113
#2 0x55cfefe3fe42 in peer_unlock_with_caller bgpd/bgpd.c:1144
#3 0x55cfefe4092e in peer_delete bgpd/bgpd.c:2457
#4 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
#5 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
#6 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
#7 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
#8 0x7f48c9c87402 in vty_command lib/vty.c:526
#9 0x7f48c9c87832 in vty_execute lib/vty.c:1291
#10 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
#11 0x7f48c9c7a66d in thread_call lib/thread.c:1585
#12 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
#13 0x55cfefc75a15 in main bgpd/bgp_main.c:540
#14 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a peer is a peer-group based peer, and the config is inherited
from the peer group, let's ensure that the CONFIG_NODE flag stays
no matter what.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
currently the following configuration
dut:
!
interface ntfp2
ip router isis 1
!
router bgp 200
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 300
neighbor 192.168.1.1 remote-as 100
neighbor 192.168.2.2 remote-as 300
!
address-family ipv4 unicast
neighbor 192.168.2.2 default-originate
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0002.0002.0002.00
redistribute ipv4 connected level-2
!
end
router:
!
interface ntfp2
ip router isis 1
isis circuit-type level-2-only
!
router bgp 300
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 200
neighbor 192.168.2.1 remote-as 200
neighbor 192.168.3.2 remote-as 400
!
address-family ipv4 unicast
network 3.3.3.0/24
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0003.0003.0003.00
redistribute ipv4 connected level-2
!
end
on dut result of show bgp ipv4 unicast command is:
show bgp ipv4 unicast
BGP table version is 1, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
instead of
sho bgp ipv4 unicast
BGP table version is 3, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
*> 3.3.3.0/24 192.168.2.2 0 100 0 (300) i
*> 4.4.4.0/24 192.168.3.2 0 100 0 (300) 400 i
Displayed 3 routes and 3 total paths
According to RFC 5065:the usage of one of the member AS number as the
confederation identifier is not forbidden.
fixes are the following
in bgp_route.c:
in bgp_update remove the test for presence of confederation id in
as_path since, this case is allowed;
in bgp_vty.c
bgp_confederation_peers, remove the test on peer as value
in bgpd.c
bgp_confederation_peers_add
remove the test on peer as value
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
bgp_confederation_peers_remove
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
RPKI revalidation is an possibly expensive operation. Break up
revalidation on a prefix basis by the `struct bgp` pointer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive. Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Not all places were checking to see if soft reconfiguration
was turned on before calling into it to do all that work.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a default limit to the InQ for messages off the bgp peer
socket. Make the limit configurable via cli.
Adding in this limit causes the messages to be retained in the tcp
socket and allow for tcp back pressure and congestion control to kick
in.
Before this change, we allow the InQ to grow indefinitely just taking
messages off the socket and adding them to the fifo queue, never letting
the kernel know we need to slow down. We were seeing under high loads of
messages and large perf-heavy routemaps (regex matching) this queue
would cause a memory spike and BGP would get OOM killed. Modifying this
leaves the messages in the socket and distributes that load where it
should be in the socket buffers on both send/recv while we handle the
mesages.
Also, changes were made to allow the ringbuffer to hold messages and
continue to be filled by the IO pthread while we wait for the Main
pthread to handle the work on the InQ.
Memory spike seen with large numbers of routes flapping and route-maps
with dozens of regex matching:
```
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 516 KiB
Used small blocks: 0 bytes
Used ordinary blocks: 160 MiB
Free small blocks: 3680 bytes
Free ordinary blocks: > 2GB
Ordinary blocks: 121244
Small blocks: 83
Holding blocks: 1
```
With most of it being held by the inQ (seen from the stream datastructure info here):
```
Type : Current# Size Total Max# MaxBytes
...
...
Stream : 115543 variable 26963208 15970740 3571708768
```
With this change that memory is capped and load is left in the sockets:
RECV Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 265350 0 [fe80::4080:30ff:feb0:cee3]%veth1:36950 [fe80::4c14:9cff:fe1d:5bfd]:179 users:(("bgpd",pid=1393334,fd=26))
skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61)
```
SEND Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 0 1275012 [fe80::4c14:9cff:fe1d:5bfd]%veth1:179 [fe80::4080:30ff:feb0:cee3]:36950 users:(("bgpd",pid=1393443,fd=27))
skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0)
```
Signed-off-by: Stephen Worley <sworley@nvidia.com>
We might disable sending unconfig/shutdown notifications when
Graceful-Restart is enabled and negotiated.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
As an example, Arista EOS allows this behavior.
Configuration something like:
```
neighbor PG peer-group
neighbor PG remote-as 65001
neighbor PG local-as 65001
neighbor 192.168.10.124 peer-group PG
```
Or without peer-group.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
We are allocating temporary memory for information about
what to process in this thread, which is not being cleaned
up on thread cancelling.
Signed-off-by: Samanvitha B Bhargav <bsmanvitha@vmware.com>
Memory allocated when 'import vrf route maps <>' is configured,
wasn't being freed when the entire bgp config
was deleted through 'no router bgp'.
Signed-off-by: Samanvitha B Bhargav <bsmanvitha@vmware.com>
==1179738== 48 (40 direct, 8 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 29
==1179738== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==1179738== by 0x493C8D5: qcalloc (memory.c:116)
==1179738== by 0x208F0C: ecommunity_dup (bgp_ecommunity.c:267)
==1179738== by 0x2B300C: conf_copy (bgp_updgrp.c:170)
==1179738== by 0x2B35BF: peer2_updgrp_copy (bgp_updgrp.c:277)
==1179738== by 0x2B5189: update_group_find (bgp_updgrp.c:826)
==1179738== by 0x2B70D0: update_group_adjust_peer (bgp_updgrp.c:1769)
==1179738== by 0x23DB7D: update_group_adjust_peer_afs (bgp_updgrp.h:519)
==1179738== by 0x243B21: bgp_establish (bgp_fsm.c:2129)
==1179738== by 0x244B94: bgp_event_update (bgp_fsm.c:2597)
==1179738== by 0x26B0E6: bgp_process_packet (bgp_packet.c:2895)
==1179738== by 0x498F5FD: thread_call (thread.c:2008)
==1179738== by 0x49253DA: frr_run (libfrr.c:1198)
==1179738== by 0x1EEC38: main (bgp_main.c:520)
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
TCP keepalive is enabled once BGP connection is established.
New vty commands:
bgp tcp-keepalive <1-65535> <1-65535> <1-30>
no bgp tcp-keepalive
Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Handle ORF REMOVE_ALL events as well, because now we just silently return, and
a stale dynamic prefix-list is used instead of the new one.
Before this, soft clear/route refresh was needed. Don't know the reason, but
we didn't send updates when modifying the filters.
Probably due to a massive change of filters and to avoid automatic updates :/
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Description:
- On removing just the route-map from the default-originate config,
update message is not sent to the peer,
and the properties set by route-map persists on peer's end,
until we do a clear bgp.
Fix:
- The flag which is set when default route is originated,
should be unset once "neighbor X.X.X.X default-orginate",
to remove route-map from "neighbor X.X.X.X default-orginate route-map Y",
so as to trigger the flow for sending an update.
Co-authored-by: Abhinay Ramesh <rabhinay@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
These values were named WITHDRAW and UPDATE. Yeah, you guessed it, those
are already #define's elsewhere (bgp_debug.h). Hilarity ensues.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Just adding a support for peer-groups, because now it's not possible to
configure BGP role for peer-groups.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.
Add the `bgp allow-martian-nexthop` command and allow it to be
used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_rpki_validation2str implements a switch statement to determine the
correct string response from the validation state. So, switch to a
switch statement when getting a name by role for code consistency.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
Roles cannot be applied to iBGP sessions, so we can move this check to
the top of the role configuration method. Thus, we simplify the internal
logic of branching.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
BGP Role is currently defined only for eBGP session. So, we don't
need to consider which roles can be applied on iBGP session and
thus simplify code fragment.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
Allow BGP to control the TOS DSCP value in the tcp header
via a new command at the bgp global level `bgp session-dscp <0-63>`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Pavel Shirhov <pavelsh@microsoft.com>
The `bgp no-rib` command cycles through all the bgp rib tables
and removes them from zebra. Modify the code so that FRR notices
that it is attempting to cycle through the safi's that are two level
tables. In addition these safi's cannot just blindly remove the routes
from the rib as that there are none explicitly.
This code just prevents the crash in bgpd. It does not properly cycle
through and remove the zebra changes made that are explicit to these afi's.
This should be handled as appropriate by the developers on these safi's when
it becomes important to them.
Fixes: #11178
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Peer groups when various forms of `bgp capability extended-nexthop`
is entered on them are toggling the nexthop tracking status of peers
in their peer group. This is ok when the peer is not interface based.
But it is not ok when the peer is interface based as that it will turn
off the ability of FRR to properly work with that peer type.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Also, add N-Bit (Notification) flag for Graceful Restart.
This is a preparation for RFC8538.
More information: https://datatracker.ietf.org/doc/html/rfc8538
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When the peer is configured for the first time:
```
neighbor P1 peer-group
neighbor P1 remote-as external
neighbor P1 advertise-map ADV exist-map EXIST
neighbor 10.10.10.1 peer-group P1
```
Conditional advertisements route-maps are not updated and cond. advertisements
do not work until FRR restarted. BGP sessions clear does not help.
Or even changing peer-group for a peer, causes this bug to kick in.
```
no neighbor 10.10.10.1
neighbor 10.10.10.1 peer-group P2
```
With this fix, cond. advertisements start working immediatelly.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Delay BGP configuration until we receive end-configuration hook to make sure
we don't send partial updates to peer which leads to broken Graceful-Restart.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If you enter:
router bgp 325
bgp graceful-restart
bgp graceful-restart
!
The second command entered will fail. This is not
something that should be failing as that it's a no-op.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
confederations are checking to see that the bgp pointer
is non-null. But it's impossible to have a null pointer
in the cli and in all paths we have already deref'ed the bgp
pointer. Let's remove that error code as that it is impossible
to happen.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a 15 minute warning to the logging system when
bgp policy is not setup properly. Operators keep asking
about the missing policy( on upgrade typically ). Let's
try to give them a bit more of a hint when something is
going wrong as that they are clearly missing the other
various places FRR tells them about it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When setting maximum-prefix-out on peer-group, the applied value on
member is 0.
Fix usage of maximum-prefix-out on peer-group.
The peer_maximum_prefix_out_(un)set functions are derived from
peer_maximum_prefix_(un)set.
Fixes: fde246e835 ("bgpd: Add an option to limit outgoing prefixes")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Opaque data takes up a lot of memory when there are a lot of routes on
the box. Given that this is just a cosmetic info, I propose to disable
it by default to not shock people who start using FRR for the first time
or upgrades from an old version.
Fixes#10101.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Abstract:
- The command "neighbor PEER maximum-prefix-out NUMBER" cannot be applied
without clearing the BGP neighbor.
- Apply the maximum-prefix-out value as soon as it is modified without
clearing the neighbor.
subgroup_update_packet() and subgroup_withdraw_packet() respectively
manages the announcement and withdrawal BGP message to the peer.
subgrp->scount counter counts the number of sent prefixes.
Before the patch, the maximum out prefix limitation was applied in
subgroup_update_packet() in order that subgrp->scount never exceeds the
limit. Setting a limit inferior to the effective number of sent prefix
did not result in sending any withdrawal message to reduce the number of
sent prefixes. Without clearing the BGP neighbor, the limitation only
applied to the announcement of new prefixes when the limitation was
over.
With the patch, the limitation is checked in subgroup_announce_check().
The function is intended to say whether a prefix has to be announced in
regards to the prefix-list, route-map... Now when a maximum-prefix-out
value is changed/removed, the neighbor AFI/SAFI table is re-parsed in
the same way as for the application of route-map, prefix-lists...
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Used for graceful-restart mostly.
Especially for bgp_show_neighbor_graceful_restart_capability_per_afi_safi()
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Extended BGP Administrative Shutdown Communication (rfc9003):
Basically, shutdown message size is increased to 255 from 128.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Update BFD sessions when the update-source configuration is set so the
session follows the new configured source address.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When altering the TTL of a eBGP peer also update the BFD
configuration. This was only working when the configuration happened
after the peer connection had been established.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When removing the command `no neighbor <X> ebgp-multihop <Y>`
the bgp code was always resetting the connection even if
the command would do nothing.
Fixes: #6464
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Any command that uses `peer_lookup_in_view` crashes when "vrf all" is
used, because bgp is NULL in this case.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When a regular access-list is updated, we should update references to
regular access-lists, not as-path access-lists.
Fixes#9707.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The idea is to disable addpath-rx capability to avoid unnecessary additional
routes installed.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When FRR added the -Z parameter the bgp daemon was setting
a vrf identifier based upon a number starting at 1. This
caused issues when we upgraded the code to the outgoing
sockets to use vrf_bind always.
FRR should never just randomly select a vrf identifier.
Let's just use VRF_DEFAULT when we are in a -Z environment.
It's a safe bet.
Fixes: #9519
Signed-off-by: Donald Sharp <sharpd@nvidia.com>