Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It was hard to catch those unless using higher values than uint32_t, but
already hit, it's time to fix completely.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Since we increased peer->af_flags from uint32_t to uint64_t,
peer_af_flag_check() was historically returning integer, and not bool
as should be.
The bug was that if we have af_flags higher than uint32_t it will never
returned a right value.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When we specify remote-as as external/internal, we need to set local_as to
bgp->as, instead of bgp->confed_id. Before this patch, (bgp->as != *as) is
always valid for such a case because *as is always 0.
Also, append peer->local_as as CONFED_SEQ to avoid other side withdrawing
the routes due to confederation own AS received and/or malformed as-path.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When leaving the BGP shutdown state we must restart the peer timers
otherwise nothing will happen.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The route-distinguisher string can be expressed in different
ways when the AS number is part of the RD. And the configured
string value has to be kept intact.
The following vty commands store the string value internally:
- router bgp / address-family ipv4 unicast / rd vpn export <>
- router bgp / address-family l2vpn evpn / rd <>
- router bgp / address-family l2vpn evpn / vni <> / rd <>
The vty commands where RD is configured in the below places is
not considered:
- router bgp / rfapi related commands
- router bgp / address-family xxx xxx / network .. rd <>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The confederation peers as and the confederation identifier as
are stored as a string to preserve the output in the running
configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This identifier is used to display the peer configuration in
the running-config, like it has been configured.
The following commands are using a specific string attribute:
- neighbor .. remote-as ASN
- neighbor .. local-as ASN
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A json AS number API is created in order to output a
given AS number. In order to keep backward compatibility,
if the as-notation uses a number, then the json is encoded
as an integer, otherwise the encoding will be a string.
For what is not relevant to running-configuration, the
as-notation mode is the one used for the BGP instance.
Also, the vty completion gets the configured 'as_pretty'
string value, when an user wants to get the available
BGP instances.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]
At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.
Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.
The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.
Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.
One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
AS number can be defined as an unsigned long number, or
two uint16 values separated by a period (.). The possible
valus are:
- usual 32 bit values : [1;2^32 -1]
- <1.65535>.<0.65535> for dot notation
- <0.65535>.<0.65535> for dot+ notation.
The 0.0 value is forbidden when configuring BGP instances
or peer configurations.
A new ASN type is added for parsing in the vty.
The following commands use that new identifier:
- router bgp ..
- bgp confederation ..
- neighbor <> remote-as <>
- neighbor <> local-as <>
- clear ip bgp <>
- route-map / set as-path <>
An asn library is available in lib/ and provides some
services:
- convert an as string into an as number.
- parse an as path list string and extract a number.
- convert an as number into a string.
Also, the bgp tests forge an as_zero_path, and to do that,
an API to relax the possibility to have a 0 as value is
specifically called from the tests.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preliminary work to handle various ways to configure
a BGP Autonomous System. When creating a BGP instance, the
user may want to define the AS number as a dotted value,
instead of using an integer value.
To handle both cases, an as_pretty char attribute will store
the as number as it has been given to the vtysh command:
router bgp <as number>
Whenever the as integer of the BGP instance was dumped,
the as_pretty original format is used.
The json output reuses the integer value to keep backward
compatibility with old displays.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Consider this scenario:
Lots of peers with a bunch of route information that is changing
fast. One of the peers happens to be really slow for whatever
reason. The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future. Now suppose that peer has hit it's input Queue
limit and is slow. As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.
Let's limit the Output Queue to the same limit as the Input
Queue. This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.
This is the command that Cisco has also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When actually creating a peer in BGP, tell the creation if
it is a config node or not. There were cases where the
CONFIG_NODE was being set *after* being placed into
the bgp->peerhash, thus causing collisions between the
doppelganger and the peer and eventually use after free's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently bgp does not stop any events that are on the thread
system for execution on peer deletion. This is not good.
Stop those events and prevent use after free's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When the decision has been made to copy a peer configuration from
a peer to another peer because one is taking over. Do not automatically
set the CONFIG_NODE flag. Instead we need to handle that appropriately
when the final decision is made to transfer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When changing the peers sockunion structure the bgp->peer
list was not being updated properly. Since the peer's su
is being used for a sorted insert then the change of it requires
that the value be pulled out of the bgp->peer list and then
put back into as well.
Additionally ensure that the hash is always released on peer
deletion.
Lead to this from this decode in a address sanitizer run.
=================================================================
==30778==ERROR: AddressSanitizer: heap-use-after-free on address 0x62a0000d8440 at pc 0x7f48c9c5c547 bp 0x7ffcba272cb0 sp 0x7ffcba272ca8
READ of size 2 at 0x62a0000d8440 thread T0
#0 0x7f48c9c5c546 in sockunion_same lib/sockunion.c:425
#1 0x55cfefe3000f in peer_hash_same bgpd/bgpd.c:890
#2 0x7f48c9bde039 in hash_release lib/hash.c:209
#3 0x55cfefe3373f in bgp_peer_conf_if_to_su_update bgpd/bgpd.c:1541
#4 0x55cfefd0be7a in bgp_stop bgpd/bgp_fsm.c:1631
#5 0x55cfefe4028f in peer_delete bgpd/bgpd.c:2362
#6 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
#7 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
#8 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
#9 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
#10 0x7f48c9c87402 in vty_command lib/vty.c:526
#11 0x7f48c9c87832 in vty_execute lib/vty.c:1291
#12 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
#13 0x7f48c9c7a66d in thread_call lib/thread.c:1585
#14 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
#15 0x55cfefc75a15 in main bgpd/bgp_main.c:540
#16 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
#17 0x55cfefc787f9 in _start (/usr/lib/frr/bgpd+0xe27f9)
0x62a0000d8440 is located 576 bytes inside of 23376-byte region [0x62a0000d8200,0x62a0000ddd50)
freed by thread T0 here:
#0 0x7f48c9eb9fb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
#1 0x55cfefe3fe42 in peer_free bgpd/bgpd.c:1113
#2 0x55cfefe3fe42 in peer_unlock_with_caller bgpd/bgpd.c:1144
#3 0x55cfefe4092e in peer_delete bgpd/bgpd.c:2457
#4 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
#5 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
#6 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
#7 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
#8 0x7f48c9c87402 in vty_command lib/vty.c:526
#9 0x7f48c9c87832 in vty_execute lib/vty.c:1291
#10 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
#11 0x7f48c9c7a66d in thread_call lib/thread.c:1585
#12 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
#13 0x55cfefc75a15 in main bgpd/bgp_main.c:540
#14 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a peer is a peer-group based peer, and the config is inherited
from the peer group, let's ensure that the CONFIG_NODE flag stays
no matter what.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
currently the following configuration
dut:
!
interface ntfp2
ip router isis 1
!
router bgp 200
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 300
neighbor 192.168.1.1 remote-as 100
neighbor 192.168.2.2 remote-as 300
!
address-family ipv4 unicast
neighbor 192.168.2.2 default-originate
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0002.0002.0002.00
redistribute ipv4 connected level-2
!
end
router:
!
interface ntfp2
ip router isis 1
isis circuit-type level-2-only
!
router bgp 300
no bgp ebgp-requires-policy
bgp confederation identifier 300
bgp confederation peers 200
neighbor 192.168.2.1 remote-as 200
neighbor 192.168.3.2 remote-as 400
!
address-family ipv4 unicast
network 3.3.3.0/24
exit-address-family
!
router isis 1
is-type level-2-only
net 49.0001.0003.0003.0003.00
redistribute ipv4 connected level-2
!
end
on dut result of show bgp ipv4 unicast command is:
show bgp ipv4 unicast
BGP table version is 1, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
instead of
sho bgp ipv4 unicast
BGP table version is 3, local router ID is 192.168.2.1, vrf id 0
Default local pref 100, local AS 200
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 192.168.1.1 0 0 100 i
*> 3.3.3.0/24 192.168.2.2 0 100 0 (300) i
*> 4.4.4.0/24 192.168.3.2 0 100 0 (300) 400 i
Displayed 3 routes and 3 total paths
According to RFC 5065:the usage of one of the member AS number as the
confederation identifier is not forbidden.
fixes are the following
in bgp_route.c:
in bgp_update remove the test for presence of confederation id in
as_path since, this case is allowed;
in bgp_vty.c
bgp_confederation_peers, remove the test on peer as value
in bgpd.c
bgp_confederation_peers_add
remove the test on peer as value
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
bgp_confederation_peers_remove
invert the order of setting peer->sort value and peer->local_as,
since peer->sort is depending from current peer->local_as value
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
RPKI revalidation is an possibly expensive operation. Break up
revalidation on a prefix basis by the `struct bgp` pointer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive. Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Not all places were checking to see if soft reconfiguration
was turned on before calling into it to do all that work.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>