```
donatas-pc(config-router)# timers bgp 8 12
% keeplive value 8 is larger than 1/3 of the holdtime, setting to 4
donatas-pc(config-router)# do sh run | include timers bgp
timers bgp 4 12
donatas-pc(config-router)#
```
Closes https://github.com/FRRouting/frr/issues/14287
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This is based on @donaldsharp's work
The current code base is the struct bgp_node data structure.
The problem with this is that it creates a bunch of
extra data per route_node.
The table structure generates ‘holder’ nodes
that are never going to receive bgp routes,
and now the memory of those nodes is allocated
as if they are a full bgp_node.
After splitting up the bgp_node into bgp_dest and route_node,
the memory of ‘holder’ node which does not have any bgp data
will be allocated as the route_node, not the bgp_node,
and the memory usage is reduced.
The memory usage of BGP node will be reduced from 200B to 96B.
The total memory usage optimization of this part is ~16.00%.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>
As part of the conversion to a `struct peer_connection` it will
be desirable to have 2 pointers one for when we open a connection
and one for when we receive a connection. Start this actual
conversion over to this in `struct peer`. If this sounds confusing
take a look at the bgp state machine for connections and how
it resolves the processing of this router opening -vs- this
router receiving an open. At some point in time the state
machine decides that we are keeping one of the two connections.
Future commits will allow us to untangle the peer/doppelganger
duality with this abstraction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The status and ostatus are a function of the `struct peer_connection`
move it into that data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move PEER_THREAD_WRITES_ON and PEER_THREAD_READS_ON to
be a part of the `struct peer_connection` since this is
a connection oriented bit of data.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move the peer->t_write and peer->t_read into `struct peer_connection`
as that these are properties of the connection.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
P# Please enter the commit message for your changes. Lines starting
BGP tracks connections based upon the peer. But the problem
with this is that the doppelganger structure for it is being
created. This has introduced a bunch of fragileness in that
the peer exists independently of the connections to it.
The whole point of the doppelganger structure was to allow
BGP to both accept and initiate tcp connections and then
when we get one to a `good` state we collapse into the
appropriate one. The problem with this is that having
2 peer structures for this creates a situation where
we have to make sure we are configing the `right` one
and also make sure that we collapse the two independent
peer structures into 1 acting peer. This makes no sense
let's abstract out the peer into having 2 connection
one for incoming connections and one for outgoing connections
then we can easily collapse down without having to do crazy
stuff. In addition people adding new features don't need
to have to go touch a million places in the code.
This is the start of this abstraction. In this commit
we'll just pull out the fd and input/output buffers
into a connection data structure. Future commits
will abstract further.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The last_reset_cause is a plain old BGP_MAX_PACKET_SIZE buffer
that is really enlarging the peer data structure. Let's just
copy the stream that failed and only allocate how ever much
the packet size actually was. While it's likely that we have
a reset reason, the packet typically is not going to be 65k
in size. Let's save space.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The peer->ibuf_scratch was allocating 65535 * 10 bytes
for scratch space to hold data incoming from a read
from a peer. When you have 4k peers this is 262,1400,000
or 262 mb of data. Which is crazy large. Especially
since the i/o pthread is reading per peer without
any chance of having the data interfere with other reads.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Useful to have it for datacenter profile only, disabled for traditional.
If the peer is not established or established, but has no description set,
we will show the FRR version instead, which is kinda handy to have instead of
nothing.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
More details: https://www.rfc-editor.org/rfc/rfc8810.html
Not sure if we want to maintain the old code more.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Both the label manager and table manager zapi code send data requests via zapi
to zebra and then immediately listen for a response from zebra. The problem here
is of course that the listen part is throwing away any zapi command that is not
the one it is looking for.
ISIS/OSPF and PIM all have synchronous abilities via zapi, which they all
do through a special zapi connection to zebra. BGP needs to follow this model
as well. Additionally the new zclient_sync connection that should be created,
a once a second timer should wake up and read any data on the socket to
prevent problems too much data accumulating in the socket.
```
r3# sh bgp labelpool summary
Labelpool Summary
-----------------
Ledger: 3
InUse: 3
Requests: 0
LabelChunks: 1
Pending: 128
Reconnects: 1
r3# sh bgp labelpool inuse
Prefix Label
---------------------------
10.0.0.1/32 16
192.168.31.0/24 17
192.168.32.0/24 18
r3#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
In the context of the ASBR facing an EBGP neighbor, or
facing an IBGP neighbor where the BGP updates received
are re-advertised with a modified next-hop, a new local
label will be re-advertised too, to replace the original
one.
Create a binding table, in the form of a hash list, from the
original labels to the new labels. Since labels can be the
same on several routers, set the next-hop and the label as
the keys. Add the needed API functions to manage the hash
list.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When acting as intermediate device for BGP signaling, and
as transit device for data traffic, the device is not able
to modify the label value from incoming MPLS VPN updates:
- as BGP device, modifying the label value is necessary
when redistributing VPN prefixes with its own next-hop.
- as transit device that connects two ethernet segments
on separate interfaces, the return MPLS traffic must be
handled: the modified label value must be swapped with
the original label value and sent back to the original
next-hop.
The border router use case can be taken as example, when
it acts both as transit and as BGP device:
- When receiving updates from a border router peer, and where
interior traffic is expected to transit through the local
border router.
- When receiving updates from interior devices, and where
exterior traffic will transit through the local border router.
In those two situations, a new label is bound to the received
entry, and the entry is advertised to a new peer with the new
label. In the same time, an MPLS entry is created to handle
return traffic with the new mpls label: the traffic would be
swapped to the original MPLS label and the original next-hop.
This is the first commit of a series of patches, that address
the above mentioned issue.
The first commit introduces a new per-interface command:
> interface eth0
> [no] mpls bgp l3vpn-multi-domain-switching
> exit
This command will authorise mpls vpn updates to have a new
label value bound to the mpls vpn routes received over that
interface.
Link: https://www.rfc-editor.org/rfc/rfc3107.html#section-3
> When a BGP speaker redistributes a route, the label(s) assigned to
> that route must not be changed (except by omission), unless the
> speaker changes the value of the Next Hop attribute of the route.
Link: https://www.rfc-editor.org/rfc/rfc3031.html#section-4.6
Link: https://www.rfc-editor.org/rfc/rfc4364.html#section-10
sub-chapter b.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When using `addpath-tx-all` BGP announces all known paths instead of announcing
only an arbitrary number of best paths.
With this new command we can send N best paths to the neighbor. That means, we
send the best path, then send the second best path excluding the previous one,
and so on. In other words, we run best path selection algorithm N times before
we finish.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Currently we have a handler function that will walk the global EVPN
rib and unimport/remove routes matching a local IP/TIP. This generalizes
this function so that it can be re-used for other BGP Martian entry
types. Now this can be used to unimport routes when the MAC-VRF SoO is
reconfigured.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
This commit introduces the necessary structs and apis to
create the cache entries that store the label information
associated to a given nexthop.
A hash table is created in each BGP instance for all the
AFIs: IPv4 and IPv6. That hash table is initialised.
An API to look and/or create an entry based on a given
nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new VTY command is introduced in ipv4 unicast and ipv6 unicast
address family, under a BGP instance.
> r1# label vpn export allocation-mode per-nexthop|per-vrf
This command will update the label values associated for each
BGP update to export to the global instance. Two modes are
available: per-nexthop and per-vrf. The latter is the default
one.
With this commit only, configuring label allocation per nexthop
will only reset the BGP updates, and the per-vrf mode label
allocation will be chosen.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Until now, the bgp local paths were using the default null label
defined. It was not possible to select the null label for the ipv4
or the ipv6 address families.
This commit addresses this issues by adding two extra-parameters
to the BGP labeled-unicast command.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In BGP labeled unicast address-family, it is not possible to
send explicit-null label values with redistributed or network
declared prefixes.
A new CLI command is introduced:
> [no] bgp labeled-unicast explicit-null
When used, the explicit-null value for IPv4 ('0' value) or
IPv6 ('2' value) will be used.
It is necessary to reconfigure the networks or the
redistribution in order to inherit this new behaviour.
Add the documentation.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit introduces the necessary structs and apis to
create the cache entries that store the label information
associated to a given nexthop.
A hash table is created in each BGP instance for all the
AFIs: IPv4 and IPv6. That hash table is initialised.
An API to look and/or create an entry based on a given
nexthop.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new VTY command is introduced in ipv4 unicast and ipv6 unicast
address family, under a BGP instance.
> r1# label vpn export allocation-mode per-nexthop|per-vrf
This command will update the label values associated for each
BGP update to export to the global instance. Two modes are
available: per-nexthop and per-vrf. The latter is the default
one.
With this commit only, configuring label allocation per nexthop
will only reset the BGP updates, and the per-vrf mode label
allocation will be chosen.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Since we increased peer->af_flags from uint32_t to uint64_t,
peer_af_flag_check() was historically returning integer, and not bool
as should be.
The bug was that if we have af_flags higher than uint32_t it will never
returned a right value.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
peer->af_flags got this correctly.
peer->flags were already converted a time ago, but these were missed...
Let's fix this.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Implement: https://datatracker.ietf.org/doc/html/draft-abraitis-bgp-version-capability
Tested with GoBGP:
```
% ./gobgp neighbor 192.168.10.124
BGP neighbor is 192.168.10.124, remote AS 65001
BGP version 4, remote router ID 200.200.200.202
BGP state = ESTABLISHED, up for 00:01:49
BGP OutQ = 0, Flops = 0
Hold time is 3, keepalive interval is 1 seconds
Configured hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
multiprotocol:
ipv4-unicast: advertised and received
ipv6-unicast: advertised
route-refresh: advertised and received
extended-nexthop: advertised
Local: nlri: ipv4-unicast, nexthop: ipv6
UnknownCapability(6): received
UnknownCapability(9): received
graceful-restart: advertised and received
Local: restart time 10 sec
ipv6-unicast
ipv4-unicast
Remote: restart time 120 sec, notification flag set
ipv4-unicast, forward flag set
4-octet-as: advertised and received
add-path: received
Remote:
ipv4-unicast: receive
enhanced-route-refresh: received
long-lived-graceful-restart: advertised and received
Local:
ipv6-unicast, restart time 10 sec
ipv4-unicast, restart time 20 sec
Remote:
ipv4-unicast, restart time 0 sec, forward flag set
fqdn: advertised and received
Local:
name: donatas-pc, domain:
Remote:
name: spine1-debian-11, domain:
software-version: advertised and received
Local:
GoBGP/3.10.0
Remote:
FRRouting/8.5-dev-MyOwnFRRVersion-gdc92f44a45-dirt
cisco-route-refresh: received
Message statistics:
```
FRR side:
```
root@spine1-debian-11:~# vtysh -c 'show bgp neighbor 192.168.10.17 json' | \
> jq '."192.168.10.17".neighborCapabilities.softwareVersion.receivedSoftwareVersion'
"GoBGP/3.10.0"
root@spine1-debian-11:~#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The route-distinguisher string can be expressed in different
ways when the AS number is part of the RD. And the configured
string value has to be kept intact.
The following vty commands store the string value internally:
- router bgp / address-family ipv4 unicast / rd vpn export <>
- router bgp / address-family l2vpn evpn / rd <>
- router bgp / address-family l2vpn evpn / vni <> / rd <>
The vty commands where RD is configured in the below places is
not considered:
- router bgp / rfapi related commands
- router bgp / address-family xxx xxx / network .. rd <>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The confederation peers as and the confederation identifier as
are stored as a string to preserve the output in the running
configuration.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This identifier is used to display the peer configuration in
the running-config, like it has been configured.
The following commands are using a specific string attribute:
- neighbor .. remote-as ASN
- neighbor .. local-as ASN
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Each BGP prefix may have an as-path list attached. A forged
string is stored in the BGP attribute and shows the as-path
list output.
Before this commit, the as-path list output was expressed as
a list of AS values in plain format. Now, if a given BGP instance
uses a specific asnotation, then the output is changed:
new output:
router bgp 1.1 asnotation dot
!
address-family ipv4 unicast
network 10.200.0.0/24 route-map rmap
network 10.201.0.0/24 route-map rmap
redistribute connected route-map rmap
exit-address-family
exit
!
route-map rmap permit 1
set as-path prepend 1.1 5433.55 264564564
exit
ubuntu2004# do show bgp ipv4
BGP table version is 2, local router ID is 10.0.2.15, vrf id 0
Default local pref 100, local AS 1.1
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 4.4.4.4/32 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
*> 10.0.2.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 ?
10.200.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
10.201.0.0/24 0.0.0.0 0 32768 1.1 5433.55 4036.61268 i
The changes include:
- the aspath structure has a new field: asnotation type
The ashash list will differentiate 2 aspaths using a different
asnotation.
- 3 new printf extensions display the as number in the wished
format: pASP, pASD, pASE for plain, dot, or dot+ format (extended).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]
At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.
Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.
The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.
Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.
One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
AS number can be defined as an unsigned long number, or
two uint16 values separated by a period (.). The possible
valus are:
- usual 32 bit values : [1;2^32 -1]
- <1.65535>.<0.65535> for dot notation
- <0.65535>.<0.65535> for dot+ notation.
The 0.0 value is forbidden when configuring BGP instances
or peer configurations.
A new ASN type is added for parsing in the vty.
The following commands use that new identifier:
- router bgp ..
- bgp confederation ..
- neighbor <> remote-as <>
- neighbor <> local-as <>
- clear ip bgp <>
- route-map / set as-path <>
An asn library is available in lib/ and provides some
services:
- convert an as string into an as number.
- parse an as path list string and extract a number.
- convert an as number into a string.
Also, the bgp tests forge an as_zero_path, and to do that,
an API to relax the possibility to have a 0 as value is
specifically called from the tests.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This is a preliminary work to handle various ways to configure
a BGP Autonomous System. When creating a BGP instance, the
user may want to define the AS number as a dotted value,
instead of using an integer value.
To handle both cases, an as_pretty char attribute will store
the as number as it has been given to the vtysh command:
router bgp <as number>
Whenever the as integer of the BGP instance was dumped,
the as_pretty original format is used.
The json output reuses the integer value to keep backward
compatibility with old displays.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Consider this scenario:
Lots of peers with a bunch of route information that is changing
fast. One of the peers happens to be really slow for whatever
reason. The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future. Now suppose that peer has hit it's input Queue
limit and is slow. As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.
Let's limit the Output Queue to the same limit as the Input
Queue. This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The idea is to drop unwanted attributes from the BGP UPDATE messages and
continue by just ignoring them. This improves the security, flexiblity, etc.
This is the command that Cisco has also.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When actually creating a peer in BGP, tell the creation if
it is a config node or not. There were cases where the
CONFIG_NODE was being set *after* being placed into
the bgp->peerhash, thus causing collisions between the
doppelganger and the peer and eventually use after free's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
RPKI revalidation is an possibly expensive operation. Break up
revalidation on a prefix basis by the `struct bgp` pointer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
An end operator is showing cases with multiple bgp feeds
and a rpki table that calling the revalidation functions
is extremely expensive and they are seeing lots of thread
WARNS about timers being late and eventually the whole
thing gets unresponsive. Let's break up soft reconfiguration
in to a series of events per peer so that all the work
for this is not done at the same exact time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Simulated latency with:
```
tc qdisc add dev eth3 root netem delay 100ms
```
```
donatas-laptop# sh ip bgp summary failed
IPv4 Unicast Summary (VRF default):
BGP router identifier 192.0.2.252, local AS number 65000 vrf-id 0
BGP table version 28
RIB entries 0, using 0 bytes of memory
Peers 1, using 724 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.10.65 2 2 00:00:17 Admin. shutdown (RTT)
Displayed neighbors 1
Total number of neighbors 1
donatas-laptop#
```
Another end received:
```
%NOTIFICATION: received from neighbor 192.168.10.17 6/2 (Cease/Administrative Shutdown) "shutdown due to high round-trip-time (104ms > 5ms, hit 21 times)"
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Add a default limit to the InQ for messages off the bgp peer
socket. Make the limit configurable via cli.
Adding in this limit causes the messages to be retained in the tcp
socket and allow for tcp back pressure and congestion control to kick
in.
Before this change, we allow the InQ to grow indefinitely just taking
messages off the socket and adding them to the fifo queue, never letting
the kernel know we need to slow down. We were seeing under high loads of
messages and large perf-heavy routemaps (regex matching) this queue
would cause a memory spike and BGP would get OOM killed. Modifying this
leaves the messages in the socket and distributes that load where it
should be in the socket buffers on both send/recv while we handle the
mesages.
Also, changes were made to allow the ringbuffer to hold messages and
continue to be filled by the IO pthread while we wait for the Main
pthread to handle the work on the InQ.
Memory spike seen with large numbers of routes flapping and route-maps
with dozens of regex matching:
```
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 516 KiB
Used small blocks: 0 bytes
Used ordinary blocks: 160 MiB
Free small blocks: 3680 bytes
Free ordinary blocks: > 2GB
Ordinary blocks: 121244
Small blocks: 83
Holding blocks: 1
```
With most of it being held by the inQ (seen from the stream datastructure info here):
```
Type : Current# Size Total Max# MaxBytes
...
...
Stream : 115543 variable 26963208 15970740 3571708768
```
With this change that memory is capped and load is left in the sockets:
RECV Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 265350 0 [fe80::4080:30ff:feb0:cee3]%veth1:36950 [fe80::4c14:9cff:fe1d:5bfd]:179 users:(("bgpd",pid=1393334,fd=26))
skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61)
```
SEND Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 0 1275012 [fe80::4c14:9cff:fe1d:5bfd]%veth1:179 [fe80::4080:30ff:feb0:cee3]:36950 users:(("bgpd",pid=1393443,fd=27))
skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0)
```
Signed-off-by: Stephen Worley <sworley@nvidia.com>
In the current implementation of bgpd, SRv6 SIDs can be configured only
under the address-family. This enables bgpd to leak IPv6 routes using
an SRv6 End.DT6 behavior and IPv4 routes using an SRv6 End.DT4
behavior. It is not possible to leak both IPv6 and IPv4 routes using a
single SRv6 SID.
This commit adds a new CLI command
"sid vpn per-vrf export <sid_idx|auto>" that enables bgpd to leak both
IPv6 and IPv4 routes using a single SRv6 SID (End.DT46 behavior).
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
In order to send correct SRv6 L3VPN advertisement, we need to save
srv6_locator_chunk in vpn_policy. With this information, we can
construct correct SRv6 L3VPN advertisement packets.
Signed-off-by: Ryoga Saito <ryoga.saito@linecorp.com>
As an example, Arista EOS allows this behavior.
Configuration something like:
```
neighbor PG peer-group
neighbor PG remote-as 65001
neighbor PG local-as 65001
neighbor 192.168.10.124 peer-group PG
```
Or without peer-group.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
See the BGP message sequence:
R1 R2
| updates |
|------------------>|
| |
| refresh request |
x<------------------|
| |
| updates cont. |
|------------------>|
| |
| end-of-rib |
|------------------>|
| |
When R1 and R2 establish BGP session, R1 begins to send initial updates.
If R2 sends a route-refresh request before EoR, it's silently ignored
by R1, and routes received earlier have no chance to be processed again.
RFC7313 says, "for a BGP speaker that supports the BGP Graceful Restart,
it MUST NOT send a BoRR for an <AFI, SAFI> to a neighbor before it sends
the EoR for the <AFI, SAFI> to the neighbor." But it doesn't forbid
route-refresh request to be sent before receiving EoR.
To handle this scenario, postpone response to refresh request until EoR
is sent.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.
The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:
This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).
restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
TCP keepalive is enabled once BGP connection is established.
New vty commands:
bgp tcp-keepalive <1-65535> <1-65535> <1-30>
no bgp tcp-keepalive
Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement forcing L3 auto derivation via configs even when
manually RTs are set. This will allow both to coexist in
BGP RTs. Without using auto config command, it will remove
auto derived RTs when you manually configure your own. To allow
both, use the auto command ond import/export/both.
Implement '*' wildcard import L3 RTs so we can import a route into any AS.
This is necessary to avoid a user from having to configure an L3 RT for
every AS they care to import evpn route from.
Signed-off-by: Stephen Worley <sworley@nvidia.com>