A new keyword permits changing the BGP as-notation output:
- [no] router bgp <> [vrf BLABLA] [as-notation [<dot|plain|dot+>]]
At the BGP instance creation, the output will inherit the way the
BGP instance is declared. For instance, the 'router bgp 1.1'
command will configure the output in the dot format. However, if
the client wants to choose an alternate output, he will have to
add the extra command: 'router bgp 1.1 as-notation dot+'.
Also, if the user wants to have plain format, even if the BGP
instance is declared in dot format, the keyword can also be used
for that.
The as-notation output is only taken into account at the BGP
instance creation. In the case where VPN instances are used,
a separate instance may be dynamically created. In that case,
the real as-notation format will be taken into acccount at the
first configuration.
Linking the as-notation format with the BGP instance makes sense,
as the operators want to keep consistency of what they configure.
One technical reason why to link the as-notation output with the
BGP instance creation is that the as-path segment lists stored
in the BGP updates use a string representation to handle aspath
operations (by using regexp for instance). Changing on the fly
the output needs to regenerate this string representation to the
correct format. Linking the configuration to the BGP instance
creation avoids refreshing the BGP updates. A similar mechanism
is put in place in junos too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
1. When OSPF unnumbered neighbor doesn't exist in any VRF,
OSPFD prints a bunch of empty JSON objects. Fixed it
by adding an outer JSON object with VRF information in it
2. Added "vrf" option to this command so that per VRF
unnumbered OSPF neighbor information can be retrieved
JSON output:
nl1# show ip ospf neighbor swp1 detail json
{
"default":{
},
"vrf1012":{
},
"vrf1013":{
},
"vrf1014":{
}
}
nl1# show ip ospf vrf vrf1012 neighbor swp4.2 detail json
{
"9.9.12.10":[
{
"ifaceAddress":"200.254.2.46",
"areaId":"0.0.0.0",
"ifaceName":"swp4.2",
"localIfaceAddress":"200.254.2.45",
"nbrPriority":1,
"nbrState":"Full",
"role":"DR",
"stateChangeCounter":6,
"lastPrgrsvChangeMsec":1462758,
"routerDesignatedId":"200.254.2.46",
"routerDesignatedBackupId":"200.254.2.45",
"optionsCounter":2,
"optionsList":"*|-|-|-|-|-|E|-",
"routerDeadIntervalTimerDueMsec":37140,
"databaseSummaryListCounter":0,
"linkStateRequestListCounter":0,
"linkStateRetransmissionListCounter":0,
"threadInactivityTimer":"on",
"threadLinkStateRequestRetransmission":"on",
"threadLinkStateUpdateRetransmission":"on"
}
]
}
nl1#
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
Consider this scenario:
Lots of peers with a bunch of route information that is changing
fast. One of the peers happens to be really slow for whatever
reason. The way the output queue is filled is that bgpd puts
64 packets at a time and then reschedules itself to send more
in the future. Now suppose that peer has hit it's input Queue
limit and is slow. As such bgp will continue to add data to
the output Queue, irrelevant if the other side is receiving
this data.
Let's limit the Output Queue to the same limit as the Input
Queue. This should prevent bgp eating up large amounts of
memory as stream data when under severe network trauma.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let the user know how to use the static route monitoring commands.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
The override.css/js files for sphinx docs were not being included into
the tarball created by `make dist`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Added ipv4 and ipv6 option to existing "show bgp nexthop"
command to be able to query nexthops that belong to a
particular address-family.
Also fixed the warnings of MR 12171
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
New show command "show evpn mac vni xx detail [json]"
to display details of all the mac entries for the
requested VNI.
Output of show evpn mac vni xx detail json:
{
"numMacs":2,
"macs":{
"ca:be:63:7c:81:05":{
"type":"local",
"intf":"veth100",
"ifindex":8,
"uptime":"00:06:55",
"localSequence":0,
"remoteSequence":0,
"detectionCount":0,
"isDuplicate":false,
"syncNeighCount":0,
"neighbors":{
"active":[
"fe80::c8be:63ff:fe7c:8105"
],
"inactive":[
]
}
}
}
}
Also added remoteEs field in the JSON output of
"show evpn mac vni xx json".
Output of show evpn mac vni xx json:
"00:02:00:00:00:0d":{
"type":"remote",
"remoteEs":"03:44:38:39:ff:ff:02:00:00:02",
"localSequence":0,
"remoteSequence":0,
"detectionCount":0,
"isDuplicate":false
}
Signed-off-by: Pooja Jagadeesh Doijode <pdoijode@nvidia.com>
```
donatas-pc# show bgp all detail-routes
For address family: IPv4 Unicast
BGP table version is 11, local router ID is 192.168.10.17, vrf id 0
Default local pref 100, local AS 65002
BGP routing table entry for 10.0.2.0/24, version 1
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.10.124
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin incomplete, metric 0, valid, external, otc 65001, best (First path received)
Last update: Tue Dec 20 12:11:52 2022
BGP routing table entry for 10.10.100.0/24, version 2
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.10.124
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin IGP, metric 0, valid, external, otc 65001, best (First path received)
Last update: Tue Dec 20 12:11:52 2022
BGP routing table entry for 172.16.31.1/32, version 3
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.10.124
65001
192.168.10.124 from 192.168.10.124 (200.200.200.202)
Origin incomplete, metric 0, valid, external, otc 65001, best (First path received)
Last update: Tue Dec 20 12:11:52 2022
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
The command of "show ip ospf" is incomplete. But "show ipv6 ospf" is fine.
Just complete it with actual parameters.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
The existing EVPN documentation in bgp.rst does not provide a holistic
configuration, just examples of individual features, and doesn't give
an operator any idea of what a compatible Linux netdev configuration
might look like. This introduces evpn.rst which includes a sample
frr.conf and corresponding Linux interface config (via iproute2) that
an operator can use to setup a basic EVPN topology and model their
interface manager's config from.
This initial version of evpn.rst shows Linux netdev config for
traditional bridges (vlan_filtering=0) and traditional vxlan devices
(single VNI). Later changes to this file will cover the use of
VLAN-aware bridges (vlan_filtering=1), single VXLAN devices
(multi VNI), and eventually bonds (for EVPN-MH).
Eventually the plan is to move the existing EVPN content from bgp.rst
into evpn.rst, but for now let's get some user-facing documentation in
place for interface configs.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Documentation ways that community-list works as OR when multiple community
values specified per entry, but it's wrong. It must be AND, let's fix this.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Add "show motd" commad.
The vtysh user can call the "show motd" command to re-show the welcome message.
This is necessary if the user saves frequently used commands in motd.
Signed-off-by: Sergei Rozhkov <gh@zserg.ru>
The `sid vpn per-vrf export` VTY command in bgpd has been extended to
support up to 1048575 SIDs.
This commit updates the documentation of the `sid vpn per-vrf export`
command.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
We already have a global knob for graceful-shutdown, but it's handy having
per neighbor knob as well.
Especially when a single neighbor needs to be restarted/shutdown gracefuly.
We can do this route-maps, but this is a faster/cleaner way doing the same
for an operator.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Add a new cli command to troubleshoort pathd daemon.
Some traces initially enabled are hidden behind this
cli command.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add the documentation for the `behavior usid` command to zebra.
When the `behavior usid` command is set, a flag is added to the locator
to indicate that the locator is a uSID locator. When a locator is
specified as a uSID locator, the bgpd will install SRv6 behaviors with
the uSID in the dataplane and use the SRv6 uSID codepoints in the BGP
update message.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Some results:
```
====
PCRE
====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001
ret status: 0
[14:31] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001
ret status: 0
=====
PCRE2
=====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001
ret status: 0
[14:30] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001
ret status: 1
```
Seems that if using PCRE2, we need to escape outer `()` chars and `|`. Sounds
like a bug.
But this is only with some older PCRE2 versions. With >= 10.36, I wasn't able
to reproduce this, everything is fine and working as expected.
Adding _FRR_PCRE2_POSIX definition because pcre2posix.h does not have
include's guard.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Remove the nexthop groups documentation from pbr.rst and
make it `generic`. Add the resilient buckets nexthop
group type.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
"on-shutdown" and "on-startup" have the different timeout range.
Correct the timeout range for "on-shutdown" based on the current code:
```
(ospf) max-metric router-lsa on-shutdown (5-100)
```
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Add a default limit to the InQ for messages off the bgp peer
socket. Make the limit configurable via cli.
Adding in this limit causes the messages to be retained in the tcp
socket and allow for tcp back pressure and congestion control to kick
in.
Before this change, we allow the InQ to grow indefinitely just taking
messages off the socket and adding them to the fifo queue, never letting
the kernel know we need to slow down. We were seeing under high loads of
messages and large perf-heavy routemaps (regex matching) this queue
would cause a memory spike and BGP would get OOM killed. Modifying this
leaves the messages in the socket and distributes that load where it
should be in the socket buffers on both send/recv while we handle the
mesages.
Also, changes were made to allow the ringbuffer to hold messages and
continue to be filled by the IO pthread while we wait for the Main
pthread to handle the work on the InQ.
Memory spike seen with large numbers of routes flapping and route-maps
with dozens of regex matching:
```
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 516 KiB
Used small blocks: 0 bytes
Used ordinary blocks: 160 MiB
Free small blocks: 3680 bytes
Free ordinary blocks: > 2GB
Ordinary blocks: 121244
Small blocks: 83
Holding blocks: 1
```
With most of it being held by the inQ (seen from the stream datastructure info here):
```
Type : Current# Size Total Max# MaxBytes
...
...
Stream : 115543 variable 26963208 15970740 3571708768
```
With this change that memory is capped and load is left in the sockets:
RECV Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 265350 0 [fe80::4080:30ff:feb0:cee3]%veth1:36950 [fe80::4c14:9cff:fe1d:5bfd]:179 users:(("bgpd",pid=1393334,fd=26))
skmem:(r403688,rb425984,t0,tb425984,f1816,w0,o0,bl0,d61)
```
SEND Side:
```
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 0 1275012 [fe80::4c14:9cff:fe1d:5bfd]%veth1:179 [fe80::4080:30ff:feb0:cee3]:36950 users:(("bgpd",pid=1393443,fd=27))
skmem:(r0,rb131072,t0,tb1453568,f1916,w1300612,o0,bl0,d0)
```
Signed-off-by: Stephen Worley <sworley@nvidia.com>
This command adds the documentation for the "sid vpn per-vrf export (1..255)|auto" command to bgpd.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
This commit adds the documentation of the two optional parameters "block-len" and "node-len" of the SRv6 locator.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Add new `show bgp vni ...` command to docs. This command
is used to show the per-VNI EVPN tables in BGP.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Docs were recommending both integrated and non-integrated
config in different sections. Remove the recommendation
for non-integrated config from vtysh.rst.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC4364 describes peerings between multiple AS domains, to ease
the continuity of VPN services across multiple SPs. This commit
implements a sub-set of IETF option b) described in chapter 10 b.
The ASBR to ASBR approach is taken, with an EBGP peering between
the two routers. The EBGP peering must be directly connected to
the outgoing interface used. In those conditions, the next hop
is directly connected, and there is no need to have a transport
label to convey the VPN label. A new vty command is added on a
per interface basis:
This command if enabled, will permit to convey BGP VPN labels
without any transport labels (i.e. with implicit-null label).
restriction:
this command is used only for EBGP directly connected peerings.
Other use cases are not covered.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a route imported from l3vpn is analysed, the nexthop from default
VRF is looked up against a valid MPLS path. Generally, this is done on
backbones with a MPLS signalisation transport layer like LDP. Generally,
the BGP connection is multiple hops away. That scenario is already
working.
There is case where it is possible to run L3VPN over GRE interfaces, and
where there is no LSP path over that GRE interface: GRE is just here to
tunnel MPLS traffic. On that case, the nexthop given in the path does not
have MPLS path, but should be authorized to convey MPLS traffic provided
that the user permits it via a configuration command.
That commit introduces a new command that can be activated in route-map:
> set l3vpn next-hop encapsulation gre
That command authorizes the nexthop tracking engine to accept paths that
o have a GRE interface as output, independently of the presence of an LSP
path or not.
A configuration example is given below. When bgp incoming vpnv4 updates
are received, the nexthop of NLRI is 192.168.0.2. Based on nexthop
tracking service from zebra, BGP knows that the output interface to reach
192.168.0.2 is r1-gre0. Because that interface is not MPLS based, but is
a GRE tunnel, then the update will be using that nexthop to be installed.
interface r1-gre0
ip address 192.168.0.1/24
exit
router bgp 65500
bgp router-id 1.1.1.1
neighbor 192.168.0.2 remote-as 65500
!
address-family ipv4 unicast
no neighbor 192.168.0.2 activate
exit-address-family
!
address-family ipv4 vpn
neighbor 192.168.0.2 activate
neighbor 192.168.0.2 route-map rmap in
exit-address-family
exit
!
router bgp 65500 vrf vrf1
bgp router-id 1.1.1.1
no bgp network import-check
!
address-family ipv4 unicast
network 10.201.0.0/24
redistribute connected
label vpn export 101
rd vpn export 444:1
rt vpn both 52:100
export vpn
import vpn
exit-address-family
exit
!
route-map rmap permit 1
set l3vpn next-hop encapsulation gre
exit
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add an ability to match via route-maps. An additional route-map command
`match rpki-extcommunity <invalid|notfound|valid>` added.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
```
spine1-debian-11# sh ip bgp 100.100.100.101/32
BGP routing table entry for 100.100.100.101/32, version 21
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:invalid
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11# sh ip bgp 100.100.100.100/32
BGP routing table entry for 100.100.100.100/32, version 17
Paths: (1 available, best #1, table default)
Not advertised to any peer
Local
fe80::ca5d:fd0d:cd8:1bb7 from eth3 (172.17.0.3)
(fe80::ca5d:fd0d:cd8:1bb7) (used)
Origin incomplete, metric 0, localpref 100, valid, internal, best (First path received)
Extended Community: OVS:not-found
Last update: Wed Aug 31 19:31:46 2022
spine1-debian-11#
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
TCP keepalive is enabled once BGP connection is established.
New vty commands:
bgp tcp-keepalive <1-65535> <1-65535> <1-30>
no bgp tcp-keepalive
Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Overall, rfc1997 states:
The community attribute values ranging from 0x0000000 through
0x0000FFFF and 0xFFFF0000 through 0xFFFFFFFF are hereby reserved.
But we have a special handling here, like Cisco IOS XR.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
BGP SoO is a tag that is appended on BGP updates to allow a peer to mark
a particular peer as belonging to a particular site. In certain MPLS L3 VPN
configurations, the BGP AS-Path may not provide the granularity needed
prevent a loop in the control-plane. With this in mind, BGP SoO is designed
to fill this gap and prevent a routing loop that may occur.
If we configure for example, `neighbor soo 65000:1` at PEs, routes won't be
announced between CPEs if soo matches. This is especially needed when using
as-override or allowas-in.
Also, this is the automated way of the same behavior as configuring route-maps
for each peer like:
```
bgp extcommunity-list cpe permit soo 65000:1
!
route-map cpe permit 10
set extcommunity soo 65000:1
...
route-map cpe deny 10
match extcommunity cpe
route-map cpe permit 20
...
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Current wording _implies_ `neighbor` updates are sent unicast; this makes it explicit.
Signed-off-by: Ben L <47653825+ad8-bdl@users.noreply.github.com>
A new command is available under SAFI_MPLS_VPN:
With this command, the BGP vpnvx prefixes received are
not kept, if there are no VRF interested in importing
those vpn entries.
A soft refresh is performed if there is a change of
configuration: retain cmd, vrf import settings, or
route-map change.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The command `debug bgp allow-martian` is not actually
a debug command it's a command that when entered allows
bgp to not reset a peering when a martian nexthop is
passed in the nlri.
Add the `bgp allow-martian-nexthop` command and allow it to be
used.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Update the documentation with realms and how they
interact with nexthop groups that are installed into
the kernel.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
"show ip ospf neighbour [nbrid] [json]" is expected to give brief output
of the specific neighbour. But it gives the detailed output without
the detail keyword.
"show ip ospf neighbour [nbrid] [deatil] [json]" command is failed to
fetch the ecpected o/p. Corrected it.
Ex o/p:
frr(config-if)# do show ip ospf neighbor
Neighbor ID Pri State Up Time Dead Time Address Interface RXmtL RqstL DBsmL
8.8.8.8 1 Full/DR 17m03s 31.192s 20.1.1.194 ens192:20.1.1.220 0 0 0
30.1.1.100 1 Full/DR 56.229s 32.000s 30.1.1.100 ens224:30.1.1.220 0 0 0
frr(config-if)#
frr(config-if)#
frr(config-if)# do show ip ospf neighbor 8.8.8.8
Neighbor 8.8.8.8, interface address 20.1.1.194
In the area 0.0.0.0 via interface ens192
Neighbor priority is 1, State is Full/DR, 6 state changes
Most recent state change statistics:
Progressive change 17m18s ago
DR is 20.1.1.194, BDR is 20.1.1.220
Options 2 *|-|-|-|-|-|E|-
Dead timer due in 35.833s
Database Summary List 0
Link State Request List 0
Link State Retransmission List 0
Thread Inactivity Timer on
Thread Database Description Retransmision off
Thread Link State Request Retransmission on
Thread Link State Update Retransmission on
Graceful restart Helper info:
Graceful Restart HELPER Status : None
frr(config-if)# do show ip ospf neighbor 8.8.8.8 detail
No such interface.
frr(config-if)# do show ip ospf neighbor 8.8.8.8 detail json
{}
frr(config-if)#
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>