This is to avoid breaking changes between existing deployments of
extended community for bandwidth encoding. By default FRR uses uint32
to encode bandwidth, which is not as the draft requires (IEEE floating-point).
This switch enables the required encoding per-peer.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When BGP is notified by RIB that peer address is unreachable then BGP session must be brought
down immediately and not wait for the hold-timer expiry. Today single-hop EBGP already behaves
this way but need to change for iBGP and multi-hop EBGP sessions.
Signed-off-by: Prerana G.B <prerana@vmware.com>, Pushpasis Sarkar <spushpasis@vmware.com>
Currently, the following sequence of events between peers could
result in erroneous capability reports on the peer
with enabled dont-capability-negotiate option:
- having some of the capabilities advertised to a bgp neighbor,
- then disabling capability negotiation to that neighbor,
- then resetting connection to it,
- and no capabilities are actually sent to the neighbor,
- but "show bgp neighbors" on the host still displays them
as advertised to the neighbor.
There are two possibilities for establishing a new connection
- the established connection was initiated by us with bgp_start(),
- the connection was initiated on the neighbor side and processed by
us via bgp_accept() in bgp_network.c.
The former case results in "show bgp neighbors" displaying only
"received" in capabilities, as the peer's cap is initiated to zero
in bgp_start().
In the latter case, if bgp_accept() happens before bgp_start()
is called, then new peer capabilities are being transferred
from its previous record before being zeroed in bgp_start().
This results in "show bgp neighbors" still displaying
"advertised and received" in capabilities.
Following the logic of a similar af_cap field clearing,
treated correctly in both cases, we
- reset peer's capability during bgp_stop()
- don't pass it over to a new peer structure in bgp_accept().
This fix prevents transferring of the previous capabilities record
to a new peer instance in arbitrary reconnect scenario.
Signed-off-by: Alexander Skorichenko <askorichenko@netgate.com>
Adds a knob that sets the time between loc-rib scans for conditional
advertisement.
I chose the range (5-240) because 1 second seems dumb and too easy to
hurt yourself at even moderate scale, 5 seconds you can still hurt
yourself but I could see a use case for it, and 4 minutes should be
enough for anyone (tm)
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
the realloc man page:
If ptr is NULL, then the call is equivalent to malloc(size)
This should be sufficient for our needs to not have to have
XMALLOC and XREALLOC
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There is no peer_af allocated in `peer_activate`. Trying to delete
the structure just results in an no-op and a error return value.
The error message "couldn't delete af structure for peer" is
unexpected.
Signed-off-by: zyxwvu Shi <shiyuchen.syc@bytedance.com>
New peers should be initialized with a usual max packet size and later
determined on OPEN messages.
Testing with different peers supporting/not supporting extended support.
2021/07/02 13:48:00 BGP: [WEV7K-2GAQ5] u2:s2 send UPDATE len 8991 (max message len: 65535) numpfx 1788
2021/07/02 13:48:03 BGP: [WEV7K-2GAQ5] u3:s3 send UPDATE len 4096 (max message len: 4096) numpfx 809
2021/07/02 13:48:03 BGP: [WEV7K-2GAQ5] u3:s3 send UPDATE len 4096 (max message len: 4096) numpfx 809
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
There are a few places in the code where we use PREFIX_COPY(_IPV4/IPV6)
macro to copy a prefix. Let's always use prefix_copy function for this.
This should fix CID 1482142 and 1504610.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Adds new commands to allow a user to default 'default' address-families
to be inherited by all new peers. Previously this was limited to just
ipv4/ipv6 unicast, now the full list is:
---
ipv4-unicast
ipv4-multicast
ipv4-vpn
ipv4-labeled-unicast
ipv4-flowspec
ipv6-unicast
ipv6-multicast
ipv6-vpn
ipv6-labeled-unicast
ipv6-flowspec
l2vpn-evpn
---
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Introduces bgp->default_af to selectively enable various default
afi/safis to be inherited by new peers.
Makes default_af flag logic consistent for all address-families, i.e.
instead of a "no default" flag for ipv4 and a "default" flag for ipv6,
just use "default" for both and make it true for ipv4 by default.
Removes old BGP_FLAG_NO_DEFAULT_IPV4 and BGP_FLAG_DEFAULT_IPV6, and
cleans up bgp->flags bit definitions to avoid gaps for unused bits.
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
Display on which VRF/view the neighbor was not found. Useful when
selecting "vrf all".
Before patch:
No such neighbor in this view/vrf
After patch:
No such neighbor in VRF default
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
We are inconsistently using peer_establiahed(peer) with
sometimes using `peer->status == Established`. Just Convert
over to using the function for consistency.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP configuration changes that imply recomputing the BGP route table
(e.g. modifying route-maps, setting bgp graceful-shutdown) might be a
long time process depending on the size of the BGP table and the
route-map numbers and complexity. For example, setups with full
Internet routes take something like one minute to reprocess all the
prefixes when graceful-shutdown is configured. During this time, a
"show bgp commands" request on vtysh results in blocking the shell until
the soft reconfigure table task is over.
This patch splits bgp_soft_reconfig_table task into thread jobs of 25K
prefixes.
Some tests on a full Internet route setup show that after reconfiguring
route-maps or graceful-shutdown, vtysh is not stucked anymore. We are
now able to request commands like "show bgp summary" after 1 or 2
seconds instead of 30 to 60s.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This commit add base-lines for BGP SRv6 VPN support.
srv6_locator_chunks property of struct bgp is used
to store BGPd's own SRv6 locator chunk getting with
ZEBRA_SRV6_MANAGER_GET_LOCATOR_CHUNK api.
And srv6_functions is used to store BGP's srv6
localsids. It's mainly used when new SID reservation
from locator chunks.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
base
Currently, when AS number of an existing BGP neighbor is added in a BGP
confederation, AS_CONFED_SEQUENCE segment attribute will be missing in
prefixes advertised to the neighbor. Also, receiving prefixes from the
neighbor will be withdrawn because of "Malformed AS path from A.B.C.D".
neighbor 10.100.200.3 remote-as 123
bgp confederation identifier 65001
bgp confederation peers 123
With this change, update peer's sort after adding or removing its AS
number in a BGP confederation.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
Deconfiguring conditional advertisement resulted in all other policy
settings on the peer getting removed due to an excessively large memset.
This also created a desync with the northbound config tree, which caused
its own set of problems...
Fix the memset to just remove the conditional advertisement config.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
BGP_MAX_PACKET_SIZE no longer represented the absolute maximum BGP
packet size as it did before, instead it was defined as 4096 bytes,
which is the maximum unless extended message capability is negotiated,
in which case the maximum goes to 65k.
That introduced at least one bug - last_reset_cause was undersized for
extended messages, and when sending an extended message > 4096 bytes
back to a peer as part of NOTIFY data would trigger a bounds check
assert.
This patch redefines the macro to restore its previous meaning,
introduces a new macro - BGP_STANDARD_MESSAGE_MAX_PACKET_SIZE - to
represent the 4096 byte size, and renames the extended size to
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE for consistency. Code locations
that definitely should use the small size have been updated, locations
that semantically always need whatever the max is, no matter what that
is, use BGP_MAX_PACKET_SIZE.
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE should only be used as a constant
when storing what the negotiated max size is for use at runtime and to
define BGP_MAX_PACKET_SIZE. Unless there is a future standard that
introduces a third valid size it should not be used for any other
purpose.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Show alias name instead of numerical value in `show bgp <prefix>. E.g.:
```
root@exit1-debian-9:~/frr# vtysh -c 'sh run' | grep 'bgp community alias'
bgp community alias 65001:123 community-1
bgp community alias 65001:123:1 lcommunity-1
root@exit1-debian-9:~/frr#
```
```
exit1-debian-9# sh ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 21
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Community: 65001:12 65001:13 community-1 65001:65534
Large Community: lcommunity-1 65001:123:2
Last update: Fri Apr 16 12:51:27 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Problem Statement:
=================
In scale setup BGP sessions start flapping.
RCA:
====
In virtualized environment there are multiple places where
MTU need to be set. If there are some places were MTU is not set
properly then there is chances that BGP packets get fragmented,
in scale setup this will lead to BGP session flap.
Fix:
====
A new tcp option is provided as part of this implementation,
which can be configured per neighbor and helps to set the TCP
max segment size. User need to derive the path MTU between the BGP
neighbors and set that value as part of tcp-mss setting.
1. CLI Configuration:
[no] neighbor <A.B.C.D|X:X::X:X|WORD> tcp-mss (1-65535)
2. Running config
frr# show running-config
router bgp 100
neighbor 198.51.100.2 tcp-mss 150 => new entry
neighbor 2001:DB8::2 tcp-mss 400 => new entry
3. Show command
frr# show bgp neighbors 198.51.100.2
BGP neighbor is 198.51.100.2, remote AS 100, local AS 100, internal link
Hostname: frr
Configured tcp-mss is 150, synced tcp-mss is 138 => new display
4. Show command json output
frr# show bgp neighbors 2001:DB8::2 json
{
"2001:DB8::2":{
"remoteAs":100,
"bgpTimerKeepAliveIntervalMsecs":60000,
"bgpTcpMssConfigured":400, => new entry
"bgpTcpMssSynced":388, => new entry
Risk:
=====
Low - This is a config driven feature and it sets the max segment
size for the TCP session between BGP peers.
Tests Executed:
===============
Have done manual testing with three router topology.
1. Executed basic config and un config scenarios
2. Verified if the config is updated in running config
during config and no config operation
3. Verified the show command output in both CLI format and
JSON format.
4. Verified if TCP SYN messages carry the max segment size
in their initial packets.
5. Verified the behaviour during clear bgp session.
6. done packet capture to see if the new segment size
takes effect.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
When we're trying to create the config for already existing router and
the AS number or instance type mismatches, we're returning the
inconsistency error and don't set the NB entry.
Any subsequent command that configures this router will crash because
every command relies on the existence of the NB entry.
Let's store the entry even when there's a mismatch to prevent the crash.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There are multiple problems:
- commit ef7c53e2 introduced a new return value 2 which broke things,
because a lot of code treats non-zero return as an error,
- there is an incorrect error returned when AS number mismatches.
This commit fixes both.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Description:
FRR doesn't re-install the routes, imported from a tenant VRF,
when bgp instance for source vrf is deleted and re-added again.
When bgp instance is removed and re-added, when import statement is already there,
then route leaking stops between two VRFs.
Every 'router bgp' command should trigger re-export of all the routes
to the importing bgp vrf instances.
When a router bgp is configured, there could be bgp vrf instance(s) importing routes from
this newly configured bgp vrf instance.
We need to export routes from configured bgp vrf to VPN.
This can impact performance, whenever we are testing scale from vrf route-leaking perspective.
We should not trigger re-export for already existing bgp vrf instances.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
Remove old BFD API usage and replace it with the new one.
Highlights:
- More shared code: the daemon gets notified with callbacks instead of
having to roll its own code to find the notified sessions.
- Less code to integrate with BFD.
- Remove hidden commands to configure single / multi hop. Use
protocol data instead.
BGP can determine if a peer is single/multi hop according to the
following criteria:
a. If the IP address is a link-local address (single hop)
b. The network is shared with peer (single hop)
c. BGP is configured for eBGP multi hop / TTL security (multi hop)
- Respect the configuration hierarchy:
a. Peer configuration take precendence over peer-group
configuration.
b. When peer group configuration is removed, reset peer
BFD configurations to defaults (unless peer had specific
configs).
Example:
neighbor foo peer-group
neighbor foo bfd profile X
neighbor 192.168.0.2 peer-group foo
neighbor 192.168.0.2 bfd
! If peer-group is removed the profile configuration gets
! removed from peer 192.168.0.2, but BFD will still enabled
! because of the neighbor specific bfd configuration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If we have a SAFI conflict, ie we are trying to activate safi's
UNICAST and LABELED_UNICAST at the same time, we should not
cause bestpath to be rerun and we should not try to put
labels on everything.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When you use a single BGP session for both IPv4 and IPv6 it's a bit
annoying going into ipv6 address-family and explicitly activating it.
Let's get this automatically if enabled with `bgp default ipv6-unicast`.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Description:
When user is config connect timer, it doesn't reflect
immediately. It reflect when next time neighbor is tried to reconnect.
Problem Description/Summary :
When user is config connect timer, it doesn't reflect
The network connection was aborted by the local system.d to reconnect.
Fix is to update the connect timer immediately if BGP
session is not in establish state.
Expected Behavior :
If neighbor is not yet established, we should immediately apply the config connect timer to the peer.
Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
Add SNMP support for L3vpn Vrf table as defined in [RFC4382]
Keep track of vrf status for the table and for future traps.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Unconfig not resetting the peer-group member allowas_in[afi][safi]
This is causing remote route to be accept.
Signed-off-by: Kishore Kunal <kishorekunal01@broadcom.com>
Description:
Holdtime and keepalive parameters weren't copied from
peer-group to peer-group members. Fixed the issue by copying holdtime
and keepalive parameters from peer-group to its members.
Problem Description/Summary :
Holdtime and keepalive parameters weren't copied from
peer-group to peer-group members. Fixed the issue by copying holdtime
and keepalive parameters from peer-group to its members.
Signed-off-by: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
The command `neighbor PGROUP ttl-security hops X` was being
accepted but ignored. Allow it to be stored. I am still
not sure that this is applied correctly, but that is another
problem.
Fixes: #7848
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A bgp neighbor remains in Idle state in the event that the number
of received prefixes exceeds the configured maximum prefix for the
neighbor. The neighbor remains in idle state even after de-configuring
the maximum prefix limit for the neighbor.
The fix is to clear the neighbor overflow state if set, after
de-configuring the neighbor maximum-prefix commnd.
This allows the neighbor to establish without having to perform a
clear operation. It also avoids the misleading neigbor summary
indicating that the neighbor is in prefix overflow state (PfxCt)
when no limit is configured for the neighbor.
Signed-off-by: Dewi Morgan <dewi.morgan@intl.att.com>
When adding/removing some peer's flag we need to make sure we FORCE updates
to avoid suppressing critical updates.
Like entering `no neighbor x.x.x.x send-community large` would suppress
updates by default and another side will have stale large communities.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Fix `peer_default_originate_unset` so default route can be withdrawn
when `default-originate` option is being unset from a peer-group.
The loop calling `bgp_default_originate` is clearing default-originate
from the peer-group peer `peer` instead of the peer-group member peer
`member`.
Signed-off-by: zyxwvu Shi <shiyuchen.syc@bytedance.com>
On top of the recent `bgp suppress-fib-pending which
was at a BGP_NODE level, add this command at the CONFIG_NODE
level as well and allow the command to apply to all instances
of bgp running.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a bit of code to allow bgp to send the AS-Path associated with
the route being installed to zebra so it can be displayed and
used as part of the `show ip route A` command in zebra.
eva# show ip route 20.0.0.0/11
Routing entry for 20.0.0.0/11
Known via "bgp", distance 20, metric 0, best
Last update 00:00:00 ago
* 192.168.161.1, via enp39s0, weight 1
AS-Path: 60000 64539 15096 6939 8075
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Changes implement dampening profiles for peers and peer groups. This is
achieved by introducing the possibility to have multible existing
dampening configurations with their own sets of parameters and lists of
associated paths.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
ES-VRF entries are maintained for the purpose of L3-NHG creation -
1. Each ES-EVI entry is associated with a tenant VRF. This associaton
triggers the creation of an ES-VRF entry.
2. Type-2/MAC-IP routes are imported into a tenant VRF and programmed as
a /32 or host route entry in the dataplane. If the destination of
the host route is a remote-ES the route is programmed with the
corresponding (keyed in by {vrf,ES-id}) L3-NHG.
3. The reason for this indirection (route->L3-NHG, L3-NHG->list-of-VTEPs)
is to avoid route updates to the dplane when a remote-ES link flaps i.e.
instead of updating all the dependent routes the NHG's contents are
updated. This reduces the amount of dataplane updates (fewer nhg updates vs.
route updates) allowing for a faster failover.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The return from sockunion2hostprefix tells us if the conversion
succeeded or not. There are places in the code where we
always assume that it just `works`, since it can fail
notice and try to do the right thing.
Please note that failure of this function for most cases
of sockunion2hostprefix is highly highly unlikely as that
the sockunion was already created and tested elsewhere
it's just that this function can fail.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a BGP vrf instance is deleted, the routes it exported into the
main VPN table are not deleted and they remain as stale routes
attached to an unknown bgp instance. When the new vrf instance comes
along, it imports these routes from the main table and thus we see
duplicatesalongside its own identical routes.
The solution is to call the unexport logic when a BGP vrf instance is
being deleted.
problem example
---------------
volta1# sh bgp vrf VRF-a ipv4 unicast
BGP table version is 4, local router ID is 18.0.0.1, vrf id 5
Default local pref 100, local AS 567
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 7.0.0.6/32 7.0.0.5@0< 10 100 0 ?
*> 7.0.0.8/32 18.0.0.8 0 0 111 ?
*> 18.0.0.0/24 18.0.0.8 0 0 111 ?
*> 56.0.0.0/24 7.0.0.5@0< 0 100 0 ?
Displayed 4 routes and 4 total paths
volta1# conf t
volta1(config)# no router bgp 567 vrf VRF-a
volta1(config)#
volta1(config)# router bgp 567 vrf VRF-a
volta1(config-router)# bgp router-id 18.0.0.1
volta1(config-router)# no bgp ebgp-requires-policy
volta1(config-router)# no bgp network import-check
volta1(config-router)# neighbor 18.0.0.8 remote-as 111
volta1(config-router)# !
volta1(config-router)# address-family ipv4 unicast
volta1(config-router-af)# label vpn export 12345
volta1(config-router-af)# rd vpn export 567:111
volta1(config-router-af)# rt vpn both 567:100
volta1(config-router-af)# export vpn
volta1(config-router-af)# import vpn
volta1(config-router-af)# exit-address-family
volta1(config-router)# !
volta1(config-router)# end
volta1# sh bgp vrf VRF-a ipv4 unicast
BGP table version is 4, local router ID is 18.0.0.1, vrf id 5
Default local pref 100, local AS 567
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 7.0.0.6/32 7.0.0.5@0< 10 100 0 ?
* 7.0.0.8/32 18.0.0.8 0 0 111 ?
*> 18.0.0.8@-< 0 0 111 ?
* 18.0.0.0/24 18.0.0.8 0 0 111 ?
*> 18.0.0.8@-< 0 0 111 ?
*> 56.0.0.0/24 7.0.0.5@0< 0 100 0 ?
Displayed 4 routes and 6 total paths
@- routes indicating unknown bgp instance are imported
Signed-off-by: Pat Ruddy <pat@voltanet.io>
if the user sets the ebgp-multihop for a neighbor to the same value
we currently have, avoid resetting the session and just return a
silent success.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
* Added CLI command "[no] bgp suppress-fib-pending" to enable and
disable suppress-fib-pending
* Send ZEBRA_ROUTE_NOTIFY_REQUEST to zebra when "bgp suppress-fib-pending"
is enabled or disabled
* Define BGP_DEFAULT_UPDATE_ADVERTISEMENT_TIME which is the delay added
to update group timer.
* Added error codes
Signed-off-by: kssoman <somanks@gmail.com>
We are allocating temporary memory for information about
what to process in this thread, which is not being cleaned
up on thread cancelling.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The `struct listnode *rt_node` data structure is adding
8 bytes of size to the `struct bgp_dest`. This is a large
amount of data for a flag we are already setting on each
node for this. Just set the flag and use that to figure
out who we are doing graceful restart on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Sample Configuration with prefix-list and community match rules
---------------------------------------------------------------
R1 ------- R2(DUT) ------- R3
Router2# show running-config
Building configuration...
Current configuration:
!
frr version 7.6-dev-MyOwnFRRVersion
frr defaults traditional
hostname router
log file /var/log/frr/bgpd.log
log syslog informational
hostname Router2
service integrated-vtysh-config
!
debug bgp updates in
debug bgp updates out
!
debug route-map
!
ip route 20.20.0.0/16 blackhole
ipv6 route 2001:db8::200/128 blackhole
!
interface enp0s9
ip address 10.10.10.2/24
!
interface enp0s10
ip address 10.10.20.2/24
!
interface lo
ip address 2.2.2.2/32
!
router bgp 2
bgp log-neighbor-changes
no bgp ebgp-requires-policy
neighbor 10.10.10.1 remote-as 1
neighbor 10.10.20.3 remote-as 3
!
address-family ipv4 unicast
neighbor 10.10.10.1 soft-reconfiguration inbound
neighbor 10.10.20.3 soft-reconfiguration inbound
neighbor 10.10.20.3 advertise-map ADV-MAP non-exist-map EXIST-MAP
exit-address-family
!
ip prefix-list DEFAULT seq 5 permit 1.1.1.5/32
ip prefix-list DEFAULT seq 10 permit 1.1.1.1/32
ip prefix-list EXIST seq 5 permit 10.10.10.10/32
ip prefix-list DEFAULT-ROUTE seq 5 permit 0.0.0.0/0
ip prefix-list IP1 seq 5 permit 10.139.224.0/20
ip prefix-list T2 seq 5 permit 1.1.1.5/32
!
bgp community-list standard DC-ROUTES seq 5 permit 64952:3008
bgp community-list standard DC-ROUTES seq 10 permit 64671:501
bgp community-list standard DC-ROUTES seq 15 permit 64950:3009
bgp community-list standard DEFAULT-ROUTE seq 5 permit 65013:200
!
route-map ADV-MAP permit 10
match ip address prefix-list IP1
!
route-map ADV-MAP permit 20
match community DC-ROUTES
!
route-map EXIST-MAP permit 10
match community DEFAULT-ROUTE
match ip address prefix-list DEFAULT-ROUTE
!
line vty
!
end
Router2#
Router2# show ip bgp 0.0.0.0
BGP routing table entry for 0.0.0.0/0
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
10.10.10.1 10.10.20.3
1
10.10.10.1 from 10.10.10.1 (10.139.224.1)
Origin IGP, metric 0, valid, external, best (First path received)
Community: 64848:3011 65011:200 65013:200
Last update: Tue Oct 6 02:39:42 2020
Router2#
Sample output with non-exist-map when default route present in table
--------------------------------------------------------------------
Router2# show ip bgp
BGP table version is 4, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0/0 10.10.10.1 0 0 1 i
*> 1.1.1.1/32 10.10.10.1 0 0 1 i
*> 1.1.1.5/32 10.10.10.1 0 0 1 i
*> 10.139.224.0/20 10.10.10.1 0 0 1 ?
Displayed 4 routes and 4 total paths
Router2# show ip bgp neighbors 10.10.20.3 advertised-routes
BGP table version is 4, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0/0 0.0.0.0 0 1 i
*> 1.1.1.5/32 0.0.0.0 0 1 i <<<<<<<<< non-exist-map : 0.0.0.0/0 is present so, 10.139.224.0/20 not advertised
Total number of prefixes 2
Sample output with non-exist-map when default route not present in table
------------------------------------------------------------------------
Router2# show ip bgp
BGP table version is 5, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.1/32 10.10.10.1 0 0 1 i
*> 1.1.1.5/32 10.10.10.1 0 0 1 i
*> 10.139.224.0/20 10.10.10.1 0 0 1 ?
Displayed 3 routes and 3 total paths
Router2#
Router2#
Router2# show ip bgp neighbors 10.10.20.3 advertised-routes
BGP table version is 5, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.1/32 0.0.0.0 0 1 i
*> 1.1.1.5/32 0.0.0.0 0 1 i
*> 10.139.224.0/20 0.0.0.0 0 1 ? <<<<<<<<< non-exist-map : 0.0.0.0/0 is not present so, 10.139.224.0/20 advertised
Total number of prefixes 3
Router2#
Sample output with exist-map when default route present in table
--------------------------------------------------------------------
Router2# show ip bgp
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0/0 10.10.10.1 0 0 1 i
*> 1.1.1.1/32 10.10.10.1 0 0 1 i
*> 1.1.1.5/32 10.10.10.1 0 0 1 i
*> 10.139.224.0/20 10.10.10.1 0 0 1 ?
Displayed 4 routes and 4 total paths
Router2#
Router2#
Router2#
Router2#
Router2# show ip bgp neighbors 10.10.20.3 advertised-routes
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0/0 0.0.0.0 0 1 i
*> 1.1.1.1/32 0.0.0.0 0 1 i
*> 1.1.1.5/32 0.0.0.0 0 1 i
*> 10.139.224.0/20 0.0.0.0 0 1 ? <<<<<<<<< exist-map : 0.0.0.0/0 is present so, 10.139.224.0/20 advertised
Total number of prefixes 4
Router2#
Sample output with exist-map when default route not present in table
--------------------------------------------------------------------
Router2# show ip bgp
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.1/32 10.10.10.1 0 0 1 i
*> 1.1.1.5/32 10.10.10.1 0 0 1 i
*> 10.139.224.0/20 10.10.10.1 0 0 1 ?
Displayed 3 routes and 3 total paths
Router2#
Router2#
Router2#
Router2# show ip bgp neighbors 10.10.20.3 advertised-routes
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.5/32 0.0.0.0 0 1 i <<<<<<<<< exist-map : 0.0.0.0/0 is not present so, 10.139.224.0/20 not advertised
Total number of prefixes 1
Router2#
Signed-off-by: Madhuri Kuruganti <k.madhuri@samsung.com>
Implemented as per the feature description given in the source link.
Descriprion:
The BGP conditional advertisement feature uses the non-exist-map or exist-map
and the advertise-map keywords of the neighbor advertise-map command in order
to track routes by the route prefix.
non-exist-map :
If a route prefix is not present in output of the non-exist-map command, then
the route specified by the advertise-map command is announced.
exist-map :
If a route prefix is present in output of the exist-map command, then the route
specified by the advertise-map command is announced.
The conditional BGP announcements are sent in addition to the normal
announcements that a BGP router sends to its peers.
The conditional advertisement process is triggered by the BGP scanner process,
which runs every 60 seconds. This means that the maximum time for the conditional
advertisement to take effect is 60 seconds. The conditional advertisement can take
effect sooner, depending on when the tracked route is removed from the BGP table
and when the next instance of the BGP scanner occurs.
Sample Configuration on DUT
---------------------------
Router2# show running-config
Building configuration...
Current configuration:
!
frr version 7.6-dev-MyOwnFRRVersion
frr defaults traditional
hostname router
log file /var/log/frr/bgpd.log
log syslog informational
hostname Router2
service integrated-vtysh-config
!
debug bgp updates in
debug bgp updates out
!
debug route-map
!
ip route 200.200.0.0/16 blackhole
ipv6 route 2001:db8::200/128 blackhole
!
interface enp0s9
ip address 10.10.10.2/24
!
interface enp0s10
ip address 10.10.20.2/24
!
interface lo
ip address 2.2.2.2/24
ipv6 address 2001:db8::2/128
!
router bgp 2
bgp log-neighbor-changes
no bgp ebgp-requires-policy
neighbor 10.10.10.1 remote-as 1
neighbor 10.10.20.3 remote-as 3
!
address-family ipv4 unicast
network 2.2.2.0/24
network 200.200.0.0/16
neighbor 10.10.10.1 soft-reconfiguration inbound
neighbor 10.10.10.1 advertise-map ADVERTISE non-exist-map CONDITION
neighbor 10.10.20.3 soft-reconfiguration inbound
exit-address-family
!
address-family ipv6 unicast
network 2001:db8::2/128
network 2001:db8::200/128
neighbor 10.10.10.1 activate
neighbor 10.10.10.1 soft-reconfiguration inbound
neighbor 10.10.10.1 advertise-map ADVERTISE_6 non-exist-map CONDITION_6
neighbor 10.10.20.3 activate
neighbor 10.10.20.3 soft-reconfiguration inbound
exit-address-family
!
access-list CONDITION seq 5 permit 3.3.3.0/24
access-list ADVERTISE seq 5 permit 2.2.2.0/24
access-list ADVERTISE seq 6 permit 200.200.0.0/16
access-list ADVERTISE seq 7 permit 20.20.0.0/16
!
ipv6 access-list ADVERTISE_6 seq 5 permit 2001:db8::2/128
ipv6 access-list CONDITION_6 seq 5 permit 2001:db8::3/128
!
route-map ADVERTISE permit 10
match ip address ADVERTISE
!
route-map CONDITION permit 10
match ip address CONDITION
!
route-map ADVERTISE_6 permit 10
match ipv6 address ADVERTISE_6
!
route-map CONDITION_6 permit 10
match ipv6 address CONDITION_6
!
line vty
!
end
Router2#
Withdraw when non-exist-map prefixes present in BGP table:
----------------------------------------------------------
Router2# show ip bgp all wide
For address family: IPv4 Unicast
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 10.10.10.1 0 0 1 i
*> 2.2.2.0/24 0.0.0.0 0 32768 i
*> 3.3.3.0/24 10.10.20.3 0 0 3 i
*> 200.200.0.0/16 0.0.0.0 0 32768 i
Displayed 4 routes and 4 total paths
For address family: IPv6 Unicast
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2001:db8::1/128 fe80::a00:27ff:fecb:ad57 0 0 1 i
*> 2001:db8::2/128 :: 0 32768 i
*> 2001:db8::3/128 fe80::a00:27ff:fe76:6738 0 0 3 i
*> 2001:db8::200/128 :: 0 32768 i
Displayed 4 routes and 4 total paths
Router2#
Router2# show ip bgp neighbors 10.10.10.1
BGP neighbor is 10.10.10.1, remote AS 1, local AS 2, external link
!--- Output suppressed.
For address family: IPv4 Unicast
Update group 9, subgroup 5
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Condition NON_EXIST, Condition-map *CONDITION, Advertise-map *ADVERTISE, status: Withdraw
1 accepted prefixes
For address family: IPv6 Unicast
Update group 10, subgroup 6
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Condition NON_EXIST, Condition-map *CONDITION_6, Advertise-map *ADVERTISE_6, status: Withdraw
1 accepted prefixes
!--- Output suppressed.
Router2#
Here 2.2.2.0/24 & 200.200.0.0/16 (prefixes in advertise-map) are withdrawn
by conditional advertisement scanner as the prefix(3.3.3.0/24) specified
by non-exist-map is present in BGP table.
Router2# show ip bgp all neighbors 10.10.10.1 advertised-routes wide
For address family: IPv4 Unicast
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 0.0.0.0 0 1 i
*> 3.3.3.0/24 0.0.0.0 0 3 i
Total number of prefixes 2
For address family: IPv6 Unicast
BGP table version is 8, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2001:db8::1/128 :: 0 1 i
*> 2001:db8::3/128 :: 0 3 i
*> 2001:db8::200/128 :: 0 32768 i
Total number of prefixes 3
Router2#
Advertise when non-exist-map prefixes not present in BGP table:
---------------------------------------------------------------
After Removing 3.3.3.0/24 (prefix present in non-exist-map),
2.2.2.0/24 & 200.200.0.0/16 (prefixes present in advertise-map) are advertised
Router2# show ip bgp all wide
For address family: IPv4 Unicast
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 10.10.10.1 0 0 1 i
*> 2.2.2.0/24 0.0.0.0 0 32768 i
*> 200.200.0.0/16 0.0.0.0 0 32768 i
Displayed 3 routes and 3 total paths
For address family: IPv6 Unicast
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2001:db8::1/128 fe80::a00:27ff:fecb:ad57 0 0 1 i
*> 2001:db8::2/128 :: 0 32768 i
*> 2001:db8::200/128 :: 0 32768 i
Displayed 3 routes and 3 total paths
Router2#
Router2# show ip bgp neighbors 10.10.10.1
!--- Output suppressed.
For address family: IPv4 Unicast
Update group 9, subgroup 5
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Condition NON_EXIST, Condition-map *CONDITION, Advertise-map *ADVERTISE, status: Advertise
1 accepted prefixes
For address family: IPv6 Unicast
Update group 10, subgroup 6
Packet Queue length 0
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor(all)
Condition NON_EXIST, Condition-map *CONDITION_6, Advertise-map *ADVERTISE_6, status: Advertise
1 accepted prefixes
!--- Output suppressed.
Router2#
Router2# show ip bgp all neighbors 10.10.10.1 advertised-routes wide
For address family: IPv4 Unicast
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 1.1.1.0/24 0.0.0.0 0 1 i
*> 2.2.2.0/24 0.0.0.0 0 32768 i
*> 200.200.0.0/16 0.0.0.0 0 32768 i
Total number of prefixes 3
For address family: IPv6 Unicast
BGP table version is 9, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 2
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 2001:db8::1/128 :: 0 1 i
*> 2001:db8::2/128 :: 0 32768 i
*> 2001:db8::200/128 :: 0 32768 i
Total number of prefixes 3
Router2#
Signed-off-by: Madhuri Kuruganti <k.madhuri@samsung.com>
We currently have a global process queue for handling route
updates in bgp. This is fine, in general, except there are
places and times where we plug the queue for no new work
during certain peer states of bgp update delay. If we
happen to be processing multiple bgp instances on startup
why do we want to stop processing in vrf A when vrf B
is in a bit of a pickle?
Also this separation will allow us to start forward thinking
about how to fully integrate pthreads into route processing
in bgp.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The MH datastructures were being released before the paths that were
referencing them. Fix is to do the MH cleanup last.
The MH finish function has also been stripped down to only do a
datastructure cleanup i.e. avoid sending route updates etc.
Ticket: 31376
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Problem found that if a router-id was not defined or derived
initially, the bgp->router_id would be set to 0x0 and used
for determining auto-rd values. When bgp received a subsequent
router-id update from zebra, bgp would not completely process
the update since it was treated as updating an already derived
router-id with a new value, which is not desired. This also
could leave the auto rd/rt inforamation missing or invalid in
some cases. This fix allows updating the derived router-id if
the previous value was 0/0.
Ticket: CM-31441
Signed-off-by: Don Slice <dslice@nvidia.com>
Enhancement to update-delay configuration to allow setting globally
rather than per-instance. Setting the update-delay is allowed either
per-vrf or globally, but not both at the same time.
Ticket: CM-31096
Signed-off-by: Don Slice <dslice@nvidia.com>
* Added vtysh cli commands and functions to set/unset bgp daemons no-rib
option during runtime and withdraw/announce routes in bgp instances
RIB from/to Zebra.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
When deleting a dynamic peer, unsetting md5 password would cause
it to be unset on the listener allowing unauthenticated connections
from any peer in the range.
Check for dynamic peers in peer delete and avoid this.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
When setting authentication on a BGP peer in a VRF the listener is
looked up from a global list. However there is no check that the
listener is the one associated with the VRF being configured. This
can result in the wrong listener beiong configured with a password,
leaving the intended listener in an open authentication state.
To simplify this lookup stash a pointer to the bgp instance in
the listener on creating (in the same way as is done for NS-based
VRFS).
Signed-off-by: Pat Ruddy <pat@voltanet.io>
If you configure eBGP on loopbacks, you might miss setting the
ebgp-multihop option. Given that, the session will not be established
because of this. Now, the session is in Active state. When you update
your config afterwards and set the ebgp-multihop option to the
appropriate value, the session will still be in Active state. In fact,
it will be stuck in Active state and only services restart will help.
With this change, when set the ebgp-multihop option and no session was
established, reset the session.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
* Applied style suggestions by automated compliance check.
* Fixed function bgp_shutdown_enable to use immutable message string.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
* Peers are now automatically restarted by the reconnect timer instead
of a ManualStart event after lifting the administrative shutdown.
* Question of when to log what remains.
* Compiles and works as intended now.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
* Fixed integration in FSM and packet handling.
* Added CLI "show" output, incl. JSON.
* For review and testing only.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
* Changes allow administratively shutting down all peers of a BGP
instance.
* New CLI commands "[no] bgp shutdown" in vty shell.
* For review and testing only.
Signed-off-by: David Schweizer <dschweizer@opensourcerouting.org>
This is the base patch that brings in support for Type-1 routes.
It includes support for -
- Ethernet Segment (ES) management
- EAD route handling
- MAC-IP (Type-2) routes with a non-zero ESI i.e. Aliasing for
active-active multihoming
- Initial infra for consistency checking. Consistency checking
is a fundamental feature for active-active solutions like MLAG.
We will try to levarage the info in the EAD-ES/EAD-EVI routes to
detect inconsitencies in access config across VTEPs attached to
the same Ethernet Segment.
Functionality Overview -
========================
1. Ethernet segments are created in zebra and associated with
access VLANs. zebra sends that info as ES and ES-EVI objects to BGP.
2. BGP advertises EAD-ES and EAD-EVI routes for the locally attached
ethernet segments.
3. Similarly BGP processes EAD-ES and EAD-EVI routes from peers
and translates them into ES-VTEP objects which are then sent to zebra
as remote ESs.
4. Each ES in zebra is associated with a list of active VTEPs which
is then translated into a L2-NHG (nexthop group). This is the ES
"Alias" entry
5. MAC-IP routes with a non-zero ESI use the alias entry created in
(4.) to forward traffic i.e. a MAC-ECMP is done to these remote-ES
destinations.
EAD route management (route table and key) -
============================================
1. Local EAD-ES routes
a. route-table: per-ES route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
b. route-table: per-VNI route-table
Not added
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
2. Remote EAD-ES routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=ES-RD, ESI, ET=0xffffffff, VTEP-IP)
c. route-table: global route-table
key: {RD=ES-RD, ESI, ET=0xffffffff)
3. Local EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
4. Remote EAD-EVI routes
a. route-table: per-ES route-table
Not added
b. route-table: per-VNI route-table
key: {RD=0, ESI, ET=0, VTEP-IP)
c. route-table: global route-table
key: {RD=L2-VNI-RD, ESI, ET=0)
Please refer to bgp_evpn_mh.h for info on how the data-structures are
organized.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Revert "zebra: support for macvlan interfaces"
This reverts commit bf69e212fd.
Revert "doc: add some documentation about bgp evpn netns support"
This reverts commit 89b97c33d7.
Revert "zebra: dynamically detect vxlan link interfaces in other netns"
This reverts commit de0ebb2540.
Revert "bgpd: sanity check when updating nexthop from bgp to zebra"
This reverts commit ee9633ed87.
Revert "lib, zebra: reuse and adapt ns_list walk functionality"
This reverts commit c4d466c830.
Revert "zebra: local mac entries populated in correct netnamespace"
This reverts commit 4042454891.
Revert "zebra: when parsing local entry against dad, retrieve config"
This reverts commit 3acc394bc5.
Revert "bgpd: evpn nexthop can be changed by default"
This reverts commit a2342a2412.
Revert "zebra: zvni_map_to_vlan() adaptation for all namespaces"
This reverts commit db81d18647.
Revert "zebra: add ns_id attribute to mac structure"
This reverts commit 388d5b438e.
Revert "zebra: bridge layer2 information records ns_id where bridge is"
This reverts commit b5b453a2d6.
Revert "zebra, lib: new API to get absolute netns val from relative netns val"
This reverts commit b6ebab34f6.
Revert "zebra, lib: store relative default ns id in each namespace"
This reverts commit 9d3555e06c.
Revert "zebra, lib: add an internal API to get relative default nsid in other ns"
This reverts commit 97c9e7533b.
Revert "zebra: map vxlan interface to bridge interface with correct ns id"
This reverts commit 7c990878f2.
Revert "zebra: fdb and neighbor table are read for all zns"
This reverts commit f8ed2c5420.
Revert "zebra: zvni_map_to_svi() adaptation for other network namespaces"
This reverts commit 2a9dccb647.
Revert "zebra: display interface slave type"
This reverts commit fc3141393a.
Revert "zebra: zvni_from_svi() adaptation for other network namespaces"
This reverts commit 6fe516bd4b.
Revert "zebra: importation of bgp evpn rt5 from vni with other netns"
This reverts commit 28254125d0.
Revert "lib, zebra: update interface name at netlink creation"
This reverts commit 1f7a68a2ff.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
If _force_ is set, then ALL prefixes are counted for maximum instead of
accepted only. This is useful for cases where an inbound filter is applied,
but you want maximum-prefix to act on ALL (including filtered) prefixes.
For instance, we have a configuration like:
neighbor r1 maximum-prefix 10
neighbor r1 prefix-list custom in
!
ip prefix-list custom seq 1 permit 10.0.0.0/24
ip prefix-list custom seq 2 permit 10.0.1.0/24
This will accept only 2 prefixes and discard all others instead of
shutting down the session when 10 is reached.
With this new knob (force), we will count all received prefixes and shutdown
the session when 10 is reached.
The bigger problem is when you have lots of peers with full feed and such a
configuration like in an example.
This is kinda re-ordering of how to treat filter vs. maximum-prefix.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`". It should not result in any functional
change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When a peer is bound to a peer-group, the GR flags set on the
peer are over-written.
Update the GR flags for the peer after it has been bound to a
peer-group.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
There can be cases where evpn traffic is not meshed across various
endpoints, but sent to a central pe. For this situation, remove the
nexthop unchanged default behaviour for bgp evpn. Also add route
reflector commands to bgp evpn node.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Problem reported that in many circumstances, RAs created in the
process of bringing up numbered IPv6 peers with extended-nexthop
capability enabled (for ipv4 over ipv6) were not stopped on the
interface when those peers were deleted. Found several circumstances
where this occurred and fix them in this patch.
Ticket: CM-26875
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Some competitive vendors like Cisco, Bird, OpenBGPD,
Nokia already have this by default enabled.
The list is here: https://github.com/bgp/RFC8212
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This macro is undefined if vnc is disabled, and while it defaults to 0,
this is still wrong and causes issues with -Werror
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Support configurable options to control how link bandwidth is handled
by the receiver. The default behavior is to automatically honor the
link bandwidths received and use it to perform a weighted ECMP BUT only
if all paths in the multipath have associated link bandwidth; if one or
more paths do not have link bandwidth, normal ECMP is performed among
the multipaths. This behavior is as recommended by
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth.
The additional options available are to (a) completely ignore any link
bandwidth (i.e., weighted ECMP is effectively disabled), (b) skip paths
in the multipath which do not have link bandwidth and perform weighted
ECMP among the other paths (if at least some paths have the bandwidth)
or (c) use a default weight (value chosen is 1) for the paths which
do not have link bandwidth.
The command syntax is
bgp bestpath bandwidth <ignore|skip-missing|default-weight-for-missing>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement the code to handle the other route-map options to generate
the link bandwidth, namely, to use the cumulative bandwidth or to
base this on the number of multipaths. In the latter case, a reference
bandwidth is internally chosen - the implementation uses a value of
1 Gbps.
These additional options mean that the prefix may need to be advertised
if there is a link bandwidth change, which is a new criteria. Define a
new path (change) flag to support this and implement the advertisement.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
https://lists.frrouting.org/pipermail/frog/2020-March/000776.html
It was pointed out that we are not properly passing the nexthop
through and instead we were replacing the nexthop as a Route Server
with our own.
https://tools.ietf.org/html/rfc4456#section-4
10. Implementation Considerations
Care should be taken to make sure that none of the BGP path
attributes defined above can be modified through configuration when
exchanging internal routing information between RRs and Clients and
Non-Clients. Their modification could potentially result in routing
loops.
In addition, when a RR reflects a route, it SHOULD NOT modify the
following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED.
Their modification could potentially result in routing loops.
Modify the code such that when FRR is instructed to act as a
Route-Server to pass through the nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Ensure that the late registration for NHT done for IPv4 route exchange
over IPv6 GUA peering is not attempted for peer-groups, only for peers.
Fixes: "bgpd: Late registration of Extended Nexthop"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Track the returned peer_sorted value and use it where
we can and recalculate where necessary.
This is an effort to reduce the amount of work done here.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The act of peer_sort() being called always set this value
even when we are just looking it up. We need to seperate
out the idea of lookup from set.
For those places that this is immediately obvious that
this is a lookup switch over to using this function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Added CLI commands to update rib-stale-time, running in
Cmd : "bgp gaceful-restart rib-stale-time (1-3000)".
Cmd : "no bgp gaceful-restart rib-stale-time".
* Integrating the hooks function for signalling from BGPD
to ZEBRA to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
bgp tcp connection.
When the BGP peer is configured between two bgp routes both routers would create
peer structure , when they receive each other’s open message. In this event both
speakers, open duplicate TCP sessions and send OPEN messages on each socket
simultaneously, the BGP Identifier is used to resolve which socket should be closed.
If BGP GR is enabled the old tcp session is dumped and the new session is retained.
So while this transfer of connection is happening, if all the bgp gr config
is not migrated to the new connection, the new bgp gr mode will never get applied.
Fix Summary:
1. Replicate GR configuration from the old session to the new session in bgp_accept().
2. Replicate GR configuration from stub to full-fledged peer in bgp_establish().
3. Disable all NSF flags, clear stale routes (if present), stop restart & stale timers
(if they are running) when the bgp GR mode is changed to “Disabled”.
4. Disable R-bit in cap, if it is not set the received open message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
and DS.
* Added config commands and data structures for deferral timer
configuration and processing.
Cmd : bgp graceful-restart select-defer-time (0-3600)
Cmd : no bgp graceful-restart select-defertime (0-3600)
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Signed-off-by: Soman K S <somanks@vmware.com>
* Added new show command to show the graceful restart
information for each neighbor.
Cmd: show bgp [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|WORD>] graceful-restart
* Changes to show neighbors commands for displaying
graceful restart information.
Cmd :show [ip] bgp [<view|vrf> VIEWVRFNAME] [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added FSM for peer and global configuration for graceful restart
* Added debug option BGP_GRACEFUL_RESTART for logs specific to
graceful restart processing
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
This moves all the DFLT_BGP_* stuff over to the new defaults mechanism.
bgp_timers_nondefault() added to get better file-scoping.
v2: moved everything into bgp_vty.c so that the core BGP code is
independent of the CLI-specific defaults. This should make the future
northbound conversion easier.
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to have this in bgpd.c; it's just there
historically. Move it to bgp_vty.c where it makes more sense.
Signed-off-by: David Lamparter <equinox@diac24.net>
The sender side AS path loop detection code was implemented since the
import of Quagga code, however it was always disabled by a `ifdef`
guard.
Lets allow the user to decide whether or not to enable this feature on
run-time.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add -s X or --socket_size X to the bgp cli to allow
the end user to specify the outgoing bgp tcp kernel
socket buffer size.
It is recommended that this option is only used on
large scale operations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
There was a silly bug introduced when the command to show failed sessions
was added. A missing "," caused the wrong error message to be printed.
Debugging this led down a path that:
- Led to discovering one more error message that needed to be added
- Providing the error code along with the string in the JSON output
to allow programs to key off numbers rather than strings.
- Fixing the missing ","
- Changing the error message to "Waiting for Peer IPv6 LLA" to
make it clear that we're waiting for the link local addr.
Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
We have this crash:
2019-08-18T07:58:44.831656-04:00 rch2-140-fwK2b bgpd[1791]: %NOTIFICATION: sent to neighbor 10.73.248.8 4/0 (Hold Timer Expired) 0 bytes
2019-08-18T07:58:44.832164-04:00 rch2-140-fwK2b bgpd[1791]: Assertion `!((peer->thread_flags) & ((1 << 0)))' failed in file bgpd.c, line 2173, function peer_delete
2019-08-18T07:58:44.832548-04:00 rch2-140-fwK2b bgpd[1791]: Backtrace for 11 stack frames:
2019-08-18T07:58:44.832942-04:00 rch2-140-fwK2b bgpd[1791]: [bt 0] /usr/lib/libfrr.so.0(zlog_backtrace+0x3a) [0x7f5503c7c31a]
2019-08-18T07:58:44.833311-04:00 rch2-140-fwK2b bgpd[1791]: [bt 1] /usr/lib/libfrr.so.0(_zlog_assert_failed+0x61) [0x7f5503c7c891]
2019-08-18T07:58:44.833684-04:00 rch2-140-fwK2b bgpd[1791]: [bt 2] /usr/lib/frr/bgpd(peer_delete+0x4d5) [0x1432ceea15]
2019-08-18T07:58:44.834095-04:00 rch2-140-fwK2b bgpd[1791]: [bt 3] /usr/lib/frr/bgpd(+0x430e9) [0x1432cfc0e9]
2019-08-18T07:58:44.834479-04:00 rch2-140-fwK2b bgpd[1791]: [bt 4] /usr/lib/frr/bgpd(bgp_event_update+0x121) [0x1432cfe1c1]
2019-08-18T07:58:44.834852-04:00 rch2-140-fwK2b bgpd[1791]: [bt 5] /usr/lib/frr/bgpd(+0x453f1) [0x1432cfe3f1]
2019-08-18T07:58:44.835388-04:00 rch2-140-fwK2b bgpd[1791]: [bt 6] /usr/lib/libfrr.so.0(thread_call+0x60) [0x7f5503c9e3c0]
2019-08-18T07:58:44.835829-04:00 rch2-140-fwK2b bgpd[1791]: [bt 7] /usr/lib/libfrr.so.0(frr_run+0xb8) [0x7f5503c79de8]
2019-08-18T07:58:44.836292-04:00 rch2-140-fwK2b bgpd[1791]: [bt 8] /usr/lib/frr/bgpd(main+0x229) [0x1432ce4a69]
2019-08-18T07:58:44.836729-04:00 rch2-140-fwK2b bgpd[1791]: [bt 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f550271bb45]
2019-08-18T07:58:44.837198-04:00 rch2-140-fwK2b bgpd[1791]: [bt 10] /usr/lib/frr/bgpd(+0x2cefc) [0x1432ce5efc]
2019-08-18T07:58:44.837670-04:00 rch2-140-fwK2b bgpd[1791]: Current thread function (bgp_holdtime_timer), scheduled from file bgp_fsm.c, line 380
This is the code:
bgp_reads_off(peer);
bgp_writes_off(peer);
assert(!CHECK_FLAG(peer->thread_flags, PEER_THREAD_WRITES_ON));
assert(!CHECK_FLAG(peer->thread_flags, PEER_THREAD_READS_ON));
The line crashing is the first assert. We know in bgp_writes_off we unset this flag:
void bgp_writes_off(struct peer *peer)
{
struct frr_pthread *fpt = bgp_pth_io;
assert(fpt->running);
thread_cancel_async(fpt->master, &peer->t_write, NULL);
THREAD_OFF(peer->t_generate_updgrp_packets);
UNSET_FLAG(peer->thread_flags, PEER_THREAD_WRITES_ON);
}
We also know that the keepalives are not being turned off until we call
bgp_fsm_change_status(peer, Deleted);
later in the function. We know that the keepalive pthread will
write to individual peers and issue a bgp_write_on(), which sets
this flag.
Modify the code base so that we explicitly turn off the keepalives
immediately before the turning of writes off.
Ticket: CM-26119
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
"show bgp l2vpn evpn neighbors <neighbor> [advertised-routes|routes]' did
not work due to various bugs. First, the command only accepted IPv4
addresses as valid neighbor ID, thereby rejecting unnumbered BGP and IPv6
neighbor address. Second, the SAFI was hardcoded to MPLS_VPN even though
we were passing the safi. Third, "all" made no sense in the command context
and to make the command uniform across all address families, I removed the
"all" keyword from the command.
Signed-off-by: Dinesh G Dutt <ddps4u@gmail.com>
Both of these hooks are necessary for proper operation of extensions
that need to latch on to a particular instance.
- without the delete hook, it's impossible to get rid of stale
references, leading to crashes with invalid instance pointers.
- the config-write hook is necessary because per-instance config needs
to be written inside the "router bgp" block to have the appropriate
context; adding a separate config node can't do that.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This code is not returned anywhere in the system as that bgp
is by default multiple-instance 'only' now. So remove
the last remaining bits of it from the code base.
Remove BGP_ERR_MULTIPLE_INSTANCE_USED too.
Make bgp_get explicitly return BGP_SUCCESS
instead of 0.
Remove the multi-instance error code too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A router-id change that isn't explicitly configured (a change
from zebra, for example) should not replace a configured vpn
RD/RT.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
the vrf_id parameter is replaced by struct vrf * parameter.
this impacts most of the daemons that look for an interface based on the
name and the vrf identifier.
Also, it fixes 2 lookup calls in zebra and sharpd, where the vrf_id was
ignored until now.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
* When the bgp is being deleted and routes are in clear workqueue
and new aggregate address being allocated
* Added flag BGP_FLAG_DELETE_IN_PROGRESS in bgp structure to
bgp instance is being deleted
* When adding aggregate route check this flag and peer_self is valid
Signed-off-by: Soman K S <somanks@vmware.com>
The BGP_OPT_CONFIG_CISCO command could no longer be set
as such remove it from the system as a viable option to
be used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Since we no-longer allow you to select multiple-instance
or not from the cli, let's completely remove the flag
as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This fix aims to reduce the load on BGPD when certain
exisiting configurations are replayed.
Specifically, the fix prevents BGPD from processing
routes when the following already existing configurations
are replayed:
1) A match criteria is configured within a route-map.
2) When "call" is invoked within a route-map.
3) When a route-map is tied to a BGP neighbor.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Prevent IPv6 routes received via a ibgp session with one of its own interface
ip as nexthop from getting installed in the BGP table.
Implemented IPV6 HASH table, where we need to add any ipv6 address as they
gets configured and delete them from the HASH table as the ipv6 addresses
get unconfigured. The above hash table is used to verify if any route learned
via BGP has nexthop which is equal to one of its its connected ipv6 interface.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
Co-authored-by: Donald Sharp <sharpd@cumulusnetworks.com>
Co-authored-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This makes the instance bearing the advertise-all-vni config option
register to zebra as the EVPN one, forwarding it the option.
Signed-off-by: Tuetuopay <tuetuopay@me.com>
Sponsored-by: Scaleway
VRF route leak auto RD and RT uses router-id,
when a router-id changes for a bgp instance, change
associated vpn RD and RT values. Withdraw
old RD/RT routes from vpn and with new
RD/RT values advertise new routes to vpn.
One of the sceanrio is restarting frr:
A router-id change may not have reflected
for bgp vrf instance X, while import vrf X
under bgp vrf instance Y.
Once router-id changes for bgp VRF X,
change RD and RTs from export VRF and
imported VRFs. Readvertise routes with new
values to VPN.
Ticket:CM-24149
Reviewed By:CCR-8394
Testing Done:
Validated via configured multiple bgp VRF instances
and enable route leaks among them, restart frr
and all instance received correct RD and RT values.
Checked 'show bgp vrf all ipv4 unicast route-leak'
and vpn table 'show bgp ipv4 vpn all' output.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The "show bgp ipv6 summary" output displays incorrect number of peers count.
sonic# show bgp ipv6 summary
IPv6 Unicast Summary:
BGP router identifier 10.1.0.1, local AS number 65100 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 5, using 103 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
2003::1 4 65099 0 0 0 0 0 never Active
2088::1 4 65100 0 0 0 0 0 never Active
3021::2 4 65100 0 0 0 0 0 never Active
Total number of neighbors 3
sonic#
In the above output, the peers count displays as 5 but the actual peer count is 3, i.e.. 3 neighbors are activated in ipv6 unicast address family.
Displayed peer count (5) is the number of the neighbors activated in a BGP instance.
Fix : Now the peers count displays the number of neighbors activated per afi/safi.
After Fix:
sonic# show bgp ipv6 summary
IPv6 Unicast Summary:
BGP router identifier 10.1.0.1, local AS number 65100 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 3, using 62 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
2003::1 4 65099 0 0 0 0 0 never Active
2088::1 4 65100 0 0 0 0 0 never Active
3021::2 4 65100 0 0 0 0 0 never Active
Total number of neighbors 3
sonic#
Signed-off-by: Akhilesh Samineni <akhilesh.samineni@broadcom.com>
Found in testing that in a certain sequence, a neighbor's peer-group
membership would be lost. This fix resolves that issue. Additionally
found that "no neighbor swp1 remote-as 2" would sometimes leave the
config with "neighbor swp1 remote-as 0" rather than removing from the
config. That one is also resolved.
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
peer_flag_modify() will always return BGP_ERR_INVALID_FLAG because
the action was not defined for PEER_FLAG_IFPEER_V6ONLY flag.
```
global PEER_FLAG_IFPEER_V6ONLY = 16384;
global BGP_ERR_INVALID_FLAG = -2;
probe process("/usr/lib/frr/bgpd").statement("peer_flag_modify@/root/frr/bgpd/bgpd.c:3975")
{
if ($flag == PEER_FLAG_IFPEER_V6ONLY && $action->type == 0)
printf("action not found for the flag PEER_FLAG_IFPEER_V6ONLY\n");
}
probe process("/usr/lib/frr/bgpd").function("peer_flag_modify").return
{
if ($return == BGP_ERR_INVALID_FLAG)
printf("return BGP_ERR_INVALID_FLAG\n");
}
```
produces:
action not found for the flag PEER_FLAG_IFPEER_V6ONLY
return BGP_ERR_INVALID_FLAG
$ vtysh -c 'conf t' -c 'router bgp 20' -c 'neighbor eth1 interface v6only remote-as external'
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
there are some cases where the bgp deletion will not be complete, while
the vrf identifier of the bgp instance is not completely identified. The
vrf search based on the bgp name is the better protection, since the bgp
vrf instance is created, even if the vrf identifier is not yet known.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Made changes and updated the routemap applied counter in the following flows.
1.Increment when route map attached to a list.
2.Decrement when route map removed / modified from a list.
3.Increment/decrement when route map create/delete callback triggered.
4.Besides ,This counter need not be updated when a route map is got updated.
i.e changing/adding a match value to the existing routemap.
In BGP , same update api called for all three add/delete/update operation .
But this counter have to be updated only for routemap addition.
Addressed this specific change by identifying the routemap operation based
on routemap pointer.
Signed-off-by: RajeshGirada <rgirada@vmware.com>
bgp instance is disabling the label allocated to reach vrf entity.
previously, only vrf disabling was removing the label. now, when bgp
leaves, bgp instance also frees the label used.
PR=62306
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Acked-by: Julien Floret <julien.floret@6wind.com>
When a interface based peer is setup and if it is part of a peer
group we should ignore this and just use the PEER_FLAG_CAPABILITY_ENHE
no matter what.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported that with certain sequences of defining the
remote-as on the peer-group and the members, the configuration would
become wrong, with configured remote-as settings not reflected in
the config but peers unable to come up. This fix resolves these
inconsistencies.
Ticket: CM-19560
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
The confederation identifier is a `as_t` type which is a
uint32_t underneath the covers. Display it using a %u
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a bit of code that allows us to dump the mac hash. Future
commits will actually add entries to the mac hash and then operate
on it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
label pool finalisation must be delayed after route deletion on bgp.
otherwise a crash will happen, while labels will be released.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
* The function bgp_router_id_zebra_bump() will check for active bgp
peers before chenging the router ID.
If there are established peers, router ID is not modified
which prevents the flapping of established peer connection
* Added field in bgp structure to store the count of established peers
Signed-off-by: kssoman <somanks@vmware.com>
Enable/disable duplicate address detection
there are 3 actions
warning-only: Default action which generates
only frr warning (syslog) to user for any
duplicate detecton
freeze: Permanently freezes address, manual
intervene required.
freeze with time: An address will recover once
the time has expired (auto-recovery).
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
if zebra is not started, then vrf identifiers are not available. This
prevents import/exportation to be available. This commit permits having
import/export available, even when zebra is not started.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The motivation for this patch is to address a concerning behavior of
tx-addpath-bestpath-per-AS. Prior to this patch, all paths' TX ID was
pre-determined as the path was received from a peer. However, this meant
that any time the path selected as best from an AS changed, bgpd had no
choice but to withdraw the previous best path, and advertise the new
best-path under a new TX ID. This could cause significant network
disruption, especially for the subset of prefixes coming from only one
AS that were also communicated over a bestpath-per-AS session.
The patch's general approach is best illustrated by
txaddpath_update_ids. After a bestpath run (required for best-per-AS to
know what will and will not be sent as addpaths) ID numbers will be
stripped from paths that no longer need to be sent, and held in a pool.
Then, paths that will be sent as addpaths and do not already have ID
numbers will allocate new ID numbers, pulling first from that pool.
Finally, anything left in the pool will be returned to the allocator.
In order for this to work, ID numbers had to be split by strategy. The
tx-addpath-All strategy would keep every ID number "in use" constantly,
preventing IDs from being transferred to different paths. Rather than
create two variables for ID, this patch create a more generic array that
will easily enable more addpath strategies to be implemented. The
previously described ID manipulations will happen per addpath strategy,
and will only be run for strategies that are enabled on at least one
peer.
Finally, the ID numbers are allocated from an allocator that tracks per
AFI/SAFI/Addpath Strategy which IDs are in use. Though it would be very
improbable, there was the possibility with the free-running counter
approach for rollover to cause two paths on the same prefix to get
assigned the same TX ID. As remote as the possibility is, we prefer to
not leave it to chance.
This ID re-use method is not perfect. In some cases you could still get
withdraw-then-add behaviors where not strictly necessary. In the case of
bestpath-per-AS this requires one AS to advertise a prefix for the first
time, then a second AS withdraws that prefix, all within the space of an
already pending MRAI timer. In those situations a withdraw-then-add is
more forgivable, and fixing it would probably require a much more
significant effort, as IDs would need to be moved to ADVs instead of
paths.
Signed-off-by Mitchell Skiba <mskiba@amazon.com>
When we have a late registration of the Extended Nexthop capability
for BGP and the peer already has nexthop information stored, go
through and enable RA on the important interfaces.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow some debug notification when we are unable to talk
to zebra due to the connection not being there yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The peer->group pointer is set only if the PEER_STATUS_GROUP flag is
set in the peer. Add a protection to prevent a NULL pointer dereference.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we add/remove peers we need to do a bit better job
of tracking them in the bgp->peerhash.
1) When we have the doppelganger take over, make sure the
winner is the one represented in the peerhash.
2) When creating the doppelganger, leave the current one
in place instead of blindly replacing it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Cleanup calls where we were passing in the su for
peer creation a tiny bit.
Creating a peer from the cli will always have a conf_if *or*
a su but not both. While a doppelganger will have both.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
During peer startup there exists the possibility that both
locally and remote peers try to start communication at the
same time. In addition it is possible for local configuration
to change at the same time this is going on. When this happens
try to notice that the remote peer may be in opensent or openconfirm
and if so we need to restart the connection from both sides.
Additionally try to write a bit of extra code in peer_xfer_conn
to notice when this happens and to emit a error message to
the end user about this happening so that it can be cleaned up.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
All I can see is an unneccessary complication. If there's some purpose
here it needs to be documented...
Signed-off-by: David Lamparter <equinox@diac24.net>
Corrections so that the BGP daemon can work with the label manager properly
through a label-manager proxy. Details:
- Correction so the BGP daemon behind a proxy label manager gets the range
correctly (-I added to the BGP daemon, to set the daemon instance id)
- For the BGP case, added an asynchronous label manager connect command so
the labels get recycled in case of a BGP daemon reconnection. With this,
BGPd and LDPd would behave similarly.
Signed-off-by: F. Aragon <paco@voltanet.io>
Problem reported that some bgp and ospf json commands did not return
any json output at all if the bgp/ospf instance did not exist.
Additionally, some bgp and ospf json commands did not return any json
output if the instance existed but no neighbors were defined. This
fix makes these commands more consistent in returning empty braces for
json output and issue a message if not using json output. Additionally,
made the flag "use_json" a bool to make it consistent since previously,
it had been defined as an int, char, u_char, and bool at various places.
Ticket: CM-21040
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
While perusing CONFDATE I noticed that we had a couple
CONFDATE 201805, which we were not picking up( for other
reasons and fixed in a different PR ). But upon investigation
of these I noticed that the commits where in 201805, so these
CONFDATES should be in 2019
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Several zlog_warns were being used to tell the end
user that bgp had detected a bug. These all look like information
added during development that can be noted as debugs or logged
as an error situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The code for this was always there but was not kicking in because of an
incorrect dependency on is_evpn_enabled. This API attempts to locate the
default instance from bgp_master's instance list. Only the instance
currently being deleted has already been removed from the instance list
by the time bgp_delete->bgp_zebra_instance_deregister is executed.
Symptom of this bug used to show up when a default instance is deleted
and created again. In that case bgp_zebra_instance_register would not be
effective as zebra ignores the register as dup (dereg didn't happen in the
first place) so bgpd wouldn't reload already configured L2-VNIs.
root@cel-sea-03:~# net show bgp l2vpn evpn vni |grep 1000
* 1000 L2 169.253.0.11:9 6646:1000 6646:1000 vrf1
root@cel-sea-03:~# grep "router bgp" /etc/frr/frr.conf
router bgp 6646
root@cel-sea-03:~# sed -i 's/6646/6656/' /etc/frr/frr.conf
root@cel-sea-03:~# grep "router bgp" /etc/frr/frr.conf
router bgp 6656
root@cel-sea-03:~# systemctl reload frr
root@cel-sea-03:~# net show bgp l2vpn evpn vni |grep 1000
root@cel-sea-03:~#
Fix simply changes the order of dereg to make
bgp_zebra_instance_deregister actually happen (by doing it before the
default instance is removed from the master list).
Ticket: CM-21566
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a bgp instance is stopped, with a `no router bgp..`
make sure any timers associated with the instance are stopped
as well.
This issue was discovered when a customer issued a `no router bgp`
while a maxmed timer was operative. The max-med timer used the
`struct bgp *` as the passed in value for the thread. The
thread eventually popped after the cleanup and attempted to use
data off in lala land and crashed
Ticket: CM-21895
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit removes various parts of the bgpd implementation code which
are unused/useless, e.g. unused functions, unused variable
initializations, unused structs, ...
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
The current behavior of the `bgp default shutdown` command is to set the
state of all newly configured peers to shutdown. This leads to a problem
when restarting bgpd, because all peers will then be seen as newly
configured, which leads to all peers being set to shutdown after each
restart.
This behavior is undesired and not common when comparing the
implementation against other vendors. This commit moves the `bgp default
shutdown` configuration underneath the peer-group and peer
configuration, to ensure that existing peers will not be set to shutdown
after a daemon restart.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
This commit finalizes the previous commits which introduced a generic
approach for making all BGP peer and address-family attributes
overrideable by keeping track of the configuration origin in separate
internal structures.
First of all, the test suite was greatly extended to also check the
internal data structures of peer/AF attributes, so that inheritance for
internal values like 'peer->weight' is also being checked in all cases.
This revealed some smaller issues in the implementation, which were also
fixed in this commit. The test suite now fully passes and covers all the
usual situations that should normally occur.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
This commit introduces BGP peer-group overrides for the last set of
peer-level attrs which did not offer that feature yet. The following
attributes have been implemented: description, local-as, password and
update-source.
Each attribute, with the exception of description because it does not
offer any inheritance between peer-groups and peers, is now also setting
a peer-flag instead of just modifying the internal data structures. This
made it possible to also re-use the same implementation for attribute
overrides as already done for peer flags, AF flags and AF attrs.
The `no neighbor <neigh> description` command has been slightly changed
to support negation for no parameters, one parameter or * parameters
(LINE...). This was needed for the test suite to pass and is a small
change without any bigger impact on the CLI.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
This commit implements BGP peer-group overrides for the timer flags,
which control the value of the hold, keepalive, advertisement-interval
and connect connect timers. It was kept separated on purpose as the
whole timer implementation is quite complex and merging this commit
together with with the other flag implementations did not seem right.
Basically three new peer flags were introduced, namely
*PEER_FLAG_ROUTEADV*, *PEER_FLAG_TIMER* and *PEER_FLAG_TIMER_CONNECT*.
The overrides work exactly the same way as they did before, but
introducing these flags made a few conditionals simpler as they no
longer had to compare internal data structures against eachother.
Last but not least, the test suite has been adjusted accordingly to test
the newly implemented flag overrides.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
This commit cleans up some ugly leftovers from previous flag-override
implementation and refactors the AF-flag override implementation to
match the same behavior the newly added peer-flag override
implementation has.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
The current implementation of the overrides for peer address-family
attributes suffered a bug, which caused all peer-specific attributes to
be lost when the peer was added to a peer-group which already had that
specific address-family active.
This commit extends the *peer_group2peer_config_copy_af* function to
respect overridden flags properly. Additionally, the arguments of the
macros *PEER_ATTR_INHERIT* and *PEER_STR_ATTR_INHERIT* have been
reordered to be more consistent and easy to read.
This commit also adds further test cases to the BGP peer attributes test
suite, so that this kind of error is being caught in future commits. The
missing AF-attribute *distribute-list* has also been added to the test
suite.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
The current implementation of peer flags (e.g. shutdown, passive, ...)
only has partial support for overriding flags of a peer-group when the
peer is a member. Often settings might get lost if the user toys around
with the peer-group configuration, which can lead to disaster.
This commit introduces the same override implementation which was
previously integrated to support proper peer flag/attribute override on
the address-family level. The code is very similar and the global
attributes now use their separate state-arrays *flags_invert* and
*flags_override*.
The test suite for BGP peer attributes was extended to also check peer
global attributes, so that the newly introduced changes are covered. An
additional feature was added which allows to test an attribute with an
*interface-peer*, which can be configured by running `neighbor IF-TEST
interface`. This was introduced so that the dynamic runtime inversion of
the `extended-nexthop` flag, which is only enabled by default for
interface peers, can also be tested.
Last but not least, two small changes have been made to the current bgpd
implementation:
- The command `strict-capability-match` can now also be set on a
peer-group, it seems like this command slipped through while
implementing peer-groups in the very past.
- The macro `COND_FLAG` was introduced inside lib/zebra.h, which now
allows to either set or unset a flag based on a condition. The syntax
for using this macro is: `COND_FLAG(flag_variable, flag, condition)`
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
Crash w/ an assert if someone calls bgp_delete with a
NULL parameter as opposed to crashing when we dereference
the pointer a bit later.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are determining the state of a peer, we sometimes
detect that we should update the peer->su. The bgp->peer_hash
keeps a hash of peers based upon the peer->su. This requires
us to release the stored value before we re-insert it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Cleanup the leaked ecommunity data that we may have on shutdown.
Cleanup leaked vrf name strings on shutdown.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This commit fixes all outstanding style/formatting issues as detected by
'git clang-format' or 'checkpath' for the new peer-group override
implementation, which spanned across several commits.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
The previous commit introduced very strict unit tests which check all
three involved components (config input, config output, internal data
structures) which revealed two more bugs in the peer-group override
implementation.
This commit fixes overrides for 'allowas-in <number>' and
'unsuppress-map', which both had a small mistake/typo causing those
issues.
Signed-off-by: Pascal Mathis <mail@pascalmathis.com>