This bit of code was cut-n-pasted all over the place:
if (!bpa->installed && !bpa->install_in_progress) {
bgp_send_pbr_rule_action(bpa, NULL, true);
bgp_zebra_announce_default(bgp, nh,
bpa->afi,
bpa->table_id, true);
}
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Create a function bgp_bpr_bpa_remove that is this cut-n-paste code:
if (bpa->refcnt == 0) {
if (bpa->installed && bpa->table_id != 0) {
bgp_send_pbr_rule_action(bpa, NULL, false);
bgp_zebra_announce_default(bpa->bgp, &(bpa->nh),
AFI_IP,
bpa->table_id,
false);
bpa->installed = false;
}
}
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Deconfiguring conditional advertisement resulted in all other policy
settings on the peer getting removed due to an excessively large memset.
This also created a desync with the northbound config tree, which caused
its own set of problems...
Fix the memset to just remove the conditional advertisement config.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
BGP_MAX_PACKET_SIZE no longer represented the absolute maximum BGP
packet size as it did before, instead it was defined as 4096 bytes,
which is the maximum unless extended message capability is negotiated,
in which case the maximum goes to 65k.
That introduced at least one bug - last_reset_cause was undersized for
extended messages, and when sending an extended message > 4096 bytes
back to a peer as part of NOTIFY data would trigger a bounds check
assert.
This patch redefines the macro to restore its previous meaning,
introduces a new macro - BGP_STANDARD_MESSAGE_MAX_PACKET_SIZE - to
represent the 4096 byte size, and renames the extended size to
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE for consistency. Code locations
that definitely should use the small size have been updated, locations
that semantically always need whatever the max is, no matter what that
is, use BGP_MAX_PACKET_SIZE.
BGP_EXTENDED_MESSAGE_MAX_PACKET_SIZE should only be used as a constant
when storing what the negotiated max size is for use at runtime and to
define BGP_MAX_PACKET_SIZE. Unless there is a future standard that
introduces a third valid size it should not be used for any other
purpose.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
When running valgrind there are some possible memory leaks.
These memory leaks we have absolutely no control over, mark
them as not worthy of being reported.
Finally move the valgrind suppressions file from bgpd/ to tools/
this is because this suppressions file can be used beyond bgpd
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Show alias name instead of numerical value in `show bgp <prefix>. E.g.:
```
root@exit1-debian-9:~/frr# vtysh -c 'sh run' | grep 'bgp community alias'
bgp community alias 65001:123 community-1
bgp community alias 65001:123:1 lcommunity-1
root@exit1-debian-9:~/frr#
```
```
exit1-debian-9# sh ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 21
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Community: 65001:12 65001:13 community-1 65001:65534
Large Community: lcommunity-1 65001:123:2
Last update: Fri Apr 16 12:51:27 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Prevent another call path that uses uninited data in
bgp_pbr.c
This was found through more clang sa runs.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
vtysh will return an informational message to the user that changing any
graceful-shutdown related parameter will require a peer reset. This is should
not be treated as an error message (resulting in a return code of 1) but
rather as a simple information to the user.
This fixes GitHub issue https://github.com/FRRouting/frr/issues/8403
$ vtysh -c configure -c 'router bgp 100' -c 'bgp graceful-restart'
Graceful restart configuration changed, reset all peers to take effect
$ echo $?
0
Signed-off-by: Christian Poessinger <christian@poessinger.com>
For whatever reason the dampening show run code was outside the normal
loop of code that handles the afi/safi portion. consolidate it into
the rest of the normal code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp_config_write_peer_af already checks to see if we are
a dynamic peer. No need to do so right before we call it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When we are cycling through all peers and looking for
dampening data to dump, do not consider non-configed
peers( dopplegangers ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
frr(config)# bgp large-community-list expanded com1 permit .*
% Malformed community-list value
The check to validate large-community against UINT_MAX is added for
both standard and expanded community. But however it needs to be
validated only for standard community.
Signed-off-by: Prerana-GB <prerana@vmware.com>
Problem Statement:
=================
In scale setup BGP sessions start flapping.
RCA:
====
In virtualized environment there are multiple places where
MTU need to be set. If there are some places were MTU is not set
properly then there is chances that BGP packets get fragmented,
in scale setup this will lead to BGP session flap.
Fix:
====
A new tcp option is provided as part of this implementation,
which can be configured per neighbor and helps to set the TCP
max segment size. User need to derive the path MTU between the BGP
neighbors and set that value as part of tcp-mss setting.
1. CLI Configuration:
[no] neighbor <A.B.C.D|X:X::X:X|WORD> tcp-mss (1-65535)
2. Running config
frr# show running-config
router bgp 100
neighbor 198.51.100.2 tcp-mss 150 => new entry
neighbor 2001:DB8::2 tcp-mss 400 => new entry
3. Show command
frr# show bgp neighbors 198.51.100.2
BGP neighbor is 198.51.100.2, remote AS 100, local AS 100, internal link
Hostname: frr
Configured tcp-mss is 150, synced tcp-mss is 138 => new display
4. Show command json output
frr# show bgp neighbors 2001:DB8::2 json
{
"2001:DB8::2":{
"remoteAs":100,
"bgpTimerKeepAliveIntervalMsecs":60000,
"bgpTcpMssConfigured":400, => new entry
"bgpTcpMssSynced":388, => new entry
Risk:
=====
Low - This is a config driven feature and it sets the max segment
size for the TCP session between BGP peers.
Tests Executed:
===============
Have done manual testing with three router topology.
1. Executed basic config and un config scenarios
2. Verified if the config is updated in running config
during config and no config operation
3. Verified the show command output in both CLI format and
JSON format.
4. Verified if TCP SYN messages carry the max segment size
in their initial packets.
5. Verified the behaviour during clear bgp session.
6. done packet capture to see if the new segment size
takes effect.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
Delay setting local data about a remote peer until after BGP
has decided to allow an open connection to proceed.
Modifying local peer data structures based upon what is
received from a peer should not be done until after BGP
has decided that the open is allowed to proceed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
FRR in thread.c clears the passed in double pointer when
we pull it off the ready queue and pass it back to
the calling function via thread_fetch().
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As pointed out on code review of BGP extended messages, increasing the
maximum BGP message size has the consequence of growing the dynamically
sized stack buffer up to 650K. While unlikely to exceed modern stack
sizes it is still unreasonably large. Remedy this with a heap buffer.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
If a user issues the following commands:
```
router bgp 65000 vrf red
router bgp 65000 view red
```
bgpd ends up having NB config inconsistent with actual data.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When we're trying to create the config for already existing router and
the AS number or instance type mismatches, we're returning the
inconsistency error and don't set the NB entry.
Any subsequent command that configures this router will crash because
every command relies on the existence of the NB entry.
Let's store the entry even when there's a mismatch to prevent the crash.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Use unsigned value for all RA requests to Zebra
- encoding signed int as unsigned is bad practice
- RA interval is never, and should never be, negative
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
This is always a 16 bit unsigned value.
- signed int is the wrong type to use
- encoding a signed int as a uint32 is bad practice
- decoding a signed int encoded as a uint32 into a uint16 is bad
practice
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
When VNI RD changes, EAD/EVI routes with old RD should be withdrawn from
the global routing table and EAD/EVI routes in the VNI should be
advertised with the new RD.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
There are multiple problems:
- commit ef7c53e2 introduced a new return value 2 which broke things,
because a lot of code treats non-zero return as an error,
- there is an incorrect error returned when AS number mismatches.
This commit fixes both.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>