Data Structures, function declaration and Macros forSignalling
from BGPD to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
*After a restarting router comes up and the bgp session is
successfully established with the peer. If the restarting
router doesn’t have any route to send, it send EOR to
the peer immediately before receiving updates from its peers.
*Instead the restarting router should send EOR, if the
selection deferral timer is not running OR count of eor received
and eor required are matches then send EOR.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
BGP disable EOR sending is a useful command for testing various
scenarios of BGP graceful restart.
* Added the hidden CLI command : bgp graceful-restart disable-eor
* The CLI will not be displayed in "show running-config" and will not
be stored in configuration file.
* When enabled, EOR will not be sent to peer
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Signed-off-by: Soman K S <somanks@vmware.com>
When the peer router's gr mode had changed from helper/restart
to disable. The local bgp gr router should reset the peer
router's restart-time stored.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
bgp tcp connection.
When the BGP peer is configured between two bgp routes both routers would create
peer structure , when they receive each other’s open message. In this event both
speakers, open duplicate TCP sessions and send OPEN messages on each socket
simultaneously, the BGP Identifier is used to resolve which socket should be closed.
If BGP GR is enabled the old tcp session is dumped and the new session is retained.
So while this transfer of connection is happening, if all the bgp gr config
is not migrated to the new connection, the new bgp gr mode will never get applied.
Fix Summary:
1. Replicate GR configuration from the old session to the new session in bgp_accept().
2. Replicate GR configuration from stub to full-fledged peer in bgp_establish().
3. Disable all NSF flags, clear stale routes (if present), stop restart & stale timers
(if they are running) when the bgp GR mode is changed to “Disabled”.
4. Disable R-bit in cap, if it is not set the received open message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
BGP GR Neighbor mode is showing the default string as “NotRecieved”,
as the bgp gr neighbour capability was not processed,
since the local mode is “Disable”.
However now it would be changed to “NotApplicable”.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
& GR is enabled.
When GR with deferral is enabled and connected routes are
distributed then in one race condition route node gets added
in to both deferred queue and work queue. If deferred queue
gets processed first then it ends up delete only flag while
leaving the entry in the work queue as it is. When a new update
comes for the same route node next time from peer then it hits
assert. Assert check is added to ensure we don’t add to work queue
again while it is already present.
So, check before adding in to deferred queue if it is already present
in work queue and bail if so.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Changing GR mode on a router needs a session reset from the
SAME router to negotiate new GR capability.
* The present GR implementation needs a session reset after every
new BGP GR mode change.
* When BGP session reset happens due to sending or receiving BGP
notification after changing BGP GR mode, there is no need of
explicit session reset.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* BGP GR Neighbour mode in show command would show as
“NotApplicable”, when local mode is “Disable”. As the bgp
gr neighbour capability was not processed, since the local mode
is “Disable”.
* Minor changes in show Selection Deferral Time.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Selection Deferral Timer for Graceful Restart.
* Added selection deferral timer handling function.
* Route marking as selection defer when update message is received.
* Staggered processing of routes which are pending best selection.
* Fix for multi-path test case.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
and DS.
* Added config commands and data structures for deferral timer
configuration and processing.
Cmd : bgp graceful-restart select-defer-time (0-3600)
Cmd : no bgp graceful-restart select-defertime (0-3600)
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Signed-off-by: Soman K S <somanks@vmware.com>
* Added new show command to show the graceful restart
information for each neighbor.
Cmd: show bgp [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|WORD>] graceful-restart
* Changes to show neighbors commands for displaying
graceful restart information.
Cmd :show [ip] bgp [<view|vrf> VIEWVRFNAME] [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Changes to the capability sending function to advertise
graceful restart capability in the bgp OPEN message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added FSM for peer and global configuration for graceful restart
* Added debug option BGP_GRACEFUL_RESTART for logs specific to
graceful restart processing
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Duplicated domain name capability messages cause memory leak. The amount
of leaked memory is proportional to the size of the duplicated
capabilities. This bug was introduced in 2015.
To hit this, a BGP OPEN message must contain multiple FQDN capabilities.
Memory is leaked when the hostname portion of the capability is of
length 0, but the domainname portion is not, for any of the duplicated
capabilities beyond the first one.
https://tools.ietf.org/html/draft-walton-bgp-hostname-capability-00
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When withdrawing addpaths, adj_lookup was called to find the path that
needed to be withdrawn. It would lookup in the RB tree based on subgroup
pointer alone, often find the path with the wrong addpath ID, and return
null. Only the path highest in the tree sent to the subgroup could be
found, thus withdrawn.
Adding the addpath ID to the sort criteria for the RB tree allows us to
simplify the logic for adj_lookup, and address this problem. We are able
to remove the logic around non-addpath subgroups because the addpath ID
is consistently 0 for non-addpath adj_outs, so special logic to skip
matching the addpath ID isn't required. (As a side note, addpath will
also never use ID 0, so there won't be any ambiguity when looking at the
structure content.)
Signed-off-by: Mitchell Skiba <mskiba@amazon.com>
The vrrpd one conflicts with the standalone vrrpd package; also we're
installing daemons to /usr/lib/frr on some systems so they're not on
PATH.
Signed-off-by: David Lamparter <equinox@diac24.net>
bgpd already supports BGP Prefix-SID path attribute and
there are some sub-types of Prefix-SID path attribute.
This commits makes bgpd to support additional sub-types.
sub-Type-4 and sub-Type-5 for construct the VPNv4 SRv6 backend
with vpnv4-unicast address family.
This path attributes is already supported by Ciscos IOS-XR and NX-OS.
Prefix-SID sub-Type-4 and sub-Type-5 is defined on following
IETF-drafts.
Supports(A-part-of):
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-04
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-05
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
For evpn routes, nexthop and RMAC fileds are synced
in route add to zebra.
In case of EVPN routes display RMAC field in route add
debug log.
Reviewed By:CCR-9381
Testing Done:
BGP: nhop [1]: 27.0.0.11 if 30 VRF 26 RMAC 00:02:00:00:00:2e
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit makes bgpd to support VPNv4's extended
nexthop capability for bgp-capability negotiation
when BGP open messaging.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
uint8_t * cannot be cast to uint32_t * unless the
pointed-to address is aligned according to uint32_t's
alignment rules. And it usually is not.
Signed-off-by: Santosh P K <sapk@vmware.com>
With this change, we are able to set attributes via route-map to the default
route. It's useful in cases where we have two or more spines and we want to
prefer one router over others for leaves. This simplifies configuration instead
of using 'network 0.0.0.0/0' or 'ip route 0.0.0.0/0 ...' and 'redistribute
static' combination.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
uint8_t * cannot be cast to uint32_t * unless the pointed-to address is
aligned according to uint32_t's alignment rules. And it usually is not.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
advertise pip running configuration should
display ip followed by mac parameters value as defined
in cli signature.
advertise-pip is enabled by default, when displaying the
running configuration, there is '\n' added after
ip and mac parameters which was not guarded around
the non-default parameters.
Currently, for every bgp vrf instance it ends up
displaying l2vpn address-family section due to
unguarded newline.
running config:
router bgp 6004 vrf vrf1
!
address-family l2vpn evpn
exit-address-family
!
Ticket:CM-26964
Testing Done:
With fix when only 'router bgp 6004 vrf vrf1' configured,
running config looks like:
!
router bgp 6004 vrf vrf1
!
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit is about #5629 's issue.
Before this commit, bgpd creates format string of
bgp-route-distinguisher as int32, but correctly format
is uint32. current bgpd's sh-run-cli generate int32 rd,
so if user sets the rd as 1:4294967295(0x1:0xffffffff),
sh-run cli generates 1: -1 as running-config. This
commit fix that issue.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Guess what - for a bounds check to work, it has to happen *before* you
read the data. We were trusting the attribute field received in a prefix
SID attribute and then checking if it was correct afterwards, but if was
wrong we'd crash before that.
This fixes the problem, and adds additional paranoid bounds checks.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We should send a NOTIFICATION message with the Error Code Finite State
Machine Error if we receive NOTIFICATION in OpenSent state
as defined in https://tools.ietf.org/html/rfc4271#section-8.2.2
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
bgp nexthop cache update triggers RA for global ipv6
nexthop update.
In case of blackhole route type the outgoing interface
information is NULL which leads to bgpd crash.
Skip sending RA for blackhole nexthop type.
Ticket:CM-27299
Reviewed By:
Testing Done:
Configure bgp neighbor over global ipv6 address.
Configure static blackhole route with prefix includes
connected ipv6 global address.
Upon link flap, zebra sends nexthop update to bgp.
Bgp nexthop cache skips sending RA for blackhole nexthop type.
router bgp 65002
bgp router-id 91.189.93.190
...
neighbor 2001:67c:1360::b peer-group internal
static route:
ipv6 route 2001:67c:1360::/48 Null0 254
iface rowlink.4010
address 91.189.93.190/32
address 2001:67c:1360::a/128
Trigger ifdown rowlink.4010; ifup rowlink.4010
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
It's quite confusing when you see this:
```
exit1-debian-9(config-router)# bgp listen
listen Configure BGP defaults
```
And:
```
exit1-debian-9(config-router)# no bgp listen
listen unset maximum number of BGP Dynamic Neighbors that can be created
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When passing a v4 multicast route to a peer send
the v4 nexthop as a preferred methodology.
Fixes: #5582
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixes:
```
exit1-debian-9(config-router)# no bgp listen range 192.168.10.0/24 peer-group TEST
% Peer-group does not exist
exit1-debian-9(config-router)#
```
Closes https://github.com/FRRouting/frr/issues/5570
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'NOTIFICATION' string in this message incorrectly implies a BGP
Notification message was the cause of this log. Removing it to
reduce confusion and replacing with function name.
Signed-off-by: Trey Aspelund <taspelund@cumulusnetworks.com>
Before it was:
```
exit1-debian-9# show ip bgp regexp ^200a
Invalid character in as-path access-list ^200a
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
rfapi_descriptor_rfp_utils.c is already built into libbgp.a and these
include paths have no effect at all.
Signed-off-by: David Lamparter <equinox@diac24.net>
This should keep backward compatibility when bgp show-hostname is
enabled/disabled.
Also show the real originator IP instead of showing fqdn of the route
reflector.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
To keep the calling code agnostic of the DNS resolver libary used, pass
a strerror-style string instead of a status code that would need extra
handling.
Signed-off-by: David Lamparter <equinox@diac24.net>
libc-ares doesn't do IP literals, so we have to do that before running
off to do DNS. Since this isn't BMP specific, move to lib/ so NHRP can
benefit too.
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.
RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
been added to peer hash table for keepalive when it was
in openconfirm state.
3. Config node parameters including FD are exchanged with
doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
set to -1 and asserts.
Signed-off-by: Santosh P K <sapk@vmware.com>
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Use a per-nexthop flag to indicate the presence of labels; add
some utility zapi encode/decode apis for nexthops; use the zapi
apis more consistently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This moves all the DFLT_BGP_* stuff over to the new defaults mechanism.
bgp_timers_nondefault() added to get better file-scoping.
v2: moved everything into bgp_vty.c so that the core BGP code is
independent of the CLI-specific defaults. This should make the future
northbound conversion easier.
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to have this in bgpd.c; it's just there
historically. Move it to bgp_vty.c where it makes more sense.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add a bit of code to allow hostname lookup failure to
not stall bmp communication.
Fixes: #5382
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The function bgp_table_range_lookup attempts to walk down
the table node data structures to find a list of matching
nodes. We need to guard against the current node from
not matching and not having anything in the child nodes.
Add a bit of code to guard against this.
Traceback that lead me down this path:
Nov 24 12:22:38 frr bgpd[20257]: Received signal 11 at 1574616158 (si_addr 0x2, PC 0x46cdc3); aborting...
Nov 24 12:22:38 frr bgpd[20257]: Backtrace for 11 stack frames:
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_signal+0x113) [0x7fd1ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(+0x70e65) [0x7fd1ad465e65]ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libpthread.so.0(+0xf5f0) [0x7fd1abd605f0]45db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(bgp_table_range_lookup+0x63) [0x46cdc3]445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib64/frr/modules/bgpd_rpki.so(+0x4f0d) [0x7fd1a934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(thread_call+0x60) [0x7fd1ad4736e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(frr_run+0x128) [0x7fd1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(main+0x2e3) [0x41c043]1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd() [0x41d9bb]main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: in thread bgpd_sync_callback scheduled from bgpd/bgp_rpki.c:351#012; aborting...
Nov 24 12:22:38 frr watchfrr[6779]: [EC 268435457] bgpd state -> down : read returned EOF
Nov 24 12:22:38 frr zebra[5952]: [EC 4043309116] Client 'bgp' encountered an error and is shutting down.
Nov 24 12:22:38 frr zebra[5952]: zebra/zebra_ptm.c:1345 failed to find process pid registration
Nov 24 12:22:38 frr zebra[5952]: client 15 disconnected. 0 bgp routes removed from the rib
I am not really 100% sure what we are really trying to do with this function, but we must
guard against child nodes not having any data.
Fixes: #5440
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When dumping a large bit of table data via bgp_show_table
and if there is no information to display for a particular
`struct bgp_node *` the data allocated via json_object_new_array()
is leaked. Not a big deal on small tables but if you have a full
bgp feed and issue a show command that does not match any of
the route nodes ( say `vtysh -c "show bgp ipv4 large-community-list FOO"`)
then we will leak memory.
Before code change and issuing the above show bgp large-community-list command 15-20 times:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: > 2GB
Free small blocks: 31 MiB
Free ordinary blocks: 616 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
After:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 924 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 558 MiB
Free small blocks: 26 MiB
Free ordinary blocks: 340 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
Please note the 340mb of free ordinary blocks is from the fact I issued a
`show bgp ipv4 uni json` command and generated a large amount of data.
Fixes: #5445
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Without with fix we can't delete large-community-list using
no bgp large-community-list standard WORD, but no bgp large-community-list WORD
Let's keep this identical what we have with expanded lists as well.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This patch allows using sequence numbers for community lists. We already have
this for prefix-lists and access-lists.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
In rare situations, the local route in a VNI may not get selected as the
best route. One situation is during a race between bgp and zebra which
was addressed in a prior commit. This change addresses another situation
where due to a change of tunnel IP, it is possible that a received route
may be selected as the best route if the path selection needs to take
next hop IPs into consideration. This is a pretty convoluted scenario,
but the code should handle it and delete and withdraw the local route
as well as (re)install the received route.
Ticket: CM-24114
Reviewed By: CCR-9487
Testing Done:
1. Manual tests - note, problem is not readily reproducible
2. evpn-smoke - results documented in the ticket
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
If a peer advertised capability addpath in their OPEN, but sent us an
UPDATE without an ADDPATH, we overflow a heap buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This changeset follows the PR
https://github.com/FRRouting/frr/pull/5334
Above PR adds nexthop tracking support for EVPN RT-5 nexthops.
This route is marked VALID only if the BGP route has a valid nexthop.
If the EVPN peer is an EBGP pee and "disable_connected_check" flag is not set,
"connected" check is performed for the EVPN nexthop.
But, usually EVPN nexthop is not the BGP peering address, but the VTEP address.
Also, NEXTHOP_UNCHANGED flag is enabled by default for EVPN.
As a result, in a common deployment for EVPN, EVPN nexthop is not connected.
Thus, adding a fix to remove the "connected" check for EVPN nexthops.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Instead of CMD_WARNING, use CMD_WARNING_CONFIG_FAILED
for any mis-configuration scenario.
Testing Done:
TOR(config)# router bgp 5548
TOR(config-router)# address-family l2vpn evpn
TOR(config-router-af)# no advertise-pip
This command is supported under L3VNI BGP EVPN VRF
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
when a pip is disabled or mac-vlan is not present
use anycast MAC as RMAC value.
Ticket:CM-26923
Reviewed By:CCR-9417
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
For self type-2 routes, do not assign system-rmac
as attribute RMAC value if advertise-pip is disable
or macvlan is not present.
Ticket:CM-26923
Reviewed By:CCR-9397
Testing Done:
pip is disabled under bgp vrf2 instance.
Trigger frr-restart.
Before fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:00:02:00:00:00:2e
After fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:44:38:39:ff:ff:01
TOR# ifquery vlan1004
auto vlan1004
iface vlan1004
address 45.0.4.4/24
vlan-id 1004
vrf vrf2
VNI: 4002 (known to the kernel)
Type: L3
Tenant VRF: vrf2
RD: 45.0.6.4:3
Originator IP: 36.0.0.11
Advertise-pip: Yes
System-IP: 27.0.0.11
System-MAC: 00:02:00:00:00:2e
Router-MAC: 44:38:39:ff:ff:01
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
By default announct Self Type-2 routes with
system IP as nexthop and system MAC as
nexthop.
An API to check type-2 is self route via
checking ipv4/ipv6 address from connected interfaces list.
An API to extract RMAC and nexthop for type-2
routes based on advertise-svi-ip knob is enabled.
When advertise-pip is enabled/disabled, trigger type-2
route update. For self type-2 routes to use
anycast or individual (rmac, nexthop) addresses.
Ticket:CM-26190
Reviewed By:
Testing Done:
Enable 'advertise-svi-ip' knob in bgp default instance.
the vrf instance svi ip is advertised with nexthop
as default instance router-id and RMAC as system MAC.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In L3VNI add callback parse, vrr rmac value.
For non-zero vrr mac value, use it as anycast RMAC
and svi mac as individual rmac value.
If advertise-pip is disable or vrr rmac is not present
use svi mac as anycast rmac value for all routes.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Evpn Primary IP advertisement feature uses
individual system IP and system MAC for prefix (type-5)
and self type-2 routes.
The PIP knob is enabled by default for bgp vrf instance.
Configuration CLI for enable/disable PIP feature knob.
User can configure PIP system IP and MAC to retain as
permanent values.
For the PIP IP, the default behavior is to accept bgp default
instance's router-id. When the default instance router-id change,
reflect PIP IP assignment.
Reflect type-5 to use system-IP and system MAC as nexthop and RMAC
values.
Ticket:CM-26190
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Spaces were not being accounted for in the heap buffer sizing, leading
to a heap buffer overflow when encoding large communities to their
string representations.
This patch also uses safer functions to do the encoding instead of
pointer math.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Tons of insane just-so pointer math here where it is not needed. This is
too smart. Use safer methods.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The half and reuse variables can never be 1 but the
SA systems we have do not know this and think it is possible.
Provide the kick in the snarples that the SA needs to know
this is not true.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Copy paste leads to invalid read of 1 byte off the heap when converting
extended community attributes into strings.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Bug: While preparing the JSON output, 2 loops are traversed: the outer loop
loops through RD, and the inner loop loops through the prefixes of that RD.
We hit the bug (printing blank RD and stale or null prefix info) when the inner
loop exits with nothing to print, (without allocating json_routes) and the outer
loop still tries to attach it to the parent, json_adv. Thus, we have
key=<BLANK RD>, value=<junk or prev json_routes>
The fix: Avoid attaching json_routes to the parent json if there
is nothing to print.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
According to RFC 8277 IPv4 LU NLRI can be withdrawn using label 0x000000.
This RFC updates RFC3101 where it should be done only with 0x800000 label value.
Juniper implementation sets value 0x000000 when prefix is being withdrawn.
Page 12 RFC8277 states:
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
Now FRR drops BGP session when receives such BGP update.
Signed-off-by: Aleksandr Klimenko <v00lk@bk.ru>
* IPv6 routes received via a ibgp session with one of its own interface as
nexthop are getting installed in the BGP table.
*A common table to be implemented should take cares of both
ipv4 and ipv6 connected addresses.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
The command `show ip bgp ipv4|ipv6 vpn neighbors <ip> prefix-counts`
caused a segfault, because the 2-level routing was not accounted for.
Signed-off-by: Juergen Werner <juergen@opensourcerouting.org>
The CLI was not parsing prefix format of ipv6 address.
This fixes the bug: https://github.com/FRRouting/frr/issues/5322
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The total_peercount table was created as a short cut for queries about
if addpath was enabled at all on a particular afi/safi. However, the
values weren't updated, so BGP would act as if addpath wasn't enabled
when determining if updates should be sent out. The error in behavior
was much more noticeable in tx-all than best-per-as, since changes in
what is sent by best-per-as would often trigger updates even if addpath
wasn't enabled.
Signed-off-by: Mitchell Skiba <mskiba@amazon.com>
Problem statement:
When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the
nexthop of the route with nexthop tracking module. The BGP route is marked as
valid only if the nexthop is resolved.
Even for EVPN RT-5, route should be marked as valid only if the the nexthop is
resolvable.
Code changes:
1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid
only if the nexthop is resolved.
2. Only the valid EVPN routes are imported to the vrf.
3. When nht update is received in BGP, make sure that the EVPN routes are
imported/unimported based on the route becomes valid/invalid.
Testcases:
1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2.
10.100.0.2 is resolved at rtr-2 in default vrf.
At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into
vrfs.
2. Make the nexthop 10.100.0.2 unreachable at rtr-2
Remote EVPN RT-5 should be marked as invalid and should be unimported from the
vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes
should be valid.
3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable.
EVPN RT-5 should again become valid and should be imported into the vrfs.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Before the fix:
2019/11/14 19:52:21 BGP: peer 192.168.2.5 deleted from subgroup s4peer
cnt 0 - missing space after s4 before peer
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
With this code change, we can now filter evpn routes based on RD using the
match statement: "match evpn rd XX"
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The bug:
As part of displaying advertised routes to a peer, in the outer loop, we
iterate through all prefixes in the evpn table. In the inner loop,
we iterate through adj_out of each prefix.
If a prefix which is present in the evpn table is not advertised to a peer,
its corresponding attr == NULL. Checking for this condition is the fix.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
There is a memory leak of the bgp node (route node)
in vni table while processing evpn remote route(s).
During the remote evpn route processing, a new route
is created in per vni route table, the refcount for
the route node is incremented twice. First refcount
is incremented during the node creation and the second
one when the bgp_info_add is added.
Post evpn route creation, the bgp node refcount needs
to be decremented.
Ticket:CM-26898,CM-26838,CM-27169
Reviewed By:CCR-9474
Testing Done:
In EVPN topology send 1K MAC routes then check the memory footprint
at the remote VTEP before sending 1K type-2 routes
and after flushing/withdrawal of the routes.
Before fix:
-----------
Initial memory footprint:
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
With 1K MAC (type-2 routes)
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 6008 32
BGP node : 4182 152
BGP route : 2096 112
After cleaning up 1K MAC entries from source VTEP which triggers BGP withdraw
at the remote VTEP.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 4008 32
BGP node : 2182 152 <-- Here 2K delta from initial count.
BGP route : 96 112
With fix:
---------
After 1K MAC entries cleaned up at the remote VTEP, the memory footprint
(BGP Node and Hash Bucket count) is equilibrium to start of the test.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
PR 5118 introduce additional (prepend) keywords
like 'ip' to Route Distinguisher output which
breaks existing evpn route show commands parsing.
Restore to original behavior.
Testing Done:
vtysh -c 'show bgp l2vpn evpn route'
Before fix:
Route Distinguisher: ip 27.0.0.15:44
Post fix:
Route Distinguisher: 27.0.0.15:44
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>