libc-ares doesn't do IP literals, so we have to do that before running
off to do DNS. Since this isn't BMP specific, move to lib/ so NHRP can
benefit too.
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.
RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
been added to peer hash table for keepalive when it was
in openconfirm state.
3. Config node parameters including FD are exchanged with
doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
set to -1 and asserts.
Signed-off-by: Santosh P K <sapk@vmware.com>
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Use a per-nexthop flag to indicate the presence of labels; add
some utility zapi encode/decode apis for nexthops; use the zapi
apis more consistently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This moves all the DFLT_BGP_* stuff over to the new defaults mechanism.
bgp_timers_nondefault() added to get better file-scoping.
v2: moved everything into bgp_vty.c so that the core BGP code is
independent of the CLI-specific defaults. This should make the future
northbound conversion easier.
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to have this in bgpd.c; it's just there
historically. Move it to bgp_vty.c where it makes more sense.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add a bit of code to allow hostname lookup failure to
not stall bmp communication.
Fixes: #5382
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The function bgp_table_range_lookup attempts to walk down
the table node data structures to find a list of matching
nodes. We need to guard against the current node from
not matching and not having anything in the child nodes.
Add a bit of code to guard against this.
Traceback that lead me down this path:
Nov 24 12:22:38 frr bgpd[20257]: Received signal 11 at 1574616158 (si_addr 0x2, PC 0x46cdc3); aborting...
Nov 24 12:22:38 frr bgpd[20257]: Backtrace for 11 stack frames:
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_signal+0x113) [0x7fd1ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(+0x70e65) [0x7fd1ad465e65]ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libpthread.so.0(+0xf5f0) [0x7fd1abd605f0]45db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(bgp_table_range_lookup+0x63) [0x46cdc3]445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib64/frr/modules/bgpd_rpki.so(+0x4f0d) [0x7fd1a934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(thread_call+0x60) [0x7fd1ad4736e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(frr_run+0x128) [0x7fd1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(main+0x2e3) [0x41c043]1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd() [0x41d9bb]main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: in thread bgpd_sync_callback scheduled from bgpd/bgp_rpki.c:351#012; aborting...
Nov 24 12:22:38 frr watchfrr[6779]: [EC 268435457] bgpd state -> down : read returned EOF
Nov 24 12:22:38 frr zebra[5952]: [EC 4043309116] Client 'bgp' encountered an error and is shutting down.
Nov 24 12:22:38 frr zebra[5952]: zebra/zebra_ptm.c:1345 failed to find process pid registration
Nov 24 12:22:38 frr zebra[5952]: client 15 disconnected. 0 bgp routes removed from the rib
I am not really 100% sure what we are really trying to do with this function, but we must
guard against child nodes not having any data.
Fixes: #5440
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When dumping a large bit of table data via bgp_show_table
and if there is no information to display for a particular
`struct bgp_node *` the data allocated via json_object_new_array()
is leaked. Not a big deal on small tables but if you have a full
bgp feed and issue a show command that does not match any of
the route nodes ( say `vtysh -c "show bgp ipv4 large-community-list FOO"`)
then we will leak memory.
Before code change and issuing the above show bgp large-community-list command 15-20 times:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: > 2GB
Free small blocks: 31 MiB
Free ordinary blocks: 616 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
After:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 924 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 558 MiB
Free small blocks: 26 MiB
Free ordinary blocks: 340 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
Please note the 340mb of free ordinary blocks is from the fact I issued a
`show bgp ipv4 uni json` command and generated a large amount of data.
Fixes: #5445
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Without with fix we can't delete large-community-list using
no bgp large-community-list standard WORD, but no bgp large-community-list WORD
Let's keep this identical what we have with expanded lists as well.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This patch allows using sequence numbers for community lists. We already have
this for prefix-lists and access-lists.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
In rare situations, the local route in a VNI may not get selected as the
best route. One situation is during a race between bgp and zebra which
was addressed in a prior commit. This change addresses another situation
where due to a change of tunnel IP, it is possible that a received route
may be selected as the best route if the path selection needs to take
next hop IPs into consideration. This is a pretty convoluted scenario,
but the code should handle it and delete and withdraw the local route
as well as (re)install the received route.
Ticket: CM-24114
Reviewed By: CCR-9487
Testing Done:
1. Manual tests - note, problem is not readily reproducible
2. evpn-smoke - results documented in the ticket
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
If a peer advertised capability addpath in their OPEN, but sent us an
UPDATE without an ADDPATH, we overflow a heap buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This changeset follows the PR
https://github.com/FRRouting/frr/pull/5334
Above PR adds nexthop tracking support for EVPN RT-5 nexthops.
This route is marked VALID only if the BGP route has a valid nexthop.
If the EVPN peer is an EBGP pee and "disable_connected_check" flag is not set,
"connected" check is performed for the EVPN nexthop.
But, usually EVPN nexthop is not the BGP peering address, but the VTEP address.
Also, NEXTHOP_UNCHANGED flag is enabled by default for EVPN.
As a result, in a common deployment for EVPN, EVPN nexthop is not connected.
Thus, adding a fix to remove the "connected" check for EVPN nexthops.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Instead of CMD_WARNING, use CMD_WARNING_CONFIG_FAILED
for any mis-configuration scenario.
Testing Done:
TOR(config)# router bgp 5548
TOR(config-router)# address-family l2vpn evpn
TOR(config-router-af)# no advertise-pip
This command is supported under L3VNI BGP EVPN VRF
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
when a pip is disabled or mac-vlan is not present
use anycast MAC as RMAC value.
Ticket:CM-26923
Reviewed By:CCR-9417
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
For self type-2 routes, do not assign system-rmac
as attribute RMAC value if advertise-pip is disable
or macvlan is not present.
Ticket:CM-26923
Reviewed By:CCR-9397
Testing Done:
pip is disabled under bgp vrf2 instance.
Trigger frr-restart.
Before fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:00:02:00:00:00:2e
After fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:44:38:39:ff:ff:01
TOR# ifquery vlan1004
auto vlan1004
iface vlan1004
address 45.0.4.4/24
vlan-id 1004
vrf vrf2
VNI: 4002 (known to the kernel)
Type: L3
Tenant VRF: vrf2
RD: 45.0.6.4:3
Originator IP: 36.0.0.11
Advertise-pip: Yes
System-IP: 27.0.0.11
System-MAC: 00:02:00:00:00:2e
Router-MAC: 44:38:39:ff:ff:01
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
By default announct Self Type-2 routes with
system IP as nexthop and system MAC as
nexthop.
An API to check type-2 is self route via
checking ipv4/ipv6 address from connected interfaces list.
An API to extract RMAC and nexthop for type-2
routes based on advertise-svi-ip knob is enabled.
When advertise-pip is enabled/disabled, trigger type-2
route update. For self type-2 routes to use
anycast or individual (rmac, nexthop) addresses.
Ticket:CM-26190
Reviewed By:
Testing Done:
Enable 'advertise-svi-ip' knob in bgp default instance.
the vrf instance svi ip is advertised with nexthop
as default instance router-id and RMAC as system MAC.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In L3VNI add callback parse, vrr rmac value.
For non-zero vrr mac value, use it as anycast RMAC
and svi mac as individual rmac value.
If advertise-pip is disable or vrr rmac is not present
use svi mac as anycast rmac value for all routes.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Evpn Primary IP advertisement feature uses
individual system IP and system MAC for prefix (type-5)
and self type-2 routes.
The PIP knob is enabled by default for bgp vrf instance.
Configuration CLI for enable/disable PIP feature knob.
User can configure PIP system IP and MAC to retain as
permanent values.
For the PIP IP, the default behavior is to accept bgp default
instance's router-id. When the default instance router-id change,
reflect PIP IP assignment.
Reflect type-5 to use system-IP and system MAC as nexthop and RMAC
values.
Ticket:CM-26190
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Spaces were not being accounted for in the heap buffer sizing, leading
to a heap buffer overflow when encoding large communities to their
string representations.
This patch also uses safer functions to do the encoding instead of
pointer math.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Tons of insane just-so pointer math here where it is not needed. This is
too smart. Use safer methods.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The half and reuse variables can never be 1 but the
SA systems we have do not know this and think it is possible.
Provide the kick in the snarples that the SA needs to know
this is not true.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Copy paste leads to invalid read of 1 byte off the heap when converting
extended community attributes into strings.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Bug: While preparing the JSON output, 2 loops are traversed: the outer loop
loops through RD, and the inner loop loops through the prefixes of that RD.
We hit the bug (printing blank RD and stale or null prefix info) when the inner
loop exits with nothing to print, (without allocating json_routes) and the outer
loop still tries to attach it to the parent, json_adv. Thus, we have
key=<BLANK RD>, value=<junk or prev json_routes>
The fix: Avoid attaching json_routes to the parent json if there
is nothing to print.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
According to RFC 8277 IPv4 LU NLRI can be withdrawn using label 0x000000.
This RFC updates RFC3101 where it should be done only with 0x800000 label value.
Juniper implementation sets value 0x000000 when prefix is being withdrawn.
Page 12 RFC8277 states:
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
Now FRR drops BGP session when receives such BGP update.
Signed-off-by: Aleksandr Klimenko <v00lk@bk.ru>
* IPv6 routes received via a ibgp session with one of its own interface as
nexthop are getting installed in the BGP table.
*A common table to be implemented should take cares of both
ipv4 and ipv6 connected addresses.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
The command `show ip bgp ipv4|ipv6 vpn neighbors <ip> prefix-counts`
caused a segfault, because the 2-level routing was not accounted for.
Signed-off-by: Juergen Werner <juergen@opensourcerouting.org>
The CLI was not parsing prefix format of ipv6 address.
This fixes the bug: https://github.com/FRRouting/frr/issues/5322
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The total_peercount table was created as a short cut for queries about
if addpath was enabled at all on a particular afi/safi. However, the
values weren't updated, so BGP would act as if addpath wasn't enabled
when determining if updates should be sent out. The error in behavior
was much more noticeable in tx-all than best-per-as, since changes in
what is sent by best-per-as would often trigger updates even if addpath
wasn't enabled.
Signed-off-by: Mitchell Skiba <mskiba@amazon.com>
Problem statement:
When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the
nexthop of the route with nexthop tracking module. The BGP route is marked as
valid only if the nexthop is resolved.
Even for EVPN RT-5, route should be marked as valid only if the the nexthop is
resolvable.
Code changes:
1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid
only if the nexthop is resolved.
2. Only the valid EVPN routes are imported to the vrf.
3. When nht update is received in BGP, make sure that the EVPN routes are
imported/unimported based on the route becomes valid/invalid.
Testcases:
1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2.
10.100.0.2 is resolved at rtr-2 in default vrf.
At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into
vrfs.
2. Make the nexthop 10.100.0.2 unreachable at rtr-2
Remote EVPN RT-5 should be marked as invalid and should be unimported from the
vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes
should be valid.
3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable.
EVPN RT-5 should again become valid and should be imported into the vrfs.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Before the fix:
2019/11/14 19:52:21 BGP: peer 192.168.2.5 deleted from subgroup s4peer
cnt 0 - missing space after s4 before peer
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
With this code change, we can now filter evpn routes based on RD using the
match statement: "match evpn rd XX"
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The bug:
As part of displaying advertised routes to a peer, in the outer loop, we
iterate through all prefixes in the evpn table. In the inner loop,
we iterate through adj_out of each prefix.
If a prefix which is present in the evpn table is not advertised to a peer,
its corresponding attr == NULL. Checking for this condition is the fix.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
There is a memory leak of the bgp node (route node)
in vni table while processing evpn remote route(s).
During the remote evpn route processing, a new route
is created in per vni route table, the refcount for
the route node is incremented twice. First refcount
is incremented during the node creation and the second
one when the bgp_info_add is added.
Post evpn route creation, the bgp node refcount needs
to be decremented.
Ticket:CM-26898,CM-26838,CM-27169
Reviewed By:CCR-9474
Testing Done:
In EVPN topology send 1K MAC routes then check the memory footprint
at the remote VTEP before sending 1K type-2 routes
and after flushing/withdrawal of the routes.
Before fix:
-----------
Initial memory footprint:
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
With 1K MAC (type-2 routes)
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 6008 32
BGP node : 4182 152
BGP route : 2096 112
After cleaning up 1K MAC entries from source VTEP which triggers BGP withdraw
at the remote VTEP.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 4008 32
BGP node : 2182 152 <-- Here 2K delta from initial count.
BGP route : 96 112
With fix:
---------
After 1K MAC entries cleaned up at the remote VTEP, the memory footprint
(BGP Node and Hash Bucket count) is equilibrium to start of the test.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
PR 5118 introduce additional (prepend) keywords
like 'ip' to Route Distinguisher output which
breaks existing evpn route show commands parsing.
Restore to original behavior.
Testing Done:
vtysh -c 'show bgp l2vpn evpn route'
Before fix:
Route Distinguisher: ip 27.0.0.15:44
Post fix:
Route Distinguisher: 27.0.0.15:44
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Recently had a case where I was attempting to debug a nexthop tracking
issue across multiple bgp vrf's and since the setup vrf's in it with
overlapping address ranges, it became real fun real fast to track
vrf data associated. Add a bit of code to allow us to figure out
what vrf we are in when we print out debug messages.
Look through the rest of the code and find debugs where we are
not using bgp->name_pretty and switch it over.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Prevent IPv6 Link-local address being forward to IBGP peer,
which are not directly connected.
R1----IPV6-unnumbered-EBGP-------R2-----IPV6-IBGP-----R3
Configure route-map to set preferred global address on and apply
route-map-IN on R2 for R1-R2 session. Now check on R3's BGP and
RIB table has route nexthop as R1 link-local address, which is
not correct.
As of now we clear link-local address info from mp_nexthop_global,
only if mp_nexthop_global is populated with link-local address.
We should do it even if route-map is configured boz forwarding
link-local address from one link scope to another is violation of
the standards.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
This commit make bgpd to skip and ignore unsupported
sub-type of PREFIX_SID. (especially new defined sub-type)
Current bgpd can't parase unsupported sub-type of PREFIX_SID.
PREFIX_SID is drafted on draft-ietf-idr-bgp-prefix-sid-27.
There are already new sub-type drafted on
draft-dawra-idr-srv6-vpn-05. (Type5,6 is new defined.)
This commit fix the problem reported as #5277 on GitBub.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Running with --enable-address-sanitizer I am seeing this:
=================================================================
==19520==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ef850 at pc 0x7fe9b8f7b57b bp 0x7fffbac6f9c0 sp 0x7fffbac6f170
READ of size 6 at 0x6020003ef850 thread T0
#0 0x7fe9b8f7b57a (/lib/x86_64-linux-gnu/libasan.so.5+0xb857a)
#1 0x55e33d1071e5 in bgp_process_mac_rescan_table bgpd/bgp_mac.c:159
#2 0x55e33d107c09 in bgp_mac_rescan_evpn_table bgpd/bgp_mac.c:252
#3 0x55e33d107e39 in bgp_mac_rescan_all_evpn_tables bgpd/bgp_mac.c:266
#4 0x55e33d108270 in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:291
#5 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
#6 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
#7 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
#8 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
#9 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
#10 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#11 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#12 0x55e33d09d463 in main bgpd/bgp_main.c:477
#13 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
#14 0x55e33d09c189 in _start (/usr/lib/frr/bgpd+0x168189)
0x6020003ef850 is located 0 bytes inside of 16-byte region [0x6020003ef850,0x6020003ef860)
freed by thread T0 here:
#0 0x7fe9b8fabfb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
#1 0x7fe9b8ce4ea9 in qfree lib/memory.c:129
#2 0x55e33d10825c in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:289
#3 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
#4 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
#5 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
#6 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
#7 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
#8 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#9 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#10 0x55e33d09d463 in main bgpd/bgp_main.c:477
#11 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7fe9b8fac518 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
#1 0x7fe9b8ce4d93 in qcalloc lib/memory.c:110
#2 0x55e33d106b29 in bgp_mac_hash_alloc bgpd/bgp_mac.c:96
#3 0x7fe9b8cb8350 in hash_get lib/hash.c:149
#4 0x55e33d10845b in bgp_mac_add_mac_entry bgpd/bgp_mac.c:303
#5 0x55e33d226757 in bgp_ifp_create bgpd/bgp_zebra.c:2644
#6 0x7fe9b8cbf1e6 in if_new_via_zapi lib/if.c:176
#7 0x7fe9b8db2d3b in zclient_interface_add lib/zclient.c:1481
#8 0x7fe9b8db87f8 in zclient_read lib/zclient.c:2659
#9 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#10 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#11 0x55e33d09d463 in main bgpd/bgp_main.c:477
#12 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
Effectively we are passing to bgp_mac_remove_ifp_internal the macaddr
that is associated with the bsm data structure. There exists a path
where the bsm is freed and then we immediately pass the macaddr into
bgp_mac_rescan_all_evpn_tables. So just make a copy of the macaddr
data structure before we free the bsm
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
debian-9# show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:15:22
debian-9# conf
debian-9(config)# router bgp 100
debian-9(config-router)# address-family ipv4
debian-9(config-router-af)# distance bgp 123 123 123
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [123/0] via 192.168.0.1, eth1, 00:00:09
debian-9(config-router-af)# no distance bgp
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:00:02
debian-9(config-router-af)#
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The sender side AS path loop detection code was implemented since the
import of Quagga code, however it was always disabled by a `ifdef`
guard.
Lets allow the user to decide whether or not to enable this feature on
run-time.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Scenarios where this code change is required:
1. BFD is un-configured from BGP at remote end.
Neighbour BFD sends ADMIN_DOWN state, but BFD on local side will send
DOWN to BGP, resulting in BGP session DOWN.
Removing BFD session administratively shouldn't bring DOWN BGP session
at local or remote.
2. BFD is un-configured from BGP or shutdown locally.
BFD will send state DOWN to BGP resulting in BGP session DOWN.
(This is akin to saying do not use BFD for BGP)
Removing BFD session administratively shouldn't bring DOWN BGP session at
local or remote.
Signed-off-by: Sayed Mohd Saquib sayed.saquib@broadcom.com
Since we don't set a value from the return of bgp_path_info_mpath_next
it is impossible for this function to do anything as such the if statement
is dead code as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported with error messages appearing in the log
complaining about invalid afi/safi combinations. Determined
that the error messages were recently added in the function
that turns afi and safi values to strings. Unfortunately,
the function is called from places using FOREACH_AFI_SAFI,
which spins thru every afi and safi number including some
that are not legal together (ipv4 evpn and l2vpn multicast
for example.) This fix removes these error messages since
it is not necessarily an error to call it with invalid
combinations.
Ticket: CM-26883
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Coverity has found a path where the attr.aspath may be NULL.
assert that the aspath is non-null so we can make this go away.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We make the assumption that ->attr is not NULL throughout
the code base. We are totally inconsistent about application
of this though.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add -s X or --socket_size X to the bgp cli to allow
the end user to specify the outgoing bgp tcp kernel
socket buffer size.
It is recommended that this option is only used on
large scale operations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In bgp_create_evpn_bgp_path_info we create a bgp_path_info
that should be returned since we need it later.
Found by Coverity Scan.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When using soft reconfiguration inbound we are storing packet
data on the side for replaying when necessary. The problem here
is that we are just grabbing the first bgp_path_info and using
that as the base. What happens when we have soft-reconfig turned
on with multiple bgp_path_info's for a path? This was introduced
in commit 8692c50652, yes back
in 2012! I would argue, though, that it was just broken
in a different way before this.
Choose the correct bgp_path_info that corresponds to the peer
we received the data from for rethinking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a type 2/3 or 5 route is received, verified and the
resulting route generated is pushed into the appropriate vrf
the vni's associated with the route are also passed in.
This is showing up as a Remote label when you dump
the route in bgp:
BGP routing table entry for 0.0.0.0/0^M
Paths: (1 available, best #1, table third)
Advertised to non peer-group peers:
10.10.120.22
42001 42005 42006 42055
10.10.120.22 from 10.10.120.22 (10.10.255.193)
Origin IGP, valid, external, bestpath-from-AS 42001, best
Remote label: 62750
AddPath ID: RX 0, TX 2
Last update: Fri Oct 11 12:59:56 2019
The `Remote label: 62750` is the mpls label version of the
vni passed in. This is meaningless and confusing to the end
user. Do not display this information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a bgp_path_info for a type 4 route the pi->extra->parent
and the route node for the originating table were not being locked
properly. This will prevent BGP from not properly cleaning up
the data structures on cleanup.
Possibly every one of the functions that we use to create the
new bgp_path_info's should use an abstracted version of this code,
but I am unsure at this point in time if a type 4 should use the same
or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bgp_process_packets has an assert to make sure an appropriate amount of
working space in the input buffer has been freed up for future reads.
However, this assert shouldn't be made when we have encountered an error
that's going to tear down the session, because in this case we may not
be able to process the full contents of the input buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bgp_process_packets has an assert to make sure an appropriate amount of
working space in the input buffer has been freed up for future reads.
However, this assert shouldn't be made when we have encountered an error
that's going to tear down the session, because in this case we may not
be able to process the full contents of the input buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
There are cases where the table identifier is set on a bgp entry, mainly
due to route-map, and associate fib entry needs to be removed.
This change encompasses also the route-map reconfiguration that leads to
removing the previous entry, whereas bgp update had been triggered (
this happens when software inbound reconfiguration is handled).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this table identifier can be used for policy routing. incoming entries
are locally exported to that local table identifier.
note that so that the user applies the new table identifier to all
entries, the user should flush local tables first.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP code assumes that the extra data is zero'ed out. Ensure that we
are not leaving any situation that the data on the stack is actually all
0's when we pass it around as a pointer later.
Please note in issue #5025, Lou reported a different valgrind
issue, which is not the same issue:
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==
that is causing the actual crash.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixed memory leak and incorrect json output. Check the full output in the PR:
https://github.com/FRRouting/frr/pull/5118
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The bgp pointer may not be actually found. The debug
message that was using it could get the same value
another way. Convert over
Fixes Coverity Scan Issue:
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We only have a uint32_t value here but clippy is wise and
gives us more data than we need. Tell the compiler we can
throw some stuff away.
This was found by inspecting CI results.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Under high load instances with hundreds of thousands of prefixes this
could result in very unstable systems.
When maximum-prefix is set, but restart timer is not set then the session
flaps between Idle(Pfx) -> Established -> Idle(Pfx) states.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Display output from adj_out instead of the rib table.
Also fixes crash for the json output. RCA: prefix is written to json object
using inet_ntop. But, this api returns null buffer for AF_EVPN address family
(it works only for AF_INET and AF_INET6). This null buffer is then deref'd
by json-object-to string api.
Full output shown in PR: https://github.com/FRRouting/frr/pull/5078
Crash issue: https://github.com/FRRouting/frr/issues/5010
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
Problem reported that when a "neighbor x.x.x.x route-map FOO in"
set a next-hop value, that modified next-hop value was also sent
to eBGP peers. This is incorrect since bgp is expected to set
next-hop to self when sending to eBGP peers unless third party
next-hop on a shared segment is true. This fix modifies the
behavior to stop sending the modified next-hop to eBGP peers
if the route-map was applied inbound on another peer.
Ticket: CM-26025
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
The newly added PEER_RMAP_TYPE_AGGREGATE flag is setup to
be the 9th bit:
But the flag we are putting it into:
uint8_t rmap_type;
is 8 bits. Adjust the size.
Found by Coverity SA Scan
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
While configuring aggregate route prepare the hash table first,
then prepare the aggregated aspath value just like lcomm,
ecomm and standard community.
Signed-off-by: vishaldhingra<vdhingra@vmware.com>
While configuring aggregate route prepare the hash table first,
then prepare the aggregated ecomm value and then do the
unique sort once for ecommunity.
Signed-off-by: vishaldhingra<vdhingra@vmware.com>
While configuring aggregate route prepare the hash table
first, then prepare the aggregated standard comm value
and then do the unique sort once for standard community.
Signed-off-by: vishaldhingra<vdhingra@vmware.com>
While configuring aggregate route prepare the hash table first,
then prepare the aggregated lcomm value and then do the unique
sort once for large community.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
Include the coalesce time for the update group `show bgp update-group`
command as well as print out how long the coalesce timer waited
for on the timer pop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The peer's outgoing routemap should not be displaying a 'X'
appended to the front of the name. This will create
confusion.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
RFC 4271 sec 6.3 p33, In the case of a BGP_NEXTHOP attribute with an
incorrect value, FRR is supposed to send a notification
and include 'Corresponding type, length and value of the NEXT_HOP
attribute in the notification data.
Fixes: #4997
Signed-off-by: Nikos <ntriantafillis@gmail.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For all the places we have a zclient->interface_up convert
them to use the interface ifp_up callback instead.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Switch the zclient->interface_add functionality to have everyone
use the interface create callback in lib/if.c
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Start the conversion to allow zapi interface callbacks to be
controlled like vrf creation/destruction/change callbacks.
This will allow us to consolidate control into the interface.c
instead of having each daemon read the stream and react accordingly.
This will hopefully reduce a bunch of cut-n-paste stuff
Create 4 new callback functions that will be controlled by
lib/if.c
create -> A upper level protocol receives an interface creation event
The ifp is brand spanking newly created in the system.
up -> A upper level protocol receives a interface up event
This means the interface is up and ready to go.
down -> A upper level protocol receives a interface down
destroy -> A upper level protocol receives a destroy event
This means to delete the pointers associated with it.
At this point this is just boilerplate setup for future commits.
There is no new functionality.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixed the following:
1. Print the complete header for 'show bgp l2vpn evpn' command
2. Print the Route Distinguisher header
3. Print all relevant routes in json (some were being skipped)
Signed-off-by: Kishore Aramalla <karamalla@vmware.com>
This is annoying when editing a file and saving the file. IDEs like
VSCode can automatically remove trailing whitespaces, hence it would be better
having a clean code before pushing other changes.
I step onto this not the first time.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
RFC4271 specifies behavior when the hold timer is sent to zero - we
should not send keepalives or run a hold timer. But FRR, and other
vendors, allow the keepalive timer to be set to zero with a nonzero hold
timer. In this case we were sending keepalives constantly and maxing out
a pthread to do so. Instead behave similarly to other vendors and do not
send keepalives.
Unsure what the utility of this is, but blasting keepalives is
definitely the wrong thing to do.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
User pass the string match large-community 1 exact-match from CLI.
Now route map lib has got the string as "1 exact-match". It passes the string
to call back for compilation. BGP will parse this string and came to know
that for "1" it has to do exact match. Routemap lib has to save "1" in it’s
dependency table. Here routemap is saving this as a “1 exact-match”
which is wrong. The solution is used the compiled data.
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
Allow bgp to set a local Administrative distance to use
for installing routes into the rib.
Example:
!
router bgp 9323
bgp router-id 1.2.3.4
neighbor enp0s8 interface remote-as external
!
address-family ipv4 unicast
neighbor enp0s8 route-map DISTANCE in
exit-address-family
!
route-map DISTANCE permit 10
set distance 153
!
line vty
!
end
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B 0.0.0.0/0 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
K>* 0.0.0.0/0 [0/100] via 10.0.2.2, enp0s3, 00:06:31
B>* 1.1.1.1/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.2/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
B>* 1.1.1.3/32 [153/0] via fe80::a00:27ff:fe84:c2d6, enp0s8, 00:00:06
C>* 10.0.2.0/24 is directly connected, enp0s3, 00:06:31
K>* 169.254.0.0/16 [0/1000] is directly connected, enp0s3, 00:06:31
eva#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
`set ipv6 next-hop prefer-global` is not working on IPv4 peers.
In MP-BGP, bgp routers can advertising IPv6 routes over IPv4 peers.
Remove the peer's remote address AFI type checking.
Signed-off-by: shikenghua <kh_shi@edge-core.com>
This is the unusual case when you have global IPv6 address and no link-local
on interface attached. Like here:
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP
link/ether 08:00:27:65:c6:82 brd ff:ff:ff:ff:ff:ff
inet6 2a02:4780:face::1/64 scope global
valid_lft forever preferred_lft forever
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
When L3vni is created with prefix-only flag,
the flag is set at bgp vrf instance level.
In the case of bgp instance is non auto created,
means user configured instance (i.e 'router bgp x vrf <name>')
Upon deletion of l3vni, clear the prefix-only flag from
bgp vrf instance.
Ticket:CM-21894
Reviewed By:CCR-9176
Testing Done:
vrf vrf1
vni 104001
exit-vrf
!
router bgp 650030 vrf vrf1
!
tor-21(config)# vrf vrf1
tor-21(config-vrf)# vni 104001 prefix-routes-only
tor-21(config-vrf)# no vni 104001 prefix-routes-only
tor-21(config-vrf)# end
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
There was a silly bug introduced when the command to show failed sessions
was added. A missing "," caused the wrong error message to be printed.
Debugging this led down a path that:
- Led to discovering one more error message that needed to be added
- Providing the error code along with the string in the JSON output
to allow programs to key off numbers rather than strings.
- Fixing the missing ","
- Changing the error message to "Waiting for Peer IPv6 LLA" to
make it clear that we're waiting for the link local addr.
Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>