When a peer is bound to a peer-group, the GR flags set on the
peer are over-written.
Update the GR flags for the peer after it has been bound to a
peer-group.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
The code in the bgp extcommunity-list function was using
argv_find to get the correct idx. The problem was that
we had already done argv_finds before and idx was non-zero
thus having us always set the seq pointer to what was last
looked up. This causes us to pass in a value to the
underlying function and it would just wisely ignore it
causing a seq number of 0.
We would then write this seq number of 0 and then immediately
reject it on read in again. BOO!
Actually handle argv_find the way it was meant to be.
Ticket:CM-29926
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When issuing the command `match ip next-hop address`
bgp would crash. This is because the no form of the
command was making the address optional and we would
try to read data we should not be.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bgp_accept() gets called over and over again when a VRF device is
deleted out from under a bgp listener socket that is bound to it.
Prevent this by noting the error and cancelling ourselves, allowing the
vrf status code to clean up the mess when it receives word about the
change from Zebra.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Try to give a bit more useful data about where we
think the connection is trying to come in from.
Hopefully this will let us debug connection issues
a bit faster in cases where there are config issues.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When received packet is processed in bgp_process_reads(), the data
is copied to static buffer and then copied to stream buffer.
The data can be copied directly to stream buffer which will avoid extra memcpy
Signed-off-by: kssoman <somanks@gmail.com>
Don't attempt to send BFD daemon a message to remove the peer
registration on daemon exit, otherwise we'll access a dangling
interface pointer and we'll crash.
This crash was not previosly possible because the function that built
the message was passing the interface pointer but not using it due to
the exit condition.
In `lib/bfd.c`:
```
void bfd_peer_sendmsg(struct zclient *zclient, struct bfd_info *bfd_info,
int family, void *dst_ip, void *src_ip, char *if_name,
int ttl, int multihop, int cbit, int command,
int set_flag, vrf_id_t vrf_id)
{
struct bfd_session_arg args = {};
size_t addrlen;
/* Individual reg/dereg messages are suppressed during shutdown. */
if (CHECK_FLAG(bfd_gbl.flags, BFD_GBL_FLAG_IN_SHUTDOWN)) {
if (bfd_debug)
zlog_debug(
"%s: Suppressing BFD peer reg/dereg messages",
__func__);
return;
}
```
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
To remove a BFD profile without removing the BFD configuration just call
`neighbor <A.B.C.D|X:X::X:X|WORD> bfd`.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Allow BGP to use the new API to configure BFD session profiles. Now it
is possible to preconfigure BFD sessions without needing to create the
peers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
"set community accept-own-nexthop" returns "malformed communities"
error. This is because the token matching hits an earlier "accept-own"
and leaves "-nexthop" as a separate token to be processed.
Reorder the switch cases so that both are processed correctly.
Signed-off-by: Appu Joseph <apjo@kaloom.com>
We are crashing in thread_cancel on shutdown because
the thread pointer is NULL. Use the more appropriate
THREAD_CANCEL macro
Ticket: CM-29873
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Extend the next hop tracking for type-2 and type-3 EVPN routes also.
Updates: "bgpd: Add nexthop of received EVPN RT-5 for nexthop tracking"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
When there is a NHT change and the paths dependent on that NHT are being
evaluated, skip those that are marked for removal or as history.
When a route gets withdrawn, its valid flag is cleared and it is flagged
for removal; in the case of an EVPN route, it is also unimported from
VRFs (L2 and/or L3). bgp_process is then scheduled. Under rare timing
conditions, an NHT update for the route's next hop may arrive right after,
and if routes flagged for removal are not skipped, they may not only be
incorrectly marked as valid but also re-imported in the case of EVPN,
which will be a serious error.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ensure that only if there is a change to the path's validity based
on the NHT update, EVPN import or unimport is invoked.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Display next hop resolution information, whether the "detail" option is
specified or not as it is quite fundamental and only minimally increases
the output.
Introduce option to look at a specific NHT entry, which will also show
the paths associated with that entry.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Clean up a few lines of cli command installation; remove a
duplicate; follow the command grouping pattern better.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
There can be cases where evpn traffic is not meshed across various
endpoints, but sent to a central pe. For this situation, remove the
nexthop unchanged default behaviour for bgp evpn. Also add route
reflector commands to bgp evpn node.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Non-best paths (path info structures) also need to be freed during
table cleanup not only to release their memory but to also ensure
any linkages are updated correctly. One such example is for EVPN
where there is a link between the imported path info (in a L2 or
L3 vrf instance) and its parent path info.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
We had already removed the `ip as-path..` command
to have `bgp as-path` but for some reason a `no ip as-path..`
command ALIAS was still around. Kill with extreme prejudice.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Without specifying a default afi/safi we get a segfault:
```
(gdb) frame 4
bgp_table_stats (..., afi=32724, safi=SAFI_UNICAST, ...
11349 if (!bgp->rib[afi][safi]) {
(gdb)
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
L3VNI is configured with "prefix-routes-only" flag. Even in this case,
intermittently, we observed that local EVPN MACIP routes are installed and
advertised with 2 labels and 2 export RTs.
This is a sequencing issue. Consider following case where L2VNI 200 and L3VNI
1000 are configured for tenant vrf vrf-blue.
Bug is observed for following sequence of events:
1. vrf-blue BGP instance is created.
2. L2VNI is created in bgp for vni 200. It is linked to the tenant vrf vrf-blue
in function bgpevpn_link_to_l3vni.
Following code sets "VNI_FLAG_USE_TWO_LABELS" flag for vni 200 as L3VNI is not
yet attached to vrf-blue BGP instance.
/* check if we are advertising two labels for this vpn */
if (!CHECK_FLAG(bgp_vrf->vrf_flags, BGP_VRF_L3VNI_PREFIX_ROUTES_ONLY))
SET_FLAG(vpn->flags, VNI_FLAG_USE_TWO_LABELS);
2. Now L3VNI is attached to vrf-blue BGP instance. In this case, we set
BGP_VRF_L3VNI_PREFIX_ROUTES_ONLY flag for vrf-blue but we do not clear
VNI_FLAG_USE_TWO_LABELS flag set on the corresponding L2VNIs.
This fix resolves following 2 issues observed above.
1. When L2VNI is created in BGP, flag VNI_FLAG_USE_TWO_LABELS should not be set
for this VNI if BGP vrf is not attached to any L3VNI.
2. When L3VNI is attached to the BGP vrf, set "VNI_FLAG_USE_TWO_LABELS" flag
if "prefix-routes-only" is not for the vrf.
UT cases:
1. Flap "prefix-routes-only" config for a vrf.
2. Test following triggers for vrfs with and without "prefix-routes-only"
- Flap L2VNI from kernel.
- Flap L3VNI from kernel.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
The `bgp bestpath bandwidth` command should not be a legal
command. Pull out the `no` form to allow this. Allow
`no bgp bestpath bandwidth` to work as we would expect.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is not the attribute involved in path selection and by rfc7606 it should
be just ignored.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Community attributes might have been removed by an inbound route map, so we
should check to ensure they still exist before trying to free them.
This fixes a segfault described in issue #6345.
Signed-off-by: Josh Cox <josh.cox@pureport.com>
The problem is that peer_af_array returns NULL when SAFI is changed to
unicast. We use unicast table, but peer is created and activated under
labeled-unicast, hence we should lookup with a proper SAFI id.
Without this patch peer_af_find() returns NULL and we can't show
PfxSnt in `show bgp summary`.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
It is possible that the if_lookup_by_index() call will return
a NULL value and calling zclient_send_interface_radv_req. Just
test that we have a valid interface pointer.
Found by Coverity
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
unicast and labeled-unicast share the same table, but configuration should
be visible for both independently. Without this fix it confuses a bit
because when you enter `network 10.0.0.0/24` under labeled-unicast it's
written in unicast family block.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Modify the import-check command to require the underlying prefix
to exist in the rib. General consensus is that this is the correct
behavior.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported that in many circumstances, RAs created in the
process of bringing up numbered IPv6 peers with extended-nexthop
capability enabled (for ipv4 over ipv6) were not stopped on the
interface when those peers were deleted. Found several circumstances
where this occurred and fix them in this patch.
Ticket: CM-26875
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
These are easy to get subtly wrong, and doing so can cause
nondeterministic failures when racing in parallel builds.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Issue:
Configuring default-originate when static default route is previously
advertised results in withdrawal of the route.
Fix :
Delete the adj-out entry for the previously advertised static
default route without sending explicit withdraw message.
Signed-off-by: kssoman <somanks@gmail.com>
the nlri flowspec above 240 bytes size was not handled.
Over 240 bytes, the length is 2 bytes length, and a calculation must be
done to obtain the real length. This commit handles it appropriately.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
- Fix 1 byte overflow when showing GR info in bgpd
- Use PATH_MAX for path buffers
- Use unsigned specifiers for uint16_t's in zebra pbr
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When we receive an UPDATE with MP_NEXTHOP len as 32 bytes, we shouldn't
check if the global (1st) nexthop is unspecified.
Peering between bird and FRRouting we receive from Bird something like:
```
rcvd UPDATE w/ attr: , origin i, mp_nexthop ::(fe80::a00:27ff:fe09:f8a3)
```
The link-local (2nd) nexthop is valid and validated later in the code.
Before it was marked:
```
IPv6 unicast -- DENIED due to: martian or self next-hop;
```
After it's a valid prefix:
```
spine1-debian-9# show bgp
BGP table version is 0, local router ID is 2.2.2.2, vrf id 0
Default local pref 100, local AS 65002
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
2a02:4780::/64 fe80::a00:27ff:fe09:f8a3
0 65001 i
Displayed 1 routes and 1 total paths
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Replace all `random()` calls with a function called `frr_weak_random()`
and make it clear that it is only supposed to be used for weak random
applications.
Use the annotation described by the Coverity Scan documentation to
ignore `random()` call warnings.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
In real world sometimes happens that bgp_nexthop_cache is NULL. Avoid
segfaulting when using `show [ip] bgp ...` CLI commands.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Rather than doing a f*gly hack for the RPKI code, let's do an on-exit
hook in cmd_node. Also allows replacing some special-casing in the vty
code.
Signed-off-by: David Lamparter <equinox@diac24.net>
And again for the name. Why on earth would we centralize this, just so
people can forget to update it?
Signed-off-by: David Lamparter <equinox@diac24.net>
Same as before, instead of shoving this into a big central list we can
just put the parent node in cmd_node.
Signed-off-by: David Lamparter <equinox@diac24.net>
There is really no reason to not put this in the cmd_node.
And while we're add it, rename from pointless ".func" to ".config_write".
[v2: fix forgotten ldpd config_write]
Signed-off-by: David Lamparter <equinox@diac24.net>
The only nodes that have this as 0 don't have a "->func" anyway, so the
entire thing is really just pointless.
Signed-off-by: David Lamparter <equinox@diac24.net>
The problem is when using kinda such topologies:
(192.168.1.1/32) r1 <-- eBGP --> r2 <-- iBGP --> r3
Looking at r3's nexthop for 192.168.1.1/32 we have it as r2, but really
it MUST be r1.
Checking if the nexthop is connected solves the problem even for cases
when route-reflectors are used.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This fixes unnecessary whitespaces and makes capitalization
match for route type help strings.
Signed-off-by: Trey Aspelund <taspelund@cumulusnetworks.com>
Some competitive vendors like Cisco, Bird, OpenBGPD,
Nokia already have this by default enabled.
The list is here: https://github.com/bgp/RFC8212
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Problem Description:
=====================
+--+ +--+
|R1|-(192.201.202.1)----iBGP----(192.201.202.2)-|R2|
+--+ +--+
Routes on R2:
=============
S>* 202.202.202.202/32 [1/0] via 192.201.78.1, ens256, 00:40:48
Where, the next-hop network, 192.201.78.0/24, is a directly connected network address.
C>* 192.201.78.0/24 is directly connected, ens256, 00:40:48
Configurations on R1:
=====================
!
router bgp 201
bgp router-id 192.168.0.1
neighbor 192.201.202.2 remote-as 201
!
Configurations on R2:
=====================
!
ip route 202.202.202.202/32 192.201.78.1
!
router bgp 201
bgp router-id 192.168.0.2
neighbor 192.201.202.1 remote-as 201
!
address-family ipv4 unicast
redistribute static
exit-address-family
!
Step-1:
=======
R1 receives the route 202.202.202.202/32 from R2.
R1 installs the route in its BGP RIB.
Step-2:
=======
On R1, a connected interface address is added.
The address is the same as the next-hop of the BGP route received from R2 (192.201.78.1).
Point of Failure:
=================
R1 resolves the BGP route even though the route's next-hop is its own connected address.
Even though this appears to be a misconfiguration it would still be better to safeguard the code against it.
Fix:
====
When BGP receives a connected route from Zebra, it processes the
routes for the next-hop update.
While doing so, BGP must ignore routes whose next-hop address matches
the address of the connected route for which Zebra sent the next-hop update
message.
Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
Ensure that upon a link-bandwidth change - for e.g., due to change in
the number of multipaths - EVPN type-5 route injection is triggered.
In the absence of this, the proper link-bandwidth is not updated in
EVPN type-5 routes originated by the router.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
take into account polychaeta tips ono code style.
also, take into account miscellaneous code style recommandations like
braces usage.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Multiple different issues causing mostly UAFs but maybe other more
subtle things.
- Cluster lists were the only attributes whose pointers were not being
NULL'd when freed, resulting in heap UAF
- When performing an insert into the cluster hash, our temporary struct
used for hash_get() was inconsistent with our hash keying and
comparison functions. In the case of a zero length cluster list, the
->length field is 0 and the ->list field is NULL. When performing an
insert, we set the ->list field regardless of whether the length is 0.
This resulted in the two cluster lists hashing equal but not comparing
equal. Later, when removing one of them from the hash before freeing
it, because the key matched and the comparison succeeded (because it
was set to NULL *after* the search but *before* inserting into the
hash) we would sometimes release the duplicated copy of the struct,
and then free the one that remained in the hash table. Later accesses
constitute UAF. This is fixed by making sure the fields used for the
existence check match what is actually inserted into the hash when
that check fails.
This patch also makes cluster_unintern static, because it should be.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
... Oops ...
(for context, the defaults code originally didn't have a dedicated
"bool" variant and just used long for bools... I derp'd this when
adding bool as a separate case :( )
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@diac24.net>
This macro is undefined if vnc is disabled, and while it defaults to 0,
this is still wrong and causes issues with -Werror
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This is a full rewrite of the "back end" logging code. It now uses a
lock-free list to iterate over logging targets, and the targets
themselves are as lock-free as possible. (syslog() may have a hidden
internal mutex in the C library; the file/fd targets use a single
write() call which should ensure atomicity kernel-side.)
Note that some functionality is lost in this patch:
- Solaris printstack() backtraces are ditched (unlikely to come back)
- the `log-filter` machinery is gone (re-added in followup commit)
- `terminal monitor` is temporarily stubbed out. The old code had a
race condition with VTYs going away. It'll likely come back rewritten
and with vtysh support.
- The `zebra_ext_log` hook is gone. Instead, it's now much easier to
add a "proper" logging target.
v2: TLS buffer to get some actual performance
Signed-off-by: David Lamparter <equinox@diac24.net>
- each statistics is encapsulated into concatenated "<afi><safi>" value.
- the json encoding for floating and double values is using json api
double api. this change is done for bgp statistics.
- the lines over 80 characters have been handled.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this command is a shortcut to facilitate the extraction of statistics
for all afi/safi related to one bgp instance.
the command is: show bgp [vrf XX] statistics-all [json]
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
safis that use a route distinguisher in bgp tables, and as such
introduce a two level hierarchy on the bgp table, must be made available
to statistics too.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
add json support for show bgp statistics command.
The title of the stats entry is aggregated without spaces.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The BGP Router MAC extended community should be unique and not occur
multiple times. In a VRF-to-VRF route-leak scenario where EVPN routes
from a source VRF are leaked into the target VRF and then injected
back into EVPN from the target VRF, the resulting route had more than
one RMAC. With this fix, the resulting route will have only the
target VRF's RMAC.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
The EVPN advertise route-map may generate extended communities for an IPv4
or IPv6 route injected into EVPN as type-5. If so, allow for it and add
to it.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Support configurable options to control how link bandwidth is handled
by the receiver. The default behavior is to automatically honor the
link bandwidths received and use it to perform a weighted ECMP BUT only
if all paths in the multipath have associated link bandwidth; if one or
more paths do not have link bandwidth, normal ECMP is performed among
the multipaths. This behavior is as recommended by
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth.
The additional options available are to (a) completely ignore any link
bandwidth (i.e., weighted ECMP is effectively disabled), (b) skip paths
in the multipath which do not have link bandwidth and perform weighted
ECMP among the other paths (if at least some paths have the bandwidth)
or (c) use a default weight (value chosen is 1) for the paths which
do not have link bandwidth.
The command syntax is
bgp bestpath bandwidth <ignore|skip-missing|default-weight-for-missing>
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
When announcing ourselves as the next hop (e.g., to EBGP peers), if the
best path has the link bandwidth extended community and it is transitive,
change the value of the link bandwidth to the cumulative downstream
bandwidth (sum of the link bandwidths of all our multipaths) as this
makes the most sense. It is also implied by
https://tools.ietf.org/html/draft-mohanty-bess-ebgp-dmz. Of course, do
not override the link bandwidth if it has been specified by policy.
Note: Transitive extended communities will be automatically passed along
to EBGP peers; this commit is updating the value that is announced to
something that is the most appropriate.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Implement the code to handle the other route-map options to generate
the link bandwidth, namely, to use the cumulative bandwidth or to
base this on the number of multipaths. In the latter case, a reference
bandwidth is internally chosen - the implementation uses a value of
1 Gbps.
These additional options mean that the prefix may need to be advertised
if there is a link bandwidth change, which is a new criteria. Define a
new path (change) flag to support this and implement the advertisement.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
The BGP link bandwidth extended community must not be repeated. If the
attribute already carries this and the route-map specifies a new value,
the implementation will honor the policy configuration and overwrite
the existing values.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Certain extended communities cannot be repeated. An example is the
BGP link bandwidth extended community. Enhance the extended community
add function to ensure uniqueness, if requested.
Note: This commit does not change the lack of uniqueness for any of
the already-supported extended communities. Many of them such as the
BGP route target can obviously be present multiple times. Others like
the Router's MAC should most probably be present only once. The portions
of the code which add these may already be structured such that duplicates
do not arise.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Perform weighted ECMP if the multipaths have link bandwidth. This involves
assigning weights to each of the next hops associated with the prefix based
on the link bandwidth of the corresponding path as a factor of the total
(cumulative) link bandwidth for the prefix. The weight values used are
between 1 and 100. Weights are assigned only if all paths in the multipath
have link bandwidth, otherwise any bandwidths are ignored and regular
ECMP is performed. This is as recommended in
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth
A subsequent commit will implement additional (user-configurable) behaviors.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
During multipath update, track the cumulative link bandwidth
as well as update flags appropriately.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Introduce fields in the multipath structure for link bandwidth handling.
In the process, the mp_count field is changed to a uint16 as that is the
value set anyway.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Additional extended community definitions and display of link-bandwidth
extended community.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Implement route-map option to set the link-bandwidth extended
community. The command is of the form:
set extcommunity bandwidth <(1-26214400)|cumulative|num-multipaths>
[non-transitive]
The options available are to specify the actual bandwidth value in
Mbps, base it on the cumulative downstream bandwidth or base it on
the number of multipaths. The last option is based on
https://tools.ietf.org/html/draft-mohanty-bess-ebgp-dmz. Further,
in alignment with the use case described in this IETF draft, the
extended community is encoded as transitive by default. There is an
option available to specify that it should be non-transitive.
The link-bandwidth itself is carried in bytes per second as specifed in
https://tools.ietf.org/html/draft-ietf-idr-link-bandwidth
Note: This commit only handles the processing for bandwidth specifed
as a value; subsequent commits will handle the processing of the other
options.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
In the past, we always displayed the number of buffered ingress packets
as zero because there was no packet buffering in the input path and
therefore never any queue size to report. They're buffered now so we can
display something meaningful instead of 0.
Also change the inq / outq lookups to be atomic, since they can be
modified elsewhere. These should still compile down to an unfenced word
read but it's good to be explicit.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
https://lists.frrouting.org/pipermail/frog/2020-March/000776.html
It was pointed out that we are not properly passing the nexthop
through and instead we were replacing the nexthop as a Route Server
with our own.
https://tools.ietf.org/html/rfc4456#section-4
10. Implementation Considerations
Care should be taken to make sure that none of the BGP path
attributes defined above can be modified through configuration when
exchanging internal routing information between RRs and Clients and
Non-Clients. Their modification could potentially result in routing
loops.
In addition, when a RR reflects a route, it SHOULD NOT modify the
following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED.
Their modification could potentially result in routing loops.
Modify the code such that when FRR is instructed to act as a
Route-Server to pass through the nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add new function `bgp_node_get_prefix()` and modify
the bgp code base to use it.
This is prep work for the struct bgp_dest rework.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Show if this malformed under `show [ip] bgp <prefix>`:
```
eva# sh ip bgp 103.79.124.0/22
BGP routing table entry for 103.79.124.0/22
Paths: (1 available, best #1, table default)
Advertised to non peer-group peers:
192.168.201.136
64539 15096 6939 7545 7545 136001, (aggregated by 0(malformed) 0.0.0.0)
192.168.201.136 from 192.168.201.136 (192.168.201.136)
Origin IGP, valid, external, best (First path received)
Last update: Thu Mar 26 10:02:07 2020
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Having a full feed this leads to unknown. You can't point which prefix or
aspath has this malforming behavior.
Printing just `[EC 33554434] AGGREGATOR attribute is BGP_AS_ZERO(0)` isn't
enough, you can't directly pin-point where is the problem.
Additionally print at least aspath here:
```
[EC 33554434] AGGREGATOR AS number is 0 for aspath: 65000 65031
```
Overall the full table has only 6 such malformed prefixes:
```
aspath: 64539 15096 6939 45430 45458
aspath: 64539 15096 6939 1299 3257 34984 34984 34984 34984 34984 51174
aspath: 64539 15096 6939 286 34984 16135 16135 {16135}
aspath: 64539 15096 6939 7545 7545 136001
aspath: 64539 15096 6939 6762 3269 20746
aspath: 64539 15096 6939 7018 3379
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Line break at the end of the message is implicit for zlog_* and flog_*,
don't put it in the string. Mid-message line breaks are currently
unsupported. (LF is "end of message" in syslog.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem seen that if "import vrf route-map RMAP" was entered
without any vrfs being imported, the configuration was displayed
as "route-map vpn import RMAP". Additionally, if "import vrf
route-map" was entered without specifying a route-map name,
the command was accepted and the word "route-map" would be
treated as a vrf name. This fix resolves both of those issues
and also allows deleting the "import vrf route-map" line without
providing the route-map name.
Ticket: CM-28821
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Modify code to use lookup function agg_node_get_prefix()
as the abstraction layer. When we rework bgp_node to
bgp_dest this will allow us to greatly limit the amount
of work needed to do that.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Future work needs the ability to specify a
const struct prefix value. Iterate into
bgp a bit to get this started.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Defer the grabbing of the prefix for as long as is possible.
This is a long term rework of how we access the `struct bgp_node`
to only use accessor functions.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
More second order effects of cleaning up rn usage
in bgp. Sprinkle the fairy const's all over the place.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Modify more code to use `const struct prefix` throughout
bgp. This is all prep work for adding an accessor function
for bgp_node to get the prefix and reduce all the places that
code needs to be touched when we get that work done.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Tell the compiler that the prefix is being used for lookups
and it will never change.
Setup for future work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Convert some status defines for the fsm to an enum
so that we cannot mix and match them in the future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In PR #6052 which fixes issue #5963 the bgp fsm events
were confused with the bgp fsm status leading
to a bug. Let's start separating those out
so these types of failures cannot just
easily occur.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ensure that the EVPN advertise route-map is applied on a copy of the
original path_info and associated attribute, so that if the route-map
has SET clauses, they can operate properly. This closely follows
the model already in use in other route-map application code.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
During VRF-to-VRF route leaking, strip any extraneous route targets. This
ensures that source-VRF-specific route targets or route targets that are
internally assigned for the VRF-to-VRF route leaking don't get attached
to the route in the target VRF.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Extended communities like the BGP Route Target can be present multiple
times in a route's path attribute. Ensure that the strip function for a
particular extended community (type and subtype) handles this and
strips all occurrences.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
A BGP update-group is dynamically created to group together a set of peers
such that any BGP updates can be formed just once for the entire group and
only the next hop attribute may need to be modified when the update is sent
out to each peer in the group. The update formation code attempts to
determine as much as possible if the next hop will be set to our own IP
address for every peer in the group. This helps to avoid additional checks
at the point of sending the update (which happens on a per-peer basis) and
also because some other attributes may/could vary depending on whether the
next hop is set to our own IP or not. Resetting the next hop to our own IP
address is the most common behavior for EBGP peerings in the absence of
other user-configured or internal (e.g., for l2vpn/evpn) settings and
peerings on a shared subnet.
The code had a flaw in the multiaccess check to see if there are peers in
the update group which are on a shared subnet as the next hop of the path
being announced - the source peer could itself be in the same update group
and cause the check to give an incorrect result. Modify the check to skip
the source peer so that the check is more accurate.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
this command is missing, compared with 'match ipv6 next-hop' command
available. Adding it by taking into account the backward compatible
effect when supposing that some people have configured acls with name
being an ipv4 address.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The bgp reason code was being reset in bgp_best_selection
by rerunning bgp_path_info_cmp multiple times under certain
receiving patterns of data from peers.
This is the debugs that show this issue:
2020/03/16 19:17:22.523780 BGP: 2001:20:1:1::6 rcvd UPDATE w/ attr: nexthop 20.1.1.6, origin i, metric 600, community 1000:1006, path 20
2020/03/16 19:17:22.523819 BGP: 2001:20:1:1::6 rcvd 20.10.0.6/32 IPv4 unicast
2020/03/16 19:17:22.556168 BGP: 20.1.1.6 rcvd UPDATE w/ attr: nexthop 20.1.1.6, origin i, metric 500, community 1000:1006, path 20
2020/03/16 19:17:22.556209 BGP: 20.1.1.6 rcvd 20.10.0.6/32 IPv4 unicast
2020/03/16 19:17:22.572358 BGP: bgp_process_main_one: p=20.10.0.6/32 afi=IPv4, safi=unicast start
2020/03/16 19:17:22.572408 BGP: 20.10.0.6/32: Comparing path 2001:20:1:1::6 flags 0x410 with path 20.1.1.6 flags 0x410
2020/03/16 19:17:22.572415 BGP: 20.10.0.6/32: path 2001:20:1:1::6 loses to path 20.1.1.6 due to MED 600 > 500
2020/03/16 19:17:22.572422 BGP: 20.10.0.6/32: path 20.1.1.6 is the bestpath from AS 20
2020/03/16 19:17:22.572429 BGP: 20.10.0.6/32: path 20.1.1.6 is the initial bestpath
2020/03/16 19:17:22.572435 BGP: bgp_best_selection: pi 0x5627187c66c0 dmed
2020/03/16 19:17:22.572441 BGP: 20.10.0.6/32: After path selection, newbest is path 20.1.1.6 oldbest was NONE
2020/03/16 19:17:22.572447 BGP: 20.10.0.6/32: path 20.1.1.6 is the bestpath, add to the multipath list
2020/03/16 19:17:22.572453 BGP: 20.10.0.6/32: path 2001:20:1:1::6 has the same nexthop as the bestpath, skip it
2020/03/16 19:17:22.572460 BGP: 20.10.0.6/32: starting mpath update, newbest 20.1.1.6 num candidates 1 old-mpath-count 0 old-cum-bw u0
2020/03/16 19:17:22.572466 BGP: 20.10.0.6/32: comparing candidate 20.1.1.6 with existing mpath NONE
2020/03/16 19:17:22.572473 BGP: 20.10.0.6/32: New mpath count (incl newbest) 1 mpath-change NO all_paths_lb 0 cum_bw u0
Effectively if BGP receives 2 paths it could end up running bgp_path_info_cmp multiple times
and in some situations overwrite the reason selected the first time through.
In this example path selection is run and the MED is the reason for the choice.
Then in bgp_best_selection is run again this time clearing new_select
to NULL before calling path selection for the first time. This second
call into path selection resets the reason, since it is only passing in one
path. So save the last reason selected and restore in this case.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This scenario has been seen against microtik virtual machine with
bfd enabled. When remote microtik bgp reestablishes the bgp session
after a bgp reset, the bgp establishment comes first, then bfd is
initialising.
The second point is true for microtik, but not for frrouting, as the
frrouting, when receiving bfd down messages, is not at init state.
Actually, bfd state is up, and sees the first bfd down packet from
bfd as an issue. Consequently, the BGP session is cleared.
The fix consists in resetting the BFD session, only if bfd status is
considered as up, once BGP comes up.
That permits to align state machines of both local and remote bfd.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When bgp is updated with local source, the bgp session is reset; bfd
also must be reset. The bgp_stop() handler handles all kind of
unexpected failures, so the placeholder to deregister from bfd should be
ok, providing that when bgp establishes, a similar function in bgp will
recreate bfd context.
Note that the bfd session is not reset on one specific case, where BFD
down event is the last reset. In that case, we must let BFD to monitor
the link.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
If the default BGP instance is importing routes from another instance and
the latter has a router-id update, the update handler needs to handle the
default instance in a special way.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Chirag Shah <chirag@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ticket: CM-26007
Reviewed By: CCR-9108
Testing Done: Detailed verification in 3.x
Ensure that the late registration for NHT done for IPv4 route exchange
over IPv6 GUA peering is not attempted for peer-groups, only for peers.
Fixes: "bgpd: Late registration of Extended Nexthop"
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
This scenario has been seen against microtik virtual machine with bfd
enabled. When remote microtik bgp reestablishes the bgp session after a
bgp reset, the bgp establishment comes first, then bfd is initialising.
The second point is true for microtik, but not for frrouting, as the
frrouting, when receiving bfd down messages, is not at init state.
Actually, bfd state is up, and sees the first bfd down packet from bfd
as an issue. Consequently, the BGP session is cleared.
The fix consists in resetting the BFD session, once BGP comes up. That
permits to align state machines of both local and remote bfd.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When bgp is updated with local source, the bgp session is reset; bfd
also must be reset. The bgp_stop() handler handles all kind of
unexpected failures, so the placeholder to deregister from bfd should be
ok, providing that when bgp establishes, a similar function in bgp will
recreate bfd context.
Note that the bfd session is not reset on one specific case, where BFD
down event is the last reset. In that case, we must let BFD to monitor
the link.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This fixes a linking issue on Fedora Rawhide:
/usr/bin/ld: bgpd/libbgp.a(bgp_flowspec.o):/home/ruben/src/frr/./bgpd/bgp_attr_evpn.h:37: multiple definition of `eth_tag_id'; bgpd/bgp_btoa-bgp_btoa.o:/home/ruben/src/frr/./bgpd/bgp_attr_evpn.h:37: first defined here
collect2: error: ld returned 1 exit status
Signed-off-by: Ruben Kerkhof <ruben@rubenkerkhof.com>
We were using XMALLOC for these, and only initializing the refcount to 0
on one of them. Let's just use XCALLOC instead...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
RCA: When doppelganger still around and clear bgp is issued
there are chances of peer getting deleted and next pointer
is a freed peer pointer.
Fix: Pass address of nnode to get next safe peer pointer.
Signed-off-by: Santosh P K <sapk@vmware.com>
It's been a year search and destroy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We already have a generic support for add/sub in route-maps. It's already
handled in route_value_compile().
Just convert to string (allow passing (-) minus sign) - works like expected.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Some code in bgp_route_refresh_receive was spread across several
lines because of an end of line commit. Move comment to a place
to allow better formating.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
SA has found a case where we did a table lookup of a rn( and
associated lock of that node ) where we did not unlock it.
Unlock the node before moving on to the next one.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Currently During bgp open collision resolution if both
the router-id's are the same, we correctly follow
the RFC and close the connection. The problem is of course
that there is no notification of the error in configuration
to the end user other than a subtle open debug message.
Explicitly call out the miss-configuration as an error message
as that this miss-config took several hours of debugging to notice.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
annie# show bgp ipv4 uni summ
BGP router identifier 192.168.201.136, local AS number 64539 vrf-id 0
BGP table version 22458946
RIB entries 1458006, using 178 MiB of memory
Peers 4, using 68 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
45.33.5.119 4 0 0 0 0 0 0 never Active
65.19.134.122 4 15096 4611832 108292 0 0 0 6d22h55m 800670
107.13.46.23 4 0 0 0 0 0 0 never Connect
robot(192.168.201.139) 4 64540 11159975 11365599 0 0 0 05w2d05h Connect
Total number of neighbors 4
On very busy systems The column output for MsgRcvd and MsgSent can quickly move past 7 columns.
Add a couple more to allow for even display.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If the peer was shutdown locally, it doesn't show up as admin. shutdown.
Instead it's treated as "Waiting for peer OPEN".
The same applies to when the peer reaches maximum-prefix count.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
During route-map processing we return an enum, the rpki
code was doing some extra gyrations that were unnecessary.
Simplify.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Prefix-SID path attribute Label-index TLV (type-1) is
used by SR-MPLS. And Label-index TLV MUST ignored
if that path attribute is append on non-Labeled-unicast
UPDATE message described on [ref1].
There is a problem case exist arround this implementation.
This commit fix that.
Before this commit,
unfortunally, setting Label-Index value is skipped at somecases.
because, Label-Index TLV implementation check the AFI/SAFI pair.
by mp_update variable that is set by bgp_mp_reach_parse function.
if MP_REACH_NLRI is present after PREFIX_SID, bgp_attr_psid_sub
function can't understand AFI/SAFI pair. and the order of each
path attributes is never no-deterministic thing for receiver.[ref2]
In this commit,
I re-located checking code of AFI/SAFI pair after path-attr loop.
[ref1](https://tools.ietf.org/html/draft-ietf-idr-bgp-prefix-sid-27#section-3.2)
> The Originator SRGB TLV may only appear in a BGP Prefix-SID attribute
> attached to IPv4/IPv6 Labeled Unicast prefixes ([RFC8277]). It MUST
> be ignored when received for other BGP AFI/SAFI combinations.
[ref2](https://tools.ietf.org/html/rfc4271#section-5)
> The sender of an UPDATE message SHOULD order path attributes within
> the UPDATE message in ascending order of attribute type. The
> receiver of an UPDATE message MUST be prepared to handle path
> attributes within UPDATE messages that are out of order.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Prefix-SID is desined to capable for TLV array.
That behaviour is important to support SR-MPLS feature
and that supported by previous PR #5418.
In that implementation, but if some additional data
(such as next BGP update message or next path attributes)
was present after Prefix-SID path attribute,
bgpd will parse that addional data as Prefix-SID TLV.
This commit fix that. before this commit, loop condition
is determed by stream is readable or not. In more correct
implementatoin, the prefix-sid boundaly should be checked
additonally. the length of Prefix-sid path attribute can
be get by bgp_attr_parse_args.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Override ORIGIN attribute if defined.
E.g.: Cisco and Juniper set ORIGIN for aggregated address
to IGP which is not what rfc4271 says.
This enables the same behavior, optionally.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Track the returned peer_sorted value and use it where
we can and recalculate where necessary.
This is an effort to reduce the amount of work done here.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The act of peer_sort() being called always set this value
even when we are just looking it up. We need to seperate
out the idea of lookup from set.
For those places that this is immediately obvious that
this is a lookup switch over to using this function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A route where ESI, GW IP, MAC and Label are all zero at the same time SHOULD
be treat-as-withdraw.
Invalid MAC addresses are broadcast or multicast MAC addresses. The route
MUST be treat-as-withdraw in case of an invalid MAC address.
As FRR support Ethernet NVO Tunnels only.
Route will be withdrawn when ESI, GW IP and MAC are zero or Invalid MAC
Test cases:
1) ET-5 route with valid RMAC extended community
2) ET-5 route no RMAC extended community
3) ET-5 route with Multicast MAC in RMAC extended community
4) ET-5 route with Broadcast MAC in RMAC extended community
Signed-off-by: Kishore Aramalla <karamalla@vmware.com>
Current failed reasons for bgp when you have a peer that
is not online yet is `Waiting for NHT`, even if NHT has
succeeded. Add some code to differentiate this.
eva# show bgp ipv4 uni summ failed
BGP router identifier 192.168.201.135, local AS number 3923 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 43 KiB of memory
Neighbor EstdCnt DropCnt ResetTime Reason
192.168.44.1 0 0 never Waiting for NHT
192.168.201.139 0 0 never Waiting for Open to Succeed
Total number of neighbors 2
eva#
eva# show bgp nexthop
Current BGP nexthop cache:
192.168.44.1 invalid, peer 192.168.44.1
Must be Connected
Last update: Mon Feb 10 19:05:19 2020
192.168.201.139 valid [IGP metric 0], #paths 0, peer 192.168.201.139
So 192.168.201.139 is a peer for a connected route that has not been
created on .139, while 44.1 nexthop tracking has not succeeded yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
RCA:
When we install IPv6 prefix imported from EVPN RT-5 in vrf, nexthop of the IPv6
route should be IPv4 mapped IPv6 address. In function
install_evpn_route_entry_in_vrf, we generate a new attribute with IPv4 mapped
IPv6 nexthop, but we use parent->attr while creating the actual route.
Thus, Ipv4 nexthop is assigned to this route.
Because of this incorrect nexthop, we observed a crash in function
update_ipv6nh_for_route_install.
Fix:
Pass the new attribute with Ipv4 mapped Ipv6 nexthop to
bgp_create_evpn_bgp_path_info
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
For some reason we are getting a compile error around a variable I didn't
touch in the other commits. Make it happy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no need to have a temp variable to then store that
data in another temporary variable.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The new_afi and afi were being used over and over. Switch
to the end result we want and just use that from the get go.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The creation of a prefix pointer is unnecessary. Save the
prefix as part of the actual data structure. This will
reduce the data needed by 8 bytes per nexthop stored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
According to https://tools.ietf.org/html/rfc7606 some of the attributes
MUST be handled as "treat-as-withdraw" approach.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
bgp flowspec packets are being forged correctly. There is no need to
check for bgp length, as the bgp nlri length is checked at reception.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
There is no need for a call into get_afi_safi_str for the
json side since we add it based upon the afi safi str below.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The coverity SA believes that the regex value can possibly
be NULL. Not possible so let's make it happy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If you have enums handled in a switch adding a default case
makes it fun to fix when new stuff is added later. Remove.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For Graceful restart clients have to send GR capabilities
library functions are added to encode capabilities and
also for zebra to decode client capabilities.
Co-authored-by: Santosh P K <sapk@vmware.com>
Co-authored-by: Soman K S <somanks@vmware.com>
Signed-off-by: Santosh P K <sapk@vmware.com>
Two of the evpn show commands with json option has memory leak.
1) show bgp l2vpn evpn route vni all json
2) show bgp l2vpn evpn route esi json
Before fix:
----------
Executed 'show bgp l2vpn evpn route vni all json' multiple times
used ordinary blocks continue to increase.
Note at the time of show command capture there were 22 evpn routes
in vni evpn route table.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 9152 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 7300 KiB
Free small blocks: 1760 bytes
Free ordinary blocks: 1852 KiB
Ordinary blocks: 880
Small blocks: 51
Holding blocks: 0
Ticket:CM-27920
Reviewed By:
Testing Done:
After fix:
---------
Executed 'show bgp l2vpn evpn route vni all json' multiple times
Used ordinary blocks remains low.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 8356 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6492 KiB
Free small blocks: 1840 bytes
Free ordinary blocks: 1864 KiB
Ordinary blocks: 939
Small blocks: 52
Holding blocks: 0
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Found memory leak in json output of evpn's route
commands.
After executing 'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json' few times
(6 times) with more than 600 routes in total seeing
memory footprint for bgpd continue to grow.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 12 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 8390 KiB
Free small blocks: 1760 bytes
Free ordinary blocks: 3762 KiB
Ordinary blocks: 1161
Small blocks: 51
Holding blocks: 0
Ticket:CM-27920
Testing Done:
After fix:
excute few times,
'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json'
commands where used ordinary blocks (uordblks) is
in steady state.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 9968 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6486 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3482 KiB
Ordinary blocks: 1110
Small blocks: 54
Holding blocks: 0
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 10100 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6488 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3612 KiB
Ordinary blocks: 1113
Small blocks: 54
Holding blocks: 0
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
* While the Deferral timer is running, signal route update pending
(ZEBRA_CLIENT_ROUTE_UPDATE_PENDING) from BGPD to Zebra.
* After expiry of the Deferral timer, the deferred routes are processed.
When the deferred route_list becomes empty, End-of-Rib is send to the
peer and route processing complete message (ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE)
is sent to Zebra. So that Zebra would delete any stale routes still
present in the rib.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added CLI commands to update rib-stale-time, running in
Cmd : "bgp gaceful-restart rib-stale-time (1-3000)".
Cmd : "no bgp gaceful-restart rib-stale-time".
* Integrating the hooks function for signalling from BGPD
to ZEBRA to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
*Adding helper caller hooks function for signalling from BGPD
to ZEBRA to enable or disable GR feature in ZEBRA depending
on bgp per peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
*Adding helper function for signalling from BGPD to ZEBRA to
enable or disable GR feature in ZEBRA depending on bgp per
peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Data Structures, function declaration and Macros forSignalling
from BGPD to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
*After a restarting router comes up and the bgp session is
successfully established with the peer. If the restarting
router doesn’t have any route to send, it send EOR to
the peer immediately before receiving updates from its peers.
*Instead the restarting router should send EOR, if the
selection deferral timer is not running OR count of eor received
and eor required are matches then send EOR.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
BGP disable EOR sending is a useful command for testing various
scenarios of BGP graceful restart.
* Added the hidden CLI command : bgp graceful-restart disable-eor
* The CLI will not be displayed in "show running-config" and will not
be stored in configuration file.
* When enabled, EOR will not be sent to peer
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Signed-off-by: Soman K S <somanks@vmware.com>
When the peer router's gr mode had changed from helper/restart
to disable. The local bgp gr router should reset the peer
router's restart-time stored.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
bgp tcp connection.
When the BGP peer is configured between two bgp routes both routers would create
peer structure , when they receive each other’s open message. In this event both
speakers, open duplicate TCP sessions and send OPEN messages on each socket
simultaneously, the BGP Identifier is used to resolve which socket should be closed.
If BGP GR is enabled the old tcp session is dumped and the new session is retained.
So while this transfer of connection is happening, if all the bgp gr config
is not migrated to the new connection, the new bgp gr mode will never get applied.
Fix Summary:
1. Replicate GR configuration from the old session to the new session in bgp_accept().
2. Replicate GR configuration from stub to full-fledged peer in bgp_establish().
3. Disable all NSF flags, clear stale routes (if present), stop restart & stale timers
(if they are running) when the bgp GR mode is changed to “Disabled”.
4. Disable R-bit in cap, if it is not set the received open message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
BGP GR Neighbor mode is showing the default string as “NotRecieved”,
as the bgp gr neighbour capability was not processed,
since the local mode is “Disable”.
However now it would be changed to “NotApplicable”.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
& GR is enabled.
When GR with deferral is enabled and connected routes are
distributed then in one race condition route node gets added
in to both deferred queue and work queue. If deferred queue
gets processed first then it ends up delete only flag while
leaving the entry in the work queue as it is. When a new update
comes for the same route node next time from peer then it hits
assert. Assert check is added to ensure we don’t add to work queue
again while it is already present.
So, check before adding in to deferred queue if it is already present
in work queue and bail if so.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Changing GR mode on a router needs a session reset from the
SAME router to negotiate new GR capability.
* The present GR implementation needs a session reset after every
new BGP GR mode change.
* When BGP session reset happens due to sending or receiving BGP
notification after changing BGP GR mode, there is no need of
explicit session reset.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* BGP GR Neighbour mode in show command would show as
“NotApplicable”, when local mode is “Disable”. As the bgp
gr neighbour capability was not processed, since the local mode
is “Disable”.
* Minor changes in show Selection Deferral Time.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Selection Deferral Timer for Graceful Restart.
* Added selection deferral timer handling function.
* Route marking as selection defer when update message is received.
* Staggered processing of routes which are pending best selection.
* Fix for multi-path test case.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
and DS.
* Added config commands and data structures for deferral timer
configuration and processing.
Cmd : bgp graceful-restart select-defer-time (0-3600)
Cmd : no bgp graceful-restart select-defertime (0-3600)
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Signed-off-by: Soman K S <somanks@vmware.com>
* Added new show command to show the graceful restart
information for each neighbor.
Cmd: show bgp [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|WORD>] graceful-restart
* Changes to show neighbors commands for displaying
graceful restart information.
Cmd :show [ip] bgp [<view|vrf> VIEWVRFNAME] [<ipv4|ipv6>] neighbors [<A.B.C.D|X:X::X:X|
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Changes to the capability sending function to advertise
graceful restart capability in the bgp OPEN message.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
* Added FSM for peer and global configuration for graceful restart
* Added debug option BGP_GRACEFUL_RESTART for logs specific to
graceful restart processing
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Duplicated domain name capability messages cause memory leak. The amount
of leaked memory is proportional to the size of the duplicated
capabilities. This bug was introduced in 2015.
To hit this, a BGP OPEN message must contain multiple FQDN capabilities.
Memory is leaked when the hostname portion of the capability is of
length 0, but the domainname portion is not, for any of the duplicated
capabilities beyond the first one.
https://tools.ietf.org/html/draft-walton-bgp-hostname-capability-00
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
When withdrawing addpaths, adj_lookup was called to find the path that
needed to be withdrawn. It would lookup in the RB tree based on subgroup
pointer alone, often find the path with the wrong addpath ID, and return
null. Only the path highest in the tree sent to the subgroup could be
found, thus withdrawn.
Adding the addpath ID to the sort criteria for the RB tree allows us to
simplify the logic for adj_lookup, and address this problem. We are able
to remove the logic around non-addpath subgroups because the addpath ID
is consistently 0 for non-addpath adj_outs, so special logic to skip
matching the addpath ID isn't required. (As a side note, addpath will
also never use ID 0, so there won't be any ambiguity when looking at the
structure content.)
Signed-off-by: Mitchell Skiba <mskiba@amazon.com>
The vrrpd one conflicts with the standalone vrrpd package; also we're
installing daemons to /usr/lib/frr on some systems so they're not on
PATH.
Signed-off-by: David Lamparter <equinox@diac24.net>
bgpd already supports BGP Prefix-SID path attribute and
there are some sub-types of Prefix-SID path attribute.
This commits makes bgpd to support additional sub-types.
sub-Type-4 and sub-Type-5 for construct the VPNv4 SRv6 backend
with vpnv4-unicast address family.
This path attributes is already supported by Ciscos IOS-XR and NX-OS.
Prefix-SID sub-Type-4 and sub-Type-5 is defined on following
IETF-drafts.
Supports(A-part-of):
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-04
- https://tools.ietf.org/html/draft-dawra-idr-srv6-vpn-05
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
For evpn routes, nexthop and RMAC fileds are synced
in route add to zebra.
In case of EVPN routes display RMAC field in route add
debug log.
Reviewed By:CCR-9381
Testing Done:
BGP: nhop [1]: 27.0.0.11 if 30 VRF 26 RMAC 00:02:00:00:00:2e
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit makes bgpd to support VPNv4's extended
nexthop capability for bgp-capability negotiation
when BGP open messaging.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
uint8_t * cannot be cast to uint32_t * unless the
pointed-to address is aligned according to uint32_t's
alignment rules. And it usually is not.
Signed-off-by: Santosh P K <sapk@vmware.com>
With this change, we are able to set attributes via route-map to the default
route. It's useful in cases where we have two or more spines and we want to
prefer one router over others for leaves. This simplifies configuration instead
of using 'network 0.0.0.0/0' or 'ip route 0.0.0.0/0 ...' and 'redistribute
static' combination.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
uint8_t * cannot be cast to uint32_t * unless the pointed-to address is
aligned according to uint32_t's alignment rules. And it usually is not.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
advertise pip running configuration should
display ip followed by mac parameters value as defined
in cli signature.
advertise-pip is enabled by default, when displaying the
running configuration, there is '\n' added after
ip and mac parameters which was not guarded around
the non-default parameters.
Currently, for every bgp vrf instance it ends up
displaying l2vpn address-family section due to
unguarded newline.
running config:
router bgp 6004 vrf vrf1
!
address-family l2vpn evpn
exit-address-family
!
Ticket:CM-26964
Testing Done:
With fix when only 'router bgp 6004 vrf vrf1' configured,
running config looks like:
!
router bgp 6004 vrf vrf1
!
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
This commit is about #5629 's issue.
Before this commit, bgpd creates format string of
bgp-route-distinguisher as int32, but correctly format
is uint32. current bgpd's sh-run-cli generate int32 rd,
so if user sets the rd as 1:4294967295(0x1:0xffffffff),
sh-run cli generates 1: -1 as running-config. This
commit fix that issue.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Guess what - for a bounds check to work, it has to happen *before* you
read the data. We were trusting the attribute field received in a prefix
SID attribute and then checking if it was correct afterwards, but if was
wrong we'd crash before that.
This fixes the problem, and adds additional paranoid bounds checks.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We should send a NOTIFICATION message with the Error Code Finite State
Machine Error if we receive NOTIFICATION in OpenSent state
as defined in https://tools.ietf.org/html/rfc4271#section-8.2.2
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
bgp nexthop cache update triggers RA for global ipv6
nexthop update.
In case of blackhole route type the outgoing interface
information is NULL which leads to bgpd crash.
Skip sending RA for blackhole nexthop type.
Ticket:CM-27299
Reviewed By:
Testing Done:
Configure bgp neighbor over global ipv6 address.
Configure static blackhole route with prefix includes
connected ipv6 global address.
Upon link flap, zebra sends nexthop update to bgp.
Bgp nexthop cache skips sending RA for blackhole nexthop type.
router bgp 65002
bgp router-id 91.189.93.190
...
neighbor 2001:67c:1360::b peer-group internal
static route:
ipv6 route 2001:67c:1360::/48 Null0 254
iface rowlink.4010
address 91.189.93.190/32
address 2001:67c:1360::a/128
Trigger ifdown rowlink.4010; ifup rowlink.4010
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
It's quite confusing when you see this:
```
exit1-debian-9(config-router)# bgp listen
listen Configure BGP defaults
```
And:
```
exit1-debian-9(config-router)# no bgp listen
listen unset maximum number of BGP Dynamic Neighbors that can be created
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
When passing a v4 multicast route to a peer send
the v4 nexthop as a preferred methodology.
Fixes: #5582
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixes:
```
exit1-debian-9(config-router)# no bgp listen range 192.168.10.0/24 peer-group TEST
% Peer-group does not exist
exit1-debian-9(config-router)#
```
Closes https://github.com/FRRouting/frr/issues/5570
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
'NOTIFICATION' string in this message incorrectly implies a BGP
Notification message was the cause of this log. Removing it to
reduce confusion and replacing with function name.
Signed-off-by: Trey Aspelund <taspelund@cumulusnetworks.com>
Before it was:
```
exit1-debian-9# show ip bgp regexp ^200a
Invalid character in as-path access-list ^200a
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
rfapi_descriptor_rfp_utils.c is already built into libbgp.a and these
include paths have no effect at all.
Signed-off-by: David Lamparter <equinox@diac24.net>
This should keep backward compatibility when bgp show-hostname is
enabled/disabled.
Also show the real originator IP instead of showing fqdn of the route
reflector.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
To keep the calling code agnostic of the DNS resolver libary used, pass
a strerror-style string instead of a status code that would need extra
handling.
Signed-off-by: David Lamparter <equinox@diac24.net>
libc-ares doesn't do IP literals, so we have to do that before running
off to do DNS. Since this isn't BMP specific, move to lib/ so NHRP can
benefit too.
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.
RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
been added to peer hash table for keepalive when it was
in openconfirm state.
3. Config node parameters including FD are exchanged with
doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
set to -1 and asserts.
Signed-off-by: Santosh P K <sapk@vmware.com>
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Use a per-nexthop flag to indicate the presence of labels; add
some utility zapi encode/decode apis for nexthops; use the zapi
apis more consistently.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
This moves all the DFLT_BGP_* stuff over to the new defaults mechanism.
bgp_timers_nondefault() added to get better file-scoping.
v2: moved everything into bgp_vty.c so that the core BGP code is
independent of the CLI-specific defaults. This should make the future
northbound conversion easier.
Signed-off-by: David Lamparter <equinox@diac24.net>
There's no good reason to have this in bgpd.c; it's just there
historically. Move it to bgp_vty.c where it makes more sense.
Signed-off-by: David Lamparter <equinox@diac24.net>
Add a bit of code to allow hostname lookup failure to
not stall bmp communication.
Fixes: #5382
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The function bgp_table_range_lookup attempts to walk down
the table node data structures to find a list of matching
nodes. We need to guard against the current node from
not matching and not having anything in the child nodes.
Add a bit of code to guard against this.
Traceback that lead me down this path:
Nov 24 12:22:38 frr bgpd[20257]: Received signal 11 at 1574616158 (si_addr 0x2, PC 0x46cdc3); aborting...
Nov 24 12:22:38 frr bgpd[20257]: Backtrace for 11 stack frames:
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_signal+0x113) [0x7fd1ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(+0x70e65) [0x7fd1ad465e65]ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libpthread.so.0(+0xf5f0) [0x7fd1abd605f0]45db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(bgp_table_range_lookup+0x63) [0x46cdc3]445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib64/frr/modules/bgpd_rpki.so(+0x4f0d) [0x7fd1a934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(thread_call+0x60) [0x7fd1ad4736e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(frr_run+0x128) [0x7fd1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(main+0x2e3) [0x41c043]1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd() [0x41d9bb]main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: in thread bgpd_sync_callback scheduled from bgpd/bgp_rpki.c:351#012; aborting...
Nov 24 12:22:38 frr watchfrr[6779]: [EC 268435457] bgpd state -> down : read returned EOF
Nov 24 12:22:38 frr zebra[5952]: [EC 4043309116] Client 'bgp' encountered an error and is shutting down.
Nov 24 12:22:38 frr zebra[5952]: zebra/zebra_ptm.c:1345 failed to find process pid registration
Nov 24 12:22:38 frr zebra[5952]: client 15 disconnected. 0 bgp routes removed from the rib
I am not really 100% sure what we are really trying to do with this function, but we must
guard against child nodes not having any data.
Fixes: #5440
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When dumping a large bit of table data via bgp_show_table
and if there is no information to display for a particular
`struct bgp_node *` the data allocated via json_object_new_array()
is leaked. Not a big deal on small tables but if you have a full
bgp feed and issue a show command that does not match any of
the route nodes ( say `vtysh -c "show bgp ipv4 large-community-list FOO"`)
then we will leak memory.
Before code change and issuing the above show bgp large-community-list command 15-20 times:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: > 2GB
Free small blocks: 31 MiB
Free ordinary blocks: 616 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
After:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 924 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 558 MiB
Free small blocks: 26 MiB
Free ordinary blocks: 340 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
Please note the 340mb of free ordinary blocks is from the fact I issued a
`show bgp ipv4 uni json` command and generated a large amount of data.
Fixes: #5445
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Without with fix we can't delete large-community-list using
no bgp large-community-list standard WORD, but no bgp large-community-list WORD
Let's keep this identical what we have with expanded lists as well.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
This patch allows using sequence numbers for community lists. We already have
this for prefix-lists and access-lists.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
In rare situations, the local route in a VNI may not get selected as the
best route. One situation is during a race between bgp and zebra which
was addressed in a prior commit. This change addresses another situation
where due to a change of tunnel IP, it is possible that a received route
may be selected as the best route if the path selection needs to take
next hop IPs into consideration. This is a pretty convoluted scenario,
but the code should handle it and delete and withdraw the local route
as well as (re)install the received route.
Ticket: CM-24114
Reviewed By: CCR-9487
Testing Done:
1. Manual tests - note, problem is not readily reproducible
2. evpn-smoke - results documented in the ticket
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
If a peer advertised capability addpath in their OPEN, but sent us an
UPDATE without an ADDPATH, we overflow a heap buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This changeset follows the PR
https://github.com/FRRouting/frr/pull/5334
Above PR adds nexthop tracking support for EVPN RT-5 nexthops.
This route is marked VALID only if the BGP route has a valid nexthop.
If the EVPN peer is an EBGP pee and "disable_connected_check" flag is not set,
"connected" check is performed for the EVPN nexthop.
But, usually EVPN nexthop is not the BGP peering address, but the VTEP address.
Also, NEXTHOP_UNCHANGED flag is enabled by default for EVPN.
As a result, in a common deployment for EVPN, EVPN nexthop is not connected.
Thus, adding a fix to remove the "connected" check for EVPN nexthops.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Instead of CMD_WARNING, use CMD_WARNING_CONFIG_FAILED
for any mis-configuration scenario.
Testing Done:
TOR(config)# router bgp 5548
TOR(config-router)# address-family l2vpn evpn
TOR(config-router-af)# no advertise-pip
This command is supported under L3VNI BGP EVPN VRF
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
when a pip is disabled or mac-vlan is not present
use anycast MAC as RMAC value.
Ticket:CM-26923
Reviewed By:CCR-9417
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
For self type-2 routes, do not assign system-rmac
as attribute RMAC value if advertise-pip is disable
or macvlan is not present.
Ticket:CM-26923
Reviewed By:CCR-9397
Testing Done:
pip is disabled under bgp vrf2 instance.
Trigger frr-restart.
Before fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:00:02:00:00:00:2e
After fix:
*> [2]:[0]:[48]:[00:02:00:00:00:2e]:[32]:[45.0.4.4]
36.0.0.11 32768 i
ET:8 RT:5546:1004 RT:5546:4002 Rmac:44:38:39:ff:ff:01
TOR# ifquery vlan1004
auto vlan1004
iface vlan1004
address 45.0.4.4/24
vlan-id 1004
vrf vrf2
VNI: 4002 (known to the kernel)
Type: L3
Tenant VRF: vrf2
RD: 45.0.6.4:3
Originator IP: 36.0.0.11
Advertise-pip: Yes
System-IP: 27.0.0.11
System-MAC: 00:02:00:00:00:2e
Router-MAC: 44:38:39:ff:ff:01
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
By default announct Self Type-2 routes with
system IP as nexthop and system MAC as
nexthop.
An API to check type-2 is self route via
checking ipv4/ipv6 address from connected interfaces list.
An API to extract RMAC and nexthop for type-2
routes based on advertise-svi-ip knob is enabled.
When advertise-pip is enabled/disabled, trigger type-2
route update. For self type-2 routes to use
anycast or individual (rmac, nexthop) addresses.
Ticket:CM-26190
Reviewed By:
Testing Done:
Enable 'advertise-svi-ip' knob in bgp default instance.
the vrf instance svi ip is advertised with nexthop
as default instance router-id and RMAC as system MAC.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In L3VNI add callback parse, vrr rmac value.
For non-zero vrr mac value, use it as anycast RMAC
and svi mac as individual rmac value.
If advertise-pip is disable or vrr rmac is not present
use svi mac as anycast rmac value for all routes.
Ticket:CM-26190
Reviewed By:
Testing Done:
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Evpn Primary IP advertisement feature uses
individual system IP and system MAC for prefix (type-5)
and self type-2 routes.
The PIP knob is enabled by default for bgp vrf instance.
Configuration CLI for enable/disable PIP feature knob.
User can configure PIP system IP and MAC to retain as
permanent values.
For the PIP IP, the default behavior is to accept bgp default
instance's router-id. When the default instance router-id change,
reflect PIP IP assignment.
Reflect type-5 to use system-IP and system MAC as nexthop and RMAC
values.
Ticket:CM-26190
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Spaces were not being accounted for in the heap buffer sizing, leading
to a heap buffer overflow when encoding large communities to their
string representations.
This patch also uses safer functions to do the encoding instead of
pointer math.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Tons of insane just-so pointer math here where it is not needed. This is
too smart. Use safer methods.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
The half and reuse variables can never be 1 but the
SA systems we have do not know this and think it is possible.
Provide the kick in the snarples that the SA needs to know
this is not true.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Copy paste leads to invalid read of 1 byte off the heap when converting
extended community attributes into strings.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Bug: While preparing the JSON output, 2 loops are traversed: the outer loop
loops through RD, and the inner loop loops through the prefixes of that RD.
We hit the bug (printing blank RD and stale or null prefix info) when the inner
loop exits with nothing to print, (without allocating json_routes) and the outer
loop still tries to attach it to the parent, json_adv. Thus, we have
key=<BLANK RD>, value=<junk or prev json_routes>
The fix: Avoid attaching json_routes to the parent json if there
is nothing to print.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
According to RFC 8277 IPv4 LU NLRI can be withdrawn using label 0x000000.
This RFC updates RFC3101 where it should be done only with 0x800000 label value.
Juniper implementation sets value 0x000000 when prefix is being withdrawn.
Page 12 RFC8277 states:
[RFC3107] also made it possible to withdraw a binding without
specifying the label explicitly, by setting the Compatibility field
to 0x800000. However, some implementations set it to 0x000000. In
order to ensure backwards compatibility, it is RECOMMENDED by this
document that the Compatibility field be set to 0x800000, but it is
REQUIRED that it be ignored upon reception.
Now FRR drops BGP session when receives such BGP update.
Signed-off-by: Aleksandr Klimenko <v00lk@bk.ru>
* IPv6 routes received via a ibgp session with one of its own interface as
nexthop are getting installed in the BGP table.
*A common table to be implemented should take cares of both
ipv4 and ipv6 connected addresses.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
The command `show ip bgp ipv4|ipv6 vpn neighbors <ip> prefix-counts`
caused a segfault, because the 2-level routing was not accounted for.
Signed-off-by: Juergen Werner <juergen@opensourcerouting.org>
The CLI was not parsing prefix format of ipv6 address.
This fixes the bug: https://github.com/FRRouting/frr/issues/5322
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The total_peercount table was created as a short cut for queries about
if addpath was enabled at all on a particular afi/safi. However, the
values weren't updated, so BGP would act as if addpath wasn't enabled
when determining if updates should be sent out. The error in behavior
was much more noticeable in tx-all than best-per-as, since changes in
what is sent by best-per-as would often trigger updates even if addpath
wasn't enabled.
Signed-off-by: Mitchell Skiba <mskiba@amazon.com>
Problem statement:
When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the
nexthop of the route with nexthop tracking module. The BGP route is marked as
valid only if the nexthop is resolved.
Even for EVPN RT-5, route should be marked as valid only if the the nexthop is
resolvable.
Code changes:
1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid
only if the nexthop is resolved.
2. Only the valid EVPN routes are imported to the vrf.
3. When nht update is received in BGP, make sure that the EVPN routes are
imported/unimported based on the route becomes valid/invalid.
Testcases:
1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2.
10.100.0.2 is resolved at rtr-2 in default vrf.
At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into
vrfs.
2. Make the nexthop 10.100.0.2 unreachable at rtr-2
Remote EVPN RT-5 should be marked as invalid and should be unimported from the
vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes
should be valid.
3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable.
EVPN RT-5 should again become valid and should be imported into the vrfs.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Before the fix:
2019/11/14 19:52:21 BGP: peer 192.168.2.5 deleted from subgroup s4peer
cnt 0 - missing space after s4 before peer
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
With this code change, we can now filter evpn routes based on RD using the
match statement: "match evpn rd XX"
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The bug:
As part of displaying advertised routes to a peer, in the outer loop, we
iterate through all prefixes in the evpn table. In the inner loop,
we iterate through adj_out of each prefix.
If a prefix which is present in the evpn table is not advertised to a peer,
its corresponding attr == NULL. Checking for this condition is the fix.
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
There is a memory leak of the bgp node (route node)
in vni table while processing evpn remote route(s).
During the remote evpn route processing, a new route
is created in per vni route table, the refcount for
the route node is incremented twice. First refcount
is incremented during the node creation and the second
one when the bgp_info_add is added.
Post evpn route creation, the bgp node refcount needs
to be decremented.
Ticket:CM-26898,CM-26838,CM-27169
Reviewed By:CCR-9474
Testing Done:
In EVPN topology send 1K MAC routes then check the memory footprint
at the remote VTEP before sending 1K type-2 routes
and after flushing/withdrawal of the routes.
Before fix:
-----------
Initial memory footprint:
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
With 1K MAC (type-2 routes)
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 6008 32
BGP node : 4182 152
BGP route : 2096 112
After cleaning up 1K MAC entries from source VTEP which triggers BGP withdraw
at the remote VTEP.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 4008 32
BGP node : 2182 152 <-- Here 2K delta from initial count.
BGP route : 96 112
With fix:
---------
After 1K MAC entries cleaned up at the remote VTEP, the memory footprint
(BGP Node and Hash Bucket count) is equilibrium to start of the test.
root@TOR1:~# vtysh -c "show memory" | grep "Hash Bucket \|BGP node \|BGP route"
Hash Bucket : 2008 32
BGP node : 182 152
BGP route : 96 112
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
PR 5118 introduce additional (prepend) keywords
like 'ip' to Route Distinguisher output which
breaks existing evpn route show commands parsing.
Restore to original behavior.
Testing Done:
vtysh -c 'show bgp l2vpn evpn route'
Before fix:
Route Distinguisher: ip 27.0.0.15:44
Post fix:
Route Distinguisher: 27.0.0.15:44
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Recently had a case where I was attempting to debug a nexthop tracking
issue across multiple bgp vrf's and since the setup vrf's in it with
overlapping address ranges, it became real fun real fast to track
vrf data associated. Add a bit of code to allow us to figure out
what vrf we are in when we print out debug messages.
Look through the rest of the code and find debugs where we are
not using bgp->name_pretty and switch it over.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Prevent IPv6 Link-local address being forward to IBGP peer,
which are not directly connected.
R1----IPV6-unnumbered-EBGP-------R2-----IPV6-IBGP-----R3
Configure route-map to set preferred global address on and apply
route-map-IN on R2 for R1-R2 session. Now check on R3's BGP and
RIB table has route nexthop as R1 link-local address, which is
not correct.
As of now we clear link-local address info from mp_nexthop_global,
only if mp_nexthop_global is populated with link-local address.
We should do it even if route-map is configured boz forwarding
link-local address from one link scope to another is violation of
the standards.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
This commit make bgpd to skip and ignore unsupported
sub-type of PREFIX_SID. (especially new defined sub-type)
Current bgpd can't parase unsupported sub-type of PREFIX_SID.
PREFIX_SID is drafted on draft-ietf-idr-bgp-prefix-sid-27.
There are already new sub-type drafted on
draft-dawra-idr-srv6-vpn-05. (Type5,6 is new defined.)
This commit fix the problem reported as #5277 on GitBub.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Running with --enable-address-sanitizer I am seeing this:
=================================================================
==19520==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ef850 at pc 0x7fe9b8f7b57b bp 0x7fffbac6f9c0 sp 0x7fffbac6f170
READ of size 6 at 0x6020003ef850 thread T0
#0 0x7fe9b8f7b57a (/lib/x86_64-linux-gnu/libasan.so.5+0xb857a)
#1 0x55e33d1071e5 in bgp_process_mac_rescan_table bgpd/bgp_mac.c:159
#2 0x55e33d107c09 in bgp_mac_rescan_evpn_table bgpd/bgp_mac.c:252
#3 0x55e33d107e39 in bgp_mac_rescan_all_evpn_tables bgpd/bgp_mac.c:266
#4 0x55e33d108270 in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:291
#5 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
#6 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
#7 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
#8 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
#9 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
#10 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#11 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#12 0x55e33d09d463 in main bgpd/bgp_main.c:477
#13 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
#14 0x55e33d09c189 in _start (/usr/lib/frr/bgpd+0x168189)
0x6020003ef850 is located 0 bytes inside of 16-byte region [0x6020003ef850,0x6020003ef860)
freed by thread T0 here:
#0 0x7fe9b8fabfb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
#1 0x7fe9b8ce4ea9 in qfree lib/memory.c:129
#2 0x55e33d10825c in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:289
#3 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
#4 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
#5 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
#6 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
#7 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
#8 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#9 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#10 0x55e33d09d463 in main bgpd/bgp_main.c:477
#11 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7fe9b8fac518 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
#1 0x7fe9b8ce4d93 in qcalloc lib/memory.c:110
#2 0x55e33d106b29 in bgp_mac_hash_alloc bgpd/bgp_mac.c:96
#3 0x7fe9b8cb8350 in hash_get lib/hash.c:149
#4 0x55e33d10845b in bgp_mac_add_mac_entry bgpd/bgp_mac.c:303
#5 0x55e33d226757 in bgp_ifp_create bgpd/bgp_zebra.c:2644
#6 0x7fe9b8cbf1e6 in if_new_via_zapi lib/if.c:176
#7 0x7fe9b8db2d3b in zclient_interface_add lib/zclient.c:1481
#8 0x7fe9b8db87f8 in zclient_read lib/zclient.c:2659
#9 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
#10 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
#11 0x55e33d09d463 in main bgpd/bgp_main.c:477
#12 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
Effectively we are passing to bgp_mac_remove_ifp_internal the macaddr
that is associated with the bsm data structure. There exists a path
where the bsm is freed and then we immediately pass the macaddr into
bgp_mac_rescan_all_evpn_tables. So just make a copy of the macaddr
data structure before we free the bsm
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
debian-9# show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:15:22
debian-9# conf
debian-9(config)# router bgp 100
debian-9(config-router)# address-family ipv4
debian-9(config-router-af)# distance bgp 123 123 123
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [123/0] via 192.168.0.1, eth1, 00:00:09
debian-9(config-router-af)# no distance bgp
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:00:02
debian-9(config-router-af)#
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The sender side AS path loop detection code was implemented since the
import of Quagga code, however it was always disabled by a `ifdef`
guard.
Lets allow the user to decide whether or not to enable this feature on
run-time.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Scenarios where this code change is required:
1. BFD is un-configured from BGP at remote end.
Neighbour BFD sends ADMIN_DOWN state, but BFD on local side will send
DOWN to BGP, resulting in BGP session DOWN.
Removing BFD session administratively shouldn't bring DOWN BGP session
at local or remote.
2. BFD is un-configured from BGP or shutdown locally.
BFD will send state DOWN to BGP resulting in BGP session DOWN.
(This is akin to saying do not use BFD for BGP)
Removing BFD session administratively shouldn't bring DOWN BGP session at
local or remote.
Signed-off-by: Sayed Mohd Saquib sayed.saquib@broadcom.com
Since we don't set a value from the return of bgp_path_info_mpath_next
it is impossible for this function to do anything as such the if statement
is dead code as well.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problem reported with error messages appearing in the log
complaining about invalid afi/safi combinations. Determined
that the error messages were recently added in the function
that turns afi and safi values to strings. Unfortunately,
the function is called from places using FOREACH_AFI_SAFI,
which spins thru every afi and safi number including some
that are not legal together (ipv4 evpn and l2vpn multicast
for example.) This fix removes these error messages since
it is not necessarily an error to call it with invalid
combinations.
Ticket: CM-26883
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Coverity has found a path where the attr.aspath may be NULL.
assert that the aspath is non-null so we can make this go away.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We make the assumption that ->attr is not NULL throughout
the code base. We are totally inconsistent about application
of this though.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add -s X or --socket_size X to the bgp cli to allow
the end user to specify the outgoing bgp tcp kernel
socket buffer size.
It is recommended that this option is only used on
large scale operations.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
In bgp_create_evpn_bgp_path_info we create a bgp_path_info
that should be returned since we need it later.
Found by Coverity Scan.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When using soft reconfiguration inbound we are storing packet
data on the side for replaying when necessary. The problem here
is that we are just grabbing the first bgp_path_info and using
that as the base. What happens when we have soft-reconfig turned
on with multiple bgp_path_info's for a path? This was introduced
in commit 8692c50652, yes back
in 2012! I would argue, though, that it was just broken
in a different way before this.
Choose the correct bgp_path_info that corresponds to the peer
we received the data from for rethinking.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When a type 2/3 or 5 route is received, verified and the
resulting route generated is pushed into the appropriate vrf
the vni's associated with the route are also passed in.
This is showing up as a Remote label when you dump
the route in bgp:
BGP routing table entry for 0.0.0.0/0^M
Paths: (1 available, best #1, table third)
Advertised to non peer-group peers:
10.10.120.22
42001 42005 42006 42055
10.10.120.22 from 10.10.120.22 (10.10.255.193)
Origin IGP, valid, external, bestpath-from-AS 42001, best
Remote label: 62750
AddPath ID: RX 0, TX 2
Last update: Fri Oct 11 12:59:56 2019
The `Remote label: 62750` is the mpls label version of the
vni passed in. This is meaningless and confusing to the end
user. Do not display this information.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a bgp_path_info for a type 4 route the pi->extra->parent
and the route node for the originating table were not being locked
properly. This will prevent BGP from not properly cleaning up
the data structures on cleanup.
Possibly every one of the functions that we use to create the
new bgp_path_info's should use an abstracted version of this code,
but I am unsure at this point in time if a type 4 should use the same
or not.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
bgp_process_packets has an assert to make sure an appropriate amount of
working space in the input buffer has been freed up for future reads.
However, this assert shouldn't be made when we have encountered an error
that's going to tear down the session, because in this case we may not
be able to process the full contents of the input buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bgp_process_packets has an assert to make sure an appropriate amount of
working space in the input buffer has been freed up for future reads.
However, this assert shouldn't be made when we have encountered an error
that's going to tear down the session, because in this case we may not
be able to process the full contents of the input buffer.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
There are cases where the table identifier is set on a bgp entry, mainly
due to route-map, and associate fib entry needs to be removed.
This change encompasses also the route-map reconfiguration that leads to
removing the previous entry, whereas bgp update had been triggered (
this happens when software inbound reconfiguration is handled).
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
this table identifier can be used for policy routing. incoming entries
are locally exported to that local table identifier.
note that so that the user applies the new table identifier to all
entries, the user should flush local tables first.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
BGP code assumes that the extra data is zero'ed out. Ensure that we
are not leaving any situation that the data on the stack is actually all
0's when we pass it around as a pointer later.
Please note in issue #5025, Lou reported a different valgrind
issue, which is not the same issue:
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==
that is causing the actual crash.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fixed memory leak and incorrect json output. Check the full output in the PR:
https://github.com/FRRouting/frr/pull/5118
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
The bgp pointer may not be actually found. The debug
message that was using it could get the same value
another way. Convert over
Fixes Coverity Scan Issue:
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We only have a uint32_t value here but clippy is wise and
gives us more data than we need. Tell the compiler we can
throw some stuff away.
This was found by inspecting CI results.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>