Current changes deals with EVPN routes installation to zebra.
In evpn_route_select_install() we invoke evpn_zebra_install/uninstall
which sends zclient_send_message().
This is a continuation of code changes (similar to
ccfe452763) but to handle evpn part
of the code.
Ticket: #3390099
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Currently BGP attempts to send route change information
to it's peers *before* the route is installed into zebra.
This creates a bug in suppress-fib-pending in the following
scenario:
a) bgp suppress-fib-pending and bgp has a route with
2 way ecmp.
b) bgp receives a route withdraw from peer 1. BGP
will send the route to zebra and mark the route as
FIB_INSTALL_PENDING.
c) bgp receives a route withdraw from peer 2. BGP
will see the route has the FIB_INSTALL_PENDING and
not send the withdrawal of the route to the peer.
bgp will then send the route deletion to zebra and
clean up the bgp_path_info's.
At this point BGP is stuck where it has not sent
a route withdrawal to downstream peers.
Let's modify the code in bgp_process_main_one to
send the route notification to zebra first before
attempting to announce the route. The route withdrawal
will remove the FIB_INSTALL_PENDING flag from the dest
and this will allow group_announce_route to believe
it can send the route withdrawal.
For the master branch this is ok because the recent
backpressure commits are in place and nothing is going
to change from an ordering perspective in that regards.
Ostensibly this fix is also for operators of Sonic and
will be backported to the 8.5 branch as well. This will
change the order of the send to peers to be after the
zebra installation but sonic users are using suppress-fib-pending
anyways so updates won't go out until rib ack has been
received anyways.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP is now keeping a list of dests with the dest having a pointer
to the bgp_path_info that it will be working on.
1) When bgp receives a prefix, process it, add the bgp_dest of the
prefix into the new Fifo list if not present, update the flags (Ex:
earlier if the prefix was advertised and now it is a withdrawn),
increment the ref_count and DO NOT advertise the install/withdraw
to zebra yet.
2) Schedule an event to wake up to invoke the new function which will
walk the list one by one and installs/withdraws the routes into zebra.
a) if BUFFER_EMPTY, process the next item on the list
b) if BUFFER_PENDING, bail out and the callback in
zclient_flush_data() will invoke the same function when BUFFER_EMPTY
Changes
- rename old bgp_zebra_announce to bgp_zebra_announce_actual
- rename old bgp_zebra_withdrw to bgp_zebra_withdraw_actual
- Handle new fifo list cleanup in bgp_exit()
- New funcs: bgp_handle_route_announcements_to_zebra() and
bgp_zebra_route_install()
- Define a callback function to invoke
bgp_handle_route_announcements_to_zebra() when BUFFER_EMPTY in
zclient_flush_data()
The current change deals with bgp installing routes via
bgp_process_main_one()
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
Since installing/withdrawing routes into zebra is going to be changed
around to be dest based in a list,
- Retrieve the afi/safi to use based upon the dest's afi/safi
instead of passing it in.
- Prefix is known by the dest. Remove this arg as well
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
When running `show ip bgp` command, the 'route short status' and
'network' columns do not have white-space between them.
Old show:
Network Next Hop Metric LocPrf Weight Path
*>i1.1.1.1/32 10.1.12.111 0 100 0 i
New show:
Network Next Hop Metric LocPrf Weight Path
*>i 1.1.1.1/32 10.1.12.111 0 100 0 i
Added white-space to enhance readability between them.
Signed-off-by: Cassiano Campes <cassiano.campes@venkonetworks.com>
currently:
when as-path-loop-detection is set on a peer-group.
members of the peer-group are not using that functionnality.
analysis:
the as-path-loop-detection, is not using the peer's flags
related framework.
fix:
use the peer's flag framework for as-path-loop-detection.
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
Customer has this valgrind trace:
Direct leak of 2829120 byte(s) in 70728 object(s) allocated from:
0 in community_new ../bgpd/bgp_community.c:39
1 in community_uniq_sort ../bgpd/bgp_community.c:170
2 in route_set_community ../bgpd/bgp_routemap.c:2342
3 in route_map_apply_ext ../lib/routemap.c:2673
4 in subgroup_announce_check ../bgpd/bgp_route.c:2367
5 in subgroup_process_announce_selected ../bgpd/bgp_route.c:2914
6 in group_announce_route_walkcb ../bgpd/bgp_updgrp_adv.c:199
7 in hash_walk ../lib/hash.c:285
8 in update_group_af_walk ../bgpd/bgp_updgrp.c:2061
9 in group_announce_route ../bgpd/bgp_updgrp_adv.c:1059
10 in bgp_process_main_one ../bgpd/bgp_route.c:3221
11 in bgp_process_wq ../bgpd/bgp_route.c:3221
12 in work_queue_run ../lib/workqueue.c:282
The above leak detected by valgrind was from a screenshot so I copied it
by hand. Any mistakes in line numbers are purely from my transcription.
Additionally this is against a slightly modified 8.5.1 version of FRR.
Code inspection of 8.5.1 -vs- latest master shows the same problem
exists. Code should be able to be followed from there to here.
What is happening:
There is a route-map being applied that modifes the outgoing community
to a peer. This is saved in the attr copy created in
subgroup_process_announce_selected. This community pointer is not
interned. So the community->refcount is still 0. Normally when
a prefix is announced, the attr and the prefix are placed on a
adjency out structure where the attribute is interned. This will
cause the community to be saved in the community hash list as well.
In a non-normal operation when the decision to send is aborted after
the route-map application, the attribute is just dropped and the
pointer to the community is just dropped too, leading to situations
where the memory is leaked. The usage of bgp suppress-fib would
would be a case where the community is caused to be leaked.
Additionally the previous commit where an unsuppress-map is used
to modify the outgoing attribute but since unsuppress-map was
not considered part of outgoing policy the attribute would be dropped as
well. This pointer drop also extends to any dynamically allocated
memory saved by the attribute pointer that was not interned yet as well.
So let's modify the return case where the decision is made to
not send the prefix to the peer to always just flush the attribute
to ensure memory is not leaked.
Fixes: #15459
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
A route and its nexthop might belong to different VRFs. Therefore, we need
both the bgp and bgp_nexthop pointers.
Fixes: 8d51fafdcb ("bgpd: Drop bgp_static_update_safi() function")
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
This for loop has no chance of removing entries so there is no
need to do a bit of complicated code to handle the case where
an entry can be removed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Move mp_nexthop_prefer_global boolean attribute to nh_flags. It does
not currently save memory because of the packing.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Locally leaked routes remain active after the nexthop VRF interface goes
down.
Update route leaking when the loopback or a VRF interface state change is
received from zebra.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When a BGP flowspec peering stops, the BGP RIB entries for IPv6
flowspec entries are removed, but not the ZEBRA RIB IPv6 entries.
Actually, when calling bgp_zebra_withdraw() function call, only
the AFI_IP parameter is passed to the bgp_pbr_update_entry() function
in charge of the Flowspec add/delete in zebra. Fix this by passing
the AFI parameter to the bgp_zebra_withdraw() function.
Note that using topotest does not show up the problem as the
flowspec driver code is not present and was refused. Without that,
routes are not installed, and can not be uninstalled.
Fixes: 529efa2346 ("bgpd: allow flowspec entries to be announced to zebra")
Link: https://github.com/FRRouting/frr/pull/2025
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
By default, iBGP and eBGP-OAD peers exchange RPKI extended community by default.
Add a command to disable sending RPKI extended community if needed.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Structure size of bgp_path_info_extra when compiled
with vnc is 184 bytes. Reduce this size to 72 bytes
when compiled w/ vnc but not necessarily turned
on vnc.
With 2 full bgp feeds this saves aproximately 100mb
when compiling with vnc and not using vnc.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Without this change when we change the route-map, we never reinstall the route
if the route-map has changed.
We checked only some attributes like aspath, communities, large-communities,
extended-communities, but ignoring the rest of attributes.
With this change, let's check if the route-map has changed.
bgp_route_map_process_update() is triggered on route-map change, and we set
`changed` to true, which treats aggregated route as not the same as it was before.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Comparing pointers is not the appropriate way to know
if the label values are the same or not. Perform a
memcmp call instead is better.
Fixes: 8ba7105057 ("bgpd: fix valgrind flagged errors")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
https://datatracker.ietf.org/doc/html/draft-uttaro-idr-bgp-oad#section-3.13
Extended communities which are non-transitive across an AS boundary MAY be
advertised over an EBGP-OAD session if allowed by explicit policy configuration.
If allowed, all the members of the OAD SHOULD be configured to use the same
criteria.
For example, the Origin Validation State Extended Community, defined as
non-transitive in [RFC8097], can be advertised to peers in the same OAD.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
If at least one of the candidate routes was received via EBGP, remove from
consideration all routes that were received via EBGP-OAD and IBGP.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When bgp update is received for EVPN prefix
where for an existing path's nexthop becomes unreachable,
the path is marked as not VALID but the routes
were not unimported from tenant vrfs, which lead to
stale unicast route(s) and nexthop(s).
In Multipath scenario only a specific path may have marked as
not VALID, then specific path info for the EVPN prefix required to be
unimported from tenant vrf.
Ticket: #3671288
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The MTYPE_BGP memory type was being over used as
both the handler for the bgp instance itself as
well as memory associated with name strings.
Let's separate out the two.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
moved loc-rib uptime field "bgp_rib_uptime" to struct bgp_path_info_extra for memory concerns
moved logic into bgp_route_update's callback bmp_route_update
written timestamp in per peer header
Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>
TODOs that are done/un-necessary now deleted
refactored bmp_route_update to use a modified bmp_process_one function call instead of duplicating similar code
Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>
added time_t field to bgp_path_info
set value before bgp dp hook is called
value not set in the msg yet, testing and double checking is needed before
Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>
set peer type flag to 3 for loc rib monitoring
leave to 0 in other cases like before, even though RFC7854 tells us to set it to 0 1 or 2 depending on the case global/rd/local instance
Signed-off-by: Maxence Younsi <mx.yns@outlook.fr>