- for a given IP nexthop, dump all NH entries, including
colored entries, or entries with an ifindex.
- when a given IP nexthop is requested, the path is displayed.
For better readibility, remove the carriage return between
'Last update' and 'Paths', because ctime() function already
performs carriage return.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The issuing of `show bgp nexthop A.B.C.D` fails even if that
nexthop exists:
eva# show bgp nexthop 192.168.119.120
specified nexthop does not have entry
Fixed:
eva# show bgp nexthop 192.168.119.120
192.168.119.120 valid [IGP metric 0], #paths 0, peer 192.168.119.120
if enp39s0
Last update: Fri Sep 30 14:55:13 2022
Paths:
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's convert to our actual library call instead
of using yet another abstraction that makes it fun
for people to switch daemons.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
FRR should create a bnc per peer. Not have
one's that write over others. Currently when
FRR has multiple Interface based peering, BGP wa
creating a single BNC. This is insufficient in that
we were accidently overwriting the one LL with other
data. This causes issues when there are multiple and
there is weird starting issues with those interfaces
that you are peering over.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
When EVPN prefix route with a gateway IP overlay index is imported into the IP
vrf at the ingress PE, BGP nexthop of this route is set to the gateway IP.
For this vrf route to be valid, following conditions must be met.
- Gateway IP nexthop of this route should be L3 reachable, i.e., this route
should be resolved in RIB.
- A remote MAC/IP route should be present for the gateway IP address in the
EVI(L2VPN table).
To check for the first condition, gateway IP is registered with nht (nexthop
tracking) to receive the reachability notifications for this IP from zebra RIB.
If the gateway IP is reachable, zebra sends the reachability information (i.e.,
nexthop interface) for the gateway IP.
This nexthop interface should be the SVI interface.
Now, to find out type-2 route corresponding to the gateway IP, we need to fetch
the VNI for the above SVI.
To do this VNI lookup effitiently, define a hashtable of struct bgpevpn with
svi_ifindex as key.
struct hash *vni_svi_hash;
An EVI instance is added to vni_svi_hash if its svi_ifindex is nonzero.
Using this hash, we obtain struct bgpevpn corresponding to the gateway IP.
For gateway IP overlay index recursive lookup, once we find the correct EVI, we
have to lookup its route table for a MAC/IP prefix. As we have to iterate the
entire route table for every lookup, this lookup is expensive. We can optimize
this lookup by adding all the remote IP addresses in a hash table.
Following hash table is defined for this purpose in struct bgpevpn
Struct hash *remote_ip_hash;
When a MAC/IP route is installed in the EVI table, it is also added to
remote_ip_hash.
It is possible to have multiple MAC/IP routes with the same IP address because
of host move scenarios. Thus, for every address addr in remote_ip_hash, we
maintain list of all the MAC/IP routes having addr as their IP address.
Following structure defines an address in remote_ip_hash.
struct evpn_remote_ip {
struct ipaddr addr;
struct list *macip_path_list;
};
A Boolean field is added to struct bgp_nexthop_cache to indicate that the
nexthop is EVPN gateway IP overlay index.
bool is_evpn_gwip_nexthop;
A flag BGP_NEXTHOP_EVPN_INCOMPLETE is added to struct bgp_nexthop_cache.
This flag is set when the gateway IP is L3 reachable but not yet resolved by a
MAC/IP route.
Following table explains the combination of L3 and L2 reachability w.r.t.
BGP_NEXTHOP_VALID and BGP_NEXTHOP_EVPN_INCOMPLETE flags
* | MACIP resolved | MACIP unresolved
*----------------|----------------|------------------
* L3 reachable | VALID = 1 | VALID = 0
* | INCOMPLETE = 0 | INCOMPLETE = 1
* ---------------|----------------|--------------------
* L3 unreachable | VALID = 0 | VALID = 0
* | INCOMPLETE = 0 | INCOMPLETE = 0
Procedure that we use to check if the gateway IP is resolvable by a MAC/IP
route:
- Find the EVI/L2VRF that belongs to the nexthop SVI using vni_svi_hash.
- Check if the gateway IP is present in remote_ip_hash in this EVI.
When the gateway IP is L3 reachable and it is also resolved by a MAC/IP route,
unset BGP_NEXTHOP_EVPN_INCOMPLETE flag and set BGP_NEXTHOP_VALID flag.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
We are inconsistently using peer_establiahed(peer) with
sometimes using `peer->status == Established`. Just Convert
over to using the function for consistency.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgp is currently registering v6 LL as nexthops to be tracked
from zebra. This presents several problems.
a) zebra does not properly track multiple prefixes that match
the same route properly at this point in time.
b) BGP was receiving nexthops that were just incorrect because
of (a).
c) When a nexthop changed that really didn't affect the v6 LL
we were responding incorrectly because of this
Modify the code such that bgp nexthop tracking notices that
we are trying to register a v6 LL. When we do so, shortcut
and watch interface up/down events for this v6 LL and do
the work when an interface goes up / down for this type
of tracking.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Some of the `show memory` strings in bgp are longer than the
columns we have allocated for it. Shorten some strings to
make them fit and have the output pleasing to the eye.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The tip hash is only used when we are dealing with
evpn. In bgp_nexthop_self we are doing a memset
irrelevant of whether we will ever find data. Yes
hash_lookup will return pretty quickly.
Modify the code to avoid doing a memset in the case
where the tip hash is empty as that we know we'll
never find anything. With full BGP feeds this
small memset does take some time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
since the addition of srte_color to the comparison for bgp nexthops
it is possible to have several nexthops per prefix but since zebra
only sores a per prefix registration we should not unregister for
nh notifications for a prefix unti all the nexthops for that prefix
have been deleted. Otherwise we can get into a deadlock situation
where BGP thinks we have registered but we have unregistered from zebra.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Extend the NHT code so that only the affected BGP routes are affected
whenever an SR-policy is updated on zebra.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Fist, routing tables aren't the most appropriate data structure
to store nexthops and imported routes since we don't need to do
longest prefix matches with that information.
Second, by converting the NHT code to use rb-trees, we can index
the nexthops using additional information, not only the destination
address. This will be useful later to index bgpd's nexthops by
both destination and SR-TE color.
Co-authored-by: Sebastien Merle <sebastien@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`". It should not result in any functional
change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Display next hop resolution information, whether the "detail" option is
specified or not as it is quite fundamental and only minimally increases
the output.
Introduce option to look at a specific NHT entry, which will also show
the paths associated with that entry.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Add new function `bgp_node_get_prefix()` and modify
the bgp code base to use it.
This is prep work for the struct bgp_dest rework.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Some were converted to bool, where true/false status is needed.
Converted to void only those, where the return status was only false or true.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
A BGP update-group is dynamically created to group together a set of peers
such that any BGP updates can be formed just once for the entire group and
only the next hop attribute may need to be modified when the update is sent
out to each peer in the group. The update formation code attempts to
determine as much as possible if the next hop will be set to our own IP
address for every peer in the group. This helps to avoid additional checks
at the point of sending the update (which happens on a per-peer basis) and
also because some other attributes may/could vary depending on whether the
next hop is set to our own IP or not. Resetting the next hop to our own IP
address is the most common behavior for EBGP peerings in the absence of
other user-configured or internal (e.g., for l2vpn/evpn) settings and
peerings on a shared subnet.
The code had a flaw in the multiaccess check to see if there are peers in
the update group which are on a shared subnet as the next hop of the path
being announced - the source peer could itself be in the same update group
and cause the check to give an incorrect result. Modify the check to skip
the source peer so that the check is more accurate.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
For some reason we are getting a compile error around a variable I didn't
touch in the other commits. Make it happy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no need to have a temp variable to then store that
data in another temporary variable.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The new_afi and afi were being used over and over. Switch
to the end result we want and just use that from the get go.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The creation of a prefix pointer is unnecessary. Save the
prefix as part of the actual data structure. This will
reduce the data needed by 8 bytes per nexthop stored.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* IPv6 routes received via a ibgp session with one of its own interface as
nexthop are getting installed in the BGP table.
*A common table to be implemented should take cares of both
ipv4 and ipv6 connected addresses.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
When a BGP next hop tracking (NHT) entry is created for a peer,
display it in the corresponding "show" command output.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Prevent the ebgp sender from changing the nexthop( which is same as the ebgp neighbour ipv6 address),
while sending updates to its ipv6 neighbor.So,if the nexthop of the ipv6 route is same as the ipv6
neighbour address do not change the next hop to your own ip.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
Prevent IPv6 routes received via a ibgp session with one of its own interface
ip as nexthop from getting installed in the BGP table.
Implemented IPV6 HASH table, where we need to add any ipv6 address as they
gets configured and delete them from the HASH table as the ipv6 addresses
get unconfigured. The above hash table is used to verify if any route learned
via BGP has nexthop which is equal to one of its its connected ipv6 interface.
Signed-off-by: Biswajit Sadhu sadhub@vmware.com
The bgp_connected_set_node_info and bgp_connected_get_node_info
function names were slightly backwards lets fix them up
to bgp_node_set_bgp_connected_ref_info and bgp_node_get_bgp_connected_ref_info
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The bgp_nexthop_set_node_info and bgp_nexthop_get_node_info
function names were slightly backwards, rename to bgp_node_set and get
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no reason that bgp should be including zebra
headers into it's code base, it is a violation of
their respective name spaces.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
If we attempt to register nexthops before we have the zebra
connection, they will not be installed. After we have noticed
that we are up, re-install them.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Problems were reported with the name of the default vrf and the
default bgp instance being different, creating confusion. This
fix changes both to "default" for consistency.
Ticket: CM-21791
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-7658
Testing: manual testing and automated tests before pushing
The ->hash_cmp and linked list ->cmp functions were sometimes
being used interchangeably and this really is not a good
thing. So let's modify the hash_cmp function pointer to return
a boolean and convert everything to use the new syntax.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Recent changes to the nht code in bgp caused us to actually
keep a true count of v6 nexthop paths when using v4 over v6.
This change introduced a race condition on shutdown on who
got to the bnc cache first( the v4 table or not ). Effectively
we were allowing the continued existence of the path->nexthop
pointing to the freed bnc. This was especially true when
we had route leaking. So when we free the bnc make sure
we clean up the path->nexthop variables pointing at it too.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>