Re-work the bgp vni table to use separately keyed tables for type2
routes.
So, with type2 routes, we have the main table keyed off of the IP and a
new MAC table keyed off of MACs.
By separating out the two, we are able to run path selection separately
for the neigh and mac. Keeping the two separate is also more in-line
with what happens in zebra (they are managed comptletely seperate).
With this change type2 routes go into each table like so:
```
Remote MAC-IP -> IP Table & MAC Table
Remote MAC -> MAC Table
Local MAC-IP -> IP Table
Local MAC -> MAC Table
```
The difference for local is necessary because we should not ever allow
multiple paths for a local MAC.
Also cleaned up the commands for querying the vni tables:
```
show bgp vni all type ...
show bgp vni VNI type ...
```
Old commands will be deprecated in a separate commit.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
There are lib debugs being set but never show up in
`show debug` commands because there was no way to show
that they were being used. Add a bit of infrastructure
to allow this and then use it for `debug route-map`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It already "looks" like a bitmask, but we currently can't flag a command
both YANG and HIDDEN at the same time. It really should be a bitmask.
Also clarify DEPRECATED behaviour (or the absence thereof.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The typesafe hash data structure enforces items to be unique, but their
hash values may still collide. To this extent, when two items have the
same hash value, the compare function is called to see if it returns 0
(aka "equal").
While the _find() function handles this correctly, the _add() function
mistakenly only checked the first item with a colliding hash value for
equality, and if it was inequal proceeded to add the new item. There
may however be additional items with the same hash value collision, one
of which could still compare as equal. In that case, _add() would
mistakenly add the new element, failing to notice the already added
item. Breakage ensues.
Fix by looking for an equal element among *all* existing items with the
same hash value, not just the first.
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
[DL: rewrote commit message, fixed whitespace/formatting]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
JSON object was generated, but not printed, because the function returned
immediatelly, even without freeing the memory.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When bulk deleting prefix lists on shutdown the code
was calling plist_delete, which removed the item
from the master->str list, and then popping the next
item on the list and just dropping it on the floor.
The pop is not needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a route imported from l3vpn is analysed, the nexthop from default
VRF is looked up against a valid MPLS path. Generally, this is done on
backbones with a MPLS signalisation transport layer like LDP. Generally,
the BGP connection is multiple hops away. That scenario is already
working.
There is case where it is possible to run L3VPN over GRE interfaces, and
where there is no LSP path over that GRE interface: GRE is just here to
tunnel MPLS traffic. On that case, the nexthop given in the path does not
have MPLS path, but should be authorized to convey MPLS traffic provided
that the user permits it via a configuration command.
That commit introduces a new command that can be activated in route-map:
> set l3vpn next-hop encapsulation gre
That command authorizes the nexthop tracking engine to accept paths that
o have a GRE interface as output, independently of the presence of an LSP
path or not.
A configuration example is given below. When bgp incoming vpnv4 updates
are received, the nexthop of NLRI is 192.168.0.2. Based on nexthop
tracking service from zebra, BGP knows that the output interface to reach
192.168.0.2 is r1-gre0. Because that interface is not MPLS based, but is
a GRE tunnel, then the update will be using that nexthop to be installed.
interface r1-gre0
ip address 192.168.0.1/24
exit
router bgp 65500
bgp router-id 1.1.1.1
neighbor 192.168.0.2 remote-as 65500
!
address-family ipv4 unicast
no neighbor 192.168.0.2 activate
exit-address-family
!
address-family ipv4 vpn
neighbor 192.168.0.2 activate
neighbor 192.168.0.2 route-map rmap in
exit-address-family
exit
!
router bgp 65500 vrf vrf1
bgp router-id 1.1.1.1
no bgp network import-check
!
address-family ipv4 unicast
network 10.201.0.0/24
redistribute connected
label vpn export 101
rd vpn export 444:1
rt vpn both 52:100
export vpn
import vpn
exit-address-family
exit
!
route-map rmap permit 1
set l3vpn next-hop encapsulation gre
exit
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add an ability to match via route-maps. An additional route-map command
`match rpki-extcommunity <invalid|notfound|valid>` added.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
A couple of pointers in do_thread_cancel() we only inited at
the start of the function; make sure they're inited during
each iteration of the loop.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
- double the size of each new chunk request from zebra
- use bitfields to track label allocations in a chunk
- When allocating:
- skip chunks with no free labels
- search biggest chunks first
- start search in chunk where last search ended
- Improve API documentation in comments (bgp_lp_get() and callback)
- Tweak formatting of "show bgp labelpool chunks"
- Add test features (compiled conditionally on BGP_LABELPOOL_ENABLE_TESTS)
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Running `bgp_srv6l3vpn_to_bgp_vrf` and `bgp_srv6l3vpn_to_bgp_vrf2`
topotests with `--valgrind-memleaks` gives several memory leak errors.
This is due to the way FRR daemons pass local SIDs to zebra: to send a
local SID to zebra, FRR daemons call the `zclient_send_localsid()`
function.
The `zclient_send_localsid()` function performs the following sequence
of operations:
* create a temporary `struct nexthop`;
* call `nexthop_add_srv6_seg6local()` to fill the `struct nexthop` with
the proper local SID information;
* create a `struct zapi_route` and call `zapi_nexthop_from_nexthop()` to
copy the information from the `struct nexthop` to the
`struct zapi_route`;
* send the `struct zapi_route` to zebra through the ZAPI.
The `nexthop_add_srv6_seg6local()` function uses `XCALLOC()` to allocate
memory for the SRv6 nexthop. This memory is never freed.
Creating a temporary `struct nexthop` is unnecessary, as the local SID
information can be pushed directly to the `struct zapi_route`. This
patch simplifies the implementation of `zclient_send_localsid()` by
avoiding using the temporary `struct nexthop`. This eliminates the need
to use `nexthop_add_srv6_seg6local()` to fill the `struct nexthop` and
consequently fixes the memory leak.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Handle matching type2/5 evpn routes via lookup in the optimized
route-maps used by plists.
Convert the evpn_prefix to ipv4/v6 prefix to perform longest
matching on in the tree.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Implement the ability to match type-2 and type-5 routes
via a route-map and a prefix-list.
Add some library code to convert an evpn prefix into
a general ipv4/ipv6 prefix for type-2 and type-5 routes.
evpn prefix is really just another subtype of prefix so all
the info needed can be extracted right there.
Add a special handler to bgp_routemap for evpn type routes
when applying the outbound route-map. This calls the library
code to convert the evpn_prefix to a ipv4/ipv6 prefix and
run it through the plist code. In this we assume type-2 routes
are a /32.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
ls_msg2edge calls ls_edge_del_all which will free the
edge variable. Ensure that FRR properly returns NULL.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The format message checks done by clippy/xrelfo were still guarded
behind `--enable-dev-build`. They've been clean and reliable, so it's
time to enable them unconditionally.
Fixes: #11680
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In CSPF topo test, valgrind detects uninitialized bytes when exporting TE
Opaque information through ZEBRA. This is due to C pragma compilation directive
__attribute__(aligned(8)) in struct ls_node_id in link_state.h. Valgrind
consideris that struct ls_node_id nid = {} doesn't initialized the padding
bytes introduced by gcc.
This patch simply removes the C pragma compilation directive and also takes
opportunity to remove the transmission of remote node id for vertices and
subnets which is not known. Indeed, remote node id is only pertinent for
edges.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
New function setsockopt_tcp_keepalive() is added to enable TCP keepalive
mechanism for specified socket. Also TCP keepalive idle time, interval
and maximum probes are configured.
Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Don't auto set the thread->arg pointer. It is private
and should be only accessed through the THREAD_ARG pointer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
convert:
frr_with_mutex(..)
to:
frr_with_mutex (..)
To make all our code agree with what clang-format is going to produce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
in agentx_events_update the timeout_thr is canceled
on line 88 just above. This already sets the pointer
to NULL. No need to do this again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
resolver_resolve should check hostname is null or not.
if ares_gethostbyname() get null hostname string, the hostname string will access a null pointer and crash.
Signed-off-by: kevinshen <kevinshen@inspur.com>
Description:
- When there are multiple policies configured with
route-map then the first matching policy is not
getting applied on default route originated with
default-originate.
- In BGP we first run through the BGP RIB and then
pass it to the route-map to find if its permit or
deny. Due to this behaviour the first route in
BGP RIB that passes the route-map will be applied.
Fix:
- Passing extra parameter to routemap_apply so that
we can get the preference of the matching policy,
keep comparing it with the old preference and finally
consider the policy with less preference.
Co-authored-by: Abhinay Ramesh <rabhinay@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Literally 4 minutes after hitting merge on Mark's previous fix for this
I remembered we have an `assume()` macro in compiler.h which seems like
the right tool for this.
(... and if I'm touching it, I might as well add a little text
explaining what's going on here.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
It will be used to allow/deny using IPv4 reserved ranges (Class E) for Zebra
(configuring interface address) or BGP (allow next-hop to be from this range).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Staticd when run tells privs.c that it does not need any
priviledges. The lib/privs.c code was not downgrading
any and all permissions it may have been given at startup.
Since we don't need any let's actually tell the system that
FRR does not need the capabilities anymore in the case
where a daemon does not ask for any cap's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a nexthop is set RTNH_F_LINKDOWN, start noticing
that this flag is set. Allow FRR to know about this
flag but at this point do not do anything with it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
commit: 5609e70fb8
Added a new flag to the `struct nexthop` and
this addition of a flag caused the flags size to
be too small. Increase the size of flags to
allow more flags to be had.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add api is_ipv6_global_unicast to identify whether a given
ipv6 address is global unicast or not.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
In sockunion.c let's eliminate the silent and unexpected failure
mode to let the end operator figure out something is terribly wrong.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
- The parent of the daemonizing fork reports memleaks for the early
northbound allocations (libyang). If these were real memleaks these
would show up in the child as well; however, ignoring all memleaks in
the parent of the fork is too hard a sale. Instead, spend some CPU
cycles cleaning up the allocations in the parent after the fork and
immeidatley prior to exiting the parent after the daemonizing fork.
Signed-off-by: Christian Hopps <chopps@labn.net>
Abstract the usage of '%pNHs' so that when nexthop groups get
a new special printfrr that it can take advantage of this
functionality too.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Multipath route may have mixed nexthops of EVPN and IP unicast. Move
EVPN flag to nexthop to support such cases.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
* sr_event_notif_send -> sr_notif_send
* sr_process_events -> sr_subscription_process_events
* sr_oper_get_items_subscribe -> sr_oper_get_subscribe
* Removed SR_SUBSCR_CTX_REUSE flag from the code at all
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
explicit_bzero() is available as an API to clean up sensitive data
and avoid compiler optimizations that remove calls to memset() or bzero().
Signed-off-by: Loganaden Velvindron <logan@cyberstorm.mu>
Using strtol() to compare two strings is a bad idea.
Before the patch, if_cmp_name_func() may confuse foo001 and foo1.
PR=79407
Fixes: 106d2fd572 ("2003-08-01 Cougar <cougar@random.ee>")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Aurélien Degeorges <aurelien.degeorges@6wind.com>
Acked-by: Philippe Guibert <philippe.guibert@6wind.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>