Add some `pragma`s to handle errors that the C/C++ extension is not able
to understand.
Move `TRANSPARENT_UNION` to `lib/compiler.h` for consistency.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Since the `echo PING` commands are from watchfrr and are sent
a whole bunch when an operator has `log commands` on the amount
of logging done is quite significant.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We must disable the vrf before we start terminating interfaces.
On termination, we free the 'zebra_if' struct from the interface ->info
pointer. We rely on that for subsystems like vxlan for cleanup when
shutting down.
'''
==497406== Invalid read of size 8
==497406== at 0x47E70A: zebra_evpn_del (zebra_evpn.c:1103)
==497406== by 0x47F004: zebra_evpn_cleanup_all (zebra_evpn.c:1363)
==497406== by 0x4F2404: zebra_evpn_vxlan_cleanup_all (zebra_vxlan.c:1158)
==497406== by 0x4917041: hash_iterate (hash.c:267)
==497406== by 0x4F25E2: zebra_vxlan_cleanup_tables (zebra_vxlan.c:5676)
==497406== by 0x4D52EC: zebra_vrf_disable (zebra_vrf.c:209)
==497406== by 0x49A247F: vrf_disable (vrf.c:340)
==497406== by 0x49A2521: vrf_delete (vrf.c:245)
==497406== by 0x49A2E2B: vrf_terminate_single (vrf.c:533)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== Address 0x6d68c68 is 200 bytes inside a block of size 272 free'd
==497406== at 0x48470E4: free (vg_replace_malloc.c:872)
==497406== by 0x4942CF0: qfree (memory.c:141)
==497406== by 0x49196A9: if_delete (if.c:293)
==497406== by 0x491C54C: if_terminate (if.c:1031)
==497406== by 0x49A2E22: vrf_terminate_single (vrf.c:532)
==497406== by 0x49A2D8F: vrf_terminate (vrf.c:561)
==497406== by 0x441240: sigint (main.c:192)
==497406== by 0x4981F6D: frr_sigevent_process (sigevent.c:130)
==497406== by 0x499A5F0: thread_fetch (thread.c:1775)
==497406== by 0x492850E: frr_run (libfrr.c:1197)
==497406== by 0x441746: main (main.c:476)
==497406== Block was alloc'd at
==497406== at 0x4849464: calloc (vg_replace_malloc.c:1328)
==497406== by 0x49429A5: qcalloc (memory.c:116)
==497406== by 0x491D971: if_new (if.c:174)
==497406== by 0x491ACC8: if_create_name (if.c:228)
==497406== by 0x491ABEB: if_get_by_name (if.c:613)
==497406== by 0x427052: netlink_interface (if_netlink.c:1178)
==497406== by 0x43BC18: netlink_parse_info (kernel_netlink.c:1188)
==497406== by 0x4266D7: interface_lookup_netlink (if_netlink.c:1288)
==497406== by 0x42B634: interface_list (if_netlink.c:2368)
==497406== by 0x4ABF83: zebra_ns_enable (zebra_ns.c:127)
==497406== by 0x4AC17E: zebra_ns_init (zebra_ns.c:216)
==497406== by 0x44166C: main (main.c:408)
'''
Signed-off-by: Stephen Worley <sworley@nvidia.com>
This commit adds ZAPI encoders & decoders for traffic control operations, which
include tc_qdisc, tc_class and tc_filter.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
This allows Zebra to manage QDISC, TCLASS, TFILTER in kernel and do cleaning
jobs when it starts up.
Signed-off-by: Siger Yang <siger.yang@outlook.com>
Steps to reproduce:
--------------------------
1. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-A on DIface-0 with DUT as DR.
2. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
3. DUT: Send <OSPF-LSU> packet.
4. ANVL: Verify that the received <OSPF-LSU> packet contains a Network- LSA for network N1
originated by DUT, and the LS Sequence Number is set to <InitialSequenceNumber>.
5. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-B on DIface-0 with DUT as DR.
6. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
7. DUT: Send <OSPF-LSU> packet.
8. ANVL: Verify that the received <OSPF-LSU> packet contains a new instance of the
Network-LSA for network N1 originated by DUT, and the LS Sequence Number
is set to (<InitialSequenceNumber> + 1).
Both the test cases were failing while verifying the initial sequence number for network LSA.
This is because currently OSPF does not reset its LSA sequence number when it is going down.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
The endpoint string is a 46 byte length buffer. Use a single
place to store the length of that buffer.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
In this commit, we introduce a new enumeration to encode the SRv6
Endpoint Behaviors codepoints defined in the IANA SRv6 Endpoint
Behaviors Registry
(https://www.iana.org/assignments/segment-routing/segment-routing.xhtml).
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Some results:
```
====
PCRE
====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001
ret status: 0
[14:31] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001
ret status: 0
=====
PCRE2
=====
% ./a.out "^65001" "65001"
comparing: ^65001 / 65001
ret status: 0
[14:30] donatas-pc donatas /home/donatas
% ./a.out "^65001_" "65001"
comparing: ^65001_ / 65001
ret status: 1
```
Seems that if using PCRE2, we need to escape outer `()` chars and `|`. Sounds
like a bug.
But this is only with some older PCRE2 versions. With >= 10.36, I wasn't able
to reproduce this, everything is fine and working as expected.
Adding _FRR_PCRE2_POSIX definition because pcre2posix.h does not have
include's guard.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
At this point add abilty for the encode/decode of the
resilience down ZAPI to zebra. Just hookup sharpd
at this point in time.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This patch just introduces the callback mechanism for the
resilient nexthop changes so that upper level daemons
can take advantage of the change. This does nothing
at this point but just call some code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When inserting to the front of a list with listnode_add_head
if the list is empty, the tail will not be properly set and
subsuquent calls to insert/remove will cause the function
to crash.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
FRR does not use the NLM_F_APPEND semantics ( in fact I would argue that
the NLM_F_APPEND semantics just introduce pain for all parties involved )
I would also argue that most people who use the kernel netlink api
have recognized that NLM_F_APPEND for a route is a recipe for disaster
that is well documented and as such it is not used as anything other
than a curiousity by operators.
See:
https://bugzilla.redhat.com/show_bug.cgi?id=1337855https://github.com/thom311/libnl/issues/226
Are 2 great examples of how confusing it is for anyone in user
space to know what the correct thing to do is. Given that
new fields can be added with no semantics to allow us to know
what has resulted in a change or not.
In an attempt to recognize this, let's note that FRR
believes it has gotten out of sync with the kernel.
Future commits will react to the desynchronized route
and request from the kernel a reload of that specific
route if possible.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The event system when executing a thread already
sets the pointer of it to NULL. No need to
do it again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit changes `seg6local_context2str()` to use `%pI6`/`%pI4`
instead of `inet_ntop` to print the SRv6 seg6local context information.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
A programmer can use the `srv6_locator_chunk_free()` function to free
the memory allocated for a `struct srv6_locator_chunk`.
The programmer invokes `srv6_locator_chunk_free()` by passing a single
pointer to the `struct srv6_locator_chunk` to be freed.
`srv6_locator_chunk_free()` uses `XFREE()` to free the memory.
It is the responsibility of the programmer to set the
`struct srv6_locator_chunk` pointer to NULL after freeing memory with
`srv6_locator_chunk_free()`.
This commit modifies the `srv6_locator_chunk_free()` function to take a
double pointer instead of a single pointer. In this way, setting the
`struct srv6_locator_chunk` pointer to NULL is no longer the
programmer's responsibility but is the responsibility of
`srv6_locator_chunk_free()`. This prevents programmers from making
mistakes such as forgetting to set the pointer to NULL after invoking
`srv6_locator_chunk_free()`.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
In this commit, we extend the ZAPI to support encoding and decoding the
locator flags contained in the messages exchanged between zebra and the
routing daemons.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
In this commit, we add support for a new flag called
`SRV6_LOCATOR_USID`. When the `SRV6_LOCATOR_USID` flag is set, the
routing protocols will install SRv6 behaviors with the uSID in the
dataplane.
This flag is used to specify a locator as a uSID locator. When a locator
is specified as a uSID locator, all the SRv6 SIDs allocated from the
locator by the routing protocols (like BGP) are bound to the SRv6 uSID
behaviors and use the SRv6 uSID codepoints in the BGP update message.
We extend the SRv6 locator implementation to add support for a `usid`
flag. When the `usid` flag is set, the bgpd will install SRv6 behaviors
with the uSID in the dataplane and use the proper SRv6 Endpoint Behavior
codepoint in the BGP advertisement.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
In this commit, we introduce the ability to specify flags for an SRv6
locator. Flags can be used to specify the properties of the locator.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
When enabling the interface link-params, a default bandwidth is assigned
to the Max, Reservable and Unreserved Bandwidth variables. If the
bandwidth is set at in the interface context, this value is used.
Otherwise, a default bandwidth value of 10 Gbps is set.
Revert the default value to 10 Mbps as it was intended in the initial
commit. 10 Mbps is a low value so that the link will not be prioritized
when computing the paths.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The code was working but the coverity scan reported a failure.
Clarify the code to make the coverity scan happy.
Fixes: fe0a129687 ("lib,zebra: link-params are not flushed after no enable")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Rather than running selected source files through the preprocessor and a
bunch of perl regex'ing to get the list of all DEFUNs, use the data
collected in frr.xref.
This not only eliminates issues we've been having with preprocessor
failures due to nonexistent header files, but is also much faster.
Where extract.pl would take 5s, this now finishes in 0.2s. And since
this is a non-parallelizable build step towards the end of the build
(dependent on a lot of other things being done already), the speedup is
actually noticeable.
Also files containing CLI no longer need to be listed in `vtysh_scan`
since the .xref data covers everything. `#ifndef VTYSH_EXTRACT_PL`
checks are equally obsolete.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In the comparison function for a linked list code was
always checking against passed in NULL's. The comparison
function will never receive a NULL value for data from
the linklist.c code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit adds the SRv6 locator's block length, node length and
argument length to the output of the command
"show segment-routing srv6 locator json"
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Daemons like isisd continue to use the previous link-params after they
are removed from zebra.
For example,
>r0# sh run zebra
> (...)
> interface eth-rt1
> link-params
> enable
> metric 100
> exit-link-params
> r0# conf
> r0(config)# interface eth-rt1
> r0(config-if)# link-params
> r0(config-link-params)# no enable
After "no enable", "sh run zebra" displays no more link-params context.
The "no enable" causes the release of the "link_params" pointer within
the "interface" structure. The zebra function to update daemons with
a ZEBRA_INTERFACE_LINK_PARAMS zapi message is called but the function
returns without doing anything because the "link_params" pointer is
NULL. Therefore, the "link_params" pointers are kept in daemons.
When the zebra "link_params" pointer is NULL:
- Send a zapi link param message that contains no link parameters
instead of sending no message.
- At reception in daemons, the absence of link parameters causes the
release of the "link_params" pointer.
Fixes: 16f1b9e ("Update Traffic Engineering Support for OSPFD")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
A given interface has no enabled link-params context. If a link-params
configuration command fails, the link-params is wrongly enabled:
> r4(config-link-params)# no enable
> r4(config-link-params)# delay
> (0-16777215) Average delay in micro-second as decimal (0...16777215)
> r4(config-link-params)# delay 50 min 300 max 500
> Average delay should be comprise between Min (300) and Max (500) delay
> r4(config-link-params)# do sh run zebra
> (...)
> interface eth-rt1
> link-params
> enable
> exit-link-params
link-params are enabled if and only if the interface structure has a
valid link_params pointer. Before checking the command validity,
if_link_params_get() is called to retrieve the link-params pointer.
However, this function initializes the pointer if it is NULL.
Only use if_link_params_get() to retrieve the pointer to avoid
confusion. In command setting functions, initialize the link_params
pointer if needed only after the validation of the command.
Fixes: 16f1b9e ("Update Traffic Engineering Support for OSPFD")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Re-work the bgp vni table to use separately keyed tables for type2
routes.
So, with type2 routes, we have the main table keyed off of the IP and a
new MAC table keyed off of MACs.
By separating out the two, we are able to run path selection separately
for the neigh and mac. Keeping the two separate is also more in-line
with what happens in zebra (they are managed comptletely seperate).
With this change type2 routes go into each table like so:
```
Remote MAC-IP -> IP Table & MAC Table
Remote MAC -> MAC Table
Local MAC-IP -> IP Table
Local MAC -> MAC Table
```
The difference for local is necessary because we should not ever allow
multiple paths for a local MAC.
Also cleaned up the commands for querying the vni tables:
```
show bgp vni all type ...
show bgp vni VNI type ...
```
Old commands will be deprecated in a separate commit.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
There are lib debugs being set but never show up in
`show debug` commands because there was no way to show
that they were being used. Add a bit of infrastructure
to allow this and then use it for `debug route-map`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
It already "looks" like a bitmask, but we currently can't flag a command
both YANG and HIDDEN at the same time. It really should be a bitmask.
Also clarify DEPRECATED behaviour (or the absence thereof.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The typesafe hash data structure enforces items to be unique, but their
hash values may still collide. To this extent, when two items have the
same hash value, the compare function is called to see if it returns 0
(aka "equal").
While the _find() function handles this correctly, the _add() function
mistakenly only checked the first item with a colliding hash value for
equality, and if it was inequal proceeded to add the new item. There
may however be additional items with the same hash value collision, one
of which could still compare as equal. In that case, _add() would
mistakenly add the new element, failing to notice the already added
item. Breakage ensues.
Fix by looking for an equal element among *all* existing items with the
same hash value, not just the first.
Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
[DL: rewrote commit message, fixed whitespace/formatting]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
JSON object was generated, but not printed, because the function returned
immediatelly, even without freeing the memory.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
When bulk deleting prefix lists on shutdown the code
was calling plist_delete, which removed the item
from the master->str list, and then popping the next
item on the list and just dropping it on the floor.
The pop is not needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a route imported from l3vpn is analysed, the nexthop from default
VRF is looked up against a valid MPLS path. Generally, this is done on
backbones with a MPLS signalisation transport layer like LDP. Generally,
the BGP connection is multiple hops away. That scenario is already
working.
There is case where it is possible to run L3VPN over GRE interfaces, and
where there is no LSP path over that GRE interface: GRE is just here to
tunnel MPLS traffic. On that case, the nexthop given in the path does not
have MPLS path, but should be authorized to convey MPLS traffic provided
that the user permits it via a configuration command.
That commit introduces a new command that can be activated in route-map:
> set l3vpn next-hop encapsulation gre
That command authorizes the nexthop tracking engine to accept paths that
o have a GRE interface as output, independently of the presence of an LSP
path or not.
A configuration example is given below. When bgp incoming vpnv4 updates
are received, the nexthop of NLRI is 192.168.0.2. Based on nexthop
tracking service from zebra, BGP knows that the output interface to reach
192.168.0.2 is r1-gre0. Because that interface is not MPLS based, but is
a GRE tunnel, then the update will be using that nexthop to be installed.
interface r1-gre0
ip address 192.168.0.1/24
exit
router bgp 65500
bgp router-id 1.1.1.1
neighbor 192.168.0.2 remote-as 65500
!
address-family ipv4 unicast
no neighbor 192.168.0.2 activate
exit-address-family
!
address-family ipv4 vpn
neighbor 192.168.0.2 activate
neighbor 192.168.0.2 route-map rmap in
exit-address-family
exit
!
router bgp 65500 vrf vrf1
bgp router-id 1.1.1.1
no bgp network import-check
!
address-family ipv4 unicast
network 10.201.0.0/24
redistribute connected
label vpn export 101
rd vpn export 444:1
rt vpn both 52:100
export vpn
import vpn
exit-address-family
exit
!
route-map rmap permit 1
set l3vpn next-hop encapsulation gre
exit
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Add an ability to match via route-maps. An additional route-map command
`match rpki-extcommunity <invalid|notfound|valid>` added.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
A couple of pointers in do_thread_cancel() we only inited at
the start of the function; make sure they're inited during
each iteration of the loop.
Signed-off-by: Mark Stapp <mstapp@nvidia.com>
- double the size of each new chunk request from zebra
- use bitfields to track label allocations in a chunk
- When allocating:
- skip chunks with no free labels
- search biggest chunks first
- start search in chunk where last search ended
- Improve API documentation in comments (bgp_lp_get() and callback)
- Tweak formatting of "show bgp labelpool chunks"
- Add test features (compiled conditionally on BGP_LABELPOOL_ENABLE_TESTS)
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Running `bgp_srv6l3vpn_to_bgp_vrf` and `bgp_srv6l3vpn_to_bgp_vrf2`
topotests with `--valgrind-memleaks` gives several memory leak errors.
This is due to the way FRR daemons pass local SIDs to zebra: to send a
local SID to zebra, FRR daemons call the `zclient_send_localsid()`
function.
The `zclient_send_localsid()` function performs the following sequence
of operations:
* create a temporary `struct nexthop`;
* call `nexthop_add_srv6_seg6local()` to fill the `struct nexthop` with
the proper local SID information;
* create a `struct zapi_route` and call `zapi_nexthop_from_nexthop()` to
copy the information from the `struct nexthop` to the
`struct zapi_route`;
* send the `struct zapi_route` to zebra through the ZAPI.
The `nexthop_add_srv6_seg6local()` function uses `XCALLOC()` to allocate
memory for the SRv6 nexthop. This memory is never freed.
Creating a temporary `struct nexthop` is unnecessary, as the local SID
information can be pushed directly to the `struct zapi_route`. This
patch simplifies the implementation of `zclient_send_localsid()` by
avoiding using the temporary `struct nexthop`. This eliminates the need
to use `nexthop_add_srv6_seg6local()` to fill the `struct nexthop` and
consequently fixes the memory leak.
Signed-off-by: Carmine Scarpitta <carmine.scarpitta@uniroma2.it>
Handle matching type2/5 evpn routes via lookup in the optimized
route-maps used by plists.
Convert the evpn_prefix to ipv4/v6 prefix to perform longest
matching on in the tree.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Implement the ability to match type-2 and type-5 routes
via a route-map and a prefix-list.
Add some library code to convert an evpn prefix into
a general ipv4/ipv6 prefix for type-2 and type-5 routes.
evpn prefix is really just another subtype of prefix so all
the info needed can be extracted right there.
Add a special handler to bgp_routemap for evpn type routes
when applying the outbound route-map. This calls the library
code to convert the evpn_prefix to a ipv4/ipv6 prefix and
run it through the plist code. In this we assume type-2 routes
are a /32.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
ls_msg2edge calls ls_edge_del_all which will free the
edge variable. Ensure that FRR properly returns NULL.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The format message checks done by clippy/xrelfo were still guarded
behind `--enable-dev-build`. They've been clean and reliable, so it's
time to enable them unconditionally.
Fixes: #11680
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In CSPF topo test, valgrind detects uninitialized bytes when exporting TE
Opaque information through ZEBRA. This is due to C pragma compilation directive
__attribute__(aligned(8)) in struct ls_node_id in link_state.h. Valgrind
consideris that struct ls_node_id nid = {} doesn't initialized the padding
bytes introduced by gcc.
This patch simply removes the C pragma compilation directive and also takes
opportunity to remove the transmission of remote node id for vertices and
subnets which is not known. Indeed, remote node id is only pertinent for
edges.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
New function setsockopt_tcp_keepalive() is added to enable TCP keepalive
mechanism for specified socket. Also TCP keepalive idle time, interval
and maximum probes are configured.
Signed-off-by: Xiaofeng Liu <xiaofeng.liu@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Don't auto set the thread->arg pointer. It is private
and should be only accessed through the THREAD_ARG pointer.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
convert:
frr_with_mutex(..)
to:
frr_with_mutex (..)
To make all our code agree with what clang-format is going to produce
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
in agentx_events_update the timeout_thr is canceled
on line 88 just above. This already sets the pointer
to NULL. No need to do this again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
resolver_resolve should check hostname is null or not.
if ares_gethostbyname() get null hostname string, the hostname string will access a null pointer and crash.
Signed-off-by: kevinshen <kevinshen@inspur.com>
Description:
- When there are multiple policies configured with
route-map then the first matching policy is not
getting applied on default route originated with
default-originate.
- In BGP we first run through the BGP RIB and then
pass it to the route-map to find if its permit or
deny. Due to this behaviour the first route in
BGP RIB that passes the route-map will be applied.
Fix:
- Passing extra parameter to routemap_apply so that
we can get the preference of the matching policy,
keep comparing it with the old preference and finally
consider the policy with less preference.
Co-authored-by: Abhinay Ramesh <rabhinay@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Literally 4 minutes after hitting merge on Mark's previous fix for this
I remembered we have an `assume()` macro in compiler.h which seems like
the right tool for this.
(... and if I'm touching it, I might as well add a little text
explaining what's going on here.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
It will be used to allow/deny using IPv4 reserved ranges (Class E) for Zebra
(configuring interface address) or BGP (allow next-hop to be from this range).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
Staticd when run tells privs.c that it does not need any
priviledges. The lib/privs.c code was not downgrading
any and all permissions it may have been given at startup.
Since we don't need any let's actually tell the system that
FRR does not need the capabilities anymore in the case
where a daemon does not ask for any cap's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When a nexthop is set RTNH_F_LINKDOWN, start noticing
that this flag is set. Allow FRR to know about this
flag but at this point do not do anything with it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
commit: 5609e70fb8
Added a new flag to the `struct nexthop` and
this addition of a flag caused the flags size to
be too small. Increase the size of flags to
allow more flags to be had.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add api is_ipv6_global_unicast to identify whether a given
ipv6 address is global unicast or not.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
In sockunion.c let's eliminate the silent and unexpected failure
mode to let the end operator figure out something is terribly wrong.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RFC9234 is a way to establish correct connection roles (Customer/
Provider, Peer or with RS) between bgp speakers. This patch:
- Add a new configuration/terminal option to set the appropriate local
role;
- Add a mechanism for checking used roles, implemented by exchanging
the corresponding capabilities in OPEN messages;
- Add strict mode to force other party to use this feature;
- Add basic support for a new transitive optional bgp attribute - OTC
(Only to Customer);
- Add logic for default setting OTC attribute and filtering routes with
this attribute by the edge speakers, if the appropriate conditions are
met;
- Add two test stands to check role negotiation and route filtering
during role usage.
Signed-off-by: Eugene Bogomazov <eb@qrator.net>
- The parent of the daemonizing fork reports memleaks for the early
northbound allocations (libyang). If these were real memleaks these
would show up in the child as well; however, ignoring all memleaks in
the parent of the fork is too hard a sale. Instead, spend some CPU
cycles cleaning up the allocations in the parent after the fork and
immeidatley prior to exiting the parent after the daemonizing fork.
Signed-off-by: Christian Hopps <chopps@labn.net>
Abstract the usage of '%pNHs' so that when nexthop groups get
a new special printfrr that it can take advantage of this
functionality too.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Multipath route may have mixed nexthops of EVPN and IP unicast. Move
EVPN flag to nexthop to support such cases.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
* sr_event_notif_send -> sr_notif_send
* sr_process_events -> sr_subscription_process_events
* sr_oper_get_items_subscribe -> sr_oper_get_subscribe
* Removed SR_SUBSCR_CTX_REUSE flag from the code at all
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
explicit_bzero() is available as an API to clean up sensitive data
and avoid compiler optimizations that remove calls to memset() or bzero().
Signed-off-by: Loganaden Velvindron <logan@cyberstorm.mu>
Using strtol() to compare two strings is a bad idea.
Before the patch, if_cmp_name_func() may confuse foo001 and foo1.
PR=79407
Fixes: 106d2fd572 ("2003-08-01 Cougar <cougar@random.ee>")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Aurélien Degeorges <aurelien.degeorges@6wind.com>
Acked-by: Philippe Guibert <philippe.guibert@6wind.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
Don't rely on the OS interface name length definition and use the FRR
definition instead.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Passing NULL for a `%pTVMs` would result in `(null)Ms`, i.e. the `Ms`
flags not eaten up. Change to eat those up, and print `-` instead for
NULL times.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
There's a common pattern of "get VRF context for CLI node" here, which
first got a helper macro in zebra that then permeated into pimd.
Unfortunately the pimd copy wasn't quite adjusted correctly and thus
caused two coverity warnings (CID 1517453, CID 1517454).
Fix the PIM one, and clean up by providing a common base macro in
`lib/vty.h`.
Also rename the macros (add `_VRF`) to make more clear what they do.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
By changing this API call to use a `struct ipaddr`, which encodes the
type of IP address with it. (And rename/remove the `IPV4` from the
command name.)
Also add a comment explaining that this function call is going to be
obsolete in the long run since pimd needs to move to proper MRIB NHT.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
If duplicate value is entered, the whole plist/alist just dropped.
Before:
```
$ grep prefix-list /etc/frr/frr.conf
ip prefix-list test seq 5 permit 1.1.1.1/32
ip prefix-list test seq 10 permit 1.1.1.1/32
$ systemctl restart frr
$ vtysh -c 'show run | include prefix-list'
$
```
After:
```
$ grep prefix-list /etc/frr/frr.conf
ip prefix-list test seq 5 permit 1.1.1.1/32
ip prefix-list test seq 10 permit 1.1.1.1/32
$ systemctl restart frr
$ vtysh -c 'show run | include prefix-list'
ip prefix-list test seq 5 permit 1.1.1.1/32
```
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
End operator is showing:
!
frr version 8.0.1
frr defaults traditional
hostname test.example.com
domainname
domainname should not be printed in this case at all. I do not
see any mechanism in current code that this could happen, but
what do I know? Put some extra stupid insurance in place
to prevent bad config from being generated.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Recent commit e92508a741 changed
the prefix_master->str to a RB tree. This introduced a condition
whnere on shutdown the prefix list was removed from the master list
and then operated on by passing around a name. Which was then used
to lookup the prefix list again when we operated on the code.
This change to a RB Tree first deleted the item from the RB tree
first thus introducing this crash
Crash:
(gdb) bt
index=0x556c07d59650, pentry=0x556c07d29380) at lib/routemap.c:2397
arg=0x7ffdbf84bc60) at lib/hash.c:267
event=RMAP_EVENT_PLIST_DELETED) at lib/routemap.c:2489
Grab the first item on the list, clean it and then remove it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Just simple helpers to get a scope value, never-forward, and is-SSM for
a given address.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This has already been a requirement for Solaris, it is still a
requirement for some of the autoconf feature checks to work correctly,
and it will be a requirement for `-fms-extensions`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The commands:
router isis 1
mpls-te on
no mpls-te on
mpls-te on
no mpls-te on
!
Will crash
Valgrind gives us this:
==652336== Invalid read of size 8
==652336== at 0x49AB25C: typed_rb_min (typerb.c:495)
==652336== by 0x4943B54: vertices_const_first (link_state.h:424)
==652336== by 0x493DCE4: vertices_first (link_state.h:424)
==652336== by 0x493DADC: ls_ted_del_all (link_state.c:1010)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Address 0x6f928e0 is 272 bytes inside a block of size 320 free'd
==652336== at 0x48399AB: free (vg_replace_malloc.c:538)
==652336== by 0x494BA30: qfree (memory.c:141)
==652336== by 0x493D99D: ls_ted_del (link_state.c:997)
==652336== by 0x493DC20: ls_ted_del_all (link_state.c:1018)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Block was alloc'd at
==652336== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==652336== by 0x494B6F8: qcalloc (memory.c:116)
==652336== by 0x493D7D2: ls_ted_new (link_state.c:967)
==652336== by 0x47E4DD: isis_instance_mpls_te_create (isis_nb_config.c:1832)
==652336== by 0x495BB29: nb_callback_create (northbound.c:1034)
==652336== by 0x495B547: nb_callback_configuration (northbound.c:1348)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== by 0x495D23E: nb_cli_apply_changes (northbound_cli.c:268)
Let's null out the pointer. After this change. Valgrind no longer reports issues
and isisd no longer crashes.
Fixes: #10939
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
These 3 values:
ONE_DAY_SECOND
ONE_WEEK_SECOND
ONE_YEAR_SECOND
Are defined based upon the number of seconds. Unfortunately doing math
on these values say something like:
days = t->tv_sec / ONE_DAY_SECOND;
Once you go over about a day causes the order of operations to cause the multiplication
to get messed up:
204 if (!t)
(gdb) n
207 w = d = h = m = ms = 0;
(gdb) set t->tv_sec = ONE_DAY_SECOND + 30
(gdb) n
208 memset(buf, 0, size);
(gdb)
210 us = t->tv_usec;
(gdb)
211 if (us >= 1000) {
(gdb)
212 ms = us / 1000;
(gdb)
213 us %= 1000;
(gdb)
217 if (ms >= 1000) {
(gdb)
222 if (t->tv_sec > ONE_WEEK_SECOND) {
(gdb)
227 if (t->tv_sec > ONE_DAY_SECOND) {
(gdb)
228 d = t->tv_sec / ONE_DAY_SECOND;
(gdb) n
229 t->tv_sec -= d * ONE_DAY_SECOND;
(gdb) n
232 if (t->tv_sec >= HOUR_IN_SECONDS) {
(gdb) p d
$6 = 2073600
(gdb) p t->tv_sec
$7 = -179158953570
(gdb)
Converting to adding paranthesis around around the ONE_DAY_SECOND causes
the order of operations to work as expected.
Fixes: #10880
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When using zlog_backtrace I am seeing this:
==66286== Syscall param write(buf) points to uninitialised byte(s)
==66286== at 0x4CDF48A: syscall (in /lib/libc.so.7)
==66286== by 0x4A0D409: ??? (in /usr/local/lib/libunwind.so.8.0.1)
==66286== by 0x4A0D694: ??? (in /usr/local/lib/libunwind.so.8.0.1)
==66286== by 0x4A0E2F4: _ULx86_64_step (in /usr/local/lib/libunwind.so.8.0.1)
==66286== by 0x49662DB: zlog_backtrace (log.c:250)
==66286== by 0x2AFFA6: if_get_mtu (ioctl.c:163)
==66286== by 0x2B2D9D: ifan_read (kernel_socket.c:457)
==66286== by 0x2B2D9D: kernel_read (kernel_socket.c:1406)
==66286== by 0x499F46E: thread_call (thread.c:2002)
==66286== by 0x495D2B7: frr_run (libfrr.c:1196)
==66286== by 0x2B4098: main (main.c:471)
==66286== Address 0x7fc000000 is on thread 1's stack
==66286== in frame #4, created by zlog_backtrace (log.c:239)
==66286==
Let's initialize some data
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When `terminal monitor` is issued I am seeing this for valgrind on freebsd:
2022/03/24 18:07:45 ZEBRA: [RHJDG-5FNSK][EC 100663304] can't open configuration file [/usr/local/etc/frr/zebra.conf]
==52993== Syscall param sendmsg(sendmsg.msg_control) points to uninitialised byte(s)
==52993== at 0x4CE268A: _sendmsg (in /lib/libc.so.7)
==52993== by 0x4B96245: ??? (in /lib/libthr.so.3)
==52993== by 0x4CDF329: sendmsg (in /lib/libc.so.7)
==52993== by 0x49A9994: vtysh_do_pass_fd (vty.c:2041)
==52993== by 0x49A9994: vtysh_flush (vty.c:2070)
==52993== by 0x499F4CE: thread_call (thread.c:2002)
==52993== by 0x495D317: frr_run (libfrr.c:1196)
==52993== by 0x2B4068: main (main.c:471)
==52993== Address 0x7fc000864 is on thread 1's stack
==52993== in frame #3, created by vtysh_flush (vty.c:2065)
Fix by initializing the memory to `0`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The EAD-per-ES route carries ECs for all the ES-EVI RTs. As the number of VNIs
increase all RTs do not fit into a standard BGP UPDATE (4K) so the route needs
to be fragmented.
Each fragment is associated with a separate RD and frag-id -
1. Local ES-per-EAD -
ES route table - {ES-frag-ID, ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
2. Remote ES-per-EAD -
VNI route table - {ESI, ET=0xffffffff, VTEP-IP}
global route table - {RD-=ES-frag-RD, ESI, ET=0xffffffff}
Note: The fragment ID is abandoned in the per-VNI routing table. At this
point that is acceptable as we dont expect more than one-ES-per-EAD fragment
to be imported into the per-VNI routing table. But that may need to be
re-worked at a later point.
CLI changes (sample with 4 VNIs per-fragment for experimental pruposes) -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es 03:44:38:39:ff:ff:01:00:00:01"
ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
RD: 27.0.0.21:3
Originator-IP: 27.0.0.21
Local ES DF preference: 50000
VNI Count: 10
Remote VNI Count: 10
VRF Count: 3
MACIP EVI Path Count: 33
MACIP Global Path Count: 198
Inconsistent VNI VTEP Count: 0
Inconsistencies: -
Fragments: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
27.0.0.21:3 EVIs: 4
27.0.0.21:13 EVIs: 4
27.0.0.21:22 EVIs: 2
VTEPs:
27.0.0.22 flags: EA df_alg: preference df_pref: 32767
27.0.0.23 flags: EA df_alg: preference df_pref: 32767
root@torm-11:mgmt:~# vtysh -c "show bgp l2vpn evpn es-evi vni 1002 detail"
VNI: 1002 ESI: 03:44:38:39:ff:ff:01:00:00:01
Type: LR
ES fragment RD: 27.0.0.21:13 >>>>>>>>>>>>>>>>>>>>>>>>>
Inconsistencies: -
VTEPs: 27.0.0.22(EV),27.0.0.23(EV)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
PS: The number of EVIs per-fragment has been set to 128 and may need further
tuning.
Ticket: #2632967
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
The wheel data structure is a array of list pointers
but the alloc for it is using the sizeof (struct listnode *)
as the amount to allocate. Even though the (struct listnode *)
and (struct list *) sizes are the same, let's list the correct
values.
Signed-off-by: ron <lyq140hf2006@163.com>
- split NewRpcState object into 2, a Unary and a Streaming variant, which
then allows for the next.
- move all state machine details inside these new state objects
- use a template arg to allow for Streaming state tracking object
creation and deletion w/o requiring this in each specific RPC
hander.
- Code is more rugged by design now.
Thanks to Rafael Zalamena <rzalamena@opensourcerouting.org> for the cleanup
ideas/motivation.
Signed-off-by: Christian Hopps <chopps@labn.net>
Let's clean up the valgrind output even more by calling the protobuf
shutdown function that deallocates all library used memory.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against. When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd. It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.
This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.
Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
sockopt_cork is a no-op function that was cleaned up
in 2017. Since then it's still not being used. At
this point in time there is little point in keeping a
dead function that will not be used because of vagaries
between platforms
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RB-tree and double-linked-list easily support backwards iteration, and
an use case seems to have popped up. Let's make it accessible.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The prefix_master->str data structure was a sorted
list of the prefix names. Not that big of a deal
other than insertion and deletion is insanely expensive
when you have a large number of unique prefix-lists.
In my test config file that I discovered this,
I have 587 unique prefix lists spread out acros
~26k lines of prefix-lists. When reading
this config file into FRR the read time goes
from 690 seconds to 650 seconds.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
`json_object_object_add()` adds keys/items to objects/dictionaries.
Useful to have a printfrr based variant for the key there.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The vtysh live logs don't try to buffer messages when vtysh isn't
reading them fast enough. Either the kernel has space and can accept
messages without delay, or it doesn't and we continue on.
While this is intentional (otherwise slow vtysh could block a routing
daemon), at least give the user an indication if messages were dropped.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This provides direct raw log output with full metadata directly at
startup regardless of configuration details.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This was the intent here to begin with, not sure where I managed to
forget this along the way...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The timestamps used for the live log are wallclock, not monotonic. Also
some fields were left uninitialized.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
While running singlethreaded, the RCU code is "dormant" and rcu_free is
an immediate operation. This results in the log target loop accessing
free'd memory if a log target removes itself while a message is printed
(which is likely to happen on e.g. error conditions.)
Just use frr_each_safe to avoid this issue.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
- rather than coerce `const char *` to std:string&, just pass the
C ptr, as that's what is used anyway.
fixes#10578
Signed-off-by: Christian Hopps <chopps@labn.net>
Don't let open sockets hang for too long. This will fix an issue where a
improperly coded client (e.g. socat) could exaust the amount of open
file descriptors.
Documentation:
https://grpc.github.io/grpc/cpp/md_doc_keepalive.html
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
This issue is applicable to other protocols as well.
When user has used route-map, even though the prefixes are falling
under the permit rule, the prefixes were denied and were shown
as inactive route in zebra.
Reason being the parameter which is of type enum was passed to the api
route_map_get_index and was typecasted to uint8_t *.
This problem is visible in case of Big Endian systems because we are
accessing the most significant byte.
'match_ret' field is an enum in the caller and so it is of 4 bytes,
the typecasting it to 1 byte and passing it to the api made
the api to put the value in the most significant byte
which was already zero previously. Therefore the actual value
RMAP_NOMATCH which was 1 never gets reset in this case.
Therefore the api always returns 'RMAP_NOMATCH' and hence
the prefixes are always denied.
Fixes: #9782
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Call `zlog_file_rotate` for command file lines as well otherwise on
`SIGUSR1` the old descriptor will still be used and no new log file will
be created for the rotation.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If a operator issues a series of route-map deletions and
then re-adds, *and* this triggers the hash table to realloc
to grow to a larger size, then subsuquent route-map operations
will be against a corrupted hash table.
Why?
Effectively the route-map code was inserting each
route-map <NAME> into a hash for storage. Upon
deletion there is this concept of delayed processing
so the routemap code sets a bit `to-be-processed`
and marks the route-map for deletion. This is
1 entry in the hash table. Then if the operator
recreates the hash, FRR would add another hash
entry. If another deletion happens then there
now are 2 deletion entries that are indistinguishable
from a hash perspective.
FRR stores the deleted name of the route-map so that
any delayed processing can lookup the name and only process
those peers that are related to that route-map name.
This is good as that if in say BGP, we do not want
to reprocess all the peers that don't use the route-map.
Solution:
The whole purpose of the delay of deletion and the
storage of the route-map is to allow the using protocol
the ability to process the route-map at a later time
while still retaining the route-map name( for more efficient
reprocessing ). The problem exists because we are keeping
multiple copies of deletion events that are indistinguishable
from each other causing hash havoc.
The truth is that we only need to keep 1 copy of the
routemap in the table. If the series of events is:
a) delete ( schedule processing )
b) add ( reschedule processing )
Current code ends up processing the route-map two times
and in this event we really just need to reprocess everything
with the new route-map.
If the series of events is:
a) delete (schedule processing )
b) add (reschedule)
c) delete (reschedule)
d) add (reschedule)
All this really points to is that FRR just needs to keep the last
in the series of maps and ensuring that FRR knows that we need
to continue processing the route-map. So in the creation processing
if the hash has an entry for this map, the routemap code knows that
this is a deletion event. Mark this route-map for later processing
if it was marked so. Also in the lookup function do not return
a map if the map found was deleted.
Fixes: #10708
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
State-only and configuration presence-containers need to be treated
differently when iterating over YANG operational data. Currently the
get_elem() callback is used to know when a state-only p-container
exists or not, and configuration p-containers are assumed to always
exist, which is clearly wrong. Fix this by checking the running
configuration to know whether a rw p-container exists or not.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
On FreeBSD I have noticed that subsuquent calls to clock_gettime(..)
can return an after time that is before first calls value.
This in turn is generating CPU_HOG's because the subtraction
is wrapping into very very large numbers:
2022/02/28 20:12:58 SHARP: [PTDQA-70FG5] start: 35.741981000 now: 35.740581000
2022/02/28 20:12:58 SHARP: [XK9YH-ZD8FA][EC 100663313] CPU HOG: task zclient_read (800744240) ran for 0ms (cpu time 18446744073709550ms)
(Please note I added the first line of debug to figure this issue out).
I have been asked to open a FreeBSD bug report and have done so.
In the mean time I think that it is important that FRR does
not generate bogus CPU HOG's on FreeBSD ( especially since
this may or may not be easily fixed and FRR has no control
over what version of the operating system, operators are
going to be running with FRR.
So, add a bit of specialized code that checks to see if
the after time in FreeBSD is before the now time in
thread_consumed_time and do some quick manipulations
to not have this issue.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This adds the plumbing necessary to yield back a file descriptor to
vtysh. The fd is passed on the command status code bytes through
AF_UNIX SCM_RIGHTS.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Add the ability to inspect the timers and when they will pop
per daemon:
sharpd@eva ~/frr (thread_return_null)> vtysh -c "show thread timers"
Thread timers for zebra:
Showing timers for default
--------------------------
rtadv_timer 00:00:00.520
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.745
if_zebra_speed_update 00:00:02.746
if_zebra_speed_update 00:00:02.744
if_zebra_speed_update 00:00:02.745
Showing timers for Zebra dplane thread
--------------------------------------
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since there are timers that are created based upon doing some
math and we know that unsigned values when doing math and we accidently
subtract a larger number from a smaller number causes the unsigned
number to wrap to very large numbers, let's put in a small catch
in place to see if there are any places in the system that
mistakes are made and FRR is accidently creating a problem
for itself.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
assert when if_lookup_address is passed with
a family that is not AF_INET or AF_INET6 as
that we are dead in the water and this is a
dev escape
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a counter to the number of times a thread is starved from
a timer event and add the output to `show thread cpu`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem Statement:
==================
Currently there is no support for configuring hash algorithm in
keychain.
RCA:
====
Not implemented yet.
Fix:
====
Changes are done to configure hash algorithm as part of keychain.
which will easy the configuration from modules using keychain.
Risk:
=====
Low risk
Tests Executed:
===============
Have tested the configuration and unconfiguration flow for newly
implemented CLI.
!
key chain abcd
key 100
key-string password
cryptographic-algorithm sha1
exit
key 200
key-string password
cryptographic-algorithm sha256
exit
!
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
Problem Statement:
=================
When modules use keychain there is no option for auto completion
of configured keychains.
RCA:
====
Not implemented.
Fix:
====
Changes to support auto completion of configured keychain names.
Risk:
=====
Low risk
Tests Executed:
===============
Have tested auto completion of configured keychain names with newly
implemented auth CLI.
frr(config-if)# ipv6 ospf6 authentication keychain
KEYCHAIN_NAME Keychain name
abcd pqr 12345
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
Multiple deletions from the hash_walk or hash_iteration calls
during a single invocation of the passed in function can and
will cause the program to crash. Warn against doing such a
thing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add to lib/command.c the ability to remember the
release/version/system information and to allow
`show version` to dump some of it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As helper function of Segment Routing Flex Algo or RSVP-TE
add Constrained Shortest Path First algorithm able to compute
path with constraints. Supported constraints are as follow:
- Standard IGP metric
- TE IGP metric
- Delay metric
- Bandwidth for given Class of Service for bandwidth reservation (RSVP-TE)
Usage of CSPF algorithms is detailed in the doc/developer/cspf.rst file
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
When link-param is enabled for a given interface, TE metric is automatically
assigned to the metric of the interface. However, the metric of the interface
could be unassigned and keep the default value equal to 0. Thus, if the TE
metric is not explicitely modified within the `link-param metric` statement,
TE metric remains set to 0 which is not a valid value especially when
computing constrainted path.
This patch changes the assignement of the default value of the TE metric.
It is set to the metric of the interface only if the latter is not equal to 0.
TE topotests for OSPF and IS-IS have been adjusted accordingly.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Replace custom implementation or call to ipaddr_isset with a call to
ipaddr_is_zero.
ipaddr_isset is not fully correct, because it's fine to have some
non-zero bytes at the end of the struct in case of IPv4 and the function
doesn't allow that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
VRF name should not be printed in the config since 574445ec. The update
was done for NB config output but I missed it for regular vty output.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Add a thread_ignore_late_timer(struct thread *thread) function
that allows thread.c to ignore when timers are late to the party.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If a thread timer should have popped CPU_CONSUMED_CHECK
seconds in the past, and we are only handling it now. Consider
the thread starved and notice it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
BGP EVPN custom `union gw_addr` is basically the same thing as a common
`struct ipaddr` but it lacks the address family which is needed in some
cases.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
elf_getdata_rawchunk() already endian-converts; doing it again is, uh,
counterproductive.
Fixes: #10051
Reported-by: Lucian Cristian <lucian.cristian@gmail.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This causes confusing/annoying log messages at startup otherwise:
`YANG model "ietf-inet-types@*" "*@*"not embedded, trying external file`
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
systemd sets up environment variables to allow autodetecting and
switching the log format to journald native. Make use of that for the
stdout logging target.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Not much to say here, user docs are coming up in a separate commit.
RFC5424 and (systemd's) journald allow passing structured key-value
data. This stuffs the metadata we have available into there.
The "does the system syslogd support RFC5424" question is unfortunately
not easily answered, so we can only give an affirmative answer on NetBSD
5.0+ or FreeBSD 12+.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Update ospfd and ospf6d to send opaque route attributes to
zebra. Those attributes are stored in the RIB and can be viewed
using the "show ip[v6] route" commands (other than that, they are
completely ignored by zebra).
Example:
```
debian# show ip route 192.168.1.0/24
Routing entry for 192.168.1.0/24
Known via "ospf", distance 110, metric 20, best
Last update 01:57:08 ago
* 10.0.1.2, via eth-rt2, weight 1
OSPF path type : External-2
OSPF tag : 0
debian#
debian# show ip route 192.168.1.0/24 json
{
"192.168.1.0\/24":[
{
"prefix":"192.168.1.0\/24",
"prefixLen":24,
"protocol":"ospf",
"vrfId":0,
"vrfName":"default",
"selected":true,
[snip]
"ospfPathType":"External-2",
"ospfTag":"0"
}
]
}
```
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Adding an `s` after these printfrr specifiers replaces 0.0.0.0 / :: in
the output with a star (`*`). This is primarily intended for use with
multicast, e.g. to print `(*,G)`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Used for graceful-restart mostly.
Especially for bgp_show_neighbor_graceful_restart_capability_per_afi_safi()
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Since this is only used in very few places, moving it out of the way is
reasonable. (`%pSG` will be pim_sgaddr)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently `bfd_get_peer_info` should return invalid sp->family
and dp->family during fail cases.
Before this fix, in those fail cases `bfd_get_peer_info` maybe
return valid sp->family and dp->family.
This fix ensures all fail cases return invalid sp->family and
dp->family for outside callers.
Signed-off-by: anlan_cs <anlan_cs@tom.com>
Currently, it is possible to rename the default VRF either by passing
`-o` option to zebra or by creating a file in `/var/run/netns` and
binding it to `/proc/self/ns/net`.
In both cases, only zebra knows about the rename and other daemons learn
about it only after they connect to zebra. This is a problem, because
daemons may read their config before they connect to zebra. To handle
this rename after the config is read, we have some special code in every
single daemon, which is not very bad but not desirable in my opinion.
But things are getting worse when we need to handle this in northbound
layer as we have to manually rewrite the config nodes. This approach is
already hacky, but still works as every daemon handles its own NB
structures. But it is completely incompatible with the central
management daemon architecture we are aiming for, as mgmtd doesn't even
have a connection with zebra to learn from it. And it shouldn't have it,
because operational state changes should never affect configuration.
To solve the problem and simplify the code, I propose to expand the `-o`
option to all daemons. By using the startup option, we let daemons know
about the rename before they read their configs so we don't need any
special code to deal with it. There's an easy way to pass the option to
all daemons by using `frr_global_options` variable.
Unfortunately, the second way of renaming by creating a file in
`/var/run/netns` is incompatible with the new mgmtd architecture.
Theoretically, we could force daemons to read their configs only after
they connect to zebra, but it means adding even more code to handle a
very specific use-case. And anyway this won't work for mgmtd as it
doesn't have a connection with zebra. So I had to remove this option.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
If we're exiting before we finished initializing, we can end up trying
to shut down a NULL vrf here.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
New `FRR_NO_SPLIT_CONFIG` flag for newly added daemons where we're just
rolling without split config and always expect configs to be loaded via
vtysh/integrated config.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In order to add Link State Traffic Engineering to IS-IS, Link State library
should have been updated:
- Correct Node and Edge RB Tree comparison functions to support key > 32 bits
- Change Subnet RB Tree comparison function to take into account host part of
the prefix i.e. 10.0.0.1/24 and 10.0.0.2/24 are considered as different
- Add new function to convert IS-IS ISO system ID into Vertex or Edge key that
take into account Endianness architecture
- Correct Vertex and Edge creation and search function accordingly
- Add extra Adjacency entries in Link State Attributes for IPv6 Segment Routing
- Update send/received and show TED functions accordingly
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Duplicate a couple of definitions in order to remove the bgpd
includes from this libfrr header. This is necessary to fix some
name collisions like PREFIX_LIST_IN being defined differently on
multiple daemons (as soon as other daemons start including
route_opaque.h).
Including daemon headers on libfrr headers is a bad practice and
should be avoided whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is needed for the following two reasons:
1. To be able to remove the northbound HACK in if_update_to_new_vrf. It
is totally wrong to rewrite the configuration datastore when some
operational state changes. It is a hard blocker for storing a
configuration data in a management daemon which knows nothing about
the operational state.
2. To allow changing the VRF of the interface using FRR CLI or any other
frontend in the future. If the VRF is a part of the key, it can't be
changed. If the VRF is a simple leaf, it becomes possible to change
it and thus move the interface between VRFs. For now I mark the leaf
as a "config false" as it's not yet possible to control it from FRR.
But we can't simply remove the VRF from the key, because it is needed to
distinguish interfaces when using netns based VRFs, as it is possible to
have multiple interfaces with the same name in different namespaces. To
handle this, I came up with an idea to store both VRF and an interface
name in the "name" leaf using the pattern "vrfname:ifname". For example,
if there's an interface "eth0" in VRF "red" then its "name" leaf will be
"red:eth0".
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
add a parameter to resolver api that is the vrf identifier. this permits
to make resolution self to each vrf. in case vrf netns backend is used,
this is very practical, since resolution can happen on one netns, while
it is not the case in an other one.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
For IPv4 matching, we have "match ip next-hop address A.B.C.D".
For IPv6 matching, we have "match ipv6 next-hop X:X::X:X".
To have consistency, let's add "address" keyword to IPv6 commands.
Old commands are preserved as hidden for backward compatibility.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(This is mostly just to exercise the code, the actual replacement needs
to be a cocci script.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... these should probably have been added ages ago.
`json_object_string_addf(json, "key", "%pFX", prefix)` is super useful.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... this is copypasted all over the codebase & should've been a helper
to begin with really.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Most users of if_lookup_address_exact only cared about whether the
address is any local address. Split that off into a separate function.
For the users that actually need the ifp - which I'm about to add a few
of - change it to prefer returning interfaces that are UP.
(Function name changed due to slight change in behavior re. UP state, to
avoid possible bugs from this change.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
i.e. to whoever cares, since some unique IDs (from libfrr) are valid
everywhere but some others (from the daemons) only apply to specific
daemons.
(Default handling aborts on first error, so configuring any unique IDs
that don't exist on the first daemon vtysh connects to just failed
before this.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Why would this be in a vector to loop over with strcmp()'ing each
item... that just makes no sense. Use a hash instead.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We should always treat the VRF interface as a loopback. Currently, this
is not the case, because in some old pre-VRF code we use if_is_loopback
instead of if_is_loopback_or_vrf. To avoid any future problems, the
proposal is to rename if_is_loopback_or_vrf to if_is_loopback and use it
everywhere. if_is_loopback is renamed to if_is_loopback_exact in case
it's ever needed, but currently it's not used anywhere.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Fixes the following compilation errors:
In file included from /home/ruslan/sdk/sysroots/corei7-64-poky-linux/usr/include/libyang/tree_data.h:30,
from /home/ruslan/sdk/sysroots/corei7-64-poky-linux/usr/include/libyang/context.h:22,
from /home/ruslan/sdk/sysroots/corei7-64-poky-linux/usr/include/libyang/libyang.h:24,
from ../../src/frr/lib/yang.h:25,
from ../../src/frr/lib/northbound.h:27,
from ../../src/frr/lib/vty.h:36,
from ../../src/frr/lib/ferr.h:28,
from ../../src/frr/lib/lib_errors.h:24,
from ../../src/frr/lib/northbound_confd.c:23:
../../src/frr/lib/northbound_confd.c: In function 'frr_confd_init_cdb':
../../src/frr/lib/northbound_confd.c:533:28: error: 'const struct lys_module' has no member named 'data'
533 | LY_LIST_FOR (module->info->data, snode) {
| ^~
../../src/frr/lib/northbound_confd.c: In function 'frr_confd_data_get_next_object':
../../src/frr/lib/northbound_confd.c:921:3: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
921 | LY_LIST_FOR (lysc_node_child(nb_node->snode), child) {
| ^~~~~~~~~~~
Signed-off-by: Ruslan Babayev <ruslan@babayev.com>
We had various forms of min/max macros across multiple daemons
all of which duplicated what we have in compiler.h. Convert
everyone to use the `correct` ones
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, we automatically delete an inactive VRF when its last
interface is deleted. This code introduces a couple of crashes because
of the following problems:
- vrf_delete is called before calling if_del hook, so daemons may try to
dereference an ifp->vrf pointer which is freed
- in if_terminate, we continue to use the VRF in the loop condition
after the last interface is deleted
This check is needed only when the interface is deleted by the user,
because if the interface is deleted by the system, VRF must still exist
in the system. Move the check to appropriate places to fix crashes.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There are two APIs to control the expected number of hops for a BFD
session – `bfd_sess_set_mininum_ttl` and `bfd_sess_set_hop_count`.
The former is very confusing, as it takes an expected TTL in the
BFD packet which is actually a protocol internal value. The latter is
simple and straightforward – it takes an expected number of hops, which
is always 1 for single-hop and >1 for multi-hop.
As the former API is not used anywhere, just remove it to avoid any
confusion.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
... need to ignore TLS sections, their address is effectively
meaningless but can overlap other sections we actually need to access.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
`assert.h` -> `xref.h` -> `typesafe.h` -> `assert.h`
Might be possible to do this more cleanly some way, but that way is not
obvious, so here's the "simple & dumb" approach.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Looks much prettier if `libunwind` is available, but works with glibc or
libexecinfo's `backtrace()` too.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... its only purpose was to serve as a footgun, and all such uses have
been eliminated now.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The `struct thread **ref` that the thread code takes is written to and
needs to stay valid over the lifetime of a thread. This does not hold
up if thread pointers are directly put in a `vector` since adding items
to a `vector` may reallocate the entire array. The thread code would
then write to a now-invalid `ref`, potentially corrupting entirely
unrelated data.
This should be extremely rare to trigger in practice since we only use
one c-ares channel, which will likely only ever use one fd, so the
vector is never resized. That said, c-ares using only one fd is just
plain fragile luck.
Either way, fix this by creating a resolver_fd tracking struct, and
clean up the code while we're at it.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These are just used to iterate over active vty sessions, a vector is a
weird choice there.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These had no remaining users for a while now. The logging backend has
its own list of receivers.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Constify some BFD library function parameters to signalize they are
not going to get modified.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
If the VRF is not enabled, if_terminate deletes the VRF after the last
interface is removed from it. Therefore daemons crash on the subsequent
call to vrf_delete. We should call vrf_delete only for enabled VRFs.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When the netns is deleted, we should always clear the vrf->ns_ctxt
pointer. Currently, it is not cleared when there are interfaces in the
netns at the time of deletion.
If the netns is re-created, zebra crashes because it tries to use the
stale pointer.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The script entries were being stored in a hash lookup with
the script name a pre-defined array of characters. The hash
lookup is succeeding since it is auto-installed at script
start time irrelevant if there is a handler function.
Modify the code so that if the scriptname is an empty
string "\0" just return a NULL so that zebra does
not attempt to actually load up the script
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
ls_node_same, ls_attributes_same and ls_prefix_same are not producing expected
result due to a wrong usage of memcmp. In addition, if respective structures
are not initialized with 0, there is a risk that the comparison failed.
This patch correct usage of memcmp and expand comparison to each invidual
parameters of the respective structure for safer result.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
This function doesn't work correctly with netns VRF backend as the same
index may be used in multiple netns simultaneously. So let's hide it
from the public API to reduce temptation to use it instead of writing
the correct code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The fact that the interface name is used in some nexthop config doesn't
mean that the interface is configured.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-9(config-route-map)# match ip route-source prefix-list ?
<cr>
PREFIXLIST_NAME IP prefix-list name
p1 p2
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
zclient_send_localsid is called by various routing protocol daemons. To set the
srv6 endpoint function. Fix a hard-coded error in the initial implementation.
Before this PR, the srv6 function will be registered to zebra as a BGP route
even if isisd executes zclient_send_localsid.
Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
When writing the config from the NB-converted daemon, we must not rely
on the operational data. This commit changes the output of the interface
configuration to use only config data. As the code is the same for all
daemons, move it to the lib and remove all the duplicated code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Do not return pointer to the newly created thread from various thread_add
functions. This should prevent developers from storing a thread pointer
into some variable without letting the lib know that the pointer is
stored. When the lib doesn't know that the pointer is stored, it doesn't
prevent rescheduling and it can lead to hard to find bugs. If someone
wants to store the pointer, they should pass a double pointer as the last
argument.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The function thread_is_scheduled allows us to know if
the particular thread is scheduled for execution or not.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This removes a giant `switch { }` block from lib/zclient.c and
harmonizes all zclient callback function types to be the same (some had
a subset of the args, some had a void return, now they all have
ZAPI_CALLBACK_ARGS and int return.)
Apart from getting rid of the giant switch, this is a minor security
benefit since the function pointers are now in a `const` array, so they
can't be overwritten by e.g. heap overflows for code execution anymore.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
*_anywhere(item) returns whether an item is on _any_ container. Only
available for unsorted containers for now.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Using a non-NULL sentinel allows distinguishing between "end of list"
and "item not on any list". It's a compare either way, just the value
is different.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This provides a "is this item on this list" check, which may or may not
be faster than using *_find() for the same purpose. (If the container
has no faster way of doing it, it falls back to using *_find().)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Some of the typesafe containers didn't null out their innards of items
after an item was deleted or popped off the container. This is both a
bit unsafe as well as hinders the upcoming _member() from working
efficiently.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When something is used only from zebra and part of its description is
"should be called from zebra only" then it belongs to zebra, not lib.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
... to speed up vector_empty_slot() among other things.
Behavior should be 100% identical to previous.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This function doesn't work correctly with netns VRF backend as the same
ifname may be used in multiple netns simultaneously. So let's hide it
from the public API to reduce temptation to use it instead of writing
the correct code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The fact that the interface name is used in some nexthop config doesn't
mean that the interface is configured.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
```
exit1-debian-9# show ip route 172.16.16.1/32
Routing entry for 172.16.16.1/32
Known via "bgp", distance 20, metric 0, best
Last update 00:00:28 ago
* 192.168.0.2, via eth1, weight 1
AS-Path : 65003
Communities : first 65001:2 65001:3
Large-Communities: 65001:1:1 65001:1:2 65001:1:3
Selection reason : First path received
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
To ensure this, add a const modifier to functions' arguments. Would be
great do this initially and avoid this large code change, but better
late than never.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The switch statement over `enum zebra_link_type` had a default
and FRR was missing a few of the pre-defined types we cared about.
Remove the default statement and add the missing values.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There's no more difference between number-named and word-named access-lists.
This commit removes separate arguments for number-named ACLs from CLI.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We should never pass pointers to local variables to thread_add_* family.
When an event is executed, the library writes into this pointer, which
means it writes into some random memory on a stack.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The current code passes an address of a local variable to `thread_add_read`
which stores it into `thread->ref` by the lib. The next time the thread
callback is executed, the lib stores NULL into the `thread->ref` which
means it writes into some random memory on the stack.
To fix this, we should pass a pointer to the vector entry to the lib.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
`yang_dnode_get` will `assert` if no YANG node/model exist, so lets test for
its existence first before trying to access it.
This `assert` is only acceptable for internal FRR usage otherwise we
might miss typos or unmatching YANG models nodes/leaves. For gRPC usage
we should let users attempt to use non existing models without
`assert`ing.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Since there's very few locations where the `frr-format` actually prints
false positive warnings, consensus seems to be to just work around the
false positives even if the code is correct.
In fact, there is only one pattern of false positives currently, in
`bfdd/dplane.c` which does `vty_out("%"PRIu64, (uint64_t)be64toh(...))`.
The workaround/fix for this is a replacement `be64toh` whose type is
always `uint64_t` regardless of what OS we're on, making the cast
unnecessary.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Pass down the safi for when we need address
resolution. At this point in time we are
hard coding the safi to SAFI_UNICAST.
Future commits will take advantage of this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
These are no longer really needed. The client just needs
to call nexthop resolution instead.
So let's remove the zapi types.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
After `<daemon_name> is not running` message vtysh does not return
error. For example if you disable ospf in `/etc/frr/daemons` and run
`vtysh -c configure -c "router ospf"` it prints the message to stderr,
but returns 0.
This commit will make vtysh return error when not in interractive mode.
But if you run commands from vtysh, you will still be able to enter
views and exit them if daemon is not running.
Signed-off-by: Yaroslav Fedoriachenko <yar.fed99@gmail.com>
Instead of sorting each command one-by-one using listnode_add_sort, add
them to the list without sorting and then sort the list only once.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
frrmod_load() attempts to dlopen() several possible paths
(constructed from its basename argument) until one succeeds.
Each dlopen() attempt may fail for a different reason, and
the important one might not be the last one. Example:
dlopen(a/foo): file not found
dlopen(b/foo): symbol "bar" missing
dlopen(c/foo): file not found
Previous code reported only the most recent error. Now frrmod_load()
describes each dlopen() failure.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Sometimes it's needed to match by fields of one object but set fields of
another object. The following commit is an example.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
vrf_name_to_id() returned VRF_DEFAULT when the vrf name was
unknown, hiding errors. Per community recommendation, vrf_name_to_id()
is now removed and the few callers now use vrf_lookup_by_name()
directly.
Signed-off-by: G. Paul Ziemba <paulz@labn.net>
Description:
Change is intended for fixing the following issues related to vrf route leaking:
Routes with special nexthops i.e. blackhole/sink routes when imported,
are not programmed into the FIB and corresponding nexthop is set as 'inactive',
nexthop interface as 'unknown'.
While importing/leaking routes between VRFs, in case of special nexthop(ipv4/ipv6)
once bgp announces route(s) to zebra, nexthop type is incorrectly set as
NEXTHOP_TYPE_IPV6_IFINDEX/NEXTHOP_TYPE_IFINDEX
i.e. directly connected even though we are not able to resolve through an interface.
This leads to nexthop_active_check marking nexthop !NEXTHOP_FLAG_ACTIVE.
Unable to find the active nexthop(s), route is not programmed into the FIB.
Whenever BGP leaks routes, set the correct nexthop type, so that route gets resolved
and correctly programmed into the FIB, in the imported vrf.
Co-authored-by: Kantesh Mundaragi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
Without this, the hook code creates functions with empty parameter lists
like "void hook_something()", which is not a proper C prototype. It
needs to be "void hook_something(void)". Add some macro shenanigans to
handle that.
... and make the plumbing functions "inline" too.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This fixes some spurious warnings on *BSD, where `elffile_add_dynreloc`
isn't used since `elf_getdata_rawchunk` is not available.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Commit: 4b983eef2c
Modified the zapi send receive of the c-bit to only
be under the HAVE_BFDD. If you are using ptm-bfd
then the decoder function still expects this to be
sent down. This commit puts this behavior back
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This allows defining a CLI command like this:
`[no] some setting ![VALUE]`
with VALUE being optional for the "no" form, but required for the
positive form. It's just a `[...]` where the empty branch can only be
taken for commands starting with `no`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Instead of adding a separate case clause for every node, just find the
node structure in the global list and get its parent node from there.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Insist on the fact that zclient neighbor state flags are
mapped over netlink state flags. List all the defines
currently known on kernel, and create a netlink API to
convert netlink values to zclient values. The function is
simplified as it is a 1-1 match.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
As NHRP expects some notification of neighboring entries on GRE
interface, when a new interface notification is encountered, the
exact neighbor state flag is found. Previously, the flag passed
to the upper layer was forced to NDM_STATE which is REACHABLE,
as can be seen on below trace:
2021/08/25 10:58:39 NHRP: [QQ0NK-1H449] Netlink: new-neigh 102.1.1.1 dev gre1 lladdr 10.125.0.2 nud 0x2 cache used 1 type 5
When passing the real value, NHRP received an other value like STALE.
2021/08/25 11:28:44 NHRP: [QQ0NK-1H449] Netlink: new-neigh 102.1.1.1 dev gre1 lladdr 10.125.0.2 nud 0x4 cache used 0 type 5
This flag is important for NHRP, as it permits to monitor the link
layer of NHRP entries.
Fixes: d603c0774e ("nhrp, zebra, lib: enforce usage of zapi_neigh_ip structure")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
"[no] netns NAME" commands are part of the lib, but they are actually
zebra-only:
- they are using vrf_netns_handler_create and its description clearly
says that it "should be called from zebra only"
- vtysh sends these commands only to zebra
- only zebra outputs the netns related config
- zebra notifies other daemons about netns attachment
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There is a possibility that the same line can be matched as a command in
some node and its parent node. In this case, when reading the config,
this line is always executed as a command of the child node.
For example, with the following config:
```
router ospf
network 193.168.0.0/16 area 0
!
mpls ldp
discovery hello interval 111
!
```
Line `mpls ldp` is processed as command `mpls ldp-sync` inside the
`router ospf` node. This leads to a complete loss of `mpls ldp` node
configuration.
To eliminate this issue and all possible similar issues, let's print an
explicit "exit" at the end of every node config.
This commit also changes indentation for a couple of existing exit
commands so that all existing commands are on the same level as their
corresponding node-entering commands.
Fixes#9206.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There were paths where the zmq wrapper lib could call user
callbacks that would free the internal context struct, but the
context was then used in the lib code. Use a boolean to avoid
freeing the context within an application callback.
Restore logic that frees the context within the 'cancel' api.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
The zeromq lib wrapper uses an internal context struct to help
interact with the libfrr event mechanism. When freeing that
context struct, ensure the caller's pointer is also cleared.
Signed-off-by: Mark Stapp <mjs.ietf@gmail.com>
While defaults are good picks for "reasonable" guesses, min and max
range values really aren't. Operators and experimenters often like to
configure "unreasonable" values to stress test, tests boundary
conditions and explore innovations.
With that in mind, change all ranges to 1..max (of type).
While we're here add optional ignored values in the "no" CLI forms.
Signed-off-by: Christian Hopps <chopps@labn.net>
When using "match evpn default-route" rule, match_arg is NULL and strcmp
is not happy with that. There's already a special function named rulecmp
that handles such situations.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, when we check the new prefix-list entry for duplication, we
only take filled in fields into account and ignore optional fields.
For example, if we already have `ip prefix-list A 0.0.0.0/0 le 32` and
we try to add `ip prefix-list A 0.0.0.0/0`, it is treated as duplicate.
We should always compare all prefix-list fields when doing the check.
Fixes#9355.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Previous:
- frrscript_load: push Lua function onto stack
- frrscript_call: calls Lua function
Now:
- frrscript_load: checks Lua function
- frrscript_call: pushes and calls Lua function (first clear the stack)
So now we just need one frrscript_load for consecutive frrscript_call.
frrscript_call does not recompile or reload the script file, it just
keys it from the global Lua table (where it should already be).
Signed-off-by: Donald Lee <dlqs@gmx.com>
When we get a bad value for the opaque data length, instead
of stopping the program, discard the data and move on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The only difference in daemons' interface node definition is the config
write function. No need to define the node in every daemon, just pass
the callback as an argument to a library function and define the node
there.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When running `make check` against a build that zeromq enabled
the test_zmq unit test was crashing. The commit:
e08165def1
Introduced this crash. Removing the part of the commit
that was causing the crash in the test. This is mainly
to get `make check` working again for those people using
zeromq in their builds.
Fixes: #9176
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
like the other automake variables, setting `xyz_LDFLAGS` causes
`AM_LDFLAGS` to be ignored for `xyz`. For some reason I had in my mind
that automake doesn't do this for LDFLAGS, but... it does. (Which is
consistent with `_CFLAGS` and co.)
So, all the libraries and modules have been ignoring `AM_LDFLAGS` (which
includes `SAN_FLAGS` too). Set up new `LIB_LDFLAGS` and
`MODULE_LDFLAGS` to handle all of this correctly (and move these bits to
a central location.)
Fixes: #9034
Fixes: 0c4285d77e ("build: properly split CFLAGS from AC_CFLAGS")
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When exiting from link-params node, we must not decrement xpath_index
because it is not incremented when entering the node.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Will be handy to filter BGP prefixes by using BGP community alias
instead of numerical community values.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
frrscript_load now loads a function instead of a file, so frrscript_unload
should be renamed since it does not unload a function.
Signed-off-by: Donald Lee <dlqs@gmx.com>
There is some rather heavy error checking logic in frrscript_call. Normally
I'd put this in the _frrscript_call function, but the error checking needs
to happen before the encoders/decoders in the macro are called.
The error checking looks messy but its really just nested ternary
expressions insite a larger statement expression.
Signed-off-by: Donald Lee <dlqs@gmx.com>
Instead of 1 frrscript : 1 lua state, it is changed to 1 : N.
The states are hashed with their function names.
Signed-off-by: Donald Lee <dlqs@gmx.com>
Move `is_default_prefix` variations to `lib/prefix.h` and make the code
use the library version instead of implementing it again.
NOTE
----
The function was split into per family versions to cover all types.
Using `union prefixconstptr` is not possible due to static analyzer
warnings which cause CI to fail.
The specific cases that would cause this failure were:
- Caller used `struct prefix_ipv4` and called the generic function.
- `is_default_prefix` with signature using `const struct prefix *` or
`union prefixconstptr`.
The compiler would complain about reading bytes outside of the memory
bounds even though it did not take into account the `prefix->family`
part.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
We are sending up to ZAPI_MESSAGE_OPAQUE_LENGTH but checking
for one less. We know the data will fit in it to that size.
Also we have asserts on the write to ensure we don't go over
it
Fixes: #8995
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
There's nothing that can be done here with an error. Try to make
Coverity understand that this is intentional.
(I don't know if the `(void)` will actually fix the coverity warning,
but I don't really have a better way to figure it out beyond just
getting this merged and waiting for a result...)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
While we do have `show ip prefix-list NAME A.B.C.D/M`, that doesn't
actually run the prefix list matching code. While the result would
hopefully be the same anyway, let's have a way to call the actual prefix
list match code and get a result.
(As an aside, this might be useful for scripting to do a quick "is this
prefix in that prefix list" check.)
Signed-off-by: David Lamparter <equinox@diac24.net>
... the PIM code is kinda misusing prefix lists to match addresses.
Considering the weird semantics of access-lists, I can't fault it.
However, prefix lists aren't great at matching addresses by default,
since they try to match the prefix length too. So, here's an "address
match mode" for prefix lists to get that to work more reasonably.
Fixes: #8492
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This replaces the external libsystemd dependency with... pretty much the
same amount of built-in code. But with one fewer dependency and build
switch needed.
Also check `JOURNAL_STREAM` for future logging integration.
Signed-off-by: David Lamparter <equinox@diac24.net>
There are a few places in the code where we use PREFIX_COPY(_IPV4/IPV6)
macro to copy a prefix. Let's always use prefix_copy function for this.
This should fix CID 1482142 and 1504610.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Adding defensive code to the interface_link_params zebra callback
to check if the link params changed before taking action.
Signed-off-by: Karen Schoener <karen@voltanet.io>
Adding the "clear ipv6 ospf6 command" . It resets
the ospfv3 datastructures and clears the database
as well as route tables. It resets the neighborship
by restarting the interface state machine.
If the user wants to change the router-id, this
command updates the router-id to the latest static
router-id and starts the neighbor formation with
the new router-id.
Signed-off-by: Yash Ranjan <ranjany@vmware.com>
Adding a `\n' should now produce a warning. Controlled by `-Werror` so
if you're doing a dev build and it's warning about some `prefix2str`
that should be converted to `%pFX`, you can turn off `-Werror` to fix it
later like with all other warnings.
Signed-off-by: David Lamparter <equinox@diac24.net>
This might be faster if at some point in the future the Linux vDSO
supports CLOCK_THREAD_CPUTIME_ID without making a syscall. (Same
applies for other OSes.)
Signed-off-by: David Lamparter <equinox@diac24.net>
...really no reason to force this into a compile time decision. The
only point is avoiding the getrusage() syscall, which can easily be a
runtime decision.
[v2: also split cputime & walltime limits]
Signed-off-by: David Lamparter <equinox@diac24.net>
Use this noop decoder for const values (since we can't mutate a const
value passed into frrscript_call anyways) or values we don't want to
write a decoder for.
Signed-off-by: Donald Lee <dlqs@gmx.com>
This is an example of creating encoders and decoders for user defined
structs and registering them in the ENCODE_ARGS DECODE_ARGS macro
in frrscript.
Signed-off-by: Donald Lee <dlqs@gmx.com>
Split existing lua_to* functions into two functions:
- lua_decode_*: unmarshall Lua values into an existing C data structure
- lua_to*: allocate *and* unmarshall Lua values into C data structure
This allows us to mutate C data structures passed into frrscript_call,
without allocation
Signed-off-by: Donald Lee <dlqs@gmx.com>
If we have the following configuration:
```
vrf red
smth
exit-vrf
!
interface red vrf red
smth
```
And we delete the VRF using "no vrf red" command, we end up with:
```
interface red
smth
```
Interface config is preserved but moved to the default VRF.
This is not an expected behavior. We should remove the interface config
when the VRF is deleted.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
glibc removed its pid cache a while back, and grabbing this from TLS is
faster than repeatedly calling the kernel...
Also use `intmax_t` which is more appropriate for both PID & TID.
Signed-off-by: David Lamparter <equinox@diac24.net>
Since the file targets append one anyway, save them some extra work.
syslog can use `%.*s` since it's "forced" printf by API anyway.
Signed-off-by: David Lamparter <equinox@diac24.net>
printfrr() recently acquired the capability to record start/end of
formatting outputs. Make use of this in the zlog code so logging
targets have access to this information.
(This also records how long the `[XXXXX-XXXXX][EC 9999999]` prefix was
so log targets can choose to skip over it.)
Signed-off-by: David Lamparter <equinox@diac24.net>
`log-filter WORD` was giving me a serious headache since it also matches
`log WORD` due to the way the CLI token handling works. This meant that
a mistyped `log something` command would silently be interpreted as a
filter string, causing me serious headscratching and WTFs until I
figured what was going on.
Remove this UX pitfall so noone else falls into it. (Since the command
was never saved to config, renaming it shouldn't cause trouble.)
[Also I apparently forgot to update the docs when I transferred this
over to the new zlog bits...]
TODO for a rainy day: since we collect all the CLI commands anyway, we
should warn somewhere for "2nd level ambiguous" commands like this.
Signed-off-by: David Lamparter <equinox@diac24.net>
Almost all functions currently marked with pure attribute acquire a
route_node lock. By marking them pure we allow compiler to optimize the
code and not call them when it already knows the return value. This is
completely incorrect.
Only two of eleven functions can be marked as pure. And they still won't
be optimized because they are never called from the same function twice.
Let's remove the ext_pure macro completely to reduce the chance of
repeating this mistake in the future.
Fixes#8866, #8809, #8595, #6992.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This commit fixes the following problem:
- enter the interface node
- move the interface to another VRF
- try to continue configuring the interface
It is not possible to continue configuration because the XPath stored in
the vty doesn't correspond with the actual state of the system anymore.
For example:
```
nfware# conf
nfware(config)# interface enp2s0
<-- move the enp2s0 to a different VRF -->
nfware(config-if)# ip router isis 1
% Failed to get iface dnode in candidate DB
```
To fix the issue, go through all connected vty shells and update the
stored XPath.
Suggested-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Always terminate default VRF last during FRR shutdown.
On shutdown we were simply looping over the RB tree and terminating
VRFs from the ROOT. This is not guaranteed to be the default last ever.
Instead switch to RB_SAFE and skip the default VRF till the very end.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
The %p printf format specifier does already print the pointer address
with a leading "0x" prefix (indicating a hexadecimal number). There's
no need to add that prefix manually.
While here, replace explicit function names in log messages by
__func__.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
- Add following set clause for route-maps
"set evpn gateway-ip <ipv4|ipv6 >A.B.C.D|X:X::X:X"
- When this route-map is applied as outboubd policy in BGP, it will set the
gateway-ip in BGP attribute For EVPN type-5 routes.
Example configuration:
route-map RMAP-EVPN_GWIP permit 5
set evpn gateway-ip ipv4 50.0.2.12
set evpn gateway-ip ipv6 50:0:2::12
router bgp 101
bgp router-id 10.100.0.1
neighbor 10.0.1.2 remote-as 102
!
address-family l2vpn evpn
neighbor 10.0.1.2 activate
neighbor 10.0.1.2 route-map RMAP-EVPN_GWIP out
advertise-all-vni
exit-address-family
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Fix the following address sanitizer crash when running the command `find`:
ERROR: AddressSanitizer: dynamic-stack-buffer-overflow
WRITE of size 1 at 0x7fff4840fc1d thread T0
0 in print_cmd ../lib/command.c:1541
1 in cmd_find_cmds ../lib/command.c:2364
2 in find ../vtysh/vtysh.c:3732
3 in cmd_execute_command_real ../lib/command.c:995
4 in cmd_execute_command ../lib/command.c:1055
5 in cmd_execute ../lib/command.c:1219
6 in vtysh_execute_func ../vtysh/vtysh.c:486
7 in vtysh_execute ../vtysh/vtysh.c:671
8 in main ../vtysh/vtysh_main.c:721
9 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
10 in _start (/usr/bin/vtysh+0x21f64d)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Test uses staticd which required some C++ header protections.
Additionally, the test also runs in the ubuntu20 docker container as
grpc is supported there by the packaging system.
Signed-off-by: Christian Hopps <chopps@labn.net>
When the VRF node is exited using "exit" or "quit", there's still a VRF
pointer stored in the vty context. If you try to configure some router
related command, it will be applied to the previous VRF instead of the
default VRF. For example:
```
(config)# vrf test
(config-vrf)# ip router-id 1.1.1.1
(config-vrf)# do show run
...
!
vrf test
ip router-id 1.1.1.1
exit-vrf
!
...
(config-vrf)# exit
(config)# ip router-id 2.2.2.2
(config)# do show run
...
!
vrf test
ip router-id 2.2.2.2
exit-vrf
!
...
```
`vrf-exit` works correctly, because it stores a pointer to the default
VRF into the vty context (but weirdly keeping the VRF_NODE instead of
changing it to CONFIG_NODE).
Instead of relying on the behavior of exit function, always use the
default VRF when in CONFIG_NODE.
Another problem is missing `VTY_CHECK_CONTEXT`. If someone deletes the
VRF in which node the user enters the command, then zebra applies the
command to the default VRF instead of throwing an error.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, we output the command exactly how it is defined in DEFUN.
We shouldn't output varnames and excessive whitespace.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
https://github.com/FRRouting/frr/pull/5865#discussion_r597670225
As this comment says. ZEBRA_FLAG_XXX should not have been used.
To communicate SRv6 Route Information. A simple Nexthop Flag would
have been sufficient for SRv6 information. And I fixed the whole
thing that way.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
I agree with the PR review from mstap and merged
the following two functions into one.
- zapi_nexthop_seg6_cmp
- zapi_nexthop_seg6local_cmp
[note]
If both of them have more complex extensions in the future,
I think it will be less confusing to remove the integration
of these functions and make them as separate functions as
before.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit add usuful function to configure SRv6 localsid
which is represented with seg6local lwt route.
Now, it can support only NEXTHOP_TYPE_IFINDEX route.
Actual configurationof SRv6 localsid is performed with
ZEBRA_ROUTE_ADD. So this is just a wrapper function
for route-install.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit add new nexthop's addional object for SRv6
routing about seg6 route. Before this commit,
we can add MPLS info as additional object on nexthop.
This commit make it add more support about seg6 routes.
seg6 routes are ones of the LWT routing mechanism,
so configuration of seg6local routes is performed by
ZEBRA_ROUTE_SEND, it's same as MPLS configuration.
Real configuration implementation isn't implemented at
this commit. later commit add that. This commit add
only nexthop additional object and some misc functions.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit is a part of #5853 works that add new structures for
SRv6-locator. This structure will be used by zebra and another
routing daemon and its ZAPI messaging to manage SRv6-locator.
Encoder/decoder for ZAPI stream is also added by this commit.
Real configuration mechanism isn't implemented at this commit.
later commit add real configure implementation. This commit add only
SRv6-locator's structures and misc functions.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit is a part of #5853 that add new cmd-node for SRv6 configuration.
This commit just add cmd-node and moving node cli only, acutual SRv6 config
command isn't added. (that is added later commit. of this branch)
new cli nodes:
* SRv6
* SRv6-locators
* SRv6-locator
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
This commit is a part of #5853 works that add new nexthop's addional
object for SRv6 routing about seg6local route. Before this commit,
we can add MPLS info as additional object on nexthop.
This commit make it add more support about seg6local routes.
seg6local routes are ones of the LWT routing mechanism,
so configuration of seg6local routes is performed by
ZEBRA_ROUTE_SEND, it's same as MPLS configuration.
Real configuration implementation isn't implemented at this commit.
later commit add that. This commit add only nexthop additional object
and some misc functions.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
Make seg6local_context2str function's prototype better.
This function is added on commit e496b4203, and function
interface's considering wasn't enough.
Signed-off-by: Hiroki Shirokura <slank.dev@gmail.com>
The backoff code assumed that yang operations always completed quickly.
It checked for > 100 YANG modeled commands happening in under 1 second
to enable batching. If 100 yang modeled commands always take longer than
1 second batching is never enabled. This is the exact opposite of what
we want to happen since batching speeds the operations up.
Here are the results for libyang2 code without and with batching.
| action | 1K rts | 2K rts | 1K rts | 2K rts | 20k rts |
| | nobatch | nobatch | batch | batch | batch |
| Add IPv4 | .881 | 1.28 | .703 | 1.04 | 8.16 |
| Add Same IPv4 | 28.7 | 113 | .590 | .860 | 6.09 |
| Rem 1/2 IPv4 | .376 | .442 | .379 | .435 | 1.44 |
| Add Same IPv4 | 28.7 | 113 | .576 | .841 | 6.02 |
| Rem All IPv4 | 17.4 | 71.8 | .559 | .813 | 5.57 |
(IPv6 numbers are basically the same as iPv4, a couple percent slower)
Clearly we need this. Please note the growth (1K to 2K) w/o batching is
non-linear and 100 times slower than batched.
Notes on code: The use of the new `nb_cli_apply_changes_clear_pending`
is to commit any pending changes (including the current one). This is
done when the code would not correctly handle a single diff that
included the current changes with possible following changes. For
example, a "no" command followed by a new value to replace it would be
merged into a change, and the code would not deal well with that. A good
example of this is BGP neighbor peer-group changing. The other use is
after entering a router level (e.g., "router bgp") where the follow-on
command handlers expect that router object to now exists. The code
eventually needs to be cleaned up to not fail in these cases, but that
is for future NB cleanup.
Signed-off-by: Christian Hopps <chopps@labn.net>
The code that actually calls FRR northbound functions needs to be running in the
master thread. The previous code was running on a GRPC pthread. While fixing
moved to more functional vs OOP to make this easier to see.
Also fix ly merge to merge siblings not throw the originals away.
Signed-off-by: Christian Hopps <chopps@labn.net>
Never send an interface name/index for multihop sessions. It breaks
"neighbor A.B.C.D update-source" config in BGP.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There are two possible use-cases for the `vrf_bind` function:
- bind socket to an interface in a vrf
- bind socket to a vrf device
For the former case, there's one problem - success is returned when the
interface is not found. In that case, the socket is left unbound without
throwing an error.
For the latter case, there are multiple possible problems:
- If the name is not set, then the socket is left unbound (zebra, vrrp).
- If the name is "default" and there's an interface with that name in the
default VRF, then the socket is bound to that interface.
- In most daemons, if the router is configured before the VRF is actually
created, we're trying to open and bind the socket right after the
daemon receives a VRF registration from zebra. We may not receive the
VRF-interface registration from zebra yet at that point. Therefore,
`if_lookup_by_name` fails, and the socket is left unbound.
This commit fixes all the issues and updates the function description.
Suggested-by: Pat Ruddy <pat@voltanet.io>
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Prior to this commit, updating a prefix-list that is referenced by a
route-map clause will unconditionally delete the root node of that
route-map's prefix-tree (used with route-map optimization).
This is problematic because routes not matching a more specific node
in the tree (i.e. other prefix-list sequences) will not fall-back to
the default node, thus they will not hit any route-map sequences.
This commit ensures that an update to a prefix-list will only delete
the default node while adding the first/only seq to the list.
Example config:
========
ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16
ip prefix-list peer475-out-pfxlist seq 50 permit 0.0.0.0/0
!
route-map peer475-out permit 5
match ip address prefix-list peer475-out-pfxlist
Before:
========
ub20# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20# conf t
ub20(config)# ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16 le 32
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
2.138.0.0/16 (2)
(P)
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
2.138.0.0/16 (2)
(P)
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20(config)#
After:
========
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20(config)# ip prefix-list peer475-out-pfxlist seq 45 permit 2.138.0.0/16 le 32
ub20(config)# do show route-map peer475-out prefix-table
ZEBRA:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
BGP:
IPv4 Prefix Route-map Index List
_______________ ____________________
0.0.0.0/0 (2)
(P)
peer475-out seq 5
2.138.0.0/16 (2)
(P) 0.0.0.0/0
peer475-out seq 5
IPv6 Prefix Route-map Index List
_______________ ____________________
ub20(config)#
Fixes: 8410
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
lyd_merge_tree replaces dest siblings with source siblings, not what we
want. Instead lyd_merge_siblings to keep both. Instead lyd_merge_siblings
to keep both.
Signed-off-by: Christian Hopps <chopps@labn.net>
This includes community and large-community data.
```
exit1-debian-9# show ip route 172.16.16.1/32
Routing entry for 172.16.16.1/32
Known via "bgp", distance 20, metric 0, best
Last update 00:00:23 ago
* 192.168.0.2, via eth1, weight 1
AS-Path : 65030
Communities : 65001:1 65001:2 65001:3 65001:4 65001:5 65001:6
Large-Communities: 65001:123:1 65001:123:2
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Compile with v2.0.0 tag of `libyang2` branch of:
https://github.com/CESNET/libyang
staticd init load time of 10k routes now 6s vs ly1 time of 150s
Signed-off-by: Christian Hopps <chopps@labn.net>
Track 'down' state of connected addresses with a new flag. We
may have multiple addresses on an interface that share a prefix;
in those cases, we need to determine when the first address
is valid, to install a connected route, and similarly detect
when the last address goes 'down', to remove the connected
route.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
When specifying only an "le" for an existing ip prefix-list qualified with
both an "le" and "ge" make sure to remove the "ge" property so it does
not stay in the tree.
E.g. Saying these two things in order:
ip prefix-list test seq 1 permit 1.1.0.0/16 ge 18 le 24
ip prefix-list test seq 1 permit 1.1.0.0/16 ge 18
... should result in the second statement "overwriting" the first like
this:
vxdev-arch# do show ip prefix-list
ZEBRA: ip prefix-list foobar: 3 entries
seq 1 permit 15.0.0.0/16 ge 18
Previously this did not happen and "le" would stick around since it was
never given NB_OP_DESTROY and purged from the data tree.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
Show alias name instead of numerical value in `show bgp <prefix>. E.g.:
```
root@exit1-debian-9:~/frr# vtysh -c 'sh run' | grep 'bgp community alias'
bgp community alias 65001:123 community-1
bgp community alias 65001:123:1 lcommunity-1
root@exit1-debian-9:~/frr#
```
```
exit1-debian-9# sh ip bgp 172.16.16.1/32
BGP routing table entry for 172.16.16.1/32, version 21
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
65030
192.168.0.2 from home-spine1.donatas.net(192.168.0.2) (172.16.16.1)
Origin incomplete, metric 0, valid, external, best (Neighbor IP)
Community: 65001:12 65001:13 community-1 65001:65534
Large Community: lcommunity-1 65001:123:2
Last update: Fri Apr 16 12:51:27 2021
exit1-debian-9#
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
The distribute_list_init command is not used and is setup
code that will never be used because it makes assumptions about
how distribute-lists work that are fundamentally incorrect.
Remove the code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Abstract the parsing of distribute lists so that we
don't have as much cut-n-paste code.
This is a setup commit for future work. In effect
current distribute-list handling is all kinds of messed up
a) eigrp and babel both attempt to use distribute-lists, they just plain
don't work.
b) `distribute-list` is only sent to rip. `ipv6 distribute-list`
is sent to ripngd. If you use `distribute-list` under `router ripng`
it sends the command to rip but ripd is in the wrong mode and it
never works.
c) Should ripngd care about v4 and v6 specific distribute-lists?
This dichotomy was added for babel but babel has been broke
about this since day 1( see a ).
All in all we need to unwind this whole mess. Make distribute-list
commands specific to the daemons( so that we can be in the right
sub-mode ). But the parsing is going to be the same across all
daemons. So let's provide that functionality in `lib/distribute.c`
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a simpler, more limited nexthop comparison function. This
compares a few key attributes, such as vrf, gateway, labels.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Problem Statement:
=================
In scale setup BGP sessions start flapping.
RCA:
====
In virtualized environment there are multiple places where
MTU need to be set. If there are some places were MTU is not set
properly then there is chances that BGP packets get fragmented,
in scale setup this will lead to BGP session flap.
Fix:
====
A new tcp option is provided as part of this implementation,
which can be configured per neighbor and helps to set the TCP
max segment size. User need to derive the path MTU between the BGP
neighbors and set that value as part of tcp-mss setting.
1. CLI Configuration:
[no] neighbor <A.B.C.D|X:X::X:X|WORD> tcp-mss (1-65535)
2. Running config
frr# show running-config
router bgp 100
neighbor 198.51.100.2 tcp-mss 150 => new entry
neighbor 2001:DB8::2 tcp-mss 400 => new entry
3. Show command
frr# show bgp neighbors 198.51.100.2
BGP neighbor is 198.51.100.2, remote AS 100, local AS 100, internal link
Hostname: frr
Configured tcp-mss is 150, synced tcp-mss is 138 => new display
4. Show command json output
frr# show bgp neighbors 2001:DB8::2 json
{
"2001:DB8::2":{
"remoteAs":100,
"bgpTimerKeepAliveIntervalMsecs":60000,
"bgpTcpMssConfigured":400, => new entry
"bgpTcpMssSynced":388, => new entry
Risk:
=====
Low - This is a config driven feature and it sets the max segment
size for the TCP session between BGP peers.
Tests Executed:
===============
Have done manual testing with three router topology.
1. Executed basic config and un config scenarios
2. Verified if the config is updated in running config
during config and no config operation
3. Verified the show command output in both CLI format and
JSON format.
4. Verified if TCP SYN messages carry the max segment size
in their initial packets.
5. Verified the behaviour during clear bgp session.
6. done packet capture to see if the new segment size
takes effect.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
pimd was the only user of this function, and that has gone away now.
So just kill the function.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These hoops to get warnings for mis-printing `uint64_t` are apparently
breaking some C++ bits...
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The previous method, using zassert.h and hoping nothing includes
assert.h (which, on glibc at least, just does "#undef assert" and puts
its own definition in...) was fragile - and actually broke undetected.
Just provide our own assert.h and control overriding by putting it in a
separate directory to add to the include path (or not.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
When an operator encounters a situation where the number
of FD's open is greater than what we have been configured
to legitimately handle via uname or the `--limit-fds` command
line, abort with a message that they should be able to
debug and figure out what is going on.
Fixes: #8596
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
flush netlink related dependencies with gre information.
Add some linux headers required to compile with it.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
3 new gre commands are available:
- GRE_GET to permit a daemon to retrieve gre information.
- GRE_UPDATe is the reply message from zebra to the daemon. as it is a
syncronous request, the GRE_GET expected will have to match the vrf id
where the gre information is wished. this has an impact on label
manager with change in APIs.
- SET_GRE_SOURCE. this command will be stubbed for now, assuming that
the gre interface is set accordingly by external script.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Description:
This looks broken after NB changes in routemap. When routemap
action modified from permit to deny, it is expected to apply
the new action on the filtered routes before the action in the
routemap data structure has been changed. But currently this is
not handled by the corresponding northbound API.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Use unsigned value for all RA requests to Zebra
- encoding signed int as unsigned is bad practice
- RA interval is never, and should never be, negative
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
`config.h` has all the defines from autoconf, which may include things
that switch behavior of other included headers (e.g. _GNU_SOURCE
enabling prototypes for additional functions.)
So, the first include in any `.c` file must be either `config.h` (with
the appropriate guard) or `zebra.h` (which includes `config.h` first
thing.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Don't uninstall sessions if the address, interface, VRF or TTL didn't change.
Update the library documentation to make it clear to other developers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently this flag is only helpful in an extremely rare situation when
the BFD session registration was unsuccessful and after that zebra is
restarted. Let's remove this flag to simplify the API. If we ever want
to solve the problem of unsuccessful registration/deregistration, this
can be done using internal flags, without API modification.
Also add the error log to help user understand why the BFD session is
not working.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Creating any threads before we fork() into the background (if `-d` is
given) is an extremely dangerous footgun; the threads are created in
the parent and terminated when that exits.
This is extra dangerous because while testing, you'd often run the
daemon in foreground without `-d`, and everything works as expected.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... for any initialization that needs to run after forking, but that
would be racy if it were just scheduled on the thread_master (since the
config load is also just a thread callback, ordering would be undefined
for another scheduled thread callback.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
very_late_init doesn't really say what this does, config_post is much
more descriptive. (A config_pre is coming in a jiffy.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The (legacy) code for reading split configs tries to execute config
commands in parent nodes, but doesn't call the node_exit function when
it goes up to a parent node. This breaks BGP RPKI setup (and extended
syslog, which is in the next commit.)
Doing this correctly is a slight bit involved since the node_exit
callbacks should only be called if the command is actually executed on a
parent node.
Signed-off-by: David Lamparter <equinox@diac24.net>
If the last message in a batched logging operation isn't printed due to
priority, this skips the code that flushes prepared messages through
writev() and can trigger the assert() at the end of zlog_fd().
Since any logmsg above info priority triggers a buffer flush, running
into this situation requires a log file target configured for info
priority, at least 1 message of info priority buffered, a debug message
buffered after that, and then a buffer flush (explicit or due to buffer
full).
I haven't seen this chain of events happen in the wild, but it needs
fixing anyway.
Signed-off-by: David Lamparter <equinox@diac24.net>