Fixes the valgrind error we were seeing on startup due to
initializing the msg header struct:
```
==2534283== Thread 3 zebra_dplane:
==2534283== Syscall param recvmsg(msg) points to uninitialised byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0x85cd850 is on thread 3's stack
==2534283== in frame #2, created by nl_batch_read_resp (kernel_netlink.c:1051)
==2534283==
==2534283== Syscall param recvmsg(msg.msg_control) points to unaddressable byte(s)
==2534283== at 0x4D616DD: recvmsg (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x43107C: netlink_recv_msg (kernel_netlink.c:744)
==2534283== by 0x4330E4: nl_batch_read_resp (kernel_netlink.c:1070)
==2534283== by 0x431D12: nl_batch_send (kernel_netlink.c:1201)
==2534283== by 0x431E8B: kernel_update_multi (kernel_netlink.c:1369)
==2534283== by 0x46019B: kernel_dplane_process_func (zebra_dplane.c:3979)
==2534283== by 0x45EB7F: dplane_thread_loop (zebra_dplane.c:4368)
==2534283== by 0x493F5CC: thread_call (thread.c:1585)
==2534283== by 0x48D3450: fpt_run (frr_pthread.c:303)
==2534283== by 0x48D3D41: frr_pthread_inner (frr_pthread.c:156)
==2534283== by 0x4D56431: start_thread (in /usr/lib64/libpthread-2.31.so)
==2534283== by 0x4E709D2: clone (in /usr/lib64/libc-2.31.so)
==2534283== Address 0xa0 is not stack'd, malloc'd or (recently) free'd
==2534283==
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Add Quentin's cocci patch to align code with the changes
to the event cancel api. Also added a README to explain what
this collection of cocci patches is for.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add the following Anycast-SIDs on routers rt4 and rt5:
* segment-routing prefix 10.10.10.10/32 index 100 no-php-flag n-flag-clear
* segment-routing prefix 2001:db8:1000::10/128 index 101 no-php-flag n-flag-clear
The updated JSON data will then check whether the Anycast-SIDs are
being processed as expected (e.g. rt1 should use ECMP to rt2 and rt3,
rt2 should use rt4 only as it's directly connected, etc).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add the "n-flag-clear" option to the "segment-routing prefix"
command. The only thing that option does is to clear the node
flag of the Prefix-SID, even if it corresponds to a local loopback
address. No changes are necessary other than that in order to fully
support Anycast-SIDs. isisd already supports multiple routers
advertising the same route with the same Prefix-SID after the recent
refactoring. Clearing the node flag for such anycast routes isn't
strictly required, but failure to do so can lead to problems like
TI-LFA picking the wrong Prefix-SID when calculating repair paths.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When computing backup nexthops for routes that contain a Prefix-SID,
the original Prefix-SID label should be present at the end of
backup label stacks (after the repair labels). This commit fixes
that oversight in the original TI-LFA code. The SPF unit tests and
TI-LFA topotes were also updated accordingly.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Embed Prefix-SID information inside SPF data structures so that
Prefix-SIDs can be installed together with their associated routes
at the end of the SPF algorithm. This is different from the current
implementation where Prefix-SIDs are parsed and processed separately,
which is vastly suboptimal.
Advantages of the new code:
* No need to parse the LSPDB an additional time to detect and process
SR-related changes;
* Routes are installed with their Prefix-SID labels in the same ZAPI
message. This can prevent packet dropping for a few milliseconds
after each SPF run if there are BGP-labeled routes (e.g. L3VPN) that
recurse on IGP labeled routes;
* Much easier to support Anycast-SIDs, as the SPF code will naturally
figure out the best nexthops and use only them (that can't be done
in any reasonable way if the Prefix-SID Sub-TVLs are processed
separately);
* Less code to maintain and reduced memory footprint;
The "show isis segment-routing prefix-sids" command was removed as
it doesn't make sense anymore now that "show isis route" exists.
Prefix-SIDs are a property of routes, so what was done was to extend
the "show isis route" command with a new "prefix-sid" option that
changes the output table to show the Prefix-SID information associated
to each route.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is preparatory change for the upcoming SR Prefix-SID
refactoring.
Since Prefix-SID information will be stored inside IS-IS routes
(instead of being maintained separately), it will be necessary to
have local routes in order to store local Prefix-SID information.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When both old and new-style TLVs exist for a particular prefix, give
precedence to the new-style TLV (like JUNOS does) when generating
routes from the SPT. This changes the current behavior which is to
generate a route for both TLVs, whereas the first is overwritten by
the second in a non-deterministic order (i.e. either the old-style
or the new-style TLV can "win" depending on how the SPF TENTative
list is arranged).
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Change thread_cancel to take a ** to an event, NULL-check
before dereferencing, and NULL the caller's pointer. Update
many callers to use the new signature.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Instead of just counting the route suppressions, keep a reference for
all aggregations that are doing it. It should help the with the
following problems:
- Which aggregation suppressed the route.
- Double suppression
- Double unsuppression
- Avoids calling `bgp_process` if already suppressed/unsuppressed.
- Easier code maintenance and understanding
This also fixes a crash when modifying a route map that is
associated with a working aggregate-address.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Currently eigrp has a bunch of commands that are not fully
implemented yet. Tone down the yang code change of making
these in your face errors to zlog_warns, so the end-user
can not be freaked out by the message.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When `-r` is specified to zebra, on shutdown we should
not remove any routes from the fib. This was a problem
with nhg's on shutdown due to their ref-count behavior.
Introduce a methodology where on shutdown we don't mess
with the nexthop groups in the kernel. That way on
next startup things will be ok.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Remove py-ipaddr and ipsec-tools as deps in the Alpine build container,
as these were both Python 2 libraries and are not used here anymore
`ipsec-tools` is also no longer available in Alpine's test repos and was
causing breakage on this builder
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
Add test for new aggregate address option: test aggregate address option
without converged routes, then test again with a different route map
with converged routes.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Add new aggregate-address option to selectively suppress routes based
on route map results.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
We currently have a global process queue for handling route
updates in bgp. This is fine, in general, except there are
places and times where we plug the queue for no new work
during certain peer states of bgp update delay. If we
happen to be processing multiple bgp instances on startup
why do we want to stop processing in vrf A when vrf B
is in a bit of a pickle?
Also this separation will allow us to start forward thinking
about how to fully integrate pthreads into route processing
in bgp.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>