Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Firstly, *keep no change* for `hash_get()` with NULL
`alloc_func`.
Only focus on cases with non-NULL `alloc_func` of
`hash_get()`.
Since `hash_get()` with non-NULL `alloc_func` parameter
shall not fail, just ignore the returned value of it.
The returned value must not be NULL.
So in this case, remove the unnecessary checking NULL
or not for the returned value and add `void` in front
of it.
Importantly, also *keep no change* for the two cases with
non-NULL `alloc_func` -
1) Use `assert(<returned_data> == <searching_data>)` to
ensure it is a created node, not a found node.
Refer to `isis_vertex_queue_insert()` of isisd, there
are many examples of this case in isid.
2) Use `<returned_data> != <searching_data>` to judge it
is a found node, then free <searching_data>.
Refer to `aspath_intern()` of bgpd, there are many
examples of this case in bgpd.
Here, <returned_data> is the returned value from `hash_get()`,
and <searching_data> is the data, which is to be put into
hash table.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
This commit fixes a rather obscure bug that was causing the GR
topotest to fail on a frequent basis.
RFC 3623 specifies that a router acting as a helper to a restarting
neighbor should monitor topology changes and abort the GR procedures
when one is detected, falling back to normal OSPF operation.
ospfd uses the ospf_lsa_different() function to detect when the
content of an LSA has changed, which is considered as a topology
change. The problem is that ospf_lsa_different() can return true
even when the two LSAs passed as parameters are identical, provided
one LSA has the OSPF_LSA_RECEIVED flag set and the other not.
In the context of the ospf_gr_topo1 test, router rt6 performs
a graceful restart and a few seconds later acts as a helper for
router rt7. When it's acting as a helper for rt7, it still didn't
translate its NSSA Type-7 LSAs, something that happens only after 7
seconds (OSPF_ABR_TASK_DELAY) of the first SPF run. The translated
Type-5 LSAs on its LSDB were learned from the helping neighbors
(rt3 and rt7). It's then possible that the NSSA Type-7 LSAs might
be translated while rt6 is acting as helper for rt7, which causes
the daemon to detect a non-existent topology change only because
the OSPF_LSA_RECEIVED flag is unset in the recently originated
Type-5 LSA.
Fix this problem by ignoring the OSPF_LSA_RECEIVED flag when
comparing LSAs for the purpose of topology change detection.
In short, the bug would only show up when the restarting router
would start acting as a helper immediately after coming back up
(which would be hard to happen in the real world). The topotest
failures became more frequent after commit 6255aad0bc because of
the removal of the 'sleep' calls, which used to give ospfd more time
to converge before start acting as a helper for other routers. The
problem still occurred from time to time though.
Fixes#9983.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add a 'json' parameter to the 'show_opaque_info' callback definition,
and update all instances of that callback to not display plain-text
data when the user requested JSON data.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
As per the RFC 3623 section 3.2,
OSPF nbr shouldn't be deleted even in unsuccessful helper exit.
1. Made the changes to keep neighbour even after exit.
2. Restart the dead timer after expiry in helper. Otherwise, Restarter
will be in FULL state in helper forever until it receives the 'hello'.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Both the GR helper code and the upcoming GR restarting code are going
to share a lot of definitions. As such, rename ospf_gr_helper.h to
ospf_gr.h, which will be the central point of all GR definitions
and prototypes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Remove previous log config
debug ospf graceful-restart helper
and just use
debug ospf graceful-restart
for everything related to OSPF GR.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
Log the LSA advertising router in addition to the LSA type and
ID in the places where that information is necessary to uniquely
identify the LSA in the LSDB.
This is useful, for example, to know exactly which LSA has changed
when the router is exiting from the GR helper mode when a topology
change was detected.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When exiting from the helper mode for a given router after an
unsuccessful graceful restart, removing the neighborship to that
router straight away leads to a dangling pointer in the associated
interface, which inevitably leads to a crash. To solve this
problem, schedule the removal of the neighbor instead of removing
it immediately.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Change the "show_ospf_grace_lsa_info" callback to account for the
fact that the "vty" parameter can be null.
This fixes a crash that happens when "debug ospf packet ls-update
detail" is configured and a Grace-LSA is sent or received.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When exiting from the GR helper mode, recalculate the DR only for
interfaces of the appropriate types (broadcast and NMBA).
This fixes a problem where the state of a neighbor reachable over a
p2p interface was changing from Full/DROther to Full/Backup across
a graceful restart.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Since a single ospfd process can have multiple OSPF interfaces
configured, we need to separate the global GR initialization and
termination from per-instance initialization and termination.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When browsing or parsing OSPF LSA TLVs, we need to use the LSA length which is
part of the LSA header. This length, encoded in 16 bits, must be first
converted to host byte order with ntohs() function. However, Coverity Scan
considers that ntohs() function return TAINTED data. Thus, when the length is
used to control for() loop, Coverity Scan marks this part of the code as defect
with "Untrusted Loop Bound" due to the usage of Tainted variable. Similar
problems occur when browsing sub-TLV where length is extracted with ntohs().
To overcome this limitation, a size attribute has been added to the ospf_lsa
structure. The size is set when lsa->data buffer is allocated. In addition,
when an OSPF packet is received, the size of the payload is controlled before
contains is processed. For OSPF LSA, this allow a secure buffer allocation.
Thus, new size attribute contains the exact buffer allocation allowing a
strict control during TLV browsing.
This patch adds extra control to bound for() loop during TLV browsing to
avoid potential problem as suggested by Coverity Scan. Controls are based
on new size attribute of the ospf_lsa structure to avoid any ambiguity.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
gcc 10 complains about some of our format specs, fix them. Use
atomic size_t in thread stats, to work around platform
differences.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Use to-string functions for GR message codes instead of raw
string array indexing; the values used can come in packets
and are not validated.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Description:
The following show commands are added to display helper specific
information.
1.show ip ospf graceful-restart helper [detail] [json]
--> displays user configurations and list of all helpers details.
2.show ip ospf neighbour detail
--> diplays helper details
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
The follwoing helper exit scenarios are handled.
1. Recv Max age grace LSA from RESTARTER.
2. Grace timer expiry.
3. Due to topo change if lsa check is enabled.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>