Commit 76249532fa ("ospf6d: Handle Premature Aging of LSAs") added a
duplicate call to OSPF6_INTRA_PREFIX_LSA_EXECUTE_TRANSIT(), when the
interface state changes to "Down".
Fixes: #1738
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Change timestamp parameter from int to time_t to avoid truncation.
Found by Coverity Scan (CID 1563226 and 1563222)
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This command makes unplanned GR more reliable by manipulating the
sending of Grace-LSAs and Hello packets for a certain amount of time,
increasing the chance that the neighboring routers are aware of
the ongoing graceful restart before resuming normal OSPF operation.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospf6d daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospf6d now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will be
uninstalled while ospf6d is exiting. Once ospf6d starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospf6d takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When ospf6 is started up and SPF is run depending on which route is
selected as the parent route we could miss adding a NH. If one
possible parent route has two equal cost paths and the second possible
parent route has only one depending on which one is selected first
determines if we have have one or two NHs.
In the network below when creating a route 2001:db8:3:4::/64 in R2.
When SPF is run there are two possible parent routes R3 and R4.
2001:db8:1:2 +-----+ 2001:db8:2:5
+--------------+ 2 +---------------+
| ::2 | | ::2 |
| +-----+ |
| |
::1| |
+-----+ |::5
| 1 |2001:db8:1:3+-----+2001:db8:3:5+-----+2001:db8:5:6+-----+
| +------------+ 3 +------------+ 5 +------------+ 6 |
+-----+ ::1 ::3 | |::3 ::5 | |::5 ::6| |
::1| +-----+ +-----+ +-----+
| |::3
| | 2001:db8:3:4
| |
| |::4
| 2001:db8:1:4 +-----+
+--------------+ 4 |
::4 | |
+-----+
The problem was if we first created the route to 2001:db8:3:4::/64 with R3
as the parent route all is fine. The code was merging the NH from the parent
route and R3 has 2 NH, one pointing to R1 and one to R5. But if route
2001:db8:3:4::/64 was first created with parent as R4, it has only one NH
pointing to R1, and then later a new vertex was created pointing to R3 the
code would only copy the nhs from the vertex not from the parent route. The
vertex always has just one NH. But the parent route could have more. So
when we would bringup this setup one time we would see one NH for
2001:db8:3:4::/64 and the next time we would see two NHs. The code has been
modified so that it behaves the same when the route is first created, or when
a vertex is created, it selects the NHs from the parent route.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add a hash_clean_and_free() function as well as convert
the code to use it. This function also takes a double
pointer to the hash to set it NULL. Also it cleanly
does nothing if the pointer is NULL( as a bunch of
code tested for ).
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR code should check for topology changes only upon the receipt
of Router-LSAs and Network-LSAs. Other LSAs types don't affect the
topology as far as a restarting router is concerned.
This optimization reduces unnecessary computations when the
restarting router receives thousands of inter-area LSAs or external
LSAs while coming back up.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Upon exiting the GR mode, force reorigination of intra-area-prefix-LSAs
on all attached areas. This is to ensure all configured areas will have
an associated intra-area-prefix-LSA at the end of the GR procedures,
even if the area doesn't have any full adjacency.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This commit fixes a bug where self-originated NSSA/AS-External LSAs
would age out about one hour after exiting from the GR mode. The
reason is because received self-originated LSAs aren't registered
for periodic refreshing, so that needs to be done manually. Fix
this by explicitly reoriginating all NSSA/AS-External LSAs while
exiting from the GR mode.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
An ABR that is originating inter-area-prefix-LSAs should take into
account the fact that there might be self-originated LSAs for the
same prefixes that were originated prior to a graceful restart. When
that happens, the previous LSA-IDs should be reused to avoid having
duplicate LSAs.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
RFC 5340 says that Link-LSAs should not be originated for virtual
links. Add a check to prevent that from happening while exiting
from the GR mode.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Consider the case where a DR router is performing a graceful restart,
and all neighbors attached to the DR network have their priorities
set to zero.
According to RFC 3623, the router should reclaim its DR status while
coming back up once it receives a Hello packet from a neighbor
listing the router as the DR, and the associated interface is in
Waiting state.
The problem arises when the DR election starts. Since the router
is already elected the DR, and no BDR will be elected (since all
neighbors have their priorities set to zero), the AdjOk event won't
be triggered at the end of the DR election as it would normally
happen. That causes all neighbors reachable over the broadcast
interface to get stuck in the 2-Way state.
Fix this corner case by always triggering the AdjOk event at the
end of the DR election process when a GR is in progress. Triggering
the AdjOk event when not necessary should never be a problem as
the neighbor FSM is already prepared to deal with that.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
RFC 5340, Section 4.8.3 says:
"Prefixes having the NU-bit set in their PrefixOptions field should
be ignored by the inter-area route calculation".
Fix a bug where, in addition to the NU-bit, ospf6d was also ignoring
prefixes having the LA-bit set when computing inter-area routes. In
practice, this fixes interoperability issues with vendors that set
the LA-bit in loopback prefixes (among other cases).
While here, fix a copy-and-paste error where a log message wasn't
showing accurate information about what happened.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Do not assume that all redistributed routes have a nexthop address,
otherwise blackhole nexthops can be misinterpreted as IPv6 addresses,
leading to inconsistencies.
Also, change the signature of a few functions to allow const nexthop
addresses, such that in6addr_any can be used without type casts.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Originate AS-External LSAs with forwarding addresses whenever the
corresponding redistributed routes have a global nexthop address.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Don't directly use `time()` for generating sequence numbers for two
reasons:
1. `time()` can go backwards (due to NTP or time adjustments)
2. Coverity Scan warns every time we truncate a `time_t` variable for
good reason (verify that we are Y2K38 ready).
Found by Coverity Scan (CID 1519812, 1519786, 1519783 and 1519772)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Problem Statement:
=================
Memory leak backtraces
2022-11-23 01:51:10,525 - ERROR: ==842== 1,100 (1,000 direct, 100 indirect) bytes in 5 blocks are definitely lost in loss record 29 of 31
2022-11-23 01:51:10,525 - ERROR: ==842== at 0x4C31FAC: calloc (vg_replace_malloc.c:762)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x4E8A1BF: qcalloc (memory.c:111)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13555A: ospf6_lsa_alloc (ospf6_lsa.c:723)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x1355F3: ospf6_lsa_create_headeronly (ospf6_lsa.c:756)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x135702: ospf6_lsa_copy (ospf6_lsa.c:790)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13B64B: ospf6_dbdesc_recv_slave (ospf6_message.c:976)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13B64B: ospf6_dbdesc_recv (ospf6_message.c:1038)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13B64B: ospf6_read_helper (ospf6_message.c:1838)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13B64B: ospf6_receive (ospf6_message.c:1875)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x4EB741B: thread_call (thread.c:1692)
2022-11-23 01:51:10,526 - ERROR: ==842== by 0x4E85B17: frr_run (libfrr.c:1068)
2022-11-23 01:51:10,526 - ERROR: ==842== by 0x119585: main (ospf6_main.c:228)
2022-11-23 01:51:10,526 - ERROR: ==842==
2022-11-23 01:51:10,524 - ERROR: Found memory leak in module ospf6d
2022-11-23 01:51:10,525 - ERROR: ==842== 220 (200 direct, 20 indirect) bytes in 1 blocks are definitely lost in loss record 21 of 31
2022-11-23 01:51:10,525 - ERROR: ==842== at 0x4C31FAC: calloc (vg_replace_malloc.c:762)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x4E8A1BF: qcalloc (memory.c:111)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13555A: ospf6_lsa_alloc (ospf6_lsa.c:723)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x1355F3: ospf6_lsa_create_headeronly (ospf6_lsa.c:756)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x135702: ospf6_lsa_copy (ospf6_lsa.c:790)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13BBCE: ospf6_dbdesc_recv_master (ospf6_message.c:760)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13BBCE: ospf6_dbdesc_recv (ospf6_message.c:1036)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13BBCE: ospf6_read_helper (ospf6_message.c:1838)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x13BBCE: ospf6_receive (ospf6_message.c:1875)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x4EB741B: thread_call (thread.c:1692)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x4E85B17: frr_run (libfrr.c:1068)
2022-11-23 01:51:10,525 - ERROR: ==842== by 0x119585: main (ospf6_main.c:228)
2022-11-23 01:51:10,525 - ERROR: ==842==
RCA:
====
These memory leaks are beacuse of last lsa in neighbour's request_list is not
getting freed beacuse of lsa lock. The last request has an addtional lock which
is added as a part of ospf6_make_lsreq, this lock needs to be removed
in order for the lsa to get freed.
Fix:
====
Check and remove the lock on the last request in all the functions.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
When using auth keys in ospfv3, there are some memory
leaks when you change the key or remove the interface
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The ospf6_route_cmp_nexthops function was returning 0 for same
and 1 for not same. Let's reverse the polarity and actually make
the returns useful long term.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Commit 8f359e1593 removed a check that prevented the same route
from being added twice. In certain topologies, that change resulted in
the following infinite loop when adding an ASBR route:
ospf6_route_add
ospf6_top_brouter_hook_add
ospf6_abr_examin_brouter
ospf6_abr_examin_summary
ospf6_route_add
(repeat until stack overflow)
Revert the offending commit and update `ospf6_route_is_identical()` to
not do comparison using `memcmp()`.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Problem:
Delay in ospfv3 route installation when area gets converted to regular
from NSSA.
RCA:
when area gets converted from NSSA to normal the type-7(NSSA_LSAs)
gets flushed from the area, as a result the external routes
learnt from these type-7s gets removed. Once the area is moved
to nomral the type 5 lsas needs to flooded through the area
so that routes are re-learnt. however there is a delay in
flooding of these routes until these routes are refreshed.
Due to this there is delay installation of these routes.
Fix:
The Fix involves refreshing of the type 5 lsas once the area
is changed from nssa to regular area.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
donatas-pc# sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
Interface ID: 2
Internet Address:
inet : 192.168.10.17/24
inet6: fe80::ca5d:fd0d:cd8:1bb7/64
Instance ID 0, Interface MTU 1500 (autodetect: 1500)
MTU mismatch detection: enabled
Area ID 0.0.0.0, Cost 1000
State Waiting, Transmit Delay 1 sec, Priority 1
Timer intervals configured:
Hello 10(8.149), Dead 40, Retransmit 5
DR: 0.0.0.0 BDR: 0.0.0.0
Number of I/F scoped LSAs is 1
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication Trailer is disabled
donatas-pc# con
donatas-pc(config)# int enp3s0
donatas-pc(config-if)# ipv6 ospf6 passive
donatas-pc(config-if)# do sh ipv6 ospf6 interface enp3s0
enp3s0 is up, type BROADCAST
Interface ID: 2
Internet Address:
inet : 192.168.10.17/24
inet6: fe80::ca5d:fd0d:cd8:1bb7/64
Instance ID 0, Interface MTU 1500 (autodetect: 1500)
MTU mismatch detection: enabled
Area ID 0.0.0.0, Cost 1000
State Waiting, Transmit Delay 1 sec, Priority 1
Timer intervals configured:
No Hellos (Passive interface)
DR: 0.0.0.0 BDR: 0.0.0.0
Number of I/F scoped LSAs is 1
0 Pending LSAs for LSUpdate in Time 00:00:00 [thread off]
0 Pending LSAs for LSAck in Time 00:00:00 [thread off]
Authentication Trailer is disabled
donatas-pc(config-if)#
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
ospfd and ospf6d define the same metric-type route-map commands. Make
them have the same help string too.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>