This command makes unplanned GR more reliable by manipulating the
sending of Grace-LSAs and Hello packets for a certain amount of time,
increasing the chance that the neighboring routers are aware of
the ongoing graceful restart before resuming normal OSPF operation.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Don't directly use `time()` for generating sequence numbers for two
reasons:
1. `time()` can go backwards (due to NTP or time adjustments)
2. Coverity Scan warns every time we truncate a `time_t` variable for
good reason (verify that we are Y2K38 ready).
Found by Coverity Scan (CID 1519812, 1519786, 1519783 and 1519772)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When using auth keys in ospfv3, there are some memory
leaks when you change the key or remove the interface
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
OSPFv3 packets can be fragmented and up to 64k long, regardless of
interface MTU. Trying to size these buffers to MTU is just plain wrong.
To not make this a super intrusive change during the 8.3 release freeze,
just code this into ospf6_iobuf_size().
Since the buffer is now always 64k, don't waste time zeroing the entire
thing in receive; instead just zero kind of a "sled" of 128 bytes after
the buffer as a security precaution.
Fixes: #11298
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
a) Remove setting of thread pointer to NULL after
thread invocation, this is already done.
b) Use thread_is_scheduled()
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If a end users does something like this:
int enp39s0
ipv6 ospf6 hello-interval 65535
And then the timer pops and we send the hello and immediately
if the end user does this:
ipv6 ospf6 hello-interval 5
The timer is not being reset and FRR waits the full 65k seconds
before sending the hello again, which then immediately sets
the next hello to go out in 5 seconds.
When FRR receives the new timer value, look at how much time
is left on the timer in seconds. If this value is greater
than the new hello timer, stop the timer and set it too that
value.
This should fix a CI system test failure found, where the
system is testing setting timer from things like 12 seconds
to 65k seconds then back down to 12 and that the ospf6 neighbor
relationship stays up.
The code was also changed from thread_add_event to thread_add_timer
in all cases. I am not sure what would happen if a show command
comes in for a thread timer remaining with an event instead of a timer
just make it consistent.
This was chased down because the support bundle showed this:
r0# show ipv6 ospf6 vrf all interface
r0-r1-eth0 is up, type BROADCAST
Interface ID: 6
Internet Address:
inet6: fe80::a4ea:d3ff:fe35:cef1/64
inet6: fd00::1/64
Instance ID 0, Interface MTU 1500 (autodetect: 1500)
MTU mismatch detection: enabled
Area ID 0.0.0.0, Cost 10
State DR, Transmit Delay 1 sec, Priority 1
Timer intervals configured:
Hello 12(65480.960), Dead 48, Retransmit 5
And looking at the test code is doing stuff like this:
2022/05/16 17:08:15 OSPF6: [M7Q4P-46WDR] vty[5]@(config)# interface r1-r0-eth0
2022/05/16 17:08:15 OSPF6: [M7Q4P-46WDR] vty[5]@(config-if)# ipv6 ospf6 hello-interval 65535
2022/05/16 17:08:15 OSPF6: [M7Q4P-46WDR] vty[5]@(config-if)# no ipv6 ospf6 hello-interval
2022/05/16 17:08:16 OSPF6: [M7Q4P-46WDR] vty[5]@(config-if)# ipv6 ospf6 hello-interval 1
2022/05/16 17:08:16 OSPF6: [M7Q4P-46WDR] vty[5]@(config-if)# ipv6 ospf6 hello-interval 12
If the old timer value pops, the hello interval is set to 65k and never reset again.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem Statement:
==================
RFC 7166 support for OSPF6 in FRR code.
RCA:
====
This feature is newly supported in FRR
Fix:
====
Core functionality implemented in previous commit is
stitched with rest of ospf6 code as part of this commit.
Risk:
=====
Low risk
Tests Executed:
===============
Have executed the combination of commands.
Signed-off-by: Abhinay Ramesh <rabhinay@vmware.com>
RFC 3101 states both E-bit and N-bit need to be checked when receiving a Hello packet.
"To support the NSSA option an additional check must be made in the function
that handles the receiving of the Hello packet to verify that both the N-bit
and the E-bit found in the Hello packet's option field match the area type and
ExternalRoutingCapability of the area of the receiving interface."
This PR adds the check for the N-bit
Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
I am seeing a crash of ospf6d with this stack trace:
OSPF6: Received signal 11 at 1636042827 (si_addr 0x0, PC 0x55efc2d09ec2); aborting...
OSPF6: zlog_signal+0x18c 7fe20c8ca19a 7ffd08035590 /lib/libfrr.so.0 (mapped at 0x7fe20c819000)
OSPF6: core_handler+0xe3 7fe20c90805e 7ffd080356b0 /lib/libfrr.so.0 (mapped at 0x7fe20c819000)
OSPF6: funlockfile+0x50 7fe20c7f8140 7ffd08035800 /lib/x86_64-linux-gnu/libpthread.so.0 (mapped at 0x7fe20c7e4000)
OSPF6: ---- signal ----
OSPF6: ospf6_neighbor_state_change+0xdc 55efc2d09ec2 7ffd08035d90 /usr/lib/frr/ospf6d (mapped at 0x55efc2c8e000)
OSPF6: exchange_done+0x15c 55efc2d0ab4a 7ffd08035dc0 /usr/lib/frr/ospf6d (mapped at 0x55efc2c8e000)
OSPF6: thread_call+0xc2 7fe20c91ee32 7ffd08035df0 /lib/libfrr.so.0 (mapped at 0x7fe20c819000)
OSPF6: frr_run+0x217 7fe20c8bf7f3 7ffd08035eb0 /lib/libfrr.so.0 (mapped at 0x7fe20c819000)
OSPF6: main+0xf3 55efc2cd7573 7ffd08035fc0 /usr/lib/frr/ospf6d (mapped at 0x55efc2c8e000)
OSPF6: __libc_start_main+0xea 7fe20c645d0a 7ffd08036000 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fe20c61f000)
OSPF6: _start+0x2a 55efc2cd706a 7ffd080360d0 /usr/lib/frr/ospf6d (mapped at 0x55efc2c8e000)
OSPF6: in thread exchange_done scheduled from ospf6d/ospf6_message.c:2264 ospf6_dbdesc_send_newone()
The stack trace when decoded is:
(gdb) l *(ospf6_neighbor_state_change+0xdc)
0x7bec2 is in ospf6_neighbor_state_change (ospf6d/ospf6_neighbor.c:200).
warning: Source file is more recent than executable.
195 on->name, ospf6_neighbor_state_str[prev_state],
196 ospf6_neighbor_state_str[next_state],
197 ospf6_neighbor_event_string(event));
198 }
199
200 /* Optionally notify about adjacency changes */
201 if (CHECK_FLAG(on->ospf6_if->area->ospf6->config_flags,
202 OSPF6_LOG_ADJACENCY_CHANGES)
203 && (CHECK_FLAG(on->ospf6_if->area->ospf6->config_flags,
204 OSPF6_LOG_ADJACENCY_DETAIL)
OSPFv3 is creating the event without a managing thread and as such
if the event is not run before a deletion event comes in memory
will be freed up and we'll start trying to access memory we should
not. Modify ospfv3 to track the thread and appropriately stop
it when the memory is deleted or it is no longer need to run
that bit of code.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
RFC 5187 specifies the Graceful Restart enhancement to the OSPFv3
routing protocol. This commit implements support for the GR
restarting mode.
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ipv6
ospf` EXEC-level command needs to be issued before restarting the
ospf6d daemon (there's no specific requirement on how the daemon
should be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospf6d is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The message about ignoring a one-way hello should only be logged
when the router is acting a helper for another one.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
1. changes to process GRACE LSA packet.
2. Validation changes to enter Helper role.
3. Helper functionality during graceful restart.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Somehow the hello message debugging code slipped outside the debug
guard. Lets just remove it.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Some unprotected debugs need to have macro protection,
Split these into the existing covering macro section to remove
a check per-packet from the main path.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Coverity flagged the possibility of an overflow in the latency
calculation, ensure that 64 bit integers are used in the
calculation to avoid this error.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
When OSPFv3 router is configured in both default and non-default VRFs,
every packet destined to a non-default VRF is read twice. This makes it
impossible to establish neighborship because every DbDesc packet is
treated as duplicated and we end up infinitely exchanging DbDescs.
We should drop packets received in the default VRF if an interface we
received it on is bound to another VRF.
Same thing was done for OSPFv2 in 555691e.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
allow amount of work done by read and write threads in a single
invocation to be tuned to between 1 and 100 packets (default 20)
Signed-off-by: Pat Ruddy <pat@voltanet.io>
On transmit and receive calculate the time since the last hello was seen
and log a warning if it is late by more than the hello period.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
queue outgoing lsupdate messages to the interface tx FIFO and schedule
the ospf_write task to deal with them.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
enqueue outgoing dbdesc messages to the end of the tx FIFO and
schedule the ospf6_write task to deal with them.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Add per interface fifo and per instance write list as a precursor
to implementing fairer sharing of the ospf6 oscket resources.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
To ensure we read all the datagrams availabe from a socket when the
read task is scheduled, make the read helper return and error or
continue enum and loop unitl an error is received.
This requires the read from the socket to be non blocking
Signed-off-by: Pat Ruddy <pat@voltanet.io>
Take the contents of ospf6_receive and split the funtionality that
deals with a single packet receipt and place it in a separate helper
function.
This is the first step in a refactor process to allow the ospf6_read
task to read until failure.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
The logging in ospf6 is very verbose. If you turn on logging on a scaled
system you get too many logs. The problem is that there are some errors
that occur that are hidden behind the debug flags, and to see these errors
we currently need to turn on the debug logging. This change converts these
error logs to warnings and removes the debug flags.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>