This patch includes:
* Implementation of RFC 5709 support in OSPF. Using
openssl library and FRR key-chain,
one can use SHA1, SHA256, SHA384, SHA512 and
keyed-MD5( backward compatibility with RFC 2328) HMAC algs.
* Updating documentation of OSPF
* add topotests for new HMAC algorithms
Signed-off-by: Mahdi Varasteh <varasteh@amnesh.ir>
Add support for "[no] ip ospf capbility opaque" at the interface
level with the default being capability opaque enabled. The command
"no ip ospf capability opaque" will disable opaque LSA database
exchange and flooding on the interface. A change in configuration
will result in the interface being flapped to update our options
for neighbors but no attempt will be made to purge existing LSAs
as in dense topologies, these may received by neighbors through
different interfaces.
Topotests are added to test both the configuration and the LSA
opaque flooding suppression.
Signed-off-by: Acee <aceelindem@gmail.com>
1. Fix OSPF opaque LSA processing to preserve the stale opaque
LSAs in the Link State Database for 60 seconds consistent with
what is done for other LSA types.
2. Add a topotest that tests for cases where ospfd is restarted
and a stale OSPF opaque LSA exists in the OSPF routing domain
both when the LSA is purged and when the LSA is reoriginagted
with a more recent instance.
Signed-off-by: Acee <aceelindem@gmail.com>
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospfd daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospfd now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will
be uninstalled while ospfd is exiting. Once ospfd starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospfd takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add support for a write socket per interface, enabled by
default at the ospf instance level. An ospf instance-level
config allows this to be disabled, reverting to the older
behavior where a single per-instance socket is used for
sending and receiving packets.
Signed-off-by: Mark Stapp <mjs@labn.net>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The GR code should check for topology changes only upon the receipt
of Router-LSAs and Network-LSAs. Other LSAs types don't affect the
topology as far as a restarting router is concerned.
This optimization reduces unnecessary computations when the
restarting router receives thousands of inter-area LSAs or external
LSAs while coming back up.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
The changes involve setting DC bit on ospf hellos and
addition of new DO_NOT_AGE flag.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
FRR implements a non-standard, but compatible approach for
sending update LSAs (it always send to 224.0.0.5) on P2MP
interfaces. This change makes it so acks are also sent to
224.0.0.5.
Since the acks are multicast, this allows an optimization
where we don't send back out the incoming P2MP interface
immediately allow time to rx multicast ack from neighbors
on the same net that rx'ed the original (multicast) update.
Signed-off-by: Lou Berger <lberger@labn.net>
ospf_write pulls an interface off the ospf->oi_write_q
then writes one packet and places it back on the queue,
keeping track of the first one sent. Then it will
stop sending packets even if we get back to the first
interface written too but before we have sent the full
pkt_count. I do not believe this makes a whole bunch
of sense and is very restrictive of how much data can
be sent especially if you have a limited number of peers
but large amounts of data. Why be so restrictive?
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Commit 3551ee9e90304 introduced a regression that causes GR to fail
under certain circumstances. In short, while ISM events should
be ignored while acting as a helper for a restarting router, the
DR/BDR fields of the neighbor structure should still be updated
while processing a Hello packet. If that isn't done, it can cause
the helper to elect the wrong DR while exiting from the helper mode,
leading to a situation where there are two DRs for the same network
segment (and a failed GR by consequence). Fix this.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
RFC 3623 says:
"If the restarting router determines that it was the Designated
Router on a given segment prior to the restart, it elects
itself as the Designated Router again. The restarting router
knows that it was the Designated Router if, while the
associated interface is in Waiting state, a Hello packet is
received from a neighbor listing the router as the Designated
Router".
Implement that logic when processing Hello messages to ensure DR
interfaces will preserve their DR status across a graceful restart.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Issue:
===================
OSPF neighbors are not going down even after 10 mins when
having a mismatch in hello and dead interval.
First neighbors are formed and then a mismatch in the interval
is created, it is observed that the neighbor is not going down.
Root Cause Analysis:
====================
The event HelloReceived defined in RFC 2328 was named as PacketReceived
and this event was scheduled whenever LS Update, LS Ack, LS Request,
DD description packet or Hello packet is received.
Although there is a mismatch in the Hello packet contents, the
event PacketReceived gets triggered due to LS Update received and the
dead timer gets reset and hence the neighbor was never going Down and
remains FULL.
Fix:
==================
As per RFC 2328, the HelloReceived needs to be triggered only when
valid OSPF Hello packet is received and not when other OSPF packets
are received. Modified the function name as well.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
min lsa packet sizes are not always directly corresponding
to the actual LSA. Add a bit of comments so it's easier
for future people to figure out.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
In some cases FRR is receiving a lsa data packet
with a length set to the length of the header only.
If we are expecting data from a peer in the form
of lsa data. Let's enforce it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem Statement :
===================
LSA with InitialSequenceNumber is not originated
after MaxSequenceNumber.
ANVL Test case 25.33 states:
============================
As soon as this flooding of a LSA with LS sequence number
MaxSequenceNumber has been acknowledged by all adjacent neighbors,
a new instance can be originated with sequence number of InitialSequenceNumber.
RCA :
=====
DUT did not originated LSA with INITIAL_SEQUENCE number even
after receiving ACK for max sequence LSA.
Code is not present to handle this situation in the lsa ack flow.
Fix :
=====
Add code to originate LSA with initial sequence number in the
LSA ack flow in case of wrap around sequence number.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
ANVL Test case 28.11
If the database copy has LS age equal to MaxAge and LS sequence number
equal to MaxSequenceNumber, simply discard the received LSA
without acknowledging it.
ANVL Test Case 25.22
When an attempt is made to increment the sequence number past the maximum
value of N - 1 (0x7fffffff; also referred to as MaxSequenceNumber),
the current instance of the LSA must first be flushed from the routing domain.
ANVL Test Case 25.23
As soon as this flooding of a LSA with LS sequence number MaxSequenceNumber
has been acknowledged by all adjacent neighbors, a new instance can be
originated with sequence number of InitialSequenceNumber.
RCA:
When IXIA sent LS Seq num as MAX and LS Age as (MAX - 3),
DUT dropped the packet instead of sending ACK.
In function ospf_ls_upd, at Line 2106 the code is there to drop the LSA.
Hence its failing.
Fix:
LSAs ACK must be sent when received LSA is having max sequence number
but not max-aged.
Considering /* CVE-2017-3224 */ issue, have corrected the existing
code to prevent attacker from sending LSAs with max sequence number
and higher checksum and blocking the flooding of the Max-sequence numbered LSAs.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem Statement:
===================
DUT selecting itself as DR when RR goes for reload.
Test Case 7.2
DUT (GR Helper) receives the Hello packet from the OSPF GR RESTARTER
(ANVL here) with DR and BDR set to 0.0.0.0 and DUT in its hello
neighbor list. DUT triggers the DR and BDR election although it is
in the Helper mode for that neighbor.
Root Cause Analysis:
====================
When hello packet is received with self router ID in the neighbor list,
there is no check in the code to handle this scenario. Hence the DR/BDR
election happens and it changes the DR although it is helper.
Fix:
===================
As per RFC 3623 Section 3. Operation of Helper Neighbor, below point,
we need to maintain the DR relationship.
Also, if X was the Designated Router on network segment S when the
helping relationship began, Y maintains X as the Designated Router
until the helping relationship is terminated.
Adding the check when DUT is under neighbor helper mode, we need to avoid
ISM state change when hello packet is received with DR/BDR set to 0.0.0.0.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Both the GR helper code and the upcoming GR restarting code are going
to share a lot of definitions. As such, rename ospf_gr_helper.h to
ospf_gr.h, which will be the central point of all GR definitions
and prototypes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Remove previous log config
debug ospf graceful-restart helper
and just use
debug ospf graceful-restart
for everything related to OSPF GR.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>