Steps to reproduce:
--------------------------
1. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-A on DIface-0 with DUT as DR.
2. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
3. DUT: Send <OSPF-LSU> packet.
4. ANVL: Verify that the received <OSPF-LSU> packet contains a Network- LSA for network N1
originated by DUT, and the LS Sequence Number is set to <InitialSequenceNumber>.
5. ANVL: Establish full adjacency with DUT for neighbor Rtr-0-B on DIface-0 with DUT as DR.
6. ANVL: Listen (for up to 2 * <RxmtInterval> seconds) on DIface-0.
7. DUT: Send <OSPF-LSU> packet.
8. ANVL: Verify that the received <OSPF-LSU> packet contains a new instance of the
Network-LSA for network N1 originated by DUT, and the LS Sequence Number
is set to (<InitialSequenceNumber> + 1).
Both the test cases were failing while verifying the initial sequence number for network LSA.
This is because currently OSPF does not reset its LSA sequence number when it is going down.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
When using debug mode, the ei parameter may be NULL. In that
case, do not display the log trace, otherwise a crash will
happen.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Description:
Added hidden clis that will allow you to reset the default timers
for LSA refresh and LSA maxage remove delay, these will help in testing
LSA refresh scenarios in upcoming OSPFv2 Flood reduction feature(rfc4136).
IETF Link : https://datatracker.ietf.org/doc/html/rfc4136
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
Description:
ospf process is crashing when the current router acts
as GR helper and it received a new lsa.
Here, ospf_lsa_different() is being called without checking
'old' pointer. It is asserted in ospf_lsa_different() api
if the 'old' pointer is NULL.
corrected this by validaing old pointer before calling
ospf_lsa_different() api.
back tarce:
Program terminated with signal SIGABRT, Aborted.
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
[Current thread is 1 (Thread 0x6b84348827c0 (LWP 3155))]
0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
1 0x00006b8433aa4801 in __GI_abort () at abort.c:79
2 0x00006b8433a9439a in __assert_fail_base (fmt=0x6b8433c1b7d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x162ffc0630bc "l1", file=file@entry=0x162ffc062ff7 "ospfd/ospf_lsa.c", line=line@entry=3520, function=function@entry=0x162ffc0646f0 <__PRETTY_FUNCTION__.18732> "ospf_lsa_different") at assert.c:92
3 0x00006b8433a94412 in __GI___assert_fail (assertion=assertion@entry=0x162ffc0630bc "l1", file=file@entry=0x162ffc062ff7 "ospfd/ospf_lsa.c", line=line@entry=3520, function=function@entry=0x162ffc0646f0 <__PRETTY_FUNCTION__.18732> "ospf_lsa_different") at assert.c:101
4 0x0000162ffc008c25 in ospf_lsa_different (l1=l1@entry=0x0, l2=l2@entry=0x162ffe535c60, ignore_rcvd_flag=ignore_rcvd_flag@entry=true) at ospfd/ospf_lsa.c:3520
5 0x0000162ffc00a8e8 in ospf_lsa_install (ospf=ospf@entry=0x162ffe513650, oi=oi@entry=0x162ffe531c30, lsa=lsa@entry=0x162ffe535c60) at ospfd/ospf_lsa.c:2892
6 0x0000162ffc059d16 in ospf_flood (ospf=0x162ffe513650, nbr=nbr@entry=0x162ffe52cc90, current=current@entry=0x0, new=new@entry=0x162ffe535c60) at ospfd/ospf_flood.c:429
7 0x0000162ffc01838f in ospf_ls_upd (size=<optimized out>, oi=0x162ffe531c30, s=<optimized out>, ospfh=<optimized out>, iph=<optimized out>, ospf=<optimized out>) at ospfd/ospf_packet.c:2162
8 ospf_read_helper (ospf=<optimized out>) at ospfd/ospf_packet.c:3241
9 ospf_read (thread=<optimized out>) at ospfd/ospf_packet.c:3272
10 0x00006b843450139c in thread_call (thread=thread@entry=0x7780f42c7480) at lib/thread.c:1692
11 0x00006b84344cfb18 in frr_run (master=0x162ffe34d130) at lib/libfrr.c:1068
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
In ospf_handle_exnl_lsa_lsId_chg there is a code path
where that we may be using uninitialized data for decisions.
Doubtful that this happens but let's make it less likely to
even more.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
Default route is not getting flushed from neighbours though originator
triggered flush and deleted LSA from its database. It become as stale
LSA in neighbours databse forever. This could seen in the following
sequence of configurations with less than a second interval b/w configs.
And this could happen only when originator shouldnt have default route
in its rib so it originates default route only when configure with 'always'
option.
step-1:default-information originate always
step-2:no default-information originate always
step-3:default-information originate
In step-1, default route will be originated to AS.
In step-2, default route will be flushed to AS, but neighbours will be
discarding this update due to minlsainterval condition.
And it is expected that DUT need to keep send this update
until it receives the ack from neighbours by adding each
neighbour's retransmission list.
In Step-3: It is deleting the lsas from nbr's retransmission list
by assuming it initiated the flush. This is cuasing to not
send the lsa update anymore to neighbours which makes
stale lsa in nbrs forever.
Fix:
Allowed to delete the lsa from retransmission list only when lsa is
not in maxage during flushing procedure.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
NULL pointer wrongly passed instead of 'ei' pointer to
ospf_external_lsa_originate() API in opaque capability enable/disable
which always make it to fail in origination.
Corrected it by passing actual ei pointer.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
This LSID alogithm added as per rcf2328 Appendex-E recommendation.
This applies only for AS-external lsas and summary lsas.
As an example of the algorithm, consider its operation when the
following sequence of events occurs in a single router (Router A).
(1) Router A wants to originate an AS-external-LSA for
[10.0.0.0,255.255.255.0]:
(a) A Link State ID of 10.0.0.0 is used.
(2) Router A then wants to originate an AS-external-LSA for
[10.0.0.0,255.255.0.0]:
(a) The LSA for [10.0.0,0,255.255.255.0] is reoriginated using a
new Link State ID of 10.0.0.255.
(b) A Link State ID of 10.0.0.0 is used for
[10.0.0.0,255.255.0.0].
(3) Router A then wants to originate an AS-external-LSA for
[10.0.0.0,255.0.0.0]:
(a) The LSA for [10.0.0.0,255.255.0.0] is reoriginated using a
new Link State ID of 10.0.255.255.
(b) A Link State ID of 10.0.0.0 is used for
[10.0.0.0,255.0.0.0].
(c) The network [10.0.0.0,255.255.255.0] keeps its Link State ID
of 10.0.0.255.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
The ospf instance code is not properly handling the default route
when using default-information originate. This is because
the code is looking for the default route to be saved with an
instance of <ospf instance id> but we always save it as a instance
id of 0. In fact OSPF asks zebra for the default route as a special
case in instance 0, always.
Here is the correct behavior:
eva# show ip ospf data
OSPF Instance: 3
OSPF Router with ID (192.168.122.1)
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
0.0.0.0 192.168.122.1 8 0x80000001 0xdb08 E2 0.0.0.0/0 [0x0]
eva# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [0/100] via 192.168.119.1, enp39s0, 00:02:03
Fixes: #10251
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This commit fixes a rather obscure bug that was causing the GR
topotest to fail on a frequent basis.
RFC 3623 specifies that a router acting as a helper to a restarting
neighbor should monitor topology changes and abort the GR procedures
when one is detected, falling back to normal OSPF operation.
ospfd uses the ospf_lsa_different() function to detect when the
content of an LSA has changed, which is considered as a topology
change. The problem is that ospf_lsa_different() can return true
even when the two LSAs passed as parameters are identical, provided
one LSA has the OSPF_LSA_RECEIVED flag set and the other not.
In the context of the ospf_gr_topo1 test, router rt6 performs
a graceful restart and a few seconds later acts as a helper for
router rt7. When it's acting as a helper for rt7, it still didn't
translate its NSSA Type-7 LSAs, something that happens only after 7
seconds (OSPF_ABR_TASK_DELAY) of the first SPF run. The translated
Type-5 LSAs on its LSDB were learned from the helping neighbors
(rt3 and rt7). It's then possible that the NSSA Type-7 LSAs might
be translated while rt6 is acting as helper for rt7, which causes
the daemon to detect a non-existent topology change only because
the OSPF_LSA_RECEIVED flag is unset in the recently originated
Type-5 LSA.
Fix this problem by ignoring the OSPF_LSA_RECEIVED flag when
comparing LSAs for the purpose of topology change detection.
In short, the bug would only show up when the restarting router
would start acting as a helper immediately after coming back up
(which would be hard to happen in the real world). The topotest
failures became more frequent after commit 6255aad0bc because of
the removal of the 'sleep' calls, which used to give ospfd more time
to converge before start acting as a helper for other routers. The
problem still occurred from time to time though.
Fixes#9983.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Do not return pointer to the newly created thread from various thread_add
functions. This should prevent developers from storing a thread pointer
into some variable without letting the lib know that the pointer is
stored. When the lib doesn't know that the pointer is stored, it doesn't
prevent rescheduling and it can lead to hard to find bugs. If someone
wants to store the pointer, they should pass a double pointer as the last
argument.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem Statement :
===================
LSA with InitialSequenceNumber is not originated
after MaxSequenceNumber.
ANVL Test case 25.33 states:
============================
As soon as this flooding of a LSA with LS sequence number
MaxSequenceNumber has been acknowledged by all adjacent neighbors,
a new instance can be originated with sequence number of InitialSequenceNumber.
RCA :
=====
DUT did not originated LSA with INITIAL_SEQUENCE number even
after receiving ACK for max sequence LSA.
Code is not present to handle this situation in the lsa ack flow.
Fix :
=====
Add code to originate LSA with initial sequence number in the
LSA ack flow in case of wrap around sequence number.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Fix CI Failure test_ospf_type5_summary_tc45_p0
Problem Statement:
==================
Summarised LSA is not flushed in OSPFv2 in below scenario:
1. Configure summary-address in ospfv2
2. redistribute static and connected.
3. Check the LSAs are received on neighbor.
4. Now remove all OSPFv2 configs, so neighbor will still have the summarised LSA.
5. Configure router ospf with redistribute static and connected.
6. Check the DB, summarised LSA is present although the configuration is not present.
7. Now configure the summary-address and remove the configuration after sometime.
8. The summarised LSA will be still present.
RCA:
==================
When self originated LSA is received from the neighbor and that
LSA is summarised one, the LSA is refreshed but a flag is not set
due to which it was not able to remove it later.
Fix:
==================
Set the originated flag when refreshing summarised LSA.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Move `is_default_prefix` variations to `lib/prefix.h` and make the code
use the library version instead of implementing it again.
NOTE
----
The function was split into per family versions to cover all types.
Using `union prefixconstptr` is not possible due to static analyzer
warnings which cause CI to fail.
The specific cases that would cause this failure were:
- Caller used `struct prefix_ipv4` and called the generic function.
- `is_default_prefix` with signature using `const struct prefix *` or
`union prefixconstptr`.
The compiler would complain about reading bytes outside of the memory
bounds even though it did not take into account the `prefix->family`
part.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Log the LSA advertising router in addition to the LSA type and
ID in the places where that information is necessary to uniquely
identify the LSA in the LSDB.
This is useful, for example, to know exactly which LSA has changed
when the router is exiting from the GR helper mode when a topology
change was detected.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Description:
When opaque capability disabled/enabled , all the self-originated lsa will be
flushed and it will make the neighbours to renegotiate.
But here, external lsas are not being re-originated after negotiation
Fix:
Refresh/re-originate external lsas(Type-5 and Type-7) explicitly after
re-negotiation.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
The %p printf format specifier does already print the pointer address
with a leading "0x" prefix (indicating a hexadecimal number). There's
no need to add that prefix manually.
While here, replace explicit function names in log messages by
__func__.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When browsing or parsing OSPF LSA TLVs, we need to use the LSA length which is
part of the LSA header. This length, encoded in 16 bits, must be first
converted to host byte order with ntohs() function. However, Coverity Scan
considers that ntohs() function return TAINTED data. Thus, when the length is
used to control for() loop, Coverity Scan marks this part of the code as defect
with "Untrusted Loop Bound" due to the usage of Tainted variable. Similar
problems occur when browsing sub-TLV where length is extracted with ntohs().
To overcome this limitation, a size attribute has been added to the ospf_lsa
structure. The size is set when lsa->data buffer is allocated. In addition,
when an OSPF packet is received, the size of the payload is controlled before
contains is processed. For OSPF LSA, this allow a secure buffer allocation.
Thus, new size attribute contains the exact buffer allocation allowing a
strict control during TLV browsing.
This patch adds extra control to bound for() loop during TLV browsing to
avoid potential problem as suggested by Coverity Scan. Controls are based
on new size attribute of the ospf_lsa structure to avoid any ambiguity.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
This command will trigger the OSPF forwarding address suppression in
translated type-5 LSAs, causing a NSSA ABR to use 0.0.0.0 as a forwarding
address instead of copying the address from the type-7 LSA
Example: In a topology like: R1 --- R2(ABR) --- R3(ASBR)
R3 is announcing a type-7 LSA that is translated to type-5 by the R2 ABR.
The forwarding address in the type-5 is by default copied from the type-7
r1# sh ip os da external
AS External Link States
LS age: 6
Options: 0x2 : *|-|-|-|-|-|E|-
LS Flags: 0x6
LS Type: AS-external-LSA
Link State ID: 3.3.3.3 (External Network Number)
Advertising Router: 10.0.25.2
LS Seq Number: 80000001
Checksum: 0xcf99
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 10.0.23.3 <--- address copied from type-7 lsa
External Route Tag: 0
r2# sh ip os database
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.23.3 8 0x80000001 0x431d E2 3.3.3.3/32 [0x0]
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 0 0x80000001 0xcf99 E2 3.3.3.3/32 [0x0]
r2# conf t
r2(config)# router ospf
r2(config-router)# area 1 nssa suppress-fa
r2(config-router)# exit
r2(config)# exit
r2# sh ip os database
NSSA-external Link States (Area 0.0.0.1 [NSSA])
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.23.3 66 0x80000001 0x431d E2 3.3.3.3/32 [0x0]
AS External Link States
Link ID ADV Router Age Seq# CkSum Route
3.3.3.3 10.0.25.2 16 0x80000002 0x0983 E2 3.3.3.3/32 [0x0]
r1# sh ip os da external
OSPF Router with ID (11.11.11.11)
AS External Link States
LS age: 34
Options: 0x2 : *|-|-|-|-|-|E|-
LS Flags: 0x6
LS Type: AS-external-LSA
Link State ID: 3.3.3.3 (External Network Number)
Advertising Router: 10.0.25.2
LS Seq Number: 80000002
Checksum: 0x0983
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 0.0.0.0 <--- address set to 0
External Route Tag: 0
r2# conf t
r2(config)# router ospf
r2(config-router)# no area 1 nssa suppress-fa
r2(config-router)# exit
r1# sh ip os da external
OSPF Router with ID (11.11.11.11)
AS External Link States
LS age: 1
Options: 0x2 : *|-|-|-|-|-|E|-
LS Flags: 0x6
LS Type: AS-external-LSA
Link State ID: 3.3.3.3 (External Network Number)
Advertising Router: 10.0.25.2
LS Seq Number: 80000003
Checksum: 0xcb9b
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 0.0.0.0 <--- address set to 0
External Route Tag: 0
r2# conf t
r2(config)# router ospf
r2(config-router)# no area 1 nssa suppress-fa
r2(config-router)# exit
r1# sh ip os da external
OSPF Router with ID (11.11.11.11)
AS External Link States
LS age: 1
Options: 0x2 : *|-|-|-|-|-|E|-
LS Flags: 0x6
LS Type: AS-external-LSA
Link State ID: 3.3.3.3 (External Network Number)
Advertising Router: 10.0.25.2
LS Seq Number: 80000003
Checksum: 0xcb9b
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 10.0.23.3 <--- address copied from type-7 lsa
External Route Tag: 0
Signed-off-by: ckishimo <carles.kishimoto@gmail.com>
Neither tabs nor newlines are acceptable in syslog messages. They also
break line-based parsing of file logs.
Signed-off-by: David Lamparter <equinox@diac24.net>
The #if 0 code in ospfd, has not been compiled since at least
2012. If we are at least 9 years old at this point with no effort
to use or save, we should just get rid of it.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Implement the below 2 CLIs to clear the current data in the process
and neighbor data structure.
1. clear ip ospf process
2. clear ip ospf neighbor
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>