The problem was happening because the ospf->oiflist has this behaviour, each interface was removed and added at the end of the list in each ospf_network_run_subnet call, generation an infinite loop.
As a solution, a copy of the list was generated and we interacted with a fixed list.
Signed-off-by: Rodrigo Nardi <rnardi@netdef.org>
Addressed a memory leak in OSPF by fixing the improper deallocation of
area range nodes when removed from the table. Introducing a new function,
`ospf_range_table_node_destroy` for proper node cleanup, resolved the issue.
The ASan leak log for reference:
```
Direct leak of 56 byte(s) in 2 object(s) allocated from:
#0 0x7faf661d1d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7faf65bce1e9 in qcalloc lib/memory.c:105
#2 0x55a66e0b61cd in ospf_area_range_new ospfd/ospf_abr.c:43
#3 0x55a66e0b61cd in ospf_area_range_set ospfd/ospf_abr.c:195
#4 0x55a66e07f2eb in ospf_area_range ospfd/ospf_vty.c:631
#5 0x7faf65b51548 in cmd_execute_command_real lib/command.c:993
#6 0x7faf65b51f79 in cmd_execute_command_strict lib/command.c:1102
#7 0x7faf65b51fd8 in command_config_read_one_line lib/command.c:1262
#8 0x7faf65b522bf in config_from_file lib/command.c:1315
#9 0x7faf65c832df in vty_read_file lib/vty.c:2605
#10 0x7faf65c83409 in vty_read_config lib/vty.c:2851
#11 0x7faf65bb0341 in frr_config_read_in lib/libfrr.c:977
#12 0x7faf65c6cceb in event_call lib/event.c:1979
#13 0x7faf65bb1488 in frr_run lib/libfrr.c:1213
#14 0x55a66dfb28c4 in main ospfd/ospf_main.c:249
#15 0x7faf651c9c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
SUMMARY: AddressSanitizer: 56 byte(s) leaked in 2 allocation(s).
```
Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
Consider this config:
router ospf
redistribute kernel
Then you issue:
no router ospf
ospf will crash with a use after free.
The problem is that the event's associated with the
ospf pointer were shut off then the ospf_external_delete
was called which rescheduled the event. Let's just move
event deletion to the end of the no router ospf.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, when redistribution of routes was configured, external LSAs
were already advertised to peers, and then default-metric is changed,
external LSAs refresh will not occur. In other words, the peers will not
receive the refreshed external LSAs with the new metric.
With this fix, changing default-metric will cause external LSAs to be
refreshed and flooded.
There is a similar task to refresh external LSAs when NSSA settings are
changed. And there is a function that accomplishes it -
ospf_schedule_asbr_nssa_redist_update(). Since the function does the
general work of refreshing external LSAs and is not specific to NSSA
settings, the idea is to give it a more general name and call it when
default-metric changes in order to fix the problem.
Signed-off-by: Alexander Chernavin <achernavin@netgate.com>
When running all daemons with config for most of them, FRR has
sharpd@janelle:~/frr$ vtysh -c "show debug hashtable" | grep "VRF BIT HASH" | wc -l
3570
3570 hashes for bitmaps associated with the vrf. This is a very
large number of hashes. Let's do two things:
a) Reduce the created size of the actually created hashes to 2
instead of 32.
b) Delay generation of the hash *until* a set operation happens.
As that no hash directly implies a unset value if/when checked.
This reduces the number of hashes to 61 in my setup for normal
operation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, delayed reflooding on P2MP interfaces for LSAs received
from neighbors on the interface is unconditionally (see commit
c706f0e32b). In some cases, this
change wasn't desirable and this feature makes delayed reflooding
configurable for P2MP interfaces via the CLI command:
"ip ospf network point-to-multipoint delay-reflood" in interface
submode.
Signed-off-by: Acee <aceelindem@gmail.com>
In practical terms, unplanned GR refers to the act of recovering
from a software crash without affecting the forwarding plane.
Unplanned GR and Planned GR work virtually the same, except for the
following difference: on planned GR, the router sends the Grace-LSAs
*before* restarting, whereas in unplanned GR the router sends the
Grace-LSAs immediately *after* restarting.
For unplanned GR to work, ospf6d was modified to send a
ZEBRA_CLIENT_GR_CAPABILITIES message to zebra as soon as GR is
enabled. This causes zebra to freeze the OSPF routes in the RIB as
soon as the ospfd daemon dies, for as long as the configured grace
period (the defaults is 120 seconds). Similarly, ospfd now stores in
non-volatile memory that GR is enabled as soon as GR is configured.
Those two things are no longer done during the GR preparation phase,
which only happens for planned GRs.
Unplanned GR will only take effect when the daemon is killed
abruptly (e.g. SIGSEGV, SIGKILL), otherwise all OSPF routes will
be uninstalled while ospfd is exiting. Once ospfd starts, it will
check whether GR is enabled and enter in the GR mode if necessary,
sending Grace-LSAs out all operational interfaces.
One disadvantage of unplanned GR is that the neighboring routers
might time out their corresponding adjacencies if ospfd takes too
long to come back up. This is especially the case when short dead
intervals are used (or BFD). For this and other reasons, planned
GR should be preferred whenever possible.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add support for a write socket per interface, enabled by
default at the ospf instance level. An ospf instance-level
config allows this to be disabled, reverting to the older
behavior where a single per-instance socket is used for
sending and receiving packets.
Signed-off-by: Mark Stapp <mjs@labn.net>
Implement NSSA address ranges as specified by RFC 3101:
NSSA border routers may be configured with Type-7 address ranges.
Each Type-7 address range is defined as an [address,mask] pair. Many
separate Type-7 networks may fall into a single Type-7 address range,
just as a subnetted network is composed of many separate subnets.
NSSA border routers may aggregate Type-7 routes by advertising a
single Type-5 LSA for each Type-7 address range. The Type-5 LSA
resulting from a Type-7 address range match will be distributed to
all Type-5 capable areas.
Syntax:
area A.B.C.D nssa range A.B.C.D/M [<not-advertise|cost (0-16777215)>]
Example:
router ospf
router-id 1.1.1.1
area 1 nssa
area 1 nssa range 172.16.0.0/16
area 1 nssa range 10.1.0.0/16
!
Since regular area ranges and NSSA ranges have a lot in common,
this commit reuses the existing infrastructure for area ranges as
much as possible to avoid code duplication.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add the "default-information-originate" option to the "area X nssa"
command. That option allows the origination of Type-7 default routes
on NSSA ABRs and ASBRs.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Combine all variation of the "area nssa" command into a single
DEFPY to improve code maintainability.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is a first in a series of commits, whose goal is to rename
the thread system in FRR to an event system. There is a continual
problem where people are confusing `struct thread` with a true
pthread. In reality, our entire thread.c is an event system.
In this commit rename the thread.[ch] files to event.[ch].
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Description:
Code changes involve following things.
1. an additional structure containing flood reduction related info
per area.
2. a knob variable in the ospf structure for enabling/disabling the feature.
3. initialization of above mentioned variables.
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
After `free()`ing a table also set it to NULL so when the instance
release function is called we know whether the pointer is valid or not.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Description:
As part of signal handler ospf_finish_final(), lsas are originated
and added to refresh queues are not freed.
One such leak is :
==2869285== 432 (40 direct, 392 indirect) bytes in 1 blocks are definitely lost in loss record 159 of 221
==2869285== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2869285== by 0x4910EC3: qcalloc (memory.c:116)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:4017)
==2869285== by 0x199024: ospf_refresher_register_lsa (ospf_lsa.c:3979)
==2869285== by 0x19A37F: ospf_network_lsa_install (ospf_lsa.c:2680)
==2869285== by 0x19A37F: ospf_lsa_install (ospf_lsa.c:2941)
==2869285== by 0x19C18F: ospf_network_lsa_update (ospf_lsa.c:1099)
==2869285== by 0x1931ED: ism_change_state (ospf_ism.c:556)
==2869285== by 0x1931ED: ospf_ism_event (ospf_ism.c:596)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19FBC6: nsm_change_state (ospf_nsm.c:695)
==2869285== by 0x19FBC6: ospf_nsm_event (ospf_nsm.c:861)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x19020B: ospf_if_cleanup (ospf_interface.c:322)
==2869285== by 0x192D0C: ism_interface_down (ospf_ism.c:393)
==2869285== by 0x193028: ospf_ism_event (ospf_ism.c:584)
==2869285== by 0x494E0B0: thread_call (thread.c:2006)
==2869285== by 0x494E395: _thread_execute (thread.c:2098)
==2869285== by 0x190F10: ospf_if_down (ospf_interface.c:851)
==2869285== by 0x1911D6: ospf_if_free (ospf_interface.c:341)
==2869285== by 0x1E6E98: ospf_finish_final (ospfd.c:748)
==2869285== by 0x1E6E98: ospf_deferred_shutdown_finish (ospfd.c:578)
==2869285== by 0x1E7727: ospf_finish (ospfd.c:682)
==2869285== by 0x1E7727: ospf_terminate (ospfd.c:652)
==2869285== by 0x18852B: sigint (ospf_main.c:105)
==2869285== by 0x493BE12: frr_sigevent_process (sigevent.c:130)
==2869285== by 0x494DCD4: thread_fetch (thread.c:1775)
==2869285== by 0x4905022: frr_run (libfrr.c:1197)
==2869285== by 0x187891: main (ospf_main.c:235)
Added a fix to cleanup all these queue pointers and corresponing lsas in it.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
Added hidden clis that will allow you to reset the default timers
for LSA refresh and LSA maxage remove delay, these will help in testing
LSA refresh scenarios in upcoming OSPFv2 Flood reduction feature(rfc4136).
IETF Link : https://datatracker.ietf.org/doc/html/rfc4136
Signed-off-by: Manoj Naragund <mnaragund@vmware.com>
Let's just use THREAD_OFF consistently in the code base
instead of each daemon having a special macro that needs to
be looked at and remembered what it does.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Opaque data takes up a lot of memory when there are a lot of routes on
the box. Given that this is just a cosmetic info, I propose to disable
it by default to not shock people who start using FRR for the first time
or upgrades from an old version.
Fixes#10101.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Update ospfd and ospf6d to send opaque route attributes to
zebra. Those attributes are stored in the RIB and can be viewed
using the "show ip[v6] route" commands (other than that, they are
completely ignored by zebra).
Example:
```
debian# show ip route 192.168.1.0/24
Routing entry for 192.168.1.0/24
Known via "ospf", distance 110, metric 20, best
Last update 01:57:08 ago
* 10.0.1.2, via eth-rt2, weight 1
OSPF path type : External-2
OSPF tag : 0
debian#
debian# show ip route 192.168.1.0/24 json
{
"192.168.1.0\/24":[
{
"prefix":"192.168.1.0\/24",
"prefixLen":24,
"protocol":"ospf",
"vrfId":0,
"vrfName":"default",
"selected":true,
[snip]
"ospfPathType":"External-2",
"ospfTag":"0"
}
]
}
```
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Currently, it is possible to rename the default VRF either by passing
`-o` option to zebra or by creating a file in `/var/run/netns` and
binding it to `/proc/self/ns/net`.
In both cases, only zebra knows about the rename and other daemons learn
about it only after they connect to zebra. This is a problem, because
daemons may read their config before they connect to zebra. To handle
this rename after the config is read, we have some special code in every
single daemon, which is not very bad but not desirable in my opinion.
But things are getting worse when we need to handle this in northbound
layer as we have to manually rewrite the config nodes. This approach is
already hacky, but still works as every daemon handles its own NB
structures. But it is completely incompatible with the central
management daemon architecture we are aiming for, as mgmtd doesn't even
have a connection with zebra to learn from it. And it shouldn't have it,
because operational state changes should never affect configuration.
To solve the problem and simplify the code, I propose to expand the `-o`
option to all daemons. By using the startup option, we let daemons know
about the rename before they read their configs so we don't need any
special code to deal with it. There's an easy way to pass the option to
all daemons by using `frr_global_options` variable.
Unfortunately, the second way of renaming by creating a file in
`/var/run/netns` is incompatible with the new mgmtd architecture.
Theoretically, we could force daemons to read their configs only after
they connect to zebra, but it means adding even more code to handle a
very specific use-case. And anyway this won't work for mgmtd as it
doesn't have a connection with zebra. So I had to remove this option.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When doing a normal exit from ospf we should close
the log file as that we are leaving a bunch of
unterminated logging processes by not doing so.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem Statement:
==================
Summary LSA is not originated when router-id is modified or process is reset
Root Cause Analysis:
====================
When router-id is modified or process is cleared, all the external LSAs are
flushed then LSA is re-originated using ospf_external_lsa_rid_change
When the LSAs are flushed, the aggregate flags are not reset.
Fix:
===============
Reset the aggregation flag when the LSAs
are flushed.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
There are a couple of things that are not initialized if the OSPF router
is created in a non-existent VRF:
- ospf_lsa_maxage_walker
- ospf_lsa_refresh_walker
- ospf_opaque_type11_lsa_init
Rearrange some code to always initialize them and make it easier to find
similar problems in the future.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When OSPF is disabled on interface and enabled again, the IP which is
not matching the prefix-list is getting originated as External LSA.
Fixes: #9362
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
RFC 3623 specifies the Graceful Restart enhancement to the OSPF
routing protocol. This PR implements support for the restarting mode,
whereas the helper mode was implemented by #6811.
This work is based on #6782, which implemented the pre-restart part
and settled the foundations for the post-restart part (behavioral
changes, GR exit conditions, and on-exit actions).
Here's a quick summary of how the GR restarting mode works:
* GR can be enabled on a per-instance basis using the `graceful-restart
[grace-period (1-1800)]` command;
* To perform a graceful shutdown, the `graceful-restart prepare ospf`
EXEC-level command needs to be issued before restarting the ospfd
daemon (there's no specific requirement on how the daemon should
be restarted);
* `graceful-restart prepare ospf` will initiate the graceful restart
for all GR-enabled instances by taking the following actions:
o Flooding Grace-LSAs over all interfaces
o Freezing the OSPF routes in the RIB
o Saving the end of the grace period in non-volatile memory (a JSON
file stored in `$frr_statedir`)
* Once ospfd is started again, it will follow the procedures
described in RFC 3623 until it detects it's time to exit the graceful
restart (either successfully or unsuccessfully).
Testing done:
* New topotest featuring a multi-area OSPF topology (including stub
and NSSA areas);
* Successful interop tests against IOS-XR routers acting as helpers.
Co-authored-by: GalaxyGorilla <sascha@netdef.org>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Both the GR helper code and the upcoming GR restarting code are going
to share a lot of definitions. As such, rename ospf_gr_helper.h to
ospf_gr.h, which will be the central point of all GR definitions
and prototypes.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>