Function ospf_te_parse_te() and ospf_te_delete_te() browse TE TLV but also
subTLV. The loop that parse the subTLV check that cummulative read data doesn't
exceed the total size of the TLV. However, the sum variable that counts the
number of read data was wrongly intialize to 0 instead to 4 (i.e. the initial
TLV Header size that is located at the TOP of subTLV).
This patch adjust accordingly the initial value of the counter.
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
The no-form should use the same arguments as the regular command, hence
replace "period" with "grace-period".
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Running ospf_topo_vrf1 leads us to this valgrind issue:
==2386518== Invalid read of size 8
==2386518== at 0x4971520: route_top (table.c:401)
==2386518== by 0x181F08: ospf_interface_bfd_apply (ospf_bfd.c:126)
==2386518== by 0x182069: ospf_interface_disable_bfd (ospf_bfd.c:158)
==2386518== by 0x18BF51: ospf_del_if_params (ospf_interface.c:557)
==2386518== by 0x18C584: ospf_if_delete_hook (ospf_interface.c:712)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Address 0x5df39a0 is 0 bytes inside a block of size 56 free'd
==2386518== at 0x48399AB: free (vg_replace_malloc.c:538)
==2386518== by 0x492A03E: qfree (memory.c:141)
==2386518== by 0x4970C6F: route_table_free (table.c:141)
==2386518== by 0x4970A36: route_table_finish (table.c:61)
==2386518== by 0x18C543: ospf_if_delete_hook (ospf_interface.c:708)
==2386518== by 0x490CA0B: hook_call_if_del (if.c:61)
==2386518== by 0x490D1F3: if_delete_retain (if.c:286)
==2386518== by 0x490D337: if_delete (if.c:309)
==2386518== by 0x490CDED: if_destroy_via_zapi (if.c:200)
==2386518== by 0x49940A9: zclient_interface_delete (zclient.c:2237)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518== Block was alloc'd at
==2386518== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==2386518== by 0x4929EFC: qcalloc (memory.c:116)
==2386518== by 0x49709F8: route_table_init_with_delegate (table.c:53)
==2386518== by 0x49717F4: route_table_init (table.c:528)
==2386518== by 0x18C328: ospf_if_new_hook (ospf_interface.c:659)
==2386518== by 0x490C97D: hook_call_if_add (if.c:60)
==2386518== by 0x490CE85: if_create_name (if.c:223)
==2386518== by 0x490DF32: if_get_by_name (if.c:622)
==2386518== by 0x4993F73: zclient_interface_add (zclient.c:2186)
==2386518== by 0x4998062: zclient_read (zclient.c:3969)
==2386518== by 0x4979529: thread_call (thread.c:1908)
==2386518== by 0x4919918: frr_run (libfrr.c:1164)
==2386518== by 0x181AC7: main (ospf_main.c:235)
==2386518==
Fix the ordering to do the individual node tree cleanup after we delete
the data we care about.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, we automatically delete an inactive VRF when its last
interface is deleted. This code introduces a couple of crashes because
of the following problems:
- vrf_delete is called before calling if_del hook, so daemons may try to
dereference an ifp->vrf pointer which is freed
- in if_terminate, we continue to use the VRF in the loop condition
after the last interface is deleted
This check is needed only when the interface is deleted by the user,
because if the interface is deleted by the system, VRF must still exist
in the system. Move the check to appropriate places to fix crashes.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Issue #9983 explains what is wrong with the GR helper mode.
To unblock the CI that fails almost all the time on the ospf_gr_topo1
test, remove the commands and disable the test. Also add a reminder to
completely remove the helper mode if no one fixes the code in a month.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Description:
timerval datastructure is being used without initialization.
Using these uninitialized parameters can lead unexpected results
so initializing before using it.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
In PointToPoint networks, There wont be DR and BDR.
But by default, All neighbours ism state is shown as
DR_OTHER.
Changed the nbr state format to <nbrsate>/- (ex : FULL/-)
to P2pnetworks.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
1. Adding uptime to the 'show ip ospf neighbor' o/p.
2. Adding uptime and deadtime in string format for json consumption.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Do not return pointer to the newly created thread from various thread_add
functions. This should prevent developers from storing a thread pointer
into some variable without letting the lib know that the pointer is
stored. When the lib doesn't know that the pointer is stored, it doesn't
prevent rescheduling and it can lead to hard to find bugs. If someone
wants to store the pointer, they should pass a double pointer as the last
argument.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When doing a normal exit from ospf we should close
the log file as that we are leaving a bunch of
unterminated logging processes by not doing so.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This removes a giant `switch { }` block from lib/zclient.c and
harmonizes all zclient callback function types to be the same (some had
a subset of the args, some had a void return, now they all have
ZAPI_CALLBACK_ARGS and int return.)
Apart from getting rid of the giant switch, this is a minor security
benefit since the function pointers are now in a `const` array, so they
can't be overwritten by e.g. heap overflows for code execution anymore.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
if_lookup_by_name_all_vrf doesn't work correctly with netns VRF backend
as the same index may be used in multiple netns simultaneously.
Use the appropriate VRF when looking for the interface.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
The `show ip ospf neighbor json` command was displaying
state:`Full\/DR`
Where state was both the role and whether or not the neigbhor
was converged. While from a OSPF perspective this is the state.
This state is a combination of two things.
This creates a problem in testing because we have no guarantee
that a particular ospf router will actually have a particular role
given how loaded our topotest systems are. So add a bit of json
output to display both the converged status as well as the
role this router is playing on this neighbor/interface.
The above becomes:
state:`Full\/DR`
converged:`Full`
role:`DR`
Tests can now be modified to look for `Full` and allow it to
continue. Most of the tests do not actually care if this
router is the DR or Backup.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Commit 3551ee9e90304 introduced a regression that causes GR to fail
under certain circumstances. In short, while ISM events should
be ignored while acting as a helper for a restarting router, the
DR/BDR fields of the neighbor structure should still be updated
while processing a Hello packet. If that isn't done, it can cause
the helper to elect the wrong DR while exiting from the helper mode,
leading to a situation where there are two DRs for the same network
segment (and a failed GR by consequence). Fix this.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Before starting the graceful restart procedures, ospf_gr_prepare()
verifies for each configured OSPF instance whether it has the opaque
capability enabled (a pre-requisite for GR). If not, a warning is
emitted and GR isn't performed on that instance.
This PR introduces an additional opaque capability check that will
return a CLI error when the opaque capability isn't enabled. The
idea is to make it easier for the user to identify when the GR
activation has failed, instead of requiring him or her to check
the logs for errors.
The original opaque capability check from ospf_gr_prepare() was
retaining as it's possible that that function might be called from
other contexts in the future.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The ospfd opaque LSA infrastruture has an issue where it can't store
different versions of the same Type-9 LSA for different interfaces.
When flushing the self-originated Grace-LSAs upon exiting from the GR
mode, the code was looking up the single self-originated Grace-LSA
from the LSDB, setting its age to MaxAge and sending it out on all
interfaces.
The problem is that Grace-LSAs sent on broadcast interfaces have
their own unique "IP interface address" TLV that is used to identify
the restarting router. That way, just reusing the same Grace-LSA for
all interfaces doesn't work.
Fix this by generating a new Grace-LSA with its age manually set
to MaxAge whenever one needs to be flushed. This will allow the "IP
interface address" TLV to be set correctly and make GR work even in
the presence of multiple broadcast interfaces.
In the long term, the opaque LSA infrastructure should be updated
to support Type-9 link-local LSAs correctly so that we don't need to
resort to hacks like this.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Add a 'json' parameter to the 'show_opaque_info' callback definition,
and update all instances of that callback to not display plain-text
data when the user requested JSON data.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
RFC 3623 says:
"If the restarting router determines that it was the Designated
Router on a given segment prior to the restart, it elects
itself as the Designated Router again. The restarting router
knows that it was the Designated Router if, while the
associated interface is in Waiting state, a Hello packet is
received from a neighbor listing the router as the Designated
Router".
Implement that logic when processing Hello messages to ensure DR
interfaces will preserve their DR status across a graceful restart.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Issue:
===================
OSPF neighbors are not going down even after 10 mins when
having a mismatch in hello and dead interval.
First neighbors are formed and then a mismatch in the interval
is created, it is observed that the neighbor is not going down.
Root Cause Analysis:
====================
The event HelloReceived defined in RFC 2328 was named as PacketReceived
and this event was scheduled whenever LS Update, LS Ack, LS Request,
DD description packet or Hello packet is received.
Although there is a mismatch in the Hello packet contents, the
event PacketReceived gets triggered due to LS Update received and the
dead timer gets reset and hence the neighbor was never going Down and
remains FULL.
Fix:
==================
As per RFC 2328, the HelloReceived needs to be triggered only when
valid OSPF Hello packet is received and not when other OSPF packets
are received. Modified the function name as well.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Description:
As per the RFC 3623 section 3.2,
OSPF nbr shouldn't be deleted even in unsuccessful helper exit.
1. Made the changes to keep neighbour even after exit.
2. Restart the dead timer after expiry in helper. Otherwise, Restarter
will be in FULL state in helper forever until it receives the 'hello'.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
There's a helper function to check whether the interface is loopback or
VRF - if_is_loopback_or_vrf. Let's use it whenever we need to check that.
There's no functional change in this commit.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>