When lsp-mtu is configured larger than interface mtu and the interface
is brought up, the ISIS code would crash. When other vendors have this
misconfiguration they just continue ISIS running and allow the LSP
packets to be created but not sent. When the misconfiguration is corrected
the LSP packets start being sent. This change creates that same behavior
in FRR.
The startup issue I am hitting is when the isis lsp-mtu is larger that the interfaces mtu.
We run into this case when we are in the process of changing the mtu on a tunnel.
I issue a shutdown/no shutdown on the interface, because the tunnel MTU is smaller
than the lsp-mtu, it is considered an error and calls circuit_if_del. This deletes
part of the circuit information, which includes the circuit->ip_addr list. Later on we get
an address update from zebra and try to add the interface address to this list and crash.
2022/04/07 20:19:52.032 ISIS: [GTRPJ-X68CG] CSM_EVENT for tun_gw2: IF_UP_FROM_Z
calls isis_circuit_if_add
this initialize the circuit->ip_addrs
isis_circuit_up
has the mtu check circuit->area->lsp_mtu > isis_circuit_pdu_size(circuit) and fails
returns ISIS_ERROR
on failure call isis_circuit_if_del
this deletes the circiut->ip_addrs list <----
2022/04/07 20:19:52.032 ZEBRA: [NXYHN-ZKW2V] zebra_if_addr_update_ctx: INTF_ADDR_ADD: ifindex 3, addr 192.168.0.1/24
message to isisd to add address
isis_zebra_if_address_add
isis_circuit_add_addr
circuit->ip_addr we try to add the ip address to the list, but it was deleted above and isisd crashes
Signed-off-by: Lynne Morrison <lynne.morrison@ibm.com>
The commands:
router isis 1
mpls-te on
no mpls-te on
mpls-te on
no mpls-te on
!
Will crash
Valgrind gives us this:
==652336== Invalid read of size 8
==652336== at 0x49AB25C: typed_rb_min (typerb.c:495)
==652336== by 0x4943B54: vertices_const_first (link_state.h:424)
==652336== by 0x493DCE4: vertices_first (link_state.h:424)
==652336== by 0x493DADC: ls_ted_del_all (link_state.c:1010)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Address 0x6f928e0 is 272 bytes inside a block of size 320 free'd
==652336== at 0x48399AB: free (vg_replace_malloc.c:538)
==652336== by 0x494BA30: qfree (memory.c:141)
==652336== by 0x493D99D: ls_ted_del (link_state.c:997)
==652336== by 0x493DC20: ls_ted_del_all (link_state.c:1018)
==652336== by 0x47E77B: isis_instance_mpls_te_destroy (isis_nb_config.c:1871)
==652336== by 0x495BE20: nb_callback_destroy (northbound.c:1131)
==652336== by 0x495B5AC: nb_callback_configuration (northbound.c:1356)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== Block was alloc'd at
==652336== at 0x483AB65: calloc (vg_replace_malloc.c:760)
==652336== by 0x494B6F8: qcalloc (memory.c:116)
==652336== by 0x493D7D2: ls_ted_new (link_state.c:967)
==652336== by 0x47E4DD: isis_instance_mpls_te_create (isis_nb_config.c:1832)
==652336== by 0x495BB29: nb_callback_create (northbound.c:1034)
==652336== by 0x495B547: nb_callback_configuration (northbound.c:1348)
==652336== by 0x4958127: nb_transaction_process (northbound.c:1473)
==652336== by 0x4958275: nb_candidate_commit_apply (northbound.c:906)
==652336== by 0x49585B8: nb_candidate_commit (northbound.c:938)
==652336== by 0x495CE4A: nb_cli_classic_commit (northbound_cli.c:64)
==652336== by 0x495D6C5: nb_cli_apply_changes_internal (northbound_cli.c:250)
==652336== by 0x495D23E: nb_cli_apply_changes (northbound_cli.c:268)
Let's null out the pointer. After this change. Valgrind no longer reports issues
and isisd no longer crashes.
Fixes: #10939
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Add Link State TED features to isis_te.c and new CLI to export LS TED and
show LS TED to IS-IS.
IS-IS LSPs are parse each time a new LSP event occurs in order to update
accordingly the Link State Traffic Engineering Database. LS TED could be
exported through the ZAPI Opaque message (see sharpd as example).
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
- Add advertisement of Global IPv6 address in IIH pdu
- Add new CLI to set IPv6 Router ID
- Add advertisement of IPv6 Router ID
- Correctly advertise IPv6 local and neighbor addresses in Extended IS and MT
Reachability TLVs
- Correct output of Neighbor IPv6 address in 'show isis database detail'
- Manage IPv6 addresses advertisement and corresponiding Adjacency SID when
IS-IS is not using Multi-Topology by introducing a new ISIS_MT_DISABLE
value for mtid (== 4096 i.e. first reserved flag set to 1)
Signed-off-by: Olivier Dugeon <olivier.dugeon@orange.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, we have a lot of checks in CLI and NB layer to prevent
incompatible IS-types of circuits and areas. All these checks become
completely meaningless when the interface is moved between VRFs. If the
area IS-type is different in the new VRF, previously done checks mean
nothing and we still end up with incorrect circuit IS type. To actually
prevent incorrect IS type, all checks must be done in the processing
code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We can simply check whether the circuit exists already – if it exists,
then we forbid the area-tag modification.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We have checks on NB validation stage to prevent configuring LDP sync on
interfaces in non-default VRFs. These checks are completely useless,
because the interface can be easily moved to another VRF after
configuring LDP sync. Instead, the check must be done in the actual code
to cover the case when the interface is moved between VRFs.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently, we have some checks in the CLI and NB layer to "protect" from
setting loopback interfaces into non-passive mode. These checks are not
correct, because we can not rely on operational data during config
reading and validation stage as this data doesn't exist yet. There's
nothing wrong in allowing "incorrect" configuration – it is already
correctly handled by the actual code.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
There are two checks done when configuring ldp-sync on an interface:
- interface is not a loopback
- interface is in the default VRF
Both checks are incorrectly done using the operational data.
The second check can be done using only config data - do that.
The first check can't be done using only configurational data, but it's
not necessary. LDP sync code doesn't operate on loopback interfaces
already. There's no harm in allowing this to be configured.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Don't rely on operational data to validate that configuration is applied
to the default VRF. The VRF name is stored in the config - use it instead.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Don't rely on operational data to check for system ID consistency. This
is purely configurational data thing.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
We only need an instance when we have at least one area configured in a
VRF. Currently we have the following issues:
- instance for the default VRF is always created
- instance is not removed after the last area config is removed
This commit fixes both issues.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Compile with v2.0.0 tag of `libyang2` branch of:
https://github.com/CESNET/libyang
staticd init load time of 10k routes now 6s vs ly1 time of 150s
Signed-off-by: Christian Hopps <chopps@labn.net>
The current implementation of TI-LFA computes link-protecting
repair paths (even when node protection is enabled) to have repair
paths to all destinations when no node-protecting repair has been
found. This may be desired or not. E.g. the link-protecting paths
may use the protected node and be, therefore, useless if the node
fails. Also, computing link-protecting repairs incurs extra
calculations.
With this patch, when node protection is enabled, link protecting
repair paths are only computed if "link-fallback" is specified in
the configuration, on a per interface and IS-IS level.
Signed-off-by: Fredi Raspall <fredi@voltanet.io>
YANG model and CLI commands allow user to configure LDP-sync per area.
But the actual implementation is incorrect - all commands are changing
the config for the whole VRF instead of a single area. This commit fixes
this issue by actually implementing per area configuration.
Fixes#8578.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Currently we don't allow to configure the interface before the area is
configured. This approach has the following issues:
1. The area config can be deleted even when we have an interface config
relying on it. The code is not ready for that - we'll have a whole
bunch of stale pointers if user does that.
2. The code doesn't correctly process the event of changing the VRF for
an interface. There is no mechanism to ensure that the area exists
in the new VRF so currently the circuit still stays in the old VRF.
This commit allows an arbitrary order of area/interface configuration.
There is no more need to configure the area before configuring the
interface.
This change fixes both the issues.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Convert most DEFINE_MTYPE into the _STATIC variant, and move the
remaining non-static ones to appropriate places.
Signed-off-by: David Lamparter <equinox@diac24.net>
There are places in the code where function nb_running_get_entry is used
with abort_if_not_found set to true during the config validation stage.
This is incorrect because when used in transactional CLI, the running
entry won't be set until the apply stage, and such usage leads to crash.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
when changing both ranges at the same time the order of the commands
matters, as we need to make sure that the intermediate state is valid.
This represents a problem when pushing configuration via frr-reload.
To fix this, the global-block command was extended to optionally
allow setting the local-block range as well. The local-block command
is deprecated with a 1-year notice.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
The purpose of the Attach-bit is to accomplish inter-area routing. In other
venders, the Attached-bit is automatically set when a router is configured
as a L1|L2 router and has two adjacencies. When a L1 router receives a LSP
with the Attached-bit set it is supposed to create a default route pointing
toward the neighbor to provide a default path out of the L1 area.
ISIS implementation has been fixed to support the above definition:
Setting the Attach-bit is now the default behavior and we allow the user to
turn it off.
We will only set the Default Attach-bit when creating a L1 LSP, if we are
a L1|L2 router and have a L2 adjacency up.
When a L1 router receives a LSP with the Attach-bit set, we will create a
default route pointing to the L1|L2 router as the nexthop.
The default route will be removed if the LSP is received with the Attach-bit
cleared.
Signed-off-by: Lynne Morrison <lynne@voltanet.io>
Remote LFA (RFC 7490) is an extension to the base LFA mechanism
that uses dynamically determined tunnels to extend the IP-FRR
protection coverage.
RLFA is similar to TI-LFA in that it computes a post-convergence
SPT (with the protected interface pruned from the network topology)
and the P/Q spaces based on that SPT. There are a few differences
however:
* RLFAs can push at most one label, so the P/Q spaces need to
intersect otherwise the destination can't be protected (the
protection coverage is topology dependent).
* isisd needs to interface with ldpd to obtain the labels it needs to
create a tunnel to the PQ node. That interaction needs to be done
asynchronously to prevent blocking the daemon for too long. With
TI-LFA all required labels are already available in the LSPDB.
RLFA and TI-LFA have more similarities than differences though,
and thanks to that both features share a lot of code.
Limitations:
* Only RLFA link protection is implemented. The algorithm used
to find node-protecting RLFAs (RFC 8102) is too CPU intensive and
doesn't always work. Most vendors implement RLFA link protection
only.
* RFC 7490 says it should be a local matter whether the repair path
selection policy favors LFA repairs over RLFA repairs. It might be
desirable, for instance, to prefer RLFAs that satisfy the downstream
condition over LFAs that don't. In this implementation, however,
RLFAs are only computed for destinations that can't be protected
by local LFAs.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
When last area address is removed, resign if we were DR.
This fixes an issue where: when the ISIS area address is changed, ISIS fails
to elect a new DR.
Signed-off-by: Karen Schoener <karen@voltanet.io>
Removing the obsolete ldp-sync periodic 'hello' message.
When ldp-sync is configured, IGPs take action if the LDP process goes down.
The IGPs have been updated to use the zapi client close callback to detect
the LDP process going down.
Signed-off-by: Karen Schoener <karen@voltanet.io>
This is a second iteration of commit 10bdc68f0c. Some recent
commits introduced zlog calls in the northbound callbacks
inadvertently.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
isisd relies on its YANG module to prevent the same SID index
from being configured multiple times for different prefixes. It's
possible, however, to have different routers assigning the same SID
index for different prefixes. When that happens, we say we have a
Prefix-SID collision, which is ultimately a misconfiguration issue.
The problem with Prefix-SID collisions is that the Prefix-SID that
is processed later overwrites the previous ones. Then, once the
Prefix-SID collision is fixed in the configuration, the overwritten
Prefix-SID isn't reinstalled since it's already marked as installed
and it didn't change. To prevent such inconsistency from happening,
add a safeguard in the SPF code to detect Prefix-SID collisions and
handle them appropriately (i.e. log a warning + ignore the Prefix-SID
Sub-TLV since it's already in use by another prefix). That way,
once the configuration is fixed, no Prefix-SID label entry will be
missing in the LFIB.
Reported-by: Emanuele Di Pascale <emanuele@voltanet.io>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Add the "n-flag-clear" option to the "segment-routing prefix"
command. The only thing that option does is to clear the node
flag of the Prefix-SID, even if it corresponds to a local loopback
address. No changes are necessary other than that in order to fully
support Anycast-SIDs. isisd already supports multiple routers
advertising the same route with the same Prefix-SID after the recent
refactoring. Clearing the node flag for such anycast routes isn't
strictly required, but failure to do so can lead to problems like
TI-LFA picking the wrong Prefix-SID when calculating repair paths.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Interface area-tag is not supposed to be modified once defined, but the
necessary check is currently broken, because the circuit is never in
init_circ_list if the area-tag is already configured for the interface.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
TI-LFA is a modern fast-reroute (FRR) solution that leverages Segment
Routing to pre-compute backup nexthops for all destinations in the
network, helping to reduce traffic restoration times whenever a
failure occurs. The backup nexthops are expected to be installed
in the FIB so that they can be activated as soon as a failure
is detected, making sub-50ms recovery possible (assuming an
hierarchical FIB).
TI-LFA is a huge step forward compared to prior IP-FRR solutions,
like classic LFA and Remote LFA, as it guarantees 100% coverage
for all destinations. This is possible thanks to the source routing
capabilities of SR, which allows the backup nexthops to steer traffic
around the failures (using as many SIDs as necessary). In addition
to that, the repair paths always follow the post-convergence SPF
tree, which prevents transient congestions and suboptimal routing
from happening.
Deploying TI-LFA is very simple as it only requires a single
configuration command for each interface that needs to be protected
(both link protection and node protection are available). In addition
to IPv4 and IPv6 routes, SR Prefix-SIDs and Adj-SIDs are also
protected by the backup nexthops computed by the TI-LFA algorithms.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The code in isisd uses `circuit->area->isis` all the time
but we know that circuit now has a valid `circuit->isis` pointer
so let's use that and cleanup the long dereference.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1. Added isis with different vrf and it's dependecies.
2. Added new vrf leaf in yang.
3. A minor change for IF_DOWN_FROM_Z passing argrument is
replaced with ifp pointer in api "isis_if_delete_hook()".
4. Minor fix in the isisd spf unit test.
Co-authored-by: Kaushik <kaushik@niralnetworks.com>"
Signed-off-by: harios_niral <hari@niralnetworks.com>