The following causes a validation error.
> # cat config
> affinity-map green bit-position 0
> router isis 1
> flex-algo 129
> affinity exclude-any green
> # vtysh -f config
> Error type: validation
> Error description: affinity map green isn't found
> The following commands were dynamically grouped into the same transaction and rejected:
> - affinity-map green bit-position 0
> - router isis 1
> - flex-algo 129
> - affinity exclude-any green
Data does not exist in memory in validation state.
Get data from the candidate northbound config instead.
Fixes: 893882ee20 ("isisd: add isis flex-algo configuration backend")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The following causes a isisd crash.
> # cat config
> affinity-map green bit-position 0
> router isis 1
> flex-algo 129
> affinity exclude-any green
> # vtysh -f config
> #0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1 0x00007f650cd32756 in core_handler (signo=6, siginfo=0x7ffc56f93070, context=0x7ffc56f92f40) at lib/sigevent.c:258
> #2 <signal handler called>
> #3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4 0x00007f650c91c537 in __GI_abort () at abort.c:79
> #5 0x00007f650cd007c9 in nb_running_get_entry_worker (dnode=0x0, xpath=0x0, abort_if_not_found=true, rec_search=true) at lib/northbound.c:2531
> #6 0x00007f650cd007f9 in nb_running_get_entry (dnode=0x55d9ad406e00, xpath=0x0, abort_if_not_found=true) at lib/northbound.c:2537
> #7 0x000055d9ab302248 in isis_instance_flex_algo_affinity_set (args=0x7ffc56f947a0, type=2) at isisd/isis_nb_config.c:2998
> #8 0x000055d9ab3027c0 in isis_instance_flex_algo_affinity_exclude_any_create (args=0x7ffc56f947a0) at isisd/isis_nb_config.c:3155
> #9 0x00007f650ccfe284 in nb_callback_create (context=0x7ffc56f94d20, nb_node=0x55d9ad28b540, event=NB_EV_VALIDATE, dnode=0x55d9ad406e00, resource=0x0, errmsg=0x7ffc56f94de0 "",
> errmsg_len=8192) at lib/northbound.c:1487
> #10 0x00007f650ccff067 in nb_callback_configuration (context=0x7ffc56f94d20, event=NB_EV_VALIDATE, change=0x55d9ad406d40, errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1884
> #11 0x00007f650ccfda31 in nb_candidate_validate_code (context=0x7ffc56f94d20, candidate=0x55d9ad20d710, changes=0x7ffc56f94d38, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
> at lib/northbound.c:1246
> #12 0x00007f650ccfdc67 in nb_candidate_commit_prepare (context=..., candidate=0x55d9ad20d710, comment=0x0, transaction=0x7ffc56f94da0, skip_validate=false, ignore_zero_change=false,
> errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1317
> #13 0x00007f650ccfdec4 in nb_candidate_commit (context=..., candidate=0x55d9ad20d710, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
> at lib/northbound.c:1381
> #14 0x00007f650cd045ba in nb_cli_classic_commit (vty=0x55d9ad3f7490) at lib/northbound_cli.c:57
> #15 0x00007f650cd04749 in nb_cli_pending_commit_check (vty=0x55d9ad3f7490) at lib/northbound_cli.c:96
> #16 0x00007f650cc94340 in cmd_execute_command_real (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, up_level=0) at lib/command.c:1000
> #17 0x00007f650cc94599 in cmd_execute_command (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, vtysh=0) at lib/command.c:1080
> #18 0x00007f650cc94a0c in cmd_execute (vty=0x55d9ad3f7490, cmd=0x55d9ad401d30 "XFRR_end_configuration", matched=0x0, vtysh=0) at lib/command.c:1228
> #19 0x00007f650cd523a4 in vty_command (vty=0x55d9ad3f7490, buf=0x55d9ad401d30 "XFRR_end_configuration") at lib/vty.c:625
> #20 0x00007f650cd5413d in vty_execute (vty=0x55d9ad3f7490) at lib/vty.c:1388
> #21 0x00007f650cd56353 in vtysh_read (thread=0x7ffc56f99370) at lib/vty.c:2400
> #22 0x00007f650cd4b6fd in event_call (thread=0x7ffc56f99370) at lib/event.c:1996
> #23 0x00007f650ccd1365 in frr_run (master=0x55d9ad103cf0) at lib/libfrr.c:1231
> #24 0x000055d9ab29036e in main (argc=2, argv=0x7ffc56f99598, envp=0x7ffc56f995b0) at isisd/isis_main.c:354
Configuring the same in vtysh configure interactive mode works properly.
When using "vtysh -f", the northbound compatible configuration is
committed together whereas, in interactive mode, it committed line by
line. In the first situation, in validation state nb_running_get_entry()
fails because the area not yet in running.
Do not use nb_running_get_entry() northbound validation state.
Fixes: 893882ee20 ("isisd: add isis flex-algo configuration backend")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
isisd is crashing when reading a ASLA sub-TLV with Application
Identifier Bit Mask length greater than 1 octet.
Set a limit of 8 bytes in accordance with RFC9479 and check that the
received value does not exceed the limit.
Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Link: https://www.rfc-editor.org/rfc/rfc9479.html#name-application-identifier-bit-
Fixes: 5749ac83a8 ("isisd: add ASLA support")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
When trying to track down a MTYPE_TMP memory leak
it's harder to search for it when you happen to
have some usage of ttable_dump. Let's just give
it it's own memory type so that we can avoid
confusion in the future.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
If the link-params are set when the circuit not yet up, the link-params
are never updated.
isis_link_params_update() is called from isis_circuit_up() but returns
immediately because circuit->state != C_STATE_UP. circuit->state is
updated in isis_csm_state_change after isis_circuit_up().
> struct isis_circuit *isis_csm_state_change(enum isis_circuit_event event,
> struct isis_circuit *circuit,
> void *arg)
> {
> [...]
> if (isis_circuit_up(circuit) != ISIS_OK) {
> isis_circuit_deconfigure(circuit, area);
> break;
> }
> circuit->state = C_STATE_UP;
> isis_event_circuit_state_change(circuit, circuit->area,
> 1);
Do not return isis_link_params_update() if circuit->state != C_STATE_UP.
Fixes: 0fdd8b2b11 ("isisd: update link params after circuit is up")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The adj_process_threeway() api may call the adj_state_change()
api, which may delete the adj struct being examined. Change the
signature so that callers pass a ptr-to-ptr so that they will
see that deletion.
Signed-off-by: Mark Stapp <mjs@cisco.com>
When an color affinity is set on an interface before configuring the
flex-algorithm, the ASLA (Application Specific Link-Attribute) sub-TLV
is not build. Flex-algo fails to build the paths when a affinity
constraint is required because of the lacking of information contained
in ASLA. There are no problems when the configuration order is reversed.
For example:
> affinity-map red bit-position 1
>
> interface eth2
> link-params
> affinity red
>
> router isis 1
> mpls-te on
> flex-algo 129
> dataplane sr-mpls
> advertise-definition
> affinity include-any green
In isis_link_params_update_asla(), the ASLA sub-TLV is not build when
the list of flex-algos is empty.
Update ASLA when the first flex-algorithm is configured.
Fixes: 893882ee20 ("isisd: add isis flex-algo configuration backend")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Move flex_algo_delete() content into isis_instance_flex_algo_destroy()
because it is called only once.
Rename _flex_algo_delete to flex_algo_free()
Cosmetic change.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
According to draft-ietf-lsr-isis-srv6-extensions draft,
the End SID should be available in link state prefix
information.
Add the SID information in the link state prefix, by
getting the END SID from the locator TLV information.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
1. When the root IS regenerates an LSP, it calls lsp_build() -> lsp_clear_data() to free the TLV memory of the first fragment and all other fragments. If the number of fragments in the regenerated LSP decreases or if no fragmentation is needed, the extra LSP fragments are not immediately deleted. Instead, lsp_seqno_update() -> lsp_purge() is called to set the remaining time to zero and start aging, while also notifying other IS nodes to age these fragments. lsp_purge() usually does not reset lsp->hdr.seqno to zero because the LSP might recover during the aging process.
2. When other IS nodes receive an LSP, they always call process_lsp() -> isis_unpack_tlvs() to allocate TLV memory for the LSP. This does not differentiate whether the received LSP has a remaining lifetime of zero. Therefore, it is rare for an LSP of a non-root IS to have empty TLVs. Of course, if an LSP with a remaining time of zero and already corrupted is received, lsp_update() -> lsp_purge() will be called to free the TLV memory of the LSP, but this scenario is rare.
3. In LFA calculations, neighbors of the root IS are traversed, and each neighbor is taken as a new root to compute the neighbor SPT. During this process, the old root IS will serve as a neighbor of the new root IS, triggering a call to isis_spf_process_lsp() to parse the LSP of the old root IS and obtain its IP vertices and neighboring IS vertices. However, isis_spf_process_lsp() only checks whether the TLVs in the first fragment of the LSP exist, and does not check the TLVs in the fragmented LSP. If the TLV memory of the fragmented LSP of the old root IS has been freed, it can lead to a null pointer access, causing the current crash.
Additionally, for the base SPT, there are only two places where the LSP of the root IS is parsed:
1. When obtaining the UP neighbors of the root IS via spf_adj_list_parse_lsp().
2. When preloading the IP vertices of the root IS via isis_lsp_iterate_ip_reach().
Both of these checks ensure that frag->tlvs is not null, and they do not subsequently call isis_spf_process_lsp() to parse the root IS's LSP. It is very rare for non-root IS LSPs to have empty TLVs unless they are corrupted LSPs awaiting deletion. If it happens, a crash will occur.
The backtrace is as follows:
(gdb) bt
#0 0x00007f3097281fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f30973a2972 in core_handler (signo=11, siginfo=0x7ffce66c2870, context=0x7ffce66c2740) at ../lib/sigevent.c:261
#2 <signal handler called>
#3 0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
at ../isisd/isis_spf.c:898
#4 0x000055dfa805743b in isis_spf_loop (spftree=0x55dfa950eee0, root_sysid=0x55dfa950ef6c "") at ../isisd/isis_spf.c:1688
#5 0x000055dfa805784f in isis_run_spf (spftree=0x55dfa950eee0) at ../isisd/isis_spf.c:1808
#6 0x000055dfa8037ff5 in isis_spf_run_neighbors (spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:1259
#7 0x000055dfa803ac17 in isis_spf_run_lfa (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:2300
#8 0x000055dfa8057964 in isis_run_spf_with_protection (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_spf.c:1827
#9 0x000055dfa8057c15 in isis_run_spf_cb (thread=0x7ffce66c38e0) at ../isisd/isis_spf.c:1889
#10 0x00007f30973bbf04 in thread_call (thread=0x7ffce66c38e0) at ../lib/thread.c:1990
#11 0x00007f309735497b in frr_run (master=0x55dfa91733c0) at ../lib/libfrr.c:1198
#12 0x000055dfa8029d5d in main (argc=5, argv=0x7ffce66c3b08, envp=0x7ffce66c3b38) at ../isisd/isis_main.c:273
(gdb) f 3
#3 0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
at ../isisd/isis_spf.c:898
898 ../isisd/isis_spf.c: No such file or directory.
(gdb) p te_neighs
$1 = (struct isis_item_list *) 0x120
(gdb) p lsp->tlvs
$2 = (struct isis_tlvs *) 0x0
(gdb) p lsp->hdr
$3 = {pdu_len = 27, rem_lifetime = 0, lsp_id = "\000\000\000\000\000\001\000\001", seqno = 4, checksum = 59918, lsp_bits = 1 '\001'}
The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.
I have reviewed the process for calculating the SPT based on the LSP, and isis_spf_process_lsp() is the only function that does not check whether the TLVs in the fragments are empty. Therefore, I believe that modifying this function alone should be sufficient. If the TLVs of the current fragment are already empty, we do not need to continue processing subsequent fragments. This is consistent with the behavior where we do not process fragments if the TLVs of the first fragment are empty.
Of course, one could argue that lsp_purge() should still retain the TLV memory, freeing it and then reallocating it if needed. However, this is a debatable point because in some scenarios, it is permissible for the LSP to have empty TLVs. For example, after receiving an SNP (Sequence Number PDU) message, an empty LSP (with lsp->hdr.seqno = 0) might be created by calling lsp_new. If the corresponding LSP message is discarded due to domain or area authentication failure, the TLV memory wouldn't be allocated.
Test scenario:
In an LFA network, importing a sufficient number of static routes to cause LSP fragmentation, and then rolling back the imported static routes so that the LSP is no longer fragmented, can easily result in this issue.
Signed-off-by: zhou-run <zhou.run@h3c.com>
1. The lsp_update_data() function will check for the presence of the ISIS_TLV_DYNAMIC_HOSTNAME in the LSP, and then call isis_dynhn_insert() to add a hostname entry corresponding to the LSP ID. However, when the ISIS_TLV_DYNAMIC_HOSTNAME is not present in the LSP, the hostname entry corresponding to the LSP ID should also be deleted.
2. The command “show isis neighbor” invokes isis_adj_name() to display the System ID or hostname, but it does not check the area->dynhostname flag.
3. When the LSP expires and is removed, the corresponding hostname entry should also be deleted.
4. The TLV for LSP fragmentation will not contain the hostname and should be skipped.
Signed-off-by: zhou-run <zhou.run@h3c.com>
When a neighbor connection is disconnected, it may trigger LSP re-generation as a timer task, but this process may be delayed. As a result, the list of neighbors in area->adjacency_list may be inconsistent with the neighbors in lsp->tlvs->oldstyle_reach/extended_reach. For example, the area->adjacency_list may lack certain neighbors even though they are present in the LSP. When computing SPF, the call to isis_spf_build_adj_list() generates the spftree->sadj_list, which reflects the real neighbors in the area->adjacency_list. However, in the case of LAN links, spftree->sadj_list may include additional pseudo neighbors.
The pre-loading of tents through the call to isis_spf_preload_tent involves two steps:
1. isis_spf_process_lsp() is called to generate real neighbor vertices based on the root LSP and pseudo LSP.
2. isis_spf_add_local() is called to add corresponding next hops to the vertex->Adj_N list for the real neighbor vertices.
In the case of LAN links, the absence of corresponding real neighbors in the spftree->sadj_list prevents the execution of the second step. Consequently, the vertex->Adj_N list for the real neighbor vertices lacks corresponding next hops. This leads to a null pointer access when isis_lfa_compute() is called to calculate LFA.
As for P2P links, since there are no pseudo neighbors, only the second step is executed, which does not create real neighbor vertices and therefore does not encounter this issue.
The backtrace is as follows:
(gdb) bt
#0 0x00007fd065277fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007fd065398972 in core_handler (signo=11, siginfo=0x7ffc5c0636b0, context=0x7ffc5c063580) at ../lib/sigevent.c:261
#2 <signal handler called>
#3 0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
#4 0x00005564d82f8d78 in isis_spf_run_lfa (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_lfa.c:2344
#5 0x00005564d8315964 in isis_run_spf_with_protection (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_spf.c:1827
#6 0x00005564d8315c15 in isis_run_spf_cb (thread=0x7ffc5c064590) at ../isisd/isis_spf.c:1889
#7 0x00007fd0653b1f04 in thread_call (thread=0x7ffc5c064590) at ../lib/thread.c:1990
#8 0x00007fd06534a97b in frr_run (master=0x5564d88103c0) at ../lib/libfrr.c:1198
#9 0x00005564d82e7d5d in main (argc=5, argv=0x7ffc5c0647b8, envp=0x7ffc5c0647e8) at ../isisd/isis_main.c:273
(gdb) f 3
#3 0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
2134 ../isisd/isis_lfa.c: No such file or directory.
(gdb) p vadj_primary
$1 = (struct isis_vertex_adj *) 0x0
(gdb) p vertex->Adj_N->head
$2 = (struct listnode *) 0x0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->data
$8 = (struct isis_vertex *) 0x5564d8b5b240
(gdb) p $8->type
$9 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $8->N.id
$10 = "\000\000\000\000\000\002"
(gdb) p $8->Adj_N->count
$11 = 0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->next->data
$12 = (struct isis_vertex *) 0x5564d8b73dd0
(gdb) p $12->type
$13 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $12->N.id
$14 = "\000\000\000\000\000\003"
(gdb) p $12->Adj_N->count
$15 = 0
(gdb) p area->adjacency_list->count
$16 = 0
The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.
The scenario where a vertex has no next hop is normal. For example, the "clear isis neighbor" command invokes isis_vertex_adj_del() to delete the next hop of a vertex. Upon reviewing all the instances where the vertex->Adj_N list is used, I found that only isis_lfa_compute() lacks a null check. Therefore, I believe that modifying this part will be sufficient. Additionally, the vertex->parents list for IP vertices is guaranteed not to be empty.
Test scenario:
Setting up LFA for LAN links and executing the "clear isis neighbor" command easily reproduces the issue.
Signed-off-by: zhou-run <zhou.run@h3c.com>
d5879267aa ("isisd: fix show database json format") renamed JSON keys to
a standard format but forgot to rename the neighbor-id key.
Fixes: d5879267aa ("isisd: fix show database json format")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The variable flags_json was incorrectly named, leading to confusion and
causing the bug fixed in the previous commit.
Rename the variable to refer to SRv6 End SID instead. Cosmetic change.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The json format for json routes should be compliant with caml format.
Before:
> "Prefix|Metric|Interface|Nexthop|SID|LabelOp|Algo":
> "Prefix|Metric|Interface|Nexthop|Label(s)");
After:
> "prefix|metric|interface|nextHop|segmentIdentifier|labelOperation|Algorithm":
> "prefix|metric|interface|nextHop|label(s)");
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The command "show isis topology" calls print_sys_hostname() to display the system ID or hostname, but it does not check the area->dynhostname flag.
Signed-off-by: zhou-run <zhou.run@h3c.com>
In the near future, some daemons may only register SIDs. This may be
the case for the pathd daemon when creating SRv6 binding SIDs.
When a locator is getting deleted at ZEBRA level, the daemon may have
an easy way to find out the SIds to unregister to.
This commit proposes to add the locator name to the SID_SRV6_NOTIFY
message whenever possible. Only case when an allocation failure happens,
the locator will not be present. In all other places, the notify API
at procol levels has the locator name extra-parameter.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Zebra sends a SRV6_SID_NOTIFY notification to inform clients about the
result of a SID alloc/release operation. This commit adds a handler to
process a SRV6_SID_NOTIFY notification received from zebra.
If the notification indicates that a SID allocation operation was
successful, then it stores the allocated SID in the SRv6 database,
installs the SID into the RIB, and advertises the SID to the other IS-IS
routers.
If the notification indicates that an operation has failed, it logs the
error.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, IS-IS allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends IS-IS to release SIDs to Zebra when they are no
longer needed.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Currently, IS-IS allocates SIDs without interacting with Zebra.
Recently, the SRv6 implementation has been improved. Now, the daemons
need to interact with Zebra through ZAPI to obtain and release SIDs.
This commit extends IS-IS to request SIDs from Zebra instead of
allocating the SIDs on its own.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
This commit extends IS-IS to process locator information received from
SRv6 Manager (zebra) and save the locator info in the SRv6 database.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>