I'm seeing this crash in various forms:
Program terminated with signal SIGSEGV, Segmentation fault.
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f418efbc7c0 (LWP 3580253))]
(gdb) bt
(gdb) f 4
267 (*func)(hb, arg);
(gdb) p hb
$1 = (struct hash_bucket *) 0x558cdaafb250
(gdb) p *hb
$2 = {len = 0, next = 0x0, key = 0, data = 0x0}
(gdb)
I've also seen a crash where data is 0x03.
My suspicion is that hash_iterate is calling zebra_nhg_sweep_entry which
does delete the particular entry we are looking at as well as possibly other
entries when the ref count for those entries gets set to 0 as well.
Then we have this loop in hash_iterate.c:
for (i = 0; i < hash->size; i++)
for (hb = hash->index[i]; hb; hb = hbnext) {
/* get pointer to next hash bucket here, in case (*func)
* decides to delete hb by calling hash_release
*/
hbnext = hb->next;
(*func)(hb, arg);
}
Suppose in the previous loop hbnext is set to hb->next and we call
zebra_nhg_sweep_entry. This deletes the previous entry and also
happens to cause the hbnext entry to be deleted as well, because of nhg
refcounts. At this point in time the memory pointed to by hbnext is
not owned by the pthread anymore and we can end up on a state where
it's overwritten by another pthread in zebra with data for other incoming events.
What to do? Let's change the sweep function to a hash_walk and have
it stop iterating and to start over if there is a possible double
delete operation.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 07b9ebca65)
This patch repairs the checking conditions on length in four functions:
babel_packet_examin, parse_hello_subtlv, parse_ihu_subtlv, and parse_update_subtlv
Signed-off-by: qingkaishi <qingkaishi@gmail.com>
(cherry picked from commit c3793352a8)
`show ip pim assert` shows S,G ifchannel information even when
there is no information available about the assert process.
Fix the code to not dump non-interesting cases.
Fixes: 10462
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 35e5ef55f1)
In all places that pim_nht_bsr_del is called, the code
needs to not unregister if the current_bsr is INADDR_ANY.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 5f010b1205)
I'm now seeing in my log file:
2022/01/28 11:20:05 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
2022/01/28 11:20:05 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
2022/01/28 11:20:06 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
When I run pimd. Looking at the code there are 3 places where pim_bsm.c removes the
NHT BSR tracking. In 2 of them the code ensures that the address is already setup
in 1 place it is not. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 2d51f27f02)
The ctx->zd_is_update is being set in various
spots based upon the same value that we are
passing into dplane_ctx_ns_init. Let's just
consolidate all this into the dplane_ctx_ns_init
so that the zd_is_udpate value is set at the
same time that we increment the sequence numbers
to use.
As a note for future me's reading this. The sequence
number choosen for the seq number passed to the
kernel is that each context gets a copy of the
appropriate nlsock to use. Since it's a copy
at a point in time, we know we have a unique sequence
number value.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit e3ee55d4bd)
When nl_batch_read_resp gets a full on failure -1 or an implicit
ack 0 from the kernel for a batch of code. Let's immediately
mark all of those in the batch pass/fail as needed. Instead
of having them marked else where.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 00249e255e)
When adding a nhg to a route map, make sure to specify the `family`
of the rm by looking at the contents of the nhg. Installation in the
kernel (for DSCP rules in particular) relies on this being specified in
the netlink message.
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
Signed-off-by: Stephen Worley <sworley@nvidia.com>
(cherry picked from commit 9a7ea213c0)
Skip marking routes as changed in ospf_if_down if there's now
new_table present, which might be the case when the instance is
being finished
The backtrace for the core was:
raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
core_handler (signo=11, siginfo=0x7fffffffe170, context=<optimized out>) at lib/sigevent.c:262
<signal handler called>
route_top (table=0x0) at lib/table.c:401
ospf_if_down (oi=oi@entry=0x555555999090) at ospfd/ospf_interface.c:849
ospf_if_free (oi=0x555555999090) at ospfd/ospf_interface.c:339
ospf_finish_final (ospf=0x55555599c830) at ospfd/ospfd.c:749
ospf_deferred_shutdown_finish (ospf=0x55555599c830) at ospfd/ospfd.c:578
ospf_deferred_shutdown_check (ospf=<optimized out>) at ospfd/ospfd.c:627
ospf_finish (ospf=<optimized out>) at ospfd/ospfd.c:683
ospf_terminate () at ospfd/ospfd.c:653
sigint () at ospfd/ospf_main.c:109
quagga_sigevent_process () at lib/sigevent.c:130
thread_fetch (m=m@entry=0x5555556e45e0, fetch=fetch@entry=0x7fffffffe9b0) at lib/thread.c:1709
frr_run (master=0x5555556e45e0) at lib/libfrr.c:1174
main (argc=9, argv=0x7fffffffecb8) at ospfd/ospf_main.c:254
Signed-off-by: Tomi Salminen <tsalminen@forcepoint.com>
(cherry picked from commit d4e66f1485)
Current code treats all metaqueues as lists of route_node structures.
However, some queues contain other structures that need to be cleaned up
differently. Casting the elements of those queues to struct route_node
and dereferencing them leads to a crash. The crash may be seen when
executing bgp_multi_vrf_topo2.
Fix the code by using the proper list element types.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
(cherry picked from commit 0ef6eacc95)
Description:
Replacing memcmp at certain places,
to avoid the coverity issues caused by it.
Co-authored-by: Kantesh Mundargi <kmundaragi@vmware.com>
Signed-off-by: Iqra Siddiqui <imujeebsiddi@vmware.com>
FRR stores the route_tag in network byte order. Bug filed indicates
that the `show ip ospf route` command shows the correct value.
Every place route_tag is dumped in ospf_vty.c the ntohl function
is used first.
Fixes: #10450
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
EVPN route add should be queued to preserve the config order.
In particular, against deletion in rib_delete().
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Replace custom implementation or call to ipaddr_isset with a call to
ipaddr_is_zero.
ipaddr_isset is not fully correct, because it's fine to have some
non-zero bytes at the end of the struct in case of IPv4 and the function
doesn't allow that.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
When setting maximum-prefix-out on peer-group, the applied value on
member is 0.
Fix usage of maximum-prefix-out on peer-group.
The peer_maximum_prefix_out_(un)set functions are derived from
peer_maximum_prefix_(un)set.
Fixes: fde246e835 ("bgpd: Add an option to limit outgoing prefixes")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Test the ability to use the following configure command with a Y value:
no neighbor X.X.X.X maximum-prefix-out Y
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Specifying a number is not possible with command no neighbor X.X.X.X
maximum-prefix-out
> frr(config-router-af)# no neighbor 192.168.1.2 maximum-prefix-out 1
> % Unknown command: no neighbor 192.168.1.2 maximum-prefix-out 1
This patch allows it.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>