If the pimreg device exists but it has not been set to the pim->pimreg pointer we can have
a crash. Just prevent the crash since it's some sort of startup / re-org the network
issue.
(gdb) bt
0 0x00007f0485b035cb in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
1 0x00007f0485c0fbec in core_handler (signo=6, siginfo=0x7ffdc0198030, context=<optimized out>) at lib/sigevent.c:264
2 <signal handler called>
3 0x00007f04859668eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
4 0x00007f0485951535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
5 0x00007f0485c3af76 in _zlog_assert_failed (xref=xref@entry=0x55692269b940 <_xref.23164>, extra=extra@entry=0x0) at lib/zlog.c:680
6 0x00005569226150d0 in pim_if_new (ifp=0x556922c82900, gm=gm@entry=false, pim=pim@entry=false, ispimreg=ispimreg@entry=true,
is_vxlan_term=is_vxlan_term@entry=false) at pimd/pim_iface.c:124
7 0x0000556922615140 in pim_if_create_pimreg (pim=pim@entry=0x556922cc11e0) at pimd/pim_iface.c:1549
8 0x0000556922616bc8 in pim_if_create_pimreg (pim=0x556922cc11e0) at pimd/pim_iface.c:1613
9 pim_ifp_create (ifp=0x556922cc0e70) at pimd/pim_iface.c:1641
10 0x00007f0485c32cf9 in zclient_interface_add (cmd=<optimized out>, zclient=<optimized out>, length=<optimized out>, vrf_id=77) at lib/zclient.c:2214
11 0x00007f0485c3346a in zclient_read (thread=<optimized out>) at lib/zclient.c:4003
12 0x00007f0485c215ed in thread_call (thread=thread@entry=0x7ffdc0198880) at lib/thread.c:2008
13 0x00007f0485bdbbc8 in frr_run (master=0x556922a10470) at lib/libfrr.c:1223
14 0x000055692260312b in main (argc=<optimized out>, argv=0x7ffdc0198b98, envp=<optimized out>) at pimd/pim_main.c:176
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Topology:
========
FHR----Source
Problem:
=======
When FHR receives multicast traffic, there is no RP configured,
PIMD does NHT register for RP address 0.0.0.0 and group 224.0.0.0/4
PIM6D does NHT register for RP address 0::0 and group FF00::0/8
frr# show ip pim nexthop
Number of registered addresses: 1
Address Interface Nexthop
---------------------------------------------
frr# show ipv6 pim nexthop
Number of registered addresses: 1
Address Interface Nexthop
---------------------------------------------
Fix:
====
Dont track nexthop for RP 0.0.0.0 & 0::0.
frr# show ip pim nexthop
Number of registered addresses: 0
frr# show ipv6 pim nexthop
Number of registered addresses: 0
Issue: #12104
Signed-off-by: Sarita Patra <saritap@vmware.com>
Topology:
RP---FHR---Source
Problem Statement:
1. In FHR, Enable PIM and IGMP on source connected interface
2. Start multicast traffic. (s,g) mroute and upstream will be created as expected.
3. Disable PIM on source connected interface.
4. Disable IGMP on source connected interface.
5. Stop the traffic.
Mroute will never get timeout.
Root Cause:
In FHR, when PIMD receive multicast data packet on
source connected interface which is IGMP enabled, but PIM
not enabled. PIMD process the packet, install the
mroute and start the KAT timer.
Fix:
Don't process multicast data packet received on PIM disabled
Signed-off-by: Sarita Patra <saritap@vmware.com>
Topology:
=========
RP---FHR<ens224>---Source
Problem Statement:
=================
Step 1:
Enable PIM and IGMP on source connected interface ens224
Step 2:
Start multicast traffic. (s,g) mroute and upstream will be created as expected.
dev# show ip mroute
IP Multicast Routing Table
Flags: S - Sparse, C - Connected, P - Pruned
R - SGRpt Pruned, F - Register flag, T - SPT-bit for SSM FHR
Source Group Flags Proto Input Output TTL Uptime
50.0.0.4 225.1.1.1 SF PIM ens224 pimreg 1 00:37:55
dev# show ip pim upstream
Iif Source Group State Uptime JoinTimer RSTimer KATimer RefCnt
ens224 50.0.0.4 225.1.1.1 NotJ,RegJ 00:37:57 --:--:-- --:--:-- 00:02:43 1
Step 3:
Disable PIM on source connected interafce ens224
dev# show ip mroute
IP Multicast Routing Table
Flags: S - Sparse, C - Connected, P - Pruned
R - SGRpt Pruned, F - Register flag, T - SPT-bit for SSM FHR
Source Group Flags Proto Input Output TTL Uptime
50.0.0.4 225.1.1.1 SF PIM ens224 pimreg 1 00:38:05
dev# show ip pim upstream
Iif Source Group State Uptime JoinTimer RSTimer KATimer RefCnt
ens224 50.0.0.4 225.1.1.1 NotJ,RegJ 00:38:08 --:--:-- --:--:-- 00:02:32 1
Step 4:
Disable IGMP on source connected interface ens224
dev# show ip pim upstream
Iif Source Group State Uptime JoinTimer RSTimer KATimer RefCnt
ens224 50.0.0.4 225.1.1.1 NotJ,RegJ 00:38:15 --:--:-- --:--:-- 00:03:27 1
dev# show ip mroute
IP Multicast Routing Table
Flags: S - Sparse, C - Connected, P - Pruned
R - SGRpt Pruned, F - Register flag, T - SPT-bit for SSM FHR
Source Group Flags Proto Input Output TTL Uptime
50.0.0.4 225.1.1.1 SF PIM <iif?> pimreg 1 00:38:18
Pim upstream IIF is still pointing towards the source connected
interface which is not pim enabled and not IGMP enabled and
Mroute is still present in the kernel and KAT timer is still running
on the interface, where ifp->info is already set to NULL.
This leads to crash.
Root Cause:
==========
When "no ip pim" commands get executed on source connected interface,
we are updating upstream IIF only when IGMP is not enabled on the same
interface.
Fix:
===
When PIM is disabled on source connected interface, update upstream IIF
no matter if IGMP is enabled or not on the same interface.
Issue: #12848
Issue: #10782
Signed-off-by: Sarita Patra <saritap@vmware.com>
Before fix:
==========
frr# show ipv6 mld interface
Interface State V Querier Timer Uptime
ens224 up 1 fe80::250:56ff:feb7:a7e3 query 00:00:24.219 00:00:07.031
After fix:
=========
frr(config-if)# do show ipv6 mld interface
Interface State Address V Querier QuerierIp Query Timer Uptime
ens224 up fe80::250:56ff:feb7:a7e3 1 local fe80::250:56ff:feb7:a7e3 00:01:22.263 00:08:00.237
Issue: #11241
Signed-off-by: Sarita Patra <saritap@vmware.com>
We should not display down interfaces or MLD disabled interfaces in
"show ipv6 mld interface" command.
Before fix:
==========
frr# show ipv6 mld interface
Interface State V Querier Timer Uptime
ens192 up 2 fe80::250:56ff:feb7:d04 query 00:00:25.432 00:00:07.038
ens224 up 1 fe80::250:56ff:feb7:a7e3 query 00:00:24.219 00:00:07.031
pim6reg down
After fix:
=========
frr# show ipv6 mld interface
Interface State V Querier Timer Uptime
ens192 up 2 fe80::250:56ff:feb7:d04 query 00:00:25.432 00:00:07.038
ens224 up 1 fe80::250:56ff:feb7:a7e3 query 00:00:24.219 00:00:07.031
Issue: #11241
Signed-off-by: Sarita Patra <saritap@vmware.com>
When upstream RPF address is secondary address, and
neighborship is built with primary address,
then pim_neighbor_find() fails.
Verify the upstream RPF address is present in the
neighbor primary and secondary address list.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When upstream RPF address is secondary, and
neighborship is built with primary address,
then pim_neighbor_find() fails, due to which when there
is upstream change it wont send prune.
Verify the nexthop is present in the neighbor primary
and secondary address list.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When there is a mismatch in nexthop address (secondary address)
and neighborship address(primary address) on the same interface,
RPF check fails.
This is fixed now.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When route to RP is having nexthop secndary address,
neighborship is built with primary address,
then pim_neighbor_find() fails, which causes RP IIF
Unknown.
Fix:
Verify pim neighborship on the RP connected interface.
Issue: #11526
Signed-off-by: Sarita Patra <saritap@vmware.com>
Problem 1:
When route to BSR is having nexthop secondary address,
neighborship is built with primary address,
then pim_neighbor_find() fails, which cause drop of BSM
packet.
Fix 1:
Verify pim neighborship on the BSM received interface.
Problem 2:
Problem 2:
Source IP BSM address is primary address, where
as nexthop also can be primary or secondary address.
Fix 2:
Avoiding the check (nhaddr == src_ip) for PIMV6
Issue: #11957
Signed-off-by: Sarita Patra <saritap@vmware.com>
For incoming no-receiver SSM traffic, there isn't going to be a RP, much
less a RPF. We should install an MFC entry with empty oif regardless,
so we don't get swamped with further notifications.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Whether due to a pimd bug, some expiry, or someone just deleting MFC
entries, when we're in NOCACHE we *know* there's no MFC entry. Add an
install call to make sure pimd's MFC view aligns with the actual kernel
MFC.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This path here is pretty far on top of the list of issues that operators
will run into and have to debug when setting up PIM. Make the log
messages actually tell what's going on. Also escalate some from
`debug mroute detail` to `debug mroute`.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Coverity complains that MLAG_MSG_NONE cannot be reached in
the switch statement. Which is true so let's make it happy.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The files converted in this commit either had some random misspelling or
formatting weirdness that made them escape automated replacement, or
have a particularly "weird" licensing setup (e.g. dual-licensed.)
This also marks a bunch of "public domain" files as SPDX License "NONE".
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Passing a pre-formatted buffer in these places needs a `"%s"` in front
so it doesn't get formatted twice.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Use `getpid()` to initialize the sequence number. This change silences
Coverity Scan warning about truncated use of `time()` which in this case
is not a problem.
Found by Coverity Scan (CID 1519828)
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Topology:
========
Receiver----R1----(ens192)R2(ens224)----R3----R4----Source
--------------------------
R1=LHR
R2=RP
R4=FHR
Problem:
=======
1. Direct connected link between R1 and R3 is down initially.
2. So traffic flow path is R4<->R3<->R2<->R1<->Receiver.
3. Mroutes are properly created on all the nodes.
4. Up the direct connected link between R1 and R3.
5. Traffic flows in both the paths.
R4<->R3<->R2<->R1<->Receiver
R4<->R3<->R1<->Receiver
6. Duplicate traffic received at the receiver.
Root Cause:
==========
Initially when the direct connected link between R1 and R3 is
down, traffic flows via RP(R2). So in RP (S,G) installed with
IIF as ens224 and OIF as ens192 (reference = 2) with mask
PIM_OIF_FLAG_PROTO_STAR and PIM_OIF_FLAG_PROTO_PIM.
Now when the direct link between R1 and R3 is Up, LHR(R1) sends
SGRPT prune. After prune received, RP(R2) will remove OIF ens224
with mask PIM_OIF_FLAG_PROTO_STAR.
Since OIF ens224 is still present with mask PIM_OIF_FLAG_PROTO_PIM,
RP(R2) will not send prune towards R3.
So traffic continues to flow in the path R4<->R3<->R2<->R1<->Receiver.
Fix:
====
When SGRpt prune received, remove OIF irrespective of the OIF is
installed with mask "PIM_OIF_FLAG_PROTO_STAR" or "PIM_OIF_FLAG_PROTO_PIM".
Once OIF is removed, RP sends prune towards R3.
Issue: #11347
Signed-off-by: Sarita Patra <saritap@vmware.com>
Add some safe guards to avoid crashes and alert us about programming
errors in packet build.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Increase the MSDP peer stream buffer size to handle the whole TLV
(maximum is 65KiB due to 16bit field). If the stream is not resized
there will be a crash in the read function attempting to put more than
9192 (`PIM_MSDP_SA_TLV_MAX_SIZE`) bytes.
According to the RFC 3618 Section 12 we should accept the TLV and we
should not reset the peer connection.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
As per RFC 7761, Section 4.9.1
The RPT (or Rendezvous Point Tree) bit is a 1-bit value for use
with PIM Join/Prune messages (see Section 4.9.5.1). If the
WC bit is 1, the RPT bit MUST be 1.
ANVL conformance test case is trying to verify this and is failing.
Issue: #12354
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
As per RFC 7761, Section 4.9.1
The RPT (or Rendezvous Point Tree) bit is a 1-bit value for use
with PIM Join/Prune messages (see Section 4.9.5.1). If the
WC bit is 1, the RPT bit MUST be 1.
ANVL conformance test case is trying to verify this and is failing.
Issue: #12354
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
While doing nexthop lookup, only allow the nexthop
interafce which is PIM enabled.
Issue: #10782
Issue: #11931
Signed-off-by: Sarita Patra <saritap@vmware.com>