Topology:
========
Receiver----R1----(ens192)R2(ens224)----R3----R4----Source
--------------------------
R1=LHR
R2=RP
R4=FHR
Problem:
=======
1. Direct connected link between R1 and R3 is down initially.
2. So traffic flow path is R4<->R3<->R2<->R1<->Receiver.
3. Mroutes are properly created on all the nodes.
4. Up the direct connected link between R1 and R3.
5. Traffic flows in both the paths.
R4<->R3<->R2<->R1<->Receiver
R4<->R3<->R1<->Receiver
6. Duplicate traffic received at the receiver.
Root Cause:
==========
Initially when the direct connected link between R1 and R3 is
down, traffic flows via RP(R2). So in RP (S,G) installed with
IIF as ens224 and OIF as ens192 (reference = 2) with mask
PIM_OIF_FLAG_PROTO_STAR and PIM_OIF_FLAG_PROTO_PIM.
Now when the direct link between R1 and R3 is Up, LHR(R1) sends
SGRPT prune. After prune received, RP(R2) will remove OIF ens224
with mask PIM_OIF_FLAG_PROTO_STAR.
Since OIF ens224 is still present with mask PIM_OIF_FLAG_PROTO_PIM,
RP(R2) will not send prune towards R3.
So traffic continues to flow in the path R4<->R3<->R2<->R1<->Receiver.
Fix:
====
When SGRpt prune received, remove OIF irrespective of the OIF is
installed with mask "PIM_OIF_FLAG_PROTO_STAR" or "PIM_OIF_FLAG_PROTO_PIM".
Once OIF is removed, RP sends prune towards R3.
Issue: #11347
Signed-off-by: Sarita Patra <saritap@vmware.com>
... the prefix length wasn't ignored as expected. Both S and G are
always /32. But expecting "le 32" in prefix-list config is unexpected &
counterintuitive.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
in pim_ifchannel.c there exists several spots where
the ch->upstream is assumed to be NULL. This is not
possible.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This is causing build issues on BSD by including (transitively)
`linux/mroute6.h` - try to address by disentangling the headers a bunch.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The only function these macros have is to make the code confusing.
"PIM_IF_DO_PIM" sounds like it triggers some action, but it doesn't.
Replace with "bool" fields in struct pim_interface.
(Note: PIM_IF_*_IGMP_LISTEN_ALLROUTERS was always set, without any way
to unset it. It is completely removed now and always enabled.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This is just hitting the pim_mroute code with a hammer until it doesn't
print warnings anymore. This is NOT quite tested or working yet, it
just compiles.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since `pim_sgaddr` is `pim_addr` now, that causes a whole lot of fallout
anywhere S,G pairs are handled.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Only pim_sgaddr uses are covered by this since regular in_addr is still
used for singular addresses, so only a part of pim_inet4_dump calls are
gone with this.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Need a separate constant that is IPv6 when needed. Also assign the
whole struct rather than just s_addr.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Replaces comparison against INADDR_ANY, so we can do IPv6 too.
(Renamed from "pim_is_addr_any" for "pim_addr_*" naming pattern, and
type fixed to bool.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... and replace with `%pSG` printfrr specifier. This actually used a
static buffer in the formatting function, so subsequent formatting would
overwrite earlier uses.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Mostly just 2 sed calls:
- `sed -e 's%struct prefix_sg%pim_sgaddr%g'`
- `sed -e 's%memset(&sg, 0, sizeof(pim_sgaddr));%memset(\&sg, 0, sizeof(sg));%g'`
Plus a bunch of fixing whatever that broke.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
`pim_ifchannel_ifjoin_switch()` changes flags that `pim_forward_stop()`
looks at. This leads to data flow continuing until we have some reason
to sync state again.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem:
S,G entry has iif = oif in FHR is LHR case.
Setup:-
R11-----R2----R4
R11 :- FHR and LHR
R2 :- RP
R4 :- LHR
Issue :-
1) shut mapped interface in R11
2) wait for 5 min
3) do FRR restart
5) No shut of mapped interface
OIL is added for local interface also where OIL is same as IIF
and duplicate traffic observed on R4 receives in Ixia
RCA:
pim_ifchannel_local_membership_add adds inherited oif from starg when iif for
SG is unavailable.
When rpf for that SG resolves to this inherited oif from starg, iif is also in oif.
This results in dup traffic.
Fix:
If iif is not available, do not inherit from starg.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem Statement:
In Prune pending state, when Join is received, and there is hold timer change
the Expiry timer is not getting updated with new Hold timer.
Root Cause:
When thread_add_timer function is called and the thread is already in the list
the thread api just returns, it does not modify the timer value.
So when we want to change the timer, we need to first call THREAD_OFF and then
call thread_add_timer.
The Expiry timer thread is not cancelled in PIM_IFJOIN_PRUNE_PENDING state,
therefore the timer change is not taking effect.
Fix:
Call THREAD_OFF in that flow.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
RCA: There were 2 problems.
1. SGRpt prune expiry didn't create S,G entry with none oil when no other
interfaces were part of the oil.
2. When restarting the timer with new hold value, comparision was missing and
old timer was not stopping.
Fix:
SGRpt Prune pending expiry will put SG entry with none oil if no other
Signed-off-by: Saravanan K <saravanank@vmware.com>
interfaces present. If present we will be deleting the inherited oif from oil.
Deleting the oif in that scenario will take care of changing mroute.
When alone interface expires in SGRpt prune pending state, we shall detect by
checking installed flag. if not installed, install mroute.
1. When a node changes from non-DR to DR in the given topology,
the node was receiving both PIM Join as well as IGMP join.
Since it was already receiving PIM Join previously, ifchannel was
already present. Hence when it becomes DR, the IGMP source flag is not
set due to issue in the code. Hence it never creates S,G entry thinking
that it is not DR.
2. When pim join expires, the pim flag is not reset when ifchannel is not
deleted.
Issue: #7752
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
problem :
=========
When (*,G) prune received where we have SGRpt state,
ifchannel goes to NO_INFO state and doesn't get removed.
Root cause :
============
During the processing of (*,G) prune, we are not removing the
ifchannel on PruneTmp or PrunePendingTmp state.
Fix :
=====
In that scenario, stop joinExpiry timer and delete the ifchannel.
issue #7347
Co-authored-by: Saravanan K <saravanank@vmware.com>
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
Problem:
When the ifchannel is in SGRpt prune, if we receive a join, we go into no info
state but mroute still present with none oil
Join Prune Expiry timer on the ifchannel was still running when
Prune pending expired. This causes ifchannel not to be deleted and hence mroute.
Fix:
Stop expiry timer when we move into NOINFO state and delete the ifchannel.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: starg join fell in to SGRpt join check and was treated as SGRpt join
so ifchannel state machine moved from prunepending to noinfo.
Fix: Check if it is starg join and process
Signed-off-by: Saravanan K <sarav511@vmware.com>
RCA:
1. Prune processing during join state was putting of join expiry timer
2. Join received during prune pending state was not comparing hold time with remaining expiry timer.
Fix:
Fixed as per RFC 4601/7761
Signed-off-by: Saravanan K <saravanank@vmware.com>
Issue: Client1------LHR-----(int-1)RP(int-2)------client2
Client2 send IGMP join for group G.
Client1 send IGMP join for group G.
verify show ip mroute in RP, will have 2 OIL.
Client2 send IGMP leave.
Verify show ip mroute in RP, will still have 2.
Root cause: When RP receives IGMP join from client2, it creates
a (s,g) channel oil and add the interface int-2 into oil list and
set the flag PIM_OIF_FLAG_PROTO_IGMP to int-2
Client1 send IGMP join, LHR will send a (*,G) join to RP. RP will
add the interface int-1 into the oil list of (s,g) channel_oil and
will set the flag PIM_OIF_FLAG_PROTO_IGMP and PIM_OIF_FLAG_PROTO_PIM
to the int-1 and set PIM_OIF_FLAG_PROTO_PIM to int-2 as well. It is
happening because of the pim_upstream_inherited_olist_decide() and
forward_on() get all the oil and update the flag wrongly.
So now when client 2 sends IGMP prune, RP will not remove the int-2
from oil list since both PIM_OIF_FLAG_PROTO_PIM & PIM_OIF_FLAG_PROTO_IGMP
are set, it just unset the flag PIM_OIF_FLAG_PROTO_IGMP.
Fix: Introduced new flags in if_channel, PIM_IF_FLAG_MASK_PROTO_PIM
& PIM_IF_FLAG_MASK_PROTO_IGMP. If a if_channel is created because of
pim join or pim (s,g,rpt) prune received, then set the flag
PIM_IF_FLAG_MASK_PROTO_PIM. If a if_channel is created becuase of IGMP
join received, then set the flag PIM_IF_FLAG_MASK_PROTO_IGMP.
When an interface needs to be added into the oil list check if
PIM_IF_FLAG_MASK_PROTO_PIM or PIM_IF_FLAG_MASK_PROTO_IGMP is set, then
update oil flag accordingly.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Series of events leading to the problem -
1. (S,G) has been pruned on the rp on downlink-1
2. a (*,G) join is rxed on downlink-1 without the source S. This
results in the (S,G,rpt) prune state being cleared on downlink-1.
As a part of the clear the ifchannel associated with downlink-1
is deleted.
3. The ifchannel_delete handling is expected to add downlink-1
as an inherited OIF to the channel OIL (which it does). However
it is also added in as an immediate OIF (accidentally) as the
ifchannel is still present (in the process of being deleted).
To avoid the problem defer pim_upstream_update_join_desired
evaluation until after the channel is deleted.
Relevant debug logs -
PIM: pim_ifchannel_delete: ifchannel entry (27.0.0.15,239.1.1.106)(downlink-1) del start
PIM: pim_channel_add_oif(pim_ifchannel_delete): (S,G)=(27.0.0.15,239.1.1.106): proto_mask=4 OIF=downlink-1 vif_index=7: DONE
PIM: pimd/pim_oil.c pim_channel_del_oif: no existing protocol mask 2(4) for requested OIF downlink-1 (vif_index=7, min_ttl=1) for channel (S,G)=(27.0.0.15,239.1.1.106)
PIM: pim_upstream_switch: PIM_UPSTREAM_(27.0.0.15,239.1.1.106): (S,G) old: NotJoined new: Joined
PIM: pim_channel_add_oif(pim_upstream_inherited_olist_decide): (S,G)=(27.0.0.15,239.1.1.106): proto_mask=2 OIF=downlink-1 vif_index=7 added to 0x6 >>>>>>>>>>>>>>>>>>
PIM: pim_upstream_del(pim_ifchannel_delete): Delete (27.0.0.15,239.1.1.106)[default] ref count: 2 , flags: 81 c_oil ref count 1 (Pre decrement)
PIM: pim_ifchannel_delete: ifchannel entry (27.0.0.15,239.1.1.106)(downlink-1) del end
Ticket: CM-26732
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Additional protocols were being set on the OIF proto-mask without
logs. Added logs in that area.
Also added start and end logs to ifchannel_delete to help
identify state machine changes that play out as a part of this
event handling.
Ticket: CM-26732
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Initially, MLAG Sync is happened at pim_ifchannel, this is mainly to
support even config mismatches(missing configuration of dual active).
But this causes more syncs for each entry.
and also it is not In-line with PIM EVPN. to avoid that moving to
pm_upstream based syncing.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
There was some code missed during the upstreaming process
due to code squash. Identify and put into a commit
to keep code consistent and correct.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>