JD macro is defined by the RFC as -
bool JoinDesired(S,G) {
return (immediate_olist(S,G) != NULL
OR (KeepaliveTimer(S,G) is running
AND inherited_olist(S,G) != NULL))
}
However for MSDP synced SA the KAT will not be running so an exception is
needed. Earlier I had done this by relaxing KAT_run requirements entirely
on the RP. However as that prevents the source from being aged out in some
cases I have made the check more narrow i.e. has to an MSDP peer added
entry.
Ticket: CM-24398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Added event logs around add/del of upstream entries into the nbr's
jp-agg list. This is to help debug a problem with stale (deleted)
upstream entries being present in the list causing pimd to crash on
the periodic processing.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Today we are only pruning the SPT when (S,G) upstream entry
switches from Joined toNotJoined. This leaves the source still
pruned along the RPT till the next periodic XG join-prune is sent
to the RPF(RP). Traffic from the source will be blackholed for this
duration. To prevent that we need send a new JP message
to RPF(RP) immediately.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
It is now used to evaluate and display join-desired state for
each upstream entry -
root@spine-1:~# net show pim upstream-join-desired
Source Group EvalJD
* 239.1.1.111 yes
6.0.0.28 239.1.1.111 yes
6.0.0.29 239.1.1.111 no
6.0.0.30 239.1.1.111 yes
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This re-naming was needed because the JD state on an upstream is
not just based on channel info i.e. we can have JD=true even if there
is no downstream channel. The "show ip upstream-join-desired" command
will be changed to display that info i.e. upstream's JD state instead
of downstream channel params. The downstream channel params are now
available via "show ip pim channel"
PS: This change maybe reverted if upstream NAKs it. But there is a
pressing need for it to debug some not-so-reproduible problems.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This was causing pimd to crash later; call-stack -
(gdb) bt
context=<optimized out>) at lib/sigevent.c:254
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
groups=<optimized out>) at pimd/pim_join.c:562
at pimd/pim_neighbor.c:288
at lib/thread.c:1599
at lib/libfrr.c:1024
envp=<optimized out>) at pimd/pim_main.c:162
(gdb) fr 4
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
207 pimd/pim_rp.c: No such file or directory.
(gdb) fr 6
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
200 pimd/pim_msg.c: No such file or directory.
(gdb) p source->up->sg_str
$1 = '\000' <repeats 31 times>, <incomplete sequence \361>
(gdb)
This problem can manifest in the following event sequence -
1. upstream RPF neighbor is resolved
2. upstream RPF neighbor becomes unresolved (but upstream entry
stays on the jp-agg list)
3. upstream entry is removed
on the next old-neighbor jp-agg-list processing the stale entry is
accessed resulting in the crash.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Dumps while in problem state -
============================
[from "show ip pim state"]
Active Source Group RPT IIF OIL
1 6.0.0.31 239.1.1.111 n swp1 swp4( J * )
[from "show ip pim join"]
Interface Address Source Group State Uptime Expire Prune
swp3 6.0.0.22 6.0.0.31 239.1.1.111 JOIN --:--:-- 03:11 --:--
You can see from the dumps that the pim downstream router has joined on
swp3 but that OIF has not been added to the OIL with flag
PIM_OIF_FLAG_PROTO_PIM. This is because the join was rxed while the
ifchannel was in a prune-pending state.
Relevant logs -
===============
[
PIM: recv_prune: prune (S,G)=(6.0.0.31,239.1.1.111) rpt=1 wc=0 upstream=6.0.0.22 holdtime=210 from 6.0.0.28 on swp3
PIM: pim_upstream_ref(pim_ifchannel_add): upstream (6.0.0.31,239.1.1.111) ref count 3 increment
PIM: pim_upstream_add(pim_ifchannel_add): (6.0.0.31,239.1.1.111), iif 6.0.0.26/0 (swp1) found: 1: ref_count: 3
PIM: pim_ifchannel_add: ifchannel (6.0.0.31,239.1.1.111) is created
PIM: pim_joinprune_recv: SGRpt flag is set, del inherit oif from up (6.0.0.31,239.1.1.111)
PIM: pim_mroute_add(pim_channel_del_oif), vrf default Added Route: (6.0.0.31,239.1.1.111) IIF: swp1, OIFS: swp4
PIM: pim_channel_del_oif(pim_joinprune_recv): (S,G)=(6.0.0.31,239.1.1.111): proto_mask=4 IIF:1 OIF=swp3 vif_index=3
PIM: recv_join: join (S,G)=(6.0.0.31,239.1.1.111) rpt=0 wc=0 upstream=6.0.0.22 holdtime=210 from 6.0.0.28 on swp3
PIM: PIM_IFCHANNEL(swp3): (6.0.0.31,239.1.1.111) is switching from SGRpt(PP) to JOIN
PIM: Sending Request for New Channel Oil Information(6.0.0.31,239.1.1.111) VIIF 1(default)
]
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is needed for two reasons -
1. The inherited OIL needs to be setup independent of the RPF interface
to allow correct computation of the JoinDesired macro.
2. The RPF interface is computed at the time of MFC programming so
it is not possible to permanently evict the OIF at that time oif_add
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a inherited OIL becomes empty join-desired can go to false. So
we need to re-run join-desired evaluation on any inherited OIL changes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The macro was always returning non-empty because of comparing an
array of u8_t with an array of u32_t.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If an dummy upstream entry (no RPF nbr) which is already in a JOINED
state is resolved we were not triggering an immediate join via the
per-interface upstream switch list.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
A dummy pim upstream entry can be in a JOINED state before its RPF nbr is
added. Handle that case by triggering an immediate join.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Deviations -
1. Avoid using SPTbit setting. Replace that with Use_Spt macro.
2. If S is supposed to be forwarded along the RPT but has an empty OIL
prune it.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. KAT should be re-started only if traffic rxed along the SPT i.e.
IIF == RPF_Interface(S).
Only exception to the rule is if you are LHR.
2. KAT should be started on all routers (not just FHR, RP, LHR).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Criteria for switching to SPT is different on RP and LHR. Re-name
the functions to make that apparent.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Joined state is computed based on the downstream state and cannot be
changed if the RPF link flaps.
Reference: rfc 7761, section 4.5.5
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This commit includes the following changes -
1. kat needs to be included when evaluting join desired on a (S,G)
entry.
2. there were cases where we were adding OIF based on joindesired
being true for unrelated reasons (on other OIFs). cleaned up those
cases.
3. make all calls to pim_upstream_switch conditional on the JoinDesired
macro.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
RP config change is a big hammer and use_rpt/spt needs to be
re-evaluated on all existing (S,G) entries.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If a source is being forwarded along the RPT it uses the parent (*,G)'s
IIF. When the parent's IIF changes all the children need to be updated
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
mfcc_parent for an (S, G) entry was being updated on any upstream RPF
change. With the change to use RPT for (S,G) in some cases we can no
longer do that. Instead the upstream entry's RPF neigbor is managed
separately form the channel_oil's mfcc_parent i.e. via NHT. And the
mfcc_parent is evaluated at the time of mroute programming.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An (S,G) mroute can be created as a result of rpt prune. However that
entry needs to stay on the parent (*,G)'s tree (IIF) till a decision is
made to switch the source to the SPT.
The decision to stay on the RPT is made based on the SPTbit setting
according to - RFC7761, Section 4.2 “Data Packet Forwarding Rules”
However those rules are hard to achieve when hw acceleration i.e.
control and data planes are separate. So instead of relying on data
we make the decision of using SPT if we have decided to join the SPT -
Use_RPT(S,G) {
if (Joined(S,G) == TRUE // we have decided to join the SPT
OR Directly_Connected(S) == TRUE // source is directly connected
OR I_am_RP(G) == TRUE) // RP
//use_spt
return FALSE;
//use_rpt
return TRUE;
}
To make that change some re-org was needed -
1. pim static mroutes and dynamic (upstream mroutes) top level APIs
have been separated. This is to limit the state machine to dynamic
mroutes.
2. c_oil->oil.mfcc_parent is re-evaluated based on if we decided
to use the SPT or stay on the RPT.
3. upstream mroute re-eval is done when any of the criteria involved
in Use_RPT changes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Problem statement:
When IPv4/IPv6 prefixes are received in BGP, bgp_update function registers the
nexthop of the route with nexthop tracking module. The BGP route is marked as
valid only if the nexthop is resolved.
Even for EVPN RT-5, route should be marked as valid only if the the nexthop is
resolvable.
Code changes:
1. Add nexthop of EVPN RT-5 for nexthop tracking. Route will be marked as valid
only if the nexthop is resolved.
2. Only the valid EVPN routes are imported to the vrf.
3. When nht update is received in BGP, make sure that the EVPN routes are
imported/unimported based on the route becomes valid/invalid.
Testcases:
1. At rtr-1, advertise EVPN RT-5 with a nexthop 10.100.0.2.
10.100.0.2 is resolved at rtr-2 in default vrf.
At rtr-2, remote EVPN RT-5 should be marked as valid and should be imported into
vrfs.
2. Make the nexthop 10.100.0.2 unreachable at rtr-2
Remote EVPN RT-5 should be marked as invalid and should be unimported from the
vrfs. As this code change deals with EVPN type-5 routes only, other EVPN routes
should be valid.
3. At rtr-2, add a static route to make nexthop 10.100.0.2 reachable.
EVPN RT-5 should again become valid and should be imported into the vrfs.
Signed-off-by: Ameya Dharkar <adharkar@vmware.com>
Theoretically there should be no case where the channel-oil hangs
around after the upstream entry is removed. But currently there are
cases where it does. This is a precautionary fixup till we are
rid off all of those cases.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
We should be setting the ns->info pointer to NULL when we free
what it points to. Just use XFREE directly on the void * pointer
to do this.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
We were not connecting the default zebra_ns to the default
ns->info at namespace initialization in zebra. Thus, when
we tried to use the `ns_walk_func()` it would ignore the
default zebra_ns since there is no pointer to it from the
ns struct.
Fix this by connecting them in `zebra_ns_init()` and,
if the default ns is not found, exit with failure
since this is not recoverable.
This was found during a crash where we fail to cancel the kernel_read
thread at termination (via the `ns_walk_func()`) and then we
get a netlink notification trying to use the zns struct that has
already been freed.
```
(gdb) bt
\#0 0x00007fc1134dc7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
\#1 0x00007fc1134c7535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
\#2 0x00007fc113996f8f in core_handler (signo=11, siginfo=0x7ffe5429d070, context=<optimized out>) at lib/sigevent.c:254
\#3 <signal handler called>
\#4 0x0000561880e15449 in if_lookup_by_index_per_ns (ns=0x0, ifindex=174) at zebra/interface.c:269
\#5 0x0000561880e1642c in if_up (ifp=ifp@entry=0x561883076c50) at zebra/interface.c:1043
\#6 0x0000561880e10723 in netlink_link_change (h=0x7ffe5429d8f0, ns_id=<optimized out>, startup=<optimized out>) at zebra/if_netlink.c:1384
\#7 0x0000561880e17e68 in netlink_parse_info (filter=filter@entry=0x561880e17680 <netlink_information_fetch>, nl=nl@entry=0x561882497238, zns=zns@entry=0x7ffe542a5940,
count=count@entry=5, startup=startup@entry=0) at zebra/kernel_netlink.c:932
\#8 0x0000561880e186a5 in kernel_read (thread=<optimized out>) at zebra/kernel_netlink.c:406
\#9 0x00007fc1139a4416 in thread_call (thread=thread@entry=0x7ffe542a5b70) at lib/thread.c:1599
\#10 0x00007fc113974ef8 in frr_run (master=0x5618823c9510) at lib/libfrr.c:1024
\#11 0x0000561880e0b916 in main (argc=8, argv=0x7ffe542a5f78) at zebra/main.c:483
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
1. This avoids the needs to re-run "muting" decisions.
2. Avoids the need to restore's pim OIL after fixup and send to kernel
(this is getting harder to manage).
In the future we need to also move the PIM maintained channel OIL from
an array of MAXVIFs to a simple DLL. This will be a significant
optimization in memory usage and preformance (OIL reads, copies etc).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If an mroute loses DF election (with the MLAG peer) it has to stop
forwarding traffic on active-active devices such as ipmr-lo used
for vxlan traffic termination. To acheive that this commit
introduces a concept of OIF muting. That way we can let the PIM and
IGMP state machines play out and silence OIFs after the fact.
Relevant outputs:
=================
1. muted OIFs are displayed with the M flag in "pim state" -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC12:~# net show pim state |grep "27.0.0.13"|grep 100
1 27.0.0.13 239.1.1.100 uplink-1 ipmr-lo( *M)
root@TORC12:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2. And supressed altogether in the mroute output -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC12:~# net show mroute |grep "27.0.0.13"|grep 100
27.0.0.13 239.1.1.100 none uplink-1 none 0 --:--:--
root@TORC12:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
These logs were printing file name which has little value (is always
pim_oil.c). Instead print the caller.
add_oif/del_oif are being called directly from one too many. Instead OIF
setup needs to be consolidated via the PIM state machine. These
debugs are expected to help in understanding what needs to be cleaned up.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Before the fix:
2019/11/14 19:52:21 BGP: peer 192.168.2.5 deleted from subgroup s4peer
cnt 0 - missing space after s4 before peer
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Previous error was misleading and made it seem like Null0,
reject, or blackhole nexthops on static routes are invalid.
This commit makes it more clear as to why the error is seen.
Signed-off-by: Trey Aspelund <taspelund@cumulusnetworks.com>
With this code change, we can now filter evpn routes based on RD using the
match statement: "match evpn rd XX"
Signed-off-by: Lakshman Krishnamoorthy <lkrishnamoor@vmware.com>
Recently Lot of issues are seen in OSPF adjacnecy establishements,
sessions was tear down because of DD Sequence Number mismatch.
adding Debugs to capture Master & slave generated sequence numbers.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>