The entire `case NEXTHOP_TYPE_IPV6_IFINDEX:` block here was a bit of a
tripwire to stumble over, since there was no indication at all that this
concerns RFC5549 nexthop handling. So it got mis-adapted for PIM IPv6
support.
Clarify this a whole bunch that this is for v4-over-v6 nexthop mangling,
and nothing else.
This should really also use neighbor's secondary address lists for the
lookup, but that is probably going to break compatibility with other
boxes that don't include v6 addrs in v4 hellos and needs further
machinery to do properly, so for now just leave a breadcrumb.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Currently the nexthop tracking code is only sending to the requestor
what it was requested to match against. When the nexthop tracking
code was simplified to not need an import check and a nexthop check
in b8210849b8 for bgpd. It was not
noticed that a longer prefix could match but it would be seen
as a match because FRR was not sending up both the resolved
route prefix and the route FRR was asked to match against.
This code change causes the nexthop tracking code to pass
back up the matched requested route (so that the calling
protocol can figure out which one it is being told about )
as well as the actual prefix that was matched to.
Fixes: #10766
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Problem Statement:
=================
PIM Logs are coming at interval of 1 minute although pim
is not enabled.
Root Cause Analysis:
====================
By default, RCPM configures the PIM debugs when router comes up
via script. The product cannot disable PIM even though it is not required.
Hence moving these logs under a new debug option which will not be
enabled by default.
Fix:
====
Added a new option "detail" in the cli:
debug pim nht detail
Co-author: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
We can use PIMADDR_ANY instead of INADDR_NONE to initalize rp->rpf_addr
when there is no rp configured for group_all.
Signed-off-by: sarita patra <saritap@vmware.com>
Since `pim_sgaddr` is `pim_addr` now, that causes a whole lot of fallout
anywhere S,G pairs are handled.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
In all places that pim_nht_bsr_del is called, the code
needs to not unregister if the current_bsr is INADDR_ANY.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Replaces comparison against INADDR_ANY, so we can do IPv6 too.
(Renamed from "pim_is_addr_any" for "pim_addr_*" naming pattern, and
type fixed to bool.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
NHT won't have a result yet when we get the first BSM from a new BSR.
Hence, the first packet(s) are lost, since their RPF validation fails.
Re-add the blocking RPF check that was there before (though in a much
more sensible manner.)
Also nuke the now-unused pim_nexthop_match* functions.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The Bootstrap message RX path needs a RPF check for the BSR address,
and this is implemented both incorrectly as well as quite ugly.
Clean up and fix case when we have multiple interfaces to the same LAN
and/or ECMP nexthops (both would cause message duplication, the former
can even cause BSM forwarding loops.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
VRF creation can happen from either cli or from
knowledged about the vrf learned from zebra.
In the case where we learn about the vrf from
the cli, the vrf id is UNKNOWN. Upon actual
creation of the vrf, lib/vrf.c touches up the vrf_id
and calls pim_vrf_enable to turn it on properly.
At this point in time we have a pim->vrf_id of
UNKNOWN and the vrf->vrf_id of the right value.
There is no point in duplicating this data. So just
remove all pim->vrf_id and use the vrf->vrf_id instead
since we keep a copy of the pim->vrf pointer.
This will remove some crashes where we expect the
pim->vrf_id to be usable and it's not.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This should never happen; no need to debug guard it and it's not a
warning, if this isn't working then NHT is not working at all.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
It is possible that a if_lookup_by_index can return NULL
ensure that we handle this appropriately.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Issue:
Client---LHR---RP
1. Add kernel route for RP on LHR. Client send join
2. (*,G) will be get created in LHR and RP.
3. Kill the FRR on all the nodes
4. Start FRR only on LHR node
5. In LHR, (*, G) will be created with iif as unknown.
Root cause:
In the step 4, When LHR will receive igmp join, it will call
the function pim_ecmp_fib_lookup_if_vif_index which will look
for nexthop to RP with neighbor needed as false. So RPF lookup will
be true as the route is present in the kernel. It will create a
(*, G) channel_oil with incoming interface as the RPF interface
towards RP and install the (*,G) mroute in kernel.
Along with this (*,G) upstream gets craeted, which call the function
pim_rpf_update, which will look for the nexthop to RP with neighbor
needed as true. As the frr is not running in RP, no neighbor is present
on the nexthop interface. Due to which this will fail and will update
the channel_oil incoming interface as MAXVIFS(32).
Fix:
pim_ecmp_fib_lookup_if_vif_index() call the function pim_ecmp_nexthop_lookup
with neighbor_needed as true.
Signed-off-by: Sarita Patra <saritap@vmware.com>
PIM MLAG DF election API was not being triggered on cost change if the
upstream neighbor remained the same.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Convert the upstream_list and hash to a rb tree, Significant
time was being spent in the listnode_add_sort. This reduces
this time greatly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This was causing pimd to crash later; call-stack -
(gdb) bt
context=<optimized out>) at lib/sigevent.c:254
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
groups=<optimized out>) at pimd/pim_join.c:562
at pimd/pim_neighbor.c:288
at lib/thread.c:1599
at lib/libfrr.c:1024
envp=<optimized out>) at pimd/pim_main.c:162
(gdb) fr 4
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
207 pimd/pim_rp.c: No such file or directory.
(gdb) fr 6
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
200 pimd/pim_msg.c: No such file or directory.
(gdb) p source->up->sg_str
$1 = '\000' <repeats 31 times>, <incomplete sequence \361>
(gdb)
This problem can manifest in the following event sequence -
1. upstream RPF neighbor is resolved
2. upstream RPF neighbor becomes unresolved (but upstream entry
stays on the jp-agg list)
3. upstream entry is removed
on the next old-neighbor jp-agg-list processing the stale entry is
accessed resulting in the crash.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
mfcc_parent for an (S, G) entry was being updated on any upstream RPF
change. With the change to use RPT for (S,G) in some cases we can no
longer do that. Instead the upstream entry's RPF neigbor is managed
separately form the channel_oil's mfcc_parent i.e. via NHT. And the
mfcc_parent is evaluated at the time of mroute programming.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Debian packaging when run finds a bunch of spelling errors:
I: frr: spelling-error-in-binary usr/bin/vtysh occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bfdd Amount of times Number of times
I: frr: spelling-error-in-binary usr/lib/frr/bgpd occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bgpd recieved received
I: frr: spelling-error-in-binary usr/lib/frr/isisd betweeen between
I: frr: spelling-error-in-binary usr/lib/frr/ospf6d Infomation Information
I: frr: spelling-error-in-binary usr/lib/frr/ospfd missmatch mismatch
I: frr: spelling-error-in-binary usr/lib/frr/pimd bootsrap bootstrap
I: frr: spelling-error-in-binary usr/lib/frr/pimd Unknwon Unknown
I: frr: spelling-error-in-binary usr/lib/frr/zebra Requsted Requested
I: frr: spelling-error-in-binary usr/lib/frr/zebra uknown unknown
I: frr: spelling-error-in-binary usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0 overriden overridden
This commit fixes all of them except the bgp `recieved` issue due to
it being part of json output. That one will need to go through
a deprecation cycle.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Pim will do the nexthop registration with "ip pim rp" static configuration
with this Zebra will advertise the Route Information.
But while processing this info at PIM, if Nexthop Interfaces are not PIM
enabled, currently PIM is dropping those paths. in case all paths are not
PIM enabled, there is no valid RPF Interface at PIM.
and PIM will be stuck at this state until Next update this to route, that
can happen only if there is a Routing change at Zebra for this prefix.
until that time PIM will not have any valid outgoing Interface.
This issue was mainly seen during Node bootup scenarios.
Fix Proposed
=============
store the paths in PIM PNC Data structure though they are not enabled
with PIM, because while selecting the Interface PIM checks for multicast
enabled Interface.
Tests Performed
===============
1. Verified fail Test case
2. Disabling the PIM on selected outgoing Interface, PIM is choosing
another path when Neighbor is down on this Interface.
3. Re-configure the PIM on above un-configured Interface, PIM is staying
with old NHop since it is valid.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
We already log whether or not we add nht tracking, having
an additional boolean to say to log another line is
a bit over the top.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For each BSM packet, rpf check is performed. We will be accepting if the
source address match any of the next hop neighbor(in ecmp case) to reach
the Bootstrap Router.
1. pim_nexthop_match - this lookup in zebra and return true if any of the
next hop nbr is matching (in ecmp case).
2. pim_nexthop_match_nht_cache - this api searches the given address in local
pnc and return true if any of the next hop
nbr is matching (in ecmp case).
Signed-off-by: Saravanan K <saravanank@vmware.com>
This macro:
- Marks ZAPI callbacks for readability
- Standardizes argument names
- Makes it simple to add ZAPI arguments in the future
- Ensures proper types
- Looks better
- Shortens function declarations
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>