This should never happen; no need to debug guard it and it's not a
warning, if this isn't working then NHT is not working at all.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
It is possible that a if_lookup_by_index can return NULL
ensure that we handle this appropriately.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Issue:
Client---LHR---RP
1. Add kernel route for RP on LHR. Client send join
2. (*,G) will be get created in LHR and RP.
3. Kill the FRR on all the nodes
4. Start FRR only on LHR node
5. In LHR, (*, G) will be created with iif as unknown.
Root cause:
In the step 4, When LHR will receive igmp join, it will call
the function pim_ecmp_fib_lookup_if_vif_index which will look
for nexthop to RP with neighbor needed as false. So RPF lookup will
be true as the route is present in the kernel. It will create a
(*, G) channel_oil with incoming interface as the RPF interface
towards RP and install the (*,G) mroute in kernel.
Along with this (*,G) upstream gets craeted, which call the function
pim_rpf_update, which will look for the nexthop to RP with neighbor
needed as true. As the frr is not running in RP, no neighbor is present
on the nexthop interface. Due to which this will fail and will update
the channel_oil incoming interface as MAXVIFS(32).
Fix:
pim_ecmp_fib_lookup_if_vif_index() call the function pim_ecmp_nexthop_lookup
with neighbor_needed as true.
Signed-off-by: Sarita Patra <saritap@vmware.com>
PIM MLAG DF election API was not being triggered on cost change if the
upstream neighbor remained the same.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Convert the upstream_list and hash to a rb tree, Significant
time was being spent in the listnode_add_sort. This reduces
this time greatly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This was causing pimd to crash later; call-stack -
(gdb) bt
context=<optimized out>) at lib/sigevent.c:254
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
groups=<optimized out>) at pimd/pim_join.c:562
at pimd/pim_neighbor.c:288
at lib/thread.c:1599
at lib/libfrr.c:1024
envp=<optimized out>) at pimd/pim_main.c:162
(gdb) fr 4
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
207 pimd/pim_rp.c: No such file or directory.
(gdb) fr 6
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
200 pimd/pim_msg.c: No such file or directory.
(gdb) p source->up->sg_str
$1 = '\000' <repeats 31 times>, <incomplete sequence \361>
(gdb)
This problem can manifest in the following event sequence -
1. upstream RPF neighbor is resolved
2. upstream RPF neighbor becomes unresolved (but upstream entry
stays on the jp-agg list)
3. upstream entry is removed
on the next old-neighbor jp-agg-list processing the stale entry is
accessed resulting in the crash.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
mfcc_parent for an (S, G) entry was being updated on any upstream RPF
change. With the change to use RPT for (S,G) in some cases we can no
longer do that. Instead the upstream entry's RPF neigbor is managed
separately form the channel_oil's mfcc_parent i.e. via NHT. And the
mfcc_parent is evaluated at the time of mroute programming.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Debian packaging when run finds a bunch of spelling errors:
I: frr: spelling-error-in-binary usr/bin/vtysh occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bfdd Amount of times Number of times
I: frr: spelling-error-in-binary usr/lib/frr/bgpd occurences occurrences
I: frr: spelling-error-in-binary usr/lib/frr/bgpd recieved received
I: frr: spelling-error-in-binary usr/lib/frr/isisd betweeen between
I: frr: spelling-error-in-binary usr/lib/frr/ospf6d Infomation Information
I: frr: spelling-error-in-binary usr/lib/frr/ospfd missmatch mismatch
I: frr: spelling-error-in-binary usr/lib/frr/pimd bootsrap bootstrap
I: frr: spelling-error-in-binary usr/lib/frr/pimd Unknwon Unknown
I: frr: spelling-error-in-binary usr/lib/frr/zebra Requsted Requested
I: frr: spelling-error-in-binary usr/lib/frr/zebra uknown unknown
I: frr: spelling-error-in-binary usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0 overriden overridden
This commit fixes all of them except the bgp `recieved` issue due to
it being part of json output. That one will need to go through
a deprecation cycle.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Pim will do the nexthop registration with "ip pim rp" static configuration
with this Zebra will advertise the Route Information.
But while processing this info at PIM, if Nexthop Interfaces are not PIM
enabled, currently PIM is dropping those paths. in case all paths are not
PIM enabled, there is no valid RPF Interface at PIM.
and PIM will be stuck at this state until Next update this to route, that
can happen only if there is a Routing change at Zebra for this prefix.
until that time PIM will not have any valid outgoing Interface.
This issue was mainly seen during Node bootup scenarios.
Fix Proposed
=============
store the paths in PIM PNC Data structure though they are not enabled
with PIM, because while selecting the Interface PIM checks for multicast
enabled Interface.
Tests Performed
===============
1. Verified fail Test case
2. Disabling the PIM on selected outgoing Interface, PIM is choosing
another path when Neighbor is down on this Interface.
3. Re-configure the PIM on above un-configured Interface, PIM is staying
with old NHop since it is valid.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
We already log whether or not we add nht tracking, having
an additional boolean to say to log another line is
a bit over the top.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
For each BSM packet, rpf check is performed. We will be accepting if the
source address match any of the next hop neighbor(in ecmp case) to reach
the Bootstrap Router.
1. pim_nexthop_match - this lookup in zebra and return true if any of the
next hop nbr is matching (in ecmp case).
2. pim_nexthop_match_nht_cache - this api searches the given address in local
pnc and return true if any of the next hop
nbr is matching (in ecmp case).
Signed-off-by: Saravanan K <saravanank@vmware.com>
This macro:
- Marks ZAPI callbacks for readability
- Standardizes argument names
- Makes it simple to add ZAPI arguments in the future
- Ensures proper types
- Looks better
- Shortens function declarations
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Track whether or not we have received an answer from
our query to do nexthop tracking. This allows us to
go straight to doing a synchronous query for our
RPF.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Start the separation of tracking a Destination from the act
of looking it up. The cojoining of these two concepts led
to a bunch of code that had to think about both problems leading
to weird situations and code paths. Simplify the code by making
pim_ecmp_nexthop_search a static function and we only ever
call pim_ecmp_nexthop_lookup when we need to do a RPF().
pim_ecmp_nexthop_lookup will now attempt to find a stored pnc
and if it finds one it will report on the answer from it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pim_resolve_upstream_nh function call is no longer being used
let's remove it from the code base.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When RP gets deleted, find all the (*, G) upstream whose group belongs to
the deleted RP, release the upstream from pnc->upstream_hash in the function
pim_delete_tracked_nexthop().
Signed-off-by: Sarita Patra <saritap@vmware.com>
When route to RP gets modified, FRR receives a notification from
zebra, and call the function pim_resolve_upstream_nh() to compute the
nexthop and update upstream->rpf structure.
Issue: In case when RP becomes not reachable, FRR only uninstall
the mroute from the kernal, but not update the upstream->rpf structure.
Fix: When FRR receives a notification from zebra saying RP becomes
not reachable, then update the following fields.
1. update channel_oil incoming interface as MAXVIFS
2. Un-install the mroute from the kernel.
3. Switch upstream state from JOINED to NOTJOINED.
4. Clear the nexthop information of the upstream.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When route to RP gets modified, FRR receives a notification from
zebra, and call the function pim_update_rp_nh() to compute the
new nexthop and will update the source_nexthop information of
rp_info. This is not working for the case when RP becomes not
reachable.
Fix: When FRR receives a notification from zebra saying RP becomes
not reachable, then delete the source_nexthop informatio of rp_info.
Signed-off-by: Sarita Patra <saritap@vmware.com>
When FRR receives IGMP/PIM (*, G) join and RP is not configured or not
reachable, then we are creating a dummy upstream with incoming interface
as NULL and upstream address as INADDR_ANY.
Added upstream address and incoming interface validation where it is necessary,
before doing any operation on the upstream.
Signed-off-by: Sarita Patra <saritap@vmware.com>
pimd/pim_nht.c: In function ‘pim_ecmp_nexthop_search’:
pimd/pim_nht.c:523:17: error: ‘nbr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
nexthop->nbr = nbr;
(on gcc 5.4.0; this is the only warning with that version.)
Signed-off-by: David Lamparter <equinox@diac24.net>
Abstract the RPF change for upstream handling code so
that we do not have two copies of the code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
After we have decided what has changed as part of a update
we need to send the j/p messages to our peers.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Both pim_ecmp_nexthop_lookup and pim_ecmp_fib_lookup_if_vif_index
pass the address in 2 times. Make function calls consistent
and just pass in the src once.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The pim_ecmp_fib_looikup_if_vif_index does practically
the same work as pim_ecmp_nexthop_lookup, refactor to
use that function so that we do not have more code
that must parse the results from zclient_lookup_nexthop.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are looking up a RPF with a ecmp path, there
are situations where we are failing to find a path change
because we were not considering the actual number of neighbors
we have available to us at the start of the loop.
Example:
Suppose 2 way ecmp with a neighbor on each path. We have
multiple upstreams that are strewn across both paths.
If we loose a pim neighbor on one of the paths we would
initiate a rescan of the upstreams. If the neighbor
we lost happened to be the last ecmp path we rescanned
we would not successfully find a new path and leave
the upstream stranded.
This code change looks at the number of available neighbors
that we have -vs- the number of paths we have and chooses
the smaller of the two for figuring out what to do.
There probably exist other failure scenarios as well that
I am missing here and quite frankly the current code muddies
the water between a RPF lookup failure -vs- a RPF lookup succeeded
and there are no paths. Further work is needed here imo.
Additionally this idea of a pim_ecmp_nexthop_lookup and
pim_ecmp_nexthop_search is bogus. They are the same function and
should be merged at some point in time.
Ticket: CM-21599
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix a couple of problems in my 1st fix for PIM nexthops reachable via a
connected route:
Use NEXTHOP_TYPE_IPV4_IFINDEX instead of NEXTHOP_TYPE_IPV4 since we add an
IPv4 address to an already known ifindex.
Assign nexthop_tab[num_ifindex].protocol_distance and .route_metric before
incrementing num_ifindex.
Revert the default: to individual switch case statement conversion in
zclient_read_nexthop() as requested by donaldsharp in #2347
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
These commands were being accepted in all vrf's and
affecting all vrf's behavior globally, since they were
global variables.
Modify the code to make these two commands work
on a per-vrf basis.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When sending a PIM join upwards on the RP-based tree, it may get dropped on
the last hop before the RP if the RP is reachable via a connected route
(i.e. there's no associated nexthop). pimd needs to put the nexthop IP
address into the PIM join payload and fails to do that if that route has a
nexthop of 0.0.0.0. So whenever we look up a route to determine the nexthop
or we receive a nexthop tracking update from Zebra, use the destination
address as the nexthop address for connected routes.
Fixes#2326.
Signed-off-by: Martin Buck <mb-tmp-tvguho.pbz@gromit.dyndns.org>
Create a zapi_nexthop_update_decode function that both
pim and bgp use to decode the message from zebra.
There probably could be further optimizations but I opted
to keep the code as similiar as is possible between the
originals because they both make some assumptions about
code flow that I do not fully understand yet.
The real goal here is that I want to create a new
user of the nexthop tracking code from a higher level
daemon and I see no need to re-implement this damn
code again for a 3rd time.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Abstract the code that sends the zapi message into zebra
for the turn on/off of nexthop tracking for a prefix.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A recent commit has shown that we were not consistent with
handling of the vrf lookup. Adjust pim to do the right
thing with vrf lookup to be consistent and to make SA
happier.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This improves code readability and also future-proofs our codebase
against new changes in the data structure used to store interfaces.
The FOR_ALL_INTERFACES_ADDRESSES macro was also moved to lib/ but
for now only babeld is using it.
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
This is an important optimization for users running FRR on systems with
a large number of interfaces (e.g. thousands of tunnels). Red-black
trees scale much better than sorted linked-lists and also store the
elements in an ordered way (contrary to hash tables).
This is a big patch but the interesting bits are all in lib/if.[ch].
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
Convert the list_delete(struct list *) function to use
struct list **. This is to allow the list pointer to be nulled.
I keep running into uses of this list_delete function where we
forget to set the returned pointer to NULL and attempt to use
it and then experience a crash, usually after the developer
has long since left the building.
Let's make the api explicit in it setting the list pointer
to null.
Cynical Prediction: This code will expose a attempt
to use the NULL'ed list pointer in some obscure bit
of code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Current cleanup is for unset values or variables that are not used anymore.
Regarding ospfd/ospf_vty.c: argv_find()
we'll never get it NULL, so get coststr = argv[idx]->arg;
This does three things:
1) When we get a RPF_FAILURE, remove the mroute associated
with it.
-> This way when the RPF comes back we can just add the
mroute in as part of the normal scanning process.
2) When we do a ecmp_nexthop_search return 1 when we found
something we can use.
3) Ignore output from pim_update_rp_nh
-> When we do a ecmp_nexthop_search ignore the return
code and do not attempt to gather it up to return
to the calling function. It is just ignored
and we were not taking into account the what of
multiple RP's we were looking at.
Ticket: CM-17218
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The NHT upstream list at scale is horribly inefficient due to keeping
a sorted list of upstream entries. The attempting to find
the upstream and the insertion of it into the upstream_list
was consuming a large amount of cpu cycles.
Convert to a hash, allow add/deletions to effectively become
O(1) events.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we receive a new ecmp path and the old nexthop is still
valid. There existed some cases where we would continue looking
for a nexthop( and thus loose the fact that we had found it )
after found.
Ticket: CM-16983
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
Ensure that displayed (S,G) output in logs is
consistent for all debugs. This will make it
easier to grep for interesting data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Be aware that we may not have pim configured on all interfaces when
we have a failure situation.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Move the upstream_list, hash and wheel into 'struct pim_instance'
Remove all pimg to pim in pim_upstream
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A bunch of functions had return values that were never
checked for ( and not needed ) and opposite return values
for proper calling function boolean logic.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
-Upon Receving SGRpt Prune message, transitioning from Prune Pending state
to NOINFO state, ifchannel entry was getting deleted in prune pending timer
expiry. This can result in SGRpt ifhchannel deleted and recreated upon receving
triggered or periodic SGRpt received from downstream.
The automation test failed as it expected (check) SGRpt entry at RP after it triggers
SPT switchover.
- While transitioning from Prune-Pending state to NOINFO(Pruned) state, Trigger
SGRpt message towards RP.
- Add/del some of the debug traces
Ticket:CM-16057
Reviewed By:CCR-6198
Testing Done:
Rerun test08 multiple times and observed passing it.
Pim-smoke with hardnode
Ran 95 tests in 11219.420s
FAILED (SKIP=10, failures=4)
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header. (The style with * at the beginning won out with
580 to 141 in existing files.)
Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
During PIM Neighbor change/UP event, pim_scan_oil api
scans all channel oil to see any rpf impacted. Instead of
passing current upstream's RPF it passes current RPF as 0 and
does query to rib for nexhtop (without ECMP/Rebalance). This creates
inconsist RPF between Upstream and Channel oil.
In Channel Oil keep backward pointer to upstream DB and fetch up's
RPF and passed to channel_oil scan.
Decrement channel_oil ref_count in upstream_del when decrementing
up ref_count and it is not the last.
Created ECMP based FIB lookup API.
Testing Done:
Performed following testing on tester setup:
5 x LHR, 4 x MSDP Spines, 6 Sources each sending to 1023 groups from one of the spines.
Total send rate 8Mpps.
Test that caused problems was to reboot every device at the same time.
After fix performed 5 iterations of reboot devices and show no sign of the problem.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
During neighbor down event, all upstream entries rpf lookup may result
into nhop address with 0.0.0.0 and rpf interface info being NULL.
Put preventin check where rpf interface info is accessed.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Execute ip igmp version 3 under swp interface,
verified show running displayed 'ip igmp' configuration.
Continuous sending group membership, performed 'no ip igmp'
and verified, group membership flushed. Performed
'ip igmp version 3', verified 'show ip igmp groups'
displaying igmp membership re-populated.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
In this patch, PIM nexthop tracking uses locally populated nexthop cached list
to determine ECMP based nexthop (w/ ECMP knob enabled), otherwise picks
the first nexthop as RPF.
Introduced '[no] ip pim ecmp' command to enable/disable PIM ECMP knob.
By default, PIM ECMP is disabled.
Intorudced '[no] ip pim ecmp rebalance' command to provide existing mcache
entry to switch new path based on hash chosen path.
Introduced, show command to display pim registered addresses and respective nexthops.
Introuduce, show command to find nexthop and out interface for (S,G) or (RP,G).
Re-Register an address with nexthop when Interface UP event received,
to ensure the PIM nexthop cache is updated (being PIM enabled).
During PIM neighbor UP, traverse all RPs and Upstreams nexthop and determine, if
any of nexthop's IPv4 address changes/resolves due to neigbor UP event.
Testing Done: Run various LHR, RP and FHR related cases to resolve RPF using
nexthop cache with ECMP knob disabled, performed interface/PIM neighbor flap events.
Executed pim-smoke with knob disabled.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
(cherry picked from commit cba4448178)
It is impossible for the list->cmp function to
ever be handed NULL values as the arguments.
Clean up this in the code.
Additionally consolidate the exact same two functions
into 1 function.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no need for a function that calls another function.
Additionally, nexthop_updates from zebra can be either
ZEBRA_NEXTHOP_UPDATE -or-
ZEBRA_IMPORT_CHECK_UPDATE
If we were to receive a IMPORT_CHECK_UPDATE the code
would cause a immediate crash. Fix this
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add pim Nexthop tracking feature 1st part where, specific RP or Source address (unicast address)
register with Zebra. Once nexthop update received from Zebra for a given address, scan RP or upstream
entries impacted by the change in nexthop.
Reviewed By: CCR-5761, Donald Sharp <sharpd@cumulusnetworks.com>
Testing Done: Tested with multiple RPs and multiple *,G entries at LHR.
Add new Nexthop or remove one of the link towards RP and verify RP and upstream nexthop update.
similar test done at RP with multiple S,G entries to reach source.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>