Modify code base so that pim_upstream *always* creates a channel_oil
and as such we do not need to create it later or play other games.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Add a function that allows you to modify the channel oil's incoming
interface and to appropriately install/remove it from the kernel.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
zvni setup in zebra is controlled via bgpd i.e. advertise_all_vni
from bgpd triggers this setup. As a part of zvni creation we may need
to setup BUM mcast SG entries which are propagated to pimd for MDT setup.
Now pimd may not be present at the time of zvni creation or may restart
post zvni creation so we need a mechanism to replay (on pimd startup) and
to cleanup (on pimd stop). This is addressed via zebra_vxlan_sg_replay and
zebra_evpn_pim_cfg_clean_up.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When we receive an igmp query on a interface, ensure that the
source address of the packet is connected to the incoming
interface. This will prevent a meanie from crafting a igmp
packet with a source address less than ours and causing
us to suspend query activities.
Fixes: #1692
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Fix details :
Added a utility cli to generate a igmp query on an interface.
This won't impact the existing query generation based on the
general query interval.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
In a pim-evpn setup (say TORC11<=>TORC12) an mroute can have a mix of
PIM and IGMP joins. The vxlan termination device ipmr-lo is IGMP
joined on termination mroutes and the peerlink-rif can be pim joined
on the same mroute if the MLAG peer (TORC11) loses all its uplinks to
underlay -
root@TORC12:~# net show pim state 239.1.1.101|grep pimreg
1 * 239.1.1.101 uplink-1
pimreg(I ), ipmr-lo( J ), peerlink-3.4094( J )
root@TORC12:~#
When the uplinks come back up on TORC11 it will prune the peerlink-rif
and join the RP (say spine) via the uplinks.
TORC12 is rxing the prune and removing the if_channel
(pim_ifchannel_delete). However it is not removing the OIF from
mfcc_ttl basically leaving behind a leaked OIF in the forwarding
entry. And this is because it is deriving the owner flag from the
parent upstream entry and incorrectly concluding that all OIFs are
IGMP joined.
Thix fix flushes out both PIM and IGMP ownership when the ifchannel is
deleted.
There is a second fix in the commit and that is to set the proto mask
correctly (to STAR) for inherited OIFs. (S,G) entries can inherit the
OIF from the (*, G) entry and this decision can change when the pim/igmp
ifchannel is removed. The earlier code was setting the proto-mask
incorrectly to PIM or IGMP.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit d4d1d968dbbe61347393f7dace8b675496ff1024)
When a link goes down the vifi was being deleted but the OIF stayed
in the OIL with a stale vifi -
oroot@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s1 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo pimreg( J ), ipmr-lo( *), swp1s1( J )
root@act-7726-03:~# ip link set swp1s1 down
root@act-7726-03:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN
Installed Source Group IIF OIL
1 * 239.1.1.111 swp1s0 pimreg(I ), ipmr-lo( J )
1 6.0.0.28 239.1.1.111 lo ipmr-lo( *), swp1s0( J ), <oif?>( J ) >>>>>>>>
root@act-7726-03:~#
The problem was as a part ifchannel_delete the join state of the channel
was checked to avoid incorrect OIF deletion this was preventing the OIF
from being flushed. Fix is to flip the channel join-state to NOINFO before
deleting it.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There has never been a `debug igmp trace detail` but we have
had code to display this when we had the appropriate flags
set. Since we never can accept this, let's remove this.
This showed up because of commit:0ab16492d2d9fcc6cba7e001227deed6765ed261
where we re-arranged some debugs to combine them being turned on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Position of address family and encoding type is defined wrongly in the
packed structure. This will correct the position as per RFC 4601 4.9.1
Signed-off-by: Saravanan K <saravanank@vmware.com>
If we create a channel_oil ensure that all paths that
we can go down will create one. Future commits
can remove the (up->channel_oil) tests.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There is no need to check for ALLOC function failures
in the code base. If we cannot get more memory we
assert.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The vifi being displayed is just confusing. Display the
actual interface name being used in the mroute.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
This is mostly relevant for Solaris, where config.h sets up some #define
that affect overall header behaviour, so it needs to be before anything
else.
Signed-off-by: David Lamparter <equinox@diac24.net>
In Scenario where receiver is present in a subnet where 2 or more pim mrouters.
When IGMP query received on a DR interface and RP is reachable through non DR.
Currently we are blocking to create upstream where iif == oif. So pim join
not generated towards RP. We have to allow the DR router in the network to create an upstream.
Signed-off-by: Saravanan K <saravanank@vmware.com>
clippy can't process #ifdef or similar bits inside of an argument list
(e.g. within the braces of a DEFUN or DEFPY statement.) Improve error
reporting to catch these cases instead of generating broken C code.
Fixes: #3840
Signed-off-by: David Lamparter <equinox@diac24.net>
Field vrf_id is replaced by the pointer of the struct vrf *.
For that all other code referencing to (interface)->vrf_id is replaced.
This work should not change the behaviour.
It is just a continuation work toward having an interface API handling
vrf pointer only.
some new generic functions are created in vrf:
vrf_to_id, vrf_to_name,
a zebra function is also created:
zvrf_info_lookup
an ospf function is also created:
ospf_lookup_by_vrf
it is to be noted that now that interface has a vrf pointer, some more
optimisations could be thought through all the rest of the code. as
example, many structure store the vrf_id. those structures could get
the exact vrf structure if inherited from an interface vrf context.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
vrf_id parameter is replaced with struct vrf * parameter. It is
needed to create vrf structure before entering in the fuction.
an error is generated in case the vrf parameter is missing.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
the vrf_id parameter is replaced by struct vrf * parameter.
this impacts most of the daemons that look for an interface based on the
name and the vrf identifier.
Also, it fixes 2 lookup calls in zebra and sharpd, where the vrf_id was
ignored until now.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The pim ifchannel expiry timer was not setting any debug output.
Let's add something in to help us understand what is going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We already log whether or not we add nht tracking, having
an additional boolean to say to log another line is
a bit over the top.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
While this is impossible, make the compilers a bit happier
for those of us having to use something old.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com
Various compilers in our CI system were complaining about various
auto-conversions. Let's get these cleaned up a bit more.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When pim is started and has nothing to talk to zebra about
over the zlookup socket and interface events are happening
then the zlookup socket is not being drained.
This eventually leads to a situation where the kernel send buffer
fills up and zebra is unable to write anything down the pipe.
At this time the zapi starts buffering data in `struct buffer`
which of course slowly fills up as pim has nothing to do.
As a bit of a hack allow the zlookup socket to wake up 1 time
a minute and ask for information about the default route
and do nothing with it. This will cause the socket buffers
to be drained and the system will be happy.
Long term we need to get rid of this synchronous/asynchronous
duality that pim has. This is on the radar but is not something
that could be fixed in an afternoon or a week of effort in my
opinion. Given time constraints right now. Let's put this
in place and then once we get pim completely async then
we can just remove the zlookup( I hope ) code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
A couple of places of strncpy snuck in due to my confusion
about if Quentin's earlier change had gotten in. Just some
code in flux. This should fix the issue/warnings in our
CI system.
Recent commits rewrote the `clear mroute` command and this caused
these two two functions to no longer be used, remove.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The header inclusion in mtracebis.c should include zebra.h
and not config.h. There are other issues that need to
be worked through as well, but we can do this in the future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Introduce new cli commands ip igmp last-member-query-count <1-7>
ip igmp last-member-query-interval <1-255> deciseconds.
Display the config in show running config and show ip igmp interface
Signed-off-by: Sarita Patra <saritap@vmware.com>
Introduced a new command "show ip mroute summary"
to display total number of (*, G) and (S, G) mroutes
created and number of mroutes installed in the kernel.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Made changes to clean up the all upstreams and ifchannels
in FRR apart from cleanup datapath mroutes when this command
issued.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Fix: When RP receives a (*, G) join and corresponding (s,g)
is present, then check for OIL is not-empty, then only switch
upstream (s, g) state to JOINED.
Signed-off-by: Sarita Patra <saritap@vmware.com>
This would show only bsm related statistics for now.
We shall add more statistics to this later.
Sw3# show ip pim statistics
BSM Statistics :
----------------
Number of Received BSMs : 1584
Number of Forwared BSMs : 793
Number of Dropped BSMs : 1320
Interface : ens192
-------------------
Number of BSMs dropped due to config miss : 0
Number of unicast BSMs dropped : 0
Number of BSMs dropped due to invalid scope zone : 0
Interface : ens224
-------------------
Number of BSMs dropped due to config miss : 0
Number of unicast BSMs dropped : 0
Number of BSMs dropped due to invalid scope zone : 0
Interface : ens256
-------------------
Number of BSMs dropped due to config miss : 0
Number of unicast BSMs dropped : 0
Number of BSMs dropped due to invalid scope zone : 0
Signed-off-by: Saravanan K <saravanank@vmware.com>
This command shows all the fragments of the last received preferred BSM.
This displayed in readable format.
Sw3# sh ip pim bsm-database
Scope Zone: Global
Number of the fragments: 1
BSM Fragment : 1
------------------
BSR-Address BSR-Priority Hashmask-len Fragment-Tag
30.0.0.100 0 0 3289
Group : 225.1.1.1/32
-------------------
Rp Count:9
Fragment Rp Count : 9
RpAddress HoldTime Priority
20.0.0.2 150 0
2.2.2.2 150 0
9.9.9.10 150 0
7.7.2.7 150 0
7.2.2.7 150 0
7.7.9.7 150 0
7.8.9.10 150 0
7.5.2.7 150 0
9.10.9.10 150 0
Group : 226.1.1.1/32
-------------------
Rp Count:1
Fragment Rp Count : 1
RpAddress HoldTime Priority
9.9.9.9 150 0
Signed-off-by: Saravanan K <saravanank@vmware.com>
If no_fwd bit not set,
forward on all interfaces including which it came.
store it in bsm list with size for forwarding it later to new neighbor.
calculate PIM mtu of the interface, if bsm size is more do sematic frag and send
Signed-off-by: Saravanan K <saravanank@vmware.com>
When all rp received on a partial list, this routine is called.
if static rp configured for the group range
if partial list is empty
clean main list and partial list
else
replace main with partial and start the g2rp timer with head of new main
return
if main list was empty
call rp new with head of partial list and start g2rp timer.
else
if partial list is empty
call rp del
else
stop g2rp timer of old elected rp.
call rp change with new rp(head of partial list) and start g2rp timer.
swap the lists and clean the old list(now partial list).
Signed-off-by: Saravanan K <saravanank@vmware.com>
Bootstrap rp table is route_table datastructure with group range as key.
Each node represents a group range.
Every node has two lists of rp nodes. partial list and active list(bsrp_list)
Whenever a rp is parsed from BSM, it is updated to partial list.
When partial list is full, we move it to main list(bsrp_list). This commit doesn't cover that.
Rp Election routine based on RFC 7761 Sec 4.7
Hash calculation for rp election based on RFC 7761 Sec 4.7.2
Signed-off-by: Saravanan K <saravanank@vmware.com>
1. Packet validation as per RFC 5059 Sec 3.1.3
We won't supporting scope zone BSM as of now, they are dropped now.
Order of the check slightly be changed in code for optimization.
if ((DirectlyConnected(BSM.src_ip_address) == FALSE) OR
(we have no Hello state for BSM.src_ip_address)) {
drop the Bootstrap message silently
}
if (BSM.dst_ip_address == ALL-PIM-ROUTERS) {
if (BSM.no_forward_bit == 0) {
if (BSM.src_ip_address != RPF_neighbor(BSM.BSR_ip_address)) {
drop the Bootstrap message silently
}
} else if ((any previous BSM for this scope has been accepted) OR
(more than BS_Period has elapsed since startup)) {
#only accept no-forward BSM if quick refresh on startup
drop the Bootstrap message silently
}
} else if ((Unicast BSM support enabled) AND
(BSM.dst_ip_address is one of my addresses)) {
if ((any previous BSM for this scope has been accepted) OR
(more than BS_Period has elapsed since startup)) {
#the packet was unicast, but this wasn't
#a quick refresh on startup
drop the Bootstrap message silently
}
} else {
drop the Bootstrap message silently
}
2. Nexthop tracking registration for BSR
3. RPF check for BSR Message.
Zebra Lookup based rpf check for new BSR
NHT cache(pnc) based lookup for old BSR
Signed-off-by: Saravanan K <saravanank@vmware.com>
This commit includes parsing of Nbit and contructing pim hdr with Nbit
Adding Nbit to PIm hdr structure
Adding Scope zone bit and Bidir bit to Encoded IPv4 Group Address
Signed-off-by: Saravanan K <saravanank@vmware.com>
When bs time out occurs,
1. Delete the bsm list
2. Reset the BSR address
3. delete nexthop tracking for the expired BSR
4. Give one more lease of life to all the bsr advertised rp with hold time
5. clear partial list of each grp node if not empty
Signed-off-by: Saravanan K <saravanank@vmware.com>
DS Overview:
Bootstrap RP table has grp node.
scope --> rp table --> grp node1 --> rp list --> rp nodes(g2rp timer)
|
-------> grp node2 --> rp list --> rp nodes(g2rp timer)
When grp2rp mapping expires, following has to be done.
1. delete the rp node from the active bs-rp list in the list
2. calculate the elapsed time for other rp nodes in the list
3. delete those nodes having more elapse time than their hold time
4. If the list is not empty and current rp src is not static
rp change with new rp(head) & start g2rp timer with value holdtime - elapse
5. If the list is empty and current rp src for the grp is not static
delete the rp
6. If the list is not empty and current rp is static, just start the
g2rp timer with value holdtime - elapse
7. If list is empty and pending list is empty, delete grp node.
Note: g2rp timer will be run only on elected RP node for optimization.
when it expires, other node are update with elapse time.
This list is sorted insuch way that elected RP is the HEAD of list
Signed-off-by: Saravanan K <saravanank@vmware.com>
Command to display current bsr, last received bsm ts, bsr uptime
Sw3# sh ip pim bsr
PIMv2 Bootstrap information
Current preferred BSR address: 30.0.0.100
Priority Fragment-Tag State UpTime
0 6390 ACCEPT_PREFERRED 91:26:24
Last BSM seen: 00:00:37
Signed-off-by: Saravanan K <saravanank@vmware.com>
pim_rp_new split into pim_rp_new_config and pim_rp_new.
pim_rp_new_config is called by CLI.
pim_rp_new will be called by pim_rp_new_config and bsm rp config.
pim_rp_del is split into pim_rp_del_config and pim_rp_del
pim_rp_del_config is called by CLI.
pim_rp_del is called by pim_rp_del_config and bsm rp config
Signed-off-by: Saravanan K <saravanank@vmware.com>
(intf)ip pim bsm - to enable bsm processing on the interface
(intf)no ip pim bsm - to disable bsm processing on the interface
(intf)ip pim unicast-bsm - to enable ucast bsm processing on the interface
(intf)no ip pim unicast-bsm - to disable ucast bsm processing on the interface
Note: bsm processing and ucast bsm processing is enabled by default on a
pim interface. The CLI is implemented as a security feature as recommended by
RFC 5059
Signed-off-by: Saravanan K <saravanank@vmware.com>
Sw3# sh ip pim rp-info
RP address group/prefix-list OIF I am RP Source
20.0.0.2 225.1.1.1/32 ens192 no BSR
9.9.9.9 226.1.1.1/32 (Unknown) no BSR
30.0.0.100 229.1.1.5/32 ens192 no Static
Signed-off-by: Saravanan K <saravanank@vmware.com>
For each BSM packet, rpf check is performed. We will be accepting if the
source address match any of the next hop neighbor(in ecmp case) to reach
the Bootstrap Router.
1. pim_nexthop_match - this lookup in zebra and return true if any of the
next hop nbr is matching (in ecmp case).
2. pim_nexthop_match_nht_cache - this api searches the given address in local
pnc and return true if any of the next hop
nbr is matching (in ecmp case).
Signed-off-by: Saravanan K <saravanank@vmware.com>
Apart from datastructure, bsm scope initialization and deinitialiation
routines called during pim instance init and deinit. Also makefile changes.
Signed-off-by: Saravanan K <saravanank@vmware.com>
It doesn't make much sense for a hash function to modify its argument,
so const the hash input.
BGP does it in a couple places, those cast away the const. Not great but
not any worse than it was.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
bfd cbit is a value carried out in bfd messages, that permit to keep or
not, the independence between control plane and dataplane. In other
words, while most of the cases plan to flush entries, when bfd goes
down, there are some cases where that bfd event should be ignored. this
is the case with non stop forwarding mechanisms where entries may be
kept. this is the case for BGP, when graceful restart capability is
used. If BFD event down happens, and bgp is in graceful restart mode, it
is wished to ignore the BFD event while waiting for the remote router to
restart.
The changes take into account the following:
- add a config flag across zebra layer so that daemon can set or not the
cbit capability.
- ability for daemons to read the remote bfd capability associated to a bfd
notification.
- in bfdd, according to the value, the cbit value is set
- in bfdd, the received value is retrived and stored in the bfd session
context.
- by default, the local cbit announced to remote is set to 1 while
preservation of the local path is not set.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The route_map_event_hook callback was passing the `route_map_event_t`
to each individual interested party. No-one is ever using this data
so let's cut to the chase a bit and remove the pass through of data.
This is considered ok in that the routemap.c code came this way
originally and after 15+ years no-one is using this functionality.
Nor do I see any `easy` way to do anything useful with this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
become stale entries.
Topology:
--------
Source
|
FHR
|
RP ------ LHR --- Recv1
|
Recv2
Root case :
-----------
When RP acts as a LHR i.e RP has a local receiver and registed for
the same group where LHR connected receiver also registered for the
same multicast group.When RP receives a (s,g) join form LHR , it
increments upstream ref count to two to track the Local membership
as well.But at the time of KAT expiry in RP , upstream reference
is not being removed Which is added to track local membership which
is causing to make these entries as stale in RP and FHR.
Fix : Made the change such that it removes the upstream reference
if it is added to track the local memberships.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
vrf_id parameter is added to the api of bfd_client_sendmsg().
this permits being registered to bfd from a separate vrf.
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>