This is just hitting the pim_mroute code with a hammer until it doesn't
print warnings anymore. This is NOT quite tested or working yet, it
just compiles.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Just fixing a bunch of compiler errors, this will NOT actually configure
IPv6 PIM properly.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since `pim_sgaddr` is `pim_addr` now, that causes a whole lot of fallout
anywhere S,G pairs are handled.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
These are sprinkled relatively widely through the PIM codebase, so for
the time being reduce the "compiler warning surface" by moving them
forward to proper types without actual implementations.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
`show ip pim assert` shows S,G ifchannel information even when
there is no information available about the assert process.
Fix the code to not dump non-interesting cases.
Fixes: 10462
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
As per RFC 2236 section 3, when the leave message is received at a querier,
it starts sending Query messages for "last Member Query Interval*query count"
During this time there should not be any querier to non-querier
transition and the same router needs to send the remaning queries.
Fixes: #10422
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
In all places that pim_nht_bsr_del is called, the code
needs to not unregister if the current_bsr is INADDR_ANY.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
I'm now seeing in my log file:
2022/01/28 11:20:05 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
2022/01/28 11:20:05 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
2022/01/28 11:20:06 PIM: [Q0PZ7-QBBN3] attempting to delete nonexistent NHT BSR entry 0.0.0.0
When I run pimd. Looking at the code there are 3 places where pim_bsm.c removes the
NHT BSR tracking. In 2 of them the code ensures that the address is already setup
in 1 place it is not. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
XPATH_MAXLEN denotes the maximum length of an XPATH. It does not make
sense to allocate a buffer intended to contain an XPATH with a size
larger than the maximum allowable size of an XPATH. Consequently this PR
removes buffers that do this. Prints into these buffers are now checked
for overflow.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
VRF name should not be printed in the config since 574445ec. The update
was done for NB config output but I missed it for regular vty output.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem Statement:
==================
(rcv1)-----A----B---C
v3 enabled with src 90.0.0.1
|
(rcv2)--
v2 enable with src none
rcv1 sends the packet in INCLUDE mode, rcv2 sends the IGMPv2 report
and PIM convers this report into exclude mode.
As per the state machine the group structure was
getting added and deleted. As group gets deleted the mroute for 90.0.0.1
and recreated back.
This effects the end to end trafiic.
Root Cause Analysis:
====================
As per state machine
INCLUDE (A) IS_EX (B) EXCLUDE (A*B,B-A) (B-A)=0
Delete (A-B)
Group Timer=GMI
EXCLUDE (X,Y) TO_EX (A) EXCLUDE (A-Y,Y*A) (A-X-Y)=Group Timer
Delete (X-A)
Delete (Y-A)
Send Q(G,A-Y)
Group Timer=GMI
The above equations were getiing calulated for IP address
90.0.0.1 and 0.0.0.0
This results in group creation deletion.
Fix:
====
As per RFC 4604.
drop the exclude mode, IGMP reports, if destnation group is
SSM based.
EXCLUDE
mode does not apply to SSM addresses, and an SSM-aware router will
ignore MODE_IS_EXCLUDE and CHANGE_TO_EXCLUDE_MODE requests in the SSM
range,
Signed-off-by: Abhishek N R <abnr@vmware.com>
Signed-off-by: Vishal Dhingra <rac.vishaldhingra@gmail.com>
Problem Statement:
==================
(rcv1)-----A----B---C
v3 enabled with src 90.0.0.1
|
(rcv2)--
v3 enable with src none
rcv1 sends the packet in INCLUDE mode, rcv2 sends the IGMP report
in exclude mode. As per the state machine the group structure was
getting added and deleted. As group gets deleted the mroute for 90.0.0.1
and recreated back.
This effects the end to end trafiic.
Root Cause Analysis:
====================
As per state machine
INCLUDE (A) IS_EX (B) EXCLUDE (A*B,B-A) (B-A)=0
Delete (A-B)
Group Timer=GMI
EXCLUDE (X,Y) TO_EX (A) EXCLUDE (A-Y,Y*A) (A-X-Y)=Group Timer
Delete (X-A)
Delete (Y-A)
Send Q(G,A-Y)
Group Timer=GMI
The above equations were getiing calulated for IP address
90.0.0.1 and 0.0.0.0
This results in group creation deletion.
Fix:
====
As per RFC 4604.
drop the exclude mode, IGMP reports, if destnation group is
SSM based.
EXCLUDE
mode does not apply to SSM addresses, and an SSM-aware router will
ignore MODE_IS_EXCLUDE and CHANGE_TO_EXCLUDE_MODE requests in the SSM
range.
Signed-off-by: Abhishek N R <abnr@vmware.com>
Signed-off-by: Vishal Dhingra <rac.vishaldhingra@gmail.com>
These really don't serve much of a purpose, especially with how
inconsistently they're used.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Only pim_sgaddr uses are covered by this since regular in_addr is still
used for singular addresses, so only a part of pim_inet4_dump calls are
gone with this.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Renamed frr-igmp.yang to frr-gmp.yang, igmp to gmp container.
to support IGMP and MLD protocol.
frr-gmp.yang, created a list of address family under mgmd
container. For PIMV4 the key is IPV4, where as for PIMV6
the key is IPV6. This is done for PIMV6 development.
This commit will have all the northbound changes to support
IPV4 address family.
Signed-off-by: sarita patra <saritap@vmware.com>
Need a separate constant that is IPv6 when needed. Also assign the
whole struct rather than just s_addr.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Replaces comparison against INADDR_ANY, so we can do IPv6 too.
(Renamed from "pim_is_addr_any" for "pim_addr_*" naming pattern, and
type fixed to bool.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... and replace with `%pSG` printfrr specifier. This actually used a
static buffer in the formatting function, so subsequent formatting would
overwrite earlier uses.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
frr-pim.yang, created a list of address family under pim
container. For PIMV4 the key is IPV4, where as for PIMV6
the key is IPV6. This is done for PIMV6 development.
This commit will have all the northbound changes to support
IPV4 address family.
Signed-off-by: sarita patra <saritap@vmware.com>
Fix the code as per RFC 2236 section 2.5:
Note that IGMP messages may be longer than 8 octets, especially
future backwards-compatible versions of IGMP. As long as the Type is
one that is recognized, an IGMPv2 implementation MUST ignore anything
past the first 8 octets while processing the packet. However, the
IGMP checksum is always computed over the whole IP payload, not just
over the first 8 octets.
Fixes: #10331
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
As per test case IGMP Conformance test case 5.6, report
messages longer than 8 octets should be accepted to support
future-backward compatibilty.
Fix the code as per RFC 2236 section 2.5:
Note that IGMP messages may be longer than 8 octets, especially
future backwards-compatible versions of IGMP. As long as the Type is
one that is recognized, an IGMPv2 implementation MUST ignore anything
past the first 8 octets while processing the packet. However, the
IGMP checksum is always computed over the whole IP payload, not just
over the first 8 octets.
Fixes: #10331
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Mostly just 2 sed calls:
- `sed -e 's%struct prefix_sg%pim_sgaddr%g'`
- `sed -e 's%memset(&sg, 0, sizeof(pim_sgaddr));%memset(\&sg, 0, sizeof(sg));%g'`
Plus a bunch of fixing whatever that broke.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Just putting the infrastructure in place and having it disabled is
actually good progress here; have the compiler make itself useful and
tell us what we have to do to get the basics right.
(The next commit will cause a *lot* of compile errors as soon as
`PIM_V6_TEMP_BREAK` is set; but there is no reason to force everything
into a single step here.)
To enable `pim_addr = in6_addr`, run `make PIM_V6_TEMP_BREAK=1` (remove
previous compile results with `rm pimd/pim6d-*.o`)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Depending on whether we're compiling pimd or pim6d, these types take on
the appropriate AF being used.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Since this is only used in very few places, moving it out of the way is
reasonable. (`%pSG` will be pim_sgaddr)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
This newborn pim6d is essentially an empty husk, but it does build
without warnings or errors and has the build system integration prepared
and working.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Modifying the members of pim_interface which are to be used
for both IPv4 and IPv6 to common names(for both MLD and IGMP).
Issues: #10023
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Co-authored-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Changed struct in_addr source_addr to struct PIM_ADDR source_addr
which is to be used in both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: sarita patra <saritap@vmware.com>
Changed struct in_addr upstream_addr and struct in_addr upstream_register
to struct PIM_ADDR upstream_addr and struct PIM_ADDR upstream_register
which are to be used in both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: sarita patra <saritap@vmware.com>
Changed struct in_addr last_lookup to struct PIM_ADDR last_lookup
which is to be used in both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: sarita patra <saritap@vmware.com>
Changed struct in_addr group_addr to struct PIM_ADDR group_addr
in struct igmp_group which is to be used in
both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: sarita patra <saritap@vmware.com>
Changed struct in_addr address to struct pim_addr in struct
pim_iface_upstream_switch which is to be used in both IPv4
and IPv6(Both MLD and IGMP).
Reviewed-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: sarita patra <saritap@vmware.com>
Changed struct in_addr group to struct pim_addr group
which is to be used in both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Changed struct in_addr source_addr to pim_addr source_addr
in struct igmp_source which is to be used in
both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Changed struct in_addr source_addr to pim_addr source_addr
in struct igmp_join which is to be used in
both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Based on compiler option, pim_addr will be changed to in_addr
or in6_addr for pimd and pim6d respectively.
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
This change is to accomodate IPv6 and IPv4 in the same code.
Based on pimd or pim6d, this will be compiled.
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Changed struct in_addr ifassert_winner to pim_addr
which will be used in both IPv4 and IPv6(Both MLD and IGMP).
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Added apis which will be decided on compile time
for pimd and pim6d daemon
Reviewed-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
PIM_ADDR will be controlled at compile time for IPv4 and IPv6.
in_addr and in6_addr will be compiled for the respective daemons.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
The zlog_warn used to be bounded by a debug guard
but the debug guard was removed but the code was
never fixed up to remove the open and close paranthesis.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently, it is possible to rename the default VRF either by passing
`-o` option to zebra or by creating a file in `/var/run/netns` and
binding it to `/proc/self/ns/net`.
In both cases, only zebra knows about the rename and other daemons learn
about it only after they connect to zebra. This is a problem, because
daemons may read their config before they connect to zebra. To handle
this rename after the config is read, we have some special code in every
single daemon, which is not very bad but not desirable in my opinion.
But things are getting worse when we need to handle this in northbound
layer as we have to manually rewrite the config nodes. This approach is
already hacky, but still works as every daemon handles its own NB
structures. But it is completely incompatible with the central
management daemon architecture we are aiming for, as mgmtd doesn't even
have a connection with zebra to learn from it. And it shouldn't have it,
because operational state changes should never affect configuration.
To solve the problem and simplify the code, I propose to expand the `-o`
option to all daemons. By using the startup option, we let daemons know
about the rename before they read their configs so we don't need any
special code to deal with it. There's an easy way to pass the option to
all daemons by using `frr_global_options` variable.
Unfortunately, the second way of renaming by creating a file in
`/var/run/netns` is incompatible with the new mgmtd architecture.
Theoretically, we could force daemons to read their configs only after
they connect to zebra, but it means adding even more code to handle a
very specific use-case. And anyway this won't work for mgmtd as it
doesn't have a connection with zebra. So I had to remove this option.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem:
=======
Generate query once cli is generating IGMPv2 report for IGMPv3 enabled interface
Description:
===========
If the version is not specified in the cli, it was taking version 2 as default
Fix:
===
If the version is not specified in the cli along with ip igmp generate-query-once,
the default will be the version enabled on that interface.
Signed-off-by: nsaigomathi <nsaigomathi@vmware.com>
Modifying name of struct igmp_sock to struct gm_sock, which is to be used
by both IPv4 and IPv6(for both MLD and IGMP).
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Co-authored-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Modifying name of struct igmp_group to struct gm_group, which is to be used
by both IPv4 and IPv6(for both MLD and IGMP).
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Co-authored-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Modifying name of struct igmp_source to struct gm_source, which is to be used
by both IPv4 and IPv6(for both MLD and IGMP).
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Co-authored-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Fix:
====
Modifying name of struct igmp_join to struct gm_join, which is to be used
by both IPv4 and IPv6(for both MLD and IGMP).
Co-authored-by: Abhishek N R abnr@vmware.com
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Modifying the members of pim_interface which are to be used
for both IPv4 and IPv6 to common names(for both MLD and IGMP).
Issue: #10023
Co-authored-by: Mobashshera Rasool <mrasool@vmware.com>
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
RCA: When upstream transition from Joined to NotJoined due to SGRpt
prune, then only SGRpt prune was sent and SG Prune is missed.
Fix: Send SG Prune towards source as well as SGRpt prune towards RP.
Signed-off-by: sarita patra <saritap@vmware.com>
Since f60a1188 we store a pointer to the VRF in the interface structure.
There's no need anymore to store a separate vrf_id field.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Problem Statement:
==================
on rp_change, PIM processes all the upstream in a loop and for selected
upstreams PIM has to send join/prune based on the RPF changed.
join and prune packets are not getting aggregated in a single packet.
Root Cause Analysis:
====================
on pim_rp_change pim_upstream_update() gets called for selected upstreams.
This API calculates to whom it has to send join and to whom it has to
send prune via API pim_zebra_upstream_rpf_changed(). This API peprares
the upstream_switch_list list per interface and inserts the group and
sources.
Now PIM is still in the pim_upstream_update() API context, i.e PIM
is still processing the same upstream. In the last there is a
call to pim_zebra_update_all_interfaces() which processes the
upstream_switch_list list, sends the packets out and clears the list.
Fix:
====
Don't process the upstream_switch_list in the upstream context.
process all the upstreams prepare the upstream_switch_list and then
process in one go. This will club all the S,G entries.
It also saves list cleanup with respect to memory allocation and
deallocation multiple times.
Signed-off-by: Vishal Dhingra <rac.vishaldhingra@gmail.com>
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
`pim_ifchannel_ifjoin_switch()` changes flags that `pim_forward_stop()`
looks at. This leads to data flow continuing until we have some reason
to sync state again.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
rp-count==0 isn't a broken BSM, it just means the BSR no longer has any
Candidate RPs for the group range. Previous behavior is badly mistaken
since it stops processing the entire packet.
Fix to correctly remove group range on rp-count==0 and continue
processing remainder of the packet.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
NHT won't have a result yet when we get the first BSM from a new BSR.
Hence, the first packet(s) are lost, since their RPF validation fails.
Re-add the blocking RPF check that was there before (though in a much
more sensible manner.)
Also nuke the now-unused pim_nexthop_match* functions.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... where it actually belongs. And make a bunch of stuff static, since
it's no longer used across files now.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
The Bootstrap message RX path needs a RPF check for the BSR address,
and this is implemented both incorrectly as well as quite ugly.
Clean up and fix case when we have multiple interfaces to the same LAN
and/or ECMP nexthops (both would cause message duplication, the former
can even cause BSM forwarding loops.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Most users of if_lookup_address_exact only cared about whether the
address is any local address. Split that off into a separate function.
For the users that actually need the ifp - which I'm about to add a few
of - change it to prefer returning interfaces that are UP.
(Function name changed due to slight change in behavior re. UP state, to
avoid possible bugs from this change.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We should always treat the VRF interface as a loopback. Currently, this
is not the case, because in some old pre-VRF code we use if_is_loopback
instead of if_is_loopback_or_vrf. To avoid any future problems, the
proposal is to rename if_is_loopback_or_vrf to if_is_loopback and use it
everywhere. if_is_loopback is renamed to if_is_loopback_exact in case
it's ever needed, but currently it's not used anywhere.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
These really serve no purpose other than slowing our build down. If
there's a benefit to any of these, they can be readded.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
We had various forms of min/max macros across multiple daemons
all of which duplicated what we have in compiler.h. Convert
everyone to use the `correct` ones
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
enum based switches should never use default. It makes
it very hard to fix and find issues when the enum is
changed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pim-vxlan uses periodic null registers to bootstrap a new (VTEP, mcast-grp).
These null registers were not being sent if the (S, G) was already present
and in a register-join state.
Ticket: #2848079
Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
frr-reload fails to recognize wildcard "*" for
member address in frr.conf/runing-config as cli
syntax expects in v4 address format.
Ticket: #2816923
Testing:
Without fix:
running config:
ip msdp mesh-group foo1 member *
Frr reoad failure log:
2021-11-02 11:05:04,317 INFO: Loading Config object from vtysh show running
line 5: % Unknown command: ip msdp mesh-group foo1 member *
Traceback (most recent call last):
File "/usr/lib/frr/frr-reload.py", line 1950, in <module>
With fix:
--------
running config displays:
ip msdp mesh-group foo1 member 0.0.0.0
Signed-off-by: Chirag Shah <chirag@nvidia.com>
Problem Statement:
==================
Mroutes are not recovered after shut/no shut of DUT to RP links
One interface is not added in OIL List in intermediate router,
hence traffic never received at LHR and mroutes not created for (S,G).
Root Cause Analysis:
====================
Generally (*,G) PIM Join is received first and then (S,G) joins are received.
This issue occurs when (S,G) join comes first and then the (*,G) Join.
When (S,G) PIM Join is received, ifchannel is created and channel_oil
OIF flag is set to PIM_OIF_FLAG_PROTO_PIM. Now when (*,G) join is received
the flag PIM_OIF_FLAG_PROTO_STAR is not inherited due to wrong check present in
function pim_upstream_inherited_olist_decide.
Fix:
===================
When (*,G) PIM Join is received, it should always add PIM_OIF_FLAG_PROTO_STAR
flag for all the (S,G) channel oils no matter what order the (*,G) or (S,G)
is received.
Fixes: #9918
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem:
-------
(*,G) created on transit node where same groups are defined as SSM
At present FRR has SSM checks for IGMP report, but SSM check is missing for PIM join.
Fix:
----
Whenever there is a modification in prefix list for SSM range, then we need to browse the ifchannels (PIM joins)
and groups coming in SSM range with (and *,G) should be removed from ifchannel database and also withdraw those routes
from upstream routers.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Problem :
=======
(*,G) created on transit node where same groups are defined as SSM
At present FRR has SSM checks for IGMP report, but SSM check is missing for PIM join.
Fix :
===
When an interface receives the PIM (*,G)join with G as SSM group, then PIMd supposed to discard that join.
There is no need to maintain PIM state for this group.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
This removes a giant `switch { }` block from lib/zclient.c and
harmonizes all zclient callback function types to be the same (some had
a subset of the args, some had a void return, now they all have
ZAPI_CALLBACK_ARGS and int return.)
Apart from getting rid of the giant switch, this is a minor security
benefit since the function pointers are now in a `const` array, so they
can't be overwritten by e.g. heap overflows for code execution anymore.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
It allows FRR to read the interface config even when the necessary VRFs
are not yet created and interfaces are in "wrong" VRFs. Currently, such
config is rejected.
For VRF-lite backend, we don't care at all about the VRF of the inactive
interface. When the interface is created in the OS and becomes active,
we always use its actual VRF instead of the configured one. So there's
no need to reject the config.
For netns backend, we may have multiple interfaces with the same name in
different VRFs. So we care about the VRF of inactive interfaces. And we
must allow to preconfigure the interface in a VRF even before it is
moved to the corresponding netns. From now on, we allow to create
multiple configs for the same interface name in different VRFs and
the necessary config is applied once the OS interface is moved to the
corresponding netns.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Problem Statement:
==================
On new neighbor addition, the tx counter for hello msg is reset.
Fix:
=================
Do not reset the tx counter on new neighbor addition.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem Statement:
==================
pim maintains two counters hello tx and hello rx at interface level.
At present pim needs to send the hello message prior to other pim
message as per RFC. This logic is getting derived from the tx hello
counters. So when a new neighbor is added, tx counters are set to
zero and then based on this, it is further decided to send hello in
pim_hello_require function.
Fix:
====
Separating the hello statistics and the logic to decide when to send hello
based on a new flag. pim_ifstat_hello_sent will be used to note down
the hello stats while a new flag is added to decide whether to send hello
or not if it is the first packet to a neighbor.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Deleting a mesh-group member no longer deletes the mesh-group.
Complete bug description at:
https://github.com/FRRouting/frr/issues/9664
Signed-off-by: Adriano Marto Reis <adrianomarto@gmail.com>
This `XFREE()` call is in plainly in the wrong spot. `rp_all` (the
224.0.0.0/4 entry) isn't supposed to be free'd ever, and the
conditional above makes quite clear that it remains in use.
It may be possible to exploit this as a heap corruption bug, maybe even
as RCE. I haven't tried; I randomly noticed this while working on the
BSM code. Luckily this code is only run by the CLI for the clear
command, so the surface is very small.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
pim_msdp_peer_rpf_check creates an nexthop to do
a rpf search against and doesn't initialize it
sucht that the pim_nexthop_lookup function is
making decisions against the nexthop just
created that was uninitialized.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This makes a lot more sense semantically (and matches the way groups are
handled.) Also allows placing additional restrictions on source
creation (e.g. limit on number of sources or ACLs.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Intentionally separate here because the previous patch does a whole
bunch of "move stuff up 1 level of indentation", and reviewing that is
easier when you can use the ignore-whitespace option on diff.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
IGMP group/source memberships are a property of the interface; the
particular IP address that the querier used to collect the data is
irrelevant.
... and IGMP packets get delivered only once to pimd anyway, since we
receive them on the "global" per-VRF IGMP socket. (The one in igmp_sock
is only used for sending queries.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
pimd's include files are very interdependent. Let's chop that down a
bit to gain some flexibility.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem
======
In pim_msg_send_frame api, the while loop was executed only once.
Fix
===
while is changed to if, as in the code flow
the while part is getting executed only once.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Problem :
=======
When all the groups from Ixia are stopped,
groups still keep refreshing and not getting timeout
RCA:
====
IGMP Report is coming in include mode without any source address, this problem will come.
Fix :
===
If the requested filter mode is INCLUDE *and* the requested
source list is empty, then the entry corresponding to the
requested interface and multicast address is deleted if present.
If no such entry is present, the request is ignored.
When an interface receives the IGMP report without any source, then the group is deleted.
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
There is a possibility that the same line can be matched as a command in
some node and its parent node. In this case, when reading the config,
this line is always executed as a command of the child node.
For example, with the following config:
```
router ospf
network 193.168.0.0/16 area 0
!
mpls ldp
discovery hello interval 111
!
```
Line `mpls ldp` is processed as command `mpls ldp-sync` inside the
`router ospf` node. This leads to a complete loss of `mpls ldp` node
configuration.
To eliminate this issue and all possible similar issues, let's print an
explicit "exit" at the end of every node config.
This commit also changes indentation for a couple of existing exit
commands so that all existing commands are on the same level as their
corresponding node-entering commands.
Fixes#9206.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
While defaults are good picks for "reasonable" guesses, min and max
range values really aren't. Operators and experimenters often like to
configure "unreasonable" values to stress test, tests boundary
conditions and explore innovations.
With that in mind, change all ranges to 1..max (of type).
While we're here add optional ignored values in the "no" CLI forms.
Signed-off-by: Christian Hopps <chopps@labn.net>
Problem Statement:
==================
IGMP query is sent at irregular intervals
(more than 30 seconds) when "ip igmp query-max-response-time 100"
command is executed multiple times.
RCA:
=================
When "ip igmp query-max-response-time 100" is executed, the timers
are reset resulting in the delay of sending the query.
Fix:
=================
When there is no change in the config value, we should not reset
the timers.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
When we decide that we do not need a item on the partial_bsrp_list
don't just drop the memory on the floor, free it up.
This was happening when we decided that a pending item has
a hold time of 0.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Some no commands were not accepting values and left us in
a situation where a cut-n-paste of the non-no line would
not be properly accepted.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The only difference in daemons' interface node definition is the config
write function. No need to define the node in every daemon, just pass
the callback as an argument to a library function and define the node
there.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
This commit is to correct the order in which the fields are
accessed while verifying it. First the fields should be
verified, and if it is valid then access it.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Allow the join-prune interval to be as small as 5 seconds instead
of limiting the value to 60.
This can and will come at a price of being able to converge less
mroutes.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
IGMPv3 packets with invalid TOS should be dropped.
Test Case ID: 4.10
TEST_DESCRIPTION
Every IGMP message described in this document is sent with
IP Precedence of Internetwork Control (e.g., Type of Service
0xc0)
(Tests that IGMPv3 Membership Query Message conforms to
above statement)
TEST_REFERENCE
NEGATIVE: RFC 3376, IGMP Version 3, s4 p7 Message Formats
Issue: #9071
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
IGMPv3 packets with invalid TTL should be dropped.
Test Case ID: 4.10
TEST_DESCRIPTION
Every IGMP message described in this document is sent with an IP
Time-to-Live of 1 (Tests that IGMPv3 Membership Report Message
conforms to above statement)
TEST_REFERENCE
NEGATIVE: RFC 3376, IGMP Version 3, s4 p7 Message Formats
Issue: #9070
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem Statement:
==================
valgrind shows memleaks in rp_table, when pimd shuts down gracefully.
2020-05-05 22:09:29,451 ERROR: Memory leaks in router [r4] for daemon [pimd]
2020-05-05 22:09:29,451 ERROR: Memory leaks in router [r4] for daemon [zebra]
2020-05-05 22:09:29,637 ERROR: Found memory leak in module pimd
2020-05-05 22:09:29,638 ERROR: ==6178== 184 (56 direct, 128 indirect) bytes in 1 blocks are definitely lost in loss record 21 of 21
2020-05-05 22:09:29,638 ERROR: ==6178== at 0x4C2FFAC: calloc (vg_replace_malloc.c:762)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x4E855EE: qcalloc (memory.c:111)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x4EAA43C: route_table_init_with_delegate (table.c:52)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x1281A1: pim_rp_init (pim_rp.c:114)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x11D0F8: pim_instance_init (pim_instance.c:117)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x11D0F8: pim_vrf_new (pim_instance.c:150)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x4EB1BEC: vrf_get (vrf.c:209)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x4EB2B2F: vrf_init (vrf.c:493)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x11D227: pim_vrf_init (pim_instance.c:217)
2020-05-05 22:09:29,638 ERROR: ==6178== by 0x11BBAB: main (pim_main.c:121)
Fix:
====
rp_info is allocated in pim_rp_init API. rp_info pointer is present
in rp_list and rp_table. In rp_list cleanup, the memory for rp_info
gets freed. rp_table clean up should be done first and then rp_list.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem:
S,G entry has iif = oif in FHR is LHR case.
Setup:-
R11-----R2----R4
R11 :- FHR and LHR
R2 :- RP
R4 :- LHR
Issue :-
1) shut mapped interface in R11
2) wait for 5 min
3) do FRR restart
5) No shut of mapped interface
OIL is added for local interface also where OIL is same as IIF
and duplicate traffic observed on R4 receives in Ixia
RCA:
pim_ifchannel_local_membership_add adds inherited oif from starg when iif for
SG is unavailable.
When rpf for that SG resolves to this inherited oif from starg, iif is also in oif.
This results in dup traffic.
Fix:
If iif is not available, do not inherit from starg.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Problem Statement:
In Prune pending state, when Join is received, and there is hold timer change
the Expiry timer is not getting updated with new Hold timer.
Root Cause:
When thread_add_timer function is called and the thread is already in the list
the thread api just returns, it does not modify the timer value.
So when we want to change the timer, we need to first call THREAD_OFF and then
call thread_add_timer.
The Expiry timer thread is not cancelled in PIM_IFJOIN_PRUNE_PENDING state,
therefore the timer change is not taking effect.
Fix:
Call THREAD_OFF in that flow.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Making the interface holdtime range to 3.5 times the hello-time
As per 7761, Section 4.11:
The Holdtime in a Hello message should be set to
(3.5 * Hello_Period), giving a default value of 105 seconds.
Therefore providing the user also to configure max upto 3.5 times
the hello timer interval.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
When we have a "192.0.2.1 peer 192.0.2.2/32" address on an interface, we
need to (a) recognize the local address as being on the link for our own
packets, and (b) do the IGMP socket lookup with the proper local address
rather than the peer prefix.
Fixes: efe6f18 ("pimd: fix IGMP receive handling")
Cc: Nathan Bahr <nbahr@atcorp.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
1. Add the querierIP object to igmp_sock datastruct to save the IP address of the querier.
Management of the querierIP object is added.
2. To show the querier IP address in the CLI "show ip igmp interface".
3. To add the json object querierIP for querier IP address in the json CLI "show ip igmp interface json".
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
... the PIM code is kinda misusing prefix lists to match addresses.
Considering the weird semantics of access-lists, I can't fault it.
However, prefix lists aren't great at matching addresses by default,
since they try to match the prefix length too. So, here's an "address
match mode" for prefix lists to get that to work more reasonably.
Fixes: #8492
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
There's an IGMP socket per interface, so they should be bound to that
interface. Which also makes IGMP work in VRFs.
Fixes: #7889
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Rename functions (`pim_msdp_peer_new` => `pim_msdp_peer_add` and
`pim_msdp_peer_do_del` => `pim_msdp_peer_del`) to keep consistency and
update the `pim_msdp_peer_add` documentation to tell users that this is
also used for non meshed group peers.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Fully utilize the northbound to hold pointers to our private data
instead of searching for data structures every time we need to change a
configuration.
Highlights:
* Support multiple mesh groups per PIM instance (instead of one)
* Use DEFPY instead of DEFUN to reduce code complexity
* Use northbound private pointers to store data structures
* Reduce callback names size
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
When running pim on an interface and that interface has
state and we move that interface into a different vrf
there exists a call path where we have not created the pimreg
device yet. Prevent a crash in this rare situation.
Ticket: #2552763
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Compile with v2.0.0 tag of `libyang2` branch of:
https://github.com/CESNET/libyang
staticd init load time of 10k routes now 6s vs ly1 time of 150s
Signed-off-by: Christian Hopps <chopps@labn.net>
VRF creation can happen from either cli or from
knowledged about the vrf learned from zebra.
In the case where we learn about the vrf from
the cli, the vrf id is UNKNOWN. Upon actual
creation of the vrf, lib/vrf.c touches up the vrf_id
and calls pim_vrf_enable to turn it on properly.
At this point in time we have a pim->vrf_id of
UNKNOWN and the vrf->vrf_id of the right value.
There is no point in duplicating this data. So just
remove all pim->vrf_id and use the vrf->vrf_id instead
since we keep a copy of the pim->vrf pointer.
This will remove some crashes where we expect the
pim->vrf_id to be usable and it's not.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
When creating configuration for a vrf *Before* the vrf has been
created, pim will not properly create the pimreg device and
we will promptly crash when we try to pass data.
Put some code checks in place to ensure that the pimreg is
created for vrf's.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
The router->register_suppress_time is used to derive the
rp_keep_alive_time, but when the suppress time was changed, pim was
not recalculating the rp_keep_alive_time and left it at the old value.
This fix applies the changes when a new suppress_time is entered
(or removed.)
Signed-off-by: Don Slice <dslice@nvidia.com>
Problem reported that when certain pim commands were entered, they
showed up duplicated in the configuration both under default instance
and every vrf (whether pim was used there or not.) This was because
these particular parameters are global only and the function doing
the display would repeat for each vrf. This fix only displays those
in the default case (and removes them from the help for entering
under a vrf.)
Signed-off-by: Don Slice <dslice@nvidia.com>
Just some cleanup before I touch this code; switching to typesafe list
macros & putting the data directly inline.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... yes we may need it later, but if and when that happens we can put it
back there. No point carrying around unused things.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Display the MSDP peer configuration in `show running-config` so it can
be saved on configuration write.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
encoding signed int as unsigned is bad practice; since we want to do
it here lets at least be explicit about it
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
`config.h` has all the defines from autoconf, which may include things
that switch behavior of other included headers (e.g. _GNU_SOURCE
enabling prototypes for additional functions.)
So, the first include in any `.c` file must be either `config.h` (with
the appropriate guard) or `zebra.h` (which includes `config.h` first
thing.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
... by referencing all autogenerated headers relative to the root
directory. (90% of the changes here is `version.h`.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Most of these are many, many years out of date. All of them vary
randomly in quality. They show up by default in packages where they
aren't really useful now that we use integrated config. Remove them.
The useful ones have been moved to the docs.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Issue:
User is allowed to configure only hello without hold timer but when undo
config, the hold timer is mandatory as shown below:
FRR-4(config-if)# ip pim hello 10
<cr>
(1-180) Time in seconds for Hold Interval
FRR-4(config-if)# ip pim hello 10
FRR-4(config-if)# no ip pim hello 10
(1-180) Time in seconds for Hold Interval
FRR-4(config-if)# no ip pim hello 10
% Command incomplete: no ip pim hello 20
Fix:
Making the hold timer as optional when undo config.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Also included display of hold time in CLI 'show ip pim int <intf>' cmd
and json commands.
Issue:
PIM neighbor not coming up if hold time is less than hello timer
since hello is sent every 4 sec and hold is 1 sec,
because of this nbr is flapping
Fix:
Do not allow configuration of hold timer less than hello timer
Also reset the value of hold timer to 3.5 times to hello whenever
only hello is modified so that the relationship holds good.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
There are places in the code where function nb_running_get_entry is used
with abort_if_not_found set to true during the config validation stage.
This is incorrect because when used in transactional CLI, the running
entry won't be set until the apply stage, and such usage leads to crash.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Recent change in commit: 6b73800ba2
Caused this error to pop up in pim_igmp_mtrace.c:
error: taking address of packed member 'rsp_addr' of class or structure 'igmp_mtrace' may result in an unaligned pointer value [-Werror,-Waddress-of-packed-member]
Follow the pattern used in the code to solve this problem for clang
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This was found while doing libyang2 work (causes assert); however, it is
also incorrect for libyang1 (empty canonical value for incorrectly
referenced interface vs interface-name node).
While here, fix 2 other incorrect uses of "." on a container node.
Signed-off-by: Christian Hopps <chopps@labn.net>
We can proactively check whether this mroute will be nacked by loopfree
MFC checks so let's do it in the apply phase and emit a useful error
message.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
When the control plane protocol is created, the vrf structure is
allocated, and its address is stored in the northbound node.
The vrf structure may later be deleted by the user, which will lead to
a stale pointer stored in this node.
Instead of this, allow daemons that use the vrf pointer to register the
dependency between the control plane protocol and vrf nodes. This will
guarantee that the nodes will always be created and deleted together, and
there won't be any stale pointers.
Add such registration to staticd and pimd.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Neither tabs nor newlines are acceptable in syslog messages. They also
break line-based parsing of file logs.
Signed-off-by: David Lamparter <equinox@diac24.net>
Modify code to add JSON format output in show command.
"show ip igmp [vrf NAME] join" and "show ip igmp vrf all join" with proper formatting
Signed-off-by: Sai Gomathi <nsaigomathi@vmware.com>
Valgrind reports:
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A090D: bgp_bfd_dest_update (bgp_bfd.c:416)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A093C: bgp_bfd_dest_update (bgp_bfd.c:421)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
On looking at bgp_bfd_dest_update the function call into bfd_get_peer_info
when it fails to lookup the ifindex ifp pointer just returns leaving
the dest and src prefix pointers pointing to whatever was passed in.
Let's do two things:
a) The src pointer was sometimes assumed to be passed in and sometimes not.
Forget that. Make it always be passed in
b) memset the src and dst pointers to be all zeros. Then when we look
at either of the pointers we are not making decisions based upon random
data in the pointers.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
RCA: There were 2 problems.
1. SGRpt prune expiry didn't create S,G entry with none oil when no other
interfaces were part of the oil.
2. When restarting the timer with new hold value, comparision was missing and
old timer was not stopping.
Fix:
SGRpt Prune pending expiry will put SG entry with none oil if no other
Signed-off-by: Saravanan K <saravanank@vmware.com>
interfaces present. If present we will be deleting the inherited oif from oil.
Deleting the oif in that scenario will take care of changing mroute.
When alone interface expires in SGRpt prune pending state, we shall detect by
checking installed flag. if not installed, install mroute.
Valgrind is reporting this:
==22220== Invalid read of size 4
==22220== at 0x11DC2B: pim_if_delete (pim_iface.c:215)
==22220== by 0x11DD71: pim_if_terminate (pim_iface.c:76)
==22220== by 0x128E03: pim_instance_terminate (pim_instance.c:66)
==22220== by 0x128E03: pim_vrf_delete (pim_instance.c:159)
==22220== by 0x48E0010: vrf_delete (vrf.c:251)
==22220== by 0x48E0010: vrf_delete (vrf.c:225)
==22220== by 0x48E02FE: vrf_terminate (vrf.c:551)
==22220== by 0x149495: pim_terminate (pimd.c:142)
==22220== by 0x13C61B: pim_sigint (pim_signals.c:44)
==22220== by 0x48CF862: quagga_sigevent_process (sigevent.c:103)
==22220== by 0x48DD324: thread_fetch (thread.c:1404)
==22220== by 0x48A926A: frr_run (libfrr.c:1122)
==22220== by 0x11B85E: main (pim_main.c:167)
==22220== Address 0x5912160 is 1,200 bytes inside a block of size 1,624 free'd
==22220== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==22220== by 0x128E52: pim_instance_terminate (pim_instance.c:74)
==22220== by 0x128E52: pim_vrf_delete (pim_instance.c:159)
==22220== by 0x48E0010: vrf_delete (vrf.c:251)
==22220== by 0x48E0010: vrf_delete (vrf.c:225)
==22220== by 0x48F1353: zclient_vrf_delete (zclient.c:1896)
==22220== by 0x48F1353: zclient_read (zclient.c:3511)
==22220== by 0x48DD826: thread_call (thread.c:1585)
==22220== by 0x48A925F: frr_run (libfrr.c:1123)
==22220== by 0x11B85E: main (pim_main.c:167)
==22220== Block was alloc'd at
==22220== at 0x4837B65: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==22220== by 0x48ADA4F: qcalloc (memory.c:110)
==22220== by 0x128B9B: pim_instance_init (pim_instance.c:82)
==22220== by 0x128B9B: pim_vrf_new (pim_instance.c:142)
==22220== by 0x48E0C5A: vrf_get (vrf.c:217)
==22220== by 0x48F13C9: zclient_vrf_add (zclient.c:1863)
==22220== by 0x48F13C9: zclient_read (zclient.c:3508)
==22220== by 0x48DD826: thread_call (thread.c:1585)
==22220== by 0x48A925F: frr_run (libfrr.c:1123)
==22220== by 0x11B85E: main (pim_main.c:167)
On pim vrf deletion, ensure that the vrf->info pointers are NULL as well
as the free'd pim pointer for ->vrf is NULL as well.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Issue: #7892 has a startup config where only igmp
interfaces are created and a igmp report comes in.
Effectively we are not creating the regiface device unless
we do a `ip pim`. This is incorrect we should always create
the regiface.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
1. As per RFC 4601 Sec 4.5.7:
* JoinDesired(S,G) -> False, set SPTbit to false.
2. Change the debug type.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
This command has been added in the context of
PIM BSM functionality. This command will clear the
data structs having bsr information.
Co-authored-by: Sarita Patra <saritap@vmware.com>
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
Test case 5.10 sends leave message to unicast address, the leave
packet is accepted and a query message is sent in response to this.
No validation for address is present in the function
Add check for addresses as per RFC. Leave messages are allowed only
sent to either ALL-ROUTERS (224.0.0.2) or group address.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
1. When a node changes from non-DR to DR in the given topology,
the node was receiving both PIM Join as well as IGMP join.
Since it was already receiving PIM Join previously, ifchannel was
already present. Hence when it becomes DR, the IGMP source flag is not
set due to issue in the code. Hence it never creates S,G entry thinking
that it is not DR.
2. When pim join expires, the pim flag is not reset when ifchannel is not
deleted.
Issue: #7752
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
The flag "R" in below display description denotes the
SGRpt Pruned but the description shows something else
hence correcting the definition.
R2(config)# do show ip mroute
IP Multicast Routing Table
Flags: S- Sparse, C - Connected, P - Pruned
R - RP-bit set, F - Register flag
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Issue:
Configure msdp mesh mg1 for 2 groups.
ip msdp mesh-group mg1 member 1.1.1.1
ip msdp mesh-group mg1 member 1.1.1.2
Remove mg1 for 1.1.1.1
no ip msdp mesh-group mg1 member 1.1.1.1
mg1 for 1.1.1.1 still present. This is fixed.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue:
Configure RP.
ip pim rp 20.0.0.1 239.1.1.1/32
ip pim rp 20.0.0.1 239.1.1.2/32
Remove RP.
no ip pim rp 20.0.0.1 239.1.1.1/32
Rp is not removed. This is fixed now.
Signed-off-by: Sarita Patra <saritap@vmware.com>
The pim_version.[c|h] files are never used and we are getting
warnings about PIM_VERSION changing pointer sizes from
newer versions of the compiler. I see no reason to keep this
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pim is enabled internally/implicitly on the vxlan termination device
and displaying that can confuse the admin and tools (such as
frr-reload).
Ticket: CM-30180
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If we screw up and don't have the right flags we'll print
out garbage. At the very least just print out nothing.
Signed-off-by: Donald Sharp <sharp@nvidia.com>
Issue: When an IGMPv2 leave packet is received, it did not validate
the checksum and hence the packet is accepted and group specific
query is sent out in response to this.
Due to this IGMP conformance test case 6.1 failed.
https://github.com/FRRouting/frr/issues/6868
Fix: Validate the checksum for all IGMP packets
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
The `enum zclient_send_status` enum needs to be extended
throughout the code base to use the new states and
to fix up places where we tested against the return
value being non zero.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
problem :
=========
When (*,G) prune received where we have SGRpt state,
ifchannel goes to NO_INFO state and doesn't get removed.
Root cause :
============
During the processing of (*,G) prune, we are not removing the
ifchannel on PruneTmp or PrunePendingTmp state.
Fix :
=====
In that scenario, stop joinExpiry timer and delete the ifchannel.
issue #7347
Co-authored-by: Saravanan K <saravanank@vmware.com>
Signed-off-by: vishaldhingra <vdhingra@vmware.com>
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
Create appropriate accessor functions for the rn->lock
data. We should be accessing this data through accessor
functions since it is private data to the data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We have this pattern in the code base:
if (thread)
THREAD_OFF(thread);
If we look at THREAD_OFF we check to see if thread
is non-null too. So we have a double check.
This is unnecessary. Convert to just using THREAD_OFF
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* If the MSDP peer receives the SA from a non-RPF peer towards the
originating RP, it will drop the message.
* SA messages are forwarded away from the RP address only.
* SA messages are not forwarded within the mesh group.
* Preventing the MSDP connection from being dropped due to RPF check
failure (RFC3618, section 13 "MSDP Error Handling")
Signed-off-by: Adriano Marto Reis <adrianomarto@gmail.com>
Signed-off-by: Adriano Reis <areis@barrukka.local>
This should never happen; no need to debug guard it and it's not a
warning, if this isn't working then NHT is not working at all.
Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Clean up the rare situation when bind fails to not
close the fd that was just opened and have the socket
leaked.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
We use the pim mroute socket for kernel notifications of events.
Currently this is limited to 8 bits of data. There are patches
coming down the pike in kernel land to allow this to expand.
Rather than fix this and all the other places we assume MAXVIFS < 256
in the pim code right now. Leave a land mine for the developer
doing this work to point them in the right direction.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Match by exact address rather than by prefix match to
determine if we generated the IGMPP query. Othwerwise
we will be ignoring IGMP queries coming from other
hosts on the same subnet.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Reviewed-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
IGMP queries should contain the source address of the IGMP socket
they are being sent from.
Added binding the IGMP sockets to their specific source, otherwise
interfaces with multiple addresses will send multiple queries using
the same source, which is determined by the kernel.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Reviewed-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
IGMP packets received from a source that does not match the subnet
of any configured addresses on the receive interface should be
ignored.
Also, find and use the correct IGMP socket object for the received
IGMP packet.
Signed-off-by: Nathan Bahr <nbahr@atcorp.com>
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Suppose you have more than 2 addresses on a pim interface:
lo up default 10.255.0.1/32
10.255.0.101/32
10.255.0.254/32
A `show ip pim int lo` gives us this:
eva# show ip pim interface lo
Interface : lo
State : up
Address : 10.255.0.1 (primary)
10.255.0.101/32
When we go look at the code that pulls secondary addresses in
we are using a prefix_cmp to know if we know about a secondary already
but were expecting true values instead of -1/0/1 being returned.
Modify code so that pim sees all secondary addresses
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
pimd crash at pim_msg_build_jp_groups (
grp=grp@entry=0x7ffca55b5d1e, sgs=sgs@entry=0x17821a0, size=20)
at pimd/pim_msg.c:198
Fix for https://github.com/FRRouting/frr/issues/6849
Root Cause:
===========
pimd has crashed because pim_upstream_rpf_clear function sets the
up->rpf.source_nexthop.interface pointer to NULL and has not removed
the upstream source node from the neighbor. When the upstream gets
deleted the source is not removed from neighbor
neigh->upstream_jp_agg->groups->sources list. This source node has
pointer to upstream freed memory. Hence when on_neighbor_jp_timer expires,
it tries to access the upstream pointer and crashed.
Fix:
====
Before setting the interface pointer to NULL, remove the node from
neigh->upstream_jp_agg->groups->sources list. Also the upstream state
has to be changed to Not joined.
Removed extra line changes.
Signed-off-by: Mobashshera Rasool <mrasool@vmware.com>
Move pim and igmp yang files registery to appropriate makefiles.
In yang directory makefile move under `PIMD`
Remove pimd yang files from library makefile instead move them
to pimd makefile.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
There are couple spots where group may be NULL and
when we output strings associated with it we should
ensure we are not doing something stupid.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When creating a pim instance, we were allocating table information
but never freeing it. Do so.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
BFD profiles can now be used on the interface level like this:
interface eth1
ip router isis 1
isis bfd
isis bfd profile default
Here the 'default' profile needs to be specified as usual in the
bfdd configuration.
Signed-off-by: GalaxyGorilla <sascha@netdef.org>
This section of code was ported incorrectly and is causing
issues when running against some tests. Fix the missed
code.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Don't crash when trying to `show running-config` because of missing
filter northbound integration.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Issue: After SPT switchover, do shut and no-shut the received connected
interface, traffic stops.
R2
| |
+ +
Client-----R1--------R3----Source
R2 is RP.
Root cause:
Client is sending join for G and Source is sending traffic for G.
Before SPT switchover, traffic flows R3-R2-R1, after SPT switchover,
traffic flows R3-R1. Now Check in R2, there will be 2 ifchannel gets
created. first is (*, G) ifchannel which gets created because of (*, G)
join received from R1, second is (S, G) ifchannel which gets created
because of (s,g,rpt) prune received from R1
Shut the receiver connected interface on R1, R1 will send a (*, G) prune
towards RP (R2). On receiving (*, G) prune, R2 deletes the (*, G) ifchannel.
(s,g) ifchannel with flag (s,g,rpt) set will be timeout after the prune timer
expires. Before this timer expires, do noshut the received connected inrterface
on R1. R1 will send a (*,G) join to R2(RP), So oil will be updated in (*, G),
but wont get updated in (s,g) since the flag (s,g,rpt) is set. So traffic flow
stops.
Fix: When (*, G) ifchannel is getting deleted because of (*, G) prune
received, as (*,G) prune indicates that the router no longer wishes
to receive shared tree traffic, so clear (S,G,RPT) flag on all the child (S,G)
ifchannel, which was created because of (S,G,RPT) prune received
Signed-off-by: Sarita Patra <saritap@vmware.com>
Found some more missing code that got dropped during the
upstreaming process causing issues with things actually
working.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the mlag code was ported upstream the structure
of the calls in igmp_source_forward_start had been
reset and as a result we missed the call to check
that creating an ifchannel when we are DualActive
is allowed.
Ticket: CM-29941
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Description:
"show ip mroute" displays only installed(kernel) mroutes, where
as "show ip mroute json" diplays both installed and not installed
mroutes in the o/p.To make this consistant, diplaying only valid
routes in json o/p.
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
Description:
Added json support for the following PIM commands.
1. show ip mroute [vrf NAME] count [json]
2. show ip mroute vrf all count [json]
3. show ip mroute [vrf NAME] summary [json]
4. show ip mroute vrf all summary [json]
Signed-off-by: Rajesh Girada <rgirada@vmware.com>
We were using thread_cancel() directly instead of
THREAD_OFF which correctly ensures the pointer is not NULL.
Ticket:CM-29866
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Added a new show command "show ip multicast", display the multicast data
packet in and out on interface level.
Signed-off-by: Sarita Patra <saritap@vmware.com>
These are easy to get subtly wrong, and doing so can cause
nondeterministic failures when racing in parallel builds.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Replace sprintf with snprintf where straightforward to do so.
- sprintf's into local scope buffers of known size are replaced with the
equivalent snprintf call
- snprintf's into local scope buffers of known size that use the buffer
size expression now use sizeof(buffer)
- sprintf(buf + strlen(buf), ...) replaced with snprintf() into temp
buffer followed by strlcat
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Replace all `random()` calls with a function called `frr_weak_random()`
and make it clear that it is only supposed to be used for weak random
applications.
Use the annotation described by the Coverity Scan documentation to
ignore `random()` call warnings.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
It is possible that a if_lookup_by_index can return NULL
ensure that we handle this appropriately.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
And again for the name. Why on earth would we centralize this, just so
people can forget to update it?
Signed-off-by: David Lamparter <equinox@diac24.net>
Same as before, instead of shoving this into a big central list we can
just put the parent node in cmd_node.
Signed-off-by: David Lamparter <equinox@diac24.net>
There is really no reason to not put this in the cmd_node.
And while we're add it, rename from pointless ".func" to ".config_write".
[v2: fix forgotten ldpd config_write]
Signed-off-by: David Lamparter <equinox@diac24.net>
The only nodes that have this as 0 don't have a "->func" anyway, so the
entire thing is really just pointless.
Signed-off-by: David Lamparter <equinox@diac24.net>
Issue: no ip msdp mesh-group <word> source command
deleting the mesh group, which might be used by the member.
Solution: no ip msdp mesh-group <word> source command, deletes
the mesh-group source.
Add a new cli command "no ip msdp mesh-group <word>" to delete
the mesh group.
Signed-off-by: Sarita Patra <saritap@vmware.com>
JSON output for igmp group display is modified as follows for better python handling.
1. Under each interface make array of group as an object
2. Include source and group inside each group object
These improvements are required for the set of topotest cases which will be upstreamed shortly.
Signed-off-by: Saravanan K <saravanank@vmware.com>
This CLI will allow user to configure a igmp group limit which will generate
a watermark warning when reached.
Though watermark may not make sense without setting a limit, this
implementation shall serve as a base to implementing limit in future and helps
tracking a particular scale currently.
Testing:
=======
ip igmp watermark-warn <10-60000>
on reaching the configured number of group, pim will issue warning
2019/09/18 18:30:55 PIM: SCALE ALERT: igmp group count reached watermak limit: 210(vrf: default)
Also added group count and watermark limit configured on cli - show ip igmp groups [json]
<snip>
Sw3# sh ip igmp groups json
{
"Total Groups":221, <=====
"Watermark limit":210, <=========
"ens224":{
"name":"ens224",
"state":"up",
"address":"40.0.0.1",
"index":6,
"flagMulticast":true,
"flagBroadcast":true,
"lanDelayEnabled":true,
"groups":[
{
"source":"40.0.0.1",
"group":"225.1.1.122",
"timer":"00:03:56",
"sourcesCount":1,
"version":2,
"uptime":"00:00:24"
<\snip>
<snip>
Sw3(config)# do sh ip igmp group
Total IGMP groups: 221
Watermark warn limit(Set) : 210
Interface Address Group Mode Timer Srcs V Uptime
ens224 40.0.0.1 225.1.1.122 ---- 00:04:06 1 2 00:13:22
ens224 40.0.0.1 225.1.1.144 ---- 00:04:02 1 2 00:13:22
ens224 40.0.0.1 225.1.1.57 ---- 00:04:01 1 2 00:13:22
ens224 40.0.0.1 225.1.1.210 ---- 00:04:06 1 2 00:13:22
<\snip>
Signed-off-by: Saravanan K <saravanank@vmware.com>
Valid range for hashmasklen is 0-32 under IPv4; failure to validate this
results in a negative bitshift later
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This is a full rewrite of the "back end" logging code. It now uses a
lock-free list to iterate over logging targets, and the targets
themselves are as lock-free as possible. (syslog() may have a hidden
internal mutex in the C library; the file/fd targets use a single
write() call which should ensure atomicity kernel-side.)
Note that some functionality is lost in this patch:
- Solaris printstack() backtraces are ditched (unlikely to come back)
- the `log-filter` machinery is gone (re-added in followup commit)
- `terminal monitor` is temporarily stubbed out. The old code had a
race condition with VTYs going away. It'll likely come back rewritten
and with vtysh support.
- The `zebra_ext_log` hook is gone. Instead, it's now much easier to
add a "proper" logging target.
v2: TLS buffer to get some actual performance
Signed-off-by: David Lamparter <equinox@diac24.net>
Line break at the end of the message is implicit for zlog_* and flog_*,
don't put it in the string. Mid-message line breaks are currently
unsupported. (LF is "end of message" in syslog.)
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Problem: This happened in once in a while during testing the scenario multiple
times. When regstop timer expire and at that point if rpf interface doesn't
exist, the register state for the upstream gets struck in reg-prune state indefinitely.
This will not recover even when rpf comes back and traffic resumed because
register state is struck on prune.
RCA: Reg suppression expiry is keeping reg state unchanged when iif is absent.
Fix: When iif is absent during reg suppression expiry, treat it as couldreg
becoming false and move it NO_INFO state.
Signed-off-by: Saravanan K <saravanank@vmware.com>
Problem: output is cut short when prefix string all octets are 3 digit.
RCA: Buffer was allocated only to hold ip addr str.
Fix: Added 3 bytes more to hold prefix length and a /.
Modified buffer in 'show ip pim bsrp-info' and 'show ip pim bsm database'
Signed-off-by: Saravanan K <saravanank@vmware.com>
Zebra is currently sending messages on interface add/delete/update,
VRF add/delete, and interface address change - regardless of whether
its clients had requested them. This is problematic for lde and isis,
which only listens to label chunk messages, and only when it is
waiting for one (synchronous client). The effect is the that messages
accumulate on the lde synchronous message queue.
With this change:
- Zebra does not send unsolicited messages to synchronous clients.
- Synchronous clients send a ZEBRA_HELLO to zebra.
The ZEBRA_HELLO contains a new boolean field: sychronous.
- LDP and PIM have been updated to send a ZEBRA_HELLO for their
synchronous clients.
Signed-off-by: Karen Schoener <karen@voltanet.io>
This is as per RFC. This is identified when conformance suite catched join.
RCA:
Packets were processed without checking allowed dest IP for that packet.
Fix:
Added check for dest IP
Converted this check to a function
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: preferred bsr routine, compare address in network byte order
Fix: changed to host format before comparision.
Testing:
Verified between 1.1.2.7 and 10.2.1.1, 10.2.1.1 is chosen as bsr
Initially:
R11# sh ip pim bsr
PIMv2 Bootstrap information
Current preferred BSR address: 1.1.2.7
Priority Fragment-Tag State UpTime
0 2862 ACCEPT_PREFERRED 00:00:30
Last BSM seen: 00:00:30
After next bsr started:
R11# sh ip pim bsr
PIMv2 Bootstrap information
Current preferred BSR address: 10.2.1.1
Priority Fragment-Tag State UpTime
0 3578 ACCEPT_PREFERRED 00:00:01
Last BSM seen: 00:00:01
R11# sh ip pim bsr
PIMv2 Bootstrap information
Current preferred BSR address: 10.2.1.1
Priority Fragment-Tag State UpTime
0 3578 ACCEPT_PREFERRED 00:00:04
Last BSM seen: 00:00:04
Signed-off-by: Saravanan K <saravanank@vmware.com>
Problem:
When the ifchannel is in SGRpt prune, if we receive a join, we go into no info
state but mroute still present with none oil
Join Prune Expiry timer on the ifchannel was still running when
Prune pending expired. This causes ifchannel not to be deleted and hence mroute.
Fix:
Stop expiry timer when we move into NOINFO state and delete the ifchannel.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: Upstreams which are in register state other than noinfo, doesnt remove
register tunnel from oif after it becomes nonDR
Fix: scan upstreams with iif as the old dr and check if couldReg becomes false.
If couldreg becomes false from true, remove regiface and stop reg timer.
Do not disturb the entry. Later the entry shall be removed by kat expiry.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA:
Either JP timer is used to send join or join timer.
We are not removing the group from jp aggregate during suppression.
So even if join timer is restarted, jp aggregate expiry during suppression
is sending join for the group.
Fix:
Remove the group from jp aggregate on the neighbor during jp suppression.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: This was todo item in current code base
Fix: Hello sent with 0 hold time before we update the pim ifp primary address
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: starg join fell in to SGRpt join check and was treated as SGRpt join
so ifchannel state machine moved from prunepending to noinfo.
Fix: Check if it is starg join and process
Signed-off-by: Saravanan K <sarav511@vmware.com>
RCA:
Trying to get primary address for the interface.
Unnumbered interface pick address from vrf device for non default.
While doing so it ends up in recursion without exit condition if vrf dev doesnt have any address.
Solution:
Break the recursion by checking if it is vrf device.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA:
It has asserted because during neighbor delete on a interface,
pim_number_of_nonlandelay_neighbors count has become less less than 0.
This was due to not updating the count when hello option changed.
Fix:
During hello option update, check and increment or decrement when this option changes.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA:
1. Prune processing during join state was putting of join expiry timer
2. Join received during prune pending state was not comparing hold time with remaining expiry timer.
Fix:
Fixed as per RFC 4601/7761
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: We are ignoring (*,G) prune when the RP in packet is not RP(G)
Fix:
According to RFC 4601 Section 4.5.2:
Received Prune(*,G) messages are processed even if the RP in the message does not match RP(G).
We have allowed starg prune to be processed in the scenario.
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: Periodic join is mostly sent by nbr jp timer except for few scenarios by upstream join timer
Fix: If join timer not running, we have to use nbr jp timer to calculate
remaining time for next join.
Signed-off-by: Saravanan K <saravanank@vmware.com>
Issue: Client1------LHR-----(int-1)RP(int-2)------client2
Client2 send IGMP join for group G.
Client1 send IGMP join for group G.
verify show ip mroute in RP, will have 2 OIL.
Client2 send IGMP leave.
Verify show ip mroute in RP, will still have 2.
Root cause: When RP receives IGMP join from client2, it creates
a (s,g) channel oil and add the interface int-2 into oil list and
set the flag PIM_OIF_FLAG_PROTO_IGMP to int-2
Client1 send IGMP join, LHR will send a (*,G) join to RP. RP will
add the interface int-1 into the oil list of (s,g) channel_oil and
will set the flag PIM_OIF_FLAG_PROTO_IGMP and PIM_OIF_FLAG_PROTO_PIM
to the int-1 and set PIM_OIF_FLAG_PROTO_PIM to int-2 as well. It is
happening because of the pim_upstream_inherited_olist_decide() and
forward_on() get all the oil and update the flag wrongly.
So now when client 2 sends IGMP prune, RP will not remove the int-2
from oil list since both PIM_OIF_FLAG_PROTO_PIM & PIM_OIF_FLAG_PROTO_IGMP
are set, it just unset the flag PIM_OIF_FLAG_PROTO_IGMP.
Fix: Introduced new flags in if_channel, PIM_IF_FLAG_MASK_PROTO_PIM
& PIM_IF_FLAG_MASK_PROTO_IGMP. If a if_channel is created because of
pim join or pim (s,g,rpt) prune received, then set the flag
PIM_IF_FLAG_MASK_PROTO_PIM. If a if_channel is created becuase of IGMP
join received, then set the flag PIM_IF_FLAG_MASK_PROTO_IGMP.
When an interface needs to be added into the oil list check if
PIM_IF_FLAG_MASK_PROTO_PIM or PIM_IF_FLAG_MASK_PROTO_IGMP is set, then
update oil flag accordingly.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue 1: "show ip pim interface traffic" not show prune TX/RX
Rootcause : not added the variable in the json
Fix : add prune TX/RX in show ip pim interface traffic json
Issue 2: "show ip pim rp-info" not shows the key iAmRp when it is false
Rootcause: Only display the key when the value is true.
Fix: add iAmRp as false in show ip pim rp-info json
Issue 3: "show ip pim rp-info" not showing outbound interface if it is empty
Rootcause: Only display when there is any OIL
Fix: When RP is not reachable then, the outbound interface is Unknown
The command "show ip pim rp-info json" not displaying the outbound
interafce if it is unknown
Signed-off-by: Sarita Patra <saritap@vmware.com>
S - Sparse Mode
C - indicates there is a member of the group directly connected to the router.
R -set on an (S, G) by the receipt of an (S, G) RP bit prune message.
F -This indicates that this router is a FHR and send register messages to RP to inform RP of this active source
P - OIL list is NULL. That means the router will send a prune.
T - At least one packet received via SPT.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Problem:
We are receiving PIM BSR packet over the pim interface which has no nbrs
According to RFC 5059 Sec 3.4
When a Bootstrap message is forwarded, it is forwarded out of every
multicast-capable interface that has PIM neighbors (including the one
over which the message was received).
RCA:
We are sending to all pim neighbors.
Fix:
We will avoid the interfaces which has no neighbors.
Verification: Manually verified that Pim router doesn't forward to intf with no nbrs
Signed-off-by: Saravanan K <saravanank@vmware.com>
RCA: When configured more than 32(MAXVIS), the inerfaces that are configured
after 32nd interfaces have the value of MAXVIF.
This is used as index to access the free vif tracker of array size 32(MAXVIFS).
So the channel oil list pointer which is present as the next field in pim structure get corrupt, when updating free vif.
This gets accessed during rpf update resulting in crash.
Fix: Refrain from allocating mcast interface structure and throw config error when more than MAXVIFS are attempted to configure.
Max vif checks are exempted for vrf device and pimreg as vrf device will be the first interface and not expected to fail and pimreg has reserved vif.
vxlan tunnel termination device creation has this check and throw warning on max vif.
All other creation are through CLI.
Signed-off-by: Saravanan K <saravanank@vmware.com>
Problem: Route node is not de referenced after search when pim debug events are
not enabled when pim_rp_find_match_group is called. So this memory will not get
released when route node is deleted after hitting this path.
RCA: Dereferencing is done inside debug condition.
Fix: Moving outside debug condition
Signed-off-by: Saravanan K <saravanank@vmware.com>
Root cause: The header display is put in common outside the vtysh/json if-else.
Fix: Brought inside vtysh condition.
Signed-off-by: Saravanan K <saravanank@vmware.com>
Issue: when (*,G) has some receiver and directly connected source sends traffic,
new (S,G) entry created is not inheriting the oil from (*,g)
RCA: pim_mroute_msg_nocache haven't assume FHR have (*,g) member ports
Fix : Added inherit oil from parent from (*,g) receivers to get added
Signed-off-by: Saravanan K <saravanank@vmware.com>
Issue: When any interface is getting added/deleted in the outgoing
interface list, it calls pim_mroute_add() which is updating the
mroute_creation time without checking if the mroute is already
installed in the kernel.
Fix: Check if mroute is already installed, then dont refresh the
mroute_creation timer.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue: show ip mroute displays the mroute uptime (time when
mroute installed into the kernel) per oif.
This is confusing.
Fix: Display mroute uptime per (s,g) mroute entry.
Signed-off-by: Sarita Patra <saritap@vmware.com>
There exists a chain of events where calling pim_mlag_up_peer_deref
can free the up pointer. Prevent a use after free by returning
the up pointer as needed and checking to make sure we are
ok.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
On shutdown processing we may have gotten a interface down event
which might clear the rpf interface and we might trigger a
work queue item on the vxlan_sg to send a NULL register.
Ensure that we cannot attempt to do the impossible.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
VRF deletion events here calling hash_clean() with
nothing to clean up the vxlan_sg's associated with it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When pim receives a register packet, we will apply the
received source to the prefix list. If accepted normal
processing continues. If denied we will send a register
stop message to the source.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We already use the pim pointer a bunch off of pim_ifp->pim
just add another pim variable to allow us to shorten code
a bit.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The memory type PIM_SPT_PLIST_NAME is specific to
SPT but we are going to store more prefix-list names
in pim, make it generic to allow for less confusion
in the future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the (S,G) KAT expires we need to poll for activity before dropping the
entry as traffic may have been forwarded by the dataplane since the last
periodic poll cycle.
This only works if traffic is being forwarded by the kernel i.e. if the
entries were HW accelerated via an ASIC we may still miss out on last
minute activity on the mroute in the HW.
Ticket: CM-26871
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An mroute can transition from non-origination to a vxlan origination
mroute. In that case we need to re-evaluate if the interfaces in the
OIL need to be muted; pimreg and termination device need to be muted (if
they were previously un-muted).
Dump in a problem state:
=======================
root@TORC11:~# net show pim state
Codes: J -> Pim Join, I -> IGMP Report, S -> Source, * -> Inherited from (*,G), V -> VxLAN, M -> Muted
Active Source Group RPT IIF OIL
1 * 239.1.1.100 y uplink-1 pimreg(I ), ipmr-lo( J )
1 36.0.0.11 239.1.1.100 n peerlink-3.4094 ipmr-lo( * ), uplink-1( J ), uplink-2( J ), peerlink-3.4094( V )
PS: ipmr-lo should have M set in (36.0.0.11,239.1.1.100)
Ticket: CM-26747
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Add a special catch to the test for pim_macro_chisin_pim_include
to allow the LHR to signal interest in joining upstream.
This will allow both the DR and non DR of the ActiveActive
situation to draw traffic to itself.
The non-DR will continue to not forward traffic.
Ticket: CM-26610
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Series of events leading to the problem -
1. (S,G) has been pruned on the rp on downlink-1
2. a (*,G) join is rxed on downlink-1 without the source S. This
results in the (S,G,rpt) prune state being cleared on downlink-1.
As a part of the clear the ifchannel associated with downlink-1
is deleted.
3. The ifchannel_delete handling is expected to add downlink-1
as an inherited OIF to the channel OIL (which it does). However
it is also added in as an immediate OIF (accidentally) as the
ifchannel is still present (in the process of being deleted).
To avoid the problem defer pim_upstream_update_join_desired
evaluation until after the channel is deleted.
Relevant debug logs -
PIM: pim_ifchannel_delete: ifchannel entry (27.0.0.15,239.1.1.106)(downlink-1) del start
PIM: pim_channel_add_oif(pim_ifchannel_delete): (S,G)=(27.0.0.15,239.1.1.106): proto_mask=4 OIF=downlink-1 vif_index=7: DONE
PIM: pimd/pim_oil.c pim_channel_del_oif: no existing protocol mask 2(4) for requested OIF downlink-1 (vif_index=7, min_ttl=1) for channel (S,G)=(27.0.0.15,239.1.1.106)
PIM: pim_upstream_switch: PIM_UPSTREAM_(27.0.0.15,239.1.1.106): (S,G) old: NotJoined new: Joined
PIM: pim_channel_add_oif(pim_upstream_inherited_olist_decide): (S,G)=(27.0.0.15,239.1.1.106): proto_mask=2 OIF=downlink-1 vif_index=7 added to 0x6 >>>>>>>>>>>>>>>>>>
PIM: pim_upstream_del(pim_ifchannel_delete): Delete (27.0.0.15,239.1.1.106)[default] ref count: 2 , flags: 81 c_oil ref count 1 (Pre decrement)
PIM: pim_ifchannel_delete: ifchannel entry (27.0.0.15,239.1.1.106)(downlink-1) del end
Ticket: CM-26732
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Additional protocols were being set on the OIF proto-mask without
logs. Added logs in that area.
Also added start and end logs to ifchannel_delete to help
identify state machine changes that play out as a part of this
event handling.
Ticket: CM-26732
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This patch does two things:
1) Ensure the decoding of stream data between pim <-> zebra is properly
decoded and we don't read beyond the end of the stream.
2) In zebra when we are freeing memory alloced ensure that we
actually have memory to delete before we do so.
Ticket: CM-27055
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Initially, MLAG Sync is happened at pim_ifchannel, this is mainly to
support even config mismatches(missing configuration of dual active).
But this causes more syncs for each entry.
and also it is not In-line with PIM EVPN. to avoid that moving to
pm_upstream based syncing.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
(S,G) entries that inherit ipmr-lo into the OIL also inherit
the DF role from the parent (*, G) entry.
This change is done primarily to simplify the sync process and
to prevent the MLAG peers from having to track (S, G) activity etc.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
There exists the possibility that a RP exists as a anycast
pair for a lan segment. As such one side may receive
the register and properly handle the registration mechanics.
The one that does not receive the register packets will still
get S,G state and WRVIFWHOLE upcalls across the lan. In
this case notice that we have not received the Registration
packets and prevent nexthop lookups.
Ticket: CM-27466
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
There was some code missed during the upstreaming process
due to code squash. Identify and put into a commit
to keep code consistent and correct.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When the WRVIFWHOLE callback is made with a iifp of the pimreg
device we *know* that the packet is a PIM Register packet
( see net/ipv4/ipmr.c for kernel behavior ). As such
we know that we will shortly read the pim register packet
and handle it through those mechanics. There is nothing
to do here so we can move along.
Ticket: CM-27729
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Issue 1:
1. Enable pim on an interface.
2. Configure query-interval or query max response time,
which results in pimd crash.
Root cause:
1. When pim is enabled on an interface, it creates a igmp socket
with querier_timer and other_querier time as NULL.
2. When query-interval/max_response_time is configured, it call the
function igmp_sock_query_reschedule() to reshedule the query. This
function check either of querier_timer or other_querier timer should
be running. Since in this case both are NULL, it results in crash.
Issue 2:
1. Enable pim on an interface.
2. Execute no ip igmp query-interval or query max response time,
which results in pimd crash.
Root cause:
1. When pim is enabled on an interface, it creates a pim interface
with querier_timer and other_querier time as NULL.
2. When no ip igmp query-interval/max_response_time is executed, it will
check either of querier_timer or other_querier timer should be running.
Since in this case both are NULL, it results in crash.
Fix:
When pim is enabled on an interface, it creates a igmp socket with
mtrace_only as true. So add a check if mtrace_only is true, then don't
reshedule the query.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue:
Client---LHR---RP
1. Add kernel route for RP on LHR. Client send join
2. (*,G) will be get created in LHR and RP.
3. Kill the FRR on all the nodes
4. Start FRR only on LHR node
5. In LHR, (*, G) will be created with iif as unknown.
Root cause:
In the step 4, When LHR will receive igmp join, it will call
the function pim_ecmp_fib_lookup_if_vif_index which will look
for nexthop to RP with neighbor needed as false. So RPF lookup will
be true as the route is present in the kernel. It will create a
(*, G) channel_oil with incoming interface as the RPF interface
towards RP and install the (*,G) mroute in kernel.
Along with this (*,G) upstream gets craeted, which call the function
pim_rpf_update, which will look for the nexthop to RP with neighbor
needed as true. As the frr is not running in RP, no neighbor is present
on the nexthop interface. Due to which this will fail and will update
the channel_oil incoming interface as MAXVIFS(32).
Fix:
pim_ecmp_fib_lookup_if_vif_index() call the function pim_ecmp_nexthop_lookup
with neighbor_needed as true.
Signed-off-by: Sarita Patra <saritap@vmware.com>
Issue: REGISTER-STOP Rx is always displaying 0.
Root-cause: pim_ifstat_reg_stop_recv is not getting
incremented when register stop message is received.
Fix: Increment pim_ifstat_reg_stop_recv on receiving
of pim register stop packet.
Signed-off-by: Sarita Patra <saritap@vmware.com>
It's been a year search and destroy.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
- Add extern modifier to some declarations in header file and move
qpim_all_pim_routers_addr definition to pimd/pimd.c
`GCC now defaults to -fno-common. As a result, global variable accesses
are more efficient on various targets. In C, global variables with
multiple tentative definitions now result in linker errors.`
Taken from https://gcc.gnu.org/gcc-10/changes.html
Signed-off-by: Tomas Korbar <tkorbar@redhat.com>
1. show ip pim mlag summary
provides MLAG session information and stats
2. show ip pim mlag upstream
displays the upstream entries synced across the MLAG switches
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
ipmr-lo is an internally added device used for multicast vxlan tunnel
termination. This device is not expected to be managed by the admin
however in the case it is accidentally shut we need to be able handle
it by recovering when it is "no shut" again.
Ticket: CM-24985
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
PIM MLAG DF election API was not being triggered on cost change if the
upstream neighbor remained the same.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
In an anycast VTEP setup the peerlink_rif is added as a static OIF
to the originating mroute (bypassing the pim state machine). This is
needed to ensure both MLAG switches rx a copy of encapsulated BUM flow.
We were not handling link state changes on this static OIF resulting
in the wrong vifi being used in the OIL (because of vifi re-allocation).
This commit re-acts to oper state changes by deleting the OIF on link
down and re-adding it on link up.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
A local membership is created on the vxlan termination device ipmr-lo. This
is done to -
1. Pull multicast vxlan tunnel traffic to the VTEP for termination by
triggering JoinDesired on the BUM multicast group.
2. Include the OIF in the mroute to signal to the dataplane component
that flow needs to be vxlan terminated.
Earlier we were overloading the PIM_UPSTREAM_FLAG_MASK_SRC_IGMP for
this local membership creation but that is creating confusion both in
the state machine and in the show outputs. To avoid that we use the
more apparent PIM_UPSTREAM_FLAG_MASK_SRC_VXLAN_TERM. With this change -
1. We get LHR functionality for VXLAN_TERM mroutes
2. OIF is populated with PIM_OIF_FLAG_PROTO_PIM only
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When local member is added the (*, G) entry may already be in a JOINED
state. In that case the OIL is not updated i.e. pim_channel_add_oif is
not happening for ipmr-lo. Because of this the traffic associated with
the multicast vxlan tunnel is pulled down to the VTEP but not terminated
by the kernel.
This change force updates the OIL anytime ipmr-lo is added or removed
as a local member.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is not causing functional problems but has become a source
of confusion. DF status is only relevant to multicast tunnel decaps.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The RPF cost is incremented by 10 if the RPF interface is the peerlink-rif.
This is used to force the MLAG switch with the lowest cost to the RPF
to become the MLAG DF. If a switch has to go via the peerlink-rif to get
to the RP or source it simplly cannot be the designated forwarder.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
DF election is only run for (*,G) entries i.e. election is skipped
for (S,G) entries that are setup as a result of SPT switchover. (S,G)
entries inherit the DF role from the parent (*,G) entry. So the DF is
responsible for terminating all sources associated with a group.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. Upstream entries associated with tunnel termination mroutes are
synced to the MLAG peer via the local MLAG daemon.
2. These entries are installed in the peer switch (via an upstream
ref flag).
3. DF (Designated Forwarder) election is run per-upstream entry by both
the MLAG switches -
a. The switch with the lowest RPF cost is the DF winner
b. If both switches have the same RPF cost the MLAG role is
used as a tie breaker with the MLAG primary becoming the DF
winner.
4. The DF winner terminates the multicast traffic by adding the tunnel
termination device to the OIL. The non-DF suppresses the termination
device from the OIL.
Note: Before the PIM-MLAG interface was available hidden config was
used to test the EVPN-PIM functionality with MLAG. I have removed the
code to persist that config to avoid confusion. The hidden commands are
still available.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Channel with the MLAG daemon is setup on the first VxLAN BUM MDT or
pim-mlag AA SVI.
This channel is used for -
1. rxing MLAG status status updates (peer state, role etc.)
2. for syncing active-active upstream entries with the peer MLAG
switch.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The vrrpd one conflicts with the standalone vrrpd package; also we're
installing daemons to /usr/lib/frr on some systems so they're not on
PATH.
Signed-off-by: David Lamparter <equinox@diac24.net>
Update zclient_lookup_nexthop_once() to create the zapi
header using the vrf_id on the pim->vrf struct.
This is the one we do a check on a couple lines before, so
we should be using it when we actually create the header as
well.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Allow pimd to stop the lookup if zebra tells pimd that the
lookup failed due to a zapi error. Otherwise, it will keep
waiting for a nexthop message that will never come.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Convert the upstream_list and hash to a rb tree, Significant
time was being spent in the listnode_add_sort. This reduces
this time greatly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
The channel_oil_list and hash are taking significant
cpu at scale when adding to the sorted list. Replace
with a RB_TREE.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Kernel might not hand us a bad packet, but better safe than sorry here.
Validate the IP header length field. Also adds an additional check that
the packet length is sufficient for an IGMP packet, and a check that we
actually have enough for an ip header at all.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
We check that the IGMP message is sufficently sized for an mtrace query,
but not a response, leading to uninitialized stack read.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Allow 'ip igmp join' to join group for any source if no source is
specified.
Disallow joining source "0.0.0.0" as it is used to define an
any-source multicast group.
Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com>
Commit: ddbf3e6060
This commit modified the interface up handling code in
ZAPI such that the zclient handled the decoding for you.
Prior to this commit ospf assumed that it could use the
old ifp pointer to know state before reading the stream.
This lead to a situation where ospf would `smartly` track
and do the right thing in this situation. This commit
changed this assumption and in certain scenarios, say
a interface was changed after it was already up would
lead to situations where ospf would not properly handle
the new interface up.
Modify ospf to track data that is important to it in
it's interface->info pointer.
This code pattern was followed in both eigrp and pim.
In eigrp's case it was just behaving weirdly in any event
so fixing this pattern is not a big deal. In pim's
case it was not properly using this so it's a no-op
to fix.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
use_rpt macro depends on JoinDesired macro and is mostly independent of the
actual RPF interface i.e. doesn't change when the RPF interface changes.
There is however one exception to this handling and that is on the
first hop router (DR or non-DR). On the DR the FHR flag is set so the
RPF interface stays irrelevant to use_rpt eval. But on the non-DR the
IIF is the only way to know we are directly connected to the SG i.e.
to know that we must NOT switch the source to RPT.
This commit fixes up the order of use_rpt eval -
1. it is done before mroute programming
2. but after IIF setup, for SRC_NOCACHE and STATIC_IIF upstream entries
Note: drop an unnecessary check to verify that the RPF interface is
pim enabled. This is just to make the code consistent.
Ticket: CM-27446
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
A variety of buffer overflow reads and crashes
that could occur if you fed bad info into pim.
1) When type is setup incorrectly we were printing the first 8 bytes
of the pim_parse_addr_source, but the min encoding length is
4 bytes. As such we will read beyond end of buffer.
2) The RP(pim, grp) macro can return a NULL value
Do not automatically assume that we can deref
the data.
3) BSM parsing was not properly sanitizing data input from wire
and we could enter into situations where we would read beyond
the end of the buffer. Prevent this from happening, we are
probably left in a bad way.
4) The received bit length cannot be greater than 32 bits,
refuse to allow it to happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Inherited OIL is used as a part of the JoinDesired macro. And in FRR we
use the channel OIL as the inherited OIL (to reduce processing overhead
everytime JD needs to be re-evaluated). On a FHR pimreg is a part of the
channel-OIL but must not be used for JD computation.
This commit blacklists pimreg from the inherited_oil i.e. present but
ignored.
Note: This fixup is being done to address topotest failures.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If a register packet is received that is less than the PIM_MSG_REGISTER_LEN
in size we can have a possible situation where the data being
checksummed is just random data from the buffer we read into.
2019/11/18 21:45:46 warnings: PIM: int pim_if_add_vif(struct interface *, _Bool, _Bool): could not get address for interface fuzziface ifindex=0
==27636== Invalid read of size 4
==27636== at 0x4E6EB0D: in_cksum (checksum.c:28)
==27636== by 0x4463CC: pim_pim_packet (pim_pim.c:194)
==27636== by 0x40E2B4: main (pim_main.c:117)
==27636== Address 0x771f818 is 0 bytes after a block of size 24 alloc'd
==27636== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27636== by 0x40E261: main (pim_main.c:112)
==27636==
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When you configure interface configuration without explicitly
configuring pim on that interface, we were not creating the pimreg
interface and as such we would crash in an attempted register
since the pimreg device is non-existent.
The crash is this:
==8823== Invalid read of size 8
==8823== at 0x468614: pim_channel_add_oif (pim_oil.c:392)
==8823== by 0x46D0F1: pim_register_join (pim_register.c:61)
==8823== by 0x449AB3: pim_mroute_msg_nocache (pim_mroute.c:242)
==8823== by 0x449AB3: pim_mroute_msg (pim_mroute.c:661)
==8823== by 0x449AB3: mroute_read (pim_mroute.c:707)
==8823== by 0x4FC0676: thread_call (thread.c:1549)
==8823== by 0x4EF3A2F: frr_run (libfrr.c:1064)
==8823== by 0x40DCB5: main (pim_main.c:162)
==8823== Address 0xc8 is not stack'd, malloc'd or (recently) free'd
pim_register_join calls pim_channel_add_oif with:
pim_channel_add_oif(up->channel_oil, pim->regiface,
PIM_OIF_FLAG_PROTO_PIM);
We just need to make srue pim->regiface exists once we start configuring
pim.
Fixes: #5358
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When configuring a RP, dissallow the choice of 0.0.0.0 or
255.255.255.255 as the address as that they make no sense
what so ever.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We were adding a newline for the source in some cases
but not others and tighten up the display of data
eva# show ip pim rp-info
RP address group/prefix-list OIF I am RP Source
10.254.0.1 224.0.0.0/4 lo yes Static
4.4.4.4 225.1.2.3/32 abcdefghijklmno yes Static
10.0.20.45 226.200.100.100/32 r1-eth0 no Static
eva#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
SPT switchover handling is done by adding pimreg in the OIL of the (*, G)
entry on the LHR. This causes multicast data with that destination to be
sent to pimd as IGMPMSG_WHOLEPKT. These packets trigger creation of (S,G)
and also register forwarding. However register forwarding must only be done
if the router is also a FHR. That FHR check was missing causing strange
source registrations from multicast routers that were not directly
connected to the source.
Relevant logs from LHR -
PIM: pim_mroute_msg: pim kernel upcall WHOLEPKT type=3 ip_p=0 from fd=9 for (S,G)=(6.0.0.30,239.1.1.111) on pimreg vifi=0 size=98
PIM: Sending (6.0.0.30,239.1.1.111) Register Packet to 81.0.0.5
PIM: pim_register_send: Sending (6.0.0.30,239.1.1.111) Register Packet to 81.0.0.5 on swp2
And 6.0.0.30 is clearly not directly connected on that router -
root@tor-11:~# ip route |grep 6.0.0.30 -A2
6.0.0.30 proto ospf metric 20
nexthop via 6.0.0.22 dev swp1 weight 1 onlink
nexthop via 6.0.0.23 dev swp2 weight 1 onlink
root@tor-11:~#
Ticket: CM-24549
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
It was causing a Join on (S,G) who's prune state was being cleared. This
was an inactive (KAT not running; no immediate OIL) entry that was being
flushed out but because of this incorrect Join (that was being done with
out join-state checks) the source was getting populated repeatedy i.e.
never aged.
Output of "ip monitor mroute"
=============================
(27.0.0.11,239.1.1.102) Iif: lo State: resolved Table: default
Deleted (27.0.0.11,239.1.1.102) Iif: lo State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: pimreg State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: lo Oifs: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.104) Iif: lo Oifs: pimreg uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: lo Oifs: pimreg uplink-1 State: resolved Table: default
Deleted (27.0.0.11,239.1.1.102) Iif: lo State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: pimreg State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: uplink-1 State: resolved Table: default
(27.0.0.11,239.1.1.102) Iif: lo Oifs: uplink-1 State: resolved Table: default
These mroute events (on a no longer existing multicast souce) continue in
a never ending loop.
Triggered joins/prunes MUST only done via state machine transitions i.e.
via pim_upstream_update_join_desired.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
JD macro is defined by the RFC as -
bool JoinDesired(S,G) {
return (immediate_olist(S,G) != NULL
OR (KeepaliveTimer(S,G) is running
AND inherited_olist(S,G) != NULL))
}
However for MSDP synced SA the KAT will not be running so an exception is
needed. Earlier I had done this by relaxing KAT_run requirements entirely
on the RP. However as that prevents the source from being aged out in some
cases I have made the check more narrow i.e. has to an MSDP peer added
entry.
Ticket: CM-24398
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Added event logs around add/del of upstream entries into the nbr's
jp-agg list. This is to help debug a problem with stale (deleted)
upstream entries being present in the list causing pimd to crash on
the periodic processing.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Today we are only pruning the SPT when (S,G) upstream entry
switches from Joined toNotJoined. This leaves the source still
pruned along the RPT till the next periodic XG join-prune is sent
to the RPF(RP). Traffic from the source will be blackholed for this
duration. To prevent that we need send a new JP message
to RPF(RP) immediately.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
It is now used to evaluate and display join-desired state for
each upstream entry -
root@spine-1:~# net show pim upstream-join-desired
Source Group EvalJD
* 239.1.1.111 yes
6.0.0.28 239.1.1.111 yes
6.0.0.29 239.1.1.111 no
6.0.0.30 239.1.1.111 yes
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This re-naming was needed because the JD state on an upstream is
not just based on channel info i.e. we can have JD=true even if there
is no downstream channel. The "show ip upstream-join-desired" command
will be changed to display that info i.e. upstream's JD state instead
of downstream channel params. The downstream channel params are now
available via "show ip pim channel"
PS: This change maybe reverted if upstream NAKs it. But there is a
pressing need for it to debug some not-so-reproduible problems.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This was causing pimd to crash later; call-stack -
(gdb) bt
context=<optimized out>) at lib/sigevent.c:254
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
groups=<optimized out>) at pimd/pim_join.c:562
at pimd/pim_neighbor.c:288
at lib/thread.c:1599
at lib/libfrr.c:1024
envp=<optimized out>) at pimd/pim_main.c:162
(gdb) fr 4
group=group@entry=0x7ffffa9797e0) at pimd/pim_rp.c:207
207 pimd/pim_rp.c: No such file or directory.
(gdb) fr 6
grp=grp@entry=0x7ffffa9799fe, sgs=sgs@entry=0x560ac069edb0, size=52)
at pimd/pim_msg.c:200
200 pimd/pim_msg.c: No such file or directory.
(gdb) p source->up->sg_str
$1 = '\000' <repeats 31 times>, <incomplete sequence \361>
(gdb)
This problem can manifest in the following event sequence -
1. upstream RPF neighbor is resolved
2. upstream RPF neighbor becomes unresolved (but upstream entry
stays on the jp-agg list)
3. upstream entry is removed
on the next old-neighbor jp-agg-list processing the stale entry is
accessed resulting in the crash.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Dumps while in problem state -
============================
[from "show ip pim state"]
Active Source Group RPT IIF OIL
1 6.0.0.31 239.1.1.111 n swp1 swp4( J * )
[from "show ip pim join"]
Interface Address Source Group State Uptime Expire Prune
swp3 6.0.0.22 6.0.0.31 239.1.1.111 JOIN --:--:-- 03:11 --:--
You can see from the dumps that the pim downstream router has joined on
swp3 but that OIF has not been added to the OIL with flag
PIM_OIF_FLAG_PROTO_PIM. This is because the join was rxed while the
ifchannel was in a prune-pending state.
Relevant logs -
===============
[
PIM: recv_prune: prune (S,G)=(6.0.0.31,239.1.1.111) rpt=1 wc=0 upstream=6.0.0.22 holdtime=210 from 6.0.0.28 on swp3
PIM: pim_upstream_ref(pim_ifchannel_add): upstream (6.0.0.31,239.1.1.111) ref count 3 increment
PIM: pim_upstream_add(pim_ifchannel_add): (6.0.0.31,239.1.1.111), iif 6.0.0.26/0 (swp1) found: 1: ref_count: 3
PIM: pim_ifchannel_add: ifchannel (6.0.0.31,239.1.1.111) is created
PIM: pim_joinprune_recv: SGRpt flag is set, del inherit oif from up (6.0.0.31,239.1.1.111)
PIM: pim_mroute_add(pim_channel_del_oif), vrf default Added Route: (6.0.0.31,239.1.1.111) IIF: swp1, OIFS: swp4
PIM: pim_channel_del_oif(pim_joinprune_recv): (S,G)=(6.0.0.31,239.1.1.111): proto_mask=4 IIF:1 OIF=swp3 vif_index=3
PIM: recv_join: join (S,G)=(6.0.0.31,239.1.1.111) rpt=0 wc=0 upstream=6.0.0.22 holdtime=210 from 6.0.0.28 on swp3
PIM: PIM_IFCHANNEL(swp3): (6.0.0.31,239.1.1.111) is switching from SGRpt(PP) to JOIN
PIM: Sending Request for New Channel Oil Information(6.0.0.31,239.1.1.111) VIIF 1(default)
]
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This is needed for two reasons -
1. The inherited OIL needs to be setup independent of the RPF interface
to allow correct computation of the JoinDesired macro.
2. The RPF interface is computed at the time of MFC programming so
it is not possible to permanently evict the OIF at that time oif_add
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
When a inherited OIL becomes empty join-desired can go to false. So
we need to re-run join-desired evaluation on any inherited OIL changes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
The macro was always returning non-empty because of comparing an
array of u8_t with an array of u32_t.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If an dummy upstream entry (no RPF nbr) which is already in a JOINED
state is resolved we were not triggering an immediate join via the
per-interface upstream switch list.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
A dummy pim upstream entry can be in a JOINED state before its RPF nbr is
added. Handle that case by triggering an immediate join.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Deviations -
1. Avoid using SPTbit setting. Replace that with Use_Spt macro.
2. If S is supposed to be forwarded along the RPT but has an empty OIL
prune it.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. KAT should be re-started only if traffic rxed along the SPT i.e.
IIF == RPF_Interface(S).
Only exception to the rule is if you are LHR.
2. KAT should be started on all routers (not just FHR, RP, LHR).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Criteria for switching to SPT is different on RP and LHR. Re-name
the functions to make that apparent.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Joined state is computed based on the downstream state and cannot be
changed if the RPF link flaps.
Reference: rfc 7761, section 4.5.5
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
This commit includes the following changes -
1. kat needs to be included when evaluting join desired on a (S,G)
entry.
2. there were cases where we were adding OIF based on joindesired
being true for unrelated reasons (on other OIFs). cleaned up those
cases.
3. make all calls to pim_upstream_switch conditional on the JoinDesired
macro.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
RP config change is a big hammer and use_rpt/spt needs to be
re-evaluated on all existing (S,G) entries.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If a source is being forwarded along the RPT it uses the parent (*,G)'s
IIF. When the parent's IIF changes all the children need to be updated
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
mfcc_parent for an (S, G) entry was being updated on any upstream RPF
change. With the change to use RPT for (S,G) in some cases we can no
longer do that. Instead the upstream entry's RPF neigbor is managed
separately form the channel_oil's mfcc_parent i.e. via NHT. And the
mfcc_parent is evaluated at the time of mroute programming.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
An (S,G) mroute can be created as a result of rpt prune. However that
entry needs to stay on the parent (*,G)'s tree (IIF) till a decision is
made to switch the source to the SPT.
The decision to stay on the RPT is made based on the SPTbit setting
according to - RFC7761, Section 4.2 “Data Packet Forwarding Rules”
However those rules are hard to achieve when hw acceleration i.e.
control and data planes are separate. So instead of relying on data
we make the decision of using SPT if we have decided to join the SPT -
Use_RPT(S,G) {
if (Joined(S,G) == TRUE // we have decided to join the SPT
OR Directly_Connected(S) == TRUE // source is directly connected
OR I_am_RP(G) == TRUE) // RP
//use_spt
return FALSE;
//use_rpt
return TRUE;
}
To make that change some re-org was needed -
1. pim static mroutes and dynamic (upstream mroutes) top level APIs
have been separated. This is to limit the state machine to dynamic
mroutes.
2. c_oil->oil.mfcc_parent is re-evaluated based on if we decided
to use the SPT or stay on the RPT.
3. upstream mroute re-eval is done when any of the criteria involved
in Use_RPT changes.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Theoretically there should be no case where the channel-oil hangs
around after the upstream entry is removed. But currently there are
cases where it does. This is a precautionary fixup till we are
rid off all of those cases.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. This avoids the needs to re-run "muting" decisions.
2. Avoids the need to restore's pim OIL after fixup and send to kernel
(this is getting harder to manage).
In the future we need to also move the PIM maintained channel OIL from
an array of MAXVIFs to a simple DLL. This will be a significant
optimization in memory usage and preformance (OIL reads, copies etc).
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
If an mroute loses DF election (with the MLAG peer) it has to stop
forwarding traffic on active-active devices such as ipmr-lo used
for vxlan traffic termination. To acheive that this commit
introduces a concept of OIF muting. That way we can let the PIM and
IGMP state machines play out and silence OIFs after the fact.
Relevant outputs:
=================
1. muted OIFs are displayed with the M flag in "pim state" -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC12:~# net show pim state |grep "27.0.0.13"|grep 100
1 27.0.0.13 239.1.1.100 uplink-1 ipmr-lo( *M)
root@TORC12:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2. And supressed altogether in the mroute output -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC12:~# net show mroute |grep "27.0.0.13"|grep 100
27.0.0.13 239.1.1.100 none uplink-1 none 0 --:--:--
root@TORC12:~#
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
These logs were printing file name which has little value (is always
pim_oil.c). Instead print the caller.
add_oif/del_oif are being called directly from one too many. Instead OIF
setup needs to be consolidated via the PIM state machine. These
debugs are expected to help in understanding what needs to be cleaned up.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
1. add the Mlag ProtoBuf Lib to Zebra Compilation
2. Encode the messages with protobuf before writing to MLAG
3. Decode the MLAG Messages using protobuf and write to clients
based on their subscrption.
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
This includes:
1. Defining message formats
2. Stream Decoding after receiving the message at PIM
3. Handling MLAG UP & Down Notifications
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
when ever a FRR Client wants to send any data to another node
using MLAG Channel, uses below mechanisam.
1. sends a MLAG Registration to zebra with interested messages that
it is intended to receive from peer.
2. In response to this request, Zebra opens communication channel with
MLAG. and also in Rx. diretion zebra forwards only those messages which
client shown interest during registration
3. when client is no-longer interested in communicating with MLAG, client
posts De-register to Zebra
4. if this is the last client which is interested for MLAG Communication,
zebra closes the channel.
why PIM Needs MLAG Communication
================================
1. In general on LAN Networks elecetd DR will send the Join towards
Multicast RP in case of a LHR and Register in case of FHR.
2. But in case DR Goes down, traffic will be re-converged only after
the New DR is elected, but this can take time based on Hold Timer to
detect the DR down.
3. this can be optimised by using MLAG Mecganisam.
4. and also Traffic can be forwarded more efficiently by knowing the cost
towards RP using MLAG
Signed-off-by: Satheesh Kumar K <sathk@cumulusnetworks.com>
When receiving igmp packets we are spitting out a lot of
debugs. Attempt to clean this up to allow us to understand
what is going on a bit better by just being able to look
at the log file.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When you turn on `debug igmp trace` we are seeing a bunch
of debugs associated with pim processing. This is because we were
using PIM_DEBUG_TRACE which is both `debug igmp trace` and `debug pim trace`
when tracing igmp code it would be nice to only see igmp work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We are seeing situations where PIM is sending a IGMP v3 query
and immediately receiving back up the pim kernel interface the
query from itself:
from `show int brief`:
swp7 up default 192.168.202.1/24
We are also receiving these debugs:
2019-11-11T20:52:40.452307+00:00 leaf02 pimd[1592]: Send IGMPv3 query to 224.4.0.8 on swp7 for group 224.4.0.8, sources=0 msg_size=12 s_flag=0 QRV=2 QQI=125 QQIC=7d
2019-11-11T20:52:40.452430+00:00 leaf02 pimd[1592]: pim_mroute_msg(default): igmp kernel upcall on swp7(0x55eaa7dc7dc0) for 192.168.202.1 -> 224.4.11.123
2019-11-11T20:52:40.452574+00:00 leaf02 pimd[1592]: Recv IP packet from 192.168.202.1 to 224.4.11.123 on swp7: size=40 ip_header_size=24 ip_proto=2
2019-11-11T20:52:40.452699+00:00 leaf02 pimd[1592]: Recv IGMP packet from 192.168.202.1 to 224.4.11.123 on swp7: ttl=1 msg_type=17 msg_size=16
2019-11-11T20:52:40.452824+00:00 leaf02 pimd[1592]: Recv IGMP query v3 from 192.168.202.1 on swp7 for group 224.4.11.123
This query is causing us to do some weird gyrations around the IGMP state machine for handling
queries. Let's just prevent it from happening.
Ticket: CM-27247
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When adding an OIF to the OIL, if we are not the DR
there is no need to install it then remove it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
We have a zlog_warn that is unguarded ( and really is a debug message )
as that there is nothing the end user can do and nothing to note
here other than a debug message to track refcounts. Change
to an appropriate debug and zlog_debug it instead.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When you enter:
ip pim ssm prefix-list my-custom-ssm-range
ip pim ssm prefix-list my-custom-ssm-range
The second instance would cause a failure to happen which
should not happen w/ duplicate config.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>