zebra: fix crash in evpn neigh cleanup all

zebra crash is seen during shutdown (frr restart).

During shutdown, remote neigh and remote mac clean up
is triggered first, followed by per vni all neigh
(including local) and macs cleanup is triggered.

The crash occurs when a remote mac is cleaned up first
and its reference is remained in local neigh.
When local neigh attempt removes itself from its associated
mac's neigh_list it triggers inaccessible memory crash.

The fix is during mac deletion if its neigh_list is non-empty
then retain the MAC in AUTO state.
This can arise when MAC and neigh duo are in different state
(remote/local). Otherwise, the order of cleanup operation
is neighs followed by macs.

The auto mac will be cleaned up when per vni all neighs and macs
are cleaned up.

Ticket:CM-29826
Reviewed By:CCR-10369
Testing Done:

Configure evpn symmetric config where
MAC is in remote state and neigh is in local state.
Perform frr restart then crash is not seen.

Signed-off-by: Chirag Shah <chirag@nvidia.com>
This commit is contained in:
Chirag Shah 2020-06-04 09:41:31 -07:00 committed by Chirag Shah
parent 4190587a3f
commit d5fdae8f45

View File

@ -1141,14 +1141,6 @@ int zebra_evpn_mac_del(struct zebra_evpn *zevpn, struct zebra_mac *mac)
sizeof(mac_buf))); sizeof(mac_buf)));
} }
/* If the MAC is freed before the neigh we will end up
* with a stale pointer against the neigh
*/
if (!list_isempty(mac->neigh_list))
zlog_warn("%s: MAC %pEA flags 0x%x neigh list not empty %d",
__func__, &mac->macaddr, mac->flags,
listcount(mac->neigh_list));
/* force de-ref any ES entry linked to the MAC */ /* force de-ref any ES entry linked to the MAC */
zebra_evpn_es_mac_deref_entry(mac); zebra_evpn_es_mac_deref_entry(mac);
@ -1161,6 +1153,26 @@ int zebra_evpn_mac_del(struct zebra_evpn *zevpn, struct zebra_mac *mac)
/* Cancel auto recovery */ /* Cancel auto recovery */
THREAD_OFF(mac->dad_mac_auto_recovery_timer); THREAD_OFF(mac->dad_mac_auto_recovery_timer);
/* If the MAC is freed before the neigh we will end up
* with a stale pointer against the neigh.
* The situation can arise when a MAC is in remote state
* and its associated neigh is local state.
* zebra_evpn_cfg_cleanup() cleans up remote neighs and MACs.
* Instead of deleting remote MAC, if its neigh list is non-empty
* (associated to local neighs), mark the MAC as AUTO.
*/
if (!list_isempty(mac->neigh_list)) {
if (IS_ZEBRA_DEBUG_VXLAN)
zlog_debug(
"MAC %pEA (flags 0x%x vni %u) has non-empty neigh list "
"count %u, mark MAC as AUTO",
&mac->macaddr, mac->flags, zevpn->vni,
listcount(mac->neigh_list));
SET_FLAG(mac->flags, ZEBRA_MAC_AUTO);
return 0;
}
list_delete(&mac->neigh_list); list_delete(&mac->neigh_list);
/* Free the VNI hash entry and allocated memory. */ /* Free the VNI hash entry and allocated memory. */