Block remap for cloned blocks on device removal

When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #15604
Closes #17180
This commit is contained in:
Alexander Motin 2025-03-26 19:45:34 -04:00 committed by GitHub
parent 50d87fed6a
commit 4abc21b28c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -28,6 +28,7 @@
*/
#include <sys/zfs_context.h>
#include <sys/brt.h>
#include <sys/dmu.h>
#include <sys/dmu_tx.h>
#include <sys/space_map.h>
@ -5536,6 +5537,13 @@ spa_remap_blkptr(spa_t *spa, blkptr_t *bp, spa_remap_cb_t callback, void *arg)
if (BP_GET_NDVAS(bp) < 1)
return (B_FALSE);
/*
* Cloned blocks can not be remapped since BRT depends on specific
* vdev id and offset in the DVA[0] for its reference counting.
*/
if (!BP_IS_METADATA(bp) && brt_maybe_exists(spa, bp))
return (B_FALSE);
/*
* Note: we only remap dva[0]. If we remapped other dvas, we
* would no longer know what their phys birth txg is.