* [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support
@ 2026-03-23 5:46 Yu Kuai
2026-03-23 5:46 ` [PATCH v3 1/3] md: add fallback to correct bitmap_ops on version mismatch Yu Kuai
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Yu Kuai @ 2026-03-23 5:46 UTC (permalink / raw)
To: linux-raid, song; +Cc: linan122, xni
Patch 1 improves compatibility with older mdadm versions by detecting
on-disk bitmap version and falling back to the correct bitmap_ops when
there's a version mismatch.
Patch 2 adds support for proactive XOR parity building in RAID-456 arrays.
This allows users to pre-build parity for unwritten regions via sysfs
before any user data is written, which can improve write performance for
workloads that will eventually use all storage. New states (CleanUnwritten,
NeedSyncUnwritten, SyncingUnwritten) are added to track these regions
separately from normal dirty/syncing states.
Patch 3 optimizes initial array sync for RAID-456 arrays on devices that
support write_zeroes with unmap. By zeroing all disks upfront, parity is
automatically consistent (0 XOR 0 = 0), allowing the bitmap to be
initialized to BitCleanUnwritten and skipping the initial sync entirely.
This significantly reduces array initialization time on modern NVMe SSDs.
Yu Kuai (3):
md: add fallback to correct bitmap_ops on version mismatch
md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity
building
md/md-llbitmap: optimize initial sync with write_zeroes_unmap support
drivers/md/md-llbitmap.c | 202 ++++++++++++++++++++++++++++++++++++---
drivers/md/md.c | 111 ++++++++++++++++++++-
2 files changed, 299 insertions(+), 14 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 1/3] md: add fallback to correct bitmap_ops on version mismatch
2026-03-23 5:46 [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
@ 2026-03-23 5:46 ` Yu Kuai
2026-03-23 5:46 ` [PATCH v3 2/3] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building Yu Kuai
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Yu Kuai @ 2026-03-23 5:46 UTC (permalink / raw)
To: linux-raid, song; +Cc: linan122, xni
If default bitmap version and on-disk version doesn't match, and mdadm
is not the latest version to set bitmap_type, set bitmap_ops based on
the disk version.
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
---
drivers/md/md.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 110 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 521d9b34cd9e..c9597c6bd341 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6449,15 +6449,124 @@ static void md_safemode_timeout(struct timer_list *t)
static int start_dirty_degraded;
+/*
+ * Read bitmap superblock and return the bitmap_id based on disk version.
+ * This is used as fallback when default bitmap version and on-disk version
+ * doesn't match, and mdadm is not the latest version to set bitmap_type.
+ */
+static enum md_submodule_id md_bitmap_get_id_from_sb(struct mddev *mddev)
+{
+ struct md_rdev *rdev;
+ struct page *sb_page;
+ bitmap_super_t *sb;
+ enum md_submodule_id id = ID_BITMAP_NONE;
+ sector_t sector;
+ u32 version;
+
+ if (!mddev->bitmap_info.offset)
+ return ID_BITMAP_NONE;
+
+ sb_page = alloc_page(GFP_KERNEL);
+ if (!sb_page) {
+ pr_warn("md: %s: failed to allocate memory for bitmap\n",
+ mdname(mddev));
+ return ID_BITMAP_NONE;
+ }
+
+ sector = mddev->bitmap_info.offset;
+
+ rdev_for_each(rdev, mddev) {
+ u32 iosize;
+
+ if (!test_bit(In_sync, &rdev->flags) ||
+ test_bit(Faulty, &rdev->flags) ||
+ test_bit(Bitmap_sync, &rdev->flags))
+ continue;
+
+ iosize = roundup(sizeof(bitmap_super_t),
+ bdev_logical_block_size(rdev->bdev));
+ if (sync_page_io(rdev, sector, iosize, sb_page, REQ_OP_READ,
+ true))
+ goto read_ok;
+ }
+ pr_warn("md: %s: failed to read bitmap from any device\n",
+ mdname(mddev));
+ goto out;
+
+read_ok:
+ sb = kmap_local_page(sb_page);
+ if (sb->magic != cpu_to_le32(BITMAP_MAGIC)) {
+ pr_warn("md: %s: invalid bitmap magic 0x%x\n",
+ mdname(mddev), le32_to_cpu(sb->magic));
+ goto out_unmap;
+ }
+
+ version = le32_to_cpu(sb->version);
+ switch (version) {
+ case BITMAP_MAJOR_LO:
+ case BITMAP_MAJOR_HI:
+ case BITMAP_MAJOR_CLUSTERED:
+ id = ID_BITMAP;
+ break;
+ case BITMAP_MAJOR_LOCKLESS:
+ id = ID_LLBITMAP;
+ break;
+ default:
+ pr_warn("md: %s: unknown bitmap version %u\n",
+ mdname(mddev), version);
+ break;
+ }
+
+out_unmap:
+ kunmap_local(sb);
+out:
+ __free_page(sb_page);
+ return id;
+}
+
static int md_bitmap_create(struct mddev *mddev)
{
+ enum md_submodule_id orig_id = mddev->bitmap_id;
+ enum md_submodule_id sb_id;
+ int err;
+
if (mddev->bitmap_id == ID_BITMAP_NONE)
return -EINVAL;
if (!mddev_set_bitmap_ops(mddev))
return -ENOENT;
- return mddev->bitmap_ops->create(mddev);
+ err = mddev->bitmap_ops->create(mddev);
+ if (!err)
+ return 0;
+
+ /*
+ * Create failed, if default bitmap version and on-disk version
+ * doesn't match, and mdadm is not the latest version to set
+ * bitmap_type, set bitmap_ops based on the disk version.
+ */
+ mddev_clear_bitmap_ops(mddev);
+
+ sb_id = md_bitmap_get_id_from_sb(mddev);
+ if (sb_id == ID_BITMAP_NONE || sb_id == orig_id)
+ return err;
+
+ pr_info("md: %s: bitmap version mismatch, switching from %d to %d\n",
+ mdname(mddev), orig_id, sb_id);
+
+ mddev->bitmap_id = sb_id;
+ if (!mddev_set_bitmap_ops(mddev)) {
+ mddev->bitmap_id = orig_id;
+ return -ENOENT;
+ }
+
+ err = mddev->bitmap_ops->create(mddev);
+ if (err) {
+ mddev_clear_bitmap_ops(mddev);
+ mddev->bitmap_id = orig_id;
+ }
+
+ return err;
}
static void md_bitmap_destroy(struct mddev *mddev)
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 2/3] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building
2026-03-23 5:46 [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
2026-03-23 5:46 ` [PATCH v3 1/3] md: add fallback to correct bitmap_ops on version mismatch Yu Kuai
@ 2026-03-23 5:46 ` Yu Kuai
2026-03-23 5:46 ` [PATCH v3 3/3] md/md-llbitmap: optimize initial sync with write_zeroes_unmap support Yu Kuai
2026-04-07 5:29 ` [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
3 siblings, 0 replies; 5+ messages in thread
From: Yu Kuai @ 2026-03-23 5:46 UTC (permalink / raw)
To: linux-raid, song; +Cc: linan122, xni
Add new states to the llbitmap state machine to support proactive XOR
parity building for RAID-5 arrays. This allows users to pre-build parity
data for unwritten regions before any user data is written.
New states added:
- BitNeedSyncUnwritten: Transitional state when proactive sync is triggered
via sysfs on Unwritten regions.
- BitSyncingUnwritten: Proactive sync in progress for unwritten region.
- BitCleanUnwritten: XOR parity has been pre-built, but no user data
written yet. When user writes to this region, it transitions to BitDirty.
New actions added:
- BitmapActionProactiveSync: Trigger for proactive XOR parity building.
- BitmapActionClearUnwritten: Convert CleanUnwritten/NeedSyncUnwritten/
SyncingUnwritten states back to Unwritten before recovery starts.
State flows:
- Current (lazy): Unwritten -> (write) -> NeedSync -> (sync) -> Dirty -> Clean
- New (proactive): Unwritten -> (sysfs) -> NeedSyncUnwritten -> (sync) -> CleanUnwritten
- On write to CleanUnwritten: CleanUnwritten -> (write) -> Dirty -> Clean
- On disk replacement: CleanUnwritten regions are converted to Unwritten
before recovery starts, so recovery only rebuilds regions with user data
A new sysfs interface is added at /sys/block/mdX/md/llbitmap/proactive_sync
(write-only) to trigger proactive sync. This only works for RAID-456 arrays.
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
---
drivers/md/md-llbitmap.c | 140 +++++++++++++++++++++++++++++++++++----
1 file changed, 128 insertions(+), 12 deletions(-)
diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c
index cdfecaca216b..f10374242c9a 100644
--- a/drivers/md/md-llbitmap.c
+++ b/drivers/md/md-llbitmap.c
@@ -208,6 +208,20 @@ enum llbitmap_state {
BitNeedSync,
/* data is synchronizing */
BitSyncing,
+ /*
+ * Proactive sync requested for unwritten region (raid456 only).
+ * Triggered via sysfs when user wants to pre-build XOR parity
+ * for regions that have never been written.
+ */
+ BitNeedSyncUnwritten,
+ /* Proactive sync in progress for unwritten region */
+ BitSyncingUnwritten,
+ /*
+ * XOR parity has been pre-built for a region that has never had
+ * user data written. When user writes to this region, it transitions
+ * to BitDirty.
+ */
+ BitCleanUnwritten,
BitStateCount,
BitNone = 0xff,
};
@@ -232,6 +246,12 @@ enum llbitmap_action {
* BitNeedSync.
*/
BitmapActionStale,
+ /*
+ * Proactive sync trigger for raid456 - builds XOR parity for
+ * Unwritten regions without requiring user data write first.
+ */
+ BitmapActionProactiveSync,
+ BitmapActionClearUnwritten,
BitmapActionCount,
/* Init state is BitUnwritten */
BitmapActionInit,
@@ -304,6 +324,8 @@ static char state_machine[BitStateCount][BitmapActionCount] = {
[BitmapActionDaemon] = BitNone,
[BitmapActionDiscard] = BitNone,
[BitmapActionStale] = BitNone,
+ [BitmapActionProactiveSync] = BitNeedSyncUnwritten,
+ [BitmapActionClearUnwritten] = BitNone,
},
[BitClean] = {
[BitmapActionStartwrite] = BitDirty,
@@ -314,6 +336,8 @@ static char state_machine[BitStateCount][BitmapActionCount] = {
[BitmapActionDaemon] = BitNone,
[BitmapActionDiscard] = BitUnwritten,
[BitmapActionStale] = BitNeedSync,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitNone,
},
[BitDirty] = {
[BitmapActionStartwrite] = BitNone,
@@ -324,6 +348,8 @@ static char state_machine[BitStateCount][BitmapActionCount] = {
[BitmapActionDaemon] = BitClean,
[BitmapActionDiscard] = BitUnwritten,
[BitmapActionStale] = BitNeedSync,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitNone,
},
[BitNeedSync] = {
[BitmapActionStartwrite] = BitNone,
@@ -334,6 +360,8 @@ static char state_machine[BitStateCount][BitmapActionCount] = {
[BitmapActionDaemon] = BitNone,
[BitmapActionDiscard] = BitUnwritten,
[BitmapActionStale] = BitNone,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitNone,
},
[BitSyncing] = {
[BitmapActionStartwrite] = BitNone,
@@ -344,6 +372,44 @@ static char state_machine[BitStateCount][BitmapActionCount] = {
[BitmapActionDaemon] = BitNone,
[BitmapActionDiscard] = BitUnwritten,
[BitmapActionStale] = BitNeedSync,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitNone,
+ },
+ [BitNeedSyncUnwritten] = {
+ [BitmapActionStartwrite] = BitNeedSync,
+ [BitmapActionStartsync] = BitSyncingUnwritten,
+ [BitmapActionEndsync] = BitNone,
+ [BitmapActionAbortsync] = BitUnwritten,
+ [BitmapActionReload] = BitUnwritten,
+ [BitmapActionDaemon] = BitNone,
+ [BitmapActionDiscard] = BitUnwritten,
+ [BitmapActionStale] = BitUnwritten,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitUnwritten,
+ },
+ [BitSyncingUnwritten] = {
+ [BitmapActionStartwrite] = BitSyncing,
+ [BitmapActionStartsync] = BitSyncingUnwritten,
+ [BitmapActionEndsync] = BitCleanUnwritten,
+ [BitmapActionAbortsync] = BitUnwritten,
+ [BitmapActionReload] = BitUnwritten,
+ [BitmapActionDaemon] = BitNone,
+ [BitmapActionDiscard] = BitUnwritten,
+ [BitmapActionStale] = BitUnwritten,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitUnwritten,
+ },
+ [BitCleanUnwritten] = {
+ [BitmapActionStartwrite] = BitDirty,
+ [BitmapActionStartsync] = BitNone,
+ [BitmapActionEndsync] = BitNone,
+ [BitmapActionAbortsync] = BitNone,
+ [BitmapActionReload] = BitNone,
+ [BitmapActionDaemon] = BitNone,
+ [BitmapActionDiscard] = BitUnwritten,
+ [BitmapActionStale] = BitUnwritten,
+ [BitmapActionProactiveSync] = BitNone,
+ [BitmapActionClearUnwritten] = BitUnwritten,
},
};
@@ -376,6 +442,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap *llbitmap,
pctl->state[pos] = level_456 ? BitNeedSync : BitDirty;
break;
case BitClean:
+ case BitCleanUnwritten:
pctl->state[pos] = BitDirty;
break;
}
@@ -383,7 +450,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap *llbitmap,
}
static void llbitmap_set_page_dirty(struct llbitmap *llbitmap, int idx,
- int offset)
+ int offset, bool infect)
{
struct llbitmap_page_ctl *pctl = llbitmap->pctl[idx];
unsigned int io_size = llbitmap->io_size;
@@ -398,7 +465,7 @@ static void llbitmap_set_page_dirty(struct llbitmap *llbitmap, int idx,
* resync all the dirty bits, hence skip infect new dirty bits to
* prevent resync unnecessary data.
*/
- if (llbitmap->mddev->degraded) {
+ if (llbitmap->mddev->degraded || !infect) {
set_bit(block, pctl->dirty);
return;
}
@@ -438,7 +505,9 @@ static void llbitmap_write(struct llbitmap *llbitmap, enum llbitmap_state state,
llbitmap->pctl[idx]->state[bit] = state;
if (state == BitDirty || state == BitNeedSync)
- llbitmap_set_page_dirty(llbitmap, idx, bit);
+ llbitmap_set_page_dirty(llbitmap, idx, bit, true);
+ else if (state == BitNeedSyncUnwritten)
+ llbitmap_set_page_dirty(llbitmap, idx, bit, false);
}
static struct page *llbitmap_read_page(struct llbitmap *llbitmap, int idx)
@@ -627,11 +696,10 @@ static enum llbitmap_state llbitmap_state_machine(struct llbitmap *llbitmap,
goto write_bitmap;
}
- if (c == BitNeedSync)
+ if (c == BitNeedSync || c == BitNeedSyncUnwritten)
need_resync = !mddev->degraded;
state = state_machine[c][action];
-
write_bitmap:
if (unlikely(mddev->degraded)) {
/* For degraded array, mark new data as need sync. */
@@ -658,8 +726,7 @@ static enum llbitmap_state llbitmap_state_machine(struct llbitmap *llbitmap,
}
llbitmap_write(llbitmap, state, start);
-
- if (state == BitNeedSync)
+ if (state == BitNeedSync || state == BitNeedSyncUnwritten)
need_resync = !mddev->degraded;
else if (state == BitDirty &&
!timer_pending(&llbitmap->pending_timer))
@@ -1229,7 +1296,7 @@ static bool llbitmap_blocks_synced(struct mddev *mddev, sector_t offset)
unsigned long p = offset >> llbitmap->chunkshift;
enum llbitmap_state c = llbitmap_read(llbitmap, p);
- return c == BitClean || c == BitDirty;
+ return c == BitClean || c == BitDirty || c == BitCleanUnwritten;
}
static sector_t llbitmap_skip_sync_blocks(struct mddev *mddev, sector_t offset)
@@ -1243,6 +1310,10 @@ static sector_t llbitmap_skip_sync_blocks(struct mddev *mddev, sector_t offset)
if (c == BitUnwritten)
return blocks;
+ /* Skip CleanUnwritten - no user data, will be reset after recovery */
+ if (c == BitCleanUnwritten)
+ return blocks;
+
/* For degraded array, don't skip */
if (mddev->degraded)
return 0;
@@ -1261,14 +1332,25 @@ static bool llbitmap_start_sync(struct mddev *mddev, sector_t offset,
{
struct llbitmap *llbitmap = mddev->bitmap;
unsigned long p = offset >> llbitmap->chunkshift;
+ enum llbitmap_state state;
+
+ /*
+ * Before recovery starts, convert CleanUnwritten to Unwritten.
+ * This ensures the new disk won't have stale parity data.
+ */
+ if (offset == 0 && test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
+ !test_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery))
+ llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1,
+ BitmapActionClearUnwritten);
+
/*
* Handle one bit at a time, this is much simpler. And it doesn't matter
* if md_do_sync() loop more times.
*/
*blocks = llbitmap->chunksize - (offset & (llbitmap->chunksize - 1));
- return llbitmap_state_machine(llbitmap, p, p,
- BitmapActionStartsync) == BitSyncing;
+ state = llbitmap_state_machine(llbitmap, p, p, BitmapActionStartsync);
+ return state == BitSyncing || state == BitSyncingUnwritten;
}
/* Something is wrong, sync_thread stop at @offset */
@@ -1474,9 +1556,15 @@ static ssize_t bits_show(struct mddev *mddev, char *page)
}
mutex_unlock(&mddev->bitmap_info.mutex);
- return sprintf(page, "unwritten %d\nclean %d\ndirty %d\nneed sync %d\nsyncing %d\n",
+ return sprintf(page,
+ "unwritten %d\nclean %d\ndirty %d\n"
+ "need sync %d\nsyncing %d\n"
+ "need sync unwritten %d\nsyncing unwritten %d\n"
+ "clean unwritten %d\n",
bits[BitUnwritten], bits[BitClean], bits[BitDirty],
- bits[BitNeedSync], bits[BitSyncing]);
+ bits[BitNeedSync], bits[BitSyncing],
+ bits[BitNeedSyncUnwritten], bits[BitSyncingUnwritten],
+ bits[BitCleanUnwritten]);
}
static struct md_sysfs_entry llbitmap_bits = __ATTR_RO(bits);
@@ -1549,11 +1637,39 @@ barrier_idle_store(struct mddev *mddev, const char *buf, size_t len)
static struct md_sysfs_entry llbitmap_barrier_idle = __ATTR_RW(barrier_idle);
+static ssize_t
+proactive_sync_store(struct mddev *mddev, const char *buf, size_t len)
+{
+ struct llbitmap *llbitmap;
+
+ /* Only for RAID-456 */
+ if (!raid_is_456(mddev))
+ return -EINVAL;
+
+ mutex_lock(&mddev->bitmap_info.mutex);
+ llbitmap = mddev->bitmap;
+ if (!llbitmap || !llbitmap->pctl) {
+ mutex_unlock(&mddev->bitmap_info.mutex);
+ return -ENODEV;
+ }
+
+ /* Trigger proactive sync on all Unwritten regions */
+ llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1,
+ BitmapActionProactiveSync);
+
+ mutex_unlock(&mddev->bitmap_info.mutex);
+ return len;
+}
+
+static struct md_sysfs_entry llbitmap_proactive_sync =
+ __ATTR(proactive_sync, 0200, NULL, proactive_sync_store);
+
static struct attribute *md_llbitmap_attrs[] = {
&llbitmap_bits.attr,
&llbitmap_metadata.attr,
&llbitmap_daemon_sleep.attr,
&llbitmap_barrier_idle.attr,
+ &llbitmap_proactive_sync.attr,
NULL
};
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 3/3] md/md-llbitmap: optimize initial sync with write_zeroes_unmap support
2026-03-23 5:46 [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
2026-03-23 5:46 ` [PATCH v3 1/3] md: add fallback to correct bitmap_ops on version mismatch Yu Kuai
2026-03-23 5:46 ` [PATCH v3 2/3] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building Yu Kuai
@ 2026-03-23 5:46 ` Yu Kuai
2026-04-07 5:29 ` [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
3 siblings, 0 replies; 5+ messages in thread
From: Yu Kuai @ 2026-03-23 5:46 UTC (permalink / raw)
To: linux-raid, song; +Cc: linan122, xni
For RAID-456 arrays with llbitmap, if all underlying disks support
write_zeroes with unmap, issue write_zeroes to zero all disk data
regions and initialize the bitmap to BitCleanUnwritten instead of
BitUnwritten.
This optimization skips the initial XOR parity building because:
1. write_zeroes with unmap guarantees zeroed reads after the operation
2. For RAID-456, when all data is zero, parity is automatically
consistent (0 XOR 0 XOR ... = 0)
3. BitCleanUnwritten indicates parity is valid but no user data
has been written
The implementation adds two helper functions:
- llbitmap_all_disks_support_wzeroes_unmap(): Checks if all active
disks support write_zeroes with unmap
- llbitmap_zero_all_disks(): Issues blkdev_issue_zeroout() to each
rdev's data region to zero all disks
The zeroing and bitmap state setting happens in llbitmap_init_state()
during bitmap initialization. If any disk fails to zero, we fall back
to BitUnwritten and normal lazy recovery.
This significantly reduces array initialization time for RAID-456
arrays built on modern NVMe SSDs or other devices that support
write_zeroes with unmap.
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
---
drivers/md/md-llbitmap.c | 62 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 61 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c
index f10374242c9a..9e7e6b1a6f15 100644
--- a/drivers/md/md-llbitmap.c
+++ b/drivers/md/md-llbitmap.c
@@ -654,13 +654,73 @@ static int llbitmap_cache_pages(struct llbitmap *llbitmap)
return 0;
}
+/*
+ * Check if all underlying disks support write_zeroes with unmap.
+ */
+static bool llbitmap_all_disks_support_wzeroes_unmap(struct llbitmap *llbitmap)
+{
+ struct mddev *mddev = llbitmap->mddev;
+ struct md_rdev *rdev;
+
+ rdev_for_each(rdev, mddev) {
+ if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags))
+ continue;
+
+ if (bdev_write_zeroes_unmap_sectors(rdev->bdev) == 0)
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * Issue write_zeroes to all underlying disks to zero their data regions.
+ * This ensures parity consistency for RAID-456 (0 XOR 0 = 0).
+ * Returns true if all disks were successfully zeroed.
+ */
+static bool llbitmap_zero_all_disks(struct llbitmap *llbitmap)
+{
+ struct mddev *mddev = llbitmap->mddev;
+ struct md_rdev *rdev;
+ sector_t dev_sectors = mddev->dev_sectors;
+ int ret;
+
+ rdev_for_each(rdev, mddev) {
+ if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags))
+ continue;
+
+ ret = blkdev_issue_zeroout(rdev->bdev,
+ rdev->data_offset,
+ dev_sectors,
+ GFP_KERNEL, 0);
+ if (ret) {
+ pr_warn("md/llbitmap: failed to zero disk %pg: %d\n",
+ rdev->bdev, ret);
+ return false;
+ }
+ }
+
+ return true;
+}
+
static void llbitmap_init_state(struct llbitmap *llbitmap)
{
+ struct mddev *mddev = llbitmap->mddev;
enum llbitmap_state state = BitUnwritten;
unsigned long i;
- if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags))
+ if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags)) {
state = BitClean;
+ } else if (raid_is_456(mddev) &&
+ llbitmap_all_disks_support_wzeroes_unmap(llbitmap)) {
+ /*
+ * All disks support write_zeroes with unmap. Zero all disks
+ * to ensure parity consistency, then set BitCleanUnwritten
+ * to skip initial sync.
+ */
+ if (llbitmap_zero_all_disks(llbitmap))
+ state = BitCleanUnwritten;
+ }
for (i = 0; i < llbitmap->chunks; i++)
llbitmap_write(llbitmap, state, i);
--
2.51.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support
2026-03-23 5:46 [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
` (2 preceding siblings ...)
2026-03-23 5:46 ` [PATCH v3 3/3] md/md-llbitmap: optimize initial sync with write_zeroes_unmap support Yu Kuai
@ 2026-04-07 5:29 ` Yu Kuai
3 siblings, 0 replies; 5+ messages in thread
From: Yu Kuai @ 2026-04-07 5:29 UTC (permalink / raw)
To: linux-raid, song; +Cc: linan122, xni, yukuai
在 2026/3/23 13:46, Yu Kuai 写道:
> Patch 1 improves compatibility with older mdadm versions by detecting
> on-disk bitmap version and falling back to the correct bitmap_ops when
> there's a version mismatch.
>
> Patch 2 adds support for proactive XOR parity building in RAID-456 arrays.
> This allows users to pre-build parity for unwritten regions via sysfs
> before any user data is written, which can improve write performance for
> workloads that will eventually use all storage. New states (CleanUnwritten,
> NeedSyncUnwritten, SyncingUnwritten) are added to track these regions
> separately from normal dirty/syncing states.
>
> Patch 3 optimizes initial array sync for RAID-456 arrays on devices that
> support write_zeroes with unmap. By zeroing all disks upfront, parity is
> automatically consistent (0 XOR 0 = 0), allowing the bitmap to be
> initialized to BitCleanUnwritten and skipping the initial sync entirely.
> This significantly reduces array initialization time on modern NVMe SSDs.
>
> Yu Kuai (3):
> md: add fallback to correct bitmap_ops on version mismatch
> md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity
> building
> md/md-llbitmap: optimize initial sync with write_zeroes_unmap support
>
> drivers/md/md-llbitmap.c | 202 ++++++++++++++++++++++++++++++++++++---
> drivers/md/md.c | 111 ++++++++++++++++++++-
> 2 files changed, 299 insertions(+), 14 deletions(-)
Applied to md-7.1
--
Thansk,
Kuai
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-07 5:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 5:46 [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
2026-03-23 5:46 ` [PATCH v3 1/3] md: add fallback to correct bitmap_ops on version mismatch Yu Kuai
2026-03-23 5:46 ` [PATCH v3 2/3] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building Yu Kuai
2026-03-23 5:46 ` [PATCH v3 3/3] md/md-llbitmap: optimize initial sync with write_zeroes_unmap support Yu Kuai
2026-04-07 5:29 ` [PATCH v3 0/3] md/md-llbitmap: add CleanUnwritten support Yu Kuai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox