dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Li Nan <linan666@huaweicloud.com>,
	Yu Kuai <yukuai1@huaweicloud.com>,
	hch@infradead.org, corbet@lwn.net, agk@redhat.com,
	snitzer@kernel.org, mpatocka@redhat.com, song@kernel.org,
	xni@redhat.com, hare@suse.de, colyli@kernel.org
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@lists.linux.dev, linux-raid@vger.kernel.org,
	yi.zhang@huawei.com, yangerkun@huawei.com,
	johnny.chenyi@huawei.com, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH v6 md-6.18 11/11] md/md-llbitmap: introduce new lockless bitmap
Date: Fri, 29 Aug 2025 09:03:30 +0800	[thread overview]
Message-ID: <dcec1dd2-903a-3569-30e4-7af916ecba4b@huaweicloud.com> (raw)
In-Reply-To: <93e96f14-dfe3-6390-5a91-f28e1cdb1783@huaweicloud.com>

Hi,

在 2025/08/28 19:24, Li Nan 写道:
> 
> 
> 在 2025/8/26 16:52, Yu Kuai 写道:
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Redundant data is used to enhance data fault tolerance, and the storage
>> method for redundant data vary depending on the RAID levels. And it's
>> important to maintain the consistency of redundant data.
>>
>> Bitmap is used to record which data blocks have been synchronized and 
>> which
>> ones need to be resynchronized or recovered. Each bit in the bitmap
>> represents a segment of data in the array. When a bit is set, it 
>> indicates
>> that the multiple redundant copies of that data segment may not be
>> consistent. Data synchronization can be performed based on the bitmap 
>> after
>> power failure or readding a disk. If there is no bitmap, a full disk
>> synchronization is required.
>>
>> Key Features:
>>
>>   - IO fastpath is lockless, if user issues lots of write IO to the same
>>   bitmap bit in a short time, only the first write have additional 
>> overhead
>>   to update bitmap bit, no additional overhead for the following writes;
>>   - support only resync or recover written data, means in the case 
>> creating
>>   new array or replacing with a new disk, there is no need to do a 
>> full disk
>>   resync/recovery;
>>
>> Key Concept:
>>
>>   - State Machine:
>>
>> Each bit is one byte, contain 6 difference state, see llbitmap_state. And
>> there are total 8 differenct actions, see llbitmap_action, can change 
>> state:
>>
>> llbitmap state machine: transitions between states
>>
>> |           | Startwrite | Startsync | Endsync | Abortsync|
>> | --------- | ---------- | --------- | ------- | -------  |
>> | Unwritten | Dirty      | x         | x       | x        |
>> | Clean     | Dirty      | x         | x       | x        |
>> | Dirty     | x          | x         | x       | x        |
>> | NeedSync  | x          | Syncing   | x       | x        |
>> | Syncing   | x          | Syncing   | Dirty   | NeedSync |
>>
>> |           | Reload   | Daemon | Discard   | Stale     |
>> | --------- | -------- | ------ | --------- | --------- |
>> | Unwritten | x        | x      | x         | x         |
>> | Clean     | x        | x      | Unwritten | NeedSync  |
>> | Dirty     | NeedSync | Clean  | Unwritten | NeedSync  |
>> | NeedSync  | x        | x      | Unwritten | x         |
>> | Syncing   | NeedSync | x      | Unwritten | NeedSync  |
>>
>> Typical scenarios:
>>
>> 1) Create new array
>> All bits will be set to Unwritten by default, if --assume-clean is set,
>> all bits will be set to Clean instead.
>>
>> 2) write data, raid1/raid10 have full copy of data, while raid456 
>> doesn't and
>> rely on xor data
>>
>> 2.1) write new data to raid1/raid10:
>> Unwritten --StartWrite--> Dirty
>>
>> 2.2) write new data to raid456:
>> Unwritten --StartWrite--> NeedSync
>>
>> Because the initial recover for raid456 is skipped, the xor data is 
>> not build
>> yet, the bit must set to NeedSync first and after lazy initial recover is
>> finished, the bit will finially set to Dirty(see 5.1 and 5.4);
>>
>> 2.3) cover write
>> Clean --StartWrite--> Dirty
>>
>> 3) daemon, if the array is not degraded:
>> Dirty --Daemon--> Clean
>>
>> For degraded array, the Dirty bit will never be cleared, prevent full 
>> disk
>> recovery while readding a removed disk.
>>
>> 4) discard
>> {Clean, Dirty, NeedSync, Syncing} --Discard--> Unwritten
>>
>> 5) resync and recover
>>
>> 5.1) common process
>> NeedSync --Startsync--> Syncing --Endsync--> Dirty --Daemon--> Clean
> 
> There is some issue whith Dirty state:
> 1. The Dirty bit will not synced when a disk is re-add.
> 2. It remains Dirty even after a full recovery -- it should be Clean.

We're setting new bits to dirty for degraded array, and there is no
futher action to change the state to need sync before recovery by new
disk.

This can be fixed by setting new bits directly to need sync for degraded
array, will do this in the next version.

Thanks,
Kuai
> 


      reply	other threads:[~2025-08-29  1:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-26  8:51 [PATCH v6 md-6.18 00/11] md/llbitmap: md/md-llbitmap: introduce a new lockless bitmap Yu Kuai
2025-08-26  8:51 ` [PATCH v6 md-6.18 01/11] md: add a new parameter 'offset' to md_super_write() Yu Kuai
2025-08-26  8:51 ` [PATCH v6 md-6.18 02/11] md: factor out a helper raid_is_456() Yu Kuai
2025-08-26  8:51 ` [PATCH v6 md-6.18 03/11] md/md-bitmap: support discard for bitmap ops Yu Kuai
2025-08-26  8:51 ` [PATCH v6 md-6.18 04/11] md: add a new mddev field 'bitmap_id' Yu Kuai
2025-08-26  8:51 ` [PATCH v6 md-6.18 05/11] md/md-bitmap: add a new sysfs api bitmap_type Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 06/11] md/md-bitmap: delay registration of bitmap_ops until creating bitmap Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 07/11] md/md-bitmap: add a new method skip_sync_blocks() in bitmap_operations Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 08/11] md/md-bitmap: add a new method blocks_synced() " Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 09/11] md: add a new recovery_flag MD_RECOVERY_LAZY_RECOVER Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 10/11] md/md-bitmap: make method bitmap_ops->daemon_work optional Yu Kuai
2025-08-26  8:52 ` [PATCH v6 md-6.18 11/11] md/md-llbitmap: introduce new lockless bitmap Yu Kuai
2025-08-26  9:52   ` Paul Menzel
2025-08-27  3:44     ` Yu Kuai
2025-08-27  6:07       ` Paul Menzel
2025-08-28  7:10         ` Yu Kuai
2025-08-28  4:15   ` Randy Dunlap
2025-08-28 11:24   ` Li Nan
2025-08-29  1:03     ` Yu Kuai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcec1dd2-903a-3569-30e4-7af916ecba4b@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=agk@redhat.com \
    --cc=colyli@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dm-devel@lists.linux.dev \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=johnny.chenyi@huawei.com \
    --cc=linan666@huaweicloud.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=song@kernel.org \
    --cc=xni@redhat.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).