Linux RAID subsystem development
 help / color / mirror / Atom feed
* [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb
@ 2025-03-03  3:39 Su Yue
  2025-03-03  3:43 ` Heming Zhao
  2025-03-05  1:25 ` Yu Kuai
  0 siblings, 2 replies; 5+ messages in thread
From: Su Yue @ 2025-03-03  3:39 UTC (permalink / raw)
  To: linux-raid; +Cc: hch, ofir.gal, heming.zhao, yukuai3, l, Su Yue

In clustermd, separate write-intent-bitmaps are used for each cluster
node:

0                    4k                     8k                    12k
-------------------------------------------------------------------
| idle                | md super            | bm super [0] + bits |
| bm bits[0, contd]   | bm super[1] + bits  | bm bits[1, contd]   |
| bm super[2] + bits  | bm bits [2, contd]  | bm super[3] + bits  |
| bm bits [3, contd]  |                     |                     |

So in node 1, pg_index in __write_sb_page() could equal to
bitmap->storage.file_pages. Then bitmap_limit will be calculated to
0. md_super_write() will be called with 0 size.
That means the first 4k sb area of node 1 will never be updated
through filemap_write_page().
This bug causes hang of mdadm/clustermd_tests/01r1_Grow_resize.

Here use (pg_index % bitmap->storage.file_pages) to make calculation
of bitmap_limit correct.

Fixes: ab99a87542f1 ("md/md-bitmap: fix writing non bitmap pages")
Signed-off-by: Su Yue <glass.su@suse.com>
---
Changelog:
v3:
    Amend commit message suggested by Heming.
v2:
    Remove unintended change calling md_super_write().
---
 drivers/md/md-bitmap.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 23c09d22fcdb..9ae6cc8e30cb 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -426,8 +426,8 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
 	struct block_device *bdev;
 	struct mddev *mddev = bitmap->mddev;
 	struct bitmap_storage *store = &bitmap->storage;
-	unsigned int bitmap_limit = (bitmap->storage.file_pages - pg_index) <<
-		PAGE_SHIFT;
+	unsigned long num_pages = bitmap->storage.file_pages;
+	unsigned int bitmap_limit = (num_pages - pg_index % num_pages) << PAGE_SHIFT;
 	loff_t sboff, offset = mddev->bitmap_info.offset;
 	sector_t ps = pg_index * PAGE_SIZE / SECTOR_SIZE;
 	unsigned int size = PAGE_SIZE;
@@ -436,7 +436,7 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
 
 	bdev = (rdev->meta_bdev) ? rdev->meta_bdev : rdev->bdev;
 	/* we compare length (page numbers), not page offset. */
-	if ((pg_index - store->sb_index) == store->file_pages - 1) {
+	if ((pg_index - store->sb_index) == num_pages - 1) {
 		unsigned int last_page_size = store->bytes & (PAGE_SIZE - 1);
 
 		if (last_page_size == 0)
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb
  2025-03-03  3:39 [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb Su Yue
@ 2025-03-03  3:43 ` Heming Zhao
  2025-03-05  1:25 ` Yu Kuai
  1 sibling, 0 replies; 5+ messages in thread
From: Heming Zhao @ 2025-03-03  3:43 UTC (permalink / raw)
  To: Su Yue, linux-raid; +Cc: hch, ofir.gal, yukuai3, l

On 3/3/25 11:39, Su Yue wrote:
> In clustermd, separate write-intent-bitmaps are used for each cluster
> node:
> 
> 0                    4k                     8k                    12k
> -------------------------------------------------------------------
> | idle                | md super            | bm super [0] + bits |
> | bm bits[0, contd]   | bm super[1] + bits  | bm bits[1, contd]   |
> | bm super[2] + bits  | bm bits [2, contd]  | bm super[3] + bits  |
> | bm bits [3, contd]  |                     |                     |
> 
> So in node 1, pg_index in __write_sb_page() could equal to
> bitmap->storage.file_pages. Then bitmap_limit will be calculated to
> 0. md_super_write() will be called with 0 size.
> That means the first 4k sb area of node 1 will never be updated
> through filemap_write_page().
> This bug causes hang of mdadm/clustermd_tests/01r1_Grow_resize.
> 
> Here use (pg_index % bitmap->storage.file_pages) to make calculation
> of bitmap_limit correct.
> 
> Fixes: ab99a87542f1 ("md/md-bitmap: fix writing non bitmap pages")
> Signed-off-by: Su Yue <glass.su@suse.com>

Looks good to me
Reviewed-by: Heming Zhao <heming.zhao@suse.com>

> ---
> Changelog:
> v3:
>      Amend commit message suggested by Heming.
> v2:
>      Remove unintended change calling md_super_write().
> ---
>   drivers/md/md-bitmap.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 23c09d22fcdb..9ae6cc8e30cb 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -426,8 +426,8 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
>   	struct block_device *bdev;
>   	struct mddev *mddev = bitmap->mddev;
>   	struct bitmap_storage *store = &bitmap->storage;
> -	unsigned int bitmap_limit = (bitmap->storage.file_pages - pg_index) <<
> -		PAGE_SHIFT;
> +	unsigned long num_pages = bitmap->storage.file_pages;
> +	unsigned int bitmap_limit = (num_pages - pg_index % num_pages) << PAGE_SHIFT;
>   	loff_t sboff, offset = mddev->bitmap_info.offset;
>   	sector_t ps = pg_index * PAGE_SIZE / SECTOR_SIZE;
>   	unsigned int size = PAGE_SIZE;
> @@ -436,7 +436,7 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
>   
>   	bdev = (rdev->meta_bdev) ? rdev->meta_bdev : rdev->bdev;
>   	/* we compare length (page numbers), not page offset. */
> -	if ((pg_index - store->sb_index) == store->file_pages - 1) {
> +	if ((pg_index - store->sb_index) == num_pages - 1) {
>   		unsigned int last_page_size = store->bytes & (PAGE_SIZE - 1);
>   
>   		if (last_page_size == 0)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb
  2025-03-03  3:39 [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb Su Yue
  2025-03-03  3:43 ` Heming Zhao
@ 2025-03-05  1:25 ` Yu Kuai
  2025-03-05 10:47   ` Su Yue
  1 sibling, 1 reply; 5+ messages in thread
From: Yu Kuai @ 2025-03-05  1:25 UTC (permalink / raw)
  To: Su Yue, linux-raid
  Cc: hch, ofir.gal, heming.zhao, l, yukuai3 >> yukuai (C)

在 2025/03/03 11:39, Su Yue 写道:
> In clustermd, separate write-intent-bitmaps are used for each cluster
> node:
> 
> 0                    4k                     8k                    12k
> -------------------------------------------------------------------
> | idle                | md super            | bm super [0] + bits |
> | bm bits[0, contd]   | bm super[1] + bits  | bm bits[1, contd]   |
> | bm super[2] + bits  | bm bits [2, contd]  | bm super[3] + bits  |
> | bm bits [3, contd]  |                     |                     |
> 
> So in node 1, pg_index in __write_sb_page() could equal to
> bitmap->storage.file_pages. Then bitmap_limit will be calculated to
> 0. md_super_write() will be called with 0 size.
> That means the first 4k sb area of node 1 will never be updated
> through filemap_write_page().
> This bug causes hang of mdadm/clustermd_tests/01r1_Grow_resize.
> 
> Here use (pg_index % bitmap->storage.file_pages) to make calculation
> of bitmap_limit correct.
> 
> Fixes: ab99a87542f1 ("md/md-bitmap: fix writing non bitmap pages")
> Signed-off-by: Su Yue <glass.su@suse.com>
> ---
> Changelog:
> v3:
>      Amend commit message suggested by Heming.
> v2:
>      Remove unintended change calling md_super_write().
> ---
>   drivers/md/md-bitmap.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 

Applied to md-6.15
Thanks,
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 23c09d22fcdb..9ae6cc8e30cb 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -426,8 +426,8 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
>   	struct block_device *bdev;
>   	struct mddev *mddev = bitmap->mddev;
>   	struct bitmap_storage *store = &bitmap->storage;
> -	unsigned int bitmap_limit = (bitmap->storage.file_pages - pg_index) <<
> -		PAGE_SHIFT;
> +	unsigned long num_pages = bitmap->storage.file_pages;
> +	unsigned int bitmap_limit = (num_pages - pg_index % num_pages) << PAGE_SHIFT;
>   	loff_t sboff, offset = mddev->bitmap_info.offset;
>   	sector_t ps = pg_index * PAGE_SIZE / SECTOR_SIZE;
>   	unsigned int size = PAGE_SIZE;
> @@ -436,7 +436,7 @@ static int __write_sb_page(struct md_rdev *rdev, struct bitmap *bitmap,
>   
>   	bdev = (rdev->meta_bdev) ? rdev->meta_bdev : rdev->bdev;
>   	/* we compare length (page numbers), not page offset. */
> -	if ((pg_index - store->sb_index) == store->file_pages - 1) {
> +	if ((pg_index - store->sb_index) == num_pages - 1) {
>   		unsigned int last_page_size = store->bytes & (PAGE_SIZE - 1);
>   
>   		if (last_page_size == 0)
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb
  2025-03-05  1:25 ` Yu Kuai
@ 2025-03-05 10:47   ` Su Yue
  2025-03-13  3:00     ` Yu Kuai
  0 siblings, 1 reply; 5+ messages in thread
From: Su Yue @ 2025-03-05 10:47 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Su Yue, linux-raid, hch, ofir.gal, heming.zhao,
	yukuai3 >> yukuai (C)

On Wed 05 Mar 2025 at 09:25, Yu Kuai <yukuai1@huaweicloud.com> 
wrote:

> 在 2025/03/03 11:39, Su Yue 写道:
>> In clustermd, separate write-intent-bitmaps are used for each 
>> cluster
>> node:
>> 0                    4k                     8k 
>> 12k
>> -------------------------------------------------------------------
>> | idle                | md super            | bm super [0] + 
>> bits |
>> | bm bits[0, contd]   | bm super[1] + bits  | bm bits[1, contd] 
>> |
>> | bm super[2] + bits  | bm bits [2, contd]  | bm super[3] + 
>> bits  |
>> | bm bits [3, contd]  |                     | 
>> |
>> So in node 1, pg_index in __write_sb_page() could equal to
>> bitmap->storage.file_pages. Then bitmap_limit will be 
>> calculated to
>> 0. md_super_write() will be called with 0 size.
>> That means the first 4k sb area of node 1 will never be updated
>> through filemap_write_page().
>> This bug causes hang of mdadm/clustermd_tests/01r1_Grow_resize.
>> Here use (pg_index % bitmap->storage.file_pages) to make 
>> calculation
>> of bitmap_limit correct.
>> Fixes: ab99a87542f1 ("md/md-bitmap: fix writing non bitmap 
>> pages")
>> Signed-off-by: Su Yue <glass.su@suse.com>
>> ---
>> Changelog:
>> v3:
>>      Amend commit message suggested by Heming.
>> v2:
>>      Remove unintended change calling md_super_write().
>> ---
>>   drivers/md/md-bitmap.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>
> Applied to md-6.15
>
Since it's a bug fix, could you please queue it to 6.14 if the 
merge window
is still open?

--
Su
> Thanks,
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index 23c09d22fcdb..9ae6cc8e30cb 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -426,8 +426,8 @@ static int __write_sb_page(struct md_rdev 
>> *rdev, struct bitmap *bitmap,
>>   	struct block_device *bdev;
>>   	struct mddev *mddev = bitmap->mddev;
>>   	struct bitmap_storage *store = &bitmap->storage;
>> -	unsigned int bitmap_limit = (bitmap->storage.file_pages - 
>> pg_index) <<
>> -		PAGE_SHIFT;
>> +	unsigned long num_pages = bitmap->storage.file_pages;
>> +	unsigned int bitmap_limit = (num_pages - pg_index % 
>> num_pages) << PAGE_SHIFT;
>>   	loff_t sboff, offset = mddev->bitmap_info.offset;
>>   	sector_t ps = pg_index * PAGE_SIZE / SECTOR_SIZE;
>>   	unsigned int size = PAGE_SIZE;
>> @@ -436,7 +436,7 @@ static int __write_sb_page(struct md_rdev 
>> *rdev, struct bitmap *bitmap,
>>     	bdev = (rdev->meta_bdev) ? rdev->meta_bdev : rdev->bdev;
>>   	/* we compare length (page numbers), not page offset. */
>> -	if ((pg_index - store->sb_index) == store->file_pages - 1) 
>> {
>> +	if ((pg_index - store->sb_index) == num_pages - 1) {
>>   		unsigned int last_page_size = store->bytes & 
>>   (PAGE_SIZE - 1);
>>     		if (last_page_size == 0)
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb
  2025-03-05 10:47   ` Su Yue
@ 2025-03-13  3:00     ` Yu Kuai
  0 siblings, 0 replies; 5+ messages in thread
From: Yu Kuai @ 2025-03-13  3:00 UTC (permalink / raw)
  To: Su Yue, Yu Kuai
  Cc: Su Yue, linux-raid, hch, ofir.gal, heming.zhao, yukuai (C)

Hi,

在 2025/03/05 18:47, Su Yue 写道:
> Since it's a bug fix, could you please queue it to 6.14 if the merge window
> is still open?

Sorry that I forgot to reply.

Since this problem is not introduced in this merge window, and current
6.14-rc6 is a bit late. This fix should be queued to 6.15-rc1.

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-03-13  3:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-03  3:39 [PATCH v3] md/md-bitmap: fix wrong bitmap_limit for clustermd when write sb Su Yue
2025-03-03  3:43 ` Heming Zhao
2025-03-05  1:25 ` Yu Kuai
2025-03-05 10:47   ` Su Yue
2025-03-13  3:00     ` Yu Kuai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox