Linux block layer
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Ming Lei <ming.lei@redhat.com>, Yu Kuai <yukuai1@huaweicloud.com>
Cc: jack@suse.cz, hch@infradead.org, axboe@kernel.dk,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	yi.zhang@huawei.com, yangerkun@huawei.com,
	"yukuai (C)" <yukuai3@huawei.com>,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed
Date: Wed, 22 Mar 2023 17:12:48 +0800	[thread overview]
Message-ID: <283fcc3b-bf24-e473-94c5-ffe7e73bfd47@huaweicloud.com> (raw)
In-Reply-To: <ZBq1K90+9ASVbdTu@ovpn-8-17.pek2.redhat.com>

Hi, Ming

在 2023/03/22 15:58, Ming Lei 写道:
> On Wed, Mar 22, 2023 at 11:59:26AM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
>> set, and partition scan will be proceed again when blkdev_get_by_dev()
>> is called. However, this will cause a problem that re-assemble partitioned
>> raid device will creat partition for underlying disk.
>>
>> Test procedure:
>>
>> mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
>> sgdisk -n 0:0:+100MiB /dev/md0
>> blockdev --rereadpt /dev/sda
>> blockdev --rereadpt /dev/sdb
>> mdadm -S /dev/md0
>> mdadm -A /dev/md0 /dev/sda /dev/sdb
>>
>> Test result: underlying disk partition and raid partition can be
>> observed at the same time
>>
>> Note that this can still happen in come corner cases that
>> GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
>> device.
>>
>> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> 
> The issue still can't be avoided completely, such as, after rebooting,
> /dev/sda1 & /dev/md0p1 can be observed at the same time. And this one
> should be underlying partitions scanned before re-assembling raid, I
> guess it may not be easy to avoid.

Yes, this is possible and I don't know how to fix this yet...
> 
> Also seems the following change added in e5cfefa97bcc isn't necessary:
> 
>                  /* Make sure the first partition scan will be proceed */
>                  if (get_capacity(disk) && !(disk->flags & GENHD_FL_NO_PART) &&
>                      !test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
>                          set_bit(GD_NEED_PART_SCAN, &disk->state);
> 
> since the following disk_scan_partitions() in device_add_disk() should cover
> partitions scan.

This can't be guaranteed if someone else open the device excl after
bdev_add and before disk_scan_partitions:

t1: 			t2:
device_add_disk
  bdev_add
   insert_inode_hash
			// open device excl
  disk_scan_partitions
  // will fail

However, this is just in theory, and it's unlikely to happen in
practice.

Thanks,
Kuai
> 
>> ---
>>   block/genhd.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index 08bb1a9ec22c..a72e27d6779d 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -368,7 +368,6 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
>>   	if (disk->open_partitions)
>>   		return -EBUSY;
>>   
>> -	set_bit(GD_NEED_PART_SCAN, &disk->state);
>>   	/*
>>   	 * If the device is opened exclusively by current thread already, it's
>>   	 * safe to scan partitons, otherwise, use bd_prepare_to_claim() to
>> @@ -381,12 +380,19 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
>>   			return ret;
>>   	}
>>   
>> +	set_bit(GD_NEED_PART_SCAN, &disk->state);
>>   	bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
>>   	if (IS_ERR(bdev))
>>   		ret =  PTR_ERR(bdev);
>>   	else
>>   		blkdev_put(bdev, mode & ~FMODE_EXCL);
>>   
>> +	/*
>> +	 * If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
>> +	 * and this will cause that re-assemble partitioned raid device will
>> +	 * creat partition for underlying disk.
>> +	 */
>> +	clear_bit(GD_NEED_PART_SCAN, &disk->state);
> 
> I feel GD_NEED_PART_SCAN becomes a bit hard to follow.
> 
> So far, it is only consumed by blkdev_get_whole(), and cleared in
> bdev_disk_changed(). That means partition scan can be retried
> if bdev_disk_changed() fails.
> 
> Another mess is that more drivers start to touch this flag, such as
> nbd/sd, probably it is better to change them into one API of
> blk_disk_need_partition_scan(), and hide implementation detail
> to drivers.
> 
> 
> thanks,
> Ming
> 
> .
> 


  reply	other threads:[~2023-03-22  9:12 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-17  2:21 [PATCH -next 0/2] block: fix scan partition for exclusively open device again Yu Kuai
2023-02-17  2:21 ` [PATCH -next 1/2] block: Revert "block: Do not reread partition table on exclusively open device" Yu Kuai
2023-02-17 11:01   ` Jan Kara
2023-02-17  2:22 ` [PATCH -next 2/2] block: fix scan partition for exclusively open device again Yu Kuai
2023-02-17  7:29   ` Christoph Hellwig
2023-02-17 11:05   ` Jan Kara
2023-02-17 13:16 ` [PATCH -next 0/2] " Jens Axboe
2023-03-21 11:43 ` Ming Lei
2023-03-22  1:26   ` Yu Kuai
2023-03-22  1:34     ` Ming Lei
2023-03-22  2:02       ` Yu Kuai
2023-03-22  2:15         ` Yu Kuai
2023-03-22  3:38           ` Ming Lei
2023-03-22  4:00             ` Yu Kuai
2023-03-22  3:59   ` [PATCH] block: don't set GD_NEED_PART_SCAN if scan partition failed Yu Kuai
2023-03-22  7:58     ` Ming Lei
2023-03-22  9:12       ` Yu Kuai [this message]
2023-03-22  9:47       ` Jan Kara
2023-03-22 11:34         ` Ming Lei
2023-03-22 13:07           ` Jan Kara
2023-03-22 16:08             ` Ming Lei
2023-03-23 10:51               ` Jan Kara
2023-03-23 12:03                 ` Ming Lei
2023-03-22  9:52     ` Jan Kara
2023-03-23 23:59     ` Ming Lei
2023-04-06  3:42     ` Yu Kuai
2023-04-06 22:29       ` Jens Axboe
2023-04-07  2:01         ` Ming Lei
2023-04-07  2:42           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=283fcc3b-bf24-e473-94c5-ffe7e73bfd47@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox