From: "heming.zhao@suse.com" <heming.zhao@suse.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-raid@vger.kernel.org, song@kernel.org,
guoqing.jiang@cloud.ionos.com, lidong.zhong@suse.com,
xni@redhat.com, neilb@suse.de, colyli@suse.com
Subject: Re: [PATCH] md: don't create mddev in md_open
Date: Thu, 1 Apr 2021 00:42:04 +0800 [thread overview]
Message-ID: <7bef7b86-ad8b-b503-59dc-0c9c69974237@suse.com> (raw)
In-Reply-To: <20210331065512.GA987842@infradead.org>
On 3/31/21 2:55 PM, Christoph Hellwig wrote:
>> -static struct mddev *mddev_find(dev_t unit)
>> +static struct mddev *mddev_find(dev_t unit, bool create)
>
> This just makes the mess that is mddev_find even worse. Please take
> a look at the patches at the beginning of the
>
> "move bd_mutex to the gendisk"
>
> series to try to clean this up properly.
>
Hello Christoph,
Because your patch is related with md issue, I use this mail thread to discuss.
If you and other people think the To & Cc need to extend, please do it.
If I understanding the series patches correctly, the purpose of [path 1/15]
is to remove "return -ERESTARTSYS" path.
currently md_open, all the racing handling code is below part:
```md_open
if (mddev->gendisk != bdev->bd_disk) {
/* we are racing with mddev_put which is discarding this
* bd_disk.
*/
mddev_put(mddev);
/* Wait until bdev->bd_disk is definitely gone */
if (work_pending(&mddev->del_work))
flush_workqueue(md_misc_wq);
/* Then retry the open from the top */
return -ERESTARTSYS;
}
```
mddev is removed from mddev internal list in mddev_put, this function is
the key to raise discarding mddev job.
let's only focus on "mddev->gendisk != bdev->bd_disk" case. there are 2 paths:
1> in creating path
this path is impossible to trigger, userspace md device (/dev/mdX) only valid
after md_alloc successfully completing. this time mddev->gendisk must equal with
bdev->bd_disk.
2> in freeing path. (this is the Neil's patch really cared)
2.1>
md_open is running before mddev is removed from md internal list.
Neil wanted to wait queue_work to finish clean job. then return -ERESTARTSYS.
And on next turn, md_open will find the mddev is null (but in real world, the
mddev_find will alloc a new one. this is a bug, it's not Neil real thoughts)
and return -ENODEV.
Your [path 01/15] breaking this rule. you will mistakenly call mddev_get to block clean job.
In my opinion, the solution may simply return -EBUSY (instead of -ENODEV) to
fail the open path. (I will show the code later)
2.2>
the Neil's patch has a bug (I had said in 2.1), it's related with below case:
md_open is called after mddev_put removing mddev but before finishing md_free().
this time mddev is not exist in md internal list, but bdev->bd_disk still grab
the mddev pointer. this scenatio can't return -ERESTARTSYS, it will make __blkdev_get
infinitely calling md_open and trigger a soft lockup.
this case can be fixed by calling mddev_find without creating mddev job. it responses
your new [patch 04/15], the do only search job's mddev_find.
At last, the code (based on your [PATCH 01/15]) may looks like:
```
static int md_open(struct block_device *bdev, fmode_t mode)
{
/* ... */
struct mddev *mddev = mddev_find(bdev->bd_dev); //hm: the new, only do searching job
int err;
if (!mddev) //hm: this will cover freeing path 2.2
return -ENODEV;
if (mddev->gendisk != bdev->bd_disk) { //hm: for freeing path 2.1
/* we are racing with mddev_put which is discarding this
* bd_disk.
*/
mddev_put(mddev);
/* Wait until bdev->bd_disk is definitely gone */
if (work_pending(&mddev->del_work))
flush_workqueue(md_misc_wq);
return -EBUSY; //hm: fail this path. userspace can try later and get -ENODEV.
}
/* hm: below same as [PATCH 01/15]*/
err = mutex_lock_interruptible(&mddev->open_mutex);
if (err)
return err;
if (test_bit(MD_CLOSING, &mddev->flags)) {
mutex_unlock(&mddev->open_mutex);
return -ENODEV;
}
mddev_get(mddev);
atomic_inc(&mddev->openers);
mutex_unlock(&mddev->open_mutex);
bdev_check_media_change(bdev);
return 0;
}
```
Thanks,
heming
next prev parent reply other threads:[~2021-03-31 16:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-30 7:43 [PATCH] md: don't create mddev in md_open Zhao Heming
2021-03-30 8:28 ` heming.zhao
2021-03-31 6:55 ` Christoph Hellwig
2021-03-31 16:42 ` heming.zhao [this message]
2021-03-31 22:46 ` Song Liu
2021-04-01 0:43 ` heming.zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7bef7b86-ad8b-b503-59dc-0c9c69974237@suse.com \
--to=heming.zhao@suse.com \
--cc=colyli@suse.com \
--cc=guoqing.jiang@cloud.ionos.com \
--cc=hch@infradead.org \
--cc=lidong.zhong@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
--cc=song@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox