From: Ming Lei <ming.lei@redhat.com>
To: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <keith.busch@intel.com>,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org, Zhang Yi <yizhan@redhat.com>,
linux-block@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] nvme: fix race between removing and reseting failure
Date: Wed, 17 May 2017 15:01:33 +0800 [thread overview]
Message-ID: <20170517070132.GA14057@ming.t460p> (raw)
In-Reply-To: <3394ff16-2876-c9da-b457-98d6479516e0@suse.de>
On Wed, May 17, 2017 at 08:38:01AM +0200, Johannes Thumshirn wrote:
> On 05/17/2017 03:27 AM, Ming Lei wrote:
> > When one NVMe PCI device is being resetted and found reset failue,
> > nvme_remove_dead_ctrl() is called to handle the failure: blk-mq hw queues
> > are put into stopped first, then schedule .remove_work to release the driver.
> >
> > Unfortunately if the driver is being released via sysfs store
> > just before the .remove_work is run, del_gendisk() from
> > nvme_remove() may hang forever because hw queues are stopped and
> > the submitted writeback IOs from fsync_bdev() can't be completed at all.
> >
> > This patch fixes the following issue[1][2] by moving nvme_kill_queues()
> > into nvme_remove_dead_ctrl() to avoid the issue because nvme_remove()
> > flushs .reset_work, and this way is reasonable and safe because
> > nvme_dev_disable() has started to suspend queues and canceled requests
> > already.
> >
> > [1] test script
> > fio -filename=$NVME_DISK -iodepth=1 -thread -rw=randwrite -ioengine=psync \
> > -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 \
> > -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60
>
> Nit: the actual size after the -size parameter is missing.
Forget to mention, $NVME_DISK has to be one partition of nvme disk, and
the type need to be 'filesystem' type for reproduction, then actual size
isn't needed.
>
> Anyways:
> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Thanks for review!
Thanks,
Ming
WARNING: multiple messages have this Message-ID (diff)
From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH 1/2] nvme: fix race between removing and reseting failure
Date: Wed, 17 May 2017 15:01:33 +0800 [thread overview]
Message-ID: <20170517070132.GA14057@ming.t460p> (raw)
In-Reply-To: <3394ff16-2876-c9da-b457-98d6479516e0@suse.de>
On Wed, May 17, 2017@08:38:01AM +0200, Johannes Thumshirn wrote:
> On 05/17/2017 03:27 AM, Ming Lei wrote:
> > When one NVMe PCI device is being resetted and found reset failue,
> > nvme_remove_dead_ctrl() is called to handle the failure: blk-mq hw queues
> > are put into stopped first, then schedule .remove_work to release the driver.
> >
> > Unfortunately if the driver is being released via sysfs store
> > just before the .remove_work is run, del_gendisk() from
> > nvme_remove() may hang forever because hw queues are stopped and
> > the submitted writeback IOs from fsync_bdev() can't be completed at all.
> >
> > This patch fixes the following issue[1][2] by moving nvme_kill_queues()
> > into nvme_remove_dead_ctrl() to avoid the issue because nvme_remove()
> > flushs .reset_work, and this way is reasonable and safe because
> > nvme_dev_disable() has started to suspend queues and canceled requests
> > already.
> >
> > [1] test script
> > fio -filename=$NVME_DISK -iodepth=1 -thread -rw=randwrite -ioengine=psync \
> > -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 \
> > -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60
>
> Nit: the actual size after the -size parameter is missing.
Forget to mention, $NVME_DISK has to be one partition of nvme disk, and
the type need to be 'filesystem' type for reproduction, then actual size
isn't needed.
>
> Anyways:
> Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Thanks for review!
Thanks,
Ming
next prev parent reply other threads:[~2017-05-17 7:01 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-17 1:27 [PATCH 0/2] nvme: fix hang in path of removing disk Ming Lei
2017-05-17 1:27 ` Ming Lei
2017-05-17 1:27 ` [PATCH 1/2] nvme: fix race between removing and reseting failure Ming Lei
2017-05-17 1:27 ` Ming Lei
2017-05-17 6:38 ` Johannes Thumshirn
2017-05-17 6:38 ` Johannes Thumshirn
2017-05-17 7:01 ` Ming Lei [this message]
2017-05-17 7:01 ` Ming Lei
2017-05-18 13:47 ` Christoph Hellwig
2017-05-18 13:47 ` Christoph Hellwig
2017-05-18 15:04 ` Ming Lei
2017-05-18 15:04 ` Ming Lei
2017-05-18 14:13 ` Keith Busch
2017-05-18 14:13 ` Keith Busch
2017-05-19 12:52 ` Ming Lei
2017-05-19 12:52 ` Ming Lei
2017-05-19 15:15 ` Keith Busch
2017-05-19 15:15 ` Keith Busch
2017-05-19 14:41 ` Jens Axboe
2017-05-19 14:41 ` Jens Axboe
2017-05-19 15:10 ` Ming Lei
2017-05-19 15:10 ` Ming Lei
2017-05-19 16:40 ` Ming Lei
2017-05-19 16:40 ` Ming Lei
2017-05-19 16:55 ` yizhan
2017-05-19 16:55 ` yizhan
2017-05-17 1:27 ` [PATCH 2/2] nvme: avoid to hang in remove disk Ming Lei
2017-05-17 1:27 ` Ming Lei
2017-05-18 13:49 ` Christoph Hellwig
2017-05-18 13:49 ` Christoph Hellwig
2017-05-18 15:35 ` Ming Lei
2017-05-18 15:35 ` Ming Lei
2017-05-18 16:06 ` Keith Busch
2017-05-18 16:06 ` Keith Busch
2017-05-19 13:19 ` Ming Lei
2017-05-19 13:19 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170517070132.GA14057@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=jthumshirn@suse.de \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
--cc=stable@vger.kernel.org \
--cc=yizhan@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.