All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	linux-block@vger.kernel.org, stable@vger.kernel.org,
	linux-nvme@lists.infradead.org, Zhang Yi <yizhan@redhat.com>
Subject: Re: [PATCH 1/2] nvme: fix race between removing and reseting failure
Date: Fri, 19 May 2017 23:10:32 +0800	[thread overview]
Message-ID: <20170519151027.GA4870@ming.t460p> (raw)
In-Reply-To: <d17e121f-799d-f555-8c7b-1d82ce61df53@kernel.dk>

On Fri, May 19, 2017 at 08:41:13AM -0600, Jens Axboe wrote:
> On 05/16/2017 07:27 PM, Ming Lei wrote:
> > When one NVMe PCI device is being resetted and found reset failue,
> > nvme_remove_dead_ctrl() is called to handle the failure: blk-mq hw queues
> > are put into stopped first, then schedule .remove_work to release the driver.
> > 
> > Unfortunately if the driver is being released via sysfs store
> > just before the .remove_work is run, del_gendisk() from
> > nvme_remove() may hang forever because hw queues are stopped and
> > the submitted writeback IOs from fsync_bdev() can't be completed at all.
> > 
> > This patch fixes the following issue[1][2] by moving nvme_kill_queues()
> > into nvme_remove_dead_ctrl() to avoid the issue because nvme_remove()
> > flushs .reset_work, and this way is reasonable and safe because
> > nvme_dev_disable() has started to suspend queues and canceled requests
> > already.
> > 
> > [1] test script
> > 	fio -filename=$NVME_DISK -iodepth=1 -thread -rw=randwrite -ioengine=psync \
> > 	    -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 \
> > 	    -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60
> > 
> > 	sleep 35
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/rescan
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/reset
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/remove
> > 	echo 1 > /sys/bus/pci/rescan
> 
> The patch looks good to me. But since you have a nice reproducer, how about
> turning that into a blktests [1] test case?

This test has triggered several problems recently, and looks a good idea to
integrate into blktests.

I am a little busy recently, and may take a while to start that. If
anyone would like to integrate the case, please go ahead, and I am happy
to provide any details you need.


Thanks,
Ming

WARNING: multiple messages have this Message-ID (diff)
From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH 1/2] nvme: fix race between removing and reseting failure
Date: Fri, 19 May 2017 23:10:32 +0800	[thread overview]
Message-ID: <20170519151027.GA4870@ming.t460p> (raw)
In-Reply-To: <d17e121f-799d-f555-8c7b-1d82ce61df53@kernel.dk>

On Fri, May 19, 2017@08:41:13AM -0600, Jens Axboe wrote:
> On 05/16/2017 07:27 PM, Ming Lei wrote:
> > When one NVMe PCI device is being resetted and found reset failue,
> > nvme_remove_dead_ctrl() is called to handle the failure: blk-mq hw queues
> > are put into stopped first, then schedule .remove_work to release the driver.
> > 
> > Unfortunately if the driver is being released via sysfs store
> > just before the .remove_work is run, del_gendisk() from
> > nvme_remove() may hang forever because hw queues are stopped and
> > the submitted writeback IOs from fsync_bdev() can't be completed at all.
> > 
> > This patch fixes the following issue[1][2] by moving nvme_kill_queues()
> > into nvme_remove_dead_ctrl() to avoid the issue because nvme_remove()
> > flushs .reset_work, and this way is reasonable and safe because
> > nvme_dev_disable() has started to suspend queues and canceled requests
> > already.
> > 
> > [1] test script
> > 	fio -filename=$NVME_DISK -iodepth=1 -thread -rw=randwrite -ioengine=psync \
> > 	    -bssplit=5k/10:9k/10:13k/10:17k/10:21k/10:25k/10:29k/10:33k/10:37k/10:41k/10 \
> > 	    -bs_unaligned -runtime=1200 -size=-group_reporting -name=mytest -numjobs=60
> > 
> > 	sleep 35
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/rescan
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/reset
> > 	echo 1 > $SYSFS_NVME_PCI_PATH/remove
> > 	echo 1 > /sys/bus/pci/rescan
> 
> The patch looks good to me. But since you have a nice reproducer, how about
> turning that into a blktests [1] test case?

This test has triggered several problems recently, and looks a good idea to
integrate into blktests.

I am a little busy recently, and may take a while to start that. If
anyone would like to integrate the case, please go ahead, and I am happy
to provide any details you need.


Thanks,
Ming

  reply	other threads:[~2017-05-19 15:10 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-17  1:27 [PATCH 0/2] nvme: fix hang in path of removing disk Ming Lei
2017-05-17  1:27 ` Ming Lei
2017-05-17  1:27 ` [PATCH 1/2] nvme: fix race between removing and reseting failure Ming Lei
2017-05-17  1:27   ` Ming Lei
2017-05-17  6:38   ` Johannes Thumshirn
2017-05-17  6:38     ` Johannes Thumshirn
2017-05-17  7:01     ` Ming Lei
2017-05-17  7:01       ` Ming Lei
2017-05-18 13:47   ` Christoph Hellwig
2017-05-18 13:47     ` Christoph Hellwig
2017-05-18 15:04     ` Ming Lei
2017-05-18 15:04       ` Ming Lei
2017-05-18 14:13   ` Keith Busch
2017-05-18 14:13     ` Keith Busch
2017-05-19 12:52     ` Ming Lei
2017-05-19 12:52       ` Ming Lei
2017-05-19 15:15       ` Keith Busch
2017-05-19 15:15         ` Keith Busch
2017-05-19 14:41   ` Jens Axboe
2017-05-19 14:41     ` Jens Axboe
2017-05-19 15:10     ` Ming Lei [this message]
2017-05-19 15:10       ` Ming Lei
2017-05-19 16:40     ` Ming Lei
2017-05-19 16:40       ` Ming Lei
2017-05-19 16:55       ` yizhan
2017-05-19 16:55         ` yizhan
2017-05-17  1:27 ` [PATCH 2/2] nvme: avoid to hang in remove disk Ming Lei
2017-05-17  1:27   ` Ming Lei
2017-05-18 13:49   ` Christoph Hellwig
2017-05-18 13:49     ` Christoph Hellwig
2017-05-18 15:35     ` Ming Lei
2017-05-18 15:35       ` Ming Lei
2017-05-18 16:06       ` Keith Busch
2017-05-18 16:06         ` Keith Busch
2017-05-19 13:19         ` Ming Lei
2017-05-19 13:19           ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170519151027.GA4870@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=stable@vger.kernel.org \
    --cc=yizhan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.