From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
=Oleksandr Natalenko <oleksandr@natalenko.name>,
Hannes Reinecke <hare@suse.com>,
Johannes Thumshirn <jthumshirn@suse.de>
Subject: Re: [PATCH v2 3/4] block, scsi: Make SCSI device suspend and resume work reliably
Date: Fri, 22 Sep 2017 06:04:03 +0800 [thread overview]
Message-ID: <20170921220401.GC6854@ming.t460p> (raw)
In-Reply-To: <20170921212255.12788-4-bart.vanassche@wdc.com>
On Thu, Sep 21, 2017 at 02:22:54PM -0700, Bart Van Assche wrote:
> It is essential during suspend and resume that neither the filesystem
> state nor the filesystem metadata in RAM changes. This is why while
> the hibernation image is being written or restored that SCSI devices
> are quiesced. The SCSI core quiesces devices through scsi_device_quiesce()
> and scsi_device_resume(). In the SDEV_QUIESCE state execution of
> non-preempt requests is deferred. This is realized by returning
> BLKPREP_DEFER from inside scsi_prep_state_check() for quiesced SCSI
> devices. Avoid that a full queue prevents power management requests
> to be submitted by slowing down allocation of non-preempt requests for
> devices in the quiesced state. This patch has been tested by running
> the following commands and by verifying that after resume the fio job
> is still running:
>
> for d in /sys/class/block/sd*[a-z]; do
> hcil=$(readlink "$d/device")
> hcil=${hcil#../../../}
> echo 4 > "$d"
> echo 1 > "/sys/class/scsi_device/$hcil/device/queue_depth"
> done
> bdev=$(readlink /dev/disk/by-uuid/5217d83f-213e-4b42-b86e-20013325ba6c)
> bdev=${bdev#../../}
> hcil=$(readlink "/sys/block/$bdev/device")
> hcil=${hcil#../../../}
> fio --name="$bdev" --filename="/dev/$bdev" --buffered=0 --bs=512 --rw=randread \
> --ioengine=psync --numjobs=4 --iodepth=16 --iodepth_batch=1 --thread \
> --loops=$((2**31)) &
> pid=$!
> sleep 1
> systemctl hibernate
> sleep 10
> kill $pid
>
> Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
> References: "I/O hangs after resuming from suspend-to-ram" (https://marc.info/?l=linux-block&m=150340235201348).
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
> block/blk-core.c | 13 ++++++++++---
> drivers/scsi/scsi_lib.c | 11 ++++++++---
> 2 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 1ac337712bbd..6a190dd998aa 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1429,11 +1429,18 @@ struct request *blk_get_request(struct request_queue *q, unsigned int op,
> gfp_t gfp_mask)
> {
> struct request *req;
> + const bool may_sleep = gfp_mask & __GFP_DIRECT_RECLAIM;
> +
> + if (unlikely(blk_queue_preempt_only(q) && !(op & REQ_PREEMPT))) {
The flag is set with queue_lock, but checked without any lock, do you
think it is safe in this way?
Also this flag isn't checked in normal I/O path, but you unfreeze
queue during scsi_device_quiesce(), then any normal I/O can come
from that time.
> + if (may_sleep)
> + msleep(100);
This is definitely a hack, why do you introduce the msleep()?
why is it 100? instead of other delay?
> + else
> + return ERR_PTR(-EBUSY);
> + }
>
> if (q->mq_ops) {
> - req = blk_mq_alloc_request(q, op,
> - (gfp_mask & __GFP_DIRECT_RECLAIM) ?
> - 0 : BLK_MQ_REQ_NOWAIT);
> + req = blk_mq_alloc_request(q, op, may_sleep ?
> + 0 : BLK_MQ_REQ_NOWAIT);
> if (!IS_ERR(req) && q->mq_ops->initialize_rq_fn)
> q->mq_ops->initialize_rq_fn(req);
> } else {
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 6db8247577a0..e76fd6e89a81 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2889,19 +2889,22 @@ static void scsi_wait_for_queuecommand(struct scsi_device *sdev)
> int
> scsi_device_quiesce(struct scsi_device *sdev)
> {
> + struct request_queue *q = sdev->request_queue;
> int err;
>
> mutex_lock(&sdev->state_mutex);
> err = scsi_device_set_state(sdev, SDEV_QUIESCE);
> + if (err == 0)
> + blk_set_preempt_only(q, true);
> mutex_unlock(&sdev->state_mutex);
>
> if (err)
> return err;
>
> - scsi_run_queue(sdev->request_queue);
> + scsi_run_queue(q);
> while (atomic_read(&sdev->device_busy)) {
> msleep_interruptible(200);
> - scsi_run_queue(sdev->request_queue);
> + scsi_run_queue(q);
> }
> return 0;
> }
> @@ -2924,8 +2927,10 @@ void scsi_device_resume(struct scsi_device *sdev)
> */
> mutex_lock(&sdev->state_mutex);
> if (sdev->sdev_state == SDEV_QUIESCE &&
> - scsi_device_set_state(sdev, SDEV_RUNNING) == 0)
> + scsi_device_set_state(sdev, SDEV_RUNNING) == 0) {
> + blk_set_preempt_only(sdev->request_queue, false);
> scsi_run_queue(sdev->request_queue);
> + }
> mutex_unlock(&sdev->state_mutex);
> }
> EXPORT_SYMBOL(scsi_device_resume);
> --
> 2.14.1
>
--
Ming
next prev parent reply other threads:[~2017-09-21 22:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-21 21:22 [PATCH v2 0/4] Make SCSI device suspend and resume work reliably Bart Van Assche
2017-09-21 21:22 ` [PATCH v2 1/4] block: Convert RQF_PREEMPT into REQ_PREEMPT Bart Van Assche
2017-09-21 21:22 ` [PATCH v2 2/4] block: Add the QUEUE_PREEMPT_ONLY request queue flag Bart Van Assche
2017-09-21 21:22 ` [PATCH v2 3/4] block, scsi: Make SCSI device suspend and resume work reliably Bart Van Assche
2017-09-21 22:04 ` Ming Lei [this message]
2017-09-21 22:44 ` Bart Van Assche
2017-09-21 21:22 ` [PATCH v2 4/4] scsi-mq: Reduce suspend latency Bart Van Assche
2017-09-21 22:06 ` Ming Lei
2017-09-21 22:43 ` Bart Van Assche
2017-09-21 23:25 ` Ming Lei
2017-09-21 23:32 ` Bart Van Assche
2017-09-21 23:53 ` Ming Lei
2017-09-21 23:56 ` Bart Van Assche
2017-09-22 0:03 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170921220401.GC6854@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@wdc.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=jthumshirn@suse.de \
--cc=linux-block@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=oleksandr@natalenko.name \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox