public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Damien Le Moal <damien.lemoal@wdc.com>
Cc: linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>,
	Bart Van Assche <Bart.VanAssche@wdc.com>
Subject: Re: [PATCH V2 10/12] scsi: sd_zbc: Disable zone write locking with scsi-mq
Date: Tue, 12 Sep 2017 17:26:30 +0800	[thread overview]
Message-ID: <20170912092629.GB15792@ming.t460p> (raw)
In-Reply-To: <5a10d60d-b18c-0956-59cd-69cb865019df@wdc.com>

On Tue, Sep 12, 2017 at 05:24:02PM +0900, Damien Le Moal wrote:
> Ming,
> 
> On 9/10/17 14:10, Ming Lei wrote:
> > On Fri, Sep 08, 2017 at 09:53:53AM -0700, Damien Le Moal wrote:
> >> Ming,
> >>
> >> On 9/8/17 05:43, Ming Lei wrote:
> >>> Hi Damien,
> >>>
> >>> On Fri, Sep 08, 2017 at 01:16:38AM +0900, Damien Le Moal wrote:
> >>>> In the case of a ZBC disk used with scsi-mq, zone write locking does
> >>>> not prevent write reordering in sequential zones. Unlike the legacy
> >>>> case, zone locking can only be done after the command request is
> >>>> removed from the scheduler dispatch queue. That is, at the time of
> >>>> zone locking, the write command may already be out of order.
> >>>
> >>> Per my understanding, for legacy case, it can be quite tricky to let
> >>> the existed I/O scheduler guarantee the write order for ZBC disk.
> >>> I guess requeue still might cause write reorder even in legacy path,
> >>> since requeue can happen in both scsi_request_fn() and scsi_io_completion()
> >>> with q->queue_lock released, meantime new rq belonging to the same
> >>> zone can come and be inserted to queue.
> >>
> >> Yes, the write ordering will always depend on the scheduler doing the
> >> right thing. But both cfq, deadline and even noop do the right thing
> >> there, even considering the aging case. The next write for a zone will
> >> always be the oldest in the queue for that zone, if it is not, it means
> >> that the application did not write sequentially. Extensive testing in
> >> the legacy case never showed a problem due to the scheduler itself.
> > 
> > OK, I suggest to document this guarantee of no write reorder for ZBC
> > somewhere, so that people will keep it in mind when trying to change
> > the current code.
> 
> Have you looked at the comments in sd_zbc.c ? That is explained there.
> Granted, this is a little deep in the stack, but this is after all
> dependent on the implementation of scsi_request_fn(). I can add comments
> there too if you prefer.

Yeah, I looked at that, but seems it is too coarse.

> 
> >> scsi_requeue_command() does the unprep (zone unlock) and requeue while
> >> holding the queue lock. So this is atomic with new write command
> >> insertion. Requeued commands are added to the dispatch queue head, and
> >> since a zone will only have a single write in-flight, there is no
> >> reordering possible. The next write command for a zone to go again is
> >> the last requeued one or the next in lba order. It works.
> > 
> > One special case is write with FLUSH/FUA, which may be added to
> > front of q->queue_head directly. Suppose one write with FUA is
> > just comes between requeue and run queue, write reorder may be
> > triggered.
> 
> Zoned disks are recent and all of them support FUA. This means that a
> write with FUA will be processed like any other write request (if I read
> the code in blk-flush.c correctly). Well, at least for the mq case,
> which does a blk_mq_sched_insert_request(), while the direct call to

blk_mq_sched_bypass_insert() can be called for flush requests too,
since requests in flush sequence share one driver tag(rq->tag != -1),
then the rq can be added to front of hctx->dispatch directly.


-- 
Ming

  reply	other threads:[~2017-09-12  9:26 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-07 16:16 [PATCH V2 00/12] scsi-mq support for ZBC disks Damien Le Moal
2017-09-07 16:16 ` [PATCH V2 01/12] block: Fix declaration of blk-mq debugfs functions Damien Le Moal
2017-09-08  8:05   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 02/12] block: Fix declaration of blk-mq scheduler functions Damien Le Moal
2017-09-08  8:05   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 03/12] scsi: sd_zbc: Move ZBC declarations to scsi_proto.h Damien Le Moal
2017-09-08  8:10   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 04/12] scsi: sd_zbc: Move zbc disk declarations to sd_zbc.h Damien Le Moal
2017-09-08  8:11   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 05/12] scsi: sd_zbc: Fix comments and indentation Damien Le Moal
2017-09-08  8:12   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 06/12] scsi: sd_zbc: Rearrange code Damien Le Moal
2017-09-08  8:13   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 07/12] scsi: sd_zbc.c: Use well defined macros Damien Le Moal
2017-09-08  8:15   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 08/12] scsi: sd_zbc: Fix sd_zbc_read_zoned_characteristics() Damien Le Moal
2017-09-08  8:17   ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 09/12] scsi: sd_zbc: Limit zone write locking to sequential zones Damien Le Moal
2017-09-08  8:39   ` Johannes Thumshirn
2017-09-08  9:48     ` Christoph Hellwig
2017-09-08  9:50       ` Johannes Thumshirn
2017-09-07 16:16 ` [PATCH V2 10/12] scsi: sd_zbc: Disable zone write locking with scsi-mq Damien Le Moal
2017-09-08 12:43   ` Ming Lei
2017-09-08 16:53     ` Damien Le Moal
2017-09-10  5:10       ` Ming Lei
2017-09-12  8:24         ` Damien Le Moal
2017-09-12  9:26           ` Ming Lei [this message]
2017-09-13  0:13             ` Damien Le Moal
2017-09-07 16:16 ` [PATCH V2 11/12] scsi: sd: Introduce scsi_disk_from_queue() Damien Le Moal
2017-09-10  5:16   ` Ming Lei
2017-09-12  8:05     ` Damien Le Moal
2017-09-07 16:16 ` [PATCH V2 12/12] scsi: Introduce ZBC disk I/O scheduler Damien Le Moal
2017-09-08  8:20 ` [PATCH V2 00/12] scsi-mq support for ZBC disks Christoph Hellwig
2017-09-08 16:12   ` Damien Le Moal
2017-09-11 12:24     ` Christoph Hellwig
2017-09-12  8:38       ` Damien Le Moal
2017-09-13 20:17         ` Christoph Hellwig
2017-09-14  0:04           ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170912092629.GB15792@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox