linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Mike Snitzer <snitzer@redhat.com>
Cc: jaxboe@fusionio.com, linux-fsdevel@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	hch@lst.de, James.Bottomley@suse.de, tytso@mit.edu,
	chris.mason@oracle.com, swhiteho@redhat.com,
	konishi.ryusuke@lab.ntt.co.jp, dm-devel@redhat.com, vst@vlnb.net,
	jack@suse.cz, rwheeler@redhat.com, hare@suse.de, neilb@suse.de,
	rusty@rustcorp.com.au, mst@redhat.com, Tejun Heo <tj@kernle.org>
Subject: Re: [PATCH 5/5] dm: implement REQ_FLUSH/FUA support
Date: Tue, 17 Aug 2010 11:33:52 +0200	[thread overview]
Message-ID: <4C6A5780.2090100@kernel.org> (raw)
In-Reply-To: <20100816190203.GA22299@redhat.com>

Hello,

On 08/16/2010 09:02 PM, Mike Snitzer wrote:
> On Mon, Aug 16 2010 at 12:52pm -0400,
> Tejun Heo <tj@kernel.org> wrote:
> 
>> From: Tejun Heo <tj@kernle.org>
>>
>> This patch converts dm to support REQ_FLUSH/FUA instead of now
>> deprecated REQ_HARDBARRIER.
> 
> What tree does this patch apply to?  I know it doesn't apply to
> v2.6.36-rc1, e.g.: http://git.kernel.org/linus/708e929513502fb0

(from the head message)
These patches are on top of

  block#for-2.6.36-post (c047ab2dddeeafbd6f7c00e45a13a5c4da53ea0b)
+ block-replace-barrier-with-sequenced-flush patchset[1]
+ block-fix-incorrect-bio-request-flag-conversion-in-md patch[2]

and available in the following git tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git flush-fua

[1] http://thread.gmane.org/gmane.linux.kernel/1022363
[2] http://thread.gmane.org/gmane.linux.kernel/1023435

Probably fetching the git tree is the easist way to review?

>> For bio-based dm,
>> * -EOPNOTSUPP retry logic dropped.
>
> That logic wasn't just about retries (at least not in the latest
> kernel).  With commit 708e929513502fb0 the -EOPNOTSUPP checking also
> serves to optimize the barrier+discard case (when discards aren't
> supported).

With the patch applied, there's no second flush.  Those requests would
now be REQ_FLUSH + REQ_DISCARD.  The first can't be avoided anyway and
there won't be the second flush to begin with, so I don't think this
worsens anything.

>> * Nothing much changes.  It just needs to handle FLUSH requests as
>>   before.  It would be beneficial to advertise FUA capability so that
>>   it can propagate FUA flags down to member request_queues instead of
>>   sequencing it as WRITE + FLUSH at the top queue.
> 
> Can you expand on that TODO a bit?  What is the mechanism to propagate
> FUA down to a DM device's members?  I'm only aware of propagating member
> devices' features up to the top-level DM device's request-queue (not the
> opposite).
> 
> Are you saying that establishing the FUA capability on the top-level DM
> device's request_queue is sufficient?  If so then why not make the
> change?

Yeah, I think it would be enough to always advertise FLUSH|FUA if the
member devices support FLUSH (regardless of FUA support).  The reason
why I didn't do it was, umm, laziness, I suppose.

>> Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
>> proceed with caution as I'm not familiar with the code base.
> 
> This is concerning...

Yeap, I want you to be concerned. :-) This was the first time I looked
at the dm code and there are many different disjoint code paths and I
couldn't fully follow or test all of them, so it definitely needs a
careful review from someone who understands the whole thing.

> if we're to offer more comprehensive review I think we need more
> detail on what guided your changes rather than details of what the
> resulting changes are.

I'll try to explain it.  If you have any further questions, please let
me know.

* For common part (dm-io, dm-log):

  * Empty REQ_HARDBARRIER is converted to empty REQ_FLUSH.

  * REQ_HARDBARRIER w/ data is converted either to data + REQ_FLUSH +
    REQ_FUA or data + REQ_FUA.  The former is the safe equivalent
    conversion but there could be cases where ther latter is enough.

* For bio based dm:

  * Unlike REQ_HARDBARRIER, REQ_FLUSH/FUA doesn't have any ordering
    requirements.  Remove assumptions of ordering and/or draining.

    A related question: Is dm_wait_for_completion() used in
    process_flush() safe against starvation under continuous influx of
    other commands?

  * As REQ_FLUSH/FUA doesn't require any ordering of requests before
    or after it, on array devices, the latter part - REQ_FUA - can be
    handled like other writes.  ie. REQ_FLUSH needs to be broadcasted
    to all devices but once that is complete the data/REQ_FUA bio can
    be sent to only the affected devices.  This needs some care as
    there are bio cloning/splitting code paths where REQ_FUA bit isn't
    preserved.

  * Guarantee that REQ_FLUSH w/ data never reaches targets (this in
    part is to put it in alignment with request based dm).

* For request based dm:

  * The sequencing is done by the block layer for the top level
    request_queue, so the only things request based dm needs to make
    sure is 1. handling empty REQ_FLUSH correctly (block layer will
    only send down empty REQ_FLUSHes) and 2. propagate REQ_FUA bit to
    member devices.

Thanks.

-- 
tejun

  reply	other threads:[~2010-08-17  9:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-16 16:51 [RFC PATCHSET block#for-2.6.36-post] block: convert to REQ_FLUSH/FUA Tejun Heo
2010-08-16 16:51 ` [PATCH 1/5] block/loop: implement REQ_FLUSH/FUA support Tejun Heo
2010-08-16 16:52 ` [PATCH 2/5] virtio_blk: " Tejun Heo
2010-08-16 18:33   ` Christoph Hellwig
2010-08-17  8:17     ` Tejun Heo
2010-08-17 13:23       ` Christoph Hellwig
2010-08-17 16:22         ` Tejun Heo
2010-08-18 10:22         ` Rusty Russell
2010-08-17  1:16   ` Rusty Russell
2010-08-17  8:18     ` Tejun Heo
2010-08-19 15:14   ` [PATCH 2/5 UPDATED] virtio_blk: drop REQ_HARDBARRIER support Tejun Heo
2010-08-16 16:52 ` [PATCH 3/5] lguest: replace VIRTIO_F_BARRIER support with VIRTIO_F_FLUSH/FUA support Tejun Heo
2010-08-19 15:15   ` [PATCH 3/5] lguest: replace VIRTIO_F_BARRIER support with VIRTIO_F_FLUSH support Tejun Heo
2010-08-16 16:52 ` [PATCH 4/5] md: implment REQ_FLUSH/FUA support Tejun Heo
2010-08-24  5:41   ` Neil Brown
2010-08-25 11:22     ` [PATCH UPDATED " Tejun Heo
2010-08-25 11:42       ` Neil Brown
2010-08-16 16:52 ` [PATCH 5/5] dm: implement " Tejun Heo
2010-08-16 19:02   ` Mike Snitzer
2010-08-17  9:33     ` Tejun Heo [this message]
2010-08-17 13:13       ` Christoph Hellwig
2010-08-17 14:07       ` Mike Snitzer
2010-08-17 16:51         ` Tejun Heo
2010-08-17 18:21           ` Mike Snitzer
2010-08-18  6:32             ` Tejun Heo
2010-08-19 10:32           ` Kiyoshi Ueda
2010-08-19 15:45             ` Tejun Heo
2010-08-18  9:53 ` [RFC PATCHSET block#for-2.6.36-post] block: convert to REQ_FLUSH/FUA Christoph Hellwig
2010-08-18 14:26   ` James Bottomley
2010-08-18 14:33     ` Christoph Hellwig
2010-08-19 15:37     ` FUJITA Tomonori
2010-08-19 15:41       ` Christoph Hellwig
2010-08-19 15:56         ` FUJITA Tomonori
2010-08-23 16:47 ` Christoph Hellwig
2010-08-24  9:51   ` Lars Ellenberg
2010-08-24 15:45   ` Philipp Reisner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C6A5780.2090100@kernel.org \
    --to=tj@kernel.org \
    --cc=James.Bottomley@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=dm-devel@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jaxboe@fusionio.com \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=neilb@suse.de \
    --cc=rusty@rustcorp.com.au \
    --cc=rwheeler@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=swhiteho@redhat.com \
    --cc=tj@kernle.org \
    --cc=tytso@mit.edu \
    --cc=vst@vlnb.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).