From: Tejun Heo <tj@kernel.org>
To: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>,
drbd-dev@lists.linbit.com, Christoph Hellwig <hch@lst.de>,
Philipp Reisner <philipp.reisner@linbit.com>,
linux-kernel@vger.kernel.org
Subject: Re: [Drbd-dev] FLUSH/FUA documentation & code discrepancy
Date: Mon, 10 Sep 2012 15:54:42 -0700 [thread overview]
Message-ID: <20120910225442.GE7677@google.com> (raw)
In-Reply-To: <20120907084221.GD7028@soda.linbit>
Hello, Lars.
On Fri, Sep 07, 2012 at 10:42:21AM +0200, Lars Ellenberg wrote:
> We have a kernel thread that is receiving data blocks,
> and some "boundary" information (in the sense that between such
> boundaries, we have a reorder domain, where requests may reorder freely,
> but no requests may be reordered across such boundaries).
What purpose does this boundary serve? Why is it necessary? Which
driver is this?
> This same thread submits the assembled bios.
>
> With the old, stronger, BIO_RW_BARRIER implementation,
> if it was supported, we could just submit the first bio of a reorder
> domain (plus some special cases) with that flag,
> and could keep receiving -> assembling -> submitting.
Yes, but the actual request processing would continue to stall as
block layer would have been draining requests continuously.
> Now, we assumed that with FLUSH/FUA, we can do the same.
> And we could, as long as it is supported through the whole stack.
>
> But if it is not supported at some level in the stack, we must first drain.
>
> And since it is all "transparent", we just cannot determine
> if the whole stack does or does not support it.
>
> So we have to drain always.
The driver was hitching on BARRIER for draining. As that's gone now,
if you want the same behavior, the driver would need to drain itself.
> We did not realize that.
> In certain cases, where we submitted in the right order, and even
> indicated what we thought would amount to at least a "soft barrier"
> (reorder boundary) for the elevator, we ended up with data corruption
> because the elevator never sees these indicators, and reorders.
>
> Fine, our mistake/misunderstanding of the drain requirement.
> That's fixed now, we do always drain
> (unless specifically configured not to, where the admin takes the blame
> if that does not work on his stack).
>
>
> To always drain is also a performance hit, as we would rather keep
> receiving data and assembling bios and submitting them.
Is the performance hit measureable? Block BARRIER support had some
optimizations but it still had to constantly drain all the same.
> We can possibly work around that by introducing an additional submitter thread,
> or at least our own list where we queue assembled bios until the lower
> level device queue drains.
>
> But we'd rather have the elevator see the FLUSH/FUA,
> and treat them as at least a soft barrier/reorder boundary.
>
> I may be wrong here, but all the necessary bits for this seem to be in
> place already, if the information would even reach the elevator in one
> way or other, and not be completely stripped away early.
>
> What would you rather see, the elevator recognizing reorder boundaries?
> Or additional higher level queueing and extra thread/work queue/whatever?
>
> Both are fine with me, I'm just asking for an opinion.
First of all, using FLUSH/FUA for such purpose is an error-prone
abuse. You're trying to exploit an implementation detail which may
change at any time. I think what you want is to be able to specify
REQ_SOFTBARRIER on bio submission, which shouldn't be too hard but I'm
still lost why this is necessary. Can you please explain it a bit
more?
Thanks.
--
tejun
next prev parent reply other threads:[~2012-09-10 22:54 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-04 12:32 [Drbd-dev] FLUSH/FUA documentation & code discrepancy Philipp Reisner
2012-09-04 22:46 ` Tejun Heo
2012-09-05 8:44 ` Philipp Reisner
2012-09-05 8:49 ` Tejun Heo
2012-09-05 10:07 ` Lars Ellenberg
2012-09-06 21:29 ` Tejun Heo
2012-09-07 8:42 ` Lars Ellenberg
2012-09-10 22:54 ` Tejun Heo [this message]
2012-09-10 23:06 ` Tejun Heo
2012-09-10 23:12 ` Kent Overstreet
2012-09-10 23:31 ` Kent Overstreet
2012-09-11 5:58 ` NeilBrown
2012-09-11 8:25 ` Lars Ellenberg
2012-09-11 14:41 ` Vivek Goyal
2012-09-12 18:58 ` Tejun Heo
2012-09-12 23:12 ` Joseph Glanville
2012-09-12 23:20 ` Tejun Heo
2012-09-12 23:53 ` Joseph Glanville
2012-09-13 0:17 ` Joseph Glanville
2012-09-13 3:10 ` Joseph Glanville
2012-09-13 19:25 ` Tejun Heo
2012-09-11 14:34 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120910225442.GE7677@google.com \
--to=tj@kernel.org \
--cc=axboe@kernel.dk \
--cc=drbd-dev@lists.linbit.com \
--cc=hch@lst.de \
--cc=lars.ellenberg@linbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=philipp.reisner@linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox