From: Tejun Heo <tj@kernel.org>
To: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>,
drbd-dev@lists.linbit.com, Christoph Hellwig <hch@lst.de>,
Philipp Reisner <philipp.reisner@linbit.com>,
linux-kernel@vger.kernel.org
Subject: Re: [Drbd-dev] FLUSH/FUA documentation & code discrepancy
Date: Mon, 10 Sep 2012 15:54:42 -0700 [thread overview]
Message-ID: <20120910225442.GE7677@google.com> (raw)
In-Reply-To: <20120907084221.GD7028@soda.linbit>
Hello, Lars.
On Fri, Sep 07, 2012 at 10:42:21AM +0200, Lars Ellenberg wrote:
> We have a kernel thread that is receiving data blocks,
> and some "boundary" information (in the sense that between such
> boundaries, we have a reorder domain, where requests may reorder freely,
> but no requests may be reordered across such boundaries).
What purpose does this boundary serve? Why is it necessary? Which
driver is this?
> This same thread submits the assembled bios.
>
> With the old, stronger, BIO_RW_BARRIER implementation,
> if it was supported, we could just submit the first bio of a reorder
> domain (plus some special cases) with that flag,
> and could keep receiving -> assembling -> submitting.
Yes, but the actual request processing would continue to stall as
block layer would have been draining requests continuously.
> Now, we assumed that with FLUSH/FUA, we can do the same.
> And we could, as long as it is supported through the whole stack.
>
> But if it is not supported at some level in the stack, we must first drain.
>
> And since it is all "transparent", we just cannot determine
> if the whole stack does or does not support it.
>
> So we have to drain always.
The driver was hitching on BARRIER for draining. As that's gone now,
if you want the same behavior, the driver would need to drain itself.
> We did not realize that.
> In certain cases, where we submitted in the right order, and even
> indicated what we thought would amount to at least a "soft barrier"
> (reorder boundary) for the elevator, we ended up with data corruption
> because the elevator never sees these indicators, and reorders.
>
> Fine, our mistake/misunderstanding of the drain requirement.
> That's fixed now, we do always drain
> (unless specifically configured not to, where the admin takes the blame
> if that does not work on his stack).
>
>
> To always drain is also a performance hit, as we would rather keep
> receiving data and assembling bios and submitting them.
Is the performance hit measureable? Block BARRIER support had some
optimizations but it still had to constantly drain all the same.
> We can possibly work around that by introducing an additional submitter thread,
> or at least our own list where we queue assembled bios until the lower
> level device queue drains.
>
> But we'd rather have the elevator see the FLUSH/FUA,
> and treat them as at least a soft barrier/reorder boundary.
>
> I may be wrong here, but all the necessary bits for this seem to be in
> place already, if the information would even reach the elevator in one
> way or other, and not be completely stripped away early.
>
> What would you rather see, the elevator recognizing reorder boundaries?
> Or additional higher level queueing and extra thread/work queue/whatever?
>
> Both are fine with me, I'm just asking for an opinion.
First of all, using FLUSH/FUA for such purpose is an error-prone
abuse. You're trying to exploit an implementation detail which may
change at any time. I think what you want is to be able to specify
REQ_SOFTBARRIER on bio submission, which shouldn't be too hard but I'm
still lost why this is necessary. Can you please explain it a bit
more?
Thanks.
--
tejun
WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>,
Jens Axboe <axboe@kernel.dk>,
linux-kernel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] FLUSH/FUA documentation & code discrepancy
Date: Mon, 10 Sep 2012 15:54:42 -0700 [thread overview]
Message-ID: <20120910225442.GE7677@google.com> (raw)
In-Reply-To: <20120907084221.GD7028@soda.linbit>
Hello, Lars.
On Fri, Sep 07, 2012 at 10:42:21AM +0200, Lars Ellenberg wrote:
> We have a kernel thread that is receiving data blocks,
> and some "boundary" information (in the sense that between such
> boundaries, we have a reorder domain, where requests may reorder freely,
> but no requests may be reordered across such boundaries).
What purpose does this boundary serve? Why is it necessary? Which
driver is this?
> This same thread submits the assembled bios.
>
> With the old, stronger, BIO_RW_BARRIER implementation,
> if it was supported, we could just submit the first bio of a reorder
> domain (plus some special cases) with that flag,
> and could keep receiving -> assembling -> submitting.
Yes, but the actual request processing would continue to stall as
block layer would have been draining requests continuously.
> Now, we assumed that with FLUSH/FUA, we can do the same.
> And we could, as long as it is supported through the whole stack.
>
> But if it is not supported at some level in the stack, we must first drain.
>
> And since it is all "transparent", we just cannot determine
> if the whole stack does or does not support it.
>
> So we have to drain always.
The driver was hitching on BARRIER for draining. As that's gone now,
if you want the same behavior, the driver would need to drain itself.
> We did not realize that.
> In certain cases, where we submitted in the right order, and even
> indicated what we thought would amount to at least a "soft barrier"
> (reorder boundary) for the elevator, we ended up with data corruption
> because the elevator never sees these indicators, and reorders.
>
> Fine, our mistake/misunderstanding of the drain requirement.
> That's fixed now, we do always drain
> (unless specifically configured not to, where the admin takes the blame
> if that does not work on his stack).
>
>
> To always drain is also a performance hit, as we would rather keep
> receiving data and assembling bios and submitting them.
Is the performance hit measureable? Block BARRIER support had some
optimizations but it still had to constantly drain all the same.
> We can possibly work around that by introducing an additional submitter thread,
> or at least our own list where we queue assembled bios until the lower
> level device queue drains.
>
> But we'd rather have the elevator see the FLUSH/FUA,
> and treat them as at least a soft barrier/reorder boundary.
>
> I may be wrong here, but all the necessary bits for this seem to be in
> place already, if the information would even reach the elevator in one
> way or other, and not be completely stripped away early.
>
> What would you rather see, the elevator recognizing reorder boundaries?
> Or additional higher level queueing and extra thread/work queue/whatever?
>
> Both are fine with me, I'm just asking for an opinion.
First of all, using FLUSH/FUA for such purpose is an error-prone
abuse. You're trying to exploit an implementation detail which may
change at any time. I think what you want is to be able to specify
REQ_SOFTBARRIER on bio submission, which shouldn't be too hard but I'm
still lost why this is necessary. Can you please explain it a bit
more?
Thanks.
--
tejun
next prev parent reply other threads:[~2012-09-10 22:54 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-04 12:32 [Drbd-dev] FLUSH/FUA documentation & code discrepancy Philipp Reisner
2012-09-04 12:32 ` Philipp Reisner
2012-09-04 22:46 ` [Drbd-dev] " Tejun Heo
2012-09-04 22:46 ` Tejun Heo
2012-09-05 8:44 ` [Drbd-dev] " Philipp Reisner
2012-09-05 8:44 ` Philipp Reisner
2012-09-05 8:49 ` Tejun Heo
2012-09-05 8:49 ` Tejun Heo
2012-09-05 10:07 ` Lars Ellenberg
2012-09-05 10:07 ` Lars Ellenberg
2012-09-06 21:29 ` Tejun Heo
2012-09-07 8:42 ` Lars Ellenberg
2012-09-07 8:42 ` Lars Ellenberg
2012-09-10 22:54 ` Tejun Heo [this message]
2012-09-10 22:54 ` Tejun Heo
2012-09-10 23:06 ` Tejun Heo
2012-09-10 23:06 ` Tejun Heo
2012-09-10 23:12 ` Kent Overstreet
2012-09-10 23:12 ` Kent Overstreet
2012-09-10 23:31 ` Kent Overstreet
2012-09-10 23:31 ` Kent Overstreet
2012-09-11 5:58 ` NeilBrown
2012-09-11 5:58 ` NeilBrown
2012-09-11 8:25 ` Lars Ellenberg
2012-09-11 8:25 ` Lars Ellenberg
2012-09-11 14:41 ` Vivek Goyal
2012-09-11 14:41 ` Vivek Goyal
2012-09-12 18:58 ` Tejun Heo
2012-09-12 23:12 ` Joseph Glanville
2012-09-12 23:12 ` Joseph Glanville
2012-09-12 23:20 ` Tejun Heo
2012-09-12 23:20 ` Tejun Heo
2012-09-12 23:53 ` Joseph Glanville
2012-09-12 23:53 ` Joseph Glanville
2012-09-13 0:17 ` Joseph Glanville
2012-09-13 0:17 ` Joseph Glanville
2012-09-13 3:10 ` Joseph Glanville
2012-09-13 3:10 ` Joseph Glanville
2012-09-13 19:25 ` Tejun Heo
2012-09-13 19:25 ` Tejun Heo
2012-09-11 14:34 ` Vivek Goyal
2012-09-11 14:34 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120910225442.GE7677@google.com \
--to=tj@kernel.org \
--cc=axboe@kernel.dk \
--cc=drbd-dev@lists.linbit.com \
--cc=hch@lst.de \
--cc=lars.ellenberg@linbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=philipp.reisner@linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.