From: Brett Russ <brett@linux.vnet.ibm.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [BUG,PATCH] raid1 behind write ordering (barrier) protection
Date: Thu, 12 Dec 2013 09:45:12 -0500 [thread overview]
Message-ID: <52A9CBF8.3050004@linux.vnet.ibm.com> (raw)
In-Reply-To: <529D1941.6000507@linux.vnet.ibm.com>
On 12/02/2013 06:35 PM, Brett Russ wrote:
> On 12/02/2013 06:08 PM, NeilBrown wrote:
>> How about just keeping a record of whether there is a BIO_FLUSH request
>> outstanding on each "behind" leg. While there is we don't submit new
>> requests.
>> So we have a queue of bios for each leg which are waiting for a BIO_FLUSH to
>> complete, and we send them on down as soon as it does.
>
> In these circumstances, it's MD who's created the situation, not an upper
> layer's BIO_FLUSH. So, we can't key off of that. Additionally, the patch below
> also fixes another issue related to BIO_FLUSH:
>
> >>> + /* If this is a flush/fua request don't
> >>> + * ever let it go "behind". Keep all the
> >>> + * mirrors in sync.
> >>> + */
> >>> + if (bio_rw_flagged(bio, BIO_FLUSH | BIO_FUA)) {
> >>> + set_bit(R1BIO_BehindIO, &r1_bio->state);
> >>> + do_flush_fua = bio->bi_rw & (BIO_FLUSH | BIO_FUA);
> >>> + }
>
> so we avoid the BIO_FLUSH "behind" issue that way. This probably should be a
> separate patch...
>
> We could divide the behind write ordering problem into two:
> 1) detecting the condition to protect
> 2) protecting against that condition
>
> Solutions for (1) include:
> a) keeping a list of behind writes
> b) keeping a count of behind writes
> c) ?
One possible additional solution for (1) proposed by a colleague here is
leveraging the bitmap as an indicator of an outstanding write to a region. I
fear this may be an incompatible overloading the in- vs. out-of sync role of the
bitmap, though.
> Solutions for (2) include:
> i) blocking the I/O
> j) ?
>
> The advantages to solution (a) are:
> -nothing gets blocked unless it overlaps (previously all reads would)
> -list depth limited to max behind writes allowed (typically small)
>
> I wish there were alternatives to solution (i) but recognize that since barriers
> were removed in favor of the filesystem owning the ordering problem, MD is
> effectively assuming the role of the filesystem in this case.
>
> Thanks,
> BR
Additional thoughts on the above, Neil?
Thanks,
BR
prev parent reply other threads:[~2013-12-12 14:45 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-21 20:53 raid1 behind write ordering (barrier) protection Brett Russ
2013-12-02 17:13 ` [BUG,PATCH] " Brett Russ
2013-12-02 23:08 ` NeilBrown
2013-12-02 23:35 ` Brett Russ
2013-12-12 14:45 ` Brett Russ [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52A9CBF8.3050004@linux.vnet.ibm.com \
--to=brett@linux.vnet.ibm.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).