linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brett Russ <brett@linux.vnet.ibm.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [BUG,PATCH] raid1 behind write ordering (barrier) protection
Date: Thu, 12 Dec 2013 09:45:12 -0500	[thread overview]
Message-ID: <52A9CBF8.3050004@linux.vnet.ibm.com> (raw)
In-Reply-To: <529D1941.6000507@linux.vnet.ibm.com>

On 12/02/2013 06:35 PM, Brett Russ wrote:
> On 12/02/2013 06:08 PM, NeilBrown wrote:
>> How about just keeping a record of whether there is a BIO_FLUSH request
>> outstanding on each "behind" leg.  While there is we don't submit new
>> requests.
>> So we have a queue of bios for each leg which are waiting for a BIO_FLUSH to
>> complete, and we send them on down as soon as it does.
>
> In these circumstances, it's MD who's created the situation, not an upper
> layer's BIO_FLUSH.  So, we can't key off of that.  Additionally, the patch below
> also fixes another issue related to BIO_FLUSH:
>
>  >>> +    /* If this is a flush/fua request don't
>  >>> +     * ever let it go "behind".  Keep all the
>  >>> +     * mirrors in sync.
>  >>> +     */
>  >>> +    if (bio_rw_flagged(bio, BIO_FLUSH | BIO_FUA)) {
>  >>> +        set_bit(R1BIO_BehindIO, &r1_bio->state);
>  >>> +        do_flush_fua =  bio->bi_rw & (BIO_FLUSH | BIO_FUA);
>  >>> +    }
>
> so we avoid the BIO_FLUSH "behind" issue that way.  This probably should be a
> separate patch...
>
> We could divide the behind write ordering problem into two:
> 1) detecting the condition to protect
> 2) protecting against that condition
>
> Solutions for (1) include:
> a) keeping a list of behind writes
> b) keeping a count of behind writes
> c) ?

One possible additional solution for (1) proposed by a colleague here is 
leveraging the bitmap as an indicator of an outstanding write to a region.  I 
fear this may be an incompatible overloading the in- vs. out-of sync role of the 
bitmap, though.

> Solutions for (2) include:
> i) blocking the I/O
> j) ?
>
> The advantages to solution (a) are:
> -nothing gets blocked unless it overlaps (previously all reads would)
> -list depth limited to max behind writes allowed (typically small)
>
> I wish there were alternatives to solution (i) but recognize that since barriers
> were removed in favor of the filesystem owning the ordering problem, MD is
> effectively assuming the role of the filesystem in this case.
>
> Thanks,
> BR

Additional thoughts on the above, Neil?

Thanks,
BR


      reply	other threads:[~2013-12-12 14:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-21 20:53 raid1 behind write ordering (barrier) protection Brett Russ
2013-12-02 17:13 ` [BUG,PATCH] " Brett Russ
2013-12-02 23:08   ` NeilBrown
2013-12-02 23:35     ` Brett Russ
2013-12-12 14:45       ` Brett Russ [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A9CBF8.3050004@linux.vnet.ibm.com \
    --to=brett@linux.vnet.ibm.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).