Linux RAID subsystem development
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Shaohua Li <shli@fb.com>
Cc: linux-raid@vger.kernel.org, songliubraving@fb.com,
	hch@infradead.org, dan.j.williams@intel.com, Kernel-team@fb.com
Subject: Re: [PATCH V4 00/13] MD: a caching layer for raid5/6
Date: Tue, 14 Jul 2015 08:22:54 +1000	[thread overview]
Message-ID: <20150714082254.3889ef43@noble> (raw)
In-Reply-To: <20150710174835.GA1837928@devbig257.prn2.facebook.com>

On Fri, 10 Jul 2015 10:48:45 -0700 Shaohua Li <shli@fb.com> wrote:

> On Fri, Jul 10, 2015 at 04:42:09PM +1000, NeilBrown wrote:
> > On Thu, 9 Jul 2015 22:18:15 -0700 Shaohua Li <shli@fb.com> wrote:
> > 
> > > On Fri, Jul 10, 2015 at 03:10:44PM +1000, NeilBrown wrote:
> > > > On Thu, 9 Jul 2015 21:52:43 -0700 Shaohua Li <shli@fb.com> wrote:
> > > > 
> > > > > On Fri, Jul 10, 2015 at 02:36:56PM +1000, NeilBrown wrote:
> > > > > > On Thu, 9 Jul 2015 21:08:49 -0700 Shaohua Li <shli@fb.com> wrote:
> > > > > > 
> > > > 
> > > > > > There is also the issue of what action commits a previous transaction.
> > > > > > I'm not sure what you had.  I'm suggesting that each metadata block
> > > > > > commits previous transactions.  Is that a close-enough match to what
> > > > > > you had?
> > > > > 
> > > > > What did you mean about a transaction? In my implementation, metadata
> > > > > block and followed stripe data/parity consist of an io unit. io units can
> > > > > be finished out of order. but if io unit has flush request (the data has
> > > > > flush/flush bio or metadata is a flush block), the io unit can only
> > > > > start after all previous io units and disk cache flush finish. Such io
> > > > > unit is strictly ordered. The log patch describes this behavior. Does it
> > > > > match?
> > > > 
> > > > Yes, a "transaction" is an "io unit".  The flushing is the same.
> > > > I just couldn't remember how, when reading the log on restart, you
> > > > determined if a given "io unit" was reliably consistent, or whether it
> > > > should be ignored (having possibly only partially been written).
> > > 
> > > The metadata block has a checksum for data of the block. data/parity has
> > > checksum stored in metadata block. This way we can know if metadata and
> > > data is consistent.
> > > 
> > 
> > OK .. though I'm not totally sold on the value of checksums.  When a
> > checksum doesn't match, that means something.  When a checksum does
> > match, it could just be a co-incidence.
> > I'd rather have a process that made checksums unnecessary, and only use
> > the checksums as a double-check.
> 
> We could do something like: write metadata/data, wait, write another
> metadata. the second metadata indicates the first is in disk. But this
> can impact performance very much. 

The performance consideration is why I suggested a double-buffered
approach.  Write metadata1, data1, metadata2, data2, then don't write
metdata3 until metdata1 and data1 has been written.
I haven't actually tried that so I don't know for certain it would help.

>                                    I think checksum should be fine. It
> might be just a coninsidence, but the rate should extremely low. jbd2 is
> using checksum too now.

Maybe I'll have a look at jbd2 - do you know what sort of checksum it
uses?  I'd be surprised it didn't use something quite a bit stronger
than crc32 for a task like this.

NeilBrown


> 
> Thanks,
> Shaohua
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2015-07-13 22:22 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-23 21:37 [PATCH V4 00/13] MD: a caching layer for raid5/6 Shaohua Li
2015-06-23 21:37 ` [PATCH V4 01/13] MD: add a new disk role to present cache device Shaohua Li
2015-06-23 21:37 ` [PATCH V4 02/13] raid5: directly use mddev->queue Shaohua Li
2015-06-23 21:37 ` [PATCH V4 03/13] raid5: cache log handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 04/13] raid5: cache part of raid5 cache Shaohua Li
2015-06-23 21:37 ` [PATCH V4 05/13] raid5: cache reclaim support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 06/13] raid5: cache IO error handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 07/13] raid5: cache device quiesce support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 08/13] raid5: cache recovery support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 09/13] raid5: add some sysfs entries Shaohua Li
2015-06-23 21:38 ` [PATCH V4 10/13] raid5: don't allow resize/reshape with cache support Shaohua Li
2015-06-23 21:38 ` [PATCH V4 11/13] raid5: guarantee cache release stripes in correct way Shaohua Li
2015-06-23 21:38 ` [PATCH V4 12/13] raid5: enable cache for raid array with cache disk Shaohua Li
2015-06-23 21:38 ` [PATCH V4 13/13] raid5: skip resync if caching is enabled Shaohua Li
2015-07-02  3:25 ` [PATCH V4 00/13] MD: a caching layer for raid5/6 Yuanhan Liu
2015-07-02 17:11   ` Shaohua Li
2015-07-03  2:18     ` Yuanhan Liu
2015-07-08  1:56 ` NeilBrown
2015-07-08  5:44   ` Shaohua Li
2015-07-09 23:21     ` NeilBrown
2015-07-10  4:08       ` Shaohua Li
2015-07-10  4:36         ` NeilBrown
2015-07-10  4:52           ` Shaohua Li
2015-07-10  5:10             ` NeilBrown
2015-07-10  5:18               ` Shaohua Li
2015-07-10  6:42                 ` NeilBrown
2015-07-10 17:48                   ` Shaohua Li
2015-07-13 22:22                     ` NeilBrown [this message]
2015-07-13 22:35                       ` Shaohua Li
2015-07-15  0:45           ` Shaohua Li
2015-07-15  2:12             ` NeilBrown
2015-07-15  3:16               ` Shaohua Li
2015-07-15  4:06                 ` NeilBrown
2015-07-15 19:49                   ` Shaohua Li
2015-07-15 23:16                     ` NeilBrown
2015-07-16  0:07                       ` Shaohua Li
2015-07-16  1:22                         ` NeilBrown
2015-07-16  4:13                           ` Shaohua Li
2015-07-16  6:07                             ` NeilBrown
2015-07-16 15:07                               ` John Stoffel
2015-07-20  0:03                                 ` NeilBrown
2015-07-20 14:11                                   ` John Stoffel
2015-07-16 17:40                               ` Shaohua Li
2015-07-17  3:47                                 ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150714082254.3889ef43@noble \
    --to=neilb@suse.com \
    --cc=Kernel-team@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox