From: NeilBrown <neilb@suse.com>
To: Shaohua Li <shli@fb.com>
Cc: linux-raid@vger.kernel.org, songliubraving@fb.com,
hch@infradead.org, dan.j.williams@intel.com, Kernel-team@fb.com
Subject: Re: [PATCH V4 00/13] MD: a caching layer for raid5/6
Date: Tue, 14 Jul 2015 08:22:54 +1000 [thread overview]
Message-ID: <20150714082254.3889ef43@noble> (raw)
In-Reply-To: <20150710174835.GA1837928@devbig257.prn2.facebook.com>
On Fri, 10 Jul 2015 10:48:45 -0700 Shaohua Li <shli@fb.com> wrote:
> On Fri, Jul 10, 2015 at 04:42:09PM +1000, NeilBrown wrote:
> > On Thu, 9 Jul 2015 22:18:15 -0700 Shaohua Li <shli@fb.com> wrote:
> >
> > > On Fri, Jul 10, 2015 at 03:10:44PM +1000, NeilBrown wrote:
> > > > On Thu, 9 Jul 2015 21:52:43 -0700 Shaohua Li <shli@fb.com> wrote:
> > > >
> > > > > On Fri, Jul 10, 2015 at 02:36:56PM +1000, NeilBrown wrote:
> > > > > > On Thu, 9 Jul 2015 21:08:49 -0700 Shaohua Li <shli@fb.com> wrote:
> > > > > >
> > > >
> > > > > > There is also the issue of what action commits a previous transaction.
> > > > > > I'm not sure what you had. I'm suggesting that each metadata block
> > > > > > commits previous transactions. Is that a close-enough match to what
> > > > > > you had?
> > > > >
> > > > > What did you mean about a transaction? In my implementation, metadata
> > > > > block and followed stripe data/parity consist of an io unit. io units can
> > > > > be finished out of order. but if io unit has flush request (the data has
> > > > > flush/flush bio or metadata is a flush block), the io unit can only
> > > > > start after all previous io units and disk cache flush finish. Such io
> > > > > unit is strictly ordered. The log patch describes this behavior. Does it
> > > > > match?
> > > >
> > > > Yes, a "transaction" is an "io unit". The flushing is the same.
> > > > I just couldn't remember how, when reading the log on restart, you
> > > > determined if a given "io unit" was reliably consistent, or whether it
> > > > should be ignored (having possibly only partially been written).
> > >
> > > The metadata block has a checksum for data of the block. data/parity has
> > > checksum stored in metadata block. This way we can know if metadata and
> > > data is consistent.
> > >
> >
> > OK .. though I'm not totally sold on the value of checksums. When a
> > checksum doesn't match, that means something. When a checksum does
> > match, it could just be a co-incidence.
> > I'd rather have a process that made checksums unnecessary, and only use
> > the checksums as a double-check.
>
> We could do something like: write metadata/data, wait, write another
> metadata. the second metadata indicates the first is in disk. But this
> can impact performance very much.
The performance consideration is why I suggested a double-buffered
approach. Write metadata1, data1, metadata2, data2, then don't write
metdata3 until metdata1 and data1 has been written.
I haven't actually tried that so I don't know for certain it would help.
> I think checksum should be fine. It
> might be just a coninsidence, but the rate should extremely low. jbd2 is
> using checksum too now.
Maybe I'll have a look at jbd2 - do you know what sort of checksum it
uses? I'd be surprised it didn't use something quite a bit stronger
than crc32 for a task like this.
NeilBrown
>
> Thanks,
> Shaohua
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-07-13 22:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-23 21:37 [PATCH V4 00/13] MD: a caching layer for raid5/6 Shaohua Li
2015-06-23 21:37 ` [PATCH V4 01/13] MD: add a new disk role to present cache device Shaohua Li
2015-06-23 21:37 ` [PATCH V4 02/13] raid5: directly use mddev->queue Shaohua Li
2015-06-23 21:37 ` [PATCH V4 03/13] raid5: cache log handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 04/13] raid5: cache part of raid5 cache Shaohua Li
2015-06-23 21:37 ` [PATCH V4 05/13] raid5: cache reclaim support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 06/13] raid5: cache IO error handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 07/13] raid5: cache device quiesce support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 08/13] raid5: cache recovery support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 09/13] raid5: add some sysfs entries Shaohua Li
2015-06-23 21:38 ` [PATCH V4 10/13] raid5: don't allow resize/reshape with cache support Shaohua Li
2015-06-23 21:38 ` [PATCH V4 11/13] raid5: guarantee cache release stripes in correct way Shaohua Li
2015-06-23 21:38 ` [PATCH V4 12/13] raid5: enable cache for raid array with cache disk Shaohua Li
2015-06-23 21:38 ` [PATCH V4 13/13] raid5: skip resync if caching is enabled Shaohua Li
2015-07-02 3:25 ` [PATCH V4 00/13] MD: a caching layer for raid5/6 Yuanhan Liu
2015-07-02 17:11 ` Shaohua Li
2015-07-03 2:18 ` Yuanhan Liu
2015-07-08 1:56 ` NeilBrown
2015-07-08 5:44 ` Shaohua Li
2015-07-09 23:21 ` NeilBrown
2015-07-10 4:08 ` Shaohua Li
2015-07-10 4:36 ` NeilBrown
2015-07-10 4:52 ` Shaohua Li
2015-07-10 5:10 ` NeilBrown
2015-07-10 5:18 ` Shaohua Li
2015-07-10 6:42 ` NeilBrown
2015-07-10 17:48 ` Shaohua Li
2015-07-13 22:22 ` NeilBrown [this message]
2015-07-13 22:35 ` Shaohua Li
2015-07-15 0:45 ` Shaohua Li
2015-07-15 2:12 ` NeilBrown
2015-07-15 3:16 ` Shaohua Li
2015-07-15 4:06 ` NeilBrown
2015-07-15 19:49 ` Shaohua Li
2015-07-15 23:16 ` NeilBrown
2015-07-16 0:07 ` Shaohua Li
2015-07-16 1:22 ` NeilBrown
2015-07-16 4:13 ` Shaohua Li
2015-07-16 6:07 ` NeilBrown
2015-07-16 15:07 ` John Stoffel
2015-07-20 0:03 ` NeilBrown
2015-07-20 14:11 ` John Stoffel
2015-07-16 17:40 ` Shaohua Li
2015-07-17 3:47 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150714082254.3889ef43@noble \
--to=neilb@suse.com \
--cc=Kernel-team@fb.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=linux-raid@vger.kernel.org \
--cc=shli@fb.com \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.