From: NeilBrown <neilb@suse.com>
To: Shaohua Li <shli@fb.com>
Cc: linux-raid@vger.kernel.org, songliubraving@fb.com,
hch@infradead.org, dan.j.williams@intel.com, Kernel-team@fb.com
Subject: Re: [PATCH V4 00/13] MD: a caching layer for raid5/6
Date: Tue, 14 Jul 2015 08:22:54 +1000 [thread overview]
Message-ID: <20150714082254.3889ef43@noble> (raw)
In-Reply-To: <20150710174835.GA1837928@devbig257.prn2.facebook.com>
On Fri, 10 Jul 2015 10:48:45 -0700 Shaohua Li <shli@fb.com> wrote:
> On Fri, Jul 10, 2015 at 04:42:09PM +1000, NeilBrown wrote:
> > On Thu, 9 Jul 2015 22:18:15 -0700 Shaohua Li <shli@fb.com> wrote:
> >
> > > On Fri, Jul 10, 2015 at 03:10:44PM +1000, NeilBrown wrote:
> > > > On Thu, 9 Jul 2015 21:52:43 -0700 Shaohua Li <shli@fb.com> wrote:
> > > >
> > > > > On Fri, Jul 10, 2015 at 02:36:56PM +1000, NeilBrown wrote:
> > > > > > On Thu, 9 Jul 2015 21:08:49 -0700 Shaohua Li <shli@fb.com> wrote:
> > > > > >
> > > >
> > > > > > There is also the issue of what action commits a previous transaction.
> > > > > > I'm not sure what you had. I'm suggesting that each metadata block
> > > > > > commits previous transactions. Is that a close-enough match to what
> > > > > > you had?
> > > > >
> > > > > What did you mean about a transaction? In my implementation, metadata
> > > > > block and followed stripe data/parity consist of an io unit. io units can
> > > > > be finished out of order. but if io unit has flush request (the data has
> > > > > flush/flush bio or metadata is a flush block), the io unit can only
> > > > > start after all previous io units and disk cache flush finish. Such io
> > > > > unit is strictly ordered. The log patch describes this behavior. Does it
> > > > > match?
> > > >
> > > > Yes, a "transaction" is an "io unit". The flushing is the same.
> > > > I just couldn't remember how, when reading the log on restart, you
> > > > determined if a given "io unit" was reliably consistent, or whether it
> > > > should be ignored (having possibly only partially been written).
> > >
> > > The metadata block has a checksum for data of the block. data/parity has
> > > checksum stored in metadata block. This way we can know if metadata and
> > > data is consistent.
> > >
> >
> > OK .. though I'm not totally sold on the value of checksums. When a
> > checksum doesn't match, that means something. When a checksum does
> > match, it could just be a co-incidence.
> > I'd rather have a process that made checksums unnecessary, and only use
> > the checksums as a double-check.
>
> We could do something like: write metadata/data, wait, write another
> metadata. the second metadata indicates the first is in disk. But this
> can impact performance very much.
The performance consideration is why I suggested a double-buffered
approach. Write metadata1, data1, metadata2, data2, then don't write
metdata3 until metdata1 and data1 has been written.
I haven't actually tried that so I don't know for certain it would help.
> I think checksum should be fine. It
> might be just a coninsidence, but the rate should extremely low. jbd2 is
> using checksum too now.
Maybe I'll have a look at jbd2 - do you know what sort of checksum it
uses? I'd be surprised it didn't use something quite a bit stronger
than crc32 for a task like this.
NeilBrown
>
> Thanks,
> Shaohua
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-07-13 22:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-23 21:37 [PATCH V4 00/13] MD: a caching layer for raid5/6 Shaohua Li
2015-06-23 21:37 ` [PATCH V4 01/13] MD: add a new disk role to present cache device Shaohua Li
2015-06-23 21:37 ` [PATCH V4 02/13] raid5: directly use mddev->queue Shaohua Li
2015-06-23 21:37 ` [PATCH V4 03/13] raid5: cache log handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 04/13] raid5: cache part of raid5 cache Shaohua Li
2015-06-23 21:37 ` [PATCH V4 05/13] raid5: cache reclaim support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 06/13] raid5: cache IO error handling Shaohua Li
2015-06-23 21:37 ` [PATCH V4 07/13] raid5: cache device quiesce support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 08/13] raid5: cache recovery support Shaohua Li
2015-06-23 21:37 ` [PATCH V4 09/13] raid5: add some sysfs entries Shaohua Li
2015-06-23 21:38 ` [PATCH V4 10/13] raid5: don't allow resize/reshape with cache support Shaohua Li
2015-06-23 21:38 ` [PATCH V4 11/13] raid5: guarantee cache release stripes in correct way Shaohua Li
2015-06-23 21:38 ` [PATCH V4 12/13] raid5: enable cache for raid array with cache disk Shaohua Li
2015-06-23 21:38 ` [PATCH V4 13/13] raid5: skip resync if caching is enabled Shaohua Li
2015-07-02 3:25 ` [PATCH V4 00/13] MD: a caching layer for raid5/6 Yuanhan Liu
2015-07-02 17:11 ` Shaohua Li
2015-07-03 2:18 ` Yuanhan Liu
2015-07-08 1:56 ` NeilBrown
2015-07-08 5:44 ` Shaohua Li
2015-07-09 23:21 ` NeilBrown
2015-07-10 4:08 ` Shaohua Li
2015-07-10 4:36 ` NeilBrown
2015-07-10 4:52 ` Shaohua Li
2015-07-10 5:10 ` NeilBrown
2015-07-10 5:18 ` Shaohua Li
2015-07-10 6:42 ` NeilBrown
2015-07-10 17:48 ` Shaohua Li
2015-07-13 22:22 ` NeilBrown [this message]
2015-07-13 22:35 ` Shaohua Li
2015-07-15 0:45 ` Shaohua Li
2015-07-15 2:12 ` NeilBrown
2015-07-15 3:16 ` Shaohua Li
2015-07-15 4:06 ` NeilBrown
2015-07-15 19:49 ` Shaohua Li
2015-07-15 23:16 ` NeilBrown
2015-07-16 0:07 ` Shaohua Li
2015-07-16 1:22 ` NeilBrown
2015-07-16 4:13 ` Shaohua Li
2015-07-16 6:07 ` NeilBrown
2015-07-16 15:07 ` John Stoffel
2015-07-20 0:03 ` NeilBrown
2015-07-20 14:11 ` John Stoffel
2015-07-16 17:40 ` Shaohua Li
2015-07-17 3:47 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150714082254.3889ef43@noble \
--to=neilb@suse.com \
--cc=Kernel-team@fb.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=linux-raid@vger.kernel.org \
--cc=shli@fb.com \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox