Re: [RFC PATCH] dm-csum: A new device mapper target that checks data integrity

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alberto Bertogli <albertito@blitiri.com.ar>
To: SandeepKsinha <sandeepksinha@gmail.com>
Cc: Goswin von Brederlow <goswin-v-b@web.de>,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	linux-raid@vger.kernel.org, agk@redhat.com, neilb@suse.de
Subject: Re: [RFC PATCH] dm-csum: A new device mapper target that checks data integrity
Date: Fri, 26 Jun 2009 19:36:34 -0300	[thread overview]
Message-ID: <20090626223634.GG5913@blitiri.com.ar> (raw)
In-Reply-To: <37d33d830906260026t60ae1c71h981b6f3bc0165053@mail.gmail.com>

On Fri, Jun 26, 2009 at 12:56:42PM +0530, SandeepKsinha wrote:
> Hi Alberto,
> 
> + * TODO: would it be better to have M1 and M2 apart, to improve the chances of
> + * recovery in case of a failure?
> + *
> 
> How do you guarantee consistency in case of any lost writes? Your
> checksum might hold an updated value while your data block might be
> lost or written to a wrong destination?

The module aims to detect corruption, not prevent or recover from it. If the
write bio returns an error, it should be properly handled and propagated to
the upper layers.

If it returns success but writes the wrong data to the disk, then there will
be a mismatch between the checksum and the data at the destination, which will
be detected when it is read.

If a write returns success but no write ever takes place on the disk, then
dm-csum (as it is now) will not detect it; although I'm not sure if that
qualifies as on-disk data corruption or is it a disk controller issue.

> When implementing such integrity solutions, IMO, it is always
> advisable to handle such error conditions else this might lead to
> issues. Since, checksums  are very tightly coupled with the data and
> any misleading can be quite dangerous unlike parity which can be
> recovered.

The code considers the possibility of bios failing, and propagates this
information to the upper layers. If you see any part of the code that lacks
error handling, please let me know so I can fix it.

> Calculate the data's CRC, and compare it to the one found in M1. If they
> + *    match, the reading is successful. If not, compare it to the one found in
> + *    M2. If they match, the reading is successful;
> 
> Also, I hope by M1 and M2 you refer to the entry for a particular
> block in the respective IMD sector. What kind of mechanism do you use

M1 and M2 refer to whole IMD sectors.

> to determine which is younger?
> Is it the timestamp or some generation count?
>
> I assume information is per_block_entry in the IMD sectors. Which I
> don't see in your implementation?

It's per IMD sector. More specifically, struct imd_sector_header's
last_updated contains the generation count for the entire IMD sector, which is
used to determine which one is younger for updating purposes.

On reads, both IMD sectors are loaded and CRCs are verified against both.

Thanks a lot,
		Alberto

next prev parent reply	other threads:[~2009-06-26 22:36 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-21 16:13 [RFC PATCH] dm-csum: A new device mapper target that checks data integrity Alberto Bertogli
2009-05-21 18:17 ` Greg Freemyer
2009-05-21 19:17   ` Alberto Bertogli
2009-05-25 12:22 ` Goswin von Brederlow
2009-05-25 17:46   ` Alberto Bertogli
2009-05-26 10:33     ` Goswin von Brederlow
2009-05-26 12:52       ` Alberto Bertogli
2009-05-28 19:29         ` Goswin von Brederlow
2009-06-26  7:26           ` SandeepKsinha
2009-06-26  8:50             ` SandeepKsinha
2009-06-26 22:36             ` Alberto Bertogli [this message]
2009-06-26 22:53               ` Alan Cox
2009-06-28  0:34         ` Neil Brown
2009-06-28 15:30           ` Alberto Bertogli
2009-06-28 22:59             ` Goswin von Brederlow
2009-05-26 19:48 ` [RFC PATCH v2] " Alberto Bertogli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090626223634.GG5913@blitiri.com.ar \
    --to=albertito@blitiri.com.ar \
    --cc=agk@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=goswin-v-b@web.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=sandeepksinha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).