All of lore.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: iordan@cdf.toronto.edu, linux@horizon.com
Cc: linux-raid@vger.kernel.org
Subject: Re: 3-way mirrors
Date: 7 Sep 2010 14:49:17 -0400	[thread overview]
Message-ID: <20100907184917.8181.qmail@science.horizon.com> (raw)
In-Reply-To: <4C866354.3050904@cdf.toronto.edu>

George Spelvin wrote:
>> Anyway, one nice property of a 2-drive redundancy (3+-way mirror or
>> RAID-6) is error detection: in case of a mismatch, it's possible to
>> finger the offending drive.

> When we see a mismatch_cnt > 0, we would run a dd/cmp script which would 
> detect the drive and sector which is mismatched (i.e. we would craft a 
> script which runs three dd processes in parallel, reading from each 
> drive, and compares the data).

> When an inconsistency is discovered, we would have the sector which 
> doesn't match, and which drive it's on. However, even at 60MB/s, this 
> would take 5 hours to perform with our 1TB drives. So, it would be much 
> better if we could do this while we are up, somehow.

That was my hope, for the md software to do it automatically.

>> My understanding of the current code is that it just copies one mirror
>> (the first readable?) to the others.  Does someone have a patch to vote
>> on the data?  If not, can someone point me at the relevant bit of code
>> and orient me enough that I can create it?

> Resyncing an entire drive is probably not necessary with a mismatch, 
> because you already know the rest of the drive is synced and can simply 
> manually force a particular sector to match.

Ideally, I'd like ZFS-like checksums on the data, with a mismatch triggering
a read of all mirrors and a reconstruction attempt.  With that, a silently
corrupted sector on RAID-5 can be pinpointed and fixed.

But in the meantime, I'd like check/repair passes to tell me if 2 of the 3
mirrors agree, so I can blame the third.

>> (The other thing I'd love is a more advanced that can accept a
>> block number found by "check" as a parameter to "repair" so I don't have
>> to wait while the array is re-scanned.  Um... I suppose this depends on
>> a local patch I have that logs the sector numbers of mismatches.)

> Yes, but don't you run the risk of syncing the "bad" data from the 
> mismatch drive to the other two drives if you do this automatically? 
> Don't you also need a parameter to specify which drive to sync from?

That's why I wanted the voting, so the RAID software could decide
automatically.  I don't see a practical way to identify the correct
block contents in isolation, although mapping up to a logical file
may find a file which can be checked for consistency.

(But debugfs takes forever to run icheck + ncheck on a large filesystem.)

> At any rate, if the mismatch sector(s) are also logged during the array 
> check, then resyncing this sector by hand would be easy and fast with 
> minimal downtime. It would be great to have this functionality to start 
> with.

I use the following patch.  Note that it reports the offset in 512-byte
sectors within a single component; multiply by the number of data drives
and divide by sectors per block to get a block offset within the RAID
array.

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index d1d6891..2dcffcd 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1363,6 +1363,8 @@ static void sync_request_write(mddev_t *mddev, r10bio_t *r10_bio)
 					break;
 			if (j == vcnt)
 				continue;
+			printk(KERN_INFO "%s: Mismatch at sector %llu\n",
+			    mdname(mddev), (unsigned long long)r10_bio->sector);
 			mddev->resync_mismatches += r10_bio->sectors;
 		}
 		if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 96c6902..a0a0b08 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2732,6 +2732,8 @@ static void handle_parity_checks5(raid5_conf_t *conf, struct stripe_head *sh,
 			 */
 			set_bit(STRIPE_INSYNC, &sh->state);
 		else {
+printk(KERN_INFO "%s: Mismatch at sector %llu\n", mdname(conf->mddev),
+	(unsigned long long)sh->sector);
 			conf->mddev->resync_mismatches += STRIPE_SECTORS;
 			if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
 				/* don't try to repair!! */

  reply	other threads:[~2010-09-07 18:49 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-07 14:19 3-way mirrors George Spelvin
2010-09-07 16:07 ` Iordan Iordanov
2010-09-07 18:49   ` George Spelvin [this message]
2010-09-07 19:55     ` Keld Jørn Simonsen
2010-09-07 18:31 ` Aryeh Gregor
2010-09-07 19:02   ` George Spelvin
2010-09-08 22:28     ` Bill Davidsen
2010-09-07 22:01 ` Neil Brown
2010-09-08  1:33   ` Neil Brown
2010-09-08 14:52   ` George Spelvin
2010-09-08 23:04     ` Neil Brown
2010-09-08  9:40 ` RAID mismatches (and reporting thereof) Tim Small
2010-09-08 12:35   ` George Spelvin
2010-09-28 16:42 ` 3-way mirrors Tim Small
  -- strict thread matches above, loose matches on Subject: below --
2010-09-08  3:58 Michael Sallaway
2010-09-08  4:16 ` Neil Brown
2010-09-08  5:45 Michael Sallaway
2010-09-08  6:02 ` Neil Brown
2010-09-08  6:16 Michael Sallaway
2010-09-08  6:40 ` Neil Brown
2010-09-08  9:06   ` Tim Small
2010-09-08  7:01 Michael Sallaway
2010-09-08  9:11 ` Tim Small

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100907184917.8181.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=iordan@cdf.toronto.edu \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.