* what does md do if it finds an inconsistency?
@ 2007-05-06 0:45 martin f krafft
2007-05-06 9:06 ` martin f krafft
0 siblings, 1 reply; 7+ messages in thread
From: martin f krafft @ 2007-05-06 0:45 UTC (permalink / raw)
To: linux-raid mailing list
[-- Attachment #1: Type: text/plain, Size: 541 bytes --]
Neil,
With the check feature of the recent md feature, the question popped
up what happens when an inconsistency is found. Does it fix it? If
so, which disk it assumes to be wrong if an inconsistency is found?
Cheers,
--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
spamtraps: madduck.bogus@madduck.net
"frank harris has been received
in all the great houses -- once!"
-- oscar wilde
[-- Attachment #2: Digital signature (GPG/PGP) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: what does md do if it finds an inconsistency?
2007-05-06 0:45 what does md do if it finds an inconsistency? martin f krafft
@ 2007-05-06 9:06 ` martin f krafft
2007-05-06 11:07 ` Eyal Lebedinsky
0 siblings, 1 reply; 7+ messages in thread
From: martin f krafft @ 2007-05-06 9:06 UTC (permalink / raw)
To: linux-raid mailing list
[-- Attachment #1: Type: text/plain, Size: 1130 bytes --]
also sprach martin f krafft <madduck@madduck.net> [2007.05.06.0245 +0200]:
> With the check feature of the recent md feature, the question popped
> up what happens when an inconsistency is found. Does it fix it? If
> so, which disk it assumes to be wrong if an inconsistency is found?
What I meant was of course
echo repair > sycn_action
I am unsure what happens:
piper:/sys/block/md7/md# cat mismatch_cnt
128
piper:/sys/block/md7/md# echo repair > sync_action
piper:/sys/block/md7/md# cat sync_action
idle
piper:/sys/block/md7/md# cat mismatch_cnt
128
If I do this again, then mismatch_cnt goes to 0. Not the first time.
md7 : active raid10 sda2[0] sdc2[2] sdb2[1]
1373376 blocks 64K chunks 2 near-copies [3/3] [UUU]
--
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
spamtraps: madduck.bogus@madduck.net
"the thought of suicide is a great consolation: by means of it one
gets successfully through many a bad night."
- friedrich nietzsche
[-- Attachment #2: Digital signature (GPG/PGP) --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: what does md do if it finds an inconsistency?
2007-05-06 9:06 ` martin f krafft
@ 2007-05-06 11:07 ` Eyal Lebedinsky
2007-05-06 13:36 ` martin f krafft
0 siblings, 1 reply; 7+ messages in thread
From: Eyal Lebedinsky @ 2007-05-06 11:07 UTC (permalink / raw)
To: martin f krafft; +Cc: linux-raid mailing list
martin f krafft wrote:
> also sprach martin f krafft <madduck@madduck.net> [2007.05.06.0245 +0200]:
>
>>With the check feature of the recent md feature, the question popped
>>up what happens when an inconsistency is found. Does it fix it? If
>>so, which disk it assumes to be wrong if an inconsistency is found?
>
>
> What I meant was of course
>
> echo repair > sycn_action
>
> I am unsure what happens:
>
> piper:/sys/block/md7/md# cat mismatch_cnt
> 128
> piper:/sys/block/md7/md# echo repair > sync_action
> piper:/sys/block/md7/md# cat sync_action
> idle
> piper:/sys/block/md7/md# cat mismatch_cnt
> 128
>
> If I do this again, then mismatch_cnt goes to 0. Not the first time.
>
> md7 : active raid10 sda2[0] sdc2[2] sdb2[1]
> 1373376 blocks 64K chunks 2 near-copies [3/3] [UUU]
The first time it reports that it found (and repaired) 128 items.
It does not mean that you now *have* 128 mismatches.
The next run ('repair' or 'check') will find none (hopefully...)
and report zero.
--
Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/>
attach .zip as .dat
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: what does md do if it finds an inconsistency?
2007-05-06 11:07 ` Eyal Lebedinsky
@ 2007-05-06 13:36 ` martin f krafft
2007-05-06 15:59 ` Gavin McCullagh
2007-05-08 13:27 ` Bill Davidsen
0 siblings, 2 replies; 7+ messages in thread
From: martin f krafft @ 2007-05-06 13:36 UTC (permalink / raw)
To: linux-raid mailing list
> The first time it reports that it found (and repaired) 128 items.
> It does not mean that you now *have* 128 mismatches.
>
> The next run ('repair' or 'check') will find none (hopefully...)
> and report zero.
Oh, this makes perfect sense, thanks for the explanation.
As the mdadm maintainer for Debian, I would like to come up with a way to
handle mismatches somewhat intelligently. I already have the check
sync_action run once a month on all machines by default (can be turned
on/off via debconf), and now I would like to find a good way to react when
mismatch_count is non-zero. I don't want to write to the components
without the admin's consent though.
Maybe the ideal way would be to have mdadm --monitor send an email on
mismatch_count>0 or a cronjob that regularly sends reminders, until the
admin logs in and runs e.g. /usr/share/mdadm/repairarray.
Thoughts?
Also, if a mismatch is found on a RAID1, how does md decide which copy is
mismatched and which is correct? What about RAID 5/6/10?
Thanks for your time!
-martin
>
> --
> Eyal Lebedinsky (eyal@eyal.emu.id.au) <http://samba.org/eyal/>
> attach .zip as .dat
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: what does md do if it finds an inconsistency?
2007-05-06 13:36 ` martin f krafft
@ 2007-05-06 15:59 ` Gavin McCullagh
2007-05-07 4:08 ` Neil Brown
2007-05-08 13:27 ` Bill Davidsen
1 sibling, 1 reply; 7+ messages in thread
From: Gavin McCullagh @ 2007-05-06 15:59 UTC (permalink / raw)
To: martin f krafft; +Cc: linux-raid mailing list
On Sun, 06 May 2007, martin f krafft wrote:
> Maybe the ideal way would be to have mdadm --monitor send an email on
> mismatch_count>0 or a cronjob that regularly sends reminders, until the
> admin logs in and runs e.g. /usr/share/mdadm/repairarray.
>
> Also, if a mismatch is found on a RAID1, how does md decide which copy is
> mismatched and which is correct? What about RAID 5/6/10?
I think it just picks one at random. After all, how could you reliably
know which is right in a raid1 array? With raid5, I understand it just
updates the parity.
I had an idea to write an interactive userspace program which ran through
each block on each disk device to figure out which ones didn't match up and
then figure out whether it's within allocated filesystem space and if so,
which file or filesystem data was affected. This would hopefully enable a
user to figure out which block is wrong and correct things.
Gavin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: what does md do if it finds an inconsistency?
2007-05-06 15:59 ` Gavin McCullagh
@ 2007-05-07 4:08 ` Neil Brown
0 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2007-05-07 4:08 UTC (permalink / raw)
To: Gavin McCullagh; +Cc: martin f krafft, linux-raid mailing list
On Sunday May 6, gmccullagh@gmail.com wrote:
> On Sun, 06 May 2007, martin f krafft wrote:
>
> > Maybe the ideal way would be to have mdadm --monitor send an email on
> > mismatch_count>0 or a cronjob that regularly sends reminders, until the
> > admin logs in and runs e.g. /usr/share/mdadm/repairarray.
You could certainly do that. If you configure mdadm to run a program
for each 'monitor' event, you can detect the mismatch count from
argv[3] when argv[1] == RebuildFinished.
Though I suspect many people would be happy with running the 'repair'
every month rather than just a 'check'. Maybe that should be a config
option.
> >
> > Also, if a mismatch is found on a RAID1, how does md decide which copy is
> > mismatched and which is correct? What about RAID 5/6/10?
>
> I think it just picks one at random. After all, how could you reliably
> know which is right in a raid1 array? With raid5, I understand it just
> updates the parity.
I prefer to say "arbitrary" rather than "random".
I think the current implementation uses the first readable device as
the 'correct' one.
Otherwise, this is correct.
>
> I had an idea to write an interactive userspace program which ran through
> each block on each disk device to figure out which ones didn't match up and
> then figure out whether it's within allocated filesystem space and if so,
> which file or filesystem data was affected. This would hopefully enable a
> user to figure out which block is wrong and correct things.
>
That would be awfully difficult as doing a reverse mapping (block ->
file) is no-trivial in almost any filesystem, and you would want to
(ultimately) do it for every filesystem...
Might be educational though :-)
NeilBrown
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: what does md do if it finds an inconsistency?
2007-05-06 13:36 ` martin f krafft
2007-05-06 15:59 ` Gavin McCullagh
@ 2007-05-08 13:27 ` Bill Davidsen
1 sibling, 0 replies; 7+ messages in thread
From: Bill Davidsen @ 2007-05-08 13:27 UTC (permalink / raw)
To: martin f krafft; +Cc: linux-raid mailing list
martin f krafft wrote:
>> The first time it reports that it found (and repaired) 128 items.
>> It does not mean that you now *have* 128 mismatches.
>>
>> The next run ('repair' or 'check') will find none (hopefully...)
>> and report zero.
>>
>
> Oh, this makes perfect sense, thanks for the explanation.
>
> As the mdadm maintainer for Debian, I would like to come up with a way to
> handle mismatches somewhat intelligently. I already have the check
> sync_action run once a month on all machines by default (can be turned
> on/off via debconf), and now I would like to find a good way to react when
> mismatch_count is non-zero. I don't want to write to the components
> without the admin's consent though.
>
That sounds right. Some arrarys have persistent mismatches if they are
in use, you are unlikely to want to even attempt to take corrective
action. You might want to have a config file and just run a program
which reads the config regularly and acts based on what it finds.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-05-08 13:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-06 0:45 what does md do if it finds an inconsistency? martin f krafft
2007-05-06 9:06 ` martin f krafft
2007-05-06 11:07 ` Eyal Lebedinsky
2007-05-06 13:36 ` martin f krafft
2007-05-06 15:59 ` Gavin McCullagh
2007-05-07 4:08 ` Neil Brown
2007-05-08 13:27 ` Bill Davidsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).