linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bas van Schaik <bas@tuxes.nl>
To: linux-raid@vger.kernel.org
Subject: Re: Redundancy check using "echo check > sync_action": error	reporting?
Date: Thu, 20 Mar 2008 16:16:08 +0100	[thread overview]
Message-ID: <47E27FB8.1070009@tuxes.nl> (raw)
In-Reply-To: <20080320144549.GB28114@cthulhu.home.robinhill.me.uk>

Robin Hill wrote:
> On Thu Mar 20, 2008 at 03:19:08PM +0100, Bas van Schaik wrote:
>
>   
>> Robin Hill wrote:
>>     
>>> On Thu Mar 20, 2008 at 02:32:37PM +0100, Bas van Schaik wrote:
>>>   
>>>       
>>>> Anyone able to answer the last and most important question: does it
>>>> produce a message during resync in case of corruption? That would be great!
>>>>     
>>>>         
>>> There's no explicit message produced by the md module, no.  You need to
>>> check the /sys/block/md{X}/md/mismatch_cnt entry to find out how many
>>> mismatches there are.  Similarly, following a repair this will indicate
>>> how many mismatches it thinks have been fixed (by updating the parity
>>> block to match the data blocks).
>>>   
>>>       
>> Marvellous! I naively assumed that the module would warn me, but that's
>> not true. Wouldn't it be appropriate to print a message to dmesg if such
>> a mismatch occurs during a check? Such a mismatch clearly means that
>> there is something wrong with your hardware lying beneath md, doesn't it?
>>
>>     
> With a RAID5 then mostly, yes - there may be errors caused by transient
> situations (interference, cosmic rays, etc) which are entirely
> independent of the hardware.  With other RAID versions it's not quite as
> clear cut.  For example with RAID1 it's possible for the in-memory data
> to have been changed between writing to each disk (especially with swap
> disks) - this isn't necessarily an issue (and certainly not a hardware
> one).
>   
Maybe I understand something wrong then. In an ideal situation, the
following should hold:
 - for RAID5: all data should count up to the parity bit
 - for RAID1: all bits should be identical

If the redundancy check encounters a anomaly, something should be fixed.
If something should be fixed, clearly something went wrong somewhere in
the past. Or can you give an example where the statements mentioned
above don't hold and nothing is wrong?

>>> I've no idea whether the checkarray script you're using is checking this
>>> counter - there seems little point in having a special script if it
>>> isn't though.
>>>   
>>>       
>> If I understand the meaning of this counter, it would be sufficient to
>> check the value of it _before_ the check operation and compare that
>> value to the counter value _after_ the check. If the counter has
>> increased: the check has encountered some inconsistencies which should
>> be reported.
>> Please correct me if I'm wrong
> Depends on what the previous operation was.  After a repair, the counter
> will indicate the number of errors fixed, not the number remaining.
> Theoretically, after a repair there will be no errors remaining, so any
> value (> 0) in the counter after a check would indicate an issue to be
> reported.
>   
Bottom line: I just want to know if an md check (using "echo check >
sync_action") encountered any inconsistencies. If so, in my setup that
would probably mean there is something wrong (bits flipping somewhere
between md, the bus, the NIC, the network, the NIC of a storage server,
etc.)

I just don't want to be surprised by any major filesystem corruptions
anymore!

Cheers,

  Bas

  reply	other threads:[~2008-03-20 15:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-16 14:21 Redundancy check using "echo check > sync_action": error reporting? Bas van Schaik
2008-03-16 15:14 ` Janek Kozicki
2008-03-20 13:32   ` Bas van Schaik
2008-03-20 13:47     ` Robin Hill
2008-03-20 14:19       ` Bas van Schaik
2008-03-20 14:45         ` Robin Hill
2008-03-20 15:16           ` Bas van Schaik [this message]
2008-03-20 16:04             ` Robin Hill
2008-03-20 16:35         ` Theodore Tso
2008-03-20 17:10           ` Robin Hill
2008-03-20 17:39           ` Andre Noll
2008-03-20 18:02             ` Theodore Tso
2008-03-20 18:57               ` Andre Noll
2008-03-21 14:02               ` Ric Wheeler
2008-03-21 20:19               ` NeilBrown
2008-03-21 20:45                 ` Ric Wheeler
2008-03-22 17:13                 ` Bill Davidsen
2008-03-20 23:08           ` Peter Rabbitson
2008-03-21 14:24             ` Bill Davidsen
2008-03-21 14:52               ` Peter Rabbitson
2008-03-21 17:13                 ` Theodore Tso
2008-03-21 17:35                   ` Peter Rabbitson
2008-03-22 13:27                     ` Theodore Tso
2008-03-22 14:00                       ` Bas van Schaik
2008-03-25  4:44                       ` Neil Brown
2008-03-25 15:17                         ` Bill Davidsen
2008-03-25  9:19                       ` Mattias Wadenstein
2008-03-21 17:43                   ` Robin Hill
2008-03-21 23:01                 ` Bill Davidsen
2008-03-21 23:45                   ` Carlos Carvalho
2008-03-22 17:19                     ` Bill Davidsen
2008-03-21 23:55                   ` Robin Hill
2008-03-22 10:03                     ` Peter Rabbitson
2008-03-22 10:42                       ` What do Events actually mean? Justin Piszcz
2008-03-22 17:35                         ` David Greaves
2008-03-22 17:48                           ` Justin Piszcz
2008-03-22 18:02                             ` David Greaves
2008-03-25  3:58                         ` Neil Brown
2008-03-26  8:57                           ` David Greaves
2008-03-26  8:57                           ` David Greaves
2008-05-04  7:30                       ` Redundancy check using "echo check > sync_action": error reporting? Peter Rabbitson
2008-05-06  6:36                         ` Luca Berra
2008-03-25  4:24             ` Neil Brown
2008-03-25  9:00               ` Peter Rabbitson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47E27FB8.1070009@tuxes.nl \
    --to=bas@tuxes.nl \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).