linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jaromir Capik <jcapik@redhat.com>
To: stan@hardwarefreak.com
Cc: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: [RFE] Please, add optional RAID1 feature (= chunk checksums) to make it more robust
Date: Mon, 23 Jul 2012 05:34:17 -0400 (EDT)	[thread overview]
Message-ID: <542194327.593466.1343036057599.JavaMail.root@redhat.com> (raw)
In-Reply-To: <500CD32E.4000800@hardwarefreak.com>

Hello Stan.

I received your reply without having the Linux RAID list in Cc
and thus I was unsure if you wanna discuss that privately or not.
I always choose reply to all unless I really want to remove
some of the recipients :]

Cheers,
Jaromir.

> 
> Please keep discussion on list.  This is probably an MUA issue.
>  Happens
> to me on occasion when I hit "reply to list" instead of "reply to
> all".
>  vger doesn't provide a List-Post: header so "reply to list" doesn't
> work and you end up replying to the sender.
> 
> On 7/22/2012 5:11 PM, Jaromir Capik wrote:
> >>> I admit, that the problem could lie elsewhere ... but that
> >>> doesn't
> >>> change anything on the fact, that the data became corrupted
> >>> without
> >>> me noticing that.
> >>
> >> The key here I think is "without me noticing that".  Drives
> >> normally
> >> cry
> >> out in the night, spitting errors to logs, when they encounter
> >> problems.
> >>  You may not receive an immediate error in your application,
> >>  especially
> >> when the drive is a RAID member and the data can be shipped
> >> regardless
> >> of the drive error.  If you never check your logs, or simply don't
> >> see
> >> these disk errors, how will you know there's a problem?
> > 
> > Hello Stan.
> > 
> > I used to periodically check logs as well as S.M.A.R.T. attributes.
> > And I believe I've already mentioned two of the cases and how
> > I finally discovered the issues. Moreover I switched from manual
> > checking to receiving emails from monitoring daemons. And even
> > if you receive such email, it usually takes some time to replace
> > the failing drive. That time window might be fatal for your data
> > if junk is read from one of the drives and when it's followed
> > by a write. Such write would destroy the second correct copy ...
> > 
> >>
> >> Likewise, if the checksumming you request is implemented in
> >> md/RAID1,
> >> and your application never sees a problem when a drive heads
> >> South,
> >> and
> >> you never check your logs and thus don't see the checksum
> >> errors...
> > 
> > You wouldn't have to ... because the corrupted chunks would be
> > immediately resynced with good data and you'll REALLY get some
> > errors
> > in the logs if the harddrive or controller or it's driver doesn't
> > produce them for whatever reason.
> > 
> >>
> >> How is this new checksumming any better than the current
> >> situation?
> >>  The
> >> drive is still failing and you're still unaware of it.
> > 
> > Do you believe, that other reasons of silent data corruptions
> > simply
> > do not exist? Try to imagine a case, when the correct data aren't
> > written at all to one of the drives due to a bug in the drive's
> > firmware
> > or due to a bug in the controller design or due to a bug in the
> > controller driver or due to other reasons. Such bug could be
> > tiggered
> > by anything ... it could be a delay in the read operation when the
> > sector is not well readable or any race condition, etc. Especially
> > new devices and their very first versions are expected to be buggy.
> > Checksuming would prevent them all and would make the whole
> > I/O really bulletproof.
> 
> 

  reply	other threads:[~2012-07-23  9:34 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1082734092.338339.1342995087426.JavaMail.root@redhat.com>
2012-07-23  4:29 ` [RFE] Please, add optional RAID1 feature (= chunk checksums) to make it more robust Stan Hoeppner
2012-07-23  9:34   ` Jaromir Capik [this message]
2012-07-23 10:53     ` Stan Hoeppner
2012-07-23 17:03     ` Piergiorgio Sartor
2012-07-23 18:24       ` Roberto Spadim
2012-07-23 21:31         ` Drew
2012-07-23 21:42           ` Roberto Spadim
2012-07-24  4:42           ` Stan Hoeppner
2012-07-24 12:51             ` Roberto Spadim
2012-07-27  6:06           ` Adam Goryachev
2012-07-27 13:42             ` Roberto Spadim
2012-07-24 15:09         ` Jaromir Capik
     [not found] <1897705147.341625.1342995720661.JavaMail.root@redhat.com>
2012-07-23  4:30 ` Stan Hoeppner
     [not found] <17025a94-1999-4619-b23d-7460946c2f85@zmail15.collab.prod.int.phx2.redhat.com>
2012-07-18 11:01 ` Jaromir Capik
2012-07-18 11:13   ` Mathias Burén
2012-07-18 12:42     ` Jaromir Capik
2012-07-18 11:15   ` NeilBrown
2012-07-18 13:04     ` Jaromir Capik
2012-07-19  3:48       ` Stan Hoeppner
2012-07-20 12:53         ` Jaromir Capik
2012-07-20 18:24           ` Roberto Spadim
2012-07-20 18:30             ` Roberto Spadim
2012-07-20 20:07             ` Jaromir Capik
2012-07-20 20:21               ` Roberto Spadim
2012-07-20 20:44                 ` Jaromir Capik
2012-07-20 20:59                   ` Roberto Spadim
2012-07-21  3:58           ` Stan Hoeppner
2012-07-18 11:49   ` keld
2012-07-18 13:08     ` Jaromir Capik
2012-07-18 16:08       ` Roberto Spadim
2012-07-20 10:35         ` Jaromir Capik
2012-07-18 21:02       ` keld
2012-07-18 16:28   ` Asdo
2012-07-20 11:07     ` Jaromir Capik
2012-07-20 11:14       ` Oliver Schinagl
2012-07-20 11:28       ` Jaromir Capik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542194327.593466.1343036057599.JavaMail.root@redhat.com \
    --to=jcapik@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).