RAID1 robust read and read/write correct and EVMS-BBR

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID1 robust read and read/write correct and EVMS-BBR
@ 2005-02-23 19:55 Nagpure, Dinesh
  2005-02-23 20:07 ` J. David Beutel
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Nagpure, Dinesh @ 2005-02-23 19:55 UTC (permalink / raw)
  To: 'evms-devel@lists.sourceforge.net'
  Cc: 'linux-raid@vger.kernel.org'

Hi,

I noticed the discussion about robust read on the RAID list and similar one
on the EVMS list so I am sending this mail to both the lists. Latent media
faults which prevent data from being read from portions of a disk has always
been a concern for us. Such faults will go undetected till the time that
block is read. RAID 1 depends on error free mirrors for proper operation and
undiscovered bad blocks would only give pseudo illusion of duplexity when in
reality the array should be degraded. Over long run all the mirrors might
develop latent media faults and none can be replaced with a new disk. Also
it is a disaster if the same block goes bad on all the mirrors in a RAID 1
volume. With this concern we developed what we call "disk-scrubber". The
approach was to proactively seek for bad spots on the disk and when one is
discovered, read the correct data from the other mirror and use it to repair
the disk by way of a write. SCSI disks automatically repair bad spots on
write by internally mapping the bad spots to spare sectors (Being SCSI
centric might be one limitation of this solution).
The implementation comprised of a thread that looks for bad spots by way of
slow repeated continuous scan through all disks. The RAID error management
was extended to attempt a repair on read error from a RAID 1 array to permit
fixing of user discovered bad spots as well as those discovered by the
scrubber. The work is lk2.4.26 based as of now.

I can go back and put together a patch over the weekend if anyone is
interested in using it. 

-dinesh
dinesh.nagpure@stratus.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID1 robust read and read/write correct and EVMS-BBR
  2005-02-23 19:55 RAID1 robust read and read/write correct and EVMS-BBR Nagpure, Dinesh
@ 2005-02-23 20:07 ` J. David Beutel
  2005-02-23 20:52 ` Guy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: J. David Beutel @ 2005-02-23 20:07 UTC (permalink / raw)
  To: Nagpure, Dinesh; +Cc: 'linux-raid@vger.kernel.org'

Nagpure, Dinesh wrote, on 2005-Feb-23 9:55 AM:

>I can go back and put together a patch over the weekend if anyone is
>interested in using it. 
>  
>

Yes, please, I'm very interested in using it.

Cheers,
11011011

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: RAID1 robust read and read/write correct and EVMS-BBR
  2005-02-23 19:55 RAID1 robust read and read/write correct and EVMS-BBR Nagpure, Dinesh
  2005-02-23 20:07 ` J. David Beutel
@ 2005-02-23 20:52 ` Guy
  2005-02-23 21:01 ` Peter T. Breuer
  2005-02-23 21:22 ` bernd
  3 siblings, 0 replies; 6+ messages in thread
From: Guy @ 2005-02-23 20:52 UTC (permalink / raw)
  To: 'Nagpure, Dinesh', evms-devel; +Cc: linux-raid

This is very good!  But most of my disk space is RAID5.  Any chance you have
similar plans for RAID5?

Thanks,
Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Nagpure, Dinesh
Sent: Wednesday, February 23, 2005 2:56 PM
To: 'evms-devel@lists.sourceforge.net'
Cc: 'linux-raid@vger.kernel.org'
Subject: RAID1 robust read and read/write correct and EVMS-BBR

Hi,

I noticed the discussion about robust read on the RAID list and similar one
on the EVMS list so I am sending this mail to both the lists. Latent media
faults which prevent data from being read from portions of a disk has always
been a concern for us. Such faults will go undetected till the time that
block is read. RAID 1 depends on error free mirrors for proper operation and
undiscovered bad blocks would only give pseudo illusion of duplexity when in
reality the array should be degraded. Over long run all the mirrors might
develop latent media faults and none can be replaced with a new disk. Also
it is a disaster if the same block goes bad on all the mirrors in a RAID 1
volume. With this concern we developed what we call "disk-scrubber". The
approach was to proactively seek for bad spots on the disk and when one is
discovered, read the correct data from the other mirror and use it to repair
the disk by way of a write. SCSI disks automatically repair bad spots on
write by internally mapping the bad spots to spare sectors (Being SCSI
centric might be one limitation of this solution).
The implementation comprised of a thread that looks for bad spots by way of
slow repeated continuous scan through all disks. The RAID error management
was extended to attempt a repair on read error from a RAID 1 array to permit
fixing of user discovered bad spots as well as those discovered by the
scrubber. The work is lk2.4.26 based as of now.

I can go back and put together a patch over the weekend if anyone is
interested in using it. 

-dinesh
dinesh.nagpure@stratus.com
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID1 robust read and read/write correct and EVMS-BBR
  2005-02-23 19:55 RAID1 robust read and read/write correct and EVMS-BBR Nagpure, Dinesh
  2005-02-23 20:07 ` J. David Beutel
  2005-02-23 20:52 ` Guy
@ 2005-02-23 21:01 ` Peter T. Breuer
  2005-02-23 21:22 ` bernd
  3 siblings, 0 replies; 6+ messages in thread
From: Peter T. Breuer @ 2005-02-23 21:01 UTC (permalink / raw)
  To: linux-raid; +Cc: evms-devel

In gmane.linux.raid Nagpure, Dinesh <Dinesh.Nagpure@stratus.com> wrote:
> I noticed the discussion about robust read on the RAID list and similar one
> on the EVMS list so I am sending this mail to both the lists. Latent media
> faults which prevent data from being read from portions of a disk has always
> been a concern for us. Such faults will go undetected till the time that
> block is read.

Well, sure, unless you have some other test. Finding latent faults is
always a question of making them come out into the open. But do you
want to? Testing something to destruction does not make it more useful.

> RAID 1 depends on error free mirrors for proper operation and

Err, if one mirror has a read error you can always read from another one
instead.

> undiscovered bad blocks would only give pseudo illusion of duplexity when in

Well, undiscovered bad blocks are just that, nice and crypto! But I
take your point. The problem with your reasoning however is that it is
not raid-specific - undiscovered errors in ANYTHING are a problem
waiting to be discovered :).

Should we be concerned about that? Sometimes yes, sometimes no.

When we shouldn't be concerned about it is when our aim is merely to DO
BETTER.

When we should be concerned about it is when our aim is to BE PERFECT.

Personally, I am only looking to do better.

> reality the array should be degraded.

Why should we degrade a perfectly good mirror just because one of the
disks has a read error on a particular sector?  You've lost me there!

> Over long run all the mirrors might
> develop latent media faults

Sure they might.  But it's not a crime to have faults!  We all have them.
We don't kill ourselves as soon as we develop a blackhead, which seems
to be what you are suggesting!

Personally I'd launch resyncs every so often. SInce robust-read makes
the array tolerant of read faults during resync too, you will reduce
the number of errors by 1/n (i.e. get rid of 50% of the errors in a
2-disk array) every time you do this.

And/Or you can  also help develop the write-correct addition to the
robust-read patch to make the read errors get corrected on the fly.

>  and none can be replaced with a new disk.

Sure they can. Whenever you like. But why?

> Also
> it is a disaster if the same block goes bad on all the mirrors in a RAID 1
> volume.

No it's not. It's an error. It's no worse than a block going bad on a
single disk. The world doesn't cave in when that happens. It takes
longer to happen on a 2 disk system because one needs to get both disks
with errors in the same place. So the 2 disk raid is a lot BETTER.

> With this concern we developed what we call "disk-scrubber". The

Well, then you are up a gum-tree, because your concerns appear to be
ill-reasoned. That's not to say that there isn't merit in what you
might now propose, but it won't be fully justified by your reasoning so
far, if it is what you have shown!

> approach was to proactively seek for bad spots on the disk and when one is
> discovered, read the correct data from the other mirror and use it to repair

There's nothing wrong with that, if you like your disk humming away
doing a resync in the background. One can do that. Just keep the raid1d
resync thread occupied. There are several possible strategies.

But I wouldn't say you "developed" this! Isn't it a standard tactic in
classical raid to do background tests and syncs? I thought the idea was
to combat the tendency of raid to develop errors that cannot be detected
by the array itself afterwards!

> the disk by way of a write. SCSI disks automatically repair bad spots on
> write by internally mapping the bad spots to spare sectors (Being SCSI

So do IDE. You seem to be a bit behind the times. Surely that's been
the case for at least five years? Or more?

> centric might be one limitation of this solution).

I don't think so.

> The implementation comprised of a thread that looks for bad spots by way of
> slow repeated continuous scan through all disks.

Brilliant , but it's trivial to make the resync thread active the whole
time.

> The RAID error management
> was extended to attempt a repair on read error from a RAID 1 array to permit
> fixing of user discovered bad spots as well as those discovered by the

Wel, I'd like to see how you did that bit. I've only suggested code t
do it, not actually tried it!

> scrubber. The work is lk2.4.26 based as of now.
> 
> I can go back and put together a patch over the weekend if anyone is
> interested in using it. 

Go "back"? I don't understand .. how do you actually have the work if
not as a patch? But yes - of course I would be interested. Please show
the patch as soon as possible! Looks like a combined patch is in order!

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID1 robust read and read/write correct and EVMS-BBR
  2005-02-23 19:55 RAID1 robust read and read/write correct and EVMS-BBR Nagpure, Dinesh
                   ` (2 preceding siblings ...)
  2005-02-23 21:01 ` Peter T. Breuer
@ 2005-02-23 21:22 ` bernd
  2005-02-23 21:30   ` Peter T. Breuer
  3 siblings, 1 reply; 6+ messages in thread
From: bernd @ 2005-02-23 21:22 UTC (permalink / raw)
  To: Nagpure Dinesh; +Cc: linux-raid

....
>I can go back and put together a patch over the weekend if anyone is
>interested in using it. 
>
>-dinesh
>dinesh.nagpure@stratus.com
>-

Oh yes, please make this patch. We are very very interested in it!

We are waiting for the one day where the same block on all mirrors has
read problems. Ok, we're now waiting for about 15 years because the
HPUX mirror strategy is the same. Quite a long time without desaster
but it will happen (till today Murphy was right in any case but one:-)).
If anything happens to a disk one must be warned as soon as possible.

Thanks
B. Rieke (bernd@rhm.de)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID1 robust read and read/write correct and EVMS-BBR
  2005-02-23 21:22 ` bernd
@ 2005-02-23 21:30   ` Peter T. Breuer
  0 siblings, 0 replies; 6+ messages in thread
From: Peter T. Breuer @ 2005-02-23 21:30 UTC (permalink / raw)
  To: linux-raid

bernd@rhm.de wrote:
> We are waiting for the one day where the same block on all mirrors has
> read problems. Ok, we're now waiting for about 15 years because the
> HPUX mirror strategy is the same. Quite a long time without desaster
> but it will happen (till today Murphy was right in any case but one:-)).
> If anything happens to a disk one must be warned as soon as possible.

Sarky :).

(or did I mean "scary"? |-)

Peter


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-02-23 21:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-23 19:55 RAID1 robust read and read/write correct and EVMS-BBR Nagpure, Dinesh
2005-02-23 20:07 ` J. David Beutel
2005-02-23 20:52 ` Guy
2005-02-23 21:01 ` Peter T. Breuer
2005-02-23 21:22 ` bernd
2005-02-23 21:30   ` Peter T. Breuer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).