Possible failures in raid5?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Possible failures in raid5?
@ 2004-12-27  1:14 John McMonagle
  2004-12-28  7:48 ` Peter T. Breuer
  0 siblings, 1 reply; 9+ messages in thread
From: John McMonagle @ 2004-12-27  1:14 UTC (permalink / raw)
  To: linux raid

I am  building a  server for backing up our other servers.

For now have 3 200gb sata drives.
Using debian sarge with 2.6.8 kernel.
Will have a ups and set to shut down on power failure.
I am  concerned what will happen if the computer dies while writing a strip.

Is it possible that the stripe will be corrupted?
If so will the  the rest of the raid array be OK?
if so is there anything one can do  about it?

Thanks

John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-27  1:14 Possible failures in raid5? John McMonagle
@ 2004-12-28  7:48 ` Peter T. Breuer
  2004-12-28 23:23   ` John McMonagle
  0 siblings, 1 reply; 9+ messages in thread
From: Peter T. Breuer @ 2004-12-28  7:48 UTC (permalink / raw)
  To: linux-raid

John McMonagle <johnm@advocap.org> wrote:
> I am  concerned what will happen if the computer dies while writing a strip.

Well, the strip will be partially written.

> Is it possible that the stripe will be corrupted?

No - it will be partially written.  The parity data may or may not be
consistent with the real data at that point, and if you lose a disk
before the next resync (at next reboot) you may get some different data
reconstructed using parity than if you had used the data itself. OTOH
if you reboot with all disks intact the parity will be reconstructed
properly and the inconsistency will be removed. But any missed writes
will remain missing.

> If so will the  the rest of the raid array be OK?

Apart from that strip?  I'm not sure exactly what you mean by "strip"
(probably some raid jargon?  Don't they use "chunks" and "stripes"?)!
But clearly the data inconsistency will be confined to it.

> if so is there anything one can do  about it?

? Nothing has happened - you have simply not written all the
data you wanted to write. This happens all the time when writing to
disks and crashing your computer! But since raid writes at least twice,
once for data and once for parity, you have the extra pssibility of 
having missed some of the parity data (or having written parity but not
data). That produces an inconsistency in the redundant data. You can
fix it by rebooting with all disks intact. Hey - you even fix the
inconsistency by rebooting with one missing; you just have less chance
of reconstructing the intended data that way.

Peter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-28  7:48 ` Peter T. Breuer
@ 2004-12-28 23:23   ` John McMonagle
  2004-12-28 23:40     ` Peter T. Breuer
  2004-12-29  3:35     ` Guy
  0 siblings, 2 replies; 9+ messages in thread
From: John McMonagle @ 2004-12-28 23:23 UTC (permalink / raw)
  To: Peter T. Breuer; +Cc: linux-raid

Thanks Peter

If one is using a journaling file system like ext3. Will that fix the 
problem area?
My concern was that writes are being done in larger chunks than the 
written data but possibly that is not a problem.

Another issue is it's good to know about any disk failures as soon as 
possible.

I see someone has suggested running something like
dd if=/dev/sda of=/dev/null bs=64k

I have used that before and it does work. What I would prefer is to run 
it periodically and email on failures.
I haven't seen any scripts for this.
Not that hard to do but I'm wondering how to detect errors in the script.
When I ran by hand in the past I got scsi errors even if the read 
ultimately worked.
Kind of hard to test without a bad drive on hand :)
If one redirects errors to a file will the scsi or ide errors go to the 
file?

John

Peter T. Breuer wrote:

>John McMonagle <johnm@advocap.org> wrote:
>  
>
>>I am  concerned what will happen if the computer dies while writing a strip.
>>    
>>
>
>Well, the strip will be partially written.
>
>  
>
>>Is it possible that the stripe will be corrupted?
>>    
>>
>
>No - it will be partially written.  The parity data may or may not be
>consistent with the real data at that point, and if you lose a disk
>before the next resync (at next reboot) you may get some different data
>reconstructed using parity than if you had used the data itself. OTOH
>if you reboot with all disks intact the parity will be reconstructed
>properly and the inconsistency will be removed. But any missed writes
>will remain missing.
>
>  
>
>>If so will the  the rest of the raid array be OK?
>>    
>>
>
>Apart from that strip?  I'm not sure exactly what you mean by "strip"
>(probably some raid jargon?  Don't they use "chunks" and "stripes"?)!
>But clearly the data inconsistency will be confined to it.
>
>  
>
>>if so is there anything one can do  about it?
>>    
>>
>
>? Nothing has happened - you have simply not written all the
>data you wanted to write. This happens all the time when writing to
>disks and crashing your computer! But since raid writes at least twice,
>once for data and once for parity, you have the extra pssibility of 
>having missed some of the parity data (or having written parity but not
>data). That produces an inconsistency in the redundant data. You can
>fix it by rebooting with all disks intact. Hey - you even fix the
>inconsistency by rebooting with one missing; you just have less chance
>of reconstructing the intended data that way.
>
>Peter
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-28 23:23   ` John McMonagle
@ 2004-12-28 23:40     ` Peter T. Breuer
  2004-12-29  3:35     ` Guy
  1 sibling, 0 replies; 9+ messages in thread
From: Peter T. Breuer @ 2004-12-28 23:40 UTC (permalink / raw)
  To: linux-raid

John McMonagle <johnm@advocap.org> wrote:
> If one is using a journaling file system like ext3. Will that fix the 
> problem area?

No. Journalling systems perform no magic - they merely promise that
the file systems they run are consistent, not that they contain the
right data!  They don't and can't compensate for corruptions in the
medium below them.

And the raid system lies below them.

But be careful not to put the journal on the raid system too, if you do
use a journalling file system! That would be extremely nasty.

> My concern was that writes are being done in larger chunks than the 
> written data but possibly that is not a problem.

?? I don't understand what you are saying here.  Writes write what is
written. No more and no less.

Are you referring to the fact that block devices write in units of 1KB
(or other)?  That simply means that in order to write 1B the kernel has
to read 1KB, modify it, then write 1KB back to the device.

> Another issue is it's good to know about any disk failures as soon as 
> possible.

Oh, you will!

> I see someone has suggested running something like
> dd if=/dev/sda of=/dev/null bs=64k

Well that's a read test. You might just as well run e2fsck -c, which
does the same. But it doesn't fix anything, whereas you would fix it if
you simply wrote the block you couldn't read. The drive would fill it
in with a spare.

But why? You can interrogate the disk directly about its condition!
That's what SMART means.

> If one redirects errors to a file will the scsi or ide errors go to the 
> file?

Eh? What do you mean by "an error"? Anyway, whatever you redirect will
go wherever you redirect it to! That's what redirection means.

Peter

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Possible failures in raid5?
  2004-12-28 23:23   ` John McMonagle
  2004-12-28 23:40     ` Peter T. Breuer
@ 2004-12-29  3:35     ` Guy
  2004-12-29 21:38       ` Luca Berra
  1 sibling, 1 reply; 9+ messages in thread
From: Guy @ 2004-12-29  3:35 UTC (permalink / raw)
  To: 'John McMonagle', 'Peter T. Breuer'; +Cc: linux-raid

For testing the disk(s) status, look at smartd.
My disks are too old, but newer disks should support this.

I run a Seagate tool every night.  I believe my disks relocate bad blocks on
read, but only if they can be read with error correction.

Before I started using the Seagate tool, I used dd.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of John McMonagle
Sent: Tuesday, December 28, 2004 6:24 PM
To: Peter T. Breuer
Cc: linux-raid@vger.kernel.org
Subject: Re: Possible failures in raid5?

Thanks Peter

If one is using a journaling file system like ext3. Will that fix the 
problem area?
My concern was that writes are being done in larger chunks than the 
written data but possibly that is not a problem.

Another issue is it's good to know about any disk failures as soon as 
possible.

I see someone has suggested running something like
dd if=/dev/sda of=/dev/null bs=64k

I have used that before and it does work. What I would prefer is to run 
it periodically and email on failures.
I haven't seen any scripts for this.
Not that hard to do but I'm wondering how to detect errors in the script.
When I ran by hand in the past I got scsi errors even if the read 
ultimately worked.
Kind of hard to test without a bad drive on hand :)
If one redirects errors to a file will the scsi or ide errors go to the 
file?

John

Peter T. Breuer wrote:

>John McMonagle <johnm@advocap.org> wrote:
>  
>
>>I am  concerned what will happen if the computer dies while writing a
strip.
>>    
>>
>
>Well, the strip will be partially written.
>
>  
>
>>Is it possible that the stripe will be corrupted?
>>    
>>
>
>No - it will be partially written.  The parity data may or may not be
>consistent with the real data at that point, and if you lose a disk
>before the next resync (at next reboot) you may get some different data
>reconstructed using parity than if you had used the data itself. OTOH
>if you reboot with all disks intact the parity will be reconstructed
>properly and the inconsistency will be removed. But any missed writes
>will remain missing.
>
>  
>
>>If so will the  the rest of the raid array be OK?
>>    
>>
>
>Apart from that strip?  I'm not sure exactly what you mean by "strip"
>(probably some raid jargon?  Don't they use "chunks" and "stripes"?)!
>But clearly the data inconsistency will be confined to it.
>
>  
>
>>if so is there anything one can do  about it?
>>    
>>
>
>? Nothing has happened - you have simply not written all the
>data you wanted to write. This happens all the time when writing to
>disks and crashing your computer! But since raid writes at least twice,
>once for data and once for parity, you have the extra pssibility of 
>having missed some of the parity data (or having written parity but not
>data). That produces an inconsistency in the redundant data. You can
>fix it by rebooting with all disks intact. Hey - you even fix the
>inconsistency by rebooting with one missing; you just have less chance
>of reconstructing the intended data that way.
>
>Peter
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  
>

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-29  3:35     ` Guy
@ 2004-12-29 21:38       ` Luca Berra
  2004-12-29 22:53         ` Guy
  2004-12-29 23:40         ` Brad Campbell
  0 siblings, 2 replies; 9+ messages in thread
From: Luca Berra @ 2004-12-29 21:38 UTC (permalink / raw)
  To: linux-raid

On Tue, Dec 28, 2004 at 10:35:33PM -0500, Guy wrote:
>For testing the disk(s) status, look at smartd.
>My disks are too old, but newer disks should support this.
except that libata does not support S.M.A.R.T.
L.

-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Possible failures in raid5?
  2004-12-29 21:38       ` Luca Berra
@ 2004-12-29 22:53         ` Guy
  2004-12-29 23:40         ` Brad Campbell
  1 sibling, 0 replies; 9+ messages in thread
From: Guy @ 2004-12-29 22:53 UTC (permalink / raw)
  To: 'Luca Berra', linux-raid

Well then...

Well, humm...

Well, that's just dumb!  :)

dd rules!

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Luca Berra
Sent: Wednesday, December 29, 2004 4:39 PM
To: linux-raid@vger.kernel.org
Subject: Re: Possible failures in raid5?

On Tue, Dec 28, 2004 at 10:35:33PM -0500, Guy wrote:
>For testing the disk(s) status, look at smartd.
>My disks are too old, but newer disks should support this.
except that libata does not support S.M.A.R.T.
L.

-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-29 21:38       ` Luca Berra
  2004-12-29 22:53         ` Guy
@ 2004-12-29 23:40         ` Brad Campbell
  2004-12-30  6:53           ` Luca Berra
  1 sibling, 1 reply; 9+ messages in thread
From: Brad Campbell @ 2004-12-29 23:40 UTC (permalink / raw)
  To: Luca Berra; +Cc: linux-raid

Luca Berra wrote:
> On Tue, Dec 28, 2004 at 10:35:33PM -0500, Guy wrote:
> 
>> For testing the disk(s) status, look at smartd.
>> My disks are too old, but newer disks should support this.
> 
> except that libata does not support S.M.A.R.T.

There is an experimental patch in the libata-dev tree that appears (and I mean it works for me) to 
support SMART reliably on a UP kernel. There is a known issue with SMP kernels ATM. I have been 
using it every 20 minutes on a reasonably loaded server with 13 SATA and 1 ATA disks now for what 
must be several months with no issues. (It has helped detect a pending disk failure quite nicely)/

I guess it will stay experimental until someone with an SMP machine can assist Andy in debugging the 
SCSI passthough code (It's not SMART specific at this stage).

-- 
Brad
                    /"\
Save the Forests   \ /     ASCII RIBBON CAMPAIGN
Burn a Greenie.     X      AGAINST HTML MAIL
                    / \

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Possible failures in raid5?
  2004-12-29 23:40         ` Brad Campbell
@ 2004-12-30  6:53           ` Luca Berra
  0 siblings, 0 replies; 9+ messages in thread
From: Luca Berra @ 2004-12-30  6:53 UTC (permalink / raw)
  To: linux-raid

On Thu, Dec 30, 2004 at 03:40:22AM +0400, Brad Campbell wrote:
>Luca Berra wrote:
>>On Tue, Dec 28, 2004 at 10:35:33PM -0500, Guy wrote:
>>
>>>For testing the disk(s) status, look at smartd.
>>>My disks are too old, but newer disks should support this.
>>
>>except that libata does not support S.M.A.R.T.
>
>There is an experimental patch in the libata-dev tree that appears (and I 
>mean it works for me) to support SMART reliably on a UP kernel. There is a 
>known issue with SMP kernels ATM. I have been using it every 20 minutes on 
>a reasonably loaded server with 13 SATA and 1 ATA disks now for what must 
>be several months with no issues. (It has helped detect a pending disk 
>failure quite nicely)/
>
>I guess it will stay experimental until someone with an SMP machine can 
>assist Andy in debugging the SCSI passthough code (It's not SMART specific 
>at this stage).

thanks for the info, i will check it out (is there a snapshot of the
libata-dev for non bk users? /pub/linux/kernel/people/jgarzik/libata on
kernel.org seems to be a bit behind)
Unfortunately i don't own a SMP machine with sata drives.
L.

-- 
Luca Berra -- bluca@comedia.it
        Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
  X        AGAINST HTML MAIL
 / \

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-12-30  6:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-27  1:14 Possible failures in raid5? John McMonagle
2004-12-28  7:48 ` Peter T. Breuer
2004-12-28 23:23   ` John McMonagle
2004-12-28 23:40     ` Peter T. Breuer
2004-12-29  3:35     ` Guy
2004-12-29 21:38       ` Luca Berra
2004-12-29 22:53         ` Guy
2004-12-29 23:40         ` Brad Campbell
2004-12-30  6:53           ` Luca Berra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).