unreadable drives can be synchronized?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* unreadable drives can be synchronized?
@ 2007-05-16 15:50 Colin McCabe
  2007-05-16 17:22 ` Bill Davidsen
  2007-05-17  0:54 ` Neil Brown
  0 siblings, 2 replies; 10+ messages in thread
From: Colin McCabe @ 2007-05-16 15:50 UTC (permalink / raw)
  To: linux-raid

Hi all,

I am running software RAID on Linux 2.6.21.

While experimenting with adding and removing devices from the RAID array, I
noticed something very troubling. I have a bad drive (let's call it drive B)
which gets random read errors. I also have a good drive, call it drive A.

B can synchronize with A. But then, if I remove A from the raid array, A
cannot be re-added. This is because the bad drive, B, cannot be read from.

Basically, B appears to be "write-only"; it will never return an error on a
write, but just try to read from it, and you will be sorry.

Writing is fine:
[root@cmccabe-devel root]# dd if=/dev/zero of=/dev/sdb bs=524288
dd: writing `/dev/sdb': No space left on device
114464+0 records in
114463+0 records out

Reading is not:
[root@cmccabe-devel root]# dd if=/dev/sdb of=/dev/null bs=524288
ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x2 frozen
ata1.00: cmd 60/00:00:00:b0:01/01:00:00:00:00/40 tag 0 cdb 0x0 data 131072 in
[ ... copious errors ... ]

I have disabled write caching using hdparm -W0.
Both drives are: Fujitsu MHV2060BH, 60 GB, Serial ATA
The SATA controller is: ICH6

My problem is that even though B gets into the synchronized state, it is no
good at all. This is potentially misleading, and if someone removes A after
synchronizing B, the system will probably crash, since there will be no good
drives left.

I wonder if anyone else is interested in a "paranoid recovery" mode where the
md layer tests the data that has been written. Even if this doubles the
recovery time, I think that it would be desirable for many applications.

Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-16 15:50 unreadable drives can be synchronized? Colin McCabe
@ 2007-05-16 17:22 ` Bill Davidsen
  2007-05-16 20:09   ` Colin McCabe
  2007-05-17  0:54 ` Neil Brown
  1 sibling, 1 reply; 10+ messages in thread
From: Bill Davidsen @ 2007-05-16 17:22 UTC (permalink / raw)
  To: Colin McCabe; +Cc: linux-raid

Colin McCabe wrote:
> Hi all,
>
> I am running software RAID on Linux 2.6.21.
>
> While experimenting with adding and removing devices from the RAID 
> array, I
> noticed something very troubling. I have a bad drive (let's call it 
> drive B)
> which gets random read errors. I also have a good drive, call it drive A.
>
> B can synchronize with A. But then, if I remove A from the raid array, A
> cannot be re-added. This is because the bad drive, B, cannot be read 
> from.
>
> Basically, B appears to be "write-only"; it will never return an error 
> on a
> write, but just try to read from it, and you will be sorry.
>
You may be able to recover from this (why would you do such a thing?) by 
stopping the array and reassembling the array with only the "good" drive 
and the other as failed. Caution, I made this up, it should work but I 
have no bad drive to use for a test, we have a good recycling system in 
my area.
> Writing is fine:
> [root@cmccabe-devel root]# dd if=/dev/zero of=/dev/sdb bs=524288
> dd: writing `/dev/sdb': No space left on device
> 114464+0 records in
> 114463+0 records out
>
> Reading is not:
> [root@cmccabe-devel root]# dd if=/dev/sdb of=/dev/null bs=524288
> ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x2 frozen
> ata1.00: cmd 60/00:00:00:b0:01/01:00:00:00:00/40 tag 0 cdb 0x0 data 
> 131072 in
> [ ... copious errors ... ]
>
> I have disabled write caching using hdparm -W0.
> Both drives are: Fujitsu MHV2060BH, 60 GB, Serial ATA
> The SATA controller is: ICH6
>
> My problem is that even though B gets into the synchronized state, it 
> is no
> good at all. This is potentially misleading, and if someone removes A 
> after
> synchronizing B, the system will probably crash, since there will be 
> no good
> drives left.
>
> I wonder if anyone else is interested in a "paranoid recovery" mode 
> where the
> md layer tests the data that has been written. Even if this doubles the
> recovery time, I think that it would be desirable for many applications.


-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-16 17:22 ` Bill Davidsen
@ 2007-05-16 20:09   ` Colin McCabe
  2007-05-16 20:18     ` Colin McCabe
  0 siblings, 1 reply; 10+ messages in thread
From: Colin McCabe @ 2007-05-16 20:09 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-raid

On 5/16/07, Bill Davidsen <davidsen@tmr.com> wrote:
> Colin McCabe wrote:
> > Hi all,
> >
> > I am running software RAID on Linux 2.6.21.
> >
> > While experimenting with adding and removing devices from the RAID
> > array, I
> > noticed something very troubling. I have a bad drive (let's call it
> > drive B)
> > which gets random read errors. I also have a good drive, call it drive A.
> >
> > B can synchronize with A. But then, if I remove A from the raid array, A
> > cannot be re-added. This is because the bad drive, B, cannot be read
> > from.
> >
> > Basically, B appears to be "write-only"; it will never return an error
> > on a
> > write, but just try to read from it, and you will be sorry.
> >
> You may be able to recover from this (why would you do such a thing?) by
> stopping the array and reassembling the array with only the "good" drive
> and the other as failed. Caution, I made this up, it should work but I
> have no bad drive to use for a test, we have a good recycling system in
> my area.

This is an embedded systems application. There isn't any important
data on drives A or B at the moment.

What concerns me is that apparently these Hitachi disks have errors
that only show up when you try to read from them. I don't know if this
is a firmware bug or a physical limitation of the way the drive
detects errors. I actually have two different drives which could fill
the role of drive B in this scenario.

If I do a "check" on both drives, it speedily removes B once it
realizes that it can't read from it. But what bothers me is that it is
able to become active without ever being tested by being read from. So
it seems like at minimum, careful admins should do a "check"
immediately after adding a new disk to an array.

Colin


> > Writing is fine:
> > [root@cmccabe-devel root]# dd if=/dev/zero of=/dev/sdb bs=524288
> > dd: writing `/dev/sdb': No space left on device
> > 114464+0 records in
> > 114463+0 records out
> >
> > Reading is not:
> > [root@cmccabe-devel root]# dd if=/dev/sdb of=/dev/null bs=524288
> > ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x2 frozen
> > ata1.00: cmd 60/00:00:00:b0:01/01:00:00:00:00/40 tag 0 cdb 0x0 data
> > 131072 in
> > [ ... copious errors ... ]
> >
> > I have disabled write caching using hdparm -W0.
> > Both drives are: Fujitsu MHV2060BH, 60 GB, Serial ATA
> > The SATA controller is: ICH6
> >
> > My problem is that even though B gets into the synchronized state, it
> > is no
> > good at all. This is potentially misleading, and if someone removes A
> > after
> > synchronizing B, the system will probably crash, since there will be
> > no good
> > drives left.
> >
> > I wonder if anyone else is interested in a "paranoid recovery" mode
> > where the
> > md layer tests the data that has been written. Even if this doubles the
> > recovery time, I think that it would be desirable for many applications.
>
>
> --
> bill davidsen <davidsen@tmr.com>
>   CTO TMR Associates, Inc
>   Doing interesting things with small computers since 1979
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-16 20:09   ` Colin McCabe
@ 2007-05-16 20:18     ` Colin McCabe
  0 siblings, 0 replies; 10+ messages in thread
From: Colin McCabe @ 2007-05-16 20:18 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-raid

On 5/16/07, Colin McCabe <colin.p.mccabe@gmail.com> wrote:

> What concerns me is that apparently these Hitachi disks have errors

Sorry, I meant to write "Fujitsu," not "Hitachi."
Doh!

Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-16 15:50 unreadable drives can be synchronized? Colin McCabe
  2007-05-16 17:22 ` Bill Davidsen
@ 2007-05-17  0:54 ` Neil Brown
  1 sibling, 0 replies; 10+ messages in thread
From: Neil Brown @ 2007-05-17  0:54 UTC (permalink / raw)
  To: Colin McCabe; +Cc: linux-raid

On Wednesday May 16, colin.p.mccabe@gmail.com wrote:
> Hi all,
> 
> I am running software RAID on Linux 2.6.21.
> 
> While experimenting with adding and removing devices from the RAID array, I
> noticed something very troubling. I have a bad drive (let's call it drive B)
> which gets random read errors. I also have a good drive, call it drive A.
> 
> B can synchronize with A. But then, if I remove A from the raid array, A
> cannot be re-added. This is because the bad drive, B, cannot be read from.

Well, if you remove A, then you have the situation of trusting your
data to a single drive.  And we all know that isn't very safe.

> 
> I wonder if anyone else is interested in a "paranoid recovery" mode where the
> md layer tests the data that has been written. Even if this doubles the
> recovery time, I think that it would be desirable for many applications.

I think it is just as easy to run a 'check' pass after recovery has
completed.

Whenever you are commissioning new hardware, it makes sense to do some
testing before you trust it with valuable data.
It sounds like this particular hardware failure lends itself to easy
discovery with a bit of testing - indeed, you found it while testing
your hardware.  So I don't think it is a failure mode we need to put
any extra care in to.
It is the failure modes that are hard to find with basic testing that
we should worry about.

NeilBrown

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
@ 2007-05-18 14:47 Andrew Burgess
  2007-05-18 15:04 ` Tomasz Chmielewski
  2007-05-18 18:10 ` Colin McCabe
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Burgess @ 2007-05-18 14:47 UTC (permalink / raw)
  To: colin.p.mccabe, linux-raid

>Basically, B appears to be "write-only"; it will never return an error on a
>write, but just try to read from it, and you will be sorry.

It would be interesting to see what SMART says about drive B, especially
the short and long self tests.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-18 14:47 Andrew Burgess
@ 2007-05-18 15:04 ` Tomasz Chmielewski
  2007-05-18 18:18   ` Colin McCabe
  2007-05-18 18:10 ` Colin McCabe
  1 sibling, 1 reply; 10+ messages in thread
From: Tomasz Chmielewski @ 2007-05-18 15:04 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: colin.p.mccabe, linux-raid

Andrew Burgess schrieb:
>> Basically, B appears to be "write-only"; it will never return an error on a
>> write, but just try to read from it, and you will be sorry.
> 
> It would be interesting to see what SMART says about drive B, especially
> the short and long self tests.

I wouldn't rely on SMART.

I have a broken drive, which has lots of badblocks - but SMART happily 
claims it's fine (short/long tests are completed without errors).


-- 
Tomasz Chmielewski
http://wpkg.org


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-18 15:04 ` Tomasz Chmielewski
@ 2007-05-18 18:18   ` Colin McCabe
  2007-05-23 17:46     ` Bill Davidsen
  0 siblings, 1 reply; 10+ messages in thread
From: Colin McCabe @ 2007-05-18 18:18 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: Andrew Burgess, linux-raid

On 5/18/07, Tomasz Chmielewski <mangoo@wpkg.org> wrote:
> Andrew Burgess schrieb:
> >> Basically, B appears to be "write-only"; it will never return an error on a
> >> write, but just try to read from it, and you will be sorry.
> >
> > It would be interesting to see what SMART says about drive B, especially
> > the short and long self tests.
>
> I wouldn't rely on SMART.
>
> I have a broken drive, which has lots of badblocks - but SMART happily
> claims it's fine (short/long tests are completed without errors).
>

If you haven't seen Google's hard drive study yet, you should take a look.
It's at http://labs.google.com/papers/disk_failures.pdf

The conclusion says that "some of the SMART parameters are
well-correlated with higher failure probabilities," but also that "a
large fraction of [google's] failed drives have shown no SMART error
signals whatsoever."

Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-18 18:18   ` Colin McCabe
@ 2007-05-23 17:46     ` Bill Davidsen
  0 siblings, 0 replies; 10+ messages in thread
From: Bill Davidsen @ 2007-05-23 17:46 UTC (permalink / raw)
  To: Colin McCabe; +Cc: Tomasz Chmielewski, Andrew Burgess, linux-raid

Colin McCabe wrote:
> On 5/18/07, Tomasz Chmielewski <mangoo@wpkg.org> wrote:
>> Andrew Burgess schrieb:
>> >> Basically, B appears to be "write-only"; it will never return an 
>> error on a
>> >> write, but just try to read from it, and you will be sorry.
>> >
>> > It would be interesting to see what SMART says about drive B, 
>> especially
>> > the short and long self tests.
>>
>> I wouldn't rely on SMART.
>>
>> I have a broken drive, which has lots of badblocks - but SMART happily
>> claims it's fine (short/long tests are completed without errors).
>>
>
> If you haven't seen Google's hard drive study yet, you should take a 
> look.
> It's at http://labs.google.com/papers/disk_failures.pdf
>
> The conclusion says that "some of the SMART parameters are
> well-correlated with higher failure probabilities," but also that "a
> large fraction of [google's] failed drives have shown no SMART error
> signals whatsoever."
Having covered that in a presentation to a user group related to SMART. 
may I offer a paraphrase which may be more obvious to people who are not 
native speakers of English:

High counts of some SMART parameters indicate that the drive is likely 
to fail. However, most drives fail without warning.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: unreadable drives can be synchronized?
  2007-05-18 14:47 Andrew Burgess
  2007-05-18 15:04 ` Tomasz Chmielewski
@ 2007-05-18 18:10 ` Colin McCabe
  1 sibling, 0 replies; 10+ messages in thread
From: Colin McCabe @ 2007-05-18 18:10 UTC (permalink / raw)
  Cc: linux-raid

Andrew Burgess wrote:
>> Basically, B appears to be "write-only"; it will never return an error on a
>> write, but just try to read from it, and you will be sorry.
> 
> It would be interesting to see what SMART says about drive B, especially
> the short and long self tests.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

On my hard drives, automatic online testing is turned on, and so is 
automatic offline testing. I run the long self-test once a week.

I have two drives which can play the role of B. One of them has this 
SMART output:

[root@cmccabe-devel root]# smartctl -d ata /dev/sdb -H
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE 
UPDATED  WHEN_FAILED RAW_VALUE
   5 Reallocated_Sector_Ct   0x0033   004   004   024    Pre-fail 
Always   FAILING_NOW 133273031255

The other one passes SMART.
Both of them eat data, though.

Colin

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-05-23 17:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-16 15:50 unreadable drives can be synchronized? Colin McCabe
2007-05-16 17:22 ` Bill Davidsen
2007-05-16 20:09   ` Colin McCabe
2007-05-16 20:18     ` Colin McCabe
2007-05-17  0:54 ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2007-05-18 14:47 Andrew Burgess
2007-05-18 15:04 ` Tomasz Chmielewski
2007-05-18 18:18   ` Colin McCabe
2007-05-23 17:46     ` Bill Davidsen
2007-05-18 18:10 ` Colin McCabe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).