entire array lost when some blocks unreadable?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* entire array lost when some blocks unreadable?
@ 2005-06-07 20:56 Tom Eicher
  2005-06-07 21:10 ` Brad Campbell
  2005-06-07 21:13 ` Mike Hardy
  0 siblings, 2 replies; 5+ messages in thread
From: Tom Eicher @ 2005-06-07 20:56 UTC (permalink / raw)
  To: linux-raid

Hi list,

I might be missing the point here... I lost my first Raid-5 array 
(apparently) because one drive was kicked out after a DriveSeek error. 
When reconstruction startet at full speed, some blocks on another drive 
appeared to have uncorrectable errors, resulting in that drive also 
being kicked... you get it.

Now here is my question: On a normal drive, I would expect that a drive 
seek error or uncorrectable blocks would typically not take out the 
entire drive, but rather just corrupt the files that happen to be on 
those blocks. With RAID, a local error seems to render the entire array 
unusable. This would seem like an extreme measure to take just for some 
corrupt blocks.

- Is it correct that a relatively small corrupt area on a drive can 
cause the raid manager to kick out a drive?
- How does one prevent the scenario above?
  - periodically run drive tests (smart -t...) to early detect problems 
before multiple drives fail?
  - periodically run over the entire drives and copy the data around so 
the drives can sort out the bad blocks?

Thanks for any insight, tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: entire array lost when some blocks unreadable?
  2005-06-07 20:56 entire array lost when some blocks unreadable? Tom Eicher
@ 2005-06-07 21:10 ` Brad Campbell
  2005-06-07 21:21   ` Mike Hardy
  2005-06-07 21:13 ` Mike Hardy
  1 sibling, 1 reply; 5+ messages in thread
From: Brad Campbell @ 2005-06-07 21:10 UTC (permalink / raw)
  To: Tom Eicher; +Cc: linux-raid

Tom Eicher wrote:
> Hi list,
> 
> I might be missing the point here... I lost my first Raid-5 array 
> (apparently) because one drive was kicked out after a DriveSeek error. 
> When reconstruction startet at full speed, some blocks on another drive 
> appeared to have uncorrectable errors, resulting in that drive also 
> being kicked... you get it.

Join the long line next to the club trophy cabinet :)

> Now here is my question: On a normal drive, I would expect that a drive 
> seek error or uncorrectable blocks would typically not take out the 
> entire drive, but rather just corrupt the files that happen to be on 
> those blocks. With RAID, a local error seems to render the entire array 
> unusable. This would seem like an extreme measure to take just for some 
> corrupt blocks.

Perhaps.. I believe something may be on the cards in regard to trying to do a reconstruction 
re-write of the dodgy sector to try and force a reallocation prior to kicking the drive, but I only 
recall a rumbling of the ground type of rumour there.

> - Is it correct that a relatively small corrupt area on a drive can 
> cause the raid manager to kick out a drive?

At the moment, yes..

> - How does one prevent the scenario above?
>  - periodically run drive tests (smart -t...) to early detect problems 
> before multiple drives fail?

I run a short test on every drive 6 days a week, and a long test on every drive, every sunday.
This does a good job of locating pending errors and smartd E-mails me any issues it spots. My server 
is not a heavily loaded machine however, and I generally have a chance to trigger a write to 
reallocate the sectors prior to a read hitting it and kicking the drive out (or with the bug I have 
at the moment, killing the box - but that is not an md issue)

>  - periodically run over the entire drives and copy the data around so 
> the drives can sort out the bad blocks?

Something along those lines.
Generally if I get an error notification from smartd I pull the drive from the array and re-add it. 
This causes a rewrite of the entire disk and everyone is happy. (Unless the drive is dying, in which 
case the rewrite of the entire disk usually finishes it off nicely)

Another interesting thought is to unmount the drive and run a badblocks non-destructive media test 
on the array. This will read a stripe into memory (depending on how you configure badblocks) and 
write a series of patterns to the stripe (which will re-write every sector in the stripe). It will 
then write back the original data. Although, I guess in thinking about it, the reading the stripe 
will cause the drive to be kicked if it has a bad block anyway.. so scratch that. It's late here :)

Regards,
Brad
-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: entire array lost when some blocks unreadable?
  2005-06-07 20:56 entire array lost when some blocks unreadable? Tom Eicher
  2005-06-07 21:10 ` Brad Campbell
@ 2005-06-07 21:13 ` Mike Hardy
  1 sibling, 0 replies; 5+ messages in thread
From: Mike Hardy @ 2005-06-07 21:13 UTC (permalink / raw)
  To: Tom Eicher, linux-raid

Linux raid considers one unreadable block on one drive sufficient
evidence to kick the whole device out of the array.

If at that point reconstruction finds other blocks, you have the dreaded
raid 5 double disk failure. Total loss, *seemingly*.

You already realize the important information though, which is that the
unreadable sections of disk are probably not in the same stripe on each
disk (that's highly improbably, at least, and you can check either way)
so you actually have enough information via redundancy to recover all of
your information.

You just can't do it with the main raid tools.

I posted a number of things under the heading "the dreaded double disk
failure" on this list, including a script to create a loopback device
test array, and a perl script (several iterations, in fact) that
implements the raid5 algorithm in software and can read raid5 stripes
and tell you (via parity) what the data in a given device for a given
stripe should be.

Using a strategy similar to this, you can then re-write the data onto
the unreadable sectors, causing the drive firmware to remap those
sectors and fix that spot.

Repeat until you're clean, and you may have your data back.

In general though, your hunch is right. smartd
(http://smartmontools.sf.net) running scans (I do a short scan on each
disk staggered nightly, and a long scan on each disk once a week also
staggered) with email notifications will help. mdadm running with email
notifications will notify when you lost a disk so you can take action if
necessary (for instance, a long scan of all the remaining disks ASAP)

Also, raid is no substitute for good backups.

Good luck - and if you use the scripts, please post any patches that
make them more useful. They are far from perfect, but worked for me.
Hopefully they can help you if you need them.

-Mike

Tom Eicher wrote:
> Hi list,
> 
> I might be missing the point here... I lost my first Raid-5 array
> (apparently) because one drive was kicked out after a DriveSeek error.
> When reconstruction startet at full speed, some blocks on another drive
> appeared to have uncorrectable errors, resulting in that drive also
> being kicked... you get it.
> 
> Now here is my question: On a normal drive, I would expect that a drive
> seek error or uncorrectable blocks would typically not take out the
> entire drive, but rather just corrupt the files that happen to be on
> those blocks. With RAID, a local error seems to render the entire array
> unusable. This would seem like an extreme measure to take just for some
> corrupt blocks.
> 
> - Is it correct that a relatively small corrupt area on a drive can
> cause the raid manager to kick out a drive?
> - How does one prevent the scenario above?
>  - periodically run drive tests (smart -t...) to early detect problems
> before multiple drives fail?
>  - periodically run over the entire drives and copy the data around so
> the drives can sort out the bad blocks?
> 
> Thanks for any insight, tom
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: entire array lost when some blocks unreadable?
  2005-06-07 21:10 ` Brad Campbell
@ 2005-06-07 21:21   ` Mike Hardy
  2005-06-08  2:27     ` Guy
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Hardy @ 2005-06-07 21:21 UTC (permalink / raw)
  To: Brad Campbell; +Cc: Tom Eicher, linux-raid

Brad Campbell wrote:

> Join the long line next to the club trophy cabinet :)

Its a shame the line is this long - I wish I had the time to implement
the solution myself, but not having that I can't really whine either.
Its still a shame though. Alas.

> Something along those lines.
> Generally if I get an error notification from smartd I pull the drive
> from the array and re-add it. This causes a rewrite of the entire disk
> and everyone is happy. (Unless the drive is dying, in which case the
> rewrite of the entire disk usually finishes it off nicely)

When I get one of those, the first thing I do is verify my backup :-).
The backup is a second array that's on the network, so I typically
remount it read-only at that point.

Then I start drive scans on all drives (primary and backup) to see if
I've got any other blocks that will stop reconstruction. If I find any
other bad blocks on other devices, I immediately remount the primary as
read-only to preserve the data (if its not already gone) on all of the
disks. Note my disks almost never get written to, so this actually does
preserve the old data everywhere in all the cases I care about.

After that, a fail and re-add has done the trick for me in the past, but
once I actually got remapped into a bad block. Very annoying. Since
then, I fail the disk and do multiple badblocks passes on it.

Being able to enable an "aggressively correct" raid mode where any
single-block read error triggered a reconstruct/write/re-read cycle
until either it worked or failed would be nice. Bonus points for extra
md status markers that mdadm could pick up and mail to folks depending
on policy configuration.

-Mike

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: entire array lost when some blocks unreadable?
  2005-06-07 21:21   ` Mike Hardy
@ 2005-06-08  2:27     ` Guy
  0 siblings, 0 replies; 5+ messages in thread
From: Guy @ 2005-06-08  2:27 UTC (permalink / raw)
  To: 'Mike Hardy', 'Brad Campbell'
  Cc: 'Tom Eicher', linux-raid

I have 17 scsi disks, all 18 Gig.  I run a full scan each night.  I find
about 1 error each week.  The rate was much lower until a few months ago.
Mostly the same few disks.  Anyway, the errors are corrected by the scan
before md finds them.  Otherwise the errors could sit dormant for months
waiting for md to find 1, then while re-building to the spare md would find
another and poof!  Array gone.  Manual override needed to correct.  Since I
starting the scanning, md has not found any bad blocks.

The sad part, this is why I say Linux is not ready for prime time.  Ok for
home use, but not a 24x7 system that my job depends on!!!  If my job depends
on it, get an external hardware RAID system with battery backed memory.

Please, someone fix the bad block problems!!!
Without kicking out the disk as part of the solution!!!

Guy

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Mike Hardy
> Sent: Tuesday, June 07, 2005 5:22 PM
> To: Brad Campbell
> Cc: Tom Eicher; linux-raid@vger.kernel.org
> Subject: Re: entire array lost when some blocks unreadable?
> 
> 
> 
> Brad Campbell wrote:
> 
> > Join the long line next to the club trophy cabinet :)
> 
> Its a shame the line is this long - I wish I had the time to implement
> the solution myself, but not having that I can't really whine either.
> Its still a shame though. Alas.
> 
> > Something along those lines.
> > Generally if I get an error notification from smartd I pull the drive
> > from the array and re-add it. This causes a rewrite of the entire disk
> > and everyone is happy. (Unless the drive is dying, in which case the
> > rewrite of the entire disk usually finishes it off nicely)
> 
> When I get one of those, the first thing I do is verify my backup :-).
> The backup is a second array that's on the network, so I typically
> remount it read-only at that point.
> 
> Then I start drive scans on all drives (primary and backup) to see if
> I've got any other blocks that will stop reconstruction. If I find any
> other bad blocks on other devices, I immediately remount the primary as
> read-only to preserve the data (if its not already gone) on all of the
> disks. Note my disks almost never get written to, so this actually does
> preserve the old data everywhere in all the cases I care about.
> 
> After that, a fail and re-add has done the trick for me in the past, but
> once I actually got remapped into a bad block. Very annoying. Since
> then, I fail the disk and do multiple badblocks passes on it.
> 
> Being able to enable an "aggressively correct" raid mode where any
> single-block read error triggered a reconstruct/write/re-read cycle
> until either it worked or failed would be nice. Bonus points for extra
> md status markers that mdadm could pick up and mail to folks depending
> on policy configuration.
> 
> -Mike
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-06-08  2:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-07 20:56 entire array lost when some blocks unreadable? Tom Eicher
2005-06-07 21:10 ` Brad Campbell
2005-06-07 21:21   ` Mike Hardy
2005-06-08  2:27     ` Guy
2005-06-07 21:13 ` Mike Hardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).