* Hardware vs Software and Bad Block Relocation
@ 2004-05-21 13:05 AndyLiebman
2004-05-21 14:38 ` Guy
0 siblings, 1 reply; 4+ messages in thread
From: AndyLiebman @ 2004-05-21 13:05 UTC (permalink / raw)
To: linux-raid
From the replies I got to my last question about Hardware versus Software
RAID, one of the big advantages of true hardware RAID can be the better handling
of bad blocks or read errors on RAID 1 and RAID 5 arrays.
I have encountered this situation a few times with Linux software RAID 5
where I will get a read error on a particular sector of a particular disk. Linux
software RAID will immediately throw this disk out of the array. And now, if I
get a read error on another disk before I replace the first disk (unlikely
but it did happen to me once -- about a day after getting the first error), the
array can be totally lost. Or at least it's not so obvious how to recover the
data.
Yesterday, I spoke with two tech support people at 3ware who explained that
their hardware RAID cards will remember where a read error is encountered and
next time you try to write to that sector the data will get relocated to
another sector instead. As long as there is still communication with the disk after
a read error (within 20 seconds) the disk won't get kicked out of the array
and the RAID won't go into degraded mode. An error report will get generated
that you can view in the 3ware 3dm or 3dm2 GUI interface -- so you can see that
you MIGHT have to start worrying about a particular disk. But the data will
still be intact and the array will still offer redundancy.
This seems like a HUGE advantage to data security -- especially in my
application. I am dealing with Terrabytes of video and audio files, and it's simply
not practical to back them up.
So, my question is, is there a "software equivalent" to what the 3ware card
does with bad sectors or bad blocks. Will EVMS do that? Will the latest LVM do
that? I have read that EVMS does have a bad block relocation function, but
does it work the same way as the 3ware card? Will it prevent an array from going
into degraded mode after a read error?
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Hardware vs Software and Bad Block Relocation
2004-05-21 13:05 Hardware vs Software and Bad Block Relocation AndyLiebman
@ 2004-05-21 14:38 ` Guy
2004-05-22 15:23 ` Marcel de Riedmatten
0 siblings, 1 reply; 4+ messages in thread
From: Guy @ 2004-05-21 14:38 UTC (permalink / raw)
To: AndyLiebman, linux-raid
I have had this same problem. The funny thing is it could be fixed, but I
bet it is very hard to do.
With most or all modern disk drives (10 years old or less) if you write to
the bad block the disk drive will re-locate the bad block. The RAID5
software could do this:
Read bad block, get a failure.
Re-create missing data.
Write missing data back over the bad block.
If success then
go on with life!
Else
Report the disk as needing to be replaced,
but don't fail it for 1 bad block!
Maybe have a threshold.
After all 99.99999% of the data is still there!
I have "corrected" disks with bad blocks by using dd to copy /dev/zero to
the disk. After that test the disk by copying the disk to /dev/null. Works
every time.
Example:
/dev/sdf has a bad block.
And you are willing to loose the data on it!
dd if=/dev/zero of=/dev/sdf bs=64k
If success then
dd if=/dev/sdf of=/dev/null bs=64k
If success then
The disk is good as... well it has not bad blocks
for now.
If a disk has a bad block in 1 partition you could just dd zero to that
partition, but still verify 100% of the disk.
I have corrected about 3 disks this way in the past 3 years. I have never
had any issues since then. So I know the raid software could automate this
and save some major headaches!
One gotcha, my disks had auto re-locate disabled. I install a Seagate tool
that allowed me to adjust disk drive options. I enabled auto re-locate for
read and write. Since then I have not had a read error. I think the drive
re-locates blocks on reads if there is a retry on read. Of course it can't
re-locate the block if it can't read it.
A note about hardware RAID. Hardware RAID systems will test the disks from
time to time. So the bad block will be found at a time that you don't need
it. The chances of having 2 bad blocks on different drives is reduced much
by this extra scanning. I use a crontab script to read my disks each night.
It sends me an email status. This way I stand a good chance of knowing
about a bad block before md finds it.
Guy
-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of AndyLiebman@aol.com
Sent: Friday, May 21, 2004 9:05 AM
To: linux-raid@vger.kernel.org
Subject: Hardware vs Software and Bad Block Relocation
From the replies I got to my last question about Hardware versus Software
RAID, one of the big advantages of true hardware RAID can be the better
handling
of bad blocks or read errors on RAID 1 and RAID 5 arrays.
I have encountered this situation a few times with Linux software RAID 5
where I will get a read error on a particular sector of a particular disk.
Linux
software RAID will immediately throw this disk out of the array. And now, if
I
get a read error on another disk before I replace the first disk (unlikely
but it did happen to me once -- about a day after getting the first error),
the
array can be totally lost. Or at least it's not so obvious how to recover
the
data.
Yesterday, I spoke with two tech support people at 3ware who explained that
their hardware RAID cards will remember where a read error is encountered
and
next time you try to write to that sector the data will get relocated to
another sector instead. As long as there is still communication with the
disk after
a read error (within 20 seconds) the disk won't get kicked out of the array
and the RAID won't go into degraded mode. An error report will get generated
that you can view in the 3ware 3dm or 3dm2 GUI interface -- so you can see
that
you MIGHT have to start worrying about a particular disk. But the data will
still be intact and the array will still offer redundancy.
This seems like a HUGE advantage to data security -- especially in my
application. I am dealing with Terrabytes of video and audio files, and it's
simply
not practical to back them up.
So, my question is, is there a "software equivalent" to what the 3ware card
does with bad sectors or bad blocks. Will EVMS do that? Will the latest LVM
do
that? I have read that EVMS does have a bad block relocation function, but
does it work the same way as the 3ware card? Will it prevent an array from
going
into degraded mode after a read error?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Hardware vs Software and Bad Block Relocation
2004-05-21 14:38 ` Guy
@ 2004-05-22 15:23 ` Marcel de Riedmatten
2004-05-22 16:27 ` Guy
0 siblings, 1 reply; 4+ messages in thread
From: Marcel de Riedmatten @ 2004-05-22 15:23 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 353 bytes --]
Le ven 21/05/2004 à 16:38, Guy a écrit :
>
> If a disk has a bad block in 1 partition you could just dd zero to that
> partition, but still verify 100% of the disk.
Hi
for the record a somewhat smarter way to do that manualy is discribed at
http://smartmontools.sourceforge.net/BadBlockHowTo.txt
Cheers
--
Marcel de Riedmatten
[-- Attachment #2: Ceci est une partie de message numériquement signée. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Hardware vs Software and Bad Block Relocation
2004-05-22 15:23 ` Marcel de Riedmatten
@ 2004-05-22 16:27 ` Guy
0 siblings, 0 replies; 4+ messages in thread
From: Guy @ 2004-05-22 16:27 UTC (permalink / raw)
To: 'Marcel de Riedmatten', linux-raid
It would be smarter, but it does not cover RAID. The logic to determine
which file is bad does not work. And if you correct your disk this way you
may corrupt your array. Until the RAID software has support for bad blocks,
just remove the disk from the array, then correct the bad block. The web
site mentioned does have some good info. But not all drives support the
smartctl options needed. My disks do not.
Guy
-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Marcel de Riedmatten
Sent: Saturday, May 22, 2004 11:24 AM
To: linux-raid@vger.kernel.org
Subject: RE: Hardware vs Software and Bad Block Relocation
Le ven 21/05/2004 à 16:38, Guy a écrit :
>
> If a disk has a bad block in 1 partition you could just dd zero to that
> partition, but still verify 100% of the disk.
Hi
for the record a somewhat smarter way to do that manualy is discribed at
http://smartmontools.sourceforge.net/BadBlockHowTo.txt
Cheers
--
Marcel de Riedmatten
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-05-22 16:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-21 13:05 Hardware vs Software and Bad Block Relocation AndyLiebman
2004-05-21 14:38 ` Guy
2004-05-22 15:23 ` Marcel de Riedmatten
2004-05-22 16:27 ` Guy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).