bad block management

All of lore.kernel.org
 help / color / mirror / Atom feed

* bad block management
@ 2008-04-01  5:03 kgp
  2008-04-01 18:55 ` Christian Kujau
  0 siblings, 1 reply; 19+ messages in thread
From: kgp @ 2008-04-01  5:03 UTC (permalink / raw)
  To: reiserfs-devel


How ReiserFS manages bad blocks?
If it is not supporting bad block management, plz tell me the approches we
can use to implement bad block management in any file system. I am
implementing bad block manager for UDF. I want some inputs
-- 
View this message in context: http://www.nabble.com/bad-block-management-tp16413477p16413477.html
Sent from the ReiserFS - General mailing list archive at Nabble.com.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-01  5:03 bad block management kgp
@ 2008-04-01 18:55 ` Christian Kujau
  2008-04-01 19:32   ` Ric Wheeler
  0 siblings, 1 reply; 19+ messages in thread
From: Christian Kujau @ 2008-04-01 18:55 UTC (permalink / raw)
  To: kgp; +Cc: reiserfs-devel

On Mon, 31 Mar 2008, kgp wrote:
> How ReiserFS manages bad blocks?

Is reiserfsck's --badblocks option helpful?

> If it is not supporting bad block management, plz tell me the approches we
> can use to implement bad block management in any file system. I am
> implementing bad block manager for UDF. I want some inputs

Doesn't "Spared UDF" already provide some kind of bad block management?

C.
-- 
BOFH excuse #261:

The Usenet news is out of date

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-01 18:55 ` Christian Kujau
@ 2008-04-01 19:32   ` Ric Wheeler
  2008-04-01 19:51     ` Jeff Mahoney
  0 siblings, 1 reply; 19+ messages in thread
From: Ric Wheeler @ 2008-04-01 19:32 UTC (permalink / raw)
  To: Christian Kujau; +Cc: kgp, reiserfs-devel


Christian Kujau wrote:
> On Mon, 31 Mar 2008, kgp wrote:
>> How ReiserFS manages bad blocks?
> 
> Is reiserfsck's --badblocks option helpful?
> 
>> If it is not supporting bad block management, plz tell me the 
>> approches we
>> can use to implement bad block management in any file system. I am
>> implementing bad block manager for UDF. I want some inputs
> 
> Doesn't "Spared UDF" already provide some kind of bad block management?
> 
> C.

I am not sure what you want to do with bad block management.

With most modern disk drives, they will remap bad disk sectors dynamically for 
you so the file system layer can stay out of the bad block mapping business 
entirely.

ric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-01 19:32   ` Ric Wheeler
@ 2008-04-01 19:51     ` Jeff Mahoney
  2008-04-01 22:11       ` Edward Shishkin
  2008-04-04  0:14       ` Zan Lynx
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff Mahoney @ 2008-04-01 19:51 UTC (permalink / raw)
  To: ric; +Cc: Christian Kujau, kgp, reiserfs-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ric Wheeler wrote:
> 
> Christian Kujau wrote:
>> On Mon, 31 Mar 2008, kgp wrote:
>>> How ReiserFS manages bad blocks?
>>
>> Is reiserfsck's --badblocks option helpful?
>>
>>> If it is not supporting bad block management, plz tell me the
>>> approches we
>>> can use to implement bad block management in any file system. I am
>>> implementing bad block manager for UDF. I want some inputs
>>
>> Doesn't "Spared UDF" already provide some kind of bad block management?
>>
>> C.
> 
> I am not sure what you want to do with bad block management.
> 
> With most modern disk drives, they will remap bad disk sectors
> dynamically for you so the file system layer can stay out of the bad
> block mapping business entirely.

He's asking about UDF, though, so I'd imagine he's talking about optical
media. It's even cheaper than disk though, so I guess I don't see the
benefit.

Reiserfs handles bad blocks by allocating the blocks input as "known
bad" to special file that's inaccessible. It does _not_ do this
automatically, and reiserfsck/mkreiserfs must be passed a list of blocks
to allocate to that file. It also doesn't recover files that have been
corrupted to the block failure.

Ric's right about disk drives, though. They'll remap the bad sectors
automatically at the hardware level. When you start to see bad sectors
at the file system level, it means that the sectors reserved for
remapping have been exhausted and you should replace the disk.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFH8pJXLPWxlyuTD7IRAuX5AJ4xfX/hFL5i608lEZ9dBcXZ2cuxngCfTBq+
L0gkAhJOlbl4FUb7cgafaM8=
=YqBe
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-01 19:51     ` Jeff Mahoney
@ 2008-04-01 22:11       ` Edward Shishkin
  2008-04-02  4:50         ` jyotiv
  2008-04-04  0:14       ` Zan Lynx
  1 sibling, 1 reply; 19+ messages in thread
From: Edward Shishkin @ 2008-04-01 22:11 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: ric, Christian Kujau, kgp, reiserfs-devel

Jeff Mahoney wrote:

> Ric Wheeler wrote:
>
> >Christian Kujau wrote:
>
> >>On Mon, 31 Mar 2008, kgp wrote:
> >>
> >>>How ReiserFS manages bad blocks?
> >>
> >>Is reiserfsck's --badblocks option helpful?
> >>
> >>>If it is not supporting bad block management, plz tell me the
> >>>approches we
> >>>can use to implement bad block management in any file system. I am
> >>>implementing bad block manager for UDF. I want some inputs
> >>
> >>Doesn't "Spared UDF" already provide some kind of bad block management?
> >>
> >>C.
>
> >I am not sure what you want to do with bad block management.
>
> >With most modern disk drives, they will remap bad disk sectors
> >dynamically for you so the file system layer can stay out of the bad
> >block mapping business entirely.
>
>
> He's asking about UDF, though, so I'd imagine he's talking about optical
> media. It's even cheaper than disk though, so I guess I don't see the
> benefit.
>
> Reiserfs handles bad blocks by allocating the blocks input as "known
> bad" to special file that's inaccessible. It does _not_ do this
> automatically, and reiserfsck/mkreiserfs must be passed a list of blocks
> to allocate to that file. It also doesn't recover files that have been
> corrupted to the block failure.


Here are the instructions:
http://chichkin_i.zelnet.ru/bad-block-handling.html

>
> Ric's right about disk drives, though. They'll remap the bad sectors
> automatically at the hardware level. When you start to see bad sectors
> at the file system level, it means that the sectors reserved for
> remapping have been exhausted and you should replace the disk.
>
> -Jeff
>
> --
> Jeff Mahoney
> SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: bad block management
  2008-04-01 22:11       ` Edward Shishkin
@ 2008-04-02  4:50         ` jyotiv
  2008-04-02 10:43           ` Ric Wheeler
  2008-04-02 13:14           ` Jeff Mahoney
  0 siblings, 2 replies; 19+ messages in thread
From: jyotiv @ 2008-04-02  4:50 UTC (permalink / raw)
  To: 'Edward Shishkin', 'Jeff Mahoney'
  Cc: ric, 'Christian Kujau', reiserfs-devel

Thanks a lot.

Modern disk drives manages the bad blocks(sectors).

We are keeping in mind that hard drive, the  user is using may not be
modern.
In such case this bad block management at file system level is help ful.

And for modern hard drives it is helpful only after reserved sector for
remapping are exhasted.


UDF what we have implemented is for hard disk with high capacity.
Thank you every body.

Rgrds,
Kgp


-----Original Message-----
From: Edward Shishkin [mailto:edward.shishkin@gmail.com]
Sent: Wednesday, April 02, 2008 3:41 AM
To: Jeff Mahoney
Cc: ric@emc.com; Christian Kujau; kgp; reiserfs-devel@vger.kernel.org
Subject: Re: bad block management


Jeff Mahoney wrote:

> Ric Wheeler wrote:
>
> >Christian Kujau wrote:
>
> >>On Mon, 31 Mar 2008, kgp wrote:
> >>
> >>>How ReiserFS manages bad blocks?
> >>
> >>Is reiserfsck's --badblocks option helpful?
> >>
> >>>If it is not supporting bad block management, plz tell me the
> >>>approches we
> >>>can use to implement bad block management in any file system. I am
> >>>implementing bad block manager for UDF. I want some inputs
> >>
> >>Doesn't "Spared UDF" already provide some kind of bad block management?
> >>
> >>C.
>
> >I am not sure what you want to do with bad block management.
>
> >With most modern disk drives, they will remap bad disk sectors
> >dynamically for you so the file system layer can stay out of the bad
> >block mapping business entirely.
>
>
> He's asking about UDF, though, so I'd imagine he's talking about optical
> media. It's even cheaper than disk though, so I guess I don't see the
> benefit.
>
> Reiserfs handles bad blocks by allocating the blocks input as "known
> bad" to special file that's inaccessible. It does _not_ do this
> automatically, and reiserfsck/mkreiserfs must be passed a list of blocks
> to allocate to that file. It also doesn't recover files that have been
> corrupted to the block failure.


Here are the instructions:
http://chichkin_i.zelnet.ru/bad-block-handling.html

>
> Ric's right about disk drives, though. They'll remap the bad sectors
> automatically at the hardware level. When you start to see bad sectors
> at the file system level, it means that the sectors reserved for
> remapping have been exhausted and you should replace the disk.
>
> -Jeff
>
> --
> Jeff Mahoney
> SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments contained in it.

Contact your Administrator for further information.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-02  4:50         ` jyotiv
@ 2008-04-02 10:43           ` Ric Wheeler
  2008-04-02 11:22             ` jyotiv
  2008-04-02 13:14           ` Jeff Mahoney
  1 sibling, 1 reply; 19+ messages in thread
From: Ric Wheeler @ 2008-04-02 10:43 UTC (permalink / raw)
  To: jyotiv
  Cc: 'Edward Shishkin', 'Jeff Mahoney',
	'Christian Kujau', reiserfs-devel


jyotiv wrote:
> Thanks a lot.
> 
> Modern disk drives manages the bad blocks(sectors).
> 
> We are keeping in mind that hard drive, the  user is using may not be
> modern.
> In such case this bad block management at file system level is help ful.
> 
> And for modern hard drives it is helpful only after reserved sector for
> remapping are exhasted.
> 
> 
> UDF what we have implemented is for hard disk with high capacity.
> Thank you every body.
> 
> Rgrds,
> Kgp
> 

I think that you will not see many drives still running that don't 
handle bad block remapping.

Best of luck!

Ric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: bad block management
  2008-04-02 10:43           ` Ric Wheeler
@ 2008-04-02 11:22             ` jyotiv
  2008-04-02 13:31               ` Ric Wheeler
  0 siblings, 1 reply; 19+ messages in thread
From: jyotiv @ 2008-04-02 11:22 UTC (permalink / raw)
  To: 'Ric Wheeler'
  Cc: 'Edward Shishkin', 'Jeff Mahoney',
	'Christian Kujau', reiserfs-devel

ok, i agree with u.
Can I know why reiserfs is handling bag blocks?

Rgrds,
Kgp

-----Original Message-----
From: Ric Wheeler [mailto:ric@emc.com]
Sent: Wednesday, April 02, 2008 4:14 PM
To: jyotiv@tataelxsi.co.in
Cc: 'Edward Shishkin'; 'Jeff Mahoney'; 'Christian Kujau';
reiserfs-devel@vger.kernel.org
Subject: Re: bad block management

jyotiv wrote:
> Thanks a lot.
> 
> Modern disk drives manages the bad blocks(sectors).
> 
> We are keeping in mind that hard drive, the  user is using may not be
> modern.
> In such case this bad block management at file system level is help ful.
> 
> And for modern hard drives it is helpful only after reserved sector for
> remapping are exhasted.
> 
> 
> UDF what we have implemented is for hard disk with high capacity.
> Thank you every body.
> 
> Rgrds,
> Kgp
> 

I think that you will not see many drives still running that don't 
handle bad block remapping.

Best of luck!

Ric

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments contained in it.

Contact your Administrator for further information.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-02  4:50         ` jyotiv
  2008-04-02 10:43           ` Ric Wheeler
@ 2008-04-02 13:14           ` Jeff Mahoney
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff Mahoney @ 2008-04-02 13:14 UTC (permalink / raw)
  To: jyotiv
  Cc: 'Edward Shishkin', ric, 'Christian Kujau',
	reiserfs-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

jyotiv wrote:
> We are keeping in mind that hard drive, the  user is using may not be
> modern.

> UDF what we have implemented is for hard disk with high capacity.

These two statement seem conflicting. Best of luck, though.

- -Jeff

- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFH84ajLPWxlyuTD7IRAkUUAJ4+dA0haEXc62wb16hYtudjTtcDFwCeNx92
kcfvFCqU43f9nGrPftSPnx8=
=1hys
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-02 11:22             ` jyotiv
@ 2008-04-02 13:31               ` Ric Wheeler
  0 siblings, 0 replies; 19+ messages in thread
From: Ric Wheeler @ 2008-04-02 13:31 UTC (permalink / raw)
  To: jyotiv
  Cc: 'Edward Shishkin', 'Jeff Mahoney',
	'Christian Kujau', reiserfs-devel

jyotiv wrote:
> ok, i agree with u.
> Can I know why reiserfs is handling bag blocks?
> 
> Rgrds,
> Kgp
> 

Bad block handling in UNIX file systems is a really, really old feature (dating 
back to the 70's). I assume that reiserfs, when it was originally written still 
had some exposure to older drives of that era and assumed is was a useful 
feature to add.

Personally, I would not put this kind of feature into any new file system being 
written today ;-)

ric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-01 19:51     ` Jeff Mahoney
  2008-04-01 22:11       ` Edward Shishkin
@ 2008-04-04  0:14       ` Zan Lynx
  2008-04-04  4:21         ` Toby Thain
  1 sibling, 1 reply; 19+ messages in thread
From: Zan Lynx @ 2008-04-04  0:14 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: ric, Christian Kujau, kgp, reiserfs-devel

[-- Attachment #1: Type: text/plain, Size: 760 bytes --]

On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:

> Ric's right about disk drives, though. They'll remap the bad sectors
> automatically at the hardware level. When you start to see bad sectors
> at the file system level, it means that the sectors reserved for
> remapping have been exhausted and you should replace the disk.

There are a couple of cases where you can see bad block errors on a good
drive.

If a block is written with a bad CRC for some reason...the write head
got a freak blip or it lost power as it was writing, or the data went
corrupt while sitting on disk, then it will read as a bad block, but
rewriting would fix it.

A RAID media verify or a badblocks -n run can usually fix these.
-- 
Zan Lynx <zlynx@acm.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04  0:14       ` Zan Lynx
@ 2008-04-04  4:21         ` Toby Thain
  2008-04-04 16:12           ` Zan Lynx
  2008-04-04 18:58           ` Ric Wheeler
  0 siblings, 2 replies; 19+ messages in thread
From: Toby Thain @ 2008-04-04  4:21 UTC (permalink / raw)
  To: Zan Lynx; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel


On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>
>> Ric's right about disk drives, though. They'll remap the bad sectors
>> automatically at the hardware level. When you start to see bad  
>> sectors
>> at the file system level, it means that the sectors reserved for
>> remapping have been exhausted and you should replace the disk.
>
> There are a couple of cases where you can see bad block errors on a  
> good
> drive.
>
> If a block is written with a bad CRC for some reason...the write head
> got a freak blip or it lost power as it was writing, or the data went
> corrupt while sitting on disk, then it will read as a bad block, but
> rewriting would fix it.
>
> A RAID media verify or a badblocks -n run can usually fix these.

Only if your RAID uses CRCs (most don't).

ZFS is the real answer to undetected corruption.

--Toby

> -- 
> Zan Lynx <zlynx@acm.org>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04  4:21         ` Toby Thain
@ 2008-04-04 16:12           ` Zan Lynx
  2008-04-04 22:41             ` Toby Thain
  2008-04-04 18:58           ` Ric Wheeler
  1 sibling, 1 reply; 19+ messages in thread
From: Zan Lynx @ 2008-04-04 16:12 UTC (permalink / raw)
  To: Toby Thain; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

On Fri, 2008-04-04 at 00:21 -0400, Toby Thain wrote:
> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:

> > A RAID media verify or a badblocks -n run can usually fix these.
> 
> Only if your RAID uses CRCs (most don't).
> 
> ZFS is the real answer to undetected corruption.

If one hard disk returns a CRC read error for a block but the other
RAID-1 mirror disk or the parity disk(s) return good blocks, the array
controller should know which data is good and which is bad in order to
rewrite a good copy.
-- 
Zan Lynx <zlynx@acm.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04  4:21         ` Toby Thain
  2008-04-04 16:12           ` Zan Lynx
@ 2008-04-04 18:58           ` Ric Wheeler
  2008-04-04 22:42             ` Toby Thain
  1 sibling, 1 reply; 19+ messages in thread
From: Ric Wheeler @ 2008-04-04 18:58 UTC (permalink / raw)
  To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel

Toby Thain wrote:
> 
> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>>
>>> Ric's right about disk drives, though. They'll remap the bad sectors
>>> automatically at the hardware level. When you start to see bad sectors
>>> at the file system level, it means that the sectors reserved for
>>> remapping have been exhausted and you should replace the disk.
>>
>> There are a couple of cases where you can see bad block errors on a good
>> drive.
>>
>> If a block is written with a bad CRC for some reason...the write head
>> got a freak blip or it lost power as it was writing, or the data went
>> corrupt while sitting on disk, then it will read as a bad block, but
>> rewriting would fix it.
>>
>> A RAID media verify or a badblocks -n run can usually fix these.
> 
> Only if your RAID uses CRCs (most don't).
> 
> ZFS is the real answer to undetected corruption.
> 
> --Toby

Zan is right - even on a local drive, a write can repair some sectors with bad 
protection bits. All disks have per sector data protection (reed solomon 
encoding, etc) and there are lots of those bits per sector.

There is work on adding DIF (data integrity f?) which is extra bytes that arrays 
or local drives can store for application level protection. Martin Petersen has 
some good slides about this on linux:

http://oss.oracle.com/projects/data-integrity/documentation/

ZFS, for example, or more specifically its lvm layer, could use DIF to add this 
kind of protection.

The other way to go is to use an enterprise class array - they all have multiple 
layers of data integrity baked in to deal with and correct these kind of errors.

ric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04 16:12           ` Zan Lynx
@ 2008-04-04 22:41             ` Toby Thain
  0 siblings, 0 replies; 19+ messages in thread
From: Toby Thain @ 2008-04-04 22:41 UTC (permalink / raw)
  To: Zan Lynx; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel


On 4-Apr-08, at 12:12 PM, Zan Lynx wrote:
> On Fri, 2008-04-04 at 00:21 -0400, Toby Thain wrote:
>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>
>>> A RAID media verify or a badblocks -n run can usually fix these.
>>
>> Only if your RAID uses CRCs (most don't).
>>
>> ZFS is the real answer to undetected corruption.
>
> If one hard disk returns a CRC read error for a block but the other
> RAID-1 mirror disk or the parity disk(s) return good blocks, the array
> controller should know which data is good and which is bad in order to
> rewrite a good copy.


That's a lot of IFs. I prefer ZFS' approach: Check which side of the  
mirror is bad, and take the good side.

--Toby

> -- 
> Zan Lynx <zlynx@acm.org>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04 18:58           ` Ric Wheeler
@ 2008-04-04 22:42             ` Toby Thain
  2008-04-05 12:31               ` Ric Wheeler
  0 siblings, 1 reply; 19+ messages in thread
From: Toby Thain @ 2008-04-04 22:42 UTC (permalink / raw)
  To: ric; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel


On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote:
>
> Toby Thain wrote:
>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>>>
>>>> Ric's right about disk drives, though. They'll remap the bad  
>>>> sectors
>>>> automatically at the hardware level. When you start to see bad  
>>>> sectors
>>>> at the file system level, it means that the sectors reserved for
>>>> remapping have been exhausted and you should replace the disk.
>>>
>>> There are a couple of cases where you can see bad block errors on  
>>> a good
>>> drive.
>>>
>>> If a block is written with a bad CRC for some reason...the write  
>>> head
>>> got a freak blip or it lost power as it was writing, or the data  
>>> went
>>> corrupt while sitting on disk, then it will read as a bad block, but
>>> rewriting would fix it.
>>>
>>> A RAID media verify or a badblocks -n run can usually fix these.
>> Only if your RAID uses CRCs (most don't).
>> ZFS is the real answer to undetected corruption.
>> --Toby
>
> Zan is right - even on a local drive, a write can repair some  
> sectors with bad protection bits. All disks have per sector data  
> protection (reed solomon encoding, etc) and there are lots of those  
> bits per sector.


That does not protect against writing bad data, only some errors  
internal to drive. There is a long way to travel between CPU and  
drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the entire  
data path.

--Toby

>
> There is work on adding DIF (data integrity f?) which is extra  
> bytes that arrays or local drives can store for application level  
> protection. Martin Petersen has some good slides about this on linux:
>
> http://oss.oracle.com/projects/data-integrity/documentation/
>
> ZFS, for example, or more specifically its lvm layer, could use DIF  
> to add this kind of protection.
>
> The other way to go is to use an enterprise class array - they all  
> have multiple layers of data integrity baked in to deal with and  
> correct these kind of errors.
>
> ric
> --
> To unsubscribe from this list: send the line "unsubscribe reiserfs- 
> devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-04 22:42             ` Toby Thain
@ 2008-04-05 12:31               ` Ric Wheeler
  2008-04-05 14:07                 ` Toby Thain
  0 siblings, 1 reply; 19+ messages in thread
From: Ric Wheeler @ 2008-04-05 12:31 UTC (permalink / raw)
  To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel

Toby Thain wrote:
> 
> On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote:
>>
>> Toby Thain wrote:
>>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>>>>
>>>>> Ric's right about disk drives, though. They'll remap the bad sectors
>>>>> automatically at the hardware level. When you start to see bad sectors
>>>>> at the file system level, it means that the sectors reserved for
>>>>> remapping have been exhausted and you should replace the disk.
>>>>
>>>> There are a couple of cases where you can see bad block errors on a 
>>>> good
>>>> drive.
>>>>
>>>> If a block is written with a bad CRC for some reason...the write head
>>>> got a freak blip or it lost power as it was writing, or the data went
>>>> corrupt while sitting on disk, then it will read as a bad block, but
>>>> rewriting would fix it.
>>>>
>>>> A RAID media verify or a badblocks -n run can usually fix these.
>>> Only if your RAID uses CRCs (most don't).
>>> ZFS is the real answer to undetected corruption.
>>> --Toby
>>
>> Zan is right - even on a local drive, a write can repair some sectors 
>> with bad protection bits. All disks have per sector data protection 
>> (reed solomon encoding, etc) and there are lots of those bits per sector.
> 
> 
> That does not protect against writing bad data, only some errors 
> internal to drive. There is a long way to travel between CPU and drive. 
> Cable, controller, RAM, etc, etc, etc. ZFS protects the entire data path.
> 
> --Toby

If you want to protect the entire data path, you are looking at 
something like DIF which protects even more of the data path than ZFS 
since it adds a check from application space to the IO stack ;-)

ZFS does not export its protection bits up the stack.

ric


> 
>>
>> There is work on adding DIF (data integrity f?) which is extra bytes 
>> that arrays or local drives can store for application level 
>> protection. Martin Petersen has some good slides about this on linux:
>>
>> http://oss.oracle.com/projects/data-integrity/documentation/
>>
>> ZFS, for example, or more specifically its lvm layer, could use DIF to 
>> add this kind of protection.
>>
>> The other way to go is to use an enterprise class array - they all 
>> have multiple layers of data integrity baked in to deal with and 
>> correct these kind of errors.
>>
>> ric
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> reiserfs-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-05 12:31               ` Ric Wheeler
@ 2008-04-05 14:07                 ` Toby Thain
  2008-04-05 15:08                   ` Ric Wheeler
  0 siblings, 1 reply; 19+ messages in thread
From: Toby Thain @ 2008-04-05 14:07 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel


On 5-Apr-08, at 8:31 AM, Ric Wheeler wrote:
> Toby Thain wrote:
>> On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote:
>>>
>>> Toby Thain wrote:
>>>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>>>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>>>>>
>>>>>> Ric's right about disk drives, though. They'll remap the bad  
>>>>>> sectors
>>>>>> automatically at the hardware level. When you start to see bad  
>>>>>> sectors
>>>>>> at the file system level, it means that the sectors reserved for
>>>>>> remapping have been exhausted and you should replace the disk.
>>>>>
>>>>> There are a couple of cases where you can see bad block errors  
>>>>> on a good
>>>>> drive.
>>>>>
>>>>> If a block is written with a bad CRC for some reason...the  
>>>>> write head
>>>>> got a freak blip or it lost power as it was writing, or the  
>>>>> data went
>>>>> corrupt while sitting on disk, then it will read as a bad  
>>>>> block, but
>>>>> rewriting would fix it.
>>>>>
>>>>> A RAID media verify or a badblocks -n run can usually fix these.
>>>> Only if your RAID uses CRCs (most don't).
>>>> ZFS is the real answer to undetected corruption.
>>>> --Toby
>>>
>>> Zan is right - even on a local drive, a write can repair some  
>>> sectors with bad protection bits. All disks have per sector data  
>>> protection (reed solomon encoding, etc) and there are lots of  
>>> those bits per sector.
>> That does not protect against writing bad data, only some errors  
>> internal to drive. There is a long way to travel between CPU and  
>> drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the  
>> entire data path.
>> --Toby
>
> If you want to protect the entire data path, you are looking at  
> something like DIF which protects even more of the data path than  
> ZFS since it adds a check from application space to the IO stack ;-)
>
> ZFS does not export its protection bits up the stack.

Correct, but it protects everything up to the system call. RAID does  
not even get close, even with perfect error reporting (which doesn't  
really exist anyway). :)

--Toby

>
> ric
>
>
>>>
>>> There is work on adding DIF (data integrity f?) which is extra  
>>> bytes that arrays or local drives can store for application level  
>>> protection. Martin Petersen has some good slides about this on  
>>> linux:
>>>
>>> http://oss.oracle.com/projects/data-integrity/documentation/
>>>
>>> ZFS, for example, or more specifically its lvm layer, could use  
>>> DIF to add this kind of protection.
>>>
>>> The other way to go is to use an enterprise class array - they  
>>> all have multiple layers of data integrity baked in to deal with  
>>> and correct these kind of errors.
>>>
>>> ric
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe  
>>> reiserfs-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: bad block management
  2008-04-05 14:07                 ` Toby Thain
@ 2008-04-05 15:08                   ` Ric Wheeler
  0 siblings, 0 replies; 19+ messages in thread
From: Ric Wheeler @ 2008-04-05 15:08 UTC (permalink / raw)
  To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel

Toby Thain wrote:
> 
> On 5-Apr-08, at 8:31 AM, Ric Wheeler wrote:
>> Toby Thain wrote:
>>> On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote:
>>>>
>>>> Toby Thain wrote:
>>>>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote:
>>>>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote:
>>>>>>
>>>>>>> Ric's right about disk drives, though. They'll remap the bad sectors
>>>>>>> automatically at the hardware level. When you start to see bad 
>>>>>>> sectors
>>>>>>> at the file system level, it means that the sectors reserved for
>>>>>>> remapping have been exhausted and you should replace the disk.
>>>>>>
>>>>>> There are a couple of cases where you can see bad block errors on 
>>>>>> a good
>>>>>> drive.
>>>>>>
>>>>>> If a block is written with a bad CRC for some reason...the write head
>>>>>> got a freak blip or it lost power as it was writing, or the data went
>>>>>> corrupt while sitting on disk, then it will read as a bad block, but
>>>>>> rewriting would fix it.
>>>>>>
>>>>>> A RAID media verify or a badblocks -n run can usually fix these.
>>>>> Only if your RAID uses CRCs (most don't).
>>>>> ZFS is the real answer to undetected corruption.
>>>>> --Toby
>>>>
>>>> Zan is right - even on a local drive, a write can repair some 
>>>> sectors with bad protection bits. All disks have per sector data 
>>>> protection (reed solomon encoding, etc) and there are lots of those 
>>>> bits per sector.
>>> That does not protect against writing bad data, only some errors 
>>> internal to drive. There is a long way to travel between CPU and 
>>> drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the entire 
>>> data path.
>>> --Toby
>>
>> If you want to protect the entire data path, you are looking at 
>> something like DIF which protects even more of the data path than ZFS 
>> since it adds a check from application space to the IO stack ;-)
>>
>> ZFS does not export its protection bits up the stack.
> 
> Correct, but it protects everything up to the system call. RAID does not 
> even get close, even with perfect error reporting (which doesn't really 
> exist anyway). :)
> 
> --Toby
> 

When you look in detail at how data is lost in working systems, it is 
always interesting to look at the big buckets of common failures and 
make sure that we balance the complexity and cost (in money or in 
performance) against the realized improvement.

What RAID does well is to protect against the leading and really, really 
common error case which is single or few sector errors on a disk drive. 
Those errors are almost always reported as IO errors and RAID systems 
(including our MD software RAID) will do the right thing when only one 
sector in a stripe is bad with a media error.

The interesting question is what failure is the second most common.

That, from what I see, is normally application/SW errors. It can be bugs 
in the fs or IO stack, but also it is also common to lose data from bad 
applications.

I don't have first hand experience with ZFS, but in any complicated 
system you have a danger to increase the error rate (certainly for early 
adopters ;-)) while the developers try to figure out how their 
implementation differs from their pristine design (or what the design 
concept missed).

My measured results of the reliability of reiserfs (v3) over a really 
large population show that we do quite well (when you use barriers or 
disable the write cache).

It will be interesting to look for the first ZFS study (like the CMU 
paper by Bianca on disk failure, the google paper on failures and the 
recent NetApp/UWisc papers on IO stack failures) to see how ZFS does in 
the wild.

ric

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-04-05 15:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-01  5:03 bad block management kgp
2008-04-01 18:55 ` Christian Kujau
2008-04-01 19:32   ` Ric Wheeler
2008-04-01 19:51     ` Jeff Mahoney
2008-04-01 22:11       ` Edward Shishkin
2008-04-02  4:50         ` jyotiv
2008-04-02 10:43           ` Ric Wheeler
2008-04-02 11:22             ` jyotiv
2008-04-02 13:31               ` Ric Wheeler
2008-04-02 13:14           ` Jeff Mahoney
2008-04-04  0:14       ` Zan Lynx
2008-04-04  4:21         ` Toby Thain
2008-04-04 16:12           ` Zan Lynx
2008-04-04 22:41             ` Toby Thain
2008-04-04 18:58           ` Ric Wheeler
2008-04-04 22:42             ` Toby Thain
2008-04-05 12:31               ` Ric Wheeler
2008-04-05 14:07                 ` Toby Thain
2008-04-05 15:08                   ` Ric Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.