* bad block management @ 2008-04-01 5:03 kgp 2008-04-01 18:55 ` Christian Kujau 0 siblings, 1 reply; 19+ messages in thread From: kgp @ 2008-04-01 5:03 UTC (permalink / raw) To: reiserfs-devel How ReiserFS manages bad blocks? If it is not supporting bad block management, plz tell me the approches we can use to implement bad block management in any file system. I am implementing bad block manager for UDF. I want some inputs -- View this message in context: http://www.nabble.com/bad-block-management-tp16413477p16413477.html Sent from the ReiserFS - General mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-01 5:03 bad block management kgp @ 2008-04-01 18:55 ` Christian Kujau 2008-04-01 19:32 ` Ric Wheeler 0 siblings, 1 reply; 19+ messages in thread From: Christian Kujau @ 2008-04-01 18:55 UTC (permalink / raw) To: kgp; +Cc: reiserfs-devel On Mon, 31 Mar 2008, kgp wrote: > How ReiserFS manages bad blocks? Is reiserfsck's --badblocks option helpful? > If it is not supporting bad block management, plz tell me the approches we > can use to implement bad block management in any file system. I am > implementing bad block manager for UDF. I want some inputs Doesn't "Spared UDF" already provide some kind of bad block management? C. -- BOFH excuse #261: The Usenet news is out of date ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-01 18:55 ` Christian Kujau @ 2008-04-01 19:32 ` Ric Wheeler 2008-04-01 19:51 ` Jeff Mahoney 0 siblings, 1 reply; 19+ messages in thread From: Ric Wheeler @ 2008-04-01 19:32 UTC (permalink / raw) To: Christian Kujau; +Cc: kgp, reiserfs-devel Christian Kujau wrote: > On Mon, 31 Mar 2008, kgp wrote: >> How ReiserFS manages bad blocks? > > Is reiserfsck's --badblocks option helpful? > >> If it is not supporting bad block management, plz tell me the >> approches we >> can use to implement bad block management in any file system. I am >> implementing bad block manager for UDF. I want some inputs > > Doesn't "Spared UDF" already provide some kind of bad block management? > > C. I am not sure what you want to do with bad block management. With most modern disk drives, they will remap bad disk sectors dynamically for you so the file system layer can stay out of the bad block mapping business entirely. ric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-01 19:32 ` Ric Wheeler @ 2008-04-01 19:51 ` Jeff Mahoney 2008-04-01 22:11 ` Edward Shishkin 2008-04-04 0:14 ` Zan Lynx 0 siblings, 2 replies; 19+ messages in thread From: Jeff Mahoney @ 2008-04-01 19:51 UTC (permalink / raw) To: ric; +Cc: Christian Kujau, kgp, reiserfs-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ric Wheeler wrote: > > Christian Kujau wrote: >> On Mon, 31 Mar 2008, kgp wrote: >>> How ReiserFS manages bad blocks? >> >> Is reiserfsck's --badblocks option helpful? >> >>> If it is not supporting bad block management, plz tell me the >>> approches we >>> can use to implement bad block management in any file system. I am >>> implementing bad block manager for UDF. I want some inputs >> >> Doesn't "Spared UDF" already provide some kind of bad block management? >> >> C. > > I am not sure what you want to do with bad block management. > > With most modern disk drives, they will remap bad disk sectors > dynamically for you so the file system layer can stay out of the bad > block mapping business entirely. He's asking about UDF, though, so I'd imagine he's talking about optical media. It's even cheaper than disk though, so I guess I don't see the benefit. Reiserfs handles bad blocks by allocating the blocks input as "known bad" to special file that's inaccessible. It does _not_ do this automatically, and reiserfsck/mkreiserfs must be passed a list of blocks to allocate to that file. It also doesn't recover files that have been corrupted to the block failure. Ric's right about disk drives, though. They'll remap the bad sectors automatically at the hardware level. When you start to see bad sectors at the file system level, it means that the sectors reserved for remapping have been exhausted and you should replace the disk. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFH8pJXLPWxlyuTD7IRAuX5AJ4xfX/hFL5i608lEZ9dBcXZ2cuxngCfTBq+ L0gkAhJOlbl4FUb7cgafaM8= =YqBe -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-01 19:51 ` Jeff Mahoney @ 2008-04-01 22:11 ` Edward Shishkin 2008-04-02 4:50 ` jyotiv 2008-04-04 0:14 ` Zan Lynx 1 sibling, 1 reply; 19+ messages in thread From: Edward Shishkin @ 2008-04-01 22:11 UTC (permalink / raw) To: Jeff Mahoney; +Cc: ric, Christian Kujau, kgp, reiserfs-devel Jeff Mahoney wrote: > Ric Wheeler wrote: > > >Christian Kujau wrote: > > >>On Mon, 31 Mar 2008, kgp wrote: > >> > >>>How ReiserFS manages bad blocks? > >> > >>Is reiserfsck's --badblocks option helpful? > >> > >>>If it is not supporting bad block management, plz tell me the > >>>approches we > >>>can use to implement bad block management in any file system. I am > >>>implementing bad block manager for UDF. I want some inputs > >> > >>Doesn't "Spared UDF" already provide some kind of bad block management? > >> > >>C. > > >I am not sure what you want to do with bad block management. > > >With most modern disk drives, they will remap bad disk sectors > >dynamically for you so the file system layer can stay out of the bad > >block mapping business entirely. > > > He's asking about UDF, though, so I'd imagine he's talking about optical > media. It's even cheaper than disk though, so I guess I don't see the > benefit. > > Reiserfs handles bad blocks by allocating the blocks input as "known > bad" to special file that's inaccessible. It does _not_ do this > automatically, and reiserfsck/mkreiserfs must be passed a list of blocks > to allocate to that file. It also doesn't recover files that have been > corrupted to the block failure. Here are the instructions: http://chichkin_i.zelnet.ru/bad-block-handling.html > > Ric's right about disk drives, though. They'll remap the bad sectors > automatically at the hardware level. When you start to see bad sectors > at the file system level, it means that the sectors reserved for > remapping have been exhausted and you should replace the disk. > > -Jeff > > -- > Jeff Mahoney > SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: bad block management 2008-04-01 22:11 ` Edward Shishkin @ 2008-04-02 4:50 ` jyotiv 2008-04-02 10:43 ` Ric Wheeler 2008-04-02 13:14 ` Jeff Mahoney 0 siblings, 2 replies; 19+ messages in thread From: jyotiv @ 2008-04-02 4:50 UTC (permalink / raw) To: 'Edward Shishkin', 'Jeff Mahoney' Cc: ric, 'Christian Kujau', reiserfs-devel Thanks a lot. Modern disk drives manages the bad blocks(sectors). We are keeping in mind that hard drive, the user is using may not be modern. In such case this bad block management at file system level is help ful. And for modern hard drives it is helpful only after reserved sector for remapping are exhasted. UDF what we have implemented is for hard disk with high capacity. Thank you every body. Rgrds, Kgp -----Original Message----- From: Edward Shishkin [mailto:edward.shishkin@gmail.com] Sent: Wednesday, April 02, 2008 3:41 AM To: Jeff Mahoney Cc: ric@emc.com; Christian Kujau; kgp; reiserfs-devel@vger.kernel.org Subject: Re: bad block management Jeff Mahoney wrote: > Ric Wheeler wrote: > > >Christian Kujau wrote: > > >>On Mon, 31 Mar 2008, kgp wrote: > >> > >>>How ReiserFS manages bad blocks? > >> > >>Is reiserfsck's --badblocks option helpful? > >> > >>>If it is not supporting bad block management, plz tell me the > >>>approches we > >>>can use to implement bad block management in any file system. I am > >>>implementing bad block manager for UDF. I want some inputs > >> > >>Doesn't "Spared UDF" already provide some kind of bad block management? > >> > >>C. > > >I am not sure what you want to do with bad block management. > > >With most modern disk drives, they will remap bad disk sectors > >dynamically for you so the file system layer can stay out of the bad > >block mapping business entirely. > > > He's asking about UDF, though, so I'd imagine he's talking about optical > media. It's even cheaper than disk though, so I guess I don't see the > benefit. > > Reiserfs handles bad blocks by allocating the blocks input as "known > bad" to special file that's inaccessible. It does _not_ do this > automatically, and reiserfsck/mkreiserfs must be passed a list of blocks > to allocate to that file. It also doesn't recover files that have been > corrupted to the block failure. Here are the instructions: http://chichkin_i.zelnet.ru/bad-block-handling.html > > Ric's right about disk drives, though. They'll remap the bad sectors > automatically at the hardware level. When you start to see bad sectors > at the file system level, it means that the sectors reserved for > remapping have been exhausted and you should replace the disk. > > -Jeff > > -- > Jeff Mahoney > SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments contained in it. Contact your Administrator for further information. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-02 4:50 ` jyotiv @ 2008-04-02 10:43 ` Ric Wheeler 2008-04-02 11:22 ` jyotiv 2008-04-02 13:14 ` Jeff Mahoney 1 sibling, 1 reply; 19+ messages in thread From: Ric Wheeler @ 2008-04-02 10:43 UTC (permalink / raw) To: jyotiv Cc: 'Edward Shishkin', 'Jeff Mahoney', 'Christian Kujau', reiserfs-devel jyotiv wrote: > Thanks a lot. > > Modern disk drives manages the bad blocks(sectors). > > We are keeping in mind that hard drive, the user is using may not be > modern. > In such case this bad block management at file system level is help ful. > > And for modern hard drives it is helpful only after reserved sector for > remapping are exhasted. > > > UDF what we have implemented is for hard disk with high capacity. > Thank you every body. > > Rgrds, > Kgp > I think that you will not see many drives still running that don't handle bad block remapping. Best of luck! Ric ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: bad block management 2008-04-02 10:43 ` Ric Wheeler @ 2008-04-02 11:22 ` jyotiv 2008-04-02 13:31 ` Ric Wheeler 0 siblings, 1 reply; 19+ messages in thread From: jyotiv @ 2008-04-02 11:22 UTC (permalink / raw) To: 'Ric Wheeler' Cc: 'Edward Shishkin', 'Jeff Mahoney', 'Christian Kujau', reiserfs-devel ok, i agree with u. Can I know why reiserfs is handling bag blocks? Rgrds, Kgp -----Original Message----- From: Ric Wheeler [mailto:ric@emc.com] Sent: Wednesday, April 02, 2008 4:14 PM To: jyotiv@tataelxsi.co.in Cc: 'Edward Shishkin'; 'Jeff Mahoney'; 'Christian Kujau'; reiserfs-devel@vger.kernel.org Subject: Re: bad block management jyotiv wrote: > Thanks a lot. > > Modern disk drives manages the bad blocks(sectors). > > We are keeping in mind that hard drive, the user is using may not be > modern. > In such case this bad block management at file system level is help ful. > > And for modern hard drives it is helpful only after reserved sector for > remapping are exhasted. > > > UDF what we have implemented is for hard disk with high capacity. > Thank you every body. > > Rgrds, > Kgp > I think that you will not see many drives still running that don't handle bad block remapping. Best of luck! Ric The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments contained in it. Contact your Administrator for further information. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-02 11:22 ` jyotiv @ 2008-04-02 13:31 ` Ric Wheeler 0 siblings, 0 replies; 19+ messages in thread From: Ric Wheeler @ 2008-04-02 13:31 UTC (permalink / raw) To: jyotiv Cc: 'Edward Shishkin', 'Jeff Mahoney', 'Christian Kujau', reiserfs-devel jyotiv wrote: > ok, i agree with u. > Can I know why reiserfs is handling bag blocks? > > Rgrds, > Kgp > Bad block handling in UNIX file systems is a really, really old feature (dating back to the 70's). I assume that reiserfs, when it was originally written still had some exposure to older drives of that era and assumed is was a useful feature to add. Personally, I would not put this kind of feature into any new file system being written today ;-) ric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-02 4:50 ` jyotiv 2008-04-02 10:43 ` Ric Wheeler @ 2008-04-02 13:14 ` Jeff Mahoney 1 sibling, 0 replies; 19+ messages in thread From: Jeff Mahoney @ 2008-04-02 13:14 UTC (permalink / raw) To: jyotiv Cc: 'Edward Shishkin', ric, 'Christian Kujau', reiserfs-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 jyotiv wrote: > We are keeping in mind that hard drive, the user is using may not be > modern. > UDF what we have implemented is for hard disk with high capacity. These two statement seem conflicting. Best of luck, though. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFH84ajLPWxlyuTD7IRAkUUAJ4+dA0haEXc62wb16hYtudjTtcDFwCeNx92 kcfvFCqU43f9nGrPftSPnx8= =1hys -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-01 19:51 ` Jeff Mahoney 2008-04-01 22:11 ` Edward Shishkin @ 2008-04-04 0:14 ` Zan Lynx 2008-04-04 4:21 ` Toby Thain 1 sibling, 1 reply; 19+ messages in thread From: Zan Lynx @ 2008-04-04 0:14 UTC (permalink / raw) To: Jeff Mahoney; +Cc: ric, Christian Kujau, kgp, reiserfs-devel [-- Attachment #1: Type: text/plain, Size: 760 bytes --] On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: > Ric's right about disk drives, though. They'll remap the bad sectors > automatically at the hardware level. When you start to see bad sectors > at the file system level, it means that the sectors reserved for > remapping have been exhausted and you should replace the disk. There are a couple of cases where you can see bad block errors on a good drive. If a block is written with a bad CRC for some reason...the write head got a freak blip or it lost power as it was writing, or the data went corrupt while sitting on disk, then it will read as a bad block, but rewriting would fix it. A RAID media verify or a badblocks -n run can usually fix these. -- Zan Lynx <zlynx@acm.org> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 0:14 ` Zan Lynx @ 2008-04-04 4:21 ` Toby Thain 2008-04-04 16:12 ` Zan Lynx 2008-04-04 18:58 ` Ric Wheeler 0 siblings, 2 replies; 19+ messages in thread From: Toby Thain @ 2008-04-04 4:21 UTC (permalink / raw) To: Zan Lynx; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: > On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: > >> Ric's right about disk drives, though. They'll remap the bad sectors >> automatically at the hardware level. When you start to see bad >> sectors >> at the file system level, it means that the sectors reserved for >> remapping have been exhausted and you should replace the disk. > > There are a couple of cases where you can see bad block errors on a > good > drive. > > If a block is written with a bad CRC for some reason...the write head > got a freak blip or it lost power as it was writing, or the data went > corrupt while sitting on disk, then it will read as a bad block, but > rewriting would fix it. > > A RAID media verify or a badblocks -n run can usually fix these. Only if your RAID uses CRCs (most don't). ZFS is the real answer to undetected corruption. --Toby > -- > Zan Lynx <zlynx@acm.org> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 4:21 ` Toby Thain @ 2008-04-04 16:12 ` Zan Lynx 2008-04-04 22:41 ` Toby Thain 2008-04-04 18:58 ` Ric Wheeler 1 sibling, 1 reply; 19+ messages in thread From: Zan Lynx @ 2008-04-04 16:12 UTC (permalink / raw) To: Toby Thain; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel [-- Attachment #1: Type: text/plain, Size: 543 bytes --] On Fri, 2008-04-04 at 00:21 -0400, Toby Thain wrote: > On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: > > A RAID media verify or a badblocks -n run can usually fix these. > > Only if your RAID uses CRCs (most don't). > > ZFS is the real answer to undetected corruption. If one hard disk returns a CRC read error for a block but the other RAID-1 mirror disk or the parity disk(s) return good blocks, the array controller should know which data is good and which is bad in order to rewrite a good copy. -- Zan Lynx <zlynx@acm.org> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 16:12 ` Zan Lynx @ 2008-04-04 22:41 ` Toby Thain 0 siblings, 0 replies; 19+ messages in thread From: Toby Thain @ 2008-04-04 22:41 UTC (permalink / raw) To: Zan Lynx; +Cc: Jeff Mahoney, ric, Christian Kujau, kgp, reiserfs-devel On 4-Apr-08, at 12:12 PM, Zan Lynx wrote: > On Fri, 2008-04-04 at 00:21 -0400, Toby Thain wrote: >> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: > >>> A RAID media verify or a badblocks -n run can usually fix these. >> >> Only if your RAID uses CRCs (most don't). >> >> ZFS is the real answer to undetected corruption. > > If one hard disk returns a CRC read error for a block but the other > RAID-1 mirror disk or the parity disk(s) return good blocks, the array > controller should know which data is good and which is bad in order to > rewrite a good copy. That's a lot of IFs. I prefer ZFS' approach: Check which side of the mirror is bad, and take the good side. --Toby > -- > Zan Lynx <zlynx@acm.org> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 4:21 ` Toby Thain 2008-04-04 16:12 ` Zan Lynx @ 2008-04-04 18:58 ` Ric Wheeler 2008-04-04 22:42 ` Toby Thain 1 sibling, 1 reply; 19+ messages in thread From: Ric Wheeler @ 2008-04-04 18:58 UTC (permalink / raw) To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel Toby Thain wrote: > > On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: >> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: >> >>> Ric's right about disk drives, though. They'll remap the bad sectors >>> automatically at the hardware level. When you start to see bad sectors >>> at the file system level, it means that the sectors reserved for >>> remapping have been exhausted and you should replace the disk. >> >> There are a couple of cases where you can see bad block errors on a good >> drive. >> >> If a block is written with a bad CRC for some reason...the write head >> got a freak blip or it lost power as it was writing, or the data went >> corrupt while sitting on disk, then it will read as a bad block, but >> rewriting would fix it. >> >> A RAID media verify or a badblocks -n run can usually fix these. > > Only if your RAID uses CRCs (most don't). > > ZFS is the real answer to undetected corruption. > > --Toby Zan is right - even on a local drive, a write can repair some sectors with bad protection bits. All disks have per sector data protection (reed solomon encoding, etc) and there are lots of those bits per sector. There is work on adding DIF (data integrity f?) which is extra bytes that arrays or local drives can store for application level protection. Martin Petersen has some good slides about this on linux: http://oss.oracle.com/projects/data-integrity/documentation/ ZFS, for example, or more specifically its lvm layer, could use DIF to add this kind of protection. The other way to go is to use an enterprise class array - they all have multiple layers of data integrity baked in to deal with and correct these kind of errors. ric ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 18:58 ` Ric Wheeler @ 2008-04-04 22:42 ` Toby Thain 2008-04-05 12:31 ` Ric Wheeler 0 siblings, 1 reply; 19+ messages in thread From: Toby Thain @ 2008-04-04 22:42 UTC (permalink / raw) To: ric; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote: > > Toby Thain wrote: >> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: >>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: >>> >>>> Ric's right about disk drives, though. They'll remap the bad >>>> sectors >>>> automatically at the hardware level. When you start to see bad >>>> sectors >>>> at the file system level, it means that the sectors reserved for >>>> remapping have been exhausted and you should replace the disk. >>> >>> There are a couple of cases where you can see bad block errors on >>> a good >>> drive. >>> >>> If a block is written with a bad CRC for some reason...the write >>> head >>> got a freak blip or it lost power as it was writing, or the data >>> went >>> corrupt while sitting on disk, then it will read as a bad block, but >>> rewriting would fix it. >>> >>> A RAID media verify or a badblocks -n run can usually fix these. >> Only if your RAID uses CRCs (most don't). >> ZFS is the real answer to undetected corruption. >> --Toby > > Zan is right - even on a local drive, a write can repair some > sectors with bad protection bits. All disks have per sector data > protection (reed solomon encoding, etc) and there are lots of those > bits per sector. That does not protect against writing bad data, only some errors internal to drive. There is a long way to travel between CPU and drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the entire data path. --Toby > > There is work on adding DIF (data integrity f?) which is extra > bytes that arrays or local drives can store for application level > protection. Martin Petersen has some good slides about this on linux: > > http://oss.oracle.com/projects/data-integrity/documentation/ > > ZFS, for example, or more specifically its lvm layer, could use DIF > to add this kind of protection. > > The other way to go is to use an enterprise class array - they all > have multiple layers of data integrity baked in to deal with and > correct these kind of errors. > > ric > -- > To unsubscribe from this list: send the line "unsubscribe reiserfs- > devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-04 22:42 ` Toby Thain @ 2008-04-05 12:31 ` Ric Wheeler 2008-04-05 14:07 ` Toby Thain 0 siblings, 1 reply; 19+ messages in thread From: Ric Wheeler @ 2008-04-05 12:31 UTC (permalink / raw) To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel Toby Thain wrote: > > On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote: >> >> Toby Thain wrote: >>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: >>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: >>>> >>>>> Ric's right about disk drives, though. They'll remap the bad sectors >>>>> automatically at the hardware level. When you start to see bad sectors >>>>> at the file system level, it means that the sectors reserved for >>>>> remapping have been exhausted and you should replace the disk. >>>> >>>> There are a couple of cases where you can see bad block errors on a >>>> good >>>> drive. >>>> >>>> If a block is written with a bad CRC for some reason...the write head >>>> got a freak blip or it lost power as it was writing, or the data went >>>> corrupt while sitting on disk, then it will read as a bad block, but >>>> rewriting would fix it. >>>> >>>> A RAID media verify or a badblocks -n run can usually fix these. >>> Only if your RAID uses CRCs (most don't). >>> ZFS is the real answer to undetected corruption. >>> --Toby >> >> Zan is right - even on a local drive, a write can repair some sectors >> with bad protection bits. All disks have per sector data protection >> (reed solomon encoding, etc) and there are lots of those bits per sector. > > > That does not protect against writing bad data, only some errors > internal to drive. There is a long way to travel between CPU and drive. > Cable, controller, RAM, etc, etc, etc. ZFS protects the entire data path. > > --Toby If you want to protect the entire data path, you are looking at something like DIF which protects even more of the data path than ZFS since it adds a check from application space to the IO stack ;-) ZFS does not export its protection bits up the stack. ric > >> >> There is work on adding DIF (data integrity f?) which is extra bytes >> that arrays or local drives can store for application level >> protection. Martin Petersen has some good slides about this on linux: >> >> http://oss.oracle.com/projects/data-integrity/documentation/ >> >> ZFS, for example, or more specifically its lvm layer, could use DIF to >> add this kind of protection. >> >> The other way to go is to use an enterprise class array - they all >> have multiple layers of data integrity baked in to deal with and >> correct these kind of errors. >> >> ric >> -- >> To unsubscribe from this list: send the line "unsubscribe >> reiserfs-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-05 12:31 ` Ric Wheeler @ 2008-04-05 14:07 ` Toby Thain 2008-04-05 15:08 ` Ric Wheeler 0 siblings, 1 reply; 19+ messages in thread From: Toby Thain @ 2008-04-05 14:07 UTC (permalink / raw) To: Ric Wheeler; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel On 5-Apr-08, at 8:31 AM, Ric Wheeler wrote: > Toby Thain wrote: >> On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote: >>> >>> Toby Thain wrote: >>>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: >>>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: >>>>> >>>>>> Ric's right about disk drives, though. They'll remap the bad >>>>>> sectors >>>>>> automatically at the hardware level. When you start to see bad >>>>>> sectors >>>>>> at the file system level, it means that the sectors reserved for >>>>>> remapping have been exhausted and you should replace the disk. >>>>> >>>>> There are a couple of cases where you can see bad block errors >>>>> on a good >>>>> drive. >>>>> >>>>> If a block is written with a bad CRC for some reason...the >>>>> write head >>>>> got a freak blip or it lost power as it was writing, or the >>>>> data went >>>>> corrupt while sitting on disk, then it will read as a bad >>>>> block, but >>>>> rewriting would fix it. >>>>> >>>>> A RAID media verify or a badblocks -n run can usually fix these. >>>> Only if your RAID uses CRCs (most don't). >>>> ZFS is the real answer to undetected corruption. >>>> --Toby >>> >>> Zan is right - even on a local drive, a write can repair some >>> sectors with bad protection bits. All disks have per sector data >>> protection (reed solomon encoding, etc) and there are lots of >>> those bits per sector. >> That does not protect against writing bad data, only some errors >> internal to drive. There is a long way to travel between CPU and >> drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the >> entire data path. >> --Toby > > If you want to protect the entire data path, you are looking at > something like DIF which protects even more of the data path than > ZFS since it adds a check from application space to the IO stack ;-) > > ZFS does not export its protection bits up the stack. Correct, but it protects everything up to the system call. RAID does not even get close, even with perfect error reporting (which doesn't really exist anyway). :) --Toby > > ric > > >>> >>> There is work on adding DIF (data integrity f?) which is extra >>> bytes that arrays or local drives can store for application level >>> protection. Martin Petersen has some good slides about this on >>> linux: >>> >>> http://oss.oracle.com/projects/data-integrity/documentation/ >>> >>> ZFS, for example, or more specifically its lvm layer, could use >>> DIF to add this kind of protection. >>> >>> The other way to go is to use an enterprise class array - they >>> all have multiple layers of data integrity baked in to deal with >>> and correct these kind of errors. >>> >>> ric >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> reiserfs-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: bad block management 2008-04-05 14:07 ` Toby Thain @ 2008-04-05 15:08 ` Ric Wheeler 0 siblings, 0 replies; 19+ messages in thread From: Ric Wheeler @ 2008-04-05 15:08 UTC (permalink / raw) To: Toby Thain; +Cc: Zan Lynx, Jeff Mahoney, Christian Kujau, kgp, reiserfs-devel Toby Thain wrote: > > On 5-Apr-08, at 8:31 AM, Ric Wheeler wrote: >> Toby Thain wrote: >>> On 4-Apr-08, at 2:58 PM, Ric Wheeler wrote: >>>> >>>> Toby Thain wrote: >>>>> On 3-Apr-08, at 8:14 PM, Zan Lynx wrote: >>>>>> On Tue, 2008-04-01 at 15:51 -0400, Jeff Mahoney wrote: >>>>>> >>>>>>> Ric's right about disk drives, though. They'll remap the bad sectors >>>>>>> automatically at the hardware level. When you start to see bad >>>>>>> sectors >>>>>>> at the file system level, it means that the sectors reserved for >>>>>>> remapping have been exhausted and you should replace the disk. >>>>>> >>>>>> There are a couple of cases where you can see bad block errors on >>>>>> a good >>>>>> drive. >>>>>> >>>>>> If a block is written with a bad CRC for some reason...the write head >>>>>> got a freak blip or it lost power as it was writing, or the data went >>>>>> corrupt while sitting on disk, then it will read as a bad block, but >>>>>> rewriting would fix it. >>>>>> >>>>>> A RAID media verify or a badblocks -n run can usually fix these. >>>>> Only if your RAID uses CRCs (most don't). >>>>> ZFS is the real answer to undetected corruption. >>>>> --Toby >>>> >>>> Zan is right - even on a local drive, a write can repair some >>>> sectors with bad protection bits. All disks have per sector data >>>> protection (reed solomon encoding, etc) and there are lots of those >>>> bits per sector. >>> That does not protect against writing bad data, only some errors >>> internal to drive. There is a long way to travel between CPU and >>> drive. Cable, controller, RAM, etc, etc, etc. ZFS protects the entire >>> data path. >>> --Toby >> >> If you want to protect the entire data path, you are looking at >> something like DIF which protects even more of the data path than ZFS >> since it adds a check from application space to the IO stack ;-) >> >> ZFS does not export its protection bits up the stack. > > Correct, but it protects everything up to the system call. RAID does not > even get close, even with perfect error reporting (which doesn't really > exist anyway). :) > > --Toby > When you look in detail at how data is lost in working systems, it is always interesting to look at the big buckets of common failures and make sure that we balance the complexity and cost (in money or in performance) against the realized improvement. What RAID does well is to protect against the leading and really, really common error case which is single or few sector errors on a disk drive. Those errors are almost always reported as IO errors and RAID systems (including our MD software RAID) will do the right thing when only one sector in a stripe is bad with a media error. The interesting question is what failure is the second most common. That, from what I see, is normally application/SW errors. It can be bugs in the fs or IO stack, but also it is also common to lose data from bad applications. I don't have first hand experience with ZFS, but in any complicated system you have a danger to increase the error rate (certainly for early adopters ;-)) while the developers try to figure out how their implementation differs from their pristine design (or what the design concept missed). My measured results of the reliability of reiserfs (v3) over a really large population show that we do quite well (when you use barriers or disable the write cache). It will be interesting to look for the first ZFS study (like the CMU paper by Bianca on disk failure, the google paper on failures and the recent NetApp/UWisc papers on IO stack failures) to see how ZFS does in the wild. ric ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2008-04-05 15:08 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-01 5:03 bad block management kgp 2008-04-01 18:55 ` Christian Kujau 2008-04-01 19:32 ` Ric Wheeler 2008-04-01 19:51 ` Jeff Mahoney 2008-04-01 22:11 ` Edward Shishkin 2008-04-02 4:50 ` jyotiv 2008-04-02 10:43 ` Ric Wheeler 2008-04-02 11:22 ` jyotiv 2008-04-02 13:31 ` Ric Wheeler 2008-04-02 13:14 ` Jeff Mahoney 2008-04-04 0:14 ` Zan Lynx 2008-04-04 4:21 ` Toby Thain 2008-04-04 16:12 ` Zan Lynx 2008-04-04 22:41 ` Toby Thain 2008-04-04 18:58 ` Ric Wheeler 2008-04-04 22:42 ` Toby Thain 2008-04-05 12:31 ` Ric Wheeler 2008-04-05 14:07 ` Toby Thain 2008-04-05 15:08 ` Ric Wheeler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.