All of lore.kernel.org
 help / color / mirror / Atom feed
* Bad blocks
@ 2004-04-01 14:06 Kalev Lember
  2004-04-01 14:36 ` David Woodhouse
  0 siblings, 1 reply; 27+ messages in thread
From: Kalev Lember @ 2004-04-01 14:06 UTC (permalink / raw)
  To: linux-mtd

Hi,

I am going to use DOC Millennium Plus and I do not want to use M-Systems 
propietary kernel modules.
Having read the mailing list archives I have some questions.  Does current 
INFTL code support bad block handling? Without that I would say it is 
virtually useless. Am I correct?

-- 
Best regards,
Kalev Lember

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Bad Blocks
@ 2013-03-20 18:55 Dyweni - Ceph-Devel
  2013-03-28 15:54 ` Gregory Farnum
  0 siblings, 1 reply; 27+ messages in thread
From: Dyweni - Ceph-Devel @ 2013-03-20 18:55 UTC (permalink / raw)
  To: ceph-devel

Hi All,

I would like to understand how Ceph handles and recovers from bad 
blocks.  Would someone mind explaining this to me?  It wasn't very 
apparent from the docs.

My ultimate goal to be able to get some extra life out of my disks, 
after I detect that they may be failing.  (I'm talking about those disks 
that may have a small amount of bad blocks, but otherwise seem file and 
still perform well).

Here's what I've put together:

1. BBR Hardware
     - All hard disks come with a set number of blocks that are reserved 
for remapping of failed blocks.  This is handled transparently by the 
hard disk.  The hard disk may not begin reporting failed blocks until 
all the reserved blocks are used up.

2. BBR Device Mapper Target
     - Back in the EVMS days, IBM wrote a kernel module (dm-bbr) and a 
evms plugin to manage that kernel module.  I have updated that kernel 
module to work with the 3.6.11 kernel.  I have also rewrote some 
portions of the evms plugin as a standalone bash script to allow me to 
initialize the BBR layer and start the BBR device mapper target on that 
layer.  (So far it seems to run fine, but requires more testing).

3. BTRFS
     - I've read that BTRFS can perform data scrubbing and repair 
damaged files from redundant copies.

4. CEPH
     - I've read that CEPH can perform a deep scrub to find damaged 
copies.  I assume by the distributed nature of CEPH, it can repair the 
damaged copy from the other OSDs.

One thing I am not clear on is when BTRFS / CEPH finds damaged data, 
what do they do to prevent data from being written to the same area?

Also, I'm wondering if any parts to my layered approach are redundant / 
unnecessary...  For instance if BTRFS marks the block bad internally, 
then perhaps the BBR DM Target isn't needed...


In my testing recently, I had the following setup:
   Disk -> DM-Crypt -> DM-BBR -> BTRFS -> OSD

When the OSD hit a bad block, the DM-BBR target successfully remapped 
it to one of its own reserved blocks, BTRFS then reported data 
corruption, and the OSD daemon crashed.


-- 
Thanks,
Dyweni

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Bad Blocks...
@ 2005-04-26 20:33 Eddie Dawydiuk
  2005-04-26 22:43 ` Charles Manning
  0 siblings, 1 reply; 27+ messages in thread
From: Eddie Dawydiuk @ 2005-04-26 20:33 UTC (permalink / raw)
  To: linux-mtd

Hello Yaffers,

After running some stress tests on a 128MB NAND Flash I have found some
strange behavior while using the Yaffs filesystem... The stress test
creates a ring-buffer of 5 directories, each directory contains 10,000
files with a size of 1248 bytes (Please find the source code attached). 
When running this application on a 32MB NAND Flash I am able to fill the
disk and then delete the files as expected... Although when running the
test on a 128MB NAND Flash(with the same kernel) I find that after
creating slightly over 35,000 files I am unable to write any more files to
disk(my board hangs). After rebooting the board, when I attempt to delete
the files only some of the files are deleted successfully(on the first
attempt). After attempting several more times I am able to delete all of
the files but I find that I have hundreds of bad blocks(there are no error
messages when I attempt to delete a file and it is unsucessfully deleted).
I have provided the output of /proc/yaffs below(after running the stress
test multiple times) and am using a 2.4.26 kernel... I have read the other
posts refering to bad block management
(http://www.aleph1.co.uk/pipermail/yaffs/2005q1/000955.html) and have
ensured the fixes suggested have been made. If anyone has any suggestions
I would appreciate them...

$ cat /proc/yaffs
YAFFS built:Apr 26 2005 10:44:45
$Id: yaffs_fs.c,v 1.3 2005/01/25 00:38:25 eddie Exp $
$Id: yaffs_guts.c,v 1.41 2005/04/24 08:54:36 charles Exp $

Device yaffs
startBlock......... 1
endBlock........... 7999
chunkGroupBits..... 2
chunkGroupSize..... 4
nErasedBlocks...... 1575
nTnodesCreated..... 35000
nFreeTnodes........ 21790
nObjectsCreated.... 34400
nFreeObjects....... 21795
nFreeChunks........ 142801
nPageWrites........ 71420
nPageReads......... 2573700
nBlockErasures..... 4021
nGCCopies.......... 317
garbageCollections. 3599
passiveGCs......... 3599
nRetriedWrites..... 0
nRetireBlocks...... 2397
eccFixed........... 0
eccUnfixed......... 0
tagsEccFixed....... 0
tagsEccUnfixed..... 654
cacheHits.......... 0
nDeletedFiles...... 22161
nUnlinkedFiles..... 22162
nBackgroudDeletions 0
useNANDECC......... 1

Thanks,
Eddie

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Re: Question regarding mdadm.conf
@ 2005-02-17  7:14 Michael Tokarev
  2005-02-17  8:30 ` Bad blocks Guy
  0 siblings, 1 reply; 27+ messages in thread
From: Michael Tokarev @ 2005-02-17  7:14 UTC (permalink / raw)
  To: Lajber Zoltan; +Cc: Torsten E., linux-raid

Lajber Zoltan wrote:
> Hi!
> 
> On Thu, 17 Feb 2005, Torsten E. wrote:
> 
> 
>>How does I get those UUID information, to add them to the new
>>/etc/mdadm.conf?
> 
> Try this one: mdadm --detail /dev/md1 | grep UUID

I'd say

   mdadm --detail --brief /dev/md1 | grep -v devices=

-- this will give you all information necessary for mdadm.conf,
you can just redirect output into that file.

Note the grep usage.  Someone will disagree with me here, but
there is a reason why to remove devices= line.  Without it,
output from mdadm looks like (on my system anyway):

ARRAY /dev/md1 level=raid1 num-devices=4 UUID=11e92e45:15fcc4a0:cf62e981:a79de494
    devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1

Ie, it lists all the devices which are parts of the array.
The problem with this is: if, for any reason (dead drive,
adding/removing drives/controllers etc) the devices will
change, and some /dev/sdXY will point to another device wich
is a part of some other raid array, mdadm will refuse to
assemble this array, saying something in a line of "the
UUIDs does not match, aborting".  Without the "devices="
part but with --scan option, mdadm will search all devices
by its own (based on the DEVICE line in mdadm.conf) - this
is somewhat slower as it will try to open each device in
turn, but safer, as it will find all the present components
no matter what.

Someone correct me if I'm wrong... ;)

/mjt

^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: Bad Blocks
@ 2002-09-24  7:54 Aleksander Kujbida
  0 siblings, 0 replies; 27+ messages in thread
From: Aleksander Kujbida @ 2002-09-24  7:54 UTC (permalink / raw)
  To: linux-admin

Thanks for the clarification, Glynn.
Aleksander


>From: Glynn Clements <glynn.clements@virgin.net>
>To: "Aleksander Kujbida" <akujbida@hotmail.com>
>CC: linux-admin@vger.kernel.org
>Subject: RE: Bad Blocks
>Date: Tue, 24 Sep 2002 05:54:28 +0100
>
>
>Aleksander Kujbida wrote:
>
> > > > Continuing on the same subject, How does the linux box decide that 
>it
> > > > requires a manual fsck, during bootup
> > >
> > > If fsck returns an error code other than 0 (no errors) or 1 (some
> > > errors, but they were all fixed), the boot sequence will normally be
> > > interrupted before the root filesystem is mounted read-write.
> >
> > Also, if a specified period of time has elapsed since the last fsck (or 
>is
> > it last boot?), it will fsck. Can't remember where the time period is 
>set.
>
>The boot sequence *always* runs fsck; but fsck itself won't actually
>perform the check if the filesystem was cleanly unmounted and neither
>the maximum mount count nor the check interval have been reached.
>
>--
>Glynn Clements <glynn.clements@virgin.net>




_________________________________________________________________
Join the world’s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com


^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: Bad Blocks
@ 2002-09-24  4:14 Aleksander Kujbida
  2002-09-24  4:54 ` Glynn Clements
  0 siblings, 1 reply; 27+ messages in thread
From: Aleksander Kujbida @ 2002-09-24  4:14 UTC (permalink / raw)
  To: linux-admin

Also, if a specified period of time has elapsed since the last fsck (or is 
it last boot?), it will fsck. Can't remember where the time period is set.

Aleksander


>From: Glynn Clements <glynn.clements@virgin.net>
>To: "Abiy,Mike [Edm]" <Mike.Abiy@EC.gc.ca>
>CC: "'Jorge R . Csapo'" <jorge@completo.com.br>,linux-admin@vger.kernel.org
>Subject: RE: Bad Blocks
>Date: Mon, 23 Sep 2002 21:33:10 +0100
>
>
>Abiy,Mike [Edm] wrote:
>
> > Continuing on the same subject, How does the linux box decide that it
> > requires a manual fsck, during bootup
>
>If fsck returns an error code other than 0 (no errors) or 1 (some
>errors, but they were all fixed), the boot sequence will normally be
>interrupted before the root filesystem is mounted read-write.
>


_________________________________________________________________
Join the world’s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com


^ permalink raw reply	[flat|nested] 27+ messages in thread
* RE: Bad Blocks
@ 2002-09-23 14:17 Abiy,Mike [Edm]
  2002-09-23 20:33 ` Glynn Clements
  0 siblings, 1 reply; 27+ messages in thread
From: Abiy,Mike [Edm] @ 2002-09-23 14:17 UTC (permalink / raw)
  To: 'Jorge R . Csapo', Abiy,Mike [Edm]; +Cc: linux-admin

Continuing on the same subject, How does the linux box decide that it
requires a manual fsck, during bootup, is there a specified number of bad
blocks that it has to come across before it decides to halt the bootup
process and ( fsck) and wait for a manual fsck; if so, is there a way to
change that. again, the whole reason is to avoid having to drive over six
and half hours just to do a manual fsck and the loss of the linux box(
system) during that time.

assim falou Abiy,Mike [Edm] (em 19/09/2002):
> I would like to apologize, first, if this happens to be a simple question.
> 
> 1. How does one test for bad blocks on a hard drive in linux.
> 
> 2. More important, how does one render a bad block on hard drive
unreadable,
> so that the bad block utility that was used( whatever it may be)  and/or
> program does not try to write or read to this same bad block, resulting in
> the same errors happening again and again.

mkfs -c does just that, addressing both 1. and 2. 

> 
> 3. This was necessitated by trips to a remote site from remote power
bootup
> ( sometime it is absolutely necessary to do that) to do a manual fsck,
> because the linux box stops the normal bootup process awaiting manual
> intervention to do manual fsck.

This is totally configurable, meaning the necessity for a manual fsck can
simply be removed. You can either prevent Linux from fsck'ing at boot (not a
really good idea) or force fsck at every boot but with options that don't
require manual intervention. The way to do this depends on your distro, but
it may involve editing /etc/inittab, a number of /etc/rc.d files,
re-creating
your filesystems or all of the above...

-- 
Jorge R. Csapo
--------------------------------------------------
 /"\
 \ / CAMPANHA DA FITA ASCII - CONTRA MAIL HTML
  X  ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 / \
--------------------------------------------------
http://www.completo.com.br/~jorge
===========================================
With a PC, I always felt limited
by the software available.
On Unix, I am limited only by my knowledge.
--Peter J. Schoenster

^ permalink raw reply	[flat|nested] 27+ messages in thread
[parent not found: <F8500ECEBD66D211A3D20008C724A29C01CEFD9D@SR-EDM-EXCH4.edm. ab.ec.gc.ca>]
* Bad Blocks
@ 2002-09-19 14:48 Abiy,Mike [Edm]
  2002-09-19 15:23 ` Jorge R . Csapo
  0 siblings, 1 reply; 27+ messages in thread
From: Abiy,Mike [Edm] @ 2002-09-19 14:48 UTC (permalink / raw)
  To: linux-admin

I would like to apologize, first, if this happens to be a simple question.

1. How does one test for bad blocks on a hard drive in linux.

2. More important, how does one render a bad block on hard drive unreadable,
so that the bad block utility that was used( whatever it may be)  and/or
program does not try to write or read to this same bad block, resulting in
the same errors happening again and again.

3. This was necessitated by trips to a remote site from remote power bootup
( sometime it is absolutely necessary to do that) to do a manual fsck,
because the linux box stops the normal bootup process awaiting manual
intervention to do manual fsck.

any information on this topic would be greatly appreciated.

thanks
Michael

^ permalink raw reply	[flat|nested] 27+ messages in thread
* bad blocks
@ 2002-06-17 15:19 Alexander Saers
  2002-06-17 15:23 ` Oleg Drokin
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Saers @ 2002-06-17 15:19 UTC (permalink / raw)
  To: reiserfs-list

Hello

It seams that my harddrive have some bad sectors. I therefore made some
researche on how to handle it. I found this peace of information

http://www.reiserfs.org/bad-block-handling.html

But this only makes the sectors busy and the next reiserfsck will then
detect this and remove the bussy mark. Isnt there a way to create files on
the bad sectors so that nobody uses them. Not even in the future after a
reiserfsck. Like if i have a folder on the drive that says
/badblocks/sector8004
/badblocks/sector8010
etc etc

Thanx fo a good filesystem

/Alexander



^ permalink raw reply	[flat|nested] 27+ messages in thread
* Bad blocks
@ 2002-04-16  1:13 Sam Vilain
  2002-04-16 11:43 ` Matthias Andree
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Sam Vilain @ 2002-04-16  1:13 UTC (permalink / raw)
  To: reiserfs-list

I seem to have some bad blocks on my laptop's hard drive, how do I mark
them as bad to reiserfs?

root@hoffman:/usr/share/doc/RFC/unclassified# ls
hda: read_intr: status=0x5b { DriveReady SeekComplete DataRequest Index Error }
hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=25866987, sector=4211304
end_request: I/O error, dev 03:06 (hda), sector 4211304
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data of [66745 67408 0x0 SD]
ls: rfc470.txt.gz: Permission denied
hda: read_intr: status=0x5b { DriveReady SeekComplete DataRequest Index Error }
hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=25866987, sector=4211304
end_request: I/O error, dev 03:06 (hda), sector 4211304
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data of [66745 67410 0x0 SD]
ls: rfc486.txt.gz: Permission denied
hda: read_intr: status=0x5b { DriveReady SeekComplete DataRequest Index Error }
hda: read_intr: error=0x40 { UncorrectableError }, LBAsect=25866987, sector=4211304
end_request: I/O error, dev 03:06 (hda), sector 4211304
vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data of [66745 67409 0x0 SD]
ls: rfc478.txt.gz: Permission denied
rfc1.txt.gz    rfc239.txt.gz  rfc370.txt.gz  rfc501.txt.gz  rfc697.txt.gz  rfc84.txt.gz
   [...]

I know I have 8 bad sectors:

hoffman:/usr/share/doc/RFC/unclassified# dd if=/dev/hda6 of=test skip=4211303 count=2
dd: reading `/dev/hda6': Input/output error
1+0 records in
1+0 records out
hoffman:/usr/share/doc/RFC/unclassified# dd if=/dev/hda6 of=test skip=4211311 count=2
dd: reading `/dev/hda6': Input/output error
0+0 records in
0+0 records out
hoffman:/usr/share/doc/RFC/unclassified# dd if=/dev/hda6 of=test skip=4211312 count=2
2+0 records in
2+0 records out
hoffman:/usr/share/doc/RFC/unclassified# 

`mkreiserfs' is missing the -c and -l options.

`reiserfsck' aborts during the scan with the message:

pass_through_tree: unable to read 526413 block on device 0x3

Apparently, ext2 handles marking sectors as bad automatically.

How can I deal with this?  If anyone knows of a tool to re-format just
8 sectors (to let the disk re-map the blocks elsewhere), that also
would be helpful.

Cheers,
--
   Sam Vilain, sam@vilain.net     WWW: http://sam.vilain.net/
    7D74 2A09 B2D3 C30F F78E      GPG: http://sam.vilain.net/sam.asc
    278A A425 30A9 05B5 2F13


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-03-28 15:54 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-01 14:06 Bad blocks Kalev Lember
2004-04-01 14:36 ` David Woodhouse
2004-04-02  0:01   ` patch for AMD am29dl800b David Updegraff
2004-04-02  5:57     ` David Woodhouse
2004-04-02 13:30       ` David Updegraff
2004-04-02 13:39         ` David Woodhouse
2004-04-02  0:14   ` Bad blocks Greg Ungerer
2004-04-04 11:16   ` Kalev Lember
  -- strict thread matches above, loose matches on Subject: below --
2013-03-20 18:55 Bad Blocks Dyweni - Ceph-Devel
2013-03-28 15:54 ` Gregory Farnum
2005-04-26 20:33 Eddie Dawydiuk
2005-04-26 22:43 ` Charles Manning
2005-02-17  7:14 Question regarding mdadm.conf Michael Tokarev
2005-02-17  8:30 ` Bad blocks Guy
2002-09-24  7:54 Bad Blocks Aleksander Kujbida
2002-09-24  4:14 Aleksander Kujbida
2002-09-24  4:54 ` Glynn Clements
2002-09-23 14:17 Abiy,Mike [Edm]
2002-09-23 20:33 ` Glynn Clements
     [not found] <F8500ECEBD66D211A3D20008C724A29C01CEFD9D@SR-EDM-EXCH4.edm. ab.ec.gc.ca>
2002-09-19 15:03 ` Scott Taylor
2002-09-19 14:48 Abiy,Mike [Edm]
2002-09-19 15:23 ` Jorge R . Csapo
2002-06-17 15:19 bad blocks Alexander Saers
2002-06-17 15:23 ` Oleg Drokin
2002-04-16  1:13 Bad blocks Sam Vilain
2002-04-16 11:43 ` Matthias Andree
2002-04-16 11:47 ` Oleg Drokin
2002-04-16 12:01 ` Ed Tomlinson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.