Hard disk crash and solution

All of lore.kernel.org
 help / color / mirror / Atom feed

* Hard disk crash and solution
@ 2003-01-22 21:35 Niek
  2003-01-22 21:57 ` Hans Reiser
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Niek @ 2003-01-22 21:35 UTC (permalink / raw)
  To: reiserfs-list

Title: IBM DTLA 307045 Hard disk crash

I bought this disk (46 GB) about two years ago. One of the best they
claimed.
Anyway, I put it into my Linux box and yes it worked for a while.
After about 4 months it started to make weird noises. krrk-screech etc.
Shit, a lot of data gone. Send it to IBM and got a refurbished one back from
Taiwan.
This drive went back into my Linux box and it worked happily for more that
1.3
years. Then again it happened: Kkrr-screech etc. Shit happens.
What is the fucking MBTF of these drives?? Is it close to one year like I
experienced?

Anyway, I want my data back. I mailed Ontrack, a nice company specialized in
getting data back from crashed HD's. Their response (by phone) was very,
very quick!
(within 2 minutes after completing their online form).
It would cost me euro 96 for them to have a look and if I want the data back
it'll cost me about euro 700 - euro 1500 exclusive VAT. Since I'm not a
company,
they gave me 20% off their price).

Anyway, I'll try to do it myself.

First I reinstalled Linux Suse 8.1 on a new disk. Auto installed all the
patches (using YOU).
Then I attached the broken disk (with the kkkggrr sounds) on a free ide
port.
Fdisk showed me the partitions. I tried to mount all of them. I could mount
only
one of them. I rescued a few files.
But the bulk of the data was located at /dev/hdd9. A partition of around 32
GB.
I could not mount it. Mount could not mount it due to one bad block (well
there
must be thousands of them, but mount needs only one to bail out).

The following command created a image:

mymachine:/rescued # dd_rescue -b 32768 -B 512 -l dd_rescue_log -a -v
/dev/hdd9 hdd9.img

It took around 2 hours to make this image. This program reported many
errors,
the HD make awful noises but I got an image. Hum, now what?

mymachine:/rescued # l
total 29054814
drwxr-xr-x    2 root     root          104 Jan 22 17:18 ./
drwxr-xr-x   26 root     root          600 Jan 22 18:14 ../
-rw-r-----    1 root     root       741121 Jan 22 20:22 dd_rescue.log
-rw-r-----    1 root     root     32020627456 Jan 22 20:22 hdd9.img

>>>>>>>>>>Let's try to mount

mymachine:/rescued # mount -o loop -r hdd9.img /d9 -t reiserfs
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
       or too many mounted file systems

>>>>>>>>>>Check it then....

mymachine:/rescued # reiserfsck  hdd9.img
reiserfsck 3.6.2 (2002)
Will read-only check consistency of the filesystem on hdd9.img
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes):Yes
###########
reiserfsck --check started at Wed Jan 22 20:29:18 2003
###########
Replaying journal..
No transactions found
Checking S+tree../  1 (of 166)/ 82 (of  86)node (12285) with wrong level (0)
found in the tree (should be 1)
whole subtree skipped
/  2 (of 166)/ 19 (of  85)node (39901) with wrong level (0) found in the
tree (should be 1)
whole subtree skipped
/  7 (of 166)/ 14 (of 170)node (8381) with wrong level (0) found in the tree
(should be 1)

---- sniped lines -------

/152 (of 166)/152 (of 170)node (31645) with wrong level (0) found in the
tree (should be 1)
whole subtree skipped
ok
Comparing bitmaps..free block count 1718350 mismatches with a correct one
2289481.
on-disk bitmap does not match to the correct one.
Bad nodes were found, Semantic pass skipped
There were found 17 corruptions which can be fixed only
during --rebuild-tree
###########
reiserfsck finished at Wed Jan 22 20:29:33 2003
###########

>>>>>>> Well rebuild it then:

mymachine:/rescued # reiserfsck  --rebuild-tree hdd9.img
reiserfsck 3.6.2 (2002)
  **********************************************************
  ** This  is  an  experimental  version  of  reiserfsck, **
  **              !! MAKE A BACKUP FIRST !!               **
  ** Don't run this program unless something  is  broken. **
  ** Some types of random FS damage can be recovered from **
  ** by  this  program,   which  basically   throws  away **
  ** the internal nodes of the tree and then reconstructs **
  ** them. This program is for use only by the desperate, **
  ** and is  of only beta quality.  If you are using  the **
  ** latest  reiserfsprogs  and  it  fails  please  email **
  ** bug reports to reiserfs-list@namesys.com.            **
  **********************************************************

Will rebuild the filesystem (hdd9.img) tree
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes):Yes
Replaying journal..
No transactions found
###########
reiserfsck --rebuild-tree started at Wed Jan 22 20:29:53 2003
###########

Pass 0:
####### Pass 0 #######
Loading on-disk bitmap .. ok, 6099186 blocks marked used
Skipping 8449 blocks (super block, journal, bitmaps) 6090737 blocks will be
read
0%....20%....40%....60%....80%....100%                        left 0, 8627
/sec
"r5" got 96833 hits
        "r5" hash is selected
Flushing..done
        Read blocks (but not data blocks) 6090737
                Leaves among those 25618
                Objectids found 96822

Pass 1 (will try to insert 25618 leaves):
####### Pass 1 #######
Looking for allocable blocks .. ok
0%....20%....40%....60%....80%....100%                         left 0, 419
/sec
Flushing..done
        25618 leaves read
                25608 inserted
                10 not inserted
####### Pass 2 #######

Pass2:
0%....20%....40%....60%....80%....100%                          left 0, 20
/sec
Flushing..done
        Leaves inserted item by item 10
Pass 3 (semantic):
####### Pass 3 #########
name "cache3B65EDC20112F11.png" in directory 20882 21317 points to nowhere
21317 21321 - removed
dir 20882 21317 has wrong sd_size 316, has to be 196
name "04" in directory 189 20882 points to nowhere 20882 73124 - removed
name "config" in directory 20875 20883 points to nowhere 20883 20887 -
removed
dir 20875 20883 has wrong sd_size 188, has to be 146
/database/gotocode/SQLname "mysql_gotocode.sql" in directory 87643 88651
points to nowhere 88651 88654 - removed

----- snip a lot of errors ----

dir 73444 32462 has wrong sd_size 281, has to be 228
Flushing..done
        Files found: 92045
        Directories found: 4310
        Symlinks found: 24
        Others: 1
        Broken (of files/symlinks/others): 1
        Names pointing to nowhere (removed): 103
Pass 3a (looking for lost dir/files):
####### Pass 3a (lost+found pass) #########
Looking for lost directories:
/20883_20887name "kwriterc" in directory 20883 20887 points to nowhere 20887
9941 - removed

---- snip again loads of errors -----

/22523_36802get_next_directory_item: 22523 36802 0x1 DIR (3): ".." points to
[20884 22523], should point to [2 206] - fixed
Looking for lost files:1 /sec
Flushing..done 9137, 147 /sec
        Objects without names 112
        Empty lost dirs removed 390
        Dirs linked to /lost+found: 11
                Dirs without stat data found 1
        Files linked to /lost+found 71
Pass 4 - done     done 16947, 264 /sec
        Deleted unreachable items 179
Flushing..done
Syncing..done
###########
reiserfsck finished at Wed Jan 22 20:43:58 2003
###########
mymachine:/rescued #

>>>>>>> Now I tried to mount it again.

mymachine:/rescued # mount -treiserfs -o loop -r hdd9.img /d9
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
       or too many mounted file systems

>>>>>>> What is wrong?????

debugreiserfs hdd9.img
debugreiserfs 3.6.2 (2002)

Filesystem state: consistent

Reiserfs super block in block 16 on 0x0 of format 3.5 with standard journal
Count of blocks on the device: 7817536
Number of bitmaps: 239
Blocksize: 4096
Free blocks (count of blocks - used [journal, bitmaps, data, reserved]
blocks): 1884129
Root block: 8222
Filesystem is clean
Tree height: 4
Hash function used to sort names: "r5"
Objectid map size 196, max 1004
Journal parameters:
        Device [0x0]
        Magic [0x0]
        Size 8193 blocks (including 1 for journal header) (first block 18)
        Max transaction length 1024 blocks
        Max batch size 900 blocks
        Max commit age 30
Blocks reserved by journal: 0
Fs state field: 0x0
sb_version: 0

>>>>>>>> There is nothing wrong !!!!!!!, Let's update the reiserfs utilities
>>>>>>>>  from their website and try again

mymachine:/rescued # debugreiserfs hdd9.img

<-------------debugreiserfs, 2002------------->
reiserfsprogs 3.6.4

Filesystem state: consistent

Reiserfs super block in block 16 on 0x0 of format 3.5 with standard journal
Count of blocks on the device: 7817536
Number of bitmaps: 239
Blocksize: 4096
Free blocks (count of blocks - used [journal, bitmaps, data, reserved]
blocks): 1884129
Root block: 8222
Filesystem is cleanly umounted
Tree height: 4
Hash function used to sort names: "r5"
Objectid map size 196, max 1004
Journal parameters:
        Device [0x0]
        Magic [0x0]
        Size 8193 blocks (including 1 for journal header) (first block 18)
        Max transaction length 1024 blocks
        Max batch size 900 blocks
        Max commit age 30
Blocks reserved by journal: 0
Fs state field: 0x0
sb_version: 0

>>>>>>>>>>> Hummm, there is still nothing wrong, both debugreiserfs reports
the same
>>>>>>>>>>> Let's recheck with the new utility

mymachine:/rescued # reiserfsck hdd9.img

<-------------reiserfsck, 2002------------->
reiserfsprogs 3.6.4

  *************************************************************
  ** If you are using the latest reiserfsprogs and  it fails **
  ** please  email bug reports to reiserfs-list@namesys.com, **
  ** providing  as  much  information  as  possible --  your **
  ** hardware,  kernel,  patches,  settings,  all  reiserfsk **
  ** messages  (including version),  the reiserfsck logfile, **
  ** check  the  syslog file  for  any  related information. **
  ** If you would like advice on using this program, support **
  ** is available  for $25 at  www.namesys.com/support.html. **
  *************************************************************

Will read-only check consistency of the filesystem on hdd9.img
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you
do):Yes
###########
reiserfsck --check started at Wed Jan 22 21:31:19 2003
###########
Replaying journal..
trans replayed: mountid 126, transid 167823, desc 4265, len 3, commit 4269,
next trans offset 4252
1 transactions replayed
Checking internal tree../ 34 (of 165)/ 79 (of 105)bad_item: vpf-10570: block
11065: The item header (6) has not cleaned flags.
finished
Comparing bitmaps..finished
Checking Semantic tree:
/n/qcad-1.4.7-i386-setup.tar.gzvpf-10680: The file [3 21341] has the wrong
block count in the StatData (7464), should be (2104)
/n/windows_vmware/win98.dskvpf-10680: The file [21732 21734] has the wrong
block count in the StatData (1365888), should be (1354496)
/n/papers.part1.rarvpf-10680: The file [3 70216] has the wrong block count
in the StatData (1269536), should be (56040)
/data_2/f4.bakvpf-10680: The file [10533 10585] has the wrong block count in
the StatData (3656), should be (744)
finished
5 found corruptions can be fixed with --fix-fixable
###########
reiserfsck finished at Wed Jan 22 21:31:55 2003
###########

>>>>>>>> Heee, still errors? What did I miss? Debug reiserfs thinks it's ok
and the
>>>>>>>> newer reiserfsck found some more errors. Stupid program........

mymachine:/rescued # reiserfsck --fix-fixable hdd9.img

<-------------reiserfsck, 2002------------->
reiserfsprogs 3.6.4

  *************************************************************
  ** If you are using the latest reiserfsprogs and  it fails **
  ** please  email bug reports to reiserfs-list@namesys.com, **
  ** providing  as  much  information  as  possible --  your **
  ** hardware,  kernel,  patches,  settings,  all  reiserfsk **
  ** messages  (including version),  the reiserfsck logfile, **
  ** check  the  syslog file  for  any  related information. **
  ** If you would like advice on using this program, support **
  ** is available  for $25 at  www.namesys.com/support.html. **
  *************************************************************

Will check consistency of the filesystem on hdd9.img
and will fix what can be fixed w/o --rebuild-tree
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you
do):Yes
###########
reiserfsck --fix-fixable started at Wed Jan 22 21:32:47 2003
###########
Replaying journal..
0 transactions replayed
Checking internal tree../ 34 (of 165)/ 79 (of 105)bad_item: vpf-10580: block
11065: Flags in the item header (6) were cleaned
finished
Comparing bitmaps..finished
Checking Semantic tree:
/n/qcad-1.4.7-i386-setup.tar.gzvpf-10680: The file [3 21341] has the wrong
block count in the StatData (7464) - corrected to (2104)
/n/windows_vmware/win98.dskvpf-10680: The file [21732 21734] has the wrong
block count in the StatData (1365888) - corrected to (1354496)
/n/papers.part1.rarvpf-10680: The file [3 70216] has the wrong block count
in the StatData (1269536) - corrected to (56040)
/data_2/f4.bakvpf-10680: The file [10533 10585] has the wrong block count in
the StatData (3656) - corrected to (744)
finished
No corruptions found
There are on the filesystem:
        Leaves 25460
        Internal nodes 166
        Directories 4340
        Other files 92460
        Data block pointers 5899332 (0 of them are zero)
        Safe links 0
###########
reiserfsck finished at Wed Jan 22 21:34:09 2003
###########

mymachine:/rescued # mount -treiserfs -o loop -r hdd9.img /d9
mymachine:/rescued #

------- Hee, no errors any more -------- let's check it's contents

Data is there. I'm very happy!!
What did we learn. A lot. One of them is that computers are more unreliable
than humans and they are
not very reliable. Look e.g. at Suse. Their auto update should have updated
the reiserfs programs to the
newest versions, but they forgot to put that in. I'll mail them.
One thing that Namesys should do is to make the software more user-friendly
and put this type of info in their webpages
Crashed HD's are a pain in the ass to repair. This type of IBM drives have a
flaw. Ibm confirmed this, but they did not remove them from the market.

Have a lot of fun,

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
@ 2003-01-22 21:57 ` Hans Reiser
  2003-01-23  6:43   ` Ookhoi
  2003-01-22 22:01 ` Dieter Nützel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2003-01-22 21:57 UTC (permalink / raw)
  To: Niek; +Cc: reiserfs-list, Vitaly Fertman, mason

Niek wrote:

>Title: IBM DTLA 307045 Hard disk crash
>
>I bought this disk (46 GB) about two years ago. One of the best they
>claimed.
>Anyway, I put it into my Linux box and yes it worked for a while.
>After about 4 months it started to make weird noises. krrk-screech etc.
>Shit, a lot of data gone. Send it to IBM and got a refurbished one back from
>Taiwan.
>This drive went back into my Linux box and it worked happily for more that
>1.3
>years. Then again it happened: Kkrr-screech etc. Shit happens.
>What is the fucking MBTF of these drives?? Is it close to one year like I
>experienced?
>
>Anyway, I want my data back. I mailed Ontrack, a nice company specialized in
>getting data back from crashed HD's. Their response (by phone) was very,
>very quick!
>(within 2 minutes after completing their online form).
>It would cost me euro 96 for them to have a look and if I want the data back
>it'll cost me about euro 700 - euro 1500 exclusive VAT. Since I'm not a
>company,
>they gave me 20% off their price).
>
>Anyway, I'll try to do it myself.
>
>
>First I reinstalled Linux Suse 8.1 on a new disk. Auto installed all the
>patches (using YOU).
>Then I attached the broken disk (with the kkkggrr sounds) on a free ide
>port.
>Fdisk showed me the partitions. I tried to mount all of them. I could mount
>only
>one of them. I rescued a few files.
>But the bulk of the data was located at /dev/hdd9. A partition of around 32
>GB.
>I could not mount it. Mount could not mount it due to one bad block (well
>there
>must be thousands of them, but mount needs only one to bail out).
>
>The following command created a image:
>
>mymachine:/rescued # dd_rescue -b 32768 -B 512 -l dd_rescue_log -a -v
>/dev/hdd9 hdd9.img
>
>It took around 2 hours to make this image. This program reported many
>errors,
>the HD make awful noises but I got an image. Hum, now what?
>
>
>
>mymachine:/rescued # l
>total 29054814
>drwxr-xr-x    2 root     root          104 Jan 22 17:18 ./
>drwxr-xr-x   26 root     root          600 Jan 22 18:14 ../
>-rw-r-----    1 root     root       741121 Jan 22 20:22 dd_rescue.log
>-rw-r-----    1 root     root     32020627456 Jan 22 20:22 hdd9.img
>
>
>  
>
>>>>>>>>>>>Let's try to mount
>>>>>>>>>>>                      
>>>>>>>>>>>
>
>mymachine:/rescued # mount -o loop -r hdd9.img /d9 -t reiserfs
>mount: wrong fs type, bad option, bad superblock on /dev/loop0,
>       or too many mounted file systems
>
>
>  
>
>>>>>>>>>>>Check it then....
>>>>>>>>>>>                      
>>>>>>>>>>>
>
>mymachine:/rescued # reiserfsck  hdd9.img
>reiserfsck 3.6.2 (2002)
>Will read-only check consistency of the filesystem on hdd9.img
>Will put log info to 'stdout'
>
>Do you want to run this program?[N/Yes] (note need to type Yes):Yes
>###########
>reiserfsck --check started at Wed Jan 22 20:29:18 2003
>###########
>Replaying journal..
>No transactions found
>Checking S+tree../  1 (of 166)/ 82 (of  86)node (12285) with wrong level (0)
>found in the tree (should be 1)
>whole subtree skipped
>/  2 (of 166)/ 19 (of  85)node (39901) with wrong level (0) found in the
>tree (should be 1)
>whole subtree skipped
>/  7 (of 166)/ 14 (of 170)node (8381) with wrong level (0) found in the tree
>(should be 1)
>
>---- sniped lines -------
>
>/152 (of 166)/152 (of 170)node (31645) with wrong level (0) found in the
>tree (should be 1)
>whole subtree skipped
>ok
>Comparing bitmaps..free block count 1718350 mismatches with a correct one
>2289481.
>on-disk bitmap does not match to the correct one.
>Bad nodes were found, Semantic pass skipped
>There were found 17 corruptions which can be fixed only
>during --rebuild-tree
>###########
>reiserfsck finished at Wed Jan 22 20:29:33 2003
>###########
>
>
>
>  
>
>>>>>>>>Well rebuild it then:
>>>>>>>>                
>>>>>>>>
>
>mymachine:/rescued # reiserfsck  --rebuild-tree hdd9.img
>reiserfsck 3.6.2 (2002)
>  **********************************************************
>  ** This  is  an  experimental  version  of  reiserfsck, **
>  **              !! MAKE A BACKUP FIRST !!               **
>  ** Don't run this program unless something  is  broken. **
>  ** Some types of random FS damage can be recovered from **
>  ** by  this  program,   which  basically   throws  away **
>  ** the internal nodes of the tree and then reconstructs **
>  ** them. This program is for use only by the desperate, **
>  ** and is  of only beta quality.  If you are using  the **
>  ** latest  reiserfsprogs  and  it  fails  please  email **
>  ** bug reports to reiserfs-list@namesys.com.            **
>  **********************************************************
>
>Will rebuild the filesystem (hdd9.img) tree
>Will put log info to 'stdout'
>
>Do you want to run this program?[N/Yes] (note need to type Yes):Yes
>Replaying journal..
>No transactions found
>###########
>reiserfsck --rebuild-tree started at Wed Jan 22 20:29:53 2003
>###########
>
>Pass 0:
>####### Pass 0 #######
>Loading on-disk bitmap .. ok, 6099186 blocks marked used
>Skipping 8449 blocks (super block, journal, bitmaps) 6090737 blocks will be
>read
>0%....20%....40%....60%....80%....100%                        left 0, 8627
>/sec
>"r5" got 96833 hits
>        "r5" hash is selected
>Flushing..done
>        Read blocks (but not data blocks) 6090737
>                Leaves among those 25618
>                Objectids found 96822
>
>Pass 1 (will try to insert 25618 leaves):
>####### Pass 1 #######
>Looking for allocable blocks .. ok
>0%....20%....40%....60%....80%....100%                         left 0, 419
>/sec
>Flushing..done
>        25618 leaves read
>                25608 inserted
>                10 not inserted
>####### Pass 2 #######
>
>Pass2:
>0%....20%....40%....60%....80%....100%                          left 0, 20
>/sec
>Flushing..done
>        Leaves inserted item by item 10
>Pass 3 (semantic):
>####### Pass 3 #########
>name "cache3B65EDC20112F11.png" in directory 20882 21317 points to nowhere
>21317 21321 - removed
>dir 20882 21317 has wrong sd_size 316, has to be 196
>name "04" in directory 189 20882 points to nowhere 20882 73124 - removed
>name "config" in directory 20875 20883 points to nowhere 20883 20887 -
>removed
>dir 20875 20883 has wrong sd_size 188, has to be 146
>/database/gotocode/SQLname "mysql_gotocode.sql" in directory 87643 88651
>points to nowhere 88651 88654 - removed
>
>----- snip a lot of errors ----
>
>
>dir 73444 32462 has wrong sd_size 281, has to be 228
>Flushing..done
>        Files found: 92045
>        Directories found: 4310
>        Symlinks found: 24
>        Others: 1
>        Broken (of files/symlinks/others): 1
>        Names pointing to nowhere (removed): 103
>Pass 3a (looking for lost dir/files):
>####### Pass 3a (lost+found pass) #########
>Looking for lost directories:
>/20883_20887name "kwriterc" in directory 20883 20887 points to nowhere 20887
>9941 - removed
>
>---- snip again loads of errors -----
>
>/22523_36802get_next_directory_item: 22523 36802 0x1 DIR (3): ".." points to
>[20884 22523], should point to [2 206] - fixed
>Looking for lost files:1 /sec
>Flushing..done 9137, 147 /sec
>        Objects without names 112
>        Empty lost dirs removed 390
>        Dirs linked to /lost+found: 11
>                Dirs without stat data found 1
>        Files linked to /lost+found 71
>Pass 4 - done     done 16947, 264 /sec
>        Deleted unreachable items 179
>Flushing..done
>Syncing..done
>###########
>reiserfsck finished at Wed Jan 22 20:43:58 2003
>###########
>mymachine:/rescued #
>
>
>  
>
>>>>>>>>Now I tried to mount it again.
>>>>>>>>                
>>>>>>>>
>
>mymachine:/rescued # mount -treiserfs -o loop -r hdd9.img /d9
>mount: wrong fs type, bad option, bad superblock on /dev/loop0,
>       or too many mounted file systems
>
>
>  
>
>>>>>>>>What is wrong?????
>>>>>>>>                
>>>>>>>>
>
>
>debugreiserfs hdd9.img
>debugreiserfs 3.6.2 (2002)
>
>Filesystem state: consistent
>
>Reiserfs super block in block 16 on 0x0 of format 3.5 with standard journal
>Count of blocks on the device: 7817536
>Number of bitmaps: 239
>Blocksize: 4096
>Free blocks (count of blocks - used [journal, bitmaps, data, reserved]
>blocks): 1884129
>Root block: 8222
>Filesystem is clean
>Tree height: 4
>Hash function used to sort names: "r5"
>Objectid map size 196, max 1004
>Journal parameters:
>        Device [0x0]
>        Magic [0x0]
>        Size 8193 blocks (including 1 for journal header) (first block 18)
>        Max transaction length 1024 blocks
>        Max batch size 900 blocks
>        Max commit age 30
>Blocks reserved by journal: 0
>Fs state field: 0x0
>sb_version: 0
>
>
>
>
>  
>
>>>>>>>>>There is nothing wrong !!!!!!!, Let's update the reiserfs utilities
>>>>>>>>> from their website and try again
>>>>>>>>>                  
>>>>>>>>>
>
>
>mymachine:/rescued # debugreiserfs hdd9.img
>
><-------------debugreiserfs, 2002------------->
>reiserfsprogs 3.6.4
>
>
>Filesystem state: consistent
>
>Reiserfs super block in block 16 on 0x0 of format 3.5 with standard journal
>Count of blocks on the device: 7817536
>Number of bitmaps: 239
>Blocksize: 4096
>Free blocks (count of blocks - used [journal, bitmaps, data, reserved]
>blocks): 1884129
>Root block: 8222
>Filesystem is cleanly umounted
>Tree height: 4
>Hash function used to sort names: "r5"
>Objectid map size 196, max 1004
>Journal parameters:
>        Device [0x0]
>        Magic [0x0]
>        Size 8193 blocks (including 1 for journal header) (first block 18)
>        Max transaction length 1024 blocks
>        Max batch size 900 blocks
>        Max commit age 30
>Blocks reserved by journal: 0
>Fs state field: 0x0
>sb_version: 0
>
>
>  
>
>>>>>>>>>>>>Hummm, there is still nothing wrong, both debugreiserfs reports
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>the same
>  
>
>>>>>>>>>>>>Let's recheck with the new utility
>>>>>>>>>>>>                        
>>>>>>>>>>>>
>
>
>mymachine:/rescued # reiserfsck hdd9.img
>
><-------------reiserfsck, 2002------------->
>reiserfsprogs 3.6.4
>
>  *************************************************************
>  ** If you are using the latest reiserfsprogs and  it fails **
>  ** please  email bug reports to reiserfs-list@namesys.com, **
>  ** providing  as  much  information  as  possible --  your **
>  ** hardware,  kernel,  patches,  settings,  all  reiserfsk **
>  ** messages  (including version),  the reiserfsck logfile, **
>  ** check  the  syslog file  for  any  related information. **
>  ** If you would like advice on using this program, support **
>  ** is available  for $25 at  www.namesys.com/support.html. **
>  *************************************************************
>
>Will read-only check consistency of the filesystem on hdd9.img
>Will put log info to 'stdout'
>
>Do you want to run this program?[N/Yes] (note need to type Yes if you
>do):Yes
>###########
>reiserfsck --check started at Wed Jan 22 21:31:19 2003
>###########
>Replaying journal..
>trans replayed: mountid 126, transid 167823, desc 4265, len 3, commit 4269,
>next trans offset 4252
>1 transactions replayed
>Checking internal tree../ 34 (of 165)/ 79 (of 105)bad_item: vpf-10570: block
>11065: The item header (6) has not cleaned flags.
>finished
>Comparing bitmaps..finished
>Checking Semantic tree:
>/n/qcad-1.4.7-i386-setup.tar.gzvpf-10680: The file [3 21341] has the wrong
>block count in the StatData (7464), should be (2104)
>/n/windows_vmware/win98.dskvpf-10680: The file [21732 21734] has the wrong
>block count in the StatData (1365888), should be (1354496)
>/n/papers.part1.rarvpf-10680: The file [3 70216] has the wrong block count
>in the StatData (1269536), should be (56040)
>/data_2/f4.bakvpf-10680: The file [10533 10585] has the wrong block count in
>the StatData (3656), should be (744)
>finished
>5 found corruptions can be fixed with --fix-fixable
>###########
>reiserfsck finished at Wed Jan 22 21:31:55 2003
>###########
>
>
>  
>
>>>>>>>>>Heee, still errors? What did I miss? Debug reiserfs thinks it's ok
>>>>>>>>>                  
>>>>>>>>>
>and the
>  
>
>>>>>>>>>newer reiserfsck found some more errors. Stupid program........
>>>>>>>>>                  
>>>>>>>>>
>
>
>
>mymachine:/rescued # reiserfsck --fix-fixable hdd9.img
>
><-------------reiserfsck, 2002------------->
>reiserfsprogs 3.6.4
>
>  *************************************************************
>  ** If you are using the latest reiserfsprogs and  it fails **
>  ** please  email bug reports to reiserfs-list@namesys.com, **
>  ** providing  as  much  information  as  possible --  your **
>  ** hardware,  kernel,  patches,  settings,  all  reiserfsk **
>  ** messages  (including version),  the reiserfsck logfile, **
>  ** check  the  syslog file  for  any  related information. **
>  ** If you would like advice on using this program, support **
>  ** is available  for $25 at  www.namesys.com/support.html. **
>  *************************************************************
>
>Will check consistency of the filesystem on hdd9.img
>and will fix what can be fixed w/o --rebuild-tree
>Will put log info to 'stdout'
>
>Do you want to run this program?[N/Yes] (note need to type Yes if you
>do):Yes
>###########
>reiserfsck --fix-fixable started at Wed Jan 22 21:32:47 2003
>###########
>Replaying journal..
>0 transactions replayed
>Checking internal tree../ 34 (of 165)/ 79 (of 105)bad_item: vpf-10580: block
>11065: Flags in the item header (6) were cleaned
>finished
>Comparing bitmaps..finished
>Checking Semantic tree:
>/n/qcad-1.4.7-i386-setup.tar.gzvpf-10680: The file [3 21341] has the wrong
>block count in the StatData (7464) - corrected to (2104)
>/n/windows_vmware/win98.dskvpf-10680: The file [21732 21734] has the wrong
>block count in the StatData (1365888) - corrected to (1354496)
>/n/papers.part1.rarvpf-10680: The file [3 70216] has the wrong block count
>in the StatData (1269536) - corrected to (56040)
>/data_2/f4.bakvpf-10680: The file [10533 10585] has the wrong block count in
>the StatData (3656) - corrected to (744)
>finished
>No corruptions found
>There are on the filesystem:
>        Leaves 25460
>        Internal nodes 166
>        Directories 4340
>        Other files 92460
>        Data block pointers 5899332 (0 of them are zero)
>        Safe links 0
>###########
>reiserfsck finished at Wed Jan 22 21:34:09 2003
>###########
>
>mymachine:/rescued # mount -treiserfs -o loop -r hdd9.img /d9
>mymachine:/rescued #
>
>
>------- Hee, no errors any more -------- let's check it's contents
>
>
>
>Data is there. I'm very happy!!
>What did we learn. A lot. One of them is that computers are more unreliable
>than humans and they are
>not very reliable. Look e.g. at Suse. Their auto update should have updated
>the reiserfs programs to the
>newest versions, but they forgot to put that in. I'll mail them.
>
Vitaly and Chris, can you two follow up on that with SuSE?

>One thing that Namesys should do is to make the software more user-friendly
>and put this type of info in their webpages
>
If you could define exactly what you'd like and where, it would be helpful.

I think we should have fsck ask the user to check and see if they are 
using the latest fsck.

"You really want to be using the very latest stable version of fsck when 
you run it.  Please go to www.namesys.com, click on the download button, 
and see if you have the latest version before you continue."

I think debugreiserfs should print some explanation of how it does not 
do what fsck does but does something else.  We should remember that 
utilities like debugreiserfs and fsck are rarely used twice by most 
users, so explanations will not be annoyingly repetitive for most of 
them.  If you think about it, debugreiserfs sounds like it does what 
fsck actually does.

>Crashed HD's are a pain in the ass to repair. This type of IBM drives have a
>flaw. Ibm confirmed this, but they did not remove them from the market.
>
Yes, recent IBMs from Hungary are known to be especially bad.

>
>
>
>Have a lot of fun,
>
>Nick
>
>
>
>
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:57 ` Hans Reiser
@ 2003-01-23  6:43   ` Ookhoi
  2003-01-23  6:52     ` Oleg Drokin
  0 siblings, 1 reply; 26+ messages in thread
From: Ookhoi @ 2003-01-23  6:43 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Niek, reiserfs-list, Vitaly Fertman, mason

Hans Reiser wrote (ao):
> I think we should have fsck ask the user to check and see if they are
> using the latest fsck.
> 
> "You really want to be using the very latest stable version of fsck
> when you run it.  Please go to www.namesys.com, click on the download
> button, and see if you have the latest version before you continue."

If XFree86 crashes, it says something like, 

"This is release foo, date bar. If it is more than six months old, or if
your video card is newer than this release, please see if a newer
release fixes it before you report a bug"

Maybe fsck should say the same. It could very well be that you don't
have access to the internet when you have to perform a fsck :-)  And
then you still would like to know how old your tools are. If you release
new tools at least every two months, you can say two months. You can
also point to the -pre releases (with the usual warning).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  6:43   ` Ookhoi
@ 2003-01-23  6:52     ` Oleg Drokin
  2003-01-23  7:00       ` Ookhoi
  0 siblings, 1 reply; 26+ messages in thread
From: Oleg Drokin @ 2003-01-23  6:52 UTC (permalink / raw)
  To: Ookhoi; +Cc: Hans Reiser, Niek, reiserfs-list, Vitaly Fertman, mason

Hello!

On Thu, Jan 23, 2003 at 07:43:57AM +0100, Ookhoi wrote:
> > I think we should have fsck ask the user to check and see if they are
> > using the latest fsck.
> > "You really want to be using the very latest stable version of fsck
> > when you run it.  Please go to www.namesys.com, click on the download
> > button, and see if you have the latest version before you continue."
> If XFree86 crashes, it says something like, 
> "This is release foo, date bar. If it is more than six months old, or if
> your video card is newer than this release, please see if a newer
> release fixes it before you report a bug"

I think it does already. Along with suggesting on where to send bugreports
if this is the latest release.

> Maybe fsck should say the same. It could very well be that you don't
> have access to the internet when you have to perform a fsck :-)  And
> then you still would like to know how old your tools are. If you release
> new tools at least every two months, you can say two months. You can
> also point to the -pre releases (with the usual warning).

main.c:  ** If you are using the latest reiserfsprogs and  it fails **\n\
main.c-  ** please  email bug reports to reiserfs-list@namesys.com, **\n\
main.c-  ** providing  as  much  information  as  possible --  your **\n\

Though this warning is printed at the beginning.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  6:52     ` Oleg Drokin
@ 2003-01-23  7:00       ` Ookhoi
  0 siblings, 0 replies; 26+ messages in thread
From: Ookhoi @ 2003-01-23  7:00 UTC (permalink / raw)
  To: Oleg Drokin
  Cc: Ookhoi, Hans Reiser, Niek, reiserfs-list, Vitaly Fertman, mason

Oleg Drokin wrote (ao):
> Hello!
> 
> On Thu, Jan 23, 2003 at 07:43:57AM +0100, Ookhoi wrote:
> > > I think we should have fsck ask the user to check and see if they
> > > are using the latest fsck.
> > > "You really want to be using the very latest stable version of
> > > fsck when you run it. Please go to www.namesys.com, click on the
> > > download button, and see if you have the latest version before you
> > > continue."
> >
> > If XFree86 crashes, it says something like, 
> > "This is release foo, date bar. If it is more than six months old,
> > or if your video card is newer than this release, please see if a
> > newer release fixes it before you report a bug"
> 
> I think it does already. Along with suggesting on where to send
> bugreports if this is the latest release.
> 
> > Maybe fsck should say the same. It could very well be that you don't
> > have access to the internet when you have to perform a fsck :-)  And
> > then you still would like to know how old your tools are. If you
> > release new tools at least every two months, you can say two months.
> > You can also point to the -pre releases (with the usual warning).
> 
> main.c:  ** If you are using the latest reiserfsprogs and  it fails **\n\
> main.c-  ** please  email bug reports to reiserfs-list@namesys.com, **\n\
> main.c-  ** providing  as  much  information  as  possible --  your **\n\
> 
> Though this warning is printed at the beginning.

What I mean is that it should print the release date also. Now it says:

<-------------reiserfsck, 2002------------->
reiserfsprogs 3.6.5-pre1

or

fsck.reiser4 0.2.0
Copyright (C) 2001, 2002 by Hans Reiser, licensing governed by
reiser4progs/COPYING.

It doesn't say when it is released so you can't determine if these tools
are old or not. This can be a problem if you don't have access to the
internet because you need to fsck your one and only computer. If you
notice that you have (very) old tools, you might want to do some extra
effort and go to a place where you do have internet access to fetch
newer tools. 
And it doesn't say that you shouldn't use fsck if it is older than, say,
two months.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
  2003-01-22 21:57 ` Hans Reiser
@ 2003-01-22 22:01 ` Dieter Nützel
  2003-01-22 22:02 ` Hans Reiser
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Dieter Nützel @ 2003-01-22 22:01 UTC (permalink / raw)
  To: Niek, reiserfs-list

Am Mittwoch, 22. Januar 2003 22:35 schrieb Niek:
> Title: IBM DTLA 307045 Hard disk crash

[Nice stroy snipped]

> mymachine:/rescued # mount -treiserfs -o loop -r hdd9.img /d9
> mymachine:/rescued #
>
>
> ------- Hee, no errors any more -------- let's check it's contents
>
>
>
> Data is there. I'm very happy!!
> What did we learn. A lot. One of them is that computers are more unreliable
> than humans and they are
> not very reliable. Look e.g. at Suse. Their auto update should have updated
> the reiserfs programs to the
> newest versions, but they forgot to put that in. I'll mail them.
> One thing that Namesys should do is to make the software more user-friendly
> and put this type of info in their webpages
> Crashed HD's are a pain in the ass to repair. This type of IBM drives have
> a flaw. Ibm confirmed this, but they did not remove them from the market.

You should have learnt to start with the "latest" available tools from the 
InterNet ;-)

3.6.5-pre1 is latest.
ftp://ftp.namesys.com/pub/reiserfsprogs/pre

Greetings,
	Dieter

PS There is documentation.

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel at hamburg.de (replace at with @)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
  2003-01-22 21:57 ` Hans Reiser
  2003-01-22 22:01 ` Dieter Nützel
@ 2003-01-22 22:02 ` Hans Reiser
  2003-01-23  0:16 ` Rudy L. Zijlstra
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Hans Reiser @ 2003-01-22 22:02 UTC (permalink / raw)
  To: Niek; +Cc: reiserfs-list, Vitaly Fertman

I forgot to say, thanks a lot for your detailed story.  It was helpful 
for me, I have only run fsck myself once (after I LVM'd my reiserfs 
partition and scragged the beginning of it containing the root 
directory, oh was that a misery to reconstruct by sorting through a 
bunch of unnamed files to be sure none of them were valuable).

Vitaly, I think we probably should have specific instructions for what 
to do if you destroy the beginning of the partition, and need to 
reconstruct only some of the bitmaps, it is the most common thing that 
people do (as opposed to disks doing) that requires fsck.  What do you 
think?

-- 
Hans

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
                   ` (2 preceding siblings ...)
  2003-01-22 22:02 ` Hans Reiser
@ 2003-01-23  0:16 ` Rudy L. Zijlstra
  2003-01-23  6:47   ` Ookhoi
  2003-01-23  7:45 ` Todd Lyons
  2003-01-26 19:42 ` Zygo Blaxell
  5 siblings, 1 reply; 26+ messages in thread
From: Rudy L. Zijlstra @ 2003-01-23  0:16 UTC (permalink / raw)
  To: Niek; +Cc: reiserfs-list

Niek wrote:
<snip>

>Data is there. I'm very happy!!
>What did we learn. A lot. One of them is that computers are more unreliable
>than humans and they are
>not very reliable. Look e.g. at Suse. Their auto update should have updated
>the reiserfs programs to the
>newest versions, but they forgot to put that in. I'll mail them.
>One thing that Namesys should do is to make the software more user-friendly
>and put this type of info in their webpages
>Crashed HD's are a pain in the ass to repair. This type of IBM drives have a
>flaw. Ibm confirmed this, but they did not remove them from the market.
>
>
>
>Have a lot of fun,
>
>Nick
>  
>
I am missing yet another lesson:

Make backups of important data!

Cheers,

Rudy

P.S. I run a nightly incremental backup to a different machine... I 
*hate* losing data.
And yes, I have been saved by my backups at least twice already. I've 
had disks fail on me (including SCSI disks, and including that idiot IBM 
DTLA), and I've had the original user fault (overwriting of file with 
different content, then a few weeks later: "Oops, that file does not 
contain what it should")


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  0:16 ` Rudy L. Zijlstra
@ 2003-01-23  6:47   ` Ookhoi
  2003-01-23  8:02     ` Rudy L. Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Ookhoi @ 2003-01-23  6:47 UTC (permalink / raw)
  To: Rudy L. Zijlstra; +Cc: Niek, reiserfs-list

Rudy L. Zijlstra wrote (ao):
> I am missing yet another lesson:
> 
> Make backups of important data!

Then you didn't pay attention to the mail from Francois-Rene
yesterday. ;-)  It contained highly religious phrases about The Almighty
Baah-Kuhp.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  6:47   ` Ookhoi
@ 2003-01-23  8:02     ` Rudy L. Zijlstra
  2003-01-23 13:14       ` Dieter Nützel
  0 siblings, 1 reply; 26+ messages in thread
From: Rudy L. Zijlstra @ 2003-01-23  8:02 UTC (permalink / raw)
  To: ookhoi; +Cc: Niek, reiserfs-list

[-- Attachment #1: Type: text/html, Size: 821 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  8:02     ` Rudy L. Zijlstra
@ 2003-01-23 13:14       ` Dieter Nützel
  0 siblings, 0 replies; 26+ messages in thread
From: Dieter Nützel @ 2003-01-23 13:14 UTC (permalink / raw)
  To: Rudy L. Zijlstra; +Cc: reiserfs-list

Am Donnerstag, 23. Januar 2003 09:02 schrieb Rudy L. Zijlstra:

_NO_ HTML mails to mailing lists, please!

Thank you.

-Dieter 

> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
> <html>
> <head>
>   <title></title>
> </head>
> <body>
> Ookhoi wrote:<br>
> <blockquote type="cite" cite="mid20030123074755.B2442@humilis">
>   <pre wrap="">Rudy L. Zijlstra wrote (ao):
>   </pre>
>   <blockquote type="cite">
>     <pre wrap="">I am missing yet another lesson:
>
> Make backups of important data!
>     </pre>
>   </blockquote>
>   <pre wrap=""><!---->
> Then you didn't pay attention to the mail from Francois-Rene
> yesterday. ;-)  It contained highly religious phrases about The Almighty
> Baah-Kuhp.
>   </pre>
> </blockquote>
> Doh. <span class="moz-smiley-s3"><span> ;-) </span></span>&nbsp;I was
> talking about the lessons Niek had learned.. Do not know whether Niek
> has learned to pray yet! Judging from his lessons list he is still a
> firm unbeliever.<br>
> </body>
> </html>

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel at hamburg.de (replace at with @)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
                   ` (3 preceding siblings ...)
  2003-01-23  0:16 ` Rudy L. Zijlstra
@ 2003-01-23  7:45 ` Todd Lyons
  2003-01-23  9:40   ` Hans Reiser
  2003-01-26 19:42 ` Zygo Blaxell
  5 siblings, 1 reply; 26+ messages in thread
From: Todd Lyons @ 2003-01-23  7:45 UTC (permalink / raw)
  To: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Niek wanted us to know:

>Title: IBM DTLA 307045 Hard disk crash

I've been told that the IBM DTLA 3070xx series drives are part of a
class action suit because of IBM's claim that they are not certified for
24x7 usage, only 8 hours per day.

Personally, I have 6 machines with 307030 drives (2 drives per machine
in RAID 0 configuration) and lose about one drive per month.
- -- 
Blue skies...		Todd
| Get a bigger hammer!   |  Free Linux accounts!  Ssh to 127.0.0.1.  |
| http://www.mrball.net  |  Use your existing name and password.     |
| http://faq.mrball.net  |                         --Paul Timmins    |
   Linux kernel 2.4.19-16mdk   3 users,  load average: 0.02, 0.01, 0.00
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+L52CIBT1264ScBURAprLAKCZPrxhDjVLK0iWljpn6eBu4lnwEwCbBgDv
G10EN0eBCvxK3TTF2m0aYO4=
=VRfQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-23  7:45 ` Todd Lyons
@ 2003-01-23  9:40   ` Hans Reiser
  0 siblings, 0 replies; 26+ messages in thread
From: Hans Reiser @ 2003-01-23  9:40 UTC (permalink / raw)
  To: Todd Lyons; +Cc: reiserfs-list

Todd Lyons wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Niek wanted us to know:
>
>  
>
>>Title: IBM DTLA 307045 Hard disk crash
>>    
>>
>
>I've been told that the IBM DTLA 3070xx series drives are part of a
>class action suit because of IBM's claim that they are not certified for
>24x7 usage, only 8 hours per day.
>
>Personally, I have 6 machines with 307030 drives (2 drives per machine
>in RAID 0 configuration) and lose about one drive per month.
>- -- 
>Blue skies...		Todd
>| Get a bigger hammer!   |  Free Linux accounts!  Ssh to 127.0.0.1.  |
>| http://www.mrball.net  |  Use your existing name and password.     |
>| http://faq.mrball.net  |                         --Paul Timmins    |
>   Linux kernel 2.4.19-16mdk   3 users,  load average: 0.02, 0.01, 0.00
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.0.7 (GNU/Linux)
>
>iD8DBQE+L52CIBT1264ScBURAprLAKCZPrxhDjVLK0iWljpn6eBu4lnwEwCbBgDv
>G10EN0eBCvxK3TTF2m0aYO4=
>=VRfQ
>-----END PGP SIGNATURE-----
>
>
>  
>
They should be forced to recall them.

-- 
Hans



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-22 21:35 Hard disk crash and solution Niek
                   ` (4 preceding siblings ...)
  2003-01-23  7:45 ` Todd Lyons
@ 2003-01-26 19:42 ` Zygo Blaxell
  2003-01-27  4:53   ` Ookhoi
  2003-01-29 11:24   ` Hans Reiser
  5 siblings, 2 replies; 26+ messages in thread
From: Zygo Blaxell @ 2003-01-26 19:42 UTC (permalink / raw)
  To: reiserfs-list

In article <IKENJBHCILNPNAGHKCHFOEDMCCAA.Art@chello.nl>,
Niek <Art@chello.nl> wrote:
>Title: IBM DTLA 307045 Hard disk crash
>
>I bought this disk (46 GB) about two years ago. One of the best they
>claimed.
[...]
>What is the fucking MBTF of these drives?? Is it close to one year like I
>experienced?

My employer used a total of 13 of these drives (various sizes, but all the
same family) for RAID arrays.  We originally purchased 10, and replaced
the first 3 to die under IBM warranty.  After the first 3, we started
replacing dead disks with some other brand of drive.  In the end 9 of the
IBM drives died.  Some days two or three disks would fail at a time.
We didn't bother waiting for the last 4, but presumably they would have
died if we hadn't replaced all of them.  We took the rest of them apart
to use as cubicle wall decorations, shaving mirrors, etc.

Even worse, for about 6 hours before some of them died, they randomly
flipped a few bits here and there in the data, which made RAID 
redundancy useless.

-- 
Zygo Blaxell (Laptop) <zblaxell@feedme.hungrycats.org>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-26 19:42 ` Zygo Blaxell
@ 2003-01-27  4:53   ` Ookhoi
  2003-01-27  7:03     ` Oleg Drokin
  2003-01-29 11:24   ` Hans Reiser
  1 sibling, 1 reply; 26+ messages in thread
From: Ookhoi @ 2003-01-27  4:53 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: reiserfs-list

Zygo Blaxell wrote (ao):
> >Title: IBM DTLA 307045 Hard disk crash
> >
> >I bought this disk (46 GB) about two years ago. One of the best they
> >claimed.
> [...]
> >What is the fucking MBTF of these drives?? Is it close to one year
> >like I experienced?

That is quite good for those drives :-)

> My employer used a total of 13 of these drives (various sizes, but all
> the same family) for RAID arrays. We originally purchased 10, and
> replaced the first 3 to die under IBM warranty. After the first 3, we
> started replacing dead disks with some other brand of drive. In the
> end 9 of the IBM drives died. Some days two or three disks would fail
> at a time. We didn't bother waiting for the last 4, but presumably
> they would have died if we hadn't replaced all of them. We took the
> rest of them apart to use as cubicle wall decorations, shaving
> mirrors, etc.

We had about 25% fail within a few months (30 systems). I must say that
the systems ran a bit hot inside though. Ours weren't the infamous
deathstar disks btw.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-27  4:53   ` Ookhoi
@ 2003-01-27  7:03     ` Oleg Drokin
  2003-01-27 23:34       ` Zygo Blaxell
  2003-02-02 21:11       ` tim fairchild
  0 siblings, 2 replies; 26+ messages in thread
From: Oleg Drokin @ 2003-01-27  7:03 UTC (permalink / raw)
  To: Ookhoi; +Cc: Zygo Blaxell, reiserfs-list

Hello!

On Mon, Jan 27, 2003 at 05:53:31AM +0100, Ookhoi wrote:
> > >Title: IBM DTLA 307045 Hard disk crash
> > >
> > >I bought this disk (46 GB) about two years ago. One of the best they
> > >claimed.
> > [...]
> > >What is the fucking MBTF of these drives?? Is it close to one year
> > >like I experienced?
> That is quite good for those drives :-)

I bought IBM DTLA-307030 made in Hungary 2 years ago.
It is still working (though it already have ~1500 bad sectors remapped)
aside of making unusual noises when remapping bad sectors ;)
I may be just lucky.
Also I try to run it in cool environment, so that may help it too.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-27  7:03     ` Oleg Drokin
@ 2003-01-27 23:34       ` Zygo Blaxell
  2003-02-02 21:11       ` tim fairchild
  1 sibling, 0 replies; 26+ messages in thread
From: Zygo Blaxell @ 2003-01-27 23:34 UTC (permalink / raw)
  To: reiserfs-list

In article <20030127100349.B22720@namesys.com>,
Oleg Drokin  <green@namesys.com> wrote:
>I bought IBM DTLA-307030 made in Hungary 2 years ago.
>It is still working (though it already have ~1500 bad sectors remapped)
>aside of making unusual noises when remapping bad sectors ;)
>I may be just lucky.
>Also I try to run it in cool environment, so that may help it too.

Indeed, IBM's disks do work a little better if they are aggressively
cooled.  I went looking for where our surviving 4 disks went and it
turns out that they found homes in desktops.  They'll last for several
months longer (and counting) if you put a powerful fan on top of them.

Still, most other disk brands don't need a dedicated cooling fan just
to work properly in a single-disk configuration, and usually more than
100 bad sectors remapped after purchase is a sign of imminent total
failure.

-- 
Zygo Blaxell (Laptop) <zblaxell@feedme.hungrycats.org>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-27  7:03     ` Oleg Drokin
  2003-01-27 23:34       ` Zygo Blaxell
@ 2003-02-02 21:11       ` tim fairchild
  2003-02-03  4:49         ` Ookhoi
  1 sibling, 1 reply; 26+ messages in thread
From: tim fairchild @ 2003-02-02 21:11 UTC (permalink / raw)
  To: reiserfs-list

On Monday 27 Jan 2003 5:03 pm, Oleg Drokin wrote:

> I bought IBM DTLA-307030 made in Hungary 2 years ago.
> It is still working (though it already have ~1500 bad sectors remapped)
> aside of making unusual noises when remapping bad sectors ;)
> I may be just lucky.
> Also I try to run it in cool environment, so that may help it too.

Sorry to go back off topic, but does anyone have any eperience with the more 
recent 40gb IBM 120GP (IC35L040AVVN07) drives. I have one a few weeks old and 
it's already making some evil sounding noises...

tim

-- 
---------------------------------------------------------
  Tim & Therese Fairchild
  Atchafalaya Border Collies.
  Kuttabul, Queensland, Australia.
---------------------------------------------------------
 Email       mailto:amosf@mrbean.net.au
 Homepage    http://www.bcs4me.com
---------------------------------------------------------
 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-02 21:11       ` tim fairchild
@ 2003-02-03  4:49         ` Ookhoi
  2003-02-04  1:31           ` tim fairchild
  2003-02-04  3:08           ` Todd Lyons
  0 siblings, 2 replies; 26+ messages in thread
From: Ookhoi @ 2003-02-03  4:49 UTC (permalink / raw)
  To: tim fairchild; +Cc: reiserfs-list

tim fairchild wrote (ao):
> On Monday 27 Jan 2003 5:03 pm, Oleg Drokin wrote:
> > I bought IBM DTLA-307030 made in Hungary 2 years ago.
> > It is still working (though it already have ~1500 bad sectors
> > remapped) aside of making unusual noises when remapping bad sectors
> > ;) I may be just lucky.
> > Also I try to run it in cool environment, so that may help it too.
>
> Sorry to go back off topic, but does anyone have any eperience with
> the more recent 40gb IBM 120GP (IC35L040AVVN07) drives. I have one a
> few weeks old and it's already making some evil sounding noises...

A lot? Sometimes you can hear a disk recalibrate, which is not bad, but
that should be only now and then.

Do you have disk related errors in your logs?

Try to run an ibm drive fitness program and see what it tells you about
the disk.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-03  4:49         ` Ookhoi
@ 2003-02-04  1:31           ` tim fairchild
  2003-02-04  3:05             ` Manuel Krause
  2003-02-04  3:08           ` Todd Lyons
  1 sibling, 1 reply; 26+ messages in thread
From: tim fairchild @ 2003-02-04  1:31 UTC (permalink / raw)
  To: ookhoi; +Cc: reiserfs-list

On Monday 03 Feb 2003 2:49 pm, Ookhoi wrote:

> A lot? Sometimes you can hear a disk recalibrate, which is not bad, but
> that should be only now and then.

It has always made an occasional whir which I assumed was a disk recalibrate - 
this happens a couple of times an hour I suppose. 

Just lately it has made a loud double clicking sound. So far only heard 
twice... 

> Do you have disk related errors in your logs?

Nothing yet. So far appears in perfect condition. I have reiser, ext3 and fat 
partitions. 

Maybe just worried by all the comments on IBM drives :-)

tim

-- 
---------------------------------------------------------
  Tim & Therese Fairchild
  Atchafalaya Border Collies.
  Kuttabul, Queensland, Australia.
---------------------------------------------------------
 Email       mailto:amosf@mrbean.net.au
 Homepage    http://www.bcs4me.com
---------------------------------------------------------
 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-04  1:31           ` tim fairchild
@ 2003-02-04  3:05             ` Manuel Krause
  0 siblings, 0 replies; 26+ messages in thread
From: Manuel Krause @ 2003-02-04  3:05 UTC (permalink / raw)
  Cc: reiserfs-list

On 02/04/2003 02:31 AM, tim fairchild wrote:
> On Monday 03 Feb 2003 2:49 pm, Ookhoi wrote:
> 
>>A lot? Sometimes you can hear a disk recalibrate, which is not bad, but
>>that should be only now and then.
> 
> It has always made an occasional whir which I assumed was a disk recalibrate - 
> this happens a couple of times an hour I suppose. 

A "whir(r/l)"? I'm not very used to english sound descriptions. My 
notebooks spare disk IBM-DARA-218000 makes that noise when coming out of 
any kind of standby state.

> Just lately it has made a loud double clicking sound. So far only heard 
> twice... 

This my above mentioned disk does this, too, randomly. Sometimes under 
high load, sometimes after only a simple sync. Dunno. Maybe _that's_ the 
"recalibrating". No errors so far. But this disk may be quite old (18GB, 
purchased approx. 4 years ago) - made of steel.

>>Do you have disk related errors in your logs?
> 
> Nothing yet. So far appears in perfect condition. I have reiser, ext3 and fat 
> partitions. 
> 
> Maybe just worried by all the comments on IBM drives :-)
> 
> tim

Best wishes!

Manuel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-03  4:49         ` Ookhoi
  2003-02-04  1:31           ` tim fairchild
@ 2003-02-04  3:08           ` Todd Lyons
  1 sibling, 0 replies; 26+ messages in thread
From: Todd Lyons @ 2003-02-04  3:08 UTC (permalink / raw)
  To: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ookhoi wanted us to know:

>A lot? Sometimes you can hear a disk recalibrate, which is not bad, but
>that should be only now and then.

If you're hearing what I have heard, it's the sound your ears hear when
you're scratching your head.  At that point it's trying to read a bad
sector and multiple reads/writes seemingly have stacked up in the queue
and the machine has come nearly to a halt.  It's still running, but
barely, more like crawling.

>Try to run an ibm drive fitness program and see what it tells you about
>the disk.

They refer to it as DFT and they have a couple of different versions as
I remember.  It's good software too.  Boots from PC-DOS :)
- -- 
Blue skies...	Todd 	Public key: http://www.mrball.net/todd.asc
...and I will strike down upon thee with great vengeance and furious
 anger, those who attempt to poison and destroy my binaries, and you 
    will know my name is root, when I lay my vengeance upon thee.
   Linux kernel 2.4.19-16mdk   3 users,  load average: 0.00, 0.02, 0.18
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+Py6RIBT1264ScBURAqd5AKCoW/QHfvqAiIQNmdcMxUJVT4VKiACgucMx
PeceAr3qu73ZAU08mssAUDY=
=VJP/
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-26 19:42 ` Zygo Blaxell
  2003-01-27  4:53   ` Ookhoi
@ 2003-01-29 11:24   ` Hans Reiser
  2003-02-06  3:04     ` Zygo Blaxell
  1 sibling, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2003-01-29 11:24 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: reiserfs-list

Zygo Blaxell wrote:

>In article <IKENJBHCILNPNAGHKCHFOEDMCCAA.Art@chello.nl>,
>Niek <Art@chello.nl> wrote:
>  
>
>>Title: IBM DTLA 307045 Hard disk crash
>>
>>I bought this disk (46 GB) about two years ago. One of the best they
>>claimed.
>>    
>>
>[...]
>  
>
>>What is the fucking MBTF of these drives?? Is it close to one year like I
>>experienced?
>>    
>>
>
>My employer used a total of 13 of these drives (various sizes, but all the
>same family) for RAID arrays.  We originally purchased 10, and replaced
>the first 3 to die under IBM warranty.  After the first 3, we started
>replacing dead disks with some other brand of drive.  In the end 9 of the
>IBM drives died.  Some days two or three disks would fail at a time.
>We didn't bother waiting for the last 4, but presumably they would have
>died if we hadn't replaced all of them.  We took the rest of them apart
>to use as cubicle wall decorations, shaving mirrors, etc.
>
>Even worse, for about 6 hours before some of them died, they randomly
>flipped a few bits here and there in the data, which made RAID 
>redundancy useless.
>
>  
>
Maybe we should warn against them in our FAQ?

-- 
Hans



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-01-29 11:24   ` Hans Reiser
@ 2003-02-06  3:04     ` Zygo Blaxell
  2003-02-06  9:56       ` Hans Reiser
  0 siblings, 1 reply; 26+ messages in thread
From: Zygo Blaxell @ 2003-02-06  3:04 UTC (permalink / raw)
  To: reiserfs-list

In article <3E37B9DC.20103@namesys.com>,
Hans Reiser  <reiser@namesys.com> wrote:
>Zygo Blaxell wrote:
>>My employer used a total of 13 of these drives (various sizes, but all the
>>same family) for RAID arrays.  We originally purchased 10, and replaced
>>the first 3 to die...

>Maybe we should warn against them in our FAQ?

I wouldn't be so quick to condemn all IBM disks.  I have a 342MB IBM IDE
disk which is 11 years old (and counting), and some 1-2 year old IBM laptop
drives that survived longer than their laptops did.

But there were a few notorious ones all built around the same time,
and I hear there was a class-action lawsuit...

-- 
Zygo Blaxell (Laptop) <zblaxell@feedme.hungrycats.org>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-06  3:04     ` Zygo Blaxell
@ 2003-02-06  9:56       ` Hans Reiser
  2003-02-06 10:21         ` Oleg Drokin
  0 siblings, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2003-02-06  9:56 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: reiserfs-list, Oleg Drokin

Zygo Blaxell wrote:

>In article <3E37B9DC.20103@namesys.com>,
>Hans Reiser  <reiser@namesys.com> wrote:
>  
>
>>Zygo Blaxell wrote:
>>    
>>
>>>My employer used a total of 13 of these drives (various sizes, but all the
>>>same family) for RAID arrays.  We originally purchased 10, and replaced
>>>the first 3 to die...
>>>      
>>>
>
>  
>
>>Maybe we should warn against them in our FAQ?
>>    
>>
>
>I wouldn't be so quick to condemn all IBM disks.  I have a 342MB IBM IDE
>disk which is 11 years old (and counting), and some 1-2 year old IBM laptop
>drives that survived longer than their laptops did.
>
>But there were a few notorious ones all built around the same time,
>and I hear there was a class-action lawsuit...
>
>  
>
Oleg, please see if you can construct a faq entry warning against the 
particular known bad hard drives.

-- 
Hans



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Hard disk crash and solution
  2003-02-06  9:56       ` Hans Reiser
@ 2003-02-06 10:21         ` Oleg Drokin
  0 siblings, 0 replies; 26+ messages in thread
From: Oleg Drokin @ 2003-02-06 10:21 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Zygo Blaxell, reiserfs-list

Hello!

On Thu, Feb 06, 2003 at 12:56:37PM +0300, Hans Reiser wrote:

> Oleg, please see if you can construct a faq entry warning against the 
> particular known bad hard drives.

Hmm...
Something like

Q: Are there any recomendation pro or against any particular hard drive manufacturers
for using with reiserfs?

A: There is basically no preference, general "the faster the drive is and less seek time
is better" rule applies as always. On the other side almost every harddrives manufacturer
have a "widely known" broken series of harddrives. The most recent example is
IBM's "Deskstar" series disks, especially DTLA models produced in Hungary 200-2001. These 
are known to fail very often. Also other Deskstar drives are seems to be not very good
choice. IBM released a note that deskstar drives should not run for more then 8 hours/day
on average. These drives are also known to be very sensitive to temperature conditions
and known to fail on overheat. There is class action lawsuit against IBM on that drives
series in progress. 

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2003-02-06 10:21 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-22 21:35 Hard disk crash and solution Niek
2003-01-22 21:57 ` Hans Reiser
2003-01-23  6:43   ` Ookhoi
2003-01-23  6:52     ` Oleg Drokin
2003-01-23  7:00       ` Ookhoi
2003-01-22 22:01 ` Dieter Nützel
2003-01-22 22:02 ` Hans Reiser
2003-01-23  0:16 ` Rudy L. Zijlstra
2003-01-23  6:47   ` Ookhoi
2003-01-23  8:02     ` Rudy L. Zijlstra
2003-01-23 13:14       ` Dieter Nützel
2003-01-23  7:45 ` Todd Lyons
2003-01-23  9:40   ` Hans Reiser
2003-01-26 19:42 ` Zygo Blaxell
2003-01-27  4:53   ` Ookhoi
2003-01-27  7:03     ` Oleg Drokin
2003-01-27 23:34       ` Zygo Blaxell
2003-02-02 21:11       ` tim fairchild
2003-02-03  4:49         ` Ookhoi
2003-02-04  1:31           ` tim fairchild
2003-02-04  3:05             ` Manuel Krause
2003-02-04  3:08           ` Todd Lyons
2003-01-29 11:24   ` Hans Reiser
2003-02-06  3:04     ` Zygo Blaxell
2003-02-06  9:56       ` Hans Reiser
2003-02-06 10:21         ` Oleg Drokin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.