Red Hat 8.0, Kernel Version 2.4.29-pre2, ASUS P4P800SE MB, 3.2 GHz Intel-P4 CPU, 1 GB RAM, ICH5R S-ATA only. SATA Disk = Maxtor 6Y160M0. This is really busting my balls. Around August or September of last year, a year and one month old (out of warranty) hard drive on my server gave out and crashed the system. This was my newest at the time Maxtor 4R120L4 120GB disk which I installed as the server's root drive. It runs really hot now after the crash, and it (I assume) has taken out 3 different brand new power supplies (230, 350 & 450 Watts). I bought a new computer, this one, with two new sata disks and a 550 Watt power supply, and copied the partitions of the bad disk (hdc) onto the second sata disk (sdb) as giant 60GB files. I was able to run e2fsck on the files and recovered most of my data by hand. Just about everything turned up in the lost+found directories, and I was able to move a lot of the stuff back to where it belonged. This took me most all of the Christmas Holiday break. Since my grades at school suffered severely during the quarter time period while I was trying to rebuild my server, I promised my academic advisor I would not touch the computer again until spring break. That is now, and my new sata drive (with all of my recovered data and two weeks worth of Christmas Holiday work) seems to have also fortuitously crashed just before I could transfer the data back to my new server. ARRGGGHHHH!!!! The setup is like this. I have a firewall with http and ftp software that uses nfs to serve up files from this computer's brand new sdb1 partition. Not that it has anything to do with the problem, or was it being used at the time, but one of those directories was writable as an ftp "incoming" directory. All of the other directories were served as read-only. The incoming directory was /linux2/ftp/incoming on this system where sdb1 was mounted to the root directory as /linux2. The recovered 60GB partitions were mounted as /linux2/hdcN.img to /somerootdirectory using the loopback device. I haven't touched any part of the loopback files since December and my recovery work. Except for the /linux2/ftp/incoming directory having some occasional write accesses to it from time to time, nothing has been going on with this drive until I had time to transfer my recovered files this week. This has been working very well, and without any problems since December. A couple of weeks ago on March 6 I shut down the system to do some school work on Win98, and when I rebooted linux, the system said (incorrectly) that I had not unmounted my sdb1 partition cleanly and needed to run fsck. I dutifully performed the fsck because most of the time it never causes a serious problem. This time was an exception because it found a bunch of errors, which I then elected to let it fix. When I mounted the sdb1 partition afterwards, there were no files to be found... not even the lost+found directory. I have a feeling that Mator's SMARTs in this drive are not so smart after all. Since this drive is still under warranty, it is no problem to replace. I just need to know that I can still get the data back somehow. E2fsed or whatever it was called.... isn't here anymore, and I guess that doesn't really matter anyway because my 60GB files exceed the old 2GB e2fsed limit. "debugfs" says the following [root@Sharlie bkup 20:05:27]# debugfs /dev/sdb1 debugfs 1.27 (8-Mar-2002) /dev/sdb1: Attempt to read block from filesystem resulted in short read while opening filesystem debugfs: ls ls: Filesystem not open debugfs: open /dev/sdb1 /dev/sdb1: Attempt to read block from filesystem resulted in short read while opening filesystem debugfs: ls ... I noticed that the last good bootup was around March 3, and after that I started getting ata2 error messages. F-me! I left the G-dxxxxx "less /var/log/messages.17" on the screen last night while the syslog rotated, and now it's gone too. Anyway I have included as attachments in this message the fsck error messages recorded. The system log error messages can be found here at this web site http://Christoffur.HopTo.Org/Photos/Uploads/ The main points of interest would probably be these files http://Christoffur.HopTo.Org/Photos/Uploads/LastGoodBoot.txt 28530 Bytes http://Christoffur.HopTo.Org/Photos/Uploads/FirstErrorMessages.txt 144961 Bytes http://Christoffur.HopTo.Org/Photos/Uploads/Prtl-ErrorMessagesB4+AfterFsck.txt 48158 Bytes Funny, there were no ata2 error messages on the 5th of March even though I performed two bootups. Unfortunately when the system booted at 15:57 and I performed the two fscks, system logging does not seem to have been enabled. There are no system log messages between 15:57 when I shut down the computer to reboot and run fsck, and 17:01 when the system came back up with the sdb1 entry removed from /etc/fstab. The above files have been significantly trimmed to reduce the download size. If you want access to the full and complete message logs, you can find them in the Upload directory given above. The problem with the message logs is that the messages from the new sata ata2 drive look exactly the same as the messages from the old bad hdc drive. You can find those errors in the messages logs 99, 100 and 101 in the same directory as above. It took three days to copy the data off of my bad hdc drive onto the new drive. I don't suspect an OS, or SATA driver problem at this time at all. It would appear that the drive itself, while being only 3 or 4 months old has developed some kind of hardware problem. I am not quite sure what exactly the problem is, but it won't be until April before I can afford another replacement disk. I am going to power down and kick the cables on the bad drives and take as many of the good drives out of this computer that I can, and put the good ones back in the new 300MHz Pentium-3 server machine, which is basically a September copy of this machine. That will probably take all day, if I am lucky. Hopefully 550 Watts should be enough power to drive the two bad disks, and the one good sata root disk. Then I can hopefully copy the data off of the bad drives onto the server were they belong. I have one week to get this done and then my next quarter starts at school again, and I really need to focus and concentrate on that. I am especially worried about the root drive on this computer and the server because they are exactly the same as the one that just crapped out. Except that the one on the server is a regular ata drive and not a S-ata. The rest of the drive is supposed to be exactly the same. If either one of those drives blows up, I am really going to be sunk. I am taking recommendations as to what type of drives to use to replace my Maxtors. I thought that Maxtor was one of the most reliable drives around and I have been using them exclusively for 10 years or more. It would now appear that I may have been wrong about the Maxtor reliability factor, or that something may have changed in their manufacturing processes. I've got drives that have been running 24 hours a day 365 days a year for over 6 years without as much a hiccup. Now all of a sudden in two years I've got two new drives giving me conniption fits and driving me crazy. I don't have the time nor can I afford the anger management classes that would alleviate my serious ant-social behaviors that have developed over the last few months related to these incidents. My poor defenseless little 486 server, victim of the first hard drive crash, sitting harmlessly on the floor awaiting a simple drive transplant, just happened to get in the way of my big foot immediately following the last crash on my new machine and suffered a fatal swift kick that caused a compound processor neck bone fracture. Obviously this situation cannot continue. If simple inanimate objects like my beloved little Shelby are not safe from my personal and infuriating bouts of computer rage, I hate to think of what the future will bing to the latest and greatest pentium-4's that happen to lay near in proximity to my next coming affliction. If someone could please decrypt some of the error messages and let me know if the 6Y16M0 data is salvageable and how that might be accomplished, it might save me a couple of weeks worth of re-work to re-recover my hdc data, and save future unsuspecting computer systems from further days of unearned relentless hammering torment, retribution and the rubbish heap. Any other suggestions, observations or criticisms, as always, are always welcomed. Prayers should also be helpful, if anyone is so disposed, as this can certainly not be an act of random chance, but instead due to the critical nature of the timing of these events, this must be an act of the hand of God. And I know not why? All I am trying to do is save the world and the entire human race! Regards from, -- Christopher R. Thompson Student of Mechanical Engineering California State Polytechnic University, Pomona home: http://www.csupomona.edu/~cthompson1/ SNAFU and .... Who was that guy who said.."If it can go wrong... it will go wrong!"?