* Re: ide drive dying?
2002-09-06 15:13 DevilKin
@ 2002-09-06 15:26 ` Mike Dresser
2002-09-06 15:39 ` Alan Cox
` (2 more replies)
2002-09-06 15:36 ` jbradford
` (5 subsequent siblings)
6 siblings, 3 replies; 63+ messages in thread
From: Mike Dresser @ 2002-09-06 15:26 UTC (permalink / raw)
To: DevilKin; +Cc: linux-kernel
> The drive involved is an IBM-DTLA-307060, which has served me without problems
> now for about 2 years.
IBM DeathStar 75gxp.
One of the worst hard drives ever made. It's quite likely it's failed,
and in fact, two years is pretty impressive out of one of these.
Make backups immediately. Run ibm's DFT tool, get the code to RMA this
thing back to IBM. Sell the replacement they send you to a sucker on
eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
around 80 bucks nowadays. I used to recommend Maxtors, until they said
they're cutting their warranty to one year from three. I don't know what
to use anymore.
Mike
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:26 ` Mike Dresser
@ 2002-09-06 15:39 ` Alan Cox
2002-09-06 15:42 ` Mike Dresser
2002-09-06 15:44 ` Richard B. Johnson
2002-09-06 17:28 ` Daniel Egger
2 siblings, 1 reply; 63+ messages in thread
From: Alan Cox @ 2002-09-06 15:39 UTC (permalink / raw)
To: Mike Dresser; +Cc: DevilKin, linux-kernel
On Fri, 2002-09-06 at 16:26, Mike Dresser wrote:
> eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> around 80 bucks nowadays. I used to recommend Maxtors, until they said
> they're cutting their warranty to one year from three. I don't know what
> to use anymore.
At current drive density and reliabilities - raid. Software raid setups
are so cheap there is little point not running RAID on IDE nowdays
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:39 ` Alan Cox
@ 2002-09-06 15:42 ` Mike Dresser
2002-09-06 16:14 ` Billy Harvey
0 siblings, 1 reply; 63+ messages in thread
From: Mike Dresser @ 2002-09-06 15:42 UTC (permalink / raw)
To: Alan Cox; +Cc: DevilKin, linux-kernel
On 6 Sep 2002, Alan Cox wrote:
> On Fri, 2002-09-06 at 16:26, Mike Dresser wrote:
> > eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> > around 80 bucks nowadays. I used to recommend Maxtors, until they said
> > they're cutting their warranty to one year from three. I don't know what
> > to use anymore.
>
> At current drive density and reliabilities - raid. Software raid setups
> are so cheap there is little point not running RAID on IDE nowdays
>
Well, I was looking more on the side of the Windows PC's here at the
office, it's a bit expensive to start running raid on those.
Mike
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:42 ` Mike Dresser
@ 2002-09-06 16:14 ` Billy Harvey
2002-09-06 16:41 ` Mike Dresser
2002-09-06 18:00 ` jbradford
0 siblings, 2 replies; 63+ messages in thread
From: Billy Harvey @ 2002-09-06 16:14 UTC (permalink / raw)
To: Linux Kernel
On Fri, 2002-09-06 at 11:42, Mike Dresser wrote:
> On 6 Sep 2002, Alan Cox wrote:
>
> > On Fri, 2002-09-06 at 16:26, Mike Dresser wrote:
> > > eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> > > around 80 bucks nowadays. I used to recommend Maxtors, until they said
> > > they're cutting their warranty to one year from three. I don't know what
> > > to use anymore.
> >
> > At current drive density and reliabilities - raid. Software raid setups
> > are so cheap there is little point not running RAID on IDE nowdays
> >
> Well, I was looking more on the side of the Windows PC's here at the
> office, it's a bit expensive to start running raid on those.
>
> Mike
Well, I haven't examined this empirically, but as the quantity of disk
drives in an organization continues increasing, so does the probability
of disk failure, any one of which can mean lost time/money, etc. Drive
reliability is likely not increasing at the same rate that density is,
so the likelihood of lost data is probably increasing. Since LAN speeds
continue to increase, it might start making sense now in clusters of
more than a few machines to make each machine less reliant on its own
disk storage (to the point of not at all other than big swap space) and
use the LAN more. On the LAN put the money into a quality shared
resource - a heavy duty UPS'd, etc. RAID system. Especially if a RAID
system is as easy to build/maintain/use as Alan alludes to (don't know -
never built one).
Billy
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 16:14 ` Billy Harvey
@ 2002-09-06 16:41 ` Mike Dresser
2002-09-06 18:00 ` jbradford
1 sibling, 0 replies; 63+ messages in thread
From: Mike Dresser @ 2002-09-06 16:41 UTC (permalink / raw)
To: Billy Harvey; +Cc: Linux Kernel
On 6 Sep 2002, Billy Harvey wrote:
> use the LAN more. On the LAN put the money into a quality shared
> resource - a heavy duty UPS'd, etc. RAID system. Especially if a RAID
> system is as easy to build/maintain/use as Alan alludes to (don't know -
> never built one).
>
> Billy
And don't forget the cost of cluebats to beat the users over the head
with. I've been trying for 3 years to get people to save their documents
to the H: drive. Still find stuff stored wherever they feel like storing
it.
So each facility has a backup server that nightly grabs their entire
drive, gzip's it, and then dumps it to a DDS-4 tape. Also keeps X days of
daily full backups, and X weeks as well.
Aside from Windows filesharing being so slow(1500kps via smbtar is average
here), it works quite nicely. Even with a P4/2.53, I still can't get
more than the 1500kps that a p133 is capable of. All the p4 gives me, is
the ability to gzip -9 or even bzip2 the files, instead of the gzip -1
that the p133 is capable of in real time.
Mike
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 16:14 ` Billy Harvey
2002-09-06 16:41 ` Mike Dresser
@ 2002-09-06 18:00 ` jbradford
2002-09-06 17:58 ` Mike Dresser
1 sibling, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-06 18:00 UTC (permalink / raw)
To: Billy Harvey; +Cc: linux-kernel, alan
> On Fri, 2002-09-06 at 11:42, Mike Dresser wrote:
> > On 6 Sep 2002, Alan Cox wrote:
> >
> > > On Fri, 2002-09-06 at 16:26, Mike Dresser wrote:
> > > > eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> > > > around 80 bucks nowadays. I used to recommend Maxtors, until they said
> > > > they're cutting their warranty to one year from three. I don't know what
> > > > to use anymore.
> > >
> > > At current drive density and reliabilities - raid. Software raid setups
> > > are so cheap there is little point not running RAID on IDE nowdays
> > >
> > Well, I was looking more on the side of the Windows PC's here at the
> > office, it's a bit expensive to start running raid on those.
> >
> > Mike
>
> Well, I haven't examined this empirically, but as the quantity of disk
> drives in an organization continues increasing, so does the probability
> of disk failure, any one of which can mean lost time/money, etc. Drive
> reliability is likely not increasing at the same rate that density is,
> so the likelihood of lost data is probably increasing. Since LAN speeds
> continue to increase, it might start making sense now in clusters of
> more than a few machines to make each machine less reliant on its own
> disk storage (to the point of not at all other than big swap space) and
> use the LAN more. On the LAN put the money into a quality shared
> resource - a heavy duty UPS'd, etc. RAID system. Especially if a RAID
> system is as easy to build/maintain/use as Alan alludes to (don't know -
> never built one).
A RAID array isn't a universal solution to all disk related problems, though, is it? I mean, we were talking about buggy firmware earlier on in this thread - if a drive which is part of an array returns corrupted data, without acknowledging it, then you'll read corrupted data from the RAID array. Also, an array of unreliable drives doesn't make a reliable array.
Now that the Smart Suite S.M.A.R.T. applications are unmaintained, would there be any chance of implementing S.M.A.R.T. in to the kernel IDE code? I know the IDE code is already a nightmare, but it would be a nice feature. S.M.A.R.T. is terribly under used at the moment - most people don't even know what it is. Infact, I could be wrong, but isn't a subset of S.M.A.R.T. implemented on modern SCSI disks, too?
Monitoring of any kind is always a nice feature to have...
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:26 ` Mike Dresser
2002-09-06 15:39 ` Alan Cox
@ 2002-09-06 15:44 ` Richard B. Johnson
2002-09-06 16:19 ` Craig Ruff
2002-09-06 17:28 ` Daniel Egger
2 siblings, 1 reply; 63+ messages in thread
From: Richard B. Johnson @ 2002-09-06 15:44 UTC (permalink / raw)
To: Mike Dresser; +Cc: DevilKin, linux-kernel
On Fri, 6 Sep 2002, Mike Dresser wrote:
> > The drive involved is an IBM-DTLA-307060, which has served me without problems
> > now for about 2 years.
>
> IBM DeathStar 75gxp.
>
> One of the worst hard drives ever made. It's quite likely it's failed,
> and in fact, two years is pretty impressive out of one of these.
>
> Make backups immediately. Run ibm's DFT tool, get the code to RMA this
> thing back to IBM. Sell the replacement they send you to a sucker on
> eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> around 80 bucks nowadays. I used to recommend Maxtors, until they said
> they're cutting their warranty to one year from three. I don't know what
> to use anymore.
>
> Mike
>
IBM DeathStar 75gxp.
Well put. Also, don't turn off this drive --ever. If possible, back-up
to something on a network, not to anything on the IDE bus. If you don't
have anything available, borrow something from work and make a temporary
LAN. With bad sectors and a relocation list already full, this drive
will seize the IDE bus and never let go once you trip it into failure.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
The US military has given us many words, FUBAR, SNAFU, now ENRON.
Yes, top management were graduates of West Point and Annapolis.
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:44 ` Richard B. Johnson
@ 2002-09-06 16:19 ` Craig Ruff
0 siblings, 0 replies; 63+ messages in thread
From: Craig Ruff @ 2002-09-06 16:19 UTC (permalink / raw)
To: linux-kernel
On Fri, Sep 06, 2002 at 11:44:52AM -0400, Richard B. Johnson wrote:
> IBM DeathStar 75gxp.
>
> Well put. Also, don't turn off this drive --ever. If possible, back-up
> to something on a network, not to anything on the IDE bus.
I had one of these drives fail recently with the dread "clicking of death"
sounds (while it was retrying reads). What I discovered, while backing up
the disk, is that continuing sequential reads past the bad sectors without
and intervening operation would eventually cause the drive to get into a
messed up state where it erroneously reported the following good sectors
as bad.
My strategy to recover the good data was to read sequentially until I
got an error, then explicitly seek to the next good sector and continue
from there. This enabled me to copy the good data.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:26 ` Mike Dresser
2002-09-06 15:39 ` Alan Cox
2002-09-06 15:44 ` Richard B. Johnson
@ 2002-09-06 17:28 ` Daniel Egger
2 siblings, 0 replies; 63+ messages in thread
From: Daniel Egger @ 2002-09-06 17:28 UTC (permalink / raw)
To: Mike Dresser; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 785 bytes --]
Am Fre, 2002-09-06 um 17.26 schrieb Mike Dresser:
> Make backups immediately. Run ibm's DFT tool, get the code to RMA this
> thing back to IBM. Sell the replacement they send you to a sucker on
> eBAY, and buy yourself a new drive. You can pickup 80 gig drives for
> around 80 bucks nowadays. I used to recommend Maxtors, until they said
> they're cutting their warranty to one year from three. I don't know what
> to use anymore.
I did exactly this and bought a 80gig Maxtor for EUR 100 (don't know why
it would be so much cheaper at your place, but anyway). Unfortunately
the drive was broken right away, let's see how long the replacement
drive keeps running...
Seems like every major brand is just producing crap nowadays....
--
Servus,
Daniel
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:13 DevilKin
2002-09-06 15:26 ` Mike Dresser
@ 2002-09-06 15:36 ` jbradford
2002-09-06 15:55 ` DevilKin
2002-09-06 15:37 ` mbs
` (4 subsequent siblings)
6 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-06 15:36 UTC (permalink / raw)
To: DevilKin; +Cc: linux-kernel
> I've looked up these errors on the net, and as far as i can tell it means
> that the drive has some bad sectors at the given addresses and that it will
> probably die on me sooner or later.
>
> Can someone either confirm this to me or tell me what to do to fix it?
>
> The drive involved is an IBM-DTLA-307060, which has served me without
> problems now for about 2 years.
Have a look at:
http://csl.cse.ucsc.edu/smart.shtml
there you will find software for interrogating and monitoring the S.M.A.R.T. data available from your drive. It's a little late to start monitoring it, if the drive is already dying, but if, for example, it shows a lot of re-allocated sectors, or spin retries, you'll know something is wrong.
John.
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:36 ` jbradford
@ 2002-09-06 15:55 ` DevilKin
2002-09-06 17:22 ` jbradford
2002-09-07 7:08 ` Andre Hedrick
0 siblings, 2 replies; 63+ messages in thread
From: DevilKin @ 2002-09-06 15:55 UTC (permalink / raw)
To: jbradford; +Cc: Linux Kernel Mailing List
On Friday 06 September 2002 17:36, jbradford@dial.pipex.com wrote:
> > I've looked up these errors on the net, and as far as i can tell it means
> > that the drive has some bad sectors at the given addresses and that it
> > will probably die on me sooner or later.
> >
> > Can someone either confirm this to me or tell me what to do to fix it?
> >
> > The drive involved is an IBM-DTLA-307060, which has served me without
> > problems now for about 2 years.
>
> Have a look at:
>
> http://csl.cse.ucsc.edu/smart.shtml
>
> there you will find software for interrogating and monitoring the
> S.M.A.R.T. data available from your drive. It's a little late to start
> monitoring it, if the drive is already dying, but if, for example, it shows
> a lot of re-allocated sectors, or spin retries, you'll know something is
> wrong.
>
OK, I downloaded that and installed it, but well, frankly, it shows me very
little useful stuff.
Or i'm just not good at interpreting this.
DK
--
"I gained nothing at all from Supreme Enlightenment, and for that very
reason it is called Supreme Enlightenment."
-- Gotama Buddha
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:55 ` DevilKin
@ 2002-09-06 17:22 ` jbradford
2002-09-06 19:22 ` DevilKin
2002-09-07 7:08 ` Andre Hedrick
1 sibling, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-06 17:22 UTC (permalink / raw)
To: DevilKin; +Cc: linux-kernel
> OK, I downloaded that and installed it, but well, frankly, it shows me very
> little useful stuff.
>
> Or i'm just not good at interpreting this.
Post the output of smartctl -a /dev/hda? to me, and I'll tell you what I can, but it's best to monitor the stats from when the drive is new, (I.E. every drive you buy from now on :-) ).
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 17:22 ` jbradford
@ 2002-09-06 19:22 ` DevilKin
2002-09-07 9:30 ` Anders Fugmann
2002-09-07 12:31 ` Daniel Egger
0 siblings, 2 replies; 63+ messages in thread
From: DevilKin @ 2002-09-06 19:22 UTC (permalink / raw)
To: jbradford; +Cc: Linux Kernel Mailing List
On Friday 06 September 2002 19:22, jbradford@dial.pipex.com wrote:
> > OK, I downloaded that and installed it, but well, frankly, it shows me
> > very little useful stuff.
> >
> > Or i'm just not good at interpreting this.
>
> Post the output of smartctl -a /dev/hda? to me, and I'll tell you what I
> can, but it's best to monitor the stats from when the drive is new, (I.E.
> every drive you buy from now on :-) ).
>
Well, there were 21 ATA errors, and it showed 5 error blocks, with disk 'live'
times of 629 hours.
Luckely I've been able to backup everything from the disk, and I'm running the
DFT now. The tests showed bad sectors, i'm currently running a disk erase.
DK
--
"What's that thing?"
"Well, it's a highly technical, sensitive instrument we use in
computer repair. Being a layman, you probably can't grasp exactly what
it does. We call it a two-by-four."
-- Jeff MacNelley, "Shoe"
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 19:22 ` DevilKin
@ 2002-09-07 9:30 ` Anders Fugmann
2002-09-07 9:37 ` Udo A. Steinberg
2002-09-07 15:54 ` Holger Lubitz
2002-09-07 12:31 ` Daniel Egger
1 sibling, 2 replies; 63+ messages in thread
From: Anders Fugmann @ 2002-09-07 9:30 UTC (permalink / raw)
To: DevilKin; +Cc: Linux Kernel Mailing List
DevilKin wrote:
> Luckely I've been able to backup everything from the disk, and I'm running the
> DFT now. The tests showed bad sectors, i'm currently running a disk erase.
I have had sucess in firmware-upgrading these drives, after which all
problems were gone forever.
You can download the firmware programs from
http://anders.fugmann.dhs.org/ibm. There are both upgrade for 75GXP and
60GXP, or you could contact IBM for the firmware upgrade - They are not
available on the ibm site. The programs are Windows thingies, which
creates a floppy to be booted.
Regards
Anders Fugmann
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 9:30 ` Anders Fugmann
@ 2002-09-07 9:37 ` Udo A. Steinberg
2002-09-07 15:54 ` Holger Lubitz
1 sibling, 0 replies; 63+ messages in thread
From: Udo A. Steinberg @ 2002-09-07 9:37 UTC (permalink / raw)
To: Linux-Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 498 bytes --]
On Sat, 07 Sep 2002 11:30:01 +0200 Anders Fugmann (AF) wrote:
AF> You can download the firmware programs from
AF> http://anders.fugmann.dhs.org/ibm. There are both upgrade for 75GXP and
AF> 60GXP, or you could contact IBM for the firmware upgrade - They are not
AF> available on the ibm site. The programs are Windows thingies, which
AF> creates a floppy to be booted.
They are on the IBM site, but a bit hard to find:
http://www-1.ibm.com/support/docview.wss?rs=0&uid=psg1MIGR-39082
-Udo.
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 9:30 ` Anders Fugmann
2002-09-07 9:37 ` Udo A. Steinberg
@ 2002-09-07 15:54 ` Holger Lubitz
2002-09-07 16:31 ` jbradford
1 sibling, 1 reply; 63+ messages in thread
From: Holger Lubitz @ 2002-09-07 15:54 UTC (permalink / raw)
To: linux-kernel
Anders Fugmann wrote:
> I have had sucess in firmware-upgrading these drives, after which all
> problems were gone forever.
Which firmware version do your drives show? I ran the firmware upgrade
on my two DTLA half a year ago, and ended up with this:
Model=IBM-DTLA-307045, FwRev=TX6OA59A
Model=IBM-DTLA-305040, FwRev=TW4OA69A
(from hdparm -i output - the former 0A changed to 9A after the upgrade,
rest stayed the same)
Both work fine (they never failed me before the upgrade either).
However, at least the second drive still clicks often enough for me to
notice. I am still worried, though smartsuite says I'm fine - if I read
the output correctly.
It seems to click only when doing lots of write requests for extended
periods of time (like unbatching and spooling several megabytes of news
- one or two usually don't trigger it, larger batches do).
I wonder if it would be possible for the driver to monitor SMART and
lighten the load on the drive when things don't seem normal.
What is normal, anyway? For example, my Seagate Barracuda IV shows
continually increasing raw values for "Raw Read Error Rate", "Seek Error
Rate" and "Hardware ECC Recovered". It works fine, though. The older U5
I still have running has a high but pretty constant raw value for the
first, a slower rate of increase for the second and doesn't show the
third.
I don't really believe the 310617 power on hours my Maxtor (the old 60
gig with 4 platters) claims, either.
Holger
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 15:54 ` Holger Lubitz
@ 2002-09-07 16:31 ` jbradford
0 siblings, 0 replies; 63+ messages in thread
From: jbradford @ 2002-09-07 16:31 UTC (permalink / raw)
To: Holger Lubitz; +Cc: linux-kernel
> I wonder if it would be possible for the driver to monitor SMART and
> lighten the load on the drive when things don't seem normal.
I think it would be fun to have SMART monitoring in the driver, but I'm not sure it's worth the bloat. It *can* be done in userspace, afterall.
> What is normal, anyway?
Not sure what 'normal' is, but the manufacturer defines thresholds, which are to be interpreted as 'drive is failing' if they are exceeded.
> I don't really believe the 310617 power on hours my Maxtor (the old 60
> gig with 4 platters) claims, either.
That's because it's reporting power on time in minutes :-)
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 19:22 ` DevilKin
2002-09-07 9:30 ` Anders Fugmann
@ 2002-09-07 12:31 ` Daniel Egger
2002-09-07 13:08 ` jbradford
1 sibling, 1 reply; 63+ messages in thread
From: Daniel Egger @ 2002-09-07 12:31 UTC (permalink / raw)
To: DevilKin; +Cc: jbradford, Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 683 bytes --]
Am Fre, 2002-09-06 um 21.22 schrieb DevilKin:
> Well, there were 21 ATA errors, and it showed 5 error blocks, with disk 'live'
> times of 629 hours.
No wonder it ran for 2 years. Are you using this machine frequently at
all? :)
> The tests showed bad sectors, i'm currently running a disk erase.
This is exactly the mistake I've been meaning to warn you of.
The disk will corrupt sooner or later again and you'll have to go
through all the torture (possible backup/restore, missing data) again
and if you're unlucky (which is quite possible with your frequency of
use) the warranty is void until the problems appear the next time.
--
Servus,
Daniel
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-07 12:31 ` Daniel Egger
@ 2002-09-07 13:08 ` jbradford
2002-09-07 13:50 ` Daniel Egger
0 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-07 13:08 UTC (permalink / raw)
To: Daniel Egger; +Cc: linux-kernel
> > The tests showed bad sectors, i'm currently running a disk erase.
>
> This is exactly the mistake I've been meaning to warn you of.
> The disk will corrupt sooner or later again and you'll have to go
> through all the torture (possible backup/restore, missing data) again
> and if you're unlucky (which is quite possible with your frequency of
> use) the warranty is void until the problems appear the next time.
There are two separate issues here, though:
* Buggy firmware
* Unreliable media
We have confirmed, (I believe), that the drive did have the buggy firmware. We do not know yet whether the media is defective or not, but we do know that the drives are not the best in the world.
Alan also confirmed that the errors were direct from the device, and so it is not a kernel bug.
However, I raise the question of whether the new kernel version caused different access patterns to the device, and showed up the firmware bug that was there all the time. Or maybe the compilation of the new kernel thrashed the disk and showed up the firmware bug. If the machine has been on for some time, (months), doing not very much, maybe a lot of disk data was cached in RAM, and the kernel compile caused it to be re-read from disk, showing up media defects.
I was hoping that he would actually post the output of:
smartctl -a /dev/hda?
because that tells you all sorts of things, like, for example, reallocated sector count, and calibration retry count.
Obviously, it is not a good idea to use the drive for anything important until it has been tested in a non-critical application first.
Besides, you *do* backup, don't you? (Or do what Linus suggested a while ago, and upload your stuff to an ftp site that is mirrored worldwide.)
I don't see the point of returning a disk that turns out not to be faulty after the firmware upgrade, for replacement under the warranty, even if it qualifies for a warranty replacement, (which it shouldn't do), because you might be exchanging a good disk for a bad disk.
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 13:08 ` jbradford
@ 2002-09-07 13:50 ` Daniel Egger
2002-09-07 15:02 ` jbradford
0 siblings, 1 reply; 63+ messages in thread
From: Daniel Egger @ 2002-09-07 13:50 UTC (permalink / raw)
To: jbradford; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1293 bytes --]
Am Sam, 2002-09-07 um 15.08 schrieb jbradford@dial.pipex.com:
> Besides, you *do* backup, don't you?
I do but besides that there is still data loss involved and my time is
expensive and limited, so I'd rather go for a hasslefree solution than
to poke around in mud with a stick in the hope it might clear up.
> (Or do what Linus suggested a while ago, and upload your stuff to an
> ftp site that is mirrored worldwide.)
Very practicable advise.
> I don't see the point of returning a disk that turns out not to be
> faulty after the firmware upgrade,
The point is that until you know whether it really was the firmware,
you've spend so much time that it is much easier to return the drive.
> even if it qualifies for a warranty replacement, (which it shouldn't do)
A faulty drive is a faulty drive and thus qualifies for a
free replacement (at least in Germany). Nobody here can force
you to try several costly things which might solve the problem;
it is rather the manufacturers duty to fix it on their cost.
> because you might be exchanging a good disk for a bad disk.
Very doubtful considering past experience. Also it's not very
probable (though it has happened) to receive a disk which is
more broken than broken.
--
Servus,
Daniel
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-07 13:50 ` Daniel Egger
@ 2002-09-07 15:02 ` jbradford
2002-09-07 20:19 ` Daniel Egger
0 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-07 15:02 UTC (permalink / raw)
To: Daniel Egger; +Cc: linux-kernel
> > Besides, you *do* backup, don't you?
>
> I do but besides that there is still data loss involved and my time is
> expensive and limited, so I'd rather go for a hasslefree solution than
> to poke around in mud with a stick in the hope it might clear up.
Fair enough, if you don't have the time to devote to it, it's best to replace the drive.
I assumed from the size of this thread, which has nothing to do with the kernel anymore, that we were trying to find out what was to blame.
If this is going to become a flamewar, please remove the cc: to the kernel list, as I doubt that it interests them.
> > (Or do what Linus suggested a while ago, and upload your stuff to an
> > ftp site that is mirrored worldwide.)
>
> Very practicable advise.
Whatever - it was a joke.
The reason I brought up backups, was because even if you have a RAID array, of high quality drives, with non-sequential serial numbers, on hot-pluggable interfaces, with known good firmware, you can still get silent data corruption.
Fact - *NO* SLED, or RAID array, can ever be guaranteed never to silently flip a bit.
> > I don't see the point of returning a disk that turns out not to be
> > faulty after the firmware upgrade,
>
> The point is that until you know whether it really was the firmware,
> you've spend so much time that it is much easier to return the drive.
And the chances are you will get another drive of the same model, back from IBM. How does that help?
I already pointed out that there are two known issues here with these drive - firmware bugs, and media defects.
So far, all we can say is that the firmware problem is now fixed. On a replacement drive, you can't even say that.
The 'media errors' could have been caused entirely by the buggy firmware.
> > even if it qualifies for a warranty replacement, (which it shouldn't do)
>
> A faulty drive is a faulty drive and thus qualifies for a
> free replacement (at least in Germany). Nobody here can force
> you to try several costly things which might solve the problem;
> it is rather the manufacturers duty to fix it on their cost.
No, but you've upgraded the firmware, right? If that has fixed the problem, then it is not a faulty drive. If it is not a faulty drive, then what is the point in sending it back? If it is not a faulty drive, IBM would be justified in sending it right back to you at your expense. Oh, and it might get damaged in transit.
> > because you might be exchanging a good disk for a bad disk.
>
> Very doubtful considering past experience. Also it's not very
> probable (though it has happened) to receive a disk which is
> more broken than broken.=20
No, I would say it is very possible that you could receive a disk with the old firmware on it. So, you'll just plug in your 'new' disk, and in a few months, bad sectors will start appearing.
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 15:02 ` jbradford
@ 2002-09-07 20:19 ` Daniel Egger
2002-09-07 20:41 ` jbradford
` (2 more replies)
0 siblings, 3 replies; 63+ messages in thread
From: Daniel Egger @ 2002-09-07 20:19 UTC (permalink / raw)
To: jbradford; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 757 bytes --]
Am Sam, 2002-09-07 um 17.02 schrieb jbradford@dial.pipex.com:
> No, but you've upgraded the firmware, right?
Not exactly. According to IBM technical support there is no such thing
as a new firmware. The drives are alright, the OS is broken.
> If that has fixed the problem, then it is not a faulty drive.
Right, and how would you notice without sacrifying more data?
> So, you'll just plug in your 'new' disk, and in a few months,
> bad sectors will start appearing.
Not if you sold it at Ebay, which is what I did with all *new*
drives I received from IBM. I just kept the "serviceable used part"
one in case I need to install Windows to upgrade the firmware of
some drive or anything else in range.
--
Servus,
Daniel
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-07 20:19 ` Daniel Egger
@ 2002-09-07 20:41 ` jbradford
2002-09-07 21:41 ` Alan Cox
2002-09-07 21:41 ` Alan Cox
2002-09-07 22:05 ` Andre Hedrick
2 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-07 20:41 UTC (permalink / raw)
To: Daniel Egger; +Cc: linux-kernel
This discussion is becoming stupid, but here we go:
> > No, but you've upgraded the firmware, right?
>
> Not exactly.
??? Either you did or didn't.
> According to IBM technical support there is no such thing
> as a new firmware. The drives are alright, the OS is broken.
Right, so you're calling Alan Cox a liar, then? I know who I believe.
> > If that has fixed the problem, then it is not a faulty drive.
> Right, and how would you notice without sacrifying more data?
smartctl -X /dev/hda?
'Execute Extended Self Test' might be a good start
or you could just copy data to/from it, generally hammer it and spin it up, down, and sideways, generally try to make it go wrong, and if your data is intact, then I would trust it more than a disk that arrived in a jiffy bag, with an assurance that 'this one works'.
> > So, you'll just plug in your 'new' disk, and in a few months,
> > bad sectors will start appearing.
>
> Not if you sold it at Ebay,
The bad sectors are just as likely to appear, but somebody else's data will be lost. Very nice gesture, not to mention that you probably violate the Ebay T&C by selling a product that you suspect is faulty.
> which is what I did with all *new* drives I received from IBM.
Well, I won't buy a second hand drive from you then :-).
> I just kept the "serviceable used part" one in case I need to install
> Windows to upgrade the firmware of some drive or anything else in range.
Fine, if that's what floats your boat.
Infact, I was completely wrong, OK? You were right all along, so there is no need to continue this pointless thread.
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 20:41 ` jbradford
@ 2002-09-07 21:41 ` Alan Cox
0 siblings, 0 replies; 63+ messages in thread
From: Alan Cox @ 2002-09-07 21:41 UTC (permalink / raw)
To: jbradford; +Cc: Daniel Egger, linux-kernel
On Sat, 2002-09-07 at 21:41, jbradford@dial.pipex.com wrote:
> > According to IBM technical support there is no such thing
> > as a new firmware. The drives are alright, the OS is broken.
>
> Right, so you're calling Alan Cox a liar, then? I know who I believe.
Hardly. He said IBM tech support told him one thing, and they told me
another. Give it a rest
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 20:19 ` Daniel Egger
2002-09-07 20:41 ` jbradford
@ 2002-09-07 21:41 ` Alan Cox
2002-09-07 22:00 ` jbradford
2002-09-07 22:05 ` Andre Hedrick
2 siblings, 1 reply; 63+ messages in thread
From: Alan Cox @ 2002-09-07 21:41 UTC (permalink / raw)
To: Daniel Egger; +Cc: jbradford, linux-kernel
On Sat, 2002-09-07 at 21:19, Daniel Egger wrote:
> Am Sam, 2002-09-07 um 17.02 schrieb jbradford@dial.pipex.com:
>
> > No, but you've upgraded the firmware, right?
>
> Not exactly. According to IBM technical support there is no such thing
> as a new firmware. The drives are alright, the OS is broken.
The IBM technical support I dealt with not only confirmed there was new
firmware, the tools updated it and said they had 8)
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 21:41 ` Alan Cox
@ 2002-09-07 22:00 ` jbradford
2002-09-07 23:19 ` David Forrest
0 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-07 22:00 UTC (permalink / raw)
To: Alan Cox; +Cc: degger, linux-kernel
> On Sat, 2002-09-07 at 21:19, Daniel Egger wrote:
> > Am Sam, 2002-09-07 um 17.02 schrieb jbradford@dial.pipex.com:
> >
> > > No, but you've upgraded the firmware, right?
> >
> > Not exactly. According to IBM technical support there is no such thing
> > as a new firmware. The drives are alright, the OS is broken.
>
> The IBM technical support I dealt with not only confirmed there was new
> firmware, the tools updated it and said they had 8)
Here is the URL:
http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-39082
it expressly states that the firmware is intended for the DTLA-307060.
The page mentions that is it enhances stability and SMART data collection.
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 22:00 ` jbradford
@ 2002-09-07 23:19 ` David Forrest
2002-09-08 10:56 ` Henning P. Schmiedehausen
0 siblings, 1 reply; 63+ messages in thread
From: David Forrest @ 2002-09-07 23:19 UTC (permalink / raw)
To: jbradford; +Cc: Alan Cox, degger, linux-kernel
On Sat, 7 Sep 2002 jbradford@dial.pipex.com wrote:
...
> Here is the URL:
>
> http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-39082
>
> it expressly states that the firmware is intended for the DTLA-307060.
The firmware update is for many more drives than that, My own
Model=IBM-DTLA-305040, FwRev=TW4OA60A
is also recommended, as well as many with a FwRev=xxxOyzzz with zzz<66A.
Now i have to find a windows machine to try it out on...
Dave,
--
Dave Forrest drf5n@virginia.edu
(804)642-0662h (434)924-3954w http://mug.sys.virginia.edu/~drf5n/
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-07 23:19 ` David Forrest
@ 2002-09-08 10:56 ` Henning P. Schmiedehausen
2002-09-08 14:14 ` jbradford
0 siblings, 1 reply; 63+ messages in thread
From: Henning P. Schmiedehausen @ 2002-09-08 10:56 UTC (permalink / raw)
To: linux-kernel
David Forrest <drf5n@mug.sys.virginia.edu> writes:
>On Sat, 7 Sep 2002 jbradford@dial.pipex.com wrote:
>...
>> Here is the URL:
>>
>> http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-39082
>>
>> it expressly states that the firmware is intended for the DTLA-307060.
>The firmware update is for many more drives than that, My own
> Model=IBM-DTLA-305040, FwRev=TW4OA60A
>is also recommended, as well as many with a FwRev=xxxOyzzz with zzz<66A.
>Now i have to find a windows machine to try it out on...
You don't need to. All you need is someone run this tool and send you
the image it creates. I put mine as boot.img on a CD so I can upgrade
all the disks I have in boxes without floppy disk drives. It's a self
booting DOS disk.
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH hps@intermeta.de
Am Schwabachgrund 22 Fon.: 09131 / 50654-0 info@intermeta.de
D-91054 Buckenhof Fax.: 09131 / 50654-20
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-08 10:56 ` Henning P. Schmiedehausen
@ 2002-09-08 14:14 ` jbradford
2002-09-09 21:59 ` Alan Cox
0 siblings, 1 reply; 63+ messages in thread
From: jbradford @ 2002-09-08 14:14 UTC (permalink / raw)
To: linux-kernel
> >The firmware update is for many more drives than that, My own
>
> > Model=IBM-DTLA-305040, FwRev=TW4OA60A
>
> >is also recommended, as well as many with a FwRev=xxxOyzzz with zzz<66A.
> >Now i have to find a windows machine to try it out on...
>
> You don't need to. All you need is someone run this tool and send you
> the image it creates. I put mine as boot.img on a CD so I can upgrade
> all the disks I have in boxes without floppy disk drives. It's a self
> booting DOS disk.
As the old firmware is known to be buggy, and those bugs are relevant when using Linux, and updated firmware is available, is it worth checking for the known buggy firmware version in the ide driver?
I realise that we cannot check every drive in the world for compatibility, but if this is a known issue...
John.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-07 20:19 ` Daniel Egger
2002-09-07 20:41 ` jbradford
2002-09-07 21:41 ` Alan Cox
@ 2002-09-07 22:05 ` Andre Hedrick
2 siblings, 0 replies; 63+ messages in thread
From: Andre Hedrick @ 2002-09-07 22:05 UTC (permalink / raw)
To: Daniel Egger; +Cc: jbradford, linux-kernel
On 7 Sep 2002, Daniel Egger wrote:
> Am Sam, 2002-09-07 um 17.02 schrieb jbradford@dial.pipex.com:
>
> > No, but you've upgraded the firmware, right?
>
> Not exactly. According to IBM technical support there is no such thing
> as a new firmware. The drives are alright, the OS is broken.
They are full of CRAP!
IBM ran TASKFILE IO throught there bus analyzers and it came up clean.
IBM also introduced FLAGGED versions of the diagnostic TASKFILE transport
for eventual use of their DFT (Drive Fitness Test).
You tell the service tech he is smoking crack.
The kernel passed with flying colors in their disk labs. If you read
in ide-taskfile.c version 0.33 and above, you will see they did some work
on the driver and verified issues.
Now earlier I published a method of how to stablize the drive once you
back up all the data you can off of it. Since I do not yet have a source
verison of DFT-Linux, or binary yet, I can not offer much more native.
Cheers,
Andre Hedrick
LAD Storage Consulting Group
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:55 ` DevilKin
2002-09-06 17:22 ` jbradford
@ 2002-09-07 7:08 ` Andre Hedrick
1 sibling, 0 replies; 63+ messages in thread
From: Andre Hedrick @ 2002-09-07 7:08 UTC (permalink / raw)
To: DevilKin; +Cc: jbradford, Linux Kernel Mailing List
Send me the results offline
On Fri, 6 Sep 2002, DevilKin wrote:
> On Friday 06 September 2002 17:36, jbradford@dial.pipex.com wrote:
> > > I've looked up these errors on the net, and as far as i can tell it means
> > > that the drive has some bad sectors at the given addresses and that it
> > > will probably die on me sooner or later.
> > >
> > > Can someone either confirm this to me or tell me what to do to fix it?
> > >
> > > The drive involved is an IBM-DTLA-307060, which has served me without
> > > problems now for about 2 years.
> >
> > Have a look at:
> >
> > http://csl.cse.ucsc.edu/smart.shtml
> >
> > there you will find software for interrogating and monitoring the
> > S.M.A.R.T. data available from your drive. It's a little late to start
> > monitoring it, if the drive is already dying, but if, for example, it shows
> > a lot of re-allocated sectors, or spin retries, you'll know something is
> > wrong.
> >
>
> OK, I downloaded that and installed it, but well, frankly, it shows me very
> little useful stuff.
>
> Or i'm just not good at interpreting this.
>
> DK
>
> --
> "I gained nothing at all from Supreme Enlightenment, and for that very
> reason it is called Supreme Enlightenment."
> -- Gotama Buddha
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Andre Hedrick
LAD Storage Consulting Group
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:13 DevilKin
2002-09-06 15:26 ` Mike Dresser
2002-09-06 15:36 ` jbradford
@ 2002-09-06 15:37 ` mbs
2002-09-06 15:54 ` Alan Cox
2002-09-06 15:38 ` Alan Cox
` (3 subsequent siblings)
6 siblings, 1 reply; 63+ messages in thread
From: mbs @ 2002-09-06 15:37 UTC (permalink / raw)
To: DevilKin, linux-kernel
same problem I was having with 2.4.20-pre4-ac2-preempt.
alan didn't want to hear it from me due to the -preempt
my system was e7500 chipset, dual xeon, WD 40g drive, ext2 or ext3.
from this we can glean: preempt not a factor, HD manufacturer not a factor,
FS not a factor. don't know what chipset you are using.
I was allso geting badCRC errors.
On Friday 06 September 2002 11:13, DevilKin wrote:
> Hello kernel people,
>
> Kernel running: 2.4.20-pre1ac3 or -pre5ac2 (same under both)
>
> Today I discovered a stale copy of qt-3.0.3 lying about on my disk. When I
> tried to delete it, this started showing up in my log files:
>
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612672 0x0 SD]
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612677 0x0 SD]
>
> and rm just reported me 'Permission denied'.
>
> I've looked up these errors on the net, and as far as i can tell it means
> that the drive has some bad sectors at the given addresses and that it will
> probably die on me sooner or later.
>
> Can someone either confirm this to me or tell me what to do to fix it?
>
> The drive involved is an IBM-DTLA-307060, which has served me without
> problems now for about 2 years.
>
> Thanks!
>
> DK
--
/**************************************************
** Mark Salisbury || mbs@mc.com **
** If you would like to sponsor me for the **
** Mass Getaway, a 150 mile bicycle ride to for **
** MS, contact me to donate by cash or check or **
** click the link below to donate by credit card **
**************************************************/
https://www.nationalmssociety.org/pledge/pledge.asp?participantid=86736
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:13 DevilKin
` (2 preceding siblings ...)
2002-09-06 15:37 ` mbs
@ 2002-09-06 15:38 ` Alan Cox
2002-09-06 17:33 ` Daniel Egger
2002-09-06 15:44 ` mbs
` (2 subsequent siblings)
6 siblings, 1 reply; 63+ messages in thread
From: Alan Cox @ 2002-09-06 15:38 UTC (permalink / raw)
To: DevilKin; +Cc: linux-kernel
On Fri, 2002-09-06 at 16:13, DevilKin wrote:
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
That certainly looks like a drive error.
> The drive involved is an IBM-DTLA-307060, which has served me without problems
> now for about 2 years.
Get the IBM disk tools, upgrade the firmware and see what the ibm tools
have to say. IBM drives have had some problems with spontaneous bad
blocks appearing that go away with new firmware and a run of the disk
tools. More importantly if thats the problem with the firmware update
they dont come back until the drive really dies.
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:38 ` Alan Cox
@ 2002-09-06 17:33 ` Daniel Egger
2002-09-06 20:31 ` Alan Cox
0 siblings, 1 reply; 63+ messages in thread
From: Daniel Egger @ 2002-09-06 17:33 UTC (permalink / raw)
To: Alan Cox; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 992 bytes --]
Am Fre, 2002-09-06 um 17.38 schrieb Alan Cox:
> Get the IBM disk tools, upgrade the firmware and see what the ibm tools
> have to say. IBM drives have had some problems with spontaneous bad
> blocks appearing that go away with new firmware and a run of the disk
> tools.
The "run of the disk tools" that does away with the badblocks is a
lowlevel format; a tedious way to spent ones' time on a harddrive
that will die anyway soon.
> More importantly if thats the problem with the firmware update
> they dont come back until the drive really dies.
Right, which is probably shortly after. Especially on a two years
old drive I wouldn't go through all the troubles to backup 60GB
data, lowlevel format the drive, restore the data and hope the
problems are gone; instead I'd rather get a new drive within the
warranty and cross fingers.
BTW: I did the backup way exactly once and the drive got back to me
with new errors two weeks after.
--
Servus,
Daniel
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 17:33 ` Daniel Egger
@ 2002-09-06 20:31 ` Alan Cox
0 siblings, 0 replies; 63+ messages in thread
From: Alan Cox @ 2002-09-06 20:31 UTC (permalink / raw)
To: Daniel Egger; +Cc: linux-kernel
On Fri, 2002-09-06 at 18:33, Daniel Egger wrote:
> Am Fre, 2002-09-06 um 17.38 schrieb Alan Cox:
>
> > Get the IBM disk tools, upgrade the firmware and see what the ibm tools
> > have to say. IBM drives have had some problems with spontaneous bad
> > blocks appearing that go away with new firmware and a run of the disk
> > tools.
>
> The "run of the disk tools" that does away with the badblocks is a
> lowlevel format; a tedious way to spent ones' time on a harddrive
> that will die anyway soon.
For the IBM's it depends what the problem is. Spontaneous bad blocks
appearing during power off appears to be fixed by the firmware update
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: ide drive dying?
2002-09-06 15:13 DevilKin
` (3 preceding siblings ...)
2002-09-06 15:38 ` Alan Cox
@ 2002-09-06 15:44 ` mbs
2002-09-06 17:46 ` mbs
2002-09-07 7:02 ` Andre Hedrick
6 siblings, 0 replies; 63+ messages in thread
From: mbs @ 2002-09-06 15:44 UTC (permalink / raw)
To: DevilKin, linux-kernel
forgot to say: my drive worked fine with 2.4.19-pre3-ac5-preempt before the
move to the -20 kernel.
also worked fine after a fdisk/reinstall and continued to work fine till the
first time I booted on a (freshly built) -20-ac version.
I thought it was the drive so I replaced it with a brand new drive, and had
_EXACTLY_ the same failure pattern.
------
same problem I was having with 2.4.20-pre4-ac2-preempt.
alan didn't want to hear it from me due to the -preempt
my system was e7500 chipset, dual xeon, WD 40g drive, ext2 or ext3.
from this we can glean: preempt not a factor, HD manufacturer not a factor,
FS not a factor. don't know what chipset you are using.
I was allso geting badCRC errors.
On Friday 06 September 2002 11:13, DevilKin wrote:
> Hello kernel people,
>
> Kernel running: 2.4.20-pre1ac3 or -pre5ac2 (same under both)
>
> Today I discovered a stale copy of qt-3.0.3 lying about on my disk. When I
> tried to delete it, this started showing up in my log files:
>
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612672 0x0 SD]
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612677 0x0 SD]
>
> and rm just reported me 'Permission denied'.
>
> I've looked up these errors on the net, and as far as i can tell it means
> that the drive has some bad sectors at the given addresses and that it will
> probably die on me sooner or later.
>
> Can someone either confirm this to me or tell me what to do to fix it?
>
> The drive involved is an IBM-DTLA-307060, which has served me without
> problems now for about 2 years.
>
> Thanks!
>
> DK
--
/**************************************************
** Mark Salisbury || mbs@mc.com **
** If you would like to sponsor me for the **
** Mass Getaway, a 150 mile bicycle ride to for **
** MS, contact me to donate by cash or check or **
** click the link below to donate by credit card **
**************************************************/
https://www.nationalmssociety.org/pledge/pledge.asp?participantid=86736
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:13 DevilKin
` (4 preceding siblings ...)
2002-09-06 15:44 ` mbs
@ 2002-09-06 17:46 ` mbs
2002-09-06 20:32 ` Alan Cox
2002-09-07 7:02 ` Andre Hedrick
6 siblings, 1 reply; 63+ messages in thread
From: mbs @ 2002-09-06 17:46 UTC (permalink / raw)
To: DevilKin, linux-kernel
fdisk/format and reinstall but stick with a 2.4.19 or 2.4.19-ac kernel.
I would bet money that the problem is purely a .20-preX-acX thing.
run it a while on 2.4.19 to verify that life is good. then build a new
2.4.20-pre1-ac3 and boot it. I bet that within minutes of normal use, you
will have a problem.
(I have done this loop 3 times.)
On Friday 06 September 2002 11:13, DevilKin wrote:
> Hello kernel people,
>
> Kernel running: 2.4.20-pre1ac3 or -pre5ac2 (same under both)
>
> Today I discovered a stale copy of qt-3.0.3 lying about on my disk. When I
> tried to delete it, this started showing up in my log files:
>
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612672 0x0 SD]
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat
> data of [612671 612677 0x0 SD]
>
> and rm just reported me 'Permission denied'.
>
> I've looked up these errors on the net, and as far as i can tell it means
> that the drive has some bad sectors at the given addresses and that it will
> probably die on me sooner or later.
>
> Can someone either confirm this to me or tell me what to do to fix it?
>
> The drive involved is an IBM-DTLA-307060, which has served me without
> problems now for about 2 years.
>
> Thanks!
>
> DK
--
/**************************************************
** Mark Salisbury || mbs@mc.com **
** If you would like to sponsor me for the **
** Mass Getaway, a 150 mile bicycle ride to for **
** MS, contact me to donate by cash or check or **
** click the link below to donate by credit card **
**************************************************/
https://www.nationalmssociety.org/pledge/pledge.asp?participantid=86736
^ permalink raw reply [flat|nested] 63+ messages in thread* Re: ide drive dying?
2002-09-06 15:13 DevilKin
` (5 preceding siblings ...)
2002-09-06 17:46 ` mbs
@ 2002-09-07 7:02 ` Andre Hedrick
2002-09-07 7:42 ` jbradford
6 siblings, 1 reply; 63+ messages in thread
From: Andre Hedrick @ 2002-09-07 7:02 UTC (permalink / raw)
To: DevilKin; +Cc: linux-kernel
First BACK up what is left.
Next dig out smartsuite from http://www.linux-ide.org/smart.html
Run it in full capture mode, please use another disk to run root, or the
system will tank.
Read and save smart logs.
cat /dev/zero > /dev/hd{IBM-DTLA-307060}x
Rerun Smart in full capture mode.
Reread smart logs and compare.
cat /dev/urandom > /dev/hd{IBM-DTLA-307060}x
If you get no errors you can reuse the drive, for how long? Maybe 6 months
to a year.
Now, I can not tell you what, why, how things are going on.
Sheesh, I expect to be in a deep six for this series of events already.
Sorry, I can not say anymore.
If you do not like the above, you need to run out and buy another drive
fast.
Cheers,
On Fri, 6 Sep 2002, DevilKin wrote:
> Hello kernel people,
>
> Kernel running: 2.4.20-pre1ac3 or -pre5ac2 (same under both)
>
> Today I discovered a stale copy of qt-3.0.3 lying about on my disk. When I
> tried to delete it, this started showing up in my log files:
>
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data
> of [612671 612672 0x0 SD]
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=7072862,
> sector=1803472
> end_request: I/O error, dev 03:06 (hda), sector 1803472
> vs-13070: reiserfs_read_inode2: i/o failure occurred trying to find stat data
> of [612671 612677 0x0 SD]
>
> and rm just reported me 'Permission denied'.
>
> I've looked up these errors on the net, and as far as i can tell it means that
> the drive has some bad sectors at the given addresses and that it will
> probably die on me sooner or later.
>
> Can someone either confirm this to me or tell me what to do to fix it?
>
> The drive involved is an IBM-DTLA-307060, which has served me without problems
> now for about 2 years.
>
> Thanks!
>
> DK
> --
> If all the Chinese simultaneously jumped into the Pacific off a 10 foot
> platform erected 10 feet off their coast, it would cause a tidal wave
> that would destroy everything in this country west of Nebraska.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Andre Hedrick
LAD Storage Consulting Group
^ permalink raw reply [flat|nested] 63+ messages in thread