public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.26: IDE drives become unavailable randomly
@ 2004-06-30 11:41 Andre Costa
  2004-06-30 13:59 ` tom st denis
  0 siblings, 1 reply; 8+ messages in thread
From: Andre Costa @ 2004-06-30 11:41 UTC (permalink / raw)
  To: linux-kernel

(please cc me on any replies, because I am not subscribed to this list;
if I do need to subscribe, just let me know)

Hi,

I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
with hyper-threading enabled. I have one 80GB Seagate IDE disk
as /dev/hda, and from time to time it seems to "disappear", usually
after these messages appear a couple of trimes on/var/log/messages:

Jun 27 17:15:00 dali kernel: hda: status timeout: status=0x80 { Busy }
Jun 27 17:15:00 dali kernel: 
Jun 27 17:15:00 dali kernel: hda: drive not ready for command
Jun 27 17:15:03 dali kernel: ide0: reset: success

I already had some ide-related issues, namely the one mentioned here:

http://www.x86-64.org/lists/discuss/msg04679.html

Due to that, I am booting with:

hdc=ide-scsi apm=off acpi=ht noapic

Turning off APIC and keeping ACPI to a minimum seems to have fixed the
"dma status == 0x24" problem, but I still experience the "status
timeout" above, which is very frustrating because this is supposed to be
a server for our intranet.

I tried turning off APM for this disk with 'hdparm -B255 /dev/hda', but
it didn't work:

hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
hda: drive_cmd: error=0x04 { DriveStatusError }

I have turned off spindown with 'hdparm -S0 /dev/hda', but frankly I am
not sure this will help (besides being bad for harddisk lifetime).

So, given this scenario, I would really appreciate any suggestions on
how to workaround this issue... Please, let me know if you need
additional info. I am attaching below the output of 'hdparm -I /dev/hda'
in case it helps. I am running Fedora Core 1.

TIA

Andre

-------- output of 'hdparm -I /dev/hda' -------- 

/dev/hda:

ATA device, with non-removable media
	Model Number:       ST380011A                               
	Serial Number:      3JV78385            
	Firmware Revision:  3.06    
Standards:
	Used: ATA/ATAPI-6 T13 1410D revision 2 
	Supported: 6 5 4 3 
Configuration:
	Logical		max	current
	cylinders	16383	65535
	heads		16	1
	sectors/track	63	63
	--
	CHS current addressable sectors:    4128705
	LBA    user addressable sectors:  156301488
	LBA48  user addressable sectors:  156301488
	device size with M = 1024*1024:       76319 MBytes
	device size with M = 1000*1000:       80026 MBytes (80 GB)
Capabilities:
	LBA, IORDY(can be disabled)
	bytes avail on r/w long: 4	Queue depth: 1
	Standby timer values: spec'd by Standard
	R/W multiple sector transfer: Max = 16	Current = 16
	Recommended acoustic management value: 128, current value: 0
	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
	     Cycle time: min=120ns recommended=120ns
	PIO: pio0 pio1 pio2 pio3 pio4 
	     Cycle time: no flow control=240ns  IORDY flow control=120ns
Commands/features:
	Enabled	Supported:
	   *	READ BUFFER cmd
	   *	WRITE BUFFER cmd
	   *	Host Protected Area feature set
	   *	Look-ahead
	   *	Write cache
	   *	Power Management feature set
		Security Mode feature set
	   *	SMART feature set
	   *	FLUSH CACHE EXT command
	   *	Mandatory FLUSH CACHE command 
	   *	Device Configuration Overlay feature set 
	   *	48-bit Address feature set 
		SET MAX security extension
	   *	DOWNLOAD MICROCODE cmd
	   *	SMART self-test 
	   *	SMART error logging 
Security: 
		supported
	not	enabled
	not	locked
	not	frozen
	not	expired: security count
	not	supported: enhanced erase
HW reset results:
	CBLID- above Vih
	Device num = 0 determined by the jumper
Checksum: correct


-- 
Andre Oliveira da Costa
(costa@tecgraf.puc-rio.br)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-06-30 11:41 Andre Costa
@ 2004-06-30 13:59 ` tom st denis
  2004-06-30 14:51   ` Andre Costa
  0 siblings, 1 reply; 8+ messages in thread
From: tom st denis @ 2004-06-30 13:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andre Costa

--- Andre Costa <costa@tecgraf.puc-rio.br> wrote:
> (please cc me on any replies, because I am not subscribed to this
> list;
> if I do need to subscribe, just let me know)
> 
> Hi,
> 
> I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
> with hyper-threading enabled. I have one 80GB Seagate IDE disk
> as /dev/hda, and from time to time it seems to "disappear", usually
> after these messages appear a couple of trimes on/var/log/messages:

I get a similar problem on my Presario laptop.  In my case "all of a
suddend" hda3 becomes write-only.  Next time it happens I'll see if I
can capture a dmesg log or something.  It only seems to happen when I
enable my wifi and do a lot of disk activity [but only once in a
while].  Could be that my wifi and IDE0 share an IRQ?

Of course I'm more apt to blame my laptop than Linux since the same
kernel [well diff build options but you know what I mean] works just
fine on my two desktops in the house...

Tom



		
__________________________________
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-06-30 13:59 ` tom st denis
@ 2004-06-30 14:51   ` Andre Costa
  0 siblings, 0 replies; 8+ messages in thread
From: Andre Costa @ 2004-06-30 14:51 UTC (permalink / raw)
  To: tom st denis; +Cc: linux-kernel

(please cc me on any replies, because I am not subscribed to this list)

Hi Tom,

On Wed, 30 Jun 2004 06:59:07 -0700 (PDT)
tom st denis <tomstdenis@yahoo.com> wrote:

> --- Andre Costa <costa@tecgraf.puc-rio.br> wrote:
> > (please cc me on any replies, because I am not subscribed to this
> > list;
> > if I do need to subscribe, just let me know)
> > 
> > Hi,
> > 
> > I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
> > with hyper-threading enabled. I have one 80GB Seagate IDE disk
> > as /dev/hda, and from time to time it seems to "disappear", usually
> > after these messages appear a couple of trimes on/var/log/messages:
> 
> I get a similar problem on my Presario laptop.  In my case "all of a
> suddend" hda3 becomes write-only.  Next time it happens I'll see if I
> can capture a dmesg log or something.  It only seems to happen when I
> enable my wifi and do a lot of disk activity [but only once in a
> while].  Could be that my wifi and IDE0 share an IRQ?

I can't say the situation is the same here -- actually, it seems to be
more related (in my case) to idle times: usually this happens when the
system is under light load (or under no load at all), like between
0:00am and 6:00am. This is why my primary suspect is APM (I could be
completely wrong, of course). Also, I don't have wifi.

> Of course I'm more apt to blame my laptop than Linux since the same
> kernel [well diff build options but you know what I mean] works just
> fine on my two desktops in the house...

Yeah, I know what you mean: the same Linux distro has been running
flawlessly on other boxes around here for months (with different
hardware specs, though). Mine OTOH has a sad uptime record of 5 days...
=(

I agree Linux works (actually, it rocks =)), been using it for years
both at home and at work, but sometimes a specific hardware combination
(or buggy hardware/BIOS/firmware etc.) pushes it to the edge, reaching
some weak spots that need to be "hardened". Some hardware simply doesn't
work at all (I hope that's not my case...)

Best,

Andre

-- 
Andre Oliveira da Costa
(costa@tecgraf.puc-rio.br)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
@ 2004-06-30 15:18 Nick Warne
  2004-07-01  9:48 ` Bruce Allen
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Warne @ 2004-06-30 15:18 UTC (permalink / raw)
  To: linux-kernel

I was getting this problem, and advice from smartmontools people was 
to clean out the box and reseat all cables etc.  Seemed to work for 
me on the box at work with this DMA timeout issue - BTW, always 
happened at idle, like 2:15am in the middle of the night etc.

Reference:

http://sourceforge.net/mailarchive/message.php?msg_id=8660397

http://sourceforge.net/mailarchive/forum.php?thread_id=4908273&forum_i
d=12495

Nick

-- 
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-06-30 15:18 2.4.26: IDE drives become unavailable randomly Nick Warne
@ 2004-07-01  9:48 ` Bruce Allen
  2004-07-01 10:49   ` Andre Costa
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Allen @ 2004-07-01  9:48 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: costa, tomstdenis, Nick Warne

> I was getting this problem, and advice from smartmontools people was 
> to clean out the box and reseat all cables etc.  Seemed to work for 
> me on the box at work with this DMA timeout issue - BTW, always 
> happened at idle, like 2:15am in the middle of the night etc.
> 
> Reference: 
> http://sourceforge.net/mailarchive/message.php?msg_id=8660397
> http://sourceforge.net/mailarchive/forum.php?thread_id=4908273&forum_i
> d=12495

An additional reference. See the entry that starts 'System freezes under
heavy load" in:
http://cvs.sourceforge.net/viewcvs.py/smartmontools/sm5/WARNINGS?view=markup

Cheers,
	Bruce


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-07-01  9:48 ` Bruce Allen
@ 2004-07-01 10:49   ` Andre Costa
  2004-07-01 18:43     ` Nick Warne
  0 siblings, 1 reply; 8+ messages in thread
From: Andre Costa @ 2004-07-01 10:49 UTC (permalink / raw)
  To: Bruce Allen; +Cc: linux-kernel, tomstdenis, nick

(please cc me on any replies, I am not subscribed to this list)

On Thu, 1 Jul 2004 04:48:30 -0500 (CDT)
Bruce Allen <ballen@gravity.phys.uwm.edu> wrote:

> > I was getting this problem, and advice from smartmontools people was
> > to clean out the box and reseat all cables etc.  Seemed to work for 
> > me on the box at work with this DMA timeout issue - BTW, always 
> > happened at idle, like 2:15am in the middle of the night etc.
> > 
> > Reference: 
> > http://sourceforge.net/mailarchive/message.php?msg_id=8660397
> > http://sourceforge.net/mailarchive/forum.php?thread_id=4908273&forum_i
> > d=12495
> 
> An additional reference. See the entry that starts 'System freezes
> under heavy load" in:
> http://cvs.sourceforge.net/viewcvs.py/smartmontools/sm5/WARNINGS?view=markup
> 
> Cheers,
> 	Bruce

Thks, folks, I wouldn't really suspect of bad cables/PSU, this was an
eye-opener. I have just opened the box and reseated the 80-wire IDE
cable to my hda device, and I will consider replacing it, just in case.
The PSU is brand new, 450W -- although it could be bad quality, I will
try to check this out.

BTW: Nick, I missed your msg because you didn't cc me. My hda also
usually gets disconnected at early hours in the morning, as you pointed
out. I arrived today to work and it had happened again =/ Last entry
on/var/log/messages was around 1:30am, and it was about a NFS mount that
had expired.

Best,

Andre

-- 
Andre Oliveira da Costa
(costa@tecgraf.puc-rio.br)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-07-01 10:49   ` Andre Costa
@ 2004-07-01 18:43     ` Nick Warne
  2004-07-01 19:03       ` Andre Costa
  0 siblings, 1 reply; 8+ messages in thread
From: Nick Warne @ 2004-07-01 18:43 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andre Costa, ballen

> (please cc me on any replies, I am not subscribed to this list)
> 
> On Thu, 1 Jul 2004 04:48:30 -0500 (CDT)
> Bruce Allen <ballen@gravity.phys.uwm.edu> wrote:
> 
> > > I was getting this problem, and advice from smartmontools people was
> > > to clean out the box and reseat all cables etc.  Seemed to work for 
> > > me on the box at work with this DMA timeout issue - BTW, always 
> > > happened at idle, like 2:15am in the middle of the night etc.
> > > 
> > > Reference: 
> > > http://sourceforge.net/mailarchive/message.php?msg_id=8660397
> > > http://sourceforge.net/mailarchive/forum.php?thread_id=4908273&forum_i
> > > d=12495
> > 
> > An additional reference. See the entry that starts 'System freezes
> > under heavy load" in:
> > http://cvs.sourceforge.net/viewcvs.py/smartmontools/sm5/WARNINGS?view=markup
> > 
> > Cheers,
> > 	Bruce
> 
> Thks, folks, I wouldn't really suspect of bad cables/PSU, this was an
> eye-opener. I have just opened the box and reseated the 80-wire IDE
> cable to my hda device, and I will consider replacing it, just in case.
> The PSU is brand new, 450W -- although it could be bad quality, I will
> try to check this out.
> 
> BTW: Nick, I missed your msg because you didn't cc me. My hda also
> usually gets disconnected at early hours in the morning, as you pointed
> out. I arrived today to work and it had happened again =/ Last entry
> on/var/log/messages was around 1:30am, and it was about a NFS mount that
> had expired.
> 
> Best,
> 
> Andre

Hi Andre,

Sorry, I too am not subscribed to the list, and I read (and reply to) 
from:

http://marc.theaimsgroup.com/?l=linux-kernel

I totally overlooked CC'ing you.

Anyway, new IDE cable did fix the box at work for me.  Also I only 
used smartd AFTER the problem arose, not before, so it was not smartd 
that caused it.

Regards,

Nick

-- 
"When you're chewing on life's gristle,
Don't grumble, Give a whistle..."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.26: IDE drives become unavailable randomly
  2004-07-01 18:43     ` Nick Warne
@ 2004-07-01 19:03       ` Andre Costa
  0 siblings, 0 replies; 8+ messages in thread
From: Andre Costa @ 2004-07-01 19:03 UTC (permalink / raw)
  To: Nick Warne; +Cc: linux-kernel, ballen

(please cc me on any replies, I am not subscribed to this list)

Hi Nick,

On Thu, 01 Jul 2004 19:43:58 +0100
"Nick Warne" <nick@ukfsn.org> wrote:

[snip]
> > Thks, folks, I wouldn't really suspect of bad cables/PSU, this was
> > an eye-opener. I have just opened the box and reseated the 80-wire
> > IDE cable to my hda device, and I will consider replacing it, just
> > in case. The PSU is brand new, 450W -- although it could be bad
> > quality, I will try to check this out.
> > 
> > BTW: Nick, I missed your msg because you didn't cc me. My hda also
> > usually gets disconnected at early hours in the morning, as you
> > pointed out. I arrived today to work and it had happened again =/
> > Last entry on/var/log/messages was around 1:30am, and it was about a
> > NFS mount that had expired.
> > 
> > Best,
> > 
> > Andre
> 
> Hi Andre,
> 
> Sorry, I too am not subscribed to the list, and I read (and reply to) 
> from:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel
> 
> I totally overlooked CC'ing you.

No worries. I have to remind myself over and over again to put the
"please cc me" header on every msg =)

> Anyway, new IDE cable did fix the box at work for me.  Also I only 
> used smartd AFTER the problem arose, not before, so it was not smartd 
> that caused it.

Thks, this is definitely valuable info. I will surely go after some new
cables to see if it improves my situation (I hope it does, otherwise I
will have to move services away from this box and relocate them to other
servers, which will be a PITA...)

On a side note, I browsed 2.4.27rc2 changelogs today, and there is some
interesting stuff there about interruptions, ACPI etc. Looking forward
to trying it out.

Thks again,

Andre

-- 
Andre Oliveira da Costa
(costa@tecgraf.puc-rio.br)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-07-01 19:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-30 15:18 2.4.26: IDE drives become unavailable randomly Nick Warne
2004-07-01  9:48 ` Bruce Allen
2004-07-01 10:49   ` Andre Costa
2004-07-01 18:43     ` Nick Warne
2004-07-01 19:03       ` Andre Costa
  -- strict thread matches above, loose matches on Subject: below --
2004-06-30 11:41 Andre Costa
2004-06-30 13:59 ` tom st denis
2004-06-30 14:51   ` Andre Costa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox