public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Random IDE Lock ups with via IDE
@ 2005-05-24 23:27 Jim Gifford
  2005-05-25 17:50 ` Jim Gifford
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Gifford @ 2005-05-24 23:27 UTC (permalink / raw)
  To: LKML

I have been using the 2.6.10.x series kernel for a while on my other 
systems with no issues at all.

But on my laptop which has the via 686 chipset, I started having some 
wierd issues. This happens after about 2 weeks of non-shutting down

Here is a sample of the data from my kernel log
About the device
May 24 16:22:22 laptop kernel: Uniform Multi-Platform E-IDE driver 
Revision: 7.00alpha2
May 24 16:22:22 laptop kernel: ide: Assuming 33MHz system bus speed for 
PIO modes; override with idebus=xx
May 24 16:22:22 laptop kernel: VP_IDE: IDE controller at PCI slot 
0000:00:07.1
May 24 16:22:22 laptop kernel: VP_IDE: chipset revision 16
May 24 16:22:22 laptop kernel: VP_IDE: not 100%% native mode: will probe 
irqs later
May 24 16:22:22 laptop kernel: VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 
controller on pci0000:00:07.1
May 24 16:22:22 laptop kernel:     ide0: BM-DMA at 0x1100-0x1107, BIOS 
settings: hda:DMA, hdb:pio
May 24 16:22:22 laptop kernel:     ide1: BM-DMA at 0x1108-0x110f, BIOS 
settings: hdc:DMA, hdd:pio
May 24 16:22:22 laptop kernel: Probing IDE interface ide0...
May 24 16:22:22 laptop kernel: hda: FUJITSU MHS2030AT, ATA DISK drive
May 24 16:22:22 laptop kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
May 24 16:22:22 laptop kernel: Probing IDE interface ide1...
May 24 16:22:22 laptop kernel: hdc: DW-224E, ATAPI CD/DVD-ROM drive
May 24 16:22:22 laptop kernel: ide1 at 0x170-0x177,0x376 on irq 15
May 24 16:22:22 laptop kernel: hda: max request size: 128KiB
May 24 16:22:22 laptop kernel: hda: 58605120 sectors (30005 MB) 
w/2048KiB Cache, CHS=58140/16/63
May 24 16:22:22 laptop kernel: hda: cache flushes supported
May 24 16:22:22 laptop kernel:  hda: hda1 hda2 hda3 hda4

Error Messages
First sign of the problem
May 24 01:37:03 laptop kernel: ide: failed opcode was: unknown
May 24 01:37:03 laptop kernel: ide0: reset: success
May 24 01:38:15 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 01:38:15 laptop kernel:
May 24 01:38:15 laptop kernel: ide: failed opcode was: unknown
May 24 01:38:17 laptop kernel: ide0: reset: success
May 24 01:47:57 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 01:47:57 laptop kernel:
May 24 01:47:57 laptop kernel: ide: failed opcode was: unknown
May 24 01:48:32 laptop kernel: ide0: reset timed-out, status=0xd0
May 24 01:48:32 laptop kernel: hda: status timeout: status=0xd0 { Busy }
May 24 01:48:32 laptop kernel:
May 24 01:48:32 laptop kernel: ide: failed opcode was: unknown
May 24 01:48:32 laptop kernel: hda: drive not ready for command
May 24 01:48:32 laptop kernel: ide0: reset: success
May 24 01:50:59 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 01:50:59 laptop kernel:
May 24 01:50:59 laptop kernel: ide: failed opcode was: unknown
May 24 01:51:04 laptop kernel: ide0: reset: success
May 24 01:53:44 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 01:53:49 laptop kernel:
May 24 01:53:49 laptop kernel: ide: failed opcode was: unknown
May 24 01:53:49 laptop kernel: ide0: reset: success
May 24 01:54:14 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 01:54:25 laptop kernel:
May 24 01:54:25 laptop kernel: ide: failed opcode was: unknown
May 24 01:54:25 laptop kernel: ide0: reset: success
May 24 02:00:12 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 02:00:12 laptop kernel:
May 24 02:00:12 laptop kernel: ide: failed opcode was: unknown
May 24 02:00:16 laptop kernel: ide0: reset: success
May 24 02:00:35 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 02:00:35 laptop kernel:
May 24 02:00:35 laptop kernel: ide: failed opcode was: unknown
May 24 02:00:47 laptop kernel: ide0: reset: success
May 24 02:23:36 laptop kernel: hda: status timeout: status=0xd0 { Busy }

Different error messages
May 24 16:10:47 laptop kernel: hda: status timeout: status=0xd0 { Busy }
May 24 16:10:47 laptop kernel:
May 24 16:10:47 laptop kernel: ide: failed opcode was: unknown
May 24 16:10:47 laptop kernel: hda: no DRQ after issuing MULTWRITE
May 24 16:10:50 laptop kernel: ide0: reset: success
May 24 16:14:50 laptop kernel: hda: status timeout: status=0xd0 { Busy }
May 24 16:14:50 laptop kernel:
May 24 16:14:50 laptop kernel: ide: failed opcode was: unknown
May 24 16:14:50 laptop kernel: hda: no DRQ after issuing MULTWRITE
May 24 16:14:55 laptop kernel: ide0: reset: success
May 24 16:15:35 laptop kernel: hda: irq timeout: status=0xd0 { Busy }
May 24 16:15:35 laptop kernel:
May 24 16:15:35 laptop kernel: ide: failed opcode was: unknown
May 24 16:15:37 laptop kernel: ide0: reset: success
May 24 16:16:01 laptop kernel: hda: status timeout: status=0xd0 { Busy }
May 24 16:16:01 laptop kernel:
May 24 16:16:01 laptop kernel: ide: failed opcode was: unknown
May 24 16:16:01 laptop kernel: hda: no DRQ after issuing MULTWRITE
May 24 16:16:01 laptop kernel: ide0: reset: success

The issue is also described here in this forum post from someone else
http://forums.viaarena.com/messageview.aspx?catid=28&threadid=66084&enterthread=y

-- 
----
Jim Gifford
maillist@jg555.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-24 23:27 Random IDE Lock ups with via IDE Jim Gifford
@ 2005-05-25 17:50 ` Jim Gifford
  2005-05-25 20:34   ` Ross Biro
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Gifford @ 2005-05-25 17:50 UTC (permalink / raw)
  To: LKML

Tested the hard drive it passes. Any other suggestions

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-25 17:50 ` Jim Gifford
@ 2005-05-25 20:34   ` Ross Biro
  2005-05-25 20:46     ` Jim Gifford
  0 siblings, 1 reply; 8+ messages in thread
From: Ross Biro @ 2005-05-25 20:34 UTC (permalink / raw)
  To: Jim Gifford; +Cc: LKML

That's not a bad platter issue.  It could be the electronics on the
drive have a problem, but more  likely something happened like the
drive spun down.  If that is the case, the reset at the end should
have woken it up.  Does the drive work correctly after the reset?

    Ross

On 5/25/05, Jim Gifford <maillist@jg555.com> wrote:
> Tested the hard drive it passes. Any other suggestions
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-25 20:34   ` Ross Biro
@ 2005-05-25 20:46     ` Jim Gifford
  2005-05-25 23:57       ` Ross Biro
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Gifford @ 2005-05-25 20:46 UTC (permalink / raw)
  To: Ross Biro; +Cc: LKML

Ross,
    I thought of that to, I just have 2 laptops that are identical here 
now. I have placed a different drive in the other one. So we can test 
it, but looking through all my logs that I backed up, this problem 
started when I put 2.6.8 on the laptop. I'm wondering if it could be an 
ACPI issue or IDE issue, but have no idea on what to really look for.

May 24 16:22:31 laptop kernel: mtrr: 0x40000000,0x800000 overlaps 
existing 0x40000000,0x400000
May 24 18:25:11 laptop kernel: hda: status timeout: status=0xd0 { Busy }
May 24 18:25:11 laptop kernel:
May 24 18:25:11 laptop kernel: ide: failed opcode was: unknown
May 24 18:25:11 laptop kernel: hda: no DRQ after issuing MULTWRITE
May 24 18:25:12 laptop kernel: ide0: reset: success

May 24 12:21:15 laptop2 kernel: mtrr: 0x40000000,0x800000 overlaps 
existing 0x40000000,0x400000
May 24 16:55:53 laptop2 kernel: hda: status timeout: status=0xd0 { Busy }
May 24 16:55:53 laptop2 kernel:
May 24 16:55:54 laptop2 kernel: ide: failed opcode was: unknown
May 24 16:55:54 laptop2 kernel: hda: no DRQ after issuing MULTWRITE
May 24 16:55:54 laptop2 kernel: ide0: reset: success

Here is a link to my .config file
http://ftp.jg555.com/configs/laptop.config

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-25 20:46     ` Jim Gifford
@ 2005-05-25 23:57       ` Ross Biro
  2005-05-26 18:28         ` Jim Gifford
  0 siblings, 1 reply; 8+ messages in thread
From: Ross Biro @ 2005-05-25 23:57 UTC (permalink / raw)
  To: Jim Gifford; +Cc: LKML

Since you say the problem started with 2.6.8, perhaps the easiest
thing to do is look at the
kernel you say worked and its config and then just diff it vs 2.6.8.

I took a second look at your original message, and since it's an irq
timeout, it could just be that your drive is slow to answer a command
and then is confused by the error recovery code.  Having two identical
laptops makes this easy.  If your other laptop has the same problem,
then it's software (or buggy hardware that the software hasn't figured
out how to work around) if it doesn't, then it's hardware and there
are a couple of things that can be tweaked to see if it fixes things.

    Ross

On 5/25/05, Jim Gifford <maillist@jg555.com> wrote:
> Ross,
>     I thought of that to, I just have 2 laptops that are identical here
> now. I have placed a different drive in the other one. So we can test
> it, but looking through all my logs that I backed up, this problem
> started when I put 2.6.8 on the laptop. I'm wondering if it could be an
> ACPI issue or IDE issue, but have no idea on what to really look for.
> 
> May 24 16:22:31 laptop kernel: mtrr: 0x40000000,0x800000 overlaps
> existing 0x40000000,0x400000
> May 24 18:25:11 laptop kernel: hda: status timeout: status=0xd0 { Busy }
> May 24 18:25:11 laptop kernel:
> May 24 18:25:11 laptop kernel: ide: failed opcode was: unknown
> May 24 18:25:11 laptop kernel: hda: no DRQ after issuing MULTWRITE
> May 24 18:25:12 laptop kernel: ide0: reset: success
> 
> May 24 12:21:15 laptop2 kernel: mtrr: 0x40000000,0x800000 overlaps
> existing 0x40000000,0x400000
> May 24 16:55:53 laptop2 kernel: hda: status timeout: status=0xd0 { Busy }
> May 24 16:55:53 laptop2 kernel:
> May 24 16:55:54 laptop2 kernel: ide: failed opcode was: unknown
> May 24 16:55:54 laptop2 kernel: hda: no DRQ after issuing MULTWRITE
> May 24 16:55:54 laptop2 kernel: ide0: reset: success
> 
> Here is a link to my .config file
> http://ftp.jg555.com/configs/laptop.config
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-25 23:57       ` Ross Biro
@ 2005-05-26 18:28         ` Jim Gifford
  2005-05-26 19:35           ` Ross Biro
  0 siblings, 1 reply; 8+ messages in thread
From: Jim Gifford @ 2005-05-26 18:28 UTC (permalink / raw)
  To: Ross Biro; +Cc: LKML

What do you recommend trying?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-26 18:28         ` Jim Gifford
@ 2005-05-26 19:35           ` Ross Biro
  2005-05-26 19:50             ` Jeff Garzik
  0 siblings, 1 reply; 8+ messages in thread
From: Ross Biro @ 2005-05-26 19:35 UTC (permalink / raw)
  To: Jim Gifford; +Cc: LKML

If you are using the legacy IDE layer you want to tweak WAIT_CMD.  For
testing, you can make it really large and see if that impacts your
problem.  A drive vendor once told me that it could take more than a
minute for an IDE drive to complete a command.  I no longer purchase
drives from that vendor.

I'm not sure what libata uses, but my guess is it defaults it from the
SCSI layer.

If tweaking these time outs make your problem go away, odds are what
happened was that your drive remapped a few more bad sectors and now
takes a little too long to complete commands.  The linux ide error
recovery code does a WIN_IDLE_IMMEDIATE when there is a problem.  This
is allowed by the ATA-2 spec, but confuses most modern drives.  So
once you start getting errors, often the drive gets so confused, you
never stop.

    Ross

On 5/26/05, Jim Gifford <maillist@jg555.com> wrote:
> What do you recommend trying?
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Random IDE Lock ups with via IDE
  2005-05-26 19:35           ` Ross Biro
@ 2005-05-26 19:50             ` Jeff Garzik
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Garzik @ 2005-05-26 19:50 UTC (permalink / raw)
  To: Ross Biro; +Cc: Jim Gifford, LKML

Ross Biro wrote:
> problem.  A drive vendor once told me that it could take more than a
> minute for an IDE drive to complete a command.  I no longer purchase
> drives from that vendor.


All vendors could potentially take more than 30 seconds to complete ATA 
commands such as FLUSH CACHE EXT, in extreme cases.

	Jeff



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-05-26 19:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-24 23:27 Random IDE Lock ups with via IDE Jim Gifford
2005-05-25 17:50 ` Jim Gifford
2005-05-25 20:34   ` Ross Biro
2005-05-25 20:46     ` Jim Gifford
2005-05-25 23:57       ` Ross Biro
2005-05-26 18:28         ` Jim Gifford
2005-05-26 19:35           ` Ross Biro
2005-05-26 19:50             ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox