public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: bbee <bumble.bee@xs4all.nl>
To: linux-kernel@vger.kernel.org
Subject: Re: ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x0) r0xj0
Date: Wed, 3 Jan 2007 01:44:42 +0000 (UTC)	[thread overview]
Message-ID: <loom.20070103T020347-255@post.gmane.org> (raw)
In-Reply-To: 45933A53.1090702@gmail.com

Tejun Heo <htejun <at> gmail.com> writes:
> Andrew Lyon wrote:
> > My system is gigabyte ds3 motherboard with onboard SATA JMicron
> > 20360/20363 AHCI Controller (rev 02), drive connected is WDC
> > WD740ADFD-00 20.0, I am running 2.6.18.6 32 bit, under heavy i/o I get
> > the following messaegs:
> > 
> > ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x0)
> > ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x0)
> > 
> > Is this condition dangerous?
> 
> Not usually.  Might indicate something is going wrong in some really
> rare cases.  I think vendors are getting NCQ right these days.  Maybe
> it's time to remove that printk.

Hi Tejun, it's funny you should say that, because in the subthread at
http://thread.gmane.org/gmane.linux.ide/10264/focus=10334
you seemed to have major issues with this very error and were saying there
could even be data corruption.

I too have this error, on a Asrock 939Dual-SATA2 board wich has the same
controller. Syslog lines like
ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0xf4)
every so often.

However, in my case it gets a lot worse. The following happens infrequently,
usually within 15 days of uptime on a light I/O load:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: (irq_stat 0x48000000, interface fatal error)
ata1.00: tag 0 cmd 0xea Emask 0x12 stat 0x37 err 0x0 (ATA bus error)
ata1: soft resetting port
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata1: SATA link down (SStatus 0 SControl 300)
ata1: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata1: SATA link down (SStatus 0 SControl 300)
ata1.00: disabled
ata1: EH complete
ata1.00: detaching (SCSI 0:0:0:0)
scsi 0:0:0:0: rejecting I/O to dead device

The drive then dissapears from the system. This is not preceded by any
spurious interrupt messages, but I have a hunch it is related because
following your grave comments in the referenced thread, I looked for a kernel
option to disable NCQ. Astonished to find none, I changed the source using the
flag you added in this patch:
http://article.gmane.org/gmane.linux.ide/11527
With NCQ disabled, the spurious interrupt messages as well as the exceptions
go away.

This has been happening for a few months on a box whose log I'd been neglecting
and I hadn't even noticed the issue since the drive is part of a md array. The
drive would get re-detected when I rebooted the box and md would rebuild the
array.

Here comes the weird part. When I discovered the problem, I backtracked through
the syslog to see when the problems started. They started a few months ago when
I added a DVB card to the system (it is a mythtv box). I noticed in Andrew's
dmesg that he also has a DVB card.

Could the DVB subsystem have anything to do with this? I realize the systems
are completely unrelated..
Perhaps the JMicron chip has noise issues? These are often triggered by adding
tuner cards..

It probably won't make any difference to system performance, but it would be
nice if we could resolve this so I can re-enable NCQ and stop patching my
kernels ;)

Thanks for reading and please CC,

bbee


  parent reply	other threads:[~2007-01-03  1:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-28  2:12 ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x0) r0xj0 Andrew Lyon
2006-12-28  3:30 ` Tejun Heo
2006-12-31 21:39   ` Andrew Lyon
2007-01-03  1:44   ` bbee [this message]
2007-01-03  2:25     ` Tejun Heo
2007-01-03  3:38       ` bbee
2007-01-03  4:15         ` Tejun Heo
2007-01-03 16:50           ` bbee
2007-01-04  1:01             ` Andrew Lyon
2007-01-04  3:18               ` Tejun Heo
2007-01-04  3:16             ` Tejun Heo
     [not found]         ` <f4527be0701030825m3e07a38dm67d2c21fd25b1978@mail.gmail.com>
2007-01-03 16:26           ` Fwd: " Andrew Lyon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=loom.20070103T020347-255@post.gmane.org \
    --to=bumble.bee@xs4all.nl \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox