* Dreaded PDC irq nobody cared
@ 2005-09-30 11:32 Erik Slagter
2005-10-03 8:47 ` Erik Slagter
0 siblings, 1 reply; 7+ messages in thread
From: Erik Slagter @ 2005-09-30 11:32 UTC (permalink / raw)
To: Linux IDE
[-- Attachment #1: Type: text/plain, Size: 1128 bytes --]
Hi,
I've been running with a PDC (sata150tx2plus) for a few months now, two
harddisk attached, softraid-1 config.
This has been working perfectly, in contrary to the many bug reports
posted here.
I now have moved to another house, connected up the server again, found
out that the disk cooling fans (bought these just to be sure, not
standard) were making way too much noise and I disconnected them.
From this moment on, I have the dreaded "irq xx nobody cared" problem. I
installed 2.6.14-rc2 in the hope that it includes all patches that might
fix this issue, and all of the relevant patches I saw here are in.
Still no joy!
The problem notably occurs when the array is resyncing, which
corresponds to the other's reports. Sometimes it doesn't even manage to
boot completely.
So... I have a notion and a question:
- it seems to matter what the temperature of the harddisks is!?
- are there any newer patches than those in 2.6.14-rc2 that I might
try?
- is there any chance booting with noapic would make a difference? I
must say my apic is a bit flaky (according to the logs)
Thx.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2115 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-09-30 11:32 Dreaded PDC irq nobody cared Erik Slagter
@ 2005-10-03 8:47 ` Erik Slagter
2005-10-05 10:04 ` Erik Slagter
0 siblings, 1 reply; 7+ messages in thread
From: Erik Slagter @ 2005-10-03 8:47 UTC (permalink / raw)
To: Linux IDE
[-- Attachment #1: Type: text/plain, Size: 239 bytes --]
> - is there any chance booting with noapic would make a difference? I
> must say my apic is a bit flaky (according to the logs)
BTW I tried noapic, it did seem to work for two days, but then again
crashed with the same problem :-(
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2115 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-10-03 8:47 ` Erik Slagter
@ 2005-10-05 10:04 ` Erik Slagter
2005-10-06 6:09 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Erik Slagter @ 2005-10-05 10:04 UTC (permalink / raw)
To: Linux IDE
[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]
On Mon, 2005-10-03 at 10:47 +0200, Erik Slagter wrote:
> > - is there any chance booting with noapic would make a difference? I
> > must say my apic is a bit flaky (according to the logs)
>
> BTW I tried noapic, it did seem to work for two days, but then again
> crashed with the same problem :-(
I have played around a little more and found these interesting results.
- The difference between having or having not the stuck interrupt seems
to be the forced cooling of the attached harddisks; I find this also
very hard to believe, but it is very reproducable; so to anyone having
problems with the promise sata controllers I'd recommend buying
dedicated harddisk coolers.
- I did a test run with disabling of "stuck" interrupts disabled and I
can say: the interrupt is really stuck, it's no kernel bug; after some
time I rebooted and the controller (or harddisk) still was in a confused
state (no booting possible at all), only after an hour it started
working again; indeed looks like a heat problem.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2115 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-10-05 10:04 ` Erik Slagter
@ 2005-10-06 6:09 ` Tejun Heo
2005-10-06 9:20 ` Erik Slagter
0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2005-10-06 6:09 UTC (permalink / raw)
To: Erik Slagter; +Cc: Linux IDE
Erik Slagter wrote:
> On Mon, 2005-10-03 at 10:47 +0200, Erik Slagter wrote:
>
>>> - is there any chance booting with noapic would make a difference? I
>>>must say my apic is a bit flaky (according to the logs)
>>
>>BTW I tried noapic, it did seem to work for two days, but then again
>>crashed with the same problem :-(
>
>
> I have played around a little more and found these interesting results.
>
> - The difference between having or having not the stuck interrupt seems
> to be the forced cooling of the attached harddisks; I find this also
> very hard to believe, but it is very reproducable; so to anyone having
> problems with the promise sata controllers I'd recommend buying
> dedicated harddisk coolers.
> - I did a test run with disabling of "stuck" interrupts disabled and I
> can say: the interrupt is really stuck, it's no kernel bug; after some
> time I rebooted and the controller (or harddisk) still was in a confused
> state (no booting possible at all), only after an hour it started
> working again; indeed looks like a heat problem.
Sounds like you got a faulty drive or really bad ventilation in your case.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-10-06 6:09 ` Tejun Heo
@ 2005-10-06 9:20 ` Erik Slagter
2005-10-08 17:17 ` Mark Hahn
0 siblings, 1 reply; 7+ messages in thread
From: Erik Slagter @ 2005-10-06 9:20 UTC (permalink / raw)
To: Tejun Heo; +Cc: Linux IDE
[-- Attachment #1: Type: text/plain, Size: 1961 bytes --]
On Thu, 2005-10-06 at 15:09 +0900, Tejun Heo wrote:
> >>> - is there any chance booting with noapic would make a difference? I
> >>>must say my apic is a bit flaky (according to the logs)
> >>
> >>BTW I tried noapic, it did seem to work for two days, but then again
> >>crashed with the same problem :-(
> >
> > I have played around a little more and found these interesting results.
> >
> > - The difference between having or having not the stuck interrupt seems
> > to be the forced cooling of the attached harddisks; I find this also
> > very hard to believe, but it is very reproducable; so to anyone having
> > problems with the promise sata controllers I'd recommend buying
> > dedicated harddisk coolers.
> > - I did a test run with disabling of "stuck" interrupts disabled and I
> > can say: the interrupt is really stuck, it's no kernel bug; after some
> > time I rebooted and the controller (or harddisk) still was in a confused
> > state (no booting possible at all), only after an hour it started
> > working again; indeed looks like a heat problem.
>
> Sounds like you got a faulty drive or really bad ventilation in your case.
That would be the logical conclusion.
What puzzles me still is:
- various people have this problem with various brands and models disks
- the temperature of the disks is ~40 C, without cooling, and also ~40
C with cooling (?!) according to smartctl attribute 195, so either the
temperature sensors are wrong (silly location?) or it really doesn't
matter?!
- I cannot pinpoint the problem to one harddisk, and probably never
will, because the problem only occurs with > 1 harddisk
connected :-( (as the other people experience)
- the temperature in the case is not very cool but also not extremely
hot, imho it shouldn't bother the harddisks.
I am very interested in the experiences of the other people that have
problems with the promise sata controller cards.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2115 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-10-06 9:20 ` Erik Slagter
@ 2005-10-08 17:17 ` Mark Hahn
2005-10-10 10:31 ` Erik Slagter
0 siblings, 1 reply; 7+ messages in thread
From: Mark Hahn @ 2005-10-08 17:17 UTC (permalink / raw)
To: Linux IDE
> > > - The difference between having or having not the stuck interrupt seems
> > > to be the forced cooling of the attached harddisks; I find this also
...
> - various people have this problem with various brands and models disks
> - the temperature of the disks is ~40 C, without cooling, and also ~40
> C with cooling (?!) according to smartctl attribute 195, so either the
> temperature sensors are wrong (silly location?) or it really doesn't
> matter?!
> - I cannot pinpoint the problem to one harddisk, and probably never
> will, because the problem only occurs with > 1 harddisk
> connected :-( (as the other people experience)
> - the temperature in the case is not very cool but also not extremely
> hot, imho it shouldn't bother the harddisks.
you added ventilation to your case, and the problem went away.
this doesn't mean the problem was disk temperature, especially
since your disks report no real change. instead, I suggest that
your PSU was too hot and regulating poorly. greater airflow
resulted in better regulation, and made the disks happier.
(note another recent message here regarding a similar problem machine
which was fixed by providing better power...)
regards, mark hahn.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Dreaded PDC irq nobody cared
2005-10-08 17:17 ` Mark Hahn
@ 2005-10-10 10:31 ` Erik Slagter
0 siblings, 0 replies; 7+ messages in thread
From: Erik Slagter @ 2005-10-10 10:31 UTC (permalink / raw)
To: Mark Hahn; +Cc: Linux IDE
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
On Sat, 2005-10-08 at 13:17 -0400, Mark Hahn wrote:
> > > > - The difference between having or having not the stuck interrupt seems
> > > > to be the forced cooling of the attached harddisks; I find this also
> ...
> > - various people have this problem with various brands and models disks
> > - the temperature of the disks is ~40 C, without cooling, and also ~40
> > C with cooling (?!) according to smartctl attribute 195, so either the
> > temperature sensors are wrong (silly location?) or it really doesn't
> > matter?!
> > - I cannot pinpoint the problem to one harddisk, and probably never
> > will, because the problem only occurs with > 1 harddisk
> > connected :-( (as the other people experience)
> > - the temperature in the case is not very cool but also not extremely
> > hot, imho it shouldn't bother the harddisks.
>
> you added ventilation to your case, and the problem went away.
> this doesn't mean the problem was disk temperature, especially
> since your disks report no real change. instead, I suggest that
> your PSU was too hot and regulating poorly. greater airflow
> resulted in better regulation, and made the disks happier.
> (note another recent message here regarding a similar problem machine
> which was fixed by providing better power...)
This may very well at least have something to do with the problem. I
must say that the air coming from the PSU is rather warm. Anyway, new
PSU is underway (high-tech/high quality, replacing existing no-name
PSU).
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2115 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-10-10 10:32 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-30 11:32 Dreaded PDC irq nobody cared Erik Slagter
2005-10-03 8:47 ` Erik Slagter
2005-10-05 10:04 ` Erik Slagter
2005-10-06 6:09 ` Tejun Heo
2005-10-06 9:20 ` Erik Slagter
2005-10-08 17:17 ` Mark Hahn
2005-10-10 10:31 ` Erik Slagter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).