Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)
@ 2010-04-24 13:29 Thibaut VARÈNE
  2010-04-24 14:36 ` Grant Grundler
  0 siblings, 1 reply; 6+ messages in thread
From: Thibaut VARÈNE @ 2010-04-24 13:29 UTC (permalink / raw)
  To: linux-parisc

Pa-ckers,

Just for the records, I'd like to raise some attention to what seems =20
like a pretty old bug in our IRQ code that is apparently still =20
affecting us.

Long story short: while trying to figure out why the recently attached =
=20
10-disk bay was killing the Debian "lafayette" autobuilder during raid =
=20
resync, I noticed that irqbalance was part of the default Debian =20
autobuilder setup.

The nastiness of irqbalance has been discussed before, and I =20
remembered having had issues in the past (5+ years ago) on my parisc =20
machines with that daemon. I couldn't find a pointer to a m-l thread, =20
I don't remember if I discussed that on IRC or elsewhere.

Anyway, turned out disabling irqbalance "fixed" the crash (and by =20
crash I mean HPMC). IIRC, the general idea is that when irqbalance =20
reroutes IRQ under heavy interrupt load, a race occurs by which one =20
interrupt request might end up delivered to the wrong CPU, HPMC'ing =20
the machine.

I have no particular opinion on whether it should be expected that =20
something as stupid as irqbalance could crash a system, but others =20
seem to believe it shouldn't (claiming "it works on *real* [read: x86] =
=20
hardware").

Now, I'm quite convinced that irqbalance could be one of the (major?) =20
cause of instability of the parisc autobuilders. AFAIU, they've =20
decided to disable it on their setup, maybe the situation will improve =
=20
there. Still, irqbalance is only the messenger, and I'm wondering =20
whether that apparent bug in our IRQ code could also be responsible =20
for other issues we're still having.

It's been a very long time since I last touched that code, and tbh I =20
never fully mastered it anyway, but I thought it'd be a good thing to =20
have a trace that this bug is still there, and maybe it will ring a =20
bell to others...

HTH

T-Bone

--=20
Thibaut Var=E8ne
http://www.parisc-linux.org/~varenet/--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)
  2010-04-24 13:29 Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines) Thibaut VARÈNE
@ 2010-04-24 14:36 ` Grant Grundler
  2010-04-24 14:44   ` Thibaut VARÈNE
  0 siblings, 1 reply; 6+ messages in thread
From: Grant Grundler @ 2010-04-24 14:36 UTC (permalink / raw)
  To: Thibaut VARÈNE; +Cc: linux-parisc

On Sat, Apr 24, 2010 at 03:29:01PM +0200, Thibaut VAR=C8NE wrote:
> Pa-ckers,
>
> Just for the records, I'd like to raise some attention to what seems =
=20
> like a pretty old bug in our IRQ code that is apparently still affect=
ing=20
> us.
>
> Long story short: while trying to figure out why the recently attache=
d =20
> 10-disk bay was killing the Debian "lafayette" autobuilder during rai=
d =20
> resync, I noticed that irqbalance was part of the default Debian =20
> autobuilder setup.
>
> The nastiness of irqbalance has been discussed before, and I remember=
ed=20
> having had issues in the past (5+ years ago) on my parisc machines wi=
th=20
> that daemon. I couldn't find a pointer to a m-l thread, I don't remem=
ber=20
> if I discussed that on IRC or elsewhere.
>
> Anyway, turned out disabling irqbalance "fixed" the crash (and by cra=
sh I=20
> mean HPMC). IIRC, the general idea is that when irqbalance reroutes I=
RQ=20
> under heavy interrupt load, a race occurs by which one interrupt requ=
est=20
> might end up delivered to the wrong CPU, HPMC'ing the machine.

I'm not seeing how an IRQ message getting delivered to the "wrong" CPU
would cause an HPMC. Sounds more like MSI or other mask is getting buil=
t
wrong and sending the IRQ transaction to an invalid physical address.

> I have no particular opinion on whether it should be expected that =20
> something as stupid as irqbalance could crash a system, but others se=
em=20
> to believe it shouldn't (claiming "it works on *real* [read: x86] =20
> hardware").

It definitely should not.

> Now, I'm quite convinced that irqbalance could be one of the (major?)=
 =20
> cause of instability of the parisc autobuilders. AFAIU, they've decid=
ed=20
> to disable it on their setup, maybe the situation will improve there.=
=20
> Still, irqbalance is only the messenger, and I'm wondering whether th=
at=20
> apparent bug in our IRQ code could also be responsible for other issu=
es=20
> we're still having.

Sounds like it. Though the HPMCs are clearly different than the PTE iss=
ues
that jda/carlos are seeing.

> It's been a very long time since I last touched that code, and tbh I =
=20
> never fully mastered it anyway, but I thought it'd be a good thing to=
 =20
> have a trace that this bug is still there, and maybe it will ring a b=
ell=20
> to others...

No matter what crap irqbalanced is doing, the box shouldn't crash.
I can take a look at the code path and see if something looks broken.

thanks,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)
  2010-04-24 14:36 ` Grant Grundler
@ 2010-04-24 14:44   ` Thibaut VARÈNE
  2010-04-24 15:13     ` Grant Grundler
  2010-04-24 15:48     ` John David Anglin
  0 siblings, 2 replies; 6+ messages in thread
From: Thibaut VARÈNE @ 2010-04-24 14:44 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-parisc

Le 24 avr. 10 =E0 16:36, Grant Grundler a =E9crit :

> On Sat, Apr 24, 2010 at 03:29:01PM +0200, Thibaut VAR=C8NE wrote:
>>

>> Anyway, turned out disabling irqbalance "fixed" the crash (and by =20
>> crash I
>> mean HPMC). IIRC, the general idea is that when irqbalance reroutes =
=20
>> IRQ
>> under heavy interrupt load, a race occurs by which one interrupt =20
>> request
>> might end up delivered to the wrong CPU, HPMC'ing the machine.
>
> I'm not seeing how an IRQ message getting delivered to the "wrong" CP=
U
> would cause an HPMC. Sounds more like MSI or other mask is getting =20
> built
> wrong and sending the IRQ transaction to an invalid physical address.

I'm not sure, it's been a very long time since I last tracked down =20
this bug. Maybe I'm remembering it wrong.
=46WIW, no MSI on this machine (L1000) and PCI card (sym53c896).

>> Now, I'm quite convinced that irqbalance could be one of the (major?=
)
>> cause of instability of the parisc autobuilders. AFAIU, they've =20
>> decided
>> to disable it on their setup, maybe the situation will improve there=
=2E
>> Still, irqbalance is only the messenger, and I'm wondering whether =20
>> that
>> apparent bug in our IRQ code could also be responsible for other =20
>> issues
>> we're still having.
>
> Sounds like it. Though the HPMCs are clearly different than the PTE =20
> issues
> that jda/carlos are seeing.

True, but I remember Debian staff complaining about random unexplained =
=20
hangs, I wouldn't be too surprised if this came into play...

>> It's been a very long time since I last touched that code, and tbh I
>> never fully mastered it anyway, but I thought it'd be a good thing t=
o
>> have a trace that this bug is still there, and maybe it will ring a =
=20
>> bell
>> to others...
>
> No matter what crap irqbalanced is doing, the box shouldn't crash.
> I can take a look at the code path and see if something looks broken.
>
> thanks,


You're welcome ;)

--=20
Thibaut VAR=C8NE
http://www.parisc-linux.org/~varenet/

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)
  2010-04-24 14:44   ` Thibaut VARÈNE
@ 2010-04-24 15:13     ` Grant Grundler
  2010-04-24 15:48     ` John David Anglin
  1 sibling, 0 replies; 6+ messages in thread
From: Grant Grundler @ 2010-04-24 15:13 UTC (permalink / raw)
  To: Thibaut VARÈNE; +Cc: Grant Grundler, linux-parisc

On Sat, Apr 24, 2010 at 04:44:27PM +0200, Thibaut VAR=C8NE wrote:
=2E..
> I'm not sure, it's been a very long time since I last tracked down th=
is=20
> bug. Maybe I'm remembering it wrong.
> FWIW, no MSI on this machine (L1000) and PCI card (sym53c896).

Ok - true. We don't support MSI AFAIK.
That means whatever is going wrong, is happening in the IO SAPIC.

>
>>> Now, I'm quite convinced that irqbalance could be one of the (major=
?)
>>> cause of instability of the parisc autobuilders. AFAIU, they've =20
>>> decided
>>> to disable it on their setup, maybe the situation will improve ther=
e.
>>> Still, irqbalance is only the messenger, and I'm wondering whether =
=20
>>> that
>>> apparent bug in our IRQ code could also be responsible for other =20
>>> issues
>>> we're still having.
>>
>> Sounds like it. Though the HPMCs are clearly different than the PTE =
=20
>> issues
>> that jda/carlos are seeing.
>
> True, but I remember Debian staff complaining about random unexplaine=
d =20
> hangs, I wouldn't be too surprised if this came into play...

Yes, especially given the fact the "hangs" generally means
the "machine suddenly stopped responding".

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)
  2010-04-24 14:44   ` Thibaut VARÈNE
  2010-04-24 15:13     ` Grant Grundler
@ 2010-04-24 15:48     ` John David Anglin
  2010-04-24 16:44       ` PTE/TLB issues (was Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)) Thibaut VARÈNE
  1 sibling, 1 reply; 6+ messages in thread
From: John David Anglin @ 2010-04-24 15:48 UTC (permalink / raw)
  To: Thibaut VARÈNE; +Cc: linux-parisc, grundler

> > Sounds like it. Though the HPMCs are clearly different than the PTE  
> > issues
> > that jda/carlos are seeing.
> 
> True, but I remember Debian staff complaining about random unexplained  
> hangs, I wouldn't be too surprised if this came into play...

I have been looking at the PTE/TLB code for the last couple of weeks.
I now believe that the majority of the random hangs, segvs at program
startup are related to problems with the PTE/TLB code.

1) minifail bug

As previously identified, there is a cache flush problem in ptep_set_wrprotect.
I did a kernel build with a "WARN_ON(pte_present(old_pte) && pte_dirty(old_pte)"in ptep_set_wrprotect an it triggered immediately.  So, we have to deal with
a dirty cache in ptep_set_wrprotect.  As far as I can tell, this problem
is fixed by putting the flush inside preempt_disable()/preempt_enable().

Helge is triggering other PTE issues when he runs multiple copies of
minifail.  The minifail program fails on a UP system.  With the above fix,
it doesn't fail.

2) SMP page table entry corruption

The needs to be a lock around pte updates to ensure that pte modifications
are consistent on SMP machines.  I have done this and it helps stability
on rp3440.

3) SMP PTE/TLB consistency

We are not purging existing translations when an update to a PTE value
is done such as in ptep_set_wrprotect.  As a result, the parent of a fork
doesn't immediately trigger a COW break on a page when a write occurs
if there is an existing translation.

Adding a TLB purge helps, but even this may be lost.  For example,

page fault (cpu1)
  -> load old pte (cpu1)
    -> store new pte (cpu2)
      -> global tlb purge (cpu2)
        -> insert old pte into tlb (cpu1)

It may be possible to use locking to ensure that purges aren't lost.  However,
this is a bit tricky...

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* PTE/TLB issues (was Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines))
  2010-04-24 15:48     ` John David Anglin
@ 2010-04-24 16:44       ` Thibaut VARÈNE
  0 siblings, 0 replies; 6+ messages in thread
From: Thibaut VARÈNE @ 2010-04-24 16:44 UTC (permalink / raw)
  To: John David Anglin; +Cc: linux-parisc

Le 24 avr. 10 =E0 17:48, John David Anglin a =E9crit :

>>> Sounds like it. Though the HPMCs are clearly different than the PTE
>>> issues
>>> that jda/carlos are seeing.
>>
>> True, but I remember Debian staff complaining about random =20
>> unexplained
>> hangs, I wouldn't be too surprised if this came into play...
>
> I have been looking at the PTE/TLB code for the last couple of weeks.
> I now believe that the majority of the random hangs, segvs at program
> startup are related to problems with the PTE/TLB code.

[snip]

Thanks for this detailed explanation. I've quoted your remarks to this =
=20
wiki page (as a reference point):
http://wiki.parisc-linux.org/KnownIssues#head-fc724f5d1b9a93f4b531a05fc=
1aab3314dde8c87

People, feel free to make sure everything there is as accurate as it =20
can be, btw... Also taking the opportunity to point at the "SMP perf / =
=20
cache issue" paragraph there... Might be related.

I've also added a link to the irqbalance bug on that same page.

=46WIW, the hangs I was referring to are those that kill the machine =20
without any output. Hopefully PTE/TLB issues are a bit more verbose? ;-=
)

HTH

--=20
Thibaut VAR=C8NE
http://www.parisc-linux.org/~varenet/

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-04-24 16:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-24 13:29 Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines) Thibaut VARÈNE
2010-04-24 14:36 ` Grant Grundler
2010-04-24 14:44   ` Thibaut VARÈNE
2010-04-24 15:13     ` Grant Grundler
2010-04-24 15:48     ` John David Anglin
2010-04-24 16:44       ` PTE/TLB issues (was Re: Longstanding bug in our IRQ code (irqbalance HPMCs parisc SMP machines)) Thibaut VARÈNE

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.