From: Andrew Morton <akpm@osdl.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: michael@ellerman.id.au, serue@us.ibm.com,
linuxppc64-dev@ozlabs.org, paulus@au1.ibm.com, anton@au1.ibm.com,
linux-kernel@vger.kernel.org
Subject: Re: 2.6.15-mm4 failure on power5
Date: Tue, 17 Jan 2006 22:53:04 -0800 [thread overview]
Message-ID: <20060117225304.4b6dd045.akpm@osdl.org> (raw)
In-Reply-To: <20060118063732.GA21003@elte.hu>
Ingo Molnar <mingo@elte.hu> wrote:
>
>
> * Dave C Boutcher <sleddog@us.ibm.com> wrote:
>
> > On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> > > It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
> > >
> > > Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch
> > > applied and DEBUG_MUTEXES=y ?
> > >
> > > But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
> > >
> > > This is looking quite similar to another hang we're seeing on Power4 iSeries
> > > on mainline git:
> > > http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html
> >
> > Correct...I die in exactly the same place every time with
> > DEBUG_MUTEXES=Y. I posted a backtrace that points into the _lock_cpu
> > code, but I haven't really dug into the issue yet. I believe this is
> > very timing related (Serge was dying slightly differently).
>
> so my question still is: _without_ the workaround patch, i.e. with
> vanilla -mm4, and DEBUG_MUTEXES=n, do you get a hang?
>
> the reason for my question is that DEBUG_MUTEXES=y will e.g. enable
> interrupts
That used to kill ppc64 and yes, it died in timer interrupts.
> - so buggy early bootup code which relies on interrupts being
> off might be surprised by it.
I don't think it's necessarily buggy that bootup code needs interrupts
disabled. It _is_ buggy that bootup code which needs interrupts disabled
is calling lock_cpu_hotplug().
> The fact that you observed that it's
> somehow related to the timer interrupt seems to strengthen this
> suspicion. DEBUG_MUTEXES=n on the other hand should have no such
> interrupt-enabling effects.
>
> [ if this indeed is the case then i'll add irqs_off() checks to
> DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with
> interrupts disabled. ]
Yes, I suppose so. But we're already calling might_sleep(), and
might_sleep() checks for that. Perhaps the might_sleep() check is being
defeated by the nasty system_running check.
There's a sad story behind that system_running check in might_sleep().
Because the kernel early boot is running in an in_atomic() state, a great
number of bogus might_sleep() warnings come out because of various code
doing potentially-sleepy things. I ended up adding the system_running
test, with the changelog "OK, I give up. Kill all the might_sleep warnings
from the early boot process." Undoing that and fixing up the fallout would
be a lot of nasty work.
next prev parent reply other threads:[~2006-01-18 6:54 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-16 6:35 2.6.15-mm4 failure on power5 Serge E. Hallyn
2006-01-16 7:05 ` Andrew Morton
2006-01-16 13:00 ` Michael Ellerman
2006-01-16 15:37 ` Serge E. Hallyn
2006-01-16 21:52 ` Dave C Boutcher
2006-01-17 1:09 ` Andrew Morton
2006-01-17 8:17 ` Ingo Molnar
2006-01-17 8:47 ` Andrew Morton
2006-01-17 16:52 ` Dave C Boutcher
2006-01-17 16:55 ` Dave C Boutcher
2006-01-18 6:40 ` Nathan Lynch
2006-01-18 7:07 ` Ingo Molnar
2006-01-18 7:53 ` Nathan Lynch
2006-01-18 8:08 ` Nathan Lynch
2006-01-17 12:22 ` Serge E. Hallyn
2006-01-17 13:32 ` Michael Ellerman
2006-01-17 14:00 ` Ingo Molnar
2006-01-18 0:19 ` Michael Ellerman
2006-01-18 3:32 ` Dave C Boutcher
2006-01-18 6:37 ` Ingo Molnar
2006-01-18 6:53 ` Andrew Morton [this message]
2006-01-18 7:04 ` Ingo Molnar
2006-01-18 7:28 ` Nathan Lynch
2006-01-18 7:37 ` Andrew Morton
2006-01-18 8:08 ` Ingo Molnar
2006-01-18 8:24 ` Andrew Morton
2006-01-18 9:02 ` [patch] work around ppc64 bootup bug by making mutex-debugging save/restore irqs Ingo Molnar
2006-01-18 9:18 ` [patch] turn on might_sleep() in early bootup code too Ingo Molnar
2006-01-18 10:35 ` Andrew Morton
2006-01-18 10:43 ` Ingo Molnar
2006-01-18 11:15 ` [patch] make bug messages more consistent Ingo Molnar
2006-01-19 4:39 ` [patch] turn on might_sleep() in early bootup code too Zwane Mwaikambo
2006-01-18 10:46 ` Nick Piggin
2006-01-18 11:07 ` Ingo Molnar
2006-01-18 12:53 ` [patch] add trylock_kernel() Ingo Molnar
2006-01-18 7:38 ` 2.6.15-mm4 failure on power5 Arjan van de Ven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060117225304.4b6dd045.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=anton@au1.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc64-dev@ozlabs.org \
--cc=michael@ellerman.id.au \
--cc=mingo@elte.hu \
--cc=paulus@au1.ibm.com \
--cc=serue@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.