public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: michael@ellerman.id.au, serue@us.ibm.com,
	linuxppc64-dev@ozlabs.org, paulus@au1.ibm.com, anton@au1.ibm.com,
	linux-kernel@vger.kernel.org
Subject: Re: 2.6.15-mm4 failure on power5
Date: Tue, 17 Jan 2006 22:53:04 -0800	[thread overview]
Message-ID: <20060117225304.4b6dd045.akpm@osdl.org> (raw)
In-Reply-To: <20060118063732.GA21003@elte.hu>

Ingo Molnar <mingo@elte.hu> wrote:
>
> 
> * Dave C Boutcher <sleddog@us.ibm.com> wrote:
> 
> > On Wed, Jan 18, 2006 at 11:19:36AM +1100, Michael Ellerman wrote:
> > > It booted fine _with_ the patch applied, with DEBUG_MUTEXES=y and n.
> > > 
> > > Boutcher, to be clear, you can't boot with kernel-kernel-cpuc-to-mutexes.patch 
> > > applied and DEBUG_MUTEXES=y ?
> > > 
> > > But if you revert kernel-kernel-cpuc-to-mutexes.patch it boots ok?
> > > 
> > > This is looking quite similar to another hang we're seeing on Power4 iSeries 
> > > on mainline git:
> > > http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007679.html
> > 
> > Correct...I die in exactly the same place every time with 
> > DEBUG_MUTEXES=Y.  I posted a backtrace that points into the _lock_cpu 
> > code, but I haven't really dug into the issue yet.  I believe this is 
> > very timing related (Serge was dying slightly differently).
> 
> so my question still is: _without_ the workaround patch, i.e. with 
> vanilla -mm4, and DEBUG_MUTEXES=n, do you get a hang?
> 
> the reason for my question is that DEBUG_MUTEXES=y will e.g. enable 
> interrupts

That used to kill ppc64 and yes, it died in timer interrupts.

> - so buggy early bootup code which relies on interrupts being 
> off might be surprised by it.

I don't think it's necessarily buggy that bootup code needs interrupts
disabled.  It _is_ buggy that bootup code which needs interrupts disabled
is calling lock_cpu_hotplug().

> The fact that you observed that it's 
> somehow related to the timer interrupt seems to strengthen this 
> suspicion. DEBUG_MUTEXES=n on the other hand should have no such 
> interrupt-enabling effects.
> 
> [ if this indeed is the case then i'll add irqs_off() checks to
>   DEBUG_MUTEXES=y, to ensure that the mutex APIs are never called with 
>   interrupts disabled. ]

Yes, I suppose so.  But we're already calling might_sleep(), and
might_sleep() checks for that.  Perhaps the might_sleep() check is being
defeated by the nasty system_running check.

There's a sad story behind that system_running check in might_sleep(). 
Because the kernel early boot is running in an in_atomic() state, a great
number of bogus might_sleep() warnings come out because of various code
doing potentially-sleepy things.  I ended up adding the system_running
test, with the changelog "OK, I give up.  Kill all the might_sleep warnings
from the early boot process." Undoing that and fixing up the fallout would
be a lot of nasty work.



  reply	other threads:[~2006-01-18  6:54 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-16  6:35 2.6.15-mm4 failure on power5 Serge E. Hallyn
2006-01-16  7:05 ` Andrew Morton
2006-01-16 13:00   ` Michael Ellerman
2006-01-16 15:37     ` Serge E. Hallyn
2006-01-16 21:52       ` Dave C Boutcher
2006-01-17  1:09         ` Andrew Morton
2006-01-17  8:17           ` Ingo Molnar
2006-01-17  8:47             ` Andrew Morton
2006-01-17 16:52             ` Dave C Boutcher
2006-01-17 16:55               ` Dave C Boutcher
2006-01-18  6:40                 ` Nathan Lynch
2006-01-18  7:07                   ` Ingo Molnar
2006-01-18  7:53                     ` Nathan Lynch
2006-01-18  8:08                   ` Nathan Lynch
2006-01-17 12:22         ` Serge E. Hallyn
2006-01-17 13:32         ` Michael Ellerman
2006-01-17 14:00           ` Ingo Molnar
2006-01-18  0:19             ` Michael Ellerman
2006-01-18  3:32               ` Dave C Boutcher
2006-01-18  6:37                 ` Ingo Molnar
2006-01-18  6:53                   ` Andrew Morton [this message]
2006-01-18  7:04                     ` Ingo Molnar
2006-01-18  7:28                     ` Nathan Lynch
2006-01-18  7:37                       ` Andrew Morton
2006-01-18  8:08                         ` Ingo Molnar
2006-01-18  8:24                           ` Andrew Morton
2006-01-18  9:02                             ` [patch] work around ppc64 bootup bug by making mutex-debugging save/restore irqs Ingo Molnar
2006-01-18  9:18                             ` [patch] turn on might_sleep() in early bootup code too Ingo Molnar
2006-01-18 10:35                               ` Andrew Morton
2006-01-18 10:43                                 ` Ingo Molnar
2006-01-18 11:15                                   ` [patch] make bug messages more consistent Ingo Molnar
2006-01-19  4:39                                   ` [patch] turn on might_sleep() in early bootup code too Zwane Mwaikambo
2006-01-18 10:46                                 ` Nick Piggin
2006-01-18 11:07                                   ` Ingo Molnar
2006-01-18 12:53                                     ` [patch] add trylock_kernel() Ingo Molnar
2006-01-18  7:38                       ` 2.6.15-mm4 failure on power5 Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060117225304.4b6dd045.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=anton@au1.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc64-dev@ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=mingo@elte.hu \
    --cc=paulus@au1.ibm.com \
    --cc=serue@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox