From: Johannes Weiner <hannes@saeurebad.de>
To: "Vegard Nossum" <vegard.nossum@gmail.com>
Cc: a.p.zijlstra@chello.nl, arjan@linux.intel.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] softirq softlockup debugging
Date: Tue, 24 Jun 2008 15:06:56 +0200 [thread overview]
Message-ID: <87lk0v9e33.fsf@skyscraper.fehenstaub.lan> (raw)
In-Reply-To: <19f34abd0806240445i6282c49cia22a700140dd27eb@mail.gmail.com> (Vegard Nossum's message of "Tue, 24 Jun 2008 13:45:27 +0200")
Hi,
"Vegard Nossum" <vegard.nossum@gmail.com> writes:
> On 6/24/08, Johannes Weiner <hannes@saeurebad.de> wrote:
>> After more staring at the code in question, I think that the approach is
>> not correct (or I didn't understand it, which is not unlikely).
>>
>> I hunted down the address of the traces from kerneloops.org
>> (__do_softirq+0x6d) on a kernel image with a fedora config and it's at
>> the local_irq_enable() right after the restart:label in __do_softirq().
>>
>
> Are you quite sure? I didn't use the fedora image, but I compiled with
> the fedora config and got a function size for __do_softirq that was
> exactly equal to some of the reports. And there, the EIP pointed to
> the second "pop" instruction after calling... oh, true.
> local_irq_enable(). And those pops come from calling into paravirt
> ops.
>
> Right, I fully agree.
>
>> So if the softirq handler had disabled interrupts, the softlockup would
>> have been detected still within the handler (when it reenables irqs and
>> the timer irq runs) and the stackframe should be there.
>>
>> do_softirq()
>> local_irq_save() 1)
>> local_softirq_pending()
>> __do_softirq()
>> restart: 2)
>> local_irq_enable() 3)
>> run a handler
>> local_irq_disable() 4)
>> jnz restart
>>
>> So the lockup must be caused somewhere
>> between 1) and 3)
>> or
>> between 4) and 3) [when we jump back]
>>
>> These functions are in the path and possible candidates for causing it:
>>
>> - local_softirq_pending()
>> - account_system_vtime()
>> - __local_bh_disable()
>> - trace_softirq_enter()
>> - smp_processor_id()
>> - set_softirq_pending()
>>
>> What do you think? You said you actually used your patch already for
>> debugging lockups in softirq handlers, so it confuses me why the
>> stackframe of the handler was no longer present.
>
> What didn't make sense to me when first looking at the oops reports
> was that there was nowhere inside __do_softirq() that could actually
> lock up. As far as I can see, there is no infinite (or very
> long-running) loop in there, so I concluded (probably wrongly) that it
> must be the handler; it was the only logical explanation I could find.
> I believe that the functions you listed all run in constant time, but
> maybe we should check it.
>
> It also seems that the timer interrupt (where softlockup detection
> kicks in) has a small delay after irqs are re-enabled before it
> actually interrupts the CPU. Can you post a disassembly of
> __do_softirq() where you pinpoint the exact instruction that was
> interrupted, e.g. in this case __do_softirq+0x6d? Again, for me, this
> was a pop %ebx or so, which doesn't change interruptibility, so maybe
> I simply had the wrong disassembly.
gdb associated the line to be the asm statement in
include/asm-x86/paravirt.h::raw_local_irq_enable().
And looking by hand with objdump got the second pop, too, just like you,
IIRC (dropped the object file, will have another look at it later).
Hannes
next prev parent reply other threads:[~2008-06-24 13:07 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-22 12:28 [PATCH] softirq softlockup debugging Vegard Nossum
2008-06-22 21:18 ` Arjan van de Ven
2008-06-23 19:31 ` Vegard Nossum
2008-06-24 10:41 ` Johannes Weiner
2008-06-24 11:45 ` Vegard Nossum
2008-06-24 13:06 ` Johannes Weiner [this message]
2008-06-24 12:06 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lk0v9e33.fsf@skyscraper.fehenstaub.lan \
--to=hannes@saeurebad.de \
--cc=a.p.zijlstra@chello.nl \
--cc=arjan@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox