public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@saeurebad.de>
To: "Vegard Nossum" <vegard.nossum@gmail.com>
Cc: a.p.zijlstra@chello.nl, arjan@linux.intel.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] softirq softlockup debugging
Date: Tue, 24 Jun 2008 15:06:56 +0200	[thread overview]
Message-ID: <87lk0v9e33.fsf@skyscraper.fehenstaub.lan> (raw)
In-Reply-To: <19f34abd0806240445i6282c49cia22a700140dd27eb@mail.gmail.com> (Vegard Nossum's message of "Tue, 24 Jun 2008 13:45:27 +0200")

Hi,

"Vegard Nossum" <vegard.nossum@gmail.com> writes:

> On 6/24/08, Johannes Weiner <hannes@saeurebad.de> wrote:
>> After more staring at the code in question, I think that the approach is
>> not correct (or I didn't understand it, which is not unlikely).
>>
>> I hunted down the address of the traces from kerneloops.org
>> (__do_softirq+0x6d) on a kernel image with a fedora config and it's at
>> the local_irq_enable() right after the restart:label in __do_softirq().
>>
>
> Are you quite sure? I didn't use the fedora image, but I compiled with
> the fedora config and got a function size for __do_softirq that was
> exactly equal to some of the reports. And there, the EIP pointed to
> the second "pop" instruction after calling... oh, true.
> local_irq_enable(). And those pops come from calling into paravirt
> ops.
>
> Right, I fully agree.
>
>> So if the softirq handler had disabled interrupts, the softlockup would
>> have been detected still within the handler (when it reenables irqs and
>> the timer irq runs) and the stackframe should be there.
>>
>> do_softirq()
>>  local_irq_save()                      1)
>>  local_softirq_pending()
>>  __do_softirq()
>>   restart:                             2)
>>    local_irq_enable()                  3)
>>    run a handler
>>    local_irq_disable()                 4)
>>    jnz restart
>>
>> So the lockup must be caused somewhere
>>  between 1) and 3)
>> or
>>  between 4) and 3) [when we jump back]
>>
>> These functions are in the path and possible candidates for causing it:
>>
>> - local_softirq_pending()
>> - account_system_vtime()
>> - __local_bh_disable()
>> - trace_softirq_enter()
>> - smp_processor_id()
>> - set_softirq_pending()
>>
>> What do you think?  You said you actually used your patch already for
>> debugging lockups in softirq handlers, so it confuses me why the
>> stackframe of the handler was no longer present.
>
> What didn't make sense to me when first looking at the oops reports
> was that there was nowhere inside __do_softirq() that could actually
> lock up. As far as I can see, there is no infinite (or very
> long-running) loop in there, so I concluded (probably wrongly) that it
> must be the handler; it was the only logical explanation I could find.
> I believe that the functions you listed all run in constant time, but
> maybe we should check it.
>
> It also seems that the timer interrupt (where softlockup detection
> kicks in) has a small delay after irqs are re-enabled before it
> actually interrupts the CPU. Can you post a disassembly of
> __do_softirq() where you pinpoint the exact instruction that was
> interrupted, e.g. in this case __do_softirq+0x6d? Again, for me, this
> was a pop %ebx or so, which doesn't change interruptibility, so maybe
> I simply had the wrong disassembly.

gdb associated the line to be the asm statement in
include/asm-x86/paravirt.h::raw_local_irq_enable().

And looking by hand with objdump got the second pop, too, just like you,
IIRC (dropped the object file, will have another look at it later).

	Hannes

  reply	other threads:[~2008-06-24 13:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-22 12:28 [PATCH] softirq softlockup debugging Vegard Nossum
2008-06-22 21:18 ` Arjan van de Ven
2008-06-23 19:31   ` Vegard Nossum
2008-06-24 10:41 ` Johannes Weiner
2008-06-24 11:45   ` Vegard Nossum
2008-06-24 13:06     ` Johannes Weiner [this message]
2008-06-24 12:06 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lk0v9e33.fsf@skyscraper.fehenstaub.lan \
    --to=hannes@saeurebad.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=arjan@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vegard.nossum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox