From: Ingo Molnar <mingo@elte.hu>
To: Mark_H_Johnson@raytheon.com
Cc: "K.R. Foley" <kr@cybsft.com>,
linux-kernel@vger.kernel.org, Lee Revell <rlrevell@joe-job.com>,
Thomas Charbonnel <thomas@undata.org>
Subject: Re: [patch] voluntary-preempt-2.6.9-rc1-bk4-Q7
Date: Thu, 2 Sep 2004 15:37:43 +0200 [thread overview]
Message-ID: <20040902133743.GA9096@elte.hu> (raw)
In-Reply-To: <OF4A93C101.C3FFA1E6-ON86256F03.004851E5@raytheon.com>
* Mark_H_Johnson@raytheon.com <Mark_H_Johnson@raytheon.com> wrote:
> >In any case, please enable nmi_watchdog=1 so that we can see (in -Q7)
> >what happens on the other CPUs during such long delays.
>
> Booted with nmi_watchdog=1, saw the kernel message indicating that
> NMI was checked OK.
>
> The first trace looks something like this...
>
> latency 518 us, entries: 79
> ...
> started at schedule+0x51/0x740
> ended at schedule+0x337/0x740
>
> 00000001 0.000ms (+0.000ms): schedule (io_schedule)
> 00000001 0.000ms (+0.000ms): sched_clock (schedule)
> 00010001 0.478ms (+0.478ms): do_nmi (sched_clock)
> 00010001 0.478ms (+0.000ms): do_nmi (<08049b21>)
> 00010001 0.482ms (+0.003ms): profile_tick (nmi_watchdog_tick)
> ...
> and a few entries later ends up at do_IRQ (sched_clock).
>
> The second trace goes from dequeue_task to __switch_to with a
> similar pattern - the line with do_nmi has +0.282ms duration and
> the line notifier_call_chain (profile_hook) as +0.135ms duration.
>
> I don't see how this provides any additional information but will
> provide several additional traces when the test gets done in a few
> minutes.
thanks. The NMI gives us two kinds of information, both useful:
- if the ratio of do_nmi()'s within such a section roughly matches the
number of NMIs we'd expect during the sum of these sections then it
means that the delay is most likely wall-clock time and not some
measurement artifact (RDTSC artifact or tracing bug). The NMI's are
triggered (indirectly) by the PIT and the PIT is an independent clock
that has a frequency that is independent of the rest of the system
(independent of the CPU clock, DMA activities, IRQ load, etc.)
since most of the codepaths in question (the scheduler's
dequeue_task(), etc.) run with interrupts disabled the normal timer
interrupts (smp_apic_timer_interrupt() and do_IRQ(00000000)) cannot
'sample' this codepath. Only the NMI can interrupt these codepaths.
- the NMIs also sample what happens on the other CPU. In your above
trace this gives:
> 00010001 0.478ms (+0.478ms): do_nmi (sched_clock)
> 00010001 0.478ms (+0.000ms): do_nmi (<08049b21>)
the other CPU was executing userspace code during the last NMI tick -
i.e. nothing that could be suspect. 'suspect' code would be some sort
kernel code that could in theory interact with this CPU's scheduler
code.
this too is statistical sampling so we'll need as much of these
traces as possible.
some wacky guess based on the above single sampling point: it seems the
delays are real wall-clock delays, and the only thing matching the
theory so far is that DMA traffic on the memory bus somehow stalls this
CPU's memory traffic for up to 500 usecs. How could userspace running on
CPU#0 impact the kernel's scheduler code on CPU#1?
Ingo
next prev parent reply other threads:[~2004-09-02 13:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-02 13:18 [patch] voluntary-preempt-2.6.9-rc1-bk4-Q7 Mark_H_Johnson
2004-09-02 13:37 ` Ingo Molnar [this message]
2004-09-02 18:01 ` Lee Revell
[not found] <OF3E3C1690.FD6E285E-ON86256F03.004CDD15-86256F03.004CDD4F@raytheon.com>
2004-09-02 14:43 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2004-09-02 13:33 Mark_H_Johnson
2004-09-01 22:56 Mark_H_Johnson
2004-09-02 5:34 ` Ingo Molnar
2004-08-28 20:10 [patch] voluntary-preempt-2.6.9-rc1-bk4-Q2 Daniel Schmitt
2004-08-28 20:31 ` [patch] voluntary-preempt-2.6.9-rc1-bk4-Q3 Ingo Molnar
2004-08-28 21:10 ` Lee Revell
2004-08-28 21:13 ` Ingo Molnar
2004-08-28 21:16 ` Lee Revell
2004-08-28 23:51 ` Lee Revell
2004-08-29 2:35 ` Lee Revell
2004-08-29 5:43 ` [patch] voluntary-preempt-2.6.9-rc1-bk4-Q4 Ingo Molnar
2004-08-30 9:06 ` [patch] voluntary-preempt-2.6.9-rc1-bk4-Q5 Ingo Molnar
2004-09-01 8:29 ` [patch] voluntary-preempt-2.6.9-rc1-bk4-Q6 Ingo Molnar
2004-09-01 13:51 ` [patch] voluntary-preempt-2.6.9-rc1-bk4-Q7 Ingo Molnar
2004-09-01 17:09 ` Thomas Charbonnel
2004-09-01 19:03 ` K.R. Foley
2004-09-01 20:11 ` Peter Zijlstra
2004-09-01 20:16 ` Lee Revell
2004-09-01 20:53 ` K.R. Foley
[not found] ` <41367E5D.3040605@cybsft.com>
2004-09-02 5:37 ` Ingo Molnar
2004-09-02 5:40 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040902133743.GA9096@elte.hu \
--to=mingo@elte.hu \
--cc=Mark_H_Johnson@raytheon.com \
--cc=kr@cybsft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rlrevell@joe-job.com \
--cc=thomas@undata.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox