public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Venkatesh Pallipadi <venki@google.com>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-kernel@vger.kernel.org, Paul Turner <pjt@google.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Len Brown <len.brown@intel.com>
Subject: Re: [PATCH 3/7] Add IRQ_TIME_ACCOUNTING, finer accounting of irq time -v3
Date: Sat, 02 Oct 2010 12:53:55 +0200	[thread overview]
Message-ID: <1286016835.2144.86.camel@laptop> (raw)
In-Reply-To: <AANLkTikcPX6m01F1rg--AKkSj_69kJJVsTrqB_FF6sJf@mail.gmail.com>

On Fri, 2010-10-01 at 16:32 -0700, Venkatesh Pallipadi wrote:
> On Fri, Oct 1, 2010 at 4:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, 2010-10-01 at 10:29 -0700, Venkatesh Pallipadi wrote:
> >> So, on x86, sched_clock_stable is not set on all other kind of CPUs
> >> and my test system happens to be one of them. So, sched_clock_cpu()
> >> falls back to tick based even when TSC is not marked unstable and
> >> clocksource is using TSC for timing.
> >
> > It is never tick based!! It's tick augmented! Because TSC is such a
> > piece of crap we use external (slow) means of determining a window in
> > which the TSC should live and then use the TSC to generate high
> > resolution offsets inside that.
> >
> > So even if your usage is in the hardirq context that moves that window
> > it should all work out.
> >
> 
> You mean there should not be any "jumps" noticed with
> sched_clock_cpu() when we are idle and get a interrupt?
> Atleast thats what I am seeing. May be there is some other bug
> somewhere causing that.
> 
> Loooking at one snapshot from my earlier log
> 
> <idle>-0     []  1697.915040: : START 1700899887146
> // We were idle and got an interrupt and recorded sched_clock_cpu() as
> 1700899887146
> <idle>-0     []  1697.915047: : HARD STOP 1700902008678, delta 2121532
> // We finished handling the interrupt and recorded sched_clock_cpu()
> as 1700902008678
> // So, delta we see is > 2ms
> // This is trace_printk based on local clock, which is using sched_clock()
> // So, the trace timing shows delta of 7 us, which is kind of expected time here

Egads, yes that would be a kernel/sched_clock.c buglet..

So we're in NOHZ and an IRQ/NMI happens that ends up calling
sched_clock_cpu() and friends without us leaving NOHZ.

drivers/acpi/processor_idle.c:acpi_idle_enter_simple() calls
sched_clock_idle_{sleep,wakeup}_event() around the idle loop -- are
there idle methods missing this, and or do we handle the interrupt
before the wakeup event?

Also, in irq_enter() we call __irq_enter() which does
account_system_vtime() before tick_check_idle() which restarts various
timers and resets jiffies and in fact already calls
sched_clock_idle_wakeup_event().

Gah what a mess.. we could try a code shuffle to restart timer/clock
bits before calling into account_system_vtime(), although I bet that'll
be interesting. But I see no way to fix the NMI during NOHZ problem, its
not like we can actually do GTOD from NMI context :/

The thing is, I really do _NOT_ trust TSC to be sane enough to use like
you want to do, its really proven itself to be reliably crap.



  reply	other threads:[~2010-10-02 10:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-29 19:21 Proper kernel irq time accounting -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 1/7] si time accounting accounts bh_disable'd time to si -v3 Venkatesh Pallipadi
2010-09-30 11:04   ` Peter Zijlstra
2010-09-30 16:26     ` Venkatesh Pallipadi
2010-10-01 23:16       ` Peter Zijlstra
2010-10-02 15:42         ` Venkatesh Pallipadi
2010-10-03  0:34           ` Peter Zijlstra
2010-10-04 16:54             ` Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 2/7] Consolidate account_system_vtime extern declaration -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 3/7] Add IRQ_TIME_ACCOUNTING, finer accounting of irq time -v3 Venkatesh Pallipadi
2010-09-30 11:06   ` Peter Zijlstra
2010-09-30 16:29     ` Venkatesh Pallipadi
2010-09-30 20:38       ` Venkatesh Pallipadi
2010-10-01 11:46         ` Peter Zijlstra
2010-10-01 16:51           ` Venkatesh Pallipadi
2010-10-01 17:29             ` Venkatesh Pallipadi
2010-10-01 23:14               ` Peter Zijlstra
2010-10-01 23:32                 ` Venkatesh Pallipadi
2010-10-02 10:53                   ` Peter Zijlstra [this message]
2010-10-02 15:26                     ` Venkatesh Pallipadi
2010-10-03  0:26                       ` Peter Zijlstra
2010-10-01 11:45       ` Peter Zijlstra
2010-09-29 19:21 ` [PATCH 4/7] x86: Add IRQ_TIME_ACCOUNTING in x86 -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 5/7] sched: Do not account irq time to current task -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 6/7] sched: Remove irq time from available CPU power -v3 Venkatesh Pallipadi
2010-09-29 19:21 ` [PATCH 7/7] Export per cpu hardirq and softirq time in proc -v3 Venkatesh Pallipadi
2010-09-30  7:59 ` Proper kernel irq time accounting -v3 Andi Kleen
2010-09-30 16:37   ` Venkatesh Pallipadi
2010-09-30 17:36     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1286016835.2144.86.camel@laptop \
    --to=peterz@infradead.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pjt@google.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox