From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Martin Bligh <mbligh@google.com>,
Peter Zijlstra <peterz@infradead.org>,
Martin Bligh <mbligh@mbligh.org>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
prasad@linux.vnet.ibm.com,
Mathieu Desnoyers <compudj@krystal.dyndns.org>,
"Frank Ch. Eigler" <fche@redhat.com>,
David Wilder <dwilder@us.ibm.com>,
hch@lst.de, Tom Zanussi <zanussi@comcast.net>,
Steven Rostedt <srostedt@redhat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer
Date: Thu, 25 Sep 2008 22:52:18 +0200 [thread overview]
Message-ID: <20080925205218.GA8997@elte.hu> (raw)
In-Reply-To: <alpine.LFD.1.10.0809251318270.3265@nehalem.linux-foundation.org>
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Thu, 25 Sep 2008, Ingo Molnar wrote:
> >
> > You seem to dismiss that angle by calling my arguments bullshit, but
> > i dont know on what basis you dismiss it. Sure, a feature and extra
> > complexity _always_ has a robustness cost. If your argument is that
> > we should move cpu_clock() to assembly to make it more dependable -
> > i'm all for it.
>
> Umm. cpu_clock() isn't even cross-cpu synchronized, and has actually
> thrown away all the information that can make it so, afaik. At least
> the comments say "never more than 2 jiffies difference"). You do
> realize that if you want to order events across CPU's, we're not
> talking about "jiffies" here, we're talking about 50-100 CPU _cycles_.
Steve got the _worst-case_ cpu_clock() difference down to 60 usecs not
so long ago. It might have regressed since then, it's really hard to do
it without cross-CPU synchronization.
( But it's not impossible, as Steve has proven it, because physical time
goes on linearly on each CPU so we have a chance to do it: by
accurately correlating the GTOD timestamps we get at to-idle/from-idle
times to the TSC. )
And note that i'm not only talking about cross-CPU synchronization, i'm
also talking about _single CPU_ timestamps. How do you get it right with
TSCs via a pure postprocessing method? A very large body of modern CPUs
will halt the TSC when they go into idle. (about 70% of the installed
base or so)
Note, we absolutely cannot do accurate timings in a pure
TSC-post-processing environment: unless you want to trace _every_
to-idle and from-idle event, which can easily be tens of thousands of
extra events per seconds.
What we could do perhaps is a hybrid method:
- save a GTOD+TSC pair at important events, such as to-idle and
from-idle, and in the periodic sched_tick(). [ perhaps also save it
when we change cpufreq. ]
- save the (last_GTOD, _relative_-TSC) pair in the trace entry
with that we have a chance to do good post-processed correlation - at
the cost of having 12-16 bytes of timestamp, per trace entry.
Or we could upscale the GTOD to 'TSC time', at go-idle and from-idle.
Which is rather complicated with cpufreq - which frequency do we want to
upscale to if we have a box with three available frequencies? We could
ignore cpufreq altogether - but then there goes dependable tracing on
another range of boxes.
> You also ignore the early trace issues, and have apparently not used
> it for FTRACE. [...]
i very much used early code tracing with ftrace in the past. In fact
once i debugged and early boot hang that happened so early before
_PRINTK_ was not functional yet (!).
So, to solve this bug, i hacked ftrace to use early_printk(), to print
out the last 10,000 functions executed before the hang - and that's how
i found the reason for the hang - i captured a huge trace via a serial
console. It was dead slow to capture, but it worked and sched_clock()
worked just fine in that kind of usecase as well.
[ Note that we added tracing/fastboot recently (for v2.6.28), to enable
the tracing of early boot code timings. Havent had a problem with it
yet on x86. ]
> [...] You also ignore the fact that without TSC, it goes into the same
> "crap mode" that is appropriate for the scheduler, but totally useless
> for tracing.
i havent used a TSC-less CPU in 10 years, i'm not sure i get this point
of yours. (and IIRC the division by zero was exactly on such CPUs where
we divided by cpu_khz - that's why it could even regress.)
note that sched_clock() will use the TSC whenever it is there physically
- even if GTOD does not use it anymore.
> IOW, you say that I call your arguments BS without telling you why,
> but that's just because you apparently cut out all the things I _did_
> tell you why about!
>
> The fact is, people who do tracing will want better clocks - and have
> gotten with other infrastructure - than you have apparently cared
> about. You've worried about scheduler tracing, and you seem to want to
> just have everybody use a simple but known-bad approach that was good
> enough for you.
i wrote my first -pg/mcount based tracer about 11 years ago, to learn
more about the kernel. I traced everything with it. I then used it to
find performance bottlenecks in the kernel, and i used it to learn
kernel internals - when i saw a function in the trace that i did not
recognize, i read the source code.
Scheduler tracing came much later into the picture - the -pg tracer was
written well _before_ it was used for latency tracing purposes. But it
is indeed a pretty popular use of it. (but by no means the only one)
Ingo
next prev parent reply other threads:[~2008-09-25 20:57 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-24 5:10 [RFC PATCH 0/3] An Unified tracing buffer (attempt) Steven Rostedt
2008-09-24 5:10 ` [RFC PATCH 1/3] Unified trace buffer Steven Rostedt
2008-09-24 15:03 ` Peter Zijlstra
2008-09-24 15:44 ` Steven Rostedt
2008-09-25 10:38 ` Ingo Molnar
2008-09-24 15:47 ` Martin Bligh
2008-09-24 16:11 ` Peter Zijlstra
2008-09-24 16:24 ` Linus Torvalds
2008-09-24 16:37 ` Steven Rostedt
2008-09-24 16:56 ` Martin Bligh
2008-09-24 17:25 ` Linus Torvalds
2008-09-24 18:01 ` Mathieu Desnoyers
2008-09-24 20:49 ` Linus Torvalds
2008-09-24 16:26 ` Steven Rostedt
2008-09-24 16:49 ` Martin Bligh
2008-09-24 17:36 ` Linus Torvalds
2008-09-24 17:49 ` Steven Rostedt
2008-09-24 20:23 ` Linus Torvalds
2008-09-24 20:37 ` David Miller
2008-09-24 20:48 ` Steven Rostedt
2008-09-24 20:51 ` Martin Bligh
2008-09-24 21:24 ` Frank Ch. Eigler
2008-09-24 21:33 ` Steven Rostedt
2008-09-24 20:47 ` Steven Rostedt
2008-09-24 21:03 ` Martin Bligh
2008-09-24 21:17 ` Steven Rostedt
2008-09-24 21:51 ` Steven Rostedt
2008-09-25 10:41 ` Peter Zijlstra
2008-09-25 14:33 ` Martin Bligh
2008-09-25 14:53 ` Peter Zijlstra
2008-09-25 15:05 ` Linus Torvalds
2008-09-25 15:25 ` Martin Bligh
2008-09-25 15:36 ` Ingo Molnar
2008-09-25 16:23 ` Mathieu Desnoyers
2008-09-25 16:32 ` Steven Rostedt
2008-09-25 17:20 ` Mathieu Desnoyers
2008-09-25 17:32 ` Steven Rostedt
2008-09-25 16:40 ` Linus Torvalds
2008-09-25 16:53 ` Steven Rostedt
2008-09-25 17:07 ` Linus Torvalds
2008-09-25 19:55 ` Ingo Molnar
2008-09-25 20:12 ` Ingo Molnar
2008-09-25 20:24 ` Linus Torvalds
2008-09-25 20:29 ` Linus Torvalds
2008-09-25 20:47 ` Steven Rostedt
2008-09-25 21:01 ` Steven Rostedt
2008-09-25 21:10 ` Ingo Molnar
2008-09-25 21:16 ` Ingo Molnar
2008-09-25 21:41 ` Ingo Molnar
2008-09-25 21:56 ` Ingo Molnar
2008-09-25 21:58 ` Linus Torvalds
2008-09-25 22:14 ` Ingo Molnar
2008-09-25 23:33 ` Linus Torvalds
2008-09-27 17:16 ` Ingo Molnar
2008-09-27 17:36 ` Ingo Molnar
2008-09-27 17:38 ` Steven Rostedt
2008-09-27 17:50 ` Peter Zijlstra
2008-09-27 18:18 ` Steven Rostedt
2008-09-27 18:42 ` Ingo Molnar
2008-09-25 20:52 ` Ingo Molnar [this message]
2008-09-25 21:14 ` Jeremy Fitzhardinge
2008-09-25 21:15 ` Martin Bligh
2008-09-25 20:29 ` Mathieu Desnoyers
2008-09-25 20:20 ` Ingo Molnar
2008-09-25 21:02 ` Jeremy Fitzhardinge
2008-09-25 21:55 ` Linus Torvalds
2008-09-25 22:25 ` Ingo Molnar
2008-09-25 22:45 ` Steven Rostedt
2008-09-25 23:04 ` Jeremy Fitzhardinge
2008-09-25 23:25 ` Ingo Molnar
2008-09-26 14:04 ` Thomas Gleixner
2008-09-25 22:39 ` Jeremy Fitzhardinge
2008-09-25 22:55 ` Ingo Molnar
2008-09-26 1:17 ` Jeremy Fitzhardinge
2008-09-26 1:27 ` Steven Rostedt
2008-09-26 1:49 ` Jeremy Fitzhardinge
2008-09-25 22:59 ` Steven Rostedt
2008-09-26 1:27 ` Jeremy Fitzhardinge
2008-09-26 1:35 ` Steven Rostedt
2008-09-26 2:07 ` Jeremy Fitzhardinge
2008-09-26 2:25 ` Steven Rostedt
2008-09-26 5:31 ` Jeremy Fitzhardinge
2008-09-26 10:41 ` Steven Rostedt
2008-09-25 15:26 ` Steven Rostedt
2008-09-25 17:22 ` Linus Torvalds
2008-09-25 17:39 ` Steven Rostedt
2008-09-25 18:14 ` Linus Torvalds
2008-09-25 15:20 ` Steven Rostedt
2008-09-24 17:54 ` Martin Bligh
2008-09-24 18:04 ` Martin Bligh
2008-09-24 20:39 ` Linus Torvalds
2008-09-24 20:56 ` Martin Bligh
2008-09-24 21:08 ` Steven Rostedt
2008-09-24 20:30 ` Linus Torvalds
2008-09-24 20:53 ` Mathieu Desnoyers
2008-09-24 22:28 ` Linus Torvalds
2008-09-24 22:41 ` Linus Torvalds
2008-09-25 17:15 ` Mathieu Desnoyers
2008-09-25 17:29 ` Linus Torvalds
2008-09-25 17:42 ` Mathieu Desnoyers
2008-09-25 16:37 ` Mathieu Desnoyers
2008-09-25 16:49 ` Linus Torvalds
2008-09-25 17:02 ` Steven Rostedt
2008-09-24 16:13 ` Mathieu Desnoyers
2008-09-24 16:31 ` Steven Rostedt
2008-09-24 16:39 ` Peter Zijlstra
2008-09-24 16:51 ` Mathieu Desnoyers
2008-09-24 5:10 ` [RFC PATCH 2/3] ftrace: combine some print formating Steven Rostedt
2008-09-24 5:10 ` [RFC PATCH 3/3] ftrace: hack in the ring buffer Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080925205218.GA8997@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=compudj@krystal.dyndns.org \
--cc=dwilder@us.ibm.com \
--cc=fche@redhat.com \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=mbligh@mbligh.org \
--cc=peterz@infradead.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=zanussi@comcast.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.