From: Linus Torvalds <torvalds@linux-foundation.org>
To: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Roland Dreier <rdreier@cisco.com>,
Masami Hiramatsu <mhiramat@redhat.com>,
Martin Bligh <mbligh@google.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
darren@dvhart.com, "Frank Ch. Eigler" <fche@redhat.com>,
systemtap-ml <systemtap@sources.redhat.com>
Subject: Re: Unified tracing buffer
Date: Mon, 22 Sep 2008 21:05:20 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.1.10.0809222050370.3265@nehalem.linux-foundation.org> (raw)
In-Reply-To: <20080923033635.GK24937@Krystal>
On Mon, 22 Sep 2008, Mathieu Desnoyers wrote:
>
> Unless I am missing something, in the case we use an atomic operation
> which implies memory barriers (cmpxchg and atomic_add_return does), one
> can be sure that all memory operations done before the barrier are
> completed at the barrier and that all memory ops following the barrier
> will happen after.
Sure (if you have a barrier - not all architectures will imply that for an
incrment).
But that still doesn't mean a thing.
You have two events (a) and (b), and you put trace-points on each. In your
trace, you see (a) before (b) by comparing the numbers. But what does that
mean?
The actual event that you traced is not the trace-point - the trace-point
is more like a fancy "printk". And the fact that one showed up before
another in the trace buffer, doesn't mean that the events _around_ the
trace happened in the same order.
You can use the barriers to make a partial ordering, and if you have a
separate tracepoint for entry into a region and exit, you can perhaps show
that they were totally disjoint. Or maybe they were partially overlapping,
and you'll never know exactly how they overlapped.
Example:
trace(..);
do_X();
being executed on two different CPU's. In the trace, CPU#1 was before
CPU#2. Does that mean that "do_X()" happened first on CPU#1?
No.
The only way to show that would be to put a lock around the whole trace
_and_ operation X, ie
spin_lock(..);
trace(..);
do_X();
spin_unlock(..);
and now, if CPU#1 shows up in the trace first, then you know that do_X()
really did happen first on CPU#1. Otherwise you basically know *nothing*,
and the ordering of the trace events was totally and utterly meaningless.
See? Trace events themselves may be ordered, but the point of the trace
event is never to know the ordering of the trace itself - it's to know the
ordering of the code we're interested in tracing. The ordering of the
trace events themselves is irrelevant and not useful.
And I'd rather see people _understand_ that, than if they think the
ordering is somehow something they can trust.
Btw, if you _do_ have locking, then you can also know that the "do_X()"
operations will be essentially as far apart in some theoretical notion of
"time" (let's imagine that we do have global time, even if we don't) as
the cost of the trace operation and do_X() itself.
So if we _do_ have locking (and thus a valid ordering that actually can
matter), then the TSC doesn't even have to be synchronized on a cycle
basis across CPU's - it just needs to be close enough that you can tell
which one happened first (and with ordering, that's a valid thing to do).
So you don't even need "perfect" synchronization, you just need something
reasonably close, and you'll be able to see ordering from TSC counts
without having that horrible bouncing cross-CPU thing that will impact
performance a lot.
Quite frankly, I suspect that anybody who wants to have a global counter
might as well almost just have a global ring-buffer. The trace events
aren't going to be CPU-local anyway if you need to always update a shared
cacheline - and you might as well make the shared cacheline be the ring
buffer head with a spinlock in it.
That may not be _quite_ true, but it's probably close enough.
Linus
next prev parent reply other threads:[~2008-09-23 4:06 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-19 21:33 Unified tracing buffer Martin Bligh
2008-09-19 21:42 ` Randy Dunlap
2008-09-19 21:57 ` Martin Bligh
2008-09-19 22:41 ` Olaf Dabrunz
2008-09-19 22:19 ` Martin Bligh
2008-09-20 8:10 ` Olaf Dabrunz
2008-09-20 8:29 ` Steven Rostedt
2008-09-20 11:40 ` Mathieu Desnoyers
2008-09-20 8:26 ` Steven Rostedt
2008-09-20 11:44 ` Mathieu Desnoyers
2008-09-19 22:28 ` Olaf Dabrunz
2008-09-19 22:09 ` Martin Bligh
2008-09-19 23:18 ` Frank Ch. Eigler
2008-09-20 8:50 ` Steven Rostedt
2008-09-20 13:37 ` Mathieu Desnoyers
2008-09-20 13:51 ` Steven Rostedt
2008-09-20 14:54 ` Steven Rostedt
2008-09-22 18:45 ` Mathieu Desnoyers
2008-09-22 21:39 ` Steven Rostedt
2008-09-23 3:27 ` Mathieu Desnoyers
2008-09-20 0:07 ` Peter Zijlstra
2008-09-22 14:07 ` K.Prasad
2008-09-22 14:45 ` Peter Zijlstra
2008-09-22 16:29 ` Martin Bligh
2008-09-22 16:36 ` Peter Zijlstra
2008-09-22 20:50 ` Masami Hiramatsu
2008-09-23 3:05 ` Mathieu Desnoyers
2008-09-23 2:49 ` Mathieu Desnoyers
2008-09-23 5:25 ` Tom Zanussi
2008-09-23 9:31 ` Peter Zijlstra
2008-09-23 18:13 ` Mathieu Desnoyers
2008-09-23 18:13 ` Mathieu Desnoyers
2008-09-23 18:33 ` Christoph Lameter
2008-09-23 18:33 ` Christoph Lameter
2008-09-23 18:56 ` Linus Torvalds
2008-09-23 18:56 ` Linus Torvalds
2008-09-23 13:50 ` Mathieu Desnoyers
2008-09-23 14:00 ` Martin Bligh
2008-09-23 17:55 ` K.Prasad
2008-09-23 18:27 ` Martin Bligh
2008-09-24 3:50 ` Tom Zanussi
2008-09-24 5:42 ` K.Prasad
2008-09-25 6:07 ` [RFC PATCH 0/8] current relay cleanup patchset Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 1/8] relay - Clean up relay_switch_subbuf() and make waking up consumers optional Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 2/8] relay - Make the relay sub-buffer switch code replaceable Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 3/8] relay - Add channel flags to relay, remove global callback param Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 4/8] relay - Add reserved param to switch-subbuf, in preparation for non-pad write/reserve Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 5/8] relay - Map the first sub-buffer at the end of the buffer, for temporary convenience Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 6/8] relay - Replace relay_reserve/relay_write with non-padded versions Tom Zanussi
2008-09-25 6:07 ` [RFC PATCH 7/8] relay - Remove padding-related code from relay_read()/relay_splice_read() et al Tom Zanussi
2008-09-25 6:08 ` [RFC PATCH 8/8] relay - Clean up remaining padding-related junk Tom Zanussi
2008-09-23 5:27 ` [PATCH 1/3] relay - clean up subbuf switch Tom Zanussi
2008-09-23 20:15 ` Andrew Morton
2008-09-23 5:27 ` [PATCH 2/3] relay - make subbuf switch replaceable Tom Zanussi
2008-09-23 20:17 ` Andrew Morton
2008-09-23 5:27 ` [PATCH 3/3] relay - add channel flags Tom Zanussi
2008-09-23 20:20 ` Andrew Morton
2008-09-24 3:57 ` Tom Zanussi
2008-09-20 0:26 ` Unified tracing buffer Marcel Holtmann
2008-09-20 9:03 ` Steven Rostedt
2008-09-20 13:55 ` Mathieu Desnoyers
2008-09-20 14:12 ` Arjan van de Ven
2008-09-22 18:52 ` Mathieu Desnoyers
2008-10-02 15:28 ` Jason Baron
2008-10-03 16:11 ` Mathieu Desnoyers
2008-10-03 18:37 ` Jason Baron
2008-10-03 19:10 ` Mathieu Desnoyers
2008-10-03 19:25 ` Jason Baron
2008-10-03 19:56 ` Mathieu Desnoyers
2008-10-03 20:25 ` Jason Baron
2008-10-03 21:52 ` Frank Ch. Eigler
2008-09-22 3:09 ` KOSAKI Motohiro
2008-09-22 9:57 ` Peter Zijlstra
2008-09-23 2:36 ` Mathieu Desnoyers
2008-09-22 13:57 ` K.Prasad
2008-09-22 19:45 ` Masami Hiramatsu
2008-09-22 20:13 ` Martin Bligh
2008-09-22 22:25 ` Masami Hiramatsu
2008-09-22 23:11 ` Darren Hart
2008-09-23 0:04 ` Masami Hiramatsu
2008-09-22 23:16 ` Martin Bligh
2008-09-23 0:05 ` Masami Hiramatsu
2008-09-23 0:12 ` Martin Bligh
2008-09-23 14:49 ` Masami Hiramatsu
2008-09-23 15:04 ` Mathieu Desnoyers
2008-09-23 15:30 ` Masami Hiramatsu
2008-09-23 16:01 ` Linus Torvalds
2008-09-23 17:04 ` Masami Hiramatsu
2008-09-23 17:30 ` Thomas Gleixner
2008-09-23 18:59 ` Masami Hiramatsu
2008-09-23 19:36 ` Thomas Gleixner
2008-09-23 19:38 ` Martin Bligh
2008-09-23 19:41 ` Thomas Gleixner
2008-09-23 19:50 ` Martin Bligh
2008-09-23 20:03 ` Thomas Gleixner
2008-09-23 21:02 ` Martin Bligh
2008-09-23 20:03 ` Masami Hiramatsu
2008-09-23 20:08 ` Thomas Gleixner
2008-09-23 15:46 ` Linus Torvalds
2008-09-23 0:39 ` Linus Torvalds
2008-09-23 1:26 ` Roland Dreier
2008-09-23 1:39 ` Steven Rostedt
2008-09-23 2:02 ` Mathieu Desnoyers
2008-09-23 2:26 ` Darren Hart
2008-09-23 2:31 ` Mathieu Desnoyers
2008-09-23 3:26 ` Linus Torvalds
2008-09-23 3:36 ` Mathieu Desnoyers
2008-09-23 4:05 ` Linus Torvalds [this message]
2008-09-23 3:43 ` Steven Rostedt
2008-09-23 4:10 ` Masami Hiramatsu
2008-09-23 4:17 ` Martin Bligh
2008-09-23 15:23 ` Masami Hiramatsu
2008-09-23 10:53 ` Steven Rostedt
2008-09-23 4:19 ` Linus Torvalds
2008-09-23 14:12 ` Mathieu Desnoyers
2008-09-23 2:30 ` Mathieu Desnoyers
2008-09-23 3:06 ` Masami Hiramatsu
2008-09-23 14:36 ` KOSAKI Motohiro
2008-09-23 15:02 ` Frank Ch. Eigler
2008-09-23 15:21 ` Masami Hiramatsu
2008-09-23 17:59 ` KOSAKI Motohiro
2008-09-23 18:28 ` Martin Bligh
2008-09-23 3:33 ` Andi Kleen
2008-09-23 3:47 ` Martin Bligh
2008-09-23 5:04 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.10.0809222050370.3265@nehalem.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=compudj@krystal.dyndns.org \
--cc=darren@dvhart.com \
--cc=fche@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=mhiramat@redhat.com \
--cc=rdreier@cisco.com \
--cc=rostedt@goodmis.org \
--cc=systemtap@sources.redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.