From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Thomas Gleixner <tglx@linutronix.de>,
David Miller <davem@davemloft.net>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
"ltt-dev@lists.casi.polymtl.ca" <ltt-dev@lists.casi.polymtl.ca>
Subject: Re: [RFC patch 15/15] LTTng timestamp x86
Date: Wed, 22 Oct 2008 12:51:29 -0400 [thread overview]
Message-ID: <20081022165129.GD12650@Krystal> (raw)
In-Reply-To: <57C9024A16AD2D4C97DC78E552063EA353346068@orsmsx505.amr.corp.intel.com>
* Luck, Tony (tony.luck@intel.com) wrote:
> > And what do we say when we detect this ? "sorry, please upgrade your
> > hardware to get a reliable trace" ? ;)
>
> My employer might be happy with that answer ;-) ... but I think
> we could tell the user to:
>
> 1) adjust something in /sys/...
> 2) boot with some special option
> 3) rebuild kernel with CONFIG_INSANE_TSC=y
>
> to switch over to a heavyweight workaround in s/w. Systems
> that require this are already in the minority ... and I
> think (hope!) that current and future generations of cpus
> won't have these challenges.
>
> So this is mostly a campaign for the default code path to
> be based on current (sane) TSC behaviour ... with the workarounds
> for past problems kept to one side.
>
This is exactly what I do in this patchset actually :) The common case,
when a synchronized TSC is detected, is to do a plain TSC read. However,
if a non-synchronized TSC is detected, a warning message is written to
the console (pointing to some documentation to get precise timestamping)
and the heavyweight cmpxchg-based workaround is enabled.
> > Nope, this is not required. I removed the heartbeat event from LTTng two
> > weeks ago, implementing detection of the delta from the last timestamp
> > written into the trace. If we detect that the new timestamp is too far
> > from the previous one, we write the full 64 bits TSC in an extended
> > event header. Therefore, we have no dependency on interrupt latency to
> > get a sane time-base.
>
> Neat. Could you grab the HPET value here too?
>
Yes, I could. When I detect that the TSC value is too far apart from the
previous one, I reserve extra space for the header (this could include
an extra 64-bits for the HPET). At that moment, I could also sample the
HPET, given this happens relatively rarely.
Given the frequency is expected to go at about 1GHz, the 27 bits would
overflow 7-8 times per second. The only thing is that I only need this
extended field when there are absolutely no events in the stream for
an whole overflow period, which is the only case that makes the overflow
impossible to detect without having more TSC bits. In the common case
where there is a steady flow of event, we would never have such "large
TSC header" event.
However, I could do something slightly different from the large TSC
header detection. I could make sure there would be a HPET sampling done
"periodically", or at least periodically when there are events saved to
the buffer by saving, for each buffer, the last TSC value at which the
HPET sampling has been done. When we log following events, we do a HPET
sampling (and write an extended event header) if we are too far apart
from the previous sample.
We would probably need to sample the HPET at subbuffer switch too to
allow fast time-based seek on the trace when we read it.
We could then do a pre-processing on the trace buffers which would
calculate the linear interpolation of cycles counters between the
per-buffer HPET values. The nice thing is that we know the _next_ value
coming _after_ an event (which is not the case for a standard kernel
time-base), so we can be a bit more precise and we do not suffer from
things like "the TSC of a given cpu accelerates and times appears to go
backwards when the next HPET sample is taken".
But I am not sure this would be sufficient to insure generally correct
event order; the maximum interpolation error can become quite large on
systems with different clock speeds in halt states and which would
happen to have a non-steady flow of events.
>
> > (8 cores up)
>
> Interesting results. I'm not at all sure why HPET scales so badly.
> Maybe some h/w throttling/synchronizing going on???
>
As Linus said, there is probably some contention at the IO hub level.
But it implies that we have to be careful about the frequency at which
we sample the HPET, otherwise it wouldn't scale.
Mathieu
> -Tony
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2008-10-22 16:56 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-16 23:27 [RFC patch 00/15] Tracer Timestamping Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 01/15] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 02/15] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 03/15] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-17 2:48 ` [RFC patch 03/15] get_cycles() : sparc64 HAVE_GET_CYCLES (update) Mathieu Desnoyers
2008-10-17 2:48 ` Mathieu Desnoyers
2008-10-17 2:57 ` David Miller
2008-10-16 23:27 ` [RFC patch 04/15] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-17 0:26 ` Paul Mackerras
2008-10-17 0:43 ` [RFC patch 04/15] get_cycles() : powerpc64 HAVE_GET_CYCLES (update) Mathieu Desnoyers
2008-10-17 0:43 ` Mathieu Desnoyers
2008-10-17 0:54 ` Paul Mackerras
2008-10-17 1:42 ` David Miller
2008-10-17 2:08 ` Mathieu Desnoyers
2008-10-17 2:08 ` Mathieu Desnoyers
2008-10-17 2:33 ` David Miller
2008-10-16 23:27 ` [RFC patch 05/15] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-26 10:18 ` Ralf Baechle
2008-10-26 10:18 ` Ralf Baechle
2008-10-26 20:39 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 06/15] LTTng build Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-17 8:10 ` KOSAKI Motohiro
2008-10-17 16:18 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 07/15] LTTng timestamp Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-17 8:15 ` KOSAKI Motohiro
2008-10-17 16:23 ` Mathieu Desnoyers
2008-10-17 16:23 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 08/15] LTTng - Timestamping Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 09/15] LTTng mips export hpt frequency Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 10/15] LTTng timestamp mips Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 11/15] LTTng timestamp powerpc Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 12/15] LTTng timestamp sparc64 Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 13/15] LTTng timestamp sh Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 14/15] LTTng - TSC synchronicity test Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-16 23:27 ` [RFC patch 15/15] LTTng timestamp x86 Mathieu Desnoyers
2008-10-16 23:27 ` Mathieu Desnoyers
2008-10-17 0:08 ` Linus Torvalds
2008-10-17 0:12 ` Linus Torvalds
2008-10-17 1:28 ` Mathieu Desnoyers
2008-10-17 2:19 ` Luck, Tony
2008-10-17 2:19 ` Luck, Tony
2008-10-17 17:25 ` Steven Rostedt
2008-10-17 18:08 ` Luck, Tony
2008-10-17 18:42 ` Mathieu Desnoyers
2008-10-17 18:58 ` Luck, Tony
2008-10-17 20:23 ` Mathieu Desnoyers
2008-10-17 23:52 ` Luck, Tony
2008-10-18 17:01 ` Mathieu Desnoyers
2008-10-18 17:01 ` Mathieu Desnoyers
2008-10-18 17:35 ` Linus Torvalds
2008-10-18 17:50 ` Ingo Molnar
2008-10-22 16:19 ` Mathieu Desnoyers
2008-10-22 15:53 ` Mathieu Desnoyers
2008-10-20 18:07 ` Luck, Tony
2008-10-22 16:51 ` Mathieu Desnoyers [this message]
2008-10-17 19:17 ` Steven Rostedt
2008-10-20 20:10 ` Linus Torvalds
2008-10-20 20:10 ` Linus Torvalds
2008-10-20 21:38 ` john stultz
2008-10-20 22:06 ` Linus Torvalds
2008-10-20 22:17 ` Ingo Molnar
2008-10-20 22:17 ` Ingo Molnar
2008-10-20 22:29 ` H. Peter Anvin
2008-10-20 22:29 ` H. Peter Anvin
2008-10-21 18:10 ` Bjorn Helgaas
2008-10-23 15:47 ` Linus Torvalds
2008-10-23 16:39 ` H. Peter Anvin
2008-10-23 21:54 ` Paul Mackerras
2008-10-20 23:47 ` john stultz
2008-10-20 23:47 ` john stultz
2008-10-22 17:05 ` Mathieu Desnoyers
2008-10-17 19:36 ` Christoph Lameter
2008-10-17 7:59 ` [RFC patch 00/15] Tracer Timestamping Peter Zijlstra
2008-10-20 20:25 ` Mathieu Desnoyers
2008-10-20 20:25 ` Mathieu Desnoyers
2008-10-21 0:20 ` Nicolas Pitre
2008-10-21 1:32 ` Mathieu Desnoyers
2008-10-21 2:32 ` Nicolas Pitre
2008-10-21 4:05 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081022165129.GD12650@Krystal \
--to=mathieu.desnoyers@polymtl.ca \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=hpa@zytor.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ltt-dev@lists.casi.polymtl.ca \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).