* [patch 00/17] Trace Clock v3
@ 2008-11-12 23:15 Mathieu Desnoyers
2008-11-12 23:15 ` [patch 01/17] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
` (16 more replies)
0 siblings, 17 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Hi,
In this new version, I have integrated the changes I did following the comments
I received for v2. I also reimplemented the "generic" trace clock so it is less
intrusive and simply use a standard timer. I is not as bound to the xtime_lock
as it previously was and it does not have to modify kernel/time.c anymore.
I think this is pretty close to a mergeable state. I plan to keep more exotic
features, such as dealing with non-synchronized TSCs without cache-line
bouncing, for future improvement. As a reminder, when the trace clock detects an
unsynchronized TSC on the machine, it prints the information about which kernel
command line arguments must be used so the user can have synchronized timestamp
counters. However, if the user wants to use his system with non-synchronized
TSCs, the cache-line bouncing fallback is used.
This patchset applies on top of 2.6.28-rc4.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 01/17] get_cycles() : kconfig HAVE_GET_CYCLES
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
` (15 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, David Miller, Ingo Molnar, Thomas Gleixner,
linux-arch
[-- Attachment #1: get-cycles-kconfig-have-get-cycles.patch --]
[-- Type: text/plain, Size: 1935 bytes --]
Create a new "HAVE_GET_CYCLES" architecture option to specify which
architectures provide 64-bits TSC counters readable with get_cycles(). It's
principally useful to only enable high-precision tracing code only on such
architectures and don't even bother building it on architectures which lack such
support.
It also requires architectures to provide get_cycles_barrier() and
get_cycles_rate().
I mainly use it for the "priority-sifting rwlock" latency tracing code, which
traces worse-case latency induced by the locking. It also provides the basic
changes needed for the LTTng timestamping infrastructure.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
init/Kconfig | 10 ++++++++++
1 file changed, 10 insertions(+)
Index: linux.trees.git/init/Kconfig
===================================================================
--- linux.trees.git.orig/init/Kconfig 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/init/Kconfig 2008-11-12 17:58:05.000000000 -0500
@@ -330,6 +330,16 @@ config CPUSETS
config HAVE_UNSTABLE_SCHED_CLOCK
bool
+#
+# Architectures with a 64-bits get_cycles() should select this.
+# They should also define
+# get_cycles_barrier() : instruction synchronization barrier if required
+# get_cycles_rate() : cycle counter rate, in HZ. If 0, TSC are not synchronized
+# across CPUs or their frequency may vary due to frequency scaling.
+#
+config HAVE_GET_CYCLES
+ def_bool n
+
config GROUP_SCHED
bool "Group CPU scheduler"
depends on EXPERIMENTAL
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
2008-11-12 23:15 ` [patch 01/17] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-13 7:15 ` Geert Uytterhoeven
2008-11-12 23:15 ` [patch 03/17] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
` (14 subsequent siblings)
16 siblings, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, David Miller, Ingo Molnar, Thomas Gleixner,
linux-arch
[-- Attachment #1: get-cycles-x86-have-get-cycles.patch --]
[-- Type: text/plain, Size: 1875 bytes --]
This patch selects HAVE_GET_CYCLES and makes sure get_cycles_barrier() and
get_cycles_rate() are implemented.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/tsc.h | 12 ++++++++++++
2 files changed, 13 insertions(+)
Index: linux.trees.git/arch/x86/Kconfig
===================================================================
--- linux.trees.git.orig/arch/x86/Kconfig 2008-11-12 18:15:25.000000000 -0500
+++ linux.trees.git/arch/x86/Kconfig 2008-11-12 18:15:28.000000000 -0500
@@ -20,6 +20,7 @@ config X86
def_bool y
select HAVE_AOUT if X86_32
select HAVE_UNSTABLE_SCHED_CLOCK
+ select HAVE_GET_CYCLES
select HAVE_IDE
select HAVE_OPROFILE
select HAVE_IOREMAP_PROT
Index: linux.trees.git/arch/x86/include/asm/tsc.h
===================================================================
--- linux.trees.git.orig/arch/x86/include/asm/tsc.h 2008-11-12 18:15:25.000000000 -0500
+++ linux.trees.git/arch/x86/include/asm/tsc.h 2008-11-12 18:15:28.000000000 -0500
@@ -56,6 +56,18 @@ extern void mark_tsc_unstable(char *reas
extern int unsynchronized_tsc(void);
int check_tsc_unstable(void);
+static inline cycles_t get_cycles_rate(void)
+{
+ if (check_tsc_unstable())
+ return 0;
+ return tsc_khz;
+}
+
+static inline void get_cycles_barrier(void)
+{
+ rdtsc_barrier();
+}
+
/*
* Boot-time check whether the TSCs are synchronized across
* all CPUs/cores:
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 03/17] get_cycles() : sparc64 HAVE_GET_CYCLES
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
2008-11-12 23:15 ` [patch 01/17] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-12 23:15 ` [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 04/17] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
` (13 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, David S. Miller, Ingo Molnar, Thomas Gleixner,
linux-arch
[-- Attachment #1: get-cycles-sparc64-have-get-cycles.patch --]
[-- Type: text/plain, Size: 2763 bytes --]
This patch selects HAVE_GET_CYCLES and makes sure get_cycles_barrier() and
get_cycles_rate() are implemented.
Changelog :
- Use tb_ticks_per_usec * 1000000 in get_cycles_rate().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: David S. Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
arch/sparc/include/asm/timex_64.h | 19 ++++++++++++++++++-
arch/sparc64/Kconfig | 1 +
arch/sparc64/kernel/time.c | 3 ++-
3 files changed, 21 insertions(+), 2 deletions(-)
Index: linux.trees.git/arch/sparc64/Kconfig
===================================================================
--- linux.trees.git.orig/arch/sparc64/Kconfig 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/sparc64/Kconfig 2008-11-12 18:00:04.000000000 -0500
@@ -13,6 +13,7 @@ config SPARC64
default y
select HAVE_FUNCTION_TRACER
select HAVE_IDE
+ select HAVE_GET_CYCLES
select HAVE_LMB
select HAVE_ARCH_KGDB
select USE_GENERIC_SMP_HELPERS if SMP
Index: linux.trees.git/arch/sparc/include/asm/timex_64.h
===================================================================
--- linux.trees.git.orig/arch/sparc/include/asm/timex_64.h 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/sparc/include/asm/timex_64.h 2008-11-12 18:00:04.000000000 -0500
@@ -12,7 +12,24 @@
/* Getting on the cycle counter on sparc64. */
typedef unsigned long cycles_t;
-#define get_cycles() tick_ops->get_tick()
+
+static inline cycles_t get_cycles(void)
+{
+ return tick_ops->get_tick();
+}
+
+/* get_cycles instruction is synchronized on sparc64 */
+static inline void get_cycles_barrier(void)
+{
+ return;
+}
+
+extern unsigned long tb_ticks_per_usec;
+
+static inline cycles_t get_cycles_rate(void)
+{
+ return tb_ticks_per_usec * 1000000UL;
+}
#define ARCH_HAS_READ_CURRENT_TIMER
Index: linux.trees.git/arch/sparc64/kernel/time.c
===================================================================
--- linux.trees.git.orig/arch/sparc64/kernel/time.c 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/sparc64/kernel/time.c 2008-11-12 18:00:04.000000000 -0500
@@ -793,7 +793,8 @@ static void __init setup_clockevent_mult
sparc64_clockevent.mult = mult;
}
-static unsigned long tb_ticks_per_usec __read_mostly;
+unsigned long tb_ticks_per_usec __read_mostly;
+EXPORT_SYMBOL_GPL(tb_ticks_per_usec);
void __delay(unsigned long loops)
{
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 04/17] get_cycles() : powerpc64 HAVE_GET_CYCLES
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (2 preceding siblings ...)
2008-11-12 23:15 ` [patch 03/17] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 05/17] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
` (12 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, benh, paulus, David Miller, Ingo Molnar,
Thomas Gleixner, linux-arch
[-- Attachment #1: get-cycles-powerpc-have-get-cycles.patch --]
[-- Type: text/plain, Size: 1977 bytes --]
This patch selects HAVE_GET_CYCLES and makes sure get_cycles_barrier() and
get_cycles_rate() are implemented.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: benh@kernel.crashing.org
CC: paulus@samba.org
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/timex.h | 11 +++++++++++
2 files changed, 12 insertions(+)
Index: linux.trees.git/arch/powerpc/Kconfig
===================================================================
--- linux.trees.git.orig/arch/powerpc/Kconfig 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/powerpc/Kconfig 2008-11-12 18:00:10.000000000 -0500
@@ -121,6 +121,7 @@ config PPC
select HAVE_DMA_ATTRS if PPC64
select USE_GENERIC_SMP_HELPERS if SMP
select HAVE_OPROFILE
+ select HAVE_GET_CYCLES if PPC64
config EARLY_PRINTK
bool
Index: linux.trees.git/arch/powerpc/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/powerpc/include/asm/timex.h 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/powerpc/include/asm/timex.h 2008-11-12 18:00:10.000000000 -0500
@@ -7,6 +7,7 @@
* PowerPC architecture timex specifications
*/
+#include <asm/time.h>
#include <asm/cputable.h>
#include <asm/reg.h>
@@ -46,5 +47,15 @@ static inline cycles_t get_cycles(void)
#endif
}
+static inline cycles_t get_cycles_rate(void)
+{
+ return tb_ticks_per_sec;
+}
+
+static inline void get_cycles_barrier(void)
+{
+ isync();
+}
+
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_TIMEX_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 05/17] get_cycles() : MIPS HAVE_GET_CYCLES_32
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (3 preceding siblings ...)
2008-11-12 23:15 ` [patch 04/17] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 06/17] Trace clock core Mathieu Desnoyers
` (11 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle, David Miller, Ingo Molnar,
Thomas Gleixner, linux-arch
[-- Attachment #1: get-cycles-mips-have-get-cycles.patch --]
[-- Type: text/plain, Size: 2784 bytes --]
partly reverts commit efb9ca08b5a2374b29938cdcab417ce4feb14b54. Selects
HAVE_GET_CYCLES_32 only on CPUs where it is safe to use it.
Currently consider the "_WORKAROUND" cases for 4000 and 4400 to be unsafe, but
should probably add other sub-architecture to the blacklist.
Do not define HAVE_GET_CYCLES because MIPS does not provide 64-bit tsc (only
32-bits).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
arch/mips/Kconfig | 4 ++++
arch/mips/include/asm/timex.h | 25 +++++++++++++++++++++++++
2 files changed, 29 insertions(+)
Index: linux.trees.git/arch/mips/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/mips/include/asm/timex.h 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/mips/include/asm/timex.h 2008-11-12 18:00:23.000000000 -0500
@@ -29,14 +29,39 @@
* which isn't an evil thing.
*
* We know that all SMP capable CPUs have cycle counters.
+ *
+ * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
+ * HAVE_GET_CYCLES makes sure that this case is handled properly :
+ *
+ * Ralf Baechle <ralf@linux-mips.org> :
+ * This avoids us executing an mfc0 c0_count instruction on processors which
+ * don't have but also on certain R4000 and R4400 versions where reading from
+ * the count register just in the very moment when its value equals c0_compare
+ * will result in the timer interrupt getting lost.
*/
typedef unsigned int cycles_t;
+#ifdef HAVE_GET_CYCLES_32
+static inline cycles_t get_cycles(void)
+{
+ return read_c0_count();
+}
+
+static inline void get_cycles_barrier(void)
+{
+}
+
+static inline cycles_t get_cycles_rate(void)
+{
+ return CLOCK_TICK_RATE;
+}
+#else
static inline cycles_t get_cycles(void)
{
return 0;
}
+#endif
#endif /* __KERNEL__ */
Index: linux.trees.git/arch/mips/Kconfig
===================================================================
--- linux.trees.git.orig/arch/mips/Kconfig 2008-11-07 00:34:11.000000000 -0500
+++ linux.trees.git/arch/mips/Kconfig 2008-11-12 18:00:23.000000000 -0500
@@ -1611,6 +1611,10 @@ config CPU_R4000_WORKAROUNDS
config CPU_R4400_WORKAROUNDS
bool
+config HAVE_GET_CYCLES_32
+ def_bool y
+ depends on !CPU_R4400_WORKAROUNDS
+
#
# Use the generic interrupt handling code in kernel/irq/:
#
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 06/17] Trace clock core
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (4 preceding siblings ...)
2008-11-12 23:15 ` [patch 05/17] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 07/17] Trace clock generic Mathieu Desnoyers
` (10 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Nicolas Pitre, Ralf Baechle, benh, paulus,
David Miller, Ingo Molnar, Thomas Gleixner, linux-arch
[-- Attachment #1: trace-clock-core.patch --]
[-- Type: text/plain, Size: 13796 bytes --]
32 to 64 bits clock extension. Extracts 64 bits tsc from a [1..32]
bits counter, kept up to date by periodical timer interrupt. Lockless.
It's actually a specialized version of cnt_32_to_63.h which does the following
in addition :
- Uses per-cpu data to keep track of counters.
- It limits cache-line bouncing
- I supports machines with non-synchronized TSCs.
- Does not require read barriers, which can be slow on some architectures.
- Supports a full 64-bits counter (well, just one bit more than 63 is not really
a big deal when we talk about timestamp counters). If 2^64 is considered long
enough between overflows, 2^63 is normally considered long enough too.
- The periodical update of the value is insured by the infrastructure. There is
no assumption that the counter is read frequently, because we cannot assume
that given the events for which tracing is enabled can be dynamically
selected.
- Supports counters of various width (32-bits and below) by changing the
HW_BITS define.
What cnt_32_to_63.h does that this patch doesn't do :
- It has a global counter, which removes the need to do an update periodically
on _each_ cpu. This can be important in a dynamic tick system where CPUs need
to sleep to save power. It is therefore well suited for systems reading a
global clock expected to be _exactly_ synchronized across cores (where time
can never ever go backward).
Q:
> do you actually use the RCU internals? or do you just reimplement an RCU
> algorithm?
>
A:
Nope, I don't use RCU internals in this code. Preempt disable seemed
like the best way to handle this utterly short code path and I wanted
the write side to be fast enough to be called periodically. What I do is:
- Disable preemption at the read-side :
it makes sure the pointer I get will point to a data structure that
will never change while I am in the preempt disabled code. (see *)
- I use per-cpu data to allow the read-side to be as fast as possible
(only need to disable preemption, does not race against other CPUs and
won't generate cache line bouncing). It also allows dealing with
unsynchronized TSCs if needed.
- Periodical write side : it's called from an IPI running on each CPU.
(*) We expect the read-side (preempt off region) to last shorter than
the interval between IPI updates so we can guarantee the data structure
it uses won't be modified underneath it. Since the IPI update is
launched each seconds or so (depends on the frequency of the counter we
are trying to extend), it's more than ok.
Changelog:
- Support [1..32] bits -> 64 bits.
I volountarily limit the code to use at most 32 bits of the hardware clock for
performance considerations. If this is a problem it could be changed. Also, the
algorithm is aimed at a 32 bits architecture. The code becomes muuuch simpler on
a 64 bits arch, since we can do the updates atomically.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Nicolas Pitre <nico@cam.org>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: benh@kernel.crashing.org
CC: paulus@samba.org
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
init/Kconfig | 12 +
kernel/Makefile | 3
kernel/trace/Makefile | 1
kernel/trace/trace-clock-32-to-64.c | 281 ++++++++++++++++++++++++++++++++++++
4 files changed, 295 insertions(+), 2 deletions(-)
Index: linux.trees.git/kernel/trace/Makefile
===================================================================
--- linux.trees.git.orig/kernel/trace/Makefile 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/kernel/trace/Makefile 2008-11-12 18:00:47.000000000 -0500
@@ -24,5 +24,6 @@ obj-$(CONFIG_NOP_TRACER) += trace_nop.o
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o
obj-$(CONFIG_BOOT_TRACER) += trace_boot.o
+obj-$(CONFIG_HAVE_TRACE_CLOCK_32_TO_64) += trace-clock-32-to-64.o
libftrace-y := ftrace.o
Index: linux.trees.git/kernel/trace/trace-clock-32-to-64.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/kernel/trace/trace-clock-32-to-64.c 2008-11-12 18:00:31.000000000 -0500
@@ -0,0 +1,281 @@
+/*
+ * kernel/trace/trace-clock-32-to-64.c
+ *
+ * (C) Copyright 2006,2007,2008 -
+ * Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca)
+ *
+ * Extends a 32 bits clock source to a full 64 bits count, readable atomically
+ * from any execution context.
+ *
+ * notes :
+ * - trace clock 32->64 bits extended timer-based clock cannot be used for early
+ * tracing in the boot process, as it depends on timer interrupts.
+ * - The timer is only on one CPU to support hotplug.
+ * - We have the choice between schedule_delayed_work_on and an IPI to get each
+ * CPU to write the heartbeat. IPI has been chosen because it is considered
+ * faster than passing through the timer to get the work scheduled on all the
+ * CPUs.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/timer.h>
+#include <linux/workqueue.h>
+#include <linux/cpu.h>
+#include <linux/timex.h>
+#include <linux/bitops.h>
+#include <linux/trace-clock.h>
+#include <linux/smp.h>
+#include <linux/sched.h> /* needed due to include order problem on m68k */
+
+/*
+ * Number of hardware clock bits. The higher order bits are expected to be 0.
+ * If the hardware clock source has more than 32 bits, the bits higher than the
+ * 32nd will be truncated by a cast to a 32 bits unsigned. Range : 1 - 32.
+ * (too few bits would be unrealistic though, since we depend on the timer to
+ * detect the overflows).
+ */
+#define HW_BITS 32
+
+#define HW_BITMASK ((1ULL << HW_BITS) - 1)
+#define HW_LS32(hw) ((hw) & HW_BITMASK)
+#define SW_MS32(sw) ((sw) & ~HW_BITMASK)
+
+/* Expected maximum interrupt latency in ms : 15ms, *2 for security */
+#define EXPECTED_INTERRUPT_LATENCY 30
+
+static DEFINE_MUTEX(synthetic_tsc_mutex);
+static int synthetic_tsc_refcount; /* Number of readers */
+static int synthetic_tsc_enabled; /* synth. TSC enabled on all online CPUs */
+
+static DEFINE_PER_CPU(struct timer_list, tsc_timer);
+static unsigned int precalc_expire;
+
+struct synthetic_tsc_struct {
+ union {
+ u64 val;
+ struct {
+#ifdef __BIG_ENDIAN
+ u32 ms32;
+ u32 ls32;
+#else
+ u32 ls32;
+ u32 ms32;
+#endif
+ } sel;
+ } tsc[2];
+ unsigned int index; /* Index of the current synth. tsc. */
+};
+
+static DEFINE_PER_CPU(struct synthetic_tsc_struct, synthetic_tsc);
+
+/* Called from IPI or timer interrupt */
+static void update_synthetic_tsc(void)
+{
+ struct synthetic_tsc_struct *cpu_synth;
+ u32 tsc;
+
+ cpu_synth = &per_cpu(synthetic_tsc, smp_processor_id());
+ tsc = trace_clock_read32(); /* Hardware clocksource read */
+
+ if (tsc < HW_LS32(cpu_synth->tsc[cpu_synth->index].sel.ls32)) {
+ unsigned int new_index = 1 - cpu_synth->index; /* 0 <-> 1 */
+ /*
+ * Overflow
+ * Non atomic update of the non current synthetic TSC, followed
+ * by an atomic index change. There is no write concurrency,
+ * so the index read/write does not need to be atomic.
+ */
+ cpu_synth->tsc[new_index].val =
+ (SW_MS32(cpu_synth->tsc[cpu_synth->index].val)
+ | (u64)tsc) + (1ULL << HW_BITS);
+ cpu_synth->index = new_index; /* atomic change of index */
+ } else {
+ /*
+ * No overflow : We know that the only bits changed are
+ * contained in the 32 LS32s, which can be written to atomically.
+ */
+ cpu_synth->tsc[cpu_synth->index].sel.ls32 =
+ SW_MS32(cpu_synth->tsc[cpu_synth->index].sel.ls32) | tsc;
+ }
+}
+
+/* Called from buffer switch : in _any_ context (even NMI) */
+u64 notrace trace_clock_read_synthetic_tsc(void)
+{
+ struct synthetic_tsc_struct *cpu_synth;
+ u64 ret;
+ unsigned int index;
+ u32 tsc;
+
+ preempt_disable_notrace();
+ cpu_synth = &per_cpu(synthetic_tsc, smp_processor_id());
+ index = cpu_synth->index; /* atomic read */
+ tsc = trace_clock_read32(); /* Hardware clocksource read */
+
+ /* Overflow detection */
+ if (unlikely(tsc < HW_LS32(cpu_synth->tsc[index].sel.ls32)))
+ ret = (SW_MS32(cpu_synth->tsc[index].val) | (u64)tsc)
+ + (1ULL << HW_BITS);
+ else
+ ret = SW_MS32(cpu_synth->tsc[index].val) | (u64)tsc;
+ preempt_enable_notrace();
+ return ret;
+}
+EXPORT_SYMBOL_GPL(trace_clock_read_synthetic_tsc);
+
+static void synthetic_tsc_ipi(void *info)
+{
+ update_synthetic_tsc();
+}
+
+/*
+ * tsc_timer_fct : - Timer function synchronizing synthetic TSC.
+ * @data: unused
+ *
+ * Guarantees at least 1 execution before low word of TSC wraps.
+ */
+static void tsc_timer_fct(unsigned long data)
+{
+ update_synthetic_tsc();
+
+ per_cpu(tsc_timer, smp_processor_id()).expires =
+ jiffies + precalc_expire;
+ add_timer_on(&per_cpu(tsc_timer, smp_processor_id()),
+ smp_processor_id());
+}
+
+/*
+ * precalc_stsc_interval: - Precalculates the interval between the clock
+ * wraparounds.
+ */
+static int __init precalc_stsc_interval(void)
+{
+ precalc_expire =
+ (HW_BITMASK / ((trace_clock_frequency() / HZ
+ * trace_clock_freq_scale()) << 1)
+ - 1 - (EXPECTED_INTERRUPT_LATENCY * HZ / 1000)) >> 1;
+ WARN_ON(precalc_expire == 0);
+ printk(KERN_DEBUG "Synthetic TSC timer will fire each %u jiffies.\n",
+ precalc_expire);
+ return 0;
+}
+
+static void prepare_synthetic_tsc(int cpu)
+{
+ struct synthetic_tsc_struct *cpu_synth;
+ u64 local_count;
+
+ cpu_synth = &per_cpu(synthetic_tsc, cpu);
+ local_count = trace_clock_read_synthetic_tsc();
+ cpu_synth->tsc[0].val = local_count;
+ cpu_synth->index = 0;
+ smp_wmb(); /* Writing in data of CPU about to come up */
+ init_timer(&per_cpu(tsc_timer, cpu));
+ per_cpu(tsc_timer, cpu).function = tsc_timer_fct;
+ per_cpu(tsc_timer, cpu).expires = jiffies + precalc_expire;
+}
+
+static void enable_synthetic_tsc(int cpu)
+{
+ smp_call_function_single(cpu, synthetic_tsc_ipi, NULL, 1);
+ add_timer_on(&per_cpu(tsc_timer, cpu), cpu);
+}
+
+static void disable_synthetic_tsc(int cpu)
+{
+ del_timer_sync(&per_cpu(tsc_timer, cpu));
+}
+
+/*
+ * hotcpu_callback - CPU hotplug callback
+ * @nb: notifier block
+ * @action: hotplug action to take
+ * @hcpu: CPU number
+ *
+ * Sets the new CPU's current synthetic TSC to the same value as the
+ * currently running CPU.
+ *
+ * Returns the success/failure of the operation. (NOTIFY_OK, NOTIFY_BAD)
+ */
+static int __cpuinit hotcpu_callback(struct notifier_block *nb,
+ unsigned long action,
+ void *hcpu)
+{
+ unsigned int hotcpu = (unsigned long)hcpu;
+
+ switch (action) {
+ case CPU_UP_PREPARE:
+ case CPU_UP_PREPARE_FROZEN:
+ if (synthetic_tsc_refcount)
+ prepare_synthetic_tsc(hotcpu);
+ break;
+ case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
+ if (synthetic_tsc_refcount)
+ enable_synthetic_tsc(hotcpu);
+ break;
+#ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ if (synthetic_tsc_refcount)
+ disable_synthetic_tsc(hotcpu);
+ break;
+#endif /* CONFIG_HOTPLUG_CPU */
+ }
+ return NOTIFY_OK;
+}
+
+void get_synthetic_tsc(void)
+{
+ int cpu;
+
+ get_online_cpus();
+ mutex_lock(&synthetic_tsc_mutex);
+ if (synthetic_tsc_refcount++)
+ goto end;
+
+ synthetic_tsc_enabled = 1;
+ for_each_online_cpu(cpu) {
+ prepare_synthetic_tsc(cpu);
+ enable_synthetic_tsc(cpu);
+ }
+end:
+ mutex_unlock(&synthetic_tsc_mutex);
+ put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(get_synthetic_tsc);
+
+void put_synthetic_tsc(void)
+{
+ int cpu;
+
+ get_online_cpus();
+ mutex_lock(&synthetic_tsc_mutex);
+ WARN_ON(synthetic_tsc_refcount <= 0);
+ if (synthetic_tsc_refcount != 1 || !synthetic_tsc_enabled)
+ goto end;
+
+ for_each_online_cpu(cpu)
+ disable_synthetic_tsc(cpu);
+ synthetic_tsc_enabled = 0;
+end:
+ synthetic_tsc_refcount--;
+ mutex_unlock(&synthetic_tsc_mutex);
+ put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(put_synthetic_tsc);
+
+/* Called from CPU 0, before any tracing starts, to init each structure */
+static int __init init_synthetic_tsc(void)
+{
+ precalc_stsc_interval();
+ hotcpu_notifier(hotcpu_callback, 3);
+ return 0;
+}
+
+/* Before SMP is up */
+early_initcall(init_synthetic_tsc);
Index: linux.trees.git/init/Kconfig
===================================================================
--- linux.trees.git.orig/init/Kconfig 2008-11-12 17:58:05.000000000 -0500
+++ linux.trees.git/init/Kconfig 2008-11-12 18:00:31.000000000 -0500
@@ -340,6 +340,18 @@ config HAVE_UNSTABLE_SCHED_CLOCK
config HAVE_GET_CYCLES
def_bool n
+#
+# Architectures with a specialized tracing clock should select this.
+#
+config HAVE_TRACE_CLOCK
+ def_bool n
+
+#
+# Architectures with only a 32-bits clock source should select this.
+#
+config HAVE_TRACE_CLOCK_32_TO_64
+ def_bool n
+
config GROUP_SCHED
bool "Group CPU scheduler"
depends on EXPERIMENTAL
Index: linux.trees.git/kernel/Makefile
===================================================================
--- linux.trees.git.orig/kernel/Makefile 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/kernel/Makefile 2008-11-12 18:01:11.000000000 -0500
@@ -88,8 +88,7 @@ obj-$(CONFIG_MARKERS) += marker.o
obj-$(CONFIG_TRACEPOINTS) += tracepoint.o
obj-$(CONFIG_LATENCYTOP) += latencytop.o
obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_FUNCTION_TRACER) += trace/
-obj-$(CONFIG_TRACING) += trace/
+obj-y += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 07/17] Trace clock generic
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (5 preceding siblings ...)
2008-11-12 23:15 ` [patch 06/17] Trace clock core Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 08/17] Powerpc : Trace clock Mathieu Desnoyers
` (9 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle, benh, paulus, David Miller,
Ingo Molnar, Thomas Gleixner, linux-arch
[-- Attachment #1: trace-clock-generic.patch --]
[-- Type: text/plain, Size: 7690 bytes --]
Wrapper to use the lower level clock sources available on the systems. Fall-back
on a counter incremented by a timer interrupt every jiffy or'd with a logical
clock for architectures lacking CPU timestamp counters.
A generic fallback based on a logical clock and the timer interrupt is
available.
generic - Uses jiffies or'd with a logical clock extended to 64 bits by
trace-clock-32-to-64.
i386 - Uses TSC. If detects non synchronized TSC, uses mixed TSC-logical clock.
mips - Uses TSC extended atomically from 32 to 64 bits by trace-clock-32-to-64.
powerpc - Uses TSC.
sparc64 - Uses TSC.
x86_64 - Uses TSC. If detects non synchronized TSC, uses mixed TSC-logical clock
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: benh@kernel.crashing.org
CC: paulus@samba.org
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
include/asm-generic/trace-clock.h | 64 +++++++++++++++++++++++++
include/linux/trace-clock.h | 17 ++++++
init/Kconfig | 6 ++
kernel/trace/Makefile | 1
kernel/trace/trace-clock.c | 97 ++++++++++++++++++++++++++++++++++++++
5 files changed, 185 insertions(+)
Index: linux.trees.git/include/asm-generic/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/include/asm-generic/trace-clock.h 2008-11-12 18:01:18.000000000 -0500
@@ -0,0 +1,64 @@
+#ifndef _ASM_GENERIC_TRACE_CLOCK_H
+#define _ASM_GENERIC_TRACE_CLOCK_H
+
+/*
+ * include/asm-generic/trace-clock.h
+ *
+ * Copyright (C) 2007 - Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca)
+ *
+ * Generic tracing clock for architectures without TSC.
+ */
+
+#include <linux/param.h> /* For HZ */
+#include <asm/atomic.h>
+
+#define TRACE_CLOCK_SHIFT 13
+
+extern atomic_long_t trace_clock;
+
+static inline u32 trace_clock_read32(void)
+{
+ return (u32)atomic_long_add_return(1, &trace_clock);
+}
+
+#ifdef CONFIG_HAVE_TRACE_CLOCK_32_TO_64
+extern u64 trace_clock_read_synthetic_tsc(void);
+extern void get_synthetic_tsc(void);
+extern void put_synthetic_tsc(void);
+
+static inline u64 trace_clock_read64(void)
+{
+ return trace_clock_read_synthetic_tsc();
+}
+#else
+static inline void get_synthetic_tsc(void)
+{
+}
+
+static inline void put_synthetic_tsc(void)
+{
+}
+
+static inline u64 trace_clock_read64(void)
+{
+ return atomic_long_add_return(1, &trace_clock);
+}
+#endif
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ return HZ << TRACE_CLOCK_SHIFT;
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1;
+}
+
+extern void get_trace_clock(void);
+extern void put_trace_clock(void);
+
+static inline void set_trace_clock_is_sync(int state)
+{
+}
+#endif /* _ASM_GENERIC_TRACE_CLOCK_H */
Index: linux.trees.git/include/linux/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/include/linux/trace-clock.h 2008-11-12 18:01:18.000000000 -0500
@@ -0,0 +1,17 @@
+#ifndef _LINUX_TRACE_CLOCK_H
+#define _LINUX_TRACE_CLOCK_H
+
+/*
+ * Trace clock
+ *
+ * Chooses between an architecture specific clock or an atomic logical clock.
+ *
+ * Copyright (C) 2007,2008 Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca)
+ */
+
+#ifdef CONFIG_HAVE_TRACE_CLOCK
+#include <asm/trace-clock.h>
+#else
+#include <asm-generic/trace-clock.h>
+#endif /* CONFIG_HAVE_TRACE_CLOCK */
+#endif /* _LINUX_TRACE_CLOCK_H */
Index: linux.trees.git/kernel/trace/trace-clock.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/kernel/trace/trace-clock.c 2008-11-12 18:01:18.000000000 -0500
@@ -0,0 +1,97 @@
+/*
+ * kernel/trace/trace-clock.c
+ *
+ * (C) Copyright 2008 -
+ * Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca)
+ *
+ * Generic kernel tracing clock for architectures without TSC.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/timer.h>
+#include <linux/workqueue.h>
+#include <linux/cpu.h>
+#include <linux/timex.h>
+#include <linux/bitops.h>
+#include <linux/trace-clock.h>
+#include <linux/jiffies.h>
+
+static int trace_clock_refcount;
+static DEFINE_MUTEX(trace_clock_mutex);
+static struct timer_list trace_clock_timer;
+/*
+ * bits 0..12 : counter, atomically incremented
+ * bits 13..{32,64} : time counter, incremented each jiffy.
+ */
+atomic_long_t trace_clock;
+EXPORT_SYMBOL(trace_clock);
+
+static void trace_clock_update(void)
+{
+ long old_clock, new_clock;
+ unsigned long ticks;
+
+ /*
+ * Make sure we keep track of delayed timer.
+ */
+ ticks = jiffies - trace_clock_timer.expires + 1;
+ /* Don't update if ticks is zero, time would go backward. */
+ if (unlikely(!ticks))
+ return;
+ do {
+ old_clock = atomic_long_read(&trace_clock);
+ new_clock = (old_clock + (ticks << TRACE_CLOCK_SHIFT))
+ & (~((1 << TRACE_CLOCK_SHIFT) - 1));
+ } while (atomic_long_cmpxchg(&trace_clock, old_clock, new_clock)
+ != old_clock);
+}
+
+static void trace_clock_timer_fct(unsigned long data)
+{
+ trace_clock_update();
+ trace_clock_timer.expires = jiffies + 1;
+ add_timer(&trace_clock_timer);
+}
+
+static void enable_trace_clock(void)
+{
+ init_timer(&trace_clock_timer);
+ /* trace_clock_update() reads expires */
+ trace_clock_timer.function = trace_clock_timer_fct;
+ trace_clock_timer.expires = jiffies + 1;
+ trace_clock_update();
+ add_timer(&trace_clock_timer);
+}
+
+static void disable_trace_clock(void)
+{
+ del_timer_sync(&trace_clock_timer);
+}
+
+void get_trace_clock(void)
+{
+ get_synthetic_tsc();
+ mutex_lock(&trace_clock_mutex);
+ if (trace_clock_refcount++)
+ goto end;
+ enable_trace_clock();
+end:
+ mutex_unlock(&trace_clock_mutex);
+}
+EXPORT_SYMBOL_GPL(get_trace_clock);
+
+void put_trace_clock(void)
+{
+ mutex_lock(&trace_clock_mutex);
+ WARN_ON(trace_clock_refcount <= 0);
+ if (trace_clock_refcount != 1)
+ goto end;
+ disable_trace_clock();
+end:
+ trace_clock_refcount--;
+ mutex_unlock(&trace_clock_mutex);
+ put_synthetic_tsc();
+}
+EXPORT_SYMBOL_GPL(put_trace_clock);
Index: linux.trees.git/init/Kconfig
===================================================================
--- linux.trees.git.orig/init/Kconfig 2008-11-12 18:00:31.000000000 -0500
+++ linux.trees.git/init/Kconfig 2008-11-12 18:01:18.000000000 -0500
@@ -346,6 +346,12 @@ config HAVE_GET_CYCLES
config HAVE_TRACE_CLOCK
def_bool n
+config HAVE_TRACE_CLOCK_GENERIC
+ bool
+ default y if (!HAVE_TRACE_CLOCK)
+ default n if HAVE_TRACE_CLOCK
+ select HAVE_TRACE_CLOCK_32_TO_64 if (!64BIT)
+
#
# Architectures with only a 32-bits clock source should select this.
#
Index: linux.trees.git/kernel/trace/Makefile
===================================================================
--- linux.trees.git.orig/kernel/trace/Makefile 2008-11-12 18:00:47.000000000 -0500
+++ linux.trees.git/kernel/trace/Makefile 2008-11-12 18:01:18.000000000 -0500
@@ -25,5 +25,6 @@ obj-$(CONFIG_STACK_TRACER) += trace_stac
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o
obj-$(CONFIG_BOOT_TRACER) += trace_boot.o
obj-$(CONFIG_HAVE_TRACE_CLOCK_32_TO_64) += trace-clock-32-to-64.o
+obj-$(CONFIG_HAVE_TRACE_CLOCK_GENERIC) += trace-clock.o
libftrace-y := ftrace.o
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 08/17] Powerpc : Trace clock
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (6 preceding siblings ...)
2008-11-12 23:15 ` [patch 07/17] Trace clock generic Mathieu Desnoyers
@ 2008-11-12 23:15 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 09/17] Sparc64 " Mathieu Desnoyers
` (8 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:15 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, benh, paulus, linux-arch
[-- Attachment #1: powerpc-trace-clock.patch --]
[-- Type: text/plain, Size: 2068 bytes --]
Powerpc implementation of trace clock with get_tb().
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: benh@kernel.crashing.org
CC: paulus@samba.org
CC: linux-arch@vger.kernel.org
---
arch/powerpc/Kconfig | 1
arch/powerpc/include/asm/trace-clock.h | 47 +++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+)
Index: linux.trees.git/arch/powerpc/Kconfig
===================================================================
--- linux.trees.git.orig/arch/powerpc/Kconfig 2008-11-12 18:00:10.000000000 -0500
+++ linux.trees.git/arch/powerpc/Kconfig 2008-11-12 18:01:20.000000000 -0500
@@ -114,6 +114,7 @@ config PPC
select HAVE_IOREMAP_PROT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_KPROBES
+ select HAVE_TRACE_CLOCK
select HAVE_ARCH_KGDB
select HAVE_KRETPROBES
select HAVE_ARCH_TRACEHOOK
Index: linux.trees.git/arch/powerpc/include/asm/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/powerpc/include/asm/trace-clock.h 2008-11-12 18:01:20.000000000 -0500
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2005,2008 Mathieu Desnoyers
+ *
+ * Trace clock PowerPC definitions.
+ *
+ * Use get_tb() directly to insure reading a 64-bits value on powerpc 32.
+ */
+
+#ifndef _ASM_TRACE_CLOCK_H
+#define _ASM_TRACE_CLOCK_H
+
+#include <linux/timex.h>
+#include <linux/time.h>
+#include <asm/processor.h>
+
+static inline u32 trace_clock_read32(void)
+{
+ return get_tbl();
+}
+
+static inline u64 trace_clock_read64(void)
+{
+ return get_tb();
+}
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ return get_cycles_rate();
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1;
+}
+
+static inline void get_trace_clock(void)
+{
+}
+
+static inline void put_trace_clock(void)
+{
+}
+
+static inline void set_trace_clock_is_sync(int state)
+{
+}
+#endif /* _ASM_TRACE_CLOCK_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 09/17] Sparc64 : Trace clock
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (7 preceding siblings ...)
2008-11-12 23:15 ` [patch 08/17] Powerpc : Trace clock Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 10/17] LTTng timestamp sh Mathieu Desnoyers
` (7 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, David S. Miller, linux-arch
[-- Attachment #1: sparc64-trace-clock.patch --]
[-- Type: text/plain, Size: 1937 bytes --]
Implement sparc64 trace clock.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: David S. Miller <davem@davemloft.net>
CC: linux-arch@vger.kernel.org
---
arch/sparc/include/asm/trace-clock.h | 43 +++++++++++++++++++++++++++++++++++
arch/sparc64/Kconfig | 1
2 files changed, 44 insertions(+)
Index: linux.trees.git/arch/sparc64/Kconfig
===================================================================
--- linux.trees.git.orig/arch/sparc64/Kconfig 2008-11-12 18:00:04.000000000 -0500
+++ linux.trees.git/arch/sparc64/Kconfig 2008-11-12 18:01:51.000000000 -0500
@@ -18,6 +18,7 @@ config SPARC64
select HAVE_ARCH_KGDB
select USE_GENERIC_SMP_HELPERS if SMP
select HAVE_ARCH_TRACEHOOK
+ select HAVE_TRACE_CLOCK
select ARCH_WANT_OPTIONAL_GPIOLIB
select RTC_CLASS
select RTC_DRV_M48T59
Index: linux.trees.git/arch/sparc/include/asm/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/sparc/include/asm/trace-clock.h 2008-11-12 18:01:27.000000000 -0500
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2008, Mathieu Desnoyers
+ *
+ * Trace clock definitions for Sparc64.
+ */
+
+#ifndef _ASM_SPARC_TRACE_CLOCK_H
+#define _ASM_SPARC_TRACE_CLOCK_H
+
+#include <linux/timex.h>
+
+static inline u32 trace_clock_read32(void)
+{
+ return get_cycles();
+}
+
+static inline u64 trace_clock_read64(void)
+{
+ return get_cycles();
+}
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ return get_cycles_rate();
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1;
+}
+
+static inline void get_trace_clock(void)
+{
+}
+
+static inline void put_trace_clock(void)
+{
+}
+
+static inline void set_trace_clock_is_sync(int state)
+{
+}
+#endif /* _ASM_SPARC_TRACE_CLOCK_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 10/17] LTTng timestamp sh
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (8 preceding siblings ...)
2008-11-12 23:16 ` [patch 09/17] Sparc64 " Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 11/17] LTTng - TSC synchronicity test Mathieu Desnoyers
` (6 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Giuseppe Cavallaro, Mathieu Desnoyers, Paul Mundt, linux-sh
[-- Attachment #1: sh-trace-clock.patch --]
[-- Type: text/plain, Size: 3705 bytes --]
This patch adds the timestamping mechanism in the trace-clock.h arch header
file. The new timestamp functions use the TMU channel 1.
This code only works if the TMU channel 1 is initialized during the kernel boot.
Big fat warning(TM) from Mathieu Desnoyers :
This patch seems to assume TMU channel 1 is setup at boot. Is it always true on
all SuperH boards ? Is there some Kconfig selection that should be done here ?
Make sure this patch does not break get_cycles on SuperH before merging.
From: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Paul Mundt <lethal@linux-sh.org>
CC: linux-sh@vger.kernel.org
---
arch/sh/Kconfig | 2 +
arch/sh/include/asm/timex.h | 7 +++-
arch/sh/include/asm/trace-clock.h | 58 ++++++++++++++++++++++++++++++++++++++
3 files changed, 65 insertions(+), 2 deletions(-)
Index: linux.trees.git/arch/sh/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/sh/include/asm/timex.h 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/sh/include/asm/timex.h 2008-11-12 18:01:56.000000000 -0500
@@ -6,13 +6,16 @@
#ifndef __ASM_SH_TIMEX_H
#define __ASM_SH_TIMEX_H
-#define CLOCK_TICK_RATE (CONFIG_SH_PCLK_FREQ / 4) /* Underlying HZ */
+#include <linux/io.h>
+#include <asm/cpu/timer.h>
+
+#define CLOCK_TICK_RATE (HZ * 100000UL)
typedef unsigned long long cycles_t;
static __inline__ cycles_t get_cycles (void)
{
- return 0;
+ return 0xffffffff - ctrl_inl(TMU1_TCNT);
}
#endif /* __ASM_SH_TIMEX_H */
Index: linux.trees.git/arch/sh/Kconfig
===================================================================
--- linux.trees.git.orig/arch/sh/Kconfig 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/sh/Kconfig 2008-11-12 18:01:56.000000000 -0500
@@ -11,6 +11,8 @@ config SUPERH
select HAVE_CLK
select HAVE_IDE
select HAVE_OPROFILE
+ select HAVE_TRACE_CLOCK
+ select HAVE_TRACE_CLOCK_32_TO_64
select HAVE_GENERIC_DMA_COHERENT
select HAVE_IOREMAP_PROT if MMU
help
Index: linux.trees.git/arch/sh/include/asm/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/sh/include/asm/trace-clock.h 2008-11-12 18:01:56.000000000 -0500
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2007,2008 Giuseppe Cavallaro <peppe.cavallaro@st.com>
+ * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
+ *
+ * Trace clock definitions for SuperH.
+ */
+
+#ifndef _ASM_SH_TRACE_CLOCK_H
+#define _ASM_SH_TRACE_CLOCK_H
+
+#include <linux/timer.h>
+#include <asm/clock.h>
+
+extern u64 trace_clock_read_synthetic_tsc(void);
+
+static inline u32 trace_clock_get_timestamp32(void)
+{
+ return get_cycles();
+}
+
+static inline u64 trace_clock_get_timestamp64(void)
+{
+ return trace_clock_read_synthetic_tsc();
+}
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ unsigned long rate;
+ struct clk *tmu1_clk;
+
+ tmu1_clk = clk_get(NULL, "tmu1_clk");
+ rate = clk_get_rate(tmu1_clk);
+
+ return (unsigned int)rate;
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1;
+}
+
+extern void get_synthetic_tsc(void);
+extern void put_synthetic_tsc(void);
+
+static inline void get_trace_clock(void)
+{
+ get_synthetic_tsc();
+}
+
+static inline void put_trace_clock(void)
+{
+ put_synthetic_tsc();
+}
+
+static inline void set_trace_clock_is_sync(int state)
+{
+}
+#endif /* _ASM_SH_TRACE_CLOCK_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 11/17] LTTng - TSC synchronicity test
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (9 preceding siblings ...)
2008-11-12 23:16 ` [patch 10/17] LTTng timestamp sh Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 12/17] x86 : remove arch-specific tsc_sync.c Mathieu Desnoyers
` (5 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ingo Molnar, Jan Kiszka, Thomas Gleixner
[-- Attachment #1: test-tsc-sync.patch --]
[-- Type: text/plain, Size: 13469 bytes --]
Test TSC synchronization across CPUs. Architecture independant and can therefore
be used on various architectures. Aims at testing the TSC synchronization on a
running system (not only at early boot), with minimal impact on interrupt
latency.
I've written this code before x86 tsc_sync.c existed and given it worked well
for my needs, I never switched to tsc_sync.c. Although it has the same goal, it
does it a bit differently :
tsc_sync looks at the cycle counters on two CPUs to see if one compared to the
other are going backward when read in loop. The LTTng code synchronizes both
cores with a counter used as a memory barrier and then reads the two TSCs at a
delta equal to the cache line exchange. Instruction and data caches are primed.
This test is repeated in loops to insure we deal with MCE, NMIs which could skew
the results.
The problem I see with tsc_sync.c is that is one of the two CPUs is delayed by
an interrupt handler (for way too long) while the other CPU is doing its
check_tsc_warp() execution, and if the CPU with the lowest TSC values runs
first, this code will fail to detect unsynchronized CPUs.
This sync test code does not have this problem.
A following patch replaces the x86 tsc_sync.c code by this architecture
independant code.
This code also adds the kernel parameter
force_tsc_sync=1
which forces resynchronization of CPU TSCs when a CPU is hotplugged.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ingo Molnar <mingo@redhat.com>
CC: Jan Kiszka <jan.kiszka@siemens.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
---
Documentation/kernel-parameters.txt | 4
init/Kconfig | 7
kernel/time/Makefile | 1
kernel/time/tsc-sync.c | 313 ++++++++++++++++++++++++++++++++++++
4 files changed, 325 insertions(+)
Index: linux.trees.git/kernel/time/tsc-sync.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/kernel/time/tsc-sync.c 2008-11-12 18:01:59.000000000 -0500
@@ -0,0 +1,313 @@
+/*
+ * kernel/time/tsc-sync.c
+ *
+ * Test TSC synchronization
+ *
+ * marks the tsc as unstable _and_ keep a simple "_tsc_is_sync" variable, which
+ * is fast to read when a simple test must determine which clock source to use
+ * for kernel tracing.
+ *
+ * - CPU init :
+ *
+ * We check whether all boot CPUs have their TSC's synchronized,
+ * print a warning if not and turn off the TSC clock-source.
+ *
+ * Only two CPUs may participate - they can enter in any order.
+ * ( The serial nature of the boot logic and the CPU hotplug lock
+ * protects against more than 2 CPUs entering this code.
+ *
+ * - When CPUs are up :
+ *
+ * TSC synchronicity of all CPUs can be checked later at run-time by calling
+ * test_tsc_synchronization().
+ *
+ * Copyright 2007, 2008
+ * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
+ */
+#include <linux/module.h>
+#include <linux/timer.h>
+#include <linux/timex.h>
+#include <linux/jiffies.h>
+#include <linux/trace-clock.h>
+#include <linux/cpu.h>
+#include <linux/kthread.h>
+#include <linux/mutex.h>
+#include <linux/cpu.h>
+
+#define MAX_CYCLES_DELTA 1000ULL
+
+/*
+ * Number of loops to take care of MCE, NMIs, SMIs.
+ */
+#define NR_LOOPS 10
+
+static DEFINE_MUTEX(tscsync_mutex);
+
+struct sync_data {
+ int nr_waits;
+ int wait_sync;
+ cycles_t tsc_count;
+} ____cacheline_aligned;
+
+/* 0 is master, 1 is slave */
+static struct sync_data sync_data[2] = {
+ [0 ... 1] = {
+ .nr_waits = 3 * NR_LOOPS + 1,
+ .wait_sync = 3 * NR_LOOPS + 1,
+ },
+};
+
+int _tsc_is_sync = 1;
+EXPORT_SYMBOL(_tsc_is_sync);
+
+static int force_tsc_sync;
+static cycles_t slave_offset;
+static int slave_offset_ready; /* for 32-bits architectures */
+
+static int __init force_tsc_sync_setup(char *str)
+{
+ force_tsc_sync = simple_strtoul(str, NULL, 0);
+ return 1;
+}
+__setup("force_tsc_sync=", force_tsc_sync_setup);
+
+/*
+ * Mark it noinline so we make sure it is not unrolled.
+ * Wait until value is reached.
+ */
+static noinline void tsc_barrier(long this_cpu)
+{
+ sync_core();
+ sync_data[this_cpu].wait_sync--;
+ smp_mb(); /* order master/slave sync_data read/write */
+ while (unlikely(sync_data[1 - this_cpu].wait_sync >=
+ sync_data[this_cpu].nr_waits))
+ barrier(); /*
+ * barrier is used because faster and
+ * more predictable than cpu_idle().
+ */
+ smp_mb(); /* order master/slave sync_data read/write */
+ sync_data[this_cpu].nr_waits--;
+ get_cycles_barrier();
+ sync_data[this_cpu].tsc_count = get_cycles();
+ get_cycles_barrier();
+}
+
+/*
+ * Worker thread called on each CPU.
+ * First wait with interrupts enabled, then wait with interrupt disabled,
+ * for precision. We are already bound to one CPU.
+ * this_cpu 0 : master
+ * this_cpu 1 : slave
+ */
+static void test_sync(void *arg)
+{
+ long this_cpu = (long)arg;
+ unsigned long flags;
+
+ local_irq_save(flags);
+ /* Make sure the instructions are in I-CACHE */
+ tsc_barrier(this_cpu);
+ tsc_barrier(this_cpu);
+ sync_data[this_cpu].wait_sync--;
+ smp_mb(); /* order master/slave sync_data read/write */
+ while (unlikely(sync_data[1 - this_cpu].wait_sync >=
+ sync_data[this_cpu].nr_waits))
+ barrier(); /*
+ * barrier is used because faster and
+ * more predictable than cpu_idle().
+ */
+ smp_mb(); /* order master/slave sync_data read/write */
+ sync_data[this_cpu].nr_waits--;
+ /*
+ * Here, only the master will wait for the slave to reach this barrier.
+ * This makes sure that the master, which holds the mutex and will reset
+ * the barriers, waits for the slave to stop using the barrier values
+ * before it continues. This is only done at the complete end of all the
+ * loops. This is why there is a + 1 in original wait_sync value.
+ */
+ if (sync_data[this_cpu].nr_waits == 1)
+ sync_data[this_cpu].wait_sync--;
+ local_irq_restore(flags);
+}
+
+/*
+ * Each CPU (master and target) must decrement the wait_sync value twice (one
+ * for priming in cache), and also once after the get_cycles. After all the
+ * loops, one last synchronization is required to make sure the master waits
+ * for the slave before resetting the barriers.
+ */
+static void reset_barriers(void)
+{
+ int i;
+
+ /*
+ * Wait until slave is done so that we don't overwrite
+ * wait_end_sync prematurely.
+ */
+ smp_mb(); /* order master/slave sync_data read/write */
+ while (unlikely(sync_data[1].wait_sync >= sync_data[0].nr_waits))
+ barrier(); /*
+ * barrier is used because faster and
+ * more predictable than cpu_idle().
+ */
+ smp_mb(); /* order master/slave sync_data read/write */
+
+ for (i = 0; i < 2; i++) {
+ WARN_ON(sync_data[i].wait_sync != 0);
+ WARN_ON(sync_data[i].nr_waits != 1);
+ sync_data[i].wait_sync = 3 * NR_LOOPS + 1;
+ sync_data[i].nr_waits = 3 * NR_LOOPS + 1;
+ }
+}
+
+/*
+ * Do loops (making sure no unexpected event changes the timing), keep the best
+ * one. The result of each loop is the highest tsc delta between the master CPU
+ * and the slaves. Stop CPU hotplug when this code is executed to make sure we
+ * are concurrency-safe wrt CPU hotplug also using this code. Test TSC
+ * synchronization even if we already "know" CPUs were not synchronized. This
+ * can be used as a test to check if, for some reason, the CPUs eventually got
+ * in sync after a CPU has been unplugged. This code is kept separate from the
+ * CPU hotplug code because the slave CPU executes in an IPI, which we want to
+ * keep as short as possible (this is happening while the system is running).
+ * Therefore, we do not send a single IPI for all the test loops, but rather
+ * send one IPI per loop.
+ */
+int test_tsc_synchronization(void)
+{
+ long cpu, master;
+ cycles_t max_diff = 0, diff, best_loop, worse_loop = 0;
+ int i;
+
+ mutex_lock(&tscsync_mutex);
+ get_online_cpus();
+
+ printk(KERN_INFO
+ "checking TSC synchronization across all online CPUs:");
+
+ preempt_disable();
+ master = smp_processor_id();
+ for_each_online_cpu(cpu) {
+ if (master == cpu)
+ continue;
+ best_loop = (cycles_t)ULLONG_MAX;
+ for (i = 0; i < NR_LOOPS; i++) {
+ smp_call_function_single(cpu, test_sync,
+ (void *)1UL, 0);
+ test_sync((void *)0UL);
+ diff = abs(sync_data[1].tsc_count
+ - sync_data[0].tsc_count);
+ best_loop = min(best_loop, diff);
+ worse_loop = max(worse_loop, diff);
+ }
+ reset_barriers();
+ max_diff = max(best_loop, max_diff);
+ }
+ preempt_enable();
+ if (max_diff >= MAX_CYCLES_DELTA) {
+ printk(KERN_WARNING
+ "Measured %llu cycles TSC offset between CPUs,"
+ " turning off TSC clock.\n", (u64)max_diff);
+ mark_tsc_unstable("check_tsc_sync_source failed");
+ _tsc_is_sync = 0;
+ } else {
+ printk(" passed.\n");
+ }
+ put_online_cpus();
+ mutex_unlock(&tscsync_mutex);
+ return max_diff < MAX_CYCLES_DELTA;
+}
+EXPORT_SYMBOL_GPL(test_tsc_synchronization);
+
+/*
+ * Test synchronicity of a single core when it is hotplugged.
+ * Source CPU calls into this - waits for the freshly booted target CPU to
+ * arrive and then start the measurement:
+ */
+void __cpuinit check_tsc_sync_source(int cpu)
+{
+ cycles_t diff, abs_diff,
+ best_loop = (cycles_t)ULLONG_MAX, worse_loop = 0;
+ int i;
+
+ /*
+ * No need to check if we already know that the TSC is not synchronized:
+ */
+ if (!force_tsc_sync && unsynchronized_tsc()) {
+ /*
+ * Make sure we mark _tsc_is_sync to 0 if the TSC is found
+ * to be unsynchronized for other causes than non-synchronized
+ * TSCs across CPUs.
+ */
+ _tsc_is_sync = 0;
+ set_trace_clock_is_sync(0);
+ return;
+ }
+
+ printk(KERN_INFO "checking TSC synchronization [CPU#%d -> CPU#%d]:",
+ smp_processor_id(), cpu);
+
+ for (i = 0; i < NR_LOOPS; i++) {
+ test_sync((void *)0UL);
+ diff = sync_data[1].tsc_count - sync_data[0].tsc_count;
+ abs_diff = abs(diff);
+ best_loop = min(best_loop, abs_diff);
+ worse_loop = max(worse_loop, abs_diff);
+ if (force_tsc_sync && best_loop == abs_diff)
+ slave_offset = diff;
+ }
+ reset_barriers();
+
+ if (!force_tsc_sync && best_loop >= MAX_CYCLES_DELTA) {
+ printk(" failed.\n");
+ printk(KERN_WARNING
+ "Measured %llu cycles TSC offset between CPUs,"
+ " turning off TSC clock.\n", (u64)best_loop);
+ mark_tsc_unstable("check_tsc_sync_source failed");
+ _tsc_is_sync = 0;
+ set_trace_clock_is_sync(0);
+ } else {
+ printk(" %s.\n", !force_tsc_sync ? "passed" : "forced");
+ }
+ if (force_tsc_sync) {
+ /* order slave_offset and slave_offset_ready writes */
+ smp_wmb();
+ slave_offset_ready = 1;
+ }
+}
+
+/*
+ * Freshly booted CPUs call into this:
+ */
+void __cpuinit check_tsc_sync_target(void)
+{
+ int i;
+
+ if (!force_tsc_sync && unsynchronized_tsc())
+ return;
+
+ for (i = 0; i < NR_LOOPS; i++)
+ test_sync((void *)1UL);
+
+ /*
+ * Force slave synchronization if requested.
+ */
+ if (force_tsc_sync) {
+ unsigned long flags;
+ cycles_t new_tsc;
+
+ while (!slave_offset_ready)
+ cpu_relax();
+ /* order slave_offset and slave_offset_ready reads */
+ smp_rmb();
+ local_irq_save(flags);
+ /*
+ * slave_offset is read when master has finished writing to it,
+ * and is protected by cpu hotplug serialization.
+ */
+ new_tsc = get_cycles() - slave_offset;
+ write_tsc((u32)new_tsc, (u32)((u64)new_tsc >> 32));
+ local_irq_restore(flags);
+ }
+}
Index: linux.trees.git/kernel/time/Makefile
===================================================================
--- linux.trees.git.orig/kernel/time/Makefile 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/kernel/time/Makefile 2008-11-12 18:01:59.000000000 -0500
@@ -6,3 +6,4 @@ obj-$(CONFIG_GENERIC_CLOCKEVENTS_BROADCA
obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o
obj-$(CONFIG_TICK_ONESHOT) += tick-sched.o
obj-$(CONFIG_TIMER_STATS) += timer_stats.o
+obj-$(CONFIG_HAVE_UNSYNCHRONIZED_TSC) += tsc-sync.o
Index: linux.trees.git/init/Kconfig
===================================================================
--- linux.trees.git.orig/init/Kconfig 2008-11-12 18:01:18.000000000 -0500
+++ linux.trees.git/init/Kconfig 2008-11-12 18:01:59.000000000 -0500
@@ -358,6 +358,13 @@ config HAVE_TRACE_CLOCK_GENERIC
config HAVE_TRACE_CLOCK_32_TO_64
def_bool n
+#
+# Architectures which need to dynamically detect if their TSC is unsynchronized
+# across cpus should select this.
+#
+config HAVE_UNSYNCHRONIZED_TSC
+ def_bool n
+
config GROUP_SCHED
bool "Group CPU scheduler"
depends on EXPERIMENTAL
Index: linux.trees.git/Documentation/kernel-parameters.txt
===================================================================
--- linux.trees.git.orig/Documentation/kernel-parameters.txt 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/Documentation/kernel-parameters.txt 2008-11-12 18:01:59.000000000 -0500
@@ -765,6 +765,10 @@ and is between 256 and 4096 characters.
parameter will force ia64_sal_cache_flush to call
ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
+ force_tsc_sync
+ Force TSC resynchronization when SMP CPUs go online.
+ See also idle=poll and disable frequency scaling.
+
gamecon.map[2|3]=
[HW,JOY] Multisystem joystick and NES/SNES/PSX pad
support via parallel port (up to 5 devices per port)
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 12/17] x86 : remove arch-specific tsc_sync.c
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (10 preceding siblings ...)
2008-11-12 23:16 ` [patch 11/17] LTTng - TSC synchronicity test Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 13/17] MIPS use tsc_sync.c Mathieu Desnoyers
` (4 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Thomas Gleixner, Ingo Molnar, H. Peter Anvin
[-- Attachment #1: x86-remove-arch-specific-tsc_sync.patch --]
[-- Type: text/plain, Size: 7870 bytes --]
Depends on the new arch. independent kernel/time/tsc-sync.c
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Steven Rostedt <rostedt@goodmis.org>
---
arch/x86/Kconfig | 2
arch/x86/include/asm/tsc.h | 8 +
arch/x86/kernel/Makefile | 4
arch/x86/kernel/tsc_sync.c | 189 ---------------------------------------------
4 files changed, 11 insertions(+), 192 deletions(-)
Index: linux.trees.git/arch/x86/kernel/Makefile
===================================================================
--- linux.trees.git.orig/arch/x86/kernel/Makefile 2008-11-12 18:14:22.000000000 -0500
+++ linux.trees.git/arch/x86/kernel/Makefile 2008-11-12 18:15:37.000000000 -0500
@@ -56,9 +56,9 @@ obj-$(CONFIG_PCI) += early-quirks.o
apm-y := apm_32.o
obj-$(CONFIG_APM) += apm.o
obj-$(CONFIG_X86_SMP) += smp.o
-obj-$(CONFIG_X86_SMP) += smpboot.o tsc_sync.o ipi.o tlb_$(BITS).o
+obj-$(CONFIG_X86_SMP) += smpboot.o ipi.o tlb_$(BITS).o
obj-$(CONFIG_X86_32_SMP) += smpcommon.o
-obj-$(CONFIG_X86_64_SMP) += tsc_sync.o smpcommon.o
+obj-$(CONFIG_X86_64_SMP) += smpcommon.o
obj-$(CONFIG_X86_TRAMPOLINE) += trampoline_$(BITS).o
obj-$(CONFIG_X86_MPPARSE) += mpparse.o
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
Index: linux.trees.git/arch/x86/kernel/tsc_sync.c
===================================================================
--- linux.trees.git.orig/arch/x86/kernel/tsc_sync.c 2008-11-12 18:14:22.000000000 -0500
+++ /dev/null 1970-01-01 00:00:00.000000000 +0000
@@ -1,189 +0,0 @@
-/*
- * check TSC synchronization.
- *
- * Copyright (C) 2006, Red Hat, Inc., Ingo Molnar
- *
- * We check whether all boot CPUs have their TSC's synchronized,
- * print a warning if not and turn off the TSC clock-source.
- *
- * The warp-check is point-to-point between two CPUs, the CPU
- * initiating the bootup is the 'source CPU', the freshly booting
- * CPU is the 'target CPU'.
- *
- * Only two CPUs may participate - they can enter in any order.
- * ( The serial nature of the boot logic and the CPU hotplug lock
- * protects against more than 2 CPUs entering this code. )
- */
-#include <linux/spinlock.h>
-#include <linux/kernel.h>
-#include <linux/init.h>
-#include <linux/smp.h>
-#include <linux/nmi.h>
-#include <asm/tsc.h>
-
-/*
- * Entry/exit counters that make sure that both CPUs
- * run the measurement code at once:
- */
-static __cpuinitdata atomic_t start_count;
-static __cpuinitdata atomic_t stop_count;
-
-/*
- * We use a raw spinlock in this exceptional case, because
- * we want to have the fastest, inlined, non-debug version
- * of a critical section, to be able to prove TSC time-warps:
- */
-static __cpuinitdata raw_spinlock_t sync_lock = __RAW_SPIN_LOCK_UNLOCKED;
-static __cpuinitdata cycles_t last_tsc;
-static __cpuinitdata cycles_t max_warp;
-static __cpuinitdata int nr_warps;
-
-/*
- * TSC-warp measurement loop running on both CPUs:
- */
-static __cpuinit void check_tsc_warp(void)
-{
- cycles_t start, now, prev, end;
- int i;
-
- start = get_cycles();
- /*
- * The measurement runs for 20 msecs:
- */
- end = start + tsc_khz * 20ULL;
- now = start;
-
- for (i = 0; ; i++) {
- /*
- * We take the global lock, measure TSC, save the
- * previous TSC that was measured (possibly on
- * another CPU) and update the previous TSC timestamp.
- */
- __raw_spin_lock(&sync_lock);
- prev = last_tsc;
- now = get_cycles();
- last_tsc = now;
- __raw_spin_unlock(&sync_lock);
-
- /*
- * Be nice every now and then (and also check whether
- * measurement is done [we also insert a 10 million
- * loops safety exit, so we dont lock up in case the
- * TSC readout is totally broken]):
- */
- if (unlikely(!(i & 7))) {
- if (now > end || i > 10000000)
- break;
- cpu_relax();
- touch_nmi_watchdog();
- }
- /*
- * Outside the critical section we can now see whether
- * we saw a time-warp of the TSC going backwards:
- */
- if (unlikely(prev > now)) {
- __raw_spin_lock(&sync_lock);
- max_warp = max(max_warp, prev - now);
- nr_warps++;
- __raw_spin_unlock(&sync_lock);
- }
- }
- WARN(!(now-start),
- "Warning: zero tsc calibration delta: %Ld [max: %Ld]\n",
- now-start, end-start);
-}
-
-/*
- * Source CPU calls into this - it waits for the freshly booted
- * target CPU to arrive and then starts the measurement:
- */
-void __cpuinit check_tsc_sync_source(int cpu)
-{
- int cpus = 2;
-
- /*
- * No need to check if we already know that the TSC is not
- * synchronized:
- */
- if (unsynchronized_tsc())
- return;
-
- printk(KERN_INFO "checking TSC synchronization [CPU#%d -> CPU#%d]:",
- smp_processor_id(), cpu);
-
- /*
- * Reset it - in case this is a second bootup:
- */
- atomic_set(&stop_count, 0);
-
- /*
- * Wait for the target to arrive:
- */
- while (atomic_read(&start_count) != cpus-1)
- cpu_relax();
- /*
- * Trigger the target to continue into the measurement too:
- */
- atomic_inc(&start_count);
-
- check_tsc_warp();
-
- while (atomic_read(&stop_count) != cpus-1)
- cpu_relax();
-
- if (nr_warps) {
- printk("\n");
- printk(KERN_WARNING "Measured %Ld cycles TSC warp between CPUs,"
- " turning off TSC clock.\n", max_warp);
- mark_tsc_unstable("check_tsc_sync_source failed");
- } else {
- printk(" passed.\n");
- }
-
- /*
- * Reset it - just in case we boot another CPU later:
- */
- atomic_set(&start_count, 0);
- nr_warps = 0;
- max_warp = 0;
- last_tsc = 0;
-
- /*
- * Let the target continue with the bootup:
- */
- atomic_inc(&stop_count);
-}
-
-/*
- * Freshly booted CPUs call into this:
- */
-void __cpuinit check_tsc_sync_target(void)
-{
- int cpus = 2;
-
- if (unsynchronized_tsc())
- return;
-
- /*
- * Register this CPU's participation and wait for the
- * source CPU to start the measurement:
- */
- atomic_inc(&start_count);
- while (atomic_read(&start_count) != cpus)
- cpu_relax();
-
- check_tsc_warp();
-
- /*
- * Ok, we are done:
- */
- atomic_inc(&stop_count);
-
- /*
- * Wait for the source CPU to print stuff:
- */
- while (atomic_read(&stop_count) != cpus)
- cpu_relax();
-}
-#undef NR_LOOPS
-
Index: linux.trees.git/arch/x86/Kconfig
===================================================================
--- linux.trees.git.orig/arch/x86/Kconfig 2008-11-12 18:15:28.000000000 -0500
+++ linux.trees.git/arch/x86/Kconfig 2008-11-12 18:15:37.000000000 -0500
@@ -169,6 +169,7 @@ config X86_SMP
bool
depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)
select USE_GENERIC_SMP_HELPERS
+ select HAVE_UNSYNCHRONIZED_TSC
default y
config X86_32_SMP
@@ -178,6 +179,7 @@ config X86_32_SMP
config X86_64_SMP
def_bool y
depends on X86_64 && SMP
+ select HAVE_UNSYNCHRONIZED_TSC
config X86_HT
bool
Index: linux.trees.git/arch/x86/include/asm/tsc.h
===================================================================
--- linux.trees.git.orig/arch/x86/include/asm/tsc.h 2008-11-12 18:15:28.000000000 -0500
+++ linux.trees.git/arch/x86/include/asm/tsc.h 2008-11-12 18:15:37.000000000 -0500
@@ -54,7 +54,7 @@ static __always_inline cycles_t vget_cyc
extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
-int check_tsc_unstable(void);
+extern int check_tsc_unstable(void);
static inline cycles_t get_cycles_rate(void)
{
@@ -77,4 +77,10 @@ extern void check_tsc_sync_target(void);
extern int notsc_setup(char *);
+extern int test_tsc_synchronization(void);
+extern int _tsc_is_sync;
+static inline int tsc_is_sync(void)
+{
+ return _tsc_is_sync;
+}
#endif /* _ASM_X86_TSC_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 13/17] MIPS use tsc_sync.c
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (11 preceding siblings ...)
2008-11-12 23:16 ` [patch 12/17] x86 : remove arch-specific tsc_sync.c Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 14/17] MIPS : export hpt frequency for trace_clock Mathieu Desnoyers
` (3 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle
[-- Attachment #1: mips-use-tsc_sync.patch --]
[-- Type: text/plain, Size: 2120 bytes --]
tsc-sync.c is now available to test if TSC is synchronized across cores. Given
I currently don't have access to a MIPS board myself, help trying to use it
when CPUs go online and testing the implementation would be welcome.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/mips/include/asm/timex.h | 26 ++++++++++++++++++++++++++
arch/mips/kernel/smp.c | 1 +
2 files changed, 27 insertions(+)
Index: linux.trees.git/arch/mips/kernel/smp.c
===================================================================
--- linux.trees.git.orig/arch/mips/kernel/smp.c 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/mips/kernel/smp.c 2008-11-12 18:03:16.000000000 -0500
@@ -178,6 +178,7 @@ void __init smp_cpus_done(unsigned int m
{
mp_ops->cpus_done();
synchronise_count_master();
+ test_tsc_synchronization();
}
/* called from main before smp_init() */
Index: linux.trees.git/arch/mips/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/mips/include/asm/timex.h 2008-11-12 18:00:23.000000000 -0500
+++ linux.trees.git/arch/mips/include/asm/timex.h 2008-11-12 18:03:16.000000000 -0500
@@ -56,13 +56,39 @@ static inline cycles_t get_cycles_rate(v
{
return CLOCK_TICK_RATE;
}
+
+extern int test_tsc_synchronization(void);
+extern int _tsc_is_sync;
+static inline int tsc_is_sync(void)
+{
+ return _tsc_is_sync;
+}
#else
static inline cycles_t get_cycles(void)
{
return 0;
}
+static inline int test_tsc_synchronization(void)
+{
+ return 0;
+}
+static inline int tsc_is_sync(void)
+{
+ return 0;
+}
#endif
+#define DELAY_INTERRUPT 100
+/*
+ * Only updates 32 LSB.
+ */
+static inline void write_tsc(u32 val1, u32 val2)
+{
+ write_c0_count(val1);
+ /* Arrange for an interrupt in a short while */
+ write_c0_compare(read_c0_count() + DELAY_INTERRUPT);
+}
+
#endif /* __KERNEL__ */
#endif /* _ASM_TIMEX_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 14/17] MIPS : export hpt frequency for trace_clock.
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (12 preceding siblings ...)
2008-11-12 23:16 ` [patch 13/17] MIPS use tsc_sync.c Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 15/17] MIPS create empty sync_core() Mathieu Desnoyers
` (2 subsequent siblings)
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle
[-- Attachment #1: mips-export-hpt-frequency-for-trace-clock.patch --]
[-- Type: text/plain, Size: 1393 bytes --]
Trace_clock needs to export the hpt frequency to modules (e.g. LTTng).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
---
arch/mips/include/asm/timex.h | 2 ++
arch/mips/kernel/time.c | 1 +
2 files changed, 3 insertions(+)
Index: linux.trees.git/arch/mips/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/mips/include/asm/timex.h 2008-11-12 18:03:16.000000000 -0500
+++ linux.trees.git/arch/mips/include/asm/timex.h 2008-11-12 18:03:33.000000000 -0500
@@ -89,6 +89,8 @@ static inline void write_tsc(u32 val1, u
write_c0_compare(read_c0_count() + DELAY_INTERRUPT);
}
+extern unsigned int mips_hpt_frequency;
+
#endif /* __KERNEL__ */
#endif /* _ASM_TIMEX_H */
Index: linux.trees.git/arch/mips/kernel/time.c
===================================================================
--- linux.trees.git.orig/arch/mips/kernel/time.c 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/mips/kernel/time.c 2008-11-12 18:03:33.000000000 -0500
@@ -70,6 +70,7 @@ EXPORT_SYMBOL(perf_irq);
*/
unsigned int mips_hpt_frequency;
+EXPORT_SYMBOL(mips_hpt_frequency);
void __init clocksource_set_clock(struct clocksource *cs, unsigned int clock)
{
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 15/17] MIPS create empty sync_core()
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (13 preceding siblings ...)
2008-11-12 23:16 ` [patch 14/17] MIPS : export hpt frequency for trace_clock Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 16/17] MIPS : Trace clock Mathieu Desnoyers
2008-11-12 23:16 ` [patch 17/17] x86 trace clock Mathieu Desnoyers
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle
[-- Attachment #1: mips-create-empty-sync_core.patch --]
[-- Type: text/plain, Size: 1009 bytes --]
Needed by architecture-independant tsc-sync.c.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/mips/include/asm/barrier.h | 6 ++++++
1 file changed, 6 insertions(+)
Index: linux.trees.git/arch/mips/include/asm/barrier.h
===================================================================
--- linux.trees.git.orig/arch/mips/include/asm/barrier.h 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/mips/include/asm/barrier.h 2008-11-12 18:03:51.000000000 -0500
@@ -152,4 +152,10 @@
#define smp_llsc_rmb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory")
#define smp_llsc_wmb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory")
+/*
+ * MIPS does not have any instruction to serialize instruction execution on the
+ * core.
+ */
+#define sync_core()
+
#endif /* __ASM_BARRIER_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 16/17] MIPS : Trace clock
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (14 preceding siblings ...)
2008-11-12 23:16 ` [patch 15/17] MIPS create empty sync_core() Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
2008-11-12 23:16 ` [patch 17/17] x86 trace clock Mathieu Desnoyers
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Ralf Baechle
[-- Attachment #1: mips-trace-clock.patch --]
[-- Type: text/plain, Size: 10311 bytes --]
MIPS get_cycles only returns a 32 bits TSC (see timex.h). The assumption there
is that the reschedule is done every 8 seconds or so. Given that tracing needs
to detect delays longer than 8 seconds, we need a full 64-bits TSC, which is
provided by trace-clock-32-to-64.
I leave the "depends on !CPU_R4400_WORKAROUNDS" in Kconfig because the solution
proposed by Ralf to deal with the R4400 bug is racy, so let's just not support
this broken architecture. :(
This patch uses the same cache-line bouncing algorithm used for x86. This is a
best-effort to support architectures lacking synchronized TSC without adding a
lot of complexity too soon. This keeps room for improvement in a second phase.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/mips/Kconfig | 3
arch/mips/include/asm/timex.h | 17 +++
arch/mips/include/asm/trace-clock.h | 65 +++++++++++++
arch/mips/kernel/Makefile | 2
arch/mips/kernel/trace-clock.c | 172 ++++++++++++++++++++++++++++++++++++
5 files changed, 258 insertions(+), 1 deletion(-)
Index: linux.trees.git/arch/mips/Kconfig
===================================================================
--- linux.trees.git.orig/arch/mips/Kconfig 2008-11-12 18:00:23.000000000 -0500
+++ linux.trees.git/arch/mips/Kconfig 2008-11-12 18:04:03.000000000 -0500
@@ -1614,6 +1614,9 @@ config CPU_R4400_WORKAROUNDS
config HAVE_GET_CYCLES_32
def_bool y
depends on !CPU_R4400_WORKAROUNDS
+ select HAVE_TRACE_CLOCK
+ select HAVE_TRACE_CLOCK_32_TO_64
+ select HAVE_UNSYNCHRONIZED_TSC
#
# Use the generic interrupt handling code in kernel/irq/:
Index: linux.trees.git/arch/mips/include/asm/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/mips/include/asm/trace-clock.h 2008-11-12 18:04:03.000000000 -0500
@@ -0,0 +1,65 @@
+/*
+ * Copyright (C) 2005,2008 Mathieu Desnoyers
+ *
+ * Trace clock MIPS definitions.
+ */
+
+#ifndef _ASM_MIPS_TRACE_CLOCK_H
+#define _ASM_MIPS_TRACE_CLOCK_H
+
+#include <linux/timex.h>
+#include <asm/processor.h>
+
+#define TRACE_CLOCK_MIN_PROBE_DURATION 200
+
+extern u64 trace_clock_read_synthetic_tsc(void);
+
+/*
+ * MIPS get_cycles only returns a 32 bits TSC (see timex.h). The assumption
+ * there is that the reschedule is done every 8 seconds or so. Given that
+ * tracing needs to detect delays longer than 8 seconds, we need a full 64-bits
+ * TSC, whic is provided by trace-clock-32-to-64.
+*/
+extern u64 trace_clock_async_tsc_read(void);
+
+static inline u32 trace_clock_read32(void)
+{
+ u32 cycles;
+
+ if (likely(tsc_is_sync()))
+ cycles = (u32)get_cycles(); /* only need the 32 LSB */
+ else
+ cycles = (u32)trace_clock_async_tsc_read();
+ return cycles;
+}
+
+static inline u64 trace_clock_read64(void)
+{
+ u64 cycles;
+
+ if (likely(tsc_is_sync()))
+ cycles = trace_clock_read_synthetic_tsc();
+ else
+ cycles = trace_clock_async_tsc_read();
+ return cycles;
+}
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ return mips_hpt_frequency;
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1;
+}
+
+extern void get_trace_clock(void);
+extern void put_trace_clock(void);
+extern void get_synthetic_tsc(void);
+extern void put_synthetic_tsc(void);
+
+static inline void set_trace_clock_is_sync(int state)
+{
+}
+#endif /* _ASM_MIPS_TRACE_CLOCK_H */
Index: linux.trees.git/arch/mips/kernel/trace-clock.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/mips/kernel/trace-clock.c 2008-11-12 18:04:03.000000000 -0500
@@ -0,0 +1,172 @@
+/*
+ * arch/mips/kernel/trace-clock.c
+ *
+ * Trace clock for mips.
+ *
+ * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>, October 2008
+ */
+
+#include <linux/module.h>
+#include <linux/trace-clock.h>
+#include <linux/jiffies.h>
+#include <linux/mutex.h>
+#include <linux/timer.h>
+#include <linux/spinlock.h>
+
+static u64 trace_clock_last_tsc;
+static DEFINE_PER_CPU(struct timer_list, update_timer);
+static DEFINE_MUTEX(async_tsc_mutex);
+static int async_tsc_refcount; /* Number of readers */
+static int async_tsc_enabled; /* Async TSC enabled on all online CPUs */
+
+/*
+ * Support for architectures with non-sync TSCs.
+ * When the local TSC is discovered to lag behind the highest TSC counter, we
+ * increment the TSC count of an amount that should be, ideally, lower than the
+ * execution time of this routine, in cycles : this is the granularity we look
+ * for : we must be able to order the events.
+ */
+
+#if BITS_PER_LONG == 64
+notrace u64 trace_clock_async_tsc_read(void)
+{
+ u64 new_tsc, last_tsc;
+
+ WARN_ON(!async_tsc_refcount || !async_tsc_enabled);
+ new_tsc = trace_clock_read_synthetic_tsc();
+ do {
+ last_tsc = trace_clock_last_tsc;
+ if (new_tsc < last_tsc)
+ new_tsc = last_tsc + TRACE_CLOCK_MIN_PROBE_DURATION;
+ /*
+ * If cmpxchg fails with a value higher than the new_tsc, don't
+ * retry : the value has been incremented and the events
+ * happened almost at the same time.
+ * We must retry if cmpxchg fails with a lower value :
+ * it means that we are the CPU with highest frequency and
+ * therefore MUST update the value.
+ */
+ } while (cmpxchg64(&trace_clock_last_tsc, last_tsc, new_tsc) < new_tsc);
+ return new_tsc;
+}
+EXPORT_SYMBOL_GPL(trace_clock_async_tsc_read);
+#else
+/*
+ * Emulate an atomic 64-bits update with a spinlock.
+ * Note : preempt_disable or irq save must be explicit with raw_spinlock_t.
+ * Given we use a spinlock for this time base, we should never be called from
+ * NMI context.
+ */
+static raw_spinlock_t trace_clock_lock =
+ (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+
+static inline u64 trace_clock_cmpxchg64(u64 *ptr, u64 old, u64 new)
+{
+ u64 val;
+
+ val = *ptr;
+ if (likely(val == old))
+ *ptr = val = new;
+ return val;
+}
+
+notrace u64 trace_clock_async_tsc_read(void)
+{
+ u64 new_tsc, last_tsc;
+ unsigned long flags;
+
+ WARN_ON(!async_tsc_refcount || !async_tsc_enabled);
+ local_irq_save(flags);
+ __raw_spin_lock(&trace_clock_lock);
+ new_tsc = trace_clock_read_synthetic_tsc();
+ do {
+ last_tsc = trace_clock_last_tsc;
+ if (new_tsc < last_tsc)
+ new_tsc = last_tsc + TRACE_CLOCK_MIN_PROBE_DURATION;
+ /*
+ * If cmpxchg fails with a value higher than the new_tsc, don't
+ * retry : the value has been incremented and the events
+ * happened almost at the same time.
+ * We must retry if cmpxchg fails with a lower value :
+ * it means that we are the CPU with highest frequency and
+ * therefore MUST update the value.
+ */
+ } while (trace_clock_cmpxchg64(&trace_clock_last_tsc, last_tsc,
+ new_tsc) < new_tsc);
+ __raw_spin_unlock(&trace_clock_lock);
+ local_irq_restore(flags);
+ return new_tsc;
+}
+EXPORT_SYMBOL_GPL(trace_clock_async_tsc_read);
+#endif
+
+
+static void update_timer_ipi(void *info)
+{
+ (void)trace_clock_async_tsc_read();
+}
+
+/*
+ * update_timer_fct : - Timer function to resync the clocks
+ * @data: unused
+ *
+ * Fires every jiffy.
+ */
+static void update_timer_fct(unsigned long data)
+{
+ (void)trace_clock_async_tsc_read();
+
+ per_cpu(update_timer, smp_processor_id()).expires = jiffies + 1;
+ add_timer_on(&per_cpu(update_timer, smp_processor_id()),
+ smp_processor_id());
+}
+
+static void enable_trace_clock(int cpu)
+{
+ init_timer(&per_cpu(update_timer, cpu));
+ per_cpu(update_timer, cpu).function = update_timer_fct;
+ per_cpu(update_timer, cpu).expires = jiffies + 1;
+ smp_call_function_single(cpu, update_timer_ipi, NULL, 1);
+ add_timer_on(&per_cpu(update_timer, cpu), cpu);
+}
+
+static void disable_trace_clock(int cpu)
+{
+ del_timer_sync(&per_cpu(update_timer, cpu));
+}
+
+void get_trace_clock(void)
+{
+ int cpu;
+
+ get_synthetic_tsc();
+ mutex_lock(&async_tsc_mutex);
+ if (async_tsc_refcount++ || tsc_is_sync())
+ goto end;
+
+ async_tsc_enabled = 1;
+ for_each_online_cpu(cpu)
+ enable_trace_clock(cpu);
+end:
+ mutex_unlock(&async_tsc_mutex);
+}
+EXPORT_SYMBOL_GPL(get_trace_clock);
+
+void put_trace_clock(void)
+{
+ int cpu;
+
+ mutex_lock(&async_tsc_mutex);
+ WARN_ON(async_tsc_refcount <= 0);
+ if (async_tsc_refcount != 1 || !async_tsc_enabled)
+ goto end;
+
+ for_each_online_cpu(cpu)
+ disable_trace_clock(cpu);
+ async_tsc_enabled = 0;
+end:
+ async_tsc_refcount--;
+ mutex_unlock(&async_tsc_mutex);
+ put_synthetic_tsc();
+}
+EXPORT_SYMBOL_GPL(put_trace_clock);
Index: linux.trees.git/arch/mips/include/asm/timex.h
===================================================================
--- linux.trees.git.orig/arch/mips/include/asm/timex.h 2008-11-12 18:03:33.000000000 -0500
+++ linux.trees.git/arch/mips/include/asm/timex.h 2008-11-12 18:04:03.000000000 -0500
@@ -42,7 +42,7 @@
typedef unsigned int cycles_t;
-#ifdef HAVE_GET_CYCLES_32
+#ifdef CONFIG_HAVE_GET_CYCLES_32
static inline cycles_t get_cycles(void)
{
return read_c0_count();
@@ -91,6 +91,21 @@ static inline void write_tsc(u32 val1, u
extern unsigned int mips_hpt_frequency;
+/*
+ * Currently unused, should update internal tsc-related timekeeping sources.
+ */
+static inline void mark_tsc_unstable(char *reason)
+{
+}
+
+/*
+ * Currently simply use the tsc_is_sync value.
+ */
+static inline int unsynchronized_tsc(void)
+{
+ return !tsc_is_sync();
+}
+
#endif /* __KERNEL__ */
#endif /* _ASM_TIMEX_H */
Index: linux.trees.git/arch/mips/kernel/Makefile
===================================================================
--- linux.trees.git.orig/arch/mips/kernel/Makefile 2008-11-07 00:34:10.000000000 -0500
+++ linux.trees.git/arch/mips/kernel/Makefile 2008-11-12 18:04:03.000000000 -0500
@@ -85,6 +85,8 @@ obj-$(CONFIG_GPIO_TXX9) += gpio_txx9.o
obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
+obj-$(CONFIG_HAVE_TRACE_CLOCK) += trace-clock.o
+
CFLAGS_cpu-bugs64.o = $(shell if $(CC) $(KBUILD_CFLAGS) -Wa,-mdaddi -c -o /dev/null -xc /dev/null >/dev/null 2>&1; then echo "-DHAVE_AS_SET_DADDI"; fi)
obj-$(CONFIG_HAVE_STD_PC_SERIAL_PORT) += 8250-platform.o
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 17/17] x86 trace clock
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
` (15 preceding siblings ...)
2008-11-12 23:16 ` [patch 16/17] MIPS : Trace clock Mathieu Desnoyers
@ 2008-11-12 23:16 ` Mathieu Desnoyers
16 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-12 23:16 UTC (permalink / raw)
To: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel
Cc: Mathieu Desnoyers, Thomas Gleixner, Ingo Molnar, H. Peter Anvin
[-- Attachment #1: x86-trace-clock.patch --]
[-- Type: text/plain, Size: 11154 bytes --]
X86 trace clock. Depends on tsc_sync to detect if timestamp counters are
synchronized on the machine.
I am leaving this poorly scalable solution for now as this is the simplest, yet
working, solution I found (compared to using the HPET which also scales very
poorly, probably due to bus contention). This should be a good start and let us
trace a good amount of machines out there.
A "Big Fat" (TM) warning is shown on the console when the trace clock is used on
systems without synchronized TSCs to tell the user to
- use force_tsc_sync=1
- use idle=poll
- disable Powernow or Speedstep
In order to get accurate and fast timestamps.
This keeps room for further improvement in a second phase.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Steven Rostedt <rostedt@goodmis.org>
---
arch/x86/Kconfig | 1
arch/x86/kernel/Makefile | 1
arch/x86/kernel/trace-clock.c | 248 ++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/trace-clock.h | 70 +++++++++++
4 files changed, 320 insertions(+)
Index: linux.trees.git/arch/x86/Kconfig
===================================================================
--- linux.trees.git.orig/arch/x86/Kconfig 2008-11-12 18:02:33.000000000 -0500
+++ linux.trees.git/arch/x86/Kconfig 2008-11-12 18:04:25.000000000 -0500
@@ -27,6 +27,7 @@ config X86
select HAVE_KPROBES
select ARCH_WANT_OPTIONAL_GPIOLIB
select HAVE_KRETPROBES
+ select HAVE_TRACE_CLOCK
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE
select HAVE_FUNCTION_TRACER
Index: linux.trees.git/arch/x86/kernel/Makefile
===================================================================
--- linux.trees.git.orig/arch/x86/kernel/Makefile 2008-11-12 18:02:33.000000000 -0500
+++ linux.trees.git/arch/x86/kernel/Makefile 2008-11-12 18:04:09.000000000 -0500
@@ -36,6 +36,7 @@ obj-y += bootflag.o e820.o
obj-y += pci-dma.o quirks.o i8237.o topology.o kdebugfs.o
obj-y += alternative.o i8253.o pci-nommu.o
obj-y += tsc.o io_delay.o rtc.o
+obj-$(CONFIG_HAVE_TRACE_CLOCK) += trace-clock.o
obj-$(CONFIG_X86_TRAMPOLINE) += trampoline.o
obj-y += process.o
Index: linux.trees.git/arch/x86/kernel/trace-clock.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/arch/x86/kernel/trace-clock.c 2008-11-12 18:04:09.000000000 -0500
@@ -0,0 +1,248 @@
+/*
+ * arch/x86/kernel/trace-clock.c
+ *
+ * Trace clock for x86.
+ *
+ * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>, October 2008
+ */
+
+#include <linux/module.h>
+#include <linux/trace-clock.h>
+#include <linux/jiffies.h>
+#include <linux/mutex.h>
+#include <linux/timer.h>
+#include <linux/cpu.h>
+
+static cycles_t trace_clock_last_tsc;
+static DEFINE_PER_CPU(struct timer_list, update_timer);
+static DEFINE_MUTEX(async_tsc_mutex);
+static int async_tsc_refcount; /* Number of readers */
+static int async_tsc_enabled; /* Async TSC enabled on all online CPUs */
+
+int _trace_clock_is_sync = 1;
+EXPORT_SYMBOL_GPL(_trace_clock_is_sync);
+
+/*
+ * Called by check_tsc_sync_source from CPU hotplug.
+ */
+void set_trace_clock_is_sync(int state)
+{
+ _trace_clock_is_sync = state;
+}
+
+#if BITS_PER_LONG == 64
+static cycles_t read_last_tsc(void)
+{
+ return trace_clock_last_tsc;
+}
+#else
+/*
+ * A cmpxchg64 update can happen concurrently. Based on the assumption that
+ * two cmpxchg64 will never update it to the same value (the count always
+ * increases), reading it twice insures that we read a coherent value with the
+ * same "sequence number".
+ */
+static cycles_t read_last_tsc(void)
+{
+ cycles_t val1, val2;
+
+ val1 = trace_clock_last_tsc;
+ for (;;) {
+ val2 = val1;
+ barrier();
+ val1 = trace_clock_last_tsc;
+ if (likely(val1 == val2))
+ break;
+ }
+ return val1;
+}
+#endif
+
+/*
+ * Support for architectures with non-sync TSCs.
+ * When the local TSC is discovered to lag behind the highest TSC counter, we
+ * increment the TSC count of an amount that should be, ideally, lower than the
+ * execution time of this routine, in cycles : this is the granularity we look
+ * for : we must be able to order the events.
+ */
+notrace cycles_t trace_clock_async_tsc_read(void)
+{
+ cycles_t new_tsc, last_tsc;
+
+ WARN_ON(!async_tsc_refcount || !async_tsc_enabled);
+ rdtsc_barrier();
+ new_tsc = get_cycles();
+ rdtsc_barrier();
+ last_tsc = read_last_tsc();
+ do {
+ if (new_tsc < last_tsc)
+ new_tsc = last_tsc + TRACE_CLOCK_MIN_PROBE_DURATION;
+ /*
+ * If cmpxchg fails with a value higher than the new_tsc, don't
+ * retry : the value has been incremented and the events
+ * happened almost at the same time.
+ * We must retry if cmpxchg fails with a lower value :
+ * it means that we are the CPU with highest frequency and
+ * therefore MUST update the value.
+ */
+ last_tsc = cmpxchg64(&trace_clock_last_tsc, last_tsc, new_tsc);
+ } while (unlikely(last_tsc < new_tsc));
+ return new_tsc;
+}
+EXPORT_SYMBOL_GPL(trace_clock_async_tsc_read);
+
+static void update_timer_ipi(void *info)
+{
+ (void)trace_clock_async_tsc_read();
+}
+
+/*
+ * update_timer_fct : - Timer function to resync the clocks
+ * @data: unused
+ *
+ * Fires every jiffy.
+ */
+static void update_timer_fct(unsigned long data)
+{
+ (void)trace_clock_async_tsc_read();
+
+ per_cpu(update_timer, smp_processor_id()).expires = jiffies + 1;
+ add_timer_on(&per_cpu(update_timer, smp_processor_id()),
+ smp_processor_id());
+}
+
+static void enable_trace_clock(int cpu)
+{
+ init_timer(&per_cpu(update_timer, cpu));
+ per_cpu(update_timer, cpu).function = update_timer_fct;
+ per_cpu(update_timer, cpu).expires = jiffies + 1;
+ smp_call_function_single(cpu, update_timer_ipi, NULL, 1);
+ add_timer_on(&per_cpu(update_timer, cpu), cpu);
+}
+
+static void disable_trace_clock(int cpu)
+{
+ del_timer_sync(&per_cpu(update_timer, cpu));
+}
+
+/*
+ * hotcpu_callback - CPU hotplug callback
+ * @nb: notifier block
+ * @action: hotplug action to take
+ * @hcpu: CPU number
+ *
+ * Returns the success/failure of the operation. (NOTIFY_OK, NOTIFY_BAD)
+ */
+static int __cpuinit hotcpu_callback(struct notifier_block *nb,
+ unsigned long action,
+ void *hcpu)
+{
+ unsigned int hotcpu = (unsigned long)hcpu;
+ int cpu;
+
+ mutex_lock(&async_tsc_mutex);
+ switch (action) {
+ case CPU_UP_PREPARE:
+ case CPU_UP_PREPARE_FROZEN:
+ break;
+ case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
+ /*
+ * trace_clock_is_sync() is updated by set_trace_clock_is_sync()
+ * code, protected by cpu hotplug disable.
+ * It is ok to let the hotplugged CPU read the timebase before
+ * the CPU_ONLINE notification. It's just there to give a
+ * maximum bound to the TSC error.
+ */
+ if (async_tsc_refcount && !trace_clock_is_sync()) {
+ if (!async_tsc_enabled) {
+ async_tsc_enabled = 1;
+ for_each_online_cpu(cpu)
+ enable_trace_clock(cpu);
+ } else {
+ enable_trace_clock(hotcpu);
+ }
+ }
+ break;
+#ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+ if (!async_tsc_refcount && num_online_cpus() == 1)
+ set_trace_clock_is_sync(1);
+ break;
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ /*
+ * We cannot stop the trace clock on other CPUs when readers are
+ * active even if we go back to a synchronized state (1 CPU)
+ * because the CPU left could be the one lagging behind.
+ */
+ if (async_tsc_refcount && async_tsc_enabled)
+ disable_trace_clock(hotcpu);
+ if (!async_tsc_refcount && num_online_cpus() == 1)
+ set_trace_clock_is_sync(1);
+ break;
+#endif /* CONFIG_HOTPLUG_CPU */
+ }
+ mutex_unlock(&async_tsc_mutex);
+
+ return NOTIFY_OK;
+}
+
+void get_trace_clock(void)
+{
+ int cpu;
+
+ if (!trace_clock_is_sync()) {
+ printk(KERN_WARNING
+ "Trace clock falls back on cache-line bouncing\n"
+ "workaround due to non-synchronized TSCs.\n"
+ "This workaround preserves event order across CPUs.\n"
+ "Please consider disabling Speedstep or PowerNow and\n"
+ "using kernel parameters "
+ "\"force_tsc_sync=1 idle=poll\"\n"
+ "for accurate and fast tracing clock source.\n");
+ }
+
+ get_online_cpus();
+ mutex_lock(&async_tsc_mutex);
+ if (async_tsc_refcount++ || trace_clock_is_sync())
+ goto end;
+
+ async_tsc_enabled = 1;
+ for_each_online_cpu(cpu)
+ enable_trace_clock(cpu);
+end:
+ mutex_unlock(&async_tsc_mutex);
+ put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(get_trace_clock);
+
+void put_trace_clock(void)
+{
+ int cpu;
+
+ get_online_cpus();
+ mutex_lock(&async_tsc_mutex);
+ WARN_ON(async_tsc_refcount <= 0);
+ if (async_tsc_refcount != 1 || !async_tsc_enabled)
+ goto end;
+
+ for_each_online_cpu(cpu)
+ disable_trace_clock(cpu);
+ async_tsc_enabled = 0;
+end:
+ async_tsc_refcount--;
+ if (!async_tsc_refcount && num_online_cpus() == 1)
+ set_trace_clock_is_sync(1);
+ mutex_unlock(&async_tsc_mutex);
+ put_online_cpus();
+}
+EXPORT_SYMBOL_GPL(put_trace_clock);
+
+static __init int init_unsync_trace_clock(void)
+{
+ hotcpu_notifier(hotcpu_callback, 4);
+ return 0;
+}
+early_initcall(init_unsync_trace_clock);
Index: linux.trees.git/include/asm-x86/trace-clock.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux.trees.git/include/asm-x86/trace-clock.h 2008-11-12 18:04:09.000000000 -0500
@@ -0,0 +1,70 @@
+#ifndef _ASM_X86_TRACE_CLOCK_H
+#define _ASM_X86_TRACE_CLOCK_H
+
+/*
+ * linux/include/asm-x86/trace-clock.h
+ *
+ * Copyright (C) 2005,2006,2008
+ * Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca)
+ *
+ * Trace clock definitions for x86.
+ */
+
+#include <linux/timex.h>
+#include <asm/system.h>
+#include <asm/processor.h>
+#include <asm/atomic.h>
+
+/* Minimum duration of a probe, in cycles */
+#define TRACE_CLOCK_MIN_PROBE_DURATION 200
+
+extern cycles_t trace_clock_async_tsc_read(void);
+
+extern int _trace_clock_is_sync;
+static inline int trace_clock_is_sync(void)
+{
+ return _trace_clock_is_sync;
+}
+
+static inline u32 trace_clock_read32(void)
+{
+ u32 cycles;
+
+ if (likely(trace_clock_is_sync())) {
+ get_cycles_barrier();
+ cycles = (u32)get_cycles(); /* only need the 32 LSB */
+ get_cycles_barrier();
+ } else
+ cycles = (u32)trace_clock_async_tsc_read();
+ return cycles;
+}
+
+static inline u64 trace_clock_read64(void)
+{
+ u64 cycles;
+
+ if (likely(trace_clock_is_sync())) {
+ get_cycles_barrier();
+ cycles = get_cycles();
+ get_cycles_barrier();
+ } else
+ cycles = trace_clock_async_tsc_read();
+ return cycles;
+}
+
+static inline unsigned int trace_clock_frequency(void)
+{
+ return cpu_khz;
+}
+
+static inline u32 trace_clock_freq_scale(void)
+{
+ return 1000;
+}
+
+extern void get_trace_clock(void);
+extern void put_trace_clock(void);
+
+extern void set_trace_clock_is_sync(int state);
+
+#endif /* _ASM_X86_TRACE_CLOCK_H */
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES
2008-11-12 23:15 ` [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
@ 2008-11-13 7:15 ` Geert Uytterhoeven
2008-11-13 13:34 ` Mathieu Desnoyers
0 siblings, 1 reply; 21+ messages in thread
From: Geert Uytterhoeven @ 2008-11-13 7:15 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel, David Miller, Ingo Molnar, Thomas Gleixner,
linux-arch
On Wed, 12 Nov 2008, Mathieu Desnoyers wrote:
> --- linux.trees.git.orig/arch/x86/include/asm/tsc.h 2008-11-12 18:15:25.000000000 -0500
> +++ linux.trees.git/arch/x86/include/asm/tsc.h 2008-11-12 18:15:28.000000000 -0500
> @@ -56,6 +56,18 @@ extern void mark_tsc_unstable(char *reas
> extern int unsynchronized_tsc(void);
> int check_tsc_unstable(void);
>
> +static inline cycles_t get_cycles_rate(void)
> +{
> + if (check_tsc_unstable())
> + return 0;
> + return tsc_khz;
^^^
The comment in Kconfig says:
| get_cycles_rate() : cycle counter rate, in HZ
^^
So what should it be? Hz or kHz?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES
2008-11-13 7:15 ` Geert Uytterhoeven
@ 2008-11-13 13:34 ` Mathieu Desnoyers
0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-13 13:34 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Linus Torvalds, akpm, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
linux-kernel, David Miller, Ingo Molnar, Thomas Gleixner,
linux-arch
* Geert Uytterhoeven (geert@linux-m68k.org) wrote:
> On Wed, 12 Nov 2008, Mathieu Desnoyers wrote:
> > --- linux.trees.git.orig/arch/x86/include/asm/tsc.h 2008-11-12 18:15:25.000000000 -0500
> > +++ linux.trees.git/arch/x86/include/asm/tsc.h 2008-11-12 18:15:28.000000000 -0500
> > @@ -56,6 +56,18 @@ extern void mark_tsc_unstable(char *reas
> > extern int unsynchronized_tsc(void);
> > int check_tsc_unstable(void);
> >
> > +static inline cycles_t get_cycles_rate(void)
> > +{
> > + if (check_tsc_unstable())
> > + return 0;
> > + return tsc_khz;
> ^^^
>
> The comment in Kconfig says:
>
> | get_cycles_rate() : cycle counter rate, in HZ
> ^^
>
> So what should it be? Hz or kHz?
>
HZ, for consistency.
So it becomes :
+static inline cycles_t get_cycles_rate(void)
+{
+ if (check_tsc_unstable())
+ return 0;
+ return (cycles_t)tsc_khz * 1000;
+}
+
Thanks for spotting this, will be integrated in the next release.
As a side-note, I noticed that I used CLOCK_TICK_RATE on MIPS when I
should use mips_hpt_frequency. This too will be fixed in the next
release.
Mathieu
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
* [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES
2008-11-26 12:42 [patch 00/17] Trace Clock v4 Mathieu Desnoyers
@ 2008-11-26 12:42 ` Mathieu Desnoyers
0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2008-11-26 12:42 UTC (permalink / raw)
To: Ingo Molnar, akpm, Linus Torvalds, linux-kernel
Cc: Mathieu Desnoyers, David Miller, Ingo Molnar, Peter Zijlstra,
Thomas Gleixner, Steven Rostedt, linux-arch
[-- Attachment #1: get-cycles-x86-have-get-cycles.patch --]
[-- Type: text/plain, Size: 1892 bytes --]
This patch selects HAVE_GET_CYCLES and makes sure get_cycles_barrier() and
get_cycles_rate() are implemented.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: David Miller <davem@davemloft.net>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-arch@vger.kernel.org
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/tsc.h | 12 ++++++++++++
2 files changed, 13 insertions(+)
Index: linux.trees.git/arch/x86/Kconfig
===================================================================
--- linux.trees.git.orig/arch/x86/Kconfig 2008-11-26 05:46:29.000000000 -0500
+++ linux.trees.git/arch/x86/Kconfig 2008-11-26 06:48:49.000000000 -0500
@@ -20,6 +20,7 @@ config X86
def_bool y
select HAVE_AOUT if X86_32
select HAVE_UNSTABLE_SCHED_CLOCK
+ select HAVE_GET_CYCLES
select HAVE_IDE
select HAVE_OPROFILE
select HAVE_IOREMAP_PROT
Index: linux.trees.git/arch/x86/include/asm/tsc.h
===================================================================
--- linux.trees.git.orig/arch/x86/include/asm/tsc.h 2008-11-14 17:38:28.000000000 -0500
+++ linux.trees.git/arch/x86/include/asm/tsc.h 2008-11-26 06:48:49.000000000 -0500
@@ -50,6 +50,18 @@ extern void mark_tsc_unstable(char *reas
extern int unsynchronized_tsc(void);
int check_tsc_unstable(void);
+static inline cycles_t get_cycles_rate(void)
+{
+ if (check_tsc_unstable())
+ return 0;
+ return (cycles_t)tsc_khz * 1000;
+}
+
+static inline void get_cycles_barrier(void)
+{
+ rdtsc_barrier();
+}
+
/*
* Boot-time check whether the TSCs are synchronized across
* all CPUs/cores:
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2008-11-26 13:03 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-12 23:15 [patch 00/17] Trace Clock v3 Mathieu Desnoyers
2008-11-12 23:15 ` [patch 01/17] get_cycles() : kconfig HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-12 23:15 ` [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-13 7:15 ` Geert Uytterhoeven
2008-11-13 13:34 ` Mathieu Desnoyers
2008-11-12 23:15 ` [patch 03/17] get_cycles() : sparc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-12 23:15 ` [patch 04/17] get_cycles() : powerpc64 HAVE_GET_CYCLES Mathieu Desnoyers
2008-11-12 23:15 ` [patch 05/17] get_cycles() : MIPS HAVE_GET_CYCLES_32 Mathieu Desnoyers
2008-11-12 23:15 ` [patch 06/17] Trace clock core Mathieu Desnoyers
2008-11-12 23:15 ` [patch 07/17] Trace clock generic Mathieu Desnoyers
2008-11-12 23:15 ` [patch 08/17] Powerpc : Trace clock Mathieu Desnoyers
2008-11-12 23:16 ` [patch 09/17] Sparc64 " Mathieu Desnoyers
2008-11-12 23:16 ` [patch 10/17] LTTng timestamp sh Mathieu Desnoyers
2008-11-12 23:16 ` [patch 11/17] LTTng - TSC synchronicity test Mathieu Desnoyers
2008-11-12 23:16 ` [patch 12/17] x86 : remove arch-specific tsc_sync.c Mathieu Desnoyers
2008-11-12 23:16 ` [patch 13/17] MIPS use tsc_sync.c Mathieu Desnoyers
2008-11-12 23:16 ` [patch 14/17] MIPS : export hpt frequency for trace_clock Mathieu Desnoyers
2008-11-12 23:16 ` [patch 15/17] MIPS create empty sync_core() Mathieu Desnoyers
2008-11-12 23:16 ` [patch 16/17] MIPS : Trace clock Mathieu Desnoyers
2008-11-12 23:16 ` [patch 17/17] x86 trace clock Mathieu Desnoyers
-- strict thread matches above, loose matches on Subject: below --
2008-11-26 12:42 [patch 00/17] Trace Clock v4 Mathieu Desnoyers
2008-11-26 12:42 ` [patch 02/17] get_cycles() : x86 HAVE_GET_CYCLES Mathieu Desnoyers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox