* [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
@ 2008-11-12 21:47 Frederic Weisbecker
2008-11-12 22:15 ` Ingo Molnar
2008-11-13 14:53 ` Andi Kleen
0 siblings, 2 replies; 27+ messages in thread
From: Frederic Weisbecker @ 2008-11-12 21:47 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel
Impact: remove spinlocks and irq disabling in function return tracer.
I've tried to figure out all of the race condition that could happen when
the tracer pushes or pops a return address trace to/from the current
thread_info.
Theory:
_ One thread can only execute on one cpu at a time. So this code doesn't need
to be SMP-safe. Just drop the spinlock.
_ The only race could happen between the current thread and an interrupt. If an
interrupt is raised, it will increase the index of the return stack storage and
then execute until the end of the tracing to finally free the index it used.
We don't need to disable irqs.
This is theorical. In practice, I've tested it with a two-core SMP and had no
problem at all. Perhaps -tip testing could confirm it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
arch/x86/kernel/ftrace.c | 43 +++++--------------------------------------
1 files changed, 5 insertions(+), 38 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 16a571d..1db0e12 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -44,62 +44,37 @@ void ftrace_nmi_exit(void)
atomic_dec(&in_nmi);
}
-/*
- * Synchronize accesses to return adresses stack with
- * interrupts.
- */
-static raw_spinlock_t ret_stack_lock;
-
/* Add a function return address to the trace stack on thread info.*/
static int push_return_trace(unsigned long ret, unsigned long long time,
unsigned long func)
{
int index;
- struct thread_info *ti;
- unsigned long flags;
- int err = 0;
-
- raw_local_irq_save(flags);
- __raw_spin_lock(&ret_stack_lock);
+ struct thread_info *ti = current_thread_info();
- ti = current_thread_info();
/* The return trace stack is full */
- if (ti->curr_ret_stack == FTRACE_RET_STACK_SIZE - 1) {
- err = -EBUSY;
- goto out;
- }
+ if (ti->curr_ret_stack == FTRACE_RET_STACK_SIZE - 1)
+ return -EBUSY;
index = ++ti->curr_ret_stack;
ti->ret_stack[index].ret = ret;
ti->ret_stack[index].func = func;
ti->ret_stack[index].calltime = time;
-out:
- __raw_spin_unlock(&ret_stack_lock);
- raw_local_irq_restore(flags);
- return err;
+ return 0;
}
/* Retrieve a function return address to the trace stack on thread info.*/
static void pop_return_trace(unsigned long *ret, unsigned long long *time,
unsigned long *func)
{
- struct thread_info *ti;
int index;
- unsigned long flags;
-
- raw_local_irq_save(flags);
- __raw_spin_lock(&ret_stack_lock);
- ti = current_thread_info();
+ struct thread_info *ti = current_thread_info();
index = ti->curr_ret_stack;
*ret = ti->ret_stack[index].ret;
*func = ti->ret_stack[index].func;
*time = ti->ret_stack[index].calltime;
ti->curr_ret_stack--;
-
- __raw_spin_unlock(&ret_stack_lock);
- raw_local_irq_restore(flags);
}
/*
@@ -175,14 +150,6 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
*parent = old;
}
-static int __init init_ftrace_function_return(void)
-{
- ret_stack_lock = (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
- return 0;
-}
-device_initcall(init_ftrace_function_return);
-
-
#endif
#ifdef CONFIG_DYNAMIC_FTRACE
--
1.5.2.5
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-12 21:47 [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless Frederic Weisbecker
@ 2008-11-12 22:15 ` Ingo Molnar
2008-11-13 8:50 ` Frédéric Weisbecker
2008-11-13 14:53 ` Andi Kleen
1 sibling, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2008-11-12 22:15 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Steven Rostedt, Linux Kernel
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Impact: remove spinlocks and irq disabling in function return tracer.
>
> I've tried to figure out all of the race condition that could happen when
> the tracer pushes or pops a return address trace to/from the current
> thread_info.
>
> Theory:
>
> _ One thread can only execute on one cpu at a time. So this code doesn't need
> to be SMP-safe. Just drop the spinlock.
> _ The only race could happen between the current thread and an interrupt. If an
> interrupt is raised, it will increase the index of the return stack storage and
> then execute until the end of the tracing to finally free the index it used.
> We don't need to disable irqs.
>
> This is theorical. In practice, I've tested it with a two-core SMP
> and had no problem at all. Perhaps -tip testing could confirm it.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
applied to tip/tracing/function-return-tracer, thanks Frederic!
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-12 22:15 ` Ingo Molnar
@ 2008-11-13 8:50 ` Frédéric Weisbecker
2008-11-13 8:55 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 8:50 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel
2008/11/12 Ingo Molnar <mingo@elte.hu>:
>
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
>> Impact: remove spinlocks and irq disabling in function return tracer.
>>
>> I've tried to figure out all of the race condition that could happen when
>> the tracer pushes or pops a return address trace to/from the current
>> thread_info.
>>
>> Theory:
>>
>> _ One thread can only execute on one cpu at a time. So this code doesn't need
>> to be SMP-safe. Just drop the spinlock.
>> _ The only race could happen between the current thread and an interrupt. If an
>> interrupt is raised, it will increase the index of the return stack storage and
>> then execute until the end of the tracing to finally free the index it used.
>> We don't need to disable irqs.
>>
>> This is theorical. In practice, I've tested it with a two-core SMP
>> and had no problem at all. Perhaps -tip testing could confirm it.
>>
>> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>
> applied to tip/tracing/function-return-tracer, thanks Frederic!
>
> Ingo
>
BTW I'm wondering about consistency in time capturing.
When I look into kernel/sched_clock.c I see this in introduction:
"The clock: sched_clock_cpu() is monotonic per cpu, and should be somewhat
consistent between cpus (never more than 2 jiffies difference)."
Two Jiffies, that could result in a lot of inconsistency in the way of
nanosec capturing. The current task
can be preempted between the call time and the return time and I'm
doing a cpu_clock(raw_smp_processor_id)
on these two times.
Should I keep the same processor_id for these two captures? But what
would happen if this cpu is shut
down between these two times?
One other solution would be to plan time capture in usec but I would
mostly lose the interest of function cost measuring....
What do you think?
Thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 8:50 ` Frédéric Weisbecker
@ 2008-11-13 8:55 ` Ingo Molnar
2008-11-13 9:16 ` Frédéric Weisbecker
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 8:55 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> BTW I'm wondering about consistency in time capturing. When I look
> into kernel/sched_clock.c I see this in introduction:
>
> "The clock: sched_clock_cpu() is monotonic per cpu, and should be
> somewhat consistent between cpus (never more than 2 jiffies
> difference)."
>
> Two Jiffies, that could result in a lot of inconsistency in the way
> of nanosec capturing. The current task can be preempted between the
> call time and the return time and I'm doing a
> cpu_clock(raw_smp_processor_id) on these two times. Should I keep
> the same processor_id for these two captures? But what would happen
> if this cpu is shut down between these two times? One other solution
> would be to plan time capture in usec but I would mostly lose the
> interest of function cost measuring....
>
> What do you think?
in practice the jitter is much lower - a couple of microseconds - up
to a few dozen at most.
But it's a possibility, and i think the best solution is something
that Steve suggested yesterday: a /debug/tracing/trace_options flag
that turns on global ordering for tracing timestamps. Something like:
echo global_timestamps > /debug/tracing/trace_options
tracers could also change the default of this flag. The function-cost
tracer will probably want to default to globally synchronous
timestamps, while the preempt and irqsoff tracers want to default to
local timestamps only.
Would something like this work for you?
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 8:55 ` Ingo Molnar
@ 2008-11-13 9:16 ` Frédéric Weisbecker
2008-11-13 9:23 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 9:16 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
2008/11/13 Ingo Molnar <mingo@elte.hu>:
>
> * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
>
>> BTW I'm wondering about consistency in time capturing. When I look
>> into kernel/sched_clock.c I see this in introduction:
>>
>> "The clock: sched_clock_cpu() is monotonic per cpu, and should be
>> somewhat consistent between cpus (never more than 2 jiffies
>> difference)."
>>
>> Two Jiffies, that could result in a lot of inconsistency in the way
>> of nanosec capturing. The current task can be preempted between the
>> call time and the return time and I'm doing a
>> cpu_clock(raw_smp_processor_id) on these two times. Should I keep
>> the same processor_id for these two captures? But what would happen
>> if this cpu is shut down between these two times? One other solution
>> would be to plan time capture in usec but I would mostly lose the
>> interest of function cost measuring....
>>
>> What do you think?
>
> in practice the jitter is much lower - a couple of microseconds - up
> to a few dozen at most.
>
> But it's a possibility, and i think the best solution is something
> that Steve suggested yesterday: a /debug/tracing/trace_options flag
> that turns on global ordering for tracing timestamps. Something like:
>
> echo global_timestamps > /debug/tracing/trace_options
>
> tracers could also change the default of this flag. The function-cost
> tracer will probably want to default to globally synchronous
> timestamps, while the preempt and irqsoff tracers want to default to
> local timestamps only.
>
> Would something like this work for you?
>
> Ingo
>
But I guess this flag would apply on the timestamp inserted by the
ring-buffer. Unfortunately I can't use it
since I have to capture the clock for two times and not only during
insertion in the ring-buffer.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 9:16 ` Frédéric Weisbecker
@ 2008-11-13 9:23 ` Ingo Molnar
2008-11-13 9:27 ` Frédéric Weisbecker
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 9:23 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> 2008/11/13 Ingo Molnar <mingo@elte.hu>:
> >
> > * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> >
> >> BTW I'm wondering about consistency in time capturing. When I look
> >> into kernel/sched_clock.c I see this in introduction:
> >>
> >> "The clock: sched_clock_cpu() is monotonic per cpu, and should be
> >> somewhat consistent between cpus (never more than 2 jiffies
> >> difference)."
> >>
> >> Two Jiffies, that could result in a lot of inconsistency in the way
> >> of nanosec capturing. The current task can be preempted between the
> >> call time and the return time and I'm doing a
> >> cpu_clock(raw_smp_processor_id) on these two times. Should I keep
> >> the same processor_id for these two captures? But what would happen
> >> if this cpu is shut down between these two times? One other solution
> >> would be to plan time capture in usec but I would mostly lose the
> >> interest of function cost measuring....
> >>
> >> What do you think?
> >
> > in practice the jitter is much lower - a couple of microseconds - up
> > to a few dozen at most.
> >
> > But it's a possibility, and i think the best solution is something
> > that Steve suggested yesterday: a /debug/tracing/trace_options flag
> > that turns on global ordering for tracing timestamps. Something like:
> >
> > echo global_timestamps > /debug/tracing/trace_options
> >
> > tracers could also change the default of this flag. The function-cost
> > tracer will probably want to default to globally synchronous
> > timestamps, while the preempt and irqsoff tracers want to default to
> > local timestamps only.
> >
> > Would something like this work for you?
> >
> > Ingo
> >
>
>
> But I guess this flag would apply on the timestamp inserted by the
> ring-buffer. Unfortunately I can't use it since I have to capture
> the clock for two times and not only during insertion in the
> ring-buffer.
i think the clock should be a property of the tracer, not of the ring
buffer. Hence if a tracer has the option set, it will get coherent
timestamps - including ringbuffer insertion timestamps.
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 9:23 ` Ingo Molnar
@ 2008-11-13 9:27 ` Frédéric Weisbecker
2008-11-13 9:40 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 9:27 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
2008/11/13 Ingo Molnar <mingo@elte.hu>:
>
> * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
>
>> 2008/11/13 Ingo Molnar <mingo@elte.hu>:
>> >
>> > * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
>> >
>> >> BTW I'm wondering about consistency in time capturing. When I look
>> >> into kernel/sched_clock.c I see this in introduction:
>> >>
>> >> "The clock: sched_clock_cpu() is monotonic per cpu, and should be
>> >> somewhat consistent between cpus (never more than 2 jiffies
>> >> difference)."
>> >>
>> >> Two Jiffies, that could result in a lot of inconsistency in the way
>> >> of nanosec capturing. The current task can be preempted between the
>> >> call time and the return time and I'm doing a
>> >> cpu_clock(raw_smp_processor_id) on these two times. Should I keep
>> >> the same processor_id for these two captures? But what would happen
>> >> if this cpu is shut down between these two times? One other solution
>> >> would be to plan time capture in usec but I would mostly lose the
>> >> interest of function cost measuring....
>> >>
>> >> What do you think?
>> >
>> > in practice the jitter is much lower - a couple of microseconds - up
>> > to a few dozen at most.
>> >
>> > But it's a possibility, and i think the best solution is something
>> > that Steve suggested yesterday: a /debug/tracing/trace_options flag
>> > that turns on global ordering for tracing timestamps. Something like:
>> >
>> > echo global_timestamps > /debug/tracing/trace_options
>> >
>> > tracers could also change the default of this flag. The function-cost
>> > tracer will probably want to default to globally synchronous
>> > timestamps, while the preempt and irqsoff tracers want to default to
>> > local timestamps only.
>> >
>> > Would something like this work for you?
>> >
>> > Ingo
>> >
>>
>>
>> But I guess this flag would apply on the timestamp inserted by the
>> ring-buffer. Unfortunately I can't use it since I have to capture
>> the clock for two times and not only during insertion in the
>> ring-buffer.
>
> i think the clock should be a property of the tracer, not of the ring
> buffer. Hence if a tracer has the option set, it will get coherent
> timestamps - including ringbuffer insertion timestamps.
>
> Ingo
>
If so that would be suitable. And for his purpose, the ring-buffer
would propose a function for time snapshot
adapted to the current flags and that rely on sched_clock?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 9:27 ` Frédéric Weisbecker
@ 2008-11-13 9:40 ` Ingo Molnar
2008-11-13 12:36 ` Frédéric Weisbecker
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 9:40 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> If so that would be suitable. And for his purpose, the ring-buffer
> would propose a function for time snapshot adapted to the current
> flags and that rely on sched_clock?
yeah - and it would be all means by a global entity in the beginning -
i.e. we'd just generalize the code around ring_buffer_time_stamp() to
listen to the "globally coherent" flag, and allow it to be used for
callgraph cost measurement code too.
If the "globally coherent" flag is set, then the implementation would
be something like:
A simple "last global timestamp" value combined with a "last local
timestamp" value, and the global timestamp is only ever moved forward.
It is updated via cmpxchg loop. This gives coherency and a monotonic
clock. The local timestamp would be taken from cpu_clock(cpu), and a
global timestamp would be constructed out of it. Or something like
that.
Would that work? [ Would you be interested in sending patches? :-) ]
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 9:40 ` Ingo Molnar
@ 2008-11-13 12:36 ` Frédéric Weisbecker
2008-11-13 12:54 ` Ingo Molnar
0 siblings, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 12:36 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
2008/11/13 Ingo Molnar <mingo@elte.hu>:
> yeah - and it would be all means by a global entity in the beginning -
> i.e. we'd just generalize the code around ring_buffer_time_stamp() to
> listen to the "globally coherent" flag, and allow it to be used for
> callgraph cost measurement code too.
>
> If the "globally coherent" flag is set, then the implementation would
> be something like:
>
> A simple "last global timestamp" value combined with a "last local
> timestamp" value, and the global timestamp is only ever moved forward.
> It is updated via cmpxchg loop. This gives coherency and a monotonic
> clock. The local timestamp would be taken from cpu_clock(cpu), and a
> global timestamp would be constructed out of it. Or something like
> that.
>
> Would that work? [ Would you be interested in sending patches? :-) ]
>
> Ingo
>
Ok, so correct me if I'm wrong.
Global timestamp would be captured by using sched_clock().
That's what is done currently in ring_buffer_time_stamp()
And the global timestamp would be combination of a last global
timestamp and a relative position from now to this last at
each insertion in the ring-buffer (or tracing time capture). Am I right?
I don't really understand why you want to update with a cmpxchg loop...
And the local timestamp would be built through cpu_clock() with
absolute values at each captures? Because we can't consider
relative values since this is loosing sense between cpu. Unless we
have per_cpu relative values.....
Then, depending of the current_tracer, ring_buffer_time_stamp would
act as a wrapper:
if (current_tracer->time_flag == TIME_GLOBAL_COHERENT)
return ring_buffer_global_timestamp();
else
return ring_buffer_local_timestamp();
Or more efficient with a function pointer set at the same we change
the time flag for the current tracer.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 12:36 ` Frédéric Weisbecker
@ 2008-11-13 12:54 ` Ingo Molnar
2008-11-13 12:59 ` Peter Zijlstra
2008-11-13 17:26 ` Frédéric Weisbecker
0 siblings, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 12:54 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> Ok, so correct me if I'm wrong. Global timestamp would be captured
> by using sched_clock(). That's what is done currently in
> ring_buffer_time_stamp() And the global timestamp would be
> combination of a last global timestamp and a relative position from
> now to this last at each insertion in the ring-buffer (or tracing
> time capture). Am I right? I don't really understand why you want to
> update with a cmpxchg loop...
the cmpxchg loop would be needed to ensure timestamp monotonicity:
every new "global time" is cmpxchg-ed with the "previous global time"
(and is first monotonicity checked).
"prev_global_time" also acts as a global serializer: it ensures that
events are timestamped in a monotonic and ordered way.
i.e. something like this (pseudocode, without the cmpxchg):
u64 prev_global_time;
DEFINE_PER_CPU(prev_local_time);
u64 global_time()
{
u64 now, delta, now_global;
prev_global = prev_global_time;
now = sched_clock();
delta = now - per_cpu(prev_local_time, this_cpu);
per_cpu(prev_local_time, this_cpu) = now;
now_global = prev_global + delta;
prev_global = now_global;
return now_global;
}
note how we build "global time" out of "local time".
The cmpxchg would be used to put the above one into a loop, and
instead of updating the global time in a racy way:
prev_global = now_global;
We'd update it via the cmpxchg:
atomic64_t prev_global_time;
...
while (atomic64_cmpxchg(&prev_global_time,
prev_global, now_global) != prev_global) {
[...]
}
To make sure the global time goes monotonic. (this way we also avoid a
spinlock - locks are fragile for instrumentation)
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 12:54 ` Ingo Molnar
@ 2008-11-13 12:59 ` Peter Zijlstra
2008-11-13 13:02 ` Ingo Molnar
2008-11-13 17:26 ` Frédéric Weisbecker
1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-11-13 12:59 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frédéric Weisbecker, Steven Rostedt, Linux Kernel
On Thu, 2008-11-13 at 13:54 +0100, Ingo Molnar wrote:
> * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
>
> > Ok, so correct me if I'm wrong. Global timestamp would be captured
> > by using sched_clock(). That's what is done currently in
> > ring_buffer_time_stamp() And the global timestamp would be
> > combination of a last global timestamp and a relative position from
> > now to this last at each insertion in the ring-buffer (or tracing
> > time capture). Am I right? I don't really understand why you want to
> > update with a cmpxchg loop...
>
> the cmpxchg loop would be needed to ensure timestamp monotonicity:
> every new "global time" is cmpxchg-ed with the "previous global time"
> (and is first monotonicity checked).
>
> "prev_global_time" also acts as a global serializer: it ensures that
> events are timestamped in a monotonic and ordered way.
>
> i.e. something like this (pseudocode, without the cmpxchg):
>
> u64 prev_global_time;
>
> DEFINE_PER_CPU(prev_local_time);
>
> u64 global_time()
> {
> u64 now, delta, now_global;
>
> prev_global = prev_global_time;
> now = sched_clock();
> delta = now - per_cpu(prev_local_time, this_cpu);
> per_cpu(prev_local_time, this_cpu) = now;
>
> now_global = prev_global + delta;
> prev_global = now_global;
>
> return now_global;
> }
>
> note how we build "global time" out of "local time".
This goes down shit-creek real fast if the TSC goes funny and jumps fwd
or something.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 12:59 ` Peter Zijlstra
@ 2008-11-13 13:02 ` Ingo Molnar
0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 13:02 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Frédéric Weisbecker, Steven Rostedt, Linux Kernel
* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2008-11-13 at 13:54 +0100, Ingo Molnar wrote:
> > * Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> >
> > > Ok, so correct me if I'm wrong. Global timestamp would be captured
> > > by using sched_clock(). That's what is done currently in
> > > ring_buffer_time_stamp() And the global timestamp would be
> > > combination of a last global timestamp and a relative position from
> > > now to this last at each insertion in the ring-buffer (or tracing
> > > time capture). Am I right? I don't really understand why you want to
> > > update with a cmpxchg loop...
> >
> > the cmpxchg loop would be needed to ensure timestamp monotonicity:
> > every new "global time" is cmpxchg-ed with the "previous global time"
> > (and is first monotonicity checked).
> >
> > "prev_global_time" also acts as a global serializer: it ensures that
> > events are timestamped in a monotonic and ordered way.
> >
> > i.e. something like this (pseudocode, without the cmpxchg):
> >
> > u64 prev_global_time;
> >
> > DEFINE_PER_CPU(prev_local_time);
> >
> > u64 global_time()
> > {
> > u64 now, delta, now_global;
> >
> > prev_global = prev_global_time;
> > now = sched_clock();
> > delta = now - per_cpu(prev_local_time, this_cpu);
> > per_cpu(prev_local_time, this_cpu) = now;
> >
> > now_global = prev_global + delta;
> > prev_global = now_global;
> >
> > return now_global;
> > }
> >
> > note how we build "global time" out of "local time".
>
> This goes down shit-creek real fast if the TSC goes funny and jumps
> fwd or something.
yes, it was simplified pseudocode: instead of raw sched_clock() it
should use a more reliable source like cpu_clock(this_cpu).
the important bit is how to build a global clock out of the local
clock.
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-12 21:47 [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless Frederic Weisbecker
2008-11-12 22:15 ` Ingo Molnar
@ 2008-11-13 14:53 ` Andi Kleen
2008-11-13 17:01 ` Frédéric Weisbecker
1 sibling, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 14:53 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Ingo Molnar, Steven Rostedt, Linux Kernel
Frederic Weisbecker <fweisbec@gmail.com> writes:
O
> _ The only race could happen between the current thread and an interrupt. If an
> interrupt is raised, it will increase the index of the return stack storage and
> then execute until the end of the tracing to finally free the index it used.
> We don't need to disable irqs.
>
> This is theorical. In practice, I've tested it with a two-core SMP and had no
> problem at all. Perhaps -tip testing could confirm it.
The problem I think is that you assume the ++ is atomic against
interrupts, which is not guaranteed by the C compiler. e.g.
it would be perfectly legal for the compiler to generate code like
local register i
i = index;
write to index'ed array using i
<--------- interrupt here would overwrite data
...
index = i + 1;
You would need to convert the index access to a "local_add_return()" and
possibly also add memory barriers.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 14:53 ` Andi Kleen
@ 2008-11-13 17:01 ` Frédéric Weisbecker
2008-11-13 17:12 ` Andi Kleen
0 siblings, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 17:01 UTC (permalink / raw)
To: Andi Kleen; +Cc: Ingo Molnar, Steven Rostedt, Linux Kernel
2008/11/13 Andi Kleen <andi@firstfloor.org>:
> The problem I think is that you assume the ++ is atomic against
> interrupts, which is not guaranteed by the C compiler. e.g.
> it would be perfectly legal for the compiler to generate code like
>
> local register i
> i = index;
> write to index'ed array using i
> <--------- interrupt here would overwrite data
> ...
> index = i + 1;
Yes in the common case that would be a danger. But here, if an
interrupt is raised, it will increment
the counter and then decrement it at return time without dropping the
cpu. So after the interrupt, the
value will remain the same...
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:01 ` Frédéric Weisbecker
@ 2008-11-13 17:12 ` Andi Kleen
2008-11-13 17:21 ` Steven Rostedt
0 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 17:12 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Andi Kleen, Ingo Molnar, Steven Rostedt, Linux Kernel
> > local register i
> > i = index;
> > write to index'ed array using i
> > <--------- interrupt here would overwrite data
> > ...
> > index = i + 1;
>
>
> Yes in the common case that would be a danger. But here, if an
> interrupt is raised, it will increment
> the counter and then decrement it at return time without dropping the
> cpu. So after the interrupt, the
> value will remain the same...
The buffer contents will not be necessarily
the same. See the scenario above. The interrupt would use the
same i as the current function and would overwrite the
partially written entry.
-Andi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:12 ` Andi Kleen
@ 2008-11-13 17:21 ` Steven Rostedt
2008-11-13 17:34 ` Andi Kleen
0 siblings, 1 reply; 27+ messages in thread
From: Steven Rostedt @ 2008-11-13 17:21 UTC (permalink / raw)
To: Andi Kleen; +Cc: Frédéric Weisbecker, Ingo Molnar, Linux Kernel
On Thu, 13 Nov 2008, Andi Kleen wrote:
> > > local register i
> > > i = index;
> > > write to index'ed array using i
> > > <--------- interrupt here would overwrite data
> > > ...
> > > index = i + 1;
> >
> >
> > Yes in the common case that would be a danger. But here, if an
> > interrupt is raised, it will increment
> > the counter and then decrement it at return time without dropping the
> > cpu. So after the interrupt, the
> > value will remain the same...
>
> The buffer contents will not be necessarily
> the same. See the scenario above. The interrupt would use the
> same i as the current function and would overwrite the
> partially written entry.
So the answer to this is:
i = index++;
barrier();
write to index i (not index);
-- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 12:54 ` Ingo Molnar
2008-11-13 12:59 ` Peter Zijlstra
@ 2008-11-13 17:26 ` Frédéric Weisbecker
2008-11-13 18:57 ` Ingo Molnar
1 sibling, 1 reply; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 17:26 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
2008/11/13 Ingo Molnar <mingo@elte.hu>:
> "prev_global_time" also acts as a global serializer: it ensures that
> events are timestamped in a monotonic and ordered way.
>
> i.e. something like this (pseudocode, without the cmpxchg):
>
> u64 prev_global_time;
>
> DEFINE_PER_CPU(prev_local_time);
>
> u64 global_time()
> {
> u64 now, delta, now_global;
>
> prev_global = prev_global_time;
> now = sched_clock();
> delta = now - per_cpu(prev_local_time, this_cpu);
> per_cpu(prev_local_time, this_cpu) = now;
>
> now_global = prev_global + delta;
> prev_global = now_global;
>
> return now_global;
> }
>
> note how we build "global time" out of "local time".
>
> The cmpxchg would be used to put the above one into a loop, and
> instead of updating the global time in a racy way:
>
> prev_global = now_global;
>
> We'd update it via the cmpxchg:
>
> atomic64_t prev_global_time;
>
> ...
>
> while (atomic64_cmpxchg(&prev_global_time,
> prev_global, now_global) != prev_global) {
> [...]
> }
>
> To make sure the global time goes monotonic. (this way we also avoid a
> spinlock - locks are fragile for instrumentation)
Ok, I understand better.
But consider the following:
u64 global_time()
{
u64 now, delta, now_global;
prev_global = prev_global_time;
while (atomic64_cmpxchg(&prev_global_time,
prev_global, now_global) != prev_global) {
now = sched_clock();
delta = now - per_cpu(prev_local_time, this_cpu);
per_cpu(prev_local_time, this_cpu) = now;
now_global = prev_global + delta;
prev_global = now_global;
}
return now_global;
}
Sarting with prev_global_time = 0
If we have two cpu and the above function is executed 5 times on the first cpu.
We couldl have per_cpu(prev_local_time) = 50 for example. And so
prev_global_time will be equal to 50.
Just after that, almost at the same time, cpu2 calls global_time()
delta will be equal to 50 (sched_clock() - per_cpu(prev_local_time)
which is 0) and prev_global_time will be 50 + 50 = 100.
This is not consistent.
I don't know where but I'm pretty sure I missed something....
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:34 ` Andi Kleen
@ 2008-11-13 17:32 ` Steven Rostedt
2008-11-13 17:46 ` Frédéric Weisbecker
2008-11-13 18:29 ` Andi Kleen
0 siblings, 2 replies; 27+ messages in thread
From: Steven Rostedt @ 2008-11-13 17:32 UTC (permalink / raw)
To: Andi Kleen; +Cc: Frédéric Weisbecker, Ingo Molnar, Linux Kernel
On Thu, 13 Nov 2008, Andi Kleen wrote:
> > So the answer to this is:
> >
> > i = index++;
> > barrier();
> > write to index i (not index);
>
> That was my first thought when I wrote the original email,
> but the disadvantage is that barrier() is a big hammer
> that flushes everything and can make the code much worse.
> That is why I suggested local_add_return() instead.
barrier() is a compiler barrier, does nothing with the caches, and is
quite cheap. We only need a compiler barrier because we are only
protecting ourselves from things that happen on the current CPU. No other
devices or other CPUs are involved.
-- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:21 ` Steven Rostedt
@ 2008-11-13 17:34 ` Andi Kleen
2008-11-13 17:32 ` Steven Rostedt
0 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 17:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andi Kleen, Frédéric Weisbecker, Ingo Molnar,
Linux Kernel
> So the answer to this is:
>
> i = index++;
> barrier();
> write to index i (not index);
That was my first thought when I wrote the original email,
but the disadvantage is that barrier() is a big hammer
that flushes everything and can make the code much worse.
That is why I suggested local_add_return() instead.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:32 ` Steven Rostedt
@ 2008-11-13 17:46 ` Frédéric Weisbecker
2008-11-13 18:29 ` Andi Kleen
1 sibling, 0 replies; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 17:46 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Andi Kleen, Ingo Molnar, Linux Kernel
2008/11/13 Steven Rostedt <rostedt@goodmis.org>:
>
> On Thu, 13 Nov 2008, Andi Kleen wrote:
>
>> > So the answer to this is:
>> >
>> > i = index++;
>> > barrier();
>> > write to index i (not index);
>>
>> That was my first thought when I wrote the original email,
>> but the disadvantage is that barrier() is a big hammer
>> that flushes everything and can make the code much worse.
>> That is why I suggested local_add_return() instead.
>
> barrier() is a compiler barrier, does nothing with the caches, and is
> quite cheap. We only need a compiler barrier because we are only
> protecting ourselves from things that happen on the current CPU. No other
> devices or other CPUs are involved.
Oh I see the issue now. The value of the index could have been
incremented in a register and not yet
in the memory...
So yes, a barrier() to make these operations flushed in memory before
using the index.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:32 ` Steven Rostedt
2008-11-13 17:46 ` Frédéric Weisbecker
@ 2008-11-13 18:29 ` Andi Kleen
2008-11-13 18:35 ` Steven Rostedt
1 sibling, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 18:29 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andi Kleen, Frédéric Weisbecker, Ingo Molnar,
Linux Kernel
On Thu, Nov 13, 2008 at 12:32:23PM -0500, Steven Rostedt wrote:
>
> On Thu, 13 Nov 2008, Andi Kleen wrote:
>
> > > So the answer to this is:
> > >
> > > i = index++;
> > > barrier();
> > > write to index i (not index);
> >
> > That was my first thought when I wrote the original email,
> > but the disadvantage is that barrier() is a big hammer
> > that flushes everything and can make the code much worse.
> > That is why I suggested local_add_return() instead.
>
> barrier() is a compiler barrier, does nothing with the caches, and is
> quite cheap. We only need a compiler barrier because we are only
I did not refer to CPU caches, but the compiler's register allocation
[ok if you want the registers are the "level 0 cache"]. A memory barrier
all messes it up. That is why it is better to only clobber specific
memory regions, which is what local_* does.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 18:29 ` Andi Kleen
@ 2008-11-13 18:35 ` Steven Rostedt
2008-11-13 18:55 ` Andi Kleen
0 siblings, 1 reply; 27+ messages in thread
From: Steven Rostedt @ 2008-11-13 18:35 UTC (permalink / raw)
To: Andi Kleen; +Cc: Frédéric Weisbecker, Ingo Molnar, Linux Kernel
On Thu, 13 Nov 2008, Andi Kleen wrote:
> On Thu, Nov 13, 2008 at 12:32:23PM -0500, Steven Rostedt wrote:
> >
> > On Thu, 13 Nov 2008, Andi Kleen wrote:
> >
> > > > So the answer to this is:
> > > >
> > > > i = index++;
> > > > barrier();
> > > > write to index i (not index);
> > >
> > > That was my first thought when I wrote the original email,
> > > but the disadvantage is that barrier() is a big hammer
> > > that flushes everything and can make the code much worse.
> > > That is why I suggested local_add_return() instead.
> >
> > barrier() is a compiler barrier, does nothing with the caches, and is
> > quite cheap. We only need a compiler barrier because we are only
>
> I did not refer to CPU caches, but the compiler's register allocation
> [ok if you want the registers are the "level 0 cache"]. A memory barrier
> all messes it up. That is why it is better to only clobber specific
> memory regions, which is what local_* does.
>
#define barrier() __asm__ __volatile__("": : :"memory")
static inline long local_add_return(long i, local_t *l)
{
long __i;
#ifdef CONFIG_M386
unsigned long flags;
if (unlikely(boot_cpu_data.x86 <= 3))
goto no_xadd;
#endif
/* Modern 486+ processor */
__i = i;
asm volatile(_ASM_XADD "%0, %1;"
: "+r" (i), "+m" (l->a.counter)
: : "memory");
return i + __i;
#ifdef CONFIG_M386
no_xadd: /* Legacy 386 processor */
local_irq_save(flags);
__i = local_read(l);
local_set(l, i + __i);
local_irq_restore(flags);
return i + __i;
#endif
}
Now tell me again how local_* is more efficient than barrier?
Not to mention, if this is ever used on other archs with load-linked and
store-conditional, it gets even worse.
-- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 18:35 ` Steven Rostedt
@ 2008-11-13 18:55 ` Andi Kleen
2008-11-13 19:03 ` Steven Rostedt
0 siblings, 1 reply; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 18:55 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andi Kleen, Frédéric Weisbecker, Ingo Molnar,
Linux Kernel
> Now tell me again how local_* is more efficient than barrier?
Look at the generated assembler, not the source. The M386 check is two instructions
with the i386 stuff just moved away somewhere out of line that is
about never taken. The rest is short and straight forward in asm too.
And a lot of kernels are not compiled with 386 support anyways.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 17:26 ` Frédéric Weisbecker
@ 2008-11-13 18:57 ` Ingo Molnar
2008-11-13 20:06 ` Frédéric Weisbecker
0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2008-11-13 18:57 UTC (permalink / raw)
To: Frédéric Weisbecker
Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
* Frédéric Weisbecker <fweisbec@gmail.com> wrote:
> 2008/11/13 Ingo Molnar <mingo@elte.hu>:
> > "prev_global_time" also acts as a global serializer: it ensures that
> > events are timestamped in a monotonic and ordered way.
> >
> > i.e. something like this (pseudocode, without the cmpxchg):
> >
> > u64 prev_global_time;
> >
> > DEFINE_PER_CPU(prev_local_time);
> >
> > u64 global_time()
> > {
> > u64 now, delta, now_global;
> >
> > prev_global = prev_global_time;
> > now = sched_clock();
> > delta = now - per_cpu(prev_local_time, this_cpu);
> > per_cpu(prev_local_time, this_cpu) = now;
> >
> > now_global = prev_global + delta;
> > prev_global = now_global;
> >
> > return now_global;
> > }
> >
> > note how we build "global time" out of "local time".
> >
> > The cmpxchg would be used to put the above one into a loop, and
> > instead of updating the global time in a racy way:
> >
> > prev_global = now_global;
> >
> > We'd update it via the cmpxchg:
> >
> > atomic64_t prev_global_time;
> >
> > ...
> >
> > while (atomic64_cmpxchg(&prev_global_time,
> > prev_global, now_global) != prev_global) {
> > [...]
> > }
> >
> > To make sure the global time goes monotonic. (this way we also avoid a
> > spinlock - locks are fragile for instrumentation)
>
> Ok, I understand better.
> But consider the following:
>
> u64 global_time()
> {
> u64 now, delta, now_global;
> prev_global = prev_global_time;
>
> while (atomic64_cmpxchg(&prev_global_time,
> prev_global, now_global) != prev_global) {
>
> now = sched_clock();
> delta = now - per_cpu(prev_local_time, this_cpu);
> per_cpu(prev_local_time, this_cpu) = now;
> now_global = prev_global + delta;
> prev_global = now_global;
> }
> return now_global;
> }
>
> Sarting with prev_global_time = 0 If we have two cpu and the above
> function is executed 5 times on the first cpu. We couldl have
> per_cpu(prev_local_time) = 50 for example. And so prev_global_time
> will be equal to 50.
>
> Just after that, almost at the same time, cpu2 calls global_time()
>
> delta will be equal to 50 (sched_clock() - per_cpu(prev_local_time)
> which is 0) and prev_global_time will be 50 + 50 = 100. This is not
> consistent. I don't know where but I'm pretty sure I missed
> something....
you are right - it needs a bit more logic.
I think the simplest would be something like this:
atomic64_t global_clock = INIT_ATOMIC64(0);
u64 global_time()
{
u64 now, delta, now_global, prev_global;
do {
prev_global = atomic64_read(&global_clock);
now = cpu_clock(raw_smp_processor_id());
if ((s64)(now - prev_global) < 0) {
now = prev_global;
break;
}
} while (atomic64_cmpxchg(&global_clock,
prev_global, now) != prev_global);
return now;
}
This is the simplest way of implementing monotonic time: we only allow
global_clock to go forwards. If all cpu_clock()s are perfectly in
sync, we've got no problem: then "now - prev_global" will never be
negative and we can return the local clock as the latest global time.
But if one of the CPU clocks is "behind", the function returns the
latest global time up until the local clock catches up. Time wont be
allowed to jump around by going back. If the clock is behind for a
long time, then we get a lot of timestamps with the same value - that
will be very visible in the trace and we'll then work in improving the
cpu_clock() implementation.
So i think we could start with this simplest approach, and see how
often we get the same timestamp for a long time (indication of the
clocks being not perfectly in sync).
Ingo
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 18:55 ` Andi Kleen
@ 2008-11-13 19:03 ` Steven Rostedt
2008-11-13 19:19 ` Andi Kleen
0 siblings, 1 reply; 27+ messages in thread
From: Steven Rostedt @ 2008-11-13 19:03 UTC (permalink / raw)
To: Andi Kleen; +Cc: Frédéric Weisbecker, Ingo Molnar, Linux Kernel
On Thu, 13 Nov 2008, Andi Kleen wrote:
>>>
>>> I did not refer to CPU caches, but the compiler's register allocation
>>> [ok if you want the registers are the "level 0 cache"]. A memory barrier
>>> all messes it up. That is why it is better to only clobber specific
>>> memory regions, which is what local_* does.
>>>
>> Now tell me again how local_* is more efficient than barrier?
>
> Look at the generated assembler, not the source. The M386 check is two instructions
> with the i386 stuff just moved away somewhere out of line that is
> about never taken. The rest is short and straight forward in asm too.
> And a lot of kernels are not compiled with 386 support anyways.
>
OK, lets look at just the asm.
barrier:
__asm__ __volatile__("": : :"memory")
local_add_return:
asm volatile(_ASM_XADD "%0, %1;"
: "+r" (i), "+m" (l->a.counter)
: : "memory");
Your argument was that barrier clobbers memory, but it looks like the
local_add_return does the same.
Lets go look at the code that we are talking about:
index = ++ti->curr_ret_stack;
ti->ret_stack[index].ret = ret;
ti->ret_stack[index].func = func;
ti->ret_stack[index].calltime = time;
Which as you stated was incorrect, and it is. But you suggested that we
should switch to using local_add_return which would cause us to change a
lot of code to handle the type change. When this is ported to other archs,
they will copy that too, and then be using a lesser efficient algorithm.
All that is needed is to change the pointer to always point to the next
item, and then do:
index = ti->curr_ret_stack++;
barrier();
ti->ret_stack[index].ret = ret;
ti->ret_stack[index].func = func;
ti->ret_stack[index].calltime = time;
I don't see any major gain in switching to local_add_return.
-- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 19:03 ` Steven Rostedt
@ 2008-11-13 19:19 ` Andi Kleen
0 siblings, 0 replies; 27+ messages in thread
From: Andi Kleen @ 2008-11-13 19:19 UTC (permalink / raw)
To: Steven Rostedt
Cc: Andi Kleen, Frédéric Weisbecker, Ingo Molnar,
Linux Kernel
> barrier:
> __asm__ __volatile__("": : :"memory")
>
> local_add_return:
> asm volatile(_ASM_XADD "%0, %1;"
> : "+r" (i), "+m" (l->a.counter)
> : : "memory");
>
>
> Your argument was that barrier clobbers memory, but it looks like the
> local_add_return does the same.
That's a fair point. I'm not sure why the local_add_* has a memory
barrier at all, imho it shouldn't need it. I bet it's just
a hold over from the atomic code this was based one.
> I don't see any major gain in switching to local_add_return.
That's true with the current implementation. It would
be a good idea to consider dropping that explicit
barrier though, then it would be likely a slight win.
-Andi
--
ak@linux.intel.com
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless
2008-11-13 18:57 ` Ingo Molnar
@ 2008-11-13 20:06 ` Frédéric Weisbecker
0 siblings, 0 replies; 27+ messages in thread
From: Frédéric Weisbecker @ 2008-11-13 20:06 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel, Peter Zijlstra
2008/11/13 Ingo Molnar <mingo@elte.hu>:
> you are right - it needs a bit more logic.
>
> I think the simplest would be something like this:
>
> atomic64_t global_clock = INIT_ATOMIC64(0);
>
> u64 global_time()
> {
> u64 now, delta, now_global, prev_global;
>
> do {
> prev_global = atomic64_read(&global_clock);
> now = cpu_clock(raw_smp_processor_id());
>
> if ((s64)(now - prev_global) < 0) {
> now = prev_global;
> break;
> }
> } while (atomic64_cmpxchg(&global_clock,
> prev_global, now) != prev_global);
>
> return now;
> }
>
> This is the simplest way of implementing monotonic time: we only allow
> global_clock to go forwards. If all cpu_clock()s are perfectly in
> sync, we've got no problem: then "now - prev_global" will never be
> negative and we can return the local clock as the latest global time.
>
> But if one of the CPU clocks is "behind", the function returns the
> latest global time up until the local clock catches up. Time wont be
> allowed to jump around by going back. If the clock is behind for a
> long time, then we get a lot of timestamps with the same value -
Ok! I understand now this approach.
So, if global ordering flag is set, we return this kind of protected value,
and on the opposite, we return the normal cpu_clock() local value.
> Would that work? [ Would you be interested in sending patches? :-) ]
Yes :-)
> that
> will be very visible in the trace and we'll then work in improving the
> cpu_clock() implementation.
>
> So i think we could start with this simplest approach, and see how
> often we get the same timestamp for a long time (indication of the
> clocks being not perfectly in sync).
Ok, good idea.
Thanks for the explanations!
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-11-13 20:06 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-12 21:47 [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless Frederic Weisbecker
2008-11-12 22:15 ` Ingo Molnar
2008-11-13 8:50 ` Frédéric Weisbecker
2008-11-13 8:55 ` Ingo Molnar
2008-11-13 9:16 ` Frédéric Weisbecker
2008-11-13 9:23 ` Ingo Molnar
2008-11-13 9:27 ` Frédéric Weisbecker
2008-11-13 9:40 ` Ingo Molnar
2008-11-13 12:36 ` Frédéric Weisbecker
2008-11-13 12:54 ` Ingo Molnar
2008-11-13 12:59 ` Peter Zijlstra
2008-11-13 13:02 ` Ingo Molnar
2008-11-13 17:26 ` Frédéric Weisbecker
2008-11-13 18:57 ` Ingo Molnar
2008-11-13 20:06 ` Frédéric Weisbecker
2008-11-13 14:53 ` Andi Kleen
2008-11-13 17:01 ` Frédéric Weisbecker
2008-11-13 17:12 ` Andi Kleen
2008-11-13 17:21 ` Steven Rostedt
2008-11-13 17:34 ` Andi Kleen
2008-11-13 17:32 ` Steven Rostedt
2008-11-13 17:46 ` Frédéric Weisbecker
2008-11-13 18:29 ` Andi Kleen
2008-11-13 18:35 ` Steven Rostedt
2008-11-13 18:55 ` Andi Kleen
2008-11-13 19:03 ` Steven Rostedt
2008-11-13 19:19 ` Andi Kleen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.