public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: john stultz <johnstul@us.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Christoph Hellwig <hch@infradead.org>,
	Gregory Haskins <ghaskins@novell.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tim Bird <tim.bird@am.sony.com>, Sam Ravnborg <sam@ravnborg.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Steven Rostedt <srostedt@redhat.com>,
	Paul Mackerras <paulus@samba.org>,
	Daniel Walker <dwalker@mvista.com>
Subject: Re: [RFC PATCH 16/22 -v2] add get_monotonic_cycles
Date: Wed, 16 Jan 2008 21:35:57 -0500	[thread overview]
Message-ID: <20080117023557.GC2322@Krystal> (raw)
In-Reply-To: <1200536409.6127.47.camel@localhost.localdomain>

* john stultz (johnstul@us.ibm.com) wrote:
> On Wed, 2008-01-16 at 18:39 -0500, Mathieu Desnoyers wrote:
> > I would disable preemption in clocksource_get_basecycles. We would not
> > want to be scheduled out while we hold a pointer to the old array
> > element.
> > 
> > > +	int num = cs->base_num;
> > 
> > Since you deal with base_num in a shared manner (not per cpu), you will
> > need a smp_read_barrier_depend() here after the cs->base_num read.
> > 
> > You should think about reading the cs->base_num first, and _after_ that
> > read the real clocksource. Here, the clocksource value is passed as
> > parameter. It means that the read clocksource may have been read in the
> > previous RCU window.
> 
> Here's an updated version of the patch w/ the suggested memory barrier
> changes and favored (1-x) inversion change. ;)  Let me know if you see
> any other holes, or have any other suggestions or ideas.
> 
> Still un-tested (my test box will free up soon, I promise!), but builds.
> 
> Signed-off-by: John Stultz <johstul@us.ibm.com>
> 
> Index: monotonic-cleanup/include/linux/clocksource.h
> ===================================================================
> --- monotonic-cleanup.orig/include/linux/clocksource.h	2008-01-16 12:22:04.000000000 -0800
> +++ monotonic-cleanup/include/linux/clocksource.h	2008-01-16 18:12:53.000000000 -0800
> @@ -87,9 +87,17 @@
>  	 * more than one cache line.
>  	 */
>  	struct {
> -		cycle_t cycle_last, cycle_accumulated, cycle_raw;
> -	} ____cacheline_aligned_in_smp;
> +		cycle_t cycle_last, cycle_accumulated;
>  
> +		/* base structure provides lock-free read
> +		 * access to a virtualized 64bit counter
> +		 * Uses RCU-like update.
> +		 */
> +		struct {
> +			cycle_t cycle_base_last, cycle_base;
> +		} base[2];
> +		int base_num;
> +	} ____cacheline_aligned_in_smp;
>  	u64 xtime_nsec;
>  	s64 error;
>  
> @@ -175,19 +183,29 @@
>  }
>  
>  /**
> - * clocksource_get_cycles: - Access the clocksource's accumulated cycle value
> + * clocksource_get_basecycles: - get the clocksource's accumulated cycle value
>   * @cs:		pointer to clocksource being read
>   * @now:	current cycle value
>   *
>   * Uses the clocksource to return the current cycle_t value.
>   * NOTE!!!: This is different from clocksource_read, because it
> - * returns the accumulated cycle value! Must hold xtime lock!
> + * returns a 64bit wide accumulated value.
>   */
>  static inline cycle_t
> -clocksource_get_cycles(struct clocksource *cs, cycle_t now)
> +clocksource_get_basecycles(struct clocksource *cs)
>  {
> -	cycle_t offset = (now - cs->cycle_last) & cs->mask;
> -	offset += cs->cycle_accumulated;
> +	int num;
> +	cycle_t now, offset;
> +
> +	preempt_disable();
> +	num = cs->base_num;
> +	smp_read_barrier_depends();
> +	now = clocksource_read(cs);
> +	offset = (now - cs->base[num].cycle_base_last);
> +	offset &= cs->mask;
> +	offset += cs->base[num].cycle_base;
> +	preempt_enable();
> +
>  	return offset;
>  }
>  
> @@ -197,14 +215,26 @@
>   * @now:	current cycle value
>   *
>   * Used to avoids clocksource hardware overflow by periodically
> - * accumulating the current cycle delta. Must hold xtime write lock!
> + * accumulating the current cycle delta. Uses RCU-like update, but
> + * ***still requires the xtime_lock is held for writing!***
>   */
>  static inline void clocksource_accumulate(struct clocksource *cs, cycle_t now)
>  {
> -	cycle_t offset = (now - cs->cycle_last) & cs->mask;
> +	/* First update the monotonic base portion.
> +	 * The dual array update method allows for lock-free reading.
> +	 */
> +	int num = 1 - cs->base_num;

(nitpick)
right here, you could probably express 1-num with cs->base_num, since we
are the only ones supposed to touch it.

> +	cycle_t offset = (now - cs->base[1-num].cycle_base_last);
> +	offset &= cs->mask;

here too.

> +	cs->base[num].cycle_base = cs->base[1-num].cycle_base + offset;
> +	cs->base[num].cycle_base_last = now;
> +	wmb();

As I just emailed : smp_smb() *should* be enough. I don't see which
architecture could reorder writes wrt local interrupts ? (please tell me
if I am grossly mistaken)

Mathieu

> +	cs->base_num = num;
> +
> +	/* Now update the cycle_accumulated portion */
> +	offset = (now - cs->cycle_last) & cs->mask;
>  	cs->cycle_last = now;
>  	cs->cycle_accumulated += offset;
> -	cs->cycle_raw += offset;
>  }
>  
>  /**
> Index: monotonic-cleanup/kernel/time/timekeeping.c
> ===================================================================
> --- monotonic-cleanup.orig/kernel/time/timekeeping.c	2008-01-16 12:21:46.000000000 -0800
> +++ monotonic-cleanup/kernel/time/timekeeping.c	2008-01-16 17:51:50.000000000 -0800
> @@ -71,10 +71,12 @@
>   */
>  static inline s64 __get_nsec_offset(void)
>  {
> -	cycle_t cycle_delta;
> +	cycle_t now, cycle_delta;
>  	s64 ns_offset;
>  
> -	cycle_delta = clocksource_get_cycles(clock, clocksource_read(clock));
> +	now = clocksource_read(clock);
> +	cycle_delta = (now - clock->cycle_last) & clock->mask;
> +	cycle_delta += clock->cycle_accumulated;
>  	ns_offset = cyc2ns(clock, cycle_delta);
>  
>  	return ns_offset;
> @@ -105,35 +107,7 @@
>  
>  cycle_t notrace get_monotonic_cycles(void)
>  {
> -	cycle_t cycle_now, cycle_delta, cycle_raw, cycle_last;
> -
> -	do {
> -		/*
> -		 * cycle_raw and cycle_last can change on
> -		 * another CPU and we need the delta calculation
> -		 * of cycle_now and cycle_last happen atomic, as well
> -		 * as the adding to cycle_raw. We don't need to grab
> -		 * any locks, we just keep trying until get all the
> -		 * calculations together in one state.
> -		 *
> -		 * In fact, we __cant__ grab any locks. This
> -		 * function is called from the latency_tracer which can
> -		 * be called anywhere. To grab any locks (including
> -		 * seq_locks) we risk putting ourselves into a deadlock.
> -		 */
> -		cycle_raw = clock->cycle_raw;
> -		cycle_last = clock->cycle_last;
> -
> -		/* read clocksource: */
> -		cycle_now = clocksource_read(clock);
> -
> -		/* calculate the delta since the last update_wall_time: */
> -		cycle_delta = (cycle_now - cycle_last) & clock->mask;
> -
> -	} while (cycle_raw != clock->cycle_raw ||
> -		 cycle_last != clock->cycle_last);
> -
> -	return cycle_raw + cycle_delta;
> +	return clocksource_get_basecycles(clock);
>  }
>  
>  unsigned long notrace cycles_to_usecs(cycle_t cycles)
> 
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-01-17  2:36 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-09 23:29 [RFC PATCH 00/22 -v2] mcount and latency tracing utility -v2 Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation Steven Rostedt
2008-01-10 18:19   ` Jan Kiszka
2008-01-10 19:54     ` Steven Rostedt
2008-01-10 23:02     ` Steven Rostedt
2008-01-10 18:28   ` Sam Ravnborg
2008-01-10 19:10     ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 02/22 -v2] Annotate core code that should not be traced Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 03/22 -v2] x86_64: notrace annotations Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 04/22 -v2] add notrace annotations to vsyscall Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 05/22 -v2] add notrace annotations for NMI routines Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 06/22 -v2] mcount based trace in the form of a header file library Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 07/22 -v2] tracer add debugfs interface Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 08/22 -v2] mcount tracer output file Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 09/22 -v2] mcount tracer show task comm and pid Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 10/22 -v2] Add a symbol only trace output Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 11/22 -v2] Reset the tracer when started Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 12/22 -v2] separate out the percpu date into a percpu struct Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays Steven Rostedt
2008-01-10  0:00   ` john stultz
2008-01-10  0:09     ` Steven Rostedt
2008-01-10 19:54     ` Tony Luck
2008-01-10 20:15       ` Steven Rostedt
2008-01-10 20:41         ` john stultz
2008-01-10 20:29       ` john stultz
2008-01-10 20:42         ` Mathieu Desnoyers
2008-01-10 21:25           ` john stultz
2008-01-10 22:00             ` Mathieu Desnoyers
2008-01-10 22:40               ` Steven Rostedt
2008-01-10 22:51               ` john stultz
2008-01-10 23:05                 ` john stultz
2008-01-10 21:33         ` [RFC PATCH 13/22 -v2] handle accurate time keeping over longdelays Luck, Tony
2008-01-10  0:19   ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays john stultz
2008-01-10  0:25     ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 14/22 -v2] time keeping add cycle_raw for actual incrementation Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 15/22 -v2] initialize the clock source to jiffies clock Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 16/22 -v2] add get_monotonic_cycles Steven Rostedt
2008-01-10  3:28   ` Daniel Walker
2008-01-15 21:46   ` Mathieu Desnoyers
2008-01-15 22:01     ` Steven Rostedt
2008-01-15 22:03       ` Steven Rostedt
2008-01-15 22:08       ` Mathieu Desnoyers
2008-01-16  1:38         ` Steven Rostedt
2008-01-16  3:17           ` Mathieu Desnoyers
2008-01-16 13:17             ` Steven Rostedt
2008-01-16 14:56               ` Mathieu Desnoyers
2008-01-16 15:06                 ` Steven Rostedt
2008-01-16 15:28                   ` Mathieu Desnoyers
2008-01-16 15:58                     ` Steven Rostedt
2008-01-16 17:00                       ` Mathieu Desnoyers
2008-01-16 17:49                         ` Mathieu Desnoyers
2008-01-16 19:43                         ` Steven Rostedt
2008-01-16 20:17                           ` Mathieu Desnoyers
2008-01-16 20:45                             ` Tim Bird
2008-01-16 20:49                             ` Steven Rostedt
2008-01-17 20:08                             ` Steven Rostedt
2008-01-17 20:37                               ` Frank Ch. Eigler
2008-01-17 21:03                                 ` Steven Rostedt
2008-01-18 22:26                                   ` Mathieu Desnoyers
2008-01-18 22:49                                     ` Steven Rostedt
2008-01-18 23:19                                       ` Mathieu Desnoyers
2008-01-19  3:36                                         ` Frank Ch. Eigler
2008-01-19  3:55                                           ` Steven Rostedt
2008-01-19  4:23                                             ` Frank Ch. Eigler
2008-01-19 15:29                                               ` Mathieu Desnoyers
2008-01-19  3:32                                       ` Frank Ch. Eigler
2008-01-16 18:01                       ` Tim Bird
2008-01-16 22:36                 ` john stultz
2008-01-16 22:51                   ` john stultz
2008-01-16 23:33                     ` Steven Rostedt
2008-01-17  2:28                       ` john stultz
2008-01-17  2:40                         ` Mathieu Desnoyers
2008-01-17  2:50                           ` Mathieu Desnoyers
2008-01-17  3:02                             ` Steven Rostedt
2008-01-17  3:21                             ` Paul Mackerras
2008-01-17  3:39                               ` Steven Rostedt
2008-01-17  4:22                                 ` Mathieu Desnoyers
2008-01-17  4:25                                 ` Mathieu Desnoyers
2008-01-17  4:14                               ` Mathieu Desnoyers
2008-01-17 15:22                                 ` Steven Rostedt
2008-01-17 17:46                                 ` Linus Torvalds
2008-01-17  2:51                           ` Steven Rostedt
2008-01-16 23:39                     ` Mathieu Desnoyers
2008-01-16 23:50                       ` Steven Rostedt
2008-01-17  0:36                         ` Steven Rostedt
2008-01-17  0:33                       ` john stultz
2008-01-17  2:20                         ` Mathieu Desnoyers
2008-01-17  1:03                       ` Linus Torvalds
2008-01-17  1:35                         ` Mathieu Desnoyers
2008-01-17  2:20                       ` john stultz
2008-01-17  2:35                         ` Mathieu Desnoyers [this message]
2008-01-09 23:29 ` [RFC PATCH 17/22 -v2] Add timestamps to tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 18/22 -v2] Sort trace by timestamp Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 19/22 -v2] speed up the output of the tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 20/22 -v2] Add latency_trace format tor tracer Steven Rostedt
2008-01-10  3:41   ` Daniel Walker
2008-01-09 23:29 ` [RFC PATCH 21/22 -v2] Split out specific tracing functions Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 22/22 -v2] Trace irq disabled critical timings Steven Rostedt
2008-01-10  3:58   ` Daniel Walker
2008-01-10 14:45     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080117023557.GC2322@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=akpm@linux-foundation.org \
    --cc=dwalker@mvista.com \
    --cc=fche@redhat.com \
    --cc=ghaskins@novell.com \
    --cc=hch@infradead.org \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=rostedt@goodmis.org \
    --cc=sam@ravnborg.org \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.bird@am.sony.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox