All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Christoph Hellwig <hch@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Gregory Haskins <ghaskins@novell.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tim Bird <tim.bird@am.sony.com>, Sam Ravnborg <sam@ravnborg.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	John Stultz <johnstul@us.ibm.com>,
	John Stultz <johstul@us.ibm.com>,
	Steven Rostedt <srostedt@redhat.com>
Subject: [RFC PATCH 12/23 -v4] Use RCU algorithm for monotonic cycles.
Date: Mon, 21 Jan 2008 10:22:43 -0500	[thread overview]
Message-ID: <20080121152352.789802471@goodmis.org> (raw)
In-Reply-To: 20080121152231.579118762@goodmis.org

[-- Attachment #1: get-monotonic-rcu-experimental.patch --]
[-- Type: text/plain, Size: 6194 bytes --]

From: john stultz <johnstul@us.ibm.com>

On Wed, 2008-01-16 at 18:39 -0500, Mathieu Desnoyers wrote:
> I would disable preemption in clocksource_get_basecycles. We would not
> want to be scheduled out while we hold a pointer to the old array
> element.
> 
> > +	int num = cs->base_num;
> 
> Since you deal with base_num in a shared manner (not per cpu), you will
> need a smp_read_barrier_depend() here after the cs->base_num read.
> 
> You should think about reading the cs->base_num first, and _after_ that
> read the real clocksource. Here, the clocksource value is passed as
> parameter. It means that the read clocksource may have been read in the
> previous RCU window.

Here's an updated version of the patch w/ the suggested memory barrier
changes and favored (1-x) inversion change. ;)  Let me know if you see
any other holes, or have any other suggestions or ideas.

Still un-tested (my test box will free up soon, I promise!), but builds.

[ Steven Rostedt has been running this patch with no problems ]

Signed-off-by: John Stultz <johstul@us.ibm.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>

---
 include/linux/clocksource.h |   50 +++++++++++++++++++++++++++++++++++---------
 kernel/time/timekeeping.c   |   36 ++++---------------------------
 2 files changed, 45 insertions(+), 41 deletions(-)

Index: linux-mcount.git/include/linux/clocksource.h
===================================================================
--- linux-mcount.git.orig/include/linux/clocksource.h	2008-01-18 23:48:45.000000000 -0500
+++ linux-mcount.git/include/linux/clocksource.h	2008-01-18 23:48:46.000000000 -0500
@@ -87,9 +87,17 @@ struct clocksource {
 	 * more than one cache line.
 	 */
 	struct {
-		cycle_t cycle_last, cycle_accumulated, cycle_raw;
-	} ____cacheline_aligned_in_smp;
+		cycle_t cycle_last, cycle_accumulated;
 
+		/* base structure provides lock-free read
+		 * access to a virtualized 64bit counter
+		 * Uses RCU-like update.
+		 */
+		struct {
+			cycle_t cycle_base_last, cycle_base;
+		} base[2];
+		int base_num;
+	} ____cacheline_aligned_in_smp;
 	u64 xtime_nsec;
 	s64 error;
 
@@ -175,19 +183,29 @@ static inline cycle_t clocksource_read(s
 }
 
 /**
- * clocksource_get_cycles: - Access the clocksource's accumulated cycle value
+ * clocksource_get_basecycles: - get the clocksource's accumulated cycle value
  * @cs:		pointer to clocksource being read
  * @now:	current cycle value
  *
  * Uses the clocksource to return the current cycle_t value.
  * NOTE!!!: This is different from clocksource_read, because it
- * returns the accumulated cycle value! Must hold xtime lock!
+ * returns a 64bit wide accumulated value.
  */
 static inline cycle_t
-clocksource_get_cycles(struct clocksource *cs, cycle_t now)
+clocksource_get_basecycles(struct clocksource *cs)
 {
-	cycle_t offset = (now - cs->cycle_last) & cs->mask;
-	offset += cs->cycle_accumulated;
+	int num;
+	cycle_t now, offset;
+
+	preempt_disable();
+	num = cs->base_num;
+	smp_read_barrier_depends();
+	now = clocksource_read(cs);
+	offset = (now - cs->base[num].cycle_base_last);
+	offset &= cs->mask;
+	offset += cs->base[num].cycle_base;
+	preempt_enable();
+
 	return offset;
 }
 
@@ -197,14 +215,26 @@ clocksource_get_cycles(struct clocksourc
  * @now:	current cycle value
  *
  * Used to avoids clocksource hardware overflow by periodically
- * accumulating the current cycle delta. Must hold xtime write lock!
+ * accumulating the current cycle delta. Uses RCU-like update, but
+ * ***still requires the xtime_lock is held for writing!***
  */
 static inline void clocksource_accumulate(struct clocksource *cs, cycle_t now)
 {
-	cycle_t offset = (now - cs->cycle_last) & cs->mask;
+	/* First update the monotonic base portion.
+	 * The dual array update method allows for lock-free reading.
+	 */
+	int num = 1 - cs->base_num;
+	cycle_t offset = (now - cs->base[1-num].cycle_base_last);
+	offset &= cs->mask;
+	cs->base[num].cycle_base = cs->base[1-num].cycle_base + offset;
+	cs->base[num].cycle_base_last = now;
+	wmb();
+	cs->base_num = num;
+
+	/* Now update the cycle_accumulated portion */
+	offset = (now - cs->cycle_last) & cs->mask;
 	cs->cycle_last = now;
 	cs->cycle_accumulated += offset;
-	cs->cycle_raw += offset;
 }
 
 /**
Index: linux-mcount.git/kernel/time/timekeeping.c
===================================================================
--- linux-mcount.git.orig/kernel/time/timekeeping.c	2008-01-18 23:48:45.000000000 -0500
+++ linux-mcount.git/kernel/time/timekeeping.c	2008-01-18 23:48:46.000000000 -0500
@@ -71,10 +71,12 @@ static struct clocksource *clock = &cloc
  */
 static inline s64 __get_nsec_offset(void)
 {
-	cycle_t cycle_delta;
+	cycle_t now, cycle_delta;
 	s64 ns_offset;
 
-	cycle_delta = clocksource_get_cycles(clock, clocksource_read(clock));
+	now = clocksource_read(clock);
+	cycle_delta = (now - clock->cycle_last) & clock->mask;
+	cycle_delta += clock->cycle_accumulated;
 	ns_offset = cyc2ns(clock, cycle_delta);
 
 	return ns_offset;
@@ -105,35 +107,7 @@ static inline void __get_realtime_clock_
 
 cycle_t notrace get_monotonic_cycles(void)
 {
-	cycle_t cycle_now, cycle_delta, cycle_raw, cycle_last;
-
-	do {
-		/*
-		 * cycle_raw and cycle_last can change on
-		 * another CPU and we need the delta calculation
-		 * of cycle_now and cycle_last happen atomic, as well
-		 * as the adding to cycle_raw. We don't need to grab
-		 * any locks, we just keep trying until get all the
-		 * calculations together in one state.
-		 *
-		 * In fact, we __cant__ grab any locks. This
-		 * function is called from the latency_tracer which can
-		 * be called anywhere. To grab any locks (including
-		 * seq_locks) we risk putting ourselves into a deadlock.
-		 */
-		cycle_raw = clock->cycle_raw;
-		cycle_last = clock->cycle_last;
-
-		/* read clocksource: */
-		cycle_now = clocksource_read(clock);
-
-		/* calculate the delta since the last update_wall_time: */
-		cycle_delta = (cycle_now - cycle_last) & clock->mask;
-
-	} while (cycle_raw != clock->cycle_raw ||
-		 cycle_last != clock->cycle_last);
-
-	return cycle_raw + cycle_delta;
+	return clocksource_get_basecycles(clock);
 }
 
 unsigned long notrace cycles_to_usecs(cycle_t cycles)

-- 

  parent reply	other threads:[~2008-01-21 15:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-21 15:22 [RFC PATCH 00/23 -v4] mcount and latency tracing utility -v4 Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 01/23 -v4] Add basic support for gcc profiler instrumentation Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 02/23 -v4] Annotate core code that should not be traced Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 03/23 -v4] x86_64: notrace annotations Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 04/23 -v4] add notrace annotations to vsyscall Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 05/23 -v4] add notrace annotations for NMI routines Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 06/23 -v4] handle accurate time keeping over long delays Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 07/23 -v4] ppc clock accumulate fix Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 08/23 -v4] Fixup merge between xtime_cache and timkkeeping starvation fix Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 09/23 -v4] time keeping add cycle_raw for actual incrementation Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 10/23 -v4] initialize the clock source to jiffies clock Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 11/23 -v4] add get_monotonic_cycles Steven Rostedt
2008-01-21 15:22 ` Steven Rostedt [this message]
2008-01-22  0:20   ` [RFC PATCH 12/23 -v4] Use RCU algorithm for monotonic cycles Nick Piggin
2008-01-22  0:49     ` Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 13/23 -v4] add notrace annotations to timing events Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 14/23 -v4] mcount based trace in the form of a header file library Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 15/23 -v4] Add context switch marker to sched.c Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 16/23 -v4] Make the task State char-string visible to all Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 17/23 -v4] Add tracing of context switches Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 18/23 -v4] Generic command line storage Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 19/23 -v4] trace generic call to schedule switch Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 20/23 -v4] Add marker in try_to_wake_up Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 21/23 -v4] mcount tracer for wakeup latency timings Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 22/23 -v4] Trace irq disabled critical timings Steven Rostedt
2008-01-21 15:22 ` [RFC PATCH 23/23 -v4] trace preempt off " Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080121152352.789802471@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=akpm@linux-foundation.org \
    --cc=fche@redhat.com \
    --cc=ghaskins@novell.com \
    --cc=hch@infradead.org \
    --cc=jan.kiszka@siemens.com \
    --cc=johnstul@us.ibm.com \
    --cc=johstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=sam@ravnborg.org \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.bird@am.sony.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.