public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Daniel Walker <dwalker@mvista.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	mbligh@google.com, linux-kernel@vger.kernel.org,
	johnstul@us.ibm.com, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC] Fast assurate clock readable from user space and NMI handler
Date: Tue, 27 Feb 2007 14:04:42 -0500	[thread overview]
Message-ID: <20070227190442.GA11272@Krystal> (raw)
In-Reply-To: <1172597055.5517.233.camel@imap.mvista.com>

* Daniel Walker (dwalker@mvista.com) wrote:
> On Tue, 2007-02-27 at 11:02 -0500, Mathieu Desnoyers wrote:
> > * Daniel Walker (dwalker@mvista.com) wrote:
> > > On Tue, 2007-02-27 at 02:38 -0500, Mathieu Desnoyers wrote:
> > > 
> > > > 
> > > > I am concerned about the automatic fallback to the PIT when no other
> > > > clock source is available. A clocksource read would be atomic when TSC
> > > > or HPET are available, but would fall back on PIT otherwise. There
> > > > should be some way to specify that a caller is only interested in atomic
> > > > clock sources (if none are available, the call should simply return an
> > > > error, or 0).
> > > > 
> > > I'm not sure what you mean by using the RCU
> > 
> > The original proposal of this thread uses a RCU (read-copy-update) style
> > update of the previous 64 bits counter : it swaps a pointer (atomically)
> > upon update by incrementing a word-sized counter that is used, by the
> > reader, to get the offest in the array (with a modulo operation) for the
> > current readable data and as a way to detect incorrect reads of
> > overwritten information (we re-read the word-sized counter after having
> > read the data structure to make sure is has not been incremented. If we
> > detect an increment, we redo the whole operation).
> 
> I didn't see RCU at all in your original message, so I'm not sure how
> you propose to use it .. My understanding of the RCU was that it
> couldn't be used from interrupt context, that could be totally wrong so
> I'll let you explain how you planed to use it.
> 

1 - I do not plan to use the rcupdate.h API, because it is oriented
towards allowing/freeing data structures after a quiescent state. I
don't need that. I only want to have a 64 bits data structure valid for
reading, with atomic update. Therefore, I keep an array of 2 64 bits
structures. At all time, there is one used as "readable" value and the other
as "writeable". The role is exchanged at each update. The word-sized
counter is used to select the current read and write pointers through a
mask, and is also used to detect bad reads (is a read is preempted, and
then we have 2 updates, the reader could read a bad value without
knowing it). By keeping a word-sized counter of the number of updates,
we have 32 (or 64) bits (depending on the architecture) before the wrap
around, which should not happen even in a far future.



> > > > I still think that an RCU style update mechanism would be a good way  to
> > > > fix the current clocksource read issue. Another, slower and non NMI
> > > > safe way to do this would be with a read seqlock and with IRQ disabling.
> > > 
> > > , but the pit clocksource
> > > does disable interrupts with a spin_lock_irqsave().
> > > 
> > 
> > When I say "clocksource read issue", I am talking about
> > race between the function you proposed earlier, which you say is used in
> > -rt kernels for latency tracing (get_monotonic_cycles), and HPET and TSC
> > "last cycles" updates.
> 
> Right .. You said that regular interrupts would cause this non-atomic
> 64-bit update race , but the pit disabled interrupts, and the
> last_cycles update is done with interrupts off .. So I think we're back
> to only the NMI case ..
> 
> Did you have another scenario ?
> 

__get_nsec_offset : reads clock->cycle_last. Should be called with
xtime_lock held. (ok so far, but see below)

change_clocksource
clock->cycle_last = now; (non atomic 64 bits update. Not protected by
any lock ?) -> this would race with __get_nsec_offset ?

update_wall_time
Called from timer interrupt. Holds xtime_lock and has a priority higher
than other interrupts. Other clock->cycle_last protected by
write_seqlock_irqsave.

get_monotonic_cycles (as you proposed, in -rt kernels) :
reads clock->cycle_last. Not protected by any read seqlock and does not
disable interrupts. Races with change_clocksource, update_wall_time and
all other time update functions. For instance, is someone uses
get_monotonic_cycles in process context and the timer interrupt fires
update_wall_time right at the middle of the 2 32 bits read, the value
will be wrong.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2007-02-27 19:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-24 21:59 [PATCH 8/16] LTTng 0.6.36 for 2.6.18 : Timestamp Mathieu Desnoyers
     [not found] ` <1164475747.5196.5.camel@localhost.localdomain>
     [not found]   ` <20061126170542.GA30771@Krystal>
     [not found]     ` <1164561427.16871.14.camel@localhost.localdomain>
     [not found]       ` <20061126231833.GA22241@Krystal>
     [not found]         ` <1164585589.16871.52.camel@localhost.localdomain>
2007-02-24 16:19           ` [RFC] Fast assurate clock readable from user space and NMI handler Mathieu Desnoyers
2007-02-24 18:06             ` Daniel Walker
2007-02-26 20:53               ` Mathieu Desnoyers
2007-02-26 21:27                 ` Daniel Walker
2007-02-26 22:14                   ` Mathieu Desnoyers
2007-02-26 23:12                     ` Daniel Walker
2007-02-27  3:54                       ` Mathieu Desnoyers
2007-02-27  4:22                         ` Daniel Walker
2007-02-27  4:47                           ` Mathieu Desnoyers
2007-02-27  6:29                           ` Ingo Molnar
2007-02-27  7:38                             ` Mathieu Desnoyers
2007-02-27  8:48                               ` Thomas Gleixner
2007-02-27 10:18                               ` Daniel Walker
2007-02-27 16:02                                 ` Mathieu Desnoyers
2007-02-27 17:24                                   ` Daniel Walker
2007-02-27 19:04                                     ` Mathieu Desnoyers [this message]
2007-02-27 19:40                                       ` john stultz
2007-02-27 20:09                                       ` Daniel Walker
2007-02-27  9:59                             ` Daniel Walker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070227190442.GA11272@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=dwalker@mvista.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox