All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Daniel Walker <dwalker@mvista.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	mbligh@google.com, linux-kernel@vger.kernel.org,
	johnstul@us.ibm.com, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC] Fast assurate clock readable from user space and NMI handler
Date: Tue, 27 Feb 2007 14:04:42 -0500	[thread overview]
Message-ID: <20070227190442.GA11272@Krystal> (raw)
In-Reply-To: <1172597055.5517.233.camel@imap.mvista.com>

* Daniel Walker (dwalker@mvista.com) wrote:
> On Tue, 2007-02-27 at 11:02 -0500, Mathieu Desnoyers wrote:
> > * Daniel Walker (dwalker@mvista.com) wrote:
> > > On Tue, 2007-02-27 at 02:38 -0500, Mathieu Desnoyers wrote:
> > > 
> > > > 
> > > > I am concerned about the automatic fallback to the PIT when no other
> > > > clock source is available. A clocksource read would be atomic when TSC
> > > > or HPET are available, but would fall back on PIT otherwise. There
> > > > should be some way to specify that a caller is only interested in atomic
> > > > clock sources (if none are available, the call should simply return an
> > > > error, or 0).
> > > > 
> > > I'm not sure what you mean by using the RCU
> > 
> > The original proposal of this thread uses a RCU (read-copy-update) style
> > update of the previous 64 bits counter : it swaps a pointer (atomically)
> > upon update by incrementing a word-sized counter that is used, by the
> > reader, to get the offest in the array (with a modulo operation) for the
> > current readable data and as a way to detect incorrect reads of
> > overwritten information (we re-read the word-sized counter after having
> > read the data structure to make sure is has not been incremented. If we
> > detect an increment, we redo the whole operation).
> 
> I didn't see RCU at all in your original message, so I'm not sure how
> you propose to use it .. My understanding of the RCU was that it
> couldn't be used from interrupt context, that could be totally wrong so
> I'll let you explain how you planed to use it.
> 

1 - I do not plan to use the rcupdate.h API, because it is oriented
towards allowing/freeing data structures after a quiescent state. I
don't need that. I only want to have a 64 bits data structure valid for
reading, with atomic update. Therefore, I keep an array of 2 64 bits
structures. At all time, there is one used as "readable" value and the other
as "writeable". The role is exchanged at each update. The word-sized
counter is used to select the current read and write pointers through a
mask, and is also used to detect bad reads (is a read is preempted, and
then we have 2 updates, the reader could read a bad value without
knowing it). By keeping a word-sized counter of the number of updates,
we have 32 (or 64) bits (depending on the architecture) before the wrap
around, which should not happen even in a far future.



> > > > I still think that an RCU style update mechanism would be a good way  to
> > > > fix the current clocksource read issue. Another, slower and non NMI
> > > > safe way to do this would be with a read seqlock and with IRQ disabling.
> > > 
> > > , but the pit clocksource
> > > does disable interrupts with a spin_lock_irqsave().
> > > 
> > 
> > When I say "clocksource read issue", I am talking about
> > race between the function you proposed earlier, which you say is used in
> > -rt kernels for latency tracing (get_monotonic_cycles), and HPET and TSC
> > "last cycles" updates.
> 
> Right .. You said that regular interrupts would cause this non-atomic
> 64-bit update race , but the pit disabled interrupts, and the
> last_cycles update is done with interrupts off .. So I think we're back
> to only the NMI case ..
> 
> Did you have another scenario ?
> 

__get_nsec_offset : reads clock->cycle_last. Should be called with
xtime_lock held. (ok so far, but see below)

change_clocksource
clock->cycle_last = now; (non atomic 64 bits update. Not protected by
any lock ?) -> this would race with __get_nsec_offset ?

update_wall_time
Called from timer interrupt. Holds xtime_lock and has a priority higher
than other interrupts. Other clock->cycle_last protected by
write_seqlock_irqsave.

get_monotonic_cycles (as you proposed, in -rt kernels) :
reads clock->cycle_last. Not protected by any read seqlock and does not
disable interrupts. Races with change_clocksource, update_wall_time and
all other time update functions. For instance, is someone uses
get_monotonic_cycles in process context and the timer interrupt fires
update_wall_time right at the middle of the 2 32 bits read, the value
will be wrong.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2007-02-27 19:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-24 21:59 [PATCH 8/16] LTTng 0.6.36 for 2.6.18 : Timestamp Mathieu Desnoyers
     [not found] ` <1164475747.5196.5.camel@localhost.localdomain>
     [not found]   ` <20061126170542.GA30771@Krystal>
     [not found]     ` <1164561427.16871.14.camel@localhost.localdomain>
     [not found]       ` <20061126231833.GA22241@Krystal>
     [not found]         ` <1164585589.16871.52.camel@localhost.localdomain>
2007-02-24 16:19           ` [RFC] Fast assurate clock readable from user space and NMI handler Mathieu Desnoyers
2007-02-24 18:06             ` Daniel Walker
2007-02-26 20:53               ` Mathieu Desnoyers
2007-02-26 21:27                 ` Daniel Walker
2007-02-26 22:14                   ` Mathieu Desnoyers
2007-02-26 23:12                     ` Daniel Walker
2007-02-27  3:54                       ` Mathieu Desnoyers
2007-02-27  4:22                         ` Daniel Walker
2007-02-27  4:47                           ` Mathieu Desnoyers
2007-02-27  6:29                           ` Ingo Molnar
2007-02-27  7:38                             ` Mathieu Desnoyers
2007-02-27  8:48                               ` Thomas Gleixner
2007-02-27 10:18                               ` Daniel Walker
2007-02-27 16:02                                 ` Mathieu Desnoyers
2007-02-27 17:24                                   ` Daniel Walker
2007-02-27 19:04                                     ` Mathieu Desnoyers [this message]
2007-02-27 19:40                                       ` john stultz
2007-02-27 20:09                                       ` Daniel Walker
2007-02-27  9:59                             ` Daniel Walker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070227190442.GA11272@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=dwalker@mvista.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.