Re: [PATCH 5/8] seqlock: Better document raw_write_seqcount_latch()

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, rusty@rustcorp.com.au, oleg@redhat.com,
	paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org,
	linux-kernel@vger.kernel.org, andi@firstfloor.org,
	rostedt@goodmis.org, tglx@linutronix.de,
	David Woodhouse <David.Woodhouse@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Michel Lespinasse <walken@google.com>
Subject: Re: [PATCH 5/8] seqlock: Better document raw_write_seqcount_latch()
Date: Wed, 18 Mar 2015 14:29:35 +0000 (UTC)	[thread overview]
Message-ID: <80740165.31353.1426688975785.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20150318134631.874407995@infradead.org>

----- Original Message -----
> Improve the documentation of the latch technique as used in the
> current timekeeping code, such that it can be readily employed
> elsewhere.
> 
> Borrow from the comments in timekeeping and replace those with a
> reference to this more generic comment.
> 
> Cc: David Woodhouse <David.Woodhouse@intel.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Acked-by: Michel Lespinasse <walken@google.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/linux/seqlock.h   |   77
>  +++++++++++++++++++++++++++++++++++++++++++++-
>  kernel/time/timekeeping.c |   27 ----------------
>  2 files changed, 77 insertions(+), 27 deletions(-)
> 
> --- a/include/linux/seqlock.h
> +++ b/include/linux/seqlock.h
> @@ -233,9 +233,84 @@ static inline void raw_write_seqcount_en
>  	s->sequence++;
>  }
>  
> -/*
> +/**
>   * raw_write_seqcount_latch - redirect readers to even/odd copy
>   * @s: pointer to seqcount_t
> + *
> + * The latch technique is a multiversion concurrency control method that
> allows
> + * queries during non atomic modifications. If you can guarantee queries
> never
> + * interrupt the modification -- e.g. the concurrency is strictly between
> CPUs
> + * -- you most likely do not need this.
> + *
> + * Where the traditional RCU/lockless data structures rely on atomic
> + * modifications to ensure queries observe either the old or the new state
> the
> + * latch allows the same for non atomic updates. The trade-off is doubling
> the
> + * cost of storage; we have to maintain two copies of the entire data
> + * structure.
> + *
> + * Very simply put: we first modify one copy and then the other. This
> ensures
> + * there is always one copy in a stable state, ready to give us an answer.
> + *
> + * The basic form is a data structure like:
> + *
> + * struct latch_struct {
> + *	seqcount_t		seq;
> + *	struct data_struct	data[2];
> + * };
> + *
> + * Where a modification, which is assumed to be externally serialized, does
> the
> + * following:
> + *
> + * void latch_modify(struct latch_struct *latch, ...)
> + * {
> + *	smp_wmb();	<- Ensure that the last data[1] update is visible
> + *	latch->seq++;
> + *	smp_wmb();	<- Ensure that the seqcount update is visible
> + *
> + *	modify(latch->data[0], ...);
> + *
> + *	smp_wmb();	<- Ensure that the data[0] update is visible
> + *	latch->seq++;
> + *	smp_wmb();	<- Ensure that the seqcount update is visible
> + *
> + *	modify(latch->data[1], ...);
> + * }
> + *
> + * The query will have a form like:
> + *
> + * struct entry *latch_query(struct latch_struct *latch, ...)
> + * {
> + *	struct entry *entry;
> + *	unsigned seq;
> + *	int idx;

very minor nit: why is seq unsigned, but idx a signed int ?
Could we do:

  unsigned seq, idx;   instead ?

Other than that:

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> + *
> + *	do {
> + *		seq = latch->seq;
> + *		smp_rmb();
> + *
> + *		idx = seq & 0x01;
> + *		entry = data_query(latch->data[idx], ...);
> + *
> + *		smp_rmb();
> + *	} while (seq != latch->seq);
> + *
> + *	return entry;
> + * }
> + *
> + * So during the modification, queries are first redirected to data[1]. Then
> we
> + * modify data[0]. When that is complete, we redirect queries back to
> data[0]
> + * and we can modify data[1].
> + *
> + * NOTE: The non-requirement for atomic modifications does _NOT_ include
> + *       the publishing of new entries in the case where data is a dynamic
> + *       data structure.
> + *
> + *       An iteration might start in data[0] and get suspended long enough
> + *       to miss an entire modification sequence, once it resumes it might
> + *       observe the new entry.
> + *
> + * NOTE: When data is a dynamic data structure; one should use regular RCU
> + *       patterns to manage the lifetimes of the objects within.
>   */
>  static inline void raw_write_seqcount_latch(seqcount_t *s)
>  {
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -339,32 +339,7 @@ static inline s64 timekeeping_get_ns_raw
>   * We want to use this from any context including NMI and tracing /
>   * instrumenting the timekeeping code itself.
>   *
> - * So we handle this differently than the other timekeeping accessor
> - * functions which retry when the sequence count has changed. The
> - * update side does:
> - *
> - * smp_wmb();	<- Ensure that the last base[1] update is visible
> - * tkf->seq++;
> - * smp_wmb();	<- Ensure that the seqcount update is visible
> - * update(tkf->base[0], tkr);
> - * smp_wmb();	<- Ensure that the base[0] update is visible
> - * tkf->seq++;
> - * smp_wmb();	<- Ensure that the seqcount update is visible
> - * update(tkf->base[1], tkr);
> - *
> - * The reader side does:
> - *
> - * do {
> - *	seq = tkf->seq;
> - *	smp_rmb();
> - *	idx = seq & 0x01;
> - *	now = now(tkf->base[idx]);
> - *	smp_rmb();
> - * } while (seq != tkf->seq)
> - *
> - * As long as we update base[0] readers are forced off to
> - * base[1]. Once base[0] is updated readers are redirected to base[0]
> - * and the base[1] update takes place.
> + * Employ the latch technique; see @raw_write_seqcount_latch.
>   *
>   * So if a NMI hits the update of base[0] then it will use base[1]
>   * which is still consistent. In the worst case this can result is a
> 
> 
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

next prev parent reply	other threads:[~2015-03-18 14:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-18 13:36 [PATCH 0/8] latched RB-trees and __module_address() Peter Zijlstra
2015-03-18 13:36 ` [PATCH 1/8] module: Sanitize RCU usage and locking Peter Zijlstra
2015-03-20  3:36   ` Rusty Russell
2015-03-18 13:36 ` [PATCH 2/8] module: Annotate module version magic Peter Zijlstra
2015-03-18 13:36 ` [PATCH 3/8] module, jump_label: Fix module locking Peter Zijlstra
2015-03-20  4:26   ` Rusty Russell
2015-03-20  9:11     ` Peter Zijlstra
2015-03-18 13:36 ` [PATCH 4/8] rbtree: Make lockless searches non-fatal Peter Zijlstra
2015-03-18 13:36 ` [PATCH 5/8] seqlock: Better document raw_write_seqcount_latch() Peter Zijlstra
2015-03-18 14:29   ` Mathieu Desnoyers [this message]
2015-03-18 14:50     ` Peter Zijlstra
2015-03-18 13:36 ` [PATCH 6/8] rbtree: Implement generic latch_tree Peter Zijlstra
2015-03-18 16:20   ` Ingo Molnar
2015-03-19  5:14   ` Andrew Morton
2015-03-19  7:25     ` Peter Zijlstra
2015-03-19 20:58       ` Andrew Morton
2015-03-19 21:36         ` Steven Rostedt
2015-03-19 21:38           ` Steven Rostedt
2015-03-19 21:47           ` Andrew Morton
2015-03-19 21:54             ` Steven Rostedt
2015-03-19 22:12           ` Peter Zijlstra
2015-03-19 22:04         ` Peter Zijlstra
2015-03-20  9:50           ` Peter Zijlstra
2015-03-19 22:10         ` Peter Zijlstra
2015-03-18 13:36 ` [PATCH 7/8] module: Optimize __module_address() using a latched RB-tree Peter Zijlstra
2015-03-18 13:36 ` [PATCH 8/8] module: Use __module_address() for module_address_lookup() Peter Zijlstra
2015-03-18 14:21 ` [PATCH 0/8] latched RB-trees and __module_address() Christoph Hellwig
2015-03-18 14:38   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80740165.31353.1426688975785.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=David.Woodhouse@intel.com \
    --cc=aarcange@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).