All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: rusty@rustcorp.com.au, mathieu.desnoyers@efficios.com,
	oleg@redhat.com, paulmck@linux.vnet.ibm.com,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	andi@firstfloor.org, rostedt@goodmis.org, tglx@linutronix.de,
	laijs@cn.fujitsu.com, linux@horizon.com,
	David Woodhouse <David.Woodhouse@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Michel Lespinasse <walken@google.com>
Subject: Re: [PATCH v5 04/10] rbtree: Make lockless searches non-fatal
Date: Mon, 13 Apr 2015 17:50:08 +0200	[thread overview]
Message-ID: <20150413155008.GB6040@gmail.com> (raw)
In-Reply-To: <20150413141213.432590551@infradead.org>


* Peter Zijlstra <peterz@infradead.org> wrote:

> Change the insert and erase code such that lockless searches are
> non-fatal.
> 
> In and of itself an rbtree cannot be correctly searched while
> in-modification, we can however provide weaker guarantees that will
> allow the rbtree to be used in conjunction with other techniques, such
> as latches; see 9b0fd802e8c0 ("seqcount: Add raw_write_seqcount_latch()").
> 
> For this to work we need the following guarantees from the rbtree
> code:
> 
>  1) a lockless reader must not see partial stores, this would allow it
>     to observe nodes that are invalid memory.
> 
>  2) there must not be (temporary) loops in the tree structure in the
>     modifier's program order, this would cause a lookup which
>     interrupts the modifier to get stuck indefinitely.
> 
> For 1) we must use WRITE_ONCE() for all updates to the tree structure;
> in particular this patch only does rb_{left,right} as those are the
> only element required for simple searches.
> 
> It generates slightly worse code, probably because volatile. But in
> pointer chasing heavy code a few instructions more should not matter.

So I had a look at code generation on x86/64-defconfig, it adds 2 more 
instructions, out of 900+ instructions total:

   text    data     bss     dec     hex filename
   3299       0       0    3299     ce3 rbtree.o.before
   3308       0       0    3308     cec rbtree.o.after

One of the instructions is a MOV, the other AFAICS is a NOP due to 
changed jump target alignment.

Interestingly, when compiled with -Os then your patch actually 
_shrinks_ the code:

   text    data     bss     dec     hex filename
   2524       0       0    2524     9dc rbtree.o.before
   2440       0       0    2440     988 rbtree.o.after

and rather significantly so. This is with GCC 4.9. Possibly your patch 
unconfused GCC somehow.

So just for kicks I applied my patch-set that fixes up jump target 
alignments, and the numbers with the regular -O2 became:

   text    data     bss     dec     hex filename
   2995       0       0    2995     bb3 rbtree.o.before
   2981       0       0    2981     ba5 rbtree.o.after

so your patch shrinks rbtree.o even without -Os, so it's probably a 
speedup and doesn't generate worse code once GCC's alignment sillies 
are righted!

>  	*rb_link = node;
>  }
>  
> +static inline void rb_link_node_rcu(struct rb_node * node, struct rb_node * parent,
> +				    struct rb_node ** rb_link)

Minor stylistic nit, the standard pattern I suspect has spaces fewer 
by three:

static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent,
				    struct rb_node **rb_link)

> +/*
> + * Notes on lockless lookups:
> + *
> + * All stores to the tree structure (rb_left and rb_right) must be done using
> + * WRITE_ONCE(). And we must not inadvertently cause (temporary) loops in the
> + * tree structure as seen in program order.
> + *
> + * These two requirements will allow lockless iteration of the tree -- not
> + * correct iteration mind you, tree rotations are not atomic so a lookup might
> + * miss entire subtrees.
> + *
> + * But they do guarantee that any such traversal will only see valid elements
> + * and that it will indeed complete -- does not get stuck in a loop.
> + *
> + * It also guarantees that if the lookup returns an element it is the 'correct'
> + * one. But not returning an element does _NOT_ mean its not present.

s/its/it's

Thanks,

	Ingo

  reply	other threads:[~2015-04-13 15:50 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 14:11 [PATCH v5 00/10] latched RB-trees and __module_address() Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 01/10] module: Sanitize RCU usage and locking Peter Zijlstra
2015-04-13 15:32   ` Ingo Molnar
2015-04-13 15:40     ` Peter Zijlstra
2015-04-13 16:32       ` Ingo Molnar
2015-04-13 14:11 ` [PATCH v5 02/10] module: Annotate module version magic Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 03/10] module, jump_label: Fix module locking Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 04/10] rbtree: Make lockless searches non-fatal Peter Zijlstra
2015-04-13 15:50   ` Ingo Molnar [this message]
2015-04-13 19:10     ` Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 05/10] seqlock: Better document raw_write_seqcount_latch() Peter Zijlstra
2015-04-13 16:32   ` Ingo Molnar
2015-04-13 17:08     ` Mathieu Desnoyers
2015-04-13 17:43       ` Paul E. McKenney
2015-04-13 18:21         ` Linus Torvalds
2015-04-13 18:42           ` Paul E. McKenney
2015-04-14 10:25             ` Ingo Molnar
2015-04-14 13:04               ` Paul E. McKenney
2015-04-14 14:31                 ` Ingo Molnar
2015-04-14 15:11                   ` Paul E. McKenney
2015-04-13 19:17     ` Peter Zijlstra
2015-04-13 19:47       ` Peter Zijlstra
2015-04-14 10:26         ` Ingo Molnar
2015-04-13 14:11 ` [PATCH v5 06/10] rbtree: Implement generic latch_tree Peter Zijlstra
2015-04-13 16:43   ` Ingo Molnar
2015-04-13 19:20     ` Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 07/10] module: Optimize __module_address() using a latched RB-tree Peter Zijlstra
2015-04-13 16:49   ` Ingo Molnar
2015-04-14 12:31     ` Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 08/10] module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING Peter Zijlstra
2015-04-13 16:53   ` Ingo Molnar
2015-04-13 14:11 ` [PATCH v5 09/10] module: Use __module_address() for module_address_lookup() Peter Zijlstra
2015-04-13 14:11 ` [PATCH v5 10/10] module: Rework module_addr_{min,max} Peter Zijlstra
2015-04-13 16:56   ` Ingo Molnar
2015-04-14  2:55     ` Rusty Russell
2015-04-14  6:42       ` Peter Zijlstra
2015-04-14 12:56     ` Peter Zijlstra
2015-04-14 13:00       ` Ingo Molnar
2015-04-13 17:02 ` [PATCH v5 00/10] latched RB-trees and __module_address() Ingo Molnar
2015-04-14  2:57 ` Rusty Russell
2015-04-14  6:41   ` Peter Zijlstra
2015-04-15  4:41     ` Rusty Russell
2015-04-15  9:41       ` Peter Zijlstra
2015-05-28  2:07     ` Rusty Russell
2015-05-28 11:20       ` Peter Zijlstra
2015-05-28 23:49         ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150413155008.GB6040@gmail.com \
    --to=mingo@kernel.org \
    --cc=David.Woodhouse@intel.com \
    --cc=aarcange@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@horizon.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.