From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932766AbbDMPuT (ORCPT ); Mon, 13 Apr 2015 11:50:19 -0400 Received: from mail-wg0-f48.google.com ([74.125.82.48]:35472 "EHLO mail-wg0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932119AbbDMPuO (ORCPT ); Mon, 13 Apr 2015 11:50:14 -0400 Date: Mon, 13 Apr 2015 17:50:08 +0200 From: Ingo Molnar To: Peter Zijlstra Cc: rusty@rustcorp.com.au, mathieu.desnoyers@efficios.com, oleg@redhat.com, paulmck@linux.vnet.ibm.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, andi@firstfloor.org, rostedt@goodmis.org, tglx@linutronix.de, laijs@cn.fujitsu.com, linux@horizon.com, David Woodhouse , Rik van Riel , Andrea Arcangeli , Michel Lespinasse Subject: Re: [PATCH v5 04/10] rbtree: Make lockless searches non-fatal Message-ID: <20150413155008.GB6040@gmail.com> References: <20150413141126.756350256@infradead.org> <20150413141213.432590551@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150413141213.432590551@infradead.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Peter Zijlstra wrote: > Change the insert and erase code such that lockless searches are > non-fatal. > > In and of itself an rbtree cannot be correctly searched while > in-modification, we can however provide weaker guarantees that will > allow the rbtree to be used in conjunction with other techniques, such > as latches; see 9b0fd802e8c0 ("seqcount: Add raw_write_seqcount_latch()"). > > For this to work we need the following guarantees from the rbtree > code: > > 1) a lockless reader must not see partial stores, this would allow it > to observe nodes that are invalid memory. > > 2) there must not be (temporary) loops in the tree structure in the > modifier's program order, this would cause a lookup which > interrupts the modifier to get stuck indefinitely. > > For 1) we must use WRITE_ONCE() for all updates to the tree structure; > in particular this patch only does rb_{left,right} as those are the > only element required for simple searches. > > It generates slightly worse code, probably because volatile. But in > pointer chasing heavy code a few instructions more should not matter. So I had a look at code generation on x86/64-defconfig, it adds 2 more instructions, out of 900+ instructions total: text data bss dec hex filename 3299 0 0 3299 ce3 rbtree.o.before 3308 0 0 3308 cec rbtree.o.after One of the instructions is a MOV, the other AFAICS is a NOP due to changed jump target alignment. Interestingly, when compiled with -Os then your patch actually _shrinks_ the code: text data bss dec hex filename 2524 0 0 2524 9dc rbtree.o.before 2440 0 0 2440 988 rbtree.o.after and rather significantly so. This is with GCC 4.9. Possibly your patch unconfused GCC somehow. So just for kicks I applied my patch-set that fixes up jump target alignments, and the numbers with the regular -O2 became: text data bss dec hex filename 2995 0 0 2995 bb3 rbtree.o.before 2981 0 0 2981 ba5 rbtree.o.after so your patch shrinks rbtree.o even without -Os, so it's probably a speedup and doesn't generate worse code once GCC's alignment sillies are righted! > *rb_link = node; > } > > +static inline void rb_link_node_rcu(struct rb_node * node, struct rb_node * parent, > + struct rb_node ** rb_link) Minor stylistic nit, the standard pattern I suspect has spaces fewer by three: static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent, struct rb_node **rb_link) > +/* > + * Notes on lockless lookups: > + * > + * All stores to the tree structure (rb_left and rb_right) must be done using > + * WRITE_ONCE(). And we must not inadvertently cause (temporary) loops in the > + * tree structure as seen in program order. > + * > + * These two requirements will allow lockless iteration of the tree -- not > + * correct iteration mind you, tree rotations are not atomic so a lookup might > + * miss entire subtrees. > + * > + * But they do guarantee that any such traversal will only see valid elements > + * and that it will indeed complete -- does not get stuck in a loop. > + * > + * It also guarantees that if the lookup returns an element it is the 'correct' > + * one. But not returning an element does _NOT_ mean its not present. s/its/it's Thanks, Ingo