public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Michel Lespinasse <walken@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	rusty@rustcorp.com.au, mathieu.desnoyers@efficios.com,
	oleg@redhat.com, paulmck@linux.vnet.ibm.com,
	linux-kernel@vger.kernel.org, andi@firstfloor.org,
	rostedt@goodmis.org, tglx@linutronix.de,
	Andrea Arcangeli <aarcange@redhat.com>,
	David Woodhouse <David.Woodhouse@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Josh Triplett <josh@joshtriplett.org>
Subject: Re: [RFC][PATCH 5/9] rbtree: Make lockless searches non-fatal
Date: Mon, 2 Mar 2015 08:46:12 +0100	[thread overview]
Message-ID: <20150302074612.GA3609@gmail.com> (raw)
In-Reply-To: <CANN689G1c42yz9qPM-bdbR9eAt=SSSVzD0ZjLyzJ3Niy-DBbHA@mail.gmail.com>


* Michel Lespinasse <walken@google.com> wrote:

> On Sat, Feb 28, 2015 at 1:24 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > Change the insert and erase code such that lockless searches are
> > non-fatal.
> >
> > In and of itself an rbtree cannot be correctly searched while
> > in-modification, we can however provide weaker guarantees that will
> > allow the rbtree to be used in conjunction with other techniques, such
> > as latches; see 9b0fd802e8c0 ("seqcount: Add raw_write_seqcount_latch()").
> >
> > For this to work we need the following guarantees from the rbtree
> > code:
> >
> >  1) a lockless reader must not see partial stores, this would allow it
> >     to observe nodes that are invalid memory.
> >
> >  2) there must not be (temporary) loops in the tree structure in the
> >     modifier's program order, this would cause a lookup which
> >     interrupts the modifier to get stuck indefinitely.
> >
> > For 1) we must use WRITE_ONCE() for all updates to the tree structure;
> > in particular this patch only does rb_{left,right} as those are the
> > only element required for simple searches.
> >
> > It generates slightly worse code, probably because gcc stinks at
> > volatile. But in pointer chasing heavy code a few instructions more
> > should not matter.
> 
> So, I was worried that this would penalize all rbtree users, for the 
> benefit of just the one you're adding later in this series. We have 
> several rbtrees where we care about performance a lot, such as the 
> ones used in the scheduler or for indexing vmas.
> 
> That said, I checked with the compiler we are using here (gcc 4.7 
> variant) and I didn't see any change in the generated code. So, no 
> issue here for me.

Yeah, I had that worry too. With gcc 4.9.1 on x86-64 defconfig I get 
this comparison:

lib/rbtree.o:

   text    data     bss     dec     hex filename
   3299       0       0    3299     ce3 rbtree.o.before
   3308       0       0    3308     cec rbtree.o.after

+9 bytes.

Most of the difference seems minimal, involves an extra register move 
saving addresses in registers and using them there:

  449:  4c 8b 60 10             mov    0x10(%rax),%r12
- 44d:  49 39 fc                cmp    %rdi,%r12
- 450:  0f 84 aa 00 00 00       je     500 <__rb_insert_augmented+0x1a0>
- 456:  49 89 c6                mov    %rax,%r14
- 459:  4d 85 e4                test   %r12,%r12
- 45c:  4c 89 63 08             mov    %r12,0x8(%rbx)
- 460:  48 89 58 10             mov    %rbx,0x10(%rax)
- 464:  74 0b                   je     471 <__rb_insert_augmented+0x111>


  449:  4c 8b 60 10             mov    0x10(%rax),%r12
+ 44d:  48 89 c6                mov    %rax,%rsi
+ 450:  49 39 fc                cmp    %rdi,%r12
+ 453:  0f 84 97 00 00 00       je     4f0 <__rb_insert_augmented+0x190>
+ 459:  49 89 c6                mov    %rax,%r14
+ 45c:  4d 85 e4                test   %r12,%r12
+ 45f:  4c 89 63 08             mov    %r12,0x8(%rbx)
+ 463:  48 89 5e 10             mov    %rbx,0x10(%rsi)
+ 467:  74 0b                   je     474 <__rb_insert_augmented+0x114>

So here instead of using 0x10(%rax) again, GCC moved %rax into %rsi 
and used 0x10(%rsi). That seems to be plain compiler stupidity (that 
move to %rsi is senseless with or without volatile), and gcc might 
improve in the future.

In my (admittedly quick) comparison I saw no serious changes like 
extra memory loads or stack spills.

Thanks,

	Ingo

  reply	other threads:[~2015-03-02  7:46 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-28 21:24 [RFC][PATCH 0/9] latched RB-trees and __module_address() Peter Zijlstra
2015-02-28 21:24 ` [RFC][PATCH 1/9] klp: Fix obvious RCU fail Peter Zijlstra
2015-03-01 20:09   ` Jiri Kosina
2015-03-02  8:35     ` Peter Zijlstra
2015-03-02  9:13       ` Jiri Kosina
2015-03-02 10:00         ` Peter Zijlstra
2015-03-02  9:21       ` Petr Mladek
2015-03-02  1:31   ` Masami Hiramatsu
2015-03-02 19:21   ` Paul E. McKenney
2015-03-02 21:07   ` Josh Poimboeuf
2015-02-28 21:24 ` [RFC][PATCH 2/9] module: Sanitize RCU usage and locking Peter Zijlstra
2015-03-02 11:16   ` Rusty Russell
2015-03-02 12:37     ` Peter Zijlstra
2015-03-02 19:37   ` Paul E. McKenney
2015-03-17 17:13     ` Peter Zijlstra
2015-02-28 21:24 ` [RFC][PATCH 3/9] module: Annotate module version magic Peter Zijlstra
2015-03-02 19:38   ` Paul E. McKenney
2015-02-28 21:24 ` [RFC][PATCH 4/9] module, jump_label: Fix module locking Peter Zijlstra
2015-03-02 19:39   ` Paul E. McKenney
2015-02-28 21:24 ` [RFC][PATCH 5/9] rbtree: Make lockless searches non-fatal Peter Zijlstra
2015-03-01 13:52   ` Mathieu Desnoyers
2015-03-02  8:27     ` Peter Zijlstra
2015-03-01 21:11   ` Michel Lespinasse
2015-03-02  7:46     ` Ingo Molnar [this message]
2015-03-02  8:23     ` Peter Zijlstra
2015-03-02  9:53       ` Peter Zijlstra
2015-02-28 21:24 ` [RFC][PATCH 6/9] seqlock: Better document raw_write_seqcount_latch() Peter Zijlstra
2015-03-01 14:02   ` Mathieu Desnoyers
2015-03-02  8:33     ` Peter Zijlstra
2015-03-02  8:51       ` Peter Zijlstra
2015-03-02 19:46         ` Paul E. McKenney
2015-03-01 21:12   ` Michel Lespinasse
2015-02-28 21:24 ` [RFC][PATCH 7/9] rbtree: Implement generic latch_tree Peter Zijlstra
2015-03-01 21:17   ` Michel Lespinasse
2015-03-02  8:05     ` Peter Zijlstra
2015-03-02 19:53   ` Paul E. McKenney
2015-03-17 17:24     ` Peter Zijlstra
2015-02-28 21:24 ` [RFC][PATCH 8/9] module: Optimize __module_address() using a latched RB-tree Peter Zijlstra
2015-02-28 21:24 ` [RFC][PATCH 9/9] module: Use __module_address() for module_address_lookup() Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150302074612.GA3609@gmail.com \
    --to=mingo@kernel.org \
    --cc=David.Woodhouse@intel.com \
    --cc=aarcange@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox