linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
To: Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>
Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Darren Hart <dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>,
	Waiman Long <Waiman.Long-VXdhtT5mjnY@public.gmane.org>,
	Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Davidlohr Bueso <davidlohr-VXdhtT5mjnY@public.gmane.org>,
	Heiko Carstens
	<heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Jason Low <jason.low2-VXdhtT5mjnY@public.gmane.org>,
	Scott J Norton <scott.norton-VXdhtT5mjnY@public.gmane.org>,
	Robert Haas <robertmhaas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH 0/5] futex: introduce an optimistic spinning futex
Date: Tue, 22 Jul 2014 09:47:19 +0200	[thread overview]
Message-ID: <20140722074719.GV3935@laptop> (raw)
In-Reply-To: <20140721213457.46623e2f-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>

On Mon, Jul 21, 2014 at 09:34:57PM -0400, Steven Rostedt wrote:

> I just want to point out that I was having a very nice conversation
> with Robert Haas (Cc'd) in Napa Valley at Linux Collaboration about
> this very topic. Robert is a PostgeSQL developer who told me that they
> implement their spin locks completely in userspace (no futex, just raw
> spinning on shared memory). This is because the sleep on contention of a
> futex has shown to be very expensive in their benchmarks. His work is
> not a micro benchmark but for a very popular database where locking is
> crucial.

Userspace spinlocks are a clusterfuck. Its impossible to solve the
priority inversion trainwrecks they cause _ever_.

We've had -- as I think Mike already pointed out -- tons of 'fun' with
psql exactly because its doing this :-(

> I was telling Robert that if futexes get optimistic spinning, he should
> reconsider their use of userspace spinlocks in favor of this, because
> I'm pretty sure that they will see a great improvement.
> 
> Now Robert will be the best one to answer if the system call is indeed
> more expensive than doing full spins in userspace. If the spin is done
> in the kernel and they still get better performance by just spinning
> blindly in userspace even if the owner is asleep, I think we will have
> our answer.

No, the best way is to measure the exact syscall cost. If he still gets
better performance we need to analyze why, there might be something else
hiding there.

> Note, I believe they only care about shared threads, and this
> optimistic spinning does not need to be something done between
> processes.

There's no reason not to provide it for shared futexes, in fact I
suspect not doing it for shared futexes is going to make the code
uglier.


Anyway, there is one big fail in the entire futex stack that we 'need'
to sort some day and that is NUMA. Some people (again database people)
explicitly do not use futexes and instead use sysvsem because of this.

The problem with numa futexes is that because they're vaddr based there
is no (persistent) node information. You always end up having to fall
back to looking in all nodes before you can guarantee there is no
matching futex.

One way to achieve it is by extending the futex value to include a node
number, but that's obviously a complete ABI break. Then again, it should
be pretty straight fwd, since the node number doesn't need to be part of
the actual atomic update part, just part of the userspace storage.

  parent reply	other threads:[~2014-07-22  7:47 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-21 15:24 [RFC PATCH 0/5] futex: introduce an optimistic spinning futex Waiman Long
2014-07-21 15:24 ` [RFC PATCH 1/5] futex: add new exclusive lock & unlock command codes Waiman Long
2014-07-21 16:42   ` Thomas Gleixner
2014-07-22 18:22     ` Waiman Long
     [not found]       ` <53CEABD7.3030509-VXdhtT5mjnY@public.gmane.org>
2014-07-22 21:00         ` Thomas Gleixner
     [not found] ` <1405956271-34339-1-git-send-email-Waiman.Long-VXdhtT5mjnY@public.gmane.org>
2014-07-21 15:24   ` [RFC PATCH 2/5] futex: add optimistic spinning to FUTEX_SPIN_LOCK Waiman Long
     [not found]     ` <1405956271-34339-3-git-send-email-Waiman.Long-VXdhtT5mjnY@public.gmane.org>
2014-07-21 17:15       ` Davidlohr Bueso
     [not found]         ` <1405962929.11927.19.camel-5JQ4ckphU/8SZAcGdq5asR6epYMZPwEe5NbjCUgZEJk@public.gmane.org>
2014-07-22 18:46           ` Waiman Long
2014-07-21 20:17     ` Jason Low
2014-07-22 19:34       ` Waiman Long
2014-07-21 15:24 ` [RFC PATCH 3/5] spinning futex: move a wakened task to spinning Waiman Long
2014-07-21 15:24 ` [RFC PATCH 4/5] spinning futex: put waiting tasks in a sorted rbtree Waiman Long
2014-07-21 15:24 ` [RFC PATCH 5/5] futex, doc: add a document on how to use the spinning futexes Waiman Long
2014-07-21 15:45   ` Randy Dunlap
2014-07-22  3:19     ` Waiman Long
2014-07-21 16:42 ` [RFC PATCH 0/5] futex: introduce an optimistic spinning futex Andi Kleen
2014-07-21 16:45   ` Andi Kleen
     [not found]     ` <871tte3bjw.fsf-KWJ+5VKanrL29G5dvP0v1laTQe2KTcn/@public.gmane.org>
2014-07-21 17:20       ` Darren Hart
     [not found]     ` <CFF29A00.9D44A%dvhart@linux.intel.com>
     [not found]       ` <CFF29A00.9D44A%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-07-21 17:41         ` Darren Hart
     [not found]           ` <CFF29E4A.9D44E%dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-07-21 20:16             ` Thomas Gleixner
2014-07-21 21:27               ` Peter Zijlstra
2014-07-21 21:31                 ` Andy Lutomirski
2014-07-21 21:47                   ` Thomas Gleixner
2014-07-21 22:41                     ` Darren Hart
2014-07-22  1:01                       ` Thomas Gleixner
2014-07-22  1:34                         ` Steven Rostedt
2014-07-22  2:31                           ` Mike Galbraith
2014-07-22  3:06                           ` Davidlohr Bueso
     [not found]                           ` <20140721213457.46623e2f-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>
2014-07-22  7:47                             ` Peter Zijlstra [this message]
2014-07-22  8:39                               ` Thomas Gleixner
2014-07-22  8:48                                 ` Peter Zijlstra
2014-07-22  9:59                                   ` Thomas Gleixner
2014-07-22 20:25                                     ` Waiman Long
2014-07-22 20:52                                       ` Thomas Gleixner
2014-07-22 20:21                         ` Waiman Long
2014-07-22 21:03                           ` Thomas Gleixner
2014-07-22  0:32                   ` Davidlohr Bueso
2014-07-22  7:35                     ` Peter Zijlstra
2014-07-21 21:43                 ` Thomas Gleixner
2014-07-21 18:24     ` Thomas Gleixner
2014-07-22 18:35     ` Waiman Long
2014-07-22 18:28   ` Waiman Long
     [not found]   ` <8761iq3bp3.fsf-KWJ+5VKanrL29G5dvP0v1laTQe2KTcn/@public.gmane.org>
2014-07-23  4:55     ` Mike Galbraith
2014-07-23  6:57       ` Peter Zijlstra
2014-07-23  7:25         ` Mike Galbraith
2014-07-23  7:35           ` Peter Zijlstra
2014-07-23  7:39             ` Mike Galbraith
2014-07-23  7:52               ` Peter Zijlstra
2014-07-21 21:18 ` Ingo Molnar
2014-07-21 21:41   ` Thomas Gleixner
     [not found]   ` <20140721211801.GA12149-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-07-22 19:36     ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140722074719.GV3935@laptop \
    --to=peterz-wegcikhe2lqwvfeawa7xhq@public.gmane.org \
    --cc=Waiman.Long-VXdhtT5mjnY@public.gmane.org \
    --cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
    --cc=davidlohr-VXdhtT5mjnY@public.gmane.org \
    --cc=dvhart-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=heiko.carstens-tA70FqPdS9bQT0dZR+AlfA@public.gmane.org \
    --cc=jason.low2-VXdhtT5mjnY@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=robertmhaas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=scott.norton-VXdhtT5mjnY@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).