From: Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Cc: Michael Kerrisk
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Mathieu Desnoyers
<mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>,
Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>,
Linux Kernel
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
"Paul E. McKenney"
<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>,
Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>,
Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [RFC PATCH] percpu system call: fast userspace percpu critical sections
Date: Fri, 22 May 2015 15:06:47 -0700 [thread overview]
Message-ID: <CADroS=4KKb4SKWNToqsqYm5qPq10OPy00yOTBkzaZO65kib1-w@mail.gmail.com> (raw)
In-Reply-To: <CALCETrUSBqHG3tbOq1yFz33v1_ckEgLNorgAxwLFi7MkjNcwLA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Fri, May 22, 2015 at 1:53 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> Create an array of user-managed locks, one per cpu. Call them lock[i]
> for 0 <= i < ncpus.
>
> To acquire, look up your CPU number. Then, atomically, check that
> lock[cpu] isn't held and, if so, mark it held and record both your tid
> and your lock acquisition count. If you learn that the lock *was*
> held after all, signal the holder (with kill or your favorite other
> mechanism), telling it which lock acquisition count is being aborted.
> Then atomically steal the lock, but only if the lock acquisition count
> hasn't changed.
>
We had to deploy the userspace percpu API (percpu sharded locks,
{double,}compare-and-swap, atomic-increment, etc) universally to the
fleet without waiting for 100% kernel penetration, not to mention
wanting to disable the kernel acceleration in case of kernel bugs.
(Since this is mostly used in core infrastructure--malloc, various
statistics platforms, etc--in userspace, checking for availability
isn't feasible. The primitives have to work 100% of the time or it
would be too complex for our developers to bother using them.)
So we did basically this (without the lock stealing...): we have a
single per-cpu spin lock manipulated with atomics, which we take very
briefly to implement (e.g.) compare-and-swap. The performance is
hugely worse; typical overheads are in the 10x range _without_ any
on-cpu contention. Uncontended atomics are much cheaper than they were
on pre-Nehalem chips, but they still can't hold a candle to
unsynchronized instructions.
As a fallback path for userspace, this is fine--if 5% of binaries on
busted kernels aren't quite as fast, we can work with that in exchange
for being able to write a percpu op without worrying about what to do
on -ENOSYS. But it's just not fast enough to compete as the intended
way to do things.
AHH
prev parent reply other threads:[~2015-05-22 22:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1432219487-13364-1-git-send-email-mathieu.desnoyers@efficios.com>
[not found] ` <1432219487-13364-1-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-22 20:26 ` [RFC PATCH] percpu system call: fast userspace percpu critical sections Michael Kerrisk
[not found] ` <CAHO5Pa0Kok4_QN0v3JNWyzGT=GbZNZcRyLhu02R2npV9hSdt7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-22 20:53 ` Andy Lutomirski
2015-05-22 21:34 ` Mathieu Desnoyers
2015-05-22 22:24 ` Andy Lutomirski
[not found] ` <CALCETrUxp-dP-kaTy4prEdciM-=sTXjpqnMbkvk38g5BTEvX0g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-23 17:09 ` Mathieu Desnoyers
2015-05-23 19:15 ` Andy Lutomirski
[not found] ` <CALCETrWzoFX7hXqvQqDEq=r=7PNaGKVjZeHEBWxPvC28Zi1AKA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-25 18:30 ` Mathieu Desnoyers
[not found] ` <1184354091.7499.1432578613872.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-25 18:54 ` Andy Lutomirski
[not found] ` <CALCETrW3_Hv0jc3cpiwsHTinBqJzvab_EiPS8BVJhX-xe5D8qw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-26 19:57 ` Andy Lutomirski
[not found] ` <CALCETrXzmO=fQC=UdCh5b0zWiGWAJScEtdT4QDJkoqLgtgEVig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-26 21:04 ` Mathieu Desnoyers
[not found] ` <821493560.8531.1432674243321.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-26 21:18 ` Andy Lutomirski
2015-05-26 21:44 ` Andy Lutomirski
2015-05-26 20:38 ` Mathieu Desnoyers
[not found] ` <933886515.8478.1432672739485.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-05-26 20:58 ` Andy Lutomirski
2015-05-26 21:20 ` Andi Kleen
[not found] ` <20150526212041.GQ19417-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2015-05-26 21:26 ` Andy Lutomirski
[not found] ` <CALCETrUSBqHG3tbOq1yFz33v1_ckEgLNorgAxwLFi7MkjNcwLA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-05-22 22:06 ` Andrew Hunter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADroS=4KKb4SKWNToqsqYm5qPq10OPy00yOTBkzaZO65kib1-w@mail.gmail.com' \
--to=ahh-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=bmaurer-b10kYP2dOMg@public.gmane.org \
--cc=josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org \
--cc=laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
--cc=mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org \
--cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
--cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).