From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andy Lutomirski <luto@amacapital.net>,
Dave Watson <davejwatson@fb.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-api <linux-api@vger.kernel.org>,
Paul Turner <pjt@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
rostedt <rostedt@goodmis.org>,
Josh Triplett <josh@joshtriplett.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [RFC PATCH for 4.18 12/23] cpu_opv: Provide cpu_opv system call (v7)
Date: Fri, 13 Apr 2018 08:16:49 -0400 (EDT) [thread overview]
Message-ID: <625160026.9658.1523621809662.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <CA+55aFwFUeNA6_0a1MOU-7VGzCw5T99cfVCPiyTpZQJdd6z=bw@mail.gmail.com>
----- On Apr 12, 2018, at 4:07 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
> On Thu, Apr 12, 2018 at 12:59 PM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
>>
>> What are your concerns about page pinning ?
>
> Pretty much everything.
>
> It's the most complex part by far, and the vmalloc space is a limited
> resource on 32-bit architectures.
The vmalloc space needed by cpu_opv is bound by the number of pages
a cpu_opv call can touch. On architectures with virtually aliased
dcache, we also need to add a few extra pages worth of address space
to account for SHMLBA alignment.
So on ARM32, with SHMLBA=4 pages, this means at most 1 MB of virtual
address space temporarily needed for a cpu_opv system call in the very
worst case scenario: 16 ops * 2 uaddr * 8 pages per uaddr
(if we're unlucky and find ourselves aligned across two SHMLBA) * 4096 bytes per page.
If this amount of vmalloc space happens to be our limiting factor, we can
change the max cpu_opv ops array size supported, e.g. bringing it from 16 down
to 4. The largest number of operations I currently need in the cpu-opv library
is 4. With 4 ops, the worse case vmalloc space used by a cpu_opv system call
becomes 256 kB.
>
>> Do you have an alternative approach in mind ?
>
> Do everything in user space.
I wish we could disable preemption and cpu hotplug in user-space.
Unfortunately, that does not seem to be a viable solution for many
technical reasons, starting with page fault handling.
>
> And even if you absolutely want cpu_opv at all, why not do it in the
> user space *mapping* without the aliasing into kernel space?
That's because cpu_opv need to execute the entire array of operations
with preemption disabled, and we cannot take a page fault with preemption
off.
Page pinning and aliasing user-space pages in the kernel linear mapping
ensure that we don't end up in trouble in page fault scenarios, such as
having the pages we need to touch swapped out under our feet.
>
> The cpu_opv approach isn't even fast. It's *really* slow if it has to
> do VM crap.
>
> The whole rseq thing was billed as "faster than atomics". I
> *guarantee* that the cpu_opv's aren't faster than atomics.
Yes, and here is the good news: cpu_opv speed does not even matter. rseq assember instruction sequences are very fast, but cannot deal with infrequent corner-cases.
cpu_opv is slow, but is guaranteed to deal with the occasional corner-case
situations.
This is similar to pthread mutex/futex fast/slow paths. The common case is fast
(rseq), and the speed of the infrequent case (cpu_opv) does not matter as long
as it's used infrequently enough, which is the case here.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2018-04-13 12:16 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-12 19:27 [RFC PATCH for 4.18 00/23] Restartable sequences and CPU op vector Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 01/23] uapi headers: Provide types_32_64.h (v2) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 02/23] rseq: Introduce restartable sequences system call (v13) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 03/23] arm: Add restartable sequences support Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 04/23] arm: Wire up restartable sequences system call Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 05/23] x86: Add support for restartable sequences (v2) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 06/23] x86: Wire up restartable sequence system call Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 07/23] powerpc: Add support for restartable sequences Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 08/23] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 09/23] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 10/23] mm: Introduce vm_map_user_ram, vm_unmap_user_ram Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 11/23] mm: Provide is_vma_noncached Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 12/23] cpu_opv: Provide cpu_opv system call (v7) Mathieu Desnoyers
2018-04-12 19:43 ` Linus Torvalds
2018-04-12 19:59 ` Mathieu Desnoyers
2018-04-12 20:07 ` Linus Torvalds
2018-04-13 12:16 ` Mathieu Desnoyers [this message]
2018-04-13 16:37 ` Linus Torvalds
2018-04-13 18:06 ` Mathieu Desnoyers
2018-04-12 20:23 ` Andi Kleen
2018-04-16 16:28 ` Mathieu Desnoyers
2018-04-16 17:02 ` Andi Kleen
2018-04-14 22:44 ` Andy Lutomirski
2018-04-16 18:35 ` Mathieu Desnoyers
2018-04-16 18:39 ` Linus Torvalds
2018-04-16 19:21 ` Mathieu Desnoyers
2018-04-16 19:26 ` Linus Torvalds
2018-04-16 20:58 ` Mathieu Desnoyers
2018-05-04 14:32 ` Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 13/23] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 14/23] powerpc: " Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 15/23] arm: " Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 16/23] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 17/23] cpu_opv: selftests: Implement selftests (v7) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 18/23] rseq: selftests: Provide rseq library (v5) Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 19/23] rseq: selftests: Provide percpu_op API Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 20/23] rseq: selftests: Provide basic test Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 21/23] rseq: selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-04-12 19:27 ` [RFC PATCH for 4.18 22/23] rseq: selftests: Provide parametrized tests Mathieu Desnoyers
2018-04-12 19:28 ` [RFC PATCH for 4.18 23/23] rseq: selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=625160026.9658.1523621809662.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=ahh@google.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=bmaurer@fb.com \
--cc=boqun.feng@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=davejwatson@fb.com \
--cc=hpa@zytor.com \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox