From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andy Lutomirski <luto@amacapital.net>,
Dave Watson <davejwatson@fb.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-api <linux-api@vger.kernel.org>,
Paul Turner <pjt@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
rostedt <rostedt@goodmis.org>,
Josh Triplett <josh@joshtriplett.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <cata>
Subject: Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)
Date: Tue, 3 Apr 2018 12:36:27 -0400 (EDT) [thread overview]
Message-ID: <17439540.2334.1522773387555.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1890356924.1736.1522683188833.JavaMail.zimbra@efficios.com>
----- On Apr 2, 2018, at 11:33 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Apr 1, 2018, at 12:13 PM, One Thousand Gnomes
> gnomes@lxorguk.ukuu.org.uk wrote:
>
>> On Tue, 27 Mar 2018 12:05:23 -0400
>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>>
>>> Expose a new system call allowing each thread to register one userspace
>>> memory area to be used as an ABI between kernel and user-space for two
>>> purposes: user-space restartable sequences and quick access to read the
>>> current CPU number value from user-space.
>>
>> What is the *worst* case timing achievable by using the atomics ? What
>> does it do to real time performance requirements ?
>
> Given that there are two system calls introduced in this series (rseq and
> cpu_opv), can you clarify which system call you refer to in the two questions
> above ?
>
> For rseq, given that its userspace works pretty much like a read seqlock
> (it retries on failure), it has no impact whatsoever on scheduler behavior.
> So characterizing its worst case timing does not appear to be relevant.
>
>> For cpu_opv you now
>> give an answer but your answer is assuming there isn't another thread
>> actively thrashing the cache or store buffers, and that the user didn't
>> sneakily pass in a page of uncacheable memory (eg framebuffer, or GPU
>> space).
>
> Are those considered as device pages ?
>
>>
>> I don't see anything that restricts it to cached pages. With that check
>> in place for x86 at least it would probably be ok and I think the sneaky
>> attacks to make it uncacheable would fail becuase you've got the pages
>> locked so trying to give them to an accelerator will block until you are
>> done.
>>
>> I still like the idea it's just the latencies concern me.
>
> Indeed, cpu_opv touches pages that are shared with user-space with
> preemption off, so this one affects the scheduler latency. The worse-case
> timings I measured for cpu_opv were with cache-cold memory. So I expect that
> another thread actively trashing the cache would be in the same ballpark
> figure. It does not account for a concurrent thread thrashing the store
> buffers though.
>
> The checks enforcing which pages can be touched by cpu_opv operations are
> done within cpu_op_check_page(). is_zone_device_page() is used to ensure no
> device page is touched with preempt disabled. I understand that you would
> prefer to disallow pages of uncacheable memory as well, which I'm fine with.
> Is there an API similar to is_zone_device_page() to check whether a page is
> uncacheable ?
Looking into this a bit more, I notice the following: The pgprot_noncached
(_PAGE_NOCACHE on x86) pgprot is part of the vma->vm_page_prot. Therefore,
in order to have userspace provide pointers to noncached pages as input
to cpu_opv, they need to be part of a userspace vma which has a
pgprot_noncached vm_page_prot.
The cpu_opv system call uses get_user_pages_fast() to grab the struct page
from the userspace addresses, and then passes those pages to vm_map_ram(),
with a PAGE_KERNEL pgprot. This creates a temporary kernel mapping to those
pages, which is then used to read/write from/to those pages with preemption
disabled.
Therefore, with the proposed cpu_opv implementation, the kernel is not
touching noncached mappings with preemption disabled, which should take
care of your latency concern.
Am I missing something ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andy Lutomirski <luto@amacapital.net>,
Dave Watson <davejwatson@fb.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-api <linux-api@vger.kernel.org>,
Paul Turner <pjt@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
rostedt <rostedt@goodmis.org>,
Josh Triplett <josh@joshtriplett.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)
Date: Tue, 3 Apr 2018 12:36:27 -0400 (EDT) [thread overview]
Message-ID: <17439540.2334.1522773387555.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <1890356924.1736.1522683188833.JavaMail.zimbra@efficios.com>
----- On Apr 2, 2018, at 11:33 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Apr 1, 2018, at 12:13 PM, One Thousand Gnomes
> gnomes@lxorguk.ukuu.org.uk wrote:
>
>> On Tue, 27 Mar 2018 12:05:23 -0400
>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>>
>>> Expose a new system call allowing each thread to register one userspace
>>> memory area to be used as an ABI between kernel and user-space for two
>>> purposes: user-space restartable sequences and quick access to read the
>>> current CPU number value from user-space.
>>
>> What is the *worst* case timing achievable by using the atomics ? What
>> does it do to real time performance requirements ?
>
> Given that there are two system calls introduced in this series (rseq and
> cpu_opv), can you clarify which system call you refer to in the two questions
> above ?
>
> For rseq, given that its userspace works pretty much like a read seqlock
> (it retries on failure), it has no impact whatsoever on scheduler behavior.
> So characterizing its worst case timing does not appear to be relevant.
>
>> For cpu_opv you now
>> give an answer but your answer is assuming there isn't another thread
>> actively thrashing the cache or store buffers, and that the user didn't
>> sneakily pass in a page of uncacheable memory (eg framebuffer, or GPU
>> space).
>
> Are those considered as device pages ?
>
>>
>> I don't see anything that restricts it to cached pages. With that check
>> in place for x86 at least it would probably be ok and I think the sneaky
>> attacks to make it uncacheable would fail becuase you've got the pages
>> locked so trying to give them to an accelerator will block until you are
>> done.
>>
>> I still like the idea it's just the latencies concern me.
>
> Indeed, cpu_opv touches pages that are shared with user-space with
> preemption off, so this one affects the scheduler latency. The worse-case
> timings I measured for cpu_opv were with cache-cold memory. So I expect that
> another thread actively trashing the cache would be in the same ballpark
> figure. It does not account for a concurrent thread thrashing the store
> buffers though.
>
> The checks enforcing which pages can be touched by cpu_opv operations are
> done within cpu_op_check_page(). is_zone_device_page() is used to ensure no
> device page is touched with preempt disabled. I understand that you would
> prefer to disallow pages of uncacheable memory as well, which I'm fine with.
> Is there an API similar to is_zone_device_page() to check whether a page is
> uncacheable ?
Looking into this a bit more, I notice the following: The pgprot_noncached
(_PAGE_NOCACHE on x86) pgprot is part of the vma->vm_page_prot. Therefore,
in order to have userspace provide pointers to noncached pages as input
to cpu_opv, they need to be part of a userspace vma which has a
pgprot_noncached vm_page_prot.
The cpu_opv system call uses get_user_pages_fast() to grab the struct page
from the userspace addresses, and then passes those pages to vm_map_ram(),
with a PAGE_KERNEL pgprot. This creates a temporary kernel mapping to those
pages, which is then used to read/write from/to those pages with preemption
disabled.
Therefore, with the proposed cpu_opv implementation, the kernel is not
touching noncached mappings with preemption disabled, which should take
care of your latency concern.
Am I missing something ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2018-04-03 16:36 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-27 16:05 [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 01/21] uapi headers: Provide types_32_64.h Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12) Mathieu Desnoyers
2018-03-28 6:47 ` Boqun Feng
2018-03-28 6:47 ` Boqun Feng
2018-03-28 14:06 ` Mathieu Desnoyers
2018-03-28 14:06 ` Mathieu Desnoyers
2018-03-28 14:31 ` Mathieu Desnoyers
2018-03-28 14:31 ` Mathieu Desnoyers
2018-03-28 11:19 ` Peter Zijlstra
2018-03-28 11:19 ` Peter Zijlstra
2018-03-28 14:19 ` Mathieu Desnoyers
2018-03-28 14:19 ` Mathieu Desnoyers
2018-03-28 11:22 ` Peter Zijlstra
2018-03-28 11:22 ` Peter Zijlstra
2018-03-28 14:26 ` Mathieu Desnoyers
2018-03-28 14:26 ` Mathieu Desnoyers
2018-03-28 12:29 ` Peter Zijlstra
2018-03-28 12:29 ` Peter Zijlstra
2018-03-28 12:52 ` Peter Zijlstra
2018-03-28 12:52 ` Peter Zijlstra
2018-03-28 15:03 ` Mathieu Desnoyers
2018-03-28 15:03 ` Mathieu Desnoyers
2018-03-28 16:19 ` Mathieu Desnoyers
2018-03-28 16:19 ` Mathieu Desnoyers
2018-03-28 12:50 ` Peter Zijlstra
2018-03-28 12:50 ` Peter Zijlstra
2018-03-28 14:47 ` Mathieu Desnoyers
2018-03-28 14:47 ` Mathieu Desnoyers
2018-03-28 14:59 ` Peter Zijlstra
2018-03-28 14:59 ` Peter Zijlstra
2018-03-28 15:14 ` Mathieu Desnoyers
2018-03-28 15:14 ` Mathieu Desnoyers
2018-03-28 15:28 ` Peter Zijlstra
2018-03-28 15:28 ` Peter Zijlstra
2018-03-28 15:37 ` Mathieu Desnoyers
2018-03-28 15:37 ` Mathieu Desnoyers
2018-03-28 17:49 ` Peter Zijlstra
2018-03-28 17:49 ` Peter Zijlstra
2018-03-28 20:19 ` Mathieu Desnoyers
2018-03-28 20:19 ` Mathieu Desnoyers
2018-03-28 21:25 ` Thomas Gleixner
2018-03-28 21:25 ` Thomas Gleixner
2018-03-29 13:54 ` Mathieu Desnoyers
2018-03-29 13:54 ` Mathieu Desnoyers
2018-03-29 14:23 ` Peter Zijlstra
2018-03-29 14:23 ` Peter Zijlstra
2018-03-29 15:39 ` Mathieu Desnoyers
2018-03-29 15:39 ` Mathieu Desnoyers
2018-03-29 16:24 ` Steven Rostedt
2018-03-29 16:24 ` Steven Rostedt
2018-03-29 18:02 ` Mathieu Desnoyers
2018-03-29 18:02 ` Mathieu Desnoyers
2018-03-29 18:07 ` Steven Rostedt
2018-03-29 18:07 ` Steven Rostedt
2018-03-29 18:35 ` Mathieu Desnoyers
2018-03-29 18:35 ` Mathieu Desnoyers
2018-03-29 18:46 ` Steven Rostedt
2018-03-29 18:46 ` Steven Rostedt
2018-03-29 18:47 ` Steven Rostedt
2018-03-29 18:47 ` Steven Rostedt
2018-04-01 16:13 ` Alan Cox
2018-04-01 16:13 ` Alan Cox
2018-04-02 15:03 ` Christopher Lameter
2018-04-02 15:03 ` Christopher Lameter
2018-04-02 15:27 ` Paul E. McKenney
2018-04-02 15:27 ` Paul E. McKenney
2018-04-02 15:33 ` Mathieu Desnoyers
2018-04-02 15:33 ` Mathieu Desnoyers
2018-04-03 16:36 ` Mathieu Desnoyers [this message]
2018-04-03 16:36 ` Mathieu Desnoyers
2018-04-03 20:32 ` Mathieu Desnoyers
2018-04-03 20:32 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 03/21] arm: Add restartable sequences support Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 04/21] arm: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 05/21] x86: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 06/21] x86: Wire up restartable sequence system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 07/21] powerpc: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 08/21] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 09/21] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 10/21] cpu_opv: Provide cpu_opv system call (v6) Mathieu Desnoyers
2018-03-28 15:22 ` Peter Zijlstra
2018-03-28 15:22 ` Peter Zijlstra
2018-03-28 17:54 ` Mathieu Desnoyers
2018-03-28 17:54 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 11/21] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 12/21] powerpc: " Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 13/21] arm: " Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 14/21] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 15/21] cpu_opv: selftests: Implement selftests (v7) Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 16/21] rseq: selftests: Provide rseq library (v5) Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 17/21] rseq: selftests: Provide percpu_op API Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 18/21] rseq: selftests: Provide basic test Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 19/21] rseq: selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 20/21] rseq: selftests: Provide parametrized tests Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 21/21] rseq: selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` Mathieu Desnoyers
2018-03-27 16:05 ` mathieu.desnoyers
2018-03-27 19:09 ` [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Peter Zijlstra
2018-03-27 19:09 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17439540.2334.1522773387555.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=ahh@google.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=bmaurer@fb.com \
--cc=boqun.feng@gmail.com \
--cc=cl@linux.com \
--cc=davejwatson@fb.com \
--cc=gnomes@lxorguk.ukuu.org.uk \
--cc=hpa@zytor.com \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.