From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Boqun Feng <boqun.feng@gmail.com>,
Andy Lutomirski <luto@amacapital.net>,
Dave Watson <davejwatson@fb.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-api <linux-api@vger.kernel.org>,
Paul Turner <pjt@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
rostedt <rostedt@goodmis.org>,
Josh Triplett <josh@joshtriplett.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [RFC PATCH for 4.17 10/21] cpu_opv: Provide cpu_opv system call (v6)
Date: Wed, 28 Mar 2018 13:54:41 -0400 (EDT) [thread overview]
Message-ID: <1109208604.169.1522259681295.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20180328152203.GW4043@hirez.programming.kicks-ass.net>
----- On Mar 28, 2018, at 11:22 AM, Peter Zijlstra peterz@infradead.org wrote:
> On Tue, Mar 27, 2018 at 12:05:31PM -0400, Mathieu Desnoyers wrote:
>
>> 1) Allow algorithms to perform per-cpu data migration without relying on
>> sched_setaffinity()
>>
>> The use-cases are migrating memory between per-cpu memory free-lists, or
>> stealing tasks from other per-cpu work queues: each require that
>> accesses to remote per-cpu data structures are performed.
>
> I think that one completely reduces to the per-cpu (spin)lock case,
> right? Because, as per the below, your logging case (8) can 'easily' be
> done without the cpu_opv monstrosity.
>
> And if you can construct a per-cpu lock, that can be used to construct
> aribtrary logic.
The per-cpu spinlock does not have the same performance characteristics
as lock-free alternatives for various operations. A rseq compare-and-store
is faster than a rseq spinlock for linked-list operations.
>
> And the difficult case for the per-cpu lock is the remote acquire; all
> the other cases are (relatively) trivial.
>
> I've not really managed to get anything sensible to work, I've tried
> several variations of split lock, but you invariably end up with
> barriers in the fast (local) path, which sucks.
>
> But I feel this should be solvable without cpu_opv. As in, I really hate
> that thing ;-)
I have not developed cpu_opv out of any kind of love for that solution.
I just realized that it did solve all my issues after failing for quite
some time to implement acceptable solutions for the remote access
problem, and for ensuring progress of single-stepping with current
debuggers that don't know about the rseq_table section.
>
>> 8) Allow libraries with multi-part algorithms to work on same per-cpu
>> data without affecting the allowed cpu mask
>>
>> The lttng-ust tracer presents an interesting use-case for per-cpu
>> buffers: the algorithm needs to update a "reserve" counter, serialize
>> data into the buffer, and then update a "commit" counter _on the same
>> per-cpu buffer_. Using rseq for both reserve and commit can bring
>> significant performance benefits.
>>
>> Clearly, if rseq reserve fails, the algorithm can retry on a different
>> per-cpu buffer. However, it's not that easy for the commit. It needs to
>> be performed on the same per-cpu buffer as the reserve.
>>
>> The cpu_opv system call solves that problem by receiving the cpu number
>> on which the operation needs to be performed as argument. It can push
>> the task to the right CPU if needed, and perform the operations there
>> with preemption disabled.
>>
>> Changing the allowed cpu mask for the current thread is not an
>> acceptable alternative for a tracing library, because the application
>> being traced does not expect that mask to be changed by libraries.
>
> We talked about this use-case, and it can be solved without cpu_opv if
> you keep a dual commit counter, one local and one (atomic) remote.
Right.
>
> We retain the cpu_id from the first rseq, and the second part will, when
> it (unlikely) finds it runs remotely, do an atomic increment on the
> remote counter. The consumer of the counter will then have to sum both
> the local and remote counter parts.
Yes, I did a prototype of this specific case with split-counters a while
ago. However, if we need cpu_opv as fallback for other reasons (e.g. remote
accesses), then the split-counters are not needed, and there is no need to
change the layout of user-space data to accommodate the extra per-cpu
counter.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2018-03-28 17:54 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-27 16:05 [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 01/21] uapi headers: Provide types_32_64.h Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12) Mathieu Desnoyers
2018-03-28 6:47 ` Boqun Feng
2018-03-28 14:06 ` Mathieu Desnoyers
2018-03-28 14:31 ` Mathieu Desnoyers
2018-03-28 11:19 ` Peter Zijlstra
2018-03-28 14:19 ` Mathieu Desnoyers
2018-03-28 11:22 ` Peter Zijlstra
2018-03-28 14:26 ` Mathieu Desnoyers
2018-03-28 12:29 ` Peter Zijlstra
2018-03-28 12:52 ` Peter Zijlstra
2018-03-28 15:03 ` Mathieu Desnoyers
2018-03-28 16:19 ` Mathieu Desnoyers
2018-03-28 12:50 ` Peter Zijlstra
2018-03-28 14:47 ` Mathieu Desnoyers
2018-03-28 14:59 ` Peter Zijlstra
2018-03-28 15:14 ` Mathieu Desnoyers
2018-03-28 15:28 ` Peter Zijlstra
2018-03-28 15:37 ` Mathieu Desnoyers
2018-03-28 17:49 ` Peter Zijlstra
2018-03-28 20:19 ` Mathieu Desnoyers
2018-03-28 21:25 ` Thomas Gleixner
2018-03-29 13:54 ` Mathieu Desnoyers
2018-03-29 14:23 ` Peter Zijlstra
2018-03-29 15:39 ` Mathieu Desnoyers
2018-03-29 16:24 ` Steven Rostedt
2018-03-29 18:02 ` Mathieu Desnoyers
2018-03-29 18:07 ` Steven Rostedt
2018-03-29 18:35 ` Mathieu Desnoyers
2018-03-29 18:46 ` Steven Rostedt
2018-03-29 18:47 ` Steven Rostedt
2018-04-01 16:13 ` Alan Cox
2018-04-02 15:03 ` Christopher Lameter
2018-04-02 15:27 ` Paul E. McKenney
2018-04-02 15:33 ` Mathieu Desnoyers
2018-04-03 16:36 ` Mathieu Desnoyers
2018-04-03 20:32 ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 03/21] arm: Add restartable sequences support Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 04/21] arm: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 05/21] x86: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 06/21] x86: Wire up restartable sequence system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 07/21] powerpc: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 08/21] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 09/21] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 10/21] cpu_opv: Provide cpu_opv system call (v6) Mathieu Desnoyers
2018-03-28 15:22 ` Peter Zijlstra
2018-03-28 17:54 ` Mathieu Desnoyers [this message]
2018-03-27 16:05 ` [RFC PATCH for 4.17 11/21] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 12/21] powerpc: " Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 13/21] arm: " Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 14/21] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 15/21] cpu_opv: selftests: Implement selftests (v7) Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 16/21] rseq: selftests: Provide rseq library (v5) Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 17/21] rseq: selftests: Provide percpu_op API Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 18/21] rseq: selftests: Provide basic test Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 19/21] rseq: selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 20/21] rseq: selftests: Provide parametrized tests Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 21/21] rseq: selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers
2018-03-27 19:09 ` [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1109208604.169.1522259681295.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=ahh@google.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=bmaurer@fb.com \
--cc=boqun.feng@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=davejwatson@fb.com \
--cc=hpa@zytor.com \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox