public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andrew Hunter <ahh@google.com>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector
Date: Wed, 22 Nov 2017 12:36:59 +0000 (UTC)	[thread overview]
Message-ID: <809252084.19901.1511354219731.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1711212315450.2399@nanos>

----- On Nov 21, 2017, at 5:59 PM, Thomas Gleixner tglx@linutronix.de wrote:

> On Tue, 21 Nov 2017, Mathieu Desnoyers wrote:
>> ----- On Nov 21, 2017, at 12:21 PM, Andi Kleen andi@firstfloor.org wrote:
>> 
>> > On Tue, Nov 21, 2017 at 09:18:38AM -0500, Mathieu Desnoyers wrote:
>> >> Hi,
>> >> 
>> >> Following changes based on a thorough coding style and patch changelog
>> >> review from Thomas Gleixner and Peter Zijlstra, I'm respinning this
>> >> series for another RFC.
>> >> 
>> > My suggestion would be that you also split out the opv system call.
>> > That seems to be main contention point currently, and the restartable
>> > sequences should be useful without it.
>> 
>> I consider rseq to be incomplete and a pain to use in various scenarios
>> without cpu_opv.
>> 
>> About the contention point you refer to:
>> 
>> Using vDSO as an example of how things should be done is just wrong: the
>> vDSO interaction with debugger instruction single-stepping is broken,
>> as I detailed in my previous email.
> 
> Let me turn that around. You're lamenting about a conditional branch in
> your rseq thing for performance reasons and at the same time you want to
> force extra code into the VDSO? clock_gettime() is one of the hottest
> vsyscalls in certain scenarions. So why would we want to have extra code
> there? Just to make debuggers happy. You really can't be serious about
> that.

There is *already* an existing branch in the clock_gettime vsyscall:
it's a loop. It won't hurt the fast-path to use that branch and
make it do something else instead. It could even help the vDSO fast-path
for some non-x86 architectures where branch prediction assumes that
backward branches are always taken (adding an unlikely() does not help
in those cases).

> 
>> Thomas' proposal of handling single-stepping with a user-space locking
>> fallback, which is pretty much what I had in 2016, pushes a lot of
>> complexity to user-space, requires an extra branch in the fast-path,
>> as well as additional store-release/load-acquire semantics for consistency.
>> I don't plan going down that route.
>>
>> Other than that, I have not received any concrete alternative proposal to
>> properly handle single-stepping.
> 
> You provided the details today. Up to that point all we had was handwaving
> and inconsistent information.

I mistakenly presumed you took interest in the past 2 years discussions.
It appears I was wrong, and that information needed to be summarized in
my changelog. This was my mistake and I fixed it.

> 
>> The only opposition against cpu_opv is that there *should* be an hypothetical
>> simpler solution. The rseq idea is not new: it's been presented by Paul Turner
>> in 2012 at LPC. And so far, cpu_opv is the overall simplest and most
>> efficient way I encountered to handle single-stepping, and it gives extra
>> benefits, as described in my changelog.
> 
> That's how you define it and that does not make cpu_opv less complex and
> more debuggable. There is no way to debug that and still you claim that it
> removes compexity from user space.

So I should ask: what kind of observability within cpu_opv() do you want ?
I can add a tracepoint for each operation, which would technically take care
of your concern. You main counter-argument seems to be a tooling issue.

> That ops stuff comes from user space and
> is not magically constructed by the kernel. In some of your use cases it
> even has different semantics than the rseq section code. So how is that
> removing any complexity from user space? All it buys you is an extra branch
> less in your rseq hotpath and that's your justification to shove that
> thing into the kernel.

Actually, the cpu-op user-space library can hide this difference from the
user: I implemented the equivalent rseq algorithm using a compare-and-store:

int cpu_op_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot,
                off_t voffp, intptr_t *load, int cpu)
{
        intptr_t oldv = READ_ONCE(*v);
        intptr_t *newp = (intptr_t *)(oldv + voffp);
        int ret;

        if (oldv == expectnot)
                return 1;
        ret = cpu_op_cmpeqv_storep_expect_fault(v, oldv, newp, cpu);
        if (!ret) {
                *load = oldv;
                return 0;
        }
        if (ret > 0) {
                errno = EAGAIN;
                return -1;
        }
        return -1;
}

So from a library user perspective, the fast-path and slow-path are
exactly the same.

> 
> The version I reviewed was just undigestable.

Thanks for the thorough coding style review by the way.

> I did not have time to look
> at the hastily cobbled together version of today. Aside of that the
> scheduler portion of it has not seen any review from scheduler folks
> either.

True. It appears that it really takes a merge window to get some
people's attention. That's OK, you guys are really busy on other
stuff. It's just unfortunate that the feedback about the cpu_opv
concept did not come sooner, e.g. during first rounds of patches
where the cpu_opv design was presented, or even at KS.

> 
> AFAICT there is not a single reviewed-by tag on the sys_rseq and the
> sys_opv patches either.

Very good point! Anyone in CC who cares about getting this in can
find time to do some official review ?

> 
> Are you seriously expecting that new syscalls of that kind are going to be
> merged without a deep and thorough review just based on your decision to
> declare them ready?

In my reply to Andi, I merely state that I'm not willing to push an
half-baked user-space ABI into the kernel, and rseq without cpu_opv
is only part of the solution.

Let's see if others find time to do an official review.

Thanks,

Mathieu



> 
> Thanks,
> 
> 	tglx

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2017-11-22 12:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-21 14:18 [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 01/22] uapi headers: Provide types_32_64.h Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v12 02/22] rseq: Introduce restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 03/22] arm: Add restartable sequences support Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 04/22] arm: Wire up restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 05/22] x86: Add support for restartable sequences Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 06/22] x86: Wire up restartable sequence system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 07/22] powerpc: Add support for restartable sequences Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 08/22] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 09/22] sched: Implement push_task_to_cpu Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v4 10/22] cpu_opv: Provide cpu_opv system call Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 11/22] x86: Wire up " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 12/22] powerpc: " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 13/22] arm: " Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v3 14/22] cpu_opv: selftests: Implement selftests Mathieu Desnoyers
2017-11-21 15:17   ` Shuah Khan
2017-11-21 16:46     ` Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v3 15/22] rseq: selftests: Provide self-tests Mathieu Desnoyers
2017-11-21 15:34   ` Shuah Khan
2017-11-21 17:05     ` Mathieu Desnoyers
2017-11-21 17:40       ` Shuah Khan
2017-11-21 21:22         ` Mathieu Desnoyers
2017-11-21 21:24           ` Shuah Khan
2017-11-21 21:44             ` Mathieu Desnoyers
2017-11-22 19:38   ` Peter Zijlstra
2017-11-23 21:16     ` Mathieu Desnoyers
2017-11-22 21:48   ` Peter Zijlstra
2017-11-23 22:53     ` Mathieu Desnoyers
2017-11-23  8:55   ` Peter Zijlstra
2017-11-23  8:57     ` Peter Zijlstra
2017-11-24 14:15       ` Mathieu Desnoyers
2017-11-24 13:55     ` Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 16/22] rseq: selftests: arm: workaround gcc asm size guess Mathieu Desnoyers
2017-11-21 15:39   ` Shuah Khan
2017-11-21 14:18 ` [RFC PATCH for 4.15 17/22] Fix: membarrier: add missing preempt off around smp_call_function_many Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 18/22] membarrier: selftest: Test private expedited cmd Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v7 19/22] powerpc: membarrier: Skip memory barrier in switch_mm() Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v5 20/22] membarrier: Document scheduler barrier requirements Mathieu Desnoyers
2017-11-21 14:18 ` [RFC PATCH for 4.15 v2 21/22] membarrier: provide SHARED_EXPEDITED command Mathieu Desnoyers
2017-11-21 14:19 ` [RFC PATCH for 4.15 22/22] membarrier: selftest: Test shared expedited cmd Mathieu Desnoyers
2017-11-21 17:21 ` [RFC PATCH for 4.15 v12 00/22] Restartable sequences and CPU op vector Andi Kleen
2017-11-21 22:05   ` Mathieu Desnoyers
2017-11-21 22:59     ` Thomas Gleixner
2017-11-22 12:36       ` Mathieu Desnoyers [this message]
2017-11-22 15:25         ` Thomas Gleixner
2017-11-22 15:28     ` Andy Lutomirski
2017-11-22 16:43       ` Mathieu Desnoyers
2017-11-22 18:10         ` Andi Kleen
2017-11-22 19:32     ` Peter Zijlstra
2017-11-22 19:37       ` Will Deacon
2017-11-23 21:15         ` Mathieu Desnoyers
2017-11-23 22:51           ` Thomas Gleixner
2017-11-23 23:01             ` Mathieu Desnoyers
2017-11-23 23:38               ` Thomas Gleixner
2017-11-24  0:04                 ` Mathieu Desnoyers
2017-11-24 14:47                   ` Thomas Gleixner
2017-11-23 21:13       ` Mathieu Desnoyers
2017-11-23 21:49         ` Andi Kleen
2017-11-21 22:19 ` [PATCH update for 4.15 1/3] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2017-11-21 22:22   ` Mathieu Desnoyers
2017-11-22 15:16   ` Shuah Khan
2017-11-21 22:19 ` [PATCH update for 4.15 2/3] cpu_opv: selftests: Implement selftests (v4) Mathieu Desnoyers
2017-11-22 15:20   ` Shuah Khan
2017-11-21 22:19 ` [PATCH update for 4.15 3/3] rseq: selftests: Provide self-tests (v4) Mathieu Desnoyers
2017-11-22 15:23   ` Shuah Khan
2017-11-22 16:31     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=809252084.19901.1511354219731.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox