From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@redhat.com>, Paul Turner <commonly@gmail.com>,
Andi Kleen <andi@firstfloor.org>, Chris Lameter <cl@linux.com>,
Dave Watson <davejwatson@fb.com>,
Josh Triplett <josh@joshtriplett.org>,
linux-api <linux-api@vger.kernel.org>,
linux-kernel@vger.kernel.org, Andrew Hunter <ahh@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections
Date: Fri, 8 Apr 2016 17:46:29 +0000 (UTC) [thread overview]
Message-ID: <65466698.51122.1460137589499.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <427613474.49955.1460081105607.JavaMail.zimbra@efficios.com>
----- On Apr 7, 2016, at 10:05 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Apr 7, 2016, at 9:21 PM, Andy Lutomirski luto@amacapital.net wrote:
>
>> On Thu, Apr 7, 2016 at 6:11 PM, Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>>> ----- On Apr 7, 2016, at 6:05 PM, Andy Lutomirski luto@amacapital.net wrote:
>>>
>>>> On Thu, Apr 7, 2016 at 1:11 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>>>>> On Thu, Apr 07, 2016 at 09:43:33AM -0700, Andy Lutomirski wrote:
>>> [...]
>>>>>
>>>>>> it's inherently debuggable,
>>>>>
>>>>> It is more debuggable, agreed.
>>>>>
>>>>>> and it allows multiple independent
>>>>>> rseq-protected things to coexist without forcing each other to abort.
>>>
>>> [...]
>>>
>>> My understanding is that the main goal of this rather more complex
>>> proposal is to make interaction with debuggers more straightforward in
>>> cases of single-stepping through the rseq critical section.
>>
>> The things I like about my proposal are both that you can single-step
>> through it just like any other code as long as you pin the thread to a
>> CPU and that it doesn't make preemption magical. (Of course, you can
>> *force* it to do something on resume and/or preemption by sticking a
>> bogus value in the expected event count field, but that's not the
>> intended use. Hmm, I guess it does need to hook preemption and/or
>> resume for all processes that enable the thing so it can know to check
>> for an enabled post_commit_rip, just like all the other proposals.)
>>
>> Also, mine lets you have a fairly long-running critical section that
>> doesn't get aborted under heavy load and can interleave with other
>> critical sections that don't conflict.
>
> Yes, those would be nice advantages. I'll have to do a few more
> pseudo-code and execution scenarios to get a better understanding of
> your idea.
>
>>
>>>
>>> I recently came up with a scheme that should allow us to handle such
>>> situations in a fashion similar to debuggers handling ll/sc
>>> restartable sequences of instructions on e.g. powerpc. The good news
>>> is that my scheme does not require anything at the kernel level.
>>>
>>> The idea is simple: the userspace rseq critical sections now
>>> become marked by 3 inline functions (rather than 2 in Paul's proposal):
>>>
>>> rseq_start(void *rseq_key)
>>> rseq_finish(void *rseq_key)
>>> rseq_abort(void *rseq_key)
>>
>> How do you use this thing? What are its semantics?
>
> You define one rseq_key variable (dummy 1 byte variable, can be an
> empty structure) for each rseq critical section you have in your
> program.
>
> A rseq critical section will typically have one entry point (rseq_start),
> and one exit point (rseq_finish). I'm saying "typically" because there
> may be more than one entry point, and more than one exit point per
> critical section.
>
> Entry and exit points mark the beginning and end of each rseq critical
> section. rseq_start loads the sequence counter from the TLS and copies
> it onto the stack. It then gets passed to rseq_finish() to be compared
> with the final seqnum TLS value just before the commit. rseq_finish is
> the one responsible for storing into the post_commit_instr field of the
> TLS and populating rcx with the failure insn label address. rseq_finish()
> does the commit.
>
> And there is rseq_abort(), which would need to be called if we just want
> to exit from a rseq critical section without doing the commit (no matching
> call to rseq_finish after a rseq_start).
>
> Each of rseq_start, finish, and abort would need to receive a pointer
> to the rseq_key as parameter.
>
> rseq_start would return the sequence number read from the TLS.
>
> rseq_finish would also receive as parameter that sequence number that has
> been returned by rseq_start.
>
> Does it make sense ?
By the way, the debugger can always decide to single-step through the
first iteration of the rseq, and then after it loops, decide to skip
single-stepping until the exit points are reached.
Thanks,
Mathieu
>
> Thanks,
>
> Mathieu
>
>
>>
>> --Andy
>>
>>>
>>> We associate each critical section with a unique "key" (dummy
>>> 1 byte object in the process address space), so we can group
>>> them. The new "rseq_abort" would mark exit points that would
>>> exit the critical section without executing the final commit
>>> instruction.
>>>
>>> Within each of rseq_start, rseq_finish and rseq_abort,
>>> we declare a non-loadable section that gets populated
>>> with the following tuples:
>>>
>>> (RSEQ_TYPE, insn address, rseq_key)
>>>
>>> Where RSEQ_TYPE is either RSEQ_START, RSEQ_FINISH, or RSEQ_ABORT.
>>>
>>> That special section would be found in the executable by the
>>> debugger, which can then skip over entire restartable critical
>>> sections when it encounters them by placing breakpoints at
>>> all exit points (finish and cancel) associated to the same
>>> rseq_key as the entry point (start).
>>>
>>> This way we don't need to complexify the runtime code, neither
>>> at kernel nor user-space level, and we get debuggability using
>>> a trick similar to what ll/sc architectures already need to do.
>>>
>>> Of course, this requires extending gdb, which should not be
>>> a show-stopper.
>>>
>>> Thoughts ?
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com
>>
>>
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2016-04-08 17:46 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-27 23:56 [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections Paul Turner
2015-10-27 23:56 ` [RFC PATCH v2 1/3] restartable sequences: user-space per-cpu " Paul Turner
[not found] ` <20151027235653.16059.8933.stgit-G8L5E6GV2z5XSTzz+wBt03oUN1GumTyQ7j82oEJ37pA@public.gmane.org>
2015-11-19 16:38 ` Johannes Berg
2015-12-11 12:56 ` Mathieu Desnoyers
2015-10-27 23:57 ` [RFC PATCH v2 2/3] restartable sequences: x86 ABI Paul Turner
[not found] ` <20151027235705.16059.63268.stgit-G8L5E6GV2z5XSTzz+wBt03oUN1GumTyQ7j82oEJ37pA@public.gmane.org>
2015-10-28 5:03 ` Peter Zijlstra
[not found] ` <20151028050314.GC11242-IIpfhp3q70xmmu7s1q4rt2t3HXsI98Cx0E9HWUfgJXw@public.gmane.org>
2015-10-28 5:19 ` Paul Turner
2015-12-11 13:30 ` Mathieu Desnoyers
2015-10-27 23:57 ` [RFC PATCH v2 3/3] restartable sequences: basic self-tests Paul Turner
[not found] ` <20151027235716.16059.47610.stgit-G8L5E6GV2z5XSTzz+wBt03oUN1GumTyQ7j82oEJ37pA@public.gmane.org>
2016-04-05 20:33 ` Mathieu Desnoyers
[not found] ` <1276514010.46061.1459888406999.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-06 7:43 ` Peter Zijlstra
[not found] ` <20160406074309.GE3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-06 13:39 ` Mathieu Desnoyers
[not found] ` <528054829.46502.1459949962537.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-06 19:25 ` Peter Zijlstra
[not found] ` <20151027235635.16059.11630.stgit-G8L5E6GV2z5XSTzz+wBt03oUN1GumTyQ7j82oEJ37pA@public.gmane.org>
2015-10-28 14:44 ` [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections Dave Watson
2015-12-11 12:05 ` Mathieu Desnoyers
[not found] ` <1070636085.232143.1449835536723.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2015-12-11 13:39 ` Mathieu Desnoyers
2016-04-06 15:56 ` Andy Lutomirski
2016-04-07 12:02 ` Peter Zijlstra
[not found] ` <20160407120254.GY3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 14:35 ` Andy Lutomirski
[not found] ` <CALCETrV0vcYcnBrs0axykJD=_BM28wKWVMG6bMzK8zh8R3m5fg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 15:24 ` Peter Zijlstra
[not found] ` <20160407152432.GZ3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 15:39 ` Peter Zijlstra
2016-04-07 15:44 ` Andy Lutomirski
[not found] ` <CALCETrU5ZL6Jajc=9up-j86vY_Xtt-gTFjdQE0sB0d=d-CJZ6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 15:53 ` Peter Zijlstra
[not found] ` <20160407155312.GA3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 16:43 ` Andy Lutomirski
[not found] ` <CALCETrVGo1Di3qamxx1NAFUSN_o=-HnYRDpeVp7zrQEBwe5u-g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 20:11 ` Peter Zijlstra
[not found] ` <20160407201156.GC3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 22:05 ` Andy Lutomirski
[not found] ` <CALCETrXVReuuGGKW6EOV7tFFaK9RbwWxYvKdpUdvU=MpDaOtsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-08 1:11 ` Mathieu Desnoyers
2016-04-08 1:21 ` Andy Lutomirski
2016-04-08 2:05 ` Mathieu Desnoyers
2016-04-08 17:46 ` Mathieu Desnoyers [this message]
[not found] ` <65466698.51122.1460137589499.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-08 21:16 ` Andy Lutomirski
2016-04-08 21:25 ` Linus Torvalds
[not found] ` <CA+55aFwqJmTy+Nz0k9N_2zsms51meTFMdvYYW5VHdiOq8Jjr7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-10 14:07 ` Mathieu Desnoyers
2016-04-08 11:02 ` Peter Zijlstra
[not found] ` <20160408110232.GP3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-08 15:57 ` Andy Lutomirski
2016-04-08 6:41 ` Peter Zijlstra
[not found] ` <20160408064136.GJ3448-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-08 15:58 ` Andy Lutomirski
2016-04-11 21:55 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=65466698.51122.1460137589499.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=ahh@google.com \
--cc=andi@firstfloor.org \
--cc=cl@linux.com \
--cc=commonly@gmail.com \
--cc=davejwatson@fb.com \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).