stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	paulmck <paulmck@linux.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Paul Turner <pjt@google.com>,
	linux-api <linux-api@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	Neel Natu <neelnatu@google.com>
Subject: Re: [PATCH for 5.5 1/2] rseq: Fix: Clarify rseq.h UAPI rseq_cs memory reclaim requirements
Date: Fri, 20 Dec 2019 16:15:00 -0500 (EST)	[thread overview]
Message-ID: <669061171.14506.1576876500152.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <875zian2a2.fsf@mid.deneb.enyo.de>

----- On Dec 20, 2019, at 3:57 PM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Dec 20, 2019, at 3:37 PM, Florian Weimer fw@deneb.enyo.de wrote:
>>
>>> * Mathieu Desnoyers:
>>> 
>>>> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
>>>> index 9a402fdb60e9..6f26b0b148a6 100644
>>>> --- a/include/uapi/linux/rseq.h
>>>> +++ b/include/uapi/linux/rseq.h
>>>> @@ -100,7 +100,9 @@ struct rseq {
>>>>  	 * instruction sequence block, as well as when the kernel detects that
>>>>  	 * it is preempting or delivering a signal outside of the range
>>>>  	 * targeted by the rseq_cs. Also needs to be set to NULL by user-space
>>>> -	 * before reclaiming memory that contains the targeted struct rseq_cs.
>>>> +	 * before reclaiming memory that contains the targeted struct rseq_cs
>>>> +	 * or reclaiming memory that contains the code refered to by the
>>>> +	 * start_ip and post_commit_offset fields of struct rseq_cs.
>>> 
>>> Maybe mention that it's good practice to clear rseq_cs before
>>> returning from a function that contains a restartable sequence?
>>
>> Unfortunately, clearing it is not free. Considering that rseq is meant to
>> be used in very hot code paths, it would be preferable that applications
>> clear it in the very infrequent case where the rseq_cs or code will
>> vanish (e.g. dlclose or JIT reclaim), and not require it to be cleared
>> after each critical section. I am therefore reluctant to document the
>> behavior you describe as a "good practice" for rseq.
> 
> You already have to write to rseq_cs before entering the critical
> section, right?  Then you've already determined the address, and the
> cache line is already hot, so it really should be close to zero cost.

Considering that overall rseq executes in fraction of nanoseconds on
some architectures, adding an extra store is perhaps close to zero,
but still significantly degrades performance.

> 
> I mean, you can still discard the advice, but you do so ad your own
> peril …

I am also uncomfortable leaving this to the end user. One possibility
would be to extend rseq or membarrier to add a kind of "rseq-clear"
barrier, which would ensure that the kernel will have cleared the
rseq_cs field for each thread belonging to the current process. glibc
could then call this barrier before dlclose.

This is slightly different from another rseq-barrier that has been
requested by Paul Turner: a way to ensure that all previously
running rseq critical sections have completed or aborted.

AFAIU, the desiderata for each of the 2 use-cases is as follows:

rseq-barrier: guarantee that all prior rseq critical sections have
completed or aborted for the current process or for a set of registered
processes. Allows doing RCU-like algorithms within rseq critical sections.

rseq-clear: guarantee that the rseq_cs field is cleared for each thread
belonging to the current process before the barrier system call returns
to the caller. Aborts currently running rseq critical sections for all
threads belonging to the current process. The use-case is to allow
dlclose and JIT reclaim to clear any leftover reference to struct
rseq_cs or code which are going to be reclaimed.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2019-12-20 21:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 20:12 [PATCH for 5.5 1/2] rseq: Fix: Clarify rseq.h UAPI rseq_cs memory reclaim requirements Mathieu Desnoyers
2019-12-20 20:12 ` [PATCH for 5.5 2/2] rseq/selftests: Clarify rseq_prepare_unload() helper requirements Mathieu Desnoyers
2019-12-20 20:27   ` Shuah Khan
2019-12-20 20:32     ` Mathieu Desnoyers
2019-12-20 20:37 ` [PATCH for 5.5 1/2] rseq: Fix: Clarify rseq.h UAPI rseq_cs memory reclaim requirements Florian Weimer
2019-12-20 20:54   ` Mathieu Desnoyers
2019-12-20 20:57     ` Florian Weimer
2019-12-20 21:15       ` Mathieu Desnoyers [this message]
2020-01-06 19:08         ` Mathieu Desnoyers
2020-01-06 19:30           ` Florian Weimer
2020-01-06 20:25             ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=669061171.14506.1576876500152.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=boqun.feng@gmail.com \
    --cc=dvyukov@google.com \
    --cc=fw@deneb.enyo.de \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neelnatu@google.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).