From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mathieu Desnoyers Subject: Re: [PATCH for 5.5 1/2] rseq: Fix: Clarify rseq.h UAPI rseq_cs memory reclaim requirements Date: Mon, 6 Jan 2020 14:08:37 -0500 (EST) Message-ID: <1025393027.850.1578337717165.JavaMail.zimbra@efficios.com> References: <20191220201207.17389-1-mathieu.desnoyers@efficios.com> <87imman36g.fsf@mid.deneb.enyo.de> <173832695.14381.1576875253374.JavaMail.zimbra@efficios.com> <875zian2a2.fsf@mid.deneb.enyo.de> <669061171.14506.1576876500152.JavaMail.zimbra@efficios.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <669061171.14506.1576876500152.JavaMail.zimbra@efficios.com> Sender: linux-kernel-owner@vger.kernel.org To: Florian Weimer Cc: Thomas Gleixner , linux-kernel , Peter Zijlstra , paulmck , Boqun Feng , "H. Peter Anvin" , Paul Turner , linux-api , stable , Dmitry Vyukov , Neel Natu List-Id: linux-api@vger.kernel.org ----- On Dec 20, 2019, at 4:15 PM, Mathieu Desnoyers mathieu.desnoyers@effi= cios.com wrote: > ----- On Dec 20, 2019, at 3:57 PM, Florian Weimer fw@deneb.enyo.de wrote: >=20 >> * Mathieu Desnoyers: >>=20 >>> ----- On Dec 20, 2019, at 3:37 PM, Florian Weimer fw@deneb.enyo.de wrot= e: >>> >>>> * Mathieu Desnoyers: >>>>=20 >>>>> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h >>>>> index 9a402fdb60e9..6f26b0b148a6 100644 >>>>> --- a/include/uapi/linux/rseq.h >>>>> +++ b/include/uapi/linux/rseq.h >>>>> @@ -100,7 +100,9 @@ struct rseq { >>>>> =09 * instruction sequence block, as well as when the kernel detects= that >>>>> =09 * it is preempting or delivering a signal outside of the range >>>>> =09 * targeted by the rseq_cs. Also needs to be set to NULL by user-= space >>>>> -=09 * before reclaiming memory that contains the targeted struct rse= q_cs. >>>>> +=09 * before reclaiming memory that contains the targeted struct rse= q_cs >>>>> +=09 * or reclaiming memory that contains the code refered to by the >>>>> +=09 * start_ip and post_commit_offset fields of struct rseq_cs. >>>>=20 >>>> Maybe mention that it's good practice to clear rseq_cs before >>>> returning from a function that contains a restartable sequence? >>> >>> Unfortunately, clearing it is not free. Considering that rseq is meant = to >>> be used in very hot code paths, it would be preferable that application= s >>> clear it in the very infrequent case where the rseq_cs or code will >>> vanish (e.g. dlclose or JIT reclaim), and not require it to be cleared >>> after each critical section. I am therefore reluctant to document the >>> behavior you describe as a "good practice" for rseq. >>=20 >> You already have to write to rseq_cs before entering the critical >> section, right? Then you've already determined the address, and the >> cache line is already hot, so it really should be close to zero cost. >=20 > Considering that overall rseq executes in fraction of nanoseconds on > some architectures, adding an extra store is perhaps close to zero, > but still significantly degrades performance. >=20 >>=20 >> I mean, you can still discard the advice, but you do so ad your own >> peril =E2=80=A6 >=20 > I am also uncomfortable leaving this to the end user. One possibility > would be to extend rseq or membarrier to add a kind of "rseq-clear" > barrier, which would ensure that the kernel will have cleared the > rseq_cs field for each thread belonging to the current process. glibc > could then call this barrier before dlclose. >=20 > This is slightly different from another rseq-barrier that has been > requested by Paul Turner: a way to ensure that all previously > running rseq critical sections have completed or aborted. >=20 > AFAIU, the desiderata for each of the 2 use-cases is as follows: >=20 > rseq-barrier: guarantee that all prior rseq critical sections have > completed or aborted for the current process or for a set of registered > processes. Allows doing RCU-like algorithms within rseq critical sections= . >=20 > rseq-clear: guarantee that the rseq_cs field is cleared for each thread > belonging to the current process before the barrier system call returns > to the caller. Aborts currently running rseq critical sections for all > threads belonging to the current process. The use-case is to allow > dlclose and JIT reclaim to clear any leftover reference to struct > rseq_cs or code which are going to be reclaimed. Just to clarify: should the discussion here prevent the UAPI documentation change from being merged into the Linux kernel ? Our discussion seems to be related to integration of rseq into glibc, rather than the kernel UAPI per = se. Thanks, Mathieu --=20 Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com