All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
Cc: nd <nd@arm.com>, Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>, Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>
Subject: Re: [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc
Date: Thu, 11 Oct 2018 12:37:26 -0400 (EDT)	[thread overview]
Message-ID: <1680616760.2469.1539275846360.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <3896e4f5-aab1-ae79-5360-088fd15ed380@arm.com>

----- On Oct 11, 2018, at 12:20 PM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:

> On 11/10/18 16:13, Mathieu Desnoyers wrote:
>> ----- On Oct 11, 2018, at 6:37 AM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:
>> 
>>> On 10/10/18 20:19, Mathieu Desnoyers wrote:
>>>> In order to integrate rseq into user-space applications, add a reference
>>>> counter field after the struct rseq TLS ABI so many rseq users can be
>>>> linked into the same application (e.g. librseq and glibc). The
>>>> reference count ensures that rseq syscall registration/unregistration
>>>> happens only for the most early/late user for each thread, thus ensuring
>>>> that rseq is registered across the lifetime of all rseq users for a
>>>> given thread.
>>> ...
>>>> +__attribute__((visibility("hidden"))) __thread
>>>> +volatile struct libc_rseq __lib_rseq_abi = {
>>> ...
>>>> +extern __attribute__((weak, alias("__lib_rseq_abi"))) __thread
>>>> +volatile struct rseq __rseq_abi;
>>> ...
>>>> @@ -70,7 +86,7 @@ int rseq_register_current_thread(void)
>>>>  	sigset_t oldset;
>>>>  
>>>>  	signal_off_save(&oldset);
>>>> -	if (refcount++)
>>>> +	if (__lib_rseq_abi.refcount++)
>>>>  		goto end;
>>>>  	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG);
>>>
>>> why do you use a local refcounter instead of the __rseq_abi one?
>> 
>> There is no refcount in struct rseq (the ABI between kernel and user-space).
>> The registration refcount was part of an earlier version of the rseq system
>> call,
>> but we decided against keeping it in the kernel.
>> 
>> So I'm adding one _after_ struct rseq, purely to allow interaction between
>> various user-space components (program/libraries).
> 
> then all those components must use the same
> 
>  rseq_register_current_thread
>  rseq_unregister_current_thread
> 
> functions and not call the syscall on their own.

Not quite. Each user (programs and shared objects) must handle the refcount in a
similar way if they wish to invoke the syscall by themselves. They can
alternately use the librseq APIs if they do not wish to have a local implementation
of the reference counting and syscall registration/unregistration.

> 
> in which case the refcount could be a static __thread variable.

Yes, but I want to limit the number of symbols we need to export
from glibc by appending the refcount field at the end of struct rseq.

> 
> but it's in a magic struct that's called "abi" which is confusing,
> the counter is not abi, it's in a hidden object.

No, it is really an ABI between user-space apps/libs. It's not meant to be
hidden. glibc implements its own register/unregister functions (it does not
link against librseq). librseq exposes register/unregister functions as public
APIs. Those also use the refcount. I also plan to have existing libraries, e.g.
liblttng-ust and possibly liburcu flavors, implement the
registration/unregistration and refcount handling on their own, so we don't
have to add a requirement on additional linking on librseq for pre-existing
libraries.

So that refcount is not an ABI between kernel and user-space, but it's a
user-space ABI nevertheless (between program and shared objects).

> 
>>> what prevents calling rseq_register_current_thread more than 4G times?
>> 
>> Nothing. It would indeed be cleaner to error out if we detect that refcount is
>> at
>> INT_MAX. Is that what you have in mind ?
> 
> yes

Allright, will fix.

> 
>>> why cant the kernel see that the same address is registered again and succeed?
>> 
>> It can, and it does. However, refcounting at user-level is needed to ensure
>> the registration "lifetime" for rseq covers its entire use. If we have two
>> libraries
>> using rseq, we end up with the following scenario:
>> 
>> Thread 1
>> 
>>   libA registers rseq
>>   libB registers rseq
>>   libB unregisters rseq
>>   libA uses rseq -> bug! it's been unregistered by libB.
>>   libA unregisters rseq -> unexpected, it's already been unregistered.
>>  
>> same applies if libA unregisters rseq before libB (and libB try to use rseq
>> after libA has unregistered).
>> 
>> The refcount in user-space fixes this.
> 
> i see.

Thanks for the feedback!

Mathieu

> 
>> Thoughts ?
>> 
>> Thanks,
>> 
>> Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
Cc: nd <nd@arm.com>, Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>, Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Joel Fernandes <joelaf@google.com>, shuah <shuah@kernel.org>,
	carlos <carlos@redhat.com>, Florian Weimer <fweimer@redhat.com>,
	Joseph Myers <joseph@codesourcery.com>
Subject: Re: [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc
Date: Thu, 11 Oct 2018 12:37:26 -0400 (EDT)	[thread overview]
Message-ID: <1680616760.2469.1539275846360.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <3896e4f5-aab1-ae79-5360-088fd15ed380@arm.com>

----- On Oct 11, 2018, at 12:20 PM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:

> On 11/10/18 16:13, Mathieu Desnoyers wrote:
>> ----- On Oct 11, 2018, at 6:37 AM, Szabolcs Nagy Szabolcs.Nagy@arm.com wrote:
>> 
>>> On 10/10/18 20:19, Mathieu Desnoyers wrote:
>>>> In order to integrate rseq into user-space applications, add a reference
>>>> counter field after the struct rseq TLS ABI so many rseq users can be
>>>> linked into the same application (e.g. librseq and glibc). The
>>>> reference count ensures that rseq syscall registration/unregistration
>>>> happens only for the most early/late user for each thread, thus ensuring
>>>> that rseq is registered across the lifetime of all rseq users for a
>>>> given thread.
>>> ...
>>>> +__attribute__((visibility("hidden"))) __thread
>>>> +volatile struct libc_rseq __lib_rseq_abi = {
>>> ...
>>>> +extern __attribute__((weak, alias("__lib_rseq_abi"))) __thread
>>>> +volatile struct rseq __rseq_abi;
>>> ...
>>>> @@ -70,7 +86,7 @@ int rseq_register_current_thread(void)
>>>>  	sigset_t oldset;
>>>>  
>>>>  	signal_off_save(&oldset);
>>>> -	if (refcount++)
>>>> +	if (__lib_rseq_abi.refcount++)
>>>>  		goto end;
>>>>  	rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG);
>>>
>>> why do you use a local refcounter instead of the __rseq_abi one?
>> 
>> There is no refcount in struct rseq (the ABI between kernel and user-space).
>> The registration refcount was part of an earlier version of the rseq system
>> call,
>> but we decided against keeping it in the kernel.
>> 
>> So I'm adding one _after_ struct rseq, purely to allow interaction between
>> various user-space components (program/libraries).
> 
> then all those components must use the same
> 
>  rseq_register_current_thread
>  rseq_unregister_current_thread
> 
> functions and not call the syscall on their own.

Not quite. Each user (programs and shared objects) must handle the refcount in a
similar way if they wish to invoke the syscall by themselves. They can
alternately use the librseq APIs if they do not wish to have a local implementation
of the reference counting and syscall registration/unregistration.

> 
> in which case the refcount could be a static __thread variable.

Yes, but I want to limit the number of symbols we need to export
from glibc by appending the refcount field at the end of struct rseq.

> 
> but it's in a magic struct that's called "abi" which is confusing,
> the counter is not abi, it's in a hidden object.

No, it is really an ABI between user-space apps/libs. It's not meant to be
hidden. glibc implements its own register/unregister functions (it does not
link against librseq). librseq exposes register/unregister functions as public
APIs. Those also use the refcount. I also plan to have existing libraries, e.g.
liblttng-ust and possibly liburcu flavors, implement the
registration/unregistration and refcount handling on their own, so we don't
have to add a requirement on additional linking on librseq for pre-existing
libraries.

So that refcount is not an ABI between kernel and user-space, but it's a
user-space ABI nevertheless (between program and shared objects).

> 
>>> what prevents calling rseq_register_current_thread more than 4G times?
>> 
>> Nothing. It would indeed be cleaner to error out if we detect that refcount is
>> at
>> INT_MAX. Is that what you have in mind ?
> 
> yes

Allright, will fix.

> 
>>> why cant the kernel see that the same address is registered again and succeed?
>> 
>> It can, and it does. However, refcounting at user-level is needed to ensure
>> the registration "lifetime" for rseq covers its entire use. If we have two
>> libraries
>> using rseq, we end up with the following scenario:
>> 
>> Thread 1
>> 
>>   libA registers rseq
>>   libB registers rseq
>>   libB unregisters rseq
>>   libA uses rseq -> bug! it's been unregistered by libB.
>>   libA unregisters rseq -> unexpected, it's already been unregistered.
>>  
>> same applies if libA unregisters rseq before libB (and libB try to use rseq
>> after libA has unregistered).
>> 
>> The refcount in user-space fixes this.
> 
> i see.

Thanks for the feedback!

Mathieu

> 
>> Thoughts ?
>> 
>> Thanks,
>> 
>> Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2018-10-11 16:37 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-10 19:19 [RFC PATCH for 4.21 00/16] rseq updates, new cpu_opv system call Mathieu Desnoyers
2018-10-10 19:19 ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-11 10:37   ` Szabolcs Nagy
2018-10-11 10:37     ` Szabolcs Nagy
2018-10-11 15:13     ` Mathieu Desnoyers
2018-10-11 15:13       ` Mathieu Desnoyers
2018-10-11 16:20       ` Szabolcs Nagy
2018-10-11 16:20         ` Szabolcs Nagy
2018-10-11 16:37         ` Mathieu Desnoyers [this message]
2018-10-11 16:37           ` Mathieu Desnoyers
2018-10-11 17:04           ` Szabolcs Nagy
2018-10-11 17:04             ` Szabolcs Nagy
2018-10-11 19:42             ` Mathieu Desnoyers
2018-10-11 19:42               ` Mathieu Desnoyers
2018-10-12  9:59               ` Szabolcs Nagy
2018-10-12  9:59                 ` Szabolcs Nagy
2018-10-23 14:59                 ` Mathieu Desnoyers
2018-10-23 14:59                   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 02/16] rseq/selftests: Adapt number of threads to the number of detected cpus Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 03/16] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-17  6:51   ` Srikar Dronamraju
2018-10-17  6:51     ` Srikar Dronamraju
2018-10-17 15:09     ` Mathieu Desnoyers
2018-10-17 15:09       ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 04/16] mm: Introduce vm_map_user_ram, vm_unmap_user_ram Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-16 18:30   ` Steven Rostedt
2018-10-16 18:30     ` Steven Rostedt
2018-10-16 19:21     ` Mathieu Desnoyers
2018-10-16 19:21       ` Mathieu Desnoyers
2018-10-16 19:40       ` Steven Rostedt
2018-10-16 19:40         ` Steven Rostedt
2018-10-17  0:27     ` Sergey Senozhatsky
2018-10-17  0:27       ` Sergey Senozhatsky
2018-10-17 15:00       ` Mathieu Desnoyers
2018-10-17 15:00         ` Mathieu Desnoyers
2018-10-17 15:04         ` Mathieu Desnoyers
2018-10-17 15:04           ` Mathieu Desnoyers
2018-10-17 15:34           ` Sergey Senozhatsky
2018-10-17 15:34             ` Sergey Senozhatsky
2018-10-10 19:19 ` [RFC PATCH for 4.21 05/16] mm: Provide is_vma_noncached Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 06/16] cpu_opv: Provide cpu_opv system call (v8) Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-16  8:10   ` Sergey Senozhatsky
2018-10-16  8:10     ` Sergey Senozhatsky
2018-10-16 19:17     ` Mathieu Desnoyers
2018-10-16 19:17       ` Mathieu Desnoyers
2018-10-17  1:46       ` Sergey Senozhatsky
2018-10-17  1:46         ` Sergey Senozhatsky
2018-10-17  7:19   ` Srikar Dronamraju
2018-10-17  7:19     ` Srikar Dronamraju
2018-10-17 15:11     ` Mathieu Desnoyers
2018-10-17 15:11       ` Mathieu Desnoyers
2018-10-17 16:09       ` Mathieu Desnoyers
2018-10-17 16:09         ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 07/16] cpu_opv: limit amount of virtual address space used by cpu_opv Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 08/16] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 09/16] powerpc: " Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 10/16] arm: " Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 11/16] cpu-opv/selftests: Provide cpu-op library Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 12/16] cpu-opv/selftests: Provide basic test Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 13/16] cpu-opv/selftests: Provide percpu_op API Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 14/16] cpu-opv/selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 15/16] cpu-opv/selftests: Provide parametrized tests Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers
2018-10-10 19:19 ` [RFC PATCH for 4.21 16/16] cpu-opv/selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` Mathieu Desnoyers
2018-10-10 19:19   ` mathieu.desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1680616760.2469.1539275846360.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Szabolcs.Nagy@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=nd@arm.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.