All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
To: "H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>,
	Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>,
	Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
	Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>,
	rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	"Paul E. McKenney"
	<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>,
	Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Mon, 4 Apr 2016 20:48:23 +0000 (UTC)	[thread overview]
Message-ID: <856357054.45028.1459802903401.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <492303698.44994.1459799188052.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>

----- On Apr 4, 2016, at 3:46 PM, Mathieu Desnoyers mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org wrote:

> ----- On Apr 4, 2016, at 1:11 PM, H. Peter Anvin hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org wrote:
> 
>> On 04/04/16 10:01, Mathieu Desnoyers wrote:
>>> 
>>> Changes since v5:
>>> - Rename "getcpu_cache" to "thread_local_abi", allowing to extend
>>>   this system call to cover future features such as restartable critical
>>>   sections. Generalizing this system call ensures that we can add
>>>   features similar to the cpu_id field within the same cache-line
>>>   without having to track one pointer per feature within the task
>>>   struct.
>>> - Add a tlabi_nr parameter to the system call, thus allowing to extend
>>>   the ABI beyond the initial 64-byte structure by registering structures
>>>   with tlabi_nr greater than 0. The initial ABI structure is associated
>>>   with tlabi_nr 0.
>>> - Rebased on kernel v4.5.
>>> 
>> 
>> This seems absolutely insanely complex, both for the kernel and for
>> userspace.
>> 
>> A much saner way would be for userspace to query the kernel for the size
>> of the structure; userspace then allocates the maximum of what it knows
>> and what the kernel knows.  That way, the kernel doesn't need to
>> conditionalize its accesses to user space, and libc doesn't need to
>> conditionalize its accesses either.
> 
> If we go down the route of having user-space dynamically allocating
> the structure, my understanding is that we need to associate the
> user-space TLS symbol with a pointer to the structure, and test for
> NULL each time, thus requiring user-space to touch one more cache-line
> (read the pointer), and add one conditional per user-space fast-path,
> compared to a statically-sized definition approach. Or perhaps you have
> some clever trick in mind for "allocation by user-space" that I'm missing ?
> 
> Besides the NULL pointer check, another issue is feature detection.
> As we extend the feature set, my proposal has a 32-bit features
> mask at the beginning of the TLS structure, within the same
> cache-line containing the structure fields, so user-space can quickly
> check whether the required feature is enabled (adds one conditional
> on the user-space fast path, but does not require to touch another
> cache-line). This allows adding new features without requiring to
> reserve the value "0" within each field of the structure to mean
> "feature unavailable", which I find terminally unaesthetic.
> 
> I propose here a fixed-size 64 bytes layout for the first structure,
> for which a 32-bit feature mask should be enough. If we ever fill
> up these 64 bytes, we can then use the following tlabi_nr number (1),
> which will define its own structure size and feature mask. This
> seems like a good compromise between fast-path speed, feature detection
> flexibility, optimal use of cache-lines, and extensibility.

Moreover, the feature set that the application knows about, glibc
knows about, and the kernel knows about are three different things.
My intent here is to have glibc stay out of the way as much as possible,
since this is really an interface between various applications/libraries
and the kernel.

Even if glibc allocates a structure large enough for the union of
the features it knows about and the features the kernel implements,
the application could be built against kernel headers that expose
more features than glibc knows about, and would therefore need to
have a structure length check, for an added branch on the fast path
if we dynamically allocate the tlabi structure.

A statically-sized structure allows application and libraries to
skip pointer load, NULL checks, and structure length checks on
the user-space fast-path.

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>, Andrew Hunter <ahh@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Watson <davejwatson@fb.com>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Mon, 4 Apr 2016 20:48:23 +0000 (UTC)	[thread overview]
Message-ID: <856357054.45028.1459802903401.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <492303698.44994.1459799188052.JavaMail.zimbra@efficios.com>

----- On Apr 4, 2016, at 3:46 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Apr 4, 2016, at 1:11 PM, H. Peter Anvin hpa@zytor.com wrote:
> 
>> On 04/04/16 10:01, Mathieu Desnoyers wrote:
>>> 
>>> Changes since v5:
>>> - Rename "getcpu_cache" to "thread_local_abi", allowing to extend
>>>   this system call to cover future features such as restartable critical
>>>   sections. Generalizing this system call ensures that we can add
>>>   features similar to the cpu_id field within the same cache-line
>>>   without having to track one pointer per feature within the task
>>>   struct.
>>> - Add a tlabi_nr parameter to the system call, thus allowing to extend
>>>   the ABI beyond the initial 64-byte structure by registering structures
>>>   with tlabi_nr greater than 0. The initial ABI structure is associated
>>>   with tlabi_nr 0.
>>> - Rebased on kernel v4.5.
>>> 
>> 
>> This seems absolutely insanely complex, both for the kernel and for
>> userspace.
>> 
>> A much saner way would be for userspace to query the kernel for the size
>> of the structure; userspace then allocates the maximum of what it knows
>> and what the kernel knows.  That way, the kernel doesn't need to
>> conditionalize its accesses to user space, and libc doesn't need to
>> conditionalize its accesses either.
> 
> If we go down the route of having user-space dynamically allocating
> the structure, my understanding is that we need to associate the
> user-space TLS symbol with a pointer to the structure, and test for
> NULL each time, thus requiring user-space to touch one more cache-line
> (read the pointer), and add one conditional per user-space fast-path,
> compared to a statically-sized definition approach. Or perhaps you have
> some clever trick in mind for "allocation by user-space" that I'm missing ?
> 
> Besides the NULL pointer check, another issue is feature detection.
> As we extend the feature set, my proposal has a 32-bit features
> mask at the beginning of the TLS structure, within the same
> cache-line containing the structure fields, so user-space can quickly
> check whether the required feature is enabled (adds one conditional
> on the user-space fast path, but does not require to touch another
> cache-line). This allows adding new features without requiring to
> reserve the value "0" within each field of the structure to mean
> "feature unavailable", which I find terminally unaesthetic.
> 
> I propose here a fixed-size 64 bytes layout for the first structure,
> for which a 32-bit feature mask should be enough. If we ever fill
> up these 64 bytes, we can then use the following tlabi_nr number (1),
> which will define its own structure size and feature mask. This
> seems like a good compromise between fast-path speed, feature detection
> flexibility, optimal use of cache-lines, and extensibility.

Moreover, the feature set that the application knows about, glibc
knows about, and the kernel knows about are three different things.
My intent here is to have glibc stay out of the way as much as possible,
since this is really an interface between various applications/libraries
and the kernel.

Even if glibc allocates a structure large enough for the union of
the features it knows about and the features the kernel implements,
the application could be built against kernel headers that expose
more features than glibc knows about, and would therefore need to
have a structure length check, for an added branch on the fast path
if we dynamically allocate the tlabi structure.

A statically-sized structure allows application and libraries to
skip pointer load, NULL checks, and structure length checks on
the user-space fast-path.

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 
> 
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  parent reply	other threads:[~2016-04-04 20:48 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-04 17:01 [RFC PATCH v6 0/5] Thread-local ABI system call (CPU number cache) Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread Mathieu Desnoyers
2016-04-04 17:11   ` H. Peter Anvin
2016-04-04 19:46     ` Mathieu Desnoyers
     [not found]       ` <492303698.44994.1459799188052.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-04 20:48         ` Mathieu Desnoyers [this message]
2016-04-04 20:48           ` Mathieu Desnoyers
     [not found]           ` <856357054.45028.1459802903401.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-05 16:02             ` Florian Weimer
2016-04-05 16:02               ` Florian Weimer
     [not found]               ` <5703E191.2040707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-05 16:47                 ` Peter Zijlstra
2016-04-05 16:47                   ` Peter Zijlstra
2016-04-07  9:01                   ` Florian Weimer
2016-04-07 10:31                     ` Peter Zijlstra
     [not found]                       ` <20160407103158.GP3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 10:39                         ` Florian Weimer
2016-04-07 10:39                           ` Florian Weimer
     [not found]                           ` <570638D9.7010108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 11:19                             ` Peter Zijlstra
2016-04-07 11:19                               ` Peter Zijlstra
     [not found]                               ` <20160407111938.GR3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 12:03                                 ` Florian Weimer
2016-04-07 12:03                                   ` Florian Weimer
     [not found]                                   ` <57064CA9.101-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 12:25                                     ` Peter Zijlstra
2016-04-07 12:25                                       ` Peter Zijlstra
2016-04-07 12:37                                       ` Florian Weimer
     [not found]                                       ` <20160407122528.GS3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 15:59                                         ` Mathieu Desnoyers
2016-04-07 15:59                                           ` Mathieu Desnoyers
2016-04-07 12:34                                     ` Mathieu Desnoyers
2016-04-07 12:34                                       ` Mathieu Desnoyers
2016-04-07 16:39                                 ` Linus Torvalds
2016-04-07 16:39                                   ` Linus Torvalds
     [not found]                                   ` <CA+55aFxrWx5pFN3LseaKpUHtB6nqXtkgP84seU3pjys-kq7utQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 16:46                                     ` Andy Lutomirski
2016-04-07 16:46                                       ` Andy Lutomirski
2016-04-07 16:50                                     ` Florian Weimer
2016-04-07 16:50                                       ` Florian Weimer
     [not found]                                       ` <57068FCC.8000701-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 16:59                                         ` Linus Torvalds
2016-04-07 16:59                                           ` Linus Torvalds
2016-04-07 16:52                                     ` Linus Torvalds
2016-04-07 16:52                                       ` Linus Torvalds
     [not found]                                       ` <CA+55aFyB6CPNiMKGWoaV7vxFWWBTgqOTqG4u2aNnq6uq1cHWZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 18:43                                         ` Mathieu Desnoyers
2016-04-07 18:43                                           ` Mathieu Desnoyers
     [not found]                                           ` <1025228632.49344.1460054592801.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 20:22                                             ` Andi Kleen
2016-04-07 20:22                                               ` Andi Kleen
     [not found]                                               ` <20160407202232.GF9407-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2016-04-07 20:55                                                 ` Mathieu Desnoyers
2016-04-07 20:55                                                   ` Mathieu Desnoyers
     [not found]   ` <1459789313-4917-2-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 10:40     ` Florian Weimer
2016-04-07 10:40       ` Florian Weimer
2016-04-04 17:01 ` [RFC PATCH v6 2/5] Thread-local ABI cpu_id: ARM resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 3/5] Thread-local ABI: wire up ARM system call Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 4/5] Thread-local ABI cpu_id: x86 32/64 resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 5/5] Thread-local ABI: wire up x86 32/64 system call Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=856357054.45028.1459802903401.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers-vg+e7yoek/dwk0htik3j/w@public.gmane.org \
    --cc=ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
    --cc=bmaurer-b10kYP2dOMg@public.gmane.org \
    --cc=boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=catalin.marinas-5wv7dgnIgG8@public.gmane.org \
    --cc=cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org \
    --cc=davejwatson-b10kYP2dOMg@public.gmane.org \
    --cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
    --cc=josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=will.deacon-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.