From: Mathieu Desnoyers <mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
To: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Florian Weimer <fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org>,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-api <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Paul Turner <pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andrew Hunter <ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>,
Dave Watson <davejwatson-b10kYP2dOMg@public.gmane.org>,
Chris Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
Ben Maurer <bmaurer-b10kYP2dOMg@public.gmane.org>,
rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
"Paul E. McKenney"
<paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Josh Triplett <josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org>,
Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>,
Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
Michael Kerrisk
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Boqun Feng <boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Thu, 7 Apr 2016 15:59:10 +0000 (UTC) [thread overview]
Message-ID: <966747921.48958.1460044750563.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20160407122528.GS3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
----- On Apr 7, 2016, at 8:25 AM, Peter Zijlstra peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org wrote:
> On Thu, Apr 07, 2016 at 02:03:53PM +0200, Florian Weimer wrote:
>> > struct tlabi {
>> > union {
>> > __u8[64] __foo;
>> > struct {
>> > /* fields go here */
>> > };
>> > };
>> > } __aligned__(64);
>>
>> That's not really “fixed size” as far as an ABI is concerned, due to the
>> possibility of future extensions.
>
> sizeof(struct tlabi) is always the same, right? How is that not fixed?
>
>> > People objected against the fixed size scheme, but it being possible to
>> > get a fixed TCB offset and reduce indirections is a big win IMO.
>>
>> It's a difficult trade-off. It's not an indirection as such, it's avoid
>> loading the dynamic TLS offset.
>
> What we _want_ is being able to use %[gf]s:offset and have it work (I
> forever forget which segment register userspace TLS uses).
>
>> Let me repeat that the ELF TLS GNU ABI has very limited support for
>> static offsets at present, and it is difficult to make them available
>> more widely without code generation at run time (in the form of text
>> relocations, but still).
>
> Do you have a pointer to something I can read? Because I'm clearly not
> understanding the full issue here.
For what is is worth, here are a couple of objdump snippet of my
test program without and with -fPIC:
* Compiled with -O2, *without* -fPIC, x86-64:
__thread __attribute__((weak)) volatile struct thread_local_abi __thread_local_abi;
static
int32_t read_cpu_id(void)
{
if (unlikely(!(__thread_local_abi.features & TLABI_FEATURE_CPU_ID)))
40064e: 64 8b 04 25 c0 ff ff mov %fs:0xffffffffffffffc0,%eax
400655: ff
400656: a8 01 test $0x1,%al
400658: 74 71 je 4006cb <main+0xab>
return sched_getcpu();
return __thread_local_abi.cpu_id;
40065a: 64 8b 14 25 c4 ff ff mov %fs:0xffffffffffffffc4,%edx
400661: ff
}
* Compiled with -O2, with -fPIC, x86_64:
__thread __attribute__((weak)) volatile struct thread_local_abi __thread_local_abi;
4006de: 64 48 8b 04 25 00 00 mov %fs:0x0,%rax
4006e5: 00 00
static
int32_t read_cpu_id(void)
{
if (unlikely(!(__thread_local_abi.features & TLABI_FEATURE_CPU_ID)))
4006e7: 48 8d 80 c0 ff ff ff lea -0x40(%rax),%rax
4006ee: 8b 10 mov (%rax),%edx
4006f0: 83 e2 01 and $0x1,%edx
4006f3: 0f 84 80 00 00 00 je 400779 <main+0xc9>
return sched_getcpu();
return __thread_local_abi.cpu_id;
4006f9: 8b 50 04 mov 0x4(%rax),%edx
}
So with -fPIC (libraries), TLS adds an extra indirection. However,
it just needs to load the base address once, and can then access
both "features" and "cpu_id" fields as offsets from that base.
For executables compiled without -fPIC, there is no indirection.
This justifies the following paragraph in the proposed man page:
The symbol __thread_local_abi is recommended to be used across
libraries and applications wishing to register a the thread-local
ABI structure for tlabi_nr 0. The attribute "weak" is recommended
when declaring this variable in libraries. Applications can
choose to define their own version of this symbol without the weak
attribute as a performance improvement.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Florian Weimer <fweimer@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>,
Russell King <linux@arm.linux.org.uk>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org,
linux-api <linux-api@vger.kernel.org>,
Paul Turner <pjt@google.com>, Andrew Hunter <ahh@google.com>,
Andy Lutomirski <luto@amacapital.net>,
Andi Kleen <andi@firstfloor.org>,
Dave Watson <davejwatson@fb.com>, Chris Lameter <cl@linux.com>,
Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Josh Triplett <josh@joshtriplett.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread
Date: Thu, 7 Apr 2016 15:59:10 +0000 (UTC) [thread overview]
Message-ID: <966747921.48958.1460044750563.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20160407122528.GS3430@twins.programming.kicks-ass.net>
----- On Apr 7, 2016, at 8:25 AM, Peter Zijlstra peterz@infradead.org wrote:
> On Thu, Apr 07, 2016 at 02:03:53PM +0200, Florian Weimer wrote:
>> > struct tlabi {
>> > union {
>> > __u8[64] __foo;
>> > struct {
>> > /* fields go here */
>> > };
>> > };
>> > } __aligned__(64);
>>
>> That's not really “fixed size” as far as an ABI is concerned, due to the
>> possibility of future extensions.
>
> sizeof(struct tlabi) is always the same, right? How is that not fixed?
>
>> > People objected against the fixed size scheme, but it being possible to
>> > get a fixed TCB offset and reduce indirections is a big win IMO.
>>
>> It's a difficult trade-off. It's not an indirection as such, it's avoid
>> loading the dynamic TLS offset.
>
> What we _want_ is being able to use %[gf]s:offset and have it work (I
> forever forget which segment register userspace TLS uses).
>
>> Let me repeat that the ELF TLS GNU ABI has very limited support for
>> static offsets at present, and it is difficult to make them available
>> more widely without code generation at run time (in the form of text
>> relocations, but still).
>
> Do you have a pointer to something I can read? Because I'm clearly not
> understanding the full issue here.
For what is is worth, here are a couple of objdump snippet of my
test program without and with -fPIC:
* Compiled with -O2, *without* -fPIC, x86-64:
__thread __attribute__((weak)) volatile struct thread_local_abi __thread_local_abi;
static
int32_t read_cpu_id(void)
{
if (unlikely(!(__thread_local_abi.features & TLABI_FEATURE_CPU_ID)))
40064e: 64 8b 04 25 c0 ff ff mov %fs:0xffffffffffffffc0,%eax
400655: ff
400656: a8 01 test $0x1,%al
400658: 74 71 je 4006cb <main+0xab>
return sched_getcpu();
return __thread_local_abi.cpu_id;
40065a: 64 8b 14 25 c4 ff ff mov %fs:0xffffffffffffffc4,%edx
400661: ff
}
* Compiled with -O2, with -fPIC, x86_64:
__thread __attribute__((weak)) volatile struct thread_local_abi __thread_local_abi;
4006de: 64 48 8b 04 25 00 00 mov %fs:0x0,%rax
4006e5: 00 00
static
int32_t read_cpu_id(void)
{
if (unlikely(!(__thread_local_abi.features & TLABI_FEATURE_CPU_ID)))
4006e7: 48 8d 80 c0 ff ff ff lea -0x40(%rax),%rax
4006ee: 8b 10 mov (%rax),%edx
4006f0: 83 e2 01 and $0x1,%edx
4006f3: 0f 84 80 00 00 00 je 400779 <main+0xc9>
return sched_getcpu();
return __thread_local_abi.cpu_id;
4006f9: 8b 50 04 mov 0x4(%rax),%edx
}
So with -fPIC (libraries), TLS adds an extra indirection. However,
it just needs to load the base address once, and can then access
both "features" and "cpu_id" fields as offsets from that base.
For executables compiled without -fPIC, there is no indirection.
This justifies the following paragraph in the proposed man page:
The symbol __thread_local_abi is recommended to be used across
libraries and applications wishing to register a the thread-local
ABI structure for tlabi_nr 0. The attribute "weak" is recommended
when declaring this variable in libraries. Applications can
choose to define their own version of this symbol without the weak
attribute as a performance improvement.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2016-04-07 15:59 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 17:01 [RFC PATCH v6 0/5] Thread-local ABI system call (CPU number cache) Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 1/5] Thread-local ABI system call: cache CPU number of running thread Mathieu Desnoyers
2016-04-04 17:11 ` H. Peter Anvin
2016-04-04 19:46 ` Mathieu Desnoyers
[not found] ` <492303698.44994.1459799188052.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-04 20:48 ` Mathieu Desnoyers
2016-04-04 20:48 ` Mathieu Desnoyers
[not found] ` <856357054.45028.1459802903401.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-05 16:02 ` Florian Weimer
2016-04-05 16:02 ` Florian Weimer
[not found] ` <5703E191.2040707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-05 16:47 ` Peter Zijlstra
2016-04-05 16:47 ` Peter Zijlstra
2016-04-07 9:01 ` Florian Weimer
2016-04-07 10:31 ` Peter Zijlstra
[not found] ` <20160407103158.GP3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 10:39 ` Florian Weimer
2016-04-07 10:39 ` Florian Weimer
[not found] ` <570638D9.7010108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 11:19 ` Peter Zijlstra
2016-04-07 11:19 ` Peter Zijlstra
[not found] ` <20160407111938.GR3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 12:03 ` Florian Weimer
2016-04-07 12:03 ` Florian Weimer
[not found] ` <57064CA9.101-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 12:25 ` Peter Zijlstra
2016-04-07 12:25 ` Peter Zijlstra
2016-04-07 12:37 ` Florian Weimer
[not found] ` <20160407122528.GS3430-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2016-04-07 15:59 ` Mathieu Desnoyers [this message]
2016-04-07 15:59 ` Mathieu Desnoyers
2016-04-07 12:34 ` Mathieu Desnoyers
2016-04-07 12:34 ` Mathieu Desnoyers
2016-04-07 16:39 ` Linus Torvalds
2016-04-07 16:39 ` Linus Torvalds
[not found] ` <CA+55aFxrWx5pFN3LseaKpUHtB6nqXtkgP84seU3pjys-kq7utQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 16:46 ` Andy Lutomirski
2016-04-07 16:46 ` Andy Lutomirski
2016-04-07 16:50 ` Florian Weimer
2016-04-07 16:50 ` Florian Weimer
[not found] ` <57068FCC.8000701-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-07 16:59 ` Linus Torvalds
2016-04-07 16:59 ` Linus Torvalds
2016-04-07 16:52 ` Linus Torvalds
2016-04-07 16:52 ` Linus Torvalds
[not found] ` <CA+55aFyB6CPNiMKGWoaV7vxFWWBTgqOTqG4u2aNnq6uq1cHWZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-07 18:43 ` Mathieu Desnoyers
2016-04-07 18:43 ` Mathieu Desnoyers
[not found] ` <1025228632.49344.1460054592801.JavaMail.zimbra-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 20:22 ` Andi Kleen
2016-04-07 20:22 ` Andi Kleen
[not found] ` <20160407202232.GF9407-1g7Xle2YJi4/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2016-04-07 20:55 ` Mathieu Desnoyers
2016-04-07 20:55 ` Mathieu Desnoyers
[not found] ` <1459789313-4917-2-git-send-email-mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w@public.gmane.org>
2016-04-07 10:40 ` Florian Weimer
2016-04-07 10:40 ` Florian Weimer
2016-04-04 17:01 ` [RFC PATCH v6 2/5] Thread-local ABI cpu_id: ARM resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 3/5] Thread-local ABI: wire up ARM system call Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 4/5] Thread-local ABI cpu_id: x86 32/64 resume notifier Mathieu Desnoyers
2016-04-04 17:01 ` [RFC PATCH v6 5/5] Thread-local ABI: wire up x86 32/64 system call Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=966747921.48958.1460044750563.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers-vg+e7yoek/dwk0htik3j/w@public.gmane.org \
--cc=ahh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
--cc=bmaurer-b10kYP2dOMg@public.gmane.org \
--cc=boqun.feng-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=catalin.marinas-5wv7dgnIgG8@public.gmane.org \
--cc=cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org \
--cc=davejwatson-b10kYP2dOMg@public.gmane.org \
--cc=fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
--cc=josh-iaAMLnmF4UmaiuxdJuQwMA@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
--cc=mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=pjt-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=will.deacon-5wv7dgnIgG8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.