From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1NEsGw-0005bY-94
	for qemu-devel@nongnu.org; Sun, 29 Nov 2009 17:29:38 -0500
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1NEsGq-0005b6-Mx
	for qemu-devel@nongnu.org; Sun, 29 Nov 2009 17:29:37 -0500
Received: from [199.232.76.173] (port=43885 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1NEsGq-0005ax-H4
	for qemu-devel@nongnu.org; Sun, 29 Nov 2009 17:29:32 -0500
Received: from mail2.shareable.org ([80.68.89.115]:45610)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60) (envelope-from <jamie@shareable.org>) id 1NEsGq-00021w-5K
	for qemu-devel@nongnu.org; Sun, 29 Nov 2009 17:29:32 -0500
Date: Sun, 29 Nov 2009 22:29:24 +0000
From: Jamie Lokier <jamie@shareable.org>
Subject: Re: [Qemu-devel] [PATCH 2/7] store thread-specific env information
Message-ID: <20091129222924.GA12299@shareable.org>
References: <1259256300-23937-1-git-send-email-glommer@redhat.com>
	<1259256300-23937-2-git-send-email-glommer@redhat.com>
	<1259256300-23937-3-git-send-email-glommer@redhat.com>
	<4B129372.1070204@redhat.com>
	<5E6C2888-0B2C-4BBE-A0E6-B9ECAB50F5F0@web.de>
	<4B129661.1000808@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4B129661.1000808@redhat.com>
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel@nongnu.org, Andreas =?iso-8859-1?Q?F=E4rber?= <andreas.faerber@web.de>, Glauber Costa <glommer@redhat.com>, aliguori@us.ibm.com

> On 11/29/2009 05:38 PM, Andreas Färber wrote:
>> Am 29.11.2009 um 16:29 schrieb Avi Kivity:
>>> Where is __thread not supported?
>> Apple, Sun.

Some flavours of uClinux :-)

Avi Kivity wrote:
> Well, pthread_getspecific is around 130 bytes of code, whereas __thread 
> is just on instruction.  Maybe we should support both.

It's easy enough, they are quite similar.  Except that
pthread_key_create lets you provide a destructor which is called as
each thread is destroyed (unfortunately no constructor for new
threads; and you can use both methods if you need a destructor and
speed together).

It's not always one instruction - it's more complicated in shared
libraries, but it's always close to that.

Anyway, I decided to measure them both as I wondered about this for
another program.

On my 2.0GHz Core Duo (32-bit), tight unrolled loop, everything in cache:

     Read void *__thread variable        ~ 0.6 ns
     Call pthread_getspecific(key)       ~ 8.8 ns

__thread is preferable but it's not much overhead to call pthread_getspecific().

Imho, it's not worth making code less portable or more complicated to
handle both, but it's a nice touch.

However, I did notice that the compiler optimises away references to
__thread variables much better, such as hoisting from inside loops.

In my programs I have taken to wrapping everything inside a
thread_specific(var) macro, similar to the one in the kernel, which
expands to call pthread_getspecific() or use __thread[*], That keeps the
complexity in one place, which is where the macro is defined.

( [*] - Windows has __thread, but it sometimes crashes when used in a
DLL, so I use the Windows equivalent of pthread_getspecific() in the
same wrapper macro, which is fine. )

-- Jamie