From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>, Andi Kleen <ak@suse.de>,
Eric Dumazet <dada1@cosmosbay.com>,
akpm@osdl.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] i386-pda UP optimization
Date: Thu, 16 Nov 2006 16:24:53 -0800 [thread overview]
Message-ID: <455D0155.9000305@goop.org> (raw)
In-Reply-To: <20061115190606.GB9303@elte.hu>
[-- Attachment #1: Type: text/plain, Size: 843 bytes --]
Ingo Molnar wrote:
> what point would there be in using it? It's not like the kernel could
> make use of the thread keyword anytime soon (it would need /all/
> architectures to support it) ...
The plan was to implement the x86 arch-specific percpu stuff to use it,
since it allows gcc better optimisation opportunities.
> and the kernel doesnt mind how the
> current per_cpu() primitives are implemented, via assembly or via C. In
> any case, it very much matters to see the precise cost of having the pda
> selector value in %gs versus %fs.
>
Hm, well, unfortunately for me, there is a small but distinct advantage
to using %fs rather than %gs (around 0-5ns per iteration). The notable
exception being the "AMD-K6(tm) 3D+ Processor", where %gs is about 25%
(15ns) faster.
I'll revise the patches to use %fs and resubmit.
J
[-- Attachment #2: results-mixed.txt --]
[-- Type: text/plain, Size: 3720 bytes --]
"Genuine Intel(R) CPU T2400 @ 1.83GHz" @1000Mhz (6,14,8):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 26ns/iteration
gs with data selector: 30ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 26ns/iteration
gs with LDT selector: 26ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 26ns/iteration
gs with GDT selector: 30ns/iteration
"Intel(R) Pentium(R) 4 CPU 1.80GHz" @1817.9Mhz (15,2,4):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 33ns/iteration
gs with data selector: 34ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 43ns/iteration
gs with LDT selector: 52ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 33ns/iteration
gs with GDT selector: 34ns/iteration
"Intel(R) Celeron(R) CPU 2.40GHz" @2394.47Mhz (15,2,9):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 20ns/iteration
gs with data selector: 24ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 21ns/iteration
gs with LDT selector: 26ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 21ns/iteration
gs with GDT selector: 26ns/iteration
"Pentium 75 - 200" @166.206Mhz (5,2,12):
ds=7b fs=0 gs=33 ldt=f gdt=3b GTOD
<none> with data selector: 1ns/iteration
fs with data selector: 74ns/iteration
gs with data selector: 75ns/iteration
<none> with LDT selector: 1ns/iteration
fs with LDT selector: 74ns/iteration
gs with LDT selector: 75ns/iteration
<none> with GDT selector: 1ns/iteration
fs with GDT selector: 74ns/iteration
gs with GDT selector: 74ns/iteration
"AMD-K6(tm) 3D+ Processor" @451.105Mhz (5,9,1):
ds=7b fs=0 gs=33 ldt=f gdt=3b GTOD
<none> with data selector: 0ns/iteration
fs with data selector: 59ns/iteration
gs with data selector: 44ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 59ns/iteration
gs with LDT selector: 44ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 59ns/iteration
gs with GDT selector: 44ns/iteration
"AMD Athlon(tm) XP 3000+" @2162.74Mhz (6,10,0):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 10ns/iteration
gs with data selector: 11ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 11ns/iteration
gs with LDT selector: 11ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 11ns/iteration
gs with GDT selector: 11ns/iteration
"AMD Athlon(tm) 64 Processor 3500+" @2210.23Mhz (15,31,0):
ds=2b fs=0 gs=63 ldt=f gdt=6b GTOD
<none> with data selector: 0ns/iteration
fs with data selector: 11ns/iteration
gs with data selector: 11ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 10ns/iteration
gs with LDT selector: 11ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 10ns/iteration
gs with GDT selector: 11ns/iteration
"Pentium III (Coppermine)" @700Mhz (6,8,6):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 38ns/iteration
gs with data selector: 45ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 39ns/iteration
gs with LDT selector: 41ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 39ns/iteration
gs with GDT selector: 44ns/iteration
next prev parent reply other threads:[~2006-11-17 0:12 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-12 7:35 i386 PDA patches use of %gs Arjan van de Ven
2006-09-12 7:48 ` Jeremy Fitzhardinge
2006-09-12 7:56 ` Arjan van de Ven
2006-09-12 8:31 ` Jeremy Fitzhardinge
2006-11-15 11:27 ` [PATCH] i386-pda UP optimization Eric Dumazet
2006-11-15 11:32 ` Andi Kleen
2006-11-15 17:20 ` Ingo Molnar
2006-11-15 17:24 ` Andi Kleen
2006-11-15 17:46 ` Eric Dumazet
2006-11-15 17:49 ` Ingo Molnar
2006-11-15 17:58 ` Eric Dumazet
2006-11-15 18:01 ` Ingo Molnar
2006-11-21 11:38 ` Eric Dumazet
2006-11-21 21:42 ` Jeremy Fitzhardinge
2006-11-21 21:52 ` Andi Kleen
2006-11-21 22:10 ` Jeremy Fitzhardinge
2006-11-21 21:58 ` Eric Dumazet
2006-11-21 23:12 ` Jeremy Fitzhardinge
2006-11-15 17:28 ` Jeremy Fitzhardinge
2006-11-15 17:32 ` Ingo Molnar
2006-11-15 17:59 ` Jeremy Fitzhardinge
2006-11-15 18:05 ` Eric Dumazet
2006-11-15 18:28 ` Jeremy Fitzhardinge
2006-11-15 18:31 ` Ingo Molnar
2006-11-15 18:01 ` Arjan van de Ven
2006-11-15 18:24 ` Jeremy Fitzhardinge
2006-11-15 19:06 ` Ingo Molnar
2006-11-17 0:24 ` Jeremy Fitzhardinge [this message]
2006-11-15 17:52 ` Jeremy Fitzhardinge
2006-11-28 23:12 ` Jeremy Fitzhardinge
2006-11-29 9:30 ` Eric Dumazet
2006-11-29 9:56 ` Jeremy Fitzhardinge
2006-09-13 1:00 ` i386 PDA patches use of %gs Jeremy Fitzhardinge
2006-09-13 9:59 ` Ingo Molnar
2006-09-13 16:17 ` Jeremy Fitzhardinge
2006-11-15 18:26 ` Ingo Molnar
2006-11-15 18:29 ` Ingo Molnar
2006-11-15 18:43 ` Jeremy Fitzhardinge
2006-11-15 18:44 ` Ingo Molnar
2006-11-15 18:39 ` Jeremy Fitzhardinge
2006-11-15 18:43 ` Ingo Molnar
2006-11-15 18:49 ` Jeremy Fitzhardinge
2006-11-15 18:49 ` Ingo Molnar
2006-11-15 19:00 ` Jeremy Fitzhardinge
2006-11-15 19:03 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=455D0155.9000305@goop.org \
--to=jeremy@goop.org \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=dada1@cosmosbay.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.