From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>, Andi Kleen <ak@suse.de>,
Eric Dumazet <dada1@cosmosbay.com>,
akpm@osdl.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] i386-pda UP optimization
Date: Thu, 16 Nov 2006 16:24:53 -0800 [thread overview]
Message-ID: <455D0155.9000305@goop.org> (raw)
In-Reply-To: <20061115190606.GB9303@elte.hu>
[-- Attachment #1: Type: text/plain, Size: 843 bytes --]
Ingo Molnar wrote:
> what point would there be in using it? It's not like the kernel could
> make use of the thread keyword anytime soon (it would need /all/
> architectures to support it) ...
The plan was to implement the x86 arch-specific percpu stuff to use it,
since it allows gcc better optimisation opportunities.
> and the kernel doesnt mind how the
> current per_cpu() primitives are implemented, via assembly or via C. In
> any case, it very much matters to see the precise cost of having the pda
> selector value in %gs versus %fs.
>
Hm, well, unfortunately for me, there is a small but distinct advantage
to using %fs rather than %gs (around 0-5ns per iteration). The notable
exception being the "AMD-K6(tm) 3D+ Processor", where %gs is about 25%
(15ns) faster.
I'll revise the patches to use %fs and resubmit.
J
[-- Attachment #2: results-mixed.txt --]
[-- Type: text/plain, Size: 3720 bytes --]
"Genuine Intel(R) CPU T2400 @ 1.83GHz" @1000Mhz (6,14,8):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 26ns/iteration
gs with data selector: 30ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 26ns/iteration
gs with LDT selector: 26ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 26ns/iteration
gs with GDT selector: 30ns/iteration
"Intel(R) Pentium(R) 4 CPU 1.80GHz" @1817.9Mhz (15,2,4):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 33ns/iteration
gs with data selector: 34ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 43ns/iteration
gs with LDT selector: 52ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 33ns/iteration
gs with GDT selector: 34ns/iteration
"Intel(R) Celeron(R) CPU 2.40GHz" @2394.47Mhz (15,2,9):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 20ns/iteration
gs with data selector: 24ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 21ns/iteration
gs with LDT selector: 26ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 21ns/iteration
gs with GDT selector: 26ns/iteration
"Pentium 75 - 200" @166.206Mhz (5,2,12):
ds=7b fs=0 gs=33 ldt=f gdt=3b GTOD
<none> with data selector: 1ns/iteration
fs with data selector: 74ns/iteration
gs with data selector: 75ns/iteration
<none> with LDT selector: 1ns/iteration
fs with LDT selector: 74ns/iteration
gs with LDT selector: 75ns/iteration
<none> with GDT selector: 1ns/iteration
fs with GDT selector: 74ns/iteration
gs with GDT selector: 74ns/iteration
"AMD-K6(tm) 3D+ Processor" @451.105Mhz (5,9,1):
ds=7b fs=0 gs=33 ldt=f gdt=3b GTOD
<none> with data selector: 0ns/iteration
fs with data selector: 59ns/iteration
gs with data selector: 44ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 59ns/iteration
gs with LDT selector: 44ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 59ns/iteration
gs with GDT selector: 44ns/iteration
"AMD Athlon(tm) XP 3000+" @2162.74Mhz (6,10,0):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 10ns/iteration
gs with data selector: 11ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 11ns/iteration
gs with LDT selector: 11ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 11ns/iteration
gs with GDT selector: 11ns/iteration
"AMD Athlon(tm) 64 Processor 3500+" @2210.23Mhz (15,31,0):
ds=2b fs=0 gs=63 ldt=f gdt=6b GTOD
<none> with data selector: 0ns/iteration
fs with data selector: 11ns/iteration
gs with data selector: 11ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 10ns/iteration
gs with LDT selector: 11ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 10ns/iteration
gs with GDT selector: 11ns/iteration
"Pentium III (Coppermine)" @700Mhz (6,8,6):
ds=7b fs=0 gs=33 ldt=f gdt=3b CPUTIME
<none> with data selector: 0ns/iteration
fs with data selector: 38ns/iteration
gs with data selector: 45ns/iteration
<none> with LDT selector: 0ns/iteration
fs with LDT selector: 39ns/iteration
gs with LDT selector: 41ns/iteration
<none> with GDT selector: 0ns/iteration
fs with GDT selector: 39ns/iteration
gs with GDT selector: 44ns/iteration
next prev parent reply other threads:[~2006-11-17 0:12 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-12 7:35 i386 PDA patches use of %gs Arjan van de Ven
2006-09-12 7:48 ` Jeremy Fitzhardinge
2006-09-12 7:56 ` Arjan van de Ven
2006-09-12 8:31 ` Jeremy Fitzhardinge
2006-11-15 11:27 ` [PATCH] i386-pda UP optimization Eric Dumazet
2006-11-15 11:32 ` Andi Kleen
2006-11-15 17:20 ` Ingo Molnar
2006-11-15 17:24 ` Andi Kleen
2006-11-15 17:46 ` Eric Dumazet
2006-11-15 17:49 ` Ingo Molnar
2006-11-15 17:58 ` Eric Dumazet
2006-11-15 18:01 ` Ingo Molnar
2006-11-21 11:38 ` Eric Dumazet
2006-11-21 21:42 ` Jeremy Fitzhardinge
2006-11-21 21:52 ` Andi Kleen
2006-11-21 22:10 ` Jeremy Fitzhardinge
2006-11-21 21:58 ` Eric Dumazet
2006-11-21 23:12 ` Jeremy Fitzhardinge
2006-11-15 17:28 ` Jeremy Fitzhardinge
2006-11-15 17:32 ` Ingo Molnar
2006-11-15 17:59 ` Jeremy Fitzhardinge
2006-11-15 18:05 ` Eric Dumazet
2006-11-15 18:28 ` Jeremy Fitzhardinge
2006-11-15 18:31 ` Ingo Molnar
2006-11-15 18:01 ` Arjan van de Ven
2006-11-15 18:24 ` Jeremy Fitzhardinge
2006-11-15 19:06 ` Ingo Molnar
2006-11-17 0:24 ` Jeremy Fitzhardinge [this message]
2006-11-15 17:52 ` Jeremy Fitzhardinge
2006-11-28 23:12 ` Jeremy Fitzhardinge
2006-11-29 9:30 ` Eric Dumazet
2006-11-29 9:56 ` Jeremy Fitzhardinge
2006-09-13 1:00 ` i386 PDA patches use of %gs Jeremy Fitzhardinge
2006-09-13 9:59 ` Ingo Molnar
2006-09-13 16:17 ` Jeremy Fitzhardinge
2006-11-15 18:26 ` Ingo Molnar
2006-11-15 18:29 ` Ingo Molnar
2006-11-15 18:43 ` Jeremy Fitzhardinge
2006-11-15 18:44 ` Ingo Molnar
2006-11-15 18:39 ` Jeremy Fitzhardinge
2006-11-15 18:43 ` Ingo Molnar
2006-11-15 18:49 ` Jeremy Fitzhardinge
2006-11-15 18:49 ` Ingo Molnar
2006-11-15 19:00 ` Jeremy Fitzhardinge
2006-11-15 19:03 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=455D0155.9000305@goop.org \
--to=jeremy@goop.org \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=dada1@cosmosbay.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).