* [RFC, patch] i386: vgetcpu()
@ 2006-06-20 20:25 Chuck Ebbert
2006-06-20 20:57 ` Andi Kleen
0 siblings, 1 reply; 3+ messages in thread
From: Chuck Ebbert @ 2006-06-20 20:25 UTC (permalink / raw)
To: linux-kernel; +Cc: Andi Kleen, Linus Torvalds, Andrew Morton
Use the limit field of a GDT entry to store the current CPU
number for fast userspace access. This still leaves 12 bits
free for other information.
/* vgetcpu.c: get CPU number we are running on.
* Prints -1 if error occurred.
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char * const argv[])
{
int cpu;
asm (
" lsl %1,%0 \n" \
" jnz 2f \n" \
" and $0xff,%0 \n" \
"2: \n" \
: "=&r" (cpu)
: "r" ((27 << 3) | 3), "0" (-1)
);
printf("cpu: %d\n", cpu);
return 0;
}
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
--- 2.6.17-32.orig/arch/i386/kernel/cpu/common.c
+++ 2.6.17-32/arch/i386/kernel/cpu/common.c
@@ -642,6 +642,9 @@ void __cpuinit cpu_init(void)
((((__u64)stk16_off) << 32) & 0xff00000000000000ULL) |
(CPU_16BIT_STACK_SIZE - 1);
+ /* Set up GDT entry for per-cpu data */
+ *(__u64 *)(&gdt[27]) |= cpu;
+
cpu_gdt_descr->size = GDT_SIZE - 1;
cpu_gdt_descr->address = (unsigned long)gdt;
--- 2.6.17-32.orig/arch/i386/kernel/head.S
+++ 2.6.17-32/arch/i386/kernel/head.S
@@ -525,7 +525,13 @@ ENTRY(cpu_gdt_table)
.quad 0x004092000000ffff /* 0xc8 APM DS data */
.quad 0x0000920000000000 /* 0xd0 - ESPFIX 16-bit SS */
- .quad 0x0000000000000000 /* 0xd8 - unused */
+
+ /*
+ * Use a GDT entry to store per-cpu data for user space (DPL 3.)
+ * 32-bit data segment, byte granularity, base 0, limit set at runtime.
+ */
+ .quad 0x0040f20000000000 /* 0xd8 - for per-cpu user data */
+
.quad 0x0000000000000000 /* 0xe0 - unused */
.quad 0x0000000000000000 /* 0xe8 - unused */
.quad 0x0000000000000000 /* 0xf0 - unused */
--
Chuck
"You can't read a newspaper if you can't read." --George W. Bush
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [RFC, patch] i386: vgetcpu()
2006-06-20 20:25 [RFC, patch] i386: vgetcpu() Chuck Ebbert
@ 2006-06-20 20:57 ` Andi Kleen
0 siblings, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2006-06-20 20:57 UTC (permalink / raw)
To: Chuck Ebbert; +Cc: linux-kernel, Linus Torvalds, Andrew Morton
On Tuesday 20 June 2006 22:25, Chuck Ebbert wrote:
> Use the limit field of a GDT entry to store the current CPU
> number for fast userspace access. This still leaves 12 bits
> free for other information.
Nice trick. Maybe I'll even add that to the x86-64 implementation
if it's fast enough. Do you have numbers?
But it needs to be encapsulated in a wrapper I think. Just exposing
it to user space is the wrong way to do this.
-Andi
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC, patch] i386: vgetcpu()
@ 2006-06-20 22:40 Chuck Ebbert
0 siblings, 0 replies; 3+ messages in thread
From: Chuck Ebbert @ 2006-06-20 22:40 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Linus Torvalds, linux-kernel
In-Reply-To: <200606202257.16033.ak@suse.de>
On Tue, 20 Jun 2006 22:57:16 +0200, Andi Kleen wrote:
> On Tuesday 20 June 2006 22:25, Chuck Ebbert wrote:
> > Use the limit field of a GDT entry to store the current CPU
> > number for fast userspace access. This still leaves 12 bits
> > free for other information.
>
> Nice trick. Maybe I'll even add that to the x86-64 implementation
> if it's fast enough. Do you have numbers?
>
I got ~13 clocks on x86_64 and 21 on PII. Test program below.
> But it needs to be encapsulated in a wrapper I think. Just exposing
> it to user space is the wrong way to do this.
Well there's no real way to hide it, but something more is definitely
needed. A new vdso entry point is easy but how do you tell userspace
it's there? Does glibc look at the kernel version in the note?
/* test how fast lsl, test for success + mask result runs
* leave DO_TEST undefined to measure overhead
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#define rdtscll(t) asm volatile ("rdtsc" : "=A" (t))
#ifndef ITERS
#define ITERS 1000000
#endif
int main(int argc, char * const argv[])
{
unsigned long long tsc1, tsc2;
int count, cpu, junk;
rdtscll(tsc1);
asm (
" pushl %%ds \n"
" popl %2 \n"
"1: \n"
#ifdef DO_TEST
" lsl %2,%0 \n"
" jnz 2f \n"
" and $0xff,%0 \n"
#endif
" dec %1 \n"
" jnz 1b \n"
"2: \n"
: "=&r" (cpu), "=&r" (count), "=&r" (junk)
: "1" (ITERS), "0" (-1)
);
rdtscll(tsc2);
if (count == 0)
printf("loops: %d, avg: %llu clocks\n",
ITERS, (tsc2 - tsc1) / ITERS);
return 0;
}
--
Chuck
"You can't read a newspaper if you can't read." --George W. Bush
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-06-20 22:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-20 20:25 [RFC, patch] i386: vgetcpu() Chuck Ebbert
2006-06-20 20:57 ` Andi Kleen
-- strict thread matches above, loose matches on Subject: below --
2006-06-20 22:40 Chuck Ebbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox