public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/6] Implement per-processor data areas for i386.
@ 2006-08-27  8:44 Jeremy Fitzhardinge
  2006-08-27  8:44 ` [PATCH RFC 1/6] Basic definitions for i386-pda Jeremy Fitzhardinge
                   ` (8 more replies)
  0 siblings, 9 replies; 44+ messages in thread
From: Jeremy Fitzhardinge @ 2006-08-27  8:44 UTC (permalink / raw)
  To: linux-kernel
  Cc: Chuck Ebbert, Zachary Amsden, Jan Beulich, Andi Kleen,
	Andrew Morton

This patch implements per-processor data areas by using %gs as the
base segment of the per-processor memory.  This has two principle
advantages:

- It allows very simple direct access to per-processor data by
  effectively using an effective address of the form %gs:offset, where
  offset is the offset into struct i386_pda.  These sequences are faster
  and smaller than the current mechanism using current_thread_info().

- It also allows per-CPU data to be allocated as each CPU is brought
  up, rather than statically allocating it based on the maximum number
  of CPUs which could be brought up.

I haven't measured performance yet, but when using the PDA for "current"
and "smp_processor_id", I see a 5715 byte reduction in .text segment
size for my kernel.

Unfortunately, these patches don't actually work yet.  I'm not sure why;
I'm hoping review will turn something up.


Some background for people unfamiliar with x86 segmentation:

This uses the x86 segmentation stuff in a way similar to NPTL's way of
implementing Thread-Local Storage.  It relies on the fact that each CPU
has its own Global Descriptor Table (GDT), which is basically an array
of base-length pairs (with some extra stuff).  When a segment register
is loaded with a descriptor (approximately, an index in the GDT), and
you use that segment register for memory access, the address has the
base added to it, and the resulting address is used.

In other words, if you imagine the GDT containing an entry:
	Index	Offset
	123:	0xc0211000 (allocated PDA)
and you load %gs with this selector:
	mov $123, %gs
and then use GS later on:
	mov %gs:4, %eax
This has the effect of
	mov 0xc0211004, %eax
and because the GDT is per-CPU, the offset (= 0xc0211000 = memory
allocated for this CPU's PDA) can be a CPU-specific value while leaving
everything else constant.

This means that something like "current" or "smp_processor_id()" can
collapse to a single instruction:
	mov %gs:PDA_current, %reg


TODO: 
- Make it work.  It works UP on a test QEMU machine, but it doesn't
  yet work on real hardware, or SMP (though not working SMP on QEMU is
  more likely to be a QEMU problem).  Not sure what the problem is yet;
  I'm hoping review will reveal something.
- Measure performance impact.  The patch adds a segment register
  save/restore on entry/exit to the kernel.  This expense should be
  offset by savings in using the PDA while in the kernel, but I haven't
  measured this yet.  Space savings are already appealing though.
- Modify more things to use the PDA.  The more that uses it, the more
  the cost of the %gs save/restore is amortized.  smp_processor_id and
  current are the obvious first choices, which are implemented in this
  series.
- Make it a config option?  UP systems don't need to do any of this,
  other than having a single pre-allocated PDA.  Unfortunately, it gets
  a bit messy to do this given the changes needed in handling %gs.
--


^ permalink raw reply	[flat|nested] 44+ messages in thread
* Re: [PATCH RFC 0/6] Implement per-processor data areas for i386.
@ 2006-08-28  9:06 Chuck Ebbert
  0 siblings, 0 replies; 44+ messages in thread
From: Chuck Ebbert @ 2006-08-28  9:06 UTC (permalink / raw)
  To: Andreas Mohr
  Cc: Andrew Morton, Andi Kleen, Jan Beulich, Zachary Amsden,
	linux-kernel

In-Reply-To: <20060827172155.GA21724@rhlx01.fht-esslingen.de>

On Sun, 27 Aug 2006 19:21:55 +0200, Andreas Mohr wrote:

> Something like that had to be done eventually about the inefficient
> current_thread_info() mechanism, but I wasn't sure what exactly.

In 2.6.18 it's done in C and the optimizer does a pretty good job
with it in recent compilers.

-- 
Chuck


^ permalink raw reply	[flat|nested] 44+ messages in thread
* [PATCH RFC 0/6] Implement per-processor data areas for i386.
@ 2006-08-30  9:00 Chuck Ebbert
  2006-08-30  9:17 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 44+ messages in thread
From: Chuck Ebbert @ 2006-08-30  9:00 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Andi Kleen, Jan Beulich, Zachary Amsden,
	linux-kernel

In-Reply-To: <20060827084417.918992193@goop.org>

On Sun, 27 Aug 2006 01:44:17 -0700, Jeremy Fitzhardinge wrote:

> This patch implements per-processor data areas by using %gs as the
> base segment of the per-processor memory.

This changes the ABI for signals and ptrace() and that seems like
a bad idea to me.

And the way things are done now is so ingrained into the i386
kernel that I'm not sure it can be done.  E.g. I found two
open-coded implementations of current, one in kernel_fpu_begin()
and one in math_state_restore().

> - It also allows per-CPU data to be allocated as each CPU is brought
>   up, rather than statically allocating it based on the maximum number
>   of CPUs which could be brought up.

Can you describe what it is about the way things work now that
prevents dynamic allocation?

-- 
Chuck

^ permalink raw reply	[flat|nested] 44+ messages in thread
* Re: [PATCH RFC 0/6] Implement per-processor data areas for i386.
@ 2006-08-30 12:33 Chuck Ebbert
  2006-08-30 12:54 ` Andi Kleen
  2006-08-30 16:39 ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 44+ messages in thread
From: Chuck Ebbert @ 2006-08-30 12:33 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: linux-kernel, Zachary Amsden, Jan Beulich, Andi Kleen,
	Andrew Morton

In-Reply-To: <44F557A8.1030605@goop.org>

On Wed, 30 Aug 2006 02:17:28 -0700, Jeremy Fitzhardinge wrote:

> > This changes the ABI for signals and ptrace() and that seems like
> > a bad idea to me.
> >   
> 
> I don't believe it does; it certainly shouldn't change the usermode 
> ABI.  How do you see it changing?

Nevermind.  I thought because you changed struct pt_regs in ptrace_abi.h
it meant a user ABI change.

> > And the way things are done now is so ingrained into the i386
> > kernel that I'm not sure it can be done.  E.g. I found two
> > open-coded implementations of current, one in kernel_fpu_begin()
> > and one in math_state_restore().
> >   
> 
> That's OK.  The current task will still be available in thread_info; 

But they can get out of sync, e.g. when switch_to() restores the new
task's esp, the PDA still contains the old pcurrent and they don't get
synchronized until the write_pda() in __switch_to().

> To be honest, I haven't looked at percpu.h in great detail.  I was 
> making assumptions about how it works, but it looks like they were wrong.

Would it make any sense to replace the 'cpu' field in thread_info with
a pointer to a PDA-like structure?  We could even embed the static per_cpu
data directly into that struct instead of chasing pointers...

-- 
Chuck


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2006-08-30 17:32 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-27  8:44 [PATCH RFC 0/6] Implement per-processor data areas for i386 Jeremy Fitzhardinge
2006-08-27  8:44 ` [PATCH RFC 1/6] Basic definitions for i386-pda Jeremy Fitzhardinge
2006-08-27  8:44 ` [PATCH RFC 2/6] Initialize the per-CPU data area Jeremy Fitzhardinge
2006-08-27  8:44 ` [PATCH RFC 3/6] Use %gs as the PDA base-segment in the kernel Jeremy Fitzhardinge
2006-08-27  9:49   ` Keith Owens
2006-08-27 10:01     ` Jeremy Fitzhardinge
2006-08-27 15:57   ` Andi Kleen
2006-08-27 16:36     ` Jeremy Fitzhardinge
2006-08-27 17:20     ` Jeremy Fitzhardinge
2006-08-27 18:19       ` Andi Kleen
2006-08-27 20:03         ` Jan Engelhardt
2006-08-27 23:38         ` Jeremy Fitzhardinge
2006-08-28  9:51         ` Jan Beulich
2006-08-28 14:54           ` H. J. Lu
2006-08-28 17:24         ` H. Peter Anvin
2006-08-27  8:44 ` [PATCH RFC 4/6] Fix places where using %gs changes the usermode ABI Jeremy Fitzhardinge
2006-08-27 15:59   ` Andi Kleen
2006-08-27 16:37     ` Jeremy Fitzhardinge
2006-08-27  8:44 ` [PATCH RFC 5/6] Implement smp_processor_id() with the PDA Jeremy Fitzhardinge
2006-08-27  8:44 ` [PATCH RFC 6/6] Implement "current" " Jeremy Fitzhardinge
2006-08-27 16:01   ` Andi Kleen
2006-08-27 16:38     ` Jeremy Fitzhardinge
2006-08-27  9:47 ` [PATCH RFC 0/6] Implement per-processor data areas for i386 Arjan van de Ven
2006-08-27 16:46   ` Jeremy Fitzhardinge
2006-08-27 17:44     ` Arjan van de Ven
2006-08-27 18:07       ` Andi Kleen
2006-08-27 18:27         ` Jeremy Fitzhardinge
2006-08-27 16:01 ` Andi Kleen
2006-08-27 16:41   ` Jeremy Fitzhardinge
2006-08-27 17:21 ` Andreas Mohr
2006-08-27 17:34   ` Jeremy Fitzhardinge
2006-08-27 18:23     ` Andreas Mohr
2006-08-27 18:04   ` Andi Kleen
2006-08-27 18:27     ` Andreas Mohr
2006-08-27 18:35       ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2006-08-28  9:06 Chuck Ebbert
2006-08-30  9:00 Chuck Ebbert
2006-08-30  9:17 ` Jeremy Fitzhardinge
2006-08-30 12:33 Chuck Ebbert
2006-08-30 12:54 ` Andi Kleen
2006-08-30 16:39 ` Jeremy Fitzhardinge
2006-08-30 16:48   ` Andi Kleen
2006-08-30 17:13     ` Jeremy Fitzhardinge
2006-08-30 17:32       ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox