public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: linux-kernel@vger.kernel.org
Cc: Chuck Ebbert <76306.1226@compuserve.com>,
	Zachary Amsden <zach@vmware.com>,
	Jan Beulich <jbeulich@novell.com>, Andi Kleen <ak@suse.de>,
	Andrew Morton <akpm@osdl.org>
Subject: [PATCH 0/8] Implement per-processor data areas for i386.
Date: Wed, 30 Aug 2006 16:52:01 -0700	[thread overview]
Message-ID: <20060830235201.106319215@goop.org> (raw)

(Changes since previous post:
 - now works
 - fixed sys_vm86
 - performance measurements)

Implement per-processor data areas for i386.

This patch implements per-processor data areas by using %gs as the
base segment of the per-processor memory.  This has two principle
advantages:

- It allows very simple direct access to per-processor data by
  effectively using an effective address of the form %gs:offset, where
  offset is the offset into struct i386_pda.  These sequences are faster
  and smaller than the current mechanism using current_thread_info().

- It also allows per-CPU data to be allocated as each CPU is brought
  up, rather than statically allocating it based on the maximum number
  of CPUs which could be brought up. (Though the existing per-cpu
  mechanism could be changed to do this.)

Performance:

I've done some simple performance tests on an Intel Core Duo running
at 1GHz (to emphisize any performance delta).  The results for the
lmbench null syscall latency test, which should show the most negative
effect from this change, show a ~8-9ns decline (.237uS -> .245uS).
This corresponds to around 9 CPU cycles, and correlates well with
the addition of the push/load/pop %gs into the hot path.

I have not yet measured the effect on other typees of processor or
more complex syscalls (though I would expect the push/pop overhead
would be drowned by longer times spent in the kernel, and mitigated by
actual use of the PDA).

The size improvements on the kernel text are nice as well: 
    2889361 -> 2883936 = 5425 bytes saved


Some background for people unfamiliar with x86 segmentation:

This uses the x86 segmentation stuff in a way similar to NPTL's way of
implementing Thread-Local Storage.  It relies on the fact that each CPU
has its own Global Descriptor Table (GDT), which is basically an array
of base-length pairs (with some extra stuff).  When a segment register
is loaded with a descriptor (approximately, an index in the GDT), and
you use that segment register for memory access, the address has the
base added to it, and the resulting address is used.

In other words, if you imagine the GDT containing an entry:
	Index	Offset
	123:	0xc0211000 (allocated PDA)
and you load %gs with this selector:
	mov $123, %gs
and then use GS later on:
	mov %gs:4, %eax
This has the effect of
	mov 0xc0211004, %eax
and because the GDT is per-CPU, the offset (= 0xc0211000 = memory
allocated for this CPU's PDA) can be a CPU-specific value while leaving
everything else constant.

This means that something like "current" or "smp_processor_id()" can
collapse to a single instruction:
	mov %gs:PDA_current, %reg


TODO: 
- Modify more things to use the PDA.  The more that uses it, the more
  the cost of the %gs save/restore is amortized.  smp_processor_id and
  current are the obvious first choices, which are implemented in this
  series.
--


             reply	other threads:[~2006-08-31  0:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-30 23:52 Jeremy Fitzhardinge [this message]
2006-08-30 23:52 ` [PATCH 1/8] Use asm-offsets for the offsets of registers into the pt_regs struct, rather than having hard-coded constants Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 2/8] Basic definitions for i386-pda Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 3/8] Initialize the per-CPU data area Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 4/8] Use %gs as the PDA base-segment in the kernel Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 5/8] Fix places where using %gs changes the usermode ABI Jeremy Fitzhardinge
2006-08-31  7:11   ` Andi Kleen
2006-08-31  7:22     ` Jeremy Fitzhardinge
2006-08-31  7:36       ` Andi Kleen
2006-08-31  8:04         ` Jeremy Fitzhardinge
2006-08-31  8:13           ` Andi Kleen
2006-08-31  8:39             ` Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 6/8] Update sys_vm86 to cope with changed pt_regs and %gs usage Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 7/8] Implement smp_processor_id() with the PDA Jeremy Fitzhardinge
2006-08-31 12:35   ` Ian Campbell
2006-08-31 16:04     ` Jeremy Fitzhardinge
2006-08-31 19:10     ` Jeremy Fitzhardinge
2006-08-31 21:34       ` Ian Campbell
2006-08-31 21:39         ` Jeremy Fitzhardinge
2006-08-30 23:52 ` [PATCH 8/8] Implement "current" " Jeremy Fitzhardinge
  -- strict thread matches above, loose matches on Subject: below --
2006-09-01  6:47 [PATCH 0/8] Implement per-processor data areas for i386 Jeremy Fitzhardinge
2006-09-01  8:16 ` Andi Kleen
2006-09-01  8:26   ` Jeremy Fitzhardinge
2006-09-01  8:30     ` Andi Kleen
2006-09-01 19:08       ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060830235201.106319215@goop.org \
    --to=jeremy@goop.org \
    --cc=76306.1226@compuserve.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=jbeulich@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox