public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 7] x86/paravirt: optimise pvop calls and register use
@ 2009-01-28 22:35 Jeremy Fitzhardinge
  2009-01-28 22:35 ` [PATCH 1 of 7] xen: move remaining mmu-related stuff into mmu.c Jeremy Fitzhardinge
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Jeremy Fitzhardinge @ 2009-01-28 22:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Xen-devel, the arch/x86 maintainers, Ian Campbell,
	Zachary Amsden, Rusty Russell, Ravikiran Thirumalai

Hi Ingo,

This series implements a sequence of optimisations to reduce the
impact of enabling CONFIG_PARAVIRT while running native.

They are:
  (0. Move some Xen code around to make later changes work better.)

  1. For a number of the pvops, the native implemention is a simple
     identity function which returns its argument.  Add a specific
     paravirt identity function which the patcher can treat specially,
     by directly inlining the either nops (32-bit) or a mov (64-bit)
     into the instruction stream.

  2. When a pvop is called from asm code, it also provides a hint
     about what registers are available to be clobbered by the called
     code.  Until now, that information was ignored, and all
     caller-save registers were saved.  Now, don't bother
     saving/restoring registers which are clobberable.

  3. The C calling convention lists which registers the caller can
     expect to survive a function call, and which the callee is
     allowed to clobber.  The latter set is quite large, especially on
     64-bit.  This means that converting a pile of simple inline
     functions into function calls caused a lot more register
     pressure, making the generated code much worse.

     I introduce a new "callee-save" calling convention which makes
     only the return register (eax:edx on 32-bit, rax on 64)
     callee-clobberable; the callee must preserve all other registers,
     including the argument registers.

     This makes the callsites for these functions clobber many fewer
     registers, giving the compiler a chance to generate better code.

     Small asm functions, which generally only use one or two
     registers anyway, to be directly called.  C code can also be
     called via a thunk, which does the necessary register
     saving/restoring (generated by PV_CALLEE_SAVE_REGS_THUNK(func)).

     The irq enable/disable/save/restore functions are the first to
     make use of this calling convention, since they are the most
     commonly used in the kernel, and are also called form asm code.

  4. Convert the pte_val/make_pte identity functions to use the
     callee-save convention; they're only identity functions anyway,
     so they have no need to trash lots of registers.

I had to make some adjustments to VSMP and lguest to match the new
calling conventions.  I wasn't sure how I should change VMI, so I'm
waiting for Zach's input on that (VMI doesn't compile at the moment).

In testing, the net result was that the overhead dropped by about 75%,
though I found it hard to really get stable results.  The most obvious
improvement was a reduction in L2 references, presumably meaning that
L1 was getting a better hit rate.  Each of these transforms is an
unambigious improvement in generated code for the native case, so I'm
curious to see what other people see.

Thanks,
	J


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2009-02-06 16:37 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-28 22:35 [PATCH 0 of 7] x86/paravirt: optimise pvop calls and register use Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 1 of 7] xen: move remaining mmu-related stuff into mmu.c Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 2 of 7] x86/pvops: add a paravirt_ident functions to allow special patching Jeremy Fitzhardinge
2009-01-29  8:05   ` Rusty Russell
2009-01-29  9:26     ` Jeremy Fitzhardinge
2009-01-29 10:46     ` Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 3 of 7] x86: fix paravirt clobber in entry_64.S Jeremy Fitzhardinge
2009-01-29  8:39   ` Rusty Russell
2009-01-29  9:28     ` Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 4 of 7] x86/paravirt: selectively save/restore regs around pvops calls Jeremy Fitzhardinge
2009-01-29  8:47   ` Rusty Russell
2009-01-29  9:30     ` Jeremy Fitzhardinge
2009-01-30  0:27       ` Rusty Russell
2009-01-28 22:35 ` [PATCH 5 of 7] x86/paravirt: add register-saving thunks to reduce caller register pressure Jeremy Fitzhardinge
2009-02-06  7:28   ` Andi Kleen
2009-02-06 16:37     ` Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 6 of 7] x86/paravirt: implement PVOP_CALL macros for callee-save functions Jeremy Fitzhardinge
2009-01-28 22:35 ` [PATCH 7 of 7] x86/paravirt: use callee-saved convention for pte_val/make_pte/etc Jeremy Fitzhardinge
2009-01-29  7:14 ` [PATCH 0 of 7] x86/paravirt: optimise pvop calls and register use H. Peter Anvin
2009-01-29  9:51   ` Jeremy Fitzhardinge
2009-01-31  5:04     ` H. Peter Anvin
2009-01-31  7:16       ` Jeremy Fitzhardinge
2009-01-31 22:43         ` H. Peter Anvin
2009-01-31  7:17       ` [PATCH 1/2] x86/paravirt: don't restore second return reg Jeremy Fitzhardinge
2009-01-31  7:18       ` [PATCH 2/2] x86/vmi: fix interrupt enable/disable/save/restore calling convention Jeremy Fitzhardinge
2009-01-31 16:12       ` [PATCH 0 of 7] x86/paravirt: optimise pvop calls and register use Ingo Molnar
2009-01-31 17:00         ` Jeremy Fitzhardinge
2009-02-04  2:10         ` Ingo Molnar
2009-02-04  2:16           ` Jeremy Fitzhardinge
2009-02-04 14:26             ` Ingo Molnar
2009-02-04 17:06               ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox