Re: Performance overhead of paravirt_ops on native identified

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "H. Peter Anvin" <hpa@zytor.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	"Xin, Xiaohui" <xiaohui.xin@intel.com>,
	"Li, Xin" <xin.li@intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Nick Piggin <npiggin@suse.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Xen-devel <xen-devel@lists.xensource.com>
Subject: Re: Performance overhead of paravirt_ops on native identified
Date: Wed, 13 May 2009 18:10:52 -0700	[thread overview]
Message-ID: <4A0B6F9C.4060405@zytor.com> (raw)
In-Reply-To: <4A0B62F7.5030802@goop.org>

Jeremy Fitzhardinge wrote:
> 
> So, what's the fix?
> 
> Paravirt patching turns all the pvops calls into direct calls, so
> _spin_lock etc do end up having direct calls.  For example, the compiler
> generated code for paravirtualized _spin_lock is:
> 
> <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
> <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
> <_spin_lock+15>:	callq  *0xffffffff805a5b30
> <_spin_lock+22>:	retq
> 
> The indirect call will get patched to:
> <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
> <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
> <_spin_lock+15>:	callq <__ticket_spin_lock>
> <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
> <_spin_lock+22>:	retq
> 
> One possibility is to inline _spin_lock, etc, when building an
> optimised kernel (ie, when there's no spinlock/preempt
> instrumentation/debugging enabled).  That will remove the outer
> call/return pair, returning the instruction stream to a single
> call/return, which will presumably execute the same as the non-pvops
> case.  The downsides arel 1) it will replicate the
> preempt_disable/enable code at eack lock/unlock callsite; this code is
> fairly small, but not nothing; and 2) the spinlock definitions are
> already a very heavily tangled mass of #ifdefs and other preprocessor
> magic, and making any changes will be non-trivial.
> 

The other obvious option, it would seem to me, would be to eliminate the
*inner* call/return pair, i.e. merging the _spin_lock setup code in with
the internals of each available implementation (in the case above,
__ticket_spin_lock).  This is effectively what happens on native.  The
one problem with that is that every callsite now becomes a patching target.

That brings me to a somewhat half-arsed thought I have been walking
around with for a while.

Consider a paravirt -- or for that matter any other call which is
runtime-static; this isn't just limited to paravirt -- function which
looks to the C compiler just like any other external function -- no
indirection.  We can point it by default to a function which is really
just an indirect jump to the appropriate handler, that handles the
prepatching case.  However, a linktime pass over vmlinux.o can find all
the points where this function is called, and turn it into a list of
patch sites(*).  The advantages are:

1. [minor] no additional nop padding due to indirect function calls.
2. [major] no need for a ton of wrapper macros manifest in the code.

paravirt_ops that turn into pure inline code in the native case is
obviously another ball of wax entirely; there inline assembly wrappers
are simply unavoidable.

	-hpa

(*) if patching code on SMP was cheaper, we could actually do this
lazily, and wouldn't have to store a list of patch sites.  I don't feel
brave enough to go down that route.

WARNING: multiple messages have this Message-ID (diff)

From: "H. Peter Anvin" <hpa@zytor.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Nick Piggin <npiggin@suse.de>,
	"Xin, Xiaohui" <xiaohui.xin@intel.com>,
	Xen-devel <xen-devel@lists.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Li, Xin" <xin.li@intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: Performance overhead of paravirt_ops on native identified
Date: Wed, 13 May 2009 18:10:52 -0700	[thread overview]
Message-ID: <4A0B6F9C.4060405@zytor.com> (raw)
In-Reply-To: <4A0B62F7.5030802@goop.org>

Jeremy Fitzhardinge wrote:
> 
> So, what's the fix?
> 
> Paravirt patching turns all the pvops calls into direct calls, so
> _spin_lock etc do end up having direct calls.  For example, the compiler
> generated code for paravirtualized _spin_lock is:
> 
> <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
> <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
> <_spin_lock+15>:	callq  *0xffffffff805a5b30
> <_spin_lock+22>:	retq
> 
> The indirect call will get patched to:
> <_spin_lock+0>:		mov    %gs:0xb4c8,%rax
> <_spin_lock+9>:		incl   0xffffffffffffe044(%rax)
> <_spin_lock+15>:	callq <__ticket_spin_lock>
> <_spin_lock+20>:	nop; nop		/* or whatever 2-byte nop */
> <_spin_lock+22>:	retq
> 
> One possibility is to inline _spin_lock, etc, when building an
> optimised kernel (ie, when there's no spinlock/preempt
> instrumentation/debugging enabled).  That will remove the outer
> call/return pair, returning the instruction stream to a single
> call/return, which will presumably execute the same as the non-pvops
> case.  The downsides arel 1) it will replicate the
> preempt_disable/enable code at eack lock/unlock callsite; this code is
> fairly small, but not nothing; and 2) the spinlock definitions are
> already a very heavily tangled mass of #ifdefs and other preprocessor
> magic, and making any changes will be non-trivial.
> 

The other obvious option, it would seem to me, would be to eliminate the
*inner* call/return pair, i.e. merging the _spin_lock setup code in with
the internals of each available implementation (in the case above,
__ticket_spin_lock).  This is effectively what happens on native.  The
one problem with that is that every callsite now becomes a patching target.

That brings me to a somewhat half-arsed thought I have been walking
around with for a while.

Consider a paravirt -- or for that matter any other call which is
runtime-static; this isn't just limited to paravirt -- function which
looks to the C compiler just like any other external function -- no
indirection.  We can point it by default to a function which is really
just an indirect jump to the appropriate handler, that handles the
prepatching case.  However, a linktime pass over vmlinux.o can find all
the points where this function is called, and turn it into a list of
patch sites(*).  The advantages are:

1. [minor] no additional nop padding due to indirect function calls.
2. [major] no need for a ton of wrapper macros manifest in the code.

paravirt_ops that turn into pure inline code in the native case is
obviously another ball of wax entirely; there inline assembly wrappers
are simply unavoidable.

	-hpa

(*) if patching code on SMP was cheaper, we could actually do this
lazily, and wouldn't have to store a list of patch sites.  I don't feel
brave enough to go down that route.

next prev parent reply	other threads:[~2009-05-14  1:11 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-14  0:16 Performance overhead of paravirt_ops on native identified Jeremy Fitzhardinge
2009-05-14  0:16 ` Jeremy Fitzhardinge
2009-05-14  1:10 ` H. Peter Anvin [this message]
2009-05-14  1:10   ` H. Peter Anvin
2009-05-14  8:25   ` Peter Zijlstra
2009-05-14  8:25     ` Peter Zijlstra
2009-05-14 14:05     ` H. Peter Anvin
2009-05-14 14:05       ` H. Peter Anvin
2009-05-14 17:36   ` Jeremy Fitzhardinge
2009-05-14 17:36     ` Jeremy Fitzhardinge
2009-05-14 17:50     ` H. Peter Anvin
2009-05-14 17:50       ` H. Peter Anvin
2009-05-14  8:05 ` [Xen-devel] Performance overhead of paravirt_ops on nativeidentified Jan Beulich
2009-05-14  8:05   ` Jan Beulich
2009-05-14  8:33   ` [Xen-devel] " Peter Zijlstra
2009-05-14 17:45   ` Jeremy Fitzhardinge
2009-05-14 17:45     ` Jeremy Fitzhardinge
2009-05-15  8:10     ` [Xen-devel] " Jan Beulich
2009-05-15 18:50       ` Jeremy Fitzhardinge
2009-05-18  7:19         ` Jan Beulich
2009-05-18  7:19           ` Jan Beulich
2009-05-20 22:42           ` [Xen-devel] " Jeremy Fitzhardinge
2009-05-20 22:42             ` Jeremy Fitzhardinge
2009-05-15 18:18 ` [tip:x86/urgent] x86: Fix performance regression caused by paravirt_ops on native kernels tip-bot for Jeremy Fitzhardinge
2009-05-15 18:18   ` tip-bot for Jeremy Fitzhardinge
2009-05-21 22:42 ` Performance overhead of paravirt_ops on native identified Chuck Ebbert
2009-05-21 22:48   ` Jeremy Fitzhardinge
2009-05-21 22:48     ` Jeremy Fitzhardinge
2009-05-21 23:10     ` H. Peter Anvin
2009-05-21 23:10       ` H. Peter Anvin
2009-05-22  1:26     ` Xin, Xiaohui
2009-05-22  1:26       ` Xin, Xiaohui
2009-05-22  3:39       ` H. Peter Anvin
2009-05-22  3:39         ` H. Peter Anvin
2009-05-22  4:27       ` Jeremy Fitzhardinge
2009-05-22  4:27         ` Jeremy Fitzhardinge
2009-05-22  5:59         ` Xin, Xiaohui
2009-05-22  5:59           ` Xin, Xiaohui
2009-05-22 16:33           ` H. Peter Anvin
2009-05-22 16:33             ` H. Peter Anvin
2009-05-22 22:44             ` Jeremy Fitzhardinge
2009-05-22 22:44               ` Jeremy Fitzhardinge
2009-05-22 22:47               ` H. Peter Anvin
2009-05-22 22:47                 ` H. Peter Anvin
2009-05-25  9:15 ` [benchmark] 1% performance overhead of paravirt_ops on native kernels Ingo Molnar
2009-05-26 18:42   ` Jeremy Fitzhardinge
2009-05-28  6:17     ` Nick Piggin
2009-05-28 20:57       ` Jeremy Fitzhardinge
2009-05-30 10:23       ` Ingo Molnar
2009-06-02 14:18         ` Chris Mason
2009-06-02 14:49           ` Ulrich Drepper
2009-06-02 15:03             ` Chris Mason
2009-06-02 15:22               ` Ulrich Drepper
2009-06-02 16:20                 ` Chris Mason
2009-06-02 18:13                   ` Pekka Enberg
2009-06-02 18:06               ` Pekka Enberg
2009-06-02 18:27                 ` Chris Mason
2009-06-03  6:33             ` Jeremy Fitzhardinge
2009-06-02 19:14           ` Thomas Gleixner
2009-06-02 19:51             ` Chris Mason
2009-06-03 12:38         ` Rusty Russell
2009-06-03 16:09           ` Linus Torvalds
     [not found]             ` <200906041554.37102.rusty@rustcorp.com.au>
2009-06-04 15:02               ` Linus Torvalds
2009-06-04 21:52                 ` Dave McCracken
2009-06-05  7:31                   ` Gerd Hoffmann
2009-06-05 14:31                     ` Rusty Russell
2009-06-06 18:54                   ` Anders K. Pedersen
2009-06-05  4:46                 ` Rusty Russell
2009-06-05 14:54                   ` Linus Torvalds
2009-06-07  0:53                     ` Rusty Russell
2009-06-08 14:53                       ` Linus Torvalds
2009-06-09  9:39                 ` Nick Piggin
2009-06-09 11:17                   ` Ingo Molnar
2009-06-09 12:10                     ` Nick Piggin
2009-06-09 12:25                       ` Ingo Molnar
2009-06-09 12:42                         ` Nick Piggin
2009-06-09 12:56                         ` Avi Kivity
2009-06-09 15:18                         ` Linus Torvalds
2009-06-09 23:33                         ` Paul Mackerras
2009-06-10  1:26                           ` Ingo Molnar
2009-06-09 15:07                       ` Linus Torvalds
2009-06-09 15:09                     ` H. Peter Anvin
2009-06-09 18:06                       ` Linus Torvalds
2009-06-09 18:07                         ` Linus Torvalds
2009-06-09 22:48                           ` Matthew Garrett
2009-06-09 22:54                             ` H. Peter Anvin
2009-06-09 14:54                   ` Linus Torvalds
2009-06-09 14:57                     ` Ingo Molnar
2009-06-09 15:55                       ` Avi Kivity
2009-06-09 15:38                     ` Nick Piggin
2009-06-09 16:00                       ` Linus Torvalds
2009-06-09 16:21                         ` Nick Piggin
2009-06-09 16:26                           ` Linus Torvalds
2009-06-09 16:45                             ` Nick Piggin
2009-06-09 17:08                               ` Linus Torvalds
2009-06-10  5:53                                 ` Nick Piggin
2009-06-17  9:40                                   ` Pavel Machek
2009-06-17  9:56                                     ` Nick Piggin
2009-06-10  6:29                             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0B6F9C.4060405@zytor.com \
    --to=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=jun.nakajima@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=xen-devel@lists.xensource.com \
    --cc=xiaohui.xin@intel.com \
    --cc=xin.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.