All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Joerg Roedel" <joerg.roedel@amd.com>
To: "Ingo Molnar" <mingo@elte.hu>
Cc: "Andi Kleen" <ak@suse.de>,
	dor.laor@qumranet.com,
	"kvm-devel" <kvm-devel@lists.sourceforge.net>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [kvm-devel] Performance overhead of get_cycles_sync
Date: Tue, 11 Dec 2007 22:26:28 +0100	[thread overview]
Message-ID: <20071211212628.GB6537@amd.com> (raw)
In-Reply-To: <20071211142717.GA15903@elte.hu>

On Tue, Dec 11, 2007 at 03:27:17PM +0100, Ingo Molnar wrote:
> 
> * Dor Laor <dor.laor@gmail.com> wrote:
> 
> > Here [include/asm-x86/tsc.h]:
> >
> > /* Like get_cycles, but make sure the CPU is synchronized. */
> > static __always_inline cycles_t get_cycles_sync(void)
> > {
> >    unsigned long long ret;
> >    unsigned eax, edx;
> >
> >    /*
> >       * Use RDTSCP if possible; it is guaranteed to be synchronous
> >      * and doesn't cause a VMEXIT on Hypervisors
> >     */
> >    alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
> >               ASM_OUTPUT2("=a" (eax), "=d" (edx)),
> >               "a" (0U), "d" (0U) : "ecx", "memory");
> >    ret = (((unsigned long long)edx) << 32) | ((unsigned long long)eax);
> >    if (ret)
> >        return ret;
> >
> >    /*
> >     * Don't do an additional sync on CPUs where we know
> >     * RDTSC is already synchronous:
> >     */
> > //    alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
> > //              "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
> >    rdtscll(ret);
> 
> The patch below should resolve this - could you please test and Ack it? 
> But this CPUID was present in v2.6.23 too, so why did it only show up in 
> 2.6.24-rc for you?
> 
> 	Ingo
> 
> -------------->
> Subject: x86: fix get_cycles_sync() overhead
> From: Ingo Molnar <mingo@elte.hu>
> 
> get_cycles_sync() is causing massive overhead in KVM networking:
> 
>    http://lkml.org/lkml/2007/12/11/54
> 
> remove the explicit CPUID serialization - it causes VM exits and is
> pointless: we care about GTOD coherency but that goes to user-space
> via a syscall, and syscalls are serialization points anyway.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/asm-x86/tsc.h |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> Index: linux-x86.q/include/asm-x86/tsc.h
> ===================================================================
> --- linux-x86.q.orig/include/asm-x86/tsc.h
> +++ linux-x86.q/include/asm-x86/tsc.h
> @@ -39,8 +39,8 @@ static __always_inline cycles_t get_cycl
>  	unsigned eax, edx;
>  
>  	/*
> -  	 * Use RDTSCP if possible; it is guaranteed to be synchronous
> - 	 * and doesn't cause a VMEXIT on Hypervisors
> +	 * Use RDTSCP if possible; it is guaranteed to be synchronous
> +	 * and doesn't cause a VMEXIT on Hypervisors
>  	 */
>  	alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
>  		       ASM_OUTPUT2("=a" (eax), "=d" (edx)),
> @@ -50,11 +50,11 @@ static __always_inline cycles_t get_cycl
>  		return ret;
>  
>  	/*
> -	 * Don't do an additional sync on CPUs where we know
> -	 * RDTSC is already synchronous:
> +	 * Use RDTSC on other CPUs. This might not be fully synchronous,
> +	 * but it's not a problem: the only coherency we care about is
> +	 * the GTOD output to user-space, and syscalls are synchronization
> +	 * points anyway:
>  	 */
> -	alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
> -			  "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
>  	rdtscll(ret);
>  
>  	return ret;

I don't think this is a good idea. I discussed exactly this item with
Andi Kleen a while ago and afair the serializing instruction was
necessary to fix a backwards walking gettimeofday() on some K8
revisions. Andi Kleen can tell more details, I added him to the CC list.

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy



WARNING: multiple messages have this Message-ID (diff)
From: "Joerg Roedel" <joerg.roedel-5C7GfCeVMHo@public.gmane.org>
To: "Ingo Molnar" <mingo-X9Un+BFzKDI@public.gmane.org>
Cc: kvm-devel
	<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
	Andi Kleen <ak-l3A5Bk7waGM@public.gmane.org>,
	Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Performance overhead of get_cycles_sync
Date: Tue, 11 Dec 2007 22:26:28 +0100	[thread overview]
Message-ID: <20071211212628.GB6537@amd.com> (raw)
In-Reply-To: <20071211142717.GA15903-X9Un+BFzKDI@public.gmane.org>

On Tue, Dec 11, 2007 at 03:27:17PM +0100, Ingo Molnar wrote:
> 
> * Dor Laor <dor.laor-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> > Here [include/asm-x86/tsc.h]:
> >
> > /* Like get_cycles, but make sure the CPU is synchronized. */
> > static __always_inline cycles_t get_cycles_sync(void)
> > {
> >    unsigned long long ret;
> >    unsigned eax, edx;
> >
> >    /*
> >       * Use RDTSCP if possible; it is guaranteed to be synchronous
> >      * and doesn't cause a VMEXIT on Hypervisors
> >     */
> >    alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
> >               ASM_OUTPUT2("=a" (eax), "=d" (edx)),
> >               "a" (0U), "d" (0U) : "ecx", "memory");
> >    ret = (((unsigned long long)edx) << 32) | ((unsigned long long)eax);
> >    if (ret)
> >        return ret;
> >
> >    /*
> >     * Don't do an additional sync on CPUs where we know
> >     * RDTSC is already synchronous:
> >     */
> > //    alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
> > //              "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
> >    rdtscll(ret);
> 
> The patch below should resolve this - could you please test and Ack it? 
> But this CPUID was present in v2.6.23 too, so why did it only show up in 
> 2.6.24-rc for you?
> 
> 	Ingo
> 
> -------------->
> Subject: x86: fix get_cycles_sync() overhead
> From: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> 
> get_cycles_sync() is causing massive overhead in KVM networking:
> 
>    http://lkml.org/lkml/2007/12/11/54
> 
> remove the explicit CPUID serialization - it causes VM exits and is
> pointless: we care about GTOD coherency but that goes to user-space
> via a syscall, and syscalls are serialization points anyway.
> 
> Signed-off-by: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> Signed-off-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> ---
>  include/asm-x86/tsc.h |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> Index: linux-x86.q/include/asm-x86/tsc.h
> ===================================================================
> --- linux-x86.q.orig/include/asm-x86/tsc.h
> +++ linux-x86.q/include/asm-x86/tsc.h
> @@ -39,8 +39,8 @@ static __always_inline cycles_t get_cycl
>  	unsigned eax, edx;
>  
>  	/*
> -  	 * Use RDTSCP if possible; it is guaranteed to be synchronous
> - 	 * and doesn't cause a VMEXIT on Hypervisors
> +	 * Use RDTSCP if possible; it is guaranteed to be synchronous
> +	 * and doesn't cause a VMEXIT on Hypervisors
>  	 */
>  	alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
>  		       ASM_OUTPUT2("=a" (eax), "=d" (edx)),
> @@ -50,11 +50,11 @@ static __always_inline cycles_t get_cycl
>  		return ret;
>  
>  	/*
> -	 * Don't do an additional sync on CPUs where we know
> -	 * RDTSC is already synchronous:
> +	 * Use RDTSC on other CPUs. This might not be fully synchronous,
> +	 * but it's not a problem: the only coherency we care about is
> +	 * the GTOD output to user-space, and syscalls are synchronization
> +	 * points anyway:
>  	 */
> -	alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
> -			  "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
>  	rdtscll(ret);
>  
>  	return ret;

I don't think this is a good idea. I discussed exactly this item with
Andi Kleen a while ago and afair the serializing instruction was
necessary to fix a backwards walking gettimeofday() on some K8
revisions. Andi Kleen can tell more details, I added him to the CC list.

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy



-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php

  parent reply	other threads:[~2007-12-11 21:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-11 13:11 Performance overhead of get_cycles_sync Dor Laor
2007-12-11 13:11 ` Dor Laor
2007-12-11 13:37 ` Ingo Molnar
     [not found]   ` <20071211133738.GA8150-X9Un+BFzKDI@public.gmane.org>
2007-12-11 14:11     ` Dor Laor
2007-12-11 14:27       ` Ingo Molnar
2007-12-11 14:27         ` Ingo Molnar
2007-12-11 15:03         ` Dor Laor
2007-12-11 15:03           ` Dor Laor
2007-12-11 16:35         ` Arjan van de Ven
2007-12-11 17:03           ` Ingo Molnar
2007-12-11 17:03             ` Ingo Molnar
2007-12-11 17:23             ` Andi Kleen
2007-12-11 20:19               ` Ingo Molnar
2007-12-11 20:29                 ` Ingo Molnar
2007-12-11 20:29                   ` Ingo Molnar
2007-12-11 21:26         ` Joerg Roedel [this message]
2007-12-11 21:26           ` Joerg Roedel
     [not found]           ` <20071211212628.GB6537-5C7GfCeVMHo@public.gmane.org>
2007-12-12  0:19             ` Dor Laor
2007-12-11 14:14   ` Dor Laor
2007-12-11 14:14     ` Dor Laor
2007-12-11 14:27 ` Andi Kleen
2007-12-11 14:57   ` Dor Laor
2007-12-11 15:02     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071211212628.GB6537@amd.com \
    --to=joerg.roedel@amd.com \
    --cc=ak@suse.de \
    --cc=dor.laor@qumranet.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.