All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: peterz@infradead.org, mingo@elte.hu, avi@redhat.com,
	raghukt@linux.vnet.ibm.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, jeremy@goop.org,
	vatsa@linux.vnet.ibm.com, hpa@zytor.com
Subject: Re: [PATCH v2 3/7] KVM: Add paravirt kvm_flush_tlb_others
Date: Fri, 06 Jul 2012 15:17:53 +0530	[thread overview]
Message-ID: <87hatlxlti.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120703075535.GA13291@amt.cnet>

[-- Attachment #1: Type: text/plain, Size: 5551 bytes --]

On Tue, 3 Jul 2012 04:55:35 -0300, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Mon, Jun 04, 2012 at 10:37:24AM +0530, Nikunj A. Dadhania wrote:
> > flush_tlb_others_ipi depends on lot of statics in tlb.c.  Replicated
> > the flush_tlb_others_ipi as kvm_flush_tlb_others to further adapt to
> > paravirtualization.
> > 
> > Use the vcpu state information inside the kvm_flush_tlb_others to
> > avoid sending ipi to pre-empted vcpus.
> > 
> > * Do not send ipi's to offline vcpus and set flush_on_enter flag
> > * For online vcpus: Wait for them to clear the flag
> > 
> > The approach was discussed here: https://lkml.org/lkml/2012/2/20/157
> > 
> > Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
> > 
> > --
> > Pseudo Algo:
> > 
> >    Write()
> >    ======
> > 
> > 	   guest_exit()
> > 		   flush_on_enter[i]=0;
> > 		   running[i] = 0;
> > 
> > 	   guest_enter()
> > 		   running[i] = 1;
> > 		   smp_mb();
> > 		   if(flush_on_enter[i]) {
> > 			   tlb_flush()
> > 			   flush_on_enter[i]=0;
> > 		   }
> > 
> > 
> >    Read()
> >    ======
> > 
> > 	   GUEST                                                KVM-HV
> > 
> >    f->flushcpumask = cpumask - me;
> > 
> > again:
> >    for_each_cpu(i, f->flushmask) {
> > 
> > 	   if (!running[i]) {
> > 						   case 1:
> > 
> > 						   running[n]=1
> > 
> > 						   (cpuN does not see
> > 						   flush_on_enter set,
> > 						   guest later finds it
> > 						   running and sends ipi,
> > 						   we are fine here, need
> > 						   to clear the flag on
> > 						   guest_exit)
> > 
> > 		  flush_on_enter[i] = 1;
> > 						   case2:
> > 
> > 						   running[n]=1
> > 						   (cpuN - will see flush
> > 						   on enter and an IPI as
> > 						   well - addressed in patch-4)
> > 
> > 		  if (!running[i])
> > 		     cpu_clear(f->flushmask);      All is well, vm_enter
> > 						   will do the fixup
> > 	   }
> > 						   case 3:
> > 						   running[n] = 0;
> > 
> > 						   (cpuN went to sleep,
> > 						   we saw it as awake,
> > 						   ipi sent, but wait
> > 						   will break without
> > 						   zero_mask and goto
> > 						   again will take care)
> > 
> >    }
> >    send_ipi(f->flushmask)
> > 
> >    wait_a_while_for_zero_mask();
> > 
> >    if (!zero_mask)
> > 	   goto again;
> 
> Can you please measure increased vmentry/vmexit overhead? x86/vmexit.c 
> of git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git should 
> help.
> 

Please find below the results (debug patch attached for enabling
registration of kvm_vcu_state)

I have taken results for 1 and 4 vcpus. Used the following command for
starting the tests:

/usr/libexec/qemu-kvm -smp $i -device testdev,chardev=testlog -chardev
file,id=testlog,path=vmexit.out -serial stdio -kernel ./x86/vmexit.flat

Machine : IBM xSeries with Intel(R) Xeon(R) X7560 2.27GHz CPU 
          with 32 core, 32 online cpus and 4*64GB RAM.

x  base - unpatched host kernel 
+  wo_vs - patched host kernel, vcpu_state not registered
*  w_vs.txt - patched host kernel and vcpu_state registered

1 vcpu results:
---------------
    cpuid
    =====
           N        Avg       Stddev
    x     10     2135.1      17.8975
    +     10       2188      18.3666
    *     10     2448.9      43.9910
    
    vmcall
    ======
           N        Avg       Stddev
    x     10     2025.5      38.1641
    +     10     2047.5      24.8205
    *     10     2306.2      40.3066
    
    mov_from_cr8
    ============
           N        Avg       Stddev
    x     10         12       0.0000
    +     10         12       0.0000
    *     10         12       0.0000
    
    mov_to_cr8
    ==========
           N        Avg       Stddev
    x     10       19.4       0.5164
    +     10       19.1       0.3162
    *     10       19.2       0.4216
    
    inl_from_pmtimer
    ================
           N        Avg       Stddev
    x     10    18093.2     462.0543
    +     10    16579.7    1448.8892
    *     10    18577.7     266.2676
    
    ple-round-robin
    ===============
           N        Avg       Stddev
    x     10       16.1       0.3162
    +     10       16.2       0.4216
    *     10       15.3       0.4830

4 vcpus result
--------------
    cpuid
    =====
           N        Avg       Stddev
    x     10     2135.8      10.0642
    +     10       2165       6.4118
    *     10     2423.7      12.5526
    
    vmcall
    ======
           N        Avg       Stddev
    x     10     2028.3      19.6641
    +     10     2024.7       7.2273
    *     10     2276.1      13.8680
    
    mov_from_cr8
    ============
           N        Avg       Stddev
    x     10         12       0.0000
    +     10         12       0.0000
    *     10         12       0.0000
    
    mov_to_cr8
    ==========
           N        Avg       Stddev
    x     10         19       0.0000
    +     10         19       0.0000
    *     10         19       0.0000
    
    inl_from_pmtimer
    ================
           N        Avg       Stddev
    x     10    25574.2    1693.5374
    +     10    25190.7    2219.9223
    *     10      23044    1230.8737
    
    ipi
    ===
           N        Avg       Stddev
    x     20   31996.75    7290.1777
    +     20   33683.25    9795.1601
    *     20    34563.5    8338.7826
    
    ple-round-robin
    ===============
           N        Avg       Stddev
    x     10     6281.7    1543.8601
    +     10     6149.8    1207.7928
    *     10     6433.3    2304.5377

Thanks
Nikunj


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: enable_vcpu_state.diff --]
[-- Type: text/x-patch, Size: 1895 bytes --]


Enable and register vcpu_state information to the host
    
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>

diff --git a/x86/vmexit.c b/x86/vmexit.c
index ad8ab55..a9823c9 100644
--- a/x86/vmexit.c
+++ b/x86/vmexit.c
@@ -3,6 +3,7 @@
 #include "smp.h"
 #include "processor.h"
 #include "atomic.h"
+#include "vm.h"
 
 static unsigned int inl(unsigned short port)
 {
@@ -173,10 +174,45 @@ static void enable_nx(void *junk)
 		wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_NX_MASK);
 }
 
+#define KVM_MSR_ENABLED                 1
+#define KVM_FEATURE_VCPU_STATE          7
+#define MSR_KVM_VCPU_STATE              0x4b564d04
+
+struct kvm_vcpu_state {
+        int state;
+        int flush_on_enter;
+        int pad[14];
+};
+
+struct kvm_vcpu_state test[4];
+
+static inline void my_wrmsr(unsigned int msr,
+			  unsigned low, unsigned high)
+{
+  asm volatile("wrmsr" : : "c" (msr), "a"(low), "d" (high) : "memory");
+}
+#define wrmsrl(msr, val) my_wrmsr(msr, (u32)((u64)(val)), ((u64)(val))>>32)
+
+static void enable_vcpu_state(void *junk)
+{
+	struct kvm_vcpu_state *vs;
+	int me = smp_id();
+
+	if (cpuid(0x80000001).d & (1 << KVM_FEATURE_VCPU_STATE)) {
+		vs = &test[me];
+		memset(vs, 0, sizeof(struct kvm_vcpu_state));
+		
+		wrmsrl(MSR_KVM_VCPU_STATE, ((unsigned long)(vs) | KVM_MSR_ENABLED));
+		printf("%d: Done vcpu state %p\n", me, virt_to_phys((void*)vs));
+	}
+}
+
 bool test_wanted(struct test *test, char *wanted[], int nwanted)
 {
 	int i;
 
+	return true;
+
 	if (!nwanted)
 		return true;
 
@@ -192,11 +228,16 @@ int main(int ac, char **av)
 	int i;
 
 	smp_init();
+	setup_vm();
+
 	nr_cpus = cpu_count();
 
 	for (i = cpu_count(); i > 0; i--)
 		on_cpu(i-1, enable_nx, 0);
 
+	for (i = cpu_count(); i > 0; i--)
+		on_cpu(i-1, enable_vcpu_state, 0);
+
 	for (i = 0; i < ARRAY_SIZE(tests); ++i)
 		if (test_wanted(&tests[i], av + 1, ac - 1))
 			do_test(&tests[i]);

  parent reply	other threads:[~2012-07-06  9:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  5:05 [PATCH v2 0/7] KVM paravirt remote flush tlb Nikunj A. Dadhania
2012-06-04  5:06 ` [PATCH v2 1/7] KVM Guest: Add VCPU running/pre-empted state for guest Nikunj A. Dadhania
2012-06-12 22:43   ` Marcelo Tosatti
2012-06-19  6:03     ` Nikunj A Dadhania
2012-06-04  5:06 ` [PATCH v2 2/7] KVM-HV: " Nikunj A. Dadhania
2012-06-04  5:07 ` [PATCH v2 3/7] KVM: Add paravirt kvm_flush_tlb_others Nikunj A. Dadhania
2012-06-12 23:02   ` Marcelo Tosatti
2012-06-19  6:11     ` Nikunj A Dadhania
2012-06-21 12:26       ` Marcelo Tosatti
2012-07-03  7:55   ` Marcelo Tosatti
2012-07-03  8:19     ` Nikunj A Dadhania
2012-07-05  2:09       ` Marcelo Tosatti
2012-07-05  5:55         ` Nikunj A Dadhania
2012-07-06  9:47     ` Nikunj A Dadhania [this message]
2012-07-03  8:11   ` Marcelo Tosatti
2012-07-03  8:27     ` Nikunj A Dadhania
2012-06-04  5:07 ` [PATCH v2 4/7] KVM: export kvm_kick_vcpu for pv_flush Nikunj A. Dadhania
2012-06-04  5:08 ` [PATCH v2 5/7] KVM: Introduce PV kick in flush tlb Nikunj A. Dadhania
2012-07-03  8:07   ` Marcelo Tosatti
2012-07-03  8:25     ` Nikunj A Dadhania
2012-07-05  2:37       ` Marcelo Tosatti
2012-07-05  5:53         ` Nikunj A Dadhania
2012-06-04  5:08 ` [PATCH v2 6/7] kvm,x86: RCU based table free Nikunj A. Dadhania
2012-06-05 10:48   ` Stefano Stabellini
2012-06-05 11:08     ` Nikunj A Dadhania
2012-06-05 11:58       ` Stefano Stabellini
2012-06-05 13:04         ` Nikunj A Dadhania
2012-06-05 13:08           ` Peter Zijlstra
2012-06-05 13:26             ` Stefano Stabellini
2012-06-05 13:31               ` Peter Zijlstra
2012-06-05 13:41                 ` Stefano Stabellini
2012-08-01 11:23               ` Stefano Stabellini
2012-08-01 12:12                 ` Nikunj A Dadhania
2012-08-01 12:59                   ` Stefano Stabellini
2012-06-05 15:29             ` Nikunj A Dadhania
2012-06-05 13:21           ` Stefano Stabellini
2012-06-04  5:08 ` [PATCH v2 7/7] Flush page-table pages before freeing them Nikunj A. Dadhania

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87hatlxlti.fsf@linux.vnet.ibm.com \
    --to=nikunj@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghukt@linux.vnet.ibm.com \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.