thread/core siblings info for guests

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* thread/core siblings info for guests
@ 2008-10-02 20:35 Kamble, Nitin A
  2008-10-05  9:44 ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Kamble, Nitin A @ 2008-10-02 20:35 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

Avi,
   There are some OSes like windows server which impose licensing restriction on number of cpus. These licensing restrictions are based on number of packages/sockets. And look into the cupid data to decide which cpus are thread siblings, core siblings and packages siblings. With current KVM/Qemu implementation it is hiding all this cupid information from the guests. So guest sees each cpu as a single package. And the license restrictions inside the OS is limiting no of cpus the guest can run.
    These cupid bits should be exposed to the guest so that the OS would see the thread or core sibling information correctly to utilize more cpus.

   Is anybody is working on this?


Thanks & Regards,
Nitin
Linux Open Source Technology Center, Intel Corporation
----------------------------------------------------------------------------
The Mind is like a parachute; it works much better when it's open.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-02 20:35 thread/core siblings info for guests Kamble, Nitin A
@ 2008-10-05  9:44 ` Avi Kivity
  2008-10-06 10:23   ` Gerd Hoffmann
                     ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Avi Kivity @ 2008-10-05  9:44 UTC (permalink / raw)
  To: Kamble, Nitin A; +Cc: kvm@vger.kernel.org

Kamble, Nitin A wrote:
> Avi,
>    There are some OSes like windows server which impose licensing restriction on number of cpus. These licensing restrictions are based on number of packages/sockets. And look into the cupid data to decide which cpus are thread siblings, core siblings and packages siblings. With current KVM/Qemu implementation it is hiding all this cupid information from the guests. So guest sees each cpu as a single package. And the license restrictions inside the OS is limiting no of cpus the guest can run.
>     These cupid bits should be exposed to the guest so that the OS would see the thread or core sibling information correctly to utilize more cpus.
>
>    Is anybody is working on this?
>   

Not that I know of.  Indeed finer control over cpuid is needed.  We need 
to support at least three modes:

- default: expose some machine that is likely to be widely supported
- host: expose as much of the host cpu as we can
- managed: management application controls everything


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-05  9:44 ` Avi Kivity
@ 2008-10-06 10:23   ` Gerd Hoffmann
  2008-10-06 18:01   ` Kamble, Nitin A
  2008-10-17 22:45   ` Nitin A Kamble
  2 siblings, 0 replies; 24+ messages in thread
From: Gerd Hoffmann @ 2008-10-06 10:23 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Kamble, Nitin A, kvm@vger.kernel.org

Avi Kivity wrote:
> 
> Not that I know of.  Indeed finer control over cpuid is needed.  We need
> to support at least three modes:
> 
> - default: expose some machine that is likely to be widely supported
> - host: expose as much of the host cpu as we can
> - managed: management application controls everything

qemu has an array with predefined cpu types, selectable via -cpu <name>.
 IMHO it would be a good idea to reuse that, especially with the
upstream merge in mind ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: thread/core siblings info for guests
  2008-10-05  9:44 ` Avi Kivity
  2008-10-06 10:23   ` Gerd Hoffmann
@ 2008-10-06 18:01   ` Kamble, Nitin A
  2008-10-06 20:54     ` Chris Wright
  2008-10-17 22:45   ` Nitin A Kamble
  2 siblings, 1 reply; 24+ messages in thread
From: Kamble, Nitin A @ 2008-10-06 18:01 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org

>From: Avi Kivity [mailto:avi@redhat.com]
>
>Indeed finer control over cpuid is needed.  We need
>to support at least three modes:
>
>- default: expose some machine that is likely to be widely supported
>- host: expose as much of the host cpu as we can
>- managed: management application controls everything
>

[Nitin>]
Avi,
   Would you like to elaborate more on these three mode's usage scenario?

As I see, there are 2 things to look for while providing the cpuid to the guest.
   1. Guest migration across different hosts. This limits exposed cpu features.
   2. Utilizing host CPU features. This expands exposed cpu features.

Both these needs should be accommodated keeping the flexibility to override.

Thanks & Regards,
Nitin
Linux Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------------------
The Mind is like a parachute; it works much better when it's open.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-06 18:01   ` Kamble, Nitin A
@ 2008-10-06 20:54     ` Chris Wright
  0 siblings, 0 replies; 24+ messages in thread
From: Chris Wright @ 2008-10-06 20:54 UTC (permalink / raw)
  To: Kamble, Nitin A; +Cc: Avi Kivity, kvm@vger.kernel.org

* Kamble, Nitin A (nitin.a.kamble@intel.com) wrote:
> >From: Avi Kivity [mailto:avi@redhat.com]
> >Indeed finer control over cpuid is needed.  We need
> >to support at least three modes:
> >
> >- default: expose some machine that is likely to be widely supported
> >- host: expose as much of the host cpu as we can
> >- managed: management application controls everything
> Avi,
>    Would you like to elaborate more on these three mode's usage scenario?
> 
> As I see, there are 2 things to look for while providing the cpuid to the guest.
>    1. Guest migration across different hosts. This limits exposed cpu features.

This falls under managed above, as it's smth the mgmt app knows (e.g. what's the
least common denominator in your migration pool?).

>    2. Utilizing host CPU features. This expands exposed cpu features.

This falls under host above.

And the default mode above is just about picking a sane default.

thanks,
-chris

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-05  9:44 ` Avi Kivity
  2008-10-06 10:23   ` Gerd Hoffmann
  2008-10-06 18:01   ` Kamble, Nitin A
@ 2008-10-17 22:45   ` Nitin A Kamble
  2008-10-18  0:18     ` Alexander Graf
  2008-10-19  9:29     ` Avi Kivity
  2 siblings, 2 replies; 24+ messages in thread
From: Nitin A Kamble @ 2008-10-17 22:45 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org, kraxel, chrisw


[-- Attachment #1.1: Type: text/plain, Size: 2189 bytes --]

Hi Avi,
 Attached is a patch to handle the "host mode" as you mentioned earlier.

It works this way:
   Qemu's -cpu option is extended to take a new paramenter: -cpu _host_.
With that parameter the cpuid bits are collected from the host cpu. And
few bits of cleared to get it working with the kvm.
   I tested the patch with Windows & Linux guests, and I see the host
topology is propagating into the guest. With that windows server is not
imposing the incorrect licensing restriction to guest, as it was doing
earlier. With the patch I could bring up the windows server guest with
16 cpus on kvm.
  I also fixed implementation of indexed cpuid leaves (0x4 & 0xb) in
qemu. And also some code cleanup are there for consistent terminology to
have more readable code, 

  Please check-in the patch in kvm-usermode.git tree. There are no
changes for the kernel side or kvm.git tree for this.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

Next things:
 * Default mode: I am not planning to do anything for this. I see the
current situation is good enough for the default mode.

 * Managed mode: For this, I am thinking few things. 
    1. Either an option can be added to qemu to take (almost) all of 
       it's cpuid bits from a file.
    2. Or use command line options to modify particular bits of guest  
       cpuid.
    3. Or do the both.

What do you think about the managed mode implementation? you probably
have better idea regarding how and what kind of tools will be managing
it.

-- 
Thanks & Regards,
Nitin
Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------
The mind is like a parachute; it works much better when it's open

On Sun, 2008-10-05 at 02:44 -0700, Avi Kivity wrote:
> Not that I know of.  Indeed finer control over cpuid is needed.  We need
> to support at least three modes:
> 
> - default: expose some machine that is likely to be widely supported
> - host: expose as much of the host cpu as we can
> - managed: management application controls everything
> 
> 
> --
> error compiling committee.c: too many arguments to function
> 


[-- Attachment #1.2: host_cpuid.diff --]
[-- Type: text/x-patch, Size: 16270 bytes --]

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index a8cca15..5e6e5ad 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -458,7 +458,7 @@ __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu)
 }
 
 int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
-		    struct kvm_cpuid_entry *entries)
+		    struct kvm_cpuid_entry2 *entries)
 {
 	struct kvm_cpuid *cpuid;
 	int r;
@@ -469,7 +469,7 @@ int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
 
 	cpuid->nent = nent;
 	memcpy(cpuid->entries, entries, nent * sizeof(*entries));
-	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_CPUID, cpuid);
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_CPUID2, cpuid);
 
 	free(cpuid);
 	return r;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 60d6d08..a0def52 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -374,7 +374,7 @@ int kvm_guest_debug(kvm_context_t, int vcpu, struct kvm_debug_guest *dbg);
  * \return 0 on success, or -errno on error
  */
 int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
-		    struct kvm_cpuid_entry *entries);
+		    struct kvm_cpuid_entry2 *entries);
 
 /*!
  * \brief Setting the number of shadow pages to be allocated to the vm
diff --git a/qemu/qemu-kvm-x86.c b/qemu/qemu-kvm-x86.c
index bf62e18..0317c4a 100644
--- a/qemu/qemu-kvm-x86.c
+++ b/qemu/qemu-kvm-x86.c
@@ -444,7 +444,7 @@ void kvm_arch_save_regs(CPUState *env)
     }
 }
 
-static void host_cpuid(uint32_t function, uint32_t *eax, uint32_t *ebx,
+void host_cpuid(uint32_t function, uint32_t index, uint32_t *eax, uint32_t *ebx,
 		       uint32_t *ecx, uint32_t *edx)
 {
     uint32_t vec[4];
@@ -453,7 +453,7 @@ static void host_cpuid(uint32_t function, uint32_t *eax, uint32_t *ebx,
     asm volatile("cpuid"
 		 : "=a"(vec[0]), "=b"(vec[1]),
 		   "=c"(vec[2]), "=d"(vec[3])
-		 : "0"(function) : "cc");
+		 : "0"(function), "2"(index) : "cc");
 #else
     asm volatile("pusha \n\t"
 		 "cpuid \n\t"
@@ -462,7 +462,7 @@ static void host_cpuid(uint32_t function, uint32_t *eax, uint32_t *ebx,
 		 "mov %%ecx, 8(%1) \n\t"
 		 "mov %%edx, 12(%1) \n\t"
 		 "popa"
-		 : : "a"(function), "S"(vec)
+		 : : "a"(function), "S"(vec), "c"(index)
 		 : "memory", "cc");
 #endif
 
@@ -477,20 +477,23 @@ static void host_cpuid(uint32_t function, uint32_t *eax, uint32_t *ebx,
 }
 
 
-static void do_cpuid_ent(struct kvm_cpuid_entry *e, uint32_t function,
-			 CPUState *env)
+static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function,
+			 uint32_t index, CPUState *env)
 {
     env->regs[R_EAX] = function;
+    env->regs[R_ECX] = index;
     qemu_kvm_cpuid_on_env(env);
     e->function = function;
+    e->index = index;
     e->eax = env->regs[R_EAX];
     e->ebx = env->regs[R_EBX];
     e->ecx = env->regs[R_ECX];
     e->edx = env->regs[R_EDX];
+    e->flags = 0;
     if (function == 0x80000001) {
 	uint32_t h_eax, h_edx;
 
-	host_cpuid(function, &h_eax, NULL, NULL, &h_edx);
+	host_cpuid(function, 0, &h_eax, NULL, NULL, &h_edx);
 
 	// long mode
 	if ((h_edx & 0x20000000) == 0 || !lm_capable_kernel)
@@ -512,7 +515,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry *e, uint32_t function,
     if (function == 0) {
 	uint32_t bcd[3];
 
-	host_cpuid(0, NULL, &bcd[0], &bcd[1], &bcd[2]);
+	host_cpuid(0, 0, NULL, &bcd[0], &bcd[1], &bcd[2]);
 	e->ebx = bcd[0];
 	e->ecx = bcd[1];
 	e->edx = bcd[2];
@@ -524,6 +527,10 @@ static void do_cpuid_ent(struct kvm_cpuid_entry *e, uint32_t function,
     // 3dnow isn't properly emulated yet
     if (function == 0x80000001)
 	e->edx &= ~0xc0000000;
+
+    if ((function == 4) || (function == 0xb)) {
+       e->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+    }
 }
 
 struct kvm_para_features {
@@ -559,9 +566,9 @@ static int get_para_features(kvm_context_t kvm_context)
 
 int kvm_arch_qemu_init_env(CPUState *cenv)
 {
-    struct kvm_cpuid_entry cpuid_ent[100];
+    struct kvm_cpuid_entry2 cpuid_ent[100];
 #ifdef KVM_CPUID_SIGNATURE
-    struct kvm_cpuid_entry *pv_ent;
+    struct kvm_cpuid_entry2 *pv_ent;
     uint32_t signature[3];
 #endif
     int cpuid_nent = 0;
@@ -587,19 +594,29 @@ int kvm_arch_qemu_init_env(CPUState *cenv)
     pv_ent->eax = get_para_features(kvm_context);
 #endif
 
-    copy.regs[R_EAX] = 0;
-    qemu_kvm_cpuid_on_env(&copy);
-    limit = copy.regs[R_EAX];
+    limit = copy.cpuid_level;
 
     for (i = 0; i <= limit; ++i)
-	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, &copy);
+	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, 0, &copy);
+
+    if (limit >= 4) { /* get the per index values */
+        int i = 1;
+        do
+            do_cpuid_ent(&cpuid_ent[cpuid_nent++], 4, i++, &copy);
+        while(copy.regs[R_EAX] & 0x1f); /* untill last index */
+    }
+
+    if (limit >= 0xb) { /* get the per index values */
+        int i = 1;
+        do
+            do_cpuid_ent(&cpuid_ent[cpuid_nent++], 0xb, i++, &copy);
+        while(copy.regs[R_ECX] & 0xff00); /* untill last index */
+    }
 
-    copy.regs[R_EAX] = 0x80000000;
-    qemu_kvm_cpuid_on_env(&copy);
-    limit = copy.regs[R_EAX];
+    limit = copy.cpuid_xlevel;
 
     for (i = 0x80000000; i <= limit; ++i)
-	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, &copy);
+        do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, 0, &copy);
 
     kvm_setup_cpuid(kvm_context, cenv->cpu_index, cpuid_nent, cpuid_ent);
     return 0;
diff --git a/qemu/target-i386/cpu.h b/qemu/target-i386/cpu.h
index 11bc2c1..879bddf 100644
--- a/qemu/target-i386/cpu.h
+++ b/qemu/target-i386/cpu.h
@@ -600,18 +600,19 @@ typedef struct CPUX86State {
     CPU_COMMON
 
     /* processor features (e.g. for CPUID insn) */
-    uint32_t cpuid_level;
-    uint32_t cpuid_vendor1;
-    uint32_t cpuid_vendor2;
-    uint32_t cpuid_vendor3;
-    uint32_t cpuid_version;
-    uint32_t cpuid_features;
-    uint32_t cpuid_ext_features;
-    uint32_t cpuid_xlevel;
-    uint32_t cpuid_model[12];
-    uint32_t cpuid_ext2_features;
-    uint32_t cpuid_ext3_features;
-    uint32_t cpuid_apic_id;
+    uint32_t cpuid_level; /* leaf 0x00000000:EAX */
+    uint32_t cpuid_vendor1; /* leaf 0x00000000:EBX */
+    uint32_t cpuid_vendor2; /* leaf 0x00000000:ECX */
+    uint32_t cpuid_vendor3; /* leaf 0x00000000:EDX */
+    uint32_t cpuid_version; /* leaf 0x00000001:EAX */
+    uint32_t cpuid_features; /* leaf 0x00000001:EDX */
+    uint32_t cpuid_ext_features; /* leaf 0x00000001:ECX */
+    uint32_t cpuid_xlevel; /* leaf 0x00000001:EDX */
+    uint32_t cpuid_model[12]; /* leaves 0x80000002 - 0x80000004 */
+    uint32_t cpuid_ext2_features; /* leaf 0x80000001:EDX */
+    uint32_t cpuid_ext3_features; /* leaf 0x80000001:ECX */
+    uint32_t cpuid_apic_id; /* leaf 0x00000001:EBX.bits 24-31 */
+    uint32_t cpu_host_cpu; /* guest cpuid =:=  host cpuid */
 
 #ifdef USE_KQEMU
     int kqemu_enabled;
diff --git a/qemu/target-i386/helper.c b/qemu/target-i386/helper.c
index 506a67d..d273940 100644
--- a/qemu/target-i386/helper.c
+++ b/qemu/target-i386/helper.c
@@ -150,6 +150,9 @@ typedef struct x86_def_t {
           CPUID_PAT | CPUID_FXSR | CPUID_MMX | CPUID_SSE | CPUID_SSE2 | \
           CPUID_PAE | CPUID_SEP | CPUID_APIC)
 static x86_def_t x86_defs[] = {
+    {
+        .name = "_host_",
+    },
 #ifdef TARGET_X86_64
     {
         .name = "qemu64",
@@ -386,33 +389,88 @@ void x86_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...))
         (*cpu_fprintf)(f, "x86 %16s\n", x86_defs[i].name);
 }
 
+
+void host_cpuid(uint32_t function, uint32_t index, uint32_t *eax, uint32_t *ebx,
+		       uint32_t *ecx, uint32_t *edx);
+
+static int fill_x86_defs_for_host(CPUX86State *env, x86_def_t * def)
+{
+    uint32 eax, ebx, ecx, edx;
+
+    env->cpu_host_cpu = 1;
+
+    host_cpuid(0, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_level = eax;
+    env->cpuid_vendor1 = ebx;
+    env->cpuid_vendor2 = ecx;
+    env->cpuid_vendor3 = edx;
+
+    host_cpuid(1, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_version = eax;
+    env->cpuid_features = edx & 0xdfffffff; /* disable Thermal Monitro */
+    env->cpuid_ext_features = ecx;
+
+    host_cpuid(0x80000000, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_xlevel = eax;
+
+    host_cpuid(0x80000001, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_ext3_features = ecx;
+    env->cpuid_ext2_features = edx & 0xf7ffffff; /* disable rdtscp */
+
+    host_cpuid(0x80000002, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_model[0] = eax;
+    env->cpuid_model[1] = ebx;
+    env->cpuid_model[2] = ecx;
+    env->cpuid_model[3] = edx;
+
+    host_cpuid(0x80000003, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_model[4] = eax;
+    env->cpuid_model[5] = ebx;
+    env->cpuid_model[6] = ecx;
+    env->cpuid_model[7] = edx;
+
+    host_cpuid(0x80000004, 0, &eax, &ebx, &ecx, &edx);
+    env->cpuid_model[8] = eax;
+    env->cpuid_model[9] = ebx;
+    env->cpuid_model[10] = ecx;
+    env->cpuid_model[11] = edx;
+
+    return 0;
+}
+
 static int cpu_x86_register (CPUX86State *env, const char *cpu_model)
 {
     x86_def_t def1, *def = &def1;
 
     if (cpu_x86_find_by_name(def, cpu_model) < 0)
         return -1;
-    if (def->vendor1) {
-        env->cpuid_vendor1 = def->vendor1;
-        env->cpuid_vendor2 = def->vendor2;
-        env->cpuid_vendor3 = def->vendor3;
-    } else {
-        env->cpuid_vendor1 = CPUID_VENDOR_INTEL_1;
-        env->cpuid_vendor2 = CPUID_VENDOR_INTEL_2;
-        env->cpuid_vendor3 = CPUID_VENDOR_INTEL_3;
-    }
-    env->cpuid_level = def->level;
-    env->cpuid_version = (def->family << 8) | (def->model << 4) | def->stepping;
-    env->cpuid_features = def->features;
+
     env->pat = 0x0007040600070406ULL;
-    env->cpuid_ext_features = def->ext_features;
     env->cpuid_ext2_features = def->ext2_features;
-    env->cpuid_xlevel = def->xlevel;
     env->cpuid_ext3_features = def->ext3_features;
-    {
+
+    if (strcmp(cpu_model, "_host_") == 0) {
+	fill_x86_defs_for_host(env, def);
+    } else {
         const char *model_id = def->model_id;
         int c, len, i;
 
+    	env->cpu_host_cpu = 0;
+        env->cpuid_level = def->level;
+        env->cpuid_xlevel = def->xlevel;
+        if (def->vendor1) {
+            env->cpuid_vendor1 = def->vendor1;
+            env->cpuid_vendor2 = def->vendor2;
+            env->cpuid_vendor3 = def->vendor3;
+        } else {
+            env->cpuid_vendor1 = CPUID_VENDOR_INTEL_1;
+            env->cpuid_vendor2 = CPUID_VENDOR_INTEL_2;
+            env->cpuid_vendor3 = CPUID_VENDOR_INTEL_3;
+        }
+        env->cpuid_features = def->features;
+        env->cpuid_ext_features = def->ext_features;
+
+    	env->cpuid_version = (def->family << 8) | (def->model << 4) | def->stepping;
         if (cpu_vendor_string != NULL)
             model_id = cpu_vendor_string;
         if (!model_id)
diff --git a/qemu/target-i386/op_helper.c b/qemu/target-i386/op_helper.c
index e9a6942..3ff85ba 100644
--- a/qemu/target-i386/op_helper.c
+++ b/qemu/target-i386/op_helper.c
@@ -1883,23 +1883,26 @@ void helper_single_step(void)
     raise_exception(EXCP01_SSTP);
 }
 
+void host_cpuid(uint32_t function, uint32_t index, uint32_t *eax, uint32_t *ebx,
+		       uint32_t *ecx, uint32_t *edx);
+
 void helper_cpuid(void)
 {
-    uint32_t index;
+    uint32_t function;
 
     helper_svm_check_intercept_param(SVM_EXIT_CPUID, 0);
     
-    index = (uint32_t)EAX;
-    /* test if maximum index reached */
-    if (index & 0x80000000) {
-        if (index > env->cpuid_xlevel)
-            index = env->cpuid_level;
+    function = (uint32_t)EAX;
+    /* test if maximum function reached */
+    if (function & 0x80000000) {
+        if (function > env->cpuid_xlevel)
+            function = env->cpuid_level;
     } else {
-        if (index > env->cpuid_level)
-            index = env->cpuid_level;
+        if (function > env->cpuid_level)
+            function = env->cpuid_level;
     }
 
-    switch(index) {
+    switch(function) {
     case 0:
         EAX = env->cpuid_level;
         EBX = env->cpuid_vendor1;
@@ -1911,9 +1914,17 @@ void helper_cpuid(void)
         EBX = (env->cpuid_apic_id << 24) | 8 << 8; /* CLFLUSH size in quad words, Linux wants it. */
         ECX = env->cpuid_ext_features;
         EDX = env->cpuid_features;
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, NULL, &EBX, NULL, NULL);
+	   EBX = (EBX & 0x00ffffff) | (env->cpuid_apic_id << 24);
+	}
         break;
     case 2:
         /* cache info: needed for Pentium Pro compatibility */
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, ECX, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         EAX = 1;
         EBX = 0;
         ECX = 0;
@@ -1921,6 +1932,10 @@ void helper_cpuid(void)
         break;
     case 4:
         /* cache info: needed for Core compatibility */
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, ECX, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         switch (ECX) {
             case 0: /* L1 dcache info */
                 EAX = 0x0000121;
@@ -1950,6 +1965,10 @@ void helper_cpuid(void)
 
         break;
     case 5:
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         /* mwait info: needed for Core compatibility */
         EAX = 0; /* Smallest monitor-line size in bytes */
         EBX = 0; /* Largest monitor-line size in bytes */
@@ -1957,6 +1976,10 @@ void helper_cpuid(void)
         EDX = 0;
         break;
     case 6:
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         /* Thermal and Power Leaf */
         EAX = 0;
         EBX = 0;
@@ -1965,6 +1988,10 @@ void helper_cpuid(void)
         break;
     case 9:
         /* Direct Cache Access Information Leaf */
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         EAX = 0; /* Bits 0-31 in DCA_CAP MSR */
         EBX = 0;
         ECX = 0;
@@ -1972,6 +1999,10 @@ void helper_cpuid(void)
         break;
     case 0xA:
         /* Architectural Performance Monitoring Leaf */
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         EAX = 0;
         EBX = 0;
         ECX = 0;
@@ -1992,10 +2023,10 @@ void helper_cpuid(void)
     case 0x80000002:
     case 0x80000003:
     case 0x80000004:
-        EAX = env->cpuid_model[(index - 0x80000002) * 4 + 0];
-        EBX = env->cpuid_model[(index - 0x80000002) * 4 + 1];
-        ECX = env->cpuid_model[(index - 0x80000002) * 4 + 2];
-        EDX = env->cpuid_model[(index - 0x80000002) * 4 + 3];
+        EAX = env->cpuid_model[(function - 0x80000002) * 4 + 0];
+        EBX = env->cpuid_model[(function - 0x80000002) * 4 + 1];
+        ECX = env->cpuid_model[(function - 0x80000002) * 4 + 2];
+        EDX = env->cpuid_model[(function - 0x80000002) * 4 + 3];
         break;
     case 0x80000005:
         /* cache info (L1 cache) */
@@ -2006,6 +2037,10 @@ void helper_cpuid(void)
         break;
     case 0x80000006:
         /* cache info (L2 cache) */
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         EAX = 0;
         EBX = 0x42004200;
         ECX = 0x02008140;
@@ -2034,12 +2069,20 @@ void helper_cpuid(void)
         EDX = 0;
         break;
     case 0x8000000A:
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, 0, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }
         EAX = 0x00000001;
         EBX = 0;
         ECX = 0;
         EDX = 0;
         break;
     default:
+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, ECX, &EAX, &EBX, &ECX, &EDX);
+           break;
+	}
         /* reserved values: zero */
         EAX = 0;
         EBX = 0;

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-17 22:45   ` Nitin A Kamble
@ 2008-10-18  0:18     ` Alexander Graf
  2008-10-20 16:46       ` Kamble, Nitin A
  2008-10-19  9:29     ` Avi Kivity
  1 sibling, 1 reply; 24+ messages in thread
From: Alexander Graf @ 2008-10-18  0:18 UTC (permalink / raw)
  To: nitin.a.kamble; +Cc: Avi Kivity, kvm@vger.kernel.org, kraxel, chrisw

Hi Nitin,

first of all thanks for taking on that work - it's something that's  
direly necessary and I appreciate what you're doing.

On 18.10.2008, at 00:45, Nitin A Kamble wrote:

> Hi Avi,
> Attached is a patch to handle the "host mode" as you mentioned  
> earlier.
>
> It works this way:
>   Qemu's -cpu option is extended to take a new paramenter: -cpu  
> _host_.
> With that parameter the cpuid bits are collected from the host cpu.  
> And
> few bits of cleared to get it working with the kvm.
>   I tested the patch with Windows & Linux guests, and I see the host
> topology is propagating into the guest. With that windows server is  
> not
> imposing the incorrect licensing restriction to guest, as it was doing
> earlier. With the patch I could bring up the windows server guest with
> 16 cpus on kvm.

This sounds like qemu-only source to me. Getting the cpuid from the  
host is nothing that involves the KVM interface whatsoever - even  
though I have to admit that it's only useful when used with opcode  
passthrough to the CPU, but that could also happen with kqemu.

> I also fixed implementation of indexed cpuid leaves (0x4 & 0xb) in
> qemu. And also some code cleanup are there for consistent  
> terminology to
> have more readable code,

Please - don't mix cleanup with functional changes. The patch becomes  
unreadable - and I don't see any problems with leave 4 (b is not  
implemented, right?) in qemu. The qemu<->kvm cpuid interface is  
broken, yes. But that's two completely separate problems: CPUID bits  
being wrong in qemu should go to the qemu mailinglist, while libkvm  
breakage should be handled on the kvm list IMHO.

>  Please check-in the patch in kvm-usermode.git tree. There are no
> changes for the kernel side or kvm.git tree for this.
>
> Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
>
> Next things:
> * Default mode: I am not planning to do anything for this. I see the
> current situation is good enough for the default mode.
>
> * Managed mode: For this, I am thinking few things.
>    1. Either an option can be added to qemu to take (almost) all of
>       it's cpuid bits from a file.
>    2. Or use command line options to modify particular bits of guest
>       cpuid.
>    3. Or do the both.
>
> What do you think about the managed mode implementation? you probably
> have better idea regarding how and what kind of tools will be managing
> it.

Well, the case where you really need CPUID feature hiding is when you  
have a datacenter with a lot of different hosts in it. Then the  
management software would need to have a way to tell qemu the least  
common set of bits available throughout the whole center. Sounds like  
we really need config files ;-).

Some more comments:

- top-posting is evil ;-)
- please inline patches - it's hard to comment on any code like that  
(at least with my mail client)

So thanks to the lack of inlining I'll try to explain some rough bits  
on what I grasped while looking through the qemu parts of the patch:

1. void helper_cpuid(void)

You put a call like

+        if (env->cpu_host_cpu)  {
+           host_cpuid(function, ECX, &EAX, &EBX, &ECX, &EDX);
+           break;
+        }

in almost every CPUID leaf. I don't think that's the way to go -  
there's got to be some more generic way to do what you're trying to  
achieve. Maybe just replace the whole helper_cpuid function with  
something special for the host case?

2. index->function cleanup

Is there any way you rename the variable apart from personal taste? I  
don't really like changes like that - they just break other's patches  
and it's not that hard to understand what index means here.

3. Do not use tabs in qemu code

4. The identifier "_host_" is ... arguable. I don't have any strong  
opinion on what to choose here, but the underscores look strange.

5.

+void host_cpuid(uint32_t function, uint32_t index, uint32_t *eax,  
uint32_t *ebx,
+		       uint32_t *ecx, uint32_t *edx);
+

Are you seriously defining a function stub in the middle of a c file?

--

Sorry if I'm too harsh - usually I'm nicer when reviewing patches

Alex

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-17 22:45   ` Nitin A Kamble
  2008-10-18  0:18     ` Alexander Graf
@ 2008-10-19  9:29     ` Avi Kivity
  2008-10-20 18:37       ` Nitin A Kamble
  2008-10-31 16:47       ` Nitin A Kamble
  1 sibling, 2 replies; 24+ messages in thread
From: Avi Kivity @ 2008-10-19  9:29 UTC (permalink / raw)
  To: nitin.a.kamble; +Cc: kvm@vger.kernel.org, kraxel, chrisw

Nitin A Kamble wrote:
> Hi Avi,
>  Attached is a patch to handle the "host mode" as you mentioned earlier.
>
> It works this way:
>    Qemu's -cpu option is extended to take a new paramenter: -cpu _host_.
> With that parameter the cpuid bits are collected from the host cpu. And
> few bits of cleared to get it working with the kvm.
>    I tested the patch with Windows & Linux guests, and I see the host
> topology is propagating into the guest. With that windows server is not
> imposing the incorrect licensing restriction to guest, as it was doing
> earlier. With the patch I could bring up the windows server guest with
> 16 cpus on kvm.
>   I also fixed implementation of indexed cpuid leaves (0x4 & 0xb) in
> qemu. And also some code cleanup are there for consistent terminology to
> have more readable code, 
>
>   Please check-in the patch in kvm-usermode.git tree. There are no
> changes for the kernel side or kvm.git tree for this.
>
> Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
>
> Next things:
>  * Default mode: I am not planning to do anything for this. I see the
> current situation is good enough for the default mode.
>
>  * Managed mode: For this, I am thinking few things. 
>     1. Either an option can be added to qemu to take (almost) all of 
>        it's cpuid bits from a file.
>     2. Or use command line options to modify particular bits of guest  
>        cpuid.
>     3. Or do the both.
>
> What do you think about the managed mode implementation? you probably
> have better idea regarding how and what kind of tools will be managing
> it.
>
>   

> +static int fill_x86_defs_for_host(CPUX86State *env, x86_def_t * def)
> +{
> +    uint32 eax, ebx, ecx, edx;
> +
> +    env->cpu_host_cpu = 1;
> +
> +    host_cpuid(0, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_level = eax;
> +    env->cpuid_vendor1 = ebx;
> +    env->cpuid_vendor2 = ecx;
> +    env->cpuid_vendor3 = edx;
> +
> +    host_cpuid(1, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_version = eax;
> +    env->cpuid_features = edx & 0xdfffffff; /* disable Thermal Monitro */
> +    env->cpuid_ext_features = ecx;
> +
> +    host_cpuid(0x80000000, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_xlevel = eax;
> +
> +    host_cpuid(0x80000001, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_ext3_features = ecx;
> +    env->cpuid_ext2_features = edx & 0xf7ffffff; /* disable rdtscp */
> +
> +    host_cpuid(0x80000002, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_model[0] = eax;
> +    env->cpuid_model[1] = ebx;
> +    env->cpuid_model[2] = ecx;
> +    env->cpuid_model[3] = edx;
> +
> +    host_cpuid(0x80000003, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_model[4] = eax;
> +    env->cpuid_model[5] = ebx;
> +    env->cpuid_model[6] = ecx;
> +    env->cpuid_model[7] = edx;
> +
> +    host_cpuid(0x80000004, 0, &eax, &ebx, &ecx, &edx);
> +    env->cpuid_model[8] = eax;
> +    env->cpuid_model[9] = ebx;
> +    env->cpuid_model[10] = ecx;
> +    env->cpuid_model[11] = edx;
> +
> +    return 0;
> +}
> +

There is already a kernel function for this (KVM_GET_SUPPORTED_CPUID, 
IIRC) that does this.  It has the advantage of knowing which bits the 
kernel supports, so there's no need to keep the kernel and userspace at 
the same version.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: thread/core siblings info for guests
  2008-10-18  0:18     ` Alexander Graf
@ 2008-10-20 16:46       ` Kamble, Nitin A
  0 siblings, 0 replies; 24+ messages in thread
From: Kamble, Nitin A @ 2008-10-20 16:46 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Avi Kivity, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Hi Alex,
   Thanks a lot for your detailed review. I think I can do some changes as per your lines, which will make it friendlier for the qemu tree merge. And Avi also has some comments, which may need change in userland code. So hold on and I will cut another patch soon.

Thanks & Regards,
Nitin
Linux Open Source Technology Center, Intel Corporation
---------------------------------------------------------------------------
The Mind is like a parachute; it works much better when it's open.

>-----Original Message-----
>From: Alexander Graf [mailto:agraf@suse.de]
>Sent: Friday, October 17, 2008 5:19 PM
>To: Kamble, Nitin A
>Cc: Avi Kivity; kvm@vger.kernel.org; kraxel@redhat.com; chrisw@sous-sol.org
>Subject: Re: thread/core siblings info for guests
>
>Hi Nitin,
>
>first of all thanks for taking on that work - it's something that's
>direly necessary and I appreciate what you're doing.
>
>On 18.10.2008, at 00:45, Nitin A Kamble wrote:
>
>> Hi Avi,
>> Attached is a patch to handle the "host mode" as you mentioned
>> earlier.
>>
>> It works this way:
>>   Qemu's -cpu option is extended to take a new paramenter: -cpu
>> _host_.
>> With that parameter the cpuid bits are collected from the host cpu.
>> And
>> few bits of cleared to get it working with the kvm.
>>   I tested the patch with Windows & Linux guests, and I see the host
>> topology is propagating into the guest. With that windows server is
>> not
>> imposing the incorrect licensing restriction to guest, as it was doing
>> earlier. With the patch I could bring up the windows server guest with
>> 16 cpus on kvm.
>
>This sounds like qemu-only source to me. Getting the cpuid from the
>host is nothing that involves the KVM interface whatsoever - even
>though I have to admit that it's only useful when used with opcode
>passthrough to the CPU, but that could also happen with kqemu.
>
>> I also fixed implementation of indexed cpuid leaves (0x4 & 0xb) in
>> qemu. And also some code cleanup are there for consistent
>> terminology to
>> have more readable code,
>
>Please - don't mix cleanup with functional changes. The patch becomes
>unreadable - and I don't see any problems with leave 4 (b is not
>implemented, right?) in qemu. The qemu<->kvm cpuid interface is
>broken, yes. But that's two completely separate problems: CPUID bits
>being wrong in qemu should go to the qemu mailinglist, while libkvm
>breakage should be handled on the kvm list IMHO.
>
>>  Please check-in the patch in kvm-usermode.git tree. There are no
>> changes for the kernel side or kvm.git tree for this.
>>
>> Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
>>
>> Next things:
>> * Default mode: I am not planning to do anything for this. I see the
>> current situation is good enough for the default mode.
>>
>> * Managed mode: For this, I am thinking few things.
>>    1. Either an option can be added to qemu to take (almost) all of
>>       it's cpuid bits from a file.
>>    2. Or use command line options to modify particular bits of guest
>>       cpuid.
>>    3. Or do the both.
>>
>> What do you think about the managed mode implementation? you probably
>> have better idea regarding how and what kind of tools will be managing
>> it.
>
>Well, the case where you really need CPUID feature hiding is when you
>have a datacenter with a lot of different hosts in it. Then the
>management software would need to have a way to tell qemu the least
>common set of bits available throughout the whole center. Sounds like
>we really need config files ;-).
>
>Some more comments:
>
>- top-posting is evil ;-)
>- please inline patches - it's hard to comment on any code like that
>(at least with my mail client)
>
>So thanks to the lack of inlining I'll try to explain some rough bits
>on what I grasped while looking through the qemu parts of the patch:
>
>1. void helper_cpuid(void)
>
>You put a call like
>
>+        if (env->cpu_host_cpu)  {
>+           host_cpuid(function, ECX, &EAX, &EBX, &ECX, &EDX);
>+           break;
>+        }
>
>in almost every CPUID leaf. I don't think that's the way to go -
>there's got to be some more generic way to do what you're trying to
>achieve. Maybe just replace the whole helper_cpuid function with
>something special for the host case?
>
>2. index->function cleanup
>
>Is there any way you rename the variable apart from personal taste? I
>don't really like changes like that - they just break other's patches
>and it's not that hard to understand what index means here.
>
>3. Do not use tabs in qemu code
>
>4. The identifier "_host_" is ... arguable. I don't have any strong
>opinion on what to choose here, but the underscores look strange.
>
>5.
>
>+void host_cpuid(uint32_t function, uint32_t index, uint32_t *eax,
>uint32_t *ebx,
>+                      uint32_t *ecx, uint32_t *edx);
>+
>
>Are you seriously defining a function stub in the middle of a c file?
>
>--
>
>Sorry if I'm too harsh - usually I'm nicer when reviewing patches
>
>Alex

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-19  9:29     ` Avi Kivity
@ 2008-10-20 18:37       ` Nitin A Kamble
  2008-10-21  8:52         ` Avi Kivity
  2008-10-31 16:47       ` Nitin A Kamble
  1 sibling, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-10-20 18:37 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org, kraxel@redhat.com, chrisw@sous-sol.org

[-- Attachment #1: Type: text/plain, Size: 800 bytes --]

Hi Avi,
 I do find that ioctl in the kvm kernel module code. and the qemu code
can do the ioctl to get the cpuinfo bits. But seems like nobody is using
this ioctl yet. What was the purpose of this code? 


On Sun, 2008-10-19 at 02:29 -0700, Avi Kivity wrote:
> There is already a kernel function for this (KVM_GET_SUPPORTED_CPUID,
> IIRC) that does this.  It has the advantage of knowing which bits the
> kernel supports, so there's no need to keep the kernel and userspace at
> the same version.
> 
> --
> error compiling committee.c: too many arguments to function
> 
-- 
Thanks & Regards,
Nitin
Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------
The mind is like a parachute; it works much better when it's open

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-20 18:37       ` Nitin A Kamble
@ 2008-10-21  8:52         ` Avi Kivity
  0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2008-10-21  8:52 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: kvm@vger.kernel.org, kraxel@redhat.com, chrisw@sous-sol.org

Nitin A Kamble wrote:
> Hi Avi,
>  I do find that ioctl in the kvm kernel module code. and the qemu code
> can do the ioctl to get the cpuinfo bits. But seems like nobody is using
> this ioctl yet. What was the purpose of this code? 
>
>   

It's for -cpu host and for implementing greatest common denominator type 
of calculations.  Qemu doesn't use it yet.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-19  9:29     ` Avi Kivity
  2008-10-20 18:37       ` Nitin A Kamble
@ 2008-10-31 16:47       ` Nitin A Kamble
  2008-11-04 14:59         ` Avi Kivity
  1 sibling, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-10-31 16:47 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm@vger.kernel.org, kraxel@redhat.com, chrisw@sous-sol.org


[-- Attachment #1.1: Type: text/plain, Size: 678 bytes --]

Avi & Alex,
  Thanks for your suggestions. Now I have 2 attached patches to export
the host cpuid data to guests. 
  It reuses the ioctl KVM_GET_SUPPORTED_CPUID as Avi pointed out. Also
as per Alex's suggestions the changes have been simplified for easier
merge with the qemu tree.

the kvm_kernel_patch is for the kvm.git tree.
And the kvm_userspace_patch is for the kvm-userspace.git tree.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

-- 
Thanks & Regards,
Nitin
Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------
The mind is like a parachute; it works much better when it's open

[-- Attachment #1.2: kvm_kernel_patch3.diff --]
[-- Type: text/x-patch, Size: 6238 bytes --]

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09e6c56..e50db11 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -86,7 +86,7 @@
 #define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT)
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
-#define KVM_MAX_CPUID_ENTRIES 40
+#define KVM_MAX_CPUID_ENTRIES 100
 #define KVM_NR_FIXED_MTRR_REGION 88
 #define KVM_NR_VAR_MTRR 8
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 38f79b6..91187a2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1194,36 +1194,14 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 			 u32 index, int *nent, int maxnent)
 {
-	const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
-		bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
-		bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
-		bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
-		bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
-		bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
-		bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
-		bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
-		bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
-		bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
-	const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
-		bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
-		bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
-		bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
-		bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
-		bit(X86_FEATURE_PGE) |
-		bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
-		bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
-		bit(X86_FEATURE_SYSCALL) |
-		(bit(X86_FEATURE_NX) && is_efer_nx()) |
-#ifdef CONFIG_X86_64
+	const u32 kvm_unsupported_word0_x86_features = bit(X86_FEATURE_ACC);
+	const u32 kvm_unsupported_word1_x86_features = bit(X86_FEATURE_RDTSCP) |
+#ifndef CONFIG_X86_64
 		bit(X86_FEATURE_LM) |
 #endif
-		bit(X86_FEATURE_MMXEXT) |
-		bit(X86_FEATURE_3DNOWEXT) |
-		bit(X86_FEATURE_3DNOW);
-	const u32 kvm_supported_word3_x86_features =
-		bit(X86_FEATURE_XMM3) | bit(X86_FEATURE_CX16);
-	const u32 kvm_supported_word6_x86_features =
-		bit(X86_FEATURE_LAHF_LM) | bit(X86_FEATURE_CMP_LEGACY);
+                0;
+	const u32 kvm_unsupported_word3_x86_features = 0;
+	const u32 kvm_unsupported_word6_x86_features = 0;
 
 	/* all func 2 cpuid_count() should be called on the same cpu */
 	get_cpu();
@@ -1235,8 +1213,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->eax = min(entry->eax, (u32)0xb);
 		break;
 	case 1:
-		entry->edx &= kvm_supported_word0_x86_features;
-		entry->ecx &= kvm_supported_word3_x86_features;
+		entry->edx &= ~(kvm_unsupported_word0_x86_features);
+		entry->ecx &= ~(kvm_unsupported_word3_x86_features);
 		break;
 	/* function 2 entries are STATEFUL. That is, repeated cpuid commands
 	 * may return different values. This forces us to get_cpu() before
@@ -1246,6 +1224,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		int t, times = entry->eax & 0xff;
 
 		entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
+		entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
 		for (t = 1; t < times && *nent < maxnent; ++t) {
 			do_cpuid_1_ent(&entry[t], function, 0);
 			entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
@@ -1276,7 +1255,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		/* read more entries until level_type is zero */
 		for (i = 1; *nent < maxnent; ++i) {
-			level_type = entry[i - 1].ecx & 0xff;
+			level_type = entry[i - 1].ecx & 0xff00;
 			if (!level_type)
 				break;
 			do_cpuid_1_ent(&entry[i], function, i);
@@ -1290,8 +1269,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->eax = min(entry->eax, 0x8000001a);
 		break;
 	case 0x80000001:
-		entry->edx &= kvm_supported_word1_x86_features;
-		entry->ecx &= kvm_supported_word6_x86_features;
+		entry->edx &= ~(kvm_unsupported_word1_x86_features);
+		entry->ecx &= ~(kvm_unsupported_word6_x86_features);
 		break;
 	}
 	put_cpu();
@@ -1302,10 +1281,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 {
 	struct kvm_cpuid_entry2 *cpuid_entries;
 	int limit, nent = 0, r = -E2BIG;
+	int sizer = 0;
 	u32 func;
 
-	if (cpuid->nent < 1)
-		goto out;
+	if (cpuid->nent == 0) {
+		sizer = 1;
+		cpuid->nent = KVM_MAX_CPUID_ENTRIES; 
+        }
+
 	r = -ENOMEM;
 	cpuid_entries = vmalloc(sizeof(struct kvm_cpuid_entry2) * cpuid->nent);
 	if (!cpuid_entries)
@@ -1326,9 +1309,11 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 		do_cpuid_ent(&cpuid_entries[nent], func, 0,
 			       &nent, cpuid->nent);
 	r = -EFAULT;
-	if (copy_to_user(entries, cpuid_entries,
+	if (!sizer) {
+		if (copy_to_user(entries, cpuid_entries,
 			nent * sizeof(struct kvm_cpuid_entry2)))
-		goto out_free;
+			goto out_free;
+        }
 	cpuid->nent = nent;
 	r = 0;
 
@@ -2801,7 +2786,7 @@ static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
 
 	e->flags &= ~KVM_CPUID_FLAG_STATE_READ_NEXT;
 	/* when no next entry is found, the current entry[i] is reselected */
-	for (j = i + 1; j == i; j = (j + 1) % nent) {
+	for (j = i + 1; ; j = (j + 1) % nent) {
 		struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
 		if (ej->function == e->function) {
 			ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
@@ -2860,6 +2845,13 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
 		kvm_register_write(vcpu, VCPU_REGS_RCX, best->ecx);
 		kvm_register_write(vcpu, VCPU_REGS_RDX, best->edx);
 	}
+	if (function == 0x1) { /* set the per-cpu apicid */
+		u32 ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
+		ebx &= 0x00ffffff;
+		ebx |= (vcpu->vcpu_id << 24);
+		kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
+	}
+          
 	kvm_x86_ops->skip_emulated_instruction(vcpu);
 	KVMTRACE_5D(CPUID, vcpu, function,
 		    (u32)kvm_register_read(vcpu, VCPU_REGS_RAX),

[-- Attachment #1.3: kvm_userspace_patch10.diff --]
[-- Type: text/x-patch, Size: 11671 bytes --]

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index a8cca15..7aafa20 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -379,6 +379,34 @@ int kvm_set_msrs(kvm_context_t kvm, int vcpu, struct kvm_msr_entry *msrs,
     return r;
 }
 
+/*
+ * Returns available host cpuid entries.  User must free.
+ */
+struct kvm_cpuid2 *kvm_get_host_cpuid_entries(kvm_context_t kvm)
+{
+	struct kvm_cpuid2 sizer, *cpuids;
+	int r, e;
+
+	sizer.nent = 0;
+	r = ioctl(kvm->fd, KVM_GET_SUPPORTED_CPUID, &sizer);
+	if (r == -1 && errno != E2BIG)
+		return NULL;
+	cpuids = malloc(sizeof *cpuids + sizer.nent * sizeof *cpuids->entries);
+	if (!cpuids) {
+		errno = ENOMEM;
+		return NULL;
+	}
+	cpuids->nent = sizer.nent;
+	r = ioctl(kvm->fd, KVM_GET_SUPPORTED_CPUID, cpuids);
+	if (r == -1) {
+		e = errno;
+		free(cpuids);
+		errno = e;
+		return NULL;
+	}
+	return cpuids;
+}
+
 static void print_seg(FILE *file, const char *name, struct kvm_segment *seg)
 {
     	fprintf(stderr,
@@ -458,9 +486,9 @@ __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu)
 }
 
 int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
-		    struct kvm_cpuid_entry *entries)
+		    struct kvm_cpuid_entry2 *entries)
 {
-	struct kvm_cpuid *cpuid;
+	struct kvm_cpuid2 *cpuid;
 	int r;
 
 	cpuid = malloc(sizeof(*cpuid) + nent * sizeof(*entries));
@@ -469,7 +497,7 @@ int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
 
 	cpuid->nent = nent;
 	memcpy(cpuid->entries, entries, nent * sizeof(*entries));
-	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_CPUID, cpuid);
+	r = ioctl(kvm->vcpu_fd[vcpu], KVM_SET_CPUID2, cpuid);
 
 	free(cpuid);
 	return r;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 423ce31..f84d524 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -27,6 +27,9 @@ typedef struct kvm_context *kvm_context_t;
 struct kvm_msr_list *kvm_get_msr_list(kvm_context_t);
 int kvm_get_msrs(kvm_context_t, int vcpu, struct kvm_msr_entry *msrs, int n);
 int kvm_set_msrs(kvm_context_t, int vcpu, struct kvm_msr_entry *msrs, int n);
+struct kvm_cpuid2 *kvm_get_host_cpuid_entries(kvm_context_t);
+void get_host_cpuid_entry(uint32_t function, uint32_t index, 
+                struct kvm_cpuid_entry2 * e);
 #endif
 
 /*!
@@ -374,7 +377,7 @@ int kvm_guest_debug(kvm_context_t, int vcpu, struct kvm_debug_guest *dbg);
  * \return 0 on success, or -errno on error
  */
 int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
-		    struct kvm_cpuid_entry *entries);
+		    struct kvm_cpuid_entry2 *entries);
 
 /*!
  * \brief Setting the number of shadow pages to be allocated to the vm
diff --git a/qemu/qemu-kvm-x86.c b/qemu/qemu-kvm-x86.c
index bf62e18..b776ebf 100644
--- a/qemu/qemu-kvm-x86.c
+++ b/qemu/qemu-kvm-x86.c
@@ -21,6 +21,7 @@
 #define MSR_IA32_TSC		0x10
 
 static struct kvm_msr_list *kvm_msr_list;
+static struct kvm_cpuid2 *kvm_host_cpuid_entries;
 extern unsigned int kvm_shadow_memory;
 extern kvm_context_t kvm_context;
 static int kvm_has_msr_star;
@@ -52,11 +53,17 @@ int kvm_arch_qemu_create_context(void)
 
     kvm_msr_list = kvm_get_msr_list(kvm_context);
     if (!kvm_msr_list)
-		return -1;
+        return -1;
+
     for (i = 0; i < kvm_msr_list->nmsrs; ++i)
-	if (kvm_msr_list->indices[i] == MSR_STAR)
-	    kvm_has_msr_star = 1;
-	return 0;
+        if (kvm_msr_list->indices[i] == MSR_STAR)
+        kvm_has_msr_star = 1;
+
+    kvm_host_cpuid_entries = kvm_get_host_cpuid_entries(kvm_context);
+    if (!kvm_host_cpuid_entries)
+        return -1;
+
+    return 0;
 }
 
 static void set_msr_entry(struct kvm_msr_entry *entry, uint32_t index,
@@ -476,13 +483,61 @@ static void host_cpuid(uint32_t function, uint32_t *eax, uint32_t *ebx,
 	*edx = vec[3];
 }
 
+void get_host_cpuid_entry(uint32_t function, uint32_t index, 
+                struct kvm_cpuid_entry2 * e)
+{
+    int i;
+    struct kvm_cpuid_entry2 *entries;
+
+    memset(e, 0, (sizeof *e));
+    e->function = function;
+    e->index = index;
+    
+    if (!kvm_host_cpuid_entries) 
+        return;
+
+    entries = kvm_host_cpuid_entries->entries;
+
+    for (i=0; i<kvm_host_cpuid_entries->nent; i++) {
+        struct kvm_cpuid_entry2 *ent = &entries[i];
+        if (ent->function != function)
+           continue; 
+        if ((ent->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) && 
+                (ent->index != index))
+           continue;
+        if ((ent->flags & KVM_CPUID_FLAG_STATEFUL_FUNC) &&
+                !(ent->flags & KVM_CPUID_FLAG_STATE_READ_NEXT))
+           continue;
+
+        memcpy(e, ent, sizeof (*e));
+
+        if (ent->flags & KVM_CPUID_FLAG_STATEFUL_FUNC) {
+                int j;
+		ent->flags &= ~KVM_CPUID_FLAG_STATE_READ_NEXT;
+                for (j=i+1; ; j=(j+1)%(kvm_host_cpuid_entries->nent)) {
+                    struct kvm_cpuid_entry2 *entj = &entries[j];
+                    if (entj->function == ent->function) {
+                        entj->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
+                        break;
+                    }
+                }
+        }
+        break;
+    }
+}
 
-static void do_cpuid_ent(struct kvm_cpuid_entry *e, uint32_t function,
-			 CPUState *env)
+static void do_cpuid_ent(struct kvm_cpuid_entry2 *e, uint32_t function,
+			 uint32_t index, CPUState *env)
 {
+    if (env->cpuid_host_cpu) {
+        get_host_cpuid_entry(function, index, e);
+	return;
+    }
+    e->function = function;
+    e->index = index;
     env->regs[R_EAX] = function;
+    env->regs[R_ECX] = index;
     qemu_kvm_cpuid_on_env(env);
-    e->function = function;
     e->eax = env->regs[R_EAX];
     e->ebx = env->regs[R_EBX];
     e->ecx = env->regs[R_ECX];
@@ -521,6 +576,11 @@ static void do_cpuid_ent(struct kvm_cpuid_entry *e, uint32_t function,
     if (function == 1)
 	e->ecx |= (1u << 31);
 
+    if ((function == 4) || (function == 0xb)) 
+       e->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+    else
+       e->flags = 0;
+
     // 3dnow isn't properly emulated yet
     if (function == 0x80000001)
 	e->edx &= ~0xc0000000;
@@ -559,17 +619,30 @@ static int get_para_features(kvm_context_t kvm_context)
 
 int kvm_arch_qemu_init_env(CPUState *cenv)
 {
-    struct kvm_cpuid_entry cpuid_ent[100];
+    struct kvm_cpuid_entry2 *cpuid_ent, entry, *e;
+    int cpuid_nent = 0, malloc_size = 0;
+    CPUState copy;
+    uint32_t i, limit; 
 #ifdef KVM_CPUID_SIGNATURE
-    struct kvm_cpuid_entry *pv_ent;
+    struct kvm_cpuid_entry2 *pv_ent;
     uint32_t signature[3];
+
+    malloc_size += 2;
 #endif
-    int cpuid_nent = 0;
-    CPUState copy;
-    uint32_t i, limit;
 
     copy = *cenv;
 
+    if (copy.cpuid_host_cpu) {
+        if (!kvm_host_cpuid_entries)
+	   return -EINVAL;
+        malloc_size += kvm_host_cpuid_entries->nent;
+    } else
+           malloc_size += 100;
+
+    cpuid_ent = malloc(malloc_size * sizeof (struct kvm_cpuid_entry2));
+    if (!cpuid_ent)
+       return -ENOMEM;
+
 #ifdef KVM_CPUID_SIGNATURE
     /* Paravirtualization CPUIDs */
     memcpy(signature, "KVMKVMKVM", 12);
@@ -587,21 +660,48 @@ int kvm_arch_qemu_init_env(CPUState *cenv)
     pv_ent->eax = get_para_features(kvm_context);
 #endif
 
-    copy.regs[R_EAX] = 0;
-    qemu_kvm_cpuid_on_env(&copy);
-    limit = copy.regs[R_EAX];
+    limit = copy.cpuid_level;
+    for (i=0; ((i<2) && (i<limit)) ; i++) {
+        e = &cpuid_ent[cpuid_nent++];
+	do_cpuid_ent(e, i, 0, &copy);
+    }
+
+    if (limit >= 2) { /* get the multiple stateful leaf values */
+        do_cpuid_ent(&entry, 2, 0, &copy);
+	cpuid_ent[cpuid_nent++] = entry;
+        for (i = 1; i<(entry.eax & 0xff); i++) {
+            e = &cpuid_ent[cpuid_nent++];
+            do_cpuid_ent(e, 2, 0, &copy);
+        } 
+    }
 
-    for (i = 0; i <= limit; ++i)
-	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, &copy);
+    for (i = 3; i <= limit; i++) {
+        e = &cpuid_ent[cpuid_nent++];
+	do_cpuid_ent(e, i, 0, &copy);
+    }
 
-    copy.regs[R_EAX] = 0x80000000;
-    qemu_kvm_cpuid_on_env(&copy);
-    limit = copy.regs[R_EAX];
+    if (limit >= 4) { /* get the per index values */
+        int i = 1;
+        do {
+            e = &cpuid_ent[cpuid_nent++];
+            do_cpuid_ent(e, 4, i++, &copy);
+        } while(e->eax & 0x1f); /* until the last index */
+    }
+
+    if (limit >= 0xb) { /* get the per index values */
+        int i = 1;
+        do {
+            e = &cpuid_ent[cpuid_nent++];
+            do_cpuid_ent(e, 0xb, i++, &copy);
+        } while(e->ecx & 0xff00); /* until the last index */
+    }
 
+    limit = copy.cpuid_xlevel;
     for (i = 0x80000000; i <= limit; ++i)
-	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, &copy);
+	do_cpuid_ent(&cpuid_ent[cpuid_nent++], i, 0, &copy);
 
     kvm_setup_cpuid(kvm_context, cenv->cpu_index, cpuid_nent, cpuid_ent);
+    free(cpuid_ent);
     return 0;
 }
 
diff --git a/qemu/target-i386/cpu.h b/qemu/target-i386/cpu.h
index 11bc2c1..42d646a 100644
--- a/qemu/target-i386/cpu.h
+++ b/qemu/target-i386/cpu.h
@@ -612,6 +612,7 @@ typedef struct CPUX86State {
     uint32_t cpuid_ext2_features;
     uint32_t cpuid_ext3_features;
     uint32_t cpuid_apic_id;
+    uint32_t cpuid_host_cpu;
 
 #ifdef USE_KQEMU
     int kqemu_enabled;
diff --git a/qemu/target-i386/helper.c b/qemu/target-i386/helper.c
index 68efd4d..c23e16e 100644
--- a/qemu/target-i386/helper.c
+++ b/qemu/target-i386/helper.c
@@ -152,6 +152,9 @@ typedef struct x86_def_t {
 static x86_def_t x86_defs[] = {
 #ifdef TARGET_X86_64
     {
+        .name = "host",
+    },
+    {
         .name = "qemu64",
         .level = 2,
         .vendor1 = CPUID_VENDOR_AMD_1,
@@ -405,10 +408,59 @@ void x86_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, ...))
         (*cpu_fprintf)(f, "x86 %16s\n", x86_defs[i].name);
 }
 
+int fill_x86_defs_for_host(CPUX86State *env, x86_def_t * def)
+{
+    struct kvm_cpuid_entry2 e;
+
+    get_host_cpuid_entry(0, 0, &e);
+    env->cpuid_level = e.eax;
+    env->cpuid_vendor1 = e.ebx;
+    env->cpuid_vendor2 = e.ecx;
+    env->cpuid_vendor3 = e.edx;
+
+    get_host_cpuid_entry(1, 0, &e);
+    env->cpuid_version = e.eax;
+    env->cpuid_features = e.edx;
+    env->cpuid_ext_features = e.ecx;
+
+    get_host_cpuid_entry(0x80000000, 0, &e);
+    env->cpuid_xlevel = e.eax;
+
+    get_host_cpuid_entry(0x80000001, 0, &e);
+    env->cpuid_ext3_features = e.ecx;
+    env->cpuid_ext2_features = e.edx;
+
+    get_host_cpuid_entry(0x80000002, 0, &e);
+    env->cpuid_model[0] = e.eax;
+    env->cpuid_model[1] = e.ebx;
+    env->cpuid_model[2] = e.ecx;
+    env->cpuid_model[3] = e.edx;
+
+    get_host_cpuid_entry(0x80000003, 0, &e);
+    env->cpuid_model[4] = e.eax;
+    env->cpuid_model[5] = e.ebx;
+    env->cpuid_model[6] = e.ecx;
+    env->cpuid_model[7] = e.edx;
+
+    get_host_cpuid_entry(0x80000004, 0, &e);
+    env->cpuid_model[8] = e.eax;
+    env->cpuid_model[9] = e.ebx;
+    env->cpuid_model[10] = e.ecx;
+    env->cpuid_model[11] = e.edx;
+
+    return 0;
+}
+
 static int cpu_x86_register (CPUX86State *env, const char *cpu_model)
 {
     x86_def_t def1, *def = &def1;
 
+    if (strcmp(cpu_model, "host") == 0) {
+        env->cpuid_host_cpu = 1;
+        fill_x86_defs_for_host(env, def);
+        return 0;
+    } /* else follow through */
+    env->cpuid_host_cpu = 0;
     if (cpu_x86_find_by_name(def, cpu_model) < 0)
         return -1;
     if (def->vendor1) {

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-10-31 16:47       ` Nitin A Kamble
@ 2008-11-04 14:59         ` Avi Kivity
  2008-11-04 19:02           ` Nitin A Kamble
  0 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2008-11-04 14:59 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: kvm@vger.kernel.org, kraxel@redhat.com, chrisw@sous-sol.org

Nitin A Kamble wrote:
> Avi & Alex,
>   Thanks for your suggestions. Now I have 2 attached patches to export
> the host cpuid data to guests. 
>   It reuses the ioctl KVM_GET_SUPPORTED_CPUID as Avi pointed out. Also
> as per Alex's suggestions the changes have been simplified for easier
> merge with the qemu tree.
>
> the kvm_kernel_patch is for the kvm.git tree.
> And the kvm_userspace_patch is for the kvm-userspace.git tree.
>
>   

You need changelog entries that describe the patches.  Also, one patch 
per email.

> @ -1194,36 +1194,14 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  			 u32 index, int *nent, int maxnent)
>  {
> -	const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
> -		bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
> -		bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
> -		bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
> -		bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
> -		bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
> -		bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
> -		bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
> -		bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
> -		bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
> -	const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
> -		bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
> -		bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
> -		bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
> -		bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
> -		bit(X86_FEATURE_PGE) |
> -		bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
> -		bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
> -		bit(X86_FEATURE_SYSCALL) |
> -		(bit(X86_FEATURE_NX) && is_efer_nx()) |
> -#ifdef CONFIG_X86_64
> +	const u32 kvm_unsupported_word0_x86_features = bit(X86_FEATURE_ACC);
> +	const u32 kvm_unsupported_word1_x86_features = bit(X86_FEATURE_RDTSCP) |
> +#ifndef CONFIG_X86_64
>   

What's the motivation for this change?

A potential problem is that a newer cpu might define new bits, which we 
don't support, yet we report them as supported.

>  get_cpu() before
> @@ -1246,6 +1224,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		int t, times = entry->eax & 0xff;
>  
>  		entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
> +		entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
>  		for (t = 1; t < times && *nent < maxnent; ++t) {
>  			do_cpuid_1_ent(&entry[t], function, 0);
>  			entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
>   

Why?

> @@ -1276,7 +1255,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>  		/* read more entries until level_type is zero */
>  		for (i = 1; *nent < maxnent; ++i) {
> -			level_type = entry[i - 1].ecx & 0xff;
> +			level_type = entry[i - 1].ecx & 0xff00;
>  			if (!level_type)
>  				break;
>  			do_cpuid_1_ent(&entry[i], function, i);
>   

Why?

> @@ -1302,10 +1281,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
>  {
>  	struct kvm_cpuid_entry2 *cpuid_entries;
>  	int limit, nent = 0, r = -E2BIG;
> +	int sizer = 0;
>  	u32 func;
>  
> -	if (cpuid->nent < 1)
> -		goto out;
> +	if (cpuid->nent == 0) {
> +		sizer = 1;
> +		cpuid->nent = KVM_MAX_CPUID_ENTRIES; 
> +        }
> +
>   

That's an ABI change.  New userspace could call an older kernel with 
nent = 0, and get an unexpected response.

> @@ -2860,6 +2845,13 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
>  		kvm_register_write(vcpu, VCPU_REGS_RCX, best->ecx);
>  		kvm_register_write(vcpu, VCPU_REGS_RDX, best->edx);
>  	}
> +	if (function == 0x1) { /* set the per-cpu apicid */
> +		u32 ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
> +		ebx &= 0x00ffffff;
> +		ebx |= (vcpu->vcpu_id << 24);
> +		kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
> +	}
> +          
>   

ABI change again.  This is userspace's job.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-11-04 14:59         ` Avi Kivity
@ 2008-11-04 19:02           ` Nitin A Kamble
  2008-11-05  6:38             ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-11-04 19:02 UTC (permalink / raw)
  To: Avi Kivity, Nakajima, Jun
  Cc: kvm@vger.kernel.org, kraxel@redhat.com, chrisw@sous-sol.org

[-- Attachment #1: Type: text/plain, Size: 5978 bytes --]

Hi Avi,
 Thanks for the comments on the patches. please see bellow for my
response to your comments. 

Thanks & Regards,
Nitin



On Tue, 2008-11-04 at 07:59 -0700, Avi Kivity wrote:

> You need changelog entries that describe the patches.  Also, one patch
> per email.
> 
sounds right. I will cut my patch into smaller patches and send them one
by one.


> > @ -1194,36 +1194,14 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >  static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >                        u32 index, int *nent, int maxnent)
> >  {
> > -     const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
> > -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
> > -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
> > -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
> > -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
> > -             bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
> > -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
> > -             bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
> > -             bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
> > -             bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
> > -     const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
> > -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
> > -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
> > -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
> > -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
> > -             bit(X86_FEATURE_PGE) |
> > -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
> > -             bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
> > -             bit(X86_FEATURE_SYSCALL) |
> > -             (bit(X86_FEATURE_NX) && is_efer_nx()) |
> > -#ifdef CONFIG_X86_64
> > +     const u32 kvm_unsupported_word0_x86_features = bit(X86_FEATURE_ACC);
> > +     const u32 kvm_unsupported_word1_x86_features = bit(X86_FEATURE_RDTSCP) |
> > +#ifndef CONFIG_X86_64
> >
> 
> What's the motivation for this change?
> 
> A potential problem is that a newer cpu might define new bits, which we
> don't support, yet we report them as supported.
The motivation is: IMO There would be very few features which KVM will
not be able to support by default. Also lots of cpuid bits are being
blocked unnecessarily by the current code. 
  If there is a new feature which would break KVM implementation then it
will easy to notice, and easy to block that bit in the KVM at that time.
But if the new feature is not affecting KVM then it will not be noticed,
and KVM will end up unsupporting that cpu feature.


> 
> >  get_cpu() before
> > @@ -1246,6 +1224,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >               int t, times = entry->eax & 0xff;
> >
> >               entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
> > +             entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
> >               for (t = 1; t < times && *nent < maxnent; ++t) {
> >                       do_cpuid_1_ent(&entry[t], function, 0);
> >                       entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
> >
> 
> Why?
This is a bug fix. Without this change the code does not find a leaf 2
entry in the list.


> 
> > @@ -1276,7 +1255,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >               entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> >               /* read more entries until level_type is zero */
> >               for (i = 1; *nent < maxnent; ++i) {
> > -                     level_type = entry[i - 1].ecx & 0xff;
> > +                     level_type = entry[i - 1].ecx & 0xff00;
> >                       if (!level_type)
> >                               break;
> >                       do_cpuid_1_ent(&entry[i], function, i);
> >
> 
> Why?
This is also a bugfix. The type field is in bits 8-15. As per the code,
the bits 0-7 would not define the end of counting. You can look at IA32
processor Software developer's manual for this.


> 
> > @@ -1302,10 +1281,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
> >  {
> >       struct kvm_cpuid_entry2 *cpuid_entries;
> >       int limit, nent = 0, r = -E2BIG;
> > +     int sizer = 0;
> >       u32 func;
> >
> > -     if (cpuid->nent < 1)
> > -             goto out;
> > +     if (cpuid->nent == 0) {
> > +             sizer = 1;
> > +             cpuid->nent = KVM_MAX_CPUID_ENTRIES;
> > +        }
> > +
> >
> 
> That's an ABI change.  New userspace could call an older kernel with
> nent = 0, and get an unexpected response.
That is right. What do you suggest for solution? Currently this ABI is
not utilized. Would adding another ioctl would work?



> > @@ -2860,6 +2845,13 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
> >               kvm_register_write(vcpu, VCPU_REGS_RCX, best->ecx);
> >               kvm_register_write(vcpu, VCPU_REGS_RDX, best->edx);
> >       }
> > +     if (function == 0x1) { /* set the per-cpu apicid */
> > +             u32 ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
> > +             ebx &= 0x00ffffff;
> > +             ebx |= (vcpu->vcpu_id << 24);
> > +             kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
> > +     }
> > +
> >
> 
> ABI change again.  This is userspace's job.
This does not look like an ABI change to me. I will look if vcpu_id is
available in the userspace, if yes, then this code can be moved into
userspace.


> 
> --
> error compiling committee.c: too many arguments to function
> 
-- 
Thanks & Regards,
Nitin
Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------
The mind is like a parachute; it works much better when it's open

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: thread/core siblings info for guests
  2008-11-04 19:02           ` Nitin A Kamble
@ 2008-11-05  6:38             ` Avi Kivity
  2008-11-05 23:37               ` [Patch 1] " Nitin A Kamble
                                 ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Avi Kivity @ 2008-11-05  6:38 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Nitin A Kamble wrote:
>>> @ -1194,36 +1194,14 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>  static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>                        u32 index, int *nent, int maxnent)
>>>  {
>>> -     const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
>>> -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
>>> -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
>>> -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
>>> -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
>>> -             bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
>>> -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
>>> -             bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
>>> -             bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
>>> -             bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
>>> -     const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
>>> -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
>>> -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
>>> -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
>>> -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
>>> -             bit(X86_FEATURE_PGE) |
>>> -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
>>> -             bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
>>> -             bit(X86_FEATURE_SYSCALL) |
>>> -             (bit(X86_FEATURE_NX) && is_efer_nx()) |
>>> -#ifdef CONFIG_X86_64
>>> +     const u32 kvm_unsupported_word0_x86_features = bit(X86_FEATURE_ACC);
>>> +     const u32 kvm_unsupported_word1_x86_features = bit(X86_FEATURE_RDTSCP) |
>>> +#ifndef CONFIG_X86_64
>>>
>>>       
>> What's the motivation for this change?
>>
>> A potential problem is that a newer cpu might define new bits, which we
>> don't support, yet we report them as supported.
>>     
> The motivation is: IMO There would be very few features which KVM will
> not be able to support by default. Also lots of cpuid bits are being
> blocked unnecessarily by the current code. 
>   If there is a new feature which would break KVM implementation then it
> will easy to notice, and easy to block that bit in the KVM at that time.
> But if the new feature is not affecting KVM then it will not be noticed,
> and KVM will end up unsupporting that cpu feature.
>   

Having an extra cpuid bit means that a guest will crash, while missing 
one slows it down a bit.  We need to be conservative and not risk 
passing incorrect data.

>   
>>>  get_cpu() before
>>> @@ -1246,6 +1224,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>               int t, times = entry->eax & 0xff;
>>>
>>>               entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
>>> +             entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
>>>               for (t = 1; t < times && *nent < maxnent; ++t) {
>>>                       do_cpuid_1_ent(&entry[t], function, 0);
>>>                       entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
>>>
>>>       
>> Why?
>>     
> This is a bug fix. Without this change the code does not find a leaf 2
> entry in the list.
>
>   

Please send it as a separate patch, with a changelog entry describing 
why it is needed.

>   
>>> @@ -1276,7 +1255,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>               entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>>>               /* read more entries until level_type is zero */
>>>               for (i = 1; *nent < maxnent; ++i) {
>>> -                     level_type = entry[i - 1].ecx & 0xff;
>>> +                     level_type = entry[i - 1].ecx & 0xff00;
>>>                       if (!level_type)
>>>                               break;
>>>                       do_cpuid_1_ent(&entry[i], function, i);
>>>
>>>       
>> Why?
>>     
> This is also a bugfix. The type field is in bits 8-15. As per the code,
> the bits 0-7 would not define the end of counting. You can look at IA32
> processor Software developer's manual for this.
>
>
>   

Separate patch, please.

>>> @@ -1302,10 +1281,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
>>>  {
>>>       struct kvm_cpuid_entry2 *cpuid_entries;
>>>       int limit, nent = 0, r = -E2BIG;
>>> +     int sizer = 0;
>>>       u32 func;
>>>
>>> -     if (cpuid->nent < 1)
>>> -             goto out;
>>> +     if (cpuid->nent == 0) {
>>> +             sizer = 1;
>>> +             cpuid->nent = KVM_MAX_CPUID_ENTRIES;
>>> +        }
>>> +
>>>
>>>       
>> That's an ABI change.  New userspace could call an older kernel with
>> nent = 0, and get an unexpected response.
>>     
> That is right. What do you suggest for solution? Currently this ABI is
> not utilized. Would adding another ioctl would work?
>
>   

We could have the response to KVM_CHECK_EXTENSION(KVM_CAP_CPUID2 (or 
however it's called)) return the number of entries, and document that 1 
means "unknown" (well, we could have done it with your proposal too).

>
>   
>>> @@ -2860,6 +2845,13 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
>>>               kvm_register_write(vcpu, VCPU_REGS_RCX, best->ecx);
>>>               kvm_register_write(vcpu, VCPU_REGS_RDX, best->edx);
>>>       }
>>> +     if (function == 0x1) { /* set the per-cpu apicid */
>>> +             u32 ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
>>> +             ebx &= 0x00ffffff;
>>> +             ebx |= (vcpu->vcpu_id << 24);
>>> +             kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
>>> +     }
>>> +
>>>
>>>       
>> ABI change again.  This is userspace's job.
>>     
> This does not look like an ABI change to me. 

Right, it isn't an ABI change.  But we should avoid situations where the 
same call does different things in different kernel versions.

> I will look if vcpu_id is
> available in the userspace, if yes, then this code can be moved into
> userspace.
>
>   

Yes, it's available in userspace.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Patch 1]  thread/core siblings info for guests
  2008-11-05  6:38             ` Avi Kivity
@ 2008-11-05 23:37               ` Nitin A Kamble
  2008-11-06 13:10                 ` Avi Kivity
  2008-11-05 23:56               ` [patch 2] " Nitin A Kamble
  2008-11-06  0:25               ` [Patch 3] " Nitin A Kamble
  2 siblings, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-11-05 23:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org


[-- Attachment #1.1: Type: text/plain, Size: 359 bytes --]

Hi Avi,
  Attached is the patch for the bugfix1, as we discussed in the earlier
email.


For cpuid leaf 0xb the bits 8-15 in ECX register define the end of
counting leaf.
The previous code was using bits 0-7 for this purpose, which is a bug.
    
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>


Please Apply,
Thanks & Regards,
Nitin

[-- Attachment #1.2: kvm_kernel_cpuid_patch1.diff --]
[-- Type: text/x-patch, Size: 919 bytes --]

commit 96ab287a9d90fc4c04438f6943b36049bbdfe854
Author: Nitin A Kamble <nitin.a.kamble@intel.com>
Date:   Wed Nov 5 15:28:13 2008 -0800

    For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting leaf.
    The previous code was using bits 0-7 for this purpose, which is a bug.
    
    Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 38f79b6..7ac7d5f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1276,7 +1276,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		/* read more entries until level_type is zero */
 		for (i = 1; *nent < maxnent; ++i) {
-			level_type = entry[i - 1].ecx & 0xff;
+			level_type = entry[i - 1].ecx & 0xff00;
 			if (!level_type)
 				break;
 			do_cpuid_1_ent(&entry[i], function, i);

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [patch 2] thread/core siblings info for guests
  2008-11-05  6:38             ` Avi Kivity
  2008-11-05 23:37               ` [Patch 1] " Nitin A Kamble
@ 2008-11-05 23:56               ` Nitin A Kamble
  2008-11-06 13:18                 ` Avi Kivity
  2008-11-06  0:25               ` [Patch 3] " Nitin A Kamble
  2 siblings, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-11-05 23:56 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org


[-- Attachment #1.1: Type: text/plain, Size: 678 bytes --]

Hi Avi,
 This patches the 2nd bug in the kernel code.

The code to traverse the cpuid data array list for counting type of
leaves is currently broken.
This patches fixes the 2 things in it.
 1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT.
    Without it the code will never find a valid entry.
 2. Also the stop condition in the for loop while looking for the 
    next unflaged entry is broken. It needs to stop when it find one
    matching entry; and in the case of count of 1, it will be the same
    entry found in this iteration.
    
Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

Please apply.

Thanks & Regards,
Nitin

[-- Attachment #1.2: kvm_kernel_cpuid_patch2.diff --]
[-- Type: text/x-patch, Size: 1755 bytes --]

commit 61fd85ec73d691d080a0adecab2f61840eef1a3e
Author: Nitin A Kamble <nitin.a.kamble@intel.com>
Date:   Wed Nov 5 15:47:35 2008 -0800

    The code to traverse the cpuid data array list for counting type of leaves is
    currently broken.
    This patches fixes the 2 things in it.
      1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without
         it the code will never find a valid entry.
      2. Also the stop condition in the for loop while looking for the next unflaged
         entry is broken. It needs to stop when it find one matching entry;
         and in the case of count of 1, it will be the same entry found in this
         iteration.
    
    Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7ac7d5f..bf7461b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1246,6 +1246,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		int t, times = entry->eax & 0xff;
 
 		entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
+		entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
 		for (t = 1; t < times && *nent < maxnent; ++t) {
 			do_cpuid_1_ent(&entry[t], function, 0);
 			entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
@@ -2801,7 +2802,7 @@ static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
 
 	e->flags &= ~KVM_CPUID_FLAG_STATE_READ_NEXT;
 	/* when no next entry is found, the current entry[i] is reselected */
-	for (j = i + 1; j == i; j = (j + 1) % nent) {
+	for (j = i + 1; ; j = (j + 1) % nent) {
 		struct kvm_cpuid_entry2 *ej = &vcpu->arch.cpuid_entries[j];
 		if (ej->function == e->function) {
 			ej->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Patch 3] thread/core siblings info for guests
  2008-11-05  6:38             ` Avi Kivity
  2008-11-05 23:37               ` [Patch 1] " Nitin A Kamble
  2008-11-05 23:56               ` [patch 2] " Nitin A Kamble
@ 2008-11-06  0:25               ` Nitin A Kamble
  2008-11-06 13:20                 ` Avi Kivity
  2 siblings, 1 reply; 24+ messages in thread
From: Nitin A Kamble @ 2008-11-06  0:25 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org


[-- Attachment #1.1: Type: text/plain, Size: 500 bytes --]

Hi Avi,
   This patch cover's the kernel ABI change as we discussed in the
earlier email.

Patch Description:

Change the ioctl KVM_GET_SUPPORTED_CPUID, such that it will return the
no of entries in the list when requested no of entries (nent) is 0.
    
Also add another KVM_CHECK_EXTENSION, KVM_CAP_CPUID_SIZER to determine
if the running kernel supports the above changed ABI.
    
Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

Please Apply.

Thanks & Regards,
Nitin

[-- Attachment #1.2: kvm_kernel_cpuid_patch3.diff --]
[-- Type: text/x-patch, Size: 2736 bytes --]

commit 70e4e65bc591eb9cf25c1cbc0d16b2cbdb089a6f
Author: Nitin A Kamble <nitin.a.kamble@intel.com>
Date:   Wed Nov 5 16:17:46 2008 -0800

    Change the ioctl KVM_GET_SUPPORTED_CPUID, such that it will return the
    no of entries in the list when request no of entries (nent) is 0.
    
    Also add another KVM_CHECK_EXTENSION, KVM_CAP_CPUID_SIZER to determine
    if the running kernel supports the above changed ABI.
    
    Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09e6c56..e50db11 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -86,7 +86,7 @@
 #define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT)
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
-#define KVM_MAX_CPUID_ENTRIES 40
+#define KVM_MAX_CPUID_ENTRIES 100
 #define KVM_NR_FIXED_MTRR_REGION 88
 #define KVM_NR_VAR_MTRR 8
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bf7461b..52e6207 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -969,6 +969,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_NOP_IO_DELAY:
 	case KVM_CAP_MP_STATE:
 	case KVM_CAP_SYNC_MMU:
+	case KVM_CAP_CPUID_SIZER:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -1303,10 +1304,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 {
 	struct kvm_cpuid_entry2 *cpuid_entries;
 	int limit, nent = 0, r = -E2BIG;
+	int sizer = 0;
 	u32 func;
 
-	if (cpuid->nent < 1)
-		goto out;
+	if (cpuid->nent == 0) {
+		sizer = 1;
+		cpuid->nent = KVM_MAX_CPUID_ENTRIES; 
+        }
+
 	r = -ENOMEM;
 	cpuid_entries = vmalloc(sizeof(struct kvm_cpuid_entry2) * cpuid->nent);
 	if (!cpuid_entries)
@@ -1327,9 +1332,11 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 		do_cpuid_ent(&cpuid_entries[nent], func, 0,
 			       &nent, cpuid->nent);
 	r = -EFAULT;
-	if (copy_to_user(entries, cpuid_entries,
+	if (!sizer) {
+		if (copy_to_user(entries, cpuid_entries,
 			nent * sizeof(struct kvm_cpuid_entry2)))
-		goto out_free;
+			goto out_free;
+        }
 	cpuid->nent = nent;
 	r = 0;
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 44fd7fa..d4cb8b1 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -392,6 +392,9 @@ struct kvm_trace_rec {
 #endif
 #define KVM_CAP_IOMMU 18
 #define KVM_CAP_NMI 19
+#define KVM_CAP_CPUID_SIZER 20 /* return of 1 means the KVM_GET_SUPPORTED_CPUID */
+                               /* ioctl will return the size of list when input */
+                               /* list size (nent) is 0 */
 
 /*
  * ioctls for VM fds

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Patch 1]  thread/core siblings info for guests
  2008-11-05 23:37               ` [Patch 1] " Nitin A Kamble
@ 2008-11-06 13:10                 ` Avi Kivity
  0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2008-11-06 13:10 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Nitin A Kamble wrote:
> Hi Avi,
>   Attached is the patch for the bugfix1, as we discussed in the earlier
> email.
>
>
> For cpuid leaf 0xb the bits 8-15 in ECX register define the end of
> counting leaf.
> The previous code was using bits 0-7 for this purpose, which is a bug.
>     
>   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [patch 2] thread/core siblings info for guests
  2008-11-05 23:56               ` [patch 2] " Nitin A Kamble
@ 2008-11-06 13:18                 ` Avi Kivity
  0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2008-11-06 13:18 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Nitin A Kamble wrote:
> Hi Avi,
>  This patches the 2nd bug in the kernel code.
>
> The code to traverse the cpuid data array list for counting type of
> leaves is currently broken.
> This patches fixes the 2 things in it.
>  1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT.
>     Without it the code will never find a valid entry.
>  2. Also the stop condition in the for loop while looking for the 
>     next unflaged entry is broken. It needs to stop when it find one
>     matching entry; and in the case of count of 1, it will be the same
>     entry found in this iteration.
>     
>   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Patch 3] thread/core siblings info for guests
  2008-11-06  0:25               ` [Patch 3] " Nitin A Kamble
@ 2008-11-06 13:20                 ` Avi Kivity
  2008-11-06 18:01                   ` Kamble, Nitin A
  0 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2008-11-06 13:20 UTC (permalink / raw)
  To: nitin.a.kamble
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Nitin A Kamble wrote:
> Hi Avi,
>    This patch cover's the kernel ABI change as we discussed in the
> earlier email.
>
> Patch Description:
>
> Change the ioctl KVM_GET_SUPPORTED_CPUID, such that it will return the
> no of entries in the list when requested no of entries (nent) is 0.
>     
> Also add another KVM_CHECK_EXTENSION, KVM_CAP_CPUID_SIZER to determine
> if the running kernel supports the above changed ABI.
>   

The capability itself can return the count; for example

  case KVM_CAP_NR_CPUID_LEAVES:
      return KVM_MAX_CPUID_ENTRIES;

which is simpler to use and shorter.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [Patch 3] thread/core siblings info for guests
  2008-11-06 13:20                 ` Avi Kivity
@ 2008-11-06 18:01                   ` Kamble, Nitin A
  2008-11-16 13:26                     ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Kamble, Nitin A @ 2008-11-06 18:01 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

>The capability itself can return the count; for example
>
>  case KVM_CAP_NR_CPUID_LEAVES:
>      return KVM_MAX_CPUID_ENTRIES;
>
>which is simpler to use and shorter.

Avi,
   Yes, it is simpler and shorter, but is returning a constant. It will be wasting space for the unused entries. Also it put's a restriction that the list count can not go more than that.
   The dynamic size finding, the patch I sent, is not complicated or long to be an issue. Would you like to give another consideration for the same patch (patch-3) I sent earlier?

Thanks & Regards,
Nitin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Patch 3] thread/core siblings info for guests
  2008-11-06 18:01                   ` Kamble, Nitin A
@ 2008-11-16 13:26                     ` Avi Kivity
  2008-11-18  0:09                       ` Nitin A Kamble
  0 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2008-11-16 13:26 UTC (permalink / raw)
  To: Kamble, Nitin A
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

Kamble, Nitin A wrote:
>> The capability itself can return the count; for example
>>
>>  case KVM_CAP_NR_CPUID_LEAVES:
>>      return KVM_MAX_CPUID_ENTRIES;
>>
>> which is simpler to use and shorter.
>>     
>
> Avi,
>    Yes, it is simpler and shorter, but is returning a constant. It will be wasting space for the unused entries. Also it put's a restriction that the list count can not go more than that.
>    The dynamic size finding, the patch I sent, is not complicated or long to be an issue. Would you like to give another consideration for the same patch (patch-3) I sent earlier?
>
>   

Oh, I confused the sizing code with something else.

We can add this, but older kernels will still miss the code so we have 
to work around it.  I guess userspace can start with a large number, and 
double it each time the ioctl fails.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Patch 3] thread/core siblings info for guests
  2008-11-16 13:26                     ` Avi Kivity
@ 2008-11-18  0:09                       ` Nitin A Kamble
  0 siblings, 0 replies; 24+ messages in thread
From: Nitin A Kamble @ 2008-11-18  0:09 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Nakajima, Jun, kvm@vger.kernel.org, kraxel@redhat.com,
	chrisw@sous-sol.org

[-- Attachment #1: Type: text/plain, Size: 1578 bytes --]

On Sun, 2008-11-16 at 05:26 -0800, Avi Kivity wrote:
> Kamble, Nitin A wrote:
> >> The capability itself can return the count; for example
> >>
> >>  case KVM_CAP_NR_CPUID_LEAVES:
> >>      return KVM_MAX_CPUID_ENTRIES;
> >>
> >> which is simpler to use and shorter.
> >>
> >
> > Avi,
> >    Yes, it is simpler and shorter, but is returning a constant. It will be wasting space for the unused entries. Also it put's a restriction that the list count can not go more than that.
> >    The dynamic size finding, the patch I sent, is not complicated or long to be an issue. Would you like to give another consideration for the same patch (patch-3) I sent earlier?
> >
> >
> 
> Oh, I confused the sizing code with something else.
> 
> We can add this, but older kernels will still miss the code so we have
> to work around it.  I guess userspace can start with a large number, and
> double it each time the ioctl fails.

Avi,
   You mean older kernels which does not support the KVM_CAP_CPUID_SIZER
ioctl. 
  Two ways to handle it: 
1. qemu giving error that "-cpu host" option is not supported on this
kernel. 
2. or as per your suggestion guess it in userland.

Either way it is code for kvm-userspace.git tree.

I am attaching the patch again. Please apply.

BTW I did not get your opinion for the patch for the kvm-userspace.git
tree. Would you comment on it please?


-- 
Thanks & Regards,
Nitin
Open Source Technology Center, Intel Corporation
-----------------------------------------------------------------
The mind is like a parachute; it works much better when it's open

[-- Attachment #2: kvm_kernel_cpuid_patch3.diff --]
[-- Type: text/x-patch, Size: 2656 bytes --]

commit 70e4e65bc591eb9cf25c1cbc0d16b2cbdb089a6f
Author: Nitin A Kamble <nitin.a.kamble@intel.com>
Date:   Wed Nov 5 16:17:46 2008 -0800

    Change the ioctl KVM_GET_SUPPORTED_CPUID, such that it will return the
    no of entries in the list when requested no of entries (nent) is 0.
    
    Also add another KVM_CHECK_EXTENSION, KVM_CAP_CPUID_SIZER to determine
    if the running kernel supports the above changed ABI.
    
    Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09e6c56..e50db11 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -86,7 +86,7 @@
 #define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT)
 #define KVM_MIN_FREE_MMU_PAGES 5
 #define KVM_REFILL_PAGES 25
-#define KVM_MAX_CPUID_ENTRIES 40
+#define KVM_MAX_CPUID_ENTRIES 100
 #define KVM_NR_FIXED_MTRR_REGION 88
 #define KVM_NR_VAR_MTRR 8
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bf7461b..52e6207 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -969,6 +969,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_NOP_IO_DELAY:
 	case KVM_CAP_MP_STATE:
 	case KVM_CAP_SYNC_MMU:
+	case KVM_CAP_CPUID_SIZER:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -1303,10 +1304,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 {
 	struct kvm_cpuid_entry2 *cpuid_entries;
 	int limit, nent = 0, r = -E2BIG;
+	int sizer = 0;
 	u32 func;
 
-	if (cpuid->nent < 1)
-		goto out;
+	if (cpuid->nent == 0) {
+		sizer = 1;
+		cpuid->nent = KVM_MAX_CPUID_ENTRIES; 
+        }
+
 	r = -ENOMEM;
 	cpuid_entries = vmalloc(sizeof(struct kvm_cpuid_entry2) * cpuid->nent);
 	if (!cpuid_entries)
@@ -1327,9 +1332,11 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
 		do_cpuid_ent(&cpuid_entries[nent], func, 0,
 			       &nent, cpuid->nent);
 	r = -EFAULT;
-	if (copy_to_user(entries, cpuid_entries,
+	if (!sizer) {
+		if (copy_to_user(entries, cpuid_entries,
 			nent * sizeof(struct kvm_cpuid_entry2)))
-		goto out_free;
+			goto out_free;
+        }
 	cpuid->nent = nent;
 	r = 0;
 
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 44fd7fa..d4cb8b1 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -392,6 +392,9 @@ struct kvm_trace_rec {
 #endif
 #define KVM_CAP_IOMMU 18
 #define KVM_CAP_NMI 19
+#define KVM_CAP_CPUID_SIZER 20 /* return of 1 means the KVM_GET_SUPPORTED_CPUID */
+                               /* ioctl will return the size of list when input */
+                               /* list size (nent) is 0 */
 
 /*
  * ioctls for VM fds

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2008-11-18  0:09 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-02 20:35 thread/core siblings info for guests Kamble, Nitin A
2008-10-05  9:44 ` Avi Kivity
2008-10-06 10:23   ` Gerd Hoffmann
2008-10-06 18:01   ` Kamble, Nitin A
2008-10-06 20:54     ` Chris Wright
2008-10-17 22:45   ` Nitin A Kamble
2008-10-18  0:18     ` Alexander Graf
2008-10-20 16:46       ` Kamble, Nitin A
2008-10-19  9:29     ` Avi Kivity
2008-10-20 18:37       ` Nitin A Kamble
2008-10-21  8:52         ` Avi Kivity
2008-10-31 16:47       ` Nitin A Kamble
2008-11-04 14:59         ` Avi Kivity
2008-11-04 19:02           ` Nitin A Kamble
2008-11-05  6:38             ` Avi Kivity
2008-11-05 23:37               ` [Patch 1] " Nitin A Kamble
2008-11-06 13:10                 ` Avi Kivity
2008-11-05 23:56               ` [patch 2] " Nitin A Kamble
2008-11-06 13:18                 ` Avi Kivity
2008-11-06  0:25               ` [Patch 3] " Nitin A Kamble
2008-11-06 13:20                 ` Avi Kivity
2008-11-06 18:01                   ` Kamble, Nitin A
2008-11-16 13:26                     ` Avi Kivity
2008-11-18  0:09                       ` Nitin A Kamble

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).