From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: thread/core siblings info for guests
Date: Wed, 05 Nov 2008 08:38:58 +0200
Message-ID: <49113F82.5050409@redhat.com>
References: <8EA2C2C4116BF44AB370468FBF85A7779B418DEF@orsmsx504.amr.corp.intel.com>	 <48E88C62.9030103@redhat.com>	 <1224283519.18637.23.camel@lnitindesktop.sc.intel.com>	 <48FAFE0B.5030209@redhat.com>	 <1225471648.1521.7.camel@lnitindesktop.sc.intel.com>	 <49106364.1070909@redhat.com> <1225825339.23171.25.camel@lnitindesktop.sc.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "Nakajima, Jun" <jun.nakajima@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"kraxel@redhat.com" <kraxel@redhat.com>,
	"chrisw@sous-sol.org" <chrisw@sous-sol.org>
To: nitin.a.kamble@intel.com
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx2.redhat.com ([66.187.237.31]:46461 "EHLO mx2.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750754AbYKEGi6 (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 5 Nov 2008 01:38:58 -0500
In-Reply-To: <1225825339.23171.25.camel@lnitindesktop.sc.intel.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Nitin A Kamble wrote:
>>> @ -1194,36 +1194,14 @@ static void do_cpuid_1_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>  static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>                        u32 index, int *nent, int maxnent)
>>>  {
>>> -     const u32 kvm_supported_word0_x86_features = bit(X86_FEATURE_FPU) |
>>> -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
>>> -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
>>> -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
>>> -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
>>> -             bit(X86_FEATURE_SEP) | bit(X86_FEATURE_PGE) |
>>> -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
>>> -             bit(X86_FEATURE_CLFLSH) | bit(X86_FEATURE_MMX) |
>>> -             bit(X86_FEATURE_FXSR) | bit(X86_FEATURE_XMM) |
>>> -             bit(X86_FEATURE_XMM2) | bit(X86_FEATURE_SELFSNOOP);
>>> -     const u32 kvm_supported_word1_x86_features = bit(X86_FEATURE_FPU) |
>>> -             bit(X86_FEATURE_VME) | bit(X86_FEATURE_DE) |
>>> -             bit(X86_FEATURE_PSE) | bit(X86_FEATURE_TSC) |
>>> -             bit(X86_FEATURE_MSR) | bit(X86_FEATURE_PAE) |
>>> -             bit(X86_FEATURE_CX8) | bit(X86_FEATURE_APIC) |
>>> -             bit(X86_FEATURE_PGE) |
>>> -             bit(X86_FEATURE_CMOV) | bit(X86_FEATURE_PSE36) |
>>> -             bit(X86_FEATURE_MMX) | bit(X86_FEATURE_FXSR) |
>>> -             bit(X86_FEATURE_SYSCALL) |
>>> -             (bit(X86_FEATURE_NX) && is_efer_nx()) |
>>> -#ifdef CONFIG_X86_64
>>> +     const u32 kvm_unsupported_word0_x86_features = bit(X86_FEATURE_ACC);
>>> +     const u32 kvm_unsupported_word1_x86_features = bit(X86_FEATURE_RDTSCP) |
>>> +#ifndef CONFIG_X86_64
>>>
>>>       
>> What's the motivation for this change?
>>
>> A potential problem is that a newer cpu might define new bits, which we
>> don't support, yet we report them as supported.
>>     
> The motivation is: IMO There would be very few features which KVM will
> not be able to support by default. Also lots of cpuid bits are being
> blocked unnecessarily by the current code. 
>   If there is a new feature which would break KVM implementation then it
> will easy to notice, and easy to block that bit in the KVM at that time.
> But if the new feature is not affecting KVM then it will not be noticed,
> and KVM will end up unsupporting that cpu feature.
>   

Having an extra cpuid bit means that a guest will crash, while missing 
one slows it down a bit.  We need to be conservative and not risk 
passing incorrect data.

>   
>>>  get_cpu() before
>>> @@ -1246,6 +1224,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>               int t, times = entry->eax & 0xff;
>>>
>>>               entry->flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
>>> +             entry->flags |= KVM_CPUID_FLAG_STATE_READ_NEXT;
>>>               for (t = 1; t < times && *nent < maxnent; ++t) {
>>>                       do_cpuid_1_ent(&entry[t], function, 0);
>>>                       entry[t].flags |= KVM_CPUID_FLAG_STATEFUL_FUNC;
>>>
>>>       
>> Why?
>>     
> This is a bug fix. Without this change the code does not find a leaf 2
> entry in the list.
>
>   

Please send it as a separate patch, with a changelog entry describing 
why it is needed.

>   
>>> @@ -1276,7 +1255,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>>               entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>>>               /* read more entries until level_type is zero */
>>>               for (i = 1; *nent < maxnent; ++i) {
>>> -                     level_type = entry[i - 1].ecx & 0xff;
>>> +                     level_type = entry[i - 1].ecx & 0xff00;
>>>                       if (!level_type)
>>>                               break;
>>>                       do_cpuid_1_ent(&entry[i], function, i);
>>>
>>>       
>> Why?
>>     
> This is also a bugfix. The type field is in bits 8-15. As per the code,
> the bits 0-7 would not define the end of counting. You can look at IA32
> processor Software developer's manual for this.
>
>
>   

Separate patch, please.

>>> @@ -1302,10 +1281,14 @@ static int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
>>>  {
>>>       struct kvm_cpuid_entry2 *cpuid_entries;
>>>       int limit, nent = 0, r = -E2BIG;
>>> +     int sizer = 0;
>>>       u32 func;
>>>
>>> -     if (cpuid->nent < 1)
>>> -             goto out;
>>> +     if (cpuid->nent == 0) {
>>> +             sizer = 1;
>>> +             cpuid->nent = KVM_MAX_CPUID_ENTRIES;
>>> +        }
>>> +
>>>
>>>       
>> That's an ABI change.  New userspace could call an older kernel with
>> nent = 0, and get an unexpected response.
>>     
> That is right. What do you suggest for solution? Currently this ABI is
> not utilized. Would adding another ioctl would work?
>
>   

We could have the response to KVM_CHECK_EXTENSION(KVM_CAP_CPUID2 (or 
however it's called)) return the number of entries, and document that 1 
means "unknown" (well, we could have done it with your proposal too).

>
>   
>>> @@ -2860,6 +2845,13 @@ void kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
>>>               kvm_register_write(vcpu, VCPU_REGS_RCX, best->ecx);
>>>               kvm_register_write(vcpu, VCPU_REGS_RDX, best->edx);
>>>       }
>>> +     if (function == 0x1) { /* set the per-cpu apicid */
>>> +             u32 ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
>>> +             ebx &= 0x00ffffff;
>>> +             ebx |= (vcpu->vcpu_id << 24);
>>> +             kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
>>> +     }
>>> +
>>>
>>>       
>> ABI change again.  This is userspace's job.
>>     
> This does not look like an ABI change to me. 

Right, it isn't an ABI change.  But we should avoid situations where the 
same call does different things in different kernel versions.

> I will look if vcpu_id is
> available in the userspace, if yes, then this code can be moved into
> userspace.
>
>   

Yes, it's available in userspace.


-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.