From: Boris Ostrovsky <boris.ostrovsky@amd.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [PATCH] x86/AMD: Add support for AMD's OSVW feature in guests
Date: Tue, 17 Jan 2012 12:54:14 -0500 [thread overview]
Message-ID: <4F15B5C6.6070808@amd.com> (raw)
In-Reply-To: <4F153EBA020000780006D248@nat28.tlf.novell.com>
Responses inline.
On 01/17/12 03:26, Jan Beulich wrote:
>>>> On 16.01.12 at 22:11, Boris Ostrovsky<boris.ostrovsky@amd.com> wrote:
>> --- a/tools/libxc/xc_cpuid_x86.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/tools/libxc/xc_cpuid_x86.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -108,6 +108,7 @@ static void amd_xc_cpuid_policy(
>> bitmaskof(X86_FEATURE_SSE4A) |
>> bitmaskof(X86_FEATURE_MISALIGNSSE) |
>> bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
>> + bitmaskof(X86_FEATURE_OSVW) |
>
> Indentation.
>
> Also, is this really meaningful to PV guests?
Does amd_xc_cpuid_policy() get called for PV guests? I thought this is
HVM path only.
xc_cpuid_pv_policy() is where clear_bit(X86_FEATURE_OSVW, regs[2]) was
removed to handle PV.
I believe this (i.e. OSVW changes) is meaningful for PV. Taking erratum
400 as an example -- we don't need a Linux PV guest reading an MSR
before going to idle (in amd_e400_idle()).
> And valid for pre-Fam10?
No, it indeed shouldn't be set in those cases.
I could set the bit conditionally, based on host family but the trouble
is I don't necessarily know what family the guest will be. Or is this
known at this point?
>
>> bitmaskof(X86_FEATURE_XOP) |
>> bitmaskof(X86_FEATURE_FMA4) |
>> bitmaskof(X86_FEATURE_TBM) |
>> --- a/xen/arch/x86/cpu/amd.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/cpu/amd.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -32,6 +32,13 @@
>> static char opt_famrev[14];
>> string_param("cpuid_mask_cpu", opt_famrev);
>>
>> +/*
>> + * Set osvw_len to higher number when updated Revision Guides
>> + * are published and we know what the new status bits are
>> + */
>
> This is ugly, as it requires permanent attention. The value to start
> with should really be 64 (beyond which other code adjustments are
> going to be necessary anyway).
I went back and forth on this. The reason I settled on 4 is because we
obviously don't know what errata the higher bits will cover and because
it is (at least theoretically) possible that a guest shouldn't see the
same status bits as the host.
But maybe code maintenance is more burden than it's worth?
>
>> +static uint64_t osvw_length = 4, osvw_status;
>> +static DEFINE_SPINLOCK(amd_lock);
>> +
>> static inline void wrmsr_amd(unsigned int index, unsigned int lo,
>> unsigned int hi)
>> {
>> @@ -182,6 +189,35 @@ static void __devinit set_cpuidmask(cons
>> }
>> }
>>
>> +static void amd_guest_osvw_init(struct vcpu *vcpu)
>> +{
>> + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
>> + return;
>
> Shouldn't we bail here for pre-Fam10 CPUs too?
As you noted below the vendor check should be removed anyway since this
is being called from under such a check already.
As for family --- I think I need to solve how X86_FEATURE_OSVW is set
and that will take care of whether anything that gets initialized in
this routine is ever used.
>
> Further, as asked above already - is all this really meaningful to PV
> guests? Otherwise the code would better go somewhere under
> xen/arch/x86/hvm/svm/ ...
>
> Also, throughout this file indentation needs to be changed to Linux
> style (tabs instead of spaces), unless you want to contribute a patch
> to convert the whole file to Xen style (in which case you'd still need
> to make adjustments here).
>
>> +
>> + /*
>> + * Guests should see errata 400 and 415 as fixed (assuming that
>> + * HLT and IO instructions are intercepted).
>> + */
>> + vcpu->arch.amd.osvw.length = (osvw_length>= 3) ? (osvw_length) : 3;
>
> An expression consisting of just an identifier for sure doesn't need
> parentheses.
>
>> + vcpu->arch.amd.osvw.status = osvw_status& ~(6ULL);
>> +
>> + /*
>> + * By increasing VCPU's osvw.length to 3 we are telling the guest that
>> + * all osvw.status bits inside that length, including bit 0 (which is
>> + * reserved for erratum 298), are valid. However, if host processor's
>> + * osvw_len is 0 then osvw_status[0] carries no information. We need to
>> + * be conservative here and therefore we tell the guest that erratum 298
>> + * is present (because we really don't know).
>> + */
>> + if (osvw_length == 0&& boot_cpu_data.x86 == 0x10)
>
> Why do you check the family here? If osvw_length can only ever be
> zero on Fam10, then the first check is sufficient. If osvw_length can
> be zero on other than Fam10 (no matter whether we bail early older
> CPUs), then the check is actually wrong.
10h is the only family affected by this erratum.
>
>> + vcpu->arch.amd.osvw.status |= 1;
>> +}
>> +
>> +void amd_vcpu_initialise(struct vcpu *vcpu)
>> +{
>> + amd_guest_osvw_init(vcpu);
>> +}
>> +
>> /*
>> * Check for the presence of an AMD erratum. Arguments are defined in amd.h
>>
>> * for each known erratum. Return 1 if erratum is found.
>> @@ -512,6 +548,30 @@ static void __devinit init_amd(struct cp
>> set_cpuidmask(c);
>>
>> check_syscfg_dram_mod_en();
>> +
>> + /*
>> + * Get OSVW bits. If bits are not the same on different processors then
>> + * choose the worst case (i.e. if erratum is present on one processor and
>> + * not on another assume that the erratum is present everywhere).
>> + */
>> + if (test_bit(X86_FEATURE_OSVW,&boot_cpu_data.x86_capability)) {
>> + uint64_t len, status;
>> +
>> + if (rdmsr_safe(MSR_AMD_OSVW_ID_LENGTH, len) ||
>> + rdmsr_safe(MSR_AMD_OSVW_STATUS, status))
>> + len = status = 0;
>> +
>> + spin_lock(&amd_lock);
>
> What is the locking here supposed to protect against? The AP call tree
> is smp_callin() -> identify_cpu() -> init_amd(), which cannot race with
> anything (as the initiating processor is waiting for cpu_state to change
> to CPU_STATE_CALLIN).
>
>> +
>> + if (len< osvw_length)
>> + osvw_length = len;
>> +
>> + osvw_status |= status;
>> + osvw_status&= (1ULL<< osvw_length) - 1;
>> +
>> + spin_unlock(&amd_lock);
>> + } else
>> + osvw_length = osvw_status = 0;
>> }
>>
>> static struct cpu_dev amd_cpu_dev __cpuinitdata = {
>> --- a/xen/arch/x86/domain.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/domain.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -422,6 +422,10 @@ int vcpu_initialise(struct vcpu *v)
>> if ( (rc = vcpu_init_fpu(v)) != 0 )
>> return rc;
>>
>> + /* Vendor-specific initialization */
>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
>
> If you check the vendor here, you don't need to anywhere in the
> descendant functions.
>
>> + amd_vcpu_initialise(v);
>> +
>> if ( is_hvm_domain(d) )
>> {
>> rc = hvm_vcpu_initialise(v);
>> --- a/xen/arch/x86/hvm/svm/svm.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/hvm/svm/svm.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -1044,6 +1044,27 @@ static void svm_init_erratum_383(struct
>> }
>> }
>>
>> +static int svm_handle_osvw(struct vcpu *v, uint32_t msr, uint64_t *val, uint read)
>> +{
>> + uint eax, ebx, ecx, edx;
>> +
>> + /* Guest OSVW support */
>> + hvm_cpuid(0x80000001,&eax,&ebx,&ecx,&edx);
>> + if (!test_bit((X86_FEATURE_OSVW& 31),&ecx))
>
> Formatting of changes to this file should be changed to Xen style.
>
>> + return -1;
>> +
>> + if (read) {
>> + if (msr == MSR_AMD_OSVW_ID_LENGTH)
>> + *val = v->arch.amd.osvw.length;
>> + else
>> + *val = v->arch.amd.osvw.status;
>> + }
>> + /* Writes are ignored */
>> +
>> + return 0;
>> +}
>> +
>> +
>> static int svm_cpu_up(void)
>> {
>> uint64_t msr_content;
>> --- a/xen/arch/x86/traps.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/traps.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -2542,6 +2542,15 @@ static int emulate_privileged_op(struct
>> if ( wrmsr_safe(regs->ecx, msr_content) != 0 )
>> goto fail;
>> break;
>> + case MSR_AMD_OSVW_ID_LENGTH:
>> + case MSR_AMD_OSVW_STATUS:
>> + if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) {
>> + if (!boot_cpu_has(X86_FEATURE_OSVW))
>> + goto fail;
>> + else
>
> Bogus "else" after a "goto". And I wonder, provided there is some
> point in doing all this for PV guests in the first place, whether this
> shouldn't be kept getting handled by the default case.
If we go into default case we will emit "Domain attempted WRMSR ..."
message in the log. I was trying to avoid that since writing to these
registers is not really an error, it just gets ignored.
I'll fix the rest.
Thanks for review.
-boris
>
>> + break; /* Writes are ignored */
>> + }
>> + /* Fall through to default case */
>> default:
>> if ( wrmsr_hypervisor_regs(regs->ecx, msr_content) )
>> break;
>> @@ -2573,6 +2582,7 @@ static int emulate_privileged_op(struct
>> break;
>>
>> case 0x32: /* RDMSR */
>> +
>
> Stray addition of a newline.
>
>> switch ( (u32)regs->ecx )
>> {
>> #ifdef CONFIG_X86_64
>> @@ -2632,6 +2642,23 @@ static int emulate_privileged_op(struct
>> regs->eax = (uint32_t)msr_content;
>> regs->edx = (uint32_t)(msr_content>> 32);
>> break;
>> + case MSR_AMD_OSVW_ID_LENGTH:
>> + case MSR_AMD_OSVW_STATUS:
>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
>> + if (!boot_cpu_has(X86_FEATURE_OSVW))
>> + goto fail;
>> + else {
>
> Bogus "else" after a "goto".
>
> Jan
>
>> + if ((u32)regs->ecx == MSR_AMD_OSVW_ID_LENGTH)
>> + msr_content = v->arch.amd.osvw.length;
>> + else
>> + msr_content = v->arch.amd.osvw.status;
>> +
>> + regs->eax = (uint32_t)msr_content;
>> + regs->edx = (uint32_t)(msr_content>> 32);
>> + }
>> + } else
>> + goto rdmsr_normal;
>> + break;
>> default:
>> if ( rdmsr_hypervisor_regs(regs->ecx,&val) )
>> {
>
>
next prev parent reply other threads:[~2012-01-17 17:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-16 21:11 [PATCH] x86/AMD: Add support for AMD's OSVW feature in guests Boris Ostrovsky
2012-01-17 8:26 ` Jan Beulich
2012-01-17 17:54 ` Boris Ostrovsky [this message]
2012-01-18 9:50 ` Jan Beulich
2012-01-18 18:26 ` Boris Ostrovsky
2012-01-19 8:10 ` Jan Beulich
2012-01-19 8:46 ` Keir Fraser
2012-01-19 15:57 ` Boris Ostrovsky
2012-01-19 15:22 ` Boris Ostrovsky
2012-01-19 16:29 ` Jan Beulich
-- strict thread matches above, loose matches on Subject: below --
2012-01-30 17:05 Boris Ostrovsky
2012-01-31 10:13 ` Jan Beulich
2012-01-31 10:36 ` Christoph Egger
2012-01-31 10:39 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F15B5C6.6070808@amd.com \
--to=boris.ostrovsky@amd.com \
--cc=JBeulich@suse.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.