From: Boris Ostrovsky <boris.ostrovsky@amd.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [PATCH] x86/AMD: Add support for AMD's OSVW feature in guests
Date: Tue, 17 Jan 2012 12:54:14 -0500 [thread overview]
Message-ID: <4F15B5C6.6070808@amd.com> (raw)
In-Reply-To: <4F153EBA020000780006D248@nat28.tlf.novell.com>
Responses inline.
On 01/17/12 03:26, Jan Beulich wrote:
>>>> On 16.01.12 at 22:11, Boris Ostrovsky<boris.ostrovsky@amd.com> wrote:
>> --- a/tools/libxc/xc_cpuid_x86.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/tools/libxc/xc_cpuid_x86.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -108,6 +108,7 @@ static void amd_xc_cpuid_policy(
>> bitmaskof(X86_FEATURE_SSE4A) |
>> bitmaskof(X86_FEATURE_MISALIGNSSE) |
>> bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
>> + bitmaskof(X86_FEATURE_OSVW) |
>
> Indentation.
>
> Also, is this really meaningful to PV guests?
Does amd_xc_cpuid_policy() get called for PV guests? I thought this is
HVM path only.
xc_cpuid_pv_policy() is where clear_bit(X86_FEATURE_OSVW, regs[2]) was
removed to handle PV.
I believe this (i.e. OSVW changes) is meaningful for PV. Taking erratum
400 as an example -- we don't need a Linux PV guest reading an MSR
before going to idle (in amd_e400_idle()).
> And valid for pre-Fam10?
No, it indeed shouldn't be set in those cases.
I could set the bit conditionally, based on host family but the trouble
is I don't necessarily know what family the guest will be. Or is this
known at this point?
>
>> bitmaskof(X86_FEATURE_XOP) |
>> bitmaskof(X86_FEATURE_FMA4) |
>> bitmaskof(X86_FEATURE_TBM) |
>> --- a/xen/arch/x86/cpu/amd.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/cpu/amd.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -32,6 +32,13 @@
>> static char opt_famrev[14];
>> string_param("cpuid_mask_cpu", opt_famrev);
>>
>> +/*
>> + * Set osvw_len to higher number when updated Revision Guides
>> + * are published and we know what the new status bits are
>> + */
>
> This is ugly, as it requires permanent attention. The value to start
> with should really be 64 (beyond which other code adjustments are
> going to be necessary anyway).
I went back and forth on this. The reason I settled on 4 is because we
obviously don't know what errata the higher bits will cover and because
it is (at least theoretically) possible that a guest shouldn't see the
same status bits as the host.
But maybe code maintenance is more burden than it's worth?
>
>> +static uint64_t osvw_length = 4, osvw_status;
>> +static DEFINE_SPINLOCK(amd_lock);
>> +
>> static inline void wrmsr_amd(unsigned int index, unsigned int lo,
>> unsigned int hi)
>> {
>> @@ -182,6 +189,35 @@ static void __devinit set_cpuidmask(cons
>> }
>> }
>>
>> +static void amd_guest_osvw_init(struct vcpu *vcpu)
>> +{
>> + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
>> + return;
>
> Shouldn't we bail here for pre-Fam10 CPUs too?
As you noted below the vendor check should be removed anyway since this
is being called from under such a check already.
As for family --- I think I need to solve how X86_FEATURE_OSVW is set
and that will take care of whether anything that gets initialized in
this routine is ever used.
>
> Further, as asked above already - is all this really meaningful to PV
> guests? Otherwise the code would better go somewhere under
> xen/arch/x86/hvm/svm/ ...
>
> Also, throughout this file indentation needs to be changed to Linux
> style (tabs instead of spaces), unless you want to contribute a patch
> to convert the whole file to Xen style (in which case you'd still need
> to make adjustments here).
>
>> +
>> + /*
>> + * Guests should see errata 400 and 415 as fixed (assuming that
>> + * HLT and IO instructions are intercepted).
>> + */
>> + vcpu->arch.amd.osvw.length = (osvw_length>= 3) ? (osvw_length) : 3;
>
> An expression consisting of just an identifier for sure doesn't need
> parentheses.
>
>> + vcpu->arch.amd.osvw.status = osvw_status& ~(6ULL);
>> +
>> + /*
>> + * By increasing VCPU's osvw.length to 3 we are telling the guest that
>> + * all osvw.status bits inside that length, including bit 0 (which is
>> + * reserved for erratum 298), are valid. However, if host processor's
>> + * osvw_len is 0 then osvw_status[0] carries no information. We need to
>> + * be conservative here and therefore we tell the guest that erratum 298
>> + * is present (because we really don't know).
>> + */
>> + if (osvw_length == 0&& boot_cpu_data.x86 == 0x10)
>
> Why do you check the family here? If osvw_length can only ever be
> zero on Fam10, then the first check is sufficient. If osvw_length can
> be zero on other than Fam10 (no matter whether we bail early older
> CPUs), then the check is actually wrong.
10h is the only family affected by this erratum.
>
>> + vcpu->arch.amd.osvw.status |= 1;
>> +}
>> +
>> +void amd_vcpu_initialise(struct vcpu *vcpu)
>> +{
>> + amd_guest_osvw_init(vcpu);
>> +}
>> +
>> /*
>> * Check for the presence of an AMD erratum. Arguments are defined in amd.h
>>
>> * for each known erratum. Return 1 if erratum is found.
>> @@ -512,6 +548,30 @@ static void __devinit init_amd(struct cp
>> set_cpuidmask(c);
>>
>> check_syscfg_dram_mod_en();
>> +
>> + /*
>> + * Get OSVW bits. If bits are not the same on different processors then
>> + * choose the worst case (i.e. if erratum is present on one processor and
>> + * not on another assume that the erratum is present everywhere).
>> + */
>> + if (test_bit(X86_FEATURE_OSVW,&boot_cpu_data.x86_capability)) {
>> + uint64_t len, status;
>> +
>> + if (rdmsr_safe(MSR_AMD_OSVW_ID_LENGTH, len) ||
>> + rdmsr_safe(MSR_AMD_OSVW_STATUS, status))
>> + len = status = 0;
>> +
>> + spin_lock(&amd_lock);
>
> What is the locking here supposed to protect against? The AP call tree
> is smp_callin() -> identify_cpu() -> init_amd(), which cannot race with
> anything (as the initiating processor is waiting for cpu_state to change
> to CPU_STATE_CALLIN).
>
>> +
>> + if (len< osvw_length)
>> + osvw_length = len;
>> +
>> + osvw_status |= status;
>> + osvw_status&= (1ULL<< osvw_length) - 1;
>> +
>> + spin_unlock(&amd_lock);
>> + } else
>> + osvw_length = osvw_status = 0;
>> }
>>
>> static struct cpu_dev amd_cpu_dev __cpuinitdata = {
>> --- a/xen/arch/x86/domain.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/domain.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -422,6 +422,10 @@ int vcpu_initialise(struct vcpu *v)
>> if ( (rc = vcpu_init_fpu(v)) != 0 )
>> return rc;
>>
>> + /* Vendor-specific initialization */
>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
>
> If you check the vendor here, you don't need to anywhere in the
> descendant functions.
>
>> + amd_vcpu_initialise(v);
>> +
>> if ( is_hvm_domain(d) )
>> {
>> rc = hvm_vcpu_initialise(v);
>> --- a/xen/arch/x86/hvm/svm/svm.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/hvm/svm/svm.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -1044,6 +1044,27 @@ static void svm_init_erratum_383(struct
>> }
>> }
>>
>> +static int svm_handle_osvw(struct vcpu *v, uint32_t msr, uint64_t *val, uint read)
>> +{
>> + uint eax, ebx, ecx, edx;
>> +
>> + /* Guest OSVW support */
>> + hvm_cpuid(0x80000001,&eax,&ebx,&ecx,&edx);
>> + if (!test_bit((X86_FEATURE_OSVW& 31),&ecx))
>
> Formatting of changes to this file should be changed to Xen style.
>
>> + return -1;
>> +
>> + if (read) {
>> + if (msr == MSR_AMD_OSVW_ID_LENGTH)
>> + *val = v->arch.amd.osvw.length;
>> + else
>> + *val = v->arch.amd.osvw.status;
>> + }
>> + /* Writes are ignored */
>> +
>> + return 0;
>> +}
>> +
>> +
>> static int svm_cpu_up(void)
>> {
>> uint64_t msr_content;
>> --- a/xen/arch/x86/traps.c Mon Jan 09 16:01:44 2012 +0100
>> +++ b/xen/arch/x86/traps.c Mon Jan 16 22:08:09 2012 +0100
>> @@ -2542,6 +2542,15 @@ static int emulate_privileged_op(struct
>> if ( wrmsr_safe(regs->ecx, msr_content) != 0 )
>> goto fail;
>> break;
>> + case MSR_AMD_OSVW_ID_LENGTH:
>> + case MSR_AMD_OSVW_STATUS:
>> + if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) {
>> + if (!boot_cpu_has(X86_FEATURE_OSVW))
>> + goto fail;
>> + else
>
> Bogus "else" after a "goto". And I wonder, provided there is some
> point in doing all this for PV guests in the first place, whether this
> shouldn't be kept getting handled by the default case.
If we go into default case we will emit "Domain attempted WRMSR ..."
message in the log. I was trying to avoid that since writing to these
registers is not really an error, it just gets ignored.
I'll fix the rest.
Thanks for review.
-boris
>
>> + break; /* Writes are ignored */
>> + }
>> + /* Fall through to default case */
>> default:
>> if ( wrmsr_hypervisor_regs(regs->ecx, msr_content) )
>> break;
>> @@ -2573,6 +2582,7 @@ static int emulate_privileged_op(struct
>> break;
>>
>> case 0x32: /* RDMSR */
>> +
>
> Stray addition of a newline.
>
>> switch ( (u32)regs->ecx )
>> {
>> #ifdef CONFIG_X86_64
>> @@ -2632,6 +2642,23 @@ static int emulate_privileged_op(struct
>> regs->eax = (uint32_t)msr_content;
>> regs->edx = (uint32_t)(msr_content>> 32);
>> break;
>> + case MSR_AMD_OSVW_ID_LENGTH:
>> + case MSR_AMD_OSVW_STATUS:
>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
>> + if (!boot_cpu_has(X86_FEATURE_OSVW))
>> + goto fail;
>> + else {
>
> Bogus "else" after a "goto".
>
> Jan
>
>> + if ((u32)regs->ecx == MSR_AMD_OSVW_ID_LENGTH)
>> + msr_content = v->arch.amd.osvw.length;
>> + else
>> + msr_content = v->arch.amd.osvw.status;
>> +
>> + regs->eax = (uint32_t)msr_content;
>> + regs->edx = (uint32_t)(msr_content>> 32);
>> + }
>> + } else
>> + goto rdmsr_normal;
>> + break;
>> default:
>> if ( rdmsr_hypervisor_regs(regs->ecx,&val) )
>> {
>
>
next prev parent reply other threads:[~2012-01-17 17:54 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-16 21:11 [PATCH] x86/AMD: Add support for AMD's OSVW feature in guests Boris Ostrovsky
2012-01-17 8:26 ` Jan Beulich
2012-01-17 17:54 ` Boris Ostrovsky [this message]
2012-01-18 9:50 ` Jan Beulich
2012-01-18 18:26 ` Boris Ostrovsky
2012-01-19 8:10 ` Jan Beulich
2012-01-19 8:46 ` Keir Fraser
2012-01-19 15:57 ` Boris Ostrovsky
2012-01-19 15:22 ` Boris Ostrovsky
2012-01-19 16:29 ` Jan Beulich
-- strict thread matches above, loose matches on Subject: below --
2012-01-30 17:05 Boris Ostrovsky
2012-01-31 10:13 ` Jan Beulich
2012-01-31 10:36 ` Christoph Egger
2012-01-31 10:39 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F15B5C6.6070808@amd.com \
--to=boris.ostrovsky@amd.com \
--cc=JBeulich@suse.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).