Re: [PATCH] mcheck, vmce: Allow vmce_amd_* functions to handle AMD thresolding MSRs

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: "Egger, Christoph" <chegger@amazon.de>
To: Jan Beulich <JBeulich@suse.com>,
	Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Cc: jinsong.liu@intel.com, boris.ostrovsky@oracle.com,
	suravee.suthikulpanit@amd.com, xen-devel@lists.xen.org
Subject: Re: [PATCH] mcheck, vmce: Allow vmce_amd_* functions to handle AMD thresolding MSRs
Date: Wed, 12 Feb 2014 10:58:58 +0100	[thread overview]
Message-ID: <52FB45E2.8090703@amazon.de> (raw)
In-Reply-To: <52F890CD020000780011A95D@nat28.tlf.novell.com>

On 10.02.14 08:41, Jan Beulich wrote:
>>>> On 07.02.14 at 22:27, Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
> wrote:
>> On Fri, Feb 07, 2014 at 11:05:17AM +0000, Jan Beulich wrote:
>>>>>> On 07.02.14 at 01:32, Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> 
>> wrote:
>>>> -	case MSR_F10_MC4_MISC1: /* DRAM error type */
>>>> -		v->arch.vmce.bank[1].mci_misc = val; 
>>>> -		mce_printk(MCE_VERBOSE, "MCE: wr msr %#"PRIx64"\n", val);
>>>> -		break;
>>>> -	case MSR_F10_MC4_MISC2: /* Link error type */
>>>> -	case MSR_F10_MC4_MISC3: /* L3 cache error type */
>>>> -		/* ignore write: we do not emulate link and l3 cache errors
>>>> -		 * to the guest.
>>>> -		 */
>>>> -		mce_printk(MCE_VERBOSE, "MCE: wr msr %#"PRIx64"\n", val);
>>>> -		break;
>>>> -	default:
>>>> -		return 0;
>>>> -	}
>>>> +    /* If not present, #GP fault, else do nothing as we don't emulate */
>>>> +    if ( !amd_thresholding_reg_present(msr) )
>>>> +        return -1;
>>>
>>> The one thing I'm concerned about making this #GP in the guest is
>>> migration: With it being _newer_ CPUs implementing fewer of these
>>> MSRs, it would be impossible to migrate a guest from an older system
>>> to a newer one - a direction that (as long as the newer system
>>> provides all the hardware capabilities the older one has) is generally
>>> assumed to work. Bottom line - we're probably better off always
>>> dropping writes, and always returning zero for reads. Which will
>>> eliminate the need for amd_thresholding_reg_present().
>>>
>>
>> Before I go ahead and remove the function, few questions-
>>
>> Assuming there is a tool in the guest that accesses these MSRs,
>> wouldn't it be fair to expect that the tool keep in mind these MSRs
>> exist only in certain families?
>>
>> For example:
>> if there's a guest running on F10 that accesses 0xc000040a, that would
>> be fine. But once we migrate to a newer family, then the guest should
>> not even generate accesses to the MSR.
> 
> All correct, provided the family check and the MSR access aren't
> separated by a migration.
> 
>> Also, returning #GP to guests would mean keeping it consistent with HW
>> behavior. If we return zero for reads, (IMHO) it's not necessarily
>> correct information as the register does not even exist.. 
>>
>> Bare-metal cases will face same problems too.. but if a register doesn't
>> exist, then shouldn't OS/hypervisor just say so and let whoever
>> generated the access deal with it?
> 
> That's all valid argumentation as long as you leave migration out
> of the picture.

I agree with Jan. All argumentation is valid from hardware perspective.

Apart from migration there is another perspective you miss completely:
The vmce_amd_* functions (and also the corresponding intel functions)
deal with *virtual* MSRs and deal with the case what should happen
with/to the guest when the guest accesses them.

This has absolutely nothing to do what the hardware provides and what
not. The point is, the guest knows (or better assumes) which MSRs exist
from the cpu family/model information it gets via cpuid. The question is
what should happen when the guest accesses these MSRs.

To get the right thing, the questions are:
What should the hypervisor do for recovery?
Does it make sense to make the guest aware of it?

Christoph

     prev parent reply	other threads:[~2014-02-12  9:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-07  0:32 [PATCH] mcheck, vmce: Allow vmce_amd_* functions to handle AMD thresolding MSRs Aravind Gopalakrishnan
2014-02-07 11:05 ` Jan Beulich
2014-02-07 21:27   ` Aravind Gopalakrishnan
2014-02-10  7:41     ` Jan Beulich
2014-02-10 16:54       ` Aravind Gopalakrishnan
2014-02-12  9:58       ` Egger, Christoph [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52FB45E2.8090703@amazon.de \
    --to=chegger@amazon.de \
    --cc=JBeulich@suse.com \
    --cc=aravind.gopalakrishnan@amd.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jinsong.liu@intel.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).