From mboxrd@z Thu Jan 1 00:00:00 1970 From: Aravind Gopalakrishnan Subject: Re: [PATCH] mcheck, vmce: Allow vmce_amd_* functions to handle AMD thresolding MSRs Date: Fri, 7 Feb 2014 15:27:25 -0600 Message-ID: <20140207212724.GD8837@arav-dinar> References: <1391733176-2941-1-git-send-email-aravind.gopalakrishnan@amd.com> <52F4CBFD020000780011A2F1@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <52F4CBFD020000780011A2F1@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: jinsong.liu@intel.com, boris.ostrovsky@oracle.com, chegger@amazon.de, suravee.suthikulpanit@amd.com, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On Fri, Feb 07, 2014 at 11:05:17AM +0000, Jan Beulich wrote: > >>> On 07.02.14 at 01:32, Aravind Gopalakrishnan wrote: > > - case MSR_F10_MC4_MISC1: /* DRAM error type */ > > - v->arch.vmce.bank[1].mci_misc = val; > > - mce_printk(MCE_VERBOSE, "MCE: wr msr %#"PRIx64"\n", val); > > - break; > > - case MSR_F10_MC4_MISC2: /* Link error type */ > > - case MSR_F10_MC4_MISC3: /* L3 cache error type */ > > - /* ignore write: we do not emulate link and l3 cache errors > > - * to the guest. > > - */ > > - mce_printk(MCE_VERBOSE, "MCE: wr msr %#"PRIx64"\n", val); > > - break; > > - default: > > - return 0; > > - } > > + /* If not present, #GP fault, else do nothing as we don't emulate */ > > + if ( !amd_thresholding_reg_present(msr) ) > > + return -1; > > The one thing I'm concerned about making this #GP in the guest is > migration: With it being _newer_ CPUs implementing fewer of these > MSRs, it would be impossible to migrate a guest from an older system > to a newer one - a direction that (as long as the newer system > provides all the hardware capabilities the older one has) is generally > assumed to work. Bottom line - we're probably better off always > dropping writes, and always returning zero for reads. Which will > eliminate the need for amd_thresholding_reg_present(). > Before I go ahead and remove the function, few questions- Assuming there is a tool in the guest that accesses these MSRs, wouldn't it be fair to expect that the tool keep in mind these MSRs exist only in certain families? For example: if there's a guest running on F10 that accesses 0xc000040a, that would be fine. But once we migrate to a newer family, then the guest should not even generate accesses to the MSR. Also, returning #GP to guests would mean keeping it consistent with HW behavior. If we return zero for reads, (IMHO) it's not necessarily correct information as the register does not even exist.. Bare-metal cases will face same problems too.. but if a register doesn't exist, then shouldn't OS/hypervisor just say so and let whoever generated the access deal with it? Thanks, -Aravind.