From mboxrd@z Thu Jan 1 00:00:00 1970 From: Egger Christoph Subject: Re: [PATCH] x86/MCE: Present MSR_IA32_MCx_MISC(2-6) as invalid on AMD Date: Wed, 13 Mar 2013 10:16:30 +0100 Message-ID: <514043EE.3040801@amazon.de> References: <1363102363-1719-1-git-send-email-boris.ostrovsky@oracle.com> <513F674202000078000C504A@nat28.tlf.novell.com> <513F5F1F.6080107@oracle.com> <51403B9E02000078000C5384@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51403B9E02000078000C5384@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: bp@alien.de, Boris Ostrovsky , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 13.03.13 08:41, Jan Beulich wrote: >>>> On 12.03.13 at 18:00, Boris Ostrovsky wrote: >> On 03/12/2013 12:34 PM, Jan Beulich wrote: >>>>>> On 12.03.13 at 16:32, Boris Ostrovsky wrote: >>>> MSR_IA32_MCx_MISC(4) register on AMD processors is used for error >>>> thresholding. PV guests may try to set it up for threshold >>>> interrupts which will fail and result in these warnings in the log: >>>> >>>> [Firmware Bug]: cpu 0, try to use APIC510 (LVT offset 1) for vector >>>> 0xf9, but the register is already in use for vector 0x0 on this cpu >>>> >>>> Mark this register as invalid to avoid this. While at it, also present >>>> other MSR_IA32_MCx_MISC() registers as invalid (except for the first >>>> GUEST_MC_BANK_NUM which are emulated). >>> Hmm, I'm not convinced. A PV guest shouldn't, by definition, try to >>> set up APIC LVTs (or else it is only partially PV). >> >> In Linux, bank 4 is assumed to support LVT interrupts >> (lvt_interrupt_supported()) and that leads to the guest trying to set it >> up via mce_amd_feature_init()->setup_APIC_mce()->setup_APIC_eilvt(). > > So for the PV case this call chain needs to be broken at some > suitable point. Nothing LAPIC related should be done in a PV > kernel. > >>>> --- a/xen/arch/x86/cpu/mcheck/mce.h >>>> +++ b/xen/arch/x86/cpu/mcheck/mce.h >>>> @@ -166,6 +166,7 @@ static inline int mce_vendor_bank_msr(const struct vcpu >> *v, uint32_t msr) >>>> case MSR_F10_MC4_MISC1: >>>> case MSR_F10_MC4_MISC2: >>>> case MSR_F10_MC4_MISC3: >>>> + case MSR_IA32_MCx_MISC(GUEST_MC_BANK_NUM)...MSR_IA32_MCx_MISC(6): >>> And if we take this, then I'd like to see an explanation of the magic >>> 6 here, including rationale why going forward there wouldn't be a >>> need to bump this to 7, 8, etc. >> >> As of today, 6 banks are supported (although bank #6, MSR0000_041B, >> appears to >> be a stub and in fact APM lists only 5 banks). >> >> I suppose we can look at MCG_CAP[BANK_CNT]. Is that what you are suggesting. > > Yes, that sounds like the right thing. > >>> Plus, even if it happens to work, it's not intended for architectural >>> MSRs to be dealt with in mce_vendor_bank_msr() (as that code is >>> expected to match the default cases in bank_mce_{rd,wr}msr()). >>> In other words, the change as is would create a latent bug. >> >> Not sure I follow this. My understanding is that mce_vendor_bank_msr() >> is there to >> indicate that the register is a bank register and should be dealt with >> in bank_mce_rdmsr(). > > No. CTL, STATUS, ADDR, and MISC aren't vendor specific, and > hence should be dealt with in bank_mce_rdmsr() in any case. The > upper bound here is intentionally > > MSR_IA32_MCx_CTL(v->arch.vmce.mcg_cap & MCG_CAP_COUNT) > > as that's what the guest gets announced through MCG_CAP. > >> And with this change bank_mce_rdmsr() will indeed deal with it by >> returning 0. > > But the guest shouldn't be accessing higher banks' MSRs, as it > never was told these banks exist. Or else we have a problem > _there_, not in the handling of the MSR accesses. Good point Jan. My fault was I only considered the HVM part. I fully agree with you and withdraw my Acked-by. Christoph