From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eduardo Habkost Subject: Re: MCG_CAP ABI breakage (was Re: [Qemu-devel] [PATCH] target-i386: Do not set MCG_SER_P by default) Date: Tue, 24 Nov 2015 14:36:20 -0200 Message-ID: <20151124163620.GP23717@thinpad.lan.raisama.net> References: <1448060471-14128-1-git-send-email-bp@alien8.de> <564FA8A7.3000500@suse.de> <20151121010925.GB3994@pd.tnic> <20151123132237.GM23717@thinpad.lan.raisama.net> <20151123151127.GM20436@thinpad.lan.raisama.net> <20151123164314.GF5134@pd.tnic> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: KVM , Marcelo Tosatti , qemu-devel , Huang Ying , Paolo Bonzini , Andreas =?iso-8859-1?Q?F=E4rber?= , Richard Henderson To: Borislav Petkov Return-path: Received: from mx1.redhat.com ([209.132.183.28]:43757 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751301AbbKXQgX (ORCPT ); Tue, 24 Nov 2015 11:36:23 -0500 Content-Disposition: inline In-Reply-To: <20151123164314.GF5134@pd.tnic> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Nov 23, 2015 at 05:43:14PM +0100, Borislav Petkov wrote: > On Mon, Nov 23, 2015 at 01:11:27PM -0200, Eduardo Habkost wrote: > > On Mon, Nov 23, 2015 at 11:22:37AM -0200, Eduardo Habkost wrote: > > [...] > > > In the case of this code, it looks like it's already broken > > > because the resulting mcg_cap depends on host kernel capabilities > > > (the ones reported by kvm_get_mce_cap_supported()), and the data > > > initialized by target-i386/cpu.c:mce_init() is silently > > > overwritten by kvm_arch_init_vcpu(). So we would need to fix that > > > before implementing a proper compatibility mechanism for > > > mcg_cap. > > > > Fortunately, when running Linux v2.6.37 and later, > > kvm_arch_init_vcpu() won't actually change mcg_cap (see details > > below). > > > > But the code is broken if running on Linux between v2.6.32 and > > v2.6.36: it will clear MCG_SER_P silently (and silently enable > > MCG_SER_P when migrating to a newer host). > > > > But I don't know what we should do on those cases. If we abort > > initialization when the host doesn't support MCG_SER_P, all CPU > > models with MCE and MCA enabled will become unrunnable on Linux > > between v2.6.32 and v2.6.36. Should we do that, and simply ask > > people to upgrade their kernels (or explicitly disable MCE) if > > they want to run latest QEMU? > > > > For reference, these are the capabilities returned by Linux: > > * KVM_MAX_MCE_BANKS is 32 since > > 890ca9aefa78f7831f8f633cab9e4803636dffe4 (v2.6.32-rc1~693^2~199) > > * KVM_MCE_CAP_SUPPORTED is (MCG_CTL_P | MCG_SER_P) since > > 5854dbca9b235f8cdd414a0961018763d2d5bf77 (v2.6.37-rc1~142^2~3) > > The commit message of that one says that there is MCG_SER_P support in > the kernel. > > The previous commit talks about MCE injection with KVM_X86_SET_MCE > ioctl but frankly, I don't see that. From looking at the current code, > KVM_X86_SET_MCE does kvm_vcpu_ioctl_x86_setup_mce() which simply sets > MCG_CAP. And it gets those from KVM_X86_GET_MCE_CAP_SUPPORTED which is KVM_X86_SET_MCE does not call kvm_vcpu_ioctl_x86_setup_mce(). It calls kvm_vcpu_ioctl_x86_set_mce(), which stores the IA32_MCi_{STATUS,ADDR,MISC} register contents at vcpu->arch.mce_banks. > > #define KVM_MCE_CAP_SUPPORTED (MCG_CTL_P | MCG_SER_P) > > So it basically sets those two supported bits. But how is > > supported == actually present > > ?!?! > > That soo doesn't make any sense. > I didn't check the QEMU MCE code to confirm that, but I assume it is implemented there. In that case, MCG_SER_P in KVM_MCE_CAP_SUPPORTED just indicates it can be implemented by userspace, as long as it makes the appropriate KVM_X86_SET_MCE (or maybe KVM_SET_MSRS?) calls. -- Eduardo