From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: OpenBSD 5.0 kernel panic in AMD K10 cpu power state Date: Thu, 10 Nov 2011 23:52:41 +0100 Message-ID: <4EBC55B9.1080407@amd.com> References: <4EB8F576.9040203@gmx.at> <4EBA5848.7070404@redhat.com> <4EBA82BA.1090602@redhat.com> <4EBAD609.4050307@gmx.at> <4EBB8F5C.9050305@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: Walter Haidinger , KVM list To: Avi Kivity Return-path: Received: from tx2ehsobe004.messaging.microsoft.com ([65.55.88.14]:38513 "EHLO TX2EHSOBE007.bigfish.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754824Ab1KJWwg (ORCPT ); Thu, 10 Nov 2011 17:52:36 -0500 In-Reply-To: <4EBB8F5C.9050305@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 11/10/2011 09:46 AM, Avi Kivity wrote: > (re-adding cc) > > > On 11/09/2011 09:35 PM, Walter Haidinger wrote: >> Am 09.11.2011 14:40, schrieb Avi Kivity: >>> Actually, it looks like an OpenBSD bug. According to the AMD >>> documentation: >> >> Well, the OpenBSD developers are very confident that is >> a bug in the KVM cpu emulation and _not_ in OpenBSD. >> >> Basically they say that [despite -cpu host], the emulated >> cpu does not look like a real, but _non-existant_ cpu. >> Virtualization should look like _existing_ hardware. > > That is true. But OpenBSD is not following the vendor's recommendation > for how software should access the hardware. > >> Since the list archive at >> http://marc.info/?l=openbsd-misc&m=132077741910464&w=2 >> lags a bit, I'm attaching some parts of the thread below: >> >> However, please remember it's OpenBSD, so the tone is, let's just >> say, rough. > > Less than expected, actually. > >>> The panic you hit is for an msr read, not a write. I'm aware those >>> registers are read-only. The CPUID check isn't done, it matches on >>> all family 10 and/or higher AMD processors. They're pretending to be >>> an AMD K10 processor. On all real hardware I've tested this works >>> fine. If you wish to be pedantic, patches are welcome. Avi, thanks for caring of that. The manual is clear here: no CPUID bit, no MSRs. Beside that the emulated ACPI tables probably also don't provide any info here, right? The fact that it runs: "on all family 10 and/or higher AMD processors" is just an empiric observation, not a law. You would be astonished what can be fused off... We had a similar discussion here with unconditional AMD Northbridge PCI accesses when detecting certain AMD CPU family/model/steppings in the Linux kernel already (...but every AMD CPU has a northbridge...) We (as virtualization guys) should not step back so easily here, especially if the spec is so clear. That spec argument should actually appeal to the OpenBSD guys, too. I got the impression that their design is, well, actually well designed. > > So they're actually open to adding the cpuid check. > >> They sent me a patch as a workaround, which: >> >>> The previous patch avoids touching the msr at all if ACPI indicates >>> speed scaling is unavailable, this should prevent your panic. >> >> with -cpu host, OpenBSD dmesg showed the 1100T: >>>> cpu0: AMD Phenom(tm) II X6 1100T Processor ("AuthenticAMD" 686-class, 512KB L2 cache) 3.31 GHz cpu0: >>>> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT >>>> ... >>>> bios0: vendor Bochs version "Bochs" date 01/01/2007 bios0: Bochs >>>> Bochs >>> They shouldn't be pretending to be AMD, especially if that emulation >>> is very incompatible. >> >> but the bug is in the Linux KVM: >> >>>> They're pretending to be an AMD K10 processor. >>>> >>> Exactly. What they are doing is wrong. They are pretending to be a >>> AMD K10 processor _badly_, and then they think they can say "oh, but >>> you need to check all these other registers too". A machine with that >>> setup has never physically existed. >> >> Is this all because I used -cpu host? >> > > -cpu host is not to blame, you could get the same result from other > combinations of cpu model and family. > > I'll look at adding support for this MSR; should be simple. But in > general processor features need to be qualified by cpuid, not by model. I guess emulating part of P-states will open up a can of worms. Beside the generic MSRs (0xC001006[1-3]) there are actual family specific ones which are selected by the CPUID family. So you would end up emulating them, too. I have a hard time to think about a strategy how to emulate this in general. So unless there is a real framework for dealing with P-state "hints" from the guest OS, I'd be reluctant with quick and dirty emulations. Thanks, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany