* [Qemu-devel] CPU type qemu64 breaks guest code @ 2011-03-14 14:13 Alexander Graf 2011-03-21 21:46 ` [Qemu-devel] " Andre Przywara 2011-03-22 12:34 ` [Qemu-devel] " Avi Kivity 0 siblings, 2 replies; 5+ messages in thread From: Alexander Graf @ 2011-03-14 14:13 UTC (permalink / raw) To: QEMU-devel List; +Cc: Andre Przywara, Michael Matz Hi guys, While I was off on vacation a pretty nasty bug emerged. It's our old friend the non-existent -cpu qemu64 CPU type. To refresh your memories, this is the definition of the default 64-bit CPU type in Qemu: { .name = "qemu64", .level = 4, .vendor1 = CPUID_VENDOR_AMD_1, .vendor2 = CPUID_VENDOR_AMD_2, .vendor3 = CPUID_VENDOR_AMD_3, .family = 6, .model = 2, .stepping = 3, .features = PPRO_FEATURES | CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | CPUID_PSE36, .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16 | CPUID_EXT_POPCNT, .ext2_features = (PPRO_FEATURES & EXT2_FEATURE_MASK) | CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, .ext3_features = CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A, .xlevel = 0x8000000A, .model_id = "QEMU Virtual CPU version " QEMU_VERSION, }, Before I left, we already had weird build breakages where gcc simply abort()ed when running inside of KVM. This breakage has been tracked down to libgmp, which has this code (http://gmplib.org:8000/gmp-5.0/file/1ebe39104437/mpn/x86_64/fat/fat.c): if (strcmp (vendor_string, "GenuineIntel") == 0) { .... } else if (strcmp (vendor_string, "AuthenticAMD") == 0) { switch (family) { case 5: case 6: abort (); break; case 15: /* CPUVEC_SETUP_athlon */ break; } The reasoning behind it is that for AMD, all 64-bit CPUs were family 15. This breakage is obviously pretty bad for guests running qemu and shows once again that deriving from real hardware is a bad idea. I guess the best way to move forward is to change the CPU type to something that actually exists in the real world - even if we have to strip off some features. Any ideas? Comments? Alex ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: CPU type qemu64 breaks guest code 2011-03-14 14:13 [Qemu-devel] CPU type qemu64 breaks guest code Alexander Graf @ 2011-03-21 21:46 ` Andre Przywara 2011-03-22 7:18 ` Alexander Graf 2011-03-22 14:13 ` Michael Matz 2011-03-22 12:34 ` [Qemu-devel] " Avi Kivity 1 sibling, 2 replies; 5+ messages in thread From: Andre Przywara @ 2011-03-21 21:46 UTC (permalink / raw) To: Alexander Graf; +Cc: Michael Matz, QEMU-devel List On 03/14/2011 03:13 PM, Alexander Graf wrote: > Hi guys, > > While I was off on vacation a pretty nasty bug emerged. It's our old > friend the non-existent -cpu qemu64 CPU type. To refresh your memories, > this is the definition of the default 64-bit CPU type in Qemu: > > { > .name = "qemu64", > .level = 4, > .vendor1 = CPUID_VENDOR_AMD_1, > .vendor2 = CPUID_VENDOR_AMD_2, > .vendor3 = CPUID_VENDOR_AMD_3, > .family = 6, > .model = 2, > .stepping = 3, > ... Well, yes, this is strange, and a different CPU model mimicking some really existing CPU would be better. But in my experience this opens up a can of worms, since a different family will trigger a lot of other nasty code that was silent before. Although I think that eventually we will not get around it doing so, but we should use a lot of testing before triggering tons of regressions. What about the kvm64 CPU model? Back then I used quite some testing to find a model which runs pretty well, so I came up with 15/6/1, which seemed to be relatively sane. We could try copying this F/M/S over to qemu64, I suppose with emulation the issues are easier to fix than in KVM. Another idea I think I already posted some time ago was to make kvm64 the new default if KVM is enabled. This wouldn't solve the above issue for TCG of course, but would make further development easier, since we would have a better separation between KVM and TCG and could tweak each independently. > Before I left, we already had weird build breakages where gcc simply > abort()ed when running inside of KVM. This breakage has been tracked > down to libgmp, which has this code > (http://gmplib.org:8000/gmp-5.0/file/1ebe39104437/mpn/x86_64/fat/fat.c): > .... > > if (strcmp (vendor_string, "GenuineIntel") == 0) > { > .... > } > else if (strcmp (vendor_string, "AuthenticAMD") == 0) > { > switch (family) > { > case 5: > case 6: > abort (); > break; > case 15: > /* CPUVEC_SETUP_athlon */ > break; > } > > The reasoning behind it is that for AMD, all 64-bit CPUs were family > 15. I guess that should be fixed there, as it is obviously nonsense. I haven't check what they actually need that family 6 does not provide, if it is long mode, then this bit should be checked. > This breakage is obviously pretty bad for guests running qemu and > shows once again that deriving from real hardware is a bad idea. I guess > the best way to move forward is to change the CPU type to something that > actually exists in the real world - even if we have to strip off some > features. I found that there is no valid combination for both Intel and AMD. We had to go to family 15, and it seems that there is no F/M/S combination that were both a valid K8 and a Pentium4 processor. The 15/6/1 mentioned before was the closest match I could find. Summarizing I would suggest: 1) Make kvm64 the new default model if KVM is used. Maybe we could tie this to the qemu-0.15 machine type. 2) Test as many OSes as possible whether they show strange behavior. In my experience we could catch most of the stuff with just booting the system, with Linux "-kernel vmlinuz" suffices (without a rootfs). 3) If we are happy with that, tweak the qemu64 model to those values, too. 4) Write some note somewhere to let users know what they could do to fix problems that rise due to the new model. I can easily send patches and will try to contribute to testing, if that plan sounds OK. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: CPU type qemu64 breaks guest code 2011-03-21 21:46 ` [Qemu-devel] " Andre Przywara @ 2011-03-22 7:18 ` Alexander Graf 2011-03-22 14:13 ` Michael Matz 1 sibling, 0 replies; 5+ messages in thread From: Alexander Graf @ 2011-03-22 7:18 UTC (permalink / raw) To: Andre Przywara; +Cc: Michael Matz, QEMU-devel List On 21.03.2011, at 22:46, Andre Przywara wrote: > On 03/14/2011 03:13 PM, Alexander Graf wrote: >> Hi guys, >> >> While I was off on vacation a pretty nasty bug emerged. It's our old >> friend the non-existent -cpu qemu64 CPU type. To refresh your memories, >> this is the definition of the default 64-bit CPU type in Qemu: >> >> { >> .name = "qemu64", >> .level = 4, >> .vendor1 = CPUID_VENDOR_AMD_1, >> .vendor2 = CPUID_VENDOR_AMD_2, >> .vendor3 = CPUID_VENDOR_AMD_3, >> .family = 6, >> .model = 2, >> .stepping = 3, > > ... > > Well, yes, this is strange, and a different CPU model mimicking some really existing CPU would be better. But in my experience this opens up a can of worms, since a different family will trigger a lot of other nasty code that was silent before. Although I think that eventually we will not get around it doing so, but we should use a lot of testing before triggering tons of regressions. > What about the kvm64 CPU model? Back then I used quite some testing to find a model which runs pretty well, so I came up with 15/6/1, which seemed to be relatively sane. We could try copying this F/M/S over to qemu64, I suppose with emulation the issues are easier to fix than in KVM. > > Another idea I think I already posted some time ago was to make kvm64 the new default if KVM is enabled. This wouldn't solve the above issue for TCG of course, but would make further development easier, since we would have a better separation between KVM and TCG and could tweak each independently. > >> Before I left, we already had weird build breakages where gcc simply > > abort()ed when running inside of KVM. This breakage has been tracked >> down to libgmp, which has this code >> (http://gmplib.org:8000/gmp-5.0/file/1ebe39104437/mpn/x86_64/fat/fat.c): > > .... > >> >> if (strcmp (vendor_string, "GenuineIntel") == 0) >> { >> .... >> } >> else if (strcmp (vendor_string, "AuthenticAMD") == 0) >> { >> switch (family) >> { >> case 5: >> case 6: >> abort (); >> break; >> case 15: >> /* CPUVEC_SETUP_athlon */ >> break; >> } >> >> The reasoning behind it is that for AMD, all 64-bit CPUs were family >> 15. > > I guess that should be fixed there, as it is obviously nonsense. I haven't check what they actually need that family 6 does not provide, if it is long mode, then this bit should be checked. Michael (IIRC) already tried that one, but the libgmp guys refuse to change the code here, as is works on real hardware... > >> This breakage is obviously pretty bad for guests running qemu and >> shows once again that deriving from real hardware is a bad idea. I guess >> the best way to move forward is to change the CPU type to something that >> actually exists in the real world - even if we have to strip off some >> features. > > I found that there is no valid combination for both Intel and AMD. We had to go to family 15, and it seems that there is no F/M/S combination that were both a valid K8 and a Pentium4 processor. The 15/6/1 mentioned before was the closest match I could find. > > Summarizing I would suggest: > 1) Make kvm64 the new default model if KVM is used. Maybe we could tie > this to the qemu-0.15 machine type. > 2) Test as many OSes as possible whether they show strange behavior. > In my experience we could catch most of the stuff with just booting > the system, with Linux "-kernel vmlinuz" suffices (without a rootfs). > 3) If we are happy with that, tweak the qemu64 model to those values, > too. > 4) Write some note somewhere to let users know what they could do to > fix problems that rise due to the new model. > > I can easily send patches and will try to contribute to testing, if that plan sounds OK. I love that plan! Please make sure to provide the current qemu64 type when -M pc-0.14 is selected, so people can still choose the old type for migration. Alex ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: CPU type qemu64 breaks guest code 2011-03-21 21:46 ` [Qemu-devel] " Andre Przywara 2011-03-22 7:18 ` Alexander Graf @ 2011-03-22 14:13 ` Michael Matz 1 sibling, 0 replies; 5+ messages in thread From: Michael Matz @ 2011-03-22 14:13 UTC (permalink / raw) To: Andre Przywara; +Cc: Alexander Graf, QEMU-devel List Hi, On Mon, 21 Mar 2011, Andre Przywara wrote: > > .name = "qemu64", > > .level = 4, > > .vendor1 = CPUID_VENDOR_AMD_1, > > .vendor2 = CPUID_VENDOR_AMD_2, > > .vendor3 = CPUID_VENDOR_AMD_3, > > .family = 6, > > .model = 2, > > .stepping = 3, > > ... > > Well, yes, this is strange, and a different CPU model mimicking some really > existing CPU would be better. But in my experience this opens up a can of > worms, since a different family will trigger a lot of other nasty code that > was silent before. Mimicking as a really existing CPU would trigger exactly those code paths that are triggered when that same code is running on real hardware. If such hypothetical problems were real they would occur for non-emulated hardware already. But they don't. > > else if (strcmp (vendor_string, "AuthenticAMD") == 0) > > switch (family) > > { > > case 5: > > case 6: > > abort (); > > > > The reasoning behind it is that for AMD, all 64-bit CPUs were family > > 15. > > I guess that should be fixed there, as it is obviously nonsense. Not really. The above is for the x86_64 code paths, i.e. 64bit code. There never were, and there never will be, AMD processors capable of running 64bit code that have family 5 or 6. The abort can't ever trigger on hardware or correctly emulated hardware. > I haven't check what they actually need that family 6 does not provide, > if it is long mode, then this bit should be checked. It's not about need. It's about tuning and expectations. gmp wants to tune itself according to the hardware it runs on, hence it switches on the vendor and family/model. And in order not to have to handle combinations that can't exist in the real world (which seems sane to me) the aborts where put in place. Now there's something to be said about being lenient in what you accept, but it's not wrong to be strict. > I found that there is no valid combination for both Intel and AMD. Of course not. Why should there? The FMS combination obviously can't exist independend of the (claimed) vendor. Trying to go for one FMS that fits all is going to fail, how could it be different? Ciao, Michael. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] CPU type qemu64 breaks guest code 2011-03-14 14:13 [Qemu-devel] CPU type qemu64 breaks guest code Alexander Graf 2011-03-21 21:46 ` [Qemu-devel] " Andre Przywara @ 2011-03-22 12:34 ` Avi Kivity 1 sibling, 0 replies; 5+ messages in thread From: Avi Kivity @ 2011-03-22 12:34 UTC (permalink / raw) To: Alexander Graf; +Cc: Andre Przywara, Michael Matz, QEMU-devel List On 03/14/2011 04:13 PM, Alexander Graf wrote: > Hi guys, > > While I was off on vacation a pretty nasty bug emerged. It's our old friend the non-existent -cpu qemu64 CPU type. To refresh your memories, this is the definition of the default 64-bit CPU type in Qemu: > > { > .name = "qemu64", > .level = 4, > .vendor1 = CPUID_VENDOR_AMD_1, > .vendor2 = CPUID_VENDOR_AMD_2, > .vendor3 = CPUID_VENDOR_AMD_3, > .family = 6, > .model = 2, > .stepping = 3, > .features = PPRO_FEATURES | > CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA | > CPUID_PSE36, > .ext_features = CPUID_EXT_SSE3 | CPUID_EXT_CX16 | CPUID_EXT_POPCNT, > .ext2_features = (PPRO_FEATURES& EXT2_FEATURE_MASK) | > CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, > .ext3_features = CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | > CPUID_EXT3_ABM | CPUID_EXT3_SSE4A, > .xlevel = 0x8000000A, > .model_id = "QEMU Virtual CPU version " QEMU_VERSION, > }, > > > Before I left, we already had weird build breakages where gcc simply abort()ed when running inside of KVM. This breakage has been tracked down to libgmp, which has this code (http://gmplib.org:8000/gmp-5.0/file/1ebe39104437/mpn/x86_64/fat/fat.c): > > if (strcmp (vendor_string, "GenuineIntel") == 0) > { > .... > } > else if (strcmp (vendor_string, "AuthenticAMD") == 0) > { > switch (family) > { > case 5: > case 6: > abort (); > break; > case 15: > /* CPUVEC_SETUP_athlon */ > break; > } > > The reasoning behind it is that for AMD, all 64-bit CPUs were family 15. > > This breakage is obviously pretty bad for guests running qemu and shows once again that deriving from real hardware is a bad idea. I guess the best way to move forward is to change the CPU type to something that actually exists in the real world - even if we have to strip off some features. Agree. > Any ideas? Comments? > The libgmp code should be fixed. If they want to take some specific action for specific processor families, that's fine, but abort()? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-03-22 14:13 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-14 14:13 [Qemu-devel] CPU type qemu64 breaks guest code Alexander Graf 2011-03-21 21:46 ` [Qemu-devel] " Andre Przywara 2011-03-22 7:18 ` Alexander Graf 2011-03-22 14:13 ` Michael Matz 2011-03-22 12:34 ` [Qemu-devel] " Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).