From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: nVMX regression v3.13+, bisected Date: Thu, 27 Feb 2014 11:51:08 +0100 Message-ID: <530F189C.5000200@redhat.com> References: <530E43EC.7000600@canonical.com> <530E4DB9.5050001@redhat.com> <530E4E25.4050508@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Cc: Anthoine Bourgeois To: Stefan Bader , kvm@vger.kernel.org Return-path: Received: from mail-qa0-f52.google.com ([209.85.216.52]:55647 "EHLO mail-qa0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750993AbaB0KvM (ORCPT ); Thu, 27 Feb 2014 05:51:12 -0500 Received: by mail-qa0-f52.google.com with SMTP id j15so3882870qaq.11 for ; Thu, 27 Feb 2014 02:51:11 -0800 (PST) In-Reply-To: <530E4E25.4050508@canonical.com> Sender: kvm-owner@vger.kernel.org List-ID: Il 26/02/2014 21:27, Stefan Bader ha scritto: > On 26.02.2014 21:25, Paolo Bonzini wrote: >> Il 26/02/2014 20:43, Stefan Bader ha scritto: >>> Hi, >>> >>> I was looking at a bug report[1] about a regression on nested VMX that started >>> with kernel v3.13 (same issue still existed with v3.14-rc4). The problem shows >>> up when running a v3.13 kernel in L0 and then trying to launch a L2 (L1 was >>> either a v3.2 kernel or v3.13, so seemed to have no immediate influence). L2 is >>> trying to boot a iso image and hangs before the isolinux boot loader displays >>> anything. A preinstalled hd image fails to boot, too. >>> >>> I bisected this and ended up on the following commit which, when reverted made >>> the launch work again: >>> >>> Author: Anthoine Bourgeois >>> Date: Wed Nov 13 11:45:37 2013 +0100 >>> >>> kvm, vmx: Fix lazy FPU on nested guest >>> >>> If a nested guest does a NM fault but its CR0 doesn't contain the TS >>> flag (because it was already cleared by the guest with L1 aid) then we >>> have to activate FPU ourselves in L0 and then continue to L2. If TS flag >>> is set then we fallback on the previous behavior, forward the fault to >>> L1 if it asked for. >>> >>> Signed-off-by: Anthoine Bourgeois >>> Signed-off-by: Paolo Bonzini >>> >>> The condition to exit to L0 seems to be according to what the description says. >>> Could it be that the handling in L0 is doing something wrong? >> >> Thanks, I'll look at it tomorrow or Friday. >> >> Paolo >> > Great thanks. And maybe it helps if I actually add the link to the bug report as > I had intended... :-P I don't have my usual test machine available, but here is a possible guess. nested_read_cr0 is the CR0 as read by L2, but here we want to look at the CR0 value reflecting L1's setup. This would suggest the following untested patch: diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index a06f101ef64b..0d90601a2681 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6688,7 +6688,7 @@ static bool nested_vmx_exit_handled(struct kvm_vcpu *vcpu) else if (is_page_fault(intr_info)) return enable_ept; else if (is_no_device(intr_info) && - !(nested_read_cr0(vmcs12) & X86_CR0_TS)) + !(vmcs12->guest_cr0 & X86_CR0_TS)) return 0; return vmcs12->exception_bitmap & (1u << (intr_info & INTR_INFO_VECTOR_MASK)); Paolo