From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeroen Groenewegen van der Weyden Subject: Re: [PATCH 20 of 20] n2 MSR handling and capability exposure Date: Tue, 26 Jul 2011 15:33:57 +0200 Message-ID: <4E2EC245.2030905@grosc.com> References: <1A42CE6F5F474C41B63392A5F80372B212DAB9DD82@shsmsx501.ccr.corp.intel.com> <4E258DC4.4050106@grosc.com> <1A42CE6F5F474C41B63392A5F80372B212DAC025EB@shsmsx501.ccr.corp.intel.com> <4E26E23D.4030000@grosc.com> <20110725140843.GC8970@whitby.uk.xensource.com> <20110725161657.GF8970@whitby.uk.xensource.com> <4E2E6992.9030407@grosc.com> <20110726100018.GH8970@whitby.uk.xensource.com> <20110726101140.GI8970@whitby.uk.xensource.com> <4E2E9B05.7030206@grosc.com> <20110726114233.GJ8970@whitby.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2100590480==" Return-path: In-Reply-To: <20110726114233.GJ8970@whitby.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tim Deegan Cc: "Christoph.Egger@amd.com" , "xen-devel@lists.xensource.com" , "Dong, Eddie" List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============2100590480== Content-Type: multipart/alternative; boundary="------------090905050403050101000002" This is a multi-part message in MIME format. --------------090905050403050101000002 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Tim, This improved a lot. the domu does not become in-responsive anymore. However, all four l2 guest are started. but 2 out of 4 are hanging/in-responsive after 30 to 60 seconds. L1 domu (sles11sp1) -> seems to be ok L2-1 propetary OS, seems to be ok L2-2 propetary OS, seems to be ok l2-3 sles10sp3, hanging/in-responsive l2-4 sles10sp3, hanging/in-responsive no degug/error messages in dmesg. mfg, jeroen Op 26-7-2011 13:42, Tim Deegan schreef: > At 12:46 +0200 on 26 Jul (1311684389), Jeroen Groenewegen van der Weyden wrote: >> Here my input > Thanks. Looks very similar to the bug I thought I fixed with my patch. > > VCPUs 1 and 3 are always at 0xa01a1c9d:<0f> 01 c2 eb 03 0f 01 c3 > which is probably this fragment of code from KVM's vmx_vcpu_run(): > > /* Enter guest mode */ > "jne .Llaunched \n\t" > __ex(ASM_VMX_VMLAUNCH) "\n\t" > "jmp .Lkvm_vmx_return \n\t" > ".Llaunched: " __ex(ASM_VMX_VMRESUME) "\n\t" > ".Lkvm_vmx_return: " > > So, just like the case I saw, they're trying to VMLAUNCH a VMCS and > failing. That should only fail if the VMCS is already launched. > > I think the reason they're _stuck_ is that error paths for VMLAUNCH and > VMRESUME emulation are wrong; I can fix them up a little but I suspect > that won't solve the problem; just change it from a hang to some other > failure mode. > > Can you try the attached patch instead of the previous one? > > Tim. > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel --------------090905050403050101000002 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Tim, This improved a lot. the domu does not become in-responsive anymore. However, all four l2 guest are started. but 2 out of 4 are hanging/in-responsive after 30 to 60 seconds.

L1 domu (sles11sp1)
 -> seems to be ok

L2-1 propetary OS, seems to be ok
L2-2 propetary OS, seems to be ok
l2-3 sles10sp3, hanging/in-responsive
l2-4 sles10sp3, hanging/in-responsive

no degug/error messages in dmesg.

mfg,
jeroen

Op 26-7-2011 13:42, Tim Deegan schreef:
At 12:46 +0200 on 26 Jul (1311684389), Jeroen Groenewegen van der Weyden wrote:
Here my input
Thanks.  Looks very similar to the bug I thought I fixed with my patch.

VCPUs 1 and 3 are always at 0xa01a1c9d: <0f> 01 c2 eb 03 0f 01 c3
which is probably this fragment of code from KVM's vmx_vcpu_run():

        /* Enter guest mode */
        "jne .Llaunched \n\t"
	__ex(ASM_VMX_VMLAUNCH) "\n\t"
        "jmp .Lkvm_vmx_return \n\t"
        ".Llaunched: " __ex(ASM_VMX_VMRESUME) "\n\t"
	".Lkvm_vmx_return: "

So, just like the case I saw, they're trying to VMLAUNCH a VMCS and
failing.  That should only fail if the VMCS is already launched. 

I think the reason they're _stuck_ is that error paths for VMLAUNCH and
VMRESUME emulation are wrong; I can fix them up a little but I suspect
that won't solve the problem; just change it from a hang to some other
failure mode.

Can you try the attached patch instead of the previous one? 

Tim.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

--------------090905050403050101000002-- --===============2100590480== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============2100590480==--