From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Mian M. Hamayun" Subject: Re: Can we force a KVM VCPU in Guest Mode to Exit to User Mode From User Mode ? Date: Thu, 26 Jul 2012 15:39:36 +0200 Message-ID: <50114898.5080402@imag.fr> References: <5011102E.5020302@imag.fr> <501116A1.90607@redhat.com> <50111D51.7050809@imag.fr> <50111F59.9080103@redhat.com> Reply-To: mian-muhammad.hamayun@imag.fr Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha1; boundary="------------ms040807010503000400040207" Cc: kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from mx1.imag.fr ([129.88.30.5]:56799 "EHLO shiva.imag.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751049Ab2GZNjo (ORCPT ); Thu, 26 Jul 2012 09:39:44 -0400 In-Reply-To: <50111F59.9080103@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: This is a cryptographically signed message in MIME format. --------------ms040807010503000400040207 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable On 07/26/2012 12:43 PM, Avi Kivity wrote: > On 07/26/2012 01:34 PM, Mian M. Hamayun wrote: >> On 07/26/2012 12:06 PM, Avi Kivity wrote: >>> On 07/26/2012 12:38 PM, Mian M. Hamayun wrote: >>> >>> >>> >>>> This mechanism 'seems' to work fine when both vcpu threads are in Us= er >>>> Mode. But when booting an SMP Guest, the boot processor (BSP) initia= lly >>>> executes the bootstrap code while the non-boot processors (APs) are >>>> waiting for initial INIT-SIPI-SIPI messages. >>>> >>>> What I fail to understand is if an AP is currently waiting for an IN= IT >>>> signal, and we call the "run_on_cpu" function above for this cpu, it= >>>> blocks the whole system, as the AP is in Guest mode and cannot call = the >>>> "flush_queued_work" and the BSP is waiting for this to happen. >>>> >>>> How can we resolve this deadlock ? Is there a way to force the AP to= >>>> quit the Guest Mode, by using some specific mechanism from the User >>>> mode ? >>> When a vcpu is waiting for an INIT, it still responds to signals and >>> will return to userspace if a signal is received. Did you observe >>> something different? >>> >> Hi Avi, >> >> So it means that when we execute the following: >> >> err =3D pthread_kill(env->thread, SIG_IPI); >> >> then this VCPU thread should wake-up and force the VCPU to quit the >> guest mode ? > Yes. > >> But I am not getting this behavior as the VCPU thread remains blocked = in >> Guest Mode. >> What could be wrong here ? >> > Perhaps signals are blocked? No, thats not the case. > > Can you share your reproducer? Actually its based on kvm-tool and I have integrated some code from=20 qemu-kvm to add debug support to kvm-tool. I don't have a smaller example that could reproduce the same problem. > >> Many Thanks, >> Hamayun >> >> P.S. Please see the following trace; As it might help understanding th= e >> problem. (thread 2 is the BSP and thread 3 is for AP) >> >> run_on_cpu: Kicked CPU#1 ... Waiting on qemu_work_cond >> >> Program received signal SIGUSR1, User defined signal 1. >> ^C >> Program received signal SIGINT, Interrupt. >> 0xb7fdd424 in __kernel_vsyscall () >> (gdb) info threads >> Id Target Id Frame >> 3 Thread 0xaea17b40 (LWP 2843) "arch.x" 0xb7fdd424 in >> __kernel_vsyscall () >> 2 Thread 0xaf218b40 (LWP 2842) "arch.x" 0xb7fdd424 in >> __kernel_vsyscall () >> * 1 Thread 0xb7359940 (LWP 2822) "arch.x" 0xb7fdd424 in >> __kernel_vsyscall () >> (gdb) thread 2 >> [Switching to thread 2 (Thread 0xaf218b40 (LWP 2842))] >> #0 0xb7fdd424 in __kernel_vsyscall () >> (gdb) bt >> #0 0xb7fdd424 in __kernel_vsyscall () >> #1 0xb7f7096b in pthread_cond_wait@@GLIBC_2.3.2 () from >> /lib/i386-linux-gnu/libpthread.so.0 >> #2 0xb7fc5715 in qemu_cond_wait (cond=3D0xb7fda640, lock=3D0xb7fda670= ) at >> gdb_srv_arch.c:578 >> #3 0xb7fc5981 in run_on_cpu (env=3D0x8465818, func=3D0xb7fc5417 >> , data=3D0xaf20fe50) at gdb_srv_arch.c:782= >> #4 0xb7fc5b90 in kvm_update_guest_debug (env=3D0x8465818, >> reinject_trap=3D0) at gdb_srv_arch.c:863 >> #5 0xb7fc5ef1 in kvm_remove_all_breakpoints (current_env=3D0x8465418)= at >> gdb_srv_arch.c:983 >> #6 0xb7fc2e0a in gdb_breakpoint_remove_all (env=3D0x8465418) at >> gdb_srv.c:369 >> #7 0xb7fc3139 in gdb_handle_packet (s=3D0x80ac3e0, line_buf=3D0x80ac3= fc >> "?") at gdb_srv.c:497 >> #8 0xb7fc439c in gdb_read_byte (s=3D0x80ac3e0, ch=3D102) at gdb_srv.c= :830 >> #9 0xb7fc4522 in gdb_loop (env=3D0x8465418) at gdb_srv.c:878 >> #10 0xb7fc4621 in gdb_srv_handle_debug (env=3D0x8465418) at gdb_srv.c:= 924 >> #11 0xb7fc5265 in kvm_arch_handle_debug (env=3D0x8465418) at >> gdb_srv_arch.c:424 >> #12 0xb7fac3a8 in kvm_cpu__start (cpu=3D0x8465418) at kvm-cpu.c:588 >> #13 0xb7fc6521 in kvm_cpu_thread (arg=3D0x8465418) at libkvm-main.c:27= 6 >> #14 0xb7f6cd4c in start_thread () from /lib/i386-linux-gnu/libpthread.= so.0 >> #15 0xb7d7bace in clone () from /lib/i386-linux-gnu/libc.so.6 >> (gdb) thread 3 >> [Switching to thread 3 (Thread 0xaea17b40 (LWP 2843))] >> #0 0xb7fdd424 in __kernel_vsyscall () >> (gdb) bt >> #0 0xb7fdd424 in __kernel_vsyscall () >> #1 0xb7d73869 in ioctl () from /lib/i386-linux-gnu/libc.so.6 >> #2 0xb7fabe9c in kvm_cpu__run (vcpu=3D0x8465818) at kvm-cpu.c:429 >> #3 0xb7fac376 in kvm_cpu__start (cpu=3D0x8465818) at kvm-cpu.c:578 >> #4 0xb7fc6521 in kvm_cpu_thread (arg=3D0x8465818) at libkvm-main.c:27= 6 >> #5 0xb7f6cd4c in start_thread () from /lib/i386-linux-gnu/libpthread.= so.0 >> #6 0xb7d7bace in clone () from /lib/i386-linux-gnu/libc.so.6 >> > Are you sure thread 3 did not receive the signal? The thread 3 does actually receives the signal, but the order is not righ= t. As the BSP (Thread 2) starts, it locks the "qemu_global_mutex" and=20 releases it only when it calls the "run_on_cpu" function and starts=20 waiting on "qemu_work_cond". The AP (Thread 3) wakes-up due to the SIG_IPI signal from thread 2,=20 acquires the lock on "qemu_global_mutex" and enters the guest mode.=20 (This is the deadlock case) If we do not lock the "qemu_global_mutex" in each cpu thread at the=20 beginning, and only lock it when we quit the guest mode, the problem=20 seems to go away, as now we get the SIG_IPI when the Thread 3 is=20 actually in the guest mode and it quits to user mode. But I am not sure if this is the right way to do it, as in qemu-kvm we=20 _always_ start each cpu thread by locking the "qemu_global_mutex". i.e. static void *qemu_kvm_cpu_thread_fn(void *arg) { CPUArchState *env =3D arg; int r; qemu_mutex_lock(&qemu_global_mutex); qemu_thread_get_self(env->thread); env->thread_id =3D qemu_get_thread_id(); cpu_single_env =3D env; r =3D kvm_init_vcpu(env); if (r < 0) { fprintf(stderr, "kvm_init_vcpu failed: %s\n", strerror(-r)); exit(1); } qemu_kvm_init_cpu_signals(env); /* signal CPU creation */ env->created =3D 1; qemu_cond_signal(&qemu_cpu_cond); while (1) { if (cpu_can_run(env)) { r =3D kvm_cpu_exec(env); if (r =3D=3D EXCP_DEBUG) { cpu_handle_guest_debug(env); } } qemu_kvm_wait_io_event(env); } return NULL; } > > Try stracing the run. > > --------------ms040807010503000400040207 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIAjCC A7YwggKeoAMCAQICAQMwDQYJKoZIhvcNAQEFBQAwLDELMAkGA1UEBhMCRlIxDTALBgNVBAoT BENOUlMxDjAMBgNVBAMTBUNOUlMyMB4XDTA5MDEyMTA5MDM1MloXDTI5MDEyMDA5MDM1Mlow NTELMAkGA1UEBhMCRlIxDTALBgNVBAoTBENOUlMxFzAVBgNVBAMTDkNOUlMyLVN0YW5kYXJk MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAnKlkarQHIxnDvggIxOIqXe3UKN7+ P6DtkkRrFkc1EzeNdKn1TYPkBRuPCGFM3ndb16n/u2Wdyaw8D/GJe5MioEcPXwa+jnigC3nX QmVhcmOSQIpbZxD61ic+2HdNHnnbb0sSAFJY4thCBbIzN3fgjWwdvPj28pRYJfeC2YbZXPPY Ls39cIkEh+850SrYkoxpLxxSZfpgjxB/zI/5XC4U7UyL4J03uNI8lMpQ/UF63vY87K7svVwW 3bDwc5l6gf87M9IAnk2Mxls4LjPDdobKclTbLeIQ/ZJQaJOE7XepiWlRhevglKP5lwgRjCTw D7o4tCzW12xOY/60MZ/vj6ZapQIDAQABo4HZMIHWMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0O BBYEFBHj2dFSRxtZsTwbeGZr9KGI7QpbMFQGA1UdIwRNMEuAFFCXtg33rDMXr/EdRjxrO/8A oOXloTCkLjAsMQswCQYDVQQGEwJGUjENMAsGA1UEChMEQ05SUzEOMAwGA1UEAxMFQ05SUzKC AQAwDgYDVR0PAQH/BAQDAgEGMD4GA1UdHwQ3MDUwM6AxoC+GLWh0dHA6Ly9jcmxzLnNlcnZp Y2VzLmNucnMuZnIvQ05SUzIvZ2V0ZGVyLmNybDANBgkqhkiG9w0BAQUFAAOCAQEAT+njF+ZM J/UXalBV6u7PTKq97izddj5ZoC8LaInaQ9AeHSxrEvlnE55lK6SE0jHPgqDK7yLoEGzpzxd8 rK2HhUyK4dV7TObZDrKh5CmeIK8PPnu5fyRMMuCI/nrarBZgoXWuiZyKZp2Uun6rDiAj7ffH hF2CSBTexNSwxU4sh9SNAxEvNtUpb66ZZxkMjW1aIN/Rn8bLr1XuC8qxWw/vXHT080aJY0d+ LM6/yDANAEb2GOZsPzB+kG4QjR85Sc+TaevInsJnc69Ki/Z8Qijdpd3tr8lVG2Q/VLxhJhDr kdXp9+7Q9gsL+qaQ3WD0QJ0Lp5z4zi8hOP6rBr/aDXf6ZzCCBEQwggMsoAMCAQICAi5NMA0G CSqGSIb3DQEBBQUAMDUxCzAJBgNVBAYTAkZSMQ0wCwYDVQQKEwRDTlJTMRcwFQYDVQQDEw5D TlJTMi1TdGFuZGFyZDAeFw0xMDEyMDcxNDUxMTJaFw0xMjA5MjkxNDUxMTJaMHwxCzAJBgNV BAYTAkZSMQ0wCwYDVQQKEwRDTlJTMRAwDgYDVQQLEwdVTVI1MTU5MR4wHAYDVQQDExVNaWFu LU11aGFtbWFkIEhhbWF5dW4xLDAqBgkqhkiG9w0BCQEWHU1pYW4tTXVoYW1tYWQuSGFtYXl1 bkBpbWFnLmZyMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDJBGxJA4I58grdhUtbOPP2 KIC5WZPAOwH0O8rQpxMIFcL3zh8K/SK3CtphnUbnOLMsZdd15ZM8ebiHTmtscsPG+9aOSoXw viXwcuXmNdxt0A+QSK8InMRxypgsjF/rhhyEsHNLa6J4sWgBXYkP4AgG1TkG4SG+A5tERGh8 9cwE2wIDAQABo4IBmTCCAZUwDAYDVR0TAQH/BAIwADARBglghkgBhvhCAQEEBAMCBLAwDgYD VR0PAQH/BAQDAgXgMHoGCWCGSAGG+EIBDQRtFmtDZXJ0aWZpY2F0IENOUlMyLVN0YW5kYXJk LiBQb3VyIHRvdXRlIGluZm9ybWF0aW9uIHNlIHJlcG9ydGVyIOAgaHR0cDovL2lnYy5zZXJ2 aWNlcy5jbnJzLmZyL0NOUlMyLVN0YW5kYXJkLzAdBgNVHQ4EFgQU2ehQH3aDOvQOkV44YoUk xmzr4x4wVAYDVR0jBE0wS4AUEePZ0VJHG1mxPBt4Zmv0oYjtCluhMKQuMCwxCzAJBgNVBAYT AkZSMQ0wCwYDVQQKEwRDTlJTMQ4wDAYDVQQDEwVDTlJTMoIBAzAoBgNVHREEITAfgR1NaWFu LU11aGFtbWFkLkhhbWF5dW5AaW1hZy5mcjBHBgNVHR8EQDA+MDygOqA4hjZodHRwOi8vY3Js cy5zZXJ2aWNlcy5jbnJzLmZyL0NOUlMyLVN0YW5kYXJkL2dldGRlci5jcmwwDQYJKoZIhvcN AQEFBQADggEBAAVr5o0odld0uD73gKzHyDfeQQuY2nd3Je5m321fZlecr3AtvatAtI6NNURx W/JDBNYPtDtdt6Q1CH3QB/wrUecyp3CGGQlxOs67OvvQOdaIdeAgpt9mJmM0B/HjiBzER0gC yVbI/t7sDSlBalM847lt8JxEwfUXQWXx0wxi4wk1gyeCm4ebf4PTAUo4/LBSP8KhnlD7MFJ5 RmGD0eDChb7PNSnWE/T8NbSq7uScZs89eUTd/Vtp84wIQGF94+QWECjt6Oq4N2WYeo//9gcI IvDQqdmQZx2bCb2BKdWh+bczQ3ENrjL7Q8X/3oKdnWCm9C+DB1gLAs32O3Q6ErcG398xggI9 MIICOQIBATA7MDUxCzAJBgNVBAYTAkZSMQ0wCwYDVQQKEwRDTlJTMRcwFQYDVQQDEw5DTlJT Mi1TdGFuZGFyZAICLk0wCQYFKw4DAhoFAKCCAVgwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEH ATAcBgkqhkiG9w0BCQUxDxcNMTIwNzI2MTMzOTM2WjAjBgkqhkiG9w0BCQQxFgQU/in4DR8r eIQV+JBZr8FFMbyWR6IwSgYJKwYBBAGCNxAEMT0wOzA1MQswCQYDVQQGEwJGUjENMAsGA1UE ChMEQ05SUzEXMBUGA1UEAxMOQ05SUzItU3RhbmRhcmQCAi5NMEwGCyqGSIb3DQEJEAILMT2g OzA1MQswCQYDVQQGEwJGUjENMAsGA1UEChMEQ05SUzEXMBUGA1UEAxMOQ05SUzItU3RhbmRh cmQCAi5NMF8GCSqGSIb3DQEJDzFSMFAwCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqG SIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkq hkiG9w0BAQEFAASBgKt2Tf4or7cGt5qdjR2ilgiPRCYcxuMxoWHjPNer8WtSk3fCM8ERRnix qu+9A4aSzXeIoyWLsN9P0Ucd0v4SpgRt3vI/u/DBYeJ/L0gMNfQ2pzmb/XX8JY8J2Ih5nopq 3dRD/2KYzxeNIe4aVNT287gzDhDbRCPVHZJ5NHlJknWfAAAAAAAA --------------ms040807010503000400040207--