From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths Date: Thu, 09 Nov 2017 11:36:32 +0100 Message-ID: <1510223792.4517.191.camel@linux.it> References: <58A596C0020000780013AA84@prv-mh.provo.novell.com> <58A597D8020000780013AAAF@prv-mh.provo.novell.com> <5ca9f140-a574-a8d0-1231-4ce0aec0e124@citrix.com> <5A0177B8020000780018CCC9@prv-mh.provo.novell.com> <5A01D776020000780018CEBD@prv-mh.provo.novell.com> <05171ebf-7634-664b-e43b-d905bf2e27e6@citrix.com> <5A043693020000780018D790@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1693604104822501192==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eCkCQ-0000wV-Fv for xen-devel@lists.xenproject.org; Thu, 09 Nov 2017 10:36:42 +0000 Received: by mail-wm0-f68.google.com with SMTP id b189so15928416wmd.4 for ; Thu, 09 Nov 2017 02:36:37 -0800 (PST) In-Reply-To: <5A043693020000780018D790@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich , Igor Druzhinin Cc: Sergey Dyasli , Kevin Tian , George Dunlap , Andrew Cooper , Anshul Makkar , JunNakajima , xen-devel List-Id: xen-devel@lists.xenproject.org --===============1693604104822501192== Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-dDIhhMTzycoRBTRFfHMH" --=-dDIhhMTzycoRBTRFfHMH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2017-11-09 at 03:05 -0700, Jan Beulich wrote: > > > > On 07.11.17 at 16:52, wrote: > >=20 > > There is one things that I'm worrying about with this approach: > >=20 > > At this place we just sync the idle context because we know that we > > are > > going to deal with VMCS later. But what about other potential cases > > (perhaps some softirqs) in which we are accessing a vCPU data > > structure > > that is currently shared between different pCPUs. Maybe we'd better > > sync > > the context as soon as possible after we switched to idle from a > > migrated vCPU. >=20 > Short of feedback from the scheduler maintainers I've looked > into this some more. > Sorry, as you may have seen by the other email, I was looking into this today. > Calling sync_vcpu_execstate() out of > vcpu_move_locked() would seem feasible, but there are a number > of other places where ->processor of a vCPU is being updated, > and where the vCPU was not (obviously) put to sleep already: >=20 > - credit1's csched_runq_steal() > - credit2's migrate() > - csched2_schedule() > - null's vcpu_assign() when called out of null_schedule() > - rt_schedule() >=20 > I don't think it is reasonable to call sync_vcpu_execstate() in all > of > those places,=20 > Yes, I agree. > and hence VMX'es special VMCS management may > indeed better be taken care of by VMX-specific code (i.e. as > previously indicated the sync_local_execstate() invocation moved > from vcpu_destroy() - where my most recent patch draft had put > it - to vmx_vcpu_destroy()).=20 > I was still trying to form an opinion about the issue, and was leaning toward suggesting exactly the same. In fact, the point of lazy context switch is exactly that: trying to save syncing the state. Of course, that requires that we identify all the places and occasions where having the state out of sync may be a problem, and sync it!. What seems to me to be happening here is as follows: a. a pCPU becomes idle b. we do the lazy switch, i.e., the context of the previously running=20 vCPU stays on the pCPU c. *something* happens which wants to either play with or alter the=20 context currently loaded on the pCPU (in this case it's VMX bits =20 of the context, but it could be other parts of it too) that is=20 loaded on the pCPU Well, I'm afraid I only see two solutions: 1) we get rid of lazy context switch; 2) whatever it is that is happening at point c above, it needs to be=20 aware that we use lazy context switch, and make sure to sync the=20 context before playing with or altering it; > And we'd have to live with other > VMX code paths having similar asynchronous behavior needing to > similarly take care of the requirement. >=20 Exactly. And in fact, in this thread, migration of vCPUs between pCPUs was mentioned, and it was being considered to treat that in a special way.=20 But it looks to me that something very similar may, at least in theory, happen any time a lazy context switch occurs, no matter whether the pCPU has become idle because the previously running vCPU wants to move, or because it blocked for whatever other reason. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli --=-dDIhhMTzycoRBTRFfHMH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEES5ssOj3Vhr0WPnOLFkJ4iaW4c+4FAloEL7AACgkQFkJ4iaW4 c+4wzg//XCb5n/jZ4VfTQG8jyhODXiOvmWhE6/TPB01TV+rPjWjYU7T1ezPnGCt2 8uGt10Ko4Lwpo7KL40qYwgtBZpYTwPg0rDbeGIuK2CbV1erqG7wDQ4GEdCwxenNj 59CuEP4wEYcEgMnwqA2fSFmu+1VbaIAs5OKOCAFzkmCXyrAGu8vAwCNjlPAc9+JH z0esNqD2OwuZVckfmpWZaeXp1/e14JXKFgzxf584x+cZ8E6+YYrERQe4+g5nnvcb 2HGqLtLn+/M86t00kTsgHBLvYJPbMxaDNGMK2YnUggZ2w0xGVs/XVgYwUiOLhMC6 YkXDWSw3fgyb4EvHEaSx3VVyDaRN2qLjnpPQnvIL+w3hURKMwhhKlQe0gVyBLwwc r62KdJIvIySg05kKGnLKDLA/WzuGS7TUYtJD7LGYbaXJNH+HW2TMOknwOikpqYop 9dkyxKQOHFO0sSpe5EA22IgtP/Ulh+vmZZ46cKhmLnMYhHK+7SXPci0QCYbt57gA M5hQpqYun7i/JDQoL6TyWD6BPxthkOM9SV5CjoH7q+KlZx2jvqEQ4H2lK5BOWjB3 78rUVWmM5ha7Q6+h7WYpcWg6PX3ShLQpMyrm7+rRGr9jvooiApJ5+viX5TGcgSdJ yWh7ub6Xar3vo6JNsj7zpzhcuoaSguCXEx/5FHhtElrlxhMUFFI= =j5m0 -----END PGP SIGNATURE----- --=-dDIhhMTzycoRBTRFfHMH-- --===============1693604104822501192== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============1693604104822501192==--