From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks Date: Wed, 14 Sep 2016 16:51:53 +0200 Message-ID: <1473864713.6339.142.camel@citrix.com> References: <1472615791-8664-1-git-send-email-feng.wu@intel.com> <1472615791-8664-5-git-send-email-feng.wu@intel.com> <57C8030F020000780010ABEF@prv-mh.provo.novell.com> <57C94099020000780010B115@prv-mh.provo.novell.com> <57C95162020000780010B1B3@prv-mh.provo.novell.com> <57C961B3020000780010B27E@prv-mh.provo.novell.com> <57C97470020000780010B354@prv-mh.provo.novell.com> <57C9A0C1020000780010B642@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4013222792805653803==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: "Wu, Feng" , Jan Beulich Cc: "george.dunlap@eu.citrix.com" , "andrew.cooper3@citrix.com" , "Tian, Kevin" , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============4013222792805653803== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-o4FN/x1K+7+dQx2k/a9l" --=-o4FN/x1K+7+dQx2k/a9l Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-09-14 at 02:23 +0000, Wu, Feng wrote: > Then I tried to implement the function like the following: >=20 > /* This function is called when pcidevs_lock is held */ > void vmx_pi_hooks_assign(struct domain *d) > { > =C2=A0=C2=A0=C2=A0=C2=A0if ( !iommu_intpost || !has_hvm_container_domain(= d) ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; >=20 > =C2=A0=C2=A0=C2=A0=C2=A0ASSERT(!d->arch.hvm_domain.vmx.vcpu_block); >=20 > =C2=A0=C2=A0=C2=A0=C2=A0/* > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* We carefully handle the timing here: > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* - Install the context switch first > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* - Then set the NDST field > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* - Install the block and resume hooks in t= he end > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* This can make sure the PI (especially the= NDST feild) is > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* in proper state when we call vmx_vcpu_blo= ck(). > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > =C2=A0=C2=A0=C2=A0=C2=A0d->arch.hvm_domain.vmx.pi_switch_from =3D vmx_pi_= switch_from; > =C2=A0=C2=A0=C2=A0=C2=A0d->arch.hvm_domain.vmx.pi_switch_to =3D vmx_pi_sw= itch_to; >=20 > =C2=A0=C2=A0=C2=A0=C2=A0for_each_vcpu ( d, v ) > =C2=A0=C2=A0=C2=A0=C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0unsigned int dest =3D cpu= _physical_id(v->processor); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct pi_desc *pi_desc = =3D &v->arch.hvm_vmx.pi_desc; >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* spot 1 */ >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0write_atomic(&pi_desc->nd= st, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0x2apic_enabled ? dest= : MASK_INSR(dest, > PI_xAPIC_NDST_MASK)); > =C2=A0=C2=A0=C2=A0=C2=A0} >=20 > =C2=A0=C2=A0=C2=A0=C2=A0d->arch.hvm_domain.vmx.vcpu_block =3D vmx_vcpu_bl= ock; > =C2=A0=C2=A0=C2=A0=C2=A0d->arch.hvm_domain.vmx.pi_do_resume =3D vmx_pi_do= _resume; > } >=20 > However, I still think it is problematic. Consider the _spot 1_ > above, we get > the pCPU of the vCPU and update the NDST, but the vCPU can be happily > rescheduled to another pCPU before updating the NDST. The problem is > that it is not safe to update NDST here, since vCPU can be scheduled > behind us at any time. And if this is the case, it is hard to safely > do this > without guarantee the vCPU is in a known state. Any further advice > on this? Thanks a lot! >=20 So, I'm sorry if this is me missing or not remembering something... but I do remember that, at some point, in the early days of this series, there were concerns about the use of v->processor behind the back of the scheduler, i.e., without holding the proper scheduler related locks. Pausing the domain was not even being remotely considered at the time, it (again, at least AFAICR) came up later for trying to address other issues. No, the whole point of why that was not a problem in the first place is that what counts here is on the wait list of what pcpu the vcpu is put, not really where the vcpu is being or has been scheduled last. Of course it'd be better --and it would also be true most of the times-- if there where a match, but that was not a correctness concern. So why this is all of the sudden becoming one? Am I completely off with my recollection (or in general :-P)? Or what am I missing about the issue we are trying to address with this new bits of the work? Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-o4FN/x1K+7+dQx2k/a9l Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJX2WQKAAoJEBZCeImluHPut+AQAN9k81NWjRS4+2wz+ufyMZZd Zb0EIsYfwBmO9CyT4G/joujPIzoQft11zsBhPtQb4gqYneEI62ZESIthTAmL38Q0 OATpk2XNAladEyKRATXsvHdTa9vQmFQDfkzz5BEx++I6DbnLHJLkFS9wa5xINxwA iCdDzEwb0oWyJ0kY9pW6fE14RPmcuU9IUTntuO4mqNzhqhGLBGfNmKfUZUQCk/no kDPRS8BSFP0CW0jOBFFXpqx8bNvtNoKJnyClwVbdvdy1NmQDJwiJTQffEF8P+4wE /0gL/pS53BzQSgp3hO/jze4L3ZSW/QHY4FiB3AlU91SXM16wX06zxUfce6BKsq9x fICc81dSmcqfRKtHVIJ8NvCDqWwtR7VfQjxmvGdtfI+DAx0y58zCTMCKhC+apITB NS6cyk83l0lYUvcWjzPEo/1LqUAgmOETzgDS2w2ORizLASIxooTGtUU/9TrBPm4E ZINK/iWS7D7Px0UMDEeUkS4ge7k+JB/kXJ++iFDcg9M90blwHEvtKwtKCzT4Spj8 pw5qwjAeA7LlUdtGquJxmLSW94NTb2XjiDttxhC5T7jZolioPVx1Is9Qe8q8IThO FYWEjZX/EVgEN/0019BXHieuNEm14y0BOZqPj7YEtPyAxTvzu1nb1WC8xVkE9/Kr Jy140BZM2LAP8SEJeKtm =W+FR -----END PGP SIGNATURE----- --=-o4FN/x1K+7+dQx2k/a9l-- --===============4013222792805653803== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============4013222792805653803==--