From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: Notes on stubdoms and latency on ARM Date: Sat, 8 Jul 2017 16:26:00 +0200 Message-ID: <1499523960.3620.14.camel@citrix.com> References: <8c63069d-c909-e82c-ecba-5451f822a5cc@citrix.com> <1497953518.7405.21.camel@citrix.com> <1499445690.3620.8.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3024449233848165735==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Volodymyr Babchuk Cc: Artem_Mygaiev@epam.com, xen-devel@lists.xensource.com, Andrii Anisov , George Dunlap , Julien Grall , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org --===============3024449233848165735== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-UWbqUKX3zUZuRAN2B1MH" --=-UWbqUKX3zUZuRAN2B1MH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2017-07-07 at 10:03 -0700, Volodymyr Babchuk wrote: > On 7 July 2017 at 09:41, Dario Faggioli > wrote: > >=20 > > Also, are you sure (e.g., because of how the Linux driver is done) > > that > > this always happen on one vCPU? >=20 > No, I can't guarantee that. Linux driver is single threaded, but I > did > nothing to pin in to a certain CPU. >=20 Ok, it was just to understand. > >=20 > > > - In total there are 6 vcpus active > > >=20 > > > I run test in DomU: > > > real 113.08 > > > user 0.00 > > > sys 113.04 > > >=20 > >=20 > > Ok, so there's contention for pCPUs. Dom0's vCPUs are CPU hogs, > > while, > > if my assumption above is correct, the "SMC vCPU" of the DomU is > > I/O > > bound, in the sense that it blocks on an operation --which turns > > out to > > be SMC call to MiniOS-- then resumes and block again almost > > immediately. > >=20 > > Since you are using Credit, can you try to disable context switch > > rate > > limiting? Something like: > >=20 > > # xl sched-credit -s -r 0 > >=20 > > should work. >=20 > Yep. You are right. In the environment described above (Case 2) I now > get much better results: >=20 > =C2=A0real 1.85 > user 0.00 > sys 1.85 >=20 Ok, glad to hear it worked! :-) > > This looks to me like one of those typical scenario where rate > > limiting > > is counterproductive. In fact, every time that your SMC vCPU is > > woken > > up, despite being boosted, it finds all the pCPUs busy, and it > > can't > > preempt any of the vCPUs that are running there, until rate > > limiting > > expires. > >=20 > > That means it has to wait an interval of time that varies between 0 > > and > > 1ms. This happens 100000 times, and 1ms*100000 is 100 seconds... > > Which > > is roughly how the test takes, in the overcommitted case. >=20 > Yes, looks like that was the case. Does this means that ratelimiting > should be disabled for any domain that is backed up with device > model? > AFAIK, device models are working in the exactly same way. >=20 Rate limiting is a scheduler-wide thing. If it's on, all the context switching rate of all domains is limited. If it's off, none is. We'll have to see when we will have something that is less of a proof- of-concept, but it is very likely that, for your use case, rate- limiting should just be kept disabled (you can do that with a Xen boot time parameter, so that you don't have to issue the command all the times). > > Yes, but it again makes sense. In fact, now there are 3 CPUs in > > Pool-0, > > and all are kept always busy by the the 3 DomU vCPUs running > > endless > > loops. So, when the DomU's SMC vCPU wakes up, has again to wait for > > the > > rate limit to expire on one of them. >=20 > Yes, as this was caused by ratelimit, this makes perfect sense. Thank > you. >=20 > I tried number of different cases. Now execution time depends > linearly > on number of over-committed vCPUs (about +200ms for every busy vCPU). > That is what I'm expected. > Is this the case even when MiniOS is in its own cpupool? If yes, it means that what is that the slowdown is caused by the contention between the vCPU that is doing the SMC calls, and the other vCPUs (of either the same or other domains). Which should not really happen in this case (or, at least, not to grow linearly), since you are on Credit1, and in there, the SMC vCPU should pretty much be always boosted, and hence get to be scheduled almost immediately, no matter how many CPU hogs there are around. Depending on the specific details of your usecase/product, we can try to assign to the various domains different weights... but I need to think a bit more about this... Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-UWbqUKX3zUZuRAN2B1MH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZYOt6AAoJEBZCeImluHPuNlAP/Aorjb4YXyTBGb+WK5MA9DC6 YjkiuCUFeip2SYapbgSUdZ43EpFUqWowKdEgaQwLGdNZtuB0A6dJV2djdSpnsErS aMmA716uFuwcDsm4gTXyugXDjj1iwY1wb52V5yAroP5N+Zc5aLPOMel1sjf+aIKV 92A0hIOCEun2NGkEBj5NwCX8sYcn/l2a2SLHWMAeVPkMAPh5uq+OGXp5wt6DxDQI MkafdsnE59RUil7qECJiz/yfy0aiLZhRkO+kXUwFy/2LcPVIiT6pT2X584RsKPsT AOEJudO475sQny/dYgY1YeRrmd8N+EfJKp9I1lVg+7dyqR1QmOac0bIP8d/fcfZk giE6KanQ//yCLPFi5uFf3/Pn3mTk/HC1Jj4bVLcqbZozdWhlU0B/QMYw9NHD7MCH cxWq2zSADg2hWUmgvQsMb8RS4qsaTPl+EQKs2dSyHgEeYpeNZuHgXpfvUru5inxa FvS12ScIuWG5ZzWMMXhir4qs/8QjHiZE1jXewBkF6/N49CWgTLEu1s/3/mQJYVLi 91ugJBFVgAuBXgPhqJOtBiehlxy/WKDc/IoR0ZybpKAe9gyZCtqMkqvmV2vzupKh YdZB2OX8Tzj2npJ00/VQUK/5gxJ5xihjMC7ErbSdg9zr6xnWFYLVe4ZpBf7Ds+uF cK+3sJZY/Hv+v4pPRUHq =/cwX -----END PGP SIGNATURE----- --=-UWbqUKX3zUZuRAN2B1MH-- --===============3024449233848165735== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============3024449233848165735==--