From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command Date: Fri, 16 Dec 2016 02:30:18 +0100 Message-ID: <1481851818.3445.390.camel@citrix.com> References: <20160928182457.12433-1-andre.przywara@arm.com> <0fce93d4-605b-78b9-9146-b4d65eb4e86a@arm.com> <4a8bb842-bac5-942f-ca84-d223f43ab50b@arm.com> <1481059976.3445.98.camel@citrix.com> <09a541d6-a6ec-180d-0f24-400fbb0a3ea4@arm.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1155970724603855794==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cHhLx-0001u1-Ty for xen-devel@lists.xenproject.org; Fri, 16 Dec 2016 01:30:30 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: George Dunlap , Stefano Stabellini Cc: Andre Przywara , Julien Grall , Steve Capper , Vijay Kilari , xen-devel List-Id: xen-devel@lists.xenproject.org --===============1155970724603855794== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-XwHX39RGjYmgFy2tLKBo" --=-XwHX39RGjYmgFy2tLKBo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-12-14 at 03:39 +0100, George Dunlap wrote: > > On Dec 10, 2016, at 4:18 AM, Stefano Stabellini > .org> wrote: > > > > The issue with spreading interrupts migrations over time is > > > > that it makes > > > > interrupt latency less deterministic. It is OK, in the uncommon > > > > case of > > > > vCPU migration with interrupts, to take a hit for a short time. > > > > This > > > > "hit" can be measured. It can be known. If your workload cannot > > > > tolerate > > > > it, vCPUs can be pinned. It should be a rare event anyway. On > > > > the other > > > > hand, by spreading interrupts migrations, we make it harder to > > > > predict > > > > latency. Aside from determinism, another problem with this > > > > approach is > > > > that it ensures that every interrupt assigned to a vCPU will > > > > first hit > > > > the wrong pCPU, then it will be moved. It guarantees the worst- > > > > case > > > > scenario for interrupt latency for the vCPU that has been > > > > moved. If we > > > > migrated all interrupts as soon as possible, we would minimize > > > > the > > > > amount of interrupts delivered to the wrong pCPU. Most > > > > interrupts would > > > > be delivered to the new pCPU right away, reducing interrupt > > > > latency. >=C2=A0 > Another approach which one might take: > 3. Eagerly migrate a subset of the interrupts and lazily migrate the > others.=C2=A0=C2=A0For instance, we could eagerly migrate all the interru= pts > which have fired since the last vcpu migration.=C2=A0=C2=A0In a system wh= ere > migrations happen frequently, this should only be a handful; in a > system that migrates infrequently, this will be more, but it won=E2=80=99= t > matter, because it will happen less often. >=20 Yes, if doable (e.g., I don't know how easy and practical is to know and keep track of fired interrupts) this looks a good solution to me too. > So at the moment, the scheduler already tries to avoid migrating > things *a little bit* if it can (see migrate_resist).=C2=A0=C2=A0It=E2=80= =99s not clear > to me at the moment whether this is enough or not.=C2=A0=C2=A0 > Well, true, but migration resistance, in Credit2, is just a fixed value which: =C2=A01. is set at boot time; =C2=A02. is always the same for all vcpus; =C2=A03. is always the same, no matter what a vcpu is doing. And even if we make it tunable and changeable at runtime (which I intend to do), it's still something pretty "static" because of 2 and 3. And even if we make it tunable per-vcpu (which is doable), it would be rather hard to decide to what value to set it, for each vcpu. And, of course, 3 would still apply (i.e., it would change according to the vcpu workload or characteristics). So, it's guessing. More or less fine grained, but always guessing. On the other hand, using something proportional to nr. of routed interrupt as the migration resistance threshold would overcome all 1, 2 and 3. It would give us a migrate_resist value which is adaptive, and is determined according to actual workload of properties of a specific vcpu. Feeding routed interrupt info to the load balancer comes from similar reasoning (and we actually may want to do both). FTR, Credit1 has a similar mechanism, i.e., it *even wilded guesses* whether a vcpu could still have some of its data in cache, and tries not to migrate it if it's likely (see=C2=A0__csched_vcpu_is_cache_hot()). We can improve that too, although it is a lot more complex and less predictable, as usual with Credit1. > Or to put it a different way =E2=80=94 how long should the scheduler try = to > wait before moving one of these vcpus?=C2=A0=C2=A0 > Yep, it's similar to the "anticipation" problem in I/O schedulers (where "excessive seeks" ~=3D "too frequent migrations"). =C2=A0https://en.wikipedia.org/wiki/Anticipatory_scheduling > At the moment I haven=E2=80=99t seen a good way of calculating this. >=20 Exactly, and basing the calculation on the number of routed interrupt --and, if possible, other metrics too-- could be that "good way" we're looking for. It would need experimenting, of course, but I like the idea. > #3 to me has the feeling of being somewhat more satisfying, but also > potentially fairly complicated.=C2=A0=C2=A0Since the scheduler already do= es > migration resistance somewhat, #1 would be a simpler to implement in > the sort run.=C2=A0=C2=A0If it turns out that #1 has other drawbacks, we = can > implement #3 as and when needed. >=20 > Thoughts? >=20 Yes, we can do things incrementally, which is always good. I like your #1 proposal because it has the really positive side effect of bringing us in the camp of adaptive migration resistance, which is something pretty advanced and pretty cool, if we manage to do it right. :-) Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-XwHX39RGjYmgFy2tLKBo Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYU0OrAAoJEBZCeImluHPuBNIQANGUvlO83JoKVSDjO4gI76Cz 4wakl+n1U4t/Kwt5Lx52OXh1ssoK8T2rVV7tavxC4CF6tZzv+BAIM7SQxzO+d36c gGSWGSwTzVelYLes9vCUlEbX+L47sUp+H27ji1ZUfgFHtEuyDFfBgL4qlYi4T+Dv 3GGp7cLZYpyUaFE15rWxLLBay10LMNoGDbSctAZCSeaSl6/xTv1YxJ0tPIlApFaf NB3sdjTpNaghmO6rpIBaxIHFUwyGrhi6O2+zPPJH1kgc12pZe/dJVMcvJBco6oN4 JZeweQ29WccbKDJQw02cBJHwikDQAvigwoX0jaL50mUP8C20X4KMJwoFYwxAK9MX E9Pkd4HlbB0qAAjKD2B9+KFwP7wEId9ZZ8HZP0TYZ1GBAD+lsKNusv3pDkVi6nkw Qe8EuSeMr1IYBYKZf9PyLleuoWYri+EWJV5s/ZX1byJZ085FWg0f1oQewsRGB2rk nDTrtvdJBVi+neeU41ZhfFvd+PXDUTzN3k8svHbd/wNKfKzxMqmj9V30MsmvalxW Ym9Z3veku0QkEyfKog+OAzFBum9B3aIhrXbVllW8+DpYzOs2X1CTVBs8xSsBbK9K XWbpiFbK6E3GJunuF5+rsP/uBlFFN1F1w3rx6SfsZk03CIY/yEdUBgZwJg87bboi rXgl3yM2GA/RJklyhQN+ =2O5a -----END PGP SIGNATURE----- --=-XwHX39RGjYmgFy2tLKBo-- --===============1155970724603855794== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============1155970724603855794==--