From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [RFC 0/5] xen/arm: support big.little SoC Date: Wed, 21 Sep 2016 17:45:40 +0200 Message-ID: <1474472740.4393.281.camel@citrix.com> References: <20160919083619.GA16854@linux-7smt.suse> <5ddefbc1-3bd4-c990-b615-0039761535d8@arm.com> <97d77bdb-2f4e-e89a-95b9-8aacb56eebc0@suse.com> <1474305482.4393.42.camel@citrix.com> <1474325742.4393.78.camel@citrix.com> <1474332846.4393.153.camel@citrix.com> <20160920100331.GB8084@linux-u7w5.ap.freescale.net> <4c52141f-a6a4-a0b1-dced-f799b592481e@arm.com> <61196660-df7c-7324-2fb6-cfb11f44ea1e@arm.com> <39623498-bb30-4ff7-f075-219487a5afbb@arm.com> <6bd7d587-f9ba-c3bf-db96-46a2958d9e5b@arm.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1008317207611485637==" Return-path: In-Reply-To: <6bd7d587-f9ba-c3bf-db96-46a2958d9e5b@arm.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Julien Grall , George Dunlap , Stefano Stabellini Cc: Juergen Gross , Peng Fan , Steve Capper , George Dunlap , Andrew Cooper , Punit Agrawal , "xen-devel@lists.xen.org" , Jan Beulich , Peng Fan List-Id: xen-devel@lists.xenproject.org --===============1008317207611485637== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-/4tSjsjmP78LGJdSDtgH" --=-/4tSjsjmP78LGJdSDtgH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-09-21 at 14:06 +0100, Julien Grall wrote: > (CC a couple of ARM folks) >=20 Yay, thanks for this! :-) > I had few discussions and=C2=A0=C2=A0more thought about big.LITTLE suppor= t in > Xen.=C2=A0 > The main goal of big.LITTLE is power efficiency by moving task > around=C2=A0 > and been able to idle one cluster. All the solutions suggested=C2=A0 > (including mine) so far, can be replicated by hand (except the VPIDR) > so=C2=A0 > they are mostly an automatic way.=20 > I'm sorry, how is this (going to be) handled in Linux? Is it that any arbitrary task executing any arbitrary binary code can be run on both big and LITTLE pcpus, depending on the scheduler's and energy management's decisions? This does not seem to match with what has been said at some point in this thread... And if it's like that, how's that possible, if the pcpus' ISAs are (even only slightly) different? > This will also remove the real=C2=A0 > benefits of big.LITTLE because Xen will not be able to migrate vCPU=C2=A0 > across cluster for power efficiency. >=20 > If we care about power efficiency, we would have to handle > seamlessly=C2=A0 > big.LITTLE in Xen (i.e a guess would only see a kind of CPU).=20 > Well, I'm a big fan of an approach that leaves the guests' scheduler dumb about things like these (i.e., load balancing, energy efficiency, etc), and hence puts Xen in charge. In fact, on a Xen system, it is only Xen that has all the info necessary to make wise decisions (e.g., the load of the _whole_ host, the effect of any decisions on the _whole_ host, etc). But this case may be a LITTLE.bit ( :-PP ) different. Anyway, I guess I'll way your reply to my question above before commenting more. > This arise=C2=A0 > quite few problem, nothing insurmountable, similar to migration > across=C2=A0 > two platforms with different micro-architecture (e.g processors):=C2=A0 > errata, features supported... The guest would have to know the union > of=C2=A0 > all the errata (this is done so far via the MIDR, so we would a PV > way=C2=A0 > to do it), and only the intersection of features would be exposed to > the=C2=A0 > guest. This also means the scheduler would have to be modified to > handle=C2=A0 > power efficiency (not strictly necessary at the beginning). >=20 > I agree that a such solution would require some work to implement,=C2=A0 > although Xen will have a better control of the energy consumption of > the=C2=A0 > platform. >=20 > So the question here, is what do we want to achieve with big.LITTLE? >=20 Just thinking out loud here. So, instead of "just", as George suggested: =C2=A0vcpuclass=3D["0-1:A35","2-5:A53", "6-7:A72"] we can allow something like the following (note that I'm tossing out random numbers next to the 'A's): =C2=A0vcpuclass =3D ["0-1:A35", "2-5:A53,A17", "6-7:A72,A24,A31", "12-13:A8= "] with the following meaning: =C2=A0- vcpus 0, 1 can only run on pcpus of class A35 =C2=A0- vcpus 2,3,4,5 can run on pcpus of class A53 _and_ on pcpus of class= =C2=A0 =C2=A0 =C2=A0A17 =C2=A0- vcpus 6,7 can run on pcpus of class A72, A24, A31 =C2=A0- vcpus 8,9,10,11 --since they're not mentioned, can run on pcpus of= =C2=A0 =C2=A0 =C2=A0any class =C2=A0- vcpus 12,13 can only run on pcpus of class A8 This will set the "boundaries", for each vcpu. Then, within these boundaries, once in the (Xen's) scheduler, we can implement whatever complex/magic/silly logic we want, e.g.: =C2=A0- only use a pcpu of class A53 for vcpus that have an average load=C2= =A0 =C2=A0 =C2=A0above 50% =C2=A0- only use a pcpu of class A31 if there are no idle pcpus of class A2= 4 =C2=A0- only use a pcpu of class A17 for a vcpu if the total system load=C2= =A0 =C2=A0 =C2=A0divided by the vcpu ID give 42 as result =C2=A0- whatever This allows us to achieve both the following goals: =C2=A0- allow Xen to take smart decisions, considering the load and the=C2= =A0 =C2=A0 =C2=A0efficiency of the host as a whole =C2=A0- allow the guest to take smart decisions, like running lightweight= =C2=A0 =C2=A0 =C2=A0tasks on low power vcpus (which then Xen will run on low=C2=A0 =C2=A0 =C2=A0power pcpus, at least on a properly configured system) Of course this **requires** that, for instance, vcpu 6 must be able to run on A72, A24 and A31 just fine, i.e., it must be possible for it to block on I/O when executing on an A72 pcpu, and, later, after wakeup, restart executing on an A24 pcpu. If that is not possible, and doing such vcpu movement, instead than just calling schedule.c:vcpu_migrate() (or equivalent), requires some more complex fiddling, involving local migration --or alike-- techniques, then I honestly don't think this is something that can be solved at the scheduler level anyway... :-O > [1] https://lwn.net/Articles/699569/ >=20 I tried to have a quick look, but I don't have the time right now, and firthermore, it's all about ARM, and I still speak too few ARM for properly understanding what's going on... :-( Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-/4tSjsjmP78LGJdSDtgH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJX4qskAAoJEBZCeImluHPuuS0QAOBPMgSRSuSeNpMxa088QWgA Ea7SOFqMGjUhEsiLJ8Kaz3BU/2m9hcWD+MLD/3xwDRqSoEbGQDNhYnshQCUqQQSw ZUA+BVIpMv3Rh1zJpkSeR5Ulq7WRcmQyS5ovzH8ml6FXSxynkAgz/FQXjhGP9J9X Y51iPShzVYY7MiMx2BjSfCUOQ5nBcujuwZSXgRrP0yZFh7JyjiQ6ttMFBaFeYyai qlitjfFBJFDYC0q75lj8Rt9IQUEB7/aSRmRfhfDicR+tDrpL7rxRWMwpkowAtwwM llEl7j/I9dQxCYe8PTaeQ1Az029br83SYt1mZ+s72uMLSQ4ldXNRdYU+wAZJU9Pt 5cR8MdrZq18glqnOqIHj60EBIMIbgx1j4bRm59QNGa9zYyOxD3SP1FlDZw5rCu3w uEn4J7A2VDRhan07PYO++4hGQBXwE2079vvC/OGUtfr+YF7qFoMqnTN5V24dR5jD 8YdqTquIk5iuVyr5/eD8UWfUdnjq7lKF0Qf35ze5nok/+L2xu8soQqrI5lyidCh4 9cUA+dorAxavJKDk2crCYEmbSYwoUOZ9twv2GEZ6LIvVjHss5s3+Yliv4tkq2zmk 0c8gbLXsV3Cx32jjej2/eWtuyiQ/VvDvMBKAVFAUt6NEUDIW5eriVVv8YYkNqS3/ uubii3LFVOZ/bDhHpyTP =c7B4 -----END PGP SIGNATURE----- --=-/4tSjsjmP78LGJdSDtgH-- --===============1008317207611485637== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============1008317207611485637==--