From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: Notes on stubdoms and latency on ARM Date: Thu, 20 Jul 2017 11:25:25 +0200 Message-ID: <1500542725.20438.8.camel@citrix.com> References: <8c63069d-c909-e82c-ecba-5451f822a5cc@citrix.com> <1497953518.7405.21.camel@citrix.com> <1499445690.3620.8.camel@citrix.com> <1499840091.7756.12.camel@citrix.com> <3121c88c-fbda-a494-ce91-b06fa0fc10f3@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6098132848930160186==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Julien Grall , George Dunlap , Stefano Stabellini , Volodymyr Babchuk Cc: Artem_Mygaiev@epam.com, xen-devel@lists.xensource.com, Andrii Anisov List-Id: xen-devel@lists.xenproject.org --===============6098132848930160186== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-Nrbbow2ALoTReO+vMHNj" --=-Nrbbow2ALoTReO+vMHNj Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2017-07-19 at 12:21 +0100, Julien Grall wrote: > On 17/07/17 12:28, George Dunlap wrote: > > Just checking -- you do mean its own core, as opposed to its own > > socket? > > =C2=A0(Or NUMA node?) >=20 > I don't know much about the scheduler, so I might say something > stupid=C2=A0 > here :). Below the code we have for ARM >=20 > /* XXX these seem awfully x86ish... */ > /* representing HT siblings of each logical CPU */ > DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask); > /* representing HT and core siblings of each logical CPU */ > DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask); >=20 > static void setup_cpu_sibling_map(int cpu) > { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( !zalloc_cpumask_var(&per_cpu(cpu_sibli= ng_mask, cpu)) || > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0!zalloc_cpuma= sk_var(&per_cpu(cpu_core_mask, cpu)) ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0panic("No memory fo= r CPU sibling/core maps"); >=20 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* A CPU is a sibling with itself and is al= ways on its own core. > */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpumask_set_cpu(cpu, per_cpu(cpu_sibling_ma= sk, cpu)); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpumask_set_cpu(cpu, per_cpu(cpu_core_mask,= cpu)); > } >=20 > #define cpu_to_socket(_cpu) (0) >=20 > After calling setup_cpu_sibling_map, we never touch cpu_sibling_mask > and=C2=A0 > cpu_core_mask for a given pCPU. So I would say that each logical CPU > is=C2=A0 > in its own core, but they are all in the same socket at the moment. >=20 Ah, fine... so you're in the exact opposite situation I was thinking about and reasoning upon in the reply to George I've just sent! :-P Ok, this basically means that, by default, in any ARM system, no matter how big or small, Credit2 will always use just one runqueue, from which _all_ the pCPUs will fish vCPUs, for running them. As said already, it's impossible to tell whether this is either bad or good, in the general case. It's good for fairness and load distribution (load balancing happens automatically, without the actual load balancing logic and code having to do anything at all!), but it's bad for lock contention (every runq operation, e.g., wakeup, schedule, etc., have to take the same lock). I think this explains at least part of why Stefano's wakeup latency numbers are rather bad with Credit2, on ARM, but that is not the case for my tests on x86. > > All that to say: It shouldn't be a major issue if you are mis- > > reporting > > sockets. :-) >=20 > Good to know, thank you for the explanation! We might want to parse > the=C2=A0 > bindings correctly to get a bit of improvement. I will add a task on > jira. >=20 Yes, we should. Credit1 does not care about, but Credit2 is specifically designed to take advantage of these (and possibly even more!) information, so they need to be accurate. :-D Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-Nrbbow2ALoTReO+vMHNj Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZcHcFAAoJEBZCeImluHPu5JMP/2W7P757DUh9jF38bUYeyT3k ryQccDr4+qdRfzT3NJCheE40MKewAsXJBfdRP+DWJVFgj+EpMKYUwr2YAp/fCLmL r36NGyaN48XS8q6mShDF8/DnVn6LBy/gS8OP8Ape9XhJFLhuTeWAq3OZRFEU6hFK tJ5ZRTOyZiIgQLc/kx9fHxvFtkW+Z0s4wJobN9GHZ6vVr9vuMjg603ynpIS1ZdlV S8YU4J85KJYQ2ghu51u1yanM48FECtxvinZUkV/N6Wf3kWqmg8ga37rcuAfnpuwh D3pWJ/cCtSadtc5NJJKBf5OW+a0Gcbqq4BY3dNH0U2mfdF8GX7/7yeZv1dFaiKaT yar11Lg8p7U9lH+bD+EuO2bw9GBVSBugHpT+u7XuvUe2iX5+vd7Oh9lxDCVjCxzr pQ+yJMxU6J55kLpdkzJG7WexeMjoSaPf1YpowfSHY0R/F7Bf58HG276tEIQfO9wn 5Vtx1QeExA9Mk8KLqJje8hXLzwr0dxgchOe+jwlxRrwJykJ1MZck9ohUuWOnFO6g Y3ZnFQFGumvah0DEi/Gz4Nda5fA5UVa27gpMv+rUX7tPo1oTp6zukS857gG7+QOT 7KDOSyaGV8P92J8NmJsyYWNEtGfNg+Cx5Ix7lEWVUagGD9z0hk9eYYVO9N0IiFjd YoWB4npxlPeiszR7iMvs =AR7u -----END PGP SIGNATURE----- --=-Nrbbow2ALoTReO+vMHNj-- --===============6098132848930160186== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============6098132848930160186==--