From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 3/6] xen: credit1: increase efficiency and scalability of load balancing. Date: Thu, 2 Mar 2017 12:35:12 +0100 Message-ID: <1488454512.5548.191.camel@citrix.com> References: <148844531279.23452.17528540110704914171.stgit@Solace.fritz.box> <148845109955.23452.14312315410693510946.stgit@Solace.fritz.box> <57001cc1-de50-eea3-da2c-737b27981c1b@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7675578347745039295==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cjP13-0003ox-E6 for xen-devel@lists.xenproject.org; Thu, 02 Mar 2017 11:35:25 +0000 In-Reply-To: <57001cc1-de50-eea3-da2c-737b27981c1b@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Andrew Cooper , xen-devel@lists.xenproject.org Cc: George Dunlap List-Id: xen-devel@lists.xenproject.org --===============7675578347745039295== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-nudJCf7Nbxb4DN8Ji3c5" --=-nudJCf7Nbxb4DN8Ji3c5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2017-03-02 at 11:06 +0000, Andrew Cooper wrote: > On 02/03/17 10:38, Dario Faggioli wrote: > > signed-off-by: Dario Faggioli > > --- > > Cc: George Dunlap > > Cc: Andrew Cooper > =C2=A0 > Malcolm=E2=80=99s solution to this problem is https://github.com/xenserve= r/xe > n-4.7.pg/commit/0f830b9f229fa6472accc9630ad16cfa42258966=C2=A0 This has > been in 2 releases of XenServer now, and has a very visible > improvement for aggregate multi-queue multi-vm intrahost network > performance (although I can't find the numbers right now). >=20 Yep, as you know, I had become aware of that. > The root of the performance problems is that pcpu_schedule_trylock() > is expensive even for the local case, while cross-cpu locking is much > worse.=C2=A0 Locking every single pcpu in turn is terribly expensive, in > terms of hot cacheline pingpong, and the lock is frequently > contended. >=20 > As a first opinion of this patch, you are adding another cpumask > which is going to play hot cacheline pingpong. >=20 Can you clarify? Inside csched_load_balance(), I'm actually trading one currently existing cpumask_and() with another. Sure this new mask needs updating, but that only happens when a pCPUs acquires or leaves the "overloaded" status. Malcom's patch uses an atomic counter which does not fit very well in Credit1's load balancer's logic, and in fact it (potentially) requires an additional cpumask_cycle(). And it also comes with cache coherency implications, doesn't it? Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-nudJCf7Nbxb4DN8Ji3c5 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYuANyAAoJEBZCeImluHPuKOAP/2+/87Yu6hVED3Sbi8P+0AhN WUAWyzXJx+JxhBauCeeeEAfLsHzhzE4IrOAR7ZsdTgbkyBVbrFGIt0vhGdupkBuu QbBRTZj8gUqymAQnBGWfkUgulwjXXmBoDyPOL03EKhI9B2Y3gWAb1dV5aY2cV1dR kw0JDgfrThqbOvns7lhFAhYRCoKESlK9dlqJ3W3J1cz9uMhJT8wPSltS9ZqdtuAJ THJ7XsVb+lHvlsQrfeEUgrOC7wZgQYnfcpJHN95G71EoLh2P44W9HiaA6t6hu5FT upHw4MdPnGow/U6DKefdhETkA2MSM4IIcDwXI9xJL1nOuiBZkKF+3Re+nQ6ROqXz Y9M6xhsmNVH3MeSAuvPOhaKF+OQmZuxd7IrhjvWy3aAO8SqF9aRHCY7CrRM+w3iG 856IbpqEFlPf82HVc2dxaSfwvc5iK7Xh7VgM/NlhMaNVP6OaXbZO0TFjb2O7tZ3O 45MIBvsU8HhgwRIg906dEIf6WmwcRWNAiTXmiY0CnsfHLYk1+pqOZGnb3lB+iuTC RHDGdbEt6V3m1/tYw4v6rEEVD4v+ZJsxD0dLs93F6fuyYA0k0wELbEYa1/YoAoHK vVblF5vvZE0vJuHwuUHOg8s5WHmvpsr2+AR17zZWewLo7w3Zp0pCTlfecVEFh9qV LdWuKQ6oa+bq52M6XIrO =XocS -----END PGP SIGNATURE----- --=-nudJCf7Nbxb4DN8Ji3c5-- --===============7675578347745039295== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============7675578347745039295==--