From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: schedulers' use of cpumask_scratch Date: Fri, 9 Dec 2016 12:56:04 +0100 Message-ID: <1481284564.3445.234.camel@citrix.com> References: <58497D3C0200007800126D0A@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1501988116446086633==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cFJmi-0005hZ-Uc for xen-devel@lists.xenproject.org; Fri, 09 Dec 2016 11:56:17 +0000 In-Reply-To: <58497D3C0200007800126D0A@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich Cc: xen-devel List-Id: xen-devel@lists.xenproject.org --===============1501988116446086633== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-AkQN9cuTUS1MA0o0rRAg" --=-AkQN9cuTUS1MA0o0rRAg Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2016-12-08 at 07:33 -0700, Jan Beulich wrote: > Dario, >=20 Hi, > While it's > already > in use in credit1's __runq_tickle(), I then got puzzled by the > different > ways credit1 and credit2 use the variable: The former always uses > scratch_cpumask_cpu() with the subject CPU as argument, while > the latter always uses plain scratch_cpumask. What guarantees > that these two uses never conflict, if both schedulers are active for > some part of the system? >=20 So, what's important, when using the scratch masks of one CPU is that: =C2=A0- the CPU belongs to your cpupool and scheduler, =C2=A0- you own the runqueue lock (the one you take via=C2=A0 =C2=A0 =C2=A0{v,p}cpu_schedule_lock()) for that CPU. So, in Credit1: __runq_tickle(): cpu is what's in new's processor. new comes from v in vcpu_wake(); we take its lock, but we may not be on v->processor, so we need scratch_cpumask_cpu(). csched_runq_steal(): cpu comes from cpu in=C2=A0csched_load_balance(), whic= h comes from cpu in csched_schedule(), which is smp_processor_id(). So, in this case, I could have used cpumask_scratch (and probably should make a patch to that effect). In Credit2: get_fallback_cpu(): it's called by=C2=A0csched2_cpu_pick(). That comes from either vcpu_migrate() (via csched2_cpu_pick()) or from=C2=A0csched2_vcpu_insert(), and in both cases, we can't be sure we either are on or hold the lock of smp_processor_id. So, yes, this indeed must be converted in cpumask_scratch_cpu(cpu), with cpu being one of the CPUs for which we hold the runqueue lock (which in case of Credit2 is per runqueue, not per-CPU, but that does not make the current code less wrong, I think)! csched2_cpu_pick(): should be fixed and use cpumask_scratch_cpu(), for the same reasons listed above. migrate(): when it comes from balance_load(), it's ok, because that comes from csched2_schedule(), and hence we can use this_cpu(), as using cpumask_scratch directly does, because we have the lock on smp_processor_id()'s runqueue. When it comes from=C2=A0csched2_vcpu_migrate(), however, it's not, because vcpu_migrate() can be called from other places. So, this needs fixing too. :-( csched2_schedule(): here, it's obviously ok to just use cpumask_scratch. So, indeed it looks we are at least in need for a patch. What must have happened is that I changed the logic of how scratch space was used when implementing it in schedule.c, for all schedulers, but did not update the Credit2 patches accordingly. I'll send a fix ASAP (not sure if by today, as today I'm not working... well :-P), and I guess that would be a 4.8.1 candidate. :-/ Thanks a lot for reporting this, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-AkQN9cuTUS1MA0o0rRAg Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYSpvWAAoJEBZCeImluHPu10IQAL+SraGuTEpPcVoo2lA7uwqw RVSwPVKeCrOjSe4ComamQFyofa2iFIxS4CAX3f6lWJOM0kG97VPzz0xrOD8UeO6R QKCohp/uaNZuAZDXmZvz6AgyP7v/eZQwQIoIJAKsF0MEk+k3aNgPA7VZE7ZH/Dx3 pq6F9gk//zBPkDNnfBboNumAYmw6Z+o8F0V9Soy4RE7Jv8prmwjpxTuamU4Gb/F9 F8mviIAStSlcWUXQBdCos7kVe54x9cabDVwTnoUtanSiLUoBd55Pp/nmjzubcOgS fiwyzT2OIdLZJuLZjLKiYE1bzk8b/XAUvN35gkug5Hp19hDUDLkqVCh3D9ocpS8m dp1q68KQDsmGbOcrV0HlgewDCHyHtaWiCXvUCQWfFR1GtW89XCQa6jqIrm7ykBMb F/OSu2orqvuGneOpfiQK+UnnBWwVrJrOUjutIbbbs3Vr9cS6xbsJJD9cgHNv/WVr XLGEnH2lclzQvmouLC790VOq+RP94T67byycR82JsFY5xmfpVaiPKtkfw9fVZIrL HTScpIyU8JPGRGudXY0FOVKM2puUcldL8Gpu9rm8lSdD2OfLReAW/M5cM77uoH1k eQ3qIcnc1Y9qqIYPLBrX4JcU6Klj5jDyc65/R4iLOZGVyXU+SSKSLjcpDGplCGgt FLJXYeKqOevlCFZMgTXx =iOQZ -----END PGP SIGNATURE----- --=-AkQN9cuTUS1MA0o0rRAg-- --===============1501988116446086633== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============1501988116446086633==--