From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 3/4] xen: credit1: properly deal with pCPUs not in any cpupool Date: Thu, 2 Jul 2015 18:01:58 +0200 Message-ID: <1435852918.14347.38.camel@citrix.com> References: <20150625103457.3353.39292.stgit@Solace.station> <20150625121527.3353.32071.stgit@Solace.station> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1688000829232779279==" Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZAgwJ-0006hg-1D for xen-devel@lists.xenproject.org; Thu, 02 Jul 2015 16:02:15 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: Juergen Gross , xen-devel List-Id: xen-devel@lists.xenproject.org --===============1688000829232779279== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-TLd9CD/EU2nAgaMsFKmR" --=-TLd9CD/EU2nAgaMsFKmR Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2015-07-02 at 16:24 +0100, George Dunlap wrote: > On Thu, Jun 25, 2015 at 1:15 PM, Dario Faggioli > wrote: > > Ideally, the pCPUs that are 'free', i.e., not assigned > > to any cpupool, should not be considred by the scheduler > > for load balancing or anything. In Credit1, we fail at > > this, because of how we use cpupool_scheduler_cpumask(). > > In fact, for a free pCPU, cpupool_scheduler_cpumask() > > returns a pointer to cpupool_free_cpus, and hence, near > > the top of csched_load_balance(): > > > > if ( unlikely(!cpumask_test_cpu(cpu, online)) ) > > goto out; > > > > is false (the pCPU _is_ free!), and we therefore do not > > jump to the end right away, as we should. This, causes > > the following splat when resuming from ACPI S3 with > > pCPUs not assigned to any pool: >=20 > What I haven't quite twigged yet with regard to this specific bug is > why csched_load_balance() is being run on this cpu at all if it hasn't > been added to the cpupool yet. =20 > Because the cpu is online already. That happens in start_secondary(), right after having notified the CPU_STARTING phase of CPU bringup. OTOH, adding a cpu to a pool only happens during the CPU_ONLINE phase, which is after that. > AFAICT it's only called from > csched_schedule() -- why is that being called on a cpu that isn't > online yet? =20 > Because, sadly, it is online. :-/ > If we can't fix that before the code freeze (or if we > wouldn't want to avoid it anyway), would it make more sense to check > for that condition in csched_schedule() and just return the idle > domain? > I tried to move things around (i.e., moving the assignment to a cpupool ahead in the bringup process), but that introduces irq-safe vs. non irq-safe locking issues (ISTR that was on the spin locks protecting the memory being allocated from inside schedule_cpu_switch(), which is called by the cpu to cpupool assignment routine). So, yeah, it's something I also don't like much, but it's hard to fix properly, even without considering 4.6's freeze. :-/ > (Or schedule() even?) >=20 Perhaps, but: - this is (I think) pretty orthogonal wrt this patch, - I tried something like that as well, but then I did not like having a check for "is this cpu outside of any cpupool?" in such code, and going through it all the time (and this is an hot path!), for the sake of a corner case (as I consider having cpus outside of any cpupool a transient situation, as it is system start). The approach implemented seemed the best one in term of both how the code looks and performs for the vast majority of use cases and of the time. Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-TLd9CD/EU2nAgaMsFKmR Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlWVYH8ACgkQk4XaBE3IOsQ8sgCeNRMv2zTju9feryUYAGoh4x31 LW0AoIzuGRuEbVzp9EU0U9/rSArBwSL8 =yXAR -----END PGP SIGNATURE----- --=-TLd9CD/EU2nAgaMsFKmR-- --===============1688000829232779279== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1688000829232779279==--