From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH 3/4] xen: credit1: properly deal with pCPUs not in any cpupool Date: Fri, 26 Jun 2015 16:05:27 +0200 Message-ID: <558D5C27.5050100@suse.com> References: <20150625103457.3353.39292.stgit@Solace.station> <20150625121527.3353.32071.stgit@Solace.station> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z8UG2-0008Ex-Lz for xen-devel@lists.xenproject.org; Fri, 26 Jun 2015 14:05:30 +0000 In-Reply-To: <20150625121527.3353.32071.stgit@Solace.station> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dario Faggioli , xen-devel@lists.xenproject.org Cc: George Dunlap List-Id: xen-devel@lists.xenproject.org On 06/25/2015 02:15 PM, Dario Faggioli wrote: > Ideally, the pCPUs that are 'free', i.e., not assigned > to any cpupool, should not be considred by the scheduler > for load balancing or anything. In Credit1, we fail at > this, because of how we use cpupool_scheduler_cpumask(). > In fact, for a free pCPU, cpupool_scheduler_cpumask() > returns a pointer to cpupool_free_cpus, and hence, near > the top of csched_load_balance(): > > if ( unlikely(!cpumask_test_cpu(cpu, online)) ) > goto out; > > is false (the pCPU _is_ free!), and we therefore do not > jump to the end right away, as we should. This, causes > the following splat when resuming from ACPI S3 with > pCPUs not assigned to any pool: > > (XEN) ----[ Xen-4.6-unstable x86_64 debug=y Tainted: C ]---- > (XEN) ... ... ... > (XEN) Xen call trace: > (XEN) [] csched_load_balance+0x213/0x794 > (XEN) [] csched_schedule+0x321/0x452 > (XEN) [] schedule+0x12a/0x63c > (XEN) [] __do_softirq+0x82/0x8d > (XEN) [] do_softirq+0x13/0x15 > (XEN) [] idle_loop+0x5b/0x6b > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 8: > (XEN) GENERAL PROTECTION FAULT > (XEN) [error_code=0000] > (XEN) **************************************** > > The cure is: > * use cpupool_online_cpumask(), as a better guard to the > case when the cpu is being offlined; > * explicitly check whether the cpu is free. > > SEDF is in a similar situation, so fix it too. > > Still in Credit1, we must make sure that free (or offline) > CPUs are not considered "ticklable". Not doing so would impair > the load balancing algorithm, making the scheduler think that > it is possible to 'ask' the pCPU to pick up some work, while > in reallity, that will never happen! Evidence of such behavior > is shown in this trace: > > Name CPU list > Pool-0 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14 > > 0.112998198 | ||.|| -|x||-|- d0v0 runstate_change d0v4 offline->runnable > ] 0.112998198 | ||.|| -|x||-|- d0v0 22006(2:2:6) 1 [ f ] > ] 0.112999612 | ||.|| -|x||-|- d0v0 28004(2:8:4) 2 [ 0 4 ] > 0.113003387 | ||.|| -||||-|x d32767v15 runstate_continue d32767v15 running->running > > where "22006(2:2:6) 1 [ f ]" means that pCPU 15, which is > free from any pool, is tickled. > > The cure, in this case, is to filter out the free pCPUs, > within __runq_tickle(). > > Signed-off-by: Dario Faggioli Acked-by: Juergen Gross > --- > Cc: George Dunlap > Cc: Juergen Gross > --- > xen/common/sched_credit.c | 23 ++++++++++++++++------- > xen/common/sched_sedf.c | 3 ++- > 2 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c > index 953ecb0..a1945ac 100644 > --- a/xen/common/sched_credit.c > +++ b/xen/common/sched_credit.c > @@ -366,12 +366,17 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new) > { > struct csched_vcpu * const cur = CSCHED_VCPU(curr_on_cpu(cpu)); > struct csched_private *prv = CSCHED_PRIV(per_cpu(scheduler, cpu)); > - cpumask_t mask, idle_mask; > + cpumask_t mask, idle_mask, *online; > int balance_step, idlers_empty; > > ASSERT(cur); > cpumask_clear(&mask); > - idlers_empty = cpumask_empty(prv->idlers); > + > + /* cpu is vc->processor, so it must be in a cpupool. */ > + ASSERT(per_cpu(cpupool, cpu) != NULL); > + online = cpupool_online_cpumask(per_cpu(cpupool, cpu)); > + cpumask_and(&idle_mask, prv->idlers, online); > + idlers_empty = cpumask_empty(&idle_mask); > > > /* > @@ -408,8 +413,8 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new) > /* Are there idlers suitable for new (for this balance step)? */ > csched_balance_cpumask(new->vcpu, balance_step, > csched_balance_mask); > - cpumask_and(&idle_mask, prv->idlers, csched_balance_mask); > - new_idlers_empty = cpumask_empty(&idle_mask); > + cpumask_and(csched_balance_mask, csched_balance_mask, &idle_mask); > + new_idlers_empty = cpumask_empty(csched_balance_mask); > > /* > * Let's not be too harsh! If there aren't idlers suitable > @@ -1510,6 +1515,7 @@ static struct csched_vcpu * > csched_load_balance(struct csched_private *prv, int cpu, > struct csched_vcpu *snext, bool_t *stolen) > { > + struct cpupool *c = per_cpu(cpupool, cpu); > struct csched_vcpu *speer; > cpumask_t workers; > cpumask_t *online; > @@ -1517,10 +1523,13 @@ csched_load_balance(struct csched_private *prv, int cpu, > int node = cpu_to_node(cpu); > > BUG_ON( cpu != snext->vcpu->processor ); > - online = cpupool_scheduler_cpumask(per_cpu(cpupool, cpu)); > + online = cpupool_online_cpumask(c); > > - /* If this CPU is going offline we shouldn't steal work. */ > - if ( unlikely(!cpumask_test_cpu(cpu, online)) ) > + /* > + * If this CPU is going offline, or is not (yet) part of any cpupool > + * (as it happens, e.g., during cpu bringup), we shouldn't steal work. > + */ > + if ( unlikely(!cpumask_test_cpu(cpu, online) || c == NULL) ) > goto out; > > if ( snext->pri == CSCHED_PRI_IDLE ) > diff --git a/xen/common/sched_sedf.c b/xen/common/sched_sedf.c > index a1a4cb7..cad5b39 100644 > --- a/xen/common/sched_sedf.c > +++ b/xen/common/sched_sedf.c > @@ -790,7 +790,8 @@ static struct task_slice sedf_do_schedule( > if ( tasklet_work_scheduled || > (list_empty(runq) && list_empty(waitq)) || > unlikely(!cpumask_test_cpu(cpu, > - cpupool_scheduler_cpumask(per_cpu(cpupool, cpu)))) ) > + cpupool_online_cpumask(per_cpu(cpupool, cpu))) || > + per_cpu(cpupool, cpu) == NULL) ) > { > ret.task = IDLETASK(cpu); > ret.time = SECONDS(1); > >