From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: [PATCH] Avoid endless loop for vcpu migration Date: Tue, 15 Mar 2011 09:46:08 +0100 Message-ID: <4D7F2750.4010607@ts.fujitsu.com> References: <9d164ce877a75cab847b.1300113594@nehalem1> <4D7E3C640200007800036564@vpn.id2.novell.com> <4D7EFE43.7070900@ts.fujitsu.com> <4D7F29ED02000078000367F4@vpn.id2.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------050703000005090402090803" Return-path: In-Reply-To: <4D7F29ED02000078000367F4@vpn.id2.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------050703000005090402090803 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 03/15/11 08:57, Jan Beulich wrote: >>>> On 15.03.11 at 06:50, Juergen Gross wrote: >> On 03/14/11 16:03, Jan Beulich wrote: >>>>>> On 14.03.11 at 15:39, Juergen Gross wrote: >>>> On multi-thread multi-core systems an endless loop can occur in vcpu_migrate() >>>> with credit scheduler. Avoid this loop by changing the interface of pick_cpu >>>> to indicate a repeated call in this case. >>> >>> But you're not changing in any way the loop that doesn't get >>> exited - did you perhaps read my original description as the >>> pick function itself looping (which - afaict - it doesn't)? >> >> I'm changing the way the pick_cpu function is reacting on multiple calls in >> a loop. If I've understood the idle_bias correctly, updating it in each >> loop iteration did result in returning another cpu for each call. >> By updating idle_bias only once, it should return the same cpu in subsequent >> calls. This should exit the loop in vcpu_migrate. > > You're only decreasing the likelihood of a live lock, as the return > value of pick_cpu not only depends on idle_bias. Hmm, then another solution would be to let pick_cpu really return the proposed cpu from the first iteration, if it doesn't contradict the allowed settings. It could be sub-optimal, but I don't think this is critical, as vcpu_migrate is called rarely. Patch attached. > >>> Further, the change still isn't consistent with idle_bias - the >>> updating ought to happen on the last iteration (if you need >>> to call the function more than once), not the first one, which >>> creates a chicken-and-egg problem for you as you will know >>> it's the last one only when it returned. >> >> Is it really so important idle_bias is reflecting the last cpu selected? >> I was under the impression it should be okay when this is true in most >> cases. With my patch idle_bias might be "wrong" if there is a race with >> other cpus forcing a selection of a different cpu in the second iteration >> of the loop in vcpu_migrate. Is this really critical? I doubt it. > > It's not critical, and not affecting correctness. But with updating > idle_bias on the first invocation you're (on the right hardware) > basically guaranteeing the second invocation to return a > different CPU. That way, your loop will be run minimally three > times on such systems. I already find it odd to require two > iterations when previously this was a strait code path. This was wrong. It was always required to hold the schedule lock of the picked cpu as well, otherwise a race with cpu hotplug would be possible. > > If there's really no way around the iterative approach, one > possibility might be to not take into consideration idle_bias > on non-initial invocations at all. This would be a side effect of my suggestion. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html --------------050703000005090402090803 Content-Type: text/x-patch; name="vcpu_migrate.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="vcpu_migrate.patch" diff -r 9f6dec0d25cd xen/common/sched_arinc653.c --- a/xen/common/sched_arinc653.c Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/common/sched_arinc653.c Tue Mar 15 09:39:10 2011 +0100 @@ -612,7 +612,7 @@ a653sched_do_schedule( * @return Number of selected physical CPU */ static int -a653sched_pick_cpu(const struct scheduler *ops, struct vcpu *vc) +a653sched_pick_cpu(const struct scheduler *ops, struct vcpu *vc, int proposal) { /* this implementation only supports one physical CPU */ return 0; diff -r 9f6dec0d25cd xen/common/sched_credit.c --- a/xen/common/sched_credit.c Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/common/sched_credit.c Tue Mar 15 09:39:10 2011 +0100 @@ -459,7 +459,7 @@ __csched_vcpu_is_migrateable(struct vcpu } static int -_csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit) +_csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal, bool_t commit) { cpumask_t cpus; cpumask_t idlers; @@ -473,6 +473,10 @@ _csched_cpu_pick(const struct scheduler */ online = CSCHED_CPUONLINE(vc->domain->cpupool); cpus_and(cpus, *online, vc->cpu_affinity); + + if ( (proposal >= 0) && cpu_isset(proposal, *online) ) + return proposal; + cpu = cpu_isset(vc->processor, cpus) ? vc->processor : cycle_cpu(vc->processor, cpus); @@ -548,9 +552,9 @@ _csched_cpu_pick(const struct scheduler } static int -csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc) +csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal) { - return _csched_cpu_pick(ops, vc, 1); + return _csched_cpu_pick(ops, vc, proposal, 1); } static inline void @@ -635,7 +639,7 @@ csched_vcpu_acct(struct csched_private * { __csched_vcpu_acct_start(prv, svc); } - else if ( _csched_cpu_pick(ops, current, 0) != cpu ) + else if ( _csched_cpu_pick(ops, current, -1, 0) != cpu ) { CSCHED_VCPU_STAT_CRANK(svc, migrate_r); CSCHED_STAT_CRANK(migrate_running); diff -r 9f6dec0d25cd xen/common/sched_credit2.c --- a/xen/common/sched_credit2.c Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/common/sched_credit2.c Tue Mar 15 09:39:10 2011 +0100 @@ -1350,7 +1350,7 @@ out: } static int -csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc) +csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal) { struct csched_vcpu * const svc = CSCHED_VCPU(vc); int new_cpu; diff -r 9f6dec0d25cd xen/common/sched_sedf.c --- a/xen/common/sched_sedf.c Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/common/sched_sedf.c Tue Mar 15 09:39:10 2011 +0100 @@ -442,13 +442,15 @@ static void sedf_destroy_domain(const st sedf_free_domdata(ops, d->sched_priv); } -static int sedf_pick_cpu(const struct scheduler *ops, struct vcpu *v) +static int sedf_pick_cpu(const struct scheduler *ops, struct vcpu *v, int proposal) { cpumask_t online_affinity; cpumask_t *online; online = SEDF_CPUONLINE(v->domain->cpupool); cpus_and(online_affinity, v->cpu_affinity, *online); + if ( (proposal >= 0) && cpu_isset(proposal, online_affinity) ) + return proposal; return first_cpu(online_affinity); } diff -r 9f6dec0d25cd xen/common/schedule.c --- a/xen/common/schedule.c Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/common/schedule.c Tue Mar 15 09:39:10 2011 +0100 @@ -393,9 +393,10 @@ static void vcpu_migrate(struct vcpu *v) static void vcpu_migrate(struct vcpu *v) { unsigned long flags; - unsigned int old_cpu, new_cpu; + unsigned int old_cpu, new_cpu, proposal; spinlock_t *old_lock, *new_lock; + proposal = -1; old_cpu = new_cpu = v->processor; for ( ; ; ) { @@ -430,10 +431,11 @@ static void vcpu_migrate(struct vcpu *v) old_cpu = v->processor; if ( old_lock == per_cpu(schedule_data, old_cpu).schedule_lock ) { - new_cpu = SCHED_OP(VCPU2OP(v), pick_cpu, v); + new_cpu = SCHED_OP(VCPU2OP(v), pick_cpu, v, proposal); if ( (new_lock == per_cpu(schedule_data, new_cpu).schedule_lock) && cpu_isset(new_cpu, v->domain->cpupool->cpu_valid) ) break; + proposal = new_cpu; } if ( old_lock != new_lock ) diff -r 9f6dec0d25cd xen/include/xen/sched-if.h --- a/xen/include/xen/sched-if.h Mon Mar 14 17:20:11 2011 +0000 +++ b/xen/include/xen/sched-if.h Tue Mar 15 09:39:10 2011 +0100 @@ -167,7 +167,7 @@ struct scheduler { struct task_slice (*do_schedule) (const struct scheduler *, s_time_t, bool_t tasklet_work_scheduled); - int (*pick_cpu) (const struct scheduler *, struct vcpu *); + int (*pick_cpu) (const struct scheduler *, struct vcpu *, int); int (*adjust) (const struct scheduler *, struct domain *, struct xen_domctl_scheduler_op *); int (*adjust_global) (const struct scheduler *, --------------050703000005090402090803 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------050703000005090402090803--