Re: [PATCH] Avoid endless loop for vcpu migration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Juergen Gross <juergen.gross@ts.fujitsu.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: [PATCH] Avoid endless loop for vcpu migration
Date: Tue, 15 Mar 2011 09:46:08 +0100	[thread overview]
Message-ID: <4D7F2750.4010607@ts.fujitsu.com> (raw)
In-Reply-To: <4D7F29ED02000078000367F4@vpn.id2.novell.com>

[-- Attachment #1: Type: text/plain, Size: 3204 bytes --]

On 03/15/11 08:57, Jan Beulich wrote:
>>>> On 15.03.11 at 06:50, Juergen Gross<juergen.gross@ts.fujitsu.com>  wrote:
>> On 03/14/11 16:03, Jan Beulich wrote:
>>>>>> On 14.03.11 at 15:39, Juergen Gross<juergen.gross@ts.fujitsu.com>   wrote:
>>>> On multi-thread multi-core systems an endless loop can occur in vcpu_migrate()
>>>> with credit scheduler. Avoid this loop by changing the interface of pick_cpu
>>>> to indicate a repeated call in this case.
>>>
>>> But you're not changing in any way the loop that doesn't get
>>> exited - did you perhaps read my original description as the
>>> pick function itself looping (which - afaict - it doesn't)?
>>
>> I'm changing the way the pick_cpu function is reacting on multiple calls in
>> a loop. If I've understood the idle_bias correctly, updating it in each
>> loop iteration did result in returning another cpu for each call.
>> By updating idle_bias only once, it should return the same cpu in subsequent
>> calls. This should exit the loop in vcpu_migrate.
>
> You're only decreasing the likelihood of a live lock, as the return
> value of pick_cpu not only depends on idle_bias.

Hmm, then another solution would be to let pick_cpu really return the
proposed cpu from the first iteration, if it doesn't contradict the
allowed settings. It could be sub-optimal, but I don't think this is
critical, as vcpu_migrate is called rarely.

Patch attached.

>
>>> Further, the change still isn't consistent with idle_bias - the
>>> updating ought to happen on the last iteration (if you need
>>> to call the function more than once), not the first one, which
>>> creates a chicken-and-egg problem for you as you will know
>>> it's the last one only when it returned.
>>
>> Is it really so important idle_bias is reflecting the last cpu selected?
>> I was under the impression it should be okay when this is true in most
>> cases. With my patch idle_bias might be "wrong" if there is a race with
>> other cpus forcing a selection of a different cpu in the second iteration
>> of the loop in vcpu_migrate. Is this really critical? I doubt it.
>
> It's not critical, and not affecting correctness. But with updating
> idle_bias on the first invocation you're (on the right hardware)
> basically guaranteeing the second invocation to return a
> different CPU. That way, your loop will be run minimally three
> times on such systems. I already find it odd to require two
> iterations when previously this was a strait code path.

This was wrong. It was always required to hold the schedule lock of the
picked cpu as well, otherwise a race with cpu hotplug would be possible.

>
> If there's really no way around the iterative approach, one
> possibility might be to not take into consideration idle_bias
> on non-initial invocations at all.

This would be a side effect of my suggestion.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

[-- Attachment #2: vcpu_migrate.patch --]
[-- Type: text/x-patch, Size: 5033 bytes --]

diff -r 9f6dec0d25cd xen/common/sched_arinc653.c
--- a/xen/common/sched_arinc653.c	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/common/sched_arinc653.c	Tue Mar 15 09:39:10 2011 +0100
@@ -612,7 +612,7 @@ a653sched_do_schedule(
  * @return          Number of selected physical CPU
  */
 static int
-a653sched_pick_cpu(const struct scheduler *ops, struct vcpu *vc)
+a653sched_pick_cpu(const struct scheduler *ops, struct vcpu *vc, int proposal)
 {
     /* this implementation only supports one physical CPU */
     return 0;
diff -r 9f6dec0d25cd xen/common/sched_credit.c
--- a/xen/common/sched_credit.c	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/common/sched_credit.c	Tue Mar 15 09:39:10 2011 +0100
@@ -459,7 +459,7 @@ __csched_vcpu_is_migrateable(struct vcpu
 }
 
 static int
-_csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
+_csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal, bool_t commit)
 {
     cpumask_t cpus;
     cpumask_t idlers;
@@ -473,6 +473,10 @@ _csched_cpu_pick(const struct scheduler 
      */
     online = CSCHED_CPUONLINE(vc->domain->cpupool);
     cpus_and(cpus, *online, vc->cpu_affinity);
+
+    if ( (proposal >= 0) && cpu_isset(proposal, *online) )
+        return proposal;
+
     cpu = cpu_isset(vc->processor, cpus)
             ? vc->processor
             : cycle_cpu(vc->processor, cpus);
@@ -548,9 +552,9 @@ _csched_cpu_pick(const struct scheduler 
 }
 
 static int
-csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
+csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal)
 {
-    return _csched_cpu_pick(ops, vc, 1);
+    return _csched_cpu_pick(ops, vc, proposal, 1);
 }
 
 static inline void
@@ -635,7 +639,7 @@ csched_vcpu_acct(struct csched_private *
     {
         __csched_vcpu_acct_start(prv, svc);
     }
-    else if ( _csched_cpu_pick(ops, current, 0) != cpu )
+    else if ( _csched_cpu_pick(ops, current, -1, 0) != cpu )
     {
         CSCHED_VCPU_STAT_CRANK(svc, migrate_r);
         CSCHED_STAT_CRANK(migrate_running);
diff -r 9f6dec0d25cd xen/common/sched_credit2.c
--- a/xen/common/sched_credit2.c	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/common/sched_credit2.c	Tue Mar 15 09:39:10 2011 +0100
@@ -1350,7 +1350,7 @@ out:
 }
 
 static int
-csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
+csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, int proposal)
 {
     struct csched_vcpu * const svc = CSCHED_VCPU(vc);
     int new_cpu;
diff -r 9f6dec0d25cd xen/common/sched_sedf.c
--- a/xen/common/sched_sedf.c	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/common/sched_sedf.c	Tue Mar 15 09:39:10 2011 +0100
@@ -442,13 +442,15 @@ static void sedf_destroy_domain(const st
     sedf_free_domdata(ops, d->sched_priv);
 }
 
-static int sedf_pick_cpu(const struct scheduler *ops, struct vcpu *v)
+static int sedf_pick_cpu(const struct scheduler *ops, struct vcpu *v, int proposal)
 {
     cpumask_t online_affinity;
     cpumask_t *online;
 
     online = SEDF_CPUONLINE(v->domain->cpupool);
     cpus_and(online_affinity, v->cpu_affinity, *online);
+    if ( (proposal >= 0) && cpu_isset(proposal, online_affinity) )
+        return proposal;
     return first_cpu(online_affinity);
 }
 
diff -r 9f6dec0d25cd xen/common/schedule.c
--- a/xen/common/schedule.c	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/common/schedule.c	Tue Mar 15 09:39:10 2011 +0100
@@ -393,9 +393,10 @@ static void vcpu_migrate(struct vcpu *v)
 static void vcpu_migrate(struct vcpu *v)
 {
     unsigned long flags;
-    unsigned int old_cpu, new_cpu;
+    unsigned int old_cpu, new_cpu, proposal;
     spinlock_t *old_lock, *new_lock;
 
+    proposal = -1;
     old_cpu = new_cpu = v->processor;
     for ( ; ; )
     {
@@ -430,10 +431,11 @@ static void vcpu_migrate(struct vcpu *v)
         old_cpu = v->processor;
         if ( old_lock == per_cpu(schedule_data, old_cpu).schedule_lock )
         {
-            new_cpu = SCHED_OP(VCPU2OP(v), pick_cpu, v);
+            new_cpu = SCHED_OP(VCPU2OP(v), pick_cpu, v, proposal);
             if ( (new_lock == per_cpu(schedule_data, new_cpu).schedule_lock) &&
                  cpu_isset(new_cpu, v->domain->cpupool->cpu_valid) )
                 break;
+            proposal = new_cpu;
         }
 
         if ( old_lock != new_lock )
diff -r 9f6dec0d25cd xen/include/xen/sched-if.h
--- a/xen/include/xen/sched-if.h	Mon Mar 14 17:20:11 2011 +0000
+++ b/xen/include/xen/sched-if.h	Tue Mar 15 09:39:10 2011 +0100
@@ -167,7 +167,7 @@ struct scheduler {
     struct task_slice (*do_schedule) (const struct scheduler *, s_time_t,
                                       bool_t tasklet_work_scheduled);
 
-    int          (*pick_cpu)       (const struct scheduler *, struct vcpu *);
+    int          (*pick_cpu)       (const struct scheduler *, struct vcpu *, int);
     int          (*adjust)         (const struct scheduler *, struct domain *,
                                     struct xen_domctl_scheduler_op *);
     int          (*adjust_global)  (const struct scheduler *,

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

next prev parent reply	other threads:[~2011-03-15  8:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-14 14:39 [PATCH] Avoid endless loop for vcpu migration Juergen Gross
2011-03-14 15:03 ` Jan Beulich
2011-03-14 15:06   ` Tim Deegan
2011-03-15  5:50   ` Juergen Gross
2011-03-15  7:57     ` Jan Beulich
2011-03-15  8:46       ` Juergen Gross [this message]
2011-03-15  8:50         ` Keir Fraser
2011-03-15  8:53           ` Juergen Gross
2011-03-15  9:01         ` Jan Beulich
2011-03-15  9:21           ` Juergen Gross
2011-03-15  9:34             ` Jan Beulich
2011-03-15  9:58             ` Keir Fraser
2011-03-15 10:29               ` Juergen Gross
2011-03-14 15:06 ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D7F2750.4010607@ts.fujitsu.com \
    --to=juergen.gross@ts.fujitsu.com \
    --cc=JBeulich@novell.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.