xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: xen-devel@lists.xenproject.org
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: [PATCH v2 1/2] xen: credit1: fix a race when picking initial pCPU for a vCPU
Date: Thu, 18 Aug 2016 12:00:52 +0200	[thread overview]
Message-ID: <147151445223.29674.955105994328843699.stgit@Solace.fritz.box> (raw)
In-Reply-To: <147151337343.29674.1081345215393715232.stgit@Solace.fritz.box>

In the Credit1 hunk of 9f358ddd69463 ("xen: Have
schedulers revise initial placement") csched_cpu_pick()
is called without taking the runqueue lock of the
(temporary) pCPU that the vCPU has been assigned to
(e.g., in XEN_DOMCTL_max_vcpus).

However, although 'hidden' in the IS_RUNQ_IDLE() macro,
that function does access the runq (for doing load
balancing calculations). Two scenarios are possible:
 1) we are on cpu X, and IS_RUNQ_IDLE() peeks at cpu's
    X own runq;
 2) we are on cpu X, but IS_RUNQ_IDLE() peeks at some
    other cpu's runq.

Scenario 2) absolutely requies that the appropriate
runq lock is taken. Scenario 1) works even without
taking the cpu's own runq lock, and this is important
for the case when _csched_pick_cpu() is called from
csched_vcpu_acct() which in turn is called by
csched_tick().

Races have been observed and reported (by both XenServer
own testing and OSSTest [1]), in the form of
IS_RUNQ_IDLE() falling over LIST_POISON, because we're
not currently holding the proper lock, in
csched_vcpu_insert(), when scenario 1) occurs.

Since this is all very tricky, in addition to fix things,
add both an ASSERT() and a comment in IS_RUNQ_IDLE() (which
is also becoming static inline instead of macro).

[1] https://lists.xen.org/archives/html/xen-devel/2016-08/msg02144.html

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <JBeulich@suse.com>
---
Changes from v1:
 - macro IS_RUNQ_IDLE() to static inline is_runq_idle(), as suggested
   during review;
 - add an ASSERT() and a comment, as suggested during review;
 - take into account what's described in the changelog as "scenario 1)",
   which wasn't being considered in v1.
---
 xen/common/sched_credit.c |   38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 220ff0d..daace81 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -84,9 +84,6 @@
 #define CSCHED_VCPU(_vcpu)  ((struct csched_vcpu *) (_vcpu)->sched_priv)
 #define CSCHED_DOM(_dom)    ((struct csched_dom *) (_dom)->sched_priv)
 #define RUNQ(_cpu)          (&(CSCHED_PCPU(_cpu)->runq))
-/* Is the first element of _cpu's runq its idle vcpu? */
-#define IS_RUNQ_IDLE(_cpu)  (list_empty(RUNQ(_cpu)) || \
-                             is_idle_vcpu(__runq_elem(RUNQ(_cpu)->next)->vcpu))
 
 
 /*
@@ -248,6 +245,33 @@ __runq_elem(struct list_head *elem)
     return list_entry(elem, struct csched_vcpu, runq_elem);
 }
 
+/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */
+static inline bool_t is_runq_idle(unsigned int cpu)
+{
+    /*
+     * If we are on cpu, and we are peeking at our own runq while cpu itself
+     * is not idle, that's fine even if we don't hold the runq lock. In fact,
+     * the fact that there is a (non idle!) vcpu running means that at least
+     * the idle vcpu is in the runq. And since only cpu itself (via work
+     * stealing) can add stuff to the runq, and no other cpu will ever steal
+     * our idle vcpu, that maks the runq manipulations done below safe, even
+     * without locks.
+     *
+     * On the other hand, if we're peeking at another cpu's runq, we must hold
+     * its the proper runq lock.
+     *
+     * As a matter of fact, the former scenario describes what happens when
+     * _cshced_cpu_pick() (which then calls us) is called from csched_tick(),
+     * the latter one describes what actually happen when it is called from
+     * csched_vcpu_insert().
+     */
+    ASSERT((!is_idle_vcpu(curr_on_cpu(cpu)) && cpu == smp_processor_id()) ||
+           spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
+
+    return list_empty(RUNQ(cpu)) ||
+           is_idle_vcpu(__runq_elem(RUNQ(cpu)->next)->vcpu);
+}
+
 static inline void
 __runq_insert(struct csched_vcpu *svc)
 {
@@ -771,7 +795,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
          * runnable vcpu on cpu, we add cpu to the idlers.
          */
         cpumask_and(&idlers, &cpu_online_map, CSCHED_PRIV(ops)->idlers);
-        if ( vc->processor == cpu && IS_RUNQ_IDLE(cpu) )
+        if ( vc->processor == cpu && is_runq_idle(cpu) )
             __cpumask_set_cpu(cpu, &idlers);
         cpumask_and(&cpus, &cpus, &idlers);
 
@@ -998,9 +1022,13 @@ csched_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
 
     BUG_ON( is_idle_vcpu(vc) );
 
-    /* This is safe because vc isn't yet being scheduled */
+    /* csched_cpu_pick() looks in vc->processor's runq, so we need the lock. */
+    lock = vcpu_schedule_lock_irq(vc);
+
     vc->processor = csched_cpu_pick(ops, vc);
 
+    spin_unlock_irq(lock);
+
     lock = vcpu_schedule_lock_irq(vc);
 
     if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vc->is_running )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-08-18 10:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-18 10:00 [PATCH v2 0/2] xen: credit1: fix a race when picking initial pCPU for a vCPU Dario Faggioli
2016-08-18 10:00 ` Dario Faggioli [this message]
2016-08-19 12:23   ` [PATCH v2 1/2] " George Dunlap
2016-08-19 14:02     ` Dario Faggioli
2016-08-18 10:00 ` [PATCH v2 2/2] xen: credit1: no need to check for is_idle_vcpu() in csched_vcpu_acct() Dario Faggioli
2016-08-19 12:40   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=147151445223.29674.955105994328843699.stgit@Solace.fritz.box \
    --to=dario.faggioli@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).