xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Juergen Gross <jgross@suse.com>, xen-devel@lists.xenproject.org
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy
Date: Fri, 15 Jul 2016 13:52:42 +0200	[thread overview]
Message-ID: <1468583562.13039.96.camel@citrix.com> (raw)
In-Reply-To: <5788BCA2.4020404@suse.com>


[-- Attachment #1.1: Type: text/plain, Size: 2876 bytes --]

On Fri, 2016-07-15 at 12:36 +0200, Juergen Gross wrote:
> On 15/07/16 12:14, Dario Faggioli wrote:
> > In particular, I'm probably not fully understanding, from that
> > commit
> > changelog, what is the set of operations/command that I should run
> > to
> > check whether or not I reintroduced the issue back.
> You need to create a domain in a cpupool and destroy it again while
> some dom0 process still is holding a reference to it (resulting in a
> zombie domain). Then try to destroy the cpupool.
> 
Ah, I see. I wasn't get the fact that it needed to be a zombie domain
from anywhere.

> > What am I missing?
> The domain being a zombie domain might change the picture. Moving it
> to
> cpupool0 was failing before my patch and it might do so again with
> your
> patch applied.
> 
Mmmm... I don't immediately see the reason why moving a zombie domain
fails either, but I guess I'll have to try.

But then, correct me if I'm wrong, the situation is like this:
 - right now there's a (potential) race between domain's scheduling 
   data destruction and domain removal from a cpupool;
 - with my patch, the race goes away, but we risk not being able to 
   destroy a cpupool with a zombie domain in it.

I don't think we want to be in either of these two situations. :-(

The race is never triggering so far, but I've already patches to
Credit2 (finishing implementing soft affinity for it) that make it a
real thing.

I think I can work around that specific case by doing something like
the below:

diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index bc0e794..d91fd70 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -193,12 +193,9 @@ struct cpupool
 
 static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
 {
-    /*
-     * d->cpupool is NULL only for the idle domain, and no one should
-     * be interested in calling this for the idle domain.
-     */
-    ASSERT(d->cpupool != NULL);
-    return d->cpupool->cpu_valid;
+    /* No one should be interested in calling this for the idle domain! */
+    ASSERT(!is_idle_domain(d));
+    return d->cpupool ? d->cpupool->cpu_valid : cpupool0->cpu_valid;
 }
 
 #endif /* __XEN_SCHED_IF_H__ */

But even if that would be acceptable (what do you think?), I still
don't like to have the race there lurking. :-/

Therefore, I still think this patch is correct, but I'm up for
investigating further and finding a way to solve the "zombie in
cpupool" issue as well.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-07-15 11:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 16:17 [PATCH v2 0/2] xen: cpupool (small) improvement and (latent) bug fix Dario Faggioli
2016-07-14 16:18 ` [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy Dario Faggioli
2016-07-14 17:08   ` Andrew Cooper
2016-07-15  9:38   ` Juergen Gross
2016-07-15 10:14     ` Dario Faggioli
2016-07-15 10:36       ` Juergen Gross
2016-07-15 11:52         ` Dario Faggioli [this message]
2016-07-15 12:52           ` Juergen Gross
2016-07-15 14:23             ` Dario Faggioli
2016-07-18 14:03               ` Dario Faggioli
2016-07-18 14:09                 ` Juergen Gross
2016-07-28 17:29                   ` Dario Faggioli
2016-08-03 11:54                     ` George Dunlap
2016-08-03 12:27                   ` George Dunlap
2016-07-14 16:18 ` [PATCH v2 2/2] xen: cpupool: small optimization when moving between pools Dario Faggioli
2016-07-15  9:39   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1468583562.13039.96.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).