All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Juergen Gross <jgross@suse.com>, xen-devel@lists.xenproject.org
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy
Date: Fri, 15 Jul 2016 13:52:42 +0200	[thread overview]
Message-ID: <1468583562.13039.96.camel@citrix.com> (raw)
In-Reply-To: <5788BCA2.4020404@suse.com>


[-- Attachment #1.1: Type: text/plain, Size: 2876 bytes --]

On Fri, 2016-07-15 at 12:36 +0200, Juergen Gross wrote:
> On 15/07/16 12:14, Dario Faggioli wrote:
> > In particular, I'm probably not fully understanding, from that
> > commit
> > changelog, what is the set of operations/command that I should run
> > to
> > check whether or not I reintroduced the issue back.
> You need to create a domain in a cpupool and destroy it again while
> some dom0 process still is holding a reference to it (resulting in a
> zombie domain). Then try to destroy the cpupool.
> 
Ah, I see. I wasn't get the fact that it needed to be a zombie domain
from anywhere.

> > What am I missing?
> The domain being a zombie domain might change the picture. Moving it
> to
> cpupool0 was failing before my patch and it might do so again with
> your
> patch applied.
> 
Mmmm... I don't immediately see the reason why moving a zombie domain
fails either, but I guess I'll have to try.

But then, correct me if I'm wrong, the situation is like this:
 - right now there's a (potential) race between domain's scheduling 
   data destruction and domain removal from a cpupool;
 - with my patch, the race goes away, but we risk not being able to 
   destroy a cpupool with a zombie domain in it.

I don't think we want to be in either of these two situations. :-(

The race is never triggering so far, but I've already patches to
Credit2 (finishing implementing soft affinity for it) that make it a
real thing.

I think I can work around that specific case by doing something like
the below:

diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index bc0e794..d91fd70 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -193,12 +193,9 @@ struct cpupool
 
 static inline cpumask_t* cpupool_domain_cpumask(struct domain *d)
 {
-    /*
-     * d->cpupool is NULL only for the idle domain, and no one should
-     * be interested in calling this for the idle domain.
-     */
-    ASSERT(d->cpupool != NULL);
-    return d->cpupool->cpu_valid;
+    /* No one should be interested in calling this for the idle domain! */
+    ASSERT(!is_idle_domain(d));
+    return d->cpupool ? d->cpupool->cpu_valid : cpupool0->cpu_valid;
 }
 
 #endif /* __XEN_SCHED_IF_H__ */

But even if that would be acceptable (what do you think?), I still
don't like to have the race there lurking. :-/

Therefore, I still think this patch is correct, but I'm up for
investigating further and finding a way to solve the "zombie in
cpupool" issue as well.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-07-15 11:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 16:17 [PATCH v2 0/2] xen: cpupool (small) improvement and (latent) bug fix Dario Faggioli
2016-07-14 16:18 ` [PATCH v2 1/2] xen: fix a (latent) cpupool-related race during domain destroy Dario Faggioli
2016-07-14 17:08   ` Andrew Cooper
2016-07-15  9:38   ` Juergen Gross
2016-07-15 10:14     ` Dario Faggioli
2016-07-15 10:36       ` Juergen Gross
2016-07-15 11:52         ` Dario Faggioli [this message]
2016-07-15 12:52           ` Juergen Gross
2016-07-15 14:23             ` Dario Faggioli
2016-07-18 14:03               ` Dario Faggioli
2016-07-18 14:09                 ` Juergen Gross
2016-07-28 17:29                   ` Dario Faggioli
2016-08-03 11:54                     ` George Dunlap
2016-08-03 12:27                   ` George Dunlap
2016-07-14 16:18 ` [PATCH v2 2/2] xen: cpupool: small optimization when moving between pools Dario Faggioli
2016-07-15  9:39   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1468583562.13039.96.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.