From mboxrd@z Thu Jan  1 00:00:00 1970
From: Juergen Gross <juergen.gross@ts.fujitsu.com>
Subject: Re: [PATCH] move domain to cpupool0 before destroying it
Date: Tue, 20 May 2014 06:28:54 +0200
Message-ID: <537ADA06.4070903@ts.fujitsu.com>
References: <1400129994-26297-1-git-send-email-juergen.gross@ts.fujitsu.com>	<CAFLBxZZMyxquceNYS9BLH6FU_CzD6Cmd_H9jwf_2bCO=EggONQ@mail.gmail.com>	<537A40C00200007800013C2C@mail.emea.novell.com>
	<CAFLBxZY3oBK8ZVKoW3Db3D+U=Ps+9dwy+X7tA7Gf5p=vFoH0Gg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <CAFLBxZY3oBK8ZVKoW3Db3D+U=Ps+9dwy+X7tA7Gf5p=vFoH0Gg@mail.gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: George Dunlap <dunlapg@umich.edu>, Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 19.05.2014 18:19, George Dunlap wrote:
> On Mon, May 19, 2014 at 4:34 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 19.05.14 at 16:57, <dunlapg@umich.edu> wrote:
>>> On Thu, May 15, 2014 at 5:59 AM, Juergen Gross
>>> <juergen.gross@ts.fujitsu.com> wrote:
>>>> Currently when a domain is destroyed it is removed from the domain_list
>>>> before all of it's resources, including the cpupool membership, are freed.
>>>> This can lead to a situation where the domain is still member of a cpupool
>>>> without for_each_domain_in_cpupool() (or even for_each_domain()) being
>>>> able to find it any more. This in turn can result in rejection of removing
>>>> the last cpu from a cpupool, because there seems to be still a domain in
>>>> the cpupool, even if it can't be found by scanning through all domains.
>>>>
>>>> This situation can be avoided by moving the domain to be destroyed to
>>>> cpupool0 first and then remove it from this cpupool BEFORE deleting it from
>>>> the domain_list. As cpupool0 is always active and a domain without any
>>> cpupool
>>>> membership is implicitly regarded as belonging to cpupool0, this poses no
>>>> problem.
>>>
>>> I'm a bit unclear why we're doing *both* a sched_move_domain(), *and*
>>> moving the "cpupool_rm_domain()".
>>>
>>> The sched_move_domain() only happens in domain_kill(), which is only
>>> initiated (AFAICT) by hypercall: does that mean if a VM dies for some
>>> other reason (i.e., crashes), that you may still have the same race?
>>> If not, then just this change alone should be sufficent.  If it does,
>>> then this change is redundant.
>>
>> No, a crashed domain is merely being reported as crashed to the
>> tool stack. It's the tool stack to then actually invoke the killing of
>> it (or else e.g. "on_crash=preserve" would be rather hard to handle).
>
> Right, I see.
>
>>
>>> Moving the cpupool_rm_domain() will change things so that there is now
>>> a period of time where the VM is not being listed as being in
>>> cpupool0's pool, but may still be in that pool's scheduler's list of
>>> domains.  Is that OK?  If it is OK, it seems like that change alone
>>> should be sufficient.
>>
>> Moving this earlier was a requirement to avoid the race that the
>> earlier (much different) patch tried to address. Also I think the
>> patch's description already addresses that question (see the last
>> sentence of the quoted original mail contents above).
>
> But we're avoiding that race by instead moving the dying domain to
> cpupool0, which is never going to disappear.
>
> Or, moving the domain to cpupool0 *won't* sufficiently solve the race,
> and this will -- in which case, why are we bothering to move it to
> cpupool0 at all?  Why not just remove it from the cpupool when
> removing it from the domain list?  Wouldn't that also solve the
> original problem?
>
> Regarding the last bit, "...a domain without any cpupool membership is
> implicitly regarded as belonging to cpupool0...":
>
> 1. At a quick glance through the code, I couldn't find any evidence
> that this was the case; I couldn't find an instance where d->cpupool
> == NULL => assumed cpupool0.

xen/common/schedule.c:

#define DOM2OP(_d)    (((_d)->cpupool == NULL) ? &ops : ((_d)->cpupool->sched))

together with:

struct scheduler *scheduler_get_default(void)
{
     return &ops;
}

>
> 2. If in reality d->cpupool is never (or almost never) actually NULL,
> then the "implicitly belongs to cpupool0" assumption will bitrot.
> Having that kind of assumption without some way of making sure it's
> maintained is a bug waiting to happen.

That's not going to happen: This assumption is tested every time the idle
domain is being referenced by the scheduler...

>
>>> I've been trying to trace through the twisty little passages of domain
>>> destruction, and I'm still not quite sure: would it be OK if we just
>>> called sched_move_domain() in domain_destroy() instead of calling
>>> cpupool_rm_domain()?
>>
>> No, it would not, because then again we wouldn't be able to deal
>> with potential failure, needing re-invocation of the function.
>
> Right.

Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
PSO PM&D ES&S SWE OS6                  Telephone: +49 (0) 89 62060 2932
Fujitsu                                   e-mail: juergen.gross@ts.fujitsu.com
Mies-van-der-Rohe-Str. 8                Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html