From: Andrew Cooper <andrew.cooper3@citrix.com>
To: George Dunlap <george.dunlap@eu.citrix.com>,
Julien Grall <julien.grall@linaro.org>,
Ian Campbell <Ian.Campbell@citrix.com>
Cc: jgross@suse.com,
Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
Dario Faggioli <dario.faggioli@citrix.com>,
Tim Deegan <tim@xen.org>,
george.dunlap@citrix.com, xen-devel <xen-devel@lists.xen.org>
Subject: Re: Xen crashing when killing a domain with no VCPUs allocated
Date: Mon, 21 Jul 2014 11:42:43 +0100 [thread overview]
Message-ID: <53CCEEA3.5080305@citrix.com> (raw)
In-Reply-To: <53CCEC64.7040304@eu.citrix.com>
On 21/07/14 11:33, George Dunlap wrote:
> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>
>> On 18/07/14 17:39, Ian Campbell wrote:
>>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>>> Hi all,
>>>>
>>>> I've been played with the function alloc_vcpu on ARM. And I hit one
>>>> case
>>>> where this function can failed.
>>>>
>>>> During domain creation, the toolstack will call DOMCTL_max_vcpus
>>>> which may
>>>> fail, for instance because alloc_vcpu didn't succeed. In this case,
>>>> the
>>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack
>>>> trace.
>>>>
>>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by
>>>> returning
>>>> in an error in vcpu_initialize.
>>>>
>>>> I'm not sure how to correctly fix it.
>>> I think a simple check at the head of the function would be ok.
>>>
>>> Alternatively perhaps in sched_mode_domain, which could either detect
>>> this or could detect a domain in pool0 being moved to pool0 and short
>>> circuit.
>> I was thinking about the small fix below. If it's fine for everyone,
>> I can
>> send a patch next week.
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index e9eb0bc..c44d047 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>> cpupool *c)
>> }
>> /* Do we have vcpus already? If not, no need to update
>> node-affinity */
>> - if ( d->vcpu )
>> + if ( d->vcpu && d->vcpu[0] != NULL )
>> domain_update_node_affinity(d);
>
> So is the problem that we're allocating the vcpu array area, but not
> putting any vcpus in it?
The problem (as I recall) was that domain_create() got midway through
and alloc_vcpu(0) failed with -ENOMEM. Following that failure, the
toolstack called domain_destroy().
Having d->vcpu properly allocated and containing fully NULL pointers is
a valid position to be in, especial in error or teardown paths.
>
> Overall it seems like those checks for the existence of cpus should be
> moved into domain_update_node_affinity(). The ASSERT() there I think
> is just a sanity check to make sure we're not getting a ridiculous
> result out of our calculation; but of course if there actually are no
> vcpus, it's not ridiculous at all.
>
> One solution might be to change the ASSERT to
> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]). Then
> we could probably even remove the d->vcpu conditional when calling it.
If you were going along this line, the pointer checks are substantially
less expensive than cpumask_empty(), so the ||'s should be reordered.
However, I am not convinced that it is necessarily the best solution,
given my previous observation.
~Andrew
next prev parent reply other threads:[~2014-07-21 10:42 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-18 13:27 Xen crashing when killing a domain with no VCPUs allocated Julien Grall
2014-07-18 16:39 ` Ian Campbell
2014-07-18 20:26 ` Julien Grall
2014-07-21 10:33 ` George Dunlap
2014-07-21 10:42 ` Andrew Cooper [this message]
2014-07-21 10:49 ` George Dunlap
2014-07-21 11:46 ` Julien Grall
2014-07-21 12:57 ` Dario Faggioli
2014-07-23 15:31 ` Jan Beulich
2014-07-24 14:04 ` Julien Grall
2014-07-21 10:12 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53CCEEA3.5080305@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=dario.faggioli@citrix.com \
--cc=george.dunlap@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jgross@suse.com \
--cc=julien.grall@linaro.org \
--cc=stefano.stabellini@eu.citrix.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).