* Domains not being destroyed properly
@ 2011-06-17 11:29 Anthony Wright
2011-06-17 18:12 ` Nathan March
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Anthony Wright @ 2011-06-17 11:29 UTC (permalink / raw)
To: xen-devel
If I create a domain with 'xl create -e', and then shut the domain down
with 'xl shutdown' according to 'xl list' it gets stuck in state
'--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
domain).
If I destroy the domain with 'xl destroy' the domain is destroyed properly.
If I create a domain with 'xl create' (without the '-e' option) and then
use 'xl shutdown', the domain is destroyed properly.
Since the 'xl shutdown' & 'xl destroy' give different results I presume
this is a bug.
As an extra question... Is there a way to be notified when a domain is
destroyed other than leaving the 'xl create' process lying around? I'd
like to know when any domain is destroyed, and leaving a large number of
processes lying around just to be able to do this seems rather ugly. In
the past I've editted some of the python code to achieve this, but my
patch doesn't work with 4.1, so I'm seeing if there's an official way to
do this before I work out a new patch.
thanks,
Anthony.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-17 11:29 Domains not being destroyed properly Anthony Wright
@ 2011-06-17 18:12 ` Nathan March
2011-06-21 11:51 ` Ian Campbell
2011-06-21 13:27 ` Ian Jackson
2 siblings, 0 replies; 23+ messages in thread
From: Nathan March @ 2011-06-17 18:12 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel
On 6/17/2011 4:29 AM, Anthony Wright wrote:
> As an extra question... Is there a way to be notified when a domain is
> destroyed other than leaving the 'xl create' process lying around? I'd
> like to know when any domain is destroyed, and leaving a large number of
> processes lying around just to be able to do this seems rather ugly. In
> the past I've editted some of the python code to achieve this, but my
> patch doesn't work with 4.1, so I'm seeing if there's an official way to
> do this before I work out a new patch.
Not ideal, but my approach was to add hook scripts into the block device
script and consider a vm down if xvda1 has been removed:
Line 229 (below the FRONTEND_UUID):
/path/to/block add ${XENBUS_PATH}
Line 324 (below "remove)"):
/path/to/block remove ${XENBUS_PATH}
xen1 scripts # cat block
#!/bin/bash
set -x -e
STATE="$1"
PATH="$2"
VMNAME=$(/usr/bin/xenstore-read ${PATH}/domain);
DEV=$(/usr/bin/xenstore-read ${PATH}/dev);
if [[ "$DEV" == "xvda1" ]]; then
# Do stuff
fi;
exit 0
- Nathan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-17 11:29 Domains not being destroyed properly Anthony Wright
2011-06-17 18:12 ` Nathan March
@ 2011-06-21 11:51 ` Ian Campbell
2011-06-21 12:57 ` Anthony Wright
2011-06-21 13:27 ` Ian Jackson
2 siblings, 1 reply; 23+ messages in thread
From: Ian Campbell @ 2011-06-21 11:51 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com
On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
> If I create a domain with 'xl create -e', and then shut the domain down
> with 'xl shutdown' according to 'xl list' it gets stuck in state
> '--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
> domain).
>
> If I destroy the domain with 'xl destroy' the domain is destroyed properly.
>
> If I create a domain with 'xl create' (without the '-e' option) and then
> use 'xl shutdown', the domain is destroyed properly.
>
> Since the 'xl shutdown' & 'xl destroy' give different results I presume
> this is a bug.
The -e option to xl create means don't daemonize to babysit this domain.
One of the key bits of functionality of the daemon is to destroy the
domain after it is shutdown. So if you use -e you need to do the destroy
manually. So effectively you have gotten what you asked for ;-)
> As an extra question... Is there a way to be notified when a domain is
> destroyed other than leaving the 'xl create' process lying around? I'd
> like to know when any domain is destroyed, and leaving a large number of
> processes lying around just to be able to do this seems rather ugly. In
> the past I've editted some of the python code to achieve this, but my
> patch doesn't work with 4.1, so I'm seeing if there's an official way to
> do this before I work out a new patch.
You can take a xenstore watch on the @releaseDomain pseudo node, does
that do what you want?
Ian.
>
> thanks,
>
> Anthony.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 11:51 ` Ian Campbell
@ 2011-06-21 12:57 ` Anthony Wright
2011-06-21 13:13 ` Ian Jackson
2011-06-21 13:39 ` Konrad Rzeszutek Wilk
0 siblings, 2 replies; 23+ messages in thread
From: Anthony Wright @ 2011-06-21 12:57 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com
On 21/06/2011 12:51, Ian Campbell wrote:
> On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
>> If I create a domain with 'xl create -e', and then shut the domain down
>> with 'xl shutdown' according to 'xl list' it gets stuck in state
>> '--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
>> domain).
>> If I destroy the domain with 'xl destroy' the domain is destroyed properly.
>>
>> If I create a domain with 'xl create' (without the '-e' option) and then
>> use 'xl shutdown', the domain is destroyed properly.
>> Since the 'xl shutdown' & 'xl destroy' give different results I presume
>> this is a bug.
> The -e option to xl create means don't daemonize to babysit this domain.
> One of the key bits of functionality of the daemon is to destroy the
> domain after it is shutdown. So if you use -e you need to do the destroy
> manually. So effectively you have gotten what you asked for ;-)
>
However I can't destroy the domain with 'xl destroy'. The command runs,
but the domain is still there afterwards.
>> As an extra question... Is there a way to be notified when a domain is
>> destroyed other than leaving the 'xl create' process lying around? I'd
>> like to know when any domain is destroyed, and leaving a large number of
>> processes lying around just to be able to do this seems rather ugly. In
>> the past I've editted some of the python code to achieve this, but my
>> patch doesn't work with 4.1, so I'm seeing if there's an official way to
>> do this before I work out a new patch.
> You can take a xenstore watch on the @releaseDomain pseudo node, does
> that do what you want?
>
I found the @releaseDomain xenstore watch, and have modified my code to
use it. I was wondering if there's a mechanism to find out which domain
is dying. The only mechanism I can find at the moment is to scan through
all the domains to see which one isn't there any more. Assuming 'xl
destroy' works, it would make things a little easier. Is there a better
way to get the domain states other than doing a 'xl list' and parsing
the result?
thanks,
Anthony.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 12:57 ` Anthony Wright
@ 2011-06-21 13:13 ` Ian Jackson
2011-06-21 13:39 ` Konrad Rzeszutek Wilk
1 sibling, 0 replies; 23+ messages in thread
From: Ian Jackson @ 2011-06-21 13:13 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Ian Campbell
Anthony Wright writes ("Re: [Xen-devel] Domains not being destroyed properly"):
> On 21/06/2011 12:51, Ian Campbell wrote:
> > On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
> >> If I destroy the domain with 'xl destroy' the domain is destroyed
> >> properly.
...
> However I can't destroy the domain with 'xl destroy'. The command runs,
> but the domain is still there afterwards.
Which of the above statements is true ? Or, alternatively, what is
the difference between the two situations ?
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-17 11:29 Domains not being destroyed properly Anthony Wright
2011-06-17 18:12 ` Nathan March
2011-06-21 11:51 ` Ian Campbell
@ 2011-06-21 13:27 ` Ian Jackson
2 siblings, 0 replies; 23+ messages in thread
From: Ian Jackson @ 2011-06-21 13:27 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel
Anthony Wright writes ("[Xen-devel] Domains not being destroyed properly"):
> As an extra question... Is there a way to be notified when a domain is
> destroyed other than leaving the 'xl create' process lying around? I'd
> like to know when any domain is destroyed, and leaving a large number of
> processes lying around just to be able to do this seems rather ugly. In
> the past I've editted some of the python code to achieve this, but my
> patch doesn't work with 4.1, so I'm seeing if there's an official way to
> do this before I work out a new patch.
Processes are cheap. But, the facilities should be there in libxl to
code this up as part of "xl" perhaps, or as a separate ad-hoc utility.
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 12:57 ` Anthony Wright
2011-06-21 13:13 ` Ian Jackson
@ 2011-06-21 13:39 ` Konrad Rzeszutek Wilk
2011-06-21 14:52 ` Anthony Wright
1 sibling, 1 reply; 23+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-06-21 13:39 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Ian Campbell
On Tue, Jun 21, 2011 at 01:57:20PM +0100, Anthony Wright wrote:
> On 21/06/2011 12:51, Ian Campbell wrote:
> > On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
> >> If I create a domain with 'xl create -e', and then shut the domain down
> >> with 'xl shutdown' according to 'xl list' it gets stuck in state
> >> '--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
> >> domain).
> >> If I destroy the domain with 'xl destroy' the domain is destroyed properly.
> >>
> >> If I create a domain with 'xl create' (without the '-e' option) and then
> >> use 'xl shutdown', the domain is destroyed properly.
> >> Since the 'xl shutdown' & 'xl destroy' give different results I presume
> >> this is a bug.
> > The -e option to xl create means don't daemonize to babysit this domain.
> > One of the key bits of functionality of the daemon is to destroy the
> > domain after it is shutdown. So if you use -e you need to do the destroy
> > manually. So effectively you have gotten what you asked for ;-)
> >
> However I can't destroy the domain with 'xl destroy'. The command runs,
> but the domain is still there afterwards.
Is qemu-dm running for that domain? what happens if you kill it?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 13:39 ` Konrad Rzeszutek Wilk
@ 2011-06-21 14:52 ` Anthony Wright
2011-06-21 15:44 ` Ian Campbell
0 siblings, 1 reply; 23+ messages in thread
From: Anthony Wright @ 2011-06-21 14:52 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com, Ian Campbell
On 21/06/2011 14:39, Konrad Rzeszutek Wilk wrote:
> On Tue, Jun 21, 2011 at 01:57:20PM +0100, Anthony Wright wrote:
>> On 21/06/2011 12:51, Ian Campbell wrote:
>>> On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
>>>> If I create a domain with 'xl create -e', and then shut the domain down
>>>> with 'xl shutdown' according to 'xl list' it gets stuck in state
>>>> '--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
>>>> domain).
>>>> If I destroy the domain with 'xl destroy' the domain is destroyed properly.
>>>>
>>>> If I create a domain with 'xl create' (without the '-e' option) and then
>>>> use 'xl shutdown', the domain is destroyed properly.
>>>> Since the 'xl shutdown' & 'xl destroy' give different results I presume
>>>> this is a bug.
>>> The -e option to xl create means don't daemonize to babysit this domain.
>>> One of the key bits of functionality of the daemon is to destroy the
>>> domain after it is shutdown. So if you use -e you need to do the destroy
>>> manually. So effectively you have gotten what you asked for ;-)
>>>
>> However I can't destroy the domain with 'xl destroy'. The command runs,
>> but the domain is still there afterwards.
> Is qemu-dm running for that domain? what happens if you kill it?
Ah ha.... that makes the domain go away.
Ok, to be clear what I did....
xl create -e domain-details
xl shutdown <domain>
(End up with a domain called (null) in state --ps-d according to xl list
kill the qemu-dm process associated with the domain
the domain goes away
note that I didn't need to do an 'xl destroy' the kill of qemu-dm was
sufficient
If I do....
xl create -e domain-details
xl destroy <domain>
the domain goes away
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 14:52 ` Anthony Wright
@ 2011-06-21 15:44 ` Ian Campbell
2011-06-21 16:04 ` Anthony Wright
2011-06-21 16:26 ` Ian Jackson
0 siblings, 2 replies; 23+ messages in thread
From: Ian Campbell @ 2011-06-21 15:44 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On Tue, 2011-06-21 at 15:52 +0100, Anthony Wright wrote:
> On 21/06/2011 14:39, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jun 21, 2011 at 01:57:20PM +0100, Anthony Wright wrote:
> >> On 21/06/2011 12:51, Ian Campbell wrote:
> >>> On Fri, 2011-06-17 at 12:29 +0100, Anthony Wright wrote:
> >>>> If I create a domain with 'xl create -e', and then shut the domain down
> >>>> with 'xl shutdown' according to 'xl list' it gets stuck in state
> >>>> '--ps-d', with a name of '(none)' and 0 ram, ('xm list' doesn't show the
> >>>> domain).
> >>>> If I destroy the domain with 'xl destroy' the domain is destroyed properly.
> >>>>
> >>>> If I create a domain with 'xl create' (without the '-e' option) and then
> >>>> use 'xl shutdown', the domain is destroyed properly.
> >>>> Since the 'xl shutdown' & 'xl destroy' give different results I presume
> >>>> this is a bug.
> >>> The -e option to xl create means don't daemonize to babysit this domain.
> >>> One of the key bits of functionality of the daemon is to destroy the
> >>> domain after it is shutdown. So if you use -e you need to do the destroy
> >>> manually. So effectively you have gotten what you asked for ;-)
> >>>
> >> However I can't destroy the domain with 'xl destroy'. The command runs,
> >> but the domain is still there afterwards.
> > Is qemu-dm running for that domain? what happens if you kill it?
> Ah ha.... that makes the domain go away.
>
> Ok, to be clear what I did....
>
> xl create -e domain-details
> xl shutdown <domain>
> (End up with a domain called (null) in state --ps-d according to xl list
> kill the qemu-dm process associated with the domain
> the domain goes away
>
> note that I didn't need to do an 'xl destroy' the kill of qemu-dm was
> sufficient
>
> If I do....
>
> xl create -e domain-details
> xl destroy <domain>
> the domain goes away
The difference is that shutdown is only a request to the guest to shut
itself down, while destroy is the big hammer which kills the guest with
prejudice and tidies up any detritus (or at least that is the intention
IIRC).
I suspect that you will probably find that only the hypervisor parts of
the domain are being cleaned up by killing qemu -- e.g. the xenstore
entries and some other stuff (e.g. libxl user data) are not being
cleaned up.
If you do:
xl create -e domain-details
xl shutdown <domain>
xl destroy <domain>
Does everything get cleaned up?
If you don't want to run an xl daemon for each domain (although I agree
with IanJ that they should be pretty cheap) I think a new xl sub-command
which caused it to behave as a single daemon reaping domains would be a
fine thing to have in our toolkit.
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 15:44 ` Ian Campbell
@ 2011-06-21 16:04 ` Anthony Wright
2011-06-22 7:56 ` Ian Campbell
2011-06-21 16:26 ` Ian Jackson
1 sibling, 1 reply; 23+ messages in thread
From: Anthony Wright @ 2011-06-21 16:04 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On 21/06/2011 16:44, Ian Campbell wrote:
> On Tue, 2011-06-21 at 15:52 +0100, Anthony Wright wrote:
>> Ok, to be clear what I did....
>> xl create -e domain-details
>> xl shutdown <domain>
>> (End up with a domain called (null) in state --ps-d according to xl list
>> kill the qemu-dm process associated with the domain
>> the domain goes away
>>
>> note that I didn't need to do an 'xl destroy' the kill of qemu-dm was
>> sufficient
>>
>> If I do....
>>
>> xl create -e domain-details
>> xl destroy <domain>
>> the domain goes away
> The difference is that shutdown is only a request to the guest to shut
> itself down, while destroy is the big hammer which kills the guest with
> prejudice and tidies up any detritus (or at least that is the intention
> IIRC).
>
> I suspect that you will probably find that only the hypervisor parts of
> the domain are being cleaned up by killing qemu -- e.g. the xenstore
> entries and some other stuff (e.g. libxl user data) are not being
> cleaned up.
>
> If you do:
> xl create -e domain-details
> xl shutdown <domain>
> xl destroy <domain>
>
> Does everything get cleaned up?
If I do:
xl create -e domain-details
xl shutdown <domain>
[ Wait for the domain to go to state --ps-d ]
xl destroy <domain>
The 'xl destroy' has no effect. The only thing that has an effect is
killing the qemu-dm processes which destroys the domain even without
running 'xl destroy'.
> If you don't want to run an xl daemon for each domain (although I agree
> with IanJ that they should be pretty cheap) I think a new xl sub-command
> which caused it to behave as a single daemon reaping domains would be a
> fine thing to have in our toolkit.
That would be really helpful, something that would inform me when a
domain died supplying it's domain id & domain name - the name is really
important because that's my reference, and I can't convert from an
domain id to a domain name after the domain has died. I used to have a
script that sat in /etc/xen/scripts/domain-destroyed which was called on
domain death with the dom-id and dom-name as arguments.
Anthony
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 15:44 ` Ian Campbell
2011-06-21 16:04 ` Anthony Wright
@ 2011-06-21 16:26 ` Ian Jackson
2011-06-21 16:42 ` Ian Campbell
2011-06-21 19:35 ` Tim Deegan
1 sibling, 2 replies; 23+ messages in thread
From: Ian Jackson @ 2011-06-21 16:26 UTC (permalink / raw)
To: Ian Campbell
Cc: xen-devel@lists.xensource.com, Anthony Wright,
Konrad Rzeszutek Wilk
Ian Campbell writes ("Re: [Xen-devel] Domains not being destroyed properly"):
> I suspect that you will probably find that only the hypervisor parts of
> the domain are being cleaned up by killing qemu -- e.g. the xenstore
> entries and some other stuff (e.g. libxl user data) are not being
> cleaned up.
It is puzzling that the hypervisor domain gets destroyed at all. What
is making the destroy domain hypercall ?
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 16:26 ` Ian Jackson
@ 2011-06-21 16:42 ` Ian Campbell
2011-06-21 17:01 ` Keir Fraser
2011-06-21 19:35 ` Tim Deegan
1 sibling, 1 reply; 23+ messages in thread
From: Ian Campbell @ 2011-06-21 16:42 UTC (permalink / raw)
To: Ian Jackson
Cc: xen-devel@lists.xensource.com, Anthony Wright,
Konrad Rzeszutek Wilk
On Tue, 2011-06-21 at 17:26 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] Domains not being destroyed properly"):
> > I suspect that you will probably find that only the hypervisor parts of
> > the domain are being cleaned up by killing qemu -- e.g. the xenstore
> > entries and some other stuff (e.g. libxl user data) are not being
> > cleaned up.
>
> It is puzzling that the hypervisor domain gets destroyed at all. What
> is making the destroy domain hypercall ?
I wondered that and handwaved it to myself as being the
HLT-with-interrupt-disabled logic kicking in and shutting down the guest
but now that I look it seems that this just triggers a domain shutdown
event to dom0 and doesn't actually destroy the domain (which is actually
as expected before I handwaved). So I'm a bit surprised again now too.
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 16:42 ` Ian Campbell
@ 2011-06-21 17:01 ` Keir Fraser
0 siblings, 0 replies; 23+ messages in thread
From: Keir Fraser @ 2011-06-21 17:01 UTC (permalink / raw)
To: Ian Campbell, Ian Jackson
Cc: xen-devel@lists.xensource.com, Anthony Wright,
Konrad Rzeszutek Wilk
On 21/06/2011 17:42, "Ian Campbell" <Ian.Campbell@eu.citrix.com> wrote:
> On Tue, 2011-06-21 at 17:26 +0100, Ian Jackson wrote:
>> Ian Campbell writes ("Re: [Xen-devel] Domains not being destroyed properly"):
>>> I suspect that you will probably find that only the hypervisor parts of
>>> the domain are being cleaned up by killing qemu -- e.g. the xenstore
>>> entries and some other stuff (e.g. libxl user data) are not being
>>> cleaned up.
>>
>> It is puzzling that the hypervisor domain gets destroyed at all. What
>> is making the destroy domain hypercall ?
>
> I wondered that and handwaved it to myself as being the
> HLT-with-interrupt-disabled logic kicking in and shutting down the guest
> but now that I look it seems that this just triggers a domain shutdown
> event to dom0 and doesn't actually destroy the domain (which is actually
> as expected before I handwaved). So I'm a bit surprised again now too.
A domain can only be destroyed via the destroydomain domctl from dom0.
-- Keir
> Ian.
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 16:26 ` Ian Jackson
2011-06-21 16:42 ` Ian Campbell
@ 2011-06-21 19:35 ` Tim Deegan
2011-06-22 8:02 ` Ian Campbell
1 sibling, 1 reply; 23+ messages in thread
From: Tim Deegan @ 2011-06-21 19:35 UTC (permalink / raw)
To: Ian Jackson
Cc: Ian Campbell, xen-devel@lists.xensource.com, Anthony Wright,
Konrad Rzeszutek Wilk
Hi,
At 17:26 +0100 on 21 Jun (1308677212), Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] Domains not being destroyed properly"):
> > I suspect that you will probably find that only the hypervisor parts of
> > the domain are being cleaned up by killing qemu -- e.g. the xenstore
> > entries and some other stuff (e.g. libxl user data) are not being
> > cleaned up.
>
> It is puzzling that the hypervisor domain gets destroyed at all. What
> is making the destroy domain hypercall ?
I suspect that xl destroy is doing it, but the domain can't actually
disappear until the last references to ist memory are dropped by
qemu-dm.
Tim.
--
Tim Deegan <tim@xen.org>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 16:04 ` Anthony Wright
@ 2011-06-22 7:56 ` Ian Campbell
2011-06-24 12:54 ` Anthony Wright
0 siblings, 1 reply; 23+ messages in thread
From: Ian Campbell @ 2011-06-22 7:56 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On Tue, 2011-06-21 at 17:04 +0100, Anthony Wright wrote:
> On 21/06/2011 16:44, Ian Campbell wrote:
> > On Tue, 2011-06-21 at 15:52 +0100, Anthony Wright wrote:
> >> Ok, to be clear what I did....
> >> xl create -e domain-details
> >> xl shutdown <domain>
> >> (End up with a domain called (null) in state --ps-d according to xl list
> >> kill the qemu-dm process associated with the domain
> >> the domain goes away
> >>
> >> note that I didn't need to do an 'xl destroy' the kill of qemu-dm was
> >> sufficient
> >>
> >> If I do....
> >>
> >> xl create -e domain-details
> >> xl destroy <domain>
> >> the domain goes away
> > The difference is that shutdown is only a request to the guest to shut
> > itself down, while destroy is the big hammer which kills the guest with
> > prejudice and tidies up any detritus (or at least that is the intention
> > IIRC).
> >
> > I suspect that you will probably find that only the hypervisor parts of
> > the domain are being cleaned up by killing qemu -- e.g. the xenstore
> > entries and some other stuff (e.g. libxl user data) are not being
> > cleaned up.
> >
> > If you do:
> > xl create -e domain-details
> > xl shutdown <domain>
> > xl destroy <domain>
> >
> > Does everything get cleaned up?
> If I do:
> xl create -e domain-details
> xl shutdown <domain>
> [ Wait for the domain to go to state --ps-d ]
> xl destroy <domain>
>
> The 'xl destroy' has no effect. The only thing that has an effect is
> killing the qemu-dm processes which destroys the domain even without
> running 'xl destroy'.
This sounds like a bug in "xl destroy" to me.
>
> > If you don't want to run an xl daemon for each domain (although I agree
> > with IanJ that they should be pretty cheap) I think a new xl sub-command
> > which caused it to behave as a single daemon reaping domains would be a
> > fine thing to have in our toolkit.
> That would be really helpful, something that would inform me when a
> domain died supplying it's domain id & domain name - the name is really
> important because that's my reference, and I can't convert from an
> domain id to a domain name after the domain has died.
In this case I think you can do so before the call to "xl destroy" since
that is when all the xenstore gubbins needed to do the translation is
called. In the past xend probably got their before you could and nuked
it all.
BTW, I was trying to imply that we would accept a patch which
implemented this new subcommand ;-)
> I used to have a
> script that sat in /etc/xen/scripts/domain-destroyed which was called on
> domain death with the dom-id and dom-name as arguments.
Including some sort of callout mechanism on shutdown but before destroy
would be a reasonable feature for a daemon which did this kind of global
babysitting to have.
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-21 19:35 ` Tim Deegan
@ 2011-06-22 8:02 ` Ian Campbell
0 siblings, 0 replies; 23+ messages in thread
From: Ian Campbell @ 2011-06-22 8:02 UTC (permalink / raw)
To: Tim Deegan
Cc: xen-devel@lists.xensource.com, Ian Jackson, Anthony Wright,
Konrad Rzeszutek Wilk
On Tue, 2011-06-21 at 20:35 +0100, Tim Deegan wrote:
> Hi,
>
> At 17:26 +0100 on 21 Jun (1308677212), Ian Jackson wrote:
> > Ian Campbell writes ("Re: [Xen-devel] Domains not being destroyed properly"):
> > > I suspect that you will probably find that only the hypervisor parts of
> > > the domain are being cleaned up by killing qemu -- e.g. the xenstore
> > > entries and some other stuff (e.g. libxl user data) are not being
> > > cleaned up.
> >
> > It is puzzling that the hypervisor domain gets destroyed at all. What
> > is making the destroy domain hypercall ?
>
> I suspect that xl destroy is doing it, but the domain can't actually
> disappear until the last references to ist memory are dropped by
> qemu-dm.
Anthony reckons that just killing the qemu after an "xl shutdown" is
sufficient, without an "xl destroy". I can't see anywhere that "xl
shutdown" would issue the destroydomain hypercall though (which is
expected, since it shouldn't be doing it).
Ian.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-22 7:56 ` Ian Campbell
@ 2011-06-24 12:54 ` Anthony Wright
2011-06-24 13:01 ` Tim Deegan
2011-06-24 13:45 ` Ian Campbell
0 siblings, 2 replies; 23+ messages in thread
From: Anthony Wright @ 2011-06-24 12:54 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On 22/06/2011 08:56, Ian Campbell wrote:
> On Tue, 2011-06-21 at 17:04 +0100, Anthony Wright wrote:
>> On 21/06/2011 16:44, Ian Campbell wrote:
>>> If you don't want to run an xl daemon for each domain (although I agree
>>> with IanJ that they should be pretty cheap) I think a new xl sub-command
>>> which caused it to behave as a single daemon reaping domains would be a
>>> fine thing to have in our toolkit.
>> That would be really helpful, something that would inform me when a
>> domain died supplying it's domain id & domain name - the name is really
>> important because that's my reference, and I can't convert from an
>> domain id to a domain name after the domain has died.
> In this case I think you can do so before the call to "xl destroy" since
> that is when all the xenstore gubbins needed to do the translation is
> called. In the past xend probably got their before you could and nuked
> it all.
>
> BTW, I was trying to imply that we would accept a patch which
> implemented this new subcommand ;-)
The problem is that I don't think there's enough information available
from watching @releaseDomain to implement a reaper well. The
xenstore-watch on @releaseDomain only tells you that a domain has died
(to make things more complicated I actually get two notifications when a
domain get's shutdown), it doesn't tell you which domain has died. A
reaper would have to maintain a list of domains as they were before the
notification to compare against the list after the notification to be
able to issue a notification for the domain that has died. If that list
got out of sync it would start issuing incorrect domain death notifications.
Looking at the watch code while the code in xenstored_domain has the
domain id & name available, there doesn't seem to be the ability to be
able to add arguments to @releaseDomain to let it tell the watchers
which domain has been released to avoid having to maintain this list.
Therefore the simplest change would seem to be to modify
xenstored_domain.c so that if a script existed (e.g.
/etc/xen/scripts/domain-destroyed) at domain destruction this script was
called with the domain id & name as arguments. Would it be acceptable?
Anthony.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 12:54 ` Anthony Wright
@ 2011-06-24 13:01 ` Tim Deegan
2011-06-24 13:36 ` Anthony Wright
2011-06-24 13:45 ` Ian Campbell
1 sibling, 1 reply; 23+ messages in thread
From: Tim Deegan @ 2011-06-24 13:01 UTC (permalink / raw)
To: Anthony Wright
Cc: Ian Campbell, xen-devel@lists.xensource.com,
Konrad Rzeszutek Wilk
Hi,
At 13:54 +0100 on 24 Jun (1308923697), Anthony Wright wrote:
> The problem is that I don't think there's enough information available
> from watching @releaseDomain to implement a reaper well. The
> xenstore-watch on @releaseDomain only tells you that a domain has died
> (to make things more complicated I actually get two notifications when a
> domain get's shutdown), it doesn't tell you which domain has died. A
> reaper would have to maintain a list of domains as they were before the
> notification to compare against the list after the notification to be
> able to issue a notification for the domain that has died.
That sounds awfully fragile. It could get a list of all living domains
and reap qemus/xenstore data for any domain not in the list. (Or, more
safely, enumerate all qemus and xenstore entries and check for each
whether the domain is alive).
> Therefore the simplest change would seem to be to modify
> xenstored_domain.c so that if a script existed (e.g.
> /etc/xen/scripts/domain-destroyed) at domain destruction this script was
> called with the domain id & name as arguments. Would it be acceptable?
I don't think so. At the moment it's possible (with a bit of effort) to
run xenstored in a chroot jail or in its own VM; plumbing it into the
rest of the tollstack that way would break that.
Cheers,
Tim.
--
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 13:01 ` Tim Deegan
@ 2011-06-24 13:36 ` Anthony Wright
2011-06-24 14:21 ` Ian Campbell
0 siblings, 1 reply; 23+ messages in thread
From: Anthony Wright @ 2011-06-24 13:36 UTC (permalink / raw)
To: Tim Deegan
Cc: Ian Campbell, xen-devel@lists.xensource.com,
Konrad Rzeszutek Wilk
On 24/06/2011 14:01, Tim Deegan wrote:
> Hi,
>
> At 13:54 +0100 on 24 Jun (1308923697), Anthony Wright wrote:
>> The problem is that I don't think there's enough information available
>> from watching @releaseDomain to implement a reaper well. The
>> xenstore-watch on @releaseDomain only tells you that a domain has died
>> (to make things more complicated I actually get two notifications when a
>> domain get's shutdown), it doesn't tell you which domain has died. A
>> reaper would have to maintain a list of domains as they were before the
>> notification to compare against the list after the notification to be
>> able to issue a notification for the domain that has died.
> That sounds awfully fragile. It could get a list of all living domains
> and reap qemus/xenstore data for any domain not in the list. (Or, more
> safely, enumerate all qemus and xenstore entries and check for each
> whether the domain is alive).
Is this information still available after a domain has been destroyed? I
was expecting that the watcher of @releaseDomain was only notified of a
domain destruction after the domain had been completely destroyed so
there would be no information amount the domain available any more. Are
you saying the would still be qemus/xenstore information available for
the domain, and if so how does it get tidied up?
Anthony.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 12:54 ` Anthony Wright
2011-06-24 13:01 ` Tim Deegan
@ 2011-06-24 13:45 ` Ian Campbell
2011-06-24 14:15 ` Anthony Wright
1 sibling, 1 reply; 23+ messages in thread
From: Ian Campbell @ 2011-06-24 13:45 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On Fri, 2011-06-24 at 13:54 +0100, Anthony Wright wrote:
> The problem is that I don't think there's enough information available
> from watching @releaseDomain to implement a reaper well. The
> xenstore-watch on @releaseDomain only tells you that a domain has died
> (to make things more complicated I actually get two notifications when a
> domain get's shutdown), it doesn't tell you which domain has died.
This has proven to be sufficient for all toolstacks I know of.
> A
> reaper would have to maintain a list of domains as they were before the
> notification to compare against the list after the notification to be
> able to issue a notification for the domain that has died. If that list
> got out of sync it would start issuing incorrect domain death notifications.
You don't need to maintain a list, simply call libxl_list_domain() and
for each returned domain examine the flags and shutdown reason to
determine if it is dead (and why), dying or still running.
Since you are explicitly arranging that nothing else will destroy
domains once they die so they will always be in that list when you get
to them.
> Looking at the watch code while the code in xenstored_domain has the
> domain id & name available, there doesn't seem to be the ability to be
> able to add arguments to @releaseDomain to let it tell the watchers
> which domain has been released to avoid having to maintain this list.
It shouldn't matter due to the above but it is possible to add extra
data to the vector included in the watch other than XS_WATCH_PATH and
XS_WATCH_TOKEN). IOW you can add more stuff to the "data" vector in
add_event. Linux's drivers/xen/xenfs/xenbus.c:watch_fired() even
supports exporting the extra to userspace.
> Therefore the simplest change would seem to be to modify
> xenstored_domain.c so that if a script existed (e.g.
> /etc/xen/scripts/domain-destroyed) at domain destruction this script was
> called with the domain id & name as arguments. Would it be acceptable?
>
> Anthony.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 13:45 ` Ian Campbell
@ 2011-06-24 14:15 ` Anthony Wright
2011-06-24 14:24 ` Ian Campbell
0 siblings, 1 reply; 23+ messages in thread
From: Anthony Wright @ 2011-06-24 14:15 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On 24/06/2011 14:45, Ian Campbell wrote:
> On Fri, 2011-06-24 at 13:54 +0100, Anthony Wright wrote:
>> A reaper would have to maintain a list of domains as they were before the
>> notification to compare against the list after the notification to be
>> able to issue a notification for the domain that has died. If that list
>> got out of sync it would start issuing incorrect domain death notifications.
> You don't need to maintain a list, simply call libxl_list_domain() and
> for each returned domain examine the flags and shutdown reason to
> determine if it is dead (and why), dying or still running.
>
> Since you are explicitly arranging that nothing else will destroy
> domains once they die so they will always be in that list when you get
> to them.
My original thought on reaper was that it was only responsible for
helping notify user space applications of a domain's death, but you
seem to be suggesting that it should also be responsible for tidying up
after a domain death as well. Isn't some other part of the system
already fulfilling this role?
It seems that for 'xl create' this tidy up is already being done, and I
had presumed the code for 'xl create -e' tidy up already exists but
wasn't working correctly. If you're saying the 'xl create -e' tidy up
code needs to be written, that's probably beyond my knowledge of xen,
and in any case I'm not sure it should be in reaper. Reaper should
notify user space of any domain destruction whether created by 'xl
create -e' or 'xl create'. If reaper did the tidy up for 'xl create -e'
it would have access to information needed to pass to the user space
applications, however for an 'xl create' domain death this information
may no longer be available since 'xl create' may have already tidied up.
>> Looking at the watch code while the code in xenstored_domain has the
>> domain id & name available, there doesn't seem to be the ability to be
>> able to add arguments to @releaseDomain to let it tell the watchers
>> which domain has been released to avoid having to maintain this list.
> It shouldn't matter due to the above but it is possible to add extra
> data to the vector included in the watch other than XS_WATCH_PATH and
> XS_WATCH_TOKEN). IOW you can add more stuff to the "data" vector in
> add_event. Linux's drivers/xen/xenfs/xenbus.c:watch_fired() even
> supports exporting the extra to userspace.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 13:36 ` Anthony Wright
@ 2011-06-24 14:21 ` Ian Campbell
0 siblings, 0 replies; 23+ messages in thread
From: Ian Campbell @ 2011-06-24 14:21 UTC (permalink / raw)
To: Anthony Wright
Cc: Tim Deegan, xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk
On Fri, 2011-06-24 at 14:36 +0100, Anthony Wright wrote:
> On 24/06/2011 14:01, Tim Deegan wrote:
> > Hi,
> >
> > At 13:54 +0100 on 24 Jun (1308923697), Anthony Wright wrote:
> >> The problem is that I don't think there's enough information available
> >> from watching @releaseDomain to implement a reaper well. The
> >> xenstore-watch on @releaseDomain only tells you that a domain has died
> >> (to make things more complicated I actually get two notifications when a
> >> domain get's shutdown), it doesn't tell you which domain has died. A
> >> reaper would have to maintain a list of domains as they were before the
> >> notification to compare against the list after the notification to be
> >> able to issue a notification for the domain that has died.
> > That sounds awfully fragile. It could get a list of all living domains
> > and reap qemus/xenstore data for any domain not in the list. (Or, more
> > safely, enumerate all qemus and xenstore entries and check for each
> > whether the domain is alive).
> Is this information still available after a domain has been destroyed? I
> was expecting that the watcher of @releaseDomain was only notified of a
> domain destruction after the domain had been completely destroyed
@releaseDomain fires after the domain has shutdown but _before_ the
domain is destroyed. This is necessary because @releaseDomain is the
signal to the toolstack that it is time to destroy the domain.
> so
> there would be no information amount the domain available any more. Are
> you saying the would still be qemus/xenstore information available for
> the domain, and if so how does it get tidied up?
>
> Anthony.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Domains not being destroyed properly
2011-06-24 14:15 ` Anthony Wright
@ 2011-06-24 14:24 ` Ian Campbell
0 siblings, 0 replies; 23+ messages in thread
From: Ian Campbell @ 2011-06-24 14:24 UTC (permalink / raw)
To: Anthony Wright; +Cc: xen-devel@lists.xensource.com, Rzeszutek Wilk, Konrad
On Fri, 2011-06-24 at 15:15 +0100, Anthony Wright wrote:
> On 24/06/2011 14:45, Ian Campbell wrote:
> > On Fri, 2011-06-24 at 13:54 +0100, Anthony Wright wrote:
> >> A reaper would have to maintain a list of domains as they were before the
> >> notification to compare against the list after the notification to be
> >> able to issue a notification for the domain that has died. If that list
> >> got out of sync it would start issuing incorrect domain death notifications.
> > You don't need to maintain a list, simply call libxl_list_domain() and
> > for each returned domain examine the flags and shutdown reason to
> > determine if it is dead (and why), dying or still running.
> >
> > Since you are explicitly arranging that nothing else will destroy
> > domains once they die so they will always be in that list when you get
> > to them.
> My original thought on reaper was that it was only responsible for
> helping notify user space applications of a domain's death, but you
> seem to be suggesting that it should also be responsible for tidying up
> after a domain death as well. Isn't some other part of the system
> already fulfilling this role?
It would be xl, but you are using xl create -e which explicitly turns
off that behaviour.
> It seems that for 'xl create' this tidy up is already being done, and I
> had presumed the code for 'xl create -e' tidy up already exists but
> wasn't working correctly.
No, "xl create -e" means "don't babysit this domain waiting for it to
die. I, the user, will take care of cleaning it up manually when it
dies".
You said you were using -e because you didn't want a load of processes
sitting around to fulfil this requirement and so I suggested you might
like a single reaper daemon which takes case of it instead.
Ian.
> If you're saying the 'xl create -e' tidy up
> code needs to be written, that's probably beyond my knowledge of xen,
> and in any case I'm not sure it should be in reaper. Reaper should
> notify user space of any domain destruction whether created by 'xl
> create -e' or 'xl create'. If reaper did the tidy up for 'xl create -e'
> it would have access to information needed to pass to the user space
> applications, however for an 'xl create' domain death this information
> may no longer be available since 'xl create' may have already tidied up.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2011-06-24 14:24 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-17 11:29 Domains not being destroyed properly Anthony Wright
2011-06-17 18:12 ` Nathan March
2011-06-21 11:51 ` Ian Campbell
2011-06-21 12:57 ` Anthony Wright
2011-06-21 13:13 ` Ian Jackson
2011-06-21 13:39 ` Konrad Rzeszutek Wilk
2011-06-21 14:52 ` Anthony Wright
2011-06-21 15:44 ` Ian Campbell
2011-06-21 16:04 ` Anthony Wright
2011-06-22 7:56 ` Ian Campbell
2011-06-24 12:54 ` Anthony Wright
2011-06-24 13:01 ` Tim Deegan
2011-06-24 13:36 ` Anthony Wright
2011-06-24 14:21 ` Ian Campbell
2011-06-24 13:45 ` Ian Campbell
2011-06-24 14:15 ` Anthony Wright
2011-06-24 14:24 ` Ian Campbell
2011-06-21 16:26 ` Ian Jackson
2011-06-21 16:42 ` Ian Campbell
2011-06-21 17:01 ` Keir Fraser
2011-06-21 19:35 ` Tim Deegan
2011-06-22 8:02 ` Ian Campbell
2011-06-21 13:27 ` Ian Jackson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.