* earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
@ 2010-04-22 8:08 James (song wei)
2010-04-22 11:13 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: James (song wei) @ 2010-04-22 8:08 UTC (permalink / raw)
To: xen-devel
Blktapctl thread will use qemu-dm connection instead of tapdisk-ioemu in the
case of FV VM. We found the resource like memory allocated for this Guest
can't be free for backend driver couldn't be closed in qemu-dm.
This patch would remove the backend of tapdisk device earlier in xenstore
to triger qemu-dm to notify the backend driver to release the resource
allocated.
I have tested this patch at the case of
1, save && restore
2, destory && shutdown
3, snapshot
regards,
-James (Song Wei)
Signed-off-by: James ( Song Wei ) <jsong@novell.com>
diff -r fadf63ab49e7 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py Mon Apr 19 17:57:28 2010
+0100
+++ b/tools/python/xen/xend/XendDomainInfo.py Thu Apr 22 15:54:01 2010
+0800
@@ -2406,8 +2406,13 @@
def _releaseDevices(self, suspend = False):
"""Release all domain's devices. Nothrow guarantee."""
+ t = xstransact("%s/device" % self.vmpath)
if self.image:
try:
+ for dev in t.list('tap'):
+ log.debug("Early removing %s", dev);
+ self.getDeviceController('tap').destroyDevice(dev,
True)
+ time.sleep(0.1)
log.debug("Destroying device model")
self.image.destroyDeviceModel()
except Exception, e:
@@ -2416,9 +2421,10 @@
log.debug("No device model")
log.debug("Releasing devices")
- t = xstransact("%s/device" % self.vmpath)
try:
for devclass in XendDevices.valid_devices():
+ if devclass is 'tap':
+ continue
for dev in t.list(devclass):
try:
log.debug("Removing %s", dev);
http://old.nabble.com/file/p28325456/tapdisk-close.patch tapdisk-close.patch
--
View this message in context: http://old.nabble.com/earlier-remove-the-backend-of-tapdisk-device-in-xenstore-to-release-the-resource-allocated-in-backend-driver-lies-in-dom0%27kernel-tp28325456p28325456.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-22 8:08 earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel James (song wei)
@ 2010-04-22 11:13 ` Jan Beulich
2010-04-22 17:14 ` Jim Fehlig
2010-04-23 3:24 ` James (song wei)
0 siblings, 2 replies; 8+ messages in thread
From: Jan Beulich @ 2010-04-22 11:13 UTC (permalink / raw)
To: James Song; +Cc: xen-devel
>>> "James (song wei)" <jsong@novell.com> 22.04.10 10:08 >>>
>--- a/tools/python/xen/xend/XendDomainInfo.py Mon Apr 19 17:57:28 2010 +0100
>+++ b/tools/python/xen/xend/XendDomainInfo.py Thu Apr 22 15:54:01 2010 +0800
>@@ -2406,8 +2406,13 @@
>
> def _releaseDevices(self, suspend = False):
> """Release all domain's devices. Nothrow guarantee."""
>+ t = xstransact("%s/device" % self.vmpath)
> if self.image:
> try:
>+ for dev in t.list('tap'):
>+ log.debug("Early removing %s", dev);
>+ self.getDeviceController('tap').destroyDevice(dev, True)
>+ time.sleep(0.1)
> log.debug("Destroying device model")
> self.image.destroyDeviceModel()
> except Exception, e:
>@@ -2416,9 +2421,10 @@
> log.debug("No device model")
>
> log.debug("Releasing devices")
>- t = xstransact("%s/device" % self.vmpath)
> try:
> for devclass in XendDevices.valid_devices():
>+ if devclass is 'tap':
>+ continue
> for dev in t.list(devclass):
> try:
> log.debug("Removing %s", dev);
This seems more like a hack than a solution: Surely qemu-dm gets
sent some sort of signal to shut down and clean up. The question
thus really is why that cleanup doesn't include cleaning up blktap
related resources. That is, I would expect the fix to be in qemu-dm,
or at most in the xend code that reaps qemu-dm.
More generally any solution should be generic, or it should be
explained properly why the device class needs treatment
different from any of the other ones (and, for this specific
case, why moving the device destruction a little ahead will
now *guarantee* that the cleanup can happen as expected).
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-22 11:13 ` Jan Beulich
@ 2010-04-22 17:14 ` Jim Fehlig
2010-04-23 1:48 ` James (song wei)
2010-04-23 3:24 ` James (song wei)
1 sibling, 1 reply; 8+ messages in thread
From: Jim Fehlig @ 2010-04-22 17:14 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, James Song
Jan Beulich wrote:
>>>> "James (song wei)" <jsong@novell.com> 22.04.10 10:08 >>>
>>>>
>> --- a/tools/python/xen/xend/XendDomainInfo.py Mon Apr 19 17:57:28 2010 +0100
>> +++ b/tools/python/xen/xend/XendDomainInfo.py Thu Apr 22 15:54:01 2010 +0800
>> @@ -2406,8 +2406,13 @@
>>
>> def _releaseDevices(self, suspend = False):
>> """Release all domain's devices. Nothrow guarantee."""
>> + t = xstransact("%s/device" % self.vmpath)
>> if self.image:
>> try:
>> + for dev in t.list('tap'):
>> + log.debug("Early removing %s", dev);
>> + self.getDeviceController('tap').destroyDevice(dev, True)
>> + time.sleep(0.1)
>> log.debug("Destroying device model")
>> self.image.destroyDeviceModel()
>> except Exception, e:
>> @@ -2416,9 +2421,10 @@
>> log.debug("No device model")
>>
>> log.debug("Releasing devices")
>> - t = xstransact("%s/device" % self.vmpath)
>> try:
>> for devclass in XendDevices.valid_devices():
>> + if devclass is 'tap':
>> + continue
>> for dev in t.list(devclass):
>> try:
>> log.debug("Removing %s", dev);
>>
>
> This seems more like a hack than a solution: Surely qemu-dm gets
> sent some sort of signal to shut down and clean up. The question
> thus really is why that cleanup doesn't include cleaning up blktap
> related resources. That is, I would expect the fix to be in qemu-dm,
> or at most in the xend code that reaps qemu-dm.
>
Agreed. qemu-dm should clean up these resources on shutdown. AFAICT,
it currently relies on receiving CTRMSG_CLOSE from blktapctrl (see
ioemu-dir/hw/xen_blktap.c), which it may never receive before exiting.
Regards,
Jim
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-22 17:14 ` Jim Fehlig
@ 2010-04-23 1:48 ` James (song wei)
0 siblings, 0 replies; 8+ messages in thread
From: James (song wei) @ 2010-04-23 1:48 UTC (permalink / raw)
To: xen-devel
Jim Fehlig wrote:
>
>>Agreed. qemu-dm should clean up these resources on shutdown. AFAICT,
>>it currently relies on receiving CTRMSG_CLOSE from blktapctrl (see
>>ioemu-dir/hw/xen_blktap.c), which it may never receive before exiting.
>
> No, I don't think so. The root cause in this issue is qemu-dm exit too
> earlier.
> Qemu-dm has been destroyed before blktapctrl send "CTRMSG_CLOSE",
> moreover, removing the backend node in xenstore trigers the callback in
> blktapctrl to send "CTRMSG_CLOSE".
> It's xend to decide who ought to be occured earlier between sending SIGHUP
> to qemu-dm and removing the backend node in xenstore. Obviously, xend is
> the role of manager. So I think fix this issue in xend is appropriate.
> In addition, If we really want to fix it in qemu, we have to intercept
> the exit signal sent by kill() in xend. I don't think intercept this
> singal in qemu for this point is proper.
>
> -Jame (Song Wei)
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
--
View this message in context: http://old.nabble.com/earlier-remove-the-backend-of-tapdisk-device-in-xenstore-to-release-the-resource-allocated-in-backend-driver-lies-in-dom0%27kernel-tp28325456p28336546.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-22 11:13 ` Jan Beulich
2010-04-22 17:14 ` Jim Fehlig
@ 2010-04-23 3:24 ` James (song wei)
2010-04-23 8:13 ` Jan Beulich
1 sibling, 1 reply; 8+ messages in thread
From: James (song wei) @ 2010-04-23 3:24 UTC (permalink / raw)
To: xen-devel
Jan Beulich wrote:
>
>
>
>
>> More generally any solution should be generic, or it should be
>> explained properly why the device class needs treatment
>> different from any of the other ones (and, for this specific
>> case, why moving the device destruction a little ahead will
>> now *guarantee* that the cleanup can happen as expected).
>
>
> Moving the device destruction a little ahead of killing qemu-dm would
> triger blktapctl thread send CTRMSG_CLOSE to "qemu-dm" before it exit.
> And then, qemu-dm would notify backend driver to release resouce by
> calling release() of driver through closing the opened device file of
> "/dev/xen/blktapN"
>
> thanks Jim and Jan,
>
> -James
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
--
View this message in context: http://old.nabble.com/earlier-remove-the-backend-of-tapdisk-device-in-xenstore-to-release-the-resource-allocated-in-backend-driver-lies-in-dom0%27kernel-tp28325456p28337027.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-23 3:24 ` James (song wei)
@ 2010-04-23 8:13 ` Jan Beulich
2010-04-23 8:54 ` Keir Fraser
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2010-04-23 8:13 UTC (permalink / raw)
To: James Song; +Cc: xen-devel
>>> "James (song wei)" <jsong@novell.com> 23.04.10 05:24 >>>
>Jan Beulich wrote:
>> More generally any solution should be generic, or it should be
>> explained properly why the device class needs treatment
>> different from any of the other ones (and, for this specific
>> case, why moving the device destruction a little ahead will
>> now *guarantee* that the cleanup can happen as expected).
>
>
> Moving the device destruction a little ahead of killing qemu-dm would
> triger blktapctl thread send CTRMSG_CLOSE to "qemu-dm" before it exit.
> And then, qemu-dm would notify backend driver to release resouce by
> calling release() of driver through closing the opened device file of
> "/dev/xen/blktapN"
Even though Keir already applied your patch, I continue to disagree:
The only function you modified is XendDomainInfo._releaseDevices(),
which itself doesn't kill qemu-dm afaict (I didn't actually spot where
it gets killed). Hence the only effect you achieve is that the window
in time for the CTRLMSG_CLOSE to arrive gets enlarged (the insertion
of time.sleep(0.1) is also a sign of just using heuristics instead of
proper sequencing of events). You still can't guarantee that the
message will arrive in time (i.e. if qemu-dm doesn't get scheduled
soon enough), and hence you just decreased the likelihood of the
original issue.
Imo, there's no way around doing proper cleanup from qemu-dm's
signal handler, or there needs to be better handshaking between
xend and qemu-dm.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-23 8:13 ` Jan Beulich
@ 2010-04-23 8:54 ` Keir Fraser
2010-04-23 9:09 ` James Song
0 siblings, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2010-04-23 8:54 UTC (permalink / raw)
To: Jan Beulich, James Song; +Cc: xen-devel@lists.xensource.com
On 23/04/2010 09:13, "Jan Beulich" <JBeulich@novell.com> wrote:
>> Moving the device destruction a little ahead of killing qemu-dm would
>> triger blktapctl thread send CTRMSG_CLOSE to "qemu-dm" before it exit.
>> And then, qemu-dm would notify backend driver to release resouce by
>> calling release() of driver through closing the opened device file of
>> "/dev/xen/blktapN"
>
> Even though Keir already applied your patch, I continue to disagree:
I agreed with you and Jim, and subsequently reverted it.
K.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel
2010-04-23 8:54 ` Keir Fraser
@ 2010-04-23 9:09 ` James Song
0 siblings, 0 replies; 8+ messages in thread
From: James Song @ 2010-04-23 9:09 UTC (permalink / raw)
To: Keir Fraser, Jan Beulich; +Cc: xen-devel@lists.xensource.com
[-- Attachment #1.1: Type: text/plain, Size: 598 bytes --]
>>> Keir Fraser <keir.fraser@eu.citrix.com> 2010-4-23 16:54 >>>
>> Moving the device destruction a little ahead of killing qemu-dm would
>> triger blktapctl thread send CTRMSG_CLOSE to "qemu-dm" before it exit.
>> And then, qemu-dm would notify backend driver to release resouce by
>> calling release() of driver through closing the opened device file of
>> "/dev/xen/blktapN"
>
> Even though Keir already applied your patch, I continue to disagree:
>>I agreed with you and Jim, and subsequently reverted it.
>> K.
Ok, fine, I'll rewrite the patch and send again soon.
[-- Attachment #1.2: Type: text/html, Size: 965 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-04-23 9:09 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-22 8:08 earlier remove the backend of tapdisk device in xenstore to release the resource allocated in backend driver lies in dom0'kernel James (song wei)
2010-04-22 11:13 ` Jan Beulich
2010-04-22 17:14 ` Jim Fehlig
2010-04-23 1:48 ` James (song wei)
2010-04-23 3:24 ` James (song wei)
2010-04-23 8:13 ` Jan Beulich
2010-04-23 8:54 ` Keir Fraser
2010-04-23 9:09 ` James Song
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.