From: "Charles Coffing" <ccoffing@novell.com>
To: Andrew Warfield <andrew.warfield@cl.cam.ac.uk>,
John Byrne <john.l.byrne@hp.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: vbd flushing during migration?
Date: Tue, 01 Aug 2006 15:28:20 -0400 [thread overview]
Message-ID: <44CF56C2.D169.003C.0@novell.com> (raw)
In-Reply-To: <44CE83B1.1090605@hp.com>
[-- Attachment #1: Type: text/plain, Size: 3173 bytes --]
I've got a patch in our tree that does (basically) what John is
describing.
The exact bug we hit was that a "xm shutdown -w vm" did not wait until
the vbds were cleared out before returning. So now I wait until the
backend/vbd nodes go away before returning.
This could probably be done more cleanly with watches, and should be
abstracted out to be sure it applies equally to migration, and so forth.
But for the sake of discussion, the patch is attached.
-Charles
>>> On Mon, Jul 31, 2006 at 4:26 PM, in message
<44CE83B1.1090605@hp.com>, John
Byrne <john.l.byrne@hp.com> wrote:
> It would be a bit ugly, but mostly straightforward to watch for the
> destruction of the vbds (or all devices) after the destroyDomain() is
> done and then sending an all- clear. (The last time I looked there
wasn't
> a waitForDomainDestroy() anywhere, so it would probably be best to
write
> one.) This would guarantee correctness: which is the most important
thing.
>
> The problem I see with that strategy is the effect on downtime during
a
> live- move. Ideally you'd like to start the vbd cleanup when the
final
> suspend is done and hope to parallelize the any final device
operations
> with the final pass of live- move. How to do that and play nice with
> domain destruction on the normal path and handle errors seems a lot
less
> clear to me.
>
> So, are you just ignoring the notion of minimizing downtime for the
> moment or is there something I'm missing?
>
> John
>
> Andrew Warfield wrote:
>> It's slightly more than a flush that's required. The migration
>> protocol needs to be extended so that execution on the target host
>> doesn't start until all of the outstanding (i.e. issued by the
>> backend) block requests have been either cancelled or acknowledged.
>> This should be pretty straight forward given that the backend
driver
>> ref counts a blkif's state based on pending requests, and won't
tear
>> down the backend directory in xenstore until all the outstanding
>> requests have cleared. All that is likely required is to have the
>> migration code register watches on the backend vbd directories, and
>> wait for them to disappear before giving the all- clear to the new
>> host.
>>
>> We've talked about this enough to know how to fix it, but haven't
had
>> a chance to hack it up. (I think Julian has looked into the problem
a
>> bit for blktap, but not yet done a general fix.) Patches would
>> certainly be welcome though. ;)
>>
>> a.
>>
>> On 7/31/06, John Byrne <john.l.byrne@hp.com> wrote:
>>>
>>> Hi,
>>>
>>> I don't see any obvious flush to disk taking place for vbd's on
the
>>> source host in XendCheckpoint.py before the domain is started on
the new
>>> host. Is there a guarantee that all written data is on disk
somewhere
>>> else or is something needed?
>>>
>>> Thanks,
>>>
>>> John Byrne
>>>
>>>
>>> _______________________________________________
>>> Xen- devel mailing list
>>> Xen- devel@lists.xensource.com
>>> http://lists.xensource.com/xen- devel
>>>
>>
>
>
> _______________________________________________
> Xen- devel mailing list
> Xen- devel@lists.xensource.com
> http://lists.xensource.com/xen- devel
[-- Attachment #2: xen-shutdown-wait.diff --]
[-- Type: application/octet-stream, Size: 1287 bytes --]
Index: xen-unstable/tools/python/xen/xm/shutdown.py
===================================================================
--- xen-unstable.orig/tools/python/xen/xm/shutdown.py
+++ xen-unstable/tools/python/xen/xm/shutdown.py
@@ -52,6 +52,8 @@ def shutdown(opts, doms, mode, wait):
for d in doms:
server.xend.domain.shutdown(d, mode)
if wait:
+ from xen.xend.xenstore.xstransact import xstransact
+ doms_to_cleanup = doms[:]
while doms:
alive = server.xend.domains(0)
dead = []
@@ -62,6 +64,17 @@ def shutdown(opts, doms, mode, wait):
opts.info("Domain %s terminated" % d)
doms.remove(d)
time.sleep(1)
+ # Now all the domains are terminated, but wait until the devices are
+ # cleaned up.
+ for d in doms_to_cleanup:
+ info = server.xend.domain(d)
+ domid = int(sxp.child_value(info, 'domid', '-1'))
+ device_class_path = '/local/domain/0/backend/vbd/%d/' % domid
+ while True:
+ devices = xstransact.List(device_class_path)
+ if len(devices) == 0:
+ break
+ time.sleep(1)
opts.info("All domains terminated")
def shutdown_mode(opts):
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
prev parent reply other threads:[~2006-08-01 19:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-07-31 19:39 vbd flushing during migration? John Byrne
2006-07-31 19:56 ` Andrew Warfield
2006-07-31 22:26 ` John Byrne
2006-07-31 23:03 ` Andrew Warfield
2006-08-01 19:28 ` Charles Coffing [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44CF56C2.D169.003C.0@novell.com \
--to=ccoffing@novell.com \
--cc=andrew.warfield@cl.cam.ac.uk \
--cc=john.l.byrne@hp.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.