From: Ian Campbell <Ian.Campbell@citrix.com>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v3 00/18] libxl: domain save/restore: run in a separate process
Date: Wed, 13 Jun 2012 11:30:42 +0100 [thread overview]
Message-ID: <1339583442.24104.178.camel@zakaz.uk.xensource.com> (raw)
In-Reply-To: <1339582969.24104.175.camel@zakaz.uk.xensource.com>
On Wed, 2012-06-13 at 11:22 +0100, Ian Campbell wrote:
> On Wed, 2012-06-13 at 09:59 +0100, Ian Campbell wrote:
> > On Fri, 2012-06-08 at 18:34 +0100, Ian Jackson wrote:
> > > This is v3 of my series to asyncify save/restore, rebased to current
> > > tip, retested, and with all comments addressed.
> >
> > There's quite a lot of combinations which need testing here (PV, HVM,
> > HVM w/ stub dm, old vs new qemu etc etc), which of those have you tried?
> >
> > I tried a simple localhost migrate of a PV guest and:
> > # xl -vvv migrate d32-1 localhost
> > migration target: Ready to receive domain.
> > Saving to migration stream new xl format (info 0x0/0x0/3541)
> > libxl: debug: libxl.c:722:libxl_domain_suspend: ao 0x8069720: create: how=(nil) callback=(nil) poller=0x80696c8
> > Loading new save file <incoming migration stream> (new xl fmt info 0x0/0x0/3541)
> > Savefile contains xl domain config
> > libxl: debug: libxl_dom.c:969:libxl__toolstack_save: domain=2 toolstack data size=8
> > libxl: debug: libxl.c:745:libxl_domain_suspend: ao 0x8069720: inprogress: poller=0x80696c8, flags=i
> > libxl-save-helper: debug: starting save: Success
> > xc: detail: Had 0 unexplained entries in p2m table
> > xc: Saving memory: iter 0 (last sent 0 skipped 0): 0/131072 0%
> >
> > at which point it appears to just stop.
> >
> > # strace -p 2872 # /usr/lib/xen/bin/libxl-save-helper --save-domain 8 2 0 0 1 0 0 12 8 72
> > Process 2872 attached - interrupt to quit
> > write(8, 0xb5d31000, 1974272^C <unfinished ...>
> > Process 2872 detached
> > # strace -p 2866 # /usr/lib/xen/bin/libxl-save-helper --restore-domain 0 3 1 0 2 0 0 1 0 0 0
>
> The first zero here is restore_fd, I think. But I read in the comment in
> the helper:
> > + * The helper talks on stdin and stdout, in binary in machine
> > + * endianness. The helper speaks first, and only when it has a
> > + * callback to make. It writes a 16-bit number being the message
> > + * length, and then the message body.
>
> So restore_fd == stdin => running two protocols over the same fd?
Oh, right, migrate-receive takes the migration fd on stdin doesn't it,
so that's where it comes from. I still suspect it is wrong. Might need
to dup the input onto a safe fd?
BTW, since I've been ctrl-c'ing "xl migrate" a bunch I noticed that we
seem to leak an "xl migrate-receive" and the restore side helper
process. Probably pre-existing but I thought it worth mentioning.
>
> > Process 2866 attached - interrupt to quit
> > read(0, ^C <unfinished ...>
> > # strace -p 4070 # xl -vvv migrate d32-1 localhost
> > Process 4070 attached - interrupt to quit
> > restart_syscall(<... resuming interrupted call ...>
> > # strace -p 4074 # xl migrate-receive
> > Process 4074 attached - interrupt to quit
> > restart_syscall(<... resuming interrupted call ...>
> >
> > So the saver seems to be blocked writing to fd 8, which is argv[1] == io_fd.
> >
> > Also FWIW:
> > # xl list
> > Name ID Mem VCPUs State Time(s)
> > Domain-0 0 511 4 r----- 24.5
> > d32-1 2 128 4 -b---- 0.4
> > d32-1--incoming 3 0 0 --p--- 0.0
> >
> > /var/log/xen/xl-d32-1.log is just "Waiting for domain d32-1 (domid 9) to
> > die [pid 4045]" (nb: this was a newer attempt than the ones above, to be
> > sure I was looking at the right log, so the domid's don't match, 9 ==
> > d32-1 not the incoming one). There is no xl log for the incoming domain.
> >
> > Also it'd be worth pinging/CCing Shriram next time to get him to sanity
> > test the Remus cases too.
> >
> > I'm in the middle of reviewing #5/19 (the meat), I'll keep going
> > although I doubt I'll spot the cause of this...
> >
> > Ian.
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-06-13 10:30 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-08 17:34 [PATCH v3 00/18] libxl: domain save/restore: run in a separate process Ian Jackson
2012-06-08 17:34 ` [PATCH 01/19] libxc: xc_domain_restore, make toolstack_restore const-correct Ian Jackson
2012-06-12 15:15 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 02/19] libxl: domain save: rename variables etc Ian Jackson
2012-06-12 15:24 ` Ian Campbell
2012-06-14 15:09 ` Ian Jackson
2012-06-08 17:34 ` [PATCH 03/19] libxl: domain restore: reshuffle, preparing for ao Ian Jackson
2012-06-12 15:49 ` Ian Campbell
2012-06-14 15:11 ` Ian Jackson
2012-06-08 17:34 ` [PATCH 04/19] libxl: domain save: API changes for asynchrony Ian Jackson
2012-06-12 16:51 ` Ian Campbell
2012-06-14 15:26 ` Ian Jackson
2012-06-19 10:04 ` Ian Campbell
2012-06-19 13:02 ` Ian Jackson
2012-06-19 15:15 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 05/19] libxl: domain save/restore: run in a separate process Ian Jackson
2012-06-13 11:04 ` Ian Campbell
2012-06-14 16:48 ` Ian Jackson
2012-06-19 13:50 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 06/19] libxl: rename libxl_dom:save_helper to physmap_path Ian Jackson
2012-06-08 17:34 ` [PATCH 07/19] libxl: provide libxl__xs_*_checked and libxl__xs_transaction_* Ian Jackson
2012-06-13 11:15 ` Ian Campbell
2012-06-14 16:53 ` Ian Jackson
2012-06-08 17:34 ` [PATCH 08/19] libxl: wait for qemu to acknowledge logdirty command Ian Jackson
2012-06-13 12:52 ` Ian Campbell
2012-06-14 15:47 ` Ian Jackson
2012-06-19 13:33 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 09/19] libxl: datacopier: provide "prefix data" facility Ian Jackson
2012-06-13 12:53 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 10/19] libxl: prepare for asynchronous writing of qemu save file Ian Jackson
2012-06-13 12:56 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 11/19] libxl: Make libxl__domain_save_device_model asynchronous Ian Jackson
2012-06-13 12:59 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 12/19] libxl: Add a gc to libxl_get_cpu_topology Ian Jackson
2012-06-13 12:59 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 13/19] libxl: Do not pass NULL as gc_opt; introduce NOGC Ian Jackson
2012-06-13 13:11 ` Ian Campbell
2012-06-08 17:34 ` [PATCH 14/19] libxl: Get compiler to warn about gc_opt==NULL Ian Jackson
2012-06-13 13:08 ` Ian Campbell
2012-06-13 13:09 ` Ian Campbell
2012-06-14 16:58 ` Ian Jackson
2012-06-08 17:34 ` [PATCH 15/19] xl: Handle return value from libxl_domain_suspend correctly Ian Jackson
2012-06-08 17:34 ` [PATCH 16/19] libxl: do not leak dms->saved_state Ian Jackson
2012-06-08 17:34 ` [PATCH 17/19] libxl: do not leak spawned middle children Ian Jackson
2012-06-13 13:25 ` Ian Campbell
2012-06-14 17:08 ` Ian Jackson
2012-06-08 17:34 ` [PATCH 18/19] libxl: do not leak an event struct on ignored ao progress Ian Jackson
2012-06-08 17:34 ` [PATCH 19/19] libxl: DO NOT APPLY enforce prohibition on internal Ian Jackson
2012-06-11 16:43 ` [PATCH] libxl: further fixups re LIBXL_DOMAIN_TYPE process Ian Jackson
2012-06-13 16:48 ` Ian Campbell
2012-06-13 8:59 ` [PATCH v3 00/18] libxl: domain save/restore: run in a separate process Ian Campbell
2012-06-13 10:22 ` Ian Campbell
2012-06-13 10:30 ` Ian Campbell [this message]
2012-06-14 15:31 ` [PATCH v3 00/18] libxl: domain save/restore: run in a separate process [and 4 more messages] Ian Jackson
2012-06-14 15:39 ` Ian Jackson
2012-06-13 10:38 ` [PATCH v3 00/18] libxl: domain save/restore: run in a separate process Ian Campbell
2012-06-13 11:27 ` Ian Jackson
-- strict thread matches above, loose matches on Subject: below --
2012-06-15 11:53 Ian Jackson
2012-06-15 13:40 ` Stefano Stabellini
2012-06-22 13:39 ` Ian Campbell
2012-06-22 13:49 ` Ian Campbell
2012-06-22 12:22 ` Ian Campbell
2012-06-26 17:06 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1339583442.24104.178.camel@zakaz.uk.xensource.com \
--to=ian.campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).