* [Qemu-devel] Postcopy failures
@ 2014-10-24 16:48 Gary Hook
2014-10-27 12:41 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 3+ messages in thread
From: Gary Hook @ 2014-10-24 16:48 UTC (permalink / raw)
To: qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]
I see this went by:
Il 07/10/2014 12:29, Dr. David Alan Gilbert ha scritto:
> You mean something like this (untested) ?
>
> if (mis->postcopy_ram_state != POSTCOPY_RAM_INCOMING_NONE) {
> if (mis->postcopy_ram_state == POSTCOPY_RAM_INCOMING_ADVISE) {
> /*
> * Where a migration had postcopy enabled (and thus went to advise)
> * but managed to complete within the precopy period
> */
> postcopy_ram_incoming_cleanup(mis);
> } else if (ret >= 0) {
> /*
> * Postcopy was started, cleanup should happen at the end of the
> * postcopy thread.
> */
> DPRINTF("process_incoming_migration_co: exiting main branch");
> return;
> }
> }
And I wonder if this will solve the problem of a peer-to-peer migration, using non-shared storage, failing because it appears to take a bit too lon? I see in other threads Dr. Gilbert is making changes related to post copy and I am very interested in getting resolution to what appears to be a timeout problem.
Any comments would be appreciated by this newbie to Qemu.
[-- Attachment #2: Type: text/html, Size: 1797 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Postcopy failures
2014-10-24 16:48 [Qemu-devel] Postcopy failures Gary Hook
@ 2014-10-27 12:41 ` Dr. David Alan Gilbert
2014-10-27 21:18 ` Gary Hook
0 siblings, 1 reply; 3+ messages in thread
From: Dr. David Alan Gilbert @ 2014-10-27 12:41 UTC (permalink / raw)
To: Gary Hook; +Cc: qemu-devel@nongnu.org
* Gary Hook (gary.hook@nimboxx.com) wrote:
> I see this went by:
>
> Il 07/10/2014 12:29, Dr. David Alan Gilbert ha scritto:
> > You mean something like this (untested) ?
> >
> > if (mis->postcopy_ram_state != POSTCOPY_RAM_INCOMING_NONE) {
> > if (mis->postcopy_ram_state == POSTCOPY_RAM_INCOMING_ADVISE) {
> > /*
> > * Where a migration had postcopy enabled (and thus went to advise)
> > * but managed to complete within the precopy period
> > */
> > postcopy_ram_incoming_cleanup(mis);
> > } else if (ret >= 0) {
> > /*
> > * Postcopy was started, cleanup should happen at the end of the
> > * postcopy thread.
> > */
> > DPRINTF("process_incoming_migration_co: exiting main branch");
> > return;
> > }
> > }
>
> And I wonder if this will solve the problem of a peer-to-peer migration, using non-shared storage, failing because it appears to take a bit too lon? I see in other threads Dr. Gilbert is making changes related to post copy and I am very interested in getting resolution to what appears to be a timeout problem.
>
> Any comments would be appreciated by this newbie to Qemu.
It should be possible to postcopy block storage as well, if that's
the question (it might take some work to make sure that they play
nicely together; e.g. wanting to making the page transfer higher
priority than block transfer).
However, I thought there were other ways to do block storage
migration (using blockcopy I think? but I'm not a block guy).
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Postcopy failures
2014-10-27 12:41 ` Dr. David Alan Gilbert
@ 2014-10-27 21:18 ` Gary Hook
0 siblings, 0 replies; 3+ messages in thread
From: Gary Hook @ 2014-10-27 21:18 UTC (permalink / raw)
To: qemu-devel@nongnu.org
On 10/27/14, 7:41 AM, "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>It should be possible to postcopy block storage as well, if that's
>the question (it might take some work to make sure that they play
>nicely together; e.g. wanting to making the page transfer higher
>priority than block transfer).
>However, I thought there were other ways to do block storage
>migration (using blockcopy I think? but I'm not a block guy).
Honestly (and frankly), it is a complete *ahem* ³challenge² to walk into a
project that is in a state of high flux and attempt to ascertain the
terminology in use with little usable documentation. I am trying to learn,
however. I believe I understand the above statement, but there¹s a more
fundamental issue for which I can find no discussion anywhere on the inter
web. So perhaps it would be sensible if I explain my context.
Here¹s the problem: a (working) peer-to-peer transfer of a modest VM
succeeds (PEER2PEER, PERSIST_DEST, NON_SHARED_INC) (running under
libvirt), regardless of its size. As soon as the TUNNELLED [sic] flag is
added (because security is a good thing) the transfer of the qcow2 file
completes but the sending side throws an error. The migration therefore
fails with an ³unexpected error² according to libvirt logging.
A smaller (about 1.9 GB) VM succeeds,ostensibly because it completes
before some timeout value is reached.
My conclusion is that the timeout throws an invalid error, and based upon
patch 47, my conclusion seems reasonable.
I¹d like to:
1) understand if/when the patches that went by on 10/3 (47 pieces?) are
going to be committed, and if so, how do I access them? A git clone today
didn¹t include that code, and I¹d like to test that modification to see if
it addresses my failure.
2) I¹d also love to understand how to turn on tracing in what becomes the
qemu-system_x86_64 executable on Ubuntu 14.04. Command line options are
valueless when I don¹t control the invocation of the process. Thus I
wonder if there is a config file mechanism or environment variable that
can be used? That presumes, of course, that I¹ve managed to even build the
system with tracing enabledŠ.
Searching wiki.qemu.org has been so far fruitless. The word ³trace² shows
up in exactly one document, in the discussion of command line parameters.
I need to crawl into the engine to find and resolve this failure; at this
point the challenge is getting the hood open. Any advice/pointers from any
corner would be greatly appreciated.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-10-27 21:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-24 16:48 [Qemu-devel] Postcopy failures Gary Hook
2014-10-27 12:41 ` Dr. David Alan Gilbert
2014-10-27 21:18 ` Gary Hook
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).