From: David Gibson <david@gibson.dropbear.id.au>
To: Thomas Huth <thuth@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
Juan Quintela <quintela@redhat.com>,
qemu-devel@nongnu.org,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate()
Date: Thu, 10 Nov 2016 00:08:55 +1100 [thread overview]
Message-ID: <20161109130855.GA18060@umbus.fritz.box> (raw)
In-Reply-To: <1283dfcc-2f4a-299d-6ecb-16ccd5eff89e@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4231 bytes --]
On Wed, Nov 09, 2016 at 08:46:34AM +0100, Thomas Huth wrote:
> On 09.11.2016 08:18, Amit Shah wrote:
> > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote:
> >> qemu_savevm_state_iterate() expects the iterators to return 1
> >> when they are done, and 0 if there is still something left to do.
> >> However, ram_save_iterate() does not obey this rule and returns
> >> the number of saved pages instead. This causes a fatal hang with
> >> ppc64 guests when you run QEMU like this (also works with TCG):
> >
> > "works with" -- does that mean reproduces with?
>
> Yes, that's what I've meant: You can reproduce it with TCG (e.g. running
> on a x86 system), too, there's no need for a real POWER machine with KVM
> here.
>
> >> qemu-img create -f qcow2 /tmp/test.qcow2 1M
> >> qemu-system-ppc64 -nographic -nodefaults -m 256 \
> >> -hda /tmp/test.qcow2 -serial mon:stdio
> >>
> >> ... then switch to the monitor by pressing CTRL-a c and try to
> >> save a snapshot with "savevm test1" for example.
> >>
> >> After the first iteration, ram_save_iterate() always returns 0 here,
> >> so that qemu_savevm_state_iterate() hangs in an endless loop and you
> >> can only "kill -9" the QEMU process.
> >> Fix it by using proper return values in ram_save_iterate().
> >>
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >> migration/ram.c | 6 +++---
> >> 1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index fb9252d..a1c8089 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >> int ret;
> >> int i;
> >> int64_t t0;
> >> - int pages_sent = 0;
> >> + int done = 0;
> >>
> >> rcu_read_lock();
> >> if (ram_list.version != last_version) {
> >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >> pages = ram_find_and_save_block(f, false, &bytes_transferred);
> >> /* no more pages to sent */
> >> if (pages == 0) {
> >> + done = 1;
> >> break;
> >> }
> >> - pages_sent += pages;
> >> acct_info.iterations++;
> >>
> >> /* we want to check in the 1st loop, just in case it was the 1st time
> >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >> return ret;
> >> }
> >>
> >> - return pages_sent;
> >> + return done;
> >> }
> >
> > I agree with David, we can just remove the return value. The first
> > patch of the series can do that; and this one could become the 2nd
> > patch. Should be OK for the soft freeze.
>
> Sorry, I still did not quite get it - if I'd change the return type of
> ram_save_iterate() and the other iterate functions to "void", how is
> qemu_savevm_state_iterate() supposed to know whether all iterators are
> done or not?
It doesn't - it's return value is, in turn, mostly ignored by the
caller.
On the migration path we already determine whether to proceed or not
based purely on the separate state_pending callbacks.
For the savevm path, we don't really need the iteration phase at all -
we can jump straight to the completion phase, since downtime is not an
issue.
> And other iterators also use negative return values to
> signal errors
Ah.. that's a good point. Possibly we should leave in the negative
codes for errors and just remove all positive return values.
> - should that then be handled via an "Error **" parameter
> instead? ... my gut feeling still says that such a bigger rework (we've
> got to touch all iterators for this!) should rather not be done right in
> the middle of the freeze period...
Yeah the errors could - and probably should - be handled with Error **
instead of return codes, but I also wonder if that's too much for soft
freeze. I guess that's the call of the migration guys.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2016-11-09 21:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-04 13:10 [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate() Thomas Huth
2016-11-08 1:14 ` David Gibson
2016-11-08 6:57 ` Thomas Huth
2016-11-09 7:18 ` Amit Shah
2016-11-09 7:46 ` Thomas Huth
2016-11-09 13:08 ` David Gibson [this message]
2016-11-09 15:13 ` Dr. David Alan Gilbert
2016-11-09 15:28 ` Thomas Huth
2016-11-09 15:32 ` Dr. David Alan Gilbert
2016-11-14 18:34 ` Juan Quintela
2016-11-17 3:45 ` David Gibson
2016-11-18 8:13 ` Thomas Huth
2016-12-16 16:55 ` [Qemu-devel] Is block_save_iterate() dead code? (was: migration: Fix return code of ram_save_iterate() ) Thomas Huth
2016-12-16 17:03 ` Dr. David Alan Gilbert
2016-12-19 16:30 ` [Qemu-devel] Is block_save_iterate() dead code? Thomas Huth
2016-12-19 20:19 ` John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161109130855.GA18060@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=amit.shah@redhat.com \
--cc=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.