qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Thomas Huth <thuth@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate()
Date: Wed, 9 Nov 2016 15:13:51 +0000	[thread overview]
Message-ID: <20161109151351.GC7738@work-vm> (raw)
In-Reply-To: <1283dfcc-2f4a-299d-6ecb-16ccd5eff89e@redhat.com>

* Thomas Huth (thuth@redhat.com) wrote:
> On 09.11.2016 08:18, Amit Shah wrote:
> > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote:
> >> qemu_savevm_state_iterate() expects the iterators to return 1
> >> when they are done, and 0 if there is still something left to do.
> >> However, ram_save_iterate() does not obey this rule and returns
> >> the number of saved pages instead. This causes a fatal hang with
> >> ppc64 guests when you run QEMU like this (also works with TCG):
> > 
> > "works with" -- does that mean reproduces with?
> 
> Yes, that's what I've meant: You can reproduce it with TCG (e.g. running
> on a x86 system), too, there's no need for a real POWER machine with KVM
> here.

How did you trigger it on x86?

Dave

> >>  qemu-img create -f qcow2  /tmp/test.qcow2 1M
> >>  qemu-system-ppc64 -nographic -nodefaults -m 256 \
> >>                    -hda /tmp/test.qcow2 -serial mon:stdio
> >>
> >> ... then switch to the monitor by pressing CTRL-a c and try to
> >> save a snapshot with "savevm test1" for example.
> >>
> >> After the first iteration, ram_save_iterate() always returns 0 here,
> >> so that qemu_savevm_state_iterate() hangs in an endless loop and you
> >> can only "kill -9" the QEMU process.
> >> Fix it by using proper return values in ram_save_iterate().
> >>
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >>  migration/ram.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index fb9252d..a1c8089 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >>      int ret;
> >>      int i;
> >>      int64_t t0;
> >> -    int pages_sent = 0;
> >> +    int done = 0;
> >>  
> >>      rcu_read_lock();
> >>      if (ram_list.version != last_version) {
> >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >>          pages = ram_find_and_save_block(f, false, &bytes_transferred);
> >>          /* no more pages to sent */
> >>          if (pages == 0) {
> >> +            done = 1;
> >>              break;
> >>          }
> >> -        pages_sent += pages;
> >>          acct_info.iterations++;
> >>  
> >>          /* we want to check in the 1st loop, just in case it was the 1st time
> >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >>          return ret;
> >>      }
> >>  
> >> -    return pages_sent;
> >> +    return done;
> >>  }
> > 
> > I agree with David, we can just remove the return value.  The first
> > patch of the series can do that; and this one could become the 2nd
> > patch.  Should be OK for the soft freeze.
> 
> Sorry, I still did not quite get it - if I'd change the return type of
> ram_save_iterate() and the other iterate functions to "void", how is
> qemu_savevm_state_iterate() supposed to know whether all iterators are
> done or not? And other iterators also use negative return values to
> signal errors - should that then be handled via an "Error **" parameter
> instead? ... my gut feeling still says that such a bigger rework (we've
> got to touch all iterators for this!) should rather not be done right in
> the middle of the freeze period...
> 
>  Thomas
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  parent reply	other threads:[~2016-11-09 15:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-04 13:10 [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate() Thomas Huth
2016-11-08  1:14 ` David Gibson
2016-11-08  6:57   ` Thomas Huth
2016-11-09  7:18 ` Amit Shah
2016-11-09  7:46   ` Thomas Huth
2016-11-09 13:08     ` David Gibson
2016-11-09 15:13     ` Dr. David Alan Gilbert [this message]
2016-11-09 15:28       ` Thomas Huth
2016-11-09 15:32         ` Dr. David Alan Gilbert
2016-11-14 18:34 ` Juan Quintela
2016-11-17  3:45   ` David Gibson
2016-11-18  8:13     ` Thomas Huth
2016-12-16 16:55       ` [Qemu-devel] Is block_save_iterate() dead code? (was: migration: Fix return code of ram_save_iterate() ) Thomas Huth
2016-12-16 17:03         ` Dr. David Alan Gilbert
2016-12-19 16:30           ` [Qemu-devel] Is block_save_iterate() dead code? Thomas Huth
2016-12-19 20:19             ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109151351.GC7738@work-vm \
    --to=dgilbert@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).