From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39882) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4Nam-0000Ud-IB for qemu-devel@nongnu.org; Wed, 09 Nov 2016 02:46:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4Nai-0005fu-IO for qemu-devel@nongnu.org; Wed, 09 Nov 2016 02:46:44 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39492) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c4Nai-0005eW-Ap for qemu-devel@nongnu.org; Wed, 09 Nov 2016 02:46:40 -0500 References: <1478265017-5700-1-git-send-email-thuth@redhat.com> <20161109071800.GA1888@amit-lp.rh> From: Thomas Huth Message-ID: <1283dfcc-2f4a-299d-6ecb-16ccd5eff89e@redhat.com> Date: Wed, 9 Nov 2016 08:46:34 +0100 MIME-Version: 1.0 In-Reply-To: <20161109071800.GA1888@amit-lp.rh> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Amit Shah Cc: Juan Quintela , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" , David Gibson On 09.11.2016 08:18, Amit Shah wrote: > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: >> qemu_savevm_state_iterate() expects the iterators to return 1 >> when they are done, and 0 if there is still something left to do. >> However, ram_save_iterate() does not obey this rule and returns >> the number of saved pages instead. This causes a fatal hang with >> ppc64 guests when you run QEMU like this (also works with TCG): >=20 > "works with" -- does that mean reproduces with? Yes, that's what I've meant: You can reproduce it with TCG (e.g. running on a x86 system), too, there's no need for a real POWER machine with KVM here. >> qemu-img create -f qcow2 /tmp/test.qcow2 1M >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ >> -hda /tmp/test.qcow2 -serial mon:stdio >> >> ... then switch to the monitor by pressing CTRL-a c and try to >> save a snapshot with "savevm test1" for example. >> >> After the first iteration, ram_save_iterate() always returns 0 here, >> so that qemu_savevm_state_iterate() hangs in an endless loop and you >> can only "kill -9" the QEMU process. >> Fix it by using proper return values in ram_save_iterate(). >> >> Signed-off-by: Thomas Huth >> --- >> migration/ram.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index fb9252d..a1c8089 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) >> int ret; >> int i; >> int64_t t0; >> - int pages_sent =3D 0; >> + int done =3D 0; >> =20 >> rcu_read_lock(); >> if (ram_list.version !=3D last_version) { >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) >> pages =3D ram_find_and_save_block(f, false, &bytes_transferre= d); >> /* no more pages to sent */ >> if (pages =3D=3D 0) { >> + done =3D 1; >> break; >> } >> - pages_sent +=3D pages; >> acct_info.iterations++; >> =20 >> /* we want to check in the 1st loop, just in case it was the = 1st time >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) >> return ret; >> } >> =20 >> - return pages_sent; >> + return done; >> } >=20 > I agree with David, we can just remove the return value. The first > patch of the series can do that; and this one could become the 2nd > patch. Should be OK for the soft freeze. Sorry, I still did not quite get it - if I'd change the return type of ram_save_iterate() and the other iterate functions to "void", how is qemu_savevm_state_iterate() supposed to know whether all iterators are done or not? And other iterators also use negative return values to signal errors - should that then be handled via an "Error **" parameter instead? ... my gut feeling still says that such a bigger rework (we've got to touch all iterators for this!) should rather not be done right in the middle of the freeze period... Thomas