From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54580) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4aAZ-000862-QO for qemu-devel@nongnu.org; Wed, 09 Nov 2016 16:12:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4aAW-0000li-KC for qemu-devel@nongnu.org; Wed, 09 Nov 2016 16:12:31 -0500 Received: from ozlabs.org ([103.22.144.67]:60959) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c4aAW-0000l5-7t for qemu-devel@nongnu.org; Wed, 09 Nov 2016 16:12:28 -0500 Date: Thu, 10 Nov 2016 00:08:55 +1100 From: David Gibson Message-ID: <20161109130855.GA18060@umbus.fritz.box> References: <1478265017-5700-1-git-send-email-thuth@redhat.com> <20161109071800.GA1888@amit-lp.rh> <1283dfcc-2f4a-299d-6ecb-16ccd5eff89e@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ReaqsoxgOBHFXBhH" Content-Disposition: inline In-Reply-To: <1283dfcc-2f4a-299d-6ecb-16ccd5eff89e@redhat.com> Subject: Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Thomas Huth Cc: Amit Shah , Juan Quintela , qemu-devel@nongnu.org, "Dr. David Alan Gilbert" --ReaqsoxgOBHFXBhH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 09, 2016 at 08:46:34AM +0100, Thomas Huth wrote: > On 09.11.2016 08:18, Amit Shah wrote: > > On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote: > >> qemu_savevm_state_iterate() expects the iterators to return 1 > >> when they are done, and 0 if there is still something left to do. > >> However, ram_save_iterate() does not obey this rule and returns > >> the number of saved pages instead. This causes a fatal hang with > >> ppc64 guests when you run QEMU like this (also works with TCG): > >=20 > > "works with" -- does that mean reproduces with? >=20 > Yes, that's what I've meant: You can reproduce it with TCG (e.g. running > on a x86 system), too, there's no need for a real POWER machine with KVM > here. >=20 > >> qemu-img create -f qcow2 /tmp/test.qcow2 1M > >> qemu-system-ppc64 -nographic -nodefaults -m 256 \ > >> -hda /tmp/test.qcow2 -serial mon:stdio > >> > >> ... then switch to the monitor by pressing CTRL-a c and try to > >> save a snapshot with "savevm test1" for example. > >> > >> After the first iteration, ram_save_iterate() always returns 0 here, > >> so that qemu_savevm_state_iterate() hangs in an endless loop and you > >> can only "kill -9" the QEMU process. > >> Fix it by using proper return values in ram_save_iterate(). > >> > >> Signed-off-by: Thomas Huth > >> --- > >> migration/ram.c | 6 +++--- > >> 1 file changed, 3 insertions(+), 3 deletions(-) > >> > >> diff --git a/migration/ram.c b/migration/ram.c > >> index fb9252d..a1c8089 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) > >> int ret; > >> int i; > >> int64_t t0; > >> - int pages_sent =3D 0; > >> + int done =3D 0; > >> =20 > >> rcu_read_lock(); > >> if (ram_list.version !=3D last_version) { > >> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) > >> pages =3D ram_find_and_save_block(f, false, &bytes_transferre= d); > >> /* no more pages to sent */ > >> if (pages =3D=3D 0) { > >> + done =3D 1; > >> break; > >> } > >> - pages_sent +=3D pages; > >> acct_info.iterations++; > >> =20 > >> /* we want to check in the 1st loop, just in case it was the = 1st time > >> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *o= paque) > >> return ret; > >> } > >> =20 > >> - return pages_sent; > >> + return done; > >> } > >=20 > > I agree with David, we can just remove the return value. The first > > patch of the series can do that; and this one could become the 2nd > > patch. Should be OK for the soft freeze. >=20 > Sorry, I still did not quite get it - if I'd change the return type of > ram_save_iterate() and the other iterate functions to "void", how is > qemu_savevm_state_iterate() supposed to know whether all iterators are > done or not? It doesn't - it's return value is, in turn, mostly ignored by the caller. On the migration path we already determine whether to proceed or not based purely on the separate state_pending callbacks. For the savevm path, we don't really need the iteration phase at all - we can jump straight to the completion phase, since downtime is not an issue. > And other iterators also use negative return values to > signal errors Ah.. that's a good point. Possibly we should leave in the negative codes for errors and just remove all positive return values. > - should that then be handled via an "Error **" parameter > instead? ... my gut feeling still says that such a bigger rework (we've > got to touch all iterators for this!) should rather not be done right in > the middle of the freeze period... Yeah the errors could - and probably should - be handled with Error ** instead of return codes, but I also wonder if that's too much for soft freeze. I guess that's the call of the migration guys. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --ReaqsoxgOBHFXBhH Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJYIx/kAAoJEGw4ysog2bOSUvgP/j47XK/SppriNJ018bXbGX0S ZkpbSs7GN9UBPpf+5r9JT5HUuXgbiVyC8hwKD6FrzUVDeRiOCivIn2HIbPUE93KS 3hSoRxfx/8su6ZJT+adkBlu+zuQdFTzIMYmTukdhcxdAI8alFJf3rqLORBQ7Q4Ak cQYEUD3zzqLdKDm2staGFW7sf1PQ7IeavdL/GNV7jo8lpGFKpSjEkO6l8ZAEJywU x0Dv9USBiUk4XwGQCecm+7c9pDN+8653qWxzp45cftWA+sz/vIgd4zOM3BV/OhUP Sim/P8ncfYGDOcU7JSi22B102NHRQcnqWeyaJk8n7SStugUYkJNosf2D5Ka0u9ZY DcWz1WSW//cc2wBtVWokM6dxxrwqVDOQdUmr/jDbYNFcgvhA2GHU1o50wDucrl69 zk8j3GhXFlwRHXJFX6774GdPhr2zmUbxJkx7w9nikNVFLQDGjHU5kyRXOPWMuUkk DEwyQrWAIoyRC7nDclXrL36bb0yXJAaBnnOf4BRsFtcQGpWe9B7oIr4IDXKn7ixq cyEXbblafwhTCqcLUmvClp8g1vKKm0wQcFUH4BKE/GWbJDCEjMD/XXQjBP1AFjw0 dAaNY10egnH8UhZT4/uliJd9RBgNlkZGOf7dTvzwjms8DELDQkCbdil+vYohuMga 51B3y4yGMWHc7veZMXEd =myYa -----END PGP SIGNATURE----- --ReaqsoxgOBHFXBhH--