From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:49818) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QpEZO-0003AT-Oa for qemu-devel@nongnu.org; Fri, 05 Aug 2011 03:11:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QpEZN-0007aj-6W for qemu-devel@nongnu.org; Fri, 05 Aug 2011 03:11:46 -0400 Received: from fmmailgate02.web.de ([217.72.192.227]:57862) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QpEZM-0007Ws-GW for qemu-devel@nongnu.org; Fri, 05 Aug 2011 03:11:45 -0400 Message-ID: <4E3B97A5.6090802@web.de> Date: Fri, 05 Aug 2011 09:11:33 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <1312383104-9565-1-git-send-email-mjt@msgid.tls.msk.ru> <20110804162441.0045215c@doriath> <4E3AF860.2070005@msgid.tls.msk.ru> <4E3AFA5F.4040708@msgid.tls.msk.ru> <4E3B1B07.2000400@web.de> <4E3B92F0.4050801@msgid.tls.msk.ru> In-Reply-To: <4E3B92F0.4050801@msgid.tls.msk.ru> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigDA1F20C25F016C425B51D02E" Sender: jan.kiszka@web.de Subject: Re: [Qemu-devel] [PATCH] do not call monitor_resume() from migrate_fd_put_buffer() error path List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michael Tokarev Cc: mtosatti@redhat.com, qemu-devel@nongnu.org, Luiz Capitulino This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigDA1F20C25F016C425B51D02E Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 2011-08-05 08:51, Michael Tokarev wrote: > 05.08.2011 02:19, Jan Kiszka wrote: > [] >> Resume on error in migrate_fd_put_buffer raced with migrate_fd_cleanup= >> triggered via migrate_fd_put_ready called from migrate_fd_connect. >> >> This migration code is a horrible maze. Patch below tries to move >> monitor_resume a bit out of this. Please check if it works for you, th= en >> I'll send it out properly. >=20 > Yes it's a maze. >=20 > The patch was mime-damaged, I had to apply it manually, but it > wasn't difficult (since it's understandable what it does etc). Sorry, likely forgot to disable line-wrapping before hitting send. >=20 > And now I can't trigger the original problem anymore, with any > of my variants. >=20 > Here we go: > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:cat > /tmp/foo" > (qemu) migrate "exec:cat > /tmp/foo" > (qemu) migrate "exec:foo" > sh: foo: not found > (qemu) migrate "exec:foo" > sh: foo: not found >=20 > (qemu) q >=20 > As you can see, I can hit either of the two cases - with > and without extra newline, and in both cases (and in > successful case too) it works correctly. >=20 > There's a difference still between the two - namely that > extra newline - but that's something else and merely > cosmetic. >=20 > I also verified that -incoming migration works as expected, > in both failure and success cases - and indeed it works. >=20 > Speaking of the patch: shouldn't migrate_fd_close() be > called from migrate_fd_cleanup(), Maybe. But it requires careful review what the migration close callbacks are doing and if that is always equivalent (or harmless). > or vise versa, or both > be combined into one? In migrate_fd_cleanup(), the new > logic is not clear still. New version of it: >=20 > int migrate_fd_cleanup(FdMigrationState *s) > { > int ret =3D 0; >=20 > qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL); >=20 > if (s->file) { > DPRINTF("closing file\n"); > if (qemu_fclose(s->file) !=3D 0) { > ret =3D -1; > } > s->file =3D NULL; > } else { > if (s->mon) { > monitor_resume(s->mon); > } > } >=20 > if (s->fd !=3D -1) { > close(s->fd); > s->fd =3D -1; > } >=20 > return ret; > } >=20 > Why it's EITHER s->file OR s->mon, but not both? Shouldn't > the s->mon thing be unconditional? See description of the patch I just sent out. >=20 > And the whole thing is waiting coroutine conversion badly, > since the logic is indeed a maze ;) Yeah, though I think this will not magically remove all windings here. Jan --------------enigDA1F20C25F016C425B51D02E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk47l6kACgkQitSsb3rl5xR3uACgm9U2Edm5ELPSx0/hWMTjn8LV zRUAoIjn25nUzDojYHUkkiRDvssJaqn7 =Tl1S -----END PGP SIGNATURE----- --------------enigDA1F20C25F016C425B51D02E--