From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60049) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZHVLN-0001Ku-En for qemu-devel@nongnu.org; Tue, 21 Jul 2015 07:04:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZHVLK-0001hD-89 for qemu-devel@nongnu.org; Tue, 21 Jul 2015 07:04:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52290) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZHVLK-0001gT-1C for qemu-devel@nongnu.org; Tue, 21 Jul 2015 07:04:14 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (Postfix) with ESMTPS id 6D0C93500FF for ; Tue, 21 Jul 2015 11:04:13 +0000 (UTC) Date: Tue, 21 Jul 2015 12:04:09 +0100 From: Stefan Hajnoczi Message-ID: <20150721110409.GA22717@stefanha-thinkpad.redhat.com> References: <1437370031-9070-1-git-send-email-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="mP3DRpeJDSE+ciuQ" Content-Disposition: inline In-Reply-To: <1437370031-9070-1-git-send-email-pbonzini@redhat.com> Subject: Re: [Qemu-devel] [PATCH v2] AioContext: fix broken placement of event_notifier_test_and_clear List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: kwolf@redhat.com, lersek@redhat.com, famz@redhat.com, qemu-devel@nongnu.org, rjones@redhat.com --mP3DRpeJDSE+ciuQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 20, 2015 at 07:27:11AM +0200, Paolo Bonzini wrote: > event_notifier_test_and_clear must be called before processing events. > Otherwise, an aio_poll could "eat" the notification before the main > I/O thread invokes ppoll(). The main I/O thread then never wakes up. > This is an example of what could happen: >=20 > i/o thread vcpu thread worker thread > --------------------------------------------------------------------- > lock_iothread > notify_me =3D 1 > ... > unlock_iothread > lock_iothread > notify_me =3D 3 > ppoll > notify_me =3D 1 > bh->scheduled =3D 1 > event_notifier_set > event_notifier_test_and_clear > ppoll > *** hang *** >=20 > In contrast with the previous bug, there wasn't really much debugging > to do here. "Tracing" with qemu_clock_get_ns shows pretty much the > same behavior as in the previous patch, so the only wait to find the > bug is staring at the code. >=20 > One could also use a formal model, of course. The included one shows > this with three processes: notifier corresponds to a QEMU thread pool > worker, temporary_waiter to a VCPU thread that invokes aio_poll(), > waiter to the main I/O thread. I would be happy to say that the > formal model found the bug for me, but actually I wrote it after the > fact. >=20 > This patch is a bit of a big hammer. Fam is looking into optimizations. >=20 > Reported-by: Richard W. M. Jones > Signed-off-by: Paolo Bonzini > --- > v1->v2: only do event_notifier_test_and_clear once for Win32 >=20 > aio-posix.c | 2 + > aio-win32.c | 7 ++- > async.c | 8 ++- > docs/aio_notify_bug.promela | 140 ++++++++++++++++++++++++++++++++++++++= ++++++ > 4 files changed, 153 insertions(+), 4 deletions(-) > create mode 100644 docs/aio_notify_bug.promela Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan --mP3DRpeJDSE+ciuQ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVricpAAoJEJykq7OBq3PIWcgH/jwEMy5U+xSgaPZNGgt8+bVg pPu51PwSuMQzS65xrjp4WPuXdhY1XO/ggRQqvws1F1vpcfHflAsRTrzyHiQlSiSQ /9cC2DwA3vy+KQxX9cDFdMou1fqs4fysL+leR3vV/k/G4nEJGwoOWKqrDCK6+aGw JZBCQXEdZPpc+qAK3VKdLCAa9002YRKa4gxzhO5danJoUBEXP20oZBM8c9K7Gmzt /7nIOM8OTPg7M4iKACg1vtu8MKVZwmXfDEdoOs61fkAob3WWMphEDVuo8RYhjMkK TZxIlHcOR0MyUwGy3fU5RVjQNie5dK7rjrMybVaFFWwpLWlgYxRrZ/+KWDXLF9s= =eVIi -----END PGP SIGNATURE----- --mP3DRpeJDSE+ciuQ--