From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35430) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yg8mM-0004Mh-N7 for qemu-devel@nongnu.org; Thu, 09 Apr 2015 05:29:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yg8mI-0000EJ-Jk for qemu-devel@nongnu.org; Thu, 09 Apr 2015 05:29:42 -0400 Date: Thu, 9 Apr 2015 10:29:35 +0100 From: Stefan Hajnoczi Message-ID: <20150409092935.GA18382@stefanha-thinkpad.redhat.com> References: <1428571270-11723-1-git-send-email-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="9amGYk9869ThD9tj" Content-Disposition: inline In-Reply-To: <1428571270-11723-1-git-send-email-pbonzini@redhat.com> Subject: Re: [Qemu-devel] [PATCH] aio: strengthen memory barriers for bottom half scheduling List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: jsnow@redhat.com, qemu-devel@nongnu.org, qemu-stable@nongnu.org, stefanha@redhat.com, lersek@redhat.com, Paul.Leveille@stratus.com --9amGYk9869ThD9tj Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 09, 2015 at 11:21:10AM +0200, Paolo Bonzini wrote: > There are two problems with memory barriers in async.c. The fix is > to use atomic_xchg in order to achieve sequential consistency between > the scheduling of a bottom half and the corresponding execution. >=20 > First, if bh->scheduled is already 1 in qemu_bh_schedule, QEMU does > not execute a memory barrier to order any writes needed by the callback > before the read of bh->scheduled. If the other side sees req->state as > THREAD_ACTIVE, the callback is not invoked and you get deadlock. >=20 > Second, the memory barrier in aio_bh_poll is too weak. Without this > patch, it is possible that bh->scheduled =3D 0 is not "published" until > after the callback has returned. Another thread wants to schedule the > bottom half, but it sees bh->scheduled =3D 1 and does nothing. This caus= es > a lost wakeup. The memory barrier should have been changed to smp_mb() > in commit 924fe12 (aio: fix qemu_bh_schedule() bh->ctx race condition, > 2014-06-03) together with qemu_bh_schedule()'s. Guess who reviewed > that patch? >=20 > Both of these involve a store and a load, so they are reproducible > on x86_64 as well. Paul Leveille however reported how to trigger the > problem within 15 minutes on x86_64 as well. His (untested) recipe, > reproduced here for reference, is the following: >=20 > 1) Qcow2 (or 3) is critical =E2=80=93 raw files alone seem to avoid th= e problem. >=20 > 2) Use =E2=80=9Ccache=3Ddirectsync=E2=80=9D rather than the default of > =E2=80=9Ccache=3Dnone=E2=80=9D to make it happen easier. >=20 > 3) Use a server with a write-back RAID controller to allow for rapid > IO rates. >=20 > 4) Run a random-access load that (mostly) writes chunks to various > files on the virtual block device. >=20 > a. I use =E2=80=98diskload.exe c:25=E2=80=99, a Microsoft HCT load > generator, on Windows VMs. >=20 > b. Iometer can probably be configured to generate a similar load. >=20 > 5) Run multiple VMs in parallel, against the same storage device, > to shake the failure out sooner. >=20 > 6) IvyBridge and Haswell processors for certain; not sure about others. >=20 > A similar patch survived over 12 hours of testing, where an unpatched > QEMU would fail within 15 minutes. >=20 > This bug is, most likely, also involved in the failures in the libguestfs > testsuite on AArch64 (reported by Laszlo Ersek and Richard Jones). Howev= er, > the patch is not enough to fix that. >=20 > Thanks to Stefan Hajnoczi for suggesting closer examination of > qemu_bh_schedule, and to Paul for providing test input and a prototype > patch. >=20 > Cc: qemu-stable@nongnu.org > Reported-by: Laszlo Ersek > Reported-by: Paul Leveille > Reported-by: John Snow > Suggested-by: Paul Leveille > Suggested-by: Stefan Hajnoczi > Tested-by: Paul Leveille > Signed-off-by: Paolo Bonzini > --- > async.c | 28 ++++++++++++---------------- > 1 file changed, 12 insertions(+), 16 deletions(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan --9amGYk9869ThD9tj Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVJkZ/AAoJEJykq7OBq3PIwykH/jAAqsw/o2U1THRkkuouOrUw 1o397aPYsBiP3sxm2bLH28Mc4b8vIhdbSgWyhmdALLnOEBFhRxU2eCFJbbPcemVN WEf5N79GAG7/Gz5liJz1qe2UNtgjdHXJN5hLbLglwSxY6nMxfFblEz6Kq+dItXTK VChveC05sAJMYWJfEstfgd7Ts608fSFt/xdAscgBQFSN3mDI/sMb9p7VlO+SFGD5 tx+7FJEWnJyoTbuD3gIvTz72L5/aZf+0EgotqFsoW7YcYZpGKFm0lSzUvPkkAqXB lyCxbwY2YadB97LlOcH0ilDgCnSjCjiOEO1+0eESlkeiaMGsdAxqwEOKA6+oUqk= =gxzd -----END PGP SIGNATURE----- --9amGYk9869ThD9tj--