From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47104) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1bK-0000AP-AB for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:35:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQ1bF-0005qc-6Y for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:35:42 -0500 Received: from mail-wg0-x234.google.com ([2a00:1450:400c:c00::234]:33285) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1bE-0005of-Ik for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:35:36 -0500 Received: by wghb13 with SMTP id b13so1730962wgh.0 for ; Mon, 23 Feb 2015 14:35:34 -0800 (PST) Date: Mon, 23 Feb 2015 23:35:30 +0100 From: Marc =?UTF-8?B?TWFyw60=?= Message-ID: <20150223233530.32ef6bcc@crunchbang> In-Reply-To: <54EBA841.8070803@redhat.com> References: <54EBA841.8070803@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] virtio-blk-test failure List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow Cc: qemu-devel El Mon, 23 Feb 2015 17:22:57 -0500 John Snow escribi=C3=B3: > I've been seeing this failure pop up very occasionally and I can > usually get the test to pass again by just re-running, but every now > and again: >=20 > GTESTER check-qtest-x86_64 > blkdebug: Suspended request 'A' > blkdebug: Resuming request 'A' > main-loop: WARNING: I/O thread spun for 1000 iterations > main-loop: WARNING: I/O thread spun for 1000 iterations > ** > ERROR:/home/bos/jhuston/src/qemu/tests/libqos/virtio.c:91:qvirtio_wait_qu= eue_isr:=20 > assertion failed: (g_get_monotonic_time() - start_time <=3D timeout_us) > GTester: last random seed: R02S3ba253e130ac76bbcb0bade0a2d54b2f > [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio > extension. Task offloads will be emulated. > make: *** [check-qtest-x86_64] Error 1 >=20 >=20 > I wrote a test loop that runs virtio-blk-test over and over again in > a loop and saw it fail after 137 runs. >=20 > It looks like the culprit is /virtio/blk/pci/msix; if you run only > that test it could take anywhere from 20-250 runs before you see it > fail. >=20 > I only did a little bit of debugging, but the QMP command that=20 > immediately precedes the wait_config_isr call here appears to execute=20 > successfully. >=20 > Any hunches, Marc? This is very similar to the one that took back the virtio MMIO patch. And the reason is the same, although nobody reported it: test/libqos/virtio-pci.c:162 data =3D readl(dev->config_msix_addr); writel(dev->config_msix_addr, 0); return data =3D=3D dev->config_msix_data; If my memory is correct, this write is acking the interrupt. But it is always acking, without checking first what was read. There might be a race condition there. Tomorrow I'll send a patch. Thanks Marc