From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48804) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1dA-0002Os-NN for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:37:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQ1d5-00076s-Kf for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:37:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48408) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQ1d5-00076l-Bk for qemu-devel@nongnu.org; Mon, 23 Feb 2015 17:37:31 -0500 Message-ID: <54EBABA8.3050908@redhat.com> Date: Mon, 23 Feb 2015 17:37:28 -0500 From: John Snow MIME-Version: 1.0 References: <54EBA841.8070803@redhat.com> <20150223233530.32ef6bcc@crunchbang> In-Reply-To: <20150223233530.32ef6bcc@crunchbang> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] virtio-blk-test failure List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?TWFyYyBNYXLDrQ==?= Cc: qemu-devel On 02/23/2015 05:35 PM, Marc Mar=C3=AD wrote: > El Mon, 23 Feb 2015 17:22:57 -0500 > John Snow escribi=C3=B3: >> I've been seeing this failure pop up very occasionally and I can >> usually get the test to pass again by just re-running, but every now >> and again: >> >> GTESTER check-qtest-x86_64 >> blkdebug: Suspended request 'A' >> blkdebug: Resuming request 'A' >> main-loop: WARNING: I/O thread spun for 1000 iterations >> main-loop: WARNING: I/O thread spun for 1000 iterations >> ** >> ERROR:/home/bos/jhuston/src/qemu/tests/libqos/virtio.c:91:qvirtio_wait= _queue_isr: >> assertion failed: (g_get_monotonic_time() - start_time <=3D timeout_us= ) >> GTester: last random seed: R02S3ba253e130ac76bbcb0bade0a2d54b2f >> [vmxnet3][WR][vmxnet3_peer_has_vnet_hdr]: Peer has no virtio >> extension. Task offloads will be emulated. >> make: *** [check-qtest-x86_64] Error 1 >> >> >> I wrote a test loop that runs virtio-blk-test over and over again in >> a loop and saw it fail after 137 runs. >> >> It looks like the culprit is /virtio/blk/pci/msix; if you run only >> that test it could take anywhere from 20-250 runs before you see it >> fail. >> >> I only did a little bit of debugging, but the QMP command that >> immediately precedes the wait_config_isr call here appears to execute >> successfully. >> >> Any hunches, Marc? > > This is very similar to the one that took back the virtio MMIO patch. > And the reason is the same, although nobody reported it: > > test/libqos/virtio-pci.c:162 > > data =3D readl(dev->config_msix_addr); > writel(dev->config_msix_addr, 0); > return data =3D=3D dev->config_msix_data; > > If my memory is correct, this write is acking the interrupt. But it is > always acking, without checking first what was read. There might be a > race condition there. > > Tomorrow I'll send a patch. > > Thanks > Marc > Awesome, CC me on it and I will run tests, thanks! --js