From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZC8Wo-0006hx-7T for qemu-devel@nongnu.org; Mon, 06 Jul 2015 11:41:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZC8Wj-0000so-1x for qemu-devel@nongnu.org; Mon, 06 Jul 2015 11:41:54 -0400 Message-ID: <559AA1BB.8020709@redhat.com> Date: Mon, 06 Jul 2015 11:41:47 -0400 From: John Snow MIME-Version: 1.0 References: In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell , QEMU Developers , Paolo Bonzini , Stefan Hajnoczi , "qemu-ppc@nongnu.org" On 07/06/2015 11:35 AM, Peter Maydell wrote: > I'm seeing a qtest hang in the /x86_64/ahci/io/ncq/simple > test case. It looks like QEMU is running OK, but the qtest test > is busy-looping in ahci_command_wait(): >=20 > #0 ahci_command_wait (ahci=3D0x1003f3f9400, cmd=3D0x1003f401810) at > /home/pm215/qemu/tests/libqos/ahci.c:929 > #1 0x000000001001ba10 in ahci_command_issue (ahci=3D0x1003f3f9400, > cmd=3D0x1003f401810) > at /home/pm215/qemu/tests/libqos/ahci.c:937 > #2 0x0000000010019f18 in ahci_guest_io (ahci=3D0x1003f3f9400, port=3D5 > '\005', ide_cmd=3D97 'a', buffer=3D1097728, > bufsize=3D4096, sector=3D0) at /home/pm215/qemu/tests/libqos/ahci.c= :632 > #3 0x000000001000b640 in ahci_test_io_rw_simple (ahci=3D0x1003f3f9400, > bufsize=3D4096, sector=3D0, read_cmd=3D > 96 '`', write_cmd=3D97 'a') at /home/pm215/qemu/tests/ahci-test.c:8= 86 > #4 0x000000001000d434 in test_ncq_simple () at > /home/pm215/qemu/tests/ahci-test.c:1439 > #5 0x00000080753db01c in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #6 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #7 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #8 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #9 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #10 0x00000080753db6c0 in .g_test_run_suite () from /lib64/libglib-2.0.= so.0 > #11 0x00000080753db778 in .g_test_run () from /lib64/libglib-2.0.so.0 > #12 0x000000001000de70 in main (argc=3D1, argv=3D0x3fffeeb5a578) at > /home/pm215/qemu/tests/ahci-test.c:1703 >=20 > If you singlestep, we just loop round and round. Presumably > the condition we're expecting just never becomes true. >=20 > I've only seen this on a ppc64 host system; my x86-64 and arm > 'make check' runs have been fine. Have you tested your AHCI > qtest code on a big endian system? (Not necessarily the > problem, Paolo also said he'd seen an intermittent failure > in one of these ahci tests on an x86 host. But the ppc fail > seems to be reliably always on the same test.) >=20 > thanks > -- PMM >=20 I'll take a look. More than possible there's a race in the wait conditional that I have just never seen before. I tweaked it a little recently to wait on both NCQ and traditional completion, but it doesn't have a timeout or anything. I'll go ahead and fix that while I'm here. Should be something simple, hopefully. --js