* [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) @ 2015-07-06 15:35 Peter Maydell 2015-07-06 15:41 ` John Snow 0 siblings, 1 reply; 4+ messages in thread From: Peter Maydell @ 2015-07-06 15:35 UTC (permalink / raw) To: QEMU Developers, John Snow, Paolo Bonzini, Stefan Hajnoczi, qemu-ppc@nongnu.org I'm seeing a qtest hang in the /x86_64/ahci/io/ncq/simple test case. It looks like QEMU is running OK, but the qtest test is busy-looping in ahci_command_wait(): #0 ahci_command_wait (ahci=0x1003f3f9400, cmd=0x1003f401810) at /home/pm215/qemu/tests/libqos/ahci.c:929 #1 0x000000001001ba10 in ahci_command_issue (ahci=0x1003f3f9400, cmd=0x1003f401810) at /home/pm215/qemu/tests/libqos/ahci.c:937 #2 0x0000000010019f18 in ahci_guest_io (ahci=0x1003f3f9400, port=5 '\005', ide_cmd=97 'a', buffer=1097728, bufsize=4096, sector=0) at /home/pm215/qemu/tests/libqos/ahci.c:632 #3 0x000000001000b640 in ahci_test_io_rw_simple (ahci=0x1003f3f9400, bufsize=4096, sector=0, read_cmd= 96 '`', write_cmd=97 'a') at /home/pm215/qemu/tests/ahci-test.c:886 #4 0x000000001000d434 in test_ncq_simple () at /home/pm215/qemu/tests/ahci-test.c:1439 #5 0x00000080753db01c in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 () from /lib64/libglib-2.0.so.0 #6 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 () from /lib64/libglib-2.0.so.0 #7 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 () from /lib64/libglib-2.0.so.0 #8 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 () from /lib64/libglib-2.0.so.0 #9 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 () from /lib64/libglib-2.0.so.0 #10 0x00000080753db6c0 in .g_test_run_suite () from /lib64/libglib-2.0.so.0 #11 0x00000080753db778 in .g_test_run () from /lib64/libglib-2.0.so.0 #12 0x000000001000de70 in main (argc=1, argv=0x3fffeeb5a578) at /home/pm215/qemu/tests/ahci-test.c:1703 If you singlestep, we just loop round and round. Presumably the condition we're expecting just never becomes true. I've only seen this on a ppc64 host system; my x86-64 and arm 'make check' runs have been fine. Have you tested your AHCI qtest code on a big endian system? (Not necessarily the problem, Paolo also said he'd seen an intermittent failure in one of these ahci tests on an x86 host. But the ppc fail seems to be reliably always on the same test.) thanks -- PMM ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) 2015-07-06 15:35 [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) Peter Maydell @ 2015-07-06 15:41 ` John Snow 2015-07-06 16:24 ` Paolo Bonzini 0 siblings, 1 reply; 4+ messages in thread From: John Snow @ 2015-07-06 15:41 UTC (permalink / raw) To: Peter Maydell, QEMU Developers, Paolo Bonzini, Stefan Hajnoczi, qemu-ppc@nongnu.org On 07/06/2015 11:35 AM, Peter Maydell wrote: > I'm seeing a qtest hang in the /x86_64/ahci/io/ncq/simple > test case. It looks like QEMU is running OK, but the qtest test > is busy-looping in ahci_command_wait(): > > #0 ahci_command_wait (ahci=0x1003f3f9400, cmd=0x1003f401810) at > /home/pm215/qemu/tests/libqos/ahci.c:929 > #1 0x000000001001ba10 in ahci_command_issue (ahci=0x1003f3f9400, > cmd=0x1003f401810) > at /home/pm215/qemu/tests/libqos/ahci.c:937 > #2 0x0000000010019f18 in ahci_guest_io (ahci=0x1003f3f9400, port=5 > '\005', ide_cmd=97 'a', buffer=1097728, > bufsize=4096, sector=0) at /home/pm215/qemu/tests/libqos/ahci.c:632 > #3 0x000000001000b640 in ahci_test_io_rw_simple (ahci=0x1003f3f9400, > bufsize=4096, sector=0, read_cmd= > 96 '`', write_cmd=97 'a') at /home/pm215/qemu/tests/ahci-test.c:886 > #4 0x000000001000d434 in test_ncq_simple () at > /home/pm215/qemu/tests/ahci-test.c:1439 > #5 0x00000080753db01c in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #6 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #7 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #8 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #9 0x00000080753db1f8 in 000000ca.plt_call.strncasecmp@@GLIBC_2.3+0 > () from /lib64/libglib-2.0.so.0 > #10 0x00000080753db6c0 in .g_test_run_suite () from /lib64/libglib-2.0.so.0 > #11 0x00000080753db778 in .g_test_run () from /lib64/libglib-2.0.so.0 > #12 0x000000001000de70 in main (argc=1, argv=0x3fffeeb5a578) at > /home/pm215/qemu/tests/ahci-test.c:1703 > > If you singlestep, we just loop round and round. Presumably > the condition we're expecting just never becomes true. > > I've only seen this on a ppc64 host system; my x86-64 and arm > 'make check' runs have been fine. Have you tested your AHCI > qtest code on a big endian system? (Not necessarily the > problem, Paolo also said he'd seen an intermittent failure > in one of these ahci tests on an x86 host. But the ppc fail > seems to be reliably always on the same test.) > > thanks > -- PMM > I'll take a look. More than possible there's a race in the wait conditional that I have just never seen before. I tweaked it a little recently to wait on both NCQ and traditional completion, but it doesn't have a timeout or anything. I'll go ahead and fix that while I'm here. Should be something simple, hopefully. --js ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) 2015-07-06 15:41 ` John Snow @ 2015-07-06 16:24 ` Paolo Bonzini 2015-07-06 19:21 ` John Snow 0 siblings, 1 reply; 4+ messages in thread From: Paolo Bonzini @ 2015-07-06 16:24 UTC (permalink / raw) To: John Snow, Peter Maydell, QEMU Developers, Stefan Hajnoczi, qemu-ppc@nongnu.org On 06/07/2015 17:41, John Snow wrote: > I'll take a look. More than possible there's a race in the wait > conditional that I have just never seen before. > > I tweaked it a little recently to wait on both NCQ and traditional > completion, but it doesn't have a timeout or anything. I'll go ahead and > fix that while I'm here. > > Should be something simple, hopefully. My hang in test_halted_ncq is different: (gdb) up #1 0x00007f3aaadb4476 in read (__nbytes=1, __buf=0x7ffc0652abbf, __fd=<optimized out>) at /usr/include/bits/unistd.h:44 44 return __read_alias (__fd, __buf, __nbytes); (gdb) #2 qtest_qmp_receive (s=s@entry=0x7f3aac017360) at /home/work/upstream/qemu/tests/libqtest.c:371 371 len = read(s->qmp_fd, &c, 1); (gdb) #3 0x00007f3aaadb4c48 in qtest_qmp_eventwait (s=0x7f3aac017360, event=event@entry=0x7f3aaadbdc68 "STOP") at /home/work/upstream/qemu/tests/libqtest.c:477 477 response = qtest_qmp_receive(s); (gdb) #4 0x00007f3aaadb3514 in qmp_eventwait (event=0x7f3aaadbdc68 "STOP") at /home/work/upstream/qemu/tests/libqtest.h:498 498 return qtest_qmp_eventwait(global_qtest, event); (gdb) #5 ahci_guest_io_halt (ahci=ahci@entry=0x7f3aac014820, port=port@entry=5 '\005', ide_cmd=ide_cmd@entry=97 'a', buffer=buffer@entry=1097728, bufsize=bufsize@entry=4096, sector=sector@entry=0) at /home/work/upstream/qemu/tests/libqos/ahci.c:604 604 qmp_eventwait("STOP"); (gdb) #6 0x00007f3aaada9e09 in ahci_halted_io_test (cmd_read=<optimized out>, cmd_write=<optimized out>) at /home/work/upstream/qemu/tests/ahci-test.c:1244 1244 cmd = ahci_guest_io_halt(ahci, port, cmd_write, In QEMU, current_run_state is RUN_STATE_RUNNING, so the halt didn't happen. It's 100% reproducible with "make check-qtest-{i386,x86_64} V=1 -j12". This is the final part of the qtest log: [R +0.066539] readl 0xe0000390 [S +0.066553] OK 0x0000000000000000 [R +0.066582] memset 0x10b000 0x100 0x00 [S +0.066597] OK [R +0.067627] write 0x10c000 0x1000 0x00fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0 e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b 1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a29282 7262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534 333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a4948474645444342414 03f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d 4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a5 95857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a696867666564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a09080706050403020100fffefdfcfbfaf9f8f7f6f5f4f3f2f1f0efeeedecebeae9e8e7e6e5e4e3e2e1e0dfdedddcdbdad9d8d7d6d5d4d3d2d1d0cfcecdcccbcac9c8c7c6c5c4c3c2c1c0bfbebdbcbbbab9b8b7b6b5b4b3b2b1b0afaeadacabaaa9a8a7a6a5a4a3a2a1a09f9e9d9c9b9a999897969594939291908f8e8d8c8b8a898887868584838281807f7e7d7c7b7a797877767574737271706f6e6d6c6b6a69686766 6564636261605f5e5d5c5b5a595857565554535251504f4e4d4c4b4a494847464544434241403f3e3d3c3b3a393837363534333231302f2e2d2c2b2a292827262524232221201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a090807060504030201 [S +0.067748] OK [R +0.067782] readl 0xe00003b8 [S +0.067796] OK 0x0000000000000000 [R +0.067824] read 0x10a020 0x20 [S +0.067843] OK 0x0000000000000000000000000000000000000000000000000000000000000000 [R +0.067874] write 0x10a020 0x20 0x0000000000000000000000000000000000000000000000000000000000000000 [S +0.067897] OK [R +0.067926] write 0x10a020 0x20 0x450401000000000000e010000000000000000000000000000000000000000000 [S +0.067940] OK [R +0.067964] write 0x10e000 0x14 0x2780610800000040000000000800000000000000 [S +0.067980] OK [R +0.068004] write 0x10e080 0x10 0x00c010000000000000000000ff0f0080 [S +0.068018] OK [R +0.068042] writel 0xe00003b4 0x2 [S +0.068056] OK [R +0.068079] writel 0xe00003b8 0x2 [S +0.069500] OK This is the /tmp/qtest-blkdebug.INHp7s file: [inject-error] event = "write_aio" errno = "5" state = "1" immediately = "off" once = "on" [set-state] event = "write_aio" new_state = "2" There is no trace of the write payload in the .qcow2 file. Paolo ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) 2015-07-06 16:24 ` Paolo Bonzini @ 2015-07-06 19:21 ` John Snow 0 siblings, 0 replies; 4+ messages in thread From: John Snow @ 2015-07-06 19:21 UTC (permalink / raw) To: Paolo Bonzini, Peter Maydell, QEMU Developers, Stefan Hajnoczi, qemu-ppc@nongnu.org On 07/06/2015 12:24 PM, Paolo Bonzini wrote: > > > On 06/07/2015 17:41, John Snow wrote: >> I'll take a look. More than possible there's a race in the wait >> conditional that I have just never seen before. >> >> I tweaked it a little recently to wait on both NCQ and traditional >> completion, but it doesn't have a timeout or anything. I'll go ahead and >> fix that while I'm here. >> >> Should be something simple, hopefully. The one Peter Maydell reported is indeed something simple. Sent out a patch. > > My hang in test_halted_ncq is different: ... Great :) > > (gdb) up > #1 0x00007f3aaadb4476 in read (__nbytes=1, __buf=0x7ffc0652abbf, > __fd=<optimized out>) at /usr/include/bits/unistd.h:44 > 44 return __read_alias (__fd, __buf, __nbytes); > (gdb) > #2 qtest_qmp_receive (s=s@entry=0x7f3aac017360) at > /home/work/upstream/qemu/tests/libqtest.c:371 > 371 len = read(s->qmp_fd, &c, 1); > (gdb) > #3 0x00007f3aaadb4c48 in qtest_qmp_eventwait (s=0x7f3aac017360, > event=event@entry=0x7f3aaadbdc68 "STOP") at > /home/work/upstream/qemu/tests/libqtest.c:477 > 477 response = qtest_qmp_receive(s); > (gdb) > #4 0x00007f3aaadb3514 in qmp_eventwait (event=0x7f3aaadbdc68 "STOP") at > /home/work/upstream/qemu/tests/libqtest.h:498 > 498 return qtest_qmp_eventwait(global_qtest, event); > (gdb) > #5 ahci_guest_io_halt (ahci=ahci@entry=0x7f3aac014820, > port=port@entry=5 '\005', ide_cmd=ide_cmd@entry=97 'a', > buffer=buffer@entry=1097728, bufsize=bufsize@entry=4096, > sector=sector@entry=0) > at /home/work/upstream/qemu/tests/libqos/ahci.c:604 > 604 qmp_eventwait("STOP"); > (gdb) > #6 0x00007f3aaada9e09 in ahci_halted_io_test (cmd_read=<optimized out>, > cmd_write=<optimized out>) at > /home/work/upstream/qemu/tests/ahci-test.c:1244 > 1244 cmd = ahci_guest_io_halt(ahci, port, cmd_write, > > > In QEMU, current_run_state is RUN_STATE_RUNNING, so the halt didn't > happen. It's 100% reproducible with "make check-qtest-{i386,x86_64} V=1 > -j12". This is the final part of the qtest log: > > [R +0.066539] readl 0xe0000390 > [S +0.066553] OK 0x0000000000000000 > [R +0.066582] memset 0x10b000 0x100 0x00 > [S +0.066597] OK > [R +0.067627] write 0x10c000 0x1000 [snip] > [S +0.067748] OK > [R +0.067782] readl 0xe00003b8 > [S +0.067796] OK 0x0000000000000000 > [R +0.067824] read 0x10a020 0x20 > [S +0.067843] OK > 0x0000000000000000000000000000000000000000000000000000000000000000 > [R +0.067874] write 0x10a020 0x20 > 0x0000000000000000000000000000000000000000000000000000000000000000 > [S +0.067897] OK > [R +0.067926] write 0x10a020 0x20 > 0x450401000000000000e010000000000000000000000000000000000000000000 > [S +0.067940] OK > [R +0.067964] write 0x10e000 0x14 0x2780610800000040000000000800000000000000 > [S +0.067980] OK > [R +0.068004] write 0x10e080 0x10 0x00c010000000000000000000ff0f0080 > [S +0.068018] OK > [R +0.068042] writel 0xe00003b4 0x2 > [S +0.068056] OK > [R +0.068079] writel 0xe00003b8 0x2 > [S +0.069500] OK > > This is the /tmp/qtest-blkdebug.INHp7s file: > > [inject-error] > event = "write_aio" > errno = "5" > state = "1" > immediately = "off" > once = "on" > [set-state] > event = "write_aio" > new_state = "2" > > There is no trace of the write payload in the .qcow2 file. Smells like the injected error prevented that successfully, but... > > Paolo > Can't reproduce: Fedora 21, 4.0.5, Today's QEMU HEAD (7edd8e56) ... So it's a race condition of some sort. Obvious questions: You've tried a distclean? What was your configure invocation? My attempt: make distclean ../../configure --target-list="x86_64-softmmu i386-softmmu" --enable-debug make -j9 make check-qtest-{i386,x86_64} V=1 -j12 Completes just fine for me. That aside, I can't figure out why the machine HALT *and* the write wouldn't occur. Can you stick a bunch of printfs inside the NCQ error handling function and see if there's something goofing up in there, or if it's even making it there? diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index bb6a92f..f1af6f9 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -921,6 +921,7 @@ static void ncq_cb(void *opaque, int ret) IDEState *ide_state = &ncq_tfs->drive->port.ifs[0]; if (ret == -ECANCELED) { + fprintf(stderr, "ncq_cb: -ECANCELED\n"); return; } @@ -928,12 +929,17 @@ static void ncq_cb(void *opaque, int ret) bool is_read = ncq_tfs->cmd == READ_FPDMA_QUEUED; BlockErrorAction action = blk_get_error_action(ide_state->blk, is_read, -ret); + fprintf(stderr, "ncq_cb: ret (%d) is < 0; action is %s\n", + ret, BlockErrorAction_lookup[action]); if (action == BLOCK_ERROR_ACTION_STOP) { ncq_tfs->halt = true; ide_state->bus->error_status = IDE_RETRY_HBA; + fprintf(stderr, "ncq_cb: BLOCK_ERROR_ACTION_STOP\n"); } else if (action == BLOCK_ERROR_ACTION_REPORT) { ncq_err(ncq_tfs); + fprintf(stderr, "ncq_cb: BLOCK_ERROR_ACTION_REPORT\n"); } + fprintf(stderr, "ncq_cb: -> blk_error_action\n"); blk_error_action(ide_state->blk, action, is_read, -ret); } else { ide_state->status = READY_STAT | SEEK_STAT; My guess is I have goofed something up above there in the error handling section of ncq_cb, but "worksforme," so I am not sure what's going on. If no prints show up during testing, maybe turn on AHCI_DEBUG as well and see if any prints are showing up in the success case. --js ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-07-06 19:21 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-06 15:35 [Qemu-devel] qtest hang in /x86_64/ahci/io/ncq/simple (ppc64 host) Peter Maydell 2015-07-06 15:41 ` John Snow 2015-07-06 16:24 ` Paolo Bonzini 2015-07-06 19:21 ` John Snow
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).