* iotest40 problem
@ 2022-03-24 13:47 Li Zhang
2022-03-25 10:17 ` Li Zhang
0 siblings, 1 reply; 11+ messages in thread
From: Li Zhang @ 2022-03-24 13:47 UTC (permalink / raw)
To: QEMU Developers
Hi,
When I run the testsuit on our buidling system, it reports a timeout
sometimes not always as the following.
It couldn't connect qmp socket. Any ideas about this problem?
[ 1989s] ---
/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
[ 1989s] +++ 040.out.bad
[ 1989s] @@ -1,5 +1,55 @@
[ 1989s] -.................................................................
[ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish
connection: asyncio.exceptions.CancelledError
[ 1989s]
+E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
to establish connection: asyncio.exceptions.CancelledError
[ 1989s] +E.........................
[ 1989s]
+======================================================================
[ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
[ 1989s]
+----------------------------------------------------------------------
[ 1989s] +Traceback (most recent call last):
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", line
94, in setUp
[ 1989s] + self.vm.launch()
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 399, in launch
[ 1989s] + self._launch()
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 434, in _launch
[ 1989s] + self._post_launch()
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
line 147, in _post_launch
[ 1989s] + super()._post_launch()
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 340, in _post_launch
[ 1989s] + self._qmp.accept(self._qmp_timer)
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
line 67, in accept
[ 1989s] + self._sync(
[ 1989s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
line 41, in _sync
[ 1989s] + return self._aloop.run_until_complete(
[ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
616, in run_until_complete
[ 1989s] + return future.result()
[ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, in
wait_for
[ 1989s] + raise exceptions.TimeoutError()
[ 1989s] +asyncio.exceptions.TimeoutError
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-24 13:47 iotest40 problem Li Zhang
@ 2022-03-25 10:17 ` Li Zhang
2022-03-29 16:49 ` Li Zhang
0 siblings, 1 reply; 11+ messages in thread
From: Li Zhang @ 2022-03-25 10:17 UTC (permalink / raw)
To: QEMU Developers, kwolf, Hanna Reitz
Hi,
I backport some iotests patches to the tree and change timeout.
It doesn't work.
Sometimes, iotest41 also reports the errors.
[ 1347s]
+======================================================================
[ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
[ 1347s]
+----------------------------------------------------------------------
[ 1347s] +Traceback (most recent call last):
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 399, in launch
[ 1347s] + self._launch()
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 434, in _launch
[ 1347s] + self._post_launch()
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
line 147, in _post_launch
[ 1347s] + super()._post_launch()
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
line 340, in _post_launch
[ 1347s] + self._qmp.accept(self._qmp_timer)
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
line 69, in accept
[ 1347s] + timeout
[ 1347s] + File
"/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
line 42, in _sync
[ 1347s] + asyncio.wait_for(future, timeout=timeout)
[ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
488, in run_until_complete
[ 1347s] + return future.result()
[ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, in
wait_for
[ 1347s] + raise futures.TimeoutError()
[ 1347s] +concurrent.futures._base.TimeoutError
I can see other errors like this, it's the problem of the socket.
[ 1535s] socket_accept failed: Resource temporarily unavailable
[ 1535s] **
[ 1535s]
ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
The script is running the command as this:
/usr/bin/make -O -j4 check-block V=1
I can see the errors on ppc, arm or x86.
But I couldn't reproduce it when I run it manually.
It will be appreciated if any suggestions. Thanks.
On 3/24/22 14:47, Li Zhang wrote:
> Hi,
>
> When I run the testsuit on our buidling system, it reports a timeout
> sometimes not always as the following.
> It couldn't connect qmp socket. Any ideas about this problem?
>
>
> [ 1989s] ---
> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
> [ 1989s] +++ 040.out.bad
> [ 1989s] @@ -1,5 +1,55 @@
> [ 1989s] -.................................................................
> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish
> connection: asyncio.exceptions.CancelledError
> [ 1989s]
> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
> to establish connection: asyncio.exceptions.CancelledError
> [ 1989s] +E.........................
> [ 1989s]
> +======================================================================
> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
> [ 1989s]
> +----------------------------------------------------------------------
> [ 1989s] +Traceback (most recent call last):
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", line
> 94, in setUp
> [ 1989s] + self.vm.launch()
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 399, in launch
> [ 1989s] + self._launch()
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 434, in _launch
> [ 1989s] + self._post_launch()
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line 147, in _post_launch
> [ 1989s] + super()._post_launch()
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 340, in _post_launch
> [ 1989s] + self._qmp.accept(self._qmp_timer)
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> line 67, in accept
> [ 1989s] + self._sync(
> [ 1989s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> line 41, in _sync
> [ 1989s] + return self._aloop.run_until_complete(
> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
> 616, in run_until_complete
> [ 1989s] + return future.result()
> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, in
> wait_for
> [ 1989s] + raise exceptions.TimeoutError()
> [ 1989s] +asyncio.exceptions.TimeoutError
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-25 10:17 ` Li Zhang
@ 2022-03-29 16:49 ` Li Zhang
2022-03-31 6:10 ` Hanna Reitz
0 siblings, 1 reply; 11+ messages in thread
From: Li Zhang @ 2022-03-29 16:49 UTC (permalink / raw)
To: QEMU Developers, kwolf, Hanna Reitz
Update what I observed.
It seems that aqmp is not stable when running test cases.
So I revert the patches as the following, iotest40/41 test cases work well.
commit 76cd358671e6b8e7c435ec65b1c44200254514a9
Author: John Snow <jsnow@redhat.com>
Date: Tue Oct 26 13:56:12 2021 -0400
python, iotests: replace qmp with aqmp
Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the
sync
wrapper from qemu.aqmp instead.
Add an escape hatch in the form of the environment variable
QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into
using
the old implementation, proving that both implementations work
concurrently.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-id: 20211026175612.4127598-9-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
Author: John Snow <jsnow@redhat.com>
Date: Thu Nov 18 15:46:18 2021 -0500
python/machine: handle "fast" QEMU terminations
In the case that the QEMU process actually launches -- but then dies so
quickly that we can't establish a QMP connection to it -- QEMUMachine
currently calls _post_shutdown() assuming that it never launched the VM
process.
This isn't true, though: it "merely" may have failed to establish a QMP
connection and the process is in the middle of its own exit path.
If we don't wait for the subprocess, the caller may get a bogus `None`
return for .exitcode(). This behavior was observed from
device-crash-test; after the switch to Async QMP, the timings were
changed such that it was now seemingly possible to witness the failure
of "vm.launch()" *prior* to the exitcode becoming available.
The semantic of the `_launched` property is changed in this
patch. Instead of representing the condition "launch() executed
successfully", it will now represent "has forked a child process
successfully". This way, wait() when called in the exit path won't
become a no-op.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Message-id: 20211118204620.1897674-6-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
On 3/25/22 11:17, Li Zhang wrote:
> Hi,
>
> I backport some iotests patches to the tree and change timeout.
> It doesn't work.
>
> Sometimes, iotest41 also reports the errors.
> [ 1347s]
> +======================================================================
> [ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
> [ 1347s]
> +----------------------------------------------------------------------
> [ 1347s] +Traceback (most recent call last):
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 399, in launch
> [ 1347s] + self._launch()
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 434, in _launch
> [ 1347s] + self._post_launch()
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line 147, in _post_launch
> [ 1347s] + super()._post_launch()
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> line 340, in _post_launch
> [ 1347s] + self._qmp.accept(self._qmp_timer)
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> line 69, in accept
> [ 1347s] + timeout
> [ 1347s] + File
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> line 42, in _sync
> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
> 488, in run_until_complete
> [ 1347s] + return future.result()
> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, in
> wait_for
> [ 1347s] + raise futures.TimeoutError()
> [ 1347s] +concurrent.futures._base.TimeoutError
>
>
> I can see other errors like this, it's the problem of the socket.
>
> [ 1535s] socket_accept failed: Resource temporarily unavailable
> [ 1535s] **
> [ 1535s]
> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
>
>
> The script is running the command as this:
> /usr/bin/make -O -j4 check-block V=1
>
> I can see the errors on ppc, arm or x86.
> But I couldn't reproduce it when I run it manually.
>
> It will be appreciated if any suggestions. Thanks.
>
>
> On 3/24/22 14:47, Li Zhang wrote:
>> Hi,
>>
>> When I run the testsuit on our buidling system, it reports a timeout
>> sometimes not always as the following.
>> It couldn't connect qmp socket. Any ideas about this problem?
>>
>>
>> [ 1989s] ---
>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
>> [ 1989s] +++ 040.out.bad
>> [ 1989s] @@ -1,5 +1,55 @@
>> [ 1989s]
>> -.................................................................
>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish
>> connection: asyncio.exceptions.CancelledError
>> [ 1989s]
>> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>> to establish connection: asyncio.exceptions.CancelledError
>> [ 1989s] +E.........................
>> [ 1989s]
>> +======================================================================
>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
>> [ 1989s]
>> +----------------------------------------------------------------------
>> [ 1989s] +Traceback (most recent call last):
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", line
>> 94, in setUp
>> [ 1989s] + self.vm.launch()
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 399, in launch
>> [ 1989s] + self._launch()
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 434, in _launch
>> [ 1989s] + self._post_launch()
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>> line 147, in _post_launch
>> [ 1989s] + super()._post_launch()
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 340, in _post_launch
>> [ 1989s] + self._qmp.accept(self._qmp_timer)
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> line 67, in accept
>> [ 1989s] + self._sync(
>> [ 1989s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> line 41, in _sync
>> [ 1989s] + return self._aloop.run_until_complete(
>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
>> 616, in run_until_complete
>> [ 1989s] + return future.result()
>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, in
>> wait_for
>> [ 1989s] + raise exceptions.TimeoutError()
>> [ 1989s] +asyncio.exceptions.TimeoutError
>>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-29 16:49 ` Li Zhang
@ 2022-03-31 6:10 ` Hanna Reitz
2022-03-31 10:44 ` Li Zhang
0 siblings, 1 reply; 11+ messages in thread
From: Hanna Reitz @ 2022-03-31 6:10 UTC (permalink / raw)
To: Li Zhang, QEMU Developers, kwolf, John Snow
On 29.03.22 18:49, Li Zhang wrote:
> Update what I observed.
>
> It seems that aqmp is not stable when running test cases.
> So I revert the patches as the following, iotest40/41 test cases work
> well.
Thanks for bisecting. I haven’t seen this problem before, so I didn’t
look into it; CC-ing John, perhaps he figures something.
Hanna
> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
>
> Author: John Snow <jsnow@redhat.com>
>
> Date: Tue Oct 26 13:56:12 2021 -0400
>
>
>
> python, iotests: replace qmp with aqmp
>
>
>
> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with
> the sync
>
> wrapper from qemu.aqmp instead.
>
>
>
> Add an escape hatch in the form of the environment variable
>
> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into
> using
>
> the old implementation, proving that both implementations work
>
> concurrently.
>
>
>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
>
> Reviewed-by: Hanna Reitz <hreitz@redhat.com>
>
> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
>
> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
>
> Author: John Snow <jsnow@redhat.com>
>
> Date: Thu Nov 18 15:46:18 2021 -0500
>
>
>
> python/machine: handle "fast" QEMU terminations
>
>
>
> In the case that the QEMU process actually launches -- but then
> dies so
>
> quickly that we can't establish a QMP connection to it -- QEMUMachine
>
> currently calls _post_shutdown() assuming that it never launched
> the VM
>
> process.
>
>
>
> This isn't true, though: it "merely" may have failed to establish
> a QMP
>
> connection and the process is in the middle of its own exit path.
>
>
>
> If we don't wait for the subprocess, the caller may get a bogus
> `None`
>
> return for .exitcode(). This behavior was observed from
>
> device-crash-test; after the switch to Async QMP, the timings were
>
> changed such that it was now seemingly possible to witness the
> failure
>
> of "vm.launch()" *prior* to the exitcode becoming available.
>
>
>
> The semantic of the `_launched` property is changed in this
>
> patch. Instead of representing the condition "launch() executed
>
> successfully", it will now represent "has forked a child process
>
> successfully". This way, wait() when called in the exit path won't
>
> become a no-op.
>
>
>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
> Reviewed-by: Willian Rampazzo <willianr@redhat.com>
>
> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
>
>
>
>
> On 3/25/22 11:17, Li Zhang wrote:
>> Hi,
>>
>> I backport some iotests patches to the tree and change timeout.
>> It doesn't work.
>>
>> Sometimes, iotest41 also reports the errors.
>> [ 1347s]
>> +======================================================================
>> [ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
>> [ 1347s]
>> +----------------------------------------------------------------------
>> [ 1347s] +Traceback (most recent call last):
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 399, in launch
>> [ 1347s] + self._launch()
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 434, in _launch
>> [ 1347s] + self._post_launch()
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>> line 147, in _post_launch
>> [ 1347s] + super()._post_launch()
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> line 340, in _post_launch
>> [ 1347s] + self._qmp.accept(self._qmp_timer)
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> line 69, in accept
>> [ 1347s] + timeout
>> [ 1347s] + File
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> line 42, in _sync
>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
>> 488, in run_until_complete
>> [ 1347s] + return future.result()
>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362,
>> in wait_for
>> [ 1347s] + raise futures.TimeoutError()
>> [ 1347s] +concurrent.futures._base.TimeoutError
>>
>>
>> I can see other errors like this, it's the problem of the socket.
>>
>> [ 1535s] socket_accept failed: Resource temporarily unavailable
>> [ 1535s] **
>> [ 1535s]
>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
>>
>>
>> The script is running the command as this:
>> /usr/bin/make -O -j4 check-block V=1
>>
>> I can see the errors on ppc, arm or x86.
>> But I couldn't reproduce it when I run it manually.
>>
>> It will be appreciated if any suggestions. Thanks.
>>
>>
>> On 3/24/22 14:47, Li Zhang wrote:
>>> Hi,
>>>
>>> When I run the testsuit on our buidling system, it reports a timeout
>>> sometimes not always as the following.
>>> It couldn't connect qmp socket. Any ideas about this problem?
>>>
>>>
>>> [ 1989s] ---
>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
>>> [ 1989s] +++ 040.out.bad
>>> [ 1989s] @@ -1,5 +1,55 @@
>>> [ 1989s]
>>> -.................................................................
>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
>>> establish connection: asyncio.exceptions.CancelledError
>>> [ 1989s]
>>> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>>> to establish connection: asyncio.exceptions.CancelledError
>>> [ 1989s] +E.........................
>>> [ 1989s]
>>> +======================================================================
>>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
>>> [ 1989s]
>>> +----------------------------------------------------------------------
>>> [ 1989s] +Traceback (most recent call last):
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
>>> line 94, in setUp
>>> [ 1989s] + self.vm.launch()
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 399, in launch
>>> [ 1989s] + self._launch()
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 434, in _launch
>>> [ 1989s] + self._post_launch()
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>>> line 147, in _post_launch
>>> [ 1989s] + super()._post_launch()
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 340, in _post_launch
>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>> line 67, in accept
>>> [ 1989s] + self._sync(
>>> [ 1989s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>> line 41, in _sync
>>> [ 1989s] + return self._aloop.run_until_complete(
>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
>>> 616, in run_until_complete
>>> [ 1989s] + return future.result()
>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501,
>>> in wait_for
>>> [ 1989s] + raise exceptions.TimeoutError()
>>> [ 1989s] +asyncio.exceptions.TimeoutError
>>>
>>
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 6:10 ` Hanna Reitz
@ 2022-03-31 10:44 ` Li Zhang
2022-03-31 14:37 ` John Snow
0 siblings, 1 reply; 11+ messages in thread
From: Li Zhang @ 2022-03-31 10:44 UTC (permalink / raw)
To: Hanna Reitz, QEMU Developers, kwolf, John Snow
On 3/31/22 08:10, Hanna Reitz wrote:
> On 29.03.22 18:49, Li Zhang wrote:
>> Update what I observed.
>>
>> It seems that aqmp is not stable when running test cases.
>> So I revert the patches as the following, iotest40/41 test cases work
>> well.
>
> Thanks for bisecting. I haven’t seen this problem before, so I didn’t
> look into it; CC-ing John, perhaps he figures something.
>
Thanks a lot.
It happens occassionally. I didn't find a way to reproduce it manually.
> Hanna
>
>> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
>>
>> Author: John Snow <jsnow@redhat.com>
>>
>> Date: Tue Oct 26 13:56:12 2021 -0400
>>
>>
>>
>> python, iotests: replace qmp with aqmp
>>
>>
>>
>> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with
>> the sync
>>
>> wrapper from qemu.aqmp instead.
>>
>>
>>
>> Add an escape hatch in the form of the environment variable
>>
>> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into
>> using
>>
>> the old implementation, proving that both implementations work
>>
>> concurrently.
>>
>>
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
>>
>> Reviewed-by: Hanna Reitz <hreitz@redhat.com>
>>
>> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>>
>>
>> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
>>
>> Author: John Snow <jsnow@redhat.com>
>>
>> Date: Thu Nov 18 15:46:18 2021 -0500
>>
>>
>>
>> python/machine: handle "fast" QEMU terminations
>>
>>
>>
>> In the case that the QEMU process actually launches -- but then
>> dies so
>>
>> quickly that we can't establish a QMP connection to it -- QEMUMachine
>>
>> currently calls _post_shutdown() assuming that it never launched
>> the VM
>>
>> process.
>>
>>
>>
>> This isn't true, though: it "merely" may have failed to establish
>> a QMP
>>
>> connection and the process is in the middle of its own exit path.
>>
>>
>>
>> If we don't wait for the subprocess, the caller may get a bogus
>> `None`
>>
>> return for .exitcode(). This behavior was observed from
>>
>> device-crash-test; after the switch to Async QMP, the timings were
>>
>> changed such that it was now seemingly possible to witness the
>> failure
>>
>> of "vm.launch()" *prior* to the exitcode becoming available.
>>
>>
>>
>> The semantic of the `_launched` property is changed in this
>>
>> patch. Instead of representing the condition "launch() executed
>>
>> successfully", it will now represent "has forked a child process
>>
>> successfully". This way, wait() when called in the exit path won't
>>
>> become a no-op.
>>
>>
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> Reviewed-by: Willian Rampazzo <willianr@redhat.com>
>>
>> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>>
>>
>>
>>
>>
>> On 3/25/22 11:17, Li Zhang wrote:
>>> Hi,
>>>
>>> I backport some iotests patches to the tree and change timeout.
>>> It doesn't work.
>>>
>>> Sometimes, iotest41 also reports the errors.
>>> [ 1347s]
>>> +======================================================================
>>> [ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
>>> [ 1347s]
>>> +----------------------------------------------------------------------
>>> [ 1347s] +Traceback (most recent call last):
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 399, in launch
>>> [ 1347s] + self._launch()
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 434, in _launch
>>> [ 1347s] + self._post_launch()
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", line
>>> 147, in _post_launch
>>> [ 1347s] + super()._post_launch()
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>> line 340, in _post_launch
>>> [ 1347s] + self._qmp.accept(self._qmp_timer)
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>> line 69, in accept
>>> [ 1347s] + timeout
>>> [ 1347s] + File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>> line 42, in _sync
>>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
>>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
>>> 488, in run_until_complete
>>> [ 1347s] + return future.result()
>>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362,
>>> in wait_for
>>> [ 1347s] + raise futures.TimeoutError()
>>> [ 1347s] +concurrent.futures._base.TimeoutError
>>>
>>>
>>> I can see other errors like this, it's the problem of the socket.
>>>
>>> [ 1535s] socket_accept failed: Resource temporarily unavailable
>>> [ 1535s] **
>>> [ 1535s]
>>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
>>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
>>>
>>>
>>> The script is running the command as this:
>>> /usr/bin/make -O -j4 check-block V=1
>>>
>>> I can see the errors on ppc, arm or x86.
>>> But I couldn't reproduce it when I run it manually.
>>>
>>> It will be appreciated if any suggestions. Thanks.
>>>
>>>
>>> On 3/24/22 14:47, Li Zhang wrote:
>>>> Hi,
>>>>
>>>> When I run the testsuit on our buidling system, it reports a timeout
>>>> sometimes not always as the following.
>>>> It couldn't connect qmp socket. Any ideas about this problem?
>>>>
>>>>
>>>> [ 1989s] ---
>>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
>>>> [ 1989s] +++ 040.out.bad
>>>> [ 1989s] @@ -1,5 +1,55 @@
>>>> [ 1989s]
>>>> -.................................................................
>>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
>>>> establish connection: asyncio.exceptions.CancelledError
>>>> [ 1989s]
>>>> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>>>> to establish connection: asyncio.exceptions.CancelledError
>>>> [ 1989s] +E.........................
>>>> [ 1989s]
>>>> +======================================================================
>>>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
>>>> [ 1989s]
>>>> +----------------------------------------------------------------------
>>>> [ 1989s] +Traceback (most recent call last):
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
>>>> line 94, in setUp
>>>> [ 1989s] + self.vm.launch()
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>>> line 399, in launch
>>>> [ 1989s] + self._launch()
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>>> line 434, in _launch
>>>> [ 1989s] + self._post_launch()
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>>>> line 147, in _post_launch
>>>> [ 1989s] + super()._post_launch()
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>>>> line 340, in _post_launch
>>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>>> line 67, in accept
>>>> [ 1989s] + self._sync(
>>>> [ 1989s] + File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>>>> line 41, in _sync
>>>> [ 1989s] + return self._aloop.run_until_complete(
>>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
>>>> 616, in run_until_complete
>>>> [ 1989s] + return future.result()
>>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501,
>>>> in wait_for
>>>> [ 1989s] + raise exceptions.TimeoutError()
>>>> [ 1989s] +asyncio.exceptions.TimeoutError
>>>>
>>>
>>>
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 10:44 ` Li Zhang
@ 2022-03-31 14:37 ` John Snow
2022-03-31 14:46 ` John Snow
2022-03-31 15:16 ` Li Zhang
0 siblings, 2 replies; 11+ messages in thread
From: John Snow @ 2022-03-31 14:37 UTC (permalink / raw)
To: Li Zhang; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
[-- Attachment #1: Type: text/plain, Size: 9146 bytes --]
On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de> wrote:
> On 3/31/22 08:10, Hanna Reitz wrote:
> > On 29.03.22 18:49, Li Zhang wrote:
> >> Update what I observed.
> >>
> >> It seems that aqmp is not stable when running test cases.
> >> So I revert the patches as the following, iotest40/41 test cases work
> >> well.
> >
> > Thanks for bisecting. I haven’t seen this problem before, so I didn’t
> > look into it; CC-ing John, perhaps he figures something.
> >
> Thanks a lot.
> It happens occassionally. I didn't find a way to reproduce it manually.
>
Sorry for the instability.
A few questions then:
- what operating system and version
- kernel version
- architecture (looks like x86 and others?)
- what commit/version of QEMU are you testing?
- what python version are you using to run iotests?
> > Hanna
> >
> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
> >>
> >> Author: John Snow <jsnow@redhat.com>
> >>
> >> Date: Tue Oct 26 13:56:12 2021 -0400
> >>
> >>
> >>
> >> python, iotests: replace qmp with aqmp
> >>
> >>
> >>
> >> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with
> >> the sync
> >>
> >> wrapper from qemu.aqmp instead.
> >>
> >>
> >>
> >> Add an escape hatch in the form of the environment variable
> >>
> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into
> >> using
> >>
> >> the old implementation, proving that both implementations work
> >>
> >> concurrently.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> >>
> >> Reviewed-by: Hanna Reitz <hreitz@redhat.com>
> >>
> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >>
> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
> >>
> >> Author: John Snow <jsnow@redhat.com>
> >>
> >> Date: Thu Nov 18 15:46:18 2021 -0500
> >>
> >>
> >>
> >> python/machine: handle "fast" QEMU terminations
> >>
> >>
> >>
> >> In the case that the QEMU process actually launches -- but then
> >> dies so
> >>
> >> quickly that we can't establish a QMP connection to it --
> QEMUMachine
> >>
> >> currently calls _post_shutdown() assuming that it never launched
> >> the VM
> >>
> >> process.
> >>
> >>
> >>
> >> This isn't true, though: it "merely" may have failed to establish
> >> a QMP
> >>
> >> connection and the process is in the middle of its own exit path.
> >>
> >>
> >>
> >> If we don't wait for the subprocess, the caller may get a bogus
> >> `None`
> >>
> >> return for .exitcode(). This behavior was observed from
> >>
> >> device-crash-test; after the switch to Async QMP, the timings were
> >>
> >> changed such that it was now seemingly possible to witness the
> >> failure
> >>
> >> of "vm.launch()" *prior* to the exitcode becoming available.
> >>
> >>
> >>
> >> The semantic of the `_launched` property is changed in this
> >>
> >> patch. Instead of representing the condition "launch() executed
> >>
> >> successfully", it will now represent "has forked a child process
> >>
> >> successfully". This way, wait() when called in the exit path won't
> >>
> >> become a no-op.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >> Reviewed-by: Willian Rampazzo <willianr@redhat.com>
> >>
> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >>
> >>
> >>
> >>
> >> On 3/25/22 11:17, Li Zhang wrote:
> >>> Hi,
> >>>
> >>> I backport some iotests patches to the tree and change timeout.
> >>> It doesn't work.
> >>>
> >>> Sometimes, iotest41 also reports the errors.
> >>> [ 1347s]
> >>> +======================================================================
> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain
> (__main__.TestSingleDrive)
> >>> [ 1347s]
> >>> +----------------------------------------------------------------------
> >>> [ 1347s] +Traceback (most recent call last):
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>> line 399, in launch
> >>> [ 1347s] + self._launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>> line 434, in _launch
> >>> [ 1347s] + self._post_launch()
> >>> [ 1347s] + File
> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line
> >>> 147, in _post_launch
> >>> [ 1347s] + super()._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>> line 340, in _post_launch
> >>> [ 1347s] + self._qmp.accept(self._qmp_timer)
> >>> [ 1347s] + File
> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>> line 69, in accept
> >>> [ 1347s] + timeout
> >>> [ 1347s] + File
> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>> line 42, in _sync
> >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
> >>> 488, in run_until_complete
> >>> [ 1347s] + return future.result()
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362,
> >>> in wait_for
> >>> [ 1347s] + raise futures.TimeoutError()
> >>> [ 1347s] +concurrent.futures._base.TimeoutError
> >>>
> >>>
> >>> I can see other errors like this, it's the problem of the socket.
> >>>
> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable
> >>> [ 1535s] **
> >>> [ 1535s]
> >>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
> >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> >>>
> >>>
> >>> The script is running the command as this:
> >>> /usr/bin/make -O -j4 check-block V=1
> >>>
> >>> I can see the errors on ppc, arm or x86.
> >>> But I couldn't reproduce it when I run it manually.
> >>>
> >>> It will be appreciated if any suggestions. Thanks.
> >>>
> >>>
> >>> On 3/24/22 14:47, Li Zhang wrote:
> >>>> Hi,
> >>>>
> >>>> When I run the testsuit on our buidling system, it reports a timeout
> >>>> sometimes not always as the following.
> >>>> It couldn't connect qmp socket. Any ideas about this problem?
> >>>>
> >>>>
> >>>> [ 1989s] ---
> >>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
> >>>> [ 1989s] +++ 040.out.bad
> >>>> [ 1989s] @@ -1,5 +1,55 @@
> >>>> [ 1989s]
> >>>> -.................................................................
> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
> >>>> establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s]
> >>>>
> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>
> >>>> to establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s] +E.........................
> >>>> [ 1989s]
> >>>>
> +======================================================================
> >>>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
> >>>> [ 1989s]
> >>>>
> +----------------------------------------------------------------------
> >>>> [ 1989s] +Traceback (most recent call last):
> >>>> [ 1989s] + File
> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
> >>>> line 94, in setUp
> >>>> [ 1989s] + self.vm.launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>>> line 399, in launch
> >>>> [ 1989s] + self._launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>>> line 434, in _launch
> >>>> [ 1989s] + self._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> >>>> line 147, in _post_launch
> >>>> [ 1989s] + super()._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
> >>>> line 340, in _post_launch
> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
> >>>> [ 1989s] + File
> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>>> line 67, in accept
> >>>> [ 1989s] + self._sync(
> >>>> [ 1989s] + File
> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>>> line 41, in _sync
> >>>> [ 1989s] + return self._aloop.run_until_complete(
> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
> >>>> 616, in run_until_complete
> >>>> [ 1989s] + return future.result()
> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501,
> >>>> in wait_for
> >>>> [ 1989s] + raise exceptions.TimeoutError()
> >>>> [ 1989s] +asyncio.exceptions.TimeoutError
> >>>>
> >>>
> >>>
> >>
> >
>
>
>
[-- Attachment #2: Type: text/html, Size: 13918 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 14:37 ` John Snow
@ 2022-03-31 14:46 ` John Snow
2022-03-31 15:18 ` Li Zhang
2022-04-06 14:48 ` Li Zhang
2022-03-31 15:16 ` Li Zhang
1 sibling, 2 replies; 11+ messages in thread
From: John Snow @ 2022-03-31 14:46 UTC (permalink / raw)
To: Li Zhang; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
[-- Attachment #1: Type: text/plain, Size: 9972 bytes --]
On Thu, Mar 31, 2022, 10:37 AM John Snow <jsnow@redhat.com> wrote:
>
>
> On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de> wrote:
>
>> On 3/31/22 08:10, Hanna Reitz wrote:
>> > On 29.03.22 18:49, Li Zhang wrote:
>> >> Update what I observed.
>> >>
>> >> It seems that aqmp is not stable when running test cases.
>> >> So I revert the patches as the following, iotest40/41 test cases work
>> >> well.
>> >
>> > Thanks for bisecting. I haven’t seen this problem before, so I didn’t
>> > look into it; CC-ing John, perhaps he figures something.
>> >
>> Thanks a lot.
>> It happens occassionally. I didn't find a way to reproduce it manually.
>>
>
> Sorry for the instability.
>
> A few questions then:
>
> - what operating system and version
> - kernel version
> - architecture (looks like x86 and others?)
> - what commit/version of QEMU are you testing?
> - what python version are you using to run iotests?
>
I'll also point out these patchsets, because we're in different timezones
and if I'm lucky it will be the answer:
[1] https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/
[2] https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/
#1 wasn't sufficient to fix the problem, but I had hoped #2 was. The second
patchset was merged somewhat recently, just before rc testing for 7.0.
Thanks,
--js
>
>
>> > Hanna
>> >
>> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
>> >>
>> >> Author: John Snow <jsnow@redhat.com>
>> >>
>> >> Date: Tue Oct 26 13:56:12 2021 -0400
>> >>
>> >>
>> >>
>> >> python, iotests: replace qmp with aqmp
>> >>
>> >>
>> >>
>> >> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with
>> >> the sync
>> >>
>> >> wrapper from qemu.aqmp instead.
>> >>
>> >>
>> >>
>> >> Add an escape hatch in the form of the environment variable
>> >>
>> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into
>> >> using
>> >>
>> >> the old implementation, proving that both implementations work
>> >>
>> >> concurrently.
>> >>
>> >>
>> >>
>> >> Signed-off-by: John Snow <jsnow@redhat.com>
>> >>
>> >> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
>> >>
>> >> Reviewed-by: Hanna Reitz <hreitz@redhat.com>
>> >>
>> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
>> >>
>> >> Signed-off-by: John Snow <jsnow@redhat.com>
>> >>
>> >>
>> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
>> >>
>> >> Author: John Snow <jsnow@redhat.com>
>> >>
>> >> Date: Thu Nov 18 15:46:18 2021 -0500
>> >>
>> >>
>> >>
>> >> python/machine: handle "fast" QEMU terminations
>> >>
>> >>
>> >>
>> >> In the case that the QEMU process actually launches -- but then
>> >> dies so
>> >>
>> >> quickly that we can't establish a QMP connection to it --
>> QEMUMachine
>> >>
>> >> currently calls _post_shutdown() assuming that it never launched
>> >> the VM
>> >>
>> >> process.
>> >>
>> >>
>> >>
>> >> This isn't true, though: it "merely" may have failed to establish
>> >> a QMP
>> >>
>> >> connection and the process is in the middle of its own exit path.
>> >>
>> >>
>> >>
>> >> If we don't wait for the subprocess, the caller may get a bogus
>> >> `None`
>> >>
>> >> return for .exitcode(). This behavior was observed from
>> >>
>> >> device-crash-test; after the switch to Async QMP, the timings were
>> >>
>> >> changed such that it was now seemingly possible to witness the
>> >> failure
>> >>
>> >> of "vm.launch()" *prior* to the exitcode becoming available.
>> >>
>> >>
>> >>
>> >> The semantic of the `_launched` property is changed in this
>> >>
>> >> patch. Instead of representing the condition "launch() executed
>> >>
>> >> successfully", it will now represent "has forked a child process
>> >>
>> >> successfully". This way, wait() when called in the exit path won't
>> >>
>> >> become a no-op.
>> >>
>> >>
>> >>
>> >> Signed-off-by: John Snow <jsnow@redhat.com>
>> >>
>> >> Reviewed-by: Willian Rampazzo <willianr@redhat.com>
>> >>
>> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
>> >>
>> >> Signed-off-by: John Snow <jsnow@redhat.com>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On 3/25/22 11:17, Li Zhang wrote:
>> >>> Hi,
>> >>>
>> >>> I backport some iotests patches to the tree and change timeout.
>> >>> It doesn't work.
>> >>>
>> >>> Sometimes, iotest41 also reports the errors.
>> >>> [ 1347s]
>> >>>
>> +======================================================================
>> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain
>> (__main__.TestSingleDrive)
>> >>> [ 1347s]
>> >>>
>> +----------------------------------------------------------------------
>> >>> [ 1347s] +Traceback (most recent call last):
>> >>> [ 1347s] + File
>> >>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>> line 399, in launch
>> >>> [ 1347s] + self._launch()
>> >>> [ 1347s] + File
>> >>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>> line 434, in _launch
>> >>> [ 1347s] + self._post_launch()
>> >>> [ 1347s] + File
>> >>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", line
>> >>> 147, in _post_launch
>> >>> [ 1347s] + super()._post_launch()
>> >>> [ 1347s] + File
>> >>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>> line 340, in _post_launch
>> >>> [ 1347s] + self._qmp.accept(self._qmp_timer)
>> >>> [ 1347s] + File
>> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> >>> line 69, in accept
>> >>> [ 1347s] + timeout
>> >>> [ 1347s] + File
>> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> >>> line 42, in _sync
>> >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
>> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line
>> >>> 488, in run_until_complete
>> >>> [ 1347s] + return future.result()
>> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362,
>> >>> in wait_for
>> >>> [ 1347s] + raise futures.TimeoutError()
>> >>> [ 1347s] +concurrent.futures._base.TimeoutError
>> >>>
>> >>>
>> >>> I can see other errors like this, it's the problem of the socket.
>> >>>
>> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable
>> >>> [ 1535s] **
>> >>> [ 1535s]
>> >>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
>> >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
>> >>>
>> >>>
>> >>> The script is running the command as this:
>> >>> /usr/bin/make -O -j4 check-block V=1
>> >>>
>> >>> I can see the errors on ppc, arm or x86.
>> >>> But I couldn't reproduce it when I run it manually.
>> >>>
>> >>> It will be appreciated if any suggestions. Thanks.
>> >>>
>> >>>
>> >>> On 3/24/22 14:47, Li Zhang wrote:
>> >>>> Hi,
>> >>>>
>> >>>> When I run the testsuit on our buidling system, it reports a timeout
>> >>>> sometimes not always as the following.
>> >>>> It couldn't connect qmp socket. Any ideas about this problem?
>> >>>>
>> >>>>
>> >>>> [ 1989s] ---
>> >>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
>> >>>> [ 1989s] +++ 040.out.bad
>> >>>> [ 1989s] @@ -1,5 +1,55 @@
>> >>>> [ 1989s]
>> >>>> -.................................................................
>> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
>> >>>> establish connection: asyncio.exceptions.CancelledError
>> >>>> [ 1989s]
>> >>>>
>> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>>
>> >>>> to establish connection: asyncio.exceptions.CancelledError
>> >>>> [ 1989s] +E.........................
>> >>>> [ 1989s]
>> >>>>
>> +======================================================================
>> >>>> [ 1989s] +ERROR: test_commit_node
>> (__main__.TestActiveZeroLengthImage)
>> >>>> [ 1989s]
>> >>>>
>> +----------------------------------------------------------------------
>> >>>> [ 1989s] +Traceback (most recent call last):
>> >>>> [ 1989s] + File
>> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
>> >>>> line 94, in setUp
>> >>>> [ 1989s] + self.vm.launch()
>> >>>> [ 1989s] + File
>> >>>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>>> line 399, in launch
>> >>>> [ 1989s] + self._launch()
>> >>>> [ 1989s] + File
>> >>>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>>> line 434, in _launch
>> >>>> [ 1989s] + self._post_launch()
>> >>>> [ 1989s] + File
>> >>>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>> >>>> line 147, in _post_launch
>> >>>> [ 1989s] + super()._post_launch()
>> >>>> [ 1989s] + File
>> >>>>
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>> >>>> line 340, in _post_launch
>> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
>> >>>> [ 1989s] + File
>> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> >>>> line 67, in accept
>> >>>> [ 1989s] + self._sync(
>> >>>> [ 1989s] + File
>> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>> >>>> line 41, in _sync
>> >>>> [ 1989s] + return self._aloop.run_until_complete(
>> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line
>> >>>> 616, in run_until_complete
>> >>>> [ 1989s] + return future.result()
>> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501,
>> >>>> in wait_for
>> >>>> [ 1989s] + raise exceptions.TimeoutError()
>> >>>> [ 1989s] +asyncio.exceptions.TimeoutError
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>>
>>
[-- Attachment #2: Type: text/html, Size: 15491 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 14:37 ` John Snow
2022-03-31 14:46 ` John Snow
@ 2022-03-31 15:16 ` Li Zhang
1 sibling, 0 replies; 11+ messages in thread
From: Li Zhang @ 2022-03-31 15:16 UTC (permalink / raw)
To: John Snow; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
On 3/31/22 16:37, John Snow wrote:
>
>
> On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de
> <mailto:lizhang@suse.de>> wrote:
>
> On 3/31/22 08:10, Hanna Reitz wrote:
> > On 29.03.22 18:49, Li Zhang wrote:
> >> Update what I observed.
> >>
> >> It seems that aqmp is not stable when running test cases.
> >> So I revert the patches as the following, iotest40/41 test cases
> work
> >> well.
> >
> > Thanks for bisecting. I haven’t seen this problem before, so I
> didn’t
> > look into it; CC-ing John, perhaps he figures something.
> >
> Thanks a lot.
> It happens occassionally. I didn't find a way to reproduce it manually.
>
>
> Sorry for the instability.
>
> A few questions then:
>
> - what operating system and version
> - kernel version
> - architecture (looks like x86 and others?)
> - what commit/version of QEMU are you testing?
> - what python version are you using to run iotests?
>
Hi John,
Thanks for your reply, here is the answer.
- openSUSE15.3, openSUSE15.4
- Kernel version:
openSUSE15.3: kernel-obs-build-5.3.18-150300.59.60.4
openSUSE15.4: kernel-obs-build-5.14.21-150400.15.2
- architecture: X86, aarch64, s390
- QEMU version is based on qemu6.2.0, I also backport iotest patches
after QEMU6.2.0
- python:
openSUSE 15.3: python3-base-3.6.15-150300.10.21.1
openSuse15.4: python3-base-3.6.15-150300.10.21.1
This issue happens on S390, aarch64 a lot.
>
>
> > Hanna
> >
> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Tue Oct 26 13:56:12 2021 -0400
> >>
> >>
> >>
> >> python, iotests: replace qmp with aqmp
> >>
> >>
> >>
> >> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with
> >> the sync
> >>
> >> wrapper from qemu.aqmp instead.
> >>
> >>
> >>
> >> Add an escape hatch in the form of the environment variable
> >>
> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole
> QEMUMachine into
> >> using
> >>
> >> the old implementation, proving that both implementations work
> >>
> >> concurrently.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Kevin Wolf <kwolf@redhat.com
> <mailto:kwolf@redhat.com>>
> >>
> >> Reviewed-by: Hanna Reitz <hreitz@redhat.com
> <mailto:hreitz@redhat.com>>
> >>
> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
> <mailto:20211026175612.4127598-9-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Thu Nov 18 15:46:18 2021 -0500
> >>
> >>
> >>
> >> python/machine: handle "fast" QEMU terminations
> >>
> >>
> >>
> >> In the case that the QEMU process actually launches -- but then
> >> dies so
> >>
> >> quickly that we can't establish a QMP connection to it --
> QEMUMachine
> >>
> >> currently calls _post_shutdown() assuming that it never
> launched
> >> the VM
> >>
> >> process.
> >>
> >>
> >>
> >> This isn't true, though: it "merely" may have failed to
> establish
> >> a QMP
> >>
> >> connection and the process is in the middle of its own exit
> path.
> >>
> >>
> >>
> >> If we don't wait for the subprocess, the caller may get a bogus
> >> `None`
> >>
> >> return for .exitcode(). This behavior was observed from
> >>
> >> device-crash-test; after the switch to Async QMP, the
> timings were
> >>
> >> changed such that it was now seemingly possible to witness the
> >> failure
> >>
> >> of "vm.launch()" *prior* to the exitcode becoming available.
> >>
> >>
> >>
> >> The semantic of the `_launched` property is changed in this
> >>
> >> patch. Instead of representing the condition "launch() executed
> >>
> >> successfully", it will now represent "has forked a child process
> >>
> >> successfully". This way, wait() when called in the exit path
> won't
> >>
> >> become a no-op.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Willian Rampazzo <willianr@redhat.com
> <mailto:willianr@redhat.com>>
> >>
> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
> <mailto:20211118204620.1897674-6-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >>
> >>
> >>
> >> On 3/25/22 11:17, Li Zhang wrote:
> >>> Hi,
> >>>
> >>> I backport some iotests patches to the tree and change timeout.
> >>> It doesn't work.
> >>>
> >>> Sometimes, iotest41 also reports the errors.
> >>> [ 1347s]
> >>>
> +======================================================================
> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain
> (__main__.TestSingleDrive)
> >>> [ 1347s]
> >>>
> +----------------------------------------------------------------------
> >>> [ 1347s] +Traceback (most recent call last):
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 399, in launch
> >>> [ 1347s] + self._launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 434, in _launch
> >>> [ 1347s] + self._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line
> >>> 147, in _post_launch
> >>> [ 1347s] + super()._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 340, in _post_launch
> >>> [ 1347s] + self._qmp.accept(self._qmp_timer)
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>> line 69, in accept
> >>> [ 1347s] + timeout
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>> line 42, in _sync
> >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py",
> line
> >>> 488, in run_until_complete
> >>> [ 1347s] + return future.result()
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line
> 362,
> >>> in wait_for
> >>> [ 1347s] + raise futures.TimeoutError()
> >>> [ 1347s] +concurrent.futures._base.TimeoutError
> >>>
> >>>
> >>> I can see other errors like this, it's the problem of the socket.
> >>>
> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable
> >>> [ 1535s] **
> >>> [ 1535s]
> >>>
> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
> >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> >>>
> >>>
> >>> The script is running the command as this:
> >>> /usr/bin/make -O -j4 check-block V=1
> >>>
> >>> I can see the errors on ppc, arm or x86.
> >>> But I couldn't reproduce it when I run it manually.
> >>>
> >>> It will be appreciated if any suggestions. Thanks.
> >>>
> >>>
> >>> On 3/24/22 14:47, Li Zhang wrote:
> >>>> Hi,
> >>>>
> >>>> When I run the testsuit on our buidling system, it reports a
> timeout
> >>>> sometimes not always as the following.
> >>>> It couldn't connect qmp socket. Any ideas about this problem?
> >>>>
> >>>>
> >>>> [ 1989s] ---
> >>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
> >>>> [ 1989s] +++ 040.out.bad
> >>>> [ 1989s] @@ -1,5 +1,55 @@
> >>>> [ 1989s]
> >>>> -.................................................................
> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
> >>>> establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s]
> >>>>
> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>
> >>>> to establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s] +E.........................
> >>>> [ 1989s]
> >>>>
> +======================================================================
> >>>> [ 1989s] +ERROR: test_commit_node
> (__main__.TestActiveZeroLengthImage)
> >>>> [ 1989s]
> >>>>
> +----------------------------------------------------------------------
> >>>> [ 1989s] +Traceback (most recent call last):
> >>>> [ 1989s] + File
> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
> >>>> line 94, in setUp
> >>>> [ 1989s] + self.vm.launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 399, in launch
> >>>> [ 1989s] + self._launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 434, in _launch
> >>>> [ 1989s] + self._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> >>>> line 147, in _post_launch
> >>>> [ 1989s] + super()._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 340, in _post_launch
> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>>> line 67, in accept
> >>>> [ 1989s] + self._sync(
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
> >>>> line 41, in _sync
> >>>> [ 1989s] + return self._aloop.run_until_complete(
> >>>> [ 1989s] + File
> "/usr/lib64/python3.8/asyncio/base_events.py", line
> >>>> 616, in run_until_complete
> >>>> [ 1989s] + return future.result()
> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line
> 501,
> >>>> in wait_for
> >>>> [ 1989s] + raise exceptions.TimeoutError()
> >>>> [ 1989s] +asyncio.exceptions.TimeoutError
> >>>>
> >>>
> >>>
> >>
> >
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 14:46 ` John Snow
@ 2022-03-31 15:18 ` Li Zhang
2022-04-06 14:48 ` Li Zhang
1 sibling, 0 replies; 11+ messages in thread
From: Li Zhang @ 2022-03-31 15:18 UTC (permalink / raw)
To: John Snow; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
On 3/31/22 16:46, John Snow wrote:
>
>
> On Thu, Mar 31, 2022, 10:37 AM John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>> wrote:
>
>
>
> On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de
> <mailto:lizhang@suse.de>> wrote:
>
> On 3/31/22 08:10, Hanna Reitz wrote:
> > On 29.03.22 18:49, Li Zhang wrote:
> >> Update what I observed.
> >>
> >> It seems that aqmp is not stable when running test cases.
> >> So I revert the patches as the following, iotest40/41 test
> cases work
> >> well.
> >
> > Thanks for bisecting. I haven’t seen this problem before, so
> I didn’t
> > look into it; CC-ing John, perhaps he figures something.
> >
> Thanks a lot.
> It happens occassionally. I didn't find a way to reproduce it
> manually.
>
>
> Sorry for the instability.
>
> A few questions then:
>
> - what operating system and version
> - kernel version
> - architecture (looks like x86 and others?)
> - what commit/version of QEMU are you testing?
> - what python version are you using to run iotests?
>
>
> I'll also point out these patchsets, because we're in different
> timezones and if I'm lucky it will be the answer:
>
> [1] https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/
> <https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/>
>
> [2] https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/
> <https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/>
>
> #1 wasn't sufficient to fix the problem, but I had hoped #2 was. The
> second patchset was merged somewhat recently, just before rc testing for
> 7.0.
>
That's great, thanks. I consider to backport them and do some testing.
> Thanks,
> --js
>
>
>
>
>
> > Hanna
> >
> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Tue Oct 26 13:56:12 2021 -0400
> >>
> >>
> >>
> >> python, iotests: replace qmp with aqmp
> >>
> >>
> >>
> >> Swap out the synchronous QEMUMonitorProtocol from
> qemu.qmp with
> >> the sync
> >>
> >> wrapper from qemu.aqmp instead.
> >>
> >>
> >>
> >> Add an escape hatch in the form of the environment variable
> >>
> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole
> QEMUMachine into
> >> using
> >>
> >> the old implementation, proving that both
> implementations work
> >>
> >> concurrently.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Kevin Wolf <kwolf@redhat.com
> <mailto:kwolf@redhat.com>>
> >>
> >> Reviewed-by: Hanna Reitz <hreitz@redhat.com
> <mailto:hreitz@redhat.com>>
> >>
> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
> <mailto:20211026175612.4127598-9-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Thu Nov 18 15:46:18 2021 -0500
> >>
> >>
> >>
> >> python/machine: handle "fast" QEMU terminations
> >>
> >>
> >>
> >> In the case that the QEMU process actually launches --
> but then
> >> dies so
> >>
> >> quickly that we can't establish a QMP connection to it
> -- QEMUMachine
> >>
> >> currently calls _post_shutdown() assuming that it never
> launched
> >> the VM
> >>
> >> process.
> >>
> >>
> >>
> >> This isn't true, though: it "merely" may have failed to
> establish
> >> a QMP
> >>
> >> connection and the process is in the middle of its own
> exit path.
> >>
> >>
> >>
> >> If we don't wait for the subprocess, the caller may get
> a bogus
> >> `None`
> >>
> >> return for .exitcode(). This behavior was observed from
> >>
> >> device-crash-test; after the switch to Async QMP, the
> timings were
> >>
> >> changed such that it was now seemingly possible to
> witness the
> >> failure
> >>
> >> of "vm.launch()" *prior* to the exitcode becoming available.
> >>
> >>
> >>
> >> The semantic of the `_launched` property is changed in this
> >>
> >> patch. Instead of representing the condition "launch()
> executed
> >>
> >> successfully", it will now represent "has forked a child
> process
> >>
> >> successfully". This way, wait() when called in the exit
> path won't
> >>
> >> become a no-op.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Willian Rampazzo <willianr@redhat.com
> <mailto:willianr@redhat.com>>
> >>
> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
> <mailto:20211118204620.1897674-6-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >>
> >>
> >>
> >> On 3/25/22 11:17, Li Zhang wrote:
> >>> Hi,
> >>>
> >>> I backport some iotests patches to the tree and change timeout.
> >>> It doesn't work.
> >>>
> >>> Sometimes, iotest41 also reports the errors.
> >>> [ 1347s]
> >>>
> +======================================================================
> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain
> (__main__.TestSingleDrive)
> >>> [ 1347s]
> >>>
> +----------------------------------------------------------------------
> >>> [ 1347s] +Traceback (most recent call last):
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 399, in launch
> >>> [ 1347s] + self._launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 434, in _launch
> >>> [ 1347s] + self._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line
> >>> 147, in _post_launch
> >>> [ 1347s] + super()._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 340, in _post_launch
> >>> [ 1347s] + self._qmp.accept(self._qmp_timer)
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>> line 69, in accept
> >>> [ 1347s] + timeout
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>> line 42, in _sync
> >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
> >>> [ 1347s] + File
> "/usr/lib64/python3.6/asyncio/base_events.py", line
> >>> 488, in run_until_complete
> >>> [ 1347s] + return future.result()
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py",
> line 362,
> >>> in wait_for
> >>> [ 1347s] + raise futures.TimeoutError()
> >>> [ 1347s] +concurrent.futures._base.TimeoutError
> >>>
> >>>
> >>> I can see other errors like this, it's the problem of the
> socket.
> >>>
> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable
> >>> [ 1535s] **
> >>> [ 1535s]
> >>>
> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
>
> >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> >>>
> >>>
> >>> The script is running the command as this:
> >>> /usr/bin/make -O -j4 check-block V=1
> >>>
> >>> I can see the errors on ppc, arm or x86.
> >>> But I couldn't reproduce it when I run it manually.
> >>>
> >>> It will be appreciated if any suggestions. Thanks.
> >>>
> >>>
> >>> On 3/24/22 14:47, Li Zhang wrote:
> >>>> Hi,
> >>>>
> >>>> When I run the testsuit on our buidling system, it reports
> a timeout
> >>>> sometimes not always as the following.
> >>>> It couldn't connect qmp socket. Any ideas about this problem?
> >>>>
> >>>>
> >>>> [ 1989s] ---
> >>>>
> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
> >>>> [ 1989s] +++ 040.out.bad
> >>>> [ 1989s] @@ -1,5 +1,55 @@
> >>>> [ 1989s]
> >>>>
> -.................................................................
> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
> >>>> establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s]
> >>>>
> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>
> >>>> to establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s] +E.........................
> >>>> [ 1989s]
> >>>>
> +======================================================================
> >>>> [ 1989s] +ERROR: test_commit_node
> (__main__.TestActiveZeroLengthImage)
> >>>> [ 1989s]
> >>>>
> +----------------------------------------------------------------------
> >>>> [ 1989s] +Traceback (most recent call last):
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
> >>>> line 94, in setUp
> >>>> [ 1989s] + self.vm.launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 399, in launch
> >>>> [ 1989s] + self._launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 434, in _launch
> >>>> [ 1989s] + self._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>
> >>>> line 147, in _post_launch
> >>>> [ 1989s] + super()._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 340, in _post_launch
> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>>> line 67, in accept
> >>>> [ 1989s] + self._sync(
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>>> line 41, in _sync
> >>>> [ 1989s] + return self._aloop.run_until_complete(
> >>>> [ 1989s] + File
> "/usr/lib64/python3.8/asyncio/base_events.py", line
> >>>> 616, in run_until_complete
> >>>> [ 1989s] + return future.result()
> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py",
> line 501,
> >>>> in wait_for
> >>>> [ 1989s] + raise exceptions.TimeoutError()
> >>>> [ 1989s] +asyncio.exceptions.TimeoutError
> >>>>
> >>>
> >>>
> >>
> >
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-03-31 14:46 ` John Snow
2022-03-31 15:18 ` Li Zhang
@ 2022-04-06 14:48 ` Li Zhang
2022-04-06 15:11 ` John Snow
1 sibling, 1 reply; 11+ messages in thread
From: Li Zhang @ 2022-04-06 14:48 UTC (permalink / raw)
To: John Snow; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
On 3/31/22 16:46, John Snow wrote:
>
>
> On Thu, Mar 31, 2022, 10:37 AM John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>> wrote:
>
>
>
> On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de
> <mailto:lizhang@suse.de>> wrote:
>
> On 3/31/22 08:10, Hanna Reitz wrote:
> > On 29.03.22 18:49, Li Zhang wrote:
> >> Update what I observed.
> >>
> >> It seems that aqmp is not stable when running test cases.
> >> So I revert the patches as the following, iotest40/41 test
> cases work
> >> well.
> >
> > Thanks for bisecting. I haven’t seen this problem before, so
> I didn’t
> > look into it; CC-ing John, perhaps he figures something.
> >
> Thanks a lot.
> It happens occassionally. I didn't find a way to reproduce it
> manually.
>
>
> Sorry for the instability.
>
> A few questions then:
>
> - what operating system and version
> - kernel version
> - architecture (looks like x86 and others?)
> - what commit/version of QEMU are you testing?
> - what python version are you using to run iotests?
>
>
> I'll also point out these patchsets, because we're in different
> timezones and if I'm lucky it will be the answer:
>
> [1] https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/
> <https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/>
>
> [2] https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/
> <https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/>
Thanks a lot.
I backport aqmp related patches, it looks good to run the cases
iotest040/041.
>
> #1 wasn't sufficient to fix the problem, but I had hoped #2 was. The
> second patchset was merged somewhat recently, just before rc testing for
> 7.0.
>
> Thanks,
> --js
>
>
>
>
>
> > Hanna
> >
> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Tue Oct 26 13:56:12 2021 -0400
> >>
> >>
> >>
> >> python, iotests: replace qmp with aqmp
> >>
> >>
> >>
> >> Swap out the synchronous QEMUMonitorProtocol from
> qemu.qmp with
> >> the sync
> >>
> >> wrapper from qemu.aqmp instead.
> >>
> >>
> >>
> >> Add an escape hatch in the form of the environment variable
> >>
> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole
> QEMUMachine into
> >> using
> >>
> >> the old implementation, proving that both
> implementations work
> >>
> >> concurrently.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Kevin Wolf <kwolf@redhat.com
> <mailto:kwolf@redhat.com>>
> >>
> >> Reviewed-by: Hanna Reitz <hreitz@redhat.com
> <mailto:hreitz@redhat.com>>
> >>
> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com
> <mailto:20211026175612.4127598-9-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
> >>
> >> Author: John Snow <jsnow@redhat.com <mailto:jsnow@redhat.com>>
> >>
> >> Date: Thu Nov 18 15:46:18 2021 -0500
> >>
> >>
> >>
> >> python/machine: handle "fast" QEMU terminations
> >>
> >>
> >>
> >> In the case that the QEMU process actually launches --
> but then
> >> dies so
> >>
> >> quickly that we can't establish a QMP connection to it
> -- QEMUMachine
> >>
> >> currently calls _post_shutdown() assuming that it never
> launched
> >> the VM
> >>
> >> process.
> >>
> >>
> >>
> >> This isn't true, though: it "merely" may have failed to
> establish
> >> a QMP
> >>
> >> connection and the process is in the middle of its own
> exit path.
> >>
> >>
> >>
> >> If we don't wait for the subprocess, the caller may get
> a bogus
> >> `None`
> >>
> >> return for .exitcode(). This behavior was observed from
> >>
> >> device-crash-test; after the switch to Async QMP, the
> timings were
> >>
> >> changed such that it was now seemingly possible to
> witness the
> >> failure
> >>
> >> of "vm.launch()" *prior* to the exitcode becoming available.
> >>
> >>
> >>
> >> The semantic of the `_launched` property is changed in this
> >>
> >> patch. Instead of representing the condition "launch()
> executed
> >>
> >> successfully", it will now represent "has forked a child
> process
> >>
> >> successfully". This way, wait() when called in the exit
> path won't
> >>
> >> become a no-op.
> >>
> >>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >> Reviewed-by: Willian Rampazzo <willianr@redhat.com
> <mailto:willianr@redhat.com>>
> >>
> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com
> <mailto:20211118204620.1897674-6-jsnow@redhat.com>
> >>
> >> Signed-off-by: John Snow <jsnow@redhat.com
> <mailto:jsnow@redhat.com>>
> >>
> >>
> >>
> >>
> >>
> >> On 3/25/22 11:17, Li Zhang wrote:
> >>> Hi,
> >>>
> >>> I backport some iotests patches to the tree and change timeout.
> >>> It doesn't work.
> >>>
> >>> Sometimes, iotest41 also reports the errors.
> >>> [ 1347s]
> >>>
> +======================================================================
> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain
> (__main__.TestSingleDrive)
> >>> [ 1347s]
> >>>
> +----------------------------------------------------------------------
> >>> [ 1347s] +Traceback (most recent call last):
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 399, in launch
> >>> [ 1347s] + self._launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 434, in _launch
> >>> [ 1347s] + self._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
> line
> >>> 147, in _post_launch
> >>> [ 1347s] + super()._post_launch()
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>> line 340, in _post_launch
> >>> [ 1347s] + self._qmp.accept(self._qmp_timer)
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>> line 69, in accept
> >>> [ 1347s] + timeout
> >>> [ 1347s] + File
> >>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>> line 42, in _sync
> >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout)
> >>> [ 1347s] + File
> "/usr/lib64/python3.6/asyncio/base_events.py", line
> >>> 488, in run_until_complete
> >>> [ 1347s] + return future.result()
> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py",
> line 362,
> >>> in wait_for
> >>> [ 1347s] + raise futures.TimeoutError()
> >>> [ 1347s] +concurrent.futures._base.TimeoutError
> >>>
> >>>
> >>> I can see other errors like this, it's the problem of the
> socket.
> >>>
> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable
> >>> [ 1535s] **
> >>> [ 1535s]
> >>>
> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake:
>
> >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> >>>
> >>>
> >>> The script is running the command as this:
> >>> /usr/bin/make -O -j4 check-block V=1
> >>>
> >>> I can see the errors on ppc, arm or x86.
> >>> But I couldn't reproduce it when I run it manually.
> >>>
> >>> It will be appreciated if any suggestions. Thanks.
> >>>
> >>>
> >>> On 3/24/22 14:47, Li Zhang wrote:
> >>>> Hi,
> >>>>
> >>>> When I run the testsuit on our buidling system, it reports
> a timeout
> >>>> sometimes not always as the following.
> >>>> It couldn't connect qmp socket. Any ideas about this problem?
> >>>>
> >>>>
> >>>> [ 1989s] ---
> >>>>
> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
> >>>> [ 1989s] +++ 040.out.bad
> >>>> [ 1989s] @@ -1,5 +1,55 @@
> >>>> [ 1989s]
> >>>>
> -.................................................................
> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to
> >>>> establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s]
> >>>>
> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed
>
> >>>> to establish connection: asyncio.exceptions.CancelledError
> >>>> [ 1989s] +E.........................
> >>>> [ 1989s]
> >>>>
> +======================================================================
> >>>> [ 1989s] +ERROR: test_commit_node
> (__main__.TestActiveZeroLengthImage)
> >>>> [ 1989s]
> >>>>
> +----------------------------------------------------------------------
> >>>> [ 1989s] +Traceback (most recent call last):
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040",
> >>>> line 94, in setUp
> >>>> [ 1989s] + self.vm.launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 399, in launch
> >>>> [ 1989s] + self._launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 434, in _launch
> >>>> [ 1989s] + self._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py",
>
> >>>> line 147, in _post_launch
> >>>> [ 1989s] + super()._post_launch()
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py",
>
> >>>> line 340, in _post_launch
> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer)
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>>> line 67, in accept
> >>>> [ 1989s] + self._sync(
> >>>> [ 1989s] + File
> >>>>
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py",
>
> >>>> line 41, in _sync
> >>>> [ 1989s] + return self._aloop.run_until_complete(
> >>>> [ 1989s] + File
> "/usr/lib64/python3.8/asyncio/base_events.py", line
> >>>> 616, in run_until_complete
> >>>> [ 1989s] + return future.result()
> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py",
> line 501,
> >>>> in wait_for
> >>>> [ 1989s] + raise exceptions.TimeoutError()
> >>>> [ 1989s] +asyncio.exceptions.TimeoutError
> >>>>
> >>>
> >>>
> >>
> >
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: iotest40 problem
2022-04-06 14:48 ` Li Zhang
@ 2022-04-06 15:11 ` John Snow
0 siblings, 0 replies; 11+ messages in thread
From: John Snow @ 2022-04-06 15:11 UTC (permalink / raw)
To: Li Zhang; +Cc: Kevin Wolf, Hanna Reitz, QEMU Developers
[-- Attachment #1: Type: text/plain, Size: 1964 bytes --]
On Wed, Apr 6, 2022, 10:53 AM Li Zhang <lizhang@suse.de> wrote:
> On 3/31/22 16:46, John Snow wrote:
> >
> >
> > On Thu, Mar 31, 2022, 10:37 AM John Snow <jsnow@redhat.com
> > <mailto:jsnow@redhat.com>> wrote:
> >
> >
> >
> > On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizhang@suse.de
> > <mailto:lizhang@suse.de>> wrote:
> >
> > On 3/31/22 08:10, Hanna Reitz wrote:
> > > On 29.03.22 18:49, Li Zhang wrote:
> > >> Update what I observed.
> > >>
> > >> It seems that aqmp is not stable when running test cases.
> > >> So I revert the patches as the following, iotest40/41 test
> > cases work
> > >> well.
> > >
> > > Thanks for bisecting. I haven’t seen this problem before, so
> > I didn’t
> > > look into it; CC-ing John, perhaps he figures something.
> > >
> > Thanks a lot.
> > It happens occassionally. I didn't find a way to reproduce it
> > manually.
> >
> >
> > Sorry for the instability.
> >
> > A few questions then:
> >
> > - what operating system and version
> > - kernel version
> > - architecture (looks like x86 and others?)
> > - what commit/version of QEMU are you testing?
> > - what python version are you using to run iotests?
> >
> >
> > I'll also point out these patchsets, because we're in different
> > timezones and if I'm lucky it will be the answer:
> >
> > [1] https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/
> > <https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/>
> >
> > [2] https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/
> > <https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/>
>
> Thanks a lot.
>
> I backport aqmp related patches, it looks good to run the cases
> iotest040/041.
>
Excellent, thanks for reporting back!
[-- Attachment #2: Type: text/html, Size: 3534 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-04-06 15:13 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-24 13:47 iotest40 problem Li Zhang
2022-03-25 10:17 ` Li Zhang
2022-03-29 16:49 ` Li Zhang
2022-03-31 6:10 ` Hanna Reitz
2022-03-31 10:44 ` Li Zhang
2022-03-31 14:37 ` John Snow
2022-03-31 14:46 ` John Snow
2022-03-31 15:18 ` Li Zhang
2022-04-06 14:48 ` Li Zhang
2022-04-06 15:11 ` John Snow
2022-03-31 15:16 ` Li Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).