qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Li Zhang <lizhang@suse.de>
To: QEMU Developers <qemu-devel@nongnu.org>,
	kwolf@redhat.com, Hanna Reitz <hreitz@redhat.com>
Subject: Re: iotest40 problem
Date: Tue, 29 Mar 2022 18:49:01 +0200	[thread overview]
Message-ID: <37635a04-b717-b7c6-88a0-1b3cecc0c4f5@suse.de> (raw)
In-Reply-To: <32ff3a63-fb50-8038-3f2e-5bfd70b01344@suse.de>

Update what I observed.

It seems that aqmp is not stable when running test cases.
So I revert the patches as the following, iotest40/41 test cases work well.

commit 76cd358671e6b8e7c435ec65b1c44200254514a9

Author: John Snow <jsnow@redhat.com>

Date:   Tue Oct 26 13:56:12 2021 -0400



     python, iotests: replace qmp with aqmp



     Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the 
sync

     wrapper from qemu.aqmp instead.



     Add an escape hatch in the form of the environment variable

     QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into 
using

     the old implementation, proving that both implementations work

     concurrently.



     Signed-off-by: John Snow <jsnow@redhat.com>

     Reviewed-by: Kevin Wolf <kwolf@redhat.com>

     Reviewed-by: Hanna Reitz <hreitz@redhat.com>

     Message-id: 20211026175612.4127598-9-jsnow@redhat.com

     Signed-off-by: John Snow <jsnow@redhat.com>


commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5

Author: John Snow <jsnow@redhat.com>

Date:   Thu Nov 18 15:46:18 2021 -0500



     python/machine: handle "fast" QEMU terminations



     In the case that the QEMU process actually launches -- but then dies so

     quickly that we can't establish a QMP connection to it -- QEMUMachine

     currently calls _post_shutdown() assuming that it never launched the VM

     process.



     This isn't true, though: it "merely" may have failed to establish a QMP

     connection and the process is in the middle of its own exit path.



     If we don't wait for the subprocess, the caller may get a bogus `None`

     return for .exitcode(). This behavior was observed from

     device-crash-test; after the switch to Async QMP, the timings were

     changed such that it was now seemingly possible to witness the failure

     of "vm.launch()" *prior* to the exitcode becoming available.



     The semantic of the `_launched` property is changed in this

     patch. Instead of representing the condition "launch() executed

     successfully", it will now represent "has forked a child process

     successfully". This way, wait() when called in the exit path won't

     become a no-op.



     Signed-off-by: John Snow <jsnow@redhat.com>

     Reviewed-by: Willian Rampazzo <willianr@redhat.com>

     Message-id: 20211118204620.1897674-6-jsnow@redhat.com

     Signed-off-by: John Snow <jsnow@redhat.com>





On 3/25/22 11:17, Li Zhang wrote:
> Hi,
> 
> I backport some iotests patches to the tree and change timeout.
> It doesn't work.
> 
> Sometimes, iotest41 also reports the errors.
> [ 1347s] 
> +======================================================================
> [ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
> [ 1347s] 
> +----------------------------------------------------------------------
> [ 1347s] +Traceback (most recent call last):
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
> line 399, in launch
> [ 1347s] +    self._launch()
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
> line 434, in _launch
> [ 1347s] +    self._post_launch()
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", 
> line 147, in _post_launch
> [ 1347s] +    super()._post_launch()
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
> line 340, in _post_launch
> [ 1347s] +    self._qmp.accept(self._qmp_timer)
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", 
> line 69, in accept
> [ 1347s] +    timeout
> [ 1347s] +  File 
> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", 
> line 42, in _sync
> [ 1347s] +    asyncio.wait_for(future, timeout=timeout)
> [ 1347s] +  File "/usr/lib64/python3.6/asyncio/base_events.py", line 
> 488, in run_until_complete
> [ 1347s] +    return future.result()
> [ 1347s] +  File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, in 
> wait_for
> [ 1347s] +    raise futures.TimeoutError()
> [ 1347s] +concurrent.futures._base.TimeoutError
> 
> 
> I can see other errors like this, it's the problem of the socket.
> 
> [ 1535s] socket_accept failed: Resource temporarily unavailable
> [ 1535s] **
> [ 1535s] 
> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake: 
> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)
> 
> 
> The script is running the command as this:
> /usr/bin/make -O -j4 check-block V=1
> 
> I can see the errors on ppc, arm or x86.
> But I couldn't reproduce it when I run it manually.
> 
> It will be appreciated if any suggestions. Thanks.
> 
> 
> On 3/24/22 14:47, Li Zhang wrote:
>> Hi,
>>
>> When I run the testsuit on our buidling system, it reports a timeout 
>> sometimes not always as the following.
>> It couldn't connect qmp socket. Any ideas about this problem?
>>
>>
>> [ 1989s] --- 
>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
>> [ 1989s] +++ 040.out.bad
>> [ 1989s] @@ -1,5 +1,55 @@
>> [ 1989s] 
>> -.................................................................
>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish 
>> connection: asyncio.exceptions.CancelledError
>> [ 1989s] 
>> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed 
>> to establish connection: asyncio.exceptions.CancelledError
>> [ 1989s] +E.........................
>> [ 1989s] 
>> +======================================================================
>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
>> [ 1989s] 
>> +----------------------------------------------------------------------
>> [ 1989s] +Traceback (most recent call last):
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", line 
>> 94, in setUp
>> [ 1989s] +    self.vm.launch()
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
>> line 399, in launch
>> [ 1989s] +    self._launch()
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
>> line 434, in _launch
>> [ 1989s] +    self._post_launch()
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", 
>> line 147, in _post_launch
>> [ 1989s] +    super()._post_launch()
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", 
>> line 340, in _post_launch
>> [ 1989s] +    self._qmp.accept(self._qmp_timer)
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", 
>> line 67, in accept
>> [ 1989s] +    self._sync(
>> [ 1989s] +  File 
>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", 
>> line 41, in _sync
>> [ 1989s] +    return self._aloop.run_until_complete(
>> [ 1989s] +  File "/usr/lib64/python3.8/asyncio/base_events.py", line 
>> 616, in run_until_complete
>> [ 1989s] +    return future.result()
>> [ 1989s] +  File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, in 
>> wait_for
>> [ 1989s] +    raise exceptions.TimeoutError()
>> [ 1989s] +asyncio.exceptions.TimeoutError
>>
> 
> 



  reply	other threads:[~2022-03-29 16:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-24 13:47 iotest40 problem Li Zhang
2022-03-25 10:17 ` Li Zhang
2022-03-29 16:49   ` Li Zhang [this message]
2022-03-31  6:10     ` Hanna Reitz
2022-03-31 10:44       ` Li Zhang
2022-03-31 14:37         ` John Snow
2022-03-31 14:46           ` John Snow
2022-03-31 15:18             ` Li Zhang
2022-04-06 14:48             ` Li Zhang
2022-04-06 15:11               ` John Snow
2022-03-31 15:16           ` Li Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37635a04-b717-b7c6-88a0-1b3cecc0c4f5@suse.de \
    --to=lizhang@suse.de \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).