From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7DD33C433EF for ; Thu, 31 Mar 2022 14:53:25 +0000 (UTC) Received: from localhost ([::1]:36298 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nZwAu-0001Af-MQ for qemu-devel@archiver.kernel.org; Thu, 31 Mar 2022 10:53:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46288) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nZw4v-0002Xv-Q1 for qemu-devel@nongnu.org; Thu, 31 Mar 2022 10:47:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27408) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nZw4s-00044t-SN for qemu-devel@nongnu.org; Thu, 31 Mar 2022 10:47:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648738030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=GWZ5Co1F2VnDjAJUFnjIieI39WlhiFN/uQiUDRvLRio=; b=EndXkCzQVLtNRs9jB0lh5pah4yX9sL35e38y9vZrv5LUxu5+BR2wC0VR0SIYrZuvPKPPk8 sCBEUaSpnmu0Gm3anP77tr5h8IAUt170bF9q8VxlhNh0THk/tTOoKU8CgHm5wZ6DZpvk1E nmt8RcfvjNUuDNJmjIbm8PI+d4S+ijA= Received: from mail-ua1-f71.google.com (mail-ua1-f71.google.com [209.85.222.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-454-GeXIVlV0P1yybzh8JIjbVw-1; Thu, 31 Mar 2022 10:47:06 -0400 X-MC-Unique: GeXIVlV0P1yybzh8JIjbVw-1 Received: by mail-ua1-f71.google.com with SMTP id i4-20020ab04744000000b003520c239119so9178710uac.18 for ; Thu, 31 Mar 2022 07:47:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GWZ5Co1F2VnDjAJUFnjIieI39WlhiFN/uQiUDRvLRio=; b=LVuB+kMwpckFKokeh7dzR72DEldRi+goqvX30rNIxX67RBwBT2SN89Dynq0mPTI9bJ BVGqV3t7ZMBLG7SNCJ6VCtM/IK/dzjGO5oQh9+kV5jrCGQcY7veSFnVu5g5lFJOFAJoX 77GH+cveYWnDHfMtAqrvcWACn4pAoX8ZnsANByrPH30Z/JKzf4rV5YLlSLopVacDj4Oq XS2hRu5vkDhXeCqFX88gXY5IUGzBSTEDmg/ssVpwZxwFRX8hdjJk7fMYZayAG4MILQ4r muK1vDC2Avcenz/mUJz6rIWXn16KSj9aKiiJW27KXUI/JN7tbdH+MfS6dBRrYoiBGDuW MhtA== X-Gm-Message-State: AOAM532wv3AF79tFXNk24Vidp+0F/wK/eRbfeJiUcXpBuPpqiYyrAiz+ 6Qddga+OszkYQdRZU1NB1OygSl5mqzlMOa6qZ/rUd3SOmonNCbp/3NU/bkNHRgGA4GYk4+i2naX bI5U5StCGqmDhWorpenu63+PbC/MBMIg= X-Received: by 2002:a05:6122:887:b0:332:699e:7e67 with SMTP id 7-20020a056122088700b00332699e7e67mr2190858vkf.35.1648738025804; Thu, 31 Mar 2022 07:47:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwL7J1Rt/gZVuQ3Tl66nHU2PUr6dKR1efuxpgvzUHcuihsV7kIdokHqi2XmL0ekLw6LfPVXGbm+lM/o7wNNlPg= X-Received: by 2002:a05:6122:887:b0:332:699e:7e67 with SMTP id 7-20020a056122088700b00332699e7e67mr2190849vkf.35.1648738025482; Thu, 31 Mar 2022 07:47:05 -0700 (PDT) MIME-Version: 1.0 References: <1c4fd757-5d08-0099-b842-4295c430a176@suse.de> <32ff3a63-fb50-8038-3f2e-5bfd70b01344@suse.de> <37635a04-b717-b7c6-88a0-1b3cecc0c4f5@suse.de> <8a898f41-98e1-5a02-4cd8-a697bcb6d9a1@redhat.com> <7a952654-c163-00c9-8538-e07afef47ca8@suse.de> In-Reply-To: From: John Snow Date: Thu, 31 Mar 2022 10:46:55 -0400 Message-ID: Subject: Re: iotest40 problem To: Li Zhang Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jsnow@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/alternative; boundary="0000000000005642ef05db84bca1" Received-SPF: pass client-ip=170.10.133.124; envelope-from=jsnow@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Hanna Reitz , QEMU Developers Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --0000000000005642ef05db84bca1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Mar 31, 2022, 10:37 AM John Snow wrote: > > > On Thu, Mar 31, 2022, 6:47 AM Li Zhang wrote: > >> On 3/31/22 08:10, Hanna Reitz wrote: >> > On 29.03.22 18:49, Li Zhang wrote: >> >> Update what I observed. >> >> >> >> It seems that aqmp is not stable when running test cases. >> >> So I revert the patches as the following, iotest40/41 test cases work >> >> well. >> > >> > Thanks for bisecting. I haven=E2=80=99t seen this problem before, so = I didn=E2=80=99t >> > look into it; CC-ing John, perhaps he figures something. >> > >> Thanks a lot. >> It happens occassionally. I didn't find a way to reproduce it manually. >> > > Sorry for the instability. > > A few questions then: > > - what operating system and version > - kernel version > - architecture (looks like x86 and others?) > - what commit/version of QEMU are you testing? > - what python version are you using to run iotests? > I'll also point out these patchsets, because we're in different timezones and if I'm lucky it will be the answer: [1] https://patchew.org/QEMU/20220201041134.1237016-1-jsnow@redhat.com/ [2] https://patchew.org/QEMU/20220225205948.3693480-1-jsnow@redhat.com/ #1 wasn't sufficient to fix the problem, but I had hoped #2 was. The second patchset was merged somewhat recently, just before rc testing for 7.0. Thanks, --js > > >> > Hanna >> > >> >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9 >> >> >> >> Author: John Snow >> >> >> >> Date: Tue Oct 26 13:56:12 2021 -0400 >> >> >> >> >> >> >> >> python, iotests: replace qmp with aqmp >> >> >> >> >> >> >> >> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with >> >> the sync >> >> >> >> wrapper from qemu.aqmp instead. >> >> >> >> >> >> >> >> Add an escape hatch in the form of the environment variable >> >> >> >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine int= o >> >> using >> >> >> >> the old implementation, proving that both implementations work >> >> >> >> concurrently. >> >> >> >> >> >> >> >> Signed-off-by: John Snow >> >> >> >> Reviewed-by: Kevin Wolf >> >> >> >> Reviewed-by: Hanna Reitz >> >> >> >> Message-id: 20211026175612.4127598-9-jsnow@redhat.com >> >> >> >> Signed-off-by: John Snow >> >> >> >> >> >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5 >> >> >> >> Author: John Snow >> >> >> >> Date: Thu Nov 18 15:46:18 2021 -0500 >> >> >> >> >> >> >> >> python/machine: handle "fast" QEMU terminations >> >> >> >> >> >> >> >> In the case that the QEMU process actually launches -- but then >> >> dies so >> >> >> >> quickly that we can't establish a QMP connection to it -- >> QEMUMachine >> >> >> >> currently calls _post_shutdown() assuming that it never launched >> >> the VM >> >> >> >> process. >> >> >> >> >> >> >> >> This isn't true, though: it "merely" may have failed to establish >> >> a QMP >> >> >> >> connection and the process is in the middle of its own exit path. >> >> >> >> >> >> >> >> If we don't wait for the subprocess, the caller may get a bogus >> >> `None` >> >> >> >> return for .exitcode(). This behavior was observed from >> >> >> >> device-crash-test; after the switch to Async QMP, the timings wer= e >> >> >> >> changed such that it was now seemingly possible to witness the >> >> failure >> >> >> >> of "vm.launch()" *prior* to the exitcode becoming available. >> >> >> >> >> >> >> >> The semantic of the `_launched` property is changed in this >> >> >> >> patch. Instead of representing the condition "launch() executed >> >> >> >> successfully", it will now represent "has forked a child process >> >> >> >> successfully". This way, wait() when called in the exit path won'= t >> >> >> >> become a no-op. >> >> >> >> >> >> >> >> Signed-off-by: John Snow >> >> >> >> Reviewed-by: Willian Rampazzo >> >> >> >> Message-id: 20211118204620.1897674-6-jsnow@redhat.com >> >> >> >> Signed-off-by: John Snow >> >> >> >> >> >> >> >> >> >> >> >> On 3/25/22 11:17, Li Zhang wrote: >> >>> Hi, >> >>> >> >>> I backport some iotests patches to the tree and change timeout. >> >>> It doesn't work. >> >>> >> >>> Sometimes, iotest41 also reports the errors. >> >>> [ 1347s] >> >>> >> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain >> (__main__.TestSingleDrive) >> >>> [ 1347s] >> >>> >> +---------------------------------------------------------------------- >> >>> [ 1347s] +Traceback (most recent call last): >> >>> [ 1347s] + File >> >>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>> line 399, in launch >> >>> [ 1347s] + self._launch() >> >>> [ 1347s] + File >> >>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>> line 434, in _launch >> >>> [ 1347s] + self._post_launch() >> >>> [ 1347s] + File >> >>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", l= ine >> >>> 147, in _post_launch >> >>> [ 1347s] + super()._post_launch() >> >>> [ 1347s] + File >> >>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>> line 340, in _post_launch >> >>> [ 1347s] + self._qmp.accept(self._qmp_timer) >> >>> [ 1347s] + File >> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", >> >>> line 69, in accept >> >>> [ 1347s] + timeout >> >>> [ 1347s] + File >> >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", >> >>> line 42, in _sync >> >>> [ 1347s] + asyncio.wait_for(future, timeout=3Dtimeout) >> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line >> >>> 488, in run_until_complete >> >>> [ 1347s] + return future.result() >> >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, >> >>> in wait_for >> >>> [ 1347s] + raise futures.TimeoutError() >> >>> [ 1347s] +concurrent.futures._base.TimeoutError >> >>> >> >>> >> >>> I can see other errors like this, it's the problem of the socket. >> >>> >> >>> [ 1535s] socket_accept failed: Resource temporarily unavailable >> >>> [ 1535s] ** >> >>> [ 1535s] >> >>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake= : >> >>> assertion failed: (s->fd >=3D 0 && s->qmp_fd >=3D 0) >> >>> >> >>> >> >>> The script is running the command as this: >> >>> /usr/bin/make -O -j4 check-block V=3D1 >> >>> >> >>> I can see the errors on ppc, arm or x86. >> >>> But I couldn't reproduce it when I run it manually. >> >>> >> >>> It will be appreciated if any suggestions. Thanks. >> >>> >> >>> >> >>> On 3/24/22 14:47, Li Zhang wrote: >> >>>> Hi, >> >>>> >> >>>> When I run the testsuit on our buidling system, it reports a timeou= t >> >>>> sometimes not always as the following. >> >>>> It couldn't connect qmp socket. Any ideas about this problem? >> >>>> >> >>>> >> >>>> [ 1989s] --- >> >>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out >> >>>> [ 1989s] +++ 040.out.bad >> >>>> [ 1989s] @@ -1,5 +1,55 @@ >> >>>> [ 1989s] >> >>>> -................................................................. >> >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to >> >>>> establish connection: asyncio.exceptions.CancelledError >> >>>> [ 1989s] >> >>>> >> +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471= :Failed >> >> >>>> to establish connection: asyncio.exceptions.CancelledError >> >>>> [ 1989s] +E......................... >> >>>> [ 1989s] >> >>>> >> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >>>> [ 1989s] +ERROR: test_commit_node >> (__main__.TestActiveZeroLengthImage) >> >>>> [ 1989s] >> >>>> >> +---------------------------------------------------------------------- >> >>>> [ 1989s] +Traceback (most recent call last): >> >>>> [ 1989s] + File >> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", >> >>>> line 94, in setUp >> >>>> [ 1989s] + self.vm.launch() >> >>>> [ 1989s] + File >> >>>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>>> line 399, in launch >> >>>> [ 1989s] + self._launch() >> >>>> [ 1989s] + File >> >>>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>>> line 434, in _launch >> >>>> [ 1989s] + self._post_launch() >> >>>> [ 1989s] + File >> >>>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", >> >>>> line 147, in _post_launch >> >>>> [ 1989s] + super()._post_launch() >> >>>> [ 1989s] + File >> >>>> >> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", >> >>>> line 340, in _post_launch >> >>>> [ 1989s] + self._qmp.accept(self._qmp_timer) >> >>>> [ 1989s] + File >> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py"= , >> >>>> line 67, in accept >> >>>> [ 1989s] + self._sync( >> >>>> [ 1989s] + File >> >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py"= , >> >>>> line 41, in _sync >> >>>> [ 1989s] + return self._aloop.run_until_complete( >> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", lin= e >> >>>> 616, in run_until_complete >> >>>> [ 1989s] + return future.result() >> >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, >> >>>> in wait_for >> >>>> [ 1989s] + raise exceptions.TimeoutError() >> >>>> [ 1989s] +asyncio.exceptions.TimeoutError >> >>>> >> >>> >> >>> >> >> >> > >> >> >> --0000000000005642ef05db84bca1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Mar 31, 2022, 10:37 AM John Snow <jsnow@redhat.com> wrote:


On Thu, Mar 31, 2022, 6:47 A= M Li Zhang <lizhang@suse.de> wrote:
On 3/31/22 08:10, Hanna Reitz wrote:
> On 29.03.22 18:49, Li Zhang wrote:
>> Update what I observed.
>>
>> It seems that aqmp is not stable when running test cases.
>> So I revert the patches as the following, iotest40/41 test cases w= ork
>> well.
>
> Thanks for bisecting.=C2=A0 I haven=E2=80=99t seen this problem before= , so I didn=E2=80=99t
> look into it; CC-ing John, perhaps he figures something.
>
Thanks a lot.
It happens occassionally. I didn't find a way to reproduce it manually.=

= Sorry for the instability.

A few questions then:

- what operating system and version
- kernel versio= n=C2=A0
- architecture (looks like x86 and others?)<= /div>
- what commit/version of QEMU are you testing?
=
- what python version are you using to run iotests?
=

I'll also point out these patchsets, because we're in different t= imezones and if I'm lucky it will be the answer:


[2]=C2=A0https://patchew.org/QEMU/20220225205948.36934= 80-1-jsnow@redhat.com/

#1 wasn't sufficient to fix the problem, but I had hoped #2 was. The= second patchset was merged somewhat recently, just before rc testing for 7= .0.

Thanks,
--js





> Hanna
>
>> commit 76cd358671e6b8e7c435ec65b1c44200254514a9
>>
>> Author: John Snow <jsnow@redhat.com>=
>>
>> Date:=C2=A0=C2=A0 Tue Oct 26 13:56:12 2021 -0400
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 python, iotests: replace qmp with aqmp
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 Swap out the synchronous QEMUMonitorProtocol fr= om qemu.qmp with
>> the sync
>>
>> =C2=A0=C2=A0=C2=A0 wrapper from qemu.aqmp instead.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 Add an escape hatch in the form of the environm= ent variable
>>
>> =C2=A0=C2=A0=C2=A0 QEMU_PYTHON_LEGACY_QMP which allows you to cajo= le QEMUMachine into
>> using
>>
>> =C2=A0=C2=A0=C2=A0 the old implementation, proving that both imple= mentations work
>>
>> =C2=A0=C2=A0=C2=A0 concurrently.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> =C2=A0=C2=A0=C2=A0 Reviewed-by: Kevin Wolf <kwolf@redhat.com>
>>
>> =C2=A0=C2=A0=C2=A0 Reviewed-by: Hanna Reitz <hreitz@redhat.com>
>>
>> =C2=A0=C2=A0=C2=A0 Message-id: 20211026175612.4127598-9-jsnow@redhat.com
>>
>> =C2=A0=C2=A0=C2=A0 Signed-off-by: John Snow <jsnow@redhat.com>
>>
>>
>> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5
>>
>> Author: John Snow <jsnow@redhat.com>=
>>
>> Date:=C2=A0=C2=A0 Thu Nov 18 15:46:18 2021 -0500
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 python/machine: handle "fast" QEMU te= rminations
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 In the case that the QEMU process actually laun= ches -- but then
>> dies so
>>
>> =C2=A0=C2=A0=C2=A0 quickly that we can't establish a QMP conne= ction to it -- QEMUMachine
>>
>> =C2=A0=C2=A0=C2=A0 currently calls _post_shutdown() assuming that = it never launched
>> the VM
>>
>> =C2=A0=C2=A0=C2=A0 process.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 This isn't true, though: it "merely&qu= ot; may have failed to establish
>> a QMP
>>
>> =C2=A0=C2=A0=C2=A0 connection and the process is in the middle of = its own exit path.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 If we don't wait for the subprocess, the ca= ller may get a bogus
>> `None`
>>
>> =C2=A0=C2=A0=C2=A0 return for .exitcode(). This behavior was obser= ved from
>>
>> =C2=A0=C2=A0=C2=A0 device-crash-test; after the switch to Async QM= P, the timings were
>>
>> =C2=A0=C2=A0=C2=A0 changed such that it was now seemingly possible= to witness the
>> failure
>>
>> =C2=A0=C2=A0=C2=A0 of "vm.launch()" *prior* to the exitc= ode becoming available.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 The semantic of the `_launched` property is cha= nged in this
>>
>> =C2=A0=C2=A0=C2=A0 patch. Instead of representing the condition &q= uot;launch() executed
>>
>> =C2=A0=C2=A0=C2=A0 successfully", it will now represent "= ;has forked a child process
>>
>> =C2=A0=C2=A0=C2=A0 successfully". This way, wait() when calle= d in the exit path won't
>>
>> =C2=A0=C2=A0=C2=A0 become a no-op.
>>
>>
>>
>> =C2=A0=C2=A0=C2=A0 Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> =C2=A0=C2=A0=C2=A0 Reviewed-by: Willian Rampazzo <willianr@redhat.com>
>>
>> =C2=A0=C2=A0=C2=A0 Message-id: 20211118204620.1897674-6-jsnow@redhat.com
>>
>> =C2=A0=C2=A0=C2=A0 Signed-off-by: John Snow <jsnow@redhat.com>
>>
>>
>>
>>
>>
>> On 3/25/22 11:17, Li Zhang wrote:
>>> Hi,
>>>
>>> I backport some iotests patches to the tree and change timeout= .
>>> It doesn't work.
>>>
>>> Sometimes, iotest41 also reports the errors.
>>> [ 1347s]
>>> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
>>> [ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSi= ngleDrive)
>>> [ 1347s]
>>> +-------------------------------------------------------------= ---------
>>> [ 1347s] +Traceback (most recent call last):
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machi= ne/machine.py",
>>> line 399, in launch
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 self._launch()
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machi= ne/machine.py",
>>> line 434, in _launch
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 self._post_launch()
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machi= ne/qtest.py", line
>>> 147, in _post_launch
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 super()._post_launch()
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machi= ne/machine.py",
>>> line 340, in _post_launch
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 self._qmp.accept(self._qmp_timer)=
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/= legacy.py",
>>> line 69, in accept
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 timeout
>>> [ 1347s] +=C2=A0 File
>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/= legacy.py",
>>> line 42, in _sync
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 asyncio.wait_for(future, timeout= =3Dtimeout)
>>> [ 1347s] +=C2=A0 File "/usr/lib64/python3.6/asyncio/base_= events.py", line
>>> 488, in run_until_complete
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 return future.result()
>>> [ 1347s] +=C2=A0 File "/usr/lib64/python3.6/asyncio/tasks= .py", line 362,
>>> in wait_for
>>> [ 1347s] +=C2=A0=C2=A0=C2=A0 raise futures.TimeoutError()
>>> [ 1347s] +concurrent.futures._base.TimeoutError
>>>
>>>
>>> I can see other errors like this, it's the problem of the = socket.
>>>
>>> [ 1535s] socket_accept failed: Resource temporarily unavailabl= e
>>> [ 1535s] **
>>> [ 1535s]
>>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_han= dshake:
>>> assertion failed: (s->fd >=3D 0 && s->qmp_fd = >=3D 0)
>>>
>>>
>>> The script is running the command as this:
>>> /usr/bin/make -O -j4 check-block V=3D1
>>>
>>> I can see the errors on ppc, arm or x86.
>>> But I couldn't reproduce it when I run it manually.
>>>
>>> It will be appreciated if any suggestions. Thanks.
>>>
>>>
>>> On 3/24/22 14:47, Li Zhang wrote:
>>>> Hi,
>>>>
>>>> When I run the testsuit on our buidling system, it reports= a timeout
>>>> sometimes not always as the following.
>>>> It couldn't connect qmp socket. Any ideas about this p= roblem?
>>>>
>>>>
>>>> [ 1989s] ---
>>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/= 040.out
>>>> [ 1989s] +++ 040.out.bad
>>>> [ 1989s] @@ -1,5 +1,55 @@
>>>> [ 1989s]
>>>> -.........................................................= ........
>>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed = to
>>>> establish connection: asyncio.exceptions.CancelledError >>>> [ 1989s]
>>>> +E..................................ERROR:qemu.aqmp.qmp_cl= ient.qemu-6471:Failed
>>>> to establish connection: asyncio.exceptions.CancelledError=
>>>> [ 1989s] +E.........................
>>>> [ 1989s]
>>>> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
>>>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZero= LengthImage)
>>>> [ 1989s]
>>>> +---------------------------------------------------------= -------------
>>>> [ 1989s] +Traceback (most recent call last):
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-io= tests/040",
>>>> line 94, in setUp
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 self.vm.launch()
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/m= achine/machine.py",
>>>> line 399, in launch
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 self._launch()
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/m= achine/machine.py",
>>>> line 434, in _launch
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 self._post_launch()
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/m= achine/qtest.py",
>>>> line 147, in _post_launch
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 super()._post_launch()
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/m= achine/machine.py",
>>>> line 340, in _post_launch
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 self._qmp.accept(self._qmp_ti= mer)
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/a= qmp/legacy.py",
>>>> line 67, in accept
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 self._sync(
>>>> [ 1989s] +=C2=A0 File
>>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/a= qmp/legacy.py",
>>>> line 41, in _sync
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 return self._aloop.run_until_= complete(
>>>> [ 1989s] +=C2=A0 File "/usr/lib64/python3.8/asyncio/b= ase_events.py", line
>>>> 616, in run_until_complete
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 return future.result()
>>>> [ 1989s] +=C2=A0 File "/usr/lib64/python3.8/asyncio/t= asks.py", line 501,
>>>> in wait_for
>>>> [ 1989s] +=C2=A0=C2=A0=C2=A0 raise exceptions.TimeoutError= ()
>>>> [ 1989s] +asyncio.exceptions.TimeoutError
>>>>
>>>
>>>
>>
>


--0000000000005642ef05db84bca1--