From: Juan Quintela <quintela@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Thomas Huth <thuth@redhat.com>,
Peter Maydell <peter.maydell@linaro.org>,
Kevin Wolf <kwolf@redhat.com>,
hreitz@redhat.com, qemu-devel@nongnu.org,
Daniel Berrange <berrange@redhat.com>,
richard.henderson@linaro.org,
Qemu-block <qemu-block@nongnu.org>
Subject: Re: How to tame CI?
Date: Thu, 05 Oct 2023 16:36:10 +0200 [thread overview]
Message-ID: <87ttr5jfvp.fsf@secure.mitica> (raw)
In-Reply-To: <602039f4-2a22-49ed-ab19-5ca62c9f2b47@yandex-team.ru> (Vladimir Sementsov-Ogievskiy's message of "Thu, 5 Oct 2023 15:35:15 +0300")
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 26.07.23 16:32, Thomas Huth wrote:
>> On 26/07/2023 15.00, Peter Maydell wrote:
>>> On Wed, 26 Jul 2023 at 13:06, Juan Quintela <quintela@redhat.com> wrote:
>>>> To make things easier, this is the part that show how it breaks (this is
>>>> the gcov test):
>>>>
>>>> 357/423 qemu:block / io-qcow2-copy-before-write ERROR 6.38s exit status 1
>>>>>>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3
>>>> MALLOC_PERTURB_=44
>>>> /builds/juan.quintela/qemu/build/pyvenv/bin/python3
>>>> /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap
>>>> -qcow2 copy-before-write --source-dir
>>>> /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir
>>>> /builds/juan.quintela/qemu/build/tests/qemu-iotests
>>>> ――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
>>>> stderr:
>>>> --- /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out
>>>> +++ /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
>>>> @@ -1,5 +1,21 @@
>>>> -....
>>>> +...F
>>>> +======================================================================
>>>> +FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
>>>> +----------------------------------------------------------------------
>>>> +Traceback (most recent call last):
>>>> + File "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", line 210, in test_timeout_break_snapshot
>>>> + self.assertEqual(log, """\
>>>> +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at
>>>> offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed:
>>>> Permission denied\n'
>>>> + wrote 524288/524288 bytes at offset 0
>>>> + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>> + wrote 524288/524288 bytes at offset 524288
>>>> + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>> ++ read failed: Permission denied
>>>> +- read 1048576/1048576 bytes at offset 0
>>>> +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>> +
>>>
>>> This iotest failing is an intermittent that I've seen running
>>> pullreqs on master. I tend to see it on the s390 host. I
>>> suspect a race condition somewhere where it fails if the host
>>> is heavily loaded.
>> It's obviously a failure in an iotest, so let's CC: the
>> corresponding people (done now).
>>
>
> Sorry for long delay.
>
> Does it still fail?
>
> In the test we expect that copy-before-write operation fails (because
> of throttling and timeout), and therefore snapshot is broken and next
> read from snapshot should fail.
>
> But most probably the copy-before-write operation succeeded in this
> case for some reason.. I don't think that throttling and timeouts in
> block layer can guarantee some determinism.. But usually it works.
>
> we use throttling with bps-write = 300 * 1024, i.e. 300KB per second. and cbw-timeout is set to 1 second.
>
> Then we do write 512K,
>
> then the comment say:
> # We need second write to trigger throttling
>
> and we write another 512K.
>
> first 512K are written, and we should wait 512/300 = 1.7 seconds since
> _start_ of that write before issuing the second one.. But if write was
> slow we may have to wait less than a second from finish of the first
> write start the second one. Then timeout will not fire.
>
> ====
>
> I see two possible ways to fix that:
>
> 1. decrease bps-write a bit. For example to 200 BPS.
>
> 2. rework the test to use null-co instead of real images. This way we will not suffer from unstable IO duration.
>
>
> So, is the problem still fire sometimes?
For me it is random. When it happens, it do it forever.
And then it stops, and don't happens for a while.
It is not happening for me now.
Later, Juan.
next prev parent reply other threads:[~2023-10-05 14:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-26 12:06 How to tame CI? Juan Quintela
2023-07-26 13:00 ` Peter Maydell
2023-07-26 13:32 ` Thomas Huth
2023-10-05 12:35 ` Vladimir Sementsov-Ogievskiy
2023-10-05 14:36 ` Juan Quintela [this message]
2023-07-26 14:17 ` Daniel P. Berrangé
2023-07-26 14:36 ` Juan Quintela
2023-07-26 14:43 ` Daniel P. Berrangé
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttr5jfvp.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=berrange@redhat.com \
--cc=hreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=thuth@redhat.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.