* qemu iotest 161 and make check
@ 2022-02-10 7:57 Christian Borntraeger
2022-02-10 14:47 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2022-02-10 7:57 UTC (permalink / raw)
To: qemu-devel, qemu block, qemu-s390x
Hello,
I do see spurious failures of 161 in our CI, but only when I use
make check with parallelism (-j).
I have not yet figured out which other testcase could interfere
@@ -34,6 +34,8 @@
*** Commit and then change an option on the backing file
Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
+qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
+Is another process using the image [TEST_DIR/t.IMGFMT.base]?
Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
{ 'execute': 'qmp_capabilities' }
any ideas?
Christian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 7:57 qemu iotest 161 and make check Christian Borntraeger
@ 2022-02-10 14:47 ` Vladimir Sementsov-Ogievskiy
2022-02-10 14:51 ` Christian Borntraeger
2022-02-14 9:08 ` Christian Borntraeger
0 siblings, 2 replies; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-02-10 14:47 UTC (permalink / raw)
To: Christian Borntraeger, qemu-devel, qemu block, qemu-s390x
10.02.2022 10:57, Christian Borntraeger wrote:
> Hello,
>
> I do see spurious failures of 161 in our CI, but only when I use
> make check with parallelism (-j).
> I have not yet figured out which other testcase could interfere
>
> @@ -34,6 +34,8 @@
> *** Commit and then change an option on the backing file
>
> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
> { 'execute': 'qmp_capabilities' }
>
>
> any ideas?
>
Hmm, interesting.. Is it always 161 and always exactly this diff?
First, this place in 161 is usual: we just create and image, like in many other tests.
Second, why _make_test_img trigger "Failed to get write lock"? It should just create an image. Hmm. And probably starts QSD if protocol is fuse. So, that start of QSD may probably fail.. Is that the case? What is image format and protocol used in test run?
But anyway, tests running in parallel should not break each other as each test has own TEST_DIR and SOCK_DIR..
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 14:47 ` Vladimir Sementsov-Ogievskiy
@ 2022-02-10 14:51 ` Christian Borntraeger
2022-02-10 17:13 ` Thomas Huth
2022-02-14 9:08 ` Christian Borntraeger
1 sibling, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2022-02-10 14:51 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu block, qemu-s390x
Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
> 10.02.2022 10:57, Christian Borntraeger wrote:
>> Hello,
>>
>> I do see spurious failures of 161 in our CI, but only when I use
>> make check with parallelism (-j).
>> I have not yet figured out which other testcase could interfere
>>
>> @@ -34,6 +34,8 @@
>> *** Commit and then change an option on the backing file
>>
>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
>> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
>> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
>> { 'execute': 'qmp_capabilities' }
>>
>>
>> any ideas?
>>
>
> Hmm, interesting.. Is it always 161 and always exactly this diff?
Its always 161 and only 161. I would need to check if its always the same error.
>
> First, this place in 161 is usual: we just create and image, like in many other tests.
>
> Second, why _make_test_img trigger "Failed to get write lock"? It should just create an image. Hmm. And probably starts QSD if protocol is fuse. So, that start of QSD may probably fail.. Is that the case? What is image format and protocol used in test run?
>
> But anyway, tests running in parallel should not break each other as each test has own TEST_DIR and SOCK_DIR..
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 14:51 ` Christian Borntraeger
@ 2022-02-10 17:13 ` Thomas Huth
2022-02-10 17:44 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 12+ messages in thread
From: Thomas Huth @ 2022-02-10 17:13 UTC (permalink / raw)
To: Christian Borntraeger, Vladimir Sementsov-Ogievskiy, qemu-devel,
qemu block, qemu-s390x
On 10/02/2022 15.51, Christian Borntraeger wrote:
>
>
> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>> Hello,
>>>
>>> I do see spurious failures of 161 in our CI, but only when I use
>>> make check with parallelism (-j).
>>> I have not yet figured out which other testcase could interfere
>>>
>>> @@ -34,6 +34,8 @@
>>> *** Commit and then change an option on the backing file
>>>
>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>>> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
>>> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576
>>> backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
>>> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
>>> backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
>>> { 'execute': 'qmp_capabilities' }
>>>
>>>
>>> any ideas?
>>>
>>
>> Hmm, interesting.. Is it always 161 and always exactly this diff?
>
> Its always 161 and only 161. I would need to check if its always the same
> error.
>
>>
>> First, this place in 161 is usual: we just create and image, like in many
>> other tests.
>>
>> Second, why _make_test_img trigger "Failed to get write lock"? It should
>> just create an image. Hmm. And probably starts QSD if protocol is fuse.
>> So, that start of QSD may probably fail.. Is that the case? What is image
>> format and protocol used in test run?
>>
>> But anyway, tests running in parallel should not break each other as each
>> test has own TEST_DIR and SOCK_DIR..
Unless you run into the issue that Hanna described here:
https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg01735.html
Thomas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 17:13 ` Thomas Huth
@ 2022-02-10 17:44 ` Vladimir Sementsov-Ogievskiy
2022-02-21 10:27 ` Christian Borntraeger
0 siblings, 1 reply; 12+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-02-10 17:44 UTC (permalink / raw)
To: Thomas Huth, Christian Borntraeger, qemu-devel, qemu block,
qemu-s390x
10.02.2022 20:13, Thomas Huth wrote:
> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>
>>
>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>> Hello,
>>>>
>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>> make check with parallelism (-j).
>>>> I have not yet figured out which other testcase could interfere
>>>>
>>>> @@ -34,6 +34,8 @@
>>>> *** Commit and then change an option on the backing file
>>>>
>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>>>> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
>>>> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
>>>> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
>>>> { 'execute': 'qmp_capabilities' }
>>>>
>>>>
>>>> any ideas?
>>>>
>>>
>>> Hmm, interesting.. Is it always 161 and always exactly this diff?
>>
>> Its always 161 and only 161. I would need to check if its always the same error.
>>
>>>
>>> First, this place in 161 is usual: we just create and image, like in many other tests.
>>>
>>> Second, why _make_test_img trigger "Failed to get write lock"? It should just create an image. Hmm. And probably starts QSD if protocol is fuse. So, that start of QSD may probably fail.. Is that the case? What is image format and protocol used in test run?
>>>
>>> But anyway, tests running in parallel should not break each other as each test has own TEST_DIR and SOCK_DIR..
>
> Unless you run into the issue that Hanna described here:
>
> https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg01735.html
>
Yes, we can't execute same test several times (for different formats) in parallel.. But that's about any test, not only 161.
And I don't think that it's currently possible that we run same test in parallel several times somewhere, do we? In tests/check-block.sh we have a sequential loop through $format_list ..
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 14:47 ` Vladimir Sementsov-Ogievskiy
2022-02-10 14:51 ` Christian Borntraeger
@ 2022-02-14 9:08 ` Christian Borntraeger
1 sibling, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2022-02-14 9:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu block, qemu-s390x
Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
> 10.02.2022 10:57, Christian Borntraeger wrote:
>> Hello,
>>
>> I do see spurious failures of 161 in our CI, but only when I use
>> make check with parallelism (-j).
>> I have not yet figured out which other testcase could interfere
>>
>> @@ -34,6 +34,8 @@
>> *** Commit and then change an option on the backing file
>>
>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
>> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
>> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
>> { 'execute': 'qmp_capabilities' }
>>
>>
>> any ideas?
>>
>
> Hmm, interesting.. Is it always 161 and always exactly this diff?
Seems to be always this diff.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-10 17:44 ` Vladimir Sementsov-Ogievskiy
@ 2022-02-21 10:27 ` Christian Borntraeger
2022-03-31 7:44 ` Christian Borntraeger
0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2022-02-21 10:27 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Thomas Huth, qemu-devel, qemu block,
qemu-s390x, Paolo Bonzini
Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
> 10.02.2022 20:13, Thomas Huth wrote:
>> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>>
>>>
>>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>>> Hello,
>>>>>
>>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>>> make check with parallelism (-j).
>>>>> I have not yet figured out which other testcase could interfere
>>>>>
>>>>> @@ -34,6 +34,8 @@
>>>>> *** Commit and then change an option on the backing file
>>>>>
>>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>>>>> +Is another process using the image [TEST_DIR/t.IMGFMT.base]?
>>>>> Formatting 'TEST_DIR/t.IMGFMT.int', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT
>>>>> Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.int backing_fmt=IMGFMT
>>>>> { 'execute': 'qmp_capabilities' }
>>>>>
>>>>>
>>>>> any ideas?
>>>>>
>>>>
>>>> Hmm, interesting.. Is it always 161 and always exactly this diff?
>>>
>>> Its always 161 and only 161. I would need to check if its always the same error.
>>>
>>>>
>>>> First, this place in 161 is usual: we just create and image, like in many other tests.
>>>>
>>>> Second, why _make_test_img trigger "Failed to get write lock"? It should just create an image. Hmm. And probably starts QSD if protocol is fuse. So, that start of QSD may probably fail.. Is that the case? What is image format and protocol used in test run?
>>>>
>>>> But anyway, tests running in parallel should not break each other as each test has own TEST_DIR and SOCK_DIR..
>>
>> Unless you run into the issue that Hanna described here:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg01735.html
>>
>
> Yes, we can't execute same test several times (for different formats) in parallel.. But that's about any test, not only 161.
>
> And I don't think that it's currently possible that we run same test in parallel several times somewhere, do we? In tests/check-block.sh we have a sequential loop through $format_list ..
FWIW, I was able to bisect this and it came in with
bcda7b178fde7797f476e3b066fe5fc76bfa1c43 is the first bad commit
commit bcda7b178fde7797f476e3b066fe5fc76bfa1c43
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date: Thu Dec 23 19:39:33 2021 +0100
check-block.sh: passthrough -jN flag of make to -j N flag of check
This improves performance of running iotests during "make -jN check".
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211223183933.1497037-1-vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
tests/check-block.sh | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
With
make check-block -j 100
it reproduced pretty quickly for me.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-02-21 10:27 ` Christian Borntraeger
@ 2022-03-31 7:44 ` Christian Borntraeger
2022-03-31 8:25 ` Christian Borntraeger
2022-03-31 9:59 ` Li Zhang
0 siblings, 2 replies; 12+ messages in thread
From: Christian Borntraeger @ 2022-03-31 7:44 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Thomas Huth, qemu-devel, qemu block,
qemu-s390x, Paolo Bonzini
Am 21.02.22 um 11:27 schrieb Christian Borntraeger:
>
> Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
>> 10.02.2022 20:13, Thomas Huth wrote:
>>> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>>>
>>>>
>>>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>>>> make check with parallelism (-j).
>>>>>> I have not yet figured out which other testcase could interfere
>>>>>>
>>>>>> @@ -34,6 +34,8 @@
>>>>>> *** Commit and then change an option on the backing file
>>>>>>
>>>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
and raw_check_lock_bytes spits this error.
Is this just some overload situation that we do not recover because we do not handle EAGAIN any special.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-03-31 7:44 ` Christian Borntraeger
@ 2022-03-31 8:25 ` Christian Borntraeger
2022-10-27 5:54 ` Christian Borntraeger
2022-03-31 9:59 ` Li Zhang
1 sibling, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2022-03-31 8:25 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Thomas Huth, qemu-devel, qemu block,
qemu-s390x, Paolo Bonzini
Am 31.03.22 um 09:44 schrieb Christian Borntraeger:
>
>
> Am 21.02.22 um 11:27 schrieb Christian Borntraeger:
>>
>> Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
>>> 10.02.2022 20:13, Thomas Huth wrote:
>>>> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>>>>
>>>>>
>>>>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>>>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>>>>> make check with parallelism (-j).
>>>>>>> I have not yet figured out which other testcase could interfere
>>>>>>>
>>>>>>> @@ -34,6 +34,8 @@
>>>>>>> *** Commit and then change an option on the backing file
>>>>>>>
>>>>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>
> FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
> and raw_check_lock_bytes spits this error.
And its coming from here (ret is 0)
int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
{
int ret;
struct flock fl = {
.l_whence = SEEK_SET,
.l_start = start,
.l_len = len,
.l_type = exclusive ? F_WRLCK : F_RDLCK,
};
qemu_probe_lock_ops();
ret = fcntl(fd, fcntl_op_getlk, &fl);
if (ret == -1) {
return -errno;
} else {
-----> return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
}
}
>
>
> Is this just some overload situation that we do not recover because we do not handle EAGAIN any special.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-03-31 7:44 ` Christian Borntraeger
2022-03-31 8:25 ` Christian Borntraeger
@ 2022-03-31 9:59 ` Li Zhang
1 sibling, 0 replies; 12+ messages in thread
From: Li Zhang @ 2022-03-31 9:59 UTC (permalink / raw)
To: Christian Borntraeger, Vladimir Sementsov-Ogievskiy, Thomas Huth,
qemu-devel, qemu block, qemu-s390x, Paolo Bonzini
On 3/31/22 09:44, Christian Borntraeger wrote:
>
>
> Am 21.02.22 um 11:27 schrieb Christian Borntraeger:
>>
>> Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
>>> 10.02.2022 20:13, Thomas Huth wrote:
>>>> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>>>>
>>>>>
>>>>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>>>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>>>>> make check with parallelism (-j).
>>>>>>> I have not yet figured out which other testcase could interfere
>>>>>>>
>>>>>>> @@ -34,6 +34,8 @@
>>>>>>> *** Commit and then change an option on the backing file
>>>>>>>
>>>>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>
> FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
> and raw_check_lock_bytes spits this error.
>
I also run into this issue on S390 when running test cases.
I think it will report this "write" lock error if different processes
are using the same image.
>
> Is this just some overload situation that we do not recover because we
> do not handle EAGAIN any special.
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-03-31 8:25 ` Christian Borntraeger
@ 2022-10-27 5:54 ` Christian Borntraeger
2022-12-05 13:49 ` Christian Borntraeger
0 siblings, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2022-10-27 5:54 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Thomas Huth, qemu-devel, qemu block,
qemu-s390x, Paolo Bonzini
Am 31.03.22 um 10:25 schrieb Christian Borntraeger:
>
>
> Am 31.03.22 um 09:44 schrieb Christian Borntraeger:
>>
>>
>> Am 21.02.22 um 11:27 schrieb Christian Borntraeger:
>>>
>>> Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:
>>>> 10.02.2022 20:13, Thomas Huth wrote:
>>>>> On 10/02/2022 15.51, Christian Borntraeger wrote:
>>>>>>
>>>>>>
>>>>>> Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:
>>>>>>> 10.02.2022 10:57, Christian Borntraeger wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I do see spurious failures of 161 in our CI, but only when I use
>>>>>>>> make check with parallelism (-j).
>>>>>>>> I have not yet figured out which other testcase could interfere
>>>>>>>>
>>>>>>>> @@ -34,6 +34,8 @@
>>>>>>>> *** Commit and then change an option on the backing file
>>>>>>>>
>>>>>>>> Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
>>>>>>>> +qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock
>>
>> FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
>> and raw_check_lock_bytes spits this error.
>
>
> And its coming from here (ret is 0)
>
> int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
> {
> int ret;
> struct flock fl = {
> .l_whence = SEEK_SET,
> .l_start = start,
> .l_len = len,
> .l_type = exclusive ? F_WRLCK : F_RDLCK,
> };
> qemu_probe_lock_ops();
> ret = fcntl(fd, fcntl_op_getlk, &fl);
> if (ret == -1) {
> return -errno;
> } else {
> -----> return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
> }
> }
>
>>
>>
>> Is this just some overload situation that we do not recover because we do not handle EAGAIN any special.
Restarted my investigation. Looks like the file lock from qemu is not fully cleaned up when the process is gone.
Something like
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..b28a6c187c 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -403,4 +403,5 @@ _cleanup_qemu()
unset QEMU_IN[$i]
unset QEMU_OUT[$i]
done
+ sleep 0.5
}
makes the problem go away.
Looks like we do use the OFD variant of the file lock, so any clone, fork etc will keep the lock.
So I tested the following:
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..01bdb05575 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -388,7 +388,7 @@ _cleanup_qemu()
kill -KILL ${QEMU_PID} 2>/dev/null
fi
if [ -n "${QEMU_PID}" ]; then
- wait ${QEMU_PID} 2>/dev/null # silent kill
+ wait 2>/dev/null # silent kill
fi
fi
And this also helps. Still trying to find out what clone/fork happens here.
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: qemu iotest 161 and make check
2022-10-27 5:54 ` Christian Borntraeger
@ 2022-12-05 13:49 ` Christian Borntraeger
0 siblings, 0 replies; 12+ messages in thread
From: Christian Borntraeger @ 2022-12-05 13:49 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Thomas Huth, qemu-devel, qemu block,
qemu-s390x, Paolo Bonzini
Am 27.10.22 um 07:54 schrieb Christian Borntraeger:
[...]
> diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
> index 0f1fecc68e..01bdb05575 100644
> --- a/tests/qemu-iotests/common.qemu
> +++ b/tests/qemu-iotests/common.qemu
> @@ -388,7 +388,7 @@ _cleanup_qemu()
> kill -KILL ${QEMU_PID} 2>/dev/null
> fi
> if [ -n "${QEMU_PID}" ]; then
> - wait ${QEMU_PID} 2>/dev/null # silent kill
> + wait 2>/dev/null # silent kill
> fi
> fi
>
>
> And this also helps. Still trying to find out what clone/fork happens here.
As a new information, the problem only exists on Ubuntu,
I cannot reproduce it with Fedora or RHEL. I also changed
the kernel, its not the reason. As soon as I add tracing
the different timing also makes the problem go away.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-12-05 13:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-10 7:57 qemu iotest 161 and make check Christian Borntraeger
2022-02-10 14:47 ` Vladimir Sementsov-Ogievskiy
2022-02-10 14:51 ` Christian Borntraeger
2022-02-10 17:13 ` Thomas Huth
2022-02-10 17:44 ` Vladimir Sementsov-Ogievskiy
2022-02-21 10:27 ` Christian Borntraeger
2022-03-31 7:44 ` Christian Borntraeger
2022-03-31 8:25 ` Christian Borntraeger
2022-10-27 5:54 ` Christian Borntraeger
2022-12-05 13:49 ` Christian Borntraeger
2022-03-31 9:59 ` Li Zhang
2022-02-14 9:08 ` Christian Borntraeger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).