From: Peter Lieven <pl@kamp.de>
To: Kevin Wolf <kwolf@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>,
qemu-block <qemu-block@nongnu.org>,
Christian Theune <ct@flyingcircus.io>,
qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Ilya Dryomov <idryomov@gmail.com>
Subject: Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support
Date: Mon, 15 Nov 2021 12:17:01 +0100 [thread overview]
Message-ID: <40c6e10b-b8e3-bb5c-cde3-b4a4e2261d1f@kamp.de> (raw)
In-Reply-To: <23e2ebe9-b600-cc60-0962-7e7d153a4d4d@kamp.de>
Am 26.10.21 um 16:53 schrieb Peter Lieven:
> Am 25.10.21 um 14:58 schrieb Kevin Wolf:
>> Am 25.10.2021 um 13:39 hat Peter Lieven geschrieben:
>>> Am 16.09.21 um 14:34 schrieb Peter Lieven:
>>>> Am 09.07.21 um 12:21 schrieb Kevin Wolf:
>>>>> Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:
>>>>>> Am 08.07.2021 um 14:18 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>>> Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:
>>>>>>>>> Am 06.07.2021 um 17:25 schrieb Kevin Wolf <kwolf@redhat.com>:
>>>>>>>>> Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:
>>>>>>>>>> I will have a decent look after my vacation.
>>>>>>>>> Sounds good, thanks. Enjoy your vacation!
>>>>>>>> As I had to fire up my laptop to look into another issue anyway, I
>>>>>>>> have sent two patches for updating MAINTAINERS and to fix the int vs.
>>>>>>>> bool mix for task->complete.
>>>>>>> I think you need to reevaluate your definition of vacation. ;-)
>>>>>> Lets talk about this when the kids are grown up. Sometimes sending
>>>>>> patches can be quite relaxing :-)
>>>>> Heh, fair enough. :-)
>>>>>
>>>>>>> But thanks anyway.
>>>>>>>
>>>>>>>> As Paolos fix (5f50be9b5) is relatively new and there are maybe other
>>>>>>>> non obvious problems when removing the BH indirection and we are close
>>>>>>>> to soft freeze I would leave the BH removal change for 6.2.
>>>>>>> Sure, code cleanups aren't urgent.
>>>>>> Isn’t the indirection also a slight performance drop?
>>>>> Yeah, I guess technically it is, though I doubt it's measurable.
>>>> As promised I was trying to remove the indirection through the BH after Qemu 6.1 release.
>>>>
>>>> However, if I remove the BH I run into the following assertion while running some fio tests:
>>>>
>>>>
>>>> qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed.
>>>>
>>>>
>>>> Any idea?
>>>>
>>>>
>>>> This is what I changed:
>>>>
>>>>
>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>> index 3cb24f9981..bc1dbc20f7 100644
>>>> --- a/block/rbd.c
>>>> +++ b/block/rbd.c
>>>> @@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, uint64_t size)
>>>> return 0;
>>>> }
>>>>
>>>> -static void qemu_rbd_finish_bh(void *opaque)
>>>> -{
>>>> - RBDTask *task = opaque;
>>>> - task->complete = true;
>>>> - aio_co_wake(task->co);
>>>> -}
>>>> -
>>>> /*
>>>> * This is the completion callback function for all rbd aio calls
>>>> * started from qemu_rbd_start_co().
>>>> @@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, RBDTask *task)
>>>> {
>>>> task->ret = rbd_aio_get_return_value(c);
>>>> rbd_aio_release(c);
>>>> - aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
>>>> - qemu_rbd_finish_bh, task);
>>>> + task->complete = true;
>>>> + aio_co_wake(task->co);
>>>> }
>>> Kevin, Paolo, any idea?
>> Not really, I don't see the connection between both places.
>>
>> Do you have a stack trace for the crash?
>
> The crash seems not to be limited to that assertion. I have also seen:
>
>
> qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
>
>
> Altough harder to trigger I catch this backtrace in gdb:
>
>
> qemu-system-x86_64: ../block/block-backend.c:1497: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.
> [Wechseln zu Thread 0x7ffff7fa8f40 (LWP 17852)]
>
> Thread 1 "qemu-system-x86" hit Breakpoint 1, __GI_abort () at abort.c:49
> 49 abort.c: Datei oder Verzeichnis nicht gefunden.
> (gdb) bt
> #0 0x00007ffff42567e0 in __GI_abort () at abort.c:49
> #1 0x00007ffff424648a in __assert_fail_base (fmt=0x7ffff43cd750 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:92
> #2 0x00007ffff4246502 in __GI___assert_fail (assertion=assertion@entry=0x555555e638e0 "!qiov || qiov->size == acb->bytes", file=file@entry=0x555555e634b2 "../block/block-backend.c", line=line@entry=1497, function=function@entry=0x555555e63b20 <__PRETTY_FUNCTION__.32450> "blk_aio_write_entry") at assert.c:101
> #3 0x0000555555becc78 in blk_aio_write_entry (opaque=0x555556b534f0) at ../block/block-backend.c:1497
> #4 0x0000555555cf0e4c in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:173
> #5 0x00007ffff426e7b0 in __start_context () at /lib/x86_64-linux-gnu/libc.so.6
> #6 0x00007fffffffd5a0 in ()
> #7 0x0000000000000000 in ()
>
any ideas? Or should we just abandon the idea of removing the BH?
Peter
prev parent reply other threads:[~2021-11-15 11:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-02 17:23 [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 1/6] block/rbd: bump librbd requirement to luminous release Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 2/6] block/rbd: store object_size in BDRVRBDState Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 3/6] block/rbd: update s->image_size in qemu_rbd_getlength Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 4/6] block/rbd: migrate from aio to coroutines Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 5/6] block/rbd: add write zeroes support Ilya Dryomov
2021-07-02 17:23 ` [PATCH v5 6/6] block/rbd: drop qemu_rbd_refresh_limits Ilya Dryomov
2021-07-02 19:55 ` [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support Ilya Dryomov
2021-07-06 13:19 ` Kevin Wolf
2021-07-06 14:55 ` Peter Lieven
2021-07-06 15:25 ` Kevin Wolf
2021-07-06 15:48 ` Peter Lieven
2021-07-07 18:13 ` Peter Lieven
2021-07-08 12:18 ` Kevin Wolf
2021-07-08 18:23 ` Peter Lieven
2021-07-09 10:21 ` Kevin Wolf
2021-09-16 12:34 ` Peter Lieven
2021-10-25 11:39 ` Peter Lieven
2021-10-25 12:58 ` Kevin Wolf
2021-10-26 14:53 ` Peter Lieven
2021-11-15 11:17 ` Peter Lieven [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40c6e10b-b8e3-bb5c-cde3-b4a4e2261d1f@kamp.de \
--to=pl@kamp.de \
--cc=berrange@redhat.com \
--cc=ct@flyingcircus.io \
--cc=idryomov@gmail.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).