* [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads
@ 2023-02-09 15:45 Emanuele Giuseppe Esposito
2023-02-09 16:06 ` Philippe Mathieu-Daudé
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Emanuele Giuseppe Esposito @ 2023-02-09 15:45 UTC (permalink / raw)
To: qemu-block
Cc: Kevin Wolf, Hanna Reitz, Ninad Palsule,
Philippe Mathieu-Daudé, qemu-devel,
Emanuele Giuseppe Esposito
When calling bdrv_getlength() in handle_aiocb_write_zeroes(), the
function creates a new coroutine and then waits that it finishes using
AIO_WAIT_WHILE.
The problem is that this function could also run in a worker thread,
that has a different AioContext from main loop and iothreads, therefore
in AIO_WAIT_WHILE we will have in_aio_context_home_thread(ctx) == false
and therefore
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
in the else branch will fail, crashing QEMU.
Aside from that, bdrv_getlength() is wrong also conceptually, because
it reads the BDS graph from another thread and is not protected by
any lock.
Replace it with raw_co_getlength, that doesn't create a coroutine and
doesn't read the BDS graph.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
block/file-posix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index d3073a7caa..9a99111f45 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1738,7 +1738,7 @@ static int handle_aiocb_write_zeroes(void *opaque)
#ifdef CONFIG_FALLOCATE
/* Last resort: we are trying to extend the file with zeroed data. This
* can be done via fallocate(fd, 0) */
- len = bdrv_getlength(aiocb->bs);
+ len = raw_co_getlength(aiocb->bs);
if (s->has_fallocate && len >= 0 && aiocb->aio_offset >= len) {
int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
if (ret == 0 || ret != -ENOTSUP) {
--
2.39.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads
2023-02-09 15:45 [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads Emanuele Giuseppe Esposito
@ 2023-02-09 16:06 ` Philippe Mathieu-Daudé
2023-02-09 17:31 ` Kevin Wolf
2023-02-15 12:51 ` Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-02-09 16:06 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito, qemu-block
Cc: Kevin Wolf, Hanna Reitz, Ninad Palsule, qemu-devel
On 9/2/23 16:45, Emanuele Giuseppe Esposito wrote:
> When calling bdrv_getlength() in handle_aiocb_write_zeroes(), the
> function creates a new coroutine and then waits that it finishes using
> AIO_WAIT_WHILE.
> The problem is that this function could also run in a worker thread,
> that has a different AioContext from main loop and iothreads, therefore
> in AIO_WAIT_WHILE we will have in_aio_context_home_thread(ctx) == false
> and therefore
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> in the else branch will fail, crashing QEMU.
>
> Aside from that, bdrv_getlength() is wrong also conceptually, because
> it reads the BDS graph from another thread and is not protected by
> any lock.
>
> Replace it with raw_co_getlength, that doesn't create a coroutine and
> doesn't read the BDS graph.
Reported-by: Ninad Palsule <ninad@linux.vnet.ibm.com>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> block/file-posix.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index d3073a7caa..9a99111f45 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1738,7 +1738,7 @@ static int handle_aiocb_write_zeroes(void *opaque)
> #ifdef CONFIG_FALLOCATE
> /* Last resort: we are trying to extend the file with zeroed data. This
> * can be done via fallocate(fd, 0) */
> - len = bdrv_getlength(aiocb->bs);
> + len = raw_co_getlength(aiocb->bs);
> if (s->has_fallocate && len >= 0 && aiocb->aio_offset >= len) {
> int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
> if (ret == 0 || ret != -ENOTSUP) {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads
2023-02-09 15:45 [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads Emanuele Giuseppe Esposito
2023-02-09 16:06 ` Philippe Mathieu-Daudé
@ 2023-02-09 17:31 ` Kevin Wolf
2023-02-15 12:51 ` Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2023-02-09 17:31 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito
Cc: qemu-block, Hanna Reitz, Ninad Palsule,
Philippe Mathieu-Daudé, qemu-devel
Am 09.02.2023 um 16:45 hat Emanuele Giuseppe Esposito geschrieben:
> When calling bdrv_getlength() in handle_aiocb_write_zeroes(), the
> function creates a new coroutine and then waits that it finishes using
> AIO_WAIT_WHILE.
> The problem is that this function could also run in a worker thread,
> that has a different AioContext from main loop and iothreads, therefore
> in AIO_WAIT_WHILE we will have in_aio_context_home_thread(ctx) == false
> and therefore
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> in the else branch will fail, crashing QEMU.
>
> Aside from that, bdrv_getlength() is wrong also conceptually, because
> it reads the BDS graph from another thread and is not protected by
> any lock.
>
> Replace it with raw_co_getlength, that doesn't create a coroutine and
> doesn't read the BDS graph.
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> block/file-posix.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index d3073a7caa..9a99111f45 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1738,7 +1738,7 @@ static int handle_aiocb_write_zeroes(void *opaque)
> #ifdef CONFIG_FALLOCATE
> /* Last resort: we are trying to extend the file with zeroed data. This
> * can be done via fallocate(fd, 0) */
> - len = bdrv_getlength(aiocb->bs);
> + len = raw_co_getlength(aiocb->bs);
> if (s->has_fallocate && len >= 0 && aiocb->aio_offset >= len) {
> int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
> if (ret == 0 || ret != -ENOTSUP) {
Obviously this relies on the fact that raw_co_getlength() doesn't
actually depend on running in coroutine context. Could be done in a
separate patch, but I think we should rename it back to raw_getlength()
and remove the coroutine_fn annotation again. Seems commit c86422c5549
was a little too eager.
Kevin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads
2023-02-09 15:45 [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads Emanuele Giuseppe Esposito
2023-02-09 16:06 ` Philippe Mathieu-Daudé
2023-02-09 17:31 ` Kevin Wolf
@ 2023-02-15 12:51 ` Kevin Wolf
2 siblings, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2023-02-15 12:51 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito
Cc: qemu-block, Hanna Reitz, Ninad Palsule,
Philippe Mathieu-Daudé, qemu-devel
Am 09.02.2023 um 16:45 hat Emanuele Giuseppe Esposito geschrieben:
> When calling bdrv_getlength() in handle_aiocb_write_zeroes(), the
> function creates a new coroutine and then waits that it finishes using
> AIO_WAIT_WHILE.
> The problem is that this function could also run in a worker thread,
> that has a different AioContext from main loop and iothreads, therefore
> in AIO_WAIT_WHILE we will have in_aio_context_home_thread(ctx) == false
> and therefore
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> in the else branch will fail, crashing QEMU.
>
> Aside from that, bdrv_getlength() is wrong also conceptually, because
> it reads the BDS graph from another thread and is not protected by
> any lock.
>
> Replace it with raw_co_getlength, that doesn't create a coroutine and
> doesn't read the BDS graph.
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-02-15 12:52 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-09 15:45 [PATCH] block/file-posix: don't use functions calling AIO_WAIT_WHILE in worker threads Emanuele Giuseppe Esposito
2023-02-09 16:06 ` Philippe Mathieu-Daudé
2023-02-09 17:31 ` Kevin Wolf
2023-02-15 12:51 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).