All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block: switch to co_wrapper_mixed for blk_lock_medium() and blk_eject()
@ 2026-05-30 17:58 Boudewijn van der Heide
  2026-06-08 12:17 ` Kevin Wolf
  0 siblings, 1 reply; 3+ messages in thread
From: Boudewijn van der Heide @ 2026-05-30 17:58 UTC (permalink / raw)
  To: qemu-block, qemu-devel; +Cc: kwolf, hreitz, Boudewijn van der Heide

scsi_disk_emulate_command() can be called from a coroutine;
when req->cmd.buf[0] is ALLOW_MEDIUM_REMOVAL, the synchronous
blk_lock_medium() is called, which hits assert(!qemu_in_coroutine()),
and crashes:

	qemu-system-hppa: block/block-gen.c:1692: blk_lock_medium:
	Assertion `!qemu_in_coroutine()' failed.

blk_eject() has the same problem, it can be called from coroutine,
because the same vtable entry (SCSIReqOps.send_command,
scsi_disk_emulate_command() here) calls blk_eject when req->cmd.buf[0]
is START_STOP.

Fix by switching to co_wrapper_mixed for blk_lock_medium() and
blk_eject() instead of just co_wrapper.

Signed-off-by: Boudewijn van der Heide <boudewijn@delta-utec.com>
---
Observed crash on fedora qemu-10.1.5-1.fc43 on x86_64 host using qemu-system-hppa.

trace:

Thread 1 (...):
        # 5 __assert_fail (...) at assert.c
        # 6 blk_lock_medium (...) at block/block-gen.c
        # 7 scsi_disk_emulate_command (...) at ../hw/scsi/scsi-disk.c
        # 8 scsi_req_enqueue (...) at ../hw/scsi/scsi-bus.c
        # 9 lsi_do_command (...) at ../hw/scsi/lsi53c895a.c
        # 10 lsi_execute_script (...) at ../hw/scsi/lsi53c895a.c
        # 11 scsi_read_complete_noio (...) at ../hw/scsi/scsi-disk.c
        # 12 blk_aio_complete (...) at ../block/block-backend.c
        # 14 blk_aio_read_entry (...) at ../block/block-backend.c
        # 15 in coroutine_trampoline (...) at ../util/coroutine-ucontext.c

blk_eject() has the same path, but req->cmd.buf[0] is START_STOP instead
of ALLOW_MEDIUM_REMOVAL inside scsi_disk_emulate_command(),
so fix that aswell.
---
 include/system/block-backend-io.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
index fd84723d9d..7368ad5c09 100644
--- a/include/system/block-backend-io.h
+++ b/include/system/block-backend-io.h
@@ -81,10 +81,10 @@ bool coroutine_fn GRAPH_RDLOCK blk_co_is_available(BlockBackend *blk);
 bool co_wrapper_mixed_bdrv_rdlock blk_is_available(BlockBackend *blk);
 
 void coroutine_fn blk_co_lock_medium(BlockBackend *blk, bool locked);
-void co_wrapper blk_lock_medium(BlockBackend *blk, bool locked);
+void co_wrapper_mixed blk_lock_medium(BlockBackend *blk, bool locked);
 
 void coroutine_fn blk_co_eject(BlockBackend *blk, bool eject_flag);
-void co_wrapper blk_eject(BlockBackend *blk, bool eject_flag);
+void co_wrapper_mixed blk_eject(BlockBackend *blk, bool eject_flag);
 
 int64_t coroutine_fn blk_co_getlength(BlockBackend *blk);
 int64_t co_wrapper_mixed blk_getlength(BlockBackend *blk);
-- 
2.54.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] block: switch to co_wrapper_mixed for blk_lock_medium() and blk_eject()
  2026-05-30 17:58 [PATCH] block: switch to co_wrapper_mixed for blk_lock_medium() and blk_eject() Boudewijn van der Heide
@ 2026-06-08 12:17 ` Kevin Wolf
  2026-06-09 15:49   ` Stefan Hajnoczi
  0 siblings, 1 reply; 3+ messages in thread
From: Kevin Wolf @ 2026-06-08 12:17 UTC (permalink / raw)
  To: Boudewijn van der Heide
  Cc: qemu-block, qemu-devel, hreitz, pbonzini, stefanha

[Cc: Paolo, Stefan]

Am 30.05.2026 um 19:58 hat Boudewijn van der Heide geschrieben:
> scsi_disk_emulate_command() can be called from a coroutine;
> when req->cmd.buf[0] is ALLOW_MEDIUM_REMOVAL, the synchronous
> blk_lock_medium() is called, which hits assert(!qemu_in_coroutine()),
> and crashes:
> 
> 	qemu-system-hppa: block/block-gen.c:1692: blk_lock_medium:
> 	Assertion `!qemu_in_coroutine()' failed.
> 
> blk_eject() has the same problem, it can be called from coroutine,
> because the same vtable entry (SCSIReqOps.send_command,
> scsi_disk_emulate_command() here) calls blk_eject when req->cmd.buf[0]
> is START_STOP.
> 
> Fix by switching to co_wrapper_mixed for blk_lock_medium() and
> blk_eject() instead of just co_wrapper.
> 
> Signed-off-by: Boudewijn van der Heide <boudewijn@delta-utec.com>

Yes, that will fix your immediate symptoms. I'm not completely sure if
that's the level where we want to fix things, though, which is why I
added Paolo and Stefan to Cc.

> ---
> Observed crash on fedora qemu-10.1.5-1.fc43 on x86_64 host using qemu-system-hppa.
> 
> trace:
> 
> Thread 1 (...):
>         # 5 __assert_fail (...) at assert.c
>         # 6 blk_lock_medium (...) at block/block-gen.c
>         # 7 scsi_disk_emulate_command (...) at ../hw/scsi/scsi-disk.c
>         # 8 scsi_req_enqueue (...) at ../hw/scsi/scsi-bus.c
>         # 9 lsi_do_command (...) at ../hw/scsi/lsi53c895a.c
>         # 10 lsi_execute_script (...) at ../hw/scsi/lsi53c895a.c
>         # 11 scsi_read_complete_noio (...) at ../hw/scsi/scsi-disk.c
>         # 12 blk_aio_complete (...) at ../block/block-backend.c
>         # 14 blk_aio_read_entry (...) at ../block/block-backend.c
>         # 15 in coroutine_trampoline (...) at ../util/coroutine-ucontext.c
> 
> blk_eject() has the same path, but req->cmd.buf[0] is START_STOP instead
> of ALLOW_MEDIUM_REMOVAL inside scsi_disk_emulate_command(),
> so fix that aswell.

This would be useful information to have in the commit message proper.

The step from #10 to #11 seems to be a rather big one that left out
intermediate function calls, so I'm not sure if I fully understand it.

Either way, I don't think lsi_execute_script() and the functions called
by it were ever written with the intention to run in a coroutine. I'm
almost sure that problems can be in more than just the two functions
you're changing here. So my question is mostly, should the path involve
a BH somewhere to break out of coroutine context, instead of making one
or two specific cases work?

If the general feeling is that we do want to make essentially all of the
SCSI code coroutine_mixed_fn and are prepared to fix any issues arising
from it, then this patch is fine as far as I am concerned.

Kevin

> ---
>  include/system/block-backend-io.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
> index fd84723d9d..7368ad5c09 100644
> --- a/include/system/block-backend-io.h
> +++ b/include/system/block-backend-io.h
> @@ -81,10 +81,10 @@ bool coroutine_fn GRAPH_RDLOCK blk_co_is_available(BlockBackend *blk);
>  bool co_wrapper_mixed_bdrv_rdlock blk_is_available(BlockBackend *blk);
>  
>  void coroutine_fn blk_co_lock_medium(BlockBackend *blk, bool locked);
> -void co_wrapper blk_lock_medium(BlockBackend *blk, bool locked);
> +void co_wrapper_mixed blk_lock_medium(BlockBackend *blk, bool locked);
>  
>  void coroutine_fn blk_co_eject(BlockBackend *blk, bool eject_flag);
> -void co_wrapper blk_eject(BlockBackend *blk, bool eject_flag);
> +void co_wrapper_mixed blk_eject(BlockBackend *blk, bool eject_flag);
>  
>  int64_t coroutine_fn blk_co_getlength(BlockBackend *blk);
>  int64_t co_wrapper_mixed blk_getlength(BlockBackend *blk);
> -- 
> 2.54.0
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] block: switch to co_wrapper_mixed for blk_lock_medium() and blk_eject()
  2026-06-08 12:17 ` Kevin Wolf
@ 2026-06-09 15:49   ` Stefan Hajnoczi
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2026-06-09 15:49 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Boudewijn van der Heide, qemu-block, qemu-devel, hreitz, pbonzini

[-- Attachment #1: Type: text/plain, Size: 4731 bytes --]

On Mon, Jun 08, 2026 at 02:17:46PM +0200, Kevin Wolf wrote:
> [Cc: Paolo, Stefan]
> 
> Am 30.05.2026 um 19:58 hat Boudewijn van der Heide geschrieben:
> > scsi_disk_emulate_command() can be called from a coroutine;
> > when req->cmd.buf[0] is ALLOW_MEDIUM_REMOVAL, the synchronous
> > blk_lock_medium() is called, which hits assert(!qemu_in_coroutine()),
> > and crashes:
> > 
> > 	qemu-system-hppa: block/block-gen.c:1692: blk_lock_medium:
> > 	Assertion `!qemu_in_coroutine()' failed.
> > 
> > blk_eject() has the same problem, it can be called from coroutine,
> > because the same vtable entry (SCSIReqOps.send_command,
> > scsi_disk_emulate_command() here) calls blk_eject when req->cmd.buf[0]
> > is START_STOP.
> > 
> > Fix by switching to co_wrapper_mixed for blk_lock_medium() and
> > blk_eject() instead of just co_wrapper.
> > 
> > Signed-off-by: Boudewijn van der Heide <boudewijn@delta-utec.com>
> 
> Yes, that will fix your immediate symptoms. I'm not completely sure if
> that's the level where we want to fix things, though, which is why I
> added Paolo and Stefan to Cc.
> 
> > ---
> > Observed crash on fedora qemu-10.1.5-1.fc43 on x86_64 host using qemu-system-hppa.
> > 
> > trace:
> > 
> > Thread 1 (...):
> >         # 5 __assert_fail (...) at assert.c
> >         # 6 blk_lock_medium (...) at block/block-gen.c
> >         # 7 scsi_disk_emulate_command (...) at ../hw/scsi/scsi-disk.c
> >         # 8 scsi_req_enqueue (...) at ../hw/scsi/scsi-bus.c
> >         # 9 lsi_do_command (...) at ../hw/scsi/lsi53c895a.c
> >         # 10 lsi_execute_script (...) at ../hw/scsi/lsi53c895a.c
> >         # 11 scsi_read_complete_noio (...) at ../hw/scsi/scsi-disk.c
> >         # 12 blk_aio_complete (...) at ../block/block-backend.c
> >         # 14 blk_aio_read_entry (...) at ../block/block-backend.c
> >         # 15 in coroutine_trampoline (...) at ../util/coroutine-ucontext.c
> > 
> > blk_eject() has the same path, but req->cmd.buf[0] is START_STOP instead
> > of ALLOW_MEDIUM_REMOVAL inside scsi_disk_emulate_command(),
> > so fix that aswell.
> 
> This would be useful information to have in the commit message proper.
> 
> The step from #10 to #11 seems to be a rather big one that left out
> intermediate function calls, so I'm not sure if I fully understand it.
> 
> Either way, I don't think lsi_execute_script() and the functions called
> by it were ever written with the intention to run in a coroutine. I'm
> almost sure that problems can be in more than just the two functions
> you're changing here. So my question is mostly, should the path involve
> a BH somewhere to break out of coroutine context, instead of making one
> or two specific cases work?

The code path in this bug is from the block layer (blk_aio_read_entry ->
blk_aio_complete) to scsi-disk (scsi_read_complete_noio) into the LSI
device (probably lsi_command_complete -> lsi_resume_script ->
lsi_execute_script).

I'm not sure if the LSI device, which is unaware of coroutines, should
have to work around this. It seems cleaner for blk_aio_preadv() to
invoke cb() from outside coroutine context since that API does not say
anything about coroutine contexts.

On the other hand, there will be a performance overhead for leaving
coroutine context even when it would be fine to run in a coroutine...

> If the general feeling is that we do want to make essentially all of the
> SCSI code coroutine_mixed_fn and are prepared to fix any issues arising
> from it, then this patch is fine as far as I am concerned.
> 
> Kevin
> 
> > ---
> >  include/system/block-backend-io.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
> > index fd84723d9d..7368ad5c09 100644
> > --- a/include/system/block-backend-io.h
> > +++ b/include/system/block-backend-io.h
> > @@ -81,10 +81,10 @@ bool coroutine_fn GRAPH_RDLOCK blk_co_is_available(BlockBackend *blk);
> >  bool co_wrapper_mixed_bdrv_rdlock blk_is_available(BlockBackend *blk);
> >  
> >  void coroutine_fn blk_co_lock_medium(BlockBackend *blk, bool locked);
> > -void co_wrapper blk_lock_medium(BlockBackend *blk, bool locked);
> > +void co_wrapper_mixed blk_lock_medium(BlockBackend *blk, bool locked);
> >  
> >  void coroutine_fn blk_co_eject(BlockBackend *blk, bool eject_flag);
> > -void co_wrapper blk_eject(BlockBackend *blk, bool eject_flag);
> > +void co_wrapper_mixed blk_eject(BlockBackend *blk, bool eject_flag);
> >  
> >  int64_t coroutine_fn blk_co_getlength(BlockBackend *blk);
> >  int64_t co_wrapper_mixed blk_getlength(BlockBackend *blk);
> > -- 
> > 2.54.0
> > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-09 15:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-30 17:58 [PATCH] block: switch to co_wrapper_mixed for blk_lock_medium() and blk_eject() Boudewijn van der Heide
2026-06-08 12:17 ` Kevin Wolf
2026-06-09 15:49   ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.