Re: Some iotests are failing with -luks

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	hreitz@redhat.com, Maxim Levitsky <mlevitsk@redhat.com>,
	Hyman Huang <yong.huang@smartx.com>
Subject: Re: Some iotests are failing with -luks
Date: Fri, 12 Sep 2025 19:35:11 +0100	[thread overview]
Message-ID: <aMRn31T1Id5ceFjX@redhat.com> (raw)
In-Reply-To: <aMQz383L7sjWiPV_@redhat.com>

On Fri, Sep 12, 2025 at 04:53:19PM +0200, Kevin Wolf wrote:
> Am 12.09.2025 um 16:23 hat Daniel P. Berrangé geschrieben:
> > On Thu, Sep 11, 2025 at 02:13:47PM +0200, Kevin Wolf wrote:
> > > Hm, so block_crypto_read_func() isn't prepared to be called in coroutine
> > > context, but block_crypto_co_amend_luks() still calls it from a
> > > coroutine. The indirection of going through QCrypto won't make it easier
> > > to fix this properly.
> > 
> > Historically block_crypto_read_func() didn't care/know whether it
> > was in a coroutine or not. Bisect tells me the regression was caused
> > by
> > 
> >   commit 1f051dcbdf2e4b6f518db731c84e304b2b9d15ce
> >   Author: Kevin Wolf <kwolf@redhat.com>
> >   Date:   Fri Oct 27 17:53:33 2023 +0200
> > 
> >     block: Protect bs->file with graph_lock
> > 
> > which added
> >  
> >     GLOBAL_STATE_CODE();
> >     GRAPH_RDLOCK_GUARD_MAINLOOP();
> > 
> > > It seems to me that while block_crypto_read/write_func are effectively
> > > no_coroutine_fn, qcrypto_block_amend_options() should really take
> > > function pointers that can be called from coroutines. It is called from
> > > both coroutine and non-coroutine code paths, so should the function
> > > pointers be coroutine_mixed_fn or do we want to change the callers?
> > >
> > > Either way, we should add the appropriate coroutine markers to the
> > > QCrypto interfaces to show the intention at least.
> > 
> > I'm unclear why QCrypto needs to know about coroutines at all ?
> > It just wants a function pointer that will send or recv a blob
> > of data. In the case of the block layer these functions end up
> > doing I/O via the block APIs, but QCrypto doesn't care about
> > this impl detail.
> 
> Does a case where it's not in the context of the block layer even exist?

Only the unit tests.

> The only callers of qcrypto_block_amend_options() are in block/crypto.c
> and block/qcow2.c. Apart from a test case, qcrypto_block_open() is the
> same.

Yep

> And even ignoring the block layer, doing synchronous I/O outside of
> coroutines is arguably a bug anyway because that's a blocking operation
> that stops the mainloop from making progress.

More generally, simply opening a LUKS volume can also impose a significant
delay because key validation is intentionally slow in wallclock time. So we
should get a minimum of 1 second delay, but if given an image created on a
significantly faster machine (or a malicious image), the larger 'iterations'
count could make us take way longer to open the image.

I guess that's a potential problem too ?

Amending the keys has the same performance penalty too as that involves
same intentionally slow crypto

> But if we don't want to fix it at the QCrypto level, that makes the
> function pointer implicitly coroutine_mixed_fn and the function needs to
> be rewritten to check qemu_in_coroutine() and probably take the graph
> lock internally before calling bdrv_co_pread() in the coroutine case,
> unless we can prove that all callers already hold it. Unfortunately,
> function pointers still defeat TSA, so this proof will have to be
> manual.

So IIUC the 'open' operation is not in a coroutine, while the
'amend' operation is in a coroutine ?

IIUC the coroutine_mixed_fn is expanding to a no-op. What is the
actual functional fix needed to stop the crash ?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2025-09-12 18:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-10 15:16 Some iotests are failing with -luks Thomas Huth
2025-09-10 16:08 ` Kevin Wolf
2025-09-10 18:38   ` Kevin Wolf
2025-09-11  2:33     ` Yong Huang
2025-09-11 10:04       ` Kevin Wolf
2025-09-11 10:38         ` Daniel P. Berrangé
2025-09-15 13:18           ` Kevin Wolf
2025-09-11  8:43     ` Daniel P. Berrangé
2025-09-11 11:21   ` Thomas Huth
2025-09-11 12:13     ` Kevin Wolf
2025-09-12 14:23       ` Daniel P. Berrangé
2025-09-12 14:53         ` Kevin Wolf
2025-09-12 18:35           ` Daniel P. Berrangé [this message]
2025-09-15 12:45             ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMRn31T1Id5ceFjX@redhat.com \
    --to=berrange@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=yong.huang@smartx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).