Re: Question about graph locking preconditions regarding qemu_in_main_thread()

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: "open list:Network Block Dev..." <qemu-block@nongnu.org>,
	Hanna Czenczek <hreitz@redhat.com>,
	eesposit@redhat.com, Paolo Bonzini <pbonzini@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: Question about graph locking preconditions regarding qemu_in_main_thread()
Date: Tue, 9 May 2023 15:53:32 +0200	[thread overview]
Message-ID: <ZFpQXPqW0s95/guu@redhat.com> (raw)
In-Reply-To: <8af6f1ef-9b91-7024-36a1-e98ba87a9feb@proxmox.com>

Am 09.05.2023 um 12:26 hat Fiona Ebner geschrieben:
> Am 08.05.23 um 12:40 schrieb Kevin Wolf:
> > Am 05.05.2023 um 11:35 hat Fiona Ebner geschrieben:
> >> Hi,
> >> I noticed that the bdrv_graph_co_rd_lock() and bdrv_graph_co_rd_unlock()
> >> functions use qemu_in_main_thread() as a conditional to return early.
> >> What high-level requirements ensure that qemu_in_main_thread() will
> >> evaluate to the same value during locking and unlocking?
> > 
> > I think it's actually wrong.
> > 
> > There are some conditions under which we don't actually need to do
> > anything for locking: Writing the graph is only possible from the main
> > context while holding the BQL. So if a reader is running in the main
> > contextunder the BQL and knows that it won't be interrupted until the
> > next writer runs, we don't actually need to do anything.
> > 
> > This is the case if the reader code neither has a nested event loop
> > (this is forbidden anyway while you hold the lock) nor is a coroutine
> > (because a writer could run when the coroutine has yielded).
> 
> Sorry, I don't understand the first part. If bdrv_writev_vmstate() is
> called, then, because it is a generated coroutine wrapper,
> AIO_WAIT_WHILE()/aio_poll() is used. And that is the case regardless of
> whether the lock is held or not, i.e. there is a nested event loop even
> if the lock is held?

With "lock" you mean the graph lock here, not the BQL, right?

These generated coroutine wrapper are a bit ugly because they behave
different when called from a coroutine and when called outside of
coroutine context:

* In coroutine context, the caller MUST hold the lock
* Outside of coroutine context, the caller MUST NOT hold the lock

The second requirement comes from AIO_WAIT_WHILE(), like you say. If you
hold the lock and you're not in a coroutine, you simply can't call such
functions.

However, as bdrv_graph_co_rdlock() is a coroutine_fn, you won't usually
hold the lock outside of coroutine context anyway. But it's something to
be careful with when you have a writer lock.

> > These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
> > They are not fulfilled in bdrv_graph_co_rdlock, which always runs in a
> > coroutine.
> > 
> 
> Thank you for the explanation!
> 
> >> In the output below, the boolean value after the backtraces of
> >> bdrv_graph_co_rd(un)lock is the value of qemu_in_main_thread().
> >>
> >> AFAICT, what happened below is:
> >> 1. QMP function is executed in the main thread and drops the BQL.
> >> 2. bdrv_co_writev_vmstate_entry is called, increasing the reader count,
> >> because qemu_in_main_thread() is false.
> >> 3. A vCPU thread issued a write, not increasing the reader count,
> >> because qemu_in_main_thread() is true.
> >> 4. The write is finished in the main thread, decreasing the reader
> >> count, because qemu_in_main_thread() is false.
> > 
> > This one is weird. Why don't we hold the BQL there?
> > 
> > Ah, I actually have an idea: Maybe it is executed by the nested event
> > loop in bdrv_writev_vmstate(). This nested event loop runs now without
> > the BQL, too, and it can call any sort of callback in QEMU that really
> > expects to be called with the BQL. Scary!
> > 
> > If this is what happens, you actually have your proof that not holding
> > the BQL can cause problems even if there is no other thread active that
> > could interfere.
> > 
> > Can you try to confirm this in gdb? In case you haven't debugged
> > coroutines much yet: If you have this state in gdb for a running
> > instance, you can repeat 'fin' until you reach the next coroutine
> > switch, and then you should be able to get the stack trace to see who
> > entered the coroutine.
> > 
> 
> Thank you for the guidance! Hope I did it correctly, but AFAICS, you
> are spot-on :)

Yes, this looks like what I suspected. Thanks for confirming!

Kevin

next prev parent reply	other threads:[~2023-05-09 13:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05  9:35 Question about graph locking preconditions regarding qemu_in_main_thread() Fiona Ebner
2023-05-08 10:40 ` Kevin Wolf
2023-05-09 10:26   ` Fiona Ebner
2023-05-09 13:53     ` Kevin Wolf [this message]
2023-05-09 14:20       ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFpQXPqW0s95/guu@redhat.com \
    --to=kwolf@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=hreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).