Re: Question about graph locking preconditions regarding qemu_in_main_thread()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kevin Wolf <kwolf@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: "open list:Network Block Dev..." <qemu-block@nongnu.org>,
	Hanna Czenczek <hreitz@redhat.com>,
	eesposit@redhat.com, Paolo Bonzini <pbonzini@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: Question about graph locking preconditions regarding qemu_in_main_thread()
Date: Tue, 9 May 2023 15:53:32 +0200	[thread overview]
Message-ID: <ZFpQXPqW0s95/guu@redhat.com> (raw)
In-Reply-To: <8af6f1ef-9b91-7024-36a1-e98ba87a9feb@proxmox.com>

Am 09.05.2023 um 12:26 hat Fiona Ebner geschrieben:
> Am 08.05.23 um 12:40 schrieb Kevin Wolf:
> > Am 05.05.2023 um 11:35 hat Fiona Ebner geschrieben:
> >> Hi,
> >> I noticed that the bdrv_graph_co_rd_lock() and bdrv_graph_co_rd_unlock()
> >> functions use qemu_in_main_thread() as a conditional to return early.
> >> What high-level requirements ensure that qemu_in_main_thread() will
> >> evaluate to the same value during locking and unlocking?
> > 
> > I think it's actually wrong.
> > 
> > There are some conditions under which we don't actually need to do
> > anything for locking: Writing the graph is only possible from the main
> > context while holding the BQL. So if a reader is running in the main
> > contextunder the BQL and knows that it won't be interrupted until the
> > next writer runs, we don't actually need to do anything.
> > 
> > This is the case if the reader code neither has a nested event loop
> > (this is forbidden anyway while you hold the lock) nor is a coroutine
> > (because a writer could run when the coroutine has yielded).
> 
> Sorry, I don't understand the first part. If bdrv_writev_vmstate() is
> called, then, because it is a generated coroutine wrapper,
> AIO_WAIT_WHILE()/aio_poll() is used. And that is the case regardless of
> whether the lock is held or not, i.e. there is a nested event loop even
> if the lock is held?

With "lock" you mean the graph lock here, not the BQL, right?

These generated coroutine wrapper are a bit ugly because they behave
different when called from a coroutine and when called outside of
coroutine context:

* In coroutine context, the caller MUST hold the lock
* Outside of coroutine context, the caller MUST NOT hold the lock

The second requirement comes from AIO_WAIT_WHILE(), like you say. If you
hold the lock and you're not in a coroutine, you simply can't call such
functions.

However, as bdrv_graph_co_rdlock() is a coroutine_fn, you won't usually
hold the lock outside of coroutine context anyway. But it's something to
be careful with when you have a writer lock.

> > These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
> > They are not fulfilled in bdrv_graph_co_rdlock, which always runs in a
> > coroutine.
> > 
> 
> Thank you for the explanation!
> 
> >> In the output below, the boolean value after the backtraces of
> >> bdrv_graph_co_rd(un)lock is the value of qemu_in_main_thread().
> >>
> >> AFAICT, what happened below is:
> >> 1. QMP function is executed in the main thread and drops the BQL.
> >> 2. bdrv_co_writev_vmstate_entry is called, increasing the reader count,
> >> because qemu_in_main_thread() is false.
> >> 3. A vCPU thread issued a write, not increasing the reader count,
> >> because qemu_in_main_thread() is true.
> >> 4. The write is finished in the main thread, decreasing the reader
> >> count, because qemu_in_main_thread() is false.
> > 
> > This one is weird. Why don't we hold the BQL there?
> > 
> > Ah, I actually have an idea: Maybe it is executed by the nested event
> > loop in bdrv_writev_vmstate(). This nested event loop runs now without
> > the BQL, too, and it can call any sort of callback in QEMU that really
> > expects to be called with the BQL. Scary!
> > 
> > If this is what happens, you actually have your proof that not holding
> > the BQL can cause problems even if there is no other thread active that
> > could interfere.
> > 
> > Can you try to confirm this in gdb? In case you haven't debugged
> > coroutines much yet: If you have this state in gdb for a running
> > instance, you can repeat 'fin' until you reach the next coroutine
> > switch, and then you should be able to get the stack trace to see who
> > entered the coroutine.
> > 
> 
> Thank you for the guidance! Hope I did it correctly, but AFAICS, you
> are spot-on :)

Yes, this looks like what I suspected. Thanks for confirming!

Kevin

next prev parent reply	other threads:[~2023-05-09 13:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05  9:35 Question about graph locking preconditions regarding qemu_in_main_thread() Fiona Ebner
2023-05-08 10:40 ` Kevin Wolf
2023-05-09 10:26   ` Fiona Ebner
2023-05-09 13:53     ` Kevin Wolf [this message]
2023-05-09 14:20       ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZFpQXPqW0s95/guu@redhat.com \
    --to=kwolf@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=f.ebner@proxmox.com \
    --cc=hreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.