Re: Best practice for issuing blocking calls in response to an event

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Miles Glenn <milesg@linux.ibm.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel@nongnu.org, "Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: Best practice for issuing blocking calls in response to an event
Date: Tue, 25 Mar 2025 10:08:04 -0500	[thread overview]
Message-ID: <f0fdbf4b2de592383979eb2e58855b2d6fc7c33a.camel@linux.ibm.com> (raw)
In-Reply-To: <CAJSP0QX-T=8Fw=x_De2HdiNVKNQf2nTbrHp5cnUeJfFzxVONwQ@mail.gmail.com>

On Mon, 2025-03-24 at 14:35 -0400, Stefan Hajnoczi wrote:
> On Fri, Mar 21, 2025 at 11:17 AM Miles Glenn <milesg@linux.ibm.com> wrote:
> > On Thu, 2025-03-20 at 16:09 -0400, Stefan Hajnoczi wrote:
> > > On Thu, Mar 20, 2025 at 12:34 PM Miles Glenn <milesg@linux.ibm.com> wrote:
> > > > Hello,
> > > > 
> > > > I am attempting to simulate a system with multiple CPU
> > > > architectures.  To do this I am starting a unique QEMU process for each
> > > > CPU architecture that is needed. I'm also developing some QEMU code
> > > > that aids in transporting MMIO transactions across the process
> > > > boundaries using sockets.
> > > 
> > > I have CCed Phil. He has been working on heterogenous target emulation
> > > and might be interested.
> > > 
> > > > The design takes MMIO request messages off of a socket, services the
> > > > request by calling address_space_ldq_be(), then sends a response
> > > > message (containing the requested data) over the same
> > > > socket.  Currently, this is all done inside the socket IOReadHandler
> > > > callback function.
> > > 
> > > At a high level this is similar to the vfio-user feature where a PCI
> > > device is emulated in a separate process. This also involves sending
> > > messages describing QEMU's MemoryRegion accesses. See the "remote"
> > > machine type in QEMU to look at the code.
> > > 
> > > > This works as long as the targeted register exists in the same QEMU
> > > > process that received the request.  However, If the register exists in
> > > > another QEMU process, then the call to address_space_ldq_be() results
> > > > in another socket message being sent to that QEMU process, requesting
> > > > the data, and then waiting (blocking) for the response message
> > > > containing the data.  In other words, it ends up blocking inside the
> > > > event handler and even though the QEMU process containing the target
> > > > register was able to receive the request and send the response, the
> > > > originator of the request is unable to receive the response until it
> > > > eventually times out and stops blocking.  Once it times out and stops
> > > > blocking, it does receive the response, but now it is too late.
> > > > 
> > > > Here's a summary of the stack up to where the code blocks:
> > > > 
> > > > IOReadHandler callback
> > > >   calls address_space_ldq_be()
> > > >     resolves to mmio read op of a remote device
> > > >       sends request over socket and waits (blocks) for response
> > > > 
> > > > So, I'm looking for a way to handle the work of calling
> > > > address_space_ldq_be(), which might block when attempting to read a
> > > > register of a remote device, without blocking inside the IOReadHandler
> > > > callback context.
> > > > 
> > > > I've done a lot of searches and reading about how to do this on the web
> > > > and in the QEMU code but it's still not really clear to me how this
> > > > should be done in QEMU.  I've seen a lot about using coroutines to
> > > > handle cases like this. Is that what I should be using here?
> > > 
> > > The fundamental problem is that address_space_ldq_be() is synchronous,
> > > so there is no way to return back to the caller until the response has
> > > been received.
> > > 
> > > vfio-user didn't solve this problem. It simply blocks until the
> > > response is received, but it does drop the Big QEMU Lock during this
> > > time so that other vCPU threads can run. For example, see
> > > hw/remote/proxy.c:send_bar_access_msg() and
> > > mpqemu_msg_send_and_await_reply().
> > > 
> > > QEMU supports nested event loops, but they come with their own set of
> > > gotchas. The way a nested event loop might help here is to send the
> > > request and then call aio_poll() to receive the response in another
> > > IOReadHandler. This way other event loop processing can take place
> > > while waiting in address_space_ldq_be().
> > > 
> > > The second problem is that this approach where QEMU processes send
> > > requests to each other needs to be implemented carefully to avoid
> > > deadlocks. For example, devices that do DMA could load/store memory
> > > belonging to another device handled by another QEMU. Once there is an
> > > A -> B -> A situation it could deadlock.
> > > 
> > > Both vfio-user and vhost-user have similar issues with their
> > > bi-directional communication where a device emulation process can send
> > > a message to QEMU while processing a message from QEMU. Deadlock can
> > > be avoided if the code is structured so that QEMU is able to receive
> > > new requests during the time when it is waiting for a response.
> > > 
> > > Stefan
> > 
> > Stefan, Thank you for the quick response and great information!
> > 
> > I'm not sure if this is the best way, but I was able to get things
> > working today using the coroutine approach.
> > 
> > Now, the aforementioned stack looks like this:
> > 
> > IOReadHandler callback receives request
> >   enters coroutine
> >     calls address_space_ldq_be()
> >       resolves to mmio read op of a remote device
> >         sends request
> > over socket
> >         detects coroutine context and
> >         calls qemu_coroutine_yield() instead of blocking
> >   returns to callback
> > 
> > <time passes>
> > 
> > IOReadHandler callback receives response
> >   re-enters coroutine
> >         mmio read op returns data received in response message
> >     address_space_ldq_be() returns
> >   coroutine completes and returns to callback
> > 
> > While this works, I couldn't help but notice that the coroutine concept
> > seems to be like a form of multithreading.  Is there some advantage to
> > using coroutines over doing the work in another thread?  Does QEMU
> > offer an interface that allows for a callback to queue up work that can
> > be handled by another thread or a pool of threads?
> 
> Coroutines make it easier to write concurrent code in an event loop.
> The alternative is to write asynchronous callback functions, which is
> tedious for sequences with multiple steps that need to wait for I/O.
> 
> Coroutines do not offer parallelism, so they are not replacement for
> multi-threading. QEMU is mostly event-driven rather than
> multi-threaded. Usually only computation in QEMU that really needs its
> own CPU runs in its own thread (vCPUs, compression, blocking syscalls
> when there is no alternative, etc).
> 
> There are advantages to using coroutines: less synchronization is
> necessary than with threads (you can be sure no other coroutine will
> run in the same thread while your code is running) and this eliminates
> most thread-safety issues. Also, event loops are seen as more scalable
> than threads (lots of historical resources, for example
> http://www.kegel.com/c10k.html). One QEMU-specific advantage of
> coroutines: coroutine code has access to all of QEMU's APIs that
> require the event loop whereas threads need to take extra steps to
> interact with the rest of QEMU.
> 
> Stefan

Thanks for the explanation, Stefan!

Glenn

     prev parent reply	other threads:[~2025-03-25 15:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 16:34 Best practice for issuing blocking calls in response to an event Miles Glenn
2025-03-20 20:09 ` Stefan Hajnoczi
2025-03-21 15:17   ` Miles Glenn
2025-03-24 18:35     ` Stefan Hajnoczi
2025-03-25 15:08       ` Miles Glenn [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0fdbf4b2de592383979eb2e58855b2d6fc7c33a.camel@linux.ibm.com \
    --to=milesg@linux.ibm.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).