* [Qemu-devel] Linux multiqueue block layer thoughts
@ 2013-11-27 10:16 Stefan Hajnoczi
2013-11-28 2:15 ` Jens Axboe
0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hajnoczi @ 2013-11-27 10:16 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Jens Axboe, Michael Roth, Paolo Bonzini
I finally got around to reading the Linux multiqueue block layer paper
and wanted to share some thoughts about how it relates to QEMU and
dataplane/QContext:
http://kernel.dk/blk-mq.pdf
I think Jens has virtio-blk multiqueue patches. So let's imagine that
the virtio-blk device has multiple virtqueues. (virtio-scsi is
already multiqueue BTW.)
The paper focusses on two queue mappings: 1 queue per core and 1 queue
per node. In both cases the idea is to keep the block I/O code path
localized. This makes block I/O scale as the number of CPUs
increases.
In QEMU we'd want to set up a mapping for the virtio-blk mq device:
each guest vcpu or guest node has a virtio-blk virtqueue which is
serviced by a dataplane/QContext thread.
QEMU would then process requests across these queues in parallel,
although currently BlockDriverState is not thread-safe. At least for
raw we should be able to submit requests in parallel from QEMU.
Unfortunately there are some complications in the QEMU block layer:
QEMU's own accounting, request tracking, and throttling features are
global. We'd need to eventually do something similar to the
multiqueue block layer changes in the kernel to detangle this state.
Doing multiqueue for image formats is much more challenging - we'd
have to tackle thread-safety in qcow2 and friends. For network block
drivers like Gluster or NBD it's also not 100% clear what the best
approach is. But I think the target here is local SSDs that are
capable of high IOPs together with an SMP guest.
At the end of all this we'd arrive at the following architecture:
1. Guest virtio device has multiple queues (1 per node or vcpu).
2. QEMU has multiple dataplane/QContext threads that process virtqueue
kicks, they are bound to host CPUs/nodes.
3. Linux kernel has multiqueue block I/O.
Jens: when experimenting with multiqueue virtio-blk, how far did you
modify QEMU to eliminate global request processing state from block.c?
Stefan
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Linux multiqueue block layer thoughts
2013-11-27 10:16 [Qemu-devel] Linux multiqueue block layer thoughts Stefan Hajnoczi
@ 2013-11-28 2:15 ` Jens Axboe
2013-12-03 14:55 ` Stefan Hajnoczi
0 siblings, 1 reply; 3+ messages in thread
From: Jens Axboe @ 2013-11-28 2:15 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Michael Roth
On Wed, Nov 27 2013, Stefan Hajnoczi wrote:
> I finally got around to reading the Linux multiqueue block layer paper
> and wanted to share some thoughts about how it relates to QEMU and
> dataplane/QContext:
> http://kernel.dk/blk-mq.pdf
>
> I think Jens has virtio-blk multiqueue patches. So let's imagine that
> the virtio-blk device has multiple virtqueues. (virtio-scsi is
> already multiqueue BTW.)
>
> The paper focusses on two queue mappings: 1 queue per core and 1 queue
> per node. In both cases the idea is to keep the block I/O code path
> localized. This makes block I/O scale as the number of CPUs
> increases.
>
> In QEMU we'd want to set up a mapping for the virtio-blk mq device:
> each guest vcpu or guest node has a virtio-blk virtqueue which is
> serviced by a dataplane/QContext thread.
>
> QEMU would then process requests across these queues in parallel,
> although currently BlockDriverState is not thread-safe. At least for
> raw we should be able to submit requests in parallel from QEMU.
>
> Unfortunately there are some complications in the QEMU block layer:
> QEMU's own accounting, request tracking, and throttling features are
> global. We'd need to eventually do something similar to the
> multiqueue block layer changes in the kernel to detangle this state.
>
> Doing multiqueue for image formats is much more challenging - we'd
> have to tackle thread-safety in qcow2 and friends. For network block
> drivers like Gluster or NBD it's also not 100% clear what the best
> approach is. But I think the target here is local SSDs that are
> capable of high IOPs together with an SMP guest.
>
> At the end of all this we'd arrive at the following architecture:
> 1. Guest virtio device has multiple queues (1 per node or vcpu).
> 2. QEMU has multiple dataplane/QContext threads that process virtqueue
> kicks, they are bound to host CPUs/nodes.
> 3. Linux kernel has multiqueue block I/O.
I think that sounds very reasonable. Let me know if there's anything you
need help or advice with.
> Jens: when experimenting with multiqueue virtio-blk, how far did you
> modify QEMU to eliminate global request processing state from block.c?
I did very little scaling testing on virtio-blk, it was more a demo case
for conversion than anything else. So probably not of much use to what
you are looking for...
--
Jens Axboe
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Linux multiqueue block layer thoughts
2013-11-28 2:15 ` Jens Axboe
@ 2013-12-03 14:55 ` Stefan Hajnoczi
0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2013-12-03 14:55 UTC (permalink / raw)
To: Jens Axboe; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Michael Roth
On Wed, Nov 27, 2013 at 07:15:13PM -0700, Jens Axboe wrote:
> On Wed, Nov 27 2013, Stefan Hajnoczi wrote:
> > At the end of all this we'd arrive at the following architecture:
> > 1. Guest virtio device has multiple queues (1 per node or vcpu).
> > 2. QEMU has multiple dataplane/QContext threads that process virtqueue
> > kicks, they are bound to host CPUs/nodes.
> > 3. Linux kernel has multiqueue block I/O.
>
> I think that sounds very reasonable. Let me know if there's anything you
> need help or advice with.
>
> > Jens: when experimenting with multiqueue virtio-blk, how far did you
> > modify QEMU to eliminate global request processing state from block.c?
>
> I did very little scaling testing on virtio-blk, it was more a demo case
> for conversion than anything else. So probably not of much use to what
> you are looking for...
Okay, thanks. It will be a while before the whole stack supports
multiqueue but it's good to know this approach sounds reasonable.
Stefan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-12-03 14:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-27 10:16 [Qemu-devel] Linux multiqueue block layer thoughts Stefan Hajnoczi
2013-11-28 2:15 ` Jens Axboe
2013-12-03 14:55 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).