qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Dataplane and vhost-blk
@ 2013-03-05 14:18 Benoît Canet
  2013-03-05 15:59 ` Stefan Hajnoczi
  0 siblings, 1 reply; 5+ messages in thread
From: Benoît Canet @ 2013-03-05 14:18 UTC (permalink / raw)
  To: qemu-devel


Hello,

I am looking for a way to help improving qemu block performance.

APIC-V is a work in progress and the two options with public code are vhost-*
and virtio-blk-dataplane.

The way of doing seems very similar (bypassing the qemu lock) and dedicating
a thread to each emulated virtio block device.

vhost-* is in kernel while dataplane is in qemu.

Performance seems similar.

Dataplane seems to be a demonstrator to be replaced by an evolution of the
qemu block layer made thread friendly and vhost-blk is not upstream yet.

This left me with the following questions :

Are dataplane and vhost-block purpose the same (speed) despite being pushed
by the same company (Red Hat) ?

What is the best path I can take to help improve qemu block performance ?

Best regards

Benoît

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Dataplane and vhost-blk
  2013-03-05 14:18 [Qemu-devel] Dataplane and vhost-blk Benoît Canet
@ 2013-03-05 15:59 ` Stefan Hajnoczi
  2013-03-05 20:46   ` Benoît Canet
  2013-03-06  3:23   ` Liu Yuan
  0 siblings, 2 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2013-03-05 15:59 UTC (permalink / raw)
  To: Benoît Canet; +Cc: qemu-devel

On Tue, Mar 5, 2013 at 3:18 PM, Benoît Canet <benoit.canet@irqsave.net> wrote:
> I am looking for a way to help improving qemu block performance.
>
> APIC-V is a work in progress and the two options with public code are vhost-*
> and virtio-blk-dataplane.
>
> The way of doing seems very similar (bypassing the qemu lock) and dedicating
> a thread to each emulated virtio block device.
>
> vhost-* is in kernel while dataplane is in qemu.

Yes, they take a similar approach.  The main difference is using a
vhost kernel thread versus a QEMU userspace thread.

> Performance seems similar.
>
> Dataplane seems to be a demonstrator to be replaced by an evolution of the
> qemu block layer made thread friendly and vhost-blk is not upstream yet.
>
> This left me with the following questions :
>
> Are dataplane and vhost-block purpose the same (speed) despite being pushed
> by the same company (Red Hat) ?

Both approaches tackle high IOPS scalability.  Both approaches were
prototyped over a period of 1 or 2 years.  They are not associated
with just one contributor or company - vhost_blk and virtio-blk data
plane were pushed along by various folks as time went on.  vhost_blk
had at least two independent implementations :).

Since they were relatively long-term efforts, the overlap or
duplication was actually good.  It allowed comparisons and both
approaches benefitted from competition.

> What is the best path I can take to help improve qemu block performance ?

You need to set a more specific goal.  Some questions to get started:
 * Which workloads do you care about and what are their
characteristics (sequential or random I/O, queue depth)?
 * Do you care about 1 vcpu guests or 4+ vcpu guests?  (SMP scalability)
 * Are you using an image format?

Once you have decided what needs to be improved it should be easier to
figure out what to work on.

I haven't run latency tracing on the full stack recently.  The goal is
to match host latency, but we have an overhead due to virtio-blk and
QEMU block I/O.  Many changes have been made, like the introduction of
coroutines, since I posted measurements on the KVM wiki
(http://www.linux-kvm.org/page/Virtio/Block/Latency).  Perhaps this is
an area you care about?

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Dataplane and vhost-blk
  2013-03-05 15:59 ` Stefan Hajnoczi
@ 2013-03-05 20:46   ` Benoît Canet
  2013-03-06 10:08     ` Stefan Hajnoczi
  2013-03-06  3:23   ` Liu Yuan
  1 sibling, 1 reply; 5+ messages in thread
From: Benoît Canet @ 2013-03-05 20:46 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Benoît Canet, qemu-devel

> You need to set a more specific goal.  Some questions to get started:
>  * Which workloads do you care about and what are their
> characteristics (sequential or random I/O, queue depth)?
>  * Do you care about 1 vcpu guests or 4+ vcpu guests?  (SMP scalability)
>  * Are you using an image format?

The usage would be a typical HPC workload: 4 vcpu in SMP and random IO on raw
devices.

Benoît

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Dataplane and vhost-blk
  2013-03-05 15:59 ` Stefan Hajnoczi
  2013-03-05 20:46   ` Benoît Canet
@ 2013-03-06  3:23   ` Liu Yuan
  1 sibling, 0 replies; 5+ messages in thread
From: Liu Yuan @ 2013-03-06  3:23 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Benoît Canet, qemu-devel

On 03/05/2013 11:59 PM, Stefan Hajnoczi wrote:
>> I am looking for a way to help improving qemu block performance.
>> >
>> > APIC-V is a work in progress and the two options with public code are vhost-*
>> > and virtio-blk-dataplane.
>> >
>> > The way of doing seems very similar (bypassing the qemu lock) and dedicating
>> > a thread to each emulated virtio block device.
>> >
>> > vhost-* is in kernel while dataplane is in qemu.
> Yes, they take a similar approach.  The main difference is using a
> vhost kernel thread versus a QEMU userspace thread.
> 

The other merit of blk-dataplane over in-kernel vhost_blk that I can
think of, is underlying various protocols such as Sheepdog would benefit
from it without adding code, assuming the final goal of blk-dataplain is
fully fulfilled that aims to be integrated into QEMU block layer. For
vhost_blk, ony local backing file will benefit from it without adding
more code in the kernel.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Dataplane and vhost-blk
  2013-03-05 20:46   ` Benoît Canet
@ 2013-03-06 10:08     ` Stefan Hajnoczi
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2013-03-06 10:08 UTC (permalink / raw)
  To: Benoît Canet; +Cc: qemu-devel

On Tue, Mar 05, 2013 at 09:46:30PM +0100, Benoît Canet wrote:
> > You need to set a more specific goal.  Some questions to get started:
> >  * Which workloads do you care about and what are their
> > characteristics (sequential or random I/O, queue depth)?
> >  * Do you care about 1 vcpu guests or 4+ vcpu guests?  (SMP scalability)
> >  * Are you using an image format?
> 
> The usage would be a typical HPC workload: 4 vcpu in SMP and random IO on raw
> devices.

Okay.  If you want to do performance analysis on the existing stack,
then the next step is to choose a benchmark that represents this
workload.  Then you can collect profiles and see where there is room for
improvement.

In virtio-blk data plane world I'm currently converting core QEMU code
to support multiple AioContexts.  This is needed in order to use
BlockDriverStates from threads (without holding the global mutex).  This
is a pretty linear task.

The second half of this work is enabling device emulation code to run in
threads without the global mutex.  The trickiest thing here is probably
guest memory access - making guest memory access and DMA work safely
from a thread.  I haven't really started on this, Ping Fan Liu has
worked on it in the past, you could chat with him to find out the
current status.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-03-06 10:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-05 14:18 [Qemu-devel] Dataplane and vhost-blk Benoît Canet
2013-03-05 15:59 ` Stefan Hajnoczi
2013-03-05 20:46   ` Benoît Canet
2013-03-06 10:08     ` Stefan Hajnoczi
2013-03-06  3:23   ` Liu Yuan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).