From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:43269)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1TbViK-0003Ps-Lg
	for qemu-devel@nongnu.org; Thu, 22 Nov 2012 07:17:10 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1TbViF-0000ZD-0B
	for qemu-devel@nongnu.org; Thu, 22 Nov 2012 07:17:04 -0500
Received: from mx1.redhat.com ([209.132.183.28]:3276)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@redhat.com>) id 1TbViE-0000Z9-O8
	for qemu-devel@nongnu.org; Thu, 22 Nov 2012 07:16:58 -0500
Date: Thu, 22 Nov 2012 13:16:52 +0100
From: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20121122121652.GE13571@stefanha-thinkpad.redhat.com>
References: <1352992746-8767-1-git-send-email-stefanha@redhat.com>
	<50AB470F.7050408@redhat.com>
	<CAJSP0QXx5VVCU+zs-N1J5g7t1DkC5k7+S35pmfWGzytP1GL0Tg@mail.gmail.com>
	<50AC650E.2080207@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <50AC650E.2080207@redhat.com>
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Asias He <asias@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Anthony Liguori <aliguori@us.ibm.com>, "Michael S. Tsirkin" <mst@redhat.com>, Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel <qemu-devel@nongnu.org>, Khoa Huynh <khoa@us.ibm.com>, Paolo Bonzini <pbonzini@redhat.com>

On Wed, Nov 21, 2012 at 01:22:22PM +0800, Asias He wrote:
> On 11/20/2012 08:21 PM, Stefan Hajnoczi wrote:
> > On Tue, Nov 20, 2012 at 10:02 AM, Asias He <asias@redhat.com> wrote:
> >> Hello Stefan,
> >>
> >> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote:
> >>> This series adds the -device virtio-blk-pci,x-data-plane=3Don prope=
rty that
> >>> enables a high performance I/O codepath.  A dedicated thread is use=
d to process
> >>> virtio-blk requests outside the global mutex and without going thro=
ugh the QEMU
> >>> block layer.
> >>>
> >>> Khoa Huynh <khoa@us.ibm.com> reported an increase from 140,000 IOPS=
 to 600,000
> >>> IOPS for a single VM using virtio-blk-data-plane in July:
> >>>
> >>>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
> >>>
> >>> The virtio-blk-data-plane approach was originally presented at Linu=
x Plumbers
> >>> Conference 2010.  The following slides contain a brief overview:
> >>>
> >>>   http://linuxplumbersconf.org/2010/ocw/system/presentations/651/or=
iginal/Optimizing_the_QEMU_Storage_Stack.pdf
> >>>
> >>> The basic approach is:
> >>> 1. Each virtio-blk device has a thread dedicated to handling ioeven=
tfd
> >>>    signalling when the guest kicks the virtqueue.
> >>> 2. Requests are processed without going through the QEMU block laye=
r using
> >>>    Linux AIO directly.
> >>> 3. Completion interrupts are injected via irqfd from the dedicated =
thread.
> >>>
> >>> To try it out:
> >>>
> >>>   qemu -drive if=3Dnone,id=3Ddrive0,cache=3Dnone,aio=3Dnative,forma=
t=3Draw,file=3D...
> >>>        -device virtio-blk-pci,drive=3Ddrive0,scsi=3Doff,x-data-plan=
e=3Don
> >>
> >>
> >> Is this the latest dataplane bits:
> >> (git://github.com/stefanha/qemu.git virtio-blk-data-plane)
> >>
> >> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328
> >> Author: Stefan Hajnoczi <stefanha@redhat.com>
> >> Date:   Wed Nov 14 15:45:38 2012 +0100
> >>
> >>     virtio-blk: add x-data-plane=3Don|off performance feature
> >>
> >>
> >> With this commit on a ramdisk based box, I am seeing about 10K IOPS =
with
> >> x-data-plane on and 90K IOPS with x-data-plane off.
> >>
> >> Any ideas?
> >>
> >> Command line I used:
> >>
> >> IMG=3D/dev/ram0
> >> x86_64-softmmu/qemu-system-x86_64 \
> >> -drive file=3D/root/img/sid.img,if=3Dide \
> >> -drive file=3D${IMG},if=3Dnone,cache=3Dnone,aio=3Dnative,id=3Ddisk1 =
-device
> >> virtio-blk-pci,x-data-plane=3Doff,drive=3Ddisk1,scsi=3Doff \
> >> -kernel $KERNEL -append "root=3D/dev/sdb1 console=3Dtty0" \
> >> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m
> >> 2048 -smp 4 -cpu qemu64,+x2apic -M pc
> >=20
> > Was just about to send out the latest patch series which addresses
> > review comments, so I have tested the latest code
> > (61b70fef489ce51ecd18d69afb9622c110b9315c).
> >=20
> > I was unable to reproduce a ramdisk performance regression on Linux
> > 3.6.6-3.fc18.x86_64 with Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz wit=
h
> > 8 GB RAM.
>=20
> I am using the latest upstream kernel.
>=20
> > The ramdisk is 4 GB and I used your QEMU command-line with a RHEL 6.3=
 guest.
> >=20
> > Summary results:
> > x-data-plane-on: iops=3D132856 aggrb=3D1039.1MB/s
> > x-data-plane-off: iops=3D126236 aggrb=3D988.40MB/s
> >=20
> > virtio-blk-data-plane is ~5% faster in this benchmark.
> >=20
> > fio jobfile:
> > [global]
> > filename=3D/dev/vda
> > blocksize=3D8k
> > ioengine=3Dlibaio
> > direct=3D1
> > iodepth=3D8
> > runtime=3D120
> > time_based=3D1
> >=20
> > [reads]
> > readwrite=3Drandread
> > numjobs=3D4
> >=20
> > Perf top (data-plane-on):
> >   3.71%  [kvm]               [k] kvm_arch_vcpu_ioctl_run
> >   3.27%  [kernel]            [k] memset    <--- ramdisk
> >   2.98%  [kernel]            [k] do_blockdev_direct_IO
> >   2.82%  [kvm_intel]         [k] vmx_vcpu_run
> >   2.66%  [kernel]            [k] _raw_spin_lock_irqsave
> >   2.06%  [kernel]            [k] put_compound_page
> >   2.06%  [kernel]            [k] __get_page_tail
> >   1.83%  [i915]              [k] __gen6_gt_force_wake_mt_get
> >   1.75%  [kernel]            [k] _raw_spin_unlock_irqrestore
> >   1.33%  qemu-system-x86_64  [.] vring_pop <--- virtio-blk-data-plane
> >   1.19%  [kernel]            [k] compound_unlock_irqrestore
> >   1.13%  [kernel]            [k] gup_huge_pmd
> >   1.11%  [kernel]            [k] __audit_syscall_exit
> >   1.07%  [kernel]            [k] put_page_testzero
> >   1.01%  [kernel]            [k] fget
> >   1.01%  [kernel]            [k] do_io_submit
> >=20
> > Since the ramdisk (memset and page-related functions) is so prominent
> > in perf top, I also tried a 1-job 8k dd sequential write test on a
> > Samsung 830 Series SSD where virtio-blk-data-plane was 9% faster than
> > virtio-blk.  Optimizing against ramdisk isn't a good idea IMO because
> > it acts very differently from real hardware where the driver relies o=
n
> > mmio, DMA, and interrupts (vs synchronous memcpy/memset).
>=20
> For the memset in ramdisk, you can simply patch drivers/block/brd.c to
> do nop instead of memset for testing.
>=20
> Yes, if you have fast SSD device =EF=BC=88sometimes you need multiple w=
hich I
> do not have=EF=BC=89, it makes more sense to test on real hardware. How=
ever,
> ramdisk test is still useful. It gives rough performance numbers. If A
> and B are both tested against ramdisk. The difference between A and B
> are still useful.

Optimizing the difference between A and B on ramdisk is only guaranteed
to optimize the ramdisk case.  On real hardware the bottleneck might be
elsewhere and we'd be chasing the wrong lead.

I don't think it's a waste of time but I think to stay healthy we need
to focus on real disks and SSDs most of the time.

Stefan