From: Asias He <asias@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Anthony Liguori <aliguori@us.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>, Khoa Huynh <khoa@us.ibm.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane
Date: Wed, 21 Nov 2012 13:22:22 +0800 [thread overview]
Message-ID: <50AC650E.2080207@redhat.com> (raw)
In-Reply-To: <CAJSP0QXx5VVCU+zs-N1J5g7t1DkC5k7+S35pmfWGzytP1GL0Tg@mail.gmail.com>
On 11/20/2012 08:21 PM, Stefan Hajnoczi wrote:
> On Tue, Nov 20, 2012 at 10:02 AM, Asias He <asias@redhat.com> wrote:
>> Hello Stefan,
>>
>> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote:
>>> This series adds the -device virtio-blk-pci,x-data-plane=on property that
>>> enables a high performance I/O codepath. A dedicated thread is used to process
>>> virtio-blk requests outside the global mutex and without going through the QEMU
>>> block layer.
>>>
>>> Khoa Huynh <khoa@us.ibm.com> reported an increase from 140,000 IOPS to 600,000
>>> IOPS for a single VM using virtio-blk-data-plane in July:
>>>
>>> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>>>
>>> The virtio-blk-data-plane approach was originally presented at Linux Plumbers
>>> Conference 2010. The following slides contain a brief overview:
>>>
>>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>>>
>>> The basic approach is:
>>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>>> signalling when the guest kicks the virtqueue.
>>> 2. Requests are processed without going through the QEMU block layer using
>>> Linux AIO directly.
>>> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>>>
>>> To try it out:
>>>
>>> qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>>> -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>>
>>
>> Is this the latest dataplane bits:
>> (git://github.com/stefanha/qemu.git virtio-blk-data-plane)
>>
>> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328
>> Author: Stefan Hajnoczi <stefanha@redhat.com>
>> Date: Wed Nov 14 15:45:38 2012 +0100
>>
>> virtio-blk: add x-data-plane=on|off performance feature
>>
>>
>> With this commit on a ramdisk based box, I am seeing about 10K IOPS with
>> x-data-plane on and 90K IOPS with x-data-plane off.
>>
>> Any ideas?
>>
>> Command line I used:
>>
>> IMG=/dev/ram0
>> x86_64-softmmu/qemu-system-x86_64 \
>> -drive file=/root/img/sid.img,if=ide \
>> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device
>> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \
>> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \
>> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m
>> 2048 -smp 4 -cpu qemu64,+x2apic -M pc
>
> Was just about to send out the latest patch series which addresses
> review comments, so I have tested the latest code
> (61b70fef489ce51ecd18d69afb9622c110b9315c).
>
> I was unable to reproduce a ramdisk performance regression on Linux
> 3.6.6-3.fc18.x86_64 with Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz with
> 8 GB RAM.
I am using the latest upstream kernel.
> The ramdisk is 4 GB and I used your QEMU command-line with a RHEL 6.3 guest.
>
> Summary results:
> x-data-plane-on: iops=132856 aggrb=1039.1MB/s
> x-data-plane-off: iops=126236 aggrb=988.40MB/s
>
> virtio-blk-data-plane is ~5% faster in this benchmark.
>
> fio jobfile:
> [global]
> filename=/dev/vda
> blocksize=8k
> ioengine=libaio
> direct=1
> iodepth=8
> runtime=120
> time_based=1
>
> [reads]
> readwrite=randread
> numjobs=4
>
> Perf top (data-plane-on):
> 3.71% [kvm] [k] kvm_arch_vcpu_ioctl_run
> 3.27% [kernel] [k] memset <--- ramdisk
> 2.98% [kernel] [k] do_blockdev_direct_IO
> 2.82% [kvm_intel] [k] vmx_vcpu_run
> 2.66% [kernel] [k] _raw_spin_lock_irqsave
> 2.06% [kernel] [k] put_compound_page
> 2.06% [kernel] [k] __get_page_tail
> 1.83% [i915] [k] __gen6_gt_force_wake_mt_get
> 1.75% [kernel] [k] _raw_spin_unlock_irqrestore
> 1.33% qemu-system-x86_64 [.] vring_pop <--- virtio-blk-data-plane
> 1.19% [kernel] [k] compound_unlock_irqrestore
> 1.13% [kernel] [k] gup_huge_pmd
> 1.11% [kernel] [k] __audit_syscall_exit
> 1.07% [kernel] [k] put_page_testzero
> 1.01% [kernel] [k] fget
> 1.01% [kernel] [k] do_io_submit
>
> Since the ramdisk (memset and page-related functions) is so prominent
> in perf top, I also tried a 1-job 8k dd sequential write test on a
> Samsung 830 Series SSD where virtio-blk-data-plane was 9% faster than
> virtio-blk. Optimizing against ramdisk isn't a good idea IMO because
> it acts very differently from real hardware where the driver relies on
> mmio, DMA, and interrupts (vs synchronous memcpy/memset).
For the memset in ramdisk, you can simply patch drivers/block/brd.c to
do nop instead of memset for testing.
Yes, if you have fast SSD device (sometimes you need multiple which I
do not have), it makes more sense to test on real hardware. However,
ramdisk test is still useful. It gives rough performance numbers. If A
and B are both tested against ramdisk. The difference between A and B
are still useful.
> Full results:
> $ cat data-plane-off
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> ...
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> fio 1.57
> Starting 4 processes
>
> reads: (groupid=0, jobs=1): err= 0: pid=1851
> read : io=29408MB, bw=250945KB/s, iops=31368 , runt=120001msec
> slat (usec): min=2 , max=27829 , avg=11.06, stdev=78.05
> clat (usec): min=1 , max=28028 , avg=241.41, stdev=388.47
> lat (usec): min=33 , max=28035 , avg=253.17, stdev=396.66
> bw (KB/s) : min=197141, max=335365, per=24.78%, avg=250797.02,
> stdev=29376.35
> cpu : usr=6.55%, sys=31.34%, ctx=310932, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3764202/0/0, short=0/0/0
> lat (usec): 2=0.01%, 4=0.01%, 20=0.01%, 50=1.78%, 100=27.11%
> lat (usec): 250=38.97%, 500=27.11%, 750=2.09%, 1000=0.71%
> lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%, 50=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1852
> read : io=29742MB, bw=253798KB/s, iops=31724 , runt=120001msec
> slat (usec): min=2 , max=17007 , avg=10.61, stdev=67.51
> clat (usec): min=1 , max=41531 , avg=239.00, stdev=379.03
> lat (usec): min=32 , max=41547 , avg=250.33, stdev=385.21
> bw (KB/s) : min=194336, max=347497, per=25.02%, avg=253204.25,
> stdev=31172.37
> cpu : usr=6.66%, sys=32.58%, ctx=327250, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3806999/0/0, short=0/0/0
> lat (usec): 2=0.01%, 20=0.01%, 50=1.54%, 100=26.45%, 250=40.04%
> lat (usec): 500=27.15%, 750=1.95%, 1000=0.71%
> lat (msec): 2=1.29%, 4=0.68%, 10=0.18%, 20=0.01%, 50=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1853
> read : io=29859MB, bw=254797KB/s, iops=31849 , runt=120001msec
> slat (usec): min=2 , max=16821 , avg=11.35, stdev=76.54
> clat (usec): min=1 , max=17659 , avg=237.25, stdev=375.31
> lat (usec): min=31 , max=17673 , avg=249.27, stdev=383.62
> bw (KB/s) : min=194864, max=345280, per=25.15%, avg=254534.63,
> stdev=30549.32
> cpu : usr=6.52%, sys=31.84%, ctx=303763, majf=0, minf=39
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3821989/0/0, short=0/0/0
> lat (usec): 2=0.01%, 10=0.01%, 20=0.01%, 50=2.09%, 100=29.19%
> lat (usec): 250=37.31%, 500=26.41%, 750=2.08%, 1000=0.71%
> lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1854
> read : io=29598MB, bw=252565KB/s, iops=31570 , runt=120001msec
> slat (usec): min=2 , max=26413 , avg=11.21, stdev=78.32
> clat (usec): min=16 , max=27993 , avg=239.56, stdev=381.67
> lat (usec): min=34 , max=28006 , avg=251.49, stdev=390.13
> bw (KB/s) : min=194256, max=369424, per=24.94%, avg=252462.86,
> stdev=29420.58
> cpu : usr=6.57%, sys=31.33%, ctx=305623, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3788507/0/0, short=0/0/0
> lat (usec): 20=0.01%, 50=2.13%, 100=28.30%, 250=37.74%, 500=26.66%
> lat (usec): 750=2.17%, 1000=0.75%
> lat (msec): 2=1.35%, 4=0.70%, 10=0.19%, 20=0.01%, 50=0.01%
>
> Run status group 0 (all jobs):
> READ: io=118607MB, aggrb=988.40MB/s, minb=256967KB/s,
> maxb=260912KB/s, mint=120001msec, maxt=120001msec
>
> Disk stats (read/write):
> vda: ios=15148328/0, merge=0/0, ticks=1550570/0, in_queue=1536232, util=96.56%
>
> $ cat data-plane-on
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> ...
> reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
> fio 1.57
> Starting 4 processes
>
> reads: (groupid=0, jobs=1): err= 0: pid=1796
> read : io=32081MB, bw=273759KB/s, iops=34219 , runt=120001msec
> slat (usec): min=1 , max=20404 , avg=21.08, stdev=125.49
> clat (usec): min=10 , max=135743 , avg=207.62, stdev=532.90
> lat (usec): min=21 , max=136055 , avg=229.60, stdev=556.82
> bw (KB/s) : min=56480, max=951952, per=25.49%, avg=271488.81,
> stdev=149773.57
> cpu : usr=7.01%, sys=43.26%, ctx=336854, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=4106413/0/0, short=0/0/0
> lat (usec): 20=0.01%, 50=2.46%, 100=61.13%, 250=21.58%, 500=3.11%
> lat (usec): 750=3.04%, 1000=3.88%
> lat (msec): 2=4.50%, 4=0.13%, 10=0.11%, 20=0.06%, 50=0.01%
> lat (msec): 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1797
> read : io=30104MB, bw=256888KB/s, iops=32110 , runt=120001msec
> slat (usec): min=1 , max=17595 , avg=22.20, stdev=120.29
> clat (usec): min=13 , max=136264 , avg=221.21, stdev=528.19
> lat (usec): min=22 , max=136280 , avg=244.35, stdev=551.73
> bw (KB/s) : min=57312, max=838880, per=23.93%, avg=254798.51,
> stdev=139546.57
> cpu : usr=6.82%, sys=41.87%, ctx=360348, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3853351/0/0, short=0/0/0
> lat (usec): 20=0.01%, 50=2.10%, 100=58.47%, 250=22.38%, 500=3.68%
> lat (usec): 750=3.69%, 1000=4.52%
> lat (msec): 2=4.87%, 4=0.14%, 10=0.11%, 20=0.05%, 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1798
> read : io=31698MB, bw=270487KB/s, iops=33810 , runt=120001msec
> slat (usec): min=1 , max=17457 , avg=20.93, stdev=125.33
> clat (usec): min=16 , max=134663 , avg=210.19, stdev=535.77
> lat (usec): min=21 , max=134671 , avg=232.02, stdev=559.27
> bw (KB/s) : min=57248, max=841952, per=25.29%, avg=269330.21,
> stdev=148661.08
> cpu : usr=6.92%, sys=42.81%, ctx=337799, majf=0, minf=39
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=4057340/0/0, short=0/0/0
> lat (usec): 20=0.01%, 50=1.98%, 100=62.00%, 250=20.70%, 500=3.22%
> lat (usec): 750=3.23%, 1000=4.16%
> lat (msec): 2=4.41%, 4=0.13%, 10=0.10%, 20=0.06%, 250=0.01%
> reads: (groupid=0, jobs=1): err= 0: pid=1799
> read : io=30913MB, bw=263789KB/s, iops=32973 , runt=120000msec
> slat (usec): min=1 , max=17565 , avg=21.52, stdev=120.17
> clat (usec): min=15 , max=136064 , avg=215.53, stdev=529.56
> lat (usec): min=27 , max=136070 , avg=237.99, stdev=552.50
> bw (KB/s) : min=57632, max=900896, per=24.74%, avg=263431.57,
> stdev=148379.15
> cpu : usr=6.90%, sys=42.56%, ctx=348217, majf=0, minf=41
> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> issued r/w/d: total=3956830/0/0, short=0/0/0
> lat (usec): 20=0.01%, 50=1.76%, 100=59.96%, 250=22.21%, 500=3.45%
> lat (usec): 750=3.35%, 1000=4.33%
> lat (msec): 2=4.65%, 4=0.13%, 10=0.11%, 20=0.05%, 250=0.01%
>
> Run status group 0 (all jobs):
> READ: io=124796MB, aggrb=1039.1MB/s, minb=263053KB/s,
> maxb=280328KB/s, mint=120000msec, maxt=120001msec
>
> Disk stats (read/write):
> vda: ios=15942789/0, merge=0/0, ticks=336240/0, in_queue=317832, util=97.47%
>
--
Asias
next prev parent reply other threads:[~2012-11-21 5:21 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-15 15:18 [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 1/7] raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane Stefan Hajnoczi
2012-11-15 20:03 ` Anthony Liguori
2012-11-16 6:15 ` Stefan Hajnoczi
2012-11-16 8:22 ` Paolo Bonzini
2012-11-15 15:19 ` [Qemu-devel] [PATCH 2/7] configure: add CONFIG_VIRTIO_BLK_DATA_PLANE Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 3/7] dataplane: add virtqueue vring code Stefan Hajnoczi
2012-11-15 15:37 ` Paolo Bonzini
2012-11-15 20:09 ` Anthony Liguori
2012-11-16 6:24 ` Stefan Hajnoczi
2012-11-16 7:48 ` Christian Borntraeger
2012-11-16 8:13 ` Stefan Hajnoczi
2012-11-17 16:15 ` Blue Swirl
2012-11-18 9:27 ` Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 4/7] dataplane: add event loop Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 5/7] dataplane: add Linux AIO request queue Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 6/7] dataplane: add virtio-blk data plane code Stefan Hajnoczi
2012-11-15 15:19 ` [Qemu-devel] [PATCH 7/7] virtio-blk: add x-data-plane=on|off performance feature Stefan Hajnoczi
2012-11-15 18:48 ` Michael S. Tsirkin
2012-11-15 19:34 ` Khoa Huynh
2012-11-15 21:11 ` Anthony Liguori
2012-11-15 21:08 ` Anthony Liguori
2012-11-16 6:22 ` Stefan Hajnoczi
2012-11-19 10:38 ` Kevin Wolf
2012-11-19 10:51 ` Paolo Bonzini
2012-11-16 7:40 ` Paolo Bonzini
2012-11-20 9:02 ` [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane Asias He
2012-11-20 12:21 ` Stefan Hajnoczi
2012-11-20 12:25 ` Stefan Hajnoczi
2012-11-21 5:39 ` Asias He
2012-11-21 6:42 ` Asias He
2012-11-21 6:44 ` Stefan Hajnoczi
2012-11-21 7:00 ` Asias He
2012-11-22 12:12 ` Stefan Hajnoczi
2012-11-21 5:22 ` Asias He [this message]
2012-11-22 12:16 ` Stefan Hajnoczi
2012-11-20 15:03 ` Khoa Huynh
2012-11-21 5:22 ` Asias He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50AC650E.2080207@redhat.com \
--to=asias@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=khoa@us.ibm.com \
--cc=kwolf@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.