* [Qemu-devel] poor virtio-scsi performance (fio testing) @ 2015-11-19 8:16 Vasiliy Tolstov 2015-11-25 9:35 ` Stefan Hajnoczi 2015-11-25 10:08 ` Alexandre DERUMIER 0 siblings, 2 replies; 9+ messages in thread From: Vasiliy Tolstov @ 2015-11-19 8:16 UTC (permalink / raw) To: qemu-devel I'm test virtio-scsi on various kernels (with and without scsi-mq) with deadline io scheduler (best performance). I'm test with lvm thin volume and with sheepdog storage. Data goes to ssd that have on host system is about 30K iops. When i'm test via fio [randrw] blocksize=4k filename=/dev/sdb rw=randrw direct=1 buffered=0 ioengine=libaio iodepth=32 group_reporting numjobs=10 runtime=600 I'm always stuck at 11K-12K iops with sheepdog or with lvm. When i'm switch to virtio-blk and enable data-plane i'm get around 16K iops. I'm try to enable virtio-scsi-data-plane but may be miss something (get around 13K iops) I'm use libvirt 1.2.16 and qemu 2.4.1 What can i do to get near 20K-25K iops? (qemu testing drive have cache=none io=native) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-19 8:16 [Qemu-devel] poor virtio-scsi performance (fio testing) Vasiliy Tolstov @ 2015-11-25 9:35 ` Stefan Hajnoczi 2015-11-25 10:10 ` Vasiliy Tolstov 2015-11-25 10:08 ` Alexandre DERUMIER 1 sibling, 1 reply; 9+ messages in thread From: Stefan Hajnoczi @ 2015-11-25 9:35 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1538 bytes --] On Thu, Nov 19, 2015 at 11:16:22AM +0300, Vasiliy Tolstov wrote: > I'm test virtio-scsi on various kernels (with and without scsi-mq) > with deadline io scheduler (best performance). I'm test with lvm thin > volume and with sheepdog storage. Data goes to ssd that have on host > system is about 30K iops. > When i'm test via fio > [randrw] > blocksize=4k > filename=/dev/sdb > rw=randrw > direct=1 > buffered=0 > ioengine=libaio > iodepth=32 > group_reporting > numjobs=10 > runtime=600 > > > I'm always stuck at 11K-12K iops with sheepdog or with lvm. > When i'm switch to virtio-blk and enable data-plane i'm get around 16K iops. > I'm try to enable virtio-scsi-data-plane but may be miss something > (get around 13K iops) > I'm use libvirt 1.2.16 and qemu 2.4.1 > > What can i do to get near 20K-25K iops? > > (qemu testing drive have cache=none io=native) If the workload is just fio to a single disk then dataplane (-object iothread) may not help massively. The scalability of dataplane kicks in when doing many different types of I/O or accessing many disks. If you have just 1 disk and the VM is only running fio, then dataplane simply shifts the I/O work from the QEMU main loop to a dedicated thread. This results in an improvement but it may not be very dramatic for a single disk. You can get better aio=native performance with qemu.git/master. Please see commit fc73548e444ae3239f6cef44a5200b5d2c3e85d1 ("virtio-blk: use blk_io_plug/unplug for Linux AIO batching"). Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 473 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 9:35 ` Stefan Hajnoczi @ 2015-11-25 10:10 ` Vasiliy Tolstov 2015-11-26 2:53 ` Fam Zheng 0 siblings, 1 reply; 9+ messages in thread From: Vasiliy Tolstov @ 2015-11-25 10:10 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel 2015-11-25 12:35 GMT+03:00 Stefan Hajnoczi <stefanha@gmail.com>: > You can get better aio=native performance with qemu.git/master. Please > see commit fc73548e444ae3239f6cef44a5200b5d2c3e85d1 ("virtio-blk: use > blk_io_plug/unplug for Linux AIO batching"). Thanks Stefan! Does this patch only for virtio-blk or it can increase iops for virtio-scsi ? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 10:10 ` Vasiliy Tolstov @ 2015-11-26 2:53 ` Fam Zheng 0 siblings, 0 replies; 9+ messages in thread From: Fam Zheng @ 2015-11-26 2:53 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: Stefan Hajnoczi, qemu-devel On Wed, 11/25 13:10, Vasiliy Tolstov wrote: > 2015-11-25 12:35 GMT+03:00 Stefan Hajnoczi <stefanha@gmail.com>: > > You can get better aio=native performance with qemu.git/master. Please > > see commit fc73548e444ae3239f6cef44a5200b5d2c3e85d1 ("virtio-blk: use > > blk_io_plug/unplug for Linux AIO batching"). > > > Thanks Stefan! Does this patch only for virtio-blk or it can increase > iops for virtio-scsi ? > The patch was for virtio-blk but qemu.git/master also has another patch for virtio-scsi (5170f40 virtio-scsi: Call bdrv_io_plug/bdrv_io_unplug in cmd request handling). On the other hand it would be helpful to compare the difference between virtio-blk and virtio-scsi in your environment. Could you do that? Fam ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-19 8:16 [Qemu-devel] poor virtio-scsi performance (fio testing) Vasiliy Tolstov 2015-11-25 9:35 ` Stefan Hajnoczi @ 2015-11-25 10:08 ` Alexandre DERUMIER 2015-11-25 10:12 ` Vasiliy Tolstov 1 sibling, 1 reply; 9+ messages in thread From: Alexandre DERUMIER @ 2015-11-25 10:08 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: qemu-devel Maybe could you try to create 2 disk in your vm, each with 1 dedicated iothread, then try to run fio on both disk at the same time, and see if performance improve. But maybe they are some write overhead with lvmthin (because of copy on write) and sheepdog. Do you have tried with classic lvm or raw file ? ----- Mail original ----- De: "Vasiliy Tolstov" <v.tolstov@selfip.ru> À: "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Jeudi 19 Novembre 2015 09:16:22 Objet: [Qemu-devel] poor virtio-scsi performance (fio testing) I'm test virtio-scsi on various kernels (with and without scsi-mq) with deadline io scheduler (best performance). I'm test with lvm thin volume and with sheepdog storage. Data goes to ssd that have on host system is about 30K iops. When i'm test via fio [randrw] blocksize=4k filename=/dev/sdb rw=randrw direct=1 buffered=0 ioengine=libaio iodepth=32 group_reporting numjobs=10 runtime=600 I'm always stuck at 11K-12K iops with sheepdog or with lvm. When i'm switch to virtio-blk and enable data-plane i'm get around 16K iops. I'm try to enable virtio-scsi-data-plane but may be miss something (get around 13K iops) I'm use libvirt 1.2.16 and qemu 2.4.1 What can i do to get near 20K-25K iops? (qemu testing drive have cache=none io=native) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 10:08 ` Alexandre DERUMIER @ 2015-11-25 10:12 ` Vasiliy Tolstov 2015-11-25 10:27 ` Alexandre DERUMIER 0 siblings, 1 reply; 9+ messages in thread From: Vasiliy Tolstov @ 2015-11-25 10:12 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: qemu-devel 2015-11-25 13:08 GMT+03:00 Alexandre DERUMIER <aderumier@odiso.com>: > Maybe could you try to create 2 disk in your vm, each with 1 dedicated iothread, > > then try to run fio on both disk at the same time, and see if performance improve. > Thats fine, but by default i have only one disk inside vm, so i prefer increase single disk speed. > > But maybe they are some write overhead with lvmthin (because of copy on write) and sheepdog. > > Do you have tried with classic lvm or raw file ? I'm try with classic lvm - sometimes i get more iops, but stable results is the same =) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 10:12 ` Vasiliy Tolstov @ 2015-11-25 10:27 ` Alexandre DERUMIER 2015-11-25 10:48 ` Vasiliy Tolstov 0 siblings, 1 reply; 9+ messages in thread From: Alexandre DERUMIER @ 2015-11-25 10:27 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: qemu-devel >>I'm try with classic lvm - sometimes i get more iops, but stable results is the same =) I have tested with a raw file, qemu 2.4 + virtio-scsi (without iothread), I'm around 25k iops with an intel ssd 3500. (host cpu are xeon v3 3,1ghz) randrw: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=32 ... fio-2.1.11 Starting 10 processes Jobs: 4 (f=3): [m(2),_(3),m(1),_(3),m(1)] [100.0% done] [96211KB/96639KB/0KB /s] [24.6K/24.2K/0 iops] [eta 00m:00s] randrw: (groupid=0, jobs=10): err= 0: pid=25662: Wed Nov 25 11:25:22 2015 read : io=5124.7MB, bw=97083KB/s, iops=24270, runt= 54047msec slat (usec): min=1, max=34577, avg=181.90, stdev=739.20 clat (usec): min=177, max=49641, avg=6511.31, stdev=3176.16 lat (usec): min=185, max=52810, avg=6693.55, stdev=3247.36 clat percentiles (usec): | 1.00th=[ 1704], 5.00th=[ 2576], 10.00th=[ 3184], 20.00th=[ 4016], | 30.00th=[ 4704], 40.00th=[ 5344], 50.00th=[ 5984], 60.00th=[ 6688], | 70.00th=[ 7456], 80.00th=[ 8512], 90.00th=[10304], 95.00th=[12224], | 99.00th=[17024], 99.50th=[19584], 99.90th=[26240], 99.95th=[29568], | 99.99th=[37632] bw (KB /s): min= 6690, max=12432, per=10.02%, avg=9728.49, stdev=796.49 write: io=5115.1MB, bw=96929KB/s, iops=24232, runt= 54047msec slat (usec): min=1, max=37270, avg=188.68, stdev=756.21 clat (usec): min=98, max=54737, avg=6246.50, stdev=3078.27 lat (usec): min=109, max=56078, avg=6435.53, stdev=3134.23 clat percentiles (usec): | 1.00th=[ 1960], 5.00th=[ 2640], 10.00th=[ 3120], 20.00th=[ 3856], | 30.00th=[ 4512], 40.00th=[ 5088], 50.00th=[ 5664], 60.00th=[ 6304], | 70.00th=[ 7072], 80.00th=[ 8160], 90.00th=[ 9920], 95.00th=[11712], | 99.00th=[16768], 99.50th=[19328], 99.90th=[26496], 99.95th=[31104], | 99.99th=[42752] bw (KB /s): min= 7424, max=12712, per=10.02%, avg=9712.21, stdev=768.32 lat (usec) : 100=0.01%, 250=0.01%, 500=0.02%, 750=0.03%, 1000=0.05% lat (msec) : 2=1.49%, 4=19.18%, 10=68.69%, 20=10.10%, 50=0.45% lat (msec) : 100=0.01% cpu : usr=1.28%, sys=8.94%, ctx=329299, majf=0, minf=76 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued : total=r=1311760/w=1309680/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=32 Run status group 0 (all jobs): READ: io=5124.7MB, aggrb=97082KB/s, minb=97082KB/s, maxb=97082KB/s, mint=54047msec, maxt=54047msec WRITE: io=5115.1MB, aggrb=96928KB/s, minb=96928KB/s, maxb=96928KB/s, mint=54047msec, maxt=54047msec Disk stats (read/write): sdb: ios=1307835/1305417, merge=2523/2770, ticks=4309296/3954628, in_queue=8274916, util=99.95% ----- Mail original ----- De: "Vasiliy Tolstov" <v.tolstov@selfip.ru> À: "aderumier" <aderumier@odiso.com> Cc: "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Mercredi 25 Novembre 2015 11:12:33 Objet: Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 13:08 GMT+03:00 Alexandre DERUMIER <aderumier@odiso.com>: > Maybe could you try to create 2 disk in your vm, each with 1 dedicated iothread, > > then try to run fio on both disk at the same time, and see if performance improve. > Thats fine, but by default i have only one disk inside vm, so i prefer increase single disk speed. > > But maybe they are some write overhead with lvmthin (because of copy on write) and sheepdog. > > Do you have tried with classic lvm or raw file ? I'm try with classic lvm - sometimes i get more iops, but stable results is the same =) -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 10:27 ` Alexandre DERUMIER @ 2015-11-25 10:48 ` Vasiliy Tolstov 2015-11-26 14:13 ` Alexandre DERUMIER 0 siblings, 1 reply; 9+ messages in thread From: Vasiliy Tolstov @ 2015-11-25 10:48 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: qemu-devel 2015-11-25 13:27 GMT+03:00 Alexandre DERUMIER <aderumier@odiso.com>: > I have tested with a raw file, qemu 2.4 + virtio-scsi (without iothread), I'm around 25k iops > with an intel ssd 3500. (host cpu are xeon v3 3,1ghz) What scheduler you have on host system? May be my default cfq slowdown? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 10:48 ` Vasiliy Tolstov @ 2015-11-26 14:13 ` Alexandre DERUMIER 0 siblings, 0 replies; 9+ messages in thread From: Alexandre DERUMIER @ 2015-11-26 14:13 UTC (permalink / raw) To: Vasiliy Tolstov; +Cc: qemu-devel >>May be my default cfq slowdown? Yes ! (in your first mail you said that you use deadline scheduler ?) cfq don't play well with a lof of current job. cfq + numjobs=10 : 10000 iops cfq + numjobs=1 : 25000 iops deadline + numjobs=1 : 25000 iops deadline + numjobs=10 : 25000 iops ----- Mail original ----- De: "Vasiliy Tolstov" <v.tolstov@selfip.ru> À: "aderumier" <aderumier@odiso.com> Cc: "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Mercredi 25 Novembre 2015 11:48:11 Objet: Re: [Qemu-devel] poor virtio-scsi performance (fio testing) 2015-11-25 13:27 GMT+03:00 Alexandre DERUMIER <aderumier@odiso.com>: > I have tested with a raw file, qemu 2.4 + virtio-scsi (without iothread), I'm around 25k iops > with an intel ssd 3500. (host cpu are xeon v3 3,1ghz) What scheduler you have on host system? May be my default cfq slowdown? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-11-26 14:13 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-11-19 8:16 [Qemu-devel] poor virtio-scsi performance (fio testing) Vasiliy Tolstov 2015-11-25 9:35 ` Stefan Hajnoczi 2015-11-25 10:10 ` Vasiliy Tolstov 2015-11-26 2:53 ` Fam Zheng 2015-11-25 10:08 ` Alexandre DERUMIER 2015-11-25 10:12 ` Vasiliy Tolstov 2015-11-25 10:27 ` Alexandre DERUMIER 2015-11-25 10:48 ` Vasiliy Tolstov 2015-11-26 14:13 ` Alexandre DERUMIER
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).