[Qemu-devel] Why qemu write/rw speed is so low?

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Why qemu write/rw speed is so low?
@ 2011-09-09  9:44 Zhi Yong Wu
  2011-09-09 10:38 ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-09  9:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, ryanh, aliguro, stefanha

HI,

Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.

Do qemu have regression?

The testing data is shown as below:

1.) write

test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process

Jobs: 1 (f=1): [W] [100.0% done] [0K/2K /s] [0/4 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2694
  write: io=51,200KB, bw=58,751B/s, iops=114, runt=892381msec
    slat (usec): min=19, max=376K, avg=68.30, stdev=1411.60
    clat (msec): min=1, max=375, avg= 8.63, stdev= 4.71
     lat (msec): min=1, max=433, avg= 8.70, stdev= 5.08
    bw (KB/s) : min=    1, max=   60, per=100.80%, avg=57.46, stdev= 6.36
  cpu          : usr=0.04%, sys=0.65%, ctx=102616, majf=0, minf=52
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/102400, short=0/0

     lat (msec): 2=0.01%, 4=0.02%, 10=98.82%, 20=0.20%, 50=0.76%
     lat (msec): 100=0.17%, 250=0.01%, 500=0.01%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=57KB/s, minb=58KB/s, maxb=58KB/s, mint=892381msec, maxt=892381msec

Disk stats (read/write):
  dm-0: ios=18/103166, merge=0/0, ticks=1143/910779, in_queue=911921, util=99.74%, aggrios=18/102881, aggrmerge=0/294, aggrticks=1143/900778, aggrin_queue=901855, aggrutil=99.72%
    vda: ios=18/102881, merge=0/294, ticks=1143/900778, in_queue=901855, util=99.72%

2.) read and write

test: (g=0): rw=rw, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [M] [100.0% done] [60K/61K /s] [117/119 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2731
  read : io=25,559KB, bw=58,883B/s, iops=115, runt=444473msec
  read : io=25,559KB, bw=58,883B/s, iops=115, runt=444473msec
    slat (usec): min=13, max=24,781, avg=32.04, stdev=349.25
    clat (usec): min=1, max=123K, avg=121.98, stdev=807.49
     lat (usec): min=75, max=123K, avg=154.66, stdev=879.19
    bw (KB/s) : min=    1, max=  105, per=100.86%, avg=57.49, stdev=12.40
  write: io=25,642KB, bw=59,074B/s, iops=115, runt=444473msec
    slat (usec): min=18, max=60,087, avg=59.29, stdev=830.81
    clat (msec): min=1, max=392, avg= 8.44, stdev= 5.60
     lat (msec): min=1, max=392, avg= 8.50, stdev= 5.77
    bw (KB/s) : min=    1, max=   60, per=101.40%, avg=57.80, stdev= 5.87
  cpu          : usr=0.10%, sys=0.99%, ctx=102007, majf=0, minf=38
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=51117/51283, short=0/0
     lat (usec): 2=0.18%, 4=0.49%, 10=0.01%, 20=0.01%, 100=14.25%
     lat (usec): 250=34.74%, 500=0.11%, 750=0.05%, 1000=0.03%
     lat (msec): 2=0.03%, 4=0.02%, 10=49.64%, 20=0.12%, 50=0.25%
     lat (msec): 100=0.05%, 250=0.01%, 500=0.01%

Run status group 0 (all jobs):
   READ: io=25,558KB, aggrb=57KB/s, minb=58KB/s, maxb=58KB/s, mint=444473msec, maxt=444473msec
  WRITE: io=25,641KB, aggrb=57KB/s, minb=59KB/s, maxb=59KB/s, mint=444473msec, maxt=444473msec

Disk stats (read/write):
  dm-0: ios=51105/51689, merge=0/0, ticks=6302/454018, in_queue=460323, util=99.44%, aggrios=51130/51542, aggrmerge=0/174, aggrticks=6264/447128, aggrin_queue=453328, aggrutil=99.40%
    vda: ios=51130/51542, merge=0/174, ticks=6264/447128, in_queue=453328, util=99.40%

3.) read
test: (g=0): rw=read, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [R] [100.0% done] [4,004K/0K /s] [8K/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2800
  read : io=51,200KB, bw=3,925KB/s, iops=7,850, runt= 13044msec
    slat (usec): min=13, max=1,191, avg=22.25, stdev=28.32
    clat (usec): min=1, max=37,035, avg=102.70, stdev=177.61
     lat (usec): min=78, max=37,123, avg=125.47, stdev=178.64
    bw (KB/s) : min= 3368, max= 5151, per=100.02%, avg=3925.73, stdev=362.41
  cpu          : usr=2.61%, sys=22.80%, ctx=102591, majf=0, minf=24
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=102400/0, short=0/0
     lat (usec): 2=2.20%, 4=0.80%, 10=0.01%, 20=0.01%, 50=0.01%
     lat (usec): 100=73.93%, 250=22.45%, 500=0.14%, 750=0.03%, 1000=0.05%
     lat (msec): 2=0.32%, 4=0.04%, 10=0.01%, 50=0.01%

Run status group 0 (all jobs):
   READ: io=51,200KB, aggrb=3,925KB/s, minb=4,019KB/s, maxb=4,019KB/s, mint=13044msec, maxt=13044msec

Disk stats (read/write):
  dm-0: ios=101230/24, merge=0/0, ticks=10002/2082, in_queue=12083, util=77.25%, aggrios=102400/4, aggrmerge=0/20, aggrticks=10103/244, aggrin_queue=10341, aggrutil=77.03%
    vda: ios=102400/4, merge=0/20, ticks=10103/244, in_queue=10341, util=77.03%


Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09  9:44 [Qemu-devel] Why qemu write/rw speed is so low? Zhi Yong Wu
@ 2011-09-09 10:38 ` Stefan Hajnoczi
  2011-09-09 13:48   ` Zhi Yong Wu
  2011-09-13  2:38   ` Zhi Yong Wu
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-09 10:38 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: kwolf, ryanh, aliguro, qemu-devel

On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
> 
> Do qemu have regression?
> 
> The testing data is shown as below:
> 
> 1.) write
> 
> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1

Please post your QEMU command-line.  If your -drive is using
cache=writethrough then small writes are slow because they require the
physical disk to write and then synchronize its write cache.  Typically
cache=none is a good setting to use for local disks.

The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
so I think a 512 byte write from the guest could cause a 4 KB
read-modify-write operation on the host filesystem.

You can check this by running btrace(8) on the host during the
benchmark.  The blktrace output and the summary statistics will show
what I/O pattern the host is issuing.

I suggest changing your fio block size to 8 KB if you want to try a
small block size.  If you want a large block size, try 64 KB or 128 KB.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 10:38 ` Stefan Hajnoczi
@ 2011-09-09 13:48   ` Zhi Yong Wu
  2011-09-09 13:54     ` Stefan Hajnoczi
  2011-09-13  2:38   ` Zhi Yong Wu
  1 sibling, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-09 13:48 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>
>> Do qemu have regression?
>>
>> The testing data is shown as below:
>>
>> 1.) write
>>
>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>
> Please post your QEMU command-line.  If your -drive is using
> cache=writethrough then small writes are slow because they require the
> physical disk to write and then synchronize its write cache.  Typically
> cache=none is a good setting to use for local disks.
Now i can not access my workstation in the office.
-drive if=virtio,cache=none,file=xxxx

>
> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
> so I think a 512 byte write from the guest could cause a 4 KB
> read-modify-write operation on the host filesystem.
You mean RCU? What is its work procedure? Can you explain in more
details if you are available?
>
> You can check this by running btrace(8) on the host during the
> benchmark.  The blktrace output and the summary statistics will show
> what I/O pattern the host is issuing.
OK, i will try next Tuesday.
>
> I suggest changing your fio block size to 8 KB if you want to try a
> small block size.  If you want a large block size, try 64 KB or 128 KB.
OK
>
> Stefan
>
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 13:48   ` Zhi Yong Wu
@ 2011-09-09 13:54     ` Stefan Hajnoczi
  2011-09-09 14:04       ` Kevin Wolf
  2011-09-09 14:09       ` Zhi Yong Wu
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-09 13:54 UTC (permalink / raw)
  To: Zhi Yong Wu
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>
>>> Do qemu have regression?
>>>
>>> The testing data is shown as below:
>>>
>>> 1.) write
>>>
>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>
>> Please post your QEMU command-line.  If your -drive is using
>> cache=writethrough then small writes are slow because they require the
>> physical disk to write and then synchronize its write cache.  Typically
>> cache=none is a good setting to use for local disks.
> Now i can not access my workstation in the office.
> -drive if=virtio,cache=none,file=xxxx
>
>>
>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>> so I think a 512 byte write from the guest could cause a 4 KB
>> read-modify-write operation on the host filesystem.
> You mean RCU? What is its work procedure? Can you explain in more
> details if you are available?

If the host file system manages space in 4 KB blocks, then a 512 byte
to an unallocated part of the file causes the file system to find 4 KB
of free space for this data.  Since the write is only 512 bytes and
does not cover the entire 4 KB region, the file system initializes the
remaining 3.5 KB with zeros and writes out the full 4 KB block.

Now if a 512 byte write comes in for an allocated 4 KB block, then we
need to read in the existing 4 KB, modify the 512 bytes in place, and
write out the 4 KB block again.  This is read-modify-write.  In this
worst-case scenario a 512 byte write turns into a 4 KB read followed
by a 4 KB write.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 13:54     ` Stefan Hajnoczi
@ 2011-09-09 14:04       ` Kevin Wolf
  2011-09-09 15:27         ` Stefan Hajnoczi
  2011-09-11 13:32         ` Christoph Hellwig
  2011-09-09 14:09       ` Zhi Yong Wu
  1 sibling, 2 replies; 21+ messages in thread
From: Kevin Wolf @ 2011-09-09 14:04 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, Zhi Yong Wu,
	ryanh

Am 09.09.2011 15:54, schrieb Stefan Hajnoczi:
> On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>>
>>>> Do qemu have regression?
>>>>
>>>> The testing data is shown as below:
>>>>
>>>> 1.) write
>>>>
>>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>>
>>> Please post your QEMU command-line.  If your -drive is using
>>> cache=writethrough then small writes are slow because they require the
>>> physical disk to write and then synchronize its write cache.  Typically
>>> cache=none is a good setting to use for local disks.
>> Now i can not access my workstation in the office.
>> -drive if=virtio,cache=none,file=xxxx
>>
>>>
>>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>>> so I think a 512 byte write from the guest could cause a 4 KB
>>> read-modify-write operation on the host filesystem.
>> You mean RCU? What is its work procedure? Can you explain in more
>> details if you are available?
> 
> If the host file system manages space in 4 KB blocks, then a 512 byte
> to an unallocated part of the file causes the file system to find 4 KB
> of free space for this data.  Since the write is only 512 bytes and
> does not cover the entire 4 KB region, the file system initializes the
> remaining 3.5 KB with zeros and writes out the full 4 KB block.
> 
> Now if a 512 byte write comes in for an allocated 4 KB block, then we
> need to read in the existing 4 KB, modify the 512 bytes in place, and
> write out the 4 KB block again.  This is read-modify-write.  In this
> worst-case scenario a 512 byte write turns into a 4 KB read followed
> by a 4 KB write.

But that should only happen with a 4k sector size, otherwise there's no
reason for RMW.

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 14:04       ` Kevin Wolf
@ 2011-09-09 15:27         ` Stefan Hajnoczi
  2011-09-11 13:32         ` Christoph Hellwig
  1 sibling, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-09 15:27 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguro, Stefan Hajnoczi, qemu-devel, ryanh, Zhi Yong Wu,
	Zhi Yong Wu

On Fri, Sep 09, 2011 at 04:04:07PM +0200, Kevin Wolf wrote:
> Am 09.09.2011 15:54, schrieb Stefan Hajnoczi:
> > On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> >> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
> >> <stefanha@linux.vnet.ibm.com> wrote:
> >>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
> >>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
> >>>>
> >>>> Do qemu have regression?
> >>>>
> >>>> The testing data is shown as below:
> >>>>
> >>>> 1.) write
> >>>>
> >>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
> >>>
> >>> Please post your QEMU command-line.  If your -drive is using
> >>> cache=writethrough then small writes are slow because they require the
> >>> physical disk to write and then synchronize its write cache.  Typically
> >>> cache=none is a good setting to use for local disks.
> >> Now i can not access my workstation in the office.
> >> -drive if=virtio,cache=none,file=xxxx
> >>
> >>>
> >>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
> >>> so I think a 512 byte write from the guest could cause a 4 KB
> >>> read-modify-write operation on the host filesystem.
> >> You mean RCU? What is its work procedure? Can you explain in more
> >> details if you are available?
> > 
> > If the host file system manages space in 4 KB blocks, then a 512 byte
> > to an unallocated part of the file causes the file system to find 4 KB
> > of free space for this data.  Since the write is only 512 bytes and
> > does not cover the entire 4 KB region, the file system initializes the
> > remaining 3.5 KB with zeros and writes out the full 4 KB block.
> > 
> > Now if a 512 byte write comes in for an allocated 4 KB block, then we
> > need to read in the existing 4 KB, modify the 512 bytes in place, and
> > write out the 4 KB block again.  This is read-modify-write.  In this
> > worst-case scenario a 512 byte write turns into a 4 KB read followed
> > by a 4 KB write.
> 
> But that should only happen with a 4k sector size, otherwise there's no
> reason for RMW.

You're right.  For cache=none (O_DIRECT), the host file system should
not need to do read-modify-write because it can write the single sector
without caring what is in the surrounding 3.5 KB.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 14:04       ` Kevin Wolf
  2011-09-09 15:27         ` Stefan Hajnoczi
@ 2011-09-11 13:32         ` Christoph Hellwig
  1 sibling, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2011-09-11 13:32 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: aliguro, Stefan Hajnoczi, Stefan Hajnoczi, qemu-devel, ryanh,
	Zhi Yong Wu, Zhi Yong Wu

On Fri, Sep 09, 2011 at 04:04:07PM +0200, Kevin Wolf wrote:
> > need to read in the existing 4 KB, modify the 512 bytes in place, and
> > write out the 4 KB block again.  This is read-modify-write.  In this
> > worst-case scenario a 512 byte write turns into a 4 KB read followed
> > by a 4 KB write.
> 
> But that should only happen with a 4k sector size, otherwise there's no
> reason for RMW.

The might not be a need for RMW, but if you're doing 512 byte writes to a
sparse file on 4k filesystem that filesystem will have to serialize the
I/O to prevent races from happening during block allocation.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 13:54     ` Stefan Hajnoczi
  2011-09-09 14:04       ` Kevin Wolf
@ 2011-09-09 14:09       ` Zhi Yong Wu
  1 sibling, 0 replies; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-09 14:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Fri, Sep 9, 2011 at 9:54 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Fri, Sep 9, 2011 at 2:48 PM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>>
>>>> Do qemu have regression?
>>>>
>>>> The testing data is shown as below:
>>>>
>>>> 1.) write
>>>>
>>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>>
>>> Please post your QEMU command-line.  If your -drive is using
>>> cache=writethrough then small writes are slow because they require the
>>> physical disk to write and then synchronize its write cache.  Typically
>>> cache=none is a good setting to use for local disks.
>> Now i can not access my workstation in the office.
>> -drive if=virtio,cache=none,file=xxxx
>>
>>>
>>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>>> so I think a 512 byte write from the guest could cause a 4 KB
>>> read-modify-write operation on the host filesystem.
>> You mean RCU? What is its work procedure? Can you explain in more
>> details if you are available?
>
> If the host file system manages space in 4 KB blocks, then a 512 byte
> to an unallocated part of the file causes the file system to find 4 KB
> of free space for this data.  Since the write is only 512 bytes and
> does not cover the entire 4 KB region, the file system initializes the
> remaining 3.5 KB with zeros and writes out the full 4 KB block.
>
> Now if a 512 byte write comes in for an allocated 4 KB block, then we
> need to read in the existing 4 KB, modify the 512 bytes in place, and
> write out the 4 KB block again.  This is read-modify-write.  In this
> worst-case scenario a 512 byte write turns into a 4 KB read followed
> by a 4 KB write.
A 512B write will lead to a 4KB read + 512B modify + a 4KB write.

got it. thanks.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-09 10:38 ` Stefan Hajnoczi
  2011-09-09 13:48   ` Zhi Yong Wu
@ 2011-09-13  2:38   ` Zhi Yong Wu
  2011-09-13  2:52     ` Zhi Yong Wu
  2011-09-13  7:15     ` Stefan Hajnoczi
  1 sibling, 2 replies; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13  2:38 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>
>> Do qemu have regression?
>>
>> The testing data is shown as below:
>>
>> 1.) write
>>
>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>
> Please post your QEMU command-line.  If your -drive is using
> cache=writethrough then small writes are slow because they require the
> physical disk to write and then synchronize its write cache.  Typically
> cache=none is a good setting to use for local disks.
>
> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
> so I think a 512 byte write from the guest could cause a 4 KB
> read-modify-write operation on the host filesystem.
>
> You can check this by running btrace(8) on the host during the
> benchmark.  The blktrace output and the summary statistics will show
> what I/O pattern the host is issuing.
  8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
(253,1) 42611360
  8,0    0        2     0.000000896   337  A  WS 426107552 + 8 <-
(8,2) 425081504
  8,2    0        3     0.000001772   337  Q  WS 426107552 + 8 [jbd2/dm-1-8]
  8,2    0        4     0.000006617   337  G  WS 426107552 + 8 [jbd2/dm-1-8]
  8,2    0        5     0.000007862   337  P   N [jbd2/dm-1-8]
  8,2    0        6     0.000010481   337  I  WS 426107552 + 8 [jbd2/dm-1-8]
.....
CPU0 (8,2):
 Reads Queued:          11,      416KiB	 Writes Queued:          20,       72KiB
 Read Dispatches:       12,      440KiB	 Write Dispatches:        8,       72KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:       14,      448KiB	 Writes Completed:       12,       72KiB
 Read Merges:            0,        0KiB	 Write Merges:           10,       40KiB
 Read depth:             2        	 Write depth:             2
 IO unplugs:            11        	 Timer unplugs:           0
CPU1 (8,2):
 Reads Queued:           8,       32KiB	 Writes Queued:           0,        0KiB
 Read Dispatches:        2,        8KiB	 Write Dispatches:        0,        0KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:        0,        0KiB	 Writes Completed:        0,        0KiB
 Read Merges:            5,       20KiB	 Write Merges:            0,        0KiB
 Read depth:             2        	 Write depth:             2
 IO unplugs:             0        	 Timer unplugs:           0

Total (8,2):
 Reads Queued:          19,      448KiB	 Writes Queued:          20,       72KiB
 Read Dispatches:       14,      448KiB	 Write Dispatches:        8,       72KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:       14,      448KiB	 Writes Completed:       12,       72KiB
 Read Merges:            5,       20KiB	 Write Merges:           10,       40KiB
 IO unplugs:            11        	 Timer unplugs:           0

Throughput (R/W): 69KiB/s / 11KiB/s
Events (8,2): 411 entries
Skips: 50 forward (5,937 -  93.5%)

>From its log, host will write 8 blocks each time. what is each block's size?

>
> I suggest changing your fio block size to 8 KB if you want to try a
> small block size.  If you want a large block size, try 64 KB or 128 KB.
When -drive if=virtio,cache=none,file=xxx,bps=1000000  is set
(note that bps is in bytes).

test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process
^Cbs: 1 (f=1): [W] [1.9% done] [0K/61K /s] [0/120 iops] [eta 01h:00m:57s]
fio: terminating on signal 2

test: (groupid=0, jobs=1): err= 0: pid=27515
  write: io=3,960KB, bw=56,775B/s, iops=110, runt= 71422msec
    slat (usec): min=19, max=31,032, avg=65.03, stdev=844.57
    clat (msec): min=1, max=353, avg= 8.93, stdev=11.91
     lat (msec): min=1, max=353, avg= 8.99, stdev=12.00
    bw (KB/s) : min=    2, max=   60, per=102.06%, avg=56.14, stdev=10.89
  cpu          : usr=0.04%, sys=0.61%, ctx=7936, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/7920, short=0/0

     lat (msec): 2=0.03%, 4=0.09%, 10=98.48%, 20=0.54%, 50=0.67%
     lat (msec): 100=0.03%, 250=0.05%, 500=0.11%

Run status group 0 (all jobs):
  WRITE: io=3,960KB, aggrb=55KB/s, minb=56KB/s, maxb=56KB/s,
mint=71422msec, maxt=71422msec

Disk stats (read/write):
  dm-0: ios=6/8007, merge=0/0, ticks=179/78114, in_queue=78272,
util=99.58%, aggrios=4/7975, aggrmerge=0/44, aggrticks=179/75153,
aggrin_queue=75304, aggrutil=99.53%
    vda: ios=4/7975, merge=0/44, ticks=179/75153, in_queue=75304, util=99.53%
test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/752K /s] [0/91 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=27526
  write: io=51,200KB, bw=668KB/s, iops=83, runt= 76622msec
    slat (usec): min=20, max=570K, avg=386.16, stdev=11400.53
    clat (msec): min=1, max=1,699, avg=11.57, stdev=29.85
     lat (msec): min=1, max=1,699, avg=11.96, stdev=33.24
    bw (KB/s) : min=   20, max=  968, per=104.93%, avg=700.95, stdev=245.18
  cpu          : usr=0.08%, sys=0.41%, ctx=6418, majf=0, minf=25
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/6400, short=0/0

     lat (msec): 2=0.05%, 4=0.09%, 10=94.08%, 20=0.78%, 50=3.39%
     lat (msec): 100=0.33%, 250=1.12%, 500=0.11%, 750=0.03%, 2000=0.02%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=668KB/s, minb=684KB/s, maxb=684KB/s,
mint=76622msec, maxt=76622msec

Disk stats (read/write):
  dm-0: ios=913/6731, merge=0/0, ticks=161809/696060, in_queue=858086,
util=100.00%, aggrios=1070/6679, aggrmerge=316/410,
aggrticks=163975/1587245, aggrin_queue=1751267, aggrutil=100.00%
    vda: ios=1070/6679, merge=316/410, ticks=163975/1587245,
in_queue=1751267, util=100.00%
test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/458K /s] [0/6 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=27614
  write: io=51,200KB, bw=428KB/s, iops=6, runt=119618msec
    slat (usec): min=28, max=5,507K, avg=7618.90, stdev=194782.69
    clat (msec): min=14, max=9,418, avg=140.49, stdev=328.96
     lat (msec): min=14, max=9,418, avg=148.11, stdev=382.21
    bw (KB/s) : min=   11, max=  664, per=114.06%, avg=488.19, stdev=53.97
  cpu          : usr=0.03%, sys=0.04%, ctx=825, majf=0, minf=27
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/800, short=0/0

     lat (msec): 20=0.38%, 50=2.50%, 250=97.00%, >=2000=0.12%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=428KB/s, minb=438KB/s, maxb=438KB/s,
mint=119618msec, maxt=119618msec

Disk stats (read/write):
  dm-0: ios=0/938, merge=0/0, ticks=0/517622, in_queue=517712,
util=100.00%, aggrios=0/883, aggrmerge=0/55, aggrticks=0/351498,
aggrin_queue=351497, aggrutil=100.00%
    vda: ios=0/883, merge=0/55, ticks=0/351498, in_queue=351497, util=100.00%
test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/374K /s] [0/2 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=27623
  write: io=51,200KB, bw=422KB/s, iops=3, runt=121420msec
    slat (usec): min=29, max=5,484K, avg=17456.88, stdev=274747.99
    clat (msec): min=174, max=9,559, avg=283.02, stdev=465.37
     lat (msec): min=176, max=9,559, avg=300.47, stdev=538.58
    bw (KB/s) : min=   22, max=  552, per=114.21%, avg=480.81, stdev=49.10
  cpu          : usr=0.00%, sys=0.03%, ctx=425, majf=0, minf=27
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/400, short=0/0

     lat (msec): 250=9.50%, 500=90.25%, >=2000=0.25%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=421KB/s, minb=431KB/s, maxb=431KB/s,
mint=121420msec, maxt=121420msec

Disk stats (read/write):
  dm-0: ios=0/541, merge=0/0, ticks=0/573761, in_queue=574004,
util=100.00%, aggrios=0/484, aggrmerge=0/57, aggrticks=0/396662,
aggrin_queue=396662, aggrutil=100.00%
    vda: ios=0/484, merge=0/57, ticks=0/396662, in_queue=396662, util=100.00%


>
> Stefan
>
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  2:38   ` Zhi Yong Wu
@ 2011-09-13  2:52     ` Zhi Yong Wu
  2011-09-13  7:14       ` Stefan Hajnoczi
  2011-09-13  7:15     ` Stefan Hajnoczi
  1 sibling, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13  2:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

This is real log when fio issued with bs=128K and bps=1000000(block
I/O throttling):

  8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
(253,2) 71830016
  8,0    0        2     0.000000912 24332  A  WS 80984576 + 256 <-
(8,2) 79958528
  8,2    0        3     0.000001778 24332  Q  WS 80984576 + 256
[qemu-system-x86]
  8,2    0        4     0.000006527 24332  G  WS 80984576 + 256
[qemu-system-x86]
  8,2    0        5     0.000007817 24332  P   N [qemu-system-x86]
  8,2    0        6     0.000011234 24332  I  WS 80984576 + 256
[qemu-system-x86]
CPU0 (8,2):
 Reads Queued:           0,        0KiB	 Writes Queued:         558,   25,244KiB
 Read Dispatches:        0,        0KiB	 Write Dispatches:      265,   25,440KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:        0,        0KiB	 Writes Completed:    1,027,   56,420KiB
 Read Merges:            0,        0KiB	 Write Merges:           19,       76KiB
 Read depth:             0        	 Write depth:             3
 IO unplugs:           217        	 Timer unplugs:         268
CPU1 (8,2):
 Reads Queued:           0,        0KiB	 Writes Queued:         483,   31,176KiB
 Read Dispatches:        0,        0KiB	 Write Dispatches:      262,   30,980KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:        0,        0KiB	 Writes Completed:        0,        0KiB
 Read Merges:            0,        0KiB	 Write Merges:           20,       80KiB
 Read depth:             0        	 Write depth:             3
 IO unplugs:           265        	 Timer unplugs:         181

Total (8,2):
 Reads Queued:           0,        0KiB	 Writes Queued:       1,041,   56,420KiB
 Read Dispatches:        0,        0KiB	 Write Dispatches:      527,   56,420KiB
 Reads Requeued:         0		 Writes Requeued:         0
 Reads Completed:        0,        0KiB	 Writes Completed:    1,027,   56,420KiB
 Read Merges:            0,        0KiB	 Write Merges:           39,      156KiB
 IO unplugs:           482        	 Timer unplugs:         449

Throughput (R/W): 0KiB/s / 482KiB/s
Events (8,2): 17,661 entries
Skips: 1,820 forward (3,918,005 -  99.6%)

I found that I/O pattern host issues each time is different when fio
is issued with different bs value.

On Tue, Sep 13, 2011 at 10:38 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
>> On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>
>>> Do qemu have regression?
>>>
>>> The testing data is shown as below:
>>>
>>> 1.) write
>>>
>>> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>
>> Please post your QEMU command-line.  If your -drive is using
>> cache=writethrough then small writes are slow because they require the
>> physical disk to write and then synchronize its write cache.  Typically
>> cache=none is a good setting to use for local disks.
>>
>> The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>> so I think a 512 byte write from the guest could cause a 4 KB
>> read-modify-write operation on the host filesystem.
>>
>> You can check this by running btrace(8) on the host during the
>> benchmark.  The blktrace output and the summary statistics will show
>> what I/O pattern the host is issuing.
>  8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
> (253,1) 42611360
>  8,0    0        2     0.000000896   337  A  WS 426107552 + 8 <-
> (8,2) 425081504
>  8,2    0        3     0.000001772   337  Q  WS 426107552 + 8 [jbd2/dm-1-8]
>  8,2    0        4     0.000006617   337  G  WS 426107552 + 8 [jbd2/dm-1-8]
>  8,2    0        5     0.000007862   337  P   N [jbd2/dm-1-8]
>  8,2    0        6     0.000010481   337  I  WS 426107552 + 8 [jbd2/dm-1-8]
> .....
> CPU0 (8,2):
>  Reads Queued:          11,      416KiB  Writes Queued:          20,       72KiB
>  Read Dispatches:       12,      440KiB  Write Dispatches:        8,       72KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:       14,      448KiB  Writes Completed:       12,       72KiB
>  Read Merges:            0,        0KiB  Write Merges:           10,       40KiB
>  Read depth:             2               Write depth:             2
>  IO unplugs:            11               Timer unplugs:           0
> CPU1 (8,2):
>  Reads Queued:           8,       32KiB  Writes Queued:           0,        0KiB
>  Read Dispatches:        2,        8KiB  Write Dispatches:        0,        0KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:        0,        0KiB  Writes Completed:        0,        0KiB
>  Read Merges:            5,       20KiB  Write Merges:            0,        0KiB
>  Read depth:             2               Write depth:             2
>  IO unplugs:             0               Timer unplugs:           0
>
> Total (8,2):
>  Reads Queued:          19,      448KiB  Writes Queued:          20,       72KiB
>  Read Dispatches:       14,      448KiB  Write Dispatches:        8,       72KiB
>  Reads Requeued:         0               Writes Requeued:         0
>  Reads Completed:       14,      448KiB  Writes Completed:       12,       72KiB
>  Read Merges:            5,       20KiB  Write Merges:           10,       40KiB
>  IO unplugs:            11               Timer unplugs:           0
>
> Throughput (R/W): 69KiB/s / 11KiB/s
> Events (8,2): 411 entries
> Skips: 50 forward (5,937 -  93.5%)
>
> From its log, host will write 8 blocks each time. what is each block's size?
>
>>
>> I suggest changing your fio block size to 8 KB if you want to try a
>> small block size.  If you want a large block size, try 64 KB or 128 KB.
> When -drive if=virtio,cache=none,file=xxx,bps=1000000  is set
> (note that bps is in bytes).
>
> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
> Starting 1 process
> ^Cbs: 1 (f=1): [W] [1.9% done] [0K/61K /s] [0/120 iops] [eta 01h:00m:57s]
> fio: terminating on signal 2
>
> test: (groupid=0, jobs=1): err= 0: pid=27515
>  write: io=3,960KB, bw=56,775B/s, iops=110, runt= 71422msec
>    slat (usec): min=19, max=31,032, avg=65.03, stdev=844.57
>    clat (msec): min=1, max=353, avg= 8.93, stdev=11.91
>     lat (msec): min=1, max=353, avg= 8.99, stdev=12.00
>    bw (KB/s) : min=    2, max=   60, per=102.06%, avg=56.14, stdev=10.89
>  cpu          : usr=0.04%, sys=0.61%, ctx=7936, majf=0, minf=26
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/7920, short=0/0
>
>     lat (msec): 2=0.03%, 4=0.09%, 10=98.48%, 20=0.54%, 50=0.67%
>     lat (msec): 100=0.03%, 250=0.05%, 500=0.11%
>
> Run status group 0 (all jobs):
>  WRITE: io=3,960KB, aggrb=55KB/s, minb=56KB/s, maxb=56KB/s,
> mint=71422msec, maxt=71422msec
>
> Disk stats (read/write):
>  dm-0: ios=6/8007, merge=0/0, ticks=179/78114, in_queue=78272,
> util=99.58%, aggrios=4/7975, aggrmerge=0/44, aggrticks=179/75153,
> aggrin_queue=75304, aggrutil=99.53%
>    vda: ios=4/7975, merge=0/44, ticks=179/75153, in_queue=75304, util=99.53%
> test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/752K /s] [0/91 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27526
>  write: io=51,200KB, bw=668KB/s, iops=83, runt= 76622msec
>    slat (usec): min=20, max=570K, avg=386.16, stdev=11400.53
>    clat (msec): min=1, max=1,699, avg=11.57, stdev=29.85
>     lat (msec): min=1, max=1,699, avg=11.96, stdev=33.24
>    bw (KB/s) : min=   20, max=  968, per=104.93%, avg=700.95, stdev=245.18
>  cpu          : usr=0.08%, sys=0.41%, ctx=6418, majf=0, minf=25
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/6400, short=0/0
>
>     lat (msec): 2=0.05%, 4=0.09%, 10=94.08%, 20=0.78%, 50=3.39%
>     lat (msec): 100=0.33%, 250=1.12%, 500=0.11%, 750=0.03%, 2000=0.02%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=668KB/s, minb=684KB/s, maxb=684KB/s,
> mint=76622msec, maxt=76622msec
>
> Disk stats (read/write):
>  dm-0: ios=913/6731, merge=0/0, ticks=161809/696060, in_queue=858086,
> util=100.00%, aggrios=1070/6679, aggrmerge=316/410,
> aggrticks=163975/1587245, aggrin_queue=1751267, aggrutil=100.00%
>    vda: ios=1070/6679, merge=316/410, ticks=163975/1587245,
> in_queue=1751267, util=100.00%
> test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/458K /s] [0/6 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27614
>  write: io=51,200KB, bw=428KB/s, iops=6, runt=119618msec
>    slat (usec): min=28, max=5,507K, avg=7618.90, stdev=194782.69
>    clat (msec): min=14, max=9,418, avg=140.49, stdev=328.96
>     lat (msec): min=14, max=9,418, avg=148.11, stdev=382.21
>    bw (KB/s) : min=   11, max=  664, per=114.06%, avg=488.19, stdev=53.97
>  cpu          : usr=0.03%, sys=0.04%, ctx=825, majf=0, minf=27
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/800, short=0/0
>
>     lat (msec): 20=0.38%, 50=2.50%, 250=97.00%, >=2000=0.12%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=428KB/s, minb=438KB/s, maxb=438KB/s,
> mint=119618msec, maxt=119618msec
>
> Disk stats (read/write):
>  dm-0: ios=0/938, merge=0/0, ticks=0/517622, in_queue=517712,
> util=100.00%, aggrios=0/883, aggrmerge=0/55, aggrticks=0/351498,
> aggrin_queue=351497, aggrutil=100.00%
>    vda: ios=0/883, merge=0/55, ticks=0/351498, in_queue=351497, util=100.00%
> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/374K /s] [0/2 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=27623
>  write: io=51,200KB, bw=422KB/s, iops=3, runt=121420msec
>    slat (usec): min=29, max=5,484K, avg=17456.88, stdev=274747.99
>    clat (msec): min=174, max=9,559, avg=283.02, stdev=465.37
>     lat (msec): min=176, max=9,559, avg=300.47, stdev=538.58
>    bw (KB/s) : min=   22, max=  552, per=114.21%, avg=480.81, stdev=49.10
>  cpu          : usr=0.00%, sys=0.03%, ctx=425, majf=0, minf=27
>  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>     issued r/w: total=0/400, short=0/0
>
>     lat (msec): 250=9.50%, 500=90.25%, >=2000=0.25%
>
> Run status group 0 (all jobs):
>  WRITE: io=51,200KB, aggrb=421KB/s, minb=431KB/s, maxb=431KB/s,
> mint=121420msec, maxt=121420msec
>
> Disk stats (read/write):
>  dm-0: ios=0/541, merge=0/0, ticks=0/573761, in_queue=574004,
> util=100.00%, aggrios=0/484, aggrmerge=0/57, aggrticks=0/396662,
> aggrin_queue=396662, aggrutil=100.00%
>    vda: ios=0/484, merge=0/57, ticks=0/396662, in_queue=396662, util=100.00%
>
>
>>
>> Stefan
>>
>>
>
>
>
> --
> Regards,
>
> Zhi Yong Wu
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  2:52     ` Zhi Yong Wu
@ 2011-09-13  7:14       ` Stefan Hajnoczi
  2011-09-13  9:25         ` Zhi Yong Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-13  7:14 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 10:52:44AM +0800, Zhi Yong Wu wrote:
> This is real log when fio issued with bs=128K and bps=1000000(block
> I/O throttling):

I would use 1024 * 1024 instead of 1000000 as the throughput limit.
10^5 is not a multiple of 512 bytes and is not a nice value in KB/s
(976.5625).

> 
>   8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
> (253,2) 71830016

256 blocks = 256 * 512 bytes = 128 KB per request.  We know the maximum
request size from Linux is 128 KB so this makes sense.

> Throughput (R/W): 0KiB/s / 482KiB/s

What throughput do you get without I/O throttling?  Either I/O
throttling is limiting too aggressively here or the physical disk is the
bottleneck (I double that since the write throughput value is very low).
We need to compare against the throughput when throttling is not
enabled.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  7:14       ` Stefan Hajnoczi
@ 2011-09-13  9:25         ` Zhi Yong Wu
  2011-09-13 10:14           ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13  9:25 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 3:14 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> On Tue, Sep 13, 2011 at 10:52:44AM +0800, Zhi Yong Wu wrote:
>> This is real log when fio issued with bs=128K and bps=1000000(block
>> I/O throttling):
>
> I would use 1024 * 1024 instead of 1000000 as the throughput limit.
> 10^5 is not a multiple of 512 bytes and is not a nice value in KB/s
> (976.5625).
OK. next time, i will adopt this.
>
>>
>>   8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
>> (253,2) 71830016
>
> 256 blocks = 256 * 512 bytes = 128 KB per request.  We know the maximum
> request size from Linux is 128 KB so this makes sense.
>
>> Throughput (R/W): 0KiB/s / 482KiB/s
>
> What throughput do you get without I/O throttling?  Either I/O
> throttling is limiting too aggressively here or the physical disk is the
> bottleneck (I double that since the write throughput value is very low).
> We need to compare against the throughput when throttling is not
> enabled.
Without block I/O throttling.

test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/58K /s] [0/114 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2659
  write: io=51,200KB, bw=59,936B/s, iops=117, runt=874741msec
    slat (usec): min=25, max=44,515, avg=69.77, stdev=774.19
    clat (usec): min=778, max=216K, avg=8460.67, stdev=2417.70
     lat (usec): min=845, max=216K, avg=8531.11, stdev=2778.62
    bw (KB/s) : min=   11, max=   60, per=100.89%, avg=58.52, stdev= 3.14
  cpu          : usr=0.04%, sys=0.76%, ctx=102601, majf=0, minf=49
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/102400, short=0/0
     lat (usec): 1000=0.01%
     lat (msec): 2=0.01%, 4=0.01%, 10=99.17%, 20=0.24%, 50=0.53%
     lat (msec): 100=0.04%, 250=0.01%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=58KB/s, minb=59KB/s, maxb=59KB/s,
mint=874741msec, maxt=874741msec

Disk stats (read/write):
  dm-0: ios=37/103237, merge=0/0, ticks=1935/901887, in_queue=903811,
util=99.67%, aggrios=37/102904, aggrmerge=0/351,
aggrticks=1935/889769, aggrin_queue=891623, aggrutil=99.64%
    vda: ios=37/102904, merge=0/351, ticks=1935/889769,
in_queue=891623, util=99.64%
test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/973K /s] [0/118 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2716
  write: io=51,200KB, bw=926KB/s, iops=115, runt= 55291msec
    slat (usec): min=20, max=36,133, avg=68.68, stdev=920.02
    clat (msec): min=1, max=58, avg= 8.52, stdev= 1.99
     lat (msec): min=1, max=66, avg= 8.58, stdev= 2.48
    bw (KB/s) : min=  587, max=  972, per=100.23%, avg=928.14, stdev=54.43
  cpu          : usr=0.04%, sys=0.59%, ctx=6416, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/6400, short=0/0

     lat (msec): 2=0.06%, 4=0.06%, 10=99.00%, 20=0.25%, 50=0.61%
     lat (msec): 100=0.02%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=926KB/s, minb=948KB/s, maxb=948KB/s,
mint=55291msec, maxt=55291msec

Disk stats (read/write):
  dm-0: ios=3/6507, merge=0/0, ticks=33/68470, in_queue=68508,
util=99.51%, aggrios=3/6462, aggrmerge=0/60, aggrticks=33/64291,
aggrin_queue=64322, aggrutil=99.48%
    vda: ios=3/6462, merge=0/60, ticks=33/64291, in_queue=64322, util=99.48%
test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/7,259K /s] [0/110 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2727
  write: io=51,200KB, bw=7,050KB/s, iops=110, runt=  7262msec
    slat (usec): min=30, max=46,393, avg=90.62, stdev=1639.10
    clat (msec): min=2, max=39, avg= 8.98, stdev= 1.82
     lat (msec): min=2, max=85, avg= 9.07, stdev= 3.08
    bw (KB/s) : min= 6003, max= 7252, per=100.13%, avg=7058.86, stdev=362.31
  cpu          : usr=0.00%, sys=0.61%, ctx=801, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/800, short=0/0

     lat (msec): 4=0.25%, 10=92.38%, 20=7.00%, 50=0.38%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=7,050KB/s, minb=7,219KB/s, maxb=7,219KB/s,
mint=7262msec, maxt=7262msec

Disk stats (read/write):
  dm-0: ios=0/808, merge=0/0, ticks=0/8216, in_queue=8225,
util=98.31%, aggrios=0/804, aggrmerge=0/18, aggrticks=0/7363,
aggrin_queue=7363, aggrutil=98.19%
    vda: ios=0/804, merge=0/18, ticks=0/7363, in_queue=7363, util=98.19%
test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/13M /s] [0/103 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2734
  write: io=51,200KB, bw=12,933KB/s, iops=101, runt=  3959msec
    slat (usec): min=35, max=37,845, avg=132.90, stdev=1890.38
    clat (msec): min=3, max=39, avg= 9.76, stdev= 2.56
     lat (msec): min=3, max=77, avg= 9.89, stdev= 3.98
    bw (KB/s) : min=13029, max=13660, per=103.33%, avg=13362.14, stdev=227.96
  cpu          : usr=0.43%, sys=0.15%, ctx=401, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/400, short=0/0

     lat (msec): 4=0.50%, 10=84.25%, 20=14.50%, 50=0.75%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=12,932KB/s, minb=13,242KB/s,
maxb=13,242KB/s, mint=3959msec, maxt=3959msec

Disk stats (read/write):
  dm-0: ios=0/415, merge=0/0, ticks=0/4834, in_queue=4838,
util=97.39%, aggrios=0/404, aggrmerge=0/18, aggrticks=0/4068,
aggrin_queue=4068, aggrutil=97.08%
    vda: ios=0/404, merge=0/18, ticks=0/4068, in_queue=4068, util=97.08%
test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [-.-% done] [0K/23M /s] [0/90 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2741
  write: io=51,200KB, bw=23,073KB/s, iops=90, runt=  2219msec
    slat (usec): min=45, max=246, avg=49.48, stdev=15.27
    clat (msec): min=4, max=58, avg=11.04, stdev= 3.73
     lat (msec): min=4, max=58, avg=11.09, stdev= 3.75
    bw (KB/s) : min=21841, max=23920, per=100.02%, avg=23077.75, stdev=1005.31
  cpu          : usr=0.41%, sys=0.23%, ctx=200, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/200, short=0/0

     lat (msec): 10=16.50%, 20=83.00%, 100=0.50%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=23,073KB/s, minb=23,627KB/s,
maxb=23,627KB/s, mint=2219msec, maxt=2219msec

Disk stats (read/write):
  dm-0: ios=0/198, merge=0/0, ticks=0/2841, in_queue=2849,
util=95.10%, aggrios=0/206, aggrmerge=0/0, aggrticks=0/2944,
aggrin_queue=2944, aggrutil=94.61%
    vda: ios=0/206, merge=0/0, ticks=0/2944, in_queue=2944, util=94.61%


>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  9:25         ` Zhi Yong Wu
@ 2011-09-13 10:14           ` Stefan Hajnoczi
  2011-09-13 10:27             ` Zhi Yong Wu
  2011-09-14  2:42             ` Zhi Yong Wu
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-13 10:14 UTC (permalink / raw)
  To: Zhi Yong Wu
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 10:25 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 3:14 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
>> On Tue, Sep 13, 2011 at 10:52:44AM +0800, Zhi Yong Wu wrote:
>>> This is real log when fio issued with bs=128K and bps=1000000(block
>>> I/O throttling):
>>
>> I would use 1024 * 1024 instead of 1000000 as the throughput limit.
>> 10^5 is not a multiple of 512 bytes and is not a nice value in KB/s
>> (976.5625).
> OK. next time, i will adopt this.
>>
>>>
>>>   8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
>>> (253,2) 71830016
>>
>> 256 blocks = 256 * 512 bytes = 128 KB per request.  We know the maximum
>> request size from Linux is 128 KB so this makes sense.
>>
>>> Throughput (R/W): 0KiB/s / 482KiB/s
>>
>> What throughput do you get without I/O throttling?  Either I/O
>> throttling is limiting too aggressively here or the physical disk is the
>> bottleneck (I double that since the write throughput value is very low).
>> We need to compare against the throughput when throttling is not
>> enabled.
> Without block I/O throttling.
[...]
> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
> Starting 1 process
> Jobs: 1 (f=1): [W] [100.0% done] [0K/13M /s] [0/103 iops] [eta 00m:00s]
> test: (groupid=0, jobs=1): err= 0: pid=2734
>  write: io=51,200KB, bw=12,933KB/s, iops=101, runt=  3959msec

This shows that the physical disk is capable of far exceeding 1 MB/s
when I/O is not limited.  So the earlier result where the guest only
gets 482 KiB/s under 1000000 bps limit shows that I/O limits are being
too aggressive.  For some reason the algorithm is causing the guest to
get lower throughput than expected.

It would be interesting to try with bps=$((10 * 1024 * 1024)).  I
wonder if the algorithm has a constant overhead of a couple hundred
KB/s or if it changes with the much larger bps value.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13 10:14           ` Stefan Hajnoczi
@ 2011-09-13 10:27             ` Zhi Yong Wu
  2011-09-14  2:42             ` Zhi Yong Wu
  1 sibling, 0 replies; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13 10:27 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 6:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 10:25 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Tue, Sep 13, 2011 at 3:14 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> On Tue, Sep 13, 2011 at 10:52:44AM +0800, Zhi Yong Wu wrote:
>>>> This is real log when fio issued with bs=128K and bps=1000000(block
>>>> I/O throttling):
>>>
>>> I would use 1024 * 1024 instead of 1000000 as the throughput limit.
>>> 10^5 is not a multiple of 512 bytes and is not a nice value in KB/s
>>> (976.5625).
>> OK. next time, i will adopt this.
>>>
>>>>
>>>>   8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
>>>> (253,2) 71830016
>>>
>>> 256 blocks = 256 * 512 bytes = 128 KB per request.  We know the maximum
>>> request size from Linux is 128 KB so this makes sense.
>>>
>>>> Throughput (R/W): 0KiB/s / 482KiB/s
>>>
>>> What throughput do you get without I/O throttling?  Either I/O
>>> throttling is limiting too aggressively here or the physical disk is the
>>> bottleneck (I double that since the write throughput value is very low).
>>> We need to compare against the throughput when throttling is not
>>> enabled.
>> Without block I/O throttling.
> [...]
>> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
>> Starting 1 process
>> Jobs: 1 (f=1): [W] [100.0% done] [0K/13M /s] [0/103 iops] [eta 00m:00s]
>> test: (groupid=0, jobs=1): err= 0: pid=2734
>>  write: io=51,200KB, bw=12,933KB/s, iops=101, runt=  3959msec
>
> This shows that the physical disk is capable of far exceeding 1 MB/s
> when I/O is not limited.  So the earlier result where the guest only
> gets 482 KiB/s under 1000000 bps limit shows that I/O limits are being
> too aggressive.  For some reason the algorithm is causing the guest to
> get lower throughput than expected.
>
> It would be interesting to try with bps=$((10 * 1024 * 1024)).  I
> wonder if the algorithm has a constant overhead of a couple hundred
> KB/s or if it changes with the much larger bps value.
OK, i will try it tomorrow.

By the way, I/O throttling can be enabled now from libvirt guest's xml
file when guest is started up.
I have pushed its code changes to my git tree.

git branch: ssh://wuzhy@repo.or.cz/srv/git/libvirt/zwu.git dev

>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13 10:14           ` Stefan Hajnoczi
  2011-09-13 10:27             ` Zhi Yong Wu
@ 2011-09-14  2:42             ` Zhi Yong Wu
  2011-09-14 14:17               ` Stefan Hajnoczi
  1 sibling, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-14  2:42 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

Log for bps=((10 * 1024 * 1024)).

test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/58K /s] [0/114 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2657
  write: io=51,200KB, bw=58,527B/s, iops=114, runt=895793msec
    slat (usec): min=26, max=376K, avg=81.69, stdev=2104.09
    clat (usec): min=859, max=757K, avg=8648.07, stdev=8278.64
     lat (usec): min=921, max=1,133K, avg=8730.49, stdev=9239.57
    bw (KB/s) : min=    0, max=   60, per=101.03%, avg=57.59, stdev= 7.41
  cpu          : usr=0.05%, sys=0.75%, ctx=102611, majf=0, minf=51
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/102400, short=0/0
     lat (usec): 1000=0.01%
     lat (msec): 2=0.01%, 4=0.02%, 10=98.99%, 20=0.24%, 50=0.66%
     lat (msec): 100=0.03%, 250=0.01%, 500=0.05%, 1000=0.01%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=57KB/s, minb=58KB/s, maxb=58KB/s,
mint=895793msec, maxt=895793msec

Disk stats (read/write):
  dm-0: ios=28/103311, merge=0/0, ticks=1318/950537, in_queue=951852,
util=99.63%, aggrios=28/102932, aggrmerge=0/379,
aggrticks=1316/929743, aggrin_queue=930987, aggrutil=99.60%
    vda: ios=28/102932, merge=0/379, ticks=1316/929743,
in_queue=930987, util=99.60%
test: (g=0): rw=write, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/892K /s] [0/108 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2782
  write: io=51,200KB, bw=926KB/s, iops=115, runt= 55269msec
    slat (usec): min=20, max=32,160, avg=66.43, stdev=935.62
    clat (msec): min=1, max=157, avg= 8.53, stdev= 2.55
     lat (msec): min=1, max=158, avg= 8.60, stdev= 2.93
    bw (KB/s) : min=  539, max=  968, per=100.12%, avg=927.09, stdev=63.89
  cpu          : usr=0.10%, sys=0.47%, ctx=6415, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/6400, short=0/0

     lat (msec): 2=0.06%, 4=0.05%, 10=99.19%, 20=0.06%, 50=0.62%
     lat (msec): 250=0.02%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=926KB/s, minb=948KB/s, maxb=948KB/s,
mint=55269msec, maxt=55269msec

Disk stats (read/write):
  dm-0: ios=3/6546, merge=0/0, ticks=117/65262, in_queue=65387,
util=99.58%, aggrios=3/6472, aggrmerge=0/79, aggrticks=117/62063,
aggrin_queue=62178, aggrutil=99.54%
    vda: ios=3/6472, merge=0/79, ticks=117/62063, in_queue=62178, util=99.54%
test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/7,332K /s] [0/111 iops] [eta 00m:00s]
test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/7,332K /s] [0/111 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2793
  write: io=51,200KB, bw=7,074KB/s, iops=110, runt=  7238msec
    slat (usec): min=23, max=37,715, avg=82.08, stdev=1332.25
    clat (msec): min=2, max=34, avg= 8.96, stdev= 1.54
     lat (msec): min=2, max=58, avg= 9.04, stdev= 2.31
    bw (KB/s) : min= 6361, max= 7281, per=100.13%, avg=7082.07, stdev=274.31
  cpu          : usr=0.08%, sys=0.53%, ctx=801, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/800, short=0/0

     lat (msec): 4=0.25%, 10=92.12%, 20=7.25%, 50=0.38%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=7,073KB/s, minb=7,243KB/s, maxb=7,243KB/s,
mint=7238msec, maxt=7238msec

Disk stats (read/write):
  dm-0: ios=0/811, merge=0/0, ticks=0/8003, in_queue=8003,
util=98.35%, aggrios=0/804, aggrmerge=0/17, aggrticks=0/7319,
aggrin_queue=7319, aggrutil=98.19%
    vda: ios=0/804, merge=0/17, ticks=0/7319, in_queue=7319, util=98.19%
test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [83.3% done] [0K/10M /s] [0/81 iops] [eta 00m:01s]
test: (groupid=0, jobs=1): err= 0: pid=2800
  write: io=51,200KB, bw=10,113KB/s, iops=79, runt=  5063msec
    slat (usec): min=36, max=35,279, avg=130.55, stdev=1761.93
    clat (msec): min=3, max=134, avg=12.52, stdev=16.93
     lat (msec): min=3, max=134, avg=12.65, stdev=17.14
    bw (KB/s) : min= 7888, max=13128, per=100.41%, avg=10153.00, stdev=1607.48
  cpu          : usr=0.00%, sys=0.51%, ctx=401, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/400, short=0/0

     lat (msec): 4=0.50%, 10=81.25%, 20=14.75%, 50=0.75%, 250=2.75%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=10,112KB/s, minb=10,355KB/s,
maxb=10,355KB/s, mint=5063msec, maxt=5063msec

Disk stats (read/write):
  dm-0: ios=0/403, merge=0/0, ticks=0/6216, in_queue=6225,
util=97.83%, aggrios=0/404, aggrmerge=0/17, aggrticks=0/5228,
aggrin_queue=5228, aggrutil=97.62%
    vda: ios=0/404, merge=0/17, ticks=0/5228, in_queue=5228, util=97.62%
test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0K/9,165K /s] [0/34 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=2807
  write: io=51,200KB, bw=8,883KB/s, iops=34, runt=  5764msec
    slat (usec): min=37, max=36,481, avg=240.22, stdev=2575.78
    clat (msec): min=4, max=164, avg=28.57, stdev=39.97
     lat (msec): min=4, max=164, avg=28.81, stdev=40.03
    bw (KB/s) : min= 7613, max= 9678, per=98.98%, avg=8791.27, stdev=569.82
  cpu          : usr=0.10%, sys=0.17%, ctx=201, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=0/200, short=0/0

     lat (msec): 10=14.50%, 20=68.00%, 50=1.00%, 250=16.50%

Run status group 0 (all jobs):
  WRITE: io=51,200KB, aggrb=8,882KB/s, minb=9,095KB/s, maxb=9,095KB/s,
mint=5764msec, maxt=5764msec

Disk stats (read/write):
  dm-0: ios=0/216, merge=0/0, ticks=0/6446, in_queue=6484,
util=98.10%, aggrios=0/204, aggrmerge=0/16, aggrticks=0/5861,
aggrin_queue=5861, aggrutil=97.92%
    vda: ios=0/204, merge=0/16, ticks=0/5861, in_queue=5861, util=97.92%


On Tue, Sep 13, 2011 at 6:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 10:25 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Tue, Sep 13, 2011 at 3:14 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> On Tue, Sep 13, 2011 at 10:52:44AM +0800, Zhi Yong Wu wrote:
>>>> This is real log when fio issued with bs=128K and bps=1000000(block
>>>> I/O throttling):
>>>
>>> I would use 1024 * 1024 instead of 1000000 as the throughput limit.
>>> 10^5 is not a multiple of 512 bytes and is not a nice value in KB/s
>>> (976.5625).
>> OK. next time, i will adopt this.
>>>
>>>>
>>>>   8,2    0        1     0.000000000 24332  A  WS 79958528 + 256 <-
>>>> (253,2) 71830016
>>>
>>> 256 blocks = 256 * 512 bytes = 128 KB per request.  We know the maximum
>>> request size from Linux is 128 KB so this makes sense.
>>>
>>>> Throughput (R/W): 0KiB/s / 482KiB/s
>>>
>>> What throughput do you get without I/O throttling?  Either I/O
>>> throttling is limiting too aggressively here or the physical disk is the
>>> bottleneck (I double that since the write throughput value is very low).
>>> We need to compare against the throughput when throttling is not
>>> enabled.
>> Without block I/O throttling.
> [...]
>> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
>> Starting 1 process
>> Jobs: 1 (f=1): [W] [100.0% done] [0K/13M /s] [0/103 iops] [eta 00m:00s]
>> test: (groupid=0, jobs=1): err= 0: pid=2734
>>  write: io=51,200KB, bw=12,933KB/s, iops=101, runt=  3959msec
>
> This shows that the physical disk is capable of far exceeding 1 MB/s
> when I/O is not limited.  So the earlier result where the guest only
> gets 482 KiB/s under 1000000 bps limit shows that I/O limits are being
> too aggressive.  For some reason the algorithm is causing the guest to
> get lower throughput than expected.
>
> It would be interesting to try with bps=$((10 * 1024 * 1024)).  I
> wonder if the algorithm has a constant overhead of a couple hundred
> KB/s or if it changes with the much larger bps value.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-14  2:42             ` Zhi Yong Wu
@ 2011-09-14 14:17               ` Stefan Hajnoczi
  2011-09-15  9:04                 ` Zhi Yong Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-14 14:17 UTC (permalink / raw)
  To: Zhi Yong Wu
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Wed, Sep 14, 2011 at 3:42 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> Log for bps=((10 * 1024 * 1024)).

Okay, I think this data shows that I/O limits is too aggressive.
There seems to be some "overhead" amount so the guest is never able to
reach its bps limit:

> test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
>  WRITE: io=51,200KB, aggrb=7,073KB/s, minb=7,243KB/s, maxb=7,243KB/s,

> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
>  WRITE: io=51,200KB, aggrb=10,112KB/s, minb=10,355KB/s,

> test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1
>  WRITE: io=51,200KB, aggrb=8,882KB/s, minb=9,095KB/s, maxb=9,095KB/s,

bs=128 KB worked nicely.  The 64 KB and 256 KB cases don't look so good.

I worry a little that the benchmark duration is quite short so a
fluctuation would affect the results more than if the benchmark
duration was extended to 30 secs or 1 minute.

Zhi Yong: Do you get similar results each time you run this benchmark
or do they vary by more than +/- 512 KB?  If the results are stable
and the benchmark is able to exceed 10 MB/s when running without I/O
throttling, then it's important to figure out why the guest isn't
achieving 10 MB/s.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-14 14:17               ` Stefan Hajnoczi
@ 2011-09-15  9:04                 ` Zhi Yong Wu
  0 siblings, 0 replies; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-15  9:04 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Wed, Sep 14, 2011 at 10:17 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Wed, Sep 14, 2011 at 3:42 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> Log for bps=((10 * 1024 * 1024)).
>
> Okay, I think this data shows that I/O limits is too aggressive.
> There seems to be some "overhead" amount so the guest is never able to
> reach its bps limit:
>
>> test: (g=0): rw=write, bs=64K-64K/64K-64K, ioengine=libaio, iodepth=1
>>  WRITE: io=51,200KB, aggrb=7,073KB/s, minb=7,243KB/s, maxb=7,243KB/s,
>
>> test: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=libaio, iodepth=1
>>  WRITE: io=51,200KB, aggrb=10,112KB/s, minb=10,355KB/s,
>
>> test: (g=0): rw=write, bs=256K-256K/256K-256K, ioengine=libaio, iodepth=1
>>  WRITE: io=51,200KB, aggrb=8,882KB/s, minb=9,095KB/s, maxb=9,095KB/s,
>
> bs=128 KB worked nicely.  The 64 KB and 256 KB cases don't look so good.
>
> I worry a little that the benchmark duration is quite short so a
> fluctuation would affect the results more than if the benchmark
> duration was extended to 30 secs or 1 minute.
>
> Zhi Yong: Do you get similar results each time you run this benchmark
Right
> or do they vary by more than +/- 512 KB?  If the results are stable
The results vary by less than +/-512 KB if i issue multiple times with
the same bs value.

> and the benchmark is able to exceed 10 MB/s when running without I/O
> throttling, then it's important to figure out why the guest isn't
> achieving 10 MB/s.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  2:38   ` Zhi Yong Wu
  2011-09-13  2:52     ` Zhi Yong Wu
@ 2011-09-13  7:15     ` Stefan Hajnoczi
  2011-09-13  8:31       ` Zhi Yong Wu
  1 sibling, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-13  7:15 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 10:38:28AM +0800, Zhi Yong Wu wrote:
> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
> > On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
> >> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
> >>
> >> Do qemu have regression?
> >>
> >> The testing data is shown as below:
> >>
> >> 1.) write
> >>
> >> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
> >
> > Please post your QEMU command-line.  If your -drive is using
> > cache=writethrough then small writes are slow because they require the
> > physical disk to write and then synchronize its write cache.  Typically
> > cache=none is a good setting to use for local disks.
> >
> > The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
> > so I think a 512 byte write from the guest could cause a 4 KB
> > read-modify-write operation on the host filesystem.
> >
> > You can check this by running btrace(8) on the host during the
> > benchmark.  The blktrace output and the summary statistics will show
> > what I/O pattern the host is issuing.
>   8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
> (253,1) 42611360

8 blocks = 8 * 512 bytes = 4 KB

So we are not performing 512 byte writes.  Some layer is changing the
I/O pattern.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  7:15     ` Stefan Hajnoczi
@ 2011-09-13  8:31       ` Zhi Yong Wu
  2011-09-13  8:49         ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13  8:31 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Zhi Yong Wu, aliguro, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 3:15 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> On Tue, Sep 13, 2011 at 10:38:28AM +0800, Zhi Yong Wu wrote:
>> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>> > On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>> >> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>> >>
>> >> Do qemu have regression?
>> >>
>> >> The testing data is shown as below:
>> >>
>> >> 1.) write
>> >>
>> >> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>> >
>> > Please post your QEMU command-line.  If your -drive is using
>> > cache=writethrough then small writes are slow because they require the
>> > physical disk to write and then synchronize its write cache.  Typically
>> > cache=none is a good setting to use for local disks.
>> >
>> > The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>> > so I think a 512 byte write from the guest could cause a 4 KB
>> > read-modify-write operation on the host filesystem.
>> >
>> > You can check this by running btrace(8) on the host during the
>> > benchmark.  The blktrace output and the summary statistics will show
>> > what I/O pattern the host is issuing.
>>   8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
>> (253,1) 42611360
>
> 8 blocks = 8 * 512 bytes = 4 KB
How do you know each block size is 512 bytes?

>
> So we are not performing 512 byte writes.  Some layer is changing the
> I/O pattern.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  8:31       ` Zhi Yong Wu
@ 2011-09-13  8:49         ` Stefan Hajnoczi
  2011-09-13  8:54           ` Zhi Yong Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2011-09-13  8:49 UTC (permalink / raw)
  To: Zhi Yong Wu
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 9:31 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 3:15 PM, Stefan Hajnoczi
> <stefanha@linux.vnet.ibm.com> wrote:
>> On Tue, Sep 13, 2011 at 10:38:28AM +0800, Zhi Yong Wu wrote:
>>> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
>>> <stefanha@linux.vnet.ibm.com> wrote:
>>> > On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>> >> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>> >>
>>> >> Do qemu have regression?
>>> >>
>>> >> The testing data is shown as below:
>>> >>
>>> >> 1.) write
>>> >>
>>> >> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>> >
>>> > Please post your QEMU command-line.  If your -drive is using
>>> > cache=writethrough then small writes are slow because they require the
>>> > physical disk to write and then synchronize its write cache.  Typically
>>> > cache=none is a good setting to use for local disks.
>>> >
>>> > The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>>> > so I think a 512 byte write from the guest could cause a 4 KB
>>> > read-modify-write operation on the host filesystem.
>>> >
>>> > You can check this by running btrace(8) on the host during the
>>> > benchmark.  The blktrace output and the summary statistics will show
>>> > what I/O pattern the host is issuing.
>>>   8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
>>> (253,1) 42611360
>>
>> 8 blocks = 8 * 512 bytes = 4 KB
> How do you know each block size is 512 bytes?

The blkparse format specifier for blocks is 'n'.  Here is the code to
print it from blkparse_fmt.c:

case 'n':
    fprintf(ofp, strcat(format, "u"), t_sec(t));

And t_sec() is:

#define t_sec(t)        ((t)->bytes >> 9)

So it divides the byte count by 512.  Block size == sector size == 512 bytes.

You can get the blktrace source code here:

http://brick.kernel.dk/snaps/

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] Why qemu write/rw speed is so low?
  2011-09-13  8:49         ` Stefan Hajnoczi
@ 2011-09-13  8:54           ` Zhi Yong Wu
  0 siblings, 0 replies; 21+ messages in thread
From: Zhi Yong Wu @ 2011-09-13  8:54 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, aliguro, Stefan Hajnoczi, Zhi Yong Wu, qemu-devel, ryanh

On Tue, Sep 13, 2011 at 4:49 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 9:31 AM, Zhi Yong Wu <zwu.kernel@gmail.com> wrote:
>> On Tue, Sep 13, 2011 at 3:15 PM, Stefan Hajnoczi
>> <stefanha@linux.vnet.ibm.com> wrote:
>>> On Tue, Sep 13, 2011 at 10:38:28AM +0800, Zhi Yong Wu wrote:
>>>> On Fri, Sep 9, 2011 at 6:38 PM, Stefan Hajnoczi
>>>> <stefanha@linux.vnet.ibm.com> wrote:
>>>> > On Fri, Sep 09, 2011 at 05:44:36PM +0800, Zhi Yong Wu wrote:
>>>> >> Today, i did some basical I/O testing, and suddenly found that qemu write and rw speed is so low now, my qemu binary is built on commit 344eecf6995f4a0ad1d887cec922f6806f91a3f8.
>>>> >>
>>>> >> Do qemu have regression?
>>>> >>
>>>> >> The testing data is shown as below:
>>>> >>
>>>> >> 1.) write
>>>> >>
>>>> >> test: (g=0): rw=write, bs=512-512/512-512, ioengine=libaio, iodepth=1
>>>> >
>>>> > Please post your QEMU command-line.  If your -drive is using
>>>> > cache=writethrough then small writes are slow because they require the
>>>> > physical disk to write and then synchronize its write cache.  Typically
>>>> > cache=none is a good setting to use for local disks.
>>>> >
>>>> > The block size of 512 bytes is too small.  Ext4 uses a 4 KB block size,
>>>> > so I think a 512 byte write from the guest could cause a 4 KB
>>>> > read-modify-write operation on the host filesystem.
>>>> >
>>>> > You can check this by running btrace(8) on the host during the
>>>> > benchmark.  The blktrace output and the summary statistics will show
>>>> > what I/O pattern the host is issuing.
>>>>   8,2    0        1     0.000000000   337  A  WS 425081504 + 8 <-
>>>> (253,1) 42611360
>>>
>>> 8 blocks = 8 * 512 bytes = 4 KB
>> How do you know each block size is 512 bytes?
>
> The blkparse format specifier for blocks is 'n'.  Here is the code to
> print it from blkparse_fmt.c:
>
> case 'n':
>    fprintf(ofp, strcat(format, "u"), t_sec(t));
>
> And t_sec() is:
>
> #define t_sec(t)        ((t)->bytes >> 9)
Great, it shift 9 bit in the right direction, i.e. its unit is changed
from bytes to blocks, got it, thanks.

>
> So it divides the byte count by 512.  Block size == sector size == 512 bytes.
>
> You can get the blktrace source code here:
>
> http://brick.kernel.dk/snaps/
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-09-15  9:04 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-09  9:44 [Qemu-devel] Why qemu write/rw speed is so low? Zhi Yong Wu
2011-09-09 10:38 ` Stefan Hajnoczi
2011-09-09 13:48   ` Zhi Yong Wu
2011-09-09 13:54     ` Stefan Hajnoczi
2011-09-09 14:04       ` Kevin Wolf
2011-09-09 15:27         ` Stefan Hajnoczi
2011-09-11 13:32         ` Christoph Hellwig
2011-09-09 14:09       ` Zhi Yong Wu
2011-09-13  2:38   ` Zhi Yong Wu
2011-09-13  2:52     ` Zhi Yong Wu
2011-09-13  7:14       ` Stefan Hajnoczi
2011-09-13  9:25         ` Zhi Yong Wu
2011-09-13 10:14           ` Stefan Hajnoczi
2011-09-13 10:27             ` Zhi Yong Wu
2011-09-14  2:42             ` Zhi Yong Wu
2011-09-14 14:17               ` Stefan Hajnoczi
2011-09-15  9:04                 ` Zhi Yong Wu
2011-09-13  7:15     ` Stefan Hajnoczi
2011-09-13  8:31       ` Zhi Yong Wu
2011-09-13  8:49         ` Stefan Hajnoczi
2011-09-13  8:54           ` Zhi Yong Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).