Bug using ramp_time shows cpu over 100%

Flexible I/O Tester development
 help / color / mirror / Atom feed

* Bug using ramp_time shows cpu over 100%
@ 2015-11-25 19:36 Deyoung Hong (dhong)
  2015-11-30 21:45 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Deyoung Hong (dhong) @ 2015-11-25 19:36 UTC (permalink / raw)
  To: fio@vger.kernel.org

I think this is an fio bug.

Description:   When specifying ramp_time, cpu may show over 100% (when using a polling mode driver).
If ramp_time is not specified, cpu values appear to be correct.  
You normally don't see this problem because most drivers don't utilize high CPU usage. 
You will have to use some custom driver that does polling, for example (something like http://www.dpdk.org/).

OS:  Any (but tested on Linux CentOS 6.6)

---------
Job file:

[global]
thread=1
group_reporting=1
direct=1
ioengine=libaio
time_based=1
ramp_time=10
runtime=10
bs=4096
filename=/dev/polling

[test]
rw=read
numjobs=1
iodepth=1

-------------
Fio output:

test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.2.12
Starting 1 thread

test: (groupid=0, jobs=1): err= 0: pid=27094: Wed Nov 25 11:23:07 2015
  read : io=486652KB, bw=48660KB/s, iops=12165, runt= 10001msec
    slat (usec): min=0, max=16, avg= 2.15, stdev= 0.60
    clat (usec): min=34, max=2323, avg=79.77, stdev=20.54
     lat (usec): min=36, max=2325, avg=81.79, stdev=20.54
    clat percentiles (usec):
     |  1.00th=[   58],  5.00th=[   66], 10.00th=[   69], 20.00th=[   73],
     | 30.00th=[   74], 40.00th=[   74], 50.00th=[   76], 60.00th=[   85],
     | 70.00th=[   88], 80.00th=[   89], 90.00th=[   90], 95.00th=[   91],
     | 99.00th=[   92], 99.50th=[   93], 99.90th=[   94], 99.95th=[  121],
     | 99.99th=[ 1048]
    bw (KB  /s): min=    0, max=48904, per=95.03%, avg=46242.40, stdev=10885.36
    lat (usec) : 50=0.92%, 100=99.01%, 250=0.05%, 500=0.01%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%
  cpu          : usr=164.11%, sys=4.55%, ctx=27, majf=0, minf=3013
  IO depths    : 1=163.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=121663/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=486652KB, aggrb=48660KB/s, minb=48660KB/s, maxb=48660KB/s, mint=1000
1msec, maxt=10001msec


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug using ramp_time shows cpu over 100%
  2015-11-25 19:36 Deyoung Hong (dhong)
@ 2015-11-30 21:45 ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2015-11-30 21:45 UTC (permalink / raw)
  To: Deyoung Hong (dhong), fio@vger.kernel.org

On 11/25/2015 12:36 PM, Deyoung Hong (dhong) wrote:
> I think this is an fio bug.
>
> Description:   When specifying ramp_time, cpu may show over 100% (when using a polling mode driver).
> If ramp_time is not specified, cpu values appear to be correct.
> You normally don't see this problem because most drivers don't utilize high CPU usage.
> You will have to use some custom driver that does polling, for example (something like http://www.dpdk.org/).
>
> OS:  Any (but tested on Linux CentOS 6.6)

This should fix it:

http://git.kernel.dk/cgit/fio/commit/?id=210dd0fc66b8956065e098bb9314febe8993e849

Try current -git and hopefully it'll behave better.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Bug using ramp_time shows cpu over 100%
@ 2015-12-08 21:45 Deyoung Hong (dhong)
  2015-12-09  2:50 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Deyoung Hong (dhong) @ 2015-12-08 21:45 UTC (permalink / raw)
  To: fio@vger.kernel.org

I saw the response on http://www.spinics.net/lists/fio/msg04392.html (didn't get the email), so I'm pasting it here.

I git pulled the latest fio code and observed the specified changes.  I compiled and ran the latest code but still seeing the same problem:

===============
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.2.12-50-g5261
Starting 1 thread

test: (groupid=0, jobs=1): err= 0: pid=10726: Tue Dec  8 13:40:21 2015
  read : io=799740KB, bw=79966KB/s, iops=19991, runt= 10001msec
    slat (usec): min=0, max=11, avg= 0.10, stdev= 0.30
    clat (usec): min=26, max=2624, avg=49.03, stdev=34.07
     lat (usec): min=35, max=2624, avg=49.12, stdev=34.07
    clat percentiles (usec):
     |  1.00th=[   36],  5.00th=[   36], 10.00th=[   36], 20.00th=[   37],
     | 30.00th=[   37], 40.00th=[   38], 50.00th=[   38], 60.00th=[   39],
     | 70.00th=[   40], 80.00th=[   70], 90.00th=[   79], 95.00th=[   87],
     | 99.00th=[   92], 99.50th=[   94], 99.90th=[  225], 99.95th=[  298],
     | 99.99th=[ 2128]
    bw (KB  /s): min=    0, max=81072, per=95.01%, avg=75974.80, stdev=17896.56
    lat (usec) : 50=71.32%, 100=28.42%, 250=0.19%, 500=0.04%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%
  cpu          : usr=99.99%, sys=0.00%, ctx=8, majf=0, minf=0
  IO depths    : 1=157.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=199935/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=799740KB, aggrb=79966KB/s, minb=79966KB/s, maxb=79966KB/s, mint=10001msec, maxt=10001msec


===============
Jens Axboe wrote:

On 11/25/2015 12:36 PM, Deyoung Hong (dhong) wrote:
> I think this is an fio bug.
> 
> Description:   When specifying ramp_time, cpu may show over 100% (when using a polling mode driver).
> If ramp_time is not specified, cpu values appear to be correct.
> You normally don't see this problem because most drivers don't utilize high CPU usage.
> You will have to use some custom driver that does polling, for example (something like http://www.dpdk.org/).
> 
> OS:  Any (but tested on Linux CentOS 6.6)

This should fix it:

http://git.kernel.dk/cgit/fio/commit/?id=210dd0fc66b8956065e098bb9314febe8993e849

Try current -git and hopefully it'll behave better.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug using ramp_time shows cpu over 100%
  2015-12-08 21:45 Bug using ramp_time shows cpu over 100% Deyoung Hong (dhong)
@ 2015-12-09  2:50 ` Jens Axboe
  2015-12-09  9:56   ` Deyoung Hong (dhong)
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2015-12-09  2:50 UTC (permalink / raw)
  To: Deyoung Hong (dhong), fio@vger.kernel.org

On 12/08/2015 02:45 PM, Deyoung Hong (dhong) wrote:
> I saw the response on http://www.spinics.net/lists/fio/msg04392.html (didn't get the email), so I'm pasting it here.
>
> I git pulled the latest fio code and observed the specified changes.  I compiled and ran the latest code but still seeing the same problem:
>
> ===============
> test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
> fio-2.2.12-50-g5261
> Starting 1 thread
>
> test: (groupid=0, jobs=1): err= 0: pid=10726: Tue Dec  8 13:40:21 2015
>    read : io=799740KB, bw=79966KB/s, iops=19991, runt= 10001msec
>      slat (usec): min=0, max=11, avg= 0.10, stdev= 0.30
>      clat (usec): min=26, max=2624, avg=49.03, stdev=34.07
>       lat (usec): min=35, max=2624, avg=49.12, stdev=34.07
>      clat percentiles (usec):
>       |  1.00th=[   36],  5.00th=[   36], 10.00th=[   36], 20.00th=[   37],
>       | 30.00th=[   37], 40.00th=[   38], 50.00th=[   38], 60.00th=[   39],
>       | 70.00th=[   40], 80.00th=[   70], 90.00th=[   79], 95.00th=[   87],
>       | 99.00th=[   92], 99.50th=[   94], 99.90th=[  225], 99.95th=[  298],
>       | 99.99th=[ 2128]
>      bw (KB  /s): min=    0, max=81072, per=95.01%, avg=75974.80, stdev=17896.56
>      lat (usec) : 50=71.32%, 100=28.42%, 250=0.19%, 500=0.04%, 750=0.01%
>      lat (usec) : 1000=0.01%
>      lat (msec) : 2=0.01%, 4=0.01%
>    cpu          : usr=99.99%, sys=0.00%, ctx=8, majf=0, minf=0
>    IO depths    : 1=157.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       issued    : total=r=199935/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>       latency   : target=0, window=0, percentile=100.00%, depth=1
>
> Run status group 0 (all jobs):
>     READ: io=799740KB, aggrb=79966KB/s, minb=79966KB/s, maxb=79966KB/s, mint=10001msec, maxt=10001msec

This is showing 99.99% usr time, for a polled driver that doesn't seem 
crazy. Why do you think it's broken?

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Bug using ramp_time shows cpu over 100%
  2015-12-09  2:50 ` Jens Axboe
@ 2015-12-09  9:56   ` Deyoung Hong (dhong)
  2015-12-09 15:06     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Deyoung Hong (dhong) @ 2015-12-09  9:56 UTC (permalink / raw)
  To: Jens Axboe, fio@vger.kernel.org

Sorry, not sure how I made that mistake (didn't even check my email response).
I ran various fio versions (including the latest Download ZIP) and thought I checked and saw cpu way over 100%.
It now looks like my custom engine does not run  with the latest fio code anymore.

Thanks for the fix.

-----Original Message-----
From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On Behalf Of Jens Axboe
Sent: Tuesday, December 08, 2015 6:50 PM
To: Deyoung Hong (dhong); fio@vger.kernel.org
Subject: Re: Bug using ramp_time shows cpu over 100%

On 12/08/2015 02:45 PM, Deyoung Hong (dhong) wrote:
> I saw the response on http://www.spinics.net/lists/fio/msg04392.html (didn't get the email), so I'm pasting it here.
>
> I git pulled the latest fio code and observed the specified changes.  I compiled and ran the latest code but still seeing the same problem:
>
> ===============
> test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
> fio-2.2.12-50-g5261
> Starting 1 thread
>
> test: (groupid=0, jobs=1): err= 0: pid=10726: Tue Dec  8 13:40:21 2015
>    read : io=799740KB, bw=79966KB/s, iops=19991, runt= 10001msec
>      slat (usec): min=0, max=11, avg= 0.10, stdev= 0.30
>      clat (usec): min=26, max=2624, avg=49.03, stdev=34.07
>       lat (usec): min=35, max=2624, avg=49.12, stdev=34.07
>      clat percentiles (usec):
>       |  1.00th=[   36],  5.00th=[   36], 10.00th=[   36], 20.00th=[   37],
>       | 30.00th=[   37], 40.00th=[   38], 50.00th=[   38], 60.00th=[   39],
>       | 70.00th=[   40], 80.00th=[   70], 90.00th=[   79], 95.00th=[   87],
>       | 99.00th=[   92], 99.50th=[   94], 99.90th=[  225], 99.95th=[  298],
>       | 99.99th=[ 2128]
>      bw (KB  /s): min=    0, max=81072, per=95.01%, avg=75974.80, stdev=17896.56
>      lat (usec) : 50=71.32%, 100=28.42%, 250=0.19%, 500=0.04%, 750=0.01%
>      lat (usec) : 1000=0.01%
>      lat (msec) : 2=0.01%, 4=0.01%
>    cpu          : usr=99.99%, sys=0.00%, ctx=8, majf=0, minf=0
>    IO depths    : 1=157.5%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
>       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>       issued    : total=r=199935/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>       latency   : target=0, window=0, percentile=100.00%, depth=1
>
> Run status group 0 (all jobs):
>     READ: io=799740KB, aggrb=79966KB/s, minb=79966KB/s, maxb=79966KB/s, mint=10001msec, maxt=10001msec

This is showing 99.99% usr time, for a polled driver that doesn't seem 
crazy. Why do you think it's broken?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Bug using ramp_time shows cpu over 100%
  2015-12-09  9:56   ` Deyoung Hong (dhong)
@ 2015-12-09 15:06     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2015-12-09 15:06 UTC (permalink / raw)
  To: Deyoung Hong (dhong), fio@vger.kernel.org

On 12/09/2015 02:56 AM, Deyoung Hong (dhong) wrote:
> Sorry, not sure how I made that mistake (didn't even check my email response).
> I ran various fio versions (including the latest Download ZIP) and thought I checked and saw cpu way over 100%.

No worries.

> It now looks like my custom engine does not run  with the latest fio code anymore.

How so? We haven't had any changes in that area, and the other IO 
engines haven't needed an update for a long time. Maybe you just need to 
recompile it, if structure sizes changed.

In general, I'd recommend submitting your IO engine so it can be carried 
in fio proper.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-12-09 15:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-08 21:45 Bug using ramp_time shows cpu over 100% Deyoung Hong (dhong)
2015-12-09  2:50 ` Jens Axboe
2015-12-09  9:56   ` Deyoung Hong (dhong)
2015-12-09 15:06     ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2015-11-25 19:36 Deyoung Hong (dhong)
2015-11-30 21:45 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox