Flexible I/O Tester development
 help / color / mirror / Atom feed
* Every 13 hours sees a spike in fio CPU consumption, and a drop in   IOPS/throughput
@ 2009-10-30 15:20 Chris Worley
  2009-10-30 16:49 ` Chris Worley
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Worley @ 2009-10-30 15:20 UTC (permalink / raw)
  To: fio

I'm running fio benchmarks for 120 hours:

fio --rw=randrw --bs=4k --rwmixread=60 --numjobs=64 --iodepth=64
--sync=0 --direct=1 --randrepeat=0 --ioengine=libaio
--filename=/dev/sde --filename=/dev/sdf --name=test --loops=10000
--size=322122547200 --runtime=432000 --group_reporting

The fio threads are generally reporting 100K IOPS and each of the 64
fio threads uses less than 10% CPU.

But, every 13 hours (nearly to the minute), for ~200 seconds, the FIO
threads start consuming large amounts of CPU, and the IOPS drop to
~40K:

top - 07:36:59 up 8 days,  8:12,  0 users,  load average: 57.42,
58.54, 58.00Tasks: 352 total,  17 running, 335 sleeping,   0 stopped,
 0 zombie
Cpu(s):  1.2%us, 14.8%sy,  0.0%ni, 72.4%id, 10.1%wa,  1.3%hi,  0.2%si,
 0.0%stMem:  49449752k total,  6643996k used, 42805756k free,
205832k buffers
Swap:        0k total,        0k used,        0k free,  1432664k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND22108 root      18   0 86144  10m 9948 R 80.6  0.0 137:55.21
fio
22074 root      18   0 86144  10m 9952 R 70.8  0.0 138:07.68 fio
22101 root      18   0 86144  10m 9952 R 61.0  0.0 138:18.43 fio
22095 root      16   0 86144  10m 9948 D 51.1  0.0 137:54.02 fio
22134 root      16   0 86144  10m 9948 R 51.1  0.0 138:03.73 fio
22068 root      16   0 86144  10m 9952 R 49.2  0.0 138:04.16 fio
22083 root      16   0 86144  10m 9952 D 45.2  0.0 138:09.32 fio
22071 root      16   0 86144  10m 9948 D 29.5  0.0 137:37.75 fio
22128 root      16   0 86144  10m 9952 D 25.6  0.0 137:48.11 fio
22092 root      16   0 86144  10m 9948 D 11.8  0.0 137:54.14 fio
22097 root      16   0 86144  10m 9948 D  9.8  0.0 138:07.81 fio
...

There is nothing in the system to account for this (not occurring in
conjunction with i.e. logrotate or cron).

I've repeated this with 1.21 and 1.31, on RHEL 5.3 and RHEL 5.4.  The
1.31/RHEL5.4 case, the period was ~15 hours... this was also a slower
machine.

Any idea what's going on?

Thanks,

Chris

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Every 13 hours sees a spike in fio CPU consumption, and a drop in   IOPS/throughput
  2009-10-30 15:20 Every 13 hours sees a spike in fio CPU consumption, and a drop in IOPS/throughput Chris Worley
@ 2009-10-30 16:49 ` Chris Worley
  2009-10-31  6:58   ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Worley @ 2009-10-30 16:49 UTC (permalink / raw)
  To: fio

On Fri, Oct 30, 2009 at 9:20 AM, Chris Worley <worleys@gmail.com> wrote:
> I'm running fio benchmarks for 120 hours:
>
> fio --rw=randrw --bs=4k --rwmixread=60 --numjobs=64 --iodepth=64
> --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio
> --filename=/dev/sde --filename=/dev/sdf --name=test --loops=10000
> --size=322122547200 --runtime=432000 --group_reporting
>
> The fio threads are generally reporting 100K IOPS and each of the 64
> fio threads uses less than 10% CPU.
>
> But, every 13 hours (nearly to the minute), for ~200 seconds, the FIO
> threads start consuming large amounts of CPU, and the IOPS drop to
> ~40K:

Josh is brilliant:

(322122547200[bytes per test]*64[threads])/(4096[block
size]*100000[IOPS]*3600[seconds per hour])=13 hours

...it's restarting the next loop and recalculating the random maps
every 13 hours

Chris
>
> top - 07:36:59 up 8 days, �8:12, �0 users, �load average: 57.42,
> 58.54, 58.00Tasks: 352 total, �17 running, 335 sleeping, � 0 stopped,
> �0 zombie
> Cpu(s): �1.2%us, 14.8%sy, �0.0%ni, 72.4%id, 10.1%wa, �1.3%hi, �0.2%si,
> �0.0%stMem: �49449752k total, �6643996k used, 42805756k free,
> 205832k buffers
> Swap: � � � �0k total, � � � �0k used, � � � �0k free, �1432664k cached
> �PID USER � � �PR �NI �VIRT �RES �SHR S %CPU %MEM � �TIME+
> COMMAND22108 root � � �18 � 0 86144 �10m 9948 R 80.6 �0.0 137:55.21
> fio
> 22074 root � � �18 � 0 86144 �10m 9952 R 70.8 �0.0 138:07.68 fio
> 22101 root � � �18 � 0 86144 �10m 9952 R 61.0 �0.0 138:18.43 fio
> 22095 root � � �16 � 0 86144 �10m 9948 D 51.1 �0.0 137:54.02 fio
> 22134 root � � �16 � 0 86144 �10m 9948 R 51.1 �0.0 138:03.73 fio
> 22068 root � � �16 � 0 86144 �10m 9952 R 49.2 �0.0 138:04.16 fio
> 22083 root � � �16 � 0 86144 �10m 9952 D 45.2 �0.0 138:09.32 fio
> 22071 root � � �16 � 0 86144 �10m 9948 D 29.5 �0.0 137:37.75 fio
> 22128 root � � �16 � 0 86144 �10m 9952 D 25.6 �0.0 137:48.11 fio
> 22092 root � � �16 � 0 86144 �10m 9948 D 11.8 �0.0 137:54.14 fio
> 22097 root � � �16 � 0 86144 �10m 9948 D �9.8 �0.0 138:07.81 fio
> ...
>
> There is nothing in the system to account for this (not occurring in
> conjunction with i.e. logrotate or cron).
>
> I've repeated this with 1.21 and 1.31, on RHEL 5.3 and RHEL 5.4. �The
> 1.31/RHEL5.4 case, the period was ~15 hours... this was also a slower
> machine.
>
> Any idea what's going on?
>
> Thanks,
>
> Chris
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Every 13 hours sees a spike in fio CPU consumption, and a drop  in   IOPS/throughput
  2009-10-30 16:49 ` Chris Worley
@ 2009-10-31  6:58   ` Jens Axboe
  0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2009-10-31  6:58 UTC (permalink / raw)
  To: Chris Worley; +Cc: fio

On Fri, Oct 30 2009, Chris Worley wrote:
> On Fri, Oct 30, 2009 at 9:20 AM, Chris Worley <worleys@gmail.com> wrote:
> > I'm running fio benchmarks for 120 hours:
> >
> > fio --rw=randrw --bs=4k --rwmixread=60 --numjobs=64 --iodepth=64
> > --sync=0 --direct=1 --randrepeat=0 --ioengine=libaio
> > --filename=/dev/sde --filename=/dev/sdf --name=test --loops=10000
> > --size=322122547200 --runtime=432000 --group_reporting
> >
> > The fio threads are generally reporting 100K IOPS and each of the 64
> > fio threads uses less than 10% CPU.
> >
> > But, every 13 hours (nearly to the minute), for ~200 seconds, the FIO
> > threads start consuming large amounts of CPU, and the IOPS drop to
> > ~40K:
> 
> Josh is brilliant:
> 
> (322122547200[bytes per test]*64[threads])/(4096[block
> size]*100000[IOPS]*3600[seconds per hour])=13 hours
> 
> ...it's restarting the next loop and recalculating the random maps
> every 13 hours

Awesome! It should just give it or re-clean the random map, that is
likely where the bug is. Looking forward to a tested patch :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-10-31  7:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-30 15:20 Every 13 hours sees a spike in fio CPU consumption, and a drop in IOPS/throughput Chris Worley
2009-10-30 16:49 ` Chris Worley
2009-10-31  6:58   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox