I/O tests using elvtune to improve interactive performance

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* I/O tests using elvtune to improve interactive performance
       [not found] <138.49c8e42.29247804@aol.com>
@ 2001-11-17  8:06 ` rwhron
  2001-11-19  7:09   ` Jens Axboe
  0 siblings, 1 reply; 4+ messages in thread
From: rwhron @ 2001-11-17  8:06 UTC (permalink / raw)
  To: linux-kernel, ltp-list

Kernel:	2.4.15-pre5

Test:	Run growfiles tests from Linux Test Project that really hurt
	interactive performance.  Simultaneously run "ls -laR /".
	Change the elevator read latency value with elvtune.
	Also run mp3blaster tests.

Summary:	Smaller values for the I/O elevator read latency
		have a signficant positive impact on interactive 
		performance; and throughput is as good or better
		than default value of 8192.

The idea for this came from Andrea Arcangeli's excellent doc at
http://tux.u-strasbg.fr/jl3/features-2.3-1.html .  That page shows 
that dbench throughput can be good with low values for read
latency too.

My initial tests were just to run growfiles and do commands that
were slow to respond in the past.  Things like "ls -l", "login", 
"ps aux", etc.  I didn't time these tests, but it was amazing what 
a difference using elvtune to set read latency to 128 or 32 made.  
Each growfiles test prints the number of iterations for a 120 second 
interval, and I was happy to see that the number of iterations went 
up while interactive performance was dramatically better.

Of course, running ls -l in big directories isn't exactly scientific,
so I tried to come up with something to measure interactive performance.

For these tests, the ls -laR / is running at the same time as some
growfiles tests.  I picked ls for a couple reasons:

1) It's slow to respond when I/O is high.
2) It's easy to measure and repeat.
3) My disk has 5 partitions and lots of files spread on each partition,
   which will require some seeking on the disk.

ls -laR / is not ideal though; it isn't interactive.

Total time for the 4 growfiles tests is 8 minutes (120 seconds per test).
The ls command finished before the last growfiles test completed in
each run.

I rebooted between each of these tests.

read_latency = 2
----------------
The ls was the slowest here, and none of the growfiles were the fastest.

ls -laR / > /var/tmp/ls-laR2
Elapsed (wall clock) time (h:mm:ss or m:ss): 7:40.52
Percent of CPU this job got: 4%

growfiles -b -e 1 -i 0 -L 120 -u -g 4090 -T 100 -t 408990 -l -C 10 -c 1000 -S 10 -f Lgf02_
13969 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -u -g 5000 -T 100 -t 499990 -l -C 10 -c 1000 -S 10 -f Lgf03_
12252 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -u -r 1-49600 -I r -u -i 0 -L 120 Lgfile1
48352 iterations to 1 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -w -u -r 10-5000 -I r -T 10 -l -S 2 -f Lgf04_
59807 iterations to 2 files. Hit time value of 120

read_latency = 32
-----------------
This value had 3 of the best results for growfiles.  ls was 16% slower than
the default read latency though.  Interative performance was great though.

ls -laR / > /var/tmp/ls-laR32
Elapsed (wall clock) time (h:mm:ss or m:ss): 5:08.23
Percent of CPU this job got: 6%

growfiles -b -e 1 -i 0 -L 120 -u -g 4090 -T 100 -t 408990 -l -C 10 -c 1000 -S 10 -f Lgf02_
14181 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -u -g 5000 -T 100 -t 499990 -l -C 10 -c 1000 -S 10 -f Lgf03_
11691 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -u -r 1-49600 -I r -u -i 0 -L 120 Lgfile1
54768 iterations to 1 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -w -u -r 10-5000 -I r -T 10 -l -S 2 -f Lgf04_
68342 iterations to 2 files. Hit time value of 120

read_latency = 8192 (default)
-------------------
ls -laR / > /var/tmp/ls-laR8192
Elapsed (wall clock) time (h:mm:ss or m:ss): 4:26.13
Percent of CPU this job got: 7%

growfiles -b -e 1 -i 0 -L 120 -u -g 4090 -T 100 -t 408990 -l -C 10 -c 1000 -S 10 -f Lgf02_
11085 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -u -g 5000 -T 100 -t 499990 -l -C 10 -c 1000 -S 10 -f Lgf03_
13797 iterations to 10 files. Hit time value of 120

growfiles -b -e 1 -u -r 1-49600 -I r -u -i 0 -L 120 Lgfile1
53198 iterations to 1 files. Hit time value of 120

growfiles -b -e 1 -i 0 -L 120 -w -u -r 10-5000 -I r -T 10 -l -S 2 -f Lgf04_
63542 iterations to 2 files. Hit time value of 120

mtest01 and mmap001
-------------------

I also ran the mtest01 and mmap001 tests playing mp3blaster with various
elevator settings.  These are the same tests I've run before.  Below is just
the total time for the test, and the percentage of time the mp3 played.

read_latency = 16 was best here.  Test was fastest and had the highest
mp3 playtime.

read_latency = 2
mtest01 - mp3 played 280 seconds of 316 second run.  (88%)
mmap001 not run because changing elvtune didn't seem to affect this test.

read_latency = 16
mtest01 - mp3 played 280 seconds of 309 second run.  (91%)
mmap001 - mp3 played 908 seconds of 908 second run.

read_latency = 64
mtest01 - mp3 played 280 seconds of 309 second run.  (80%)
mmap001 - mp3 played 908 seconds of 908 second run.

read_latency = 8192
mtest01 - mp3 played 262 seconds of 314 second run.  (83%)
mmap001 - mp3 played 901 seconds of 901 second run.

Hardware
--------
Athlon 1333
512 Mb RAM
(1) 40 Gb IDE hard drive with 5 partitions 

It's exciting to see Linux have good interactive performance under heavy
disk load.  

Have fun!
-- 
Randy Hron

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O tests using elvtune to improve interactive performance
  2001-11-17  8:06 ` I/O tests using elvtune to improve interactive performance rwhron
@ 2001-11-19  7:09   ` Jens Axboe
  2001-11-19 15:26     ` rwhron
  2001-11-20  7:32     ` rwhron
  0 siblings, 2 replies; 4+ messages in thread
From: Jens Axboe @ 2001-11-19  7:09 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel, ltp-list

On Sat, Nov 17 2001, rwhron@earthlink.net wrote:
> Kernel:	2.4.15-pre5
> 
> Test:	Run growfiles tests from Linux Test Project that really hurt
> 	interactive performance.  Simultaneously run "ls -laR /".
> 	Change the elevator read latency value with elvtune.
> 	Also run mp3blaster tests.

Interesting tests, thanks. I wonder if you could be convinced to do
bonnie++ and dbench tests with the same read_latency values used? Also,
I'm assuming you kept write latency at its default of 16384?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O tests using elvtune to improve interactive performance
  2001-11-19  7:09   ` Jens Axboe
@ 2001-11-19 15:26     ` rwhron
  2001-11-20  7:32     ` rwhron
  1 sibling, 0 replies; 4+ messages in thread
From: rwhron @ 2001-11-19 15:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel, ltp-list

On Mon, Nov 19, 2001 at 08:09:22AM +0100, Jens Axboe wrote:
> > Test:	Run growfiles tests from Linux Test Project that really hurt
> > 	interactive performance.  Simultaneously run "ls -laR /".
> > 	Change the elevator read latency value with elvtune.
> > 	Also run mp3blaster tests.
> 
> Interesting tests, thanks. I wonder if you could be convinced to do
> bonnie++ and dbench tests with the same read_latency values used? Also,
> I'm assuming you kept write latency at its default of 16384?
> -- 
> Jens Axboe
> 

Thanks for the feedback.  Write latency was 16384 for all tests.

I'm downloading dbench and bonnie++ now.   I'll check them out.

I'm still not sure how to measure/quantify interactive performance.  

My ideal test will have these components:

1) Simulate and measure user interactive response time.
2) Disk I/O patterns capable of making interactive performance slow.
3) Measurement of I/O throughput.
4) Note how changes with elvtune effect throughput and response time.
5) It's not too boring.  (i.e. type something, use a stop watch).

It's the "measure interactive response time" that I haven't got a handle 
on yet.  I'm looking at the SSBA benchmarks for something that simulates 
users.  I don't know if it measures response time.

I could resort to a stopwatch to test interactive response, but
hopefully, something better will come to mind.
-- 
Randy Hron

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O tests using elvtune to improve interactive performance
  2001-11-19  7:09   ` Jens Axboe
  2001-11-19 15:26     ` rwhron
@ 2001-11-20  7:32     ` rwhron
  1 sibling, 0 replies; 4+ messages in thread
From: rwhron @ 2001-11-20  7:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

On Mon, Nov 19, 2001 at 08:09:22AM +0100, Jens Axboe wrote:
> Interesting tests, thanks. I wonder if you could be convinced to do
> bonnie++ and dbench tests with the same read_latency values used? Also,
> -- 
> Jens Axboe

Jens,

I'm sure this isn't what you had in mind, but ...  :)

Kernel:	2.4.15-pre6

Test:	dbench 775 on 5 partitions.  Time ls -l on big directories.
	Test with read_latency to 8192 (default) and 32.

Summary: Load average of 775 and console IRC clients perform great.
	Lower read latency reduces throughput, but big directory listings
	are faster.

This is really a crazy test, but it's a testament to the amazing work 
of the kernel hackers.

I was looking for the I/O load that makes interactive response poor.  
There are a couple growfiles tests in the Linux Test Project that
do that with a load average of less than 5.  dbench is different.
The dbench load the kernel can handle is remarkable.

Hardware:
1 Athlon 1333
1 GB RAM
1 GB swap
1 40 GB IDE disk

A reasonable test may be dbench 36 or 144, which return:
Throughput 90.636 MB/sec (NB=113.295 MB/sec  906.36 MBit/sec)  8 procs
Throughput 56.0331 MB/sec (NB=70.0413 MB/sec  560.331 MBit/sec)  36 procs
Throughput 25.7869 MB/sec (NB=32.2336 MB/sec  257.869 MBit/sec)  144 procs

Instead, I figured out roughly how many simultaneous dbench processes would 
run with the amount of free disk space I have. 

8:01pm  up 53 min, 12 users,  load average: 779.12, 778.68, 737.68

I had 3 console irc sessions up. Occasionally there was a very slight
delay. "ls -l" on the other hand was very slow on big directories; timings 
are below.

Summary:
read latency=8192 compared to read latency=32

dbench 50	10% more throughput
dbench 50	 6% more throughput
dbench 75	22% more throughput
dbench 150	25% more throughput
dbench 450	24% more throughput
ls -l time	48% longer

ls times are interspersed with dbench results in chronologic order.

read_latency = 8192
-------------------
/usr/share/man/man3			real    30m48.908s

# /home/dbench$ ./dbench 50 completes
Throughput 1.91472 MB/sec (NB=2.39339 MB/sec  19.1472 MBit/sec)  50 procs

# /usr/local/dbench$ ./dbench 50 completes
Throughput 1.84434 MB/sec (NB=2.30543 MB/sec  18.4434 MBit/sec)  50 procs

# /dbench$ ./dbench 75 completes
Throughput 2.50039 MB/sec (NB=3.12548 MB/sec  25.0039 MBit/sec)  75 procs

/usr/src/linux ls -laR			real    10m11.953s

# /usr/src/sources/d/dbench$ ./dbench 150 completes
Throughput 3.51881 MB/sec (NB=4.39852 MB/sec  35.1881 MBit/sec)  150 procs

/usr/X11R6/lib/X11/fonts/75dpi		real    28m22.315s
/usr/X11R6/lib/X11/fonts/100dpi		real    12m27.915s

# /opt/dbench$ ./dbench 450 completes
Throughput 4.64194 MB/sec (NB=5.80242 MB/sec  46.4194 MBit/sec)  450 procs


read_latency = 32
-----------------
/usr/share/man/man3			real    10m8.684s

# /home/dbench$ ./dbench 50 completes
Throughput 1.74518 MB/sec (NB=2.18147 MB/sec  17.4518 MBit/sec)  50 procs

# /usr/local/dbench$ ./dbench 50 completes
Throughput 1.73985 MB/sec (NB=2.17481 MB/sec  17.3985 MBit/sec)  50 procs

/usr/src/linux ls -laR			real    5m57.340s

# /dbench$ ./dbench 75 completes
Throughput 2.0441 MB/sec (NB=2.55513 MB/sec  20.441 MBit/sec)  75 procs

/usr/X11R6/lib/X11/fonts/75dpi		real    13m32.822s

# /usr/src/sources/d/dbench$ ./dbench 150 completes
Throughput 2.8047 MB/sec (NB=3.50587 MB/sec  28.047 MBit/sec)  150 procs

/usr/X11R6/lib/X11/fonts/100dpi		real    14m14.336s

# /opt/dbench$ ./dbench 450 completes
Throughput 3.74463 MB/sec (NB=4.68079 MB/sec  37.4463 MBit/sec)  450 procs


Filesystems (test not running)
-----------
Filesystem    Type     Size  Used Avail Use% Mounted on
/dev/hda12 reiserfs    4.2G  1.2G  3.0G  27% /
/dev/hda11 reiserfs     15G  3.9G   11G  26% /opt
/dev/hda5  reiserfs     10G  5.6G  4.9G  53% /usr/src
/dev/hda6  reiserfs    5.2G  3.4G  1.8G  64% /home
/dev/hda8  reiserfs    2.1G  200M  1.8G  10% /usr/local

Conclusion:
Load average 775!
Box is solid.
IRC clients perform great.
Total throughput goes down as load goes up.


It may have made more sense to do a shorter test with less processes and more
values for read_latency, but it turned out this way.  Hopefully it's entertaining, 
nonetheless. :)

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-11-20  7:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <138.49c8e42.29247804@aol.com>
2001-11-17  8:06 ` I/O tests using elvtune to improve interactive performance rwhron
2001-11-19  7:09   ` Jens Axboe
2001-11-19 15:26     ` rwhron
2001-11-20  7:32     ` rwhron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox