public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ram Pai <linuxram@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: alexeyk@mysql.com, nickpiggin@yahoo.com.au, peter@mysql.com,
	linux-kernel@vger.kernel.org, axboe@suse.de
Subject: Re: Random file I/O regressions in 2.6 [patch+results]
Date: 17 May 2004 10:30:11 -0700	[thread overview]
Message-ID: <1084815010.13559.3.camel@localhost.localdomain> (raw)
In-Reply-To: <1084480888.22208.26.camel@dyn319386.beaverton.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 651 bytes --]

On Thu, 2004-05-13 at 13:41, Ram Pai wrote:
> On Tue, 2004-05-11 at 14:17, Andrew Morton wrote:
> > Ram Pai <linuxram@us.ibm.com> wrote:
>  
> I am yet to get my machine fully set up to run a DSS benchmark. But
> thought I will update you on the following comment.

Attached the cleaned up patch and the performance results of the patch.

Overall Observation:
        1.Small improvement with iozone with the patch, and overall
                        much better performance than 2.4
        2.Small/neglegible improvement with DSS workload.
        3.Negligible impact with sysbench, but results worser than
                        2.4 kernels

RP


[-- Attachment #2: seeky-readahead-speedups.patch --]
[-- Type: text/plain, Size: 7487 bytes --]


	Results of iozone,sysbench and DSS workload with the 
		seeky-readahead-speedups.patch
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


Overall Observation: 
	1.Small improvement with iozone with the patch, and overall
			much better performance than 2.4
	2.Small/neglegible improvement with DSS workload.
	3.Negligible impact with sysbench, but results worser than
			2.4 kernels

	The cleaned-up patch is included towards the end of this report.

Details:

**********************************************************************
			IOZONE 

	run on a nfs mounted filesystem:
	client machine 2proc, 733MHz, 2GB memory
	server machine 8proc, 700Mhz, 8GB memory

./iozone -c -t1 -s 4096m -r 128k


---------------------------------------------------------
|		| throughput |	throughput | throughput |
|		| KB/sec     |	KB/sec     | KB/sec     |
|		| 266	     |	266+patch  | 2.4.20     |
---------------------------------------------------------
|sequential read| 11697.55   |	11700.98   | 10846.87   |
| 		|	     |             |            |
|re-read	| 11698.39   |	11691.84   | 10865.39   |
|		|	     |             |            |
|reverse read	| 20002.71   |	20099.86   | 10340.34   |
|               |            |             |            |
|stride read	| 13813.01   |	13850.28   | 10193.87   |
|		|	     |             |            |
|random read	| 19705.06   |	19978.00   | 10839.57   |
|               |            |             |            |
|random mix	| 28465.68   |	29964.38   | 10779.17   |
|		|	     |             |            |
|pread		| 11692.95   |	11697.29   | 10863.56   |
---------------------------------------------------------


**************************************************************

			SYSBENCH

	run on machine 2proc, 733MHz, 256MB memory


---------------------------------------------------------
|		| 266	     |	266+patch  | 2.4.21     |
---------------------------------------------------------
|time spent     | 79.6253    |	79.8176    | 73.2605sec |
| 		|	     |             |            |
|Mb/sec		| 1.959Mb.sec|	1.954Mb/sec| 2.129Mb/sec|
|		|	     |             |            |
|requests/sec 	| 125.59     |	125.29     | 136.54	|
|               |            |             |            |
|no of Reads 	| 6001       |	6001	   | 6008	|
|		|	     |             |            |
|no of Writes 	| 3999	     |	3999       | 3995	|
|               |            |             |            |
---------------------------------------------------------

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266 sysbench output:

Operations performed:  6001 Read, 3999 Write, 12800 Other = 22800 Total
Read 93Mb  Written 62Mb  Total Transferred 156Mb
   1.959Mb/sec  Transferred
  125.59 Requests/sec executed

Test execution Statistics summary:
Time spent for test:  79.6253s

Per Request statistics:
Min:   0.0000s  Avg:   0.0467s  Max:   0.9802s    Events tracked: 10000
Total time taken by event execution: 467.1493s
Threads fairness: 87.41/94.20  distribution,  88.68/94.45 execution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266+patch sysbench output:

Operations performed:  6001 Read, 3999 Write, 12800 Other = 22800 Total
Read 93Mb  Written 62Mb  Total Transferred 156Mb
   1.954Mb/sec  Transferred
  125.29 Requests/sec executed

Test execution Statistics summary:
Time spent for test:  79.8176s

Per Request statistics:
Min:   0.0000s  Avg:   0.0482s  Max:   0.8481s    Events tracked: 10000
Total time taken by event execution: 481.7572s
Threads fairness: 85.27/93.25  distribution,  85.15/94.91 execution

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2.4.21 sysbench output:

Operations performed:  6008 Read, 3995 Write, 12800 Other = 22803 Total
Read 93Mb  Written 62Mb  Total Transferred 156Mb
   2.129Mb/sec  Transferred
  136.54 Requests/sec executed

Test execution Statistics summary:
Time spent for test:  73.2605s

Per Request statistics:
Min:   0.0000s  Avg:   0.0380s  Max:   0.3712s    Events tracked: 10003
Total time taken by event execution: 380.4081s
Threads fairness: 79.04/91.95  distribution,  82.52/92.44 execution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~




**************************************************************

DSS WORKLOAD

	Got 1% improvement with the patch

**************************************************************




diff -urNp linux-2.6.6/mm/readahead.c linux-2.6.6.new/mm/readahead.c
--- linux-2.6.6/mm/readahead.c	2004-05-11 20:41:28.000000000 -0700
+++ linux-2.6.6.new/mm/readahead.c	2004-05-17 17:33:51.145040472 -0700
@@ -353,7 +353,7 @@ page_cache_readahead(struct address_spac
 	unsigned orig_next_size;
 	unsigned actual;
 	int first_access=0;
-	unsigned long preoffset=0;
+	unsigned long average;
 
 	/*
 	 * Here we detect the case where the application is performing
@@ -394,10 +394,17 @@ page_cache_readahead(struct address_spac
 		if (ra->serial_cnt <= (max * 2))
 			ra->serial_cnt++;
 	} else {
-		ra->average = (ra->average + ra->serial_cnt) / 2;
+		/* 
+		 * to avoid rounding errors, ensure that 'average' 
+		 * tends towards the value of ra->serial_cnt.
+		 */
+		average = ra->average;
+		if (average < ra->serial_cnt) {
+			average++;
+		}
+		ra->average = (average + ra->serial_cnt) / 2;
 		ra->serial_cnt = 1;
 	}
-	preoffset = ra->prev_page;
 	ra->prev_page = offset;
 
 	if (offset >= ra->start && offset <= (ra->start + ra->size)) {
@@ -457,18 +464,13 @@ do_io:
 		 * ahead window and get some I/O underway for the new
 		 * current window.
 		 */
-		if (!first_access && preoffset >= ra->start &&
-				preoffset < (ra->start + ra->size)) {
-			 /* Heuristic:  If 'n' pages were
-			  * accessed in the current window, there
-			  * is a high probability that around 'n' pages
-			  * shall be used in the next current window.
-			  *
-			  * To minimize lazy-readahead triggered
-			  * in the next current window, read in
-			  * an extra page.
+		if (!first_access) {
+			 /* Heuristic: there is a high probability 
+			  * that around  ra->average number of
+			  * pages shall be accessed in the next
+			  * current window.
 			  */
-			ra->next_size = preoffset - ra->start + 2;
+			ra->next_size = min(ra->average , (unsigned long)max);
 		}
 		ra->start = offset;
 		ra->size = ra->next_size;
@@ -492,21 +494,19 @@ do_io:
 		 */
 		if (ra->ahead_start == 0) {
 			/*
-			 * if the average io-size is less than maximum
+			 * If the average io-size is more than maximum
 			 * readahead size of the file the io pattern is
 			 * sequential. Hence  bring in the readahead window
-			 * immediately.
-			 * Else the i/o pattern is random. Bring
-			 * in the readahead window only if the last page of
-			 * the current window is accessed (lazy readahead).
+			 * immediately. 
+			 * If the average io-size is less than maximum
+			 * readahead size of the file the io pattern is
+			 * random. Hence don't bother to readahead.
 			 */
-			unsigned long average = ra->average;
-
+			average = ra->average;
 			if (ra->serial_cnt > average)
-				average = (ra->serial_cnt + ra->average) / 2;
+				average = (ra->serial_cnt + ra->average + 1) / 2;
 
-			if ((average >= max) || (offset == (ra->start +
-							ra->size - 1))) {
+			if (average > max) {
 				ra->ahead_start = ra->start + ra->size;
 				ra->ahead_size = ra->next_size;
 				actual = do_page_cache_readahead(mapping, filp,

  reply	other threads:[~2004-05-17 17:33 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-02 19:57 Random file I/O regressions in 2.6 Alexey Kopytov
2004-05-03 11:14 ` Nick Piggin
2004-05-03 18:08   ` Andrew Morton
2004-05-03 20:22     ` Ram Pai
2004-05-03 20:57       ` Andrew Morton
2004-05-03 21:37         ` Peter Zaitsev
2004-05-03 21:50           ` Ram Pai
2004-05-03 22:01             ` Peter Zaitsev
2004-05-03 21:59           ` Andrew Morton
2004-05-03 22:07             ` Ram Pai
2004-05-03 23:58             ` Nick Piggin
2004-05-04  0:10               ` Andrew Morton
2004-05-04  0:19                 ` Nick Piggin
2004-05-04  0:50                   ` Ram Pai
2004-05-04  6:29                     ` Andrew Morton
2004-05-04 15:03                       ` Ram Pai
2004-05-04 19:39                         ` Ram Pai
2004-05-04 19:48                           ` Andrew Morton
2004-05-04 19:58                             ` Ram Pai
2004-05-04 21:51                               ` Ram Pai
2004-05-04 22:29                                 ` Ram Pai
2004-05-04 23:01                           ` Alexey Kopytov
2004-05-04 23:20                             ` Andrew Morton
2004-05-05 22:04                               ` Alexey Kopytov
2004-05-06  8:43                                 ` Andrew Morton
2004-05-06 18:13                                   ` Peter Zaitsev
2004-05-06 21:49                                     ` Andrew Morton
2004-05-06 23:49                                       ` Nick Piggin
2004-05-07  1:29                                         ` Peter Zaitsev
2004-05-10 19:50                                   ` Ram Pai
2004-05-10 20:21                                     ` Andrew Morton
2004-05-10 22:39                                       ` Ram Pai
2004-05-10 23:07                                         ` Andrew Morton
2004-05-11 20:51                                           ` Ram Pai
2004-05-11 21:17                                             ` Andrew Morton
2004-05-13 20:41                                               ` Ram Pai
2004-05-17 17:30                                                 ` Ram Pai [this message]
2004-05-20  1:06                                                   ` Random file I/O regressions in 2.6 [patch+results] Alexey Kopytov
2004-05-20  1:31                                                     ` Ram Pai
2004-05-21 19:32                                                       ` Alexey Kopytov
2004-05-20  5:49                                                     ` Andrew Morton
2004-05-20 21:59                                                     ` Andrew Morton
2004-05-20 22:23                                                       ` Andrew Morton
2004-05-21  7:31                                                         ` Nick Piggin
2004-05-21  7:50                                                           ` Jens Axboe
2004-05-21  8:40                                                             ` Nick Piggin
2004-05-21  8:56                                                             ` Spam: " Andrew Morton
2004-05-21 22:24                                                               ` Alexey Kopytov
2004-05-21 21:13                                                       ` Alexey Kopytov
2004-05-26  4:43                                                         ` Alexey Kopytov
2004-05-11 22:26                                           ` Random file I/O regressions in 2.6 Bill Davidsen
2004-05-04  1:15                   ` Andrew Morton
2004-05-04 11:39                     ` Nick Piggin
2004-05-04  8:27                 ` Arjan van de Ven
2004-05-04  8:47                   ` Andrew Morton
2004-05-04  8:50                     ` Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1084815010.13559.3.camel@localhost.localdomain \
    --to=linuxram@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=alexeyk@mysql.com \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=peter@mysql.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox