All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Ram Pai <linuxram@us.ibm.com>
Cc: alexeyk@mysql.com, nickpiggin@yahoo.com.au, peter@mysql.com,
	linux-kernel@vger.kernel.org, axboe@suse.de
Subject: Re: Random file I/O regressions in 2.6
Date: Mon, 10 May 2004 13:21:51 -0700	[thread overview]
Message-ID: <20040510132151.238b8d0c.akpm@osdl.org> (raw)
In-Reply-To: <1084218659.6140.459.camel@localhost.localdomain>

Ram Pai <linuxram@us.ibm.com> wrote:
>
> > Ram, can you take a look at fixing that up please?  Something clean, not
> > more hacks ;) I'd also be interested in an explanation of what the extra
> > page is for.  The little comment in there doesn't really help.
> 
> 
> The reason for the extra page read is as follows:
> 
> Consider 16k random reads i/os. Reads are generated 4pages at a time.
> 
> the readahead is triggered when the 4th page in the 'current-window' is
> touched.

Right.  We've added two whole unsigned longs to the file_struct to track
the access patterns.  That should be sufficient for us to detect when the
access pattern is random, and to then not perform readahead due to a
current-window miss *at all*.

So that extra page can go away, and:

--- 25/mm/readahead.c~a	Mon May 10 13:16:59 2004
+++ 25-akpm/mm/readahead.c	Mon May 10 13:17:22 2004
@@ -492,21 +492,17 @@ do_io:
 		 */
 		if (ra->ahead_start == 0) {
 			/*
-			 * if the average io-size is less than maximum
+			 * If the average io-size is less than maximum
 			 * readahead size of the file the io pattern is
 			 * sequential. Hence  bring in the readahead window
 			 * immediately.
-			 * Else the i/o pattern is random. Bring
-			 * in the readahead window only if the last page of
-			 * the current window is accessed (lazy readahead).
 			 */
 			unsigned long average = ra->average;
 
 			if (ra->serial_cnt > average)
 				average = (ra->serial_cnt + ra->average) / 2;
 
-			if ((average >= max) || (offset == (ra->start +
-							ra->size - 1))) {
+			if (average >= max) {
 				ra->ahead_start = ra->start + ra->size;
 				ra->ahead_size = ra->next_size;
 				actual = do_page_cache_readahead(mapping, filp,

_


That way, we read the correct amount of data, and we only start I/O when we
know the application is going to actually use the data.

This may cause problems when the application transitions from seeky-access
to linear-access.

Does it sound feasible?

>
> Probably we may see marginal degradation of this optimization with 16k
> i/o but the amount of wastage avoided by this optimization (hack) 
> is great when random i/o is of larger size. I think it was 4% better
> performance on DSS workload with 64k random reads.

64k sounds unusually large.  We need top performance at 8k too.

> Do you still think its a hack?

yup ;)

> Also I think  with sysbench workload and Andrew's ra-copy patch, we
> might be loosing some benefits of some of the optimization because 
> if two threads simulteously work with copies of the same ra structure
> and update it, the optimization effect reflected in one of the
> ra-structure is lost depending on which ra structure gets copied back
> last.

hm, maybe.  That only makes a difference if two threads are accessing the
same fd at the same time, and it was really bad before the patch.  The IO
patterns seemed OK to me with the patch.  Except it's reading one page too
many.


  reply	other threads:[~2004-05-10 20:21 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-02 19:57 Random file I/O regressions in 2.6 Alexey Kopytov
2004-05-03 11:14 ` Nick Piggin
2004-05-03 18:08   ` Andrew Morton
2004-05-03 20:22     ` Ram Pai
2004-05-03 20:57       ` Andrew Morton
2004-05-03 21:37         ` Peter Zaitsev
2004-05-03 21:50           ` Ram Pai
2004-05-03 22:01             ` Peter Zaitsev
2004-05-03 21:59           ` Andrew Morton
2004-05-03 22:07             ` Ram Pai
2004-05-03 23:58             ` Nick Piggin
2004-05-04  0:10               ` Andrew Morton
2004-05-04  0:19                 ` Nick Piggin
2004-05-04  0:50                   ` Ram Pai
2004-05-04  6:29                     ` Andrew Morton
2004-05-04 15:03                       ` Ram Pai
2004-05-04 19:39                         ` Ram Pai
2004-05-04 19:48                           ` Andrew Morton
2004-05-04 19:58                             ` Ram Pai
2004-05-04 21:51                               ` Ram Pai
2004-05-04 22:29                                 ` Ram Pai
2004-05-04 23:01                           ` Alexey Kopytov
2004-05-04 23:20                             ` Andrew Morton
2004-05-05 22:04                               ` Alexey Kopytov
2004-05-06  8:43                                 ` Andrew Morton
2004-05-06 18:13                                   ` Peter Zaitsev
2004-05-06 21:49                                     ` Andrew Morton
2004-05-06 23:49                                       ` Nick Piggin
2004-05-07  1:29                                         ` Peter Zaitsev
2004-05-10 19:50                                   ` Ram Pai
2004-05-10 20:21                                     ` Andrew Morton [this message]
2004-05-10 22:39                                       ` Ram Pai
2004-05-10 23:07                                         ` Andrew Morton
2004-05-11 20:51                                           ` Ram Pai
2004-05-11 21:17                                             ` Andrew Morton
2004-05-13 20:41                                               ` Ram Pai
2004-05-17 17:30                                                 ` Random file I/O regressions in 2.6 [patch+results] Ram Pai
2004-05-20  1:06                                                   ` Alexey Kopytov
2004-05-20  1:31                                                     ` Ram Pai
2004-05-21 19:32                                                       ` Alexey Kopytov
2004-05-20  5:49                                                     ` Andrew Morton
2004-05-20 21:59                                                     ` Andrew Morton
2004-05-20 22:23                                                       ` Andrew Morton
2004-05-21  7:31                                                         ` Nick Piggin
2004-05-21  7:50                                                           ` Jens Axboe
2004-05-21  8:40                                                             ` Nick Piggin
2004-05-21  8:56                                                             ` Spam: " Andrew Morton
2004-05-21 22:24                                                               ` Alexey Kopytov
2004-05-21 21:13                                                       ` Alexey Kopytov
2004-05-26  4:43                                                         ` Alexey Kopytov
2004-05-11 22:26                                           ` Random file I/O regressions in 2.6 Bill Davidsen
2004-05-04  1:15                   ` Andrew Morton
2004-05-04 11:39                     ` Nick Piggin
2004-05-04  8:27                 ` Arjan van de Ven
2004-05-04  8:47                   ` Andrew Morton
2004-05-04  8:50                     ` Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040510132151.238b8d0c.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=alexeyk@mysql.com \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxram@us.ibm.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=peter@mysql.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.