public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ram Pai <linuxram@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: alexeyk@mysql.com, nickpiggin@yahoo.com.au, peter@mysql.com,
	linux-kernel@vger.kernel.org, axboe@suse.de
Subject: Re: Random file I/O regressions in 2.6
Date: 10 May 2004 15:39:28 -0700	[thread overview]
Message-ID: <1084228767.6140.832.camel@localhost.localdomain> (raw)
In-Reply-To: <20040510132151.238b8d0c.akpm@osdl.org>

On Mon, 2004-05-10 at 13:21, Andrew Morton wrote:
> Ram Pai <linuxram@us.ibm.com> wrote:
> >
> > > Ram, can you take a look at fixing that up please?  Something clean, not
> > > more hacks ;) I'd also be interested in an explanation of what the extra
> > > page is for.  The little comment in there doesn't really help.
> > 
> > 
> > The reason for the extra page read is as follows:
> > 
> > Consider 16k random reads i/os. Reads are generated 4pages at a time.
> > 
> > the readahead is triggered when the 4th page in the 'current-window' is
> > touched.
> 
> Right.  We've added two whole unsigned longs to the file_struct to track
> the access patterns.  That should be sufficient for us to detect when the
> access pattern is random, and to then not perform readahead due to a
> current-window miss *at all*.
> 
> So that extra page can go away, and:
> 
> --- 25/mm/readahead.c~a	Mon May 10 13:16:59 2004
> +++ 25-akpm/mm/readahead.c	Mon May 10 13:17:22 2004
> @@ -492,21 +492,17 @@ do_io:
>  		 */
>  		if (ra->ahead_start == 0) {
>  			/*
> -			 * if the average io-size is less than maximum
> +			 * If the average io-size is less than maximum
>  			 * readahead size of the file the io pattern is
>  			 * sequential. Hence  bring in the readahead window
>  			 * immediately.
> -			 * Else the i/o pattern is random. Bring
> -			 * in the readahead window only if the last page of
> -			 * the current window is accessed (lazy readahead).
>  			 */
>  			unsigned long average = ra->average;
>  
>  			if (ra->serial_cnt > average)
>  				average = (ra->serial_cnt + ra->average) / 2;
>  
> -			if ((average >= max) || (offset == (ra->start +
> -							ra->size - 1))) {
> +			if (average >= max) {
>  				ra->ahead_start = ra->start + ra->size;
>  				ra->ahead_size = ra->next_size;
>  				actual = do_page_cache_readahead(mapping, filp,
> 
> _
> 
> 
> That way, we read the correct amount of data, and we only start I/O when we
> know the application is going to actually use the data.
> 
> This may cause problems when the application transitions from seeky-access
> to linear-access.
> 
> Does it sound feasible?
I am nervous about this change. You are totally getting rid of
lazy-readahead and that was the optimization which gave the best
possible boost in performance. 
Let me see how this patch does with a DSS benchmark.

> 
> >
> > Probably we may see marginal degradation of this optimization with 16k
> > i/o but the amount of wastage avoided by this optimization (hack) 
> > is great when random i/o is of larger size. I think it was 4% better
> > performance on DSS workload with 64k random reads.
> 
> 64k sounds unusually large.  We need top performance at 8k too.
> 
> > Do you still think its a hack?
> 
> yup ;)
> 

:-(


> > Also I think  with sysbench workload and Andrew's ra-copy patch, we
> > might be loosing some benefits of some of the optimization because 
> > if two threads simulteously work with copies of the same ra structure
> > and update it, the optimization effect reflected in one of the
> > ra-structure is lost depending on which ra structure gets copied back
> > last.
> 
> hm, maybe.  That only makes a difference if two threads are accessing the
> same fd at the same time, and it was really bad before the patch.  The IO
> patterns seemed OK to me with the patch.  Except it's reading one page too
> many.

In the normal large random workload this extra page would have
compesated for all the wasted readaheads.  However in the case of
sysbench with Andrew's ra-copy patch the readahead calculation is not
happening quiet right. Is it worth trying to get a marginal gain 
with sysbench at the cost of getting a big hit on DSS benchmarks,
aio-tests,iozone and probably others. Or am I making an unsubstantiated
claim? I will get back with results.


RP


  reply	other threads:[~2004-05-10 22:45 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-02 19:57 Random file I/O regressions in 2.6 Alexey Kopytov
2004-05-03 11:14 ` Nick Piggin
2004-05-03 18:08   ` Andrew Morton
2004-05-03 20:22     ` Ram Pai
2004-05-03 20:57       ` Andrew Morton
2004-05-03 21:37         ` Peter Zaitsev
2004-05-03 21:50           ` Ram Pai
2004-05-03 22:01             ` Peter Zaitsev
2004-05-03 21:59           ` Andrew Morton
2004-05-03 22:07             ` Ram Pai
2004-05-03 23:58             ` Nick Piggin
2004-05-04  0:10               ` Andrew Morton
2004-05-04  0:19                 ` Nick Piggin
2004-05-04  0:50                   ` Ram Pai
2004-05-04  6:29                     ` Andrew Morton
2004-05-04 15:03                       ` Ram Pai
2004-05-04 19:39                         ` Ram Pai
2004-05-04 19:48                           ` Andrew Morton
2004-05-04 19:58                             ` Ram Pai
2004-05-04 21:51                               ` Ram Pai
2004-05-04 22:29                                 ` Ram Pai
2004-05-04 23:01                           ` Alexey Kopytov
2004-05-04 23:20                             ` Andrew Morton
2004-05-05 22:04                               ` Alexey Kopytov
2004-05-06  8:43                                 ` Andrew Morton
2004-05-06 18:13                                   ` Peter Zaitsev
2004-05-06 21:49                                     ` Andrew Morton
2004-05-06 23:49                                       ` Nick Piggin
2004-05-07  1:29                                         ` Peter Zaitsev
2004-05-10 19:50                                   ` Ram Pai
2004-05-10 20:21                                     ` Andrew Morton
2004-05-10 22:39                                       ` Ram Pai [this message]
2004-05-10 23:07                                         ` Andrew Morton
2004-05-11 20:51                                           ` Ram Pai
2004-05-11 21:17                                             ` Andrew Morton
2004-05-13 20:41                                               ` Ram Pai
2004-05-17 17:30                                                 ` Random file I/O regressions in 2.6 [patch+results] Ram Pai
2004-05-20  1:06                                                   ` Alexey Kopytov
2004-05-20  1:31                                                     ` Ram Pai
2004-05-21 19:32                                                       ` Alexey Kopytov
2004-05-20  5:49                                                     ` Andrew Morton
2004-05-20 21:59                                                     ` Andrew Morton
2004-05-20 22:23                                                       ` Andrew Morton
2004-05-21  7:31                                                         ` Nick Piggin
2004-05-21  7:50                                                           ` Jens Axboe
2004-05-21  8:40                                                             ` Nick Piggin
2004-05-21  8:56                                                             ` Spam: " Andrew Morton
2004-05-21 22:24                                                               ` Alexey Kopytov
2004-05-21 21:13                                                       ` Alexey Kopytov
2004-05-26  4:43                                                         ` Alexey Kopytov
2004-05-11 22:26                                           ` Random file I/O regressions in 2.6 Bill Davidsen
2004-05-04  1:15                   ` Andrew Morton
2004-05-04 11:39                     ` Nick Piggin
2004-05-04  8:27                 ` Arjan van de Ven
2004-05-04  8:47                   ` Andrew Morton
2004-05-04  8:50                     ` Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1084228767.6140.832.camel@localhost.localdomain \
    --to=linuxram@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=alexeyk@mysql.com \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=peter@mysql.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox