All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ram Pai <linuxram@us.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: alexeyk@mysql.com, nickpiggin@yahoo.com.au, peter@mysql.com,
	linux-kernel@vger.kernel.org, axboe@suse.de
Subject: Re: Random file I/O regressions in 2.6
Date: 10 May 2004 15:39:28 -0700	[thread overview]
Message-ID: <1084228767.6140.832.camel@localhost.localdomain> (raw)
In-Reply-To: <20040510132151.238b8d0c.akpm@osdl.org>

On Mon, 2004-05-10 at 13:21, Andrew Morton wrote:
> Ram Pai <linuxram@us.ibm.com> wrote:
> >
> > > Ram, can you take a look at fixing that up please?  Something clean, not
> > > more hacks ;) I'd also be interested in an explanation of what the extra
> > > page is for.  The little comment in there doesn't really help.
> > 
> > 
> > The reason for the extra page read is as follows:
> > 
> > Consider 16k random reads i/os. Reads are generated 4pages at a time.
> > 
> > the readahead is triggered when the 4th page in the 'current-window' is
> > touched.
> 
> Right.  We've added two whole unsigned longs to the file_struct to track
> the access patterns.  That should be sufficient for us to detect when the
> access pattern is random, and to then not perform readahead due to a
> current-window miss *at all*.
> 
> So that extra page can go away, and:
> 
> --- 25/mm/readahead.c~a	Mon May 10 13:16:59 2004
> +++ 25-akpm/mm/readahead.c	Mon May 10 13:17:22 2004
> @@ -492,21 +492,17 @@ do_io:
>  		 */
>  		if (ra->ahead_start == 0) {
>  			/*
> -			 * if the average io-size is less than maximum
> +			 * If the average io-size is less than maximum
>  			 * readahead size of the file the io pattern is
>  			 * sequential. Hence  bring in the readahead window
>  			 * immediately.
> -			 * Else the i/o pattern is random. Bring
> -			 * in the readahead window only if the last page of
> -			 * the current window is accessed (lazy readahead).
>  			 */
>  			unsigned long average = ra->average;
>  
>  			if (ra->serial_cnt > average)
>  				average = (ra->serial_cnt + ra->average) / 2;
>  
> -			if ((average >= max) || (offset == (ra->start +
> -							ra->size - 1))) {
> +			if (average >= max) {
>  				ra->ahead_start = ra->start + ra->size;
>  				ra->ahead_size = ra->next_size;
>  				actual = do_page_cache_readahead(mapping, filp,
> 
> _
> 
> 
> That way, we read the correct amount of data, and we only start I/O when we
> know the application is going to actually use the data.
> 
> This may cause problems when the application transitions from seeky-access
> to linear-access.
> 
> Does it sound feasible?
I am nervous about this change. You are totally getting rid of
lazy-readahead and that was the optimization which gave the best
possible boost in performance. 
Let me see how this patch does with a DSS benchmark.

> 
> >
> > Probably we may see marginal degradation of this optimization with 16k
> > i/o but the amount of wastage avoided by this optimization (hack) 
> > is great when random i/o is of larger size. I think it was 4% better
> > performance on DSS workload with 64k random reads.
> 
> 64k sounds unusually large.  We need top performance at 8k too.
> 
> > Do you still think its a hack?
> 
> yup ;)
> 

:-(


> > Also I think  with sysbench workload and Andrew's ra-copy patch, we
> > might be loosing some benefits of some of the optimization because 
> > if two threads simulteously work with copies of the same ra structure
> > and update it, the optimization effect reflected in one of the
> > ra-structure is lost depending on which ra structure gets copied back
> > last.
> 
> hm, maybe.  That only makes a difference if two threads are accessing the
> same fd at the same time, and it was really bad before the patch.  The IO
> patterns seemed OK to me with the patch.  Except it's reading one page too
> many.

In the normal large random workload this extra page would have
compesated for all the wasted readaheads.  However in the case of
sysbench with Andrew's ra-copy patch the readahead calculation is not
happening quiet right. Is it worth trying to get a marginal gain 
with sysbench at the cost of getting a big hit on DSS benchmarks,
aio-tests,iozone and probably others. Or am I making an unsubstantiated
claim? I will get back with results.


RP


  reply	other threads:[~2004-05-10 22:45 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-02 19:57 Random file I/O regressions in 2.6 Alexey Kopytov
2004-05-03 11:14 ` Nick Piggin
2004-05-03 18:08   ` Andrew Morton
2004-05-03 20:22     ` Ram Pai
2004-05-03 20:57       ` Andrew Morton
2004-05-03 21:37         ` Peter Zaitsev
2004-05-03 21:50           ` Ram Pai
2004-05-03 22:01             ` Peter Zaitsev
2004-05-03 21:59           ` Andrew Morton
2004-05-03 22:07             ` Ram Pai
2004-05-03 23:58             ` Nick Piggin
2004-05-04  0:10               ` Andrew Morton
2004-05-04  0:19                 ` Nick Piggin
2004-05-04  0:50                   ` Ram Pai
2004-05-04  6:29                     ` Andrew Morton
2004-05-04 15:03                       ` Ram Pai
2004-05-04 19:39                         ` Ram Pai
2004-05-04 19:48                           ` Andrew Morton
2004-05-04 19:58                             ` Ram Pai
2004-05-04 21:51                               ` Ram Pai
2004-05-04 22:29                                 ` Ram Pai
2004-05-04 23:01                           ` Alexey Kopytov
2004-05-04 23:20                             ` Andrew Morton
2004-05-05 22:04                               ` Alexey Kopytov
2004-05-06  8:43                                 ` Andrew Morton
2004-05-06 18:13                                   ` Peter Zaitsev
2004-05-06 21:49                                     ` Andrew Morton
2004-05-06 23:49                                       ` Nick Piggin
2004-05-07  1:29                                         ` Peter Zaitsev
2004-05-10 19:50                                   ` Ram Pai
2004-05-10 20:21                                     ` Andrew Morton
2004-05-10 22:39                                       ` Ram Pai [this message]
2004-05-10 23:07                                         ` Andrew Morton
2004-05-11 20:51                                           ` Ram Pai
2004-05-11 21:17                                             ` Andrew Morton
2004-05-13 20:41                                               ` Ram Pai
2004-05-17 17:30                                                 ` Random file I/O regressions in 2.6 [patch+results] Ram Pai
2004-05-20  1:06                                                   ` Alexey Kopytov
2004-05-20  1:31                                                     ` Ram Pai
2004-05-21 19:32                                                       ` Alexey Kopytov
2004-05-20  5:49                                                     ` Andrew Morton
2004-05-20 21:59                                                     ` Andrew Morton
2004-05-20 22:23                                                       ` Andrew Morton
2004-05-21  7:31                                                         ` Nick Piggin
2004-05-21  7:50                                                           ` Jens Axboe
2004-05-21  8:40                                                             ` Nick Piggin
2004-05-21  8:56                                                             ` Spam: " Andrew Morton
2004-05-21 22:24                                                               ` Alexey Kopytov
2004-05-21 21:13                                                       ` Alexey Kopytov
2004-05-26  4:43                                                         ` Alexey Kopytov
2004-05-11 22:26                                           ` Random file I/O regressions in 2.6 Bill Davidsen
2004-05-04  1:15                   ` Andrew Morton
2004-05-04 11:39                     ` Nick Piggin
2004-05-04  8:27                 ` Arjan van de Ven
2004-05-04  8:47                   ` Andrew Morton
2004-05-04  8:50                     ` Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1084228767.6140.832.camel@localhost.localdomain \
    --to=linuxram@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=alexeyk@mysql.com \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=peter@mysql.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.