Re: [PATCH/RFC] Simplified Readahead

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Steven Pratt <slpratt@austin.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH/RFC] Simplified Readahead
Date: Fri, 24 Sep 2004 17:43:03 -0500	[thread overview]
Message-ID: <4154A2F7.1050909@austin.ibm.com> (raw)
In-Reply-To: <20040924150523.4853465b.akpm@osdl.org>

Andrew Morton wrote:

>Steven Pratt <slpratt@austin.ibm.com> wrote:
>  
>
>>>The advantage of the current page-at-a-time code is that the readahead code
>>>behaves exactly the same, whether the application is doing 256 4k reads or
>>>one 1M read.  Plus it fits the old pagefault requirement.
>>> 
>>>      
>>>
>>Yes, but it accomplishes this by possible making the 1M slower.  And I 
>>must admit that I don't know what the "old pagefault requirement" is.  
>>Is that something we still need to worry about?
>>    
>>
>
>The "old pagefault requirement": the code in there used to perform
>readaround at pagefault time as well as readahead at read() time.  Hence it
>had to work well for single-page requests.  That requirement isn't there
>any more but some of the code to support it is still there, perhaps.
>
>  
>
>>>>1. pages already in cache
>>>>   
>>>>
>>>>        
>>>>
>>>Yes, we need to handle this.  All pages in cache with lots of CPUs
>>>hammering the same file is a common case.
>>>
>>>Maybe not so significant on small x86, but on large power4 with a higher
>>>lock-versus-memcpy cost ratio, that extra locking will hurt.
>>> 
>>>      
>>>
>>Ok, we have some data from larger machines.  I will collect it all and 
>>summarize separately.
>>    
>>
>
>SDET would be interesting, as well as explicit testing of lots of processes
>reading the same fully-cached file.
>  
>
Don't have SDET but we have been working on the multiple processes 
reading same file case on a large(16way POWER4 with 128GB) machine.   We 
had to apply some fdrcu patches to get past problems in fget_light which 
were causing 80%spin on the file_lock.   We then end up with anout 45% 
spin lock on mapping->tree_lock.   This is on vanilla rc2.  I know you 
changed that to a rw lock in your tree and we nee to try that as well..  
Data is not consistant enough to make any conclusions, but I don't see a 
dramatic change by turning off readahead.  I need to do more testing on 
this to get better results.

In any case I agree that we should deal with this case.

>>>>cache we should just immediately turn off readahead.  What is this 
>>>>trigger point?  4 I/Os in a row? 400?
>>>>   
>>>>        
>>>>
>>>Hard call.
>>> 
>>>      
>>>
>>I know, but we have to come up with something if we really want to avoid 
>>the double lookup.
>>    
>>
>
>As long as readahead gets fully disabled at some stage, we should be OK.
>  
>
I am attaching a reworked patch which now shuts off readahead if 10M 
(arbitrary value for now) of I/O comes from page cache in a row.  Any 
actual I/O will restart readahead.

>We should probably compare i_size with mapping->nrpages at open() time,
>too.  No point in enabling readahead if it's all cached.  But doing that
>would make performance testing harder, so do it later.
>  
>
Ok. Sounds good.

>  
>
>>>I do think we should skip the I/O for POSIX_FADV_WILLNEED against a
>>>congested queue.  I can't immediately think of a good reason for skipping
>>>the I/O for normal readahead.
>>> 
>>>
>>>      
>>>
>>Can you expand on the POSIX_FADV_WILLNEED.
>>    
>>
>
>It's an application-specified readahead hint.  It should ideally be
>asynchronous so the application can get some I/O underway while it's
>crunching on something else.  If the queue is contested then the
>application will accidentally block when launching the readahead, which
>kinda defeats the purpose.
>  
>
Well if the app really does this asynchronously, does it matter that we 
block?

>Yes, the application will block when it does the subsequent read() anyway,
>but applications expect to block in read().  Seems saner this way.
>
Just to be sure I have this correct, the readahead code will be invoked 
once on the POSIX_FADV_WILLNEED request, but this looks mostly like a 
regular read, and then again for the same pages on a real read?

Steve

next prev parent reply	other threads:[~2004-09-24 22:40 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-23 16:06 [PATCH/RFC] Simplified Readahead Steven Pratt
2004-09-23 22:14 ` Joel Schopp
2004-09-24  0:21 ` Nick Piggin
2004-09-24  2:42 ` Andrew Morton
2004-09-24 15:40   ` Steven Pratt
2004-09-24 16:16     ` Nick Piggin
2004-09-24 16:48       ` Steven Pratt
2004-09-24 22:05     ` Andrew Morton
2004-09-24 22:43       ` Steven Pratt [this message]
2004-09-24 23:01         ` Andrew Morton
2004-09-27 15:39           ` Steven Pratt
2004-09-27 19:26             ` Andrew Morton
2004-09-28 10:13               ` Jens Axboe
2004-09-24 22:55       ` Steven Pratt
2004-09-27 20:29         ` Ray Bryant
2004-09-27 21:04           ` Steven Pratt
2004-09-25  0:45       ` Nick Piggin
2004-09-25  1:01 ` Ram Pai
2004-09-25  6:07   ` Ram Pai
2004-09-27 15:30     ` Steven Pratt
2004-09-27 18:42       ` Ram Pai
2004-09-27 20:07         ` Steven Pratt
2004-09-29 18:46           ` Ram Pai
2004-09-29 22:33             ` Steven Pratt
2004-09-29 23:13               ` Andreas Dilger
2004-09-30  2:26                 ` Ram Pai
2004-09-30  5:29                   ` Andrew Morton
2004-09-30 20:20                 ` Stephen C. Tweedie
2004-09-30  1:12               ` Ram Pai
2004-10-01 21:02             ` Steven Pratt
2004-10-05 17:52               ` Ram Pai
     [not found] <372479081@toto.iv>
2004-09-24  5:00 ` Peter Chubb
2004-09-24 22:57   ` Steven Pratt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4154A2F7.1050909@austin.ibm.com \
    --to=slpratt@austin.ibm.com \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox