Re: [PATCH] mm/readahead.c: update the LRU positions of in-core pages, too

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wu Fengguang <fengguang.wu@intel.com>
To: Chris Frost <frost@cs.ucla.edu>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Dickson <steved@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Xu Chenfeng <xcf@ustc.edu.cn>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Steve VanDeBogart <vandebo-lkml@nerdbox.net>
Subject: Re: [PATCH] mm/readahead.c: update the LRU positions of in-core pages, too
Date: Tue, 26 Jan 2010 21:02:12 +0800	[thread overview]
Message-ID: <20100126130211.GA25407@localhost> (raw)
In-Reply-To: <20100125223635.GC2822@frostnet.net>

On Mon, Jan 25, 2010 at 03:36:35PM -0700, Chris Frost wrote:
> I changed Wu's patch to add a PageLRU() guard that I believe is required

Good catch, Thanks!

> and optimized zone lock acquisition to only unlock and lock at zone changes.
> This optimization seems to provide a 10-20% system time improvement for
> some of my GIMP benchmarks and no improvement for other benchmarks.

OK.

> I agree that the remove and add lru list entry code looks correct.
> putback_lru_page() has to worry about a page's evictable status
> changing, but I think this code does not because it holds the page
> zone lock.
> 
> Wu removed the ClearPageReadahead(page) call on in-core pages that
> Kamezawa's change added. This removal, not making this call, looks
> ok to me.
> 
> Thanks Wu and Kamezawa.
> 
> 
> What's next?

I happen to be preparing a readahead series, will include this one :)

Thanks,
Fengguang

> ---
> readahead: retain inactive lru pages to be accessed soon
> From: Chris Frost <frost@cs.ucla.edu>
> 
> Ensure that cached pages in the inactive list are not prematurely evicted;
> move such pages to lru head when they are covered by
> - in-kernel heuristic readahead
> - an posix_fadvise(POSIX_FADV_WILLNEED) hint from an application
> 
> Before this patch, pages already in core may be evicted before the
> pages covered by the same prefetch scan but that were not yet in core.
> Many small read requests may be forced on the disk because of this behavior.
> 
> In particular, posix_fadvise(... POSIX_FADV_WILLNEED) on an in-core page
> has no effect on the page's location in the LRU list, even if it is the
> next victim on the inactive list.
> 
> This change helps address the performance problems we encountered
> while modifying SQLite and the GIMP to use large file prefetching.
> Overall these prefetching techniques improved the runtime of large
> benchmarks by 10-17x for these applications. More in the publication
> _Reducing Seek Overhead with Application-Directed Prefetching_ in
> USENIX ATC 2009 and at http://libprefetch.cs.ucla.edu/.
> 
> Signed-off-by: Chris Frost <frost@cs.ucla.edu>
> Signed-off-by: Steve VanDeBogart <vandebo@cs.ucla.edu>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  readahead.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index aa1aa23..c1d67ab 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -9,7 +9,9 @@
>  
>  #include <linux/kernel.h>
>  #include <linux/fs.h>
> +#include <linux/memcontrol.h>
>  #include <linux/mm.h>
> +#include <linux/mm_inline.h>
>  #include <linux/module.h>
>  #include <linux/blkdev.h>
>  #include <linux/backing-dev.h>
> @@ -133,6 +135,43 @@ out:
>  }
>  
>  /*
> + * The file range is expected to be accessed in near future.  Move pages
> + * (possibly in inactive lru tail) to lru head, so that they are retained
> + * in memory for some reasonable time.
> + */
> +static void retain_inactive_pages(struct address_space *mapping,
> +				  pgoff_t index, int len)
> +{
> +	int i;
> +	struct page *page;
> +	struct zone *zone;
> +	struct zone *locked_zone = NULL;
> +
> +	for (i = 0; i < len; i++) {
> +		page = find_get_page(mapping, index + i);
> +		if (!page)
> +			continue;
> +		zone = page_zone(page);
> +		if (zone != locked_zone) {
> +			if (locked_zone)
> +				spin_unlock_irq(&locked_zone->lru_lock);
> +			locked_zone = zone;
> +			spin_lock_irq(&locked_zone->lru_lock);
> +		}
> +		if (!PageActive(page) && !PageUnevictable(page) &&
> +		    PageLRU(page)) {
> +			int lru = page_lru_base_type(page);
> +
> +			del_page_from_lru_list(zone, page, lru);
> +			add_page_to_lru_list(zone, page, lru);
> +		}
> +		put_page(page);
> +	}
> +	if (locked_zone)
> +		spin_unlock_irq(&locked_zone->lru_lock);
> +}
> +
> +/*
>   * __do_page_cache_readahead() actually reads a chunk of disk.  It allocates all
>   * the pages first, then submits them all for I/O. This avoids the very bad
>   * behaviour which would occur if page allocations are causing VM writeback.
> @@ -184,6 +223,14 @@ __do_page_cache_readahead(struct address_space *mapping, struct file *filp,
>  	}
>  
>  	/*
> +	 * Normally readahead will auto stop on cached segments, so we won't
> +	 * hit many cached pages. If it does happen, bring the inactive pages
> +	 * adjecent to the newly prefetched ones(if any).
> +	 */
> +	if (ret < nr_to_read)
> +		retain_inactive_pages(mapping, offset, page_idx);
> +
> +	/*
>  	 * Now start the IO.  We ignore I/O errors - if the page is not
>  	 * uptodate then the caller will launch readpage again, and
>  	 * will then handle the error.
> 
> -- 
> Chris Frost
> http://www.frostnet.net/chris/

WARNING: multiple messages have this Message-ID (diff)

From: Wu Fengguang <fengguang.wu@intel.com>
To: Chris Frost <frost@cs.ucla.edu>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steve Dickson <steved@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Xu Chenfeng <xcf@ustc.edu.cn>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Steve VanDeBogart <vandebo-lkml@nerdbox.net>
Subject: Re: [PATCH] mm/readahead.c: update the LRU positions of in-core pages, too
Date: Tue, 26 Jan 2010 21:02:12 +0800	[thread overview]
Message-ID: <20100126130211.GA25407@localhost> (raw)
In-Reply-To: <20100125223635.GC2822@frostnet.net>

On Mon, Jan 25, 2010 at 03:36:35PM -0700, Chris Frost wrote:
> I changed Wu's patch to add a PageLRU() guard that I believe is required

Good catch, Thanks!

> and optimized zone lock acquisition to only unlock and lock at zone changes.
> This optimization seems to provide a 10-20% system time improvement for
> some of my GIMP benchmarks and no improvement for other benchmarks.

OK.

> I agree that the remove and add lru list entry code looks correct.
> putback_lru_page() has to worry about a page's evictable status
> changing, but I think this code does not because it holds the page
> zone lock.
> 
> Wu removed the ClearPageReadahead(page) call on in-core pages that
> Kamezawa's change added. This removal, not making this call, looks
> ok to me.
> 
> Thanks Wu and Kamezawa.
> 
> 
> What's next?

I happen to be preparing a readahead series, will include this one :)

Thanks,
Fengguang

> ---
> readahead: retain inactive lru pages to be accessed soon
> From: Chris Frost <frost@cs.ucla.edu>
> 
> Ensure that cached pages in the inactive list are not prematurely evicted;
> move such pages to lru head when they are covered by
> - in-kernel heuristic readahead
> - an posix_fadvise(POSIX_FADV_WILLNEED) hint from an application
> 
> Before this patch, pages already in core may be evicted before the
> pages covered by the same prefetch scan but that were not yet in core.
> Many small read requests may be forced on the disk because of this behavior.
> 
> In particular, posix_fadvise(... POSIX_FADV_WILLNEED) on an in-core page
> has no effect on the page's location in the LRU list, even if it is the
> next victim on the inactive list.
> 
> This change helps address the performance problems we encountered
> while modifying SQLite and the GIMP to use large file prefetching.
> Overall these prefetching techniques improved the runtime of large
> benchmarks by 10-17x for these applications. More in the publication
> _Reducing Seek Overhead with Application-Directed Prefetching_ in
> USENIX ATC 2009 and at http://libprefetch.cs.ucla.edu/.
> 
> Signed-off-by: Chris Frost <frost@cs.ucla.edu>
> Signed-off-by: Steve VanDeBogart <vandebo@cs.ucla.edu>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  readahead.c |   47 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index aa1aa23..c1d67ab 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -9,7 +9,9 @@
>  
>  #include <linux/kernel.h>
>  #include <linux/fs.h>
> +#include <linux/memcontrol.h>
>  #include <linux/mm.h>
> +#include <linux/mm_inline.h>
>  #include <linux/module.h>
>  #include <linux/blkdev.h>
>  #include <linux/backing-dev.h>
> @@ -133,6 +135,43 @@ out:
>  }
>  
>  /*
> + * The file range is expected to be accessed in near future.  Move pages
> + * (possibly in inactive lru tail) to lru head, so that they are retained
> + * in memory for some reasonable time.
> + */
> +static void retain_inactive_pages(struct address_space *mapping,
> +				  pgoff_t index, int len)
> +{
> +	int i;
> +	struct page *page;
> +	struct zone *zone;
> +	struct zone *locked_zone = NULL;
> +
> +	for (i = 0; i < len; i++) {
> +		page = find_get_page(mapping, index + i);
> +		if (!page)
> +			continue;
> +		zone = page_zone(page);
> +		if (zone != locked_zone) {
> +			if (locked_zone)
> +				spin_unlock_irq(&locked_zone->lru_lock);
> +			locked_zone = zone;
> +			spin_lock_irq(&locked_zone->lru_lock);
> +		}
> +		if (!PageActive(page) && !PageUnevictable(page) &&
> +		    PageLRU(page)) {
> +			int lru = page_lru_base_type(page);
> +
> +			del_page_from_lru_list(zone, page, lru);
> +			add_page_to_lru_list(zone, page, lru);
> +		}
> +		put_page(page);
> +	}
> +	if (locked_zone)
> +		spin_unlock_irq(&locked_zone->lru_lock);
> +}
> +
> +/*
>   * __do_page_cache_readahead() actually reads a chunk of disk.  It allocates all
>   * the pages first, then submits them all for I/O. This avoids the very bad
>   * behaviour which would occur if page allocations are causing VM writeback.
> @@ -184,6 +223,14 @@ __do_page_cache_readahead(struct address_space *mapping, struct file *filp,
>  	}
>  
>  	/*
> +	 * Normally readahead will auto stop on cached segments, so we won't
> +	 * hit many cached pages. If it does happen, bring the inactive pages
> +	 * adjecent to the newly prefetched ones(if any).
> +	 */
> +	if (ret < nr_to_read)
> +		retain_inactive_pages(mapping, offset, page_idx);
> +
> +	/*
>  	 * Now start the IO.  We ignore I/O errors - if the page is not
>  	 * uptodate then the caller will launch readpage again, and
>  	 * will then handle the error.
> 
> -- 
> Chris Frost
> http://www.frostnet.net/chris/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-01-26 13:02 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-20 21:55 [PATCH] mm/readahead.c: update the LRU positions of in-core pages, too Chris Frost
2010-01-20 21:55 ` Chris Frost
2010-01-21  5:47 ` Wu Fengguang
2010-01-21  5:47   ` Wu Fengguang
2010-01-23  4:03   ` Chris Frost
2010-01-23  4:03     ` Chris Frost
2010-01-23 10:22     ` Wu Fengguang
2010-01-23 10:22       ` Wu Fengguang
2010-01-25  0:42       ` KAMEZAWA Hiroyuki
2010-01-25  0:42         ` KAMEZAWA Hiroyuki
2010-01-25  2:45         ` Wu Fengguang
2010-01-25  2:45           ` Wu Fengguang
2010-01-25 22:36           ` Chris Frost
2010-01-25 22:36             ` Chris Frost
2010-01-26 13:02             ` Wu Fengguang [this message]
2010-01-26 13:02               ` Wu Fengguang
2010-01-26 13:32             ` Wu Fengguang
2010-01-26 13:32               ` Wu Fengguang
2010-01-31 14:31               ` Wu Fengguang
2010-01-31 14:31                 ` Wu Fengguang
2010-02-01  2:06                 ` Chris Frost
2010-02-01  2:06                   ` Chris Frost
2010-02-01  2:17                   ` Wu Fengguang
2010-02-01  2:17                     ` Wu Fengguang
2010-02-02  0:15                     ` Chris Frost
2010-02-02  0:15                       ` Chris Frost
2010-01-27  7:09   ` Minchan Kim
2010-01-27  7:09     ` Minchan Kim
2010-01-27 12:21     ` Wu Fengguang
2010-01-27 12:21       ` Wu Fengguang
2010-01-28  7:16     ` Steve VanDeBogart
2010-01-28  7:16       ` Steve VanDeBogart
2010-01-28  8:09       ` Minchan Kim
2010-01-28  8:09         ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100126130211.GA25407@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=frost@cs.ucla.edu \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=steved@redhat.com \
    --cc=vandebo-lkml@nerdbox.net \
    --cc=xcf@ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.