All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mel@csn.ul.ie,
	clameter@sgi.com, riel@redhat.com, andrea@suse.de,
	a.p.zijlstra@chello.nl, eric.whitney@hp.com, npiggin@suse.de
Subject: Re: [PATCH/RFC 6/14] Reclaim Scalability: "No Reclaim LRU Infrastructure"
Date: Wed, 19 Sep 2007 11:30:16 +0530	[thread overview]
Message-ID: <46F0BAF0.2020806@linux.vnet.ibm.com> (raw)
In-Reply-To: <20070914205438.6536.49500.sendpatchset@localhost>

Lee Schermerhorn wrote:
> PATCH/RFC 06/14 Reclaim Scalability: "No Reclaim LRU Infrastructure"
> 
> Against:  2.6.23-rc4-mm1
> 
> Infrastructure to manage pages excluded from reclaim--i.e., hidden
> from vmscan.  Based on a patch by Larry Woodman of Red Hat. Reworked
> to maintain "nonreclaimable" pages on a separate per-zone LRU list,
> to "hide" them from vmscan.  A separate noreclaim pagevec is provided
> for shrink_active_list() to move nonreclaimable pages to the noreclaim
> list without over burdening the zone lru_lock.
> 
> Pages on the noreclaim list have both PG_noreclaim and PG_lru set.
> Thus, PG_noreclaim is analogous to and mutually exclusive with
> PG_active--it specifies which LRU list the page is on.  
> 
> The noreclaim infrastructure is enabled by a new mm Kconfig option
> [CONFIG_]NORECLAIM.
> 

Could we use a different name. CONFIG_NORECLAIM could be misunderstood
to be that reclaim is disabled on the system all together.


> 
> 4.  TODO:  Memory Controllers maintain separate active and inactive lists.
>     Need to consider whether they should also maintain a noreclaim list.  
>     Also, convert to use Christoph's array of indexed lru variables?
> 
>     See //TODO note in mm/memcontrol.c re:  isolating non-reclaimable
>     pages. 
> 

Thanks, I'll look into exploiting this in the memory controller.

> Index: Linux/mm/swap.c
> ===================================================================
> --- Linux.orig/mm/swap.c	2007-09-14 10:21:45.000000000 -0400
> +++ Linux/mm/swap.c	2007-09-14 10:21:48.000000000 -0400
> @@ -116,14 +116,14 @@ int rotate_reclaimable_page(struct page 
>  		return 1;
>  	if (PageDirty(page))
>  		return 1;
> -	if (PageActive(page))
> +	if (PageActive(page) | PageNoreclaim(page))

Did you intend to make this bitwise or?

> -	if (PageLRU(page) && !PageActive(page)) {
> +	if (PageLRU(page) && !PageActive(page) && !PageNoreclaim(page)) {

Since we use this even below, does it make sense to wrap it into an
inline function and call it check_page_lru_inactive_reclaimable()?


>  void lru_add_drain(void)
> @@ -277,14 +312,18 @@ void release_pages(struct page **pages, 
> 
>  		if (PageLRU(page)) {
>  			struct zone *pagezone = page_zone(page);
> +			int is_lru_page;
> +
>  			if (pagezone != zone) {
>  				if (zone)
>  					spin_unlock_irq(&zone->lru_lock);
>  				zone = pagezone;
>  				spin_lock_irq(&zone->lru_lock);
>  			}
> -			VM_BUG_ON(!PageLRU(page));
> -			__ClearPageLRU(page);
> +			is_lru_page = PageLRU(page);
> +			VM_BUG_ON(!(is_lru_page));
> +			if (is_lru_page)

This is a little confusing, after asserting that the page
is indeed in LRU, why add the check for is_lru_page again?
Comments will be helpful here.


> +#ifdef CONFIG_NORECLAIM
> +void __pagevec_lru_add_noreclaim(struct pagevec *pvec)
> +{
> +	int i;
> +	struct zone *zone = NULL;
> +
> +	for (i = 0; i < pagevec_count(pvec); i++) {
> +		struct page *page = pvec->pages[i];
> +		struct zone *pagezone = page_zone(page);
> +
> +		if (pagezone != zone) {
> +			if (zone)
> +				spin_unlock_irq(&zone->lru_lock);
> +			zone = pagezone;
> +			spin_lock_irq(&zone->lru_lock);
> +		}
> +		VM_BUG_ON(PageLRU(page));
> +		SetPageLRU(page);

> +		VM_BUG_ON(PageActive(page) || PageNoreclaim(page));
> +		SetPageNoreclaim(page);
> +		add_page_to_noreclaim_list(zone, page);

These two calls seem to be the only difference between __pagevec_lru_add
and this routine, any chance we could refactor to reuse most of the
code? Something like __pagevec_lru_add_prepare(), do the stuff and
then call __pagevec_lru_add_finish()


> +/*
> + * move_to_lru() - place @page onto appropriate lru list
> + * based on preserved page flags:  active, noreclaim, none
> + */
>  static inline void move_to_lru(struct page *page)
>  {
> -	if (PageActive(page)) {
> +	if (PageNoreclaim(page)) {
> +		VM_BUG_ON(PageActive(page));
> +		ClearPageNoreclaim(page);
> +		lru_cache_add_noreclaim(page);

I know that lru_cache_add_noreclaim() does the right thing
by looking at PageNoReclaim(), but the sequence is a little
confusing to read.


> -int __isolate_lru_page(struct page *page, int mode)
> +int __isolate_lru_page(struct page *page, int mode, int take_nonreclaimable)
>  {
>  	int ret = -EINVAL;
> 
> @@ -652,12 +660,27 @@ int __isolate_lru_page(struct page *page
>  		return ret;
> 
>  	/*
> -	 * When checking the active state, we need to be sure we are
> -	 * dealing with comparible boolean values.  Take the logical not
> -	 * of each.
> +	 * Non-reclaimable pages shouldn't make it onto the inactive list,
> +	 * so if we encounter one, we should be scanning either the active
> +	 * list--e.g., after splicing noreclaim list to end of active list--
> +	 * or nearby pages [lumpy reclaim].  Take it only if scanning active
> +	 * list.
>  	 */
> -	if (mode != ISOLATE_BOTH && (!PageActive(page) != !mode))
> -		return ret;
> +	if (PageNoreclaim(page)) {
> +		if (!take_nonreclaimable)
> +			return -EBUSY;	/* lumpy reclaim -- skip this page */
> +		/*
> +		 * else fall thru' and try to isolate
> +		 */

I think we need to distinguish between the types of nonreclaimable
pages. Is it the heavily mapped pages that you pass on further?
A casual reader like me finds it hard to understand how lumpy reclaim
might try to reclaim a non-reclaimable page :-)

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-09-19  6:00 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-14 20:53 [PATCH/RFC 0/14] Page Reclaim Scalability Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 1/14] Reclaim Scalability: Convert anon_vma lock to read/write lock Lee Schermerhorn
2007-09-17 11:02   ` Mel Gorman
2007-09-18  2:41     ` KAMEZAWA Hiroyuki
2007-09-18 11:01       ` Mel Gorman
2007-09-18 14:57         ` Rik van Riel
2007-09-18 15:37       ` Lee Schermerhorn
2007-09-18 20:17     ` Lee Schermerhorn
2007-09-20 10:19       ` Mel Gorman
2007-09-14 20:54 ` [PATCH/RFC 2/14] Reclaim Scalability: convert inode i_mmap_lock to reader/writer lock Lee Schermerhorn
2007-09-17 12:53   ` Mel Gorman
2007-09-20  1:24   ` Andrea Arcangeli
2007-09-20 14:10     ` Lee Schermerhorn
2007-09-20 14:16       ` Andrea Arcangeli
2007-09-14 20:54 ` [PATCH/RFC 3/14] Reclaim Scalability: move isolate_lru_page() to vmscan.c Lee Schermerhorn
2007-09-14 21:34   ` Peter Zijlstra
2007-09-15  1:55     ` Rik van Riel
2007-09-17 14:11     ` Lee Schermerhorn
2007-09-17  9:20   ` Balbir Singh
2007-09-17 19:19     ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 4/14] Reclaim Scalability: Define page_anon() function Lee Schermerhorn
2007-09-15  2:00   ` Rik van Riel
2007-09-17 13:19   ` Mel Gorman
2007-09-18  1:58   ` KAMEZAWA Hiroyuki
2007-09-18  2:27     ` Rik van Riel
2007-09-18  2:40       ` KAMEZAWA Hiroyuki
2007-09-18 15:04     ` Lee Schermerhorn
2007-09-18 19:41       ` Christoph Lameter
2007-09-19  0:30       ` KAMEZAWA Hiroyuki
2007-09-19 16:58         ` Lee Schermerhorn
2007-09-20  0:56           ` KAMEZAWA Hiroyuki
2007-09-14 20:54 ` [PATCH/RFC 5/14] Reclaim Scalability: Use an indexed array for LRU variables Lee Schermerhorn
2007-09-17 13:40   ` Mel Gorman
2007-09-17 14:17     ` Lee Schermerhorn
2007-09-17 14:39       ` Lee Schermerhorn
2007-09-17 18:58   ` Balbir Singh
2007-09-17 19:12     ` Lee Schermerhorn
2007-09-17 19:36       ` Balbir Singh
2007-09-17 19:36     ` Rik van Riel
2007-09-17 20:21       ` Balbir Singh
2007-09-17 21:01         ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 6/14] Reclaim Scalability: "No Reclaim LRU Infrastructure" Lee Schermerhorn
2007-09-14 22:47   ` Christoph Lameter
2007-09-17 15:17     ` Lee Schermerhorn
2007-09-17 18:41       ` Christoph Lameter
2007-09-18  9:54         ` Mel Gorman
2007-09-18 19:45           ` Christoph Lameter
2007-09-19 11:11             ` Mel Gorman
2007-09-19 18:03               ` Christoph Lameter
2007-09-19  6:00   ` Balbir Singh [this message]
2007-09-19 14:47     ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 7/14] Reclaim Scalability: Non-reclaimable page statistics Lee Schermerhorn
2007-09-17  1:56   ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 8/14] Reclaim Scalability: Ram Disk Pages are non-reclaimable Lee Schermerhorn
2007-09-17  1:57   ` Rik van Riel
2007-09-17 14:40     ` Lee Schermerhorn
2007-09-17 18:42       ` Christoph Lameter
2007-09-14 20:54 ` [PATCH/RFC 9/14] Reclaim Scalability: SHM_LOCKED pages are nonreclaimable Lee Schermerhorn
2007-09-17  2:18   ` Rik van Riel
2007-09-14 20:55 ` [PATCH/RFC 10/14] Reclaim Scalability: track anon_vma "related vmas" Lee Schermerhorn
2007-09-17  2:52   ` Rik van Riel
2007-09-17 15:52     ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 11/14] Reclaim Scalability: swap backed pages are nonreclaimable when no swap space available Lee Schermerhorn
2007-09-17  2:53   ` Rik van Riel
2007-09-18 17:46     ` Lee Schermerhorn
2007-09-18 20:01       ` Rik van Riel
2007-09-19 14:55         ` Lee Schermerhorn
2007-09-18  2:59   ` KAMEZAWA Hiroyuki
2007-09-18 15:47     ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 12/14] Reclaim Scalability: Non-reclaimable Mlock'ed pages Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 13/14] Reclaim Scalability: Handle Mlock'ed pages during map/unmap and truncate Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 14/14] Reclaim Scalability: cull non-reclaimable anon pages in fault path Lee Schermerhorn
2007-09-14 21:11 ` [PATCH/RFC 0/14] Page Reclaim Scalability Peter Zijlstra
2007-09-14 21:42   ` Linus Torvalds
2007-09-14 22:02     ` Peter Zijlstra
2007-09-15  0:07       ` Linus Torvalds
2007-09-17  6:44 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46F0BAF0.2020806@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@suse.de \
    --cc=clameter@sgi.com \
    --cc=eric.whitney@hp.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.