Re: [RFC]vmscan: doing page_referenced() in batch way

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: Shaohua Li <shaohua.li@intel.com>
Cc: linux-mm <linux-mm@kvack.org>,
	riel@redhat.com, Andi Kleen <andi@firstfloor.org>,
	hughd@google.com
Subject: Re: [RFC]vmscan: doing page_referenced() in batch way
Date: Wed, 6 Oct 2010 13:10:52 -0700	[thread overview]
Message-ID: <20101006131052.e3ae026f.akpm@linux-foundation.org> (raw)
In-Reply-To: <1285729053.27440.13.camel@sli10-conroe.sh.intel.com>

On Wed, 29 Sep 2010 10:57:33 +0800
Shaohua Li <shaohua.li@intel.com> wrote:

> when memory pressure is high, page_referenced() causes a lot of lock contention
> for anon_vma->lock or mapping->i_mmap_lock. Considering pages from one file
> usually live side by side in LRU list, we can lock several pages in
> shrink_page_list() and do batch page_referenced() to avoid some lock/unlock,
> which should reduce lock contention a lot. The locking rule documented in
> rmap.c is:
> page_lock
> 	mapping->i_mmap_lock
> 		anon_vma->lock
> For a batch of pages, we do page lock for all of them first and check their
> reference, and then release their i_mmap_lock or anon_vma lock. This seems not
> break the rule to me.
> Before I further polish the patch, I'd like to know if there is anything
> preventing us to do such batch here.

The patch adds quite a bit of complexity, so we'd need to see benchmark
testing results which justify it, please.

Also, the entire patch is irrelevant for uniprocessor machines, so the
runtime overhead and code-size increases for CONFIG_SMP=n builds should
be as low as possible - ideally zero.  Please quantify this as well
within the changelog if you pursue this work.

>
> ...
>
> +#define PRC_PAGE_NUM 8
> +struct page_reference_control {
> +	int num;
> +	struct page *pages[PRC_PAGE_NUM];
> +	int references[PRC_PAGE_NUM];
> +	struct anon_vma *anon_vma;
> +	struct address_space *mapping;
> +	/* no ksm */
> +};

hm, 120 bytes of stack consumed, deep in page reclaim.

>  #endif
>  
>  extern int hwpoison_filter(struct page *p);
>
> ...
>
>  static int page_referenced_file(struct page *page,
>  				struct mem_cgroup *mem_cont,
> -				unsigned long *vm_flags)
> +				unsigned long *vm_flags,
> +				struct page_reference_control *prc)
>  {
>  	unsigned int mapcount;
>  	struct address_space *mapping = page->mapping;
> @@ -603,8 +623,25 @@ static int page_referenced_file(struct p
>  	 */
>  	BUG_ON(!PageLocked(page));
>  
> -	spin_lock(&mapping->i_mmap_lock);
> +	if (prc) {
> +		if (mapping == prc->mapping) {
> +			goto skip_lock;
> +		}
> +		if (prc->anon_vma) {
> +			page_unlock_anon_vma(prc->anon_vma);
> +			prc->anon_vma = NULL;
> +		}
> +		if (prc->mapping) {
> +			spin_unlock(&prc->mapping->i_mmap_lock);
> +			prc->mapping = NULL;
> +		}
> +		prc->mapping = mapping;
> +
> +		spin_lock(&mapping->i_mmap_lock);
> +	} else
> +		spin_lock(&mapping->i_mmap_lock);

Move the spin_lock() outside, remove the `else' part.

> +skip_lock:
>  	/*
>  	 * i_mmap_lock does not stabilize mapcount at all, but mapcount
>  	 * is more likely to be accurate if we note it after spinning.
> @@ -628,7 +665,8 @@ static int page_referenced_file(struct p
>  			break;
>  	}
>  
> -	spin_unlock(&mapping->i_mmap_lock);
> +	if (!prc)
> +		spin_unlock(&mapping->i_mmap_lock);
>  	return referenced;
>  }
>  
>
> ...
>
> +static void do_prc_batch(struct scan_control *sc,
> +	struct page_reference_control *prc)
> +{
> +	int i;
> +	for (i = 0; i < prc->num; i++)
> +		prc->references[i] = page_check_references(prc->pages[i], sc,
> +			prc);
> +	/*
> +	 * we must release all locks here, the lock ordering requries
> +	 * pagelock->
> +	 *   mapping->i_mmap_lock->
> +	 *     anon_vma->lock
> +	 * release lock guarantee we don't break the rule in next run
> +	 */
> +	if (prc->anon_vma) {
> +		page_unlock_anon_vma(prc->anon_vma);
> +		prc->anon_vma = NULL;
> +	}
> +	if (prc->mapping) {
> +		spin_unlock(&prc->mapping->i_mmap_lock);
> +		prc->mapping = NULL;
> +	}
> +}

I didn't check the locking alterations.

> +static int page_check_references_batch(struct page *page, struct scan_control *sc,
> +	struct page_reference_control *prc)
> +{
> +	/* bypass ksm pages */
> +	if (PageKsm(page))
> +		return 1;

A general point about code comments: the comment shouldn't describe
"what" the code is doing unless that is unobvious.  The comment should
explain "why" the code is doing something.  Because that is something
which the code itself cannot explain.

> +	if (prc->num < PRC_PAGE_NUM)
> +		prc->pages[prc->num] = page;
> +	prc->num++;
> +	if (prc->num == PRC_PAGE_NUM)
> +		return 1;
> +	return 0;
> +}
>
> ...
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-10-06 20:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-29  2:57 [RFC]vmscan: doing page_referenced() in batch way Shaohua Li
2010-10-06 20:10 ` Andrew Morton [this message]
2010-10-10 17:10   ` Rik van Riel
2010-10-11  0:49     ` Shaohua Li
2010-10-13  4:53   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101006131052.e3ae026f.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=hughd@google.com \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).