All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Johannes Weiner <hannes@saeurebad.de>
Cc: torvalds@linux-foundation.org, riel@redhat.com,
	kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [patch v2] vmscan: protect zone rotation stats by lru lock
Date: Mon, 1 Dec 2008 13:41:12 -0800	[thread overview]
Message-ID: <20081201134112.24c647ff.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1L6y5T-0003q3-M3@cmpxchg.org>

On Mon, 01 Dec 2008 03:00:35 +0100
Johannes Weiner <hannes@saeurebad.de> wrote:

> The zone's rotation statistics must not be accessed without the
> corresponding LRU lock held.  Fix an unprotected write in
> shrink_active_list().
> 

I don't think it really matters.  It's quite common in that code to do
unlocked, racy update to statistics such as this.  Because on those
rare occasions where a race does happen, there's a small glitch in the
reclaim logic which nobody will notice anyway.

Of course, this does need to be done with some care, to ensure the
glitch _will_ be small.  If such a race would cause the scanner to go
off and reclaim 2^32 pages, well, that's not so good.

> 
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1243,32 +1243,32 @@ static void shrink_active_list(unsigned 
>  		/* page_referenced clears PageReferenced */
>  		if (page_mapping_inuse(page) &&
>  		    page_referenced(page, 0, sc->mem_cgroup))
>  			pgmoved++;
>  
>  		list_add(&page->lru, &l_inactive);
>  	}
>  
> +	spin_lock_irq(&zone->lru_lock);
>  	/*
>  	 * Count referenced pages from currently used mappings as
>  	 * rotated, even though they are moved to the inactive list.
>  	 * This helps balance scan pressure between file and anonymous
>  	 * pages in get_scan_ratio.
>  	 */
>  	zone->recent_rotated[!!file] += pgmoved;
>  
>  	/*
>  	 * Move the pages to the [file or anon] inactive list.
>  	 */
>  	pagevec_init(&pvec, 1);
>  
>  	pgmoved = 0;
>  	lru = LRU_BASE + file * LRU_FILE;
> -	spin_lock_irq(&zone->lru_lock);

We've unnecessarily moved a pile of other things inside the locked
region as well, needlessly extending the lock hold times.

>  	while (!list_empty(&l_inactive)) {
>  		page = lru_to_page(&l_inactive);
>  		prefetchw_prev_lru_page(page, &l_inactive, flags);
>  		VM_BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
>  		VM_BUG_ON(!PageActive(page));
>  		ClearPageActive(page);
>  

You'll note that the code which _uses_ these values does so without
holding the lock.  So get_scan_ratio() sees incoherent values of
recent_scanned[0] and recent_scanned[1].  As is common in this code,
that is OK and deliberate.

It's also racy here:

	if (unlikely(zone->recent_scanned[0] > anon / 4)) {
		spin_lock_irq(&zone->lru_lock);
		zone->recent_scanned[0] /= 2;
		zone->recent_rotated[0] /= 2;
		spin_unlock_irq(&zone->lru_lock);
	}

failing to recheck the comparison after taking the lock..

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Johannes Weiner <hannes@saeurebad.de>
Cc: torvalds@linux-foundation.org, riel@redhat.com,
	kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [patch v2] vmscan: protect zone rotation stats by lru lock
Date: Mon, 1 Dec 2008 13:41:12 -0800	[thread overview]
Message-ID: <20081201134112.24c647ff.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1L6y5T-0003q3-M3@cmpxchg.org>

On Mon, 01 Dec 2008 03:00:35 +0100
Johannes Weiner <hannes@saeurebad.de> wrote:

> The zone's rotation statistics must not be accessed without the
> corresponding LRU lock held.  Fix an unprotected write in
> shrink_active_list().
> 

I don't think it really matters.  It's quite common in that code to do
unlocked, racy update to statistics such as this.  Because on those
rare occasions where a race does happen, there's a small glitch in the
reclaim logic which nobody will notice anyway.

Of course, this does need to be done with some care, to ensure the
glitch _will_ be small.  If such a race would cause the scanner to go
off and reclaim 2^32 pages, well, that's not so good.

> 
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1243,32 +1243,32 @@ static void shrink_active_list(unsigned 
>  		/* page_referenced clears PageReferenced */
>  		if (page_mapping_inuse(page) &&
>  		    page_referenced(page, 0, sc->mem_cgroup))
>  			pgmoved++;
>  
>  		list_add(&page->lru, &l_inactive);
>  	}
>  
> +	spin_lock_irq(&zone->lru_lock);
>  	/*
>  	 * Count referenced pages from currently used mappings as
>  	 * rotated, even though they are moved to the inactive list.
>  	 * This helps balance scan pressure between file and anonymous
>  	 * pages in get_scan_ratio.
>  	 */
>  	zone->recent_rotated[!!file] += pgmoved;
>  
>  	/*
>  	 * Move the pages to the [file or anon] inactive list.
>  	 */
>  	pagevec_init(&pvec, 1);
>  
>  	pgmoved = 0;
>  	lru = LRU_BASE + file * LRU_FILE;
> -	spin_lock_irq(&zone->lru_lock);

We've unnecessarily moved a pile of other things inside the locked
region as well, needlessly extending the lock hold times.

>  	while (!list_empty(&l_inactive)) {
>  		page = lru_to_page(&l_inactive);
>  		prefetchw_prev_lru_page(page, &l_inactive, flags);
>  		VM_BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
>  		VM_BUG_ON(!PageActive(page));
>  		ClearPageActive(page);
>  

You'll note that the code which _uses_ these values does so without
holding the lock.  So get_scan_ratio() sees incoherent values of
recent_scanned[0] and recent_scanned[1].  As is common in this code,
that is OK and deliberate.

It's also racy here:

	if (unlikely(zone->recent_scanned[0] > anon / 4)) {
		spin_lock_irq(&zone->lru_lock);
		zone->recent_scanned[0] /= 2;
		zone->recent_rotated[0] /= 2;
		spin_unlock_irq(&zone->lru_lock);
	}

failing to recheck the comparison after taking the lock..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-12-01 21:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-01  2:00 [patch v2] vmscan: protect zone rotation stats by lru lock Johannes Weiner
2008-12-01  2:00 ` Johannes Weiner
2008-12-01 21:41 ` Andrew Morton [this message]
2008-12-01 21:41   ` Andrew Morton
2008-12-01 21:46   ` Rik van Riel
2008-12-01 21:46     ` Rik van Riel
2008-12-01 22:09     ` Lee Schermerhorn
2008-12-01 22:09       ` Lee Schermerhorn
2008-12-02 12:34       ` Johannes Weiner
2008-12-02 12:34         ` Johannes Weiner
2008-12-02 18:17         ` Lee Schermerhorn
2008-12-02 18:17           ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081201134112.24c647ff.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hannes@saeurebad.de \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.