All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Memory Management <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hugh@veritas.com>, Andrew Morton <akpm@osdl.org>,
	Andrea Arcangeli <andrea@suse.de>,
	Linus Torvalds <torvalds@osdl.org>,
	David Miller <davem@davemloft.net>
Subject: Re: [patch 3/3] mm: PageActive no testset
Date: Wed, 18 Jan 2006 12:13:46 -0200	[thread overview]
Message-ID: <20060118141346.GB7048@dmt.cnet> (raw)
In-Reply-To: <20060118024139.10241.73020.sendpatchset@linux.site>

Hi Nick,

On Wed, Jan 18, 2006 at 11:40:58AM +0100, Nick Piggin wrote:
> PG_active is protected by zone->lru_lock, it does not need TestSet/TestClear
> operations.

page->flags bits (including PG_active and PG_lru bits) are touched by
several codepaths which do not hold zone->lru_lock. 

AFAICT zone->lru_lock guards access to the LRU list, and no more than
that.

Moreover, what about consistency of the rest of page->flags bits?

PPC for example implements test_and_set_bit() with:

	lwarx	reg, addr   (load and create reservation for 32-bit addr)
	or 	reg, BITOP_MASK(nr)	
	stwcx	reg, addr  (store word upon reservation validation, otherwise loop)

If you don't use atomic operations on page->flags, unrelated bits other
than that you're working with can have their updates lost, given that 
in reality page->flags is not protected by the lru_lock.

For example:

/*
 * shrink_list adds the number of reclaimed pages to sc->nr_reclaimed
 */
static int shrink_list(struct list_head *page_list, struct scan_control *sc)
{
...
                BUG_ON(PageActive(page));
...
activate_locked:
                SetPageActive(page);
                pgactivate++;
keep_locked:
                unlock_page(page);
keep:
                list_add(&page->lru, &ret_pages);
                BUG_ON(PageLRU(page));
}

And recently:

#ifdef CONFIG_MIGRATION
static inline void move_to_lru(struct page *page)
{
        list_del(&page->lru);
        if (PageActive(page)) {
                /*
                 * lru_cache_add_active checks that
                 * the PG_active bit is off.
                 */ 
                ClearPageActive(page);
                lru_cache_add_active(page);

Not relying on zone->lru_lock allows interesting optimizations
such as moving active/inactive pgflag bit setting from inside
__pagevec_lru_add/__pagevec_lru_add_active to the caller, and merging
the two.

Comments?


> Signed-off-by: Nick Piggin <npiggin@suse.de>
> 
> Index: linux-2.6/mm/vmscan.c
> ===================================================================
> --- linux-2.6.orig/mm/vmscan.c
> +++ linux-2.6/mm/vmscan.c
> @@ -997,8 +997,9 @@ refill_inactive_zone(struct zone *zone, 
>  		prefetchw_prev_lru_page(page, &l_inactive, flags);
>  		BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
> -		if (!TestClearPageActive(page))
> -			BUG();
> +		BUG_ON(!PageActive(page));
> +		ClearPageActive(page);
> +
>  		list_move(&page->lru, &zone->inactive_list);
>  		pgmoved++;
>  		if (!pagevec_add(&pvec, page)) {
> Index: linux-2.6/mm/swap.c
> ===================================================================
> --- linux-2.6.orig/mm/swap.c
> +++ linux-2.6/mm/swap.c
> @@ -356,8 +356,8 @@ void __pagevec_lru_add_active(struct pag
>  		}
>  		BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
> -		if (TestSetPageActive(page))
> -			BUG();
> +		BUG_ON(PageActive(page));
> +		SetPageActive(page);
>  		add_page_to_active_list(zone, page);
>  	}
>  	if (zone)
> Index: linux-2.6/include/linux/page-flags.h
> ===================================================================
> --- linux-2.6.orig/include/linux/page-flags.h
> +++ linux-2.6/include/linux/page-flags.h
> @@ -251,8 +251,6 @@ extern void __mod_page_state_offset(unsi
>  #define PageActive(page)	test_bit(PG_active, &(page)->flags)
>  #define SetPageActive(page)	set_bit(PG_active, &(page)->flags)
>  #define ClearPageActive(page)	clear_bit(PG_active, &(page)->flags)
> -#define TestClearPageActive(page) test_and_clear_bit(PG_active, &(page)->flags)
> -#define TestSetPageActive(page) test_and_set_bit(PG_active, &(page)->flags)
>  
>  #define PageSlab(page)		test_bit(PG_slab, &(page)->flags)
>  #define SetPageSlab(page)	set_bit(PG_slab, &(page)->flags)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Memory Management <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hugh@veritas.com>, Andrew Morton <akpm@osdl.org>,
	Andrea Arcangeli <andrea@suse.de>,
	Linus Torvalds <torvalds@osdl.org>,
	David Miller <davem@davemloft.net>
Subject: Re: [patch 3/3] mm: PageActive no testset
Date: Wed, 18 Jan 2006 12:13:46 -0200	[thread overview]
Message-ID: <20060118141346.GB7048@dmt.cnet> (raw)
In-Reply-To: <20060118024139.10241.73020.sendpatchset@linux.site>

Hi Nick,

On Wed, Jan 18, 2006 at 11:40:58AM +0100, Nick Piggin wrote:
> PG_active is protected by zone->lru_lock, it does not need TestSet/TestClear
> operations.

page->flags bits (including PG_active and PG_lru bits) are touched by
several codepaths which do not hold zone->lru_lock. 

AFAICT zone->lru_lock guards access to the LRU list, and no more than
that.

Moreover, what about consistency of the rest of page->flags bits?

PPC for example implements test_and_set_bit() with:

	lwarx	reg, addr   (load and create reservation for 32-bit addr)
	or 	reg, BITOP_MASK(nr)	
	stwcx	reg, addr  (store word upon reservation validation, otherwise loop)

If you don't use atomic operations on page->flags, unrelated bits other
than that you're working with can have their updates lost, given that 
in reality page->flags is not protected by the lru_lock.

For example:

/*
 * shrink_list adds the number of reclaimed pages to sc->nr_reclaimed
 */
static int shrink_list(struct list_head *page_list, struct scan_control *sc)
{
...
                BUG_ON(PageActive(page));
...
activate_locked:
                SetPageActive(page);
                pgactivate++;
keep_locked:
                unlock_page(page);
keep:
                list_add(&page->lru, &ret_pages);
                BUG_ON(PageLRU(page));
}

And recently:

#ifdef CONFIG_MIGRATION
static inline void move_to_lru(struct page *page)
{
        list_del(&page->lru);
        if (PageActive(page)) {
                /*
                 * lru_cache_add_active checks that
                 * the PG_active bit is off.
                 */ 
                ClearPageActive(page);
                lru_cache_add_active(page);

Not relying on zone->lru_lock allows interesting optimizations
such as moving active/inactive pgflag bit setting from inside
__pagevec_lru_add/__pagevec_lru_add_active to the caller, and merging
the two.

Comments?


> Signed-off-by: Nick Piggin <npiggin@suse.de>
> 
> Index: linux-2.6/mm/vmscan.c
> ===================================================================
> --- linux-2.6.orig/mm/vmscan.c
> +++ linux-2.6/mm/vmscan.c
> @@ -997,8 +997,9 @@ refill_inactive_zone(struct zone *zone, 
>  		prefetchw_prev_lru_page(page, &l_inactive, flags);
>  		BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
> -		if (!TestClearPageActive(page))
> -			BUG();
> +		BUG_ON(!PageActive(page));
> +		ClearPageActive(page);
> +
>  		list_move(&page->lru, &zone->inactive_list);
>  		pgmoved++;
>  		if (!pagevec_add(&pvec, page)) {
> Index: linux-2.6/mm/swap.c
> ===================================================================
> --- linux-2.6.orig/mm/swap.c
> +++ linux-2.6/mm/swap.c
> @@ -356,8 +356,8 @@ void __pagevec_lru_add_active(struct pag
>  		}
>  		BUG_ON(PageLRU(page));
>  		SetPageLRU(page);
> -		if (TestSetPageActive(page))
> -			BUG();
> +		BUG_ON(PageActive(page));
> +		SetPageActive(page);
>  		add_page_to_active_list(zone, page);
>  	}
>  	if (zone)
> Index: linux-2.6/include/linux/page-flags.h
> ===================================================================
> --- linux-2.6.orig/include/linux/page-flags.h
> +++ linux-2.6/include/linux/page-flags.h
> @@ -251,8 +251,6 @@ extern void __mod_page_state_offset(unsi
>  #define PageActive(page)	test_bit(PG_active, &(page)->flags)
>  #define SetPageActive(page)	set_bit(PG_active, &(page)->flags)
>  #define ClearPageActive(page)	clear_bit(PG_active, &(page)->flags)
> -#define TestClearPageActive(page) test_and_clear_bit(PG_active, &(page)->flags)
> -#define TestSetPageActive(page) test_and_set_bit(PG_active, &(page)->flags)
>  
>  #define PageSlab(page)		test_bit(PG_slab, &(page)->flags)
>  #define SetPageSlab(page)	set_bit(PG_slab, &(page)->flags)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-01-18 16:13 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-18 10:40 [patch 0/4] mm: de-skew page refcount Nick Piggin
2006-01-18 10:40 ` Nick Piggin
2006-01-18 10:40 ` [patch 1/4] mm: page refcount use atomic primitives Nick Piggin
2006-01-18 10:40   ` Nick Piggin
2006-01-18 10:40 ` [patch 2/4] mm: PageLRU no testset Nick Piggin
2006-01-18 10:40   ` Nick Piggin
2006-01-19 17:48   ` Nikita Danilov
2006-01-19 18:10     ` Nick Piggin
2006-01-18 10:40 ` [patch 3/3] mm: PageActive " Nick Piggin
2006-01-18 10:40   ` Nick Piggin
2006-01-18 14:13   ` Marcelo Tosatti [this message]
2006-01-18 14:13     ` Marcelo Tosatti
2006-01-19 14:50     ` Nick Piggin
2006-01-19 14:50       ` Nick Piggin
2006-01-19 16:52       ` Marcelo Tosatti
2006-01-19 16:52         ` Marcelo Tosatti
2006-01-19 20:02         ` Nick Piggin
2006-01-19 20:02           ` Nick Piggin
2006-01-19 21:41           ` Marcelo Tosatti
2006-01-19 21:41             ` Marcelo Tosatti
2006-01-18 10:41 ` [patch 4/4] mm: less atomic ops Nick Piggin
2006-01-18 10:41   ` Nick Piggin
2006-01-18 16:38 ` [patch 0/4] mm: de-skew page refcount Linus Torvalds
2006-01-18 16:38   ` Linus Torvalds
2006-01-18 17:05   ` Nick Piggin
2006-01-18 17:05     ` Nick Piggin
2006-01-18 19:27     ` Linus Torvalds
2006-01-18 19:27       ` Linus Torvalds
2006-01-19 14:00       ` Nick Piggin
2006-01-19 14:00         ` Nick Piggin
2006-01-19 16:36         ` Linus Torvalds
2006-01-19 16:36           ` Linus Torvalds
2006-01-19 17:06           ` Nick Piggin
2006-01-19 17:06             ` Nick Piggin
2006-01-19 17:27             ` Linus Torvalds
2006-01-19 17:27               ` Linus Torvalds
2006-01-19 17:38               ` Nick Piggin
2006-01-19 17:38                 ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060118141346.GB7048@dmt.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=davem@davemloft.net \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.