From: Nick Piggin <npiggin@suse.de>
To: Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>
Cc: Nick Piggin <npiggin@suse.de>,
Linux Memory Management <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: [patch 5/6] mm: simplify vmscan vs release refcounting
Date: Thu, 19 Jan 2006 20:23:29 +0100 (CET) [thread overview]
Message-ID: <20060119192219.11913.30071.sendpatchset@linux.site> (raw)
In-Reply-To: <20060119192131.11913.27564.sendpatchset@linux.site>
The VM has an interesting race where a page refcount can drop to zero, but
it is still on the LRU lists for a short time. This was solved by testing
a 0->1 refcount transition when picking up pages from the LRU, and dropping
the refcount in that case.
Instead, use atomic_inc_not_zero to ensure we never pick up a 0 refcount
page from the LRU, thus a 0 refcount page will never have its refcount elevated
until it is allocated again. This simplifies the handling of the race
because vmscan now *never* touches the refcount of a released page.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -286,32 +286,26 @@ struct page {
*
* Also, many kernel routines increase the page count before a critical
* routine so they can be sure the page doesn't go away from under them.
- *
- * Since 2.6.6 (approx), a free page has ->_count = -1. This is so that we
- * can use atomic_add_negative(-1, page->_count) to detect when the page
- * becomes free and so that we can also use atomic_inc_and_test to atomically
- * detect when we just tried to grab a ref on a page which some other CPU has
- * already deemed to be freeable.
- *
- * NO code should make assumptions about this internal detail! Use the provided
- * macros which retain the old rules: page_count(page) == 0 is a free page.
*/
/*
* Drop a ref, return true if the logical refcount fell to zero (the page has
* no users)
*/
-#define put_page_testzero(p) \
- ({ \
- BUG_ON(page_count(p) == 0); \
- atomic_add_negative(-1, &(p)->_count); \
- })
+static inline int put_page_testzero(struct page *page)
+{
+ BUG_ON(atomic_read(&page->_count) == -1);
+ return atomic_dec_and_test(&page->_count);
+}
/*
- * Grab a ref, return true if the page previously had a logical refcount of
- * zero. ie: returns true if we just grabbed an already-deemed-to-be-free page
+ * Try to grab a ref unless the page has a refcount of zero, return false if
+ * that is the case.
*/
-#define get_page_testone(p) atomic_inc_and_test(&(p)->_count)
+static inline int get_page_unless_zero(struct page *page)
+{
+ return atomic_add_unless(&page->_count, 1, -1);
+}
#define set_page_count(p,v) atomic_set(&(p)->_count, (v) - 1)
#define __put_page(p) atomic_dec(&(p)->_count)
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -821,29 +821,26 @@ static int isolate_lru_pages(int nr_to_s
int scan = 0;
while (scan++ < nr_to_scan && !list_empty(src)) {
+ struct list_head *target;
page = lru_to_page(src);
prefetchw_prev_lru_page(page, src, flags);
BUG_ON(!PageLRU(page));
list_del(&page->lru);
- if (unlikely(get_page_testone(page))) {
+ target = src;
+ if (likely(get_page_unless_zero(page))) {
/*
- * It is being freed elsewhere
+ * Be careful not to clear PageLRU until after we're
+ * sure the page is not being freed elsewhere -- the
+ * page release code relies on it.
*/
- __put_page(page);
- list_add(&page->lru, src);
- continue;
- }
+ ClearPageLRU(page);
+ target = dst;
+ nr_taken++;
+ } /* else it is being freed elsewhere */
- /*
- * Be careful not to clear PageLRU until after we're sure
- * the page is not being freed elsewhere -- the page release
- * code relies on it.
- */
- ClearPageLRU(page);
- list_add(&page->lru, dst);
- nr_taken++;
+ list_add(&page->lru, target);
}
*scanned = scan;
WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <npiggin@suse.de>
To: Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>
Cc: Nick Piggin <npiggin@suse.de>,
Linux Memory Management <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: [patch 5/6] mm: simplify vmscan vs release refcounting
Date: Thu, 19 Jan 2006 20:23:29 +0100 (CET) [thread overview]
Message-ID: <20060119192219.11913.30071.sendpatchset@linux.site> (raw)
In-Reply-To: <20060119192131.11913.27564.sendpatchset@linux.site>
The VM has an interesting race where a page refcount can drop to zero, but
it is still on the LRU lists for a short time. This was solved by testing
a 0->1 refcount transition when picking up pages from the LRU, and dropping
the refcount in that case.
Instead, use atomic_inc_not_zero to ensure we never pick up a 0 refcount
page from the LRU, thus a 0 refcount page will never have its refcount elevated
until it is allocated again. This simplifies the handling of the race
because vmscan now *never* touches the refcount of a released page.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -286,32 +286,26 @@ struct page {
*
* Also, many kernel routines increase the page count before a critical
* routine so they can be sure the page doesn't go away from under them.
- *
- * Since 2.6.6 (approx), a free page has ->_count = -1. This is so that we
- * can use atomic_add_negative(-1, page->_count) to detect when the page
- * becomes free and so that we can also use atomic_inc_and_test to atomically
- * detect when we just tried to grab a ref on a page which some other CPU has
- * already deemed to be freeable.
- *
- * NO code should make assumptions about this internal detail! Use the provided
- * macros which retain the old rules: page_count(page) == 0 is a free page.
*/
/*
* Drop a ref, return true if the logical refcount fell to zero (the page has
* no users)
*/
-#define put_page_testzero(p) \
- ({ \
- BUG_ON(page_count(p) == 0); \
- atomic_add_negative(-1, &(p)->_count); \
- })
+static inline int put_page_testzero(struct page *page)
+{
+ BUG_ON(atomic_read(&page->_count) == -1);
+ return atomic_dec_and_test(&page->_count);
+}
/*
- * Grab a ref, return true if the page previously had a logical refcount of
- * zero. ie: returns true if we just grabbed an already-deemed-to-be-free page
+ * Try to grab a ref unless the page has a refcount of zero, return false if
+ * that is the case.
*/
-#define get_page_testone(p) atomic_inc_and_test(&(p)->_count)
+static inline int get_page_unless_zero(struct page *page)
+{
+ return atomic_add_unless(&page->_count, 1, -1);
+}
#define set_page_count(p,v) atomic_set(&(p)->_count, (v) - 1)
#define __put_page(p) atomic_dec(&(p)->_count)
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -821,29 +821,26 @@ static int isolate_lru_pages(int nr_to_s
int scan = 0;
while (scan++ < nr_to_scan && !list_empty(src)) {
+ struct list_head *target;
page = lru_to_page(src);
prefetchw_prev_lru_page(page, src, flags);
BUG_ON(!PageLRU(page));
list_del(&page->lru);
- if (unlikely(get_page_testone(page))) {
+ target = src;
+ if (likely(get_page_unless_zero(page))) {
/*
- * It is being freed elsewhere
+ * Be careful not to clear PageLRU until after we're
+ * sure the page is not being freed elsewhere -- the
+ * page release code relies on it.
*/
- __put_page(page);
- list_add(&page->lru, src);
- continue;
- }
+ ClearPageLRU(page);
+ target = dst;
+ nr_taken++;
+ } /* else it is being freed elsewhere */
- /*
- * Be careful not to clear PageLRU until after we're sure
- * the page is not being freed elsewhere -- the page release
- * code relies on it.
- */
- ClearPageLRU(page);
- list_add(&page->lru, dst);
- nr_taken++;
+ list_add(&page->lru, target);
}
*scanned = scan;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-01-19 19:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-19 19:22 [patch 0/6] mm: optimisations and page ref simplifications Nick Piggin
2006-01-19 19:22 ` Nick Piggin
2006-01-19 19:22 ` [patch 1/6] mm: never ClearPageLRU released pages Nick Piggin
2006-01-19 19:22 ` Nick Piggin
2006-01-19 19:23 ` [patch 2/6] mm: PageLRU no testset Nick Piggin
2006-01-19 19:23 ` Nick Piggin
2006-01-19 19:23 ` [patch 3/6] mm: PageActive " Nick Piggin
2006-01-19 19:23 ` Nick Piggin
2006-01-19 19:23 ` [patch 4/6] mm: less atomic ops Nick Piggin
2006-01-19 19:23 ` Nick Piggin
2006-01-19 19:23 ` Nick Piggin [this message]
2006-01-19 19:23 ` [patch 5/6] mm: simplify vmscan vs release refcounting Nick Piggin
2006-01-19 19:35 ` Linus Torvalds
2006-01-19 19:35 ` Linus Torvalds
2006-01-19 19:53 ` Nick Piggin
2006-01-19 19:53 ` Nick Piggin
2006-01-19 19:23 ` [patch 6/6] mm: de-skew page refcounting Nick Piggin
2006-01-19 19:23 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060119192219.11913.30071.sendpatchset@linux.site \
--to=npiggin@suse.de \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.