From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mel@csn.ul.ie,
clameter@sgi.com, riel@redhat.com, balbir@linux.vnet.ibm.com,
andrea@suse.de, eric.whitney@hp.com, npiggin@suse.de
Subject: Re: [PATCH/RFC 3/14] Reclaim Scalability: move isolate_lru_page() to vmscan.c
Date: Mon, 17 Sep 2007 10:11:27 -0400 [thread overview]
Message-ID: <1190038287.5460.30.camel@localhost> (raw)
In-Reply-To: <1189805699.5826.19.camel@lappy>
On Fri, 2007-09-14 at 23:34 +0200, Peter Zijlstra wrote:
> On Fri, 2007-09-14 at 16:54 -0400, Lee Schermerhorn wrote:
>
> > Note that we now have '__isolate_lru_page()', that does
> > something quite different, visible outside of vmscan.c
> > for use with memory controller. Methinks we need to
> > rationalize these names/purposes. --lts
> >
>
> Actually it comes from lumpy reclaim, and does something very similar to
> what this one does.
Sorry. My statement was a bit ambiguous. I meant that the visibility
of __isolate_lru_pages() outside of vmscan.c comes about from the mem
controller patches. Lumpy reclaim did add the "isolation mode" [active,
inactive, both].
> When one looks at the mainline version one could
> write:
>
> int isolate_lru_page(struct page *page, struct list_head *pagelist)
> {
> int ret = -EBUSY;
>
> if (PageLRU(page)) {
> struct zone *zone = page_zone(page);
>
> spin_lock_irq(&zone->lru_lock);
> ret = __isolate_lru_page(page, ISOLATE_BOTH);
> if (!ret) {
> __dec_zone_state(zone, PageActive(page)
> ? NR_ACTIVE : NR_INACTIVE);
> list_move_tail(&page->lru, pagelist);
> }
> spin_unlock_irq(&zone->lru_lock);
> }
>
> return ret;
> }
>
In it's initial form, yes. Later [in the first noreclaim patch] you'll
see that I hacked both isolate_lru_page and __isolate_lru_page() to
handle non-reclaimable pages. The former to add recognize
non-reclaimable pages and isolate them from the noreclaim list; the
latter to allow isolation of non-reclaimable pages only when scanning
the active list, but not during lumpy reclaim.
I had to allow __isolate_lru_page() to accept non-reclaimable pages from
the active list in order to splice the noreclaim list back there when we
want to scan it--as you mentioned to me was discussed at the vm summit.
I'm not very happy with the result, and think we need to revisit how we
scan the noreclaim list for various conditions. I plan to fork off a
separate discussion on this point, real soon now.
> Obviously the container stuff somewhat complicates mattters in -mm.
>
> > /*
> > - * Isolate one page from the LRU lists. If successful put it onto
> > - * the indicated list with elevated page count.
> > - *
> > - * Result:
> > - * -EBUSY: page not on LRU list
> > - * 0: page removed from LRU list and added to the specified list.
> > - */
> > -int isolate_lru_page(struct page *page, struct list_head *pagelist)
> > -{
> > - int ret = -EBUSY;
> > -
> > - if (PageLRU(page)) {
> > - struct zone *zone = page_zone(page);
> > -
> > - spin_lock_irq(&zone->lru_lock);
> > - if (PageLRU(page) && get_page_unless_zero(page)) {
> > - ret = 0;
> > - ClearPageLRU(page);
> > - if (PageActive(page))
> > - del_page_from_active_list(zone, page);
> > - else
> > - del_page_from_inactive_list(zone, page);
> > - list_add_tail(&page->lru, pagelist);
> > - }
> > - spin_unlock_irq(&zone->lru_lock);
> > - }
> > - return ret;
> > -}
>
> remarcable change is the dissapearance of get_page_unless_zero() in the
> new version.
Good catch! What happened here is this"
The original version of isolate_lru_page() that Nick's patch moved had a
get_page() in the "if (PageLRU(page)" block--no get_page_unless_zero().
This was fine for Christoph's migration usage, because it was always
called in task context, holding the mm semaphore. Mel and Kame-san want
to use migration for defragmentation and hotplug from outside task
context, so one or the other of them [not sure] removed the get_page()
and added the get_page_unless_zero() into the if condition--around
mid-June. Apparently, during resolution of a forced patch conflict, I
managed to drop the get_page(), but not pick up the
get_page_unless_zero(). So much for following "established protocol for
handling pages on the LRU lists", huh?
<snip new, botched version>
Below is a patch to add back the get_page_unless_zero(). I'll roll this
into the move_and_rework... patch for the next posting, but in the
meantime, if anyone wants to try these, here's a quick fix.
I just tested with this and my tests ran much better. I still managed
to push my system into OOM during mbind() migration, but I am repeatedly
locking and unlocking 16G, sometimes in 8G chunks, of an 18G anon
segment to force swapping and such. Another test is creating 256MB anon
and private file-backed segments, binding them down and migrating them
around the platform. Eventually, this second test dies with OOM because
of CONSTRAINT_MEMORY_POLICY--insufficient memory on the target node.
The noreclaim statistics seemed to be behaving better as well, but once
the memtoy/mlock test went OOM with locked pages, quite a few pages
remained non-reclaimable after I killed off the other tests. Still a
lot of work to do on reviving non-reclaimable pages.
Thanks,
Lee
======================
PATCH move and rework isolate_lru_page fix
I accidently dropped the recently added "get_page_unless_zero(page)"
from isolate_lru_page() during resolution of a forced patch
conflict.
Put it back!!!
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: Linux/mm/vmscan.c
===================================================================
--- Linux.orig/mm/vmscan.c 2007-09-17 09:06:01.000000000 -0400
+++ Linux/mm/vmscan.c 2007-09-17 09:07:37.000000000 -0400
@@ -838,7 +838,7 @@ int isolate_lru_page(struct page *page)
struct zone *zone = page_zone(page);
spin_lock_irq(&zone->lru_lock);
- if (PageLRU(page)) {
+ if (PageLRU(page) && get_page_unless_zero(page)) {
ret = 0;
ClearPageLRU(page);
if (PageActive(page))
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-09-17 14:11 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-14 20:53 [PATCH/RFC 0/14] Page Reclaim Scalability Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 1/14] Reclaim Scalability: Convert anon_vma lock to read/write lock Lee Schermerhorn
2007-09-17 11:02 ` Mel Gorman
2007-09-18 2:41 ` KAMEZAWA Hiroyuki
2007-09-18 11:01 ` Mel Gorman
2007-09-18 14:57 ` Rik van Riel
2007-09-18 15:37 ` Lee Schermerhorn
2007-09-18 20:17 ` Lee Schermerhorn
2007-09-20 10:19 ` Mel Gorman
2007-09-14 20:54 ` [PATCH/RFC 2/14] Reclaim Scalability: convert inode i_mmap_lock to reader/writer lock Lee Schermerhorn
2007-09-17 12:53 ` Mel Gorman
2007-09-20 1:24 ` Andrea Arcangeli
2007-09-20 14:10 ` Lee Schermerhorn
2007-09-20 14:16 ` Andrea Arcangeli
2007-09-14 20:54 ` [PATCH/RFC 3/14] Reclaim Scalability: move isolate_lru_page() to vmscan.c Lee Schermerhorn
2007-09-14 21:34 ` Peter Zijlstra
2007-09-15 1:55 ` Rik van Riel
2007-09-17 14:11 ` Lee Schermerhorn [this message]
2007-09-17 9:20 ` Balbir Singh
2007-09-17 19:19 ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 4/14] Reclaim Scalability: Define page_anon() function Lee Schermerhorn
2007-09-15 2:00 ` Rik van Riel
2007-09-17 13:19 ` Mel Gorman
2007-09-18 1:58 ` KAMEZAWA Hiroyuki
2007-09-18 2:27 ` Rik van Riel
2007-09-18 2:40 ` KAMEZAWA Hiroyuki
2007-09-18 15:04 ` Lee Schermerhorn
2007-09-18 19:41 ` Christoph Lameter
2007-09-19 0:30 ` KAMEZAWA Hiroyuki
2007-09-19 16:58 ` Lee Schermerhorn
2007-09-20 0:56 ` KAMEZAWA Hiroyuki
2007-09-14 20:54 ` [PATCH/RFC 5/14] Reclaim Scalability: Use an indexed array for LRU variables Lee Schermerhorn
2007-09-17 13:40 ` Mel Gorman
2007-09-17 14:17 ` Lee Schermerhorn
2007-09-17 14:39 ` Lee Schermerhorn
2007-09-17 18:58 ` Balbir Singh
2007-09-17 19:12 ` Lee Schermerhorn
2007-09-17 19:36 ` Balbir Singh
2007-09-17 19:36 ` Rik van Riel
2007-09-17 20:21 ` Balbir Singh
2007-09-17 21:01 ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 6/14] Reclaim Scalability: "No Reclaim LRU Infrastructure" Lee Schermerhorn
2007-09-14 22:47 ` Christoph Lameter
2007-09-17 15:17 ` Lee Schermerhorn
2007-09-17 18:41 ` Christoph Lameter
2007-09-18 9:54 ` Mel Gorman
2007-09-18 19:45 ` Christoph Lameter
2007-09-19 11:11 ` Mel Gorman
2007-09-19 18:03 ` Christoph Lameter
2007-09-19 6:00 ` Balbir Singh
2007-09-19 14:47 ` Lee Schermerhorn
2007-09-14 20:54 ` [PATCH/RFC 7/14] Reclaim Scalability: Non-reclaimable page statistics Lee Schermerhorn
2007-09-17 1:56 ` Rik van Riel
2007-09-14 20:54 ` [PATCH/RFC 8/14] Reclaim Scalability: Ram Disk Pages are non-reclaimable Lee Schermerhorn
2007-09-17 1:57 ` Rik van Riel
2007-09-17 14:40 ` Lee Schermerhorn
2007-09-17 18:42 ` Christoph Lameter
2007-09-14 20:54 ` [PATCH/RFC 9/14] Reclaim Scalability: SHM_LOCKED pages are nonreclaimable Lee Schermerhorn
2007-09-17 2:18 ` Rik van Riel
2007-09-14 20:55 ` [PATCH/RFC 10/14] Reclaim Scalability: track anon_vma "related vmas" Lee Schermerhorn
2007-09-17 2:52 ` Rik van Riel
2007-09-17 15:52 ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 11/14] Reclaim Scalability: swap backed pages are nonreclaimable when no swap space available Lee Schermerhorn
2007-09-17 2:53 ` Rik van Riel
2007-09-18 17:46 ` Lee Schermerhorn
2007-09-18 20:01 ` Rik van Riel
2007-09-19 14:55 ` Lee Schermerhorn
2007-09-18 2:59 ` KAMEZAWA Hiroyuki
2007-09-18 15:47 ` Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 12/14] Reclaim Scalability: Non-reclaimable Mlock'ed pages Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 13/14] Reclaim Scalability: Handle Mlock'ed pages during map/unmap and truncate Lee Schermerhorn
2007-09-14 20:55 ` [PATCH/RFC 14/14] Reclaim Scalability: cull non-reclaimable anon pages in fault path Lee Schermerhorn
2007-09-14 21:11 ` [PATCH/RFC 0/14] Page Reclaim Scalability Peter Zijlstra
2007-09-14 21:42 ` Linus Torvalds
2007-09-14 22:02 ` Peter Zijlstra
2007-09-15 0:07 ` Linus Torvalds
2007-09-17 6:44 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1190038287.5460.30.camel@localhost \
--to=lee.schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=andrea@suse.de \
--cc=balbir@linux.vnet.ibm.com \
--cc=clameter@sgi.com \
--cc=eric.whitney@hp.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).