* Page Migration patchsets overview
@ 2006-04-29 3:22 Christoph Lameter
2006-04-29 3:22 ` [PATCH 1/7] PM cleanup: Rename "ignrefs" to "migration" Christoph Lameter
` (11 more replies)
0 siblings, 12 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:22 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
Following are 3 patchsets for page migration.
The first patchset contains a series of cleanups that also
contains the right fix for the PageDirty problem.
The second patchset implements read/write migration entries.
This allows us to no longer be dependent on the swap code (page migration
currently will not work if no swap volume is defined) and add additional
features. The speed of page migration increases by 20%. Page migration
can now preserve the write enable bit of the ptes. Useless COW faults
do no longer occur. The kernel can be compiled without SWAP support
and page migration will still work.
The third patchset contains two improvements based on the read/write
migration entries. First we stop incrementing / decrementing rss during
migration. Second we use the migration entries for file backed pages.
This will preserve file ptes during migration and allow repeated
migration of processes. The old code removed those ptes and people
were a bit surprised when the process suddenly got very small.
Patchset against 2.6.17-rc3. There seem to be some bits leftover
from the earlier patches (the removal of the page migration pagecache checks?)
in Andrew's tree.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/7] PM cleanup: Rename "ignrefs" to "migration"
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
@ 2006-04-29 3:22 ` Christoph Lameter
2006-04-29 3:22 ` [PATCH 2/7] PM cleanup: Group functions Christoph Lameter
` (10 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:22 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
migrate is a better name since it is only used by page migration.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
mm/rmap.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)
diff -puN mm/rmap.c~swapless-v2-try_to_unmap-rename-ignrefs-to-migration mm/rmap.c
--- devel/mm/rmap.c~swapless-v2-try_to_unmap-rename-ignrefs-to-migration 2006-04-13 17:09:50.000000000 -0700
+++ devel-akpm/mm/rmap.c 2006-04-13 17:10:01.000000000 -0700
@@ -578,7 +578,7 @@ void page_remove_rmap(struct page *page)
* repeatedly from either try_to_unmap_anon or try_to_unmap_file.
*/
static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
- int ignore_refs)
+ int migration)
{
struct mm_struct *mm = vma->vm_mm;
unsigned long address;
@@ -602,7 +602,7 @@ static int try_to_unmap_one(struct page
*/
if ((vma->vm_flags & VM_LOCKED) ||
(ptep_clear_flush_young(vma, address, pte)
- && !ignore_refs)) {
+ && !migration)) {
ret = SWAP_FAIL;
goto out_unmap;
}
@@ -736,7 +736,7 @@ static void try_to_unmap_cluster(unsigne
pte_unmap_unlock(pte - 1, ptl);
}
-static int try_to_unmap_anon(struct page *page, int ignore_refs)
+static int try_to_unmap_anon(struct page *page, int migration)
{
struct anon_vma *anon_vma;
struct vm_area_struct *vma;
@@ -747,7 +747,7 @@ static int try_to_unmap_anon(struct page
return ret;
list_for_each_entry(vma, &anon_vma->head, anon_vma_node) {
- ret = try_to_unmap_one(page, vma, ignore_refs);
+ ret = try_to_unmap_one(page, vma, migration);
if (ret == SWAP_FAIL || !page_mapped(page))
break;
}
@@ -764,7 +764,7 @@ static int try_to_unmap_anon(struct page
*
* This function is only called from try_to_unmap for object-based pages.
*/
-static int try_to_unmap_file(struct page *page, int ignore_refs)
+static int try_to_unmap_file(struct page *page, int migration)
{
struct address_space *mapping = page->mapping;
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -778,7 +778,7 @@ static int try_to_unmap_file(struct page
spin_lock(&mapping->i_mmap_lock);
vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
- ret = try_to_unmap_one(page, vma, ignore_refs);
+ ret = try_to_unmap_one(page, vma, migration);
if (ret == SWAP_FAIL || !page_mapped(page))
goto out;
}
@@ -863,16 +863,16 @@ out:
* SWAP_AGAIN - we missed a mapping, try again later
* SWAP_FAIL - the page is unswappable
*/
-int try_to_unmap(struct page *page, int ignore_refs)
+int try_to_unmap(struct page *page, int migration)
{
int ret;
BUG_ON(!PageLocked(page));
if (PageAnon(page))
- ret = try_to_unmap_anon(page, ignore_refs);
+ ret = try_to_unmap_anon(page, migration);
else
- ret = try_to_unmap_file(page, ignore_refs);
+ ret = try_to_unmap_file(page, migration);
if (!page_mapped(page))
ret = SWAP_SUCCESS;
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/7] PM cleanup: Group functions
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
2006-04-29 3:22 ` [PATCH 1/7] PM cleanup: Rename "ignrefs" to "migration" Christoph Lameter
@ 2006-04-29 3:22 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 3/7] PM cleanup: Remove useless definitions Christoph Lameter
` (9 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:22 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
page migration: Reorder functions in migrate.c
Group all migration functions for struct address_space_operations
together.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 17:11:42.779439413 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 17:24:03.935627108 -0700
@@ -120,15 +120,6 @@
}
/*
- * Non migratable page
- */
-int fail_migrate_page(struct page *newpage, struct page *page)
-{
- return -EIO;
-}
-EXPORT_SYMBOL(fail_migrate_page);
-
-/*
* swapout a single page
* page is locked upon entry, unlocked on exit
*/
@@ -297,6 +288,17 @@
}
EXPORT_SYMBOL(migrate_page_copy);
+/************************************************************
+ * Migration functions
+ ***********************************************************/
+
+/* Always fail migration. Used for mappings that are not movable */
+int fail_migrate_page(struct page *newpage, struct page *page)
+{
+ return -EIO;
+}
+EXPORT_SYMBOL(fail_migrate_page);
+
/*
* Common logic to directly migrate a single page suitable for
* pages that do not use PagePrivate.
@@ -330,6 +332,67 @@
EXPORT_SYMBOL(migrate_page);
/*
+ * Migration function for pages with buffers. This function can only be used
+ * if the underlying filesystem guarantees that no other references to "page"
+ * exist.
+ */
+int buffer_migrate_page(struct page *newpage, struct page *page)
+{
+ struct address_space *mapping = page->mapping;
+ struct buffer_head *bh, *head;
+ int rc;
+
+ if (!mapping)
+ return -EAGAIN;
+
+ if (!page_has_buffers(page))
+ return migrate_page(newpage, page);
+
+ head = page_buffers(page);
+
+ rc = migrate_page_remove_references(newpage, page, 3);
+
+ if (rc)
+ return rc;
+
+ bh = head;
+ do {
+ get_bh(bh);
+ lock_buffer(bh);
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+
+ ClearPagePrivate(page);
+ set_page_private(newpage, page_private(page));
+ set_page_private(page, 0);
+ put_page(page);
+ get_page(newpage);
+
+ bh = head;
+ do {
+ set_bh_page(bh, newpage, bh_offset(bh));
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+
+ SetPagePrivate(newpage);
+
+ migrate_page_copy(newpage, page);
+
+ bh = head;
+ do {
+ unlock_buffer(bh);
+ put_bh(bh);
+ bh = bh->b_this_page;
+
+ } while (bh != head);
+
+ return 0;
+}
+EXPORT_SYMBOL(buffer_migrate_page);
+
+/*
* migrate_pages
*
* Two lists are passed to this function. The first list
@@ -529,67 +592,6 @@
}
/*
- * Migration function for pages with buffers. This function can only be used
- * if the underlying filesystem guarantees that no other references to "page"
- * exist.
- */
-int buffer_migrate_page(struct page *newpage, struct page *page)
-{
- struct address_space *mapping = page->mapping;
- struct buffer_head *bh, *head;
- int rc;
-
- if (!mapping)
- return -EAGAIN;
-
- if (!page_has_buffers(page))
- return migrate_page(newpage, page);
-
- head = page_buffers(page);
-
- rc = migrate_page_remove_references(newpage, page, 3);
-
- if (rc)
- return rc;
-
- bh = head;
- do {
- get_bh(bh);
- lock_buffer(bh);
- bh = bh->b_this_page;
-
- } while (bh != head);
-
- ClearPagePrivate(page);
- set_page_private(newpage, page_private(page));
- set_page_private(page, 0);
- put_page(page);
- get_page(newpage);
-
- bh = head;
- do {
- set_bh_page(bh, newpage, bh_offset(bh));
- bh = bh->b_this_page;
-
- } while (bh != head);
-
- SetPagePrivate(newpage);
-
- migrate_page_copy(newpage, page);
-
- bh = head;
- do {
- unlock_buffer(bh);
- put_bh(bh);
- bh = bh->b_this_page;
-
- } while (bh != head);
-
- return 0;
-}
-EXPORT_SYMBOL(buffer_migrate_page);
-
-/*
* Migrate the list 'pagelist' of pages to a certain destination.
*
* Specify destination with either non-NULL vma or dest_node >= 0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/7] PM cleanup: Remove useless definitions
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
2006-04-29 3:22 ` [PATCH 1/7] PM cleanup: Rename "ignrefs" to "migration" Christoph Lameter
2006-04-29 3:22 ` [PATCH 2/7] PM cleanup: Group functions Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references() Christoph Lameter
` (8 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
page migration: Remove unnecessarily exported functions
Remove the export for migrate_page_remove_references() and
migrate_page_copy() that are unlikely to be used directly by
filesystems implementing migration. The export was useful
when buffer_migrate_page() lived in fs/buffer.c but it has now
been moved to migrate.c in the migration reorg.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 17:24:03.935627108 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 17:26:13.866044400 -0700
@@ -169,7 +169,7 @@
* Remove references for a page and establish the new page with the correct
* basic settings to be able to stop accesses to the page.
*/
-int migrate_page_remove_references(struct page *newpage,
+static int migrate_page_remove_references(struct page *newpage,
struct page *page, int nr_refs)
{
struct address_space *mapping = page_mapping(page);
@@ -246,12 +246,11 @@
return 0;
}
-EXPORT_SYMBOL(migrate_page_remove_references);
/*
* Copy the page to its new location
*/
-void migrate_page_copy(struct page *newpage, struct page *page)
+static void migrate_page_copy(struct page *newpage, struct page *page)
{
copy_highpage(newpage, page);
@@ -286,7 +285,6 @@
if (PageWriteback(newpage))
end_page_writeback(newpage);
}
-EXPORT_SYMBOL(migrate_page_copy);
/************************************************************
* Migration functions
Index: linux-2.6.17-rc3/include/linux/migrate.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/migrate.h 2006-04-26 19:19:25.000000000 -0700
+++ linux-2.6.17-rc3/include/linux/migrate.h 2006-04-28 17:26:13.867020902 -0700
@@ -8,8 +8,6 @@
extern int isolate_lru_page(struct page *p, struct list_head *pagelist);
extern int putback_lru_pages(struct list_head *l);
extern int migrate_page(struct page *, struct page *);
-extern void migrate_page_copy(struct page *, struct page *);
-extern int migrate_page_remove_references(struct page *, struct page *, int);
extern int migrate_pages(struct list_head *l, struct list_head *t,
struct list_head *moved, struct list_head *failed);
extern int migrate_pages_to(struct list_head *pagelist,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (2 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 3/7] PM cleanup: Remove useless definitions Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-05-01 16:09 ` Lee Schermerhorn
2006-04-29 3:23 ` [PATCH 5/7] PM cleanup: Extract try_to_unmap from migration functions Christoph Lameter
` (7 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
page migration: Drop nr_refs parameter from migrate_page_remove_references()
The nr_refs parameter is not really useful since the number of remaining
references is always
1 for anonymous pages without a mapping
2 for pages with a mapping
3 for pages with a mapping and PagePrivate set.
Remove the early check for the number of references since we are
checking page_mapcount() earlier. Ultimately only the refcount
matters after the tree_lock has been obtained.
Signed-off-by: Christoph Lameter <clameter@sgi.coim>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 17:26:13.866044400 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 17:31:10.325193799 -0700
@@ -168,19 +168,19 @@
/*
* Remove references for a page and establish the new page with the correct
* basic settings to be able to stop accesses to the page.
+ *
+ * The number of remaining references must be:
+ * 1 for anonymous pages without a mapping
+ * 2 for pages with a mapping
+ * 3 for pages with a mapping and PagePrivate set.
*/
static int migrate_page_remove_references(struct page *newpage,
- struct page *page, int nr_refs)
+ struct page *page)
{
struct address_space *mapping = page_mapping(page);
struct page **radix_pointer;
- /*
- * Avoid doing any of the following work if the page count
- * indicates that the page is in use or truncate has removed
- * the page.
- */
- if (!mapping || page_mapcount(page) + nr_refs != page_count(page))
+ if (!mapping)
return -EAGAIN;
/*
@@ -218,7 +218,8 @@
&mapping->page_tree,
page_index(page));
- if (!page_mapping(page) || page_count(page) != nr_refs ||
+ if (!page_mapping(page) ||
+ page_count(page) != 2 + !!PagePrivate(page) ||
*radix_pointer != page) {
write_unlock_irq(&mapping->tree_lock);
return -EAGAIN;
@@ -309,7 +310,7 @@
BUG_ON(PageWriteback(page)); /* Writeback must be complete */
- rc = migrate_page_remove_references(newpage, page, 2);
+ rc = migrate_page_remove_references(newpage, page);
if (rc)
return rc;
@@ -348,7 +349,7 @@
head = page_buffers(page);
- rc = migrate_page_remove_references(newpage, page, 3);
+ rc = migrate_page_remove_references(newpage, page);
if (rc)
return rc;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 5/7] PM cleanup: Extract try_to_unmap from migration functions
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (3 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references() Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 6/7] PM cleanup: Pass "mapping" to " Christoph Lameter
` (6 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
page migration: Extract try_to_unmap and rename remove_references -> move_mapping
try_to_unmap may significantly change the page state by for example setting
the dirty bit. It is therefore best to unmap in migrate_pages() before
calling any migration functions.
migrate_page_remove_references() will then only move the new page in
place of the old page in the mapping. Rename the function to
migrate_page_move_mapping().
This allows us to get rid of the special unmapping for the
fallback path.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 17:31:10.325193799 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 17:42:24.342949272 -0700
@@ -166,15 +166,14 @@
}
/*
- * Remove references for a page and establish the new page with the correct
- * basic settings to be able to stop accesses to the page.
+ * Remove or replace the page in the mapping.
*
* The number of remaining references must be:
* 1 for anonymous pages without a mapping
* 2 for pages with a mapping
* 3 for pages with a mapping and PagePrivate set.
*/
-static int migrate_page_remove_references(struct page *newpage,
+static int migrate_page_move_mapping(struct page *newpage,
struct page *page)
{
struct address_space *mapping = page_mapping(page);
@@ -183,35 +182,6 @@
if (!mapping)
return -EAGAIN;
- /*
- * Establish swap ptes for anonymous pages or destroy pte
- * maps for files.
- *
- * In order to reestablish file backed mappings the fault handlers
- * will take the radix tree_lock which may then be used to stop
- * processses from accessing this page until the new page is ready.
- *
- * A process accessing via a swap pte (an anonymous page) will take a
- * page_lock on the old page which will block the process until the
- * migration attempt is complete. At that time the PageSwapCache bit
- * will be examined. If the page was migrated then the PageSwapCache
- * bit will be clear and the operation to retrieve the page will be
- * retried which will find the new page in the radix tree. Then a new
- * direct mapping may be generated based on the radix tree contents.
- *
- * If the page was not migrated then the PageSwapCache bit
- * is still set and the operation may continue.
- */
- if (try_to_unmap(page, 1) == SWAP_FAIL)
- /* A vma has VM_LOCKED set -> permanent failure */
- return -EPERM;
-
- /*
- * Give up if we were unable to remove all mappings.
- */
- if (page_mapcount(page))
- return -EAGAIN;
-
write_lock_irq(&mapping->tree_lock);
radix_pointer = (struct page **)radix_tree_lookup_slot(
@@ -310,7 +280,7 @@
BUG_ON(PageWriteback(page)); /* Writeback must be complete */
- rc = migrate_page_remove_references(newpage, page);
+ rc = migrate_page_move_mapping(newpage, page);
if (rc)
return rc;
@@ -349,7 +319,7 @@
head = page_buffers(page);
- rc = migrate_page_remove_references(newpage, page);
+ rc = migrate_page_move_mapping(newpage, page);
if (rc)
return rc;
@@ -482,6 +452,33 @@
lock_page(newpage);
/*
+ * Establish swap ptes for anonymous pages or destroy pte
+ * maps for files.
+ *
+ * In order to reestablish file backed mappings the fault handlers
+ * will take the radix tree_lock which may then be used to stop
+ * processses from accessing this page until the new page is ready.
+ *
+ * A process accessing via a swap pte (an anonymous page) will take a
+ * page_lock on the old page which will block the process until the
+ * migration attempt is complete. At that time the PageSwapCache bit
+ * will be examined. If the page was migrated then the PageSwapCache
+ * bit will be clear and the operation to retrieve the page will be
+ * retried which will find the new page in the radix tree. Then a new
+ * direct mapping may be generated based on the radix tree contents.
+ *
+ * If the page was not migrated then the PageSwapCache bit
+ * is still set and the operation may continue.
+ */
+ rc = -EPERM;
+ if (try_to_unmap(page, 1) == SWAP_FAIL)
+ /* A vma has VM_LOCKED set -> permanent failure */
+ goto unlock_both;
+
+ rc = -EAGAIN;
+ if (page_mapped(page))
+ goto unlock_both;
+ /*
* Pages are properly locked and writeback is complete.
* Try to migrate the page.
*/
@@ -501,17 +498,6 @@
goto unlock_both;
}
- /* Make sure the dirty bit is up to date */
- if (try_to_unmap(page, 1) == SWAP_FAIL) {
- rc = -EPERM;
- goto unlock_both;
- }
-
- if (page_mapcount(page)) {
- rc = -EAGAIN;
- goto unlock_both;
- }
-
/*
* Default handling if a filesystem does not provide
* a migration function. We can only migrate clean
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 6/7] PM cleanup: Pass "mapping" to migration functions
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (4 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 5/7] PM cleanup: Extract try_to_unmap from migration functions Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 7/7] PM cleanup: Move fallback handling into special function Christoph Lameter
` (5 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
page migration: Change handling of address spaces.
Pass a pointer to the address space in which the page is migrated
to all migration function. This avoids repeatedly having to retrieve
the address space pointer from the page and checking it for validity.
The old page mapping will change once migration has gone to a certain
step, so it is less confusing to have the pointer always available.
Move the setting of the mapping and index for the new page into
migrate_pages().
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 18:12:57.786776854 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 18:40:22.939488970 -0700
@@ -173,15 +173,11 @@
* 2 for pages with a mapping
* 3 for pages with a mapping and PagePrivate set.
*/
-static int migrate_page_move_mapping(struct page *newpage,
- struct page *page)
+static int migrate_page_move_mapping(struct address_space *mapping,
+ struct page *newpage, struct page *page)
{
- struct address_space *mapping = page_mapping(page);
struct page **radix_pointer;
- if (!mapping)
- return -EAGAIN;
-
write_lock_irq(&mapping->tree_lock);
radix_pointer = (struct page **)radix_tree_lookup_slot(
@@ -197,15 +193,8 @@
/*
* Now we know that no one else is looking at the page.
- *
- * Certain minimal information about a page must be available
- * in order for other subsystems to properly handle the page if they
- * find it through the radix tree update before we are finished
- * copying the page.
*/
get_page(newpage);
- newpage->index = page->index;
- newpage->mapping = page->mapping;
if (PageSwapCache(page)) {
SetPageSwapCache(newpage);
set_page_private(newpage, page_private(page));
@@ -262,7 +251,8 @@
***********************************************************/
/* Always fail migration. Used for mappings that are not movable */
-int fail_migrate_page(struct page *newpage, struct page *page)
+int fail_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page)
{
return -EIO;
}
@@ -274,13 +264,14 @@
*
* Pages are locked upon entry and exit.
*/
-int migrate_page(struct page *newpage, struct page *page)
+int migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page)
{
int rc;
BUG_ON(PageWriteback(page)); /* Writeback must be complete */
- rc = migrate_page_move_mapping(newpage, page);
+ rc = migrate_page_move_mapping(mapping, newpage, page);
if (rc)
return rc;
@@ -305,21 +296,18 @@
* if the underlying filesystem guarantees that no other references to "page"
* exist.
*/
-int buffer_migrate_page(struct page *newpage, struct page *page)
+int buffer_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page)
{
- struct address_space *mapping = page->mapping;
struct buffer_head *bh, *head;
int rc;
- if (!mapping)
- return -EAGAIN;
-
if (!page_has_buffers(page))
- return migrate_page(newpage, page);
+ return migrate_page(mapping, newpage, page);
head = page_buffers(page);
- rc = migrate_page_move_mapping(newpage, page);
+ rc = migrate_page_move_mapping(mapping, newpage, page);
if (rc)
return rc;
@@ -448,9 +436,6 @@
goto next;
}
- newpage = lru_to_page(to);
- lock_page(newpage);
-
/*
* Establish swap ptes for anonymous pages or destroy pte
* maps for files.
@@ -473,11 +458,18 @@
rc = -EPERM;
if (try_to_unmap(page, 1) == SWAP_FAIL)
/* A vma has VM_LOCKED set -> permanent failure */
- goto unlock_both;
+ goto unlock_page;
rc = -EAGAIN;
if (page_mapped(page))
- goto unlock_both;
+ goto unlock_page;
+
+ newpage = lru_to_page(to);
+ lock_page(newpage);
+ /* Prepare mapping for the new page.*/
+ newpage->index = page->index;
+ newpage->mapping = page->mapping;
+
/*
* Pages are properly locked and writeback is complete.
* Try to migrate the page.
@@ -494,7 +486,8 @@
* own migration function. This is the most common
* path for page migration.
*/
- rc = mapping->a_ops->migratepage(newpage, page);
+ rc = mapping->a_ops->migratepage(mapping,
+ newpage, page);
goto unlock_both;
}
@@ -524,7 +517,7 @@
*/
if (!page_has_buffers(page) ||
try_to_release_page(page, GFP_KERNEL)) {
- rc = migrate_page(newpage, page);
+ rc = migrate_page(mapping, newpage, page);
goto unlock_both;
}
@@ -553,12 +546,17 @@
unlock_page(page);
next:
- if (rc == -EAGAIN) {
- retry++;
- } else if (rc) {
- /* Permanent failure */
- list_move(&page->lru, failed);
- nr_failed++;
+ if (rc) {
+ if (newpage)
+ newpage->mapping = NULL;
+
+ if (rc == -EAGAIN)
+ retry++;
+ else {
+ /* Permanent failure */
+ list_move(&page->lru, failed);
+ nr_failed++;
+ }
} else {
if (newpage) {
/* Successful migration. Return page to LRU */
Index: linux-2.6.17-rc3/include/linux/fs.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/fs.h 2006-04-26 19:19:25.000000000 -0700
+++ linux-2.6.17-rc3/include/linux/fs.h 2006-04-28 18:12:58.597273361 -0700
@@ -373,7 +373,8 @@
struct page* (*get_xip_page)(struct address_space *, sector_t,
int);
/* migrate the contents of a page to the specified target */
- int (*migratepage) (struct page *, struct page *);
+ int (*migratepage) (struct address_space *,
+ struct page *, struct page *);
};
struct backing_dev_info;
@@ -1768,7 +1769,8 @@
extern ssize_t simple_read_from_buffer(void __user *, size_t, loff_t *, const void *, size_t);
#ifdef CONFIG_MIGRATION
-extern int buffer_migrate_page(struct page *, struct page *);
+extern int buffer_migrate_page(struct address_space *,
+ struct page *, struct page *);
#else
#define buffer_migrate_page NULL
#endif
Index: linux-2.6.17-rc3/include/linux/migrate.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/migrate.h 2006-04-28 18:12:55.854279761 -0700
+++ linux-2.6.17-rc3/include/linux/migrate.h 2006-04-28 18:12:58.598249863 -0700
@@ -7,12 +7,14 @@
#ifdef CONFIG_MIGRATION
extern int isolate_lru_page(struct page *p, struct list_head *pagelist);
extern int putback_lru_pages(struct list_head *l);
-extern int migrate_page(struct page *, struct page *);
+extern int migrate_page(struct address_space *,
+ struct page *, struct page *);
extern int migrate_pages(struct list_head *l, struct list_head *t,
struct list_head *moved, struct list_head *failed);
extern int migrate_pages_to(struct list_head *pagelist,
struct vm_area_struct *vma, int dest);
-extern int fail_migrate_page(struct page *, struct page *);
+extern int fail_migrate_page(struct address_space *,
+ struct page *, struct page *);
extern int migrate_prep(void);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 7/7] PM cleanup: Move fallback handling into special function
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (5 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 6/7] PM cleanup: Pass "mapping" to " Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 1/3] Swapless PM: add R/W migration entries Christoph Lameter
` (4 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
page migration: Add new fallback function
Move the fallback code into a new fallback function and make the
function behave like any other migration function. This requires
retaking the lock if pageout() drops it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 18:12:58.595320357 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 18:25:47.893218995 -0700
@@ -349,6 +349,42 @@
}
EXPORT_SYMBOL(buffer_migrate_page);
+static int fallback_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page)
+{
+ /*
+ * Default handling if a filesystem does not provide
+ * a migration function. We can only migrate clean
+ * pages so try to write out any dirty pages first.
+ */
+ if (PageDirty(page)) {
+ switch (pageout(page, mapping)) {
+ case PAGE_KEEP:
+ case PAGE_ACTIVATE:
+ return -EAGAIN;
+
+ case PAGE_SUCCESS:
+ /* Relock since we lost the lock */
+ lock_page(page);
+ /* Must retry since page state may have changed */
+ return -EAGAIN;
+
+ case PAGE_CLEAN:
+ ; /* try to migrate the page below */
+ }
+ }
+
+ /*
+ * Buffers are managed in a filesystem specific way.
+ * We must have no buffers or drop them.
+ */
+ if (page_has_buffers(page) &&
+ !try_to_release_page(page, GFP_KERNEL))
+ return -EAGAIN;
+
+ return migrate_page(mapping, newpage, page);
+}
+
/*
* migrate_pages
*
@@ -478,7 +514,7 @@
if (!mapping)
goto unlock_both;
- if (mapping->a_ops->migratepage) {
+ if (mapping->a_ops->migratepage)
/*
* Most pages have a mapping and most filesystems
* should provide a migration function. Anonymous
@@ -488,56 +524,8 @@
*/
rc = mapping->a_ops->migratepage(mapping,
newpage, page);
- goto unlock_both;
- }
-
- /*
- * Default handling if a filesystem does not provide
- * a migration function. We can only migrate clean
- * pages so try to write out any dirty pages first.
- */
- if (PageDirty(page)) {
- switch (pageout(page, mapping)) {
- case PAGE_KEEP:
- case PAGE_ACTIVATE:
- goto unlock_both;
-
- case PAGE_SUCCESS:
- unlock_page(newpage);
- goto next;
-
- case PAGE_CLEAN:
- ; /* try to migrate the page below */
- }
- }
-
- /*
- * Buffers are managed in a filesystem specific way.
- * We must have no buffers or drop them.
- */
- if (!page_has_buffers(page) ||
- try_to_release_page(page, GFP_KERNEL)) {
- rc = migrate_page(mapping, newpage, page);
- goto unlock_both;
- }
-
- /*
- * On early passes with mapped pages simply
- * retry. There may be a lock held for some
- * buffers that may go away. Later
- * swap them out.
- */
- if (pass > 4) {
- /*
- * Persistently unable to drop buffers..... As a
- * measure of last resort we fall back to
- * swap_page().
- */
- unlock_page(newpage);
- newpage = NULL;
- rc = swap_page(page);
- goto next;
- }
+ else
+ rc = fallback_migrate_page(mapping, newpage, page);
unlock_both:
unlock_page(newpage);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] Swapless PM: add R/W migration entries
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (6 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 7/7] PM cleanup: Move fallback handling into special function Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 2/3] Swapless PM: Rip out swap based logic Christoph Lameter
` (3 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
swapless page migration: Implement read/write migration ptes
We take the upper two swapfiles for the two types of migration ptes
and define a series of macros in swapops.h.
The VM is modified to handle the migration entries. migration entries
can only be encountered when the page they are pointing to is locked.
This limits the number of places one has to fix. We also check in
copy_pte_range and in mprotect_pte_range() for migration ptes.
We check for migration ptes in do_swap_cache and call a function
that will then wait on the page lock. This allows us to effectively
stop all accesses to apge.
Migration entries are created by try_to_unmap if called for migration
and removed by local functions in migrate.c
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/include/linux/swap.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/swap.h 2006-04-28 20:20:19.104969106 -0700
+++ linux-2.6.17-rc3/include/linux/swap.h 2006-04-28 20:20:35.501412608 -0700
@@ -29,7 +29,14 @@
* the type/offset into the pte as 5/27 as well.
*/
#define MAX_SWAPFILES_SHIFT 5
+#ifndef CONFIG_MIGRATION
#define MAX_SWAPFILES (1 << MAX_SWAPFILES_SHIFT)
+#else
+/* Use last entry for page migration swap entries */
+#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT)-2)
+#define SWP_MIGRATION_READ MAX_SWAPFILES
+#define SWP_MIGRATION_WRITE (MAX_SWAPFILES + 1)
+#endif
/*
* Magic header for a swap area. The first part of the union is
Index: linux-2.6.17-rc3/include/linux/swapops.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/swapops.h 2006-04-26 19:19:25.000000000 -0700
+++ linux-2.6.17-rc3/include/linux/swapops.h 2006-04-28 20:20:35.502389109 -0700
@@ -67,3 +67,56 @@
BUG_ON(pte_file(__swp_entry_to_pte(arch_entry)));
return __swp_entry_to_pte(arch_entry);
}
+
+#ifdef CONFIG_MIGRATION
+static inline swp_entry_t make_migration_entry(struct page *page, int write)
+{
+ BUG_ON(!PageLocked(page));
+ return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
+ page_to_pfn(page));
+}
+
+static inline int is_migration_entry(swp_entry_t entry)
+{
+ return unlikely(swp_type(entry) == SWP_MIGRATION_READ ||
+ swp_type(entry) == SWP_MIGRATION_WRITE);
+}
+
+static inline int is_write_migration_entry(swp_entry_t entry)
+{
+ return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE);
+}
+
+static inline struct page *migration_entry_to_page(swp_entry_t entry)
+{
+ struct page *p = pfn_to_page(swp_offset(entry));
+ /*
+ * Any use of migration entries may only occur while the
+ * corresponding page is locked
+ */
+ BUG_ON(!PageLocked(p));
+ return p;
+}
+
+static inline void make_migration_entry_read(swp_entry_t *entry)
+{
+ *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry));
+}
+
+extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address);
+#else
+
+#define make_migration_entry(page, write) swp_entry(0, 0)
+#define is_migration_entry(swp) 0
+#define migration_entry_to_page(swp) NULL
+static inline void make_migration_entry_read(swp_entry_t *entryp) { }
+static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address) { }
+static inline int is_write_migration_entry(swp_entry_t entry)
+{
+ return 0;
+}
+
+#endif
+
Index: linux-2.6.17-rc3/mm/swapfile.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/swapfile.c 2006-04-28 20:20:19.487757853 -0700
+++ linux-2.6.17-rc3/mm/swapfile.c 2006-04-28 20:20:35.503365611 -0700
@@ -395,6 +395,9 @@
struct swap_info_struct * p;
struct page *page = NULL;
+ if (is_migration_entry(entry))
+ return;
+
p = swap_info_get(entry);
if (p) {
if (swap_entry_free(p, swp_offset(entry)) == 1) {
@@ -1702,6 +1705,9 @@
unsigned long offset, type;
int result = 0;
+ if (is_migration_entry(entry))
+ return 1;
+
type = swp_type(entry);
if (type >= nr_swapfiles)
goto bad_file;
Index: linux-2.6.17-rc3/mm/rmap.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/rmap.c 2006-04-28 20:20:25.123150353 -0700
+++ linux-2.6.17-rc3/mm/rmap.c 2006-04-28 20:20:35.504342113 -0700
@@ -620,17 +620,27 @@
if (PageAnon(page)) {
swp_entry_t entry = { .val = page_private(page) };
- /*
- * Store the swap location in the pte.
- * See handle_pte_fault() ...
- */
- BUG_ON(!PageSwapCache(page));
- swap_duplicate(entry);
- if (list_empty(&mm->mmlist)) {
- spin_lock(&mmlist_lock);
- if (list_empty(&mm->mmlist))
- list_add(&mm->mmlist, &init_mm.mmlist);
- spin_unlock(&mmlist_lock);
+
+ if (PageSwapCache(page)) {
+ /*
+ * Store the swap location in the pte.
+ * See handle_pte_fault() ...
+ */
+ swap_duplicate(entry);
+ if (list_empty(&mm->mmlist)) {
+ spin_lock(&mmlist_lock);
+ if (list_empty(&mm->mmlist))
+ list_add(&mm->mmlist, &init_mm.mmlist);
+ spin_unlock(&mmlist_lock);
+ }
+ } else {
+ /*
+ * Store the pfn of the page in a special migration
+ * pte. do_swap_page() will wait until the migration
+ * pte is removed and then restart fault handling.
+ */
+ BUG_ON(!migration);
+ entry = make_migration_entry(page, pte_write(pteval));
}
set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
BUG_ON(pte_file(*pte));
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 20:20:30.437273724 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 20:20:35.505318615 -0700
@@ -15,6 +15,7 @@
#include <linux/migrate.h>
#include <linux/module.h>
#include <linux/swap.h>
+#include <linux/swapops.h>
#include <linux/pagemap.h>
#include <linux/buffer_head.h>
#include <linux/mm_inline.h>
@@ -23,7 +24,6 @@
#include <linux/topology.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
-#include <linux/swapops.h>
#include "internal.h"
@@ -119,6 +119,136 @@
return count;
}
+static inline int is_swap_pte(pte_t pte)
+{
+ return !pte_none(pte) && !pte_present(pte) && !pte_file(pte);
+}
+
+/*
+ * Restore a potential migration pte to a working pte entry for
+ * anonymous pages.
+ */
+static void remove_migration_pte(struct vm_area_struct *vma, unsigned long addr,
+ struct page *old, struct page *new)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ swp_entry_t entry;
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *ptep, pte;
+ spinlock_t *ptl;
+
+ pgd = pgd_offset(mm, addr);
+ if (!pgd_present(*pgd))
+ return;
+
+ pud = pud_offset(pgd, addr);
+ if (!pud_present(*pud))
+ return;
+
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_present(*pmd))
+ return;
+
+ ptep = pte_offset_map(pmd, addr);
+
+ if (!is_swap_pte(*ptep)) {
+ pte_unmap(ptep);
+ return;
+ }
+
+ ptl = pte_lockptr(mm, pmd);
+ spin_lock(ptl);
+ pte = *ptep;
+ if (!is_swap_pte(pte))
+ goto out;
+
+ entry = pte_to_swp_entry(pte);
+
+ if (!is_migration_entry(entry) || migration_entry_to_page(entry) != old)
+ goto out;
+
+ inc_mm_counter(mm, anon_rss);
+ get_page(new);
+ pte = pte_mkold(mk_pte(new, vma->vm_page_prot));
+ if (is_write_migration_entry(entry))
+ pte = pte_mkwrite(pte);
+ set_pte_at(mm, addr, ptep, pte);
+ page_add_anon_rmap(new, vma, addr);
+out:
+ pte_unmap_unlock(pte, ptl);
+}
+
+/*
+ * Get rid of all migration entries and replace them by
+ * references to the indicated page.
+ *
+ * Must hold mmap_sem lock on at least one of the vmas containing
+ * the page so that the anon_vma cannot vanish.
+ */
+static void remove_migration_ptes(struct page *old, struct page *new)
+{
+ struct anon_vma *anon_vma;
+ struct vm_area_struct *vma;
+ unsigned long mapping;
+
+ mapping = (unsigned long)new->mapping;
+
+ if (!mapping || (mapping & PAGE_MAPPING_ANON) == 0)
+ return;
+
+ /*
+ * We hold the mmap_sem lock. So no need to call page_lock_anon_vma.
+ */
+ anon_vma = (struct anon_vma *) (mapping - PAGE_MAPPING_ANON);
+ spin_lock(&anon_vma->lock);
+
+ list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
+ remove_migration_pte(vma, page_address_in_vma(new, vma),
+ old, new);
+
+ spin_unlock(&anon_vma->lock);
+}
+
+/*
+ * Something used the pte of a page under migration. We need to
+ * get to the page and wait until migration is finished.
+ * When we return from this function the fault will be retried.
+ *
+ * This function is called from do_swap_page().
+ */
+void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address)
+{
+ pte_t *ptep, pte;
+ spinlock_t *ptl;
+ swp_entry_t entry;
+ struct page *page;
+
+ ptep = pte_offset_map_lock(mm, pmd, address, &ptl);
+ pte = *ptep;
+ if (!is_swap_pte(pte))
+ goto out;
+
+ entry = pte_to_swp_entry(pte);
+ if (!is_migration_entry(entry))
+ goto out;
+
+ page = migration_entry_to_page(entry);
+
+ /* Pages with migration entries are always locked */
+ BUG_ON(!PageLocked(page));
+
+ get_page(page);
+ pte_unmap_unlock(ptep, ptl);
+ wait_on_page_locked(page);
+ put_page(page);
+ return;
+out:
+ pte_unmap_unlock(ptep, ptl);
+}
+
/*
* swapout a single page
* page is locked upon entry, unlocked on exit
Index: linux-2.6.17-rc3/mm/memory.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/memory.c 2006-04-28 20:20:19.437956256 -0700
+++ linux-2.6.17-rc3/mm/memory.c 2006-04-28 20:21:17.571067703 -0700
@@ -434,7 +434,9 @@
/* pte contains position in swap or file, so copy. */
if (unlikely(!pte_present(pte))) {
if (!pte_file(pte)) {
- swap_duplicate(pte_to_swp_entry(pte));
+ swp_entry_t entry = pte_to_swp_entry(pte);
+
+ swap_duplicate(entry);
/* make sure dst_mm is on swapoff's mmlist. */
if (unlikely(list_empty(&dst_mm->mmlist))) {
spin_lock(&mmlist_lock);
@@ -443,6 +445,19 @@
&src_mm->mmlist);
spin_unlock(&mmlist_lock);
}
+ if (is_migration_entry(entry) &&
+ is_cow_mapping(vm_flags)) {
+ page = migration_entry_to_page(entry);
+
+ /*
+ * COW mappings require pages in both parent
+ * and child to be set to read.
+ */
+ entry = make_migration_entry(page,
+ SWP_MIGRATION_READ);
+ pte = swp_entry_to_pte(entry);
+ set_pte_at(src_mm, addr, src_pte, pte);
+ }
}
goto out_set_pte;
}
@@ -1879,6 +1894,10 @@
goto out;
entry = pte_to_swp_entry(orig_pte);
+ if (is_migration_entry(entry)) {
+ migration_entry_wait(mm, pmd, address);
+ goto out;
+ }
page = lookup_swap_cache(entry);
if (!page) {
swapin_readahead(entry, address, vma);
Index: linux-2.6.17-rc3/mm/mprotect.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/mprotect.c 2006-04-28 20:20:19.450650781 -0700
+++ linux-2.6.17-rc3/mm/mprotect.c 2006-04-28 20:20:35.508248121 -0700
@@ -19,7 +19,8 @@
#include <linux/mempolicy.h>
#include <linux/personality.h>
#include <linux/syscalls.h>
-
+#include <linux/swap.h>
+#include <linux/swapops.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
#include <asm/cacheflush.h>
@@ -28,12 +29,13 @@
static void change_pte_range(struct mm_struct *mm, pmd_t *pmd,
unsigned long addr, unsigned long end, pgprot_t newprot)
{
- pte_t *pte;
+ pte_t *pte, oldpte;
spinlock_t *ptl;
pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
do {
- if (pte_present(*pte)) {
+ oldpte = *pte;
+ if (pte_present(oldpte)) {
pte_t ptent;
/* Avoid an SMP race with hardware updated dirty/clean
@@ -43,7 +45,20 @@
ptent = pte_modify(ptep_get_and_clear(mm, addr, pte), newprot);
set_pte_at(mm, addr, pte, ptent);
lazy_mmu_prot_update(ptent);
+ } else if (!pte_file(oldpte)) {
+ swp_entry_t entry = pte_to_swp_entry(oldpte);
+
+ if (is_write_migration_entry(entry)) {
+ /*
+ * A protection check is difficult so
+ * just be safe and disable write
+ */
+ make_migration_entry_read(&entry);
+ set_pte_at(mm, addr, pte,
+ swp_entry_to_pte(entry));
+ }
}
+
} while (pte++, addr += PAGE_SIZE, addr != end);
pte_unmap_unlock(pte - 1, ptl);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/3] Swapless PM: Rip out swap based logic
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (7 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 1/3] Swapless PM: add R/W migration entries Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 3/3] Swapless PM: Modify core logic Christoph Lameter
` (2 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
Rip the page migration logic out
Remove all code that has to do with swapping during page migration.
This also guts the ability to migrate pages to swap. No one used that so lets
let it go for good.
Page migration should be a bit broken after this patch.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-rc3/include/linux/rmap.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/rmap.h 2006-04-26 19:19:25.000000000 -0700
+++ linux-2.6.17-rc3/include/linux/rmap.h 2006-04-28 20:06:12.757091693 -0700
@@ -92,7 +92,6 @@
*/
int page_referenced(struct page *, int is_locked);
int try_to_unmap(struct page *, int ignore_refs);
-void remove_from_swap(struct page *page);
/*
* Called from mm/filemap_xip.c to unmap empty zero page
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 20:05:11.674943774 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 20:11:39.065947928 -0700
@@ -70,10 +70,6 @@
*/
int migrate_prep(void)
{
- /* Must have swap device for migration */
- if (nr_swap_pages <= 0)
- return -ENODEV;
-
/*
* Clear the LRU lists so pages can be isolated.
* Note that pages may be moved off the LRU after we have
@@ -250,52 +246,6 @@
}
/*
- * swapout a single page
- * page is locked upon entry, unlocked on exit
- */
-static int swap_page(struct page *page)
-{
- struct address_space *mapping = page_mapping(page);
-
- if (page_mapped(page) && mapping)
- if (try_to_unmap(page, 1) != SWAP_SUCCESS)
- goto unlock_retry;
-
- if (PageDirty(page)) {
- /* Page is dirty, try to write it out here */
- switch(pageout(page, mapping)) {
- case PAGE_KEEP:
- case PAGE_ACTIVATE:
- goto unlock_retry;
-
- case PAGE_SUCCESS:
- goto retry;
-
- case PAGE_CLEAN:
- ; /* try to free the page below */
- }
- }
-
- if (PagePrivate(page)) {
- if (!try_to_release_page(page, GFP_KERNEL) ||
- (!mapping && page_count(page) == 1))
- goto unlock_retry;
- }
-
- if (remove_mapping(mapping, page)) {
- /* Success */
- unlock_page(page);
- return 0;
- }
-
-unlock_retry:
- unlock_page(page);
-
-retry:
- return -EAGAIN;
-}
-
-/*
* Remove or replace the page in the mapping.
*
* The number of remaining references must be:
@@ -521,8 +471,7 @@
* Two lists are passed to this function. The first list
* contains the pages isolated from the LRU to be migrated.
* The second list contains new pages that the pages isolated
- * can be moved to. If the second list is NULL then all
- * pages are swapped out.
+ * can be moved to.
*
* The function returns after 10 attempts or if no pages
* are movable anymore because to has become empty
@@ -578,29 +527,11 @@
* Only wait on writeback if we have already done a pass where
* we we may have triggered writeouts for lots of pages.
*/
- if (pass > 0) {
+ if (pass > 0)
wait_on_page_writeback(page);
- } else {
+ else
if (PageWriteback(page))
goto unlock_page;
- }
-
- /*
- * Anonymous pages must have swap cache references otherwise
- * the information contained in the page maps cannot be
- * preserved.
- */
- if (PageAnon(page) && !PageSwapCache(page)) {
- if (!add_to_swap(page, GFP_KERNEL)) {
- rc = -ENOMEM;
- goto unlock_page;
- }
- }
-
- if (!to) {
- rc = swap_page(page);
- goto next;
- }
/*
* Establish swap ptes for anonymous pages or destroy pte
Index: linux-2.6.17-rc3/mm/rmap.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/rmap.c 2006-04-28 20:05:11.673967272 -0700
+++ linux-2.6.17-rc3/mm/rmap.c 2006-04-28 20:06:12.759044697 -0700
@@ -205,44 +205,6 @@
return anon_vma;
}
-#ifdef CONFIG_MIGRATION
-/*
- * Remove an anonymous page from swap replacing the swap pte's
- * through real pte's pointing to valid pages and then releasing
- * the page from the swap cache.
- *
- * Must hold page lock on page and mmap_sem of one vma that contains
- * the page.
- */
-void remove_from_swap(struct page *page)
-{
- struct anon_vma *anon_vma;
- struct vm_area_struct *vma;
- unsigned long mapping;
-
- if (!PageSwapCache(page))
- return;
-
- mapping = (unsigned long)page->mapping;
-
- if (!mapping || (mapping & PAGE_MAPPING_ANON) == 0)
- return;
-
- /*
- * We hold the mmap_sem lock. So no need to call page_lock_anon_vma.
- */
- anon_vma = (struct anon_vma *) (mapping - PAGE_MAPPING_ANON);
- spin_lock(&anon_vma->lock);
-
- list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
- remove_vma_swap(vma, page);
-
- spin_unlock(&anon_vma->lock);
- delete_from_swap_cache(page);
-}
-EXPORT_SYMBOL(remove_from_swap);
-#endif
-
/*
* At what user virtual address is page expected in vma?
*/
Index: linux-2.6.17-rc3/mm/swapfile.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/swapfile.c 2006-04-28 20:05:11.673967272 -0700
+++ linux-2.6.17-rc3/mm/swapfile.c 2006-04-28 20:06:12.760021199 -0700
@@ -618,15 +618,6 @@
return 0;
}
-#ifdef CONFIG_MIGRATION
-int remove_vma_swap(struct vm_area_struct *vma, struct page *page)
-{
- swp_entry_t entry = { .val = page_private(page) };
-
- return unuse_vma(vma, entry, page);
-}
-#endif
-
/*
* Scan swap_map from current position to next entry still in use.
* Recycle to start on reaching the end, returning 0 when empty.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 3/3] Swapless PM: Modify core logic
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (8 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 2/3] Swapless PM: Rip out swap based logic Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` {PATCH 1/2} More PM: do not inc/dec rss counters Christoph Lameter
2006-04-29 3:23 ` {PATCH 2/2} More PM: use migration entries for file pages Christoph Lameter
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
Use the migration entries for page migration
This modifies the migration code to use the new migration entries. It now
becomes possible to migrate anonymous pages without having to add a swap
entry.
We add a couple of new functions to replace migration entries with the proper
ptes.
We cannot take the tree_lock for migrating anonymous pages anymore. However,
we know that we hold the only remaining reference to the page when the page
count reaches 1.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Index: linux-2.6.17-rc3/mm/Kconfig
===================================================================
--- linux-2.6.17-rc3.orig/mm/Kconfig 2006-04-26 19:19:25.000000000 -0700
+++ linux-2.6.17-rc3/mm/Kconfig 2006-04-28 20:11:44.644703353 -0700
@@ -138,8 +138,8 @@
#
config MIGRATION
bool "Page migration"
- def_bool y if NUMA
- depends on SWAP && NUMA
+ def_bool y
+ depends on NUMA
help
Allows the migration of the physical location of pages of processes
while the virtual addresses are not changed. This is useful for
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 20:11:39.065947928 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 20:12:15.213119206 -0700
@@ -258,6 +258,13 @@
{
struct page **radix_pointer;
+ if (!mapping) {
+ /* Anonymous page */
+ if (page_count(page) != 1 || !page->mapping)
+ return -EAGAIN;
+ return 0;
+ }
+
write_lock_irq(&mapping->tree_lock);
radix_pointer = (struct page **)radix_tree_lookup_slot(
@@ -275,10 +282,12 @@
* Now we know that no one else is looking at the page.
*/
get_page(newpage);
+#ifdef CONFIG_SWAP
if (PageSwapCache(page)) {
SetPageSwapCache(newpage);
set_page_private(newpage, page_private(page));
}
+#endif
*radix_pointer = newpage;
__put_page(page);
@@ -312,7 +321,9 @@
set_page_dirty(newpage);
}
+#ifdef CONFIG_SWAP
ClearPageSwapCache(page);
+#endif
ClearPageActive(page);
ClearPagePrivate(page);
set_page_private(page, 0);
@@ -357,16 +368,6 @@
return rc;
migrate_page_copy(newpage, page);
-
- /*
- * Remove auxiliary swap entries and replace
- * them with real ptes.
- *
- * Note that a real pte entry will allow processes that are not
- * waiting on the page lock to use the new page via the page tables
- * before the new page is unlocked.
- */
- remove_from_swap(newpage);
return 0;
}
EXPORT_SYMBOL(migrate_page);
@@ -534,23 +535,7 @@
goto unlock_page;
/*
- * Establish swap ptes for anonymous pages or destroy pte
- * maps for files.
- *
- * In order to reestablish file backed mappings the fault handlers
- * will take the radix tree_lock which may then be used to stop
- * processses from accessing this page until the new page is ready.
- *
- * A process accessing via a swap pte (an anonymous page) will take a
- * page_lock on the old page which will block the process until the
- * migration attempt is complete. At that time the PageSwapCache bit
- * will be examined. If the page was migrated then the PageSwapCache
- * bit will be clear and the operation to retrieve the page will be
- * retried which will find the new page in the radix tree. Then a new
- * direct mapping may be generated based on the radix tree contents.
- *
- * If the page was not migrated then the PageSwapCache bit
- * is still set and the operation may continue.
+ * Establish migration ptes or remove ptes
*/
rc = -EPERM;
if (try_to_unmap(page, 1) == SWAP_FAIL)
@@ -573,9 +558,9 @@
*/
mapping = page_mapping(page);
if (!mapping)
- goto unlock_both;
+ rc = migrate_page(mapping, newpage, page);
- if (mapping->a_ops->migratepage)
+ else if (mapping->a_ops->migratepage)
/*
* Most pages have a mapping and most filesystems
* should provide a migration function. Anonymous
@@ -588,10 +573,15 @@
else
rc = fallback_migrate_page(mapping, newpage, page);
-unlock_both:
+ if (!rc)
+ remove_migration_ptes(page, newpage);
+
unlock_page(newpage);
unlock_page:
+ if (rc)
+ remove_migration_ptes(page, page);
+
unlock_page(page);
next:
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* {PATCH 1/2} More PM: do not inc/dec rss counters
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (9 preceding siblings ...)
2006-04-29 3:23 ` [PATCH 3/3] Swapless PM: Modify core logic Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
2006-04-29 3:23 ` {PATCH 2/2} More PM: use migration entries for file pages Christoph Lameter
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, Hugh Dickins, Lee Schermerhorn, Christoph Lameter,
KAMEZAWA Hiroyuki
more page migration: Do not dec/inc rss counters
If we install a migration entry then the rss not really decreases
since the page is just moved somewhere else. We can save ourselves
the work of decrementing and later incrementing which will just
eventually cause cacheline bouncing.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 19:33:55.335332017 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 19:37:31.941975877 -0700
@@ -165,7 +165,6 @@
if (!is_migration_entry(entry) || migration_entry_to_page(entry) != old)
goto out;
- inc_mm_counter(mm, anon_rss);
get_page(new);
pte = pte_mkold(mk_pte(new, vma->vm_page_prot));
if (is_write_migration_entry(entry))
Index: linux-2.6.17-rc3/mm/rmap.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/rmap.c 2006-04-28 19:22:56.064823992 -0700
+++ linux-2.6.17-rc3/mm/rmap.c 2006-04-28 19:37:31.941975877 -0700
@@ -595,6 +595,7 @@
list_add(&mm->mmlist, &init_mm.mmlist);
spin_unlock(&mmlist_lock);
}
+ dec_mm_counter(mm, anon_rss);
} else {
/*
* Store the pfn of the page in a special migration
@@ -606,7 +607,6 @@
}
set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
BUG_ON(pte_file(*pte));
- dec_mm_counter(mm, anon_rss);
} else
dec_mm_counter(mm, file_rss);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* {PATCH 2/2} More PM: use migration entries for file pages
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
` (10 preceding siblings ...)
2006-04-29 3:23 ` {PATCH 1/2} More PM: do not inc/dec rss counters Christoph Lameter
@ 2006-04-29 3:23 ` Christoph Lameter
11 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-04-29 3:23 UTC (permalink / raw)
To: akpm
Cc: linux-mm, KAMEZAWA Hiroyuki, Lee Schermerhorn, Christoph Lameter,
Hugh Dickins
more page migration: Use migration entries for file backed pages
This implements the use of migration entries to preserve ptes of
file backed pages during migration. Processes can therefore
be migrated back and forth without loosing their connection to
pagecache pages.
Note that we implement the migration entries only for linear
mappings. Nonlinear mappings still require the unmapping of the ptes
for migration.
And another writepage() ugliness shows up. writepage() can drop
the page lock. Therefore we have to remove migration ptes
before calling writepages() in order to avoid having migration entries
point to unlocked pages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc3/mm/migrate.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 19:37:31.941975877 -0700
+++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 19:44:13.640685963 -0700
@@ -170,19 +170,44 @@
if (is_write_migration_entry(entry))
pte = pte_mkwrite(pte);
set_pte_at(mm, addr, ptep, pte);
- page_add_anon_rmap(new, vma, addr);
+
+ if (PageAnon(new))
+ page_add_anon_rmap(new, vma, addr);
+ else
+ page_add_file_rmap(new);
+
out:
pte_unmap_unlock(pte, ptl);
}
/*
- * Get rid of all migration entries and replace them by
- * references to the indicated page.
- *
+ * Note that remove_file_migration_ptes will only work on regular mappings
+ * specialized other mappings will simply be unmapped and do not use
+ * migration entries.
+ */
+static void remove_file_migration_ptes(struct page *old, struct page *new)
+{
+ struct vm_area_struct *vma;
+ struct address_space *mapping = page_mapping(new);
+ struct prio_tree_iter iter;
+ pgoff_t pgoff = new->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+
+ if (!mapping)
+ return;
+
+ spin_lock(&mapping->i_mmap_lock);
+
+ vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff)
+ remove_migration_pte(vma, page_address_in_vma(new, vma), old, new);
+
+ spin_unlock(&mapping->i_mmap_lock);
+}
+
+/*
* Must hold mmap_sem lock on at least one of the vmas containing
* the page so that the anon_vma cannot vanish.
*/
-static void remove_migration_ptes(struct page *old, struct page *new)
+static void remove_anon_migration_ptes(struct page *old, struct page *new)
{
struct anon_vma *anon_vma;
struct vm_area_struct *vma;
@@ -207,6 +232,18 @@
}
/*
+ * Get rid of all migration entries and replace them by
+ * references to the indicated page.
+ */
+static void remove_migration_ptes(struct page *old, struct page *new)
+{
+ if (PageAnon(new))
+ remove_anon_migration_ptes(old, new);
+ else
+ remove_file_migration_ptes(old, new);
+}
+
+/*
* Something used the pte of a page under migration. We need to
* get to the page and wait until migration is finished.
* When we return from this function the fault will be retried.
@@ -438,20 +475,18 @@
* pages so try to write out any dirty pages first.
*/
if (PageDirty(page)) {
- switch (pageout(page, mapping)) {
- case PAGE_KEEP:
- case PAGE_ACTIVATE:
- return -EAGAIN;
+ /*
+ * Remove the migration entries because pageout() may
+ * unlock which may result in migration entries pointing
+ * to unlocked pages.
+ */
+ remove_migration_ptes(page, page);
- case PAGE_SUCCESS:
- /* Relock since we lost the lock */
+ if (pageout(page, mapping) == PAGE_SUCCESS)
+ /* unlocked. Relock */
lock_page(page);
- /* Must retry since page state may have changed */
- return -EAGAIN;
- case PAGE_CLEAN:
- ; /* try to migrate the page below */
- }
+ return -EAGAIN;
}
/*
Index: linux-2.6.17-rc3/mm/rmap.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/rmap.c 2006-04-28 19:37:31.941975877 -0700
+++ linux-2.6.17-rc3/mm/rmap.c 2006-04-28 19:41:02.442586074 -0700
@@ -607,8 +607,14 @@
}
set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
BUG_ON(pte_file(*pte));
- } else
+ } else if (!migration)
dec_mm_counter(mm, file_rss);
+ else {
+ /* Establish migration entry for a file page */
+ swp_entry_t entry;
+ entry = make_migration_entry(page, pte_write(pteval));
+ set_pte_at(mm, address, pte, swp_entry_to_pte(entry));
+ }
page_remove_rmap(page);
page_cache_release(page);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-04-29 3:23 ` [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references() Christoph Lameter
@ 2006-05-01 16:09 ` Lee Schermerhorn
2006-05-01 16:15 ` Christoph Lameter
0 siblings, 1 reply; 19+ messages in thread
From: Lee Schermerhorn @ 2006-05-01 16:09 UTC (permalink / raw)
To: Christoph Lameter; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Fri, 2006-04-28 at 20:23 -0700, Christoph Lameter wrote:
> page migration: Drop nr_refs parameter from migrate_page_remove_references()
>
> The nr_refs parameter is not really useful since the number of remaining
> references is always
>
> 1 for anonymous pages without a mapping
> 2 for pages with a mapping
> 3 for pages with a mapping and PagePrivate set.
>
> Remove the early check for the number of references since we are
> checking page_mapcount() earlier. Ultimately only the refcount
> matters after the tree_lock has been obtained.
True for direct migration. I'll still need to know whether we're in the
fault path for migrate-on-fault. I don't think I can count on using the
mapcount as you now already remove the mapping before calling migrate_page(),
even for direct migration...
>
> Signed-off-by: Christoph Lameter <clameter@sgi.coim>
>
> Index: linux-2.6.17-rc3/mm/migrate.c
> ===================================================================
> --- linux-2.6.17-rc3.orig/mm/migrate.c 2006-04-28 17:26:13.866044400 -0700
> +++ linux-2.6.17-rc3/mm/migrate.c 2006-04-28 17:31:10.325193799 -0700
> @@ -168,19 +168,19 @@
> /*
> * Remove references for a page and establish the new page with the correct
> * basic settings to be able to stop accesses to the page.
> + *
> + * The number of remaining references must be:
> + * 1 for anonymous pages without a mapping
> + * 2 for pages with a mapping
> + * 3 for pages with a mapping and PagePrivate set.
> */
> static int migrate_page_remove_references(struct page *newpage,
> - struct page *page, int nr_refs)
> + struct page *page)
> {
> struct address_space *mapping = page_mapping(page);
> struct page **radix_pointer;
>
> - /*
> - * Avoid doing any of the following work if the page count
> - * indicates that the page is in use or truncate has removed
> - * the page.
> - */
> - if (!mapping || page_mapcount(page) + nr_refs != page_count(page))
> + if (!mapping)
> return -EAGAIN;
>
> /*
> @@ -218,7 +218,8 @@
> &mapping->page_tree,
> page_index(page));
>
> - if (!page_mapping(page) || page_count(page) != nr_refs ||
> + if (!page_mapping(page) ||
^^^^^^^^^^^^^^^^^
As part of patch 6/7, can you change this to just 'mapping'--i.e., the
added address_space argument?
> + page_count(page) != 2 + !!PagePrivate(page) ||
> *radix_pointer != page) {
> write_unlock_irq(&mapping->tree_lock);
> return -EAGAIN;
> @@ -309,7 +310,7 @@
>
> BUG_ON(PageWriteback(page)); /* Writeback must be complete */
>
> - rc = migrate_page_remove_references(newpage, page, 2);
> + rc = migrate_page_remove_references(newpage, page);
>
> if (rc)
> return rc;
> @@ -348,7 +349,7 @@
>
> head = page_buffers(page);
>
> - rc = migrate_page_remove_references(newpage, page, 3);
> + rc = migrate_page_remove_references(newpage, page);
>
> if (rc)
> return rc;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-05-01 16:09 ` Lee Schermerhorn
@ 2006-05-01 16:15 ` Christoph Lameter
2006-05-01 17:51 ` Lee Schermerhorn
0 siblings, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2006-05-01 16:15 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Mon, 1 May 2006, Lee Schermerhorn wrote:
> > Remove the early check for the number of references since we are
> > checking page_mapcount() earlier. Ultimately only the refcount
> > matters after the tree_lock has been obtained.
> True for direct migration. I'll still need to know whether we're in the
> fault path for migrate-on-fault. I don't think I can count on using the
> mapcount as you now already remove the mapping before calling migrate_page(),
> even for direct migration...
Well there is currently agreement that we wont include your patch because
it is not clear that the patch will be beneficial.
And AFAIK your patch relies on only migrating pages with mapcount = 0. In
that case I think you can call the migration functions directly without
having to unmap. I thought this would actually be better for your case.
> > - if (!page_mapping(page) || page_count(page) != nr_refs ||
> > + if (!page_mapping(page) ||
> ^^^^^^^^^^^^^^^^^
> As part of patch 6/7, can you change this to just 'mapping'--i.e., the
> added address_space argument?
No. The mapping may have been removed and this check is necessary to not
migrate a page that is already gone.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-05-01 16:15 ` Christoph Lameter
@ 2006-05-01 17:51 ` Lee Schermerhorn
2006-05-01 18:04 ` Christoph Lameter
0 siblings, 1 reply; 19+ messages in thread
From: Lee Schermerhorn @ 2006-05-01 17:51 UTC (permalink / raw)
To: Christoph Lameter; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Mon, 2006-05-01 at 09:15 -0700, Christoph Lameter wrote:
> On Mon, 1 May 2006, Lee Schermerhorn wrote:
>
> > > Remove the early check for the number of references since we are
> > > checking page_mapcount() earlier. Ultimately only the refcount
> > > matters after the tree_lock has been obtained.
> > True for direct migration. I'll still need to know whether we're in the
> > fault path for migrate-on-fault. I don't think I can count on using the
> > mapcount as you now already remove the mapping before calling migrate_page(),
> > even for direct migration...
>
> Well there is currently agreement that we wont include your patch because
> it is not clear that the patch will be beneficial.
Ouch! That's harsh! Guess I missed that meeting... ;-)
Seriously, of course, the onus is on me to show benefit. And I hope to,
once the base migration code stabilizes...
>
> And AFAIK your patch relies on only migrating pages with mapcount = 0. In
> that case I think you can call the migration functions directly without
> having to unmap. I thought this would actually be better for your case.
This only occurs if I find a cached, "misplaced" page in the fault path
with mapcount==0. But the fault path does add another ref on lookup,
so the refcounts are all one higher in this case.
>
> > > - if (!page_mapping(page) || page_count(page) != nr_refs ||
> > > + if (!page_mapping(page) ||
> > ^^^^^^^^^^^^^^^^^
> > As part of patch 6/7, can you change this to just 'mapping'--i.e., the
> > added address_space argument?
>
> No. The mapping may have been removed and this check is necessary to not
> migrate a page that is already gone.
OK I couldn't see how a page could be removed from its mapping while
we hold it locked. I'll look closer...
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-05-01 17:51 ` Lee Schermerhorn
@ 2006-05-01 18:04 ` Christoph Lameter
2006-05-01 18:34 ` Lee Schermerhorn
0 siblings, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2006-05-01 18:04 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Mon, 1 May 2006, Lee Schermerhorn wrote:
> > And AFAIK your patch relies on only migrating pages with mapcount = 0. In
> > that case I think you can call the migration functions directly without
> > having to unmap. I thought this would actually be better for your case.
>
> This only occurs if I find a cached, "misplaced" page in the fault path
> with mapcount==0. But the fault path does add another ref on lookup,
> so the refcounts are all one higher in this case.
I send you a set of patches that split migrate_pages(). Maybe that is what
you are looking for?
You only need one additional refcount to hold the page. This is the same
as in the case of migrate_pages(). Where does the second refcount come from?
> > No. The mapping may have been removed and this check is necessary to not
> > migrate a page that is already gone.
>
> OK I couldn't see how a page could be removed from its mapping while
> we hold it locked. I'll look closer...
zap_pte_range() can remove a mapcount without obtaining a lock. Hmmm...
Seems to do nothing with the mapping though. Removal of anonymous mappings
are deferred until we reach free_page() so that does not apply. Check the
file I/O functions.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-05-01 18:04 ` Christoph Lameter
@ 2006-05-01 18:34 ` Lee Schermerhorn
2006-05-01 18:53 ` Christoph Lameter
0 siblings, 1 reply; 19+ messages in thread
From: Lee Schermerhorn @ 2006-05-01 18:34 UTC (permalink / raw)
To: Christoph Lameter; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Mon, 2006-05-01 at 11:04 -0700, Christoph Lameter wrote:
> On Mon, 1 May 2006, Lee Schermerhorn wrote:
>
> > > And AFAIK your patch relies on only migrating pages with mapcount = 0. In
> > > that case I think you can call the migration functions directly without
> > > having to unmap. I thought this would actually be better for your case.
> >
> > This only occurs if I find a cached, "misplaced" page in the fault path
> > with mapcount==0. But the fault path does add another ref on lookup,
> > so the refcounts are all one higher in this case.
>
> I send you a set of patches that split migrate_pages(). Maybe that is what
> you are looking for?
>
> You only need one additional refcount to hold the page. This is the same
> as in the case of migrate_pages(). Where does the second refcount come from?
One from the cache [where I find the page on fault], one from
find_get_page().
Plus 1 from page_private(page) buf refs, if any. Also, to match the
page
state for direct migration, I isolate the page from the lru so it can't
be found, except through the cache, while I'm migrating it; and that
adds
yet another ref. Net is 1 extra ref in the fault path.
>
> > > No. The mapping may have been removed and this check is necessary to not
> > > migrate a page that is already gone.
> >
> > OK I couldn't see how a page could be removed from its mapping while
> > we hold it locked. I'll look closer...
>
> zap_pte_range() can remove a mapcount without obtaining a lock. Hmmm...
> Seems to do nothing with the mapping though. Removal of anonymous mappings
> are deferred until we reach free_page() so that does not apply. Check the
> file I/O functions.
I did. Looked like page->mapping only gets NULLed out in
[__]remove_from_page_cache(). I backtracked all of the
refs I could find, and all seemed to hold page lock. But, again,
I could have missed some [cscope can lie, as can my eyes].
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references()
2006-05-01 18:34 ` Lee Schermerhorn
@ 2006-05-01 18:53 ` Christoph Lameter
0 siblings, 0 replies; 19+ messages in thread
From: Christoph Lameter @ 2006-05-01 18:53 UTC (permalink / raw)
To: Lee Schermerhorn; +Cc: akpm, linux-mm, KAMEZAWA Hiroyuki, Hugh Dickins
On Mon, 1 May 2006, Lee Schermerhorn wrote:
> > You only need one additional refcount to hold the page. This is the same
> > as in the case of migrate_pages(). Where does the second refcount come from?
>
> One from the cache [where I find the page on fault], one from
> find_get_page().
The function alrady considers the reference from mapping. The
find_get_page() ref is basically the same that isolate_lru_page() takes.
> Plus 1 from page_private(page) buf refs, if any. Also, to match the
That is also considered in the function.
> page
> state for direct migration, I isolate the page from the lru so it can't
> be found, except through the cache, while I'm migrating it; and that
> adds
> yet another ref. Net is 1 extra ref in the fault path.
Since you already have a ref from find_get_page() I would not think that
you would not need an additional one from isolate_lru_page().
> > zap_pte_range() can remove a mapcount without obtaining a lock. Hmmm...
> > Seems to do nothing with the mapping though. Removal of anonymous mappings
> > are deferred until we reach free_page() so that does not apply. Check the
> > file I/O functions.
>
> I did. Looked like page->mapping only gets NULLed out in
> [__]remove_from_page_cache(). I backtracked all of the
> refs I could find, and all seemed to hold page lock. But, again,
> I could have missed some [cscope can lie, as can my eyes].
If there is none then we can completely remove that check. We already
check for the mapping to be non null earlier.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2006-05-01 18:53 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-29 3:22 Page Migration patchsets overview Christoph Lameter
2006-04-29 3:22 ` [PATCH 1/7] PM cleanup: Rename "ignrefs" to "migration" Christoph Lameter
2006-04-29 3:22 ` [PATCH 2/7] PM cleanup: Group functions Christoph Lameter
2006-04-29 3:23 ` [PATCH 3/7] PM cleanup: Remove useless definitions Christoph Lameter
2006-04-29 3:23 ` [PATCH 4/7] PM cleanup: Drop nr_refs in remove_references() Christoph Lameter
2006-05-01 16:09 ` Lee Schermerhorn
2006-05-01 16:15 ` Christoph Lameter
2006-05-01 17:51 ` Lee Schermerhorn
2006-05-01 18:04 ` Christoph Lameter
2006-05-01 18:34 ` Lee Schermerhorn
2006-05-01 18:53 ` Christoph Lameter
2006-04-29 3:23 ` [PATCH 5/7] PM cleanup: Extract try_to_unmap from migration functions Christoph Lameter
2006-04-29 3:23 ` [PATCH 6/7] PM cleanup: Pass "mapping" to " Christoph Lameter
2006-04-29 3:23 ` [PATCH 7/7] PM cleanup: Move fallback handling into special function Christoph Lameter
2006-04-29 3:23 ` [PATCH 1/3] Swapless PM: add R/W migration entries Christoph Lameter
2006-04-29 3:23 ` [PATCH 2/3] Swapless PM: Rip out swap based logic Christoph Lameter
2006-04-29 3:23 ` [PATCH 3/3] Swapless PM: Modify core logic Christoph Lameter
2006-04-29 3:23 ` {PATCH 1/2} More PM: do not inc/dec rss counters Christoph Lameter
2006-04-29 3:23 ` {PATCH 2/2} More PM: use migration entries for file pages Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).