From: Christoph Lameter <clameter@sgi.com>
To: akpm@osdl.org
Cc: Hugh Dickins <hugh@veritas.com>,
linux-kernel@vger.kernel.org,
Lee Schermerhorn <lee.schermerhorn@hp.com>,
linux-mm@kvack.org, Christoph Lameter <clameter@sgi.com>,
Hirokazu Takahashi <taka@valinux.co.jp>,
Marcelo Tosatti <marcelo.tosatti@cyclades.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH 5/5] Swapless V2: Revise main migration logic
Date: Thu, 13 Apr 2006 16:54:32 -0700 (PDT) [thread overview]
Message-ID: <20060413235432.15398.23912.sendpatchset@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20060413235406.15398.42233.sendpatchset@schroedinger.engr.sgi.com>
Use the migration entries for page migration
This modifies the migration code to use the new migration entries.
It now becomes possible to migrate anonymous pages without having to
add a swap entry.
We add a couple of new functions to replace migration entries with the proper
ptes.
We cannot take the tree_lock for migrating anonymous pages anymore. However,
we know that we hold the only remaining reference to the page when the page
count reaches 1.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Index: linux-2.6.17-rc1-mm2/mm/migrate.c
===================================================================
--- linux-2.6.17-rc1-mm2.orig/mm/migrate.c 2006-04-13 15:58:54.000000000 -0700
+++ linux-2.6.17-rc1-mm2/mm/migrate.c 2006-04-13 16:36:28.000000000 -0700
@@ -15,6 +15,7 @@
#include <linux/migrate.h>
#include <linux/module.h>
#include <linux/swap.h>
+#include <linux/swapops.h>
#include <linux/pagemap.h>
#include <linux/buffer_head.h>
#include <linux/mm_inline.h>
@@ -23,7 +24,6 @@
#include <linux/topology.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
-#include <linux/swapops.h>
#include "internal.h"
@@ -115,6 +115,95 @@ int putback_lru_pages(struct list_head *
return count;
}
+static inline int is_swap_pte(pte_t pte)
+{
+ return !pte_none(pte) && !pte_present(pte) && !pte_file(pte);
+}
+
+/*
+ * Restore a potential migration pte to a working pte entry for
+ * anonymous pages.
+ */
+static void remove_migration_pte(struct vm_area_struct *vma, unsigned long addr,
+ struct page *old, struct page *new)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ swp_entry_t entry;
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *ptep, pte;
+ spinlock_t *ptl;
+
+ pgd = pgd_offset(mm, addr);
+ if (!pgd_present(*pgd))
+ return;
+
+ pud = pud_offset(pgd, addr);
+ if (!pud_present(*pud))
+ return;
+
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_present(*pmd))
+ return;
+
+ ptep = pte_offset_map(pmd, addr);
+
+ if (!is_swap_pte(*ptep)) {
+ pte_unmap(ptep);
+ return;
+ }
+
+ ptl = pte_lockptr(mm, pmd);
+ spin_lock(ptl);
+ pte = *ptep;
+ if (!is_swap_pte(pte))
+ goto out;
+
+ entry = pte_to_swp_entry(pte);
+
+ if (!is_migration_entry(entry) || migration_entry_to_page(entry) != old)
+ goto out;
+
+ inc_mm_counter(mm, anon_rss);
+ get_page(new);
+ set_pte_at(mm, addr, ptep, pte_mkold(mk_pte(new, vma->vm_page_prot)));
+ page_add_anon_rmap(new, vma, addr);
+out:
+ pte_unmap_unlock(pte, ptl);
+}
+
+/*
+ * Get rid of all migration entries and replace them by
+ * references to the indicated page.
+ *
+ * Must hold mmap_sem lock on at least one of the vmas containing
+ * the page so that the anon_vma cannot vanish.
+ */
+static void remove_migration_ptes(struct page *old, struct page *new)
+{
+ struct anon_vma *anon_vma;
+ struct vm_area_struct *vma;
+ unsigned long mapping;
+
+ mapping = (unsigned long)new->mapping;
+
+ if (!mapping || (mapping & PAGE_MAPPING_ANON) == 0)
+ return;
+
+ /*
+ * We hold the mmap_sem lock. So no need to call page_lock_anon_vma.
+ */
+ anon_vma = (struct anon_vma *) (mapping - PAGE_MAPPING_ANON);
+ spin_lock(&anon_vma->lock);
+
+ list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
+ remove_migration_pte(vma, page_address_in_vma(new, vma),
+ old, new);
+
+ spin_unlock(&anon_vma->lock);
+}
+
/*
* Non migratable page
*/
@@ -125,8 +214,9 @@ int fail_migrate_page(struct page *newpa
EXPORT_SYMBOL(fail_migrate_page);
/*
- * Remove references for a page and establish the new page with the correct
- * basic settings to be able to stop accesses to the page.
+ * Remove or replace all references to a page so that future accesses to
+ * the page can be blocked. Establish the new page
+ * with the basic settings to be able to stop accesses to the page.
*/
int migrate_page_remove_references(struct page *newpage,
struct page *page, int nr_refs)
@@ -139,38 +229,51 @@ int migrate_page_remove_references(struc
* indicates that the page is in use or truncate has removed
* the page.
*/
- if (!mapping || page_mapcount(page) + nr_refs != page_count(page))
- return -EAGAIN;
+ if (!page->mapping ||
+ page_mapcount(page) + nr_refs != page_count(page))
+ return -EAGAIN;
/*
- * Establish swap ptes for anonymous pages or destroy pte
+ * Establish migration ptes for anonymous pages or destroy pte
* maps for files.
*
* In order to reestablish file backed mappings the fault handlers
* will take the radix tree_lock which may then be used to stop
* processses from accessing this page until the new page is ready.
*
- * A process accessing via a swap pte (an anonymous page) will take a
- * page_lock on the old page which will block the process until the
- * migration attempt is complete. At that time the PageSwapCache bit
- * will be examined. If the page was migrated then the PageSwapCache
- * bit will be clear and the operation to retrieve the page will be
- * retried which will find the new page in the radix tree. Then a new
- * direct mapping may be generated based on the radix tree contents.
- *
- * If the page was not migrated then the PageSwapCache bit
- * is still set and the operation may continue.
+ * A process accessing via a migration pte (an anonymous page) will
+ * take a page_lock on the old page which will block the process
+ * until the migration attempt is complete.
*/
if (try_to_unmap(page, 1) == SWAP_FAIL)
/* A vma has VM_LOCKED set -> permanent failure */
return -EPERM;
/*
- * Give up if we were unable to remove all mappings.
+ * Retry if we were unable to remove all mappings.
*/
if (page_mapcount(page))
return -EAGAIN;
+ if (!mapping) {
+ /*
+ * Anonymous page without swap mapping.
+ * User space cannot access the page anymore since we
+ * removed the ptes. Now check if the kernel still has
+ * pending references.
+ */
+ if (page_count(page) != nr_refs)
+ return -EAGAIN;
+
+ /* We are holding the only remaining reference */
+ newpage->index = page->index;
+ newpage->mapping = page->mapping;
+ return 0;
+ }
+
+ /*
+ * The page has a mapping that we need to change
+ */
write_lock_irq(&mapping->tree_lock);
radix_pointer = (struct page **)radix_tree_lookup_slot(
@@ -194,10 +297,13 @@ int migrate_page_remove_references(struc
get_page(newpage);
newpage->index = page->index;
newpage->mapping = page->mapping;
+
+#ifdef CONFIG_SWAP
if (PageSwapCache(page)) {
SetPageSwapCache(newpage);
set_page_private(newpage, page_private(page));
}
+#endif
*radix_pointer = newpage;
__put_page(page);
@@ -232,7 +338,9 @@ void migrate_page_copy(struct page *newp
set_page_dirty(newpage);
}
+#ifdef CONFIG_SWAP
ClearPageSwapCache(page);
+#endif
ClearPageActive(page);
ClearPagePrivate(page);
set_page_private(page, 0);
@@ -259,22 +367,16 @@ int migrate_page(struct page *newpage, s
BUG_ON(PageWriteback(page)); /* Writeback must be complete */
- rc = migrate_page_remove_references(newpage, page, 2);
+ rc = migrate_page_remove_references(newpage, page,
+ page_mapping(page) ? 2 : 1);
- if (rc)
+ if (rc) {
+ remove_migration_ptes(page, page);
return rc;
+ }
migrate_page_copy(newpage, page);
-
- /*
- * Remove auxiliary swap entries and replace
- * them with real ptes.
- *
- * Note that a real pte entry will allow processes that are not
- * waiting on the page lock to use the new page via the page tables
- * before the new page is unlocked.
- */
- remove_from_swap(newpage);
+ remove_migration_ptes(page, newpage);
return 0;
}
EXPORT_SYMBOL(migrate_page);
@@ -356,9 +458,11 @@ redo:
* Try to migrate the page.
*/
mapping = page_mapping(page);
- if (!mapping)
+ if (!mapping) {
+ rc = migrate_page(newpage, page);
goto unlock_both;
+ } else
if (mapping->a_ops->migratepage) {
/*
* Most pages have a mapping and most filesystems
Index: linux-2.6.17-rc1-mm2/mm/Kconfig
===================================================================
--- linux-2.6.17-rc1-mm2.orig/mm/Kconfig 2006-04-02 20:22:10.000000000 -0700
+++ linux-2.6.17-rc1-mm2/mm/Kconfig 2006-04-13 15:58:56.000000000 -0700
@@ -138,8 +138,8 @@ config SPLIT_PTLOCK_CPUS
#
config MIGRATION
bool "Page migration"
- def_bool y if NUMA
- depends on SWAP && NUMA
+ def_bool y
+ depends on NUMA
help
Allows the migration of the physical location of pages of processes
while the virtual addresses are not changed. This is useful for
next prev parent reply other threads:[~2006-04-13 23:55 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-13 23:54 [PATCH 0/5] Swapless page migration V2: Overview Christoph Lameter
2006-04-13 23:54 ` [PATCH 1/5] Swapless V2: try_to_unmap() - Rename ignrefs to "migration" Christoph Lameter
2006-04-13 23:54 ` [PATCH 2/5] Swapless V2: Add migration swap entries Christoph Lameter
2006-04-14 0:13 ` Andrew Morton
2006-04-14 0:29 ` Christoph Lameter
2006-04-14 0:42 ` Andrew Morton
2006-04-14 0:46 ` Christoph Lameter
2006-04-14 1:01 ` Andrew Morton
2006-04-14 1:17 ` Andrew Morton
2006-04-14 1:31 ` Christoph Lameter
2006-04-14 5:25 ` Andrew Morton
2006-04-14 14:27 ` Lee Schermerhorn
2006-04-14 16:01 ` Christoph Lameter
2006-04-14 1:31 ` Christoph Lameter
2006-04-14 5:29 ` Andrew Morton
2006-04-14 17:28 ` Implement lookup_swap_cache for migration entries Christoph Lameter
2006-04-14 18:31 ` Andrew Morton
2006-04-14 18:48 ` Christoph Lameter
2006-04-14 19:15 ` Andrew Morton
2006-04-14 19:22 ` Christoph Lameter
2006-04-14 19:53 ` Andrew Morton
2006-04-14 20:12 ` Christoph Lameter
2006-04-14 21:51 ` Wait for migrating page after incr of page count under anon_vma lock Christoph Lameter
2006-04-17 23:52 ` migration_entry_wait: Use the pte lock instead of the " Christoph Lameter
2006-04-14 0:36 ` [PATCH 2/5] Swapless V2: Add migration swap entries Christoph Lameter
2006-04-13 23:54 ` [PATCH 3/5] Swapless V2: Make try_to_unmap() create migration entries Christoph Lameter
2006-04-13 23:54 ` [PATCH 4/5] Swapless V2: Rip out swap portion of old migration code Christoph Lameter
2006-04-13 23:54 ` Christoph Lameter [this message]
2006-04-14 1:19 ` [PATCH 5/5] Swapless V2: Revise main migration logic KAMEZAWA Hiroyuki
2006-04-14 1:33 ` Christoph Lameter
2006-04-14 1:40 ` KAMEZAWA Hiroyuki
2006-04-14 2:34 ` KAMEZAWA Hiroyuki
2006-04-14 2:44 ` KAMEZAWA Hiroyuki
2006-04-14 17:29 ` Preserve write permissions in migration entries Christoph Lameter
2006-04-14 16:48 ` [PATCH 5/5] Swapless V2: Revise main migration logic Christoph Lameter
2006-04-15 0:06 ` KAMEZAWA Hiroyuki
2006-04-15 17:41 ` Christoph Lameter
2006-04-17 0:18 ` KAMEZAWA Hiroyuki
2006-04-17 17:00 ` Christoph Lameter
2006-04-18 0:04 ` KAMEZAWA Hiroyuki
2006-04-18 0:27 ` Christoph Lameter
2006-04-18 0:42 ` KAMEZAWA Hiroyuki
2006-04-18 1:57 ` Christoph Lameter
2006-04-18 3:00 ` KAMEZAWA Hiroyuki
2006-04-18 3:16 ` Christoph Lameter
2006-04-18 3:32 ` KAMEZAWA Hiroyuki
2006-04-18 6:58 ` Christoph Lameter
2006-04-18 8:05 ` KAMEZAWA Hiroyuki
2006-04-18 8:27 ` Christoph Lameter
2006-04-18 9:08 ` KAMEZAWA Hiroyuki
2006-04-18 16:49 ` Christoph Lameter
2006-04-14 0:08 ` [PATCH 0/5] Swapless page migration V2: Overview Andrew Morton
2006-04-14 0:27 ` Christoph Lameter
2006-04-14 14:14 ` Lee Schermerhorn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060413235432.15398.23912.sendpatchset@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=akpm@osdl.org \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=taka@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox