linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: akpm@osdl.org
Cc: Hugh Dickins <hugh@veritas.com>,
	linux-kernel@vger.kernel.org,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	linux-mm@kvack.org, Christoph Lameter <clameter@sgi.com>,
	Hirokazu Takahashi <taka@valinux.co.jp>,
	Marcelo Tosatti <marcelo.tosatti@cyclades.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: [PATCH 5/5] Swapless V2: Revise main migration logic
Date: Thu, 13 Apr 2006 16:54:32 -0700 (PDT)	[thread overview]
Message-ID: <20060413235432.15398.23912.sendpatchset@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20060413235406.15398.42233.sendpatchset@schroedinger.engr.sgi.com>

Use the migration entries for page migration

This modifies the migration code to use the new migration entries.
It now becomes possible to migrate anonymous pages without having to
add a swap entry.

We add a couple of new functions to replace migration entries with the proper
ptes.

We cannot take the tree_lock for migrating anonymous pages anymore. However,
we know that we hold the only remaining reference to the page when the page
count reaches 1.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6.17-rc1-mm2/mm/migrate.c
===================================================================
--- linux-2.6.17-rc1-mm2.orig/mm/migrate.c	2006-04-13 15:58:54.000000000 -0700
+++ linux-2.6.17-rc1-mm2/mm/migrate.c	2006-04-13 16:36:28.000000000 -0700
@@ -15,6 +15,7 @@
 #include <linux/migrate.h>
 #include <linux/module.h>
 #include <linux/swap.h>
+#include <linux/swapops.h>
 #include <linux/pagemap.h>
 #include <linux/buffer_head.h>
 #include <linux/mm_inline.h>
@@ -23,7 +24,6 @@
 #include <linux/topology.h>
 #include <linux/cpu.h>
 #include <linux/cpuset.h>
-#include <linux/swapops.h>
 
 #include "internal.h"
 
@@ -115,6 +115,95 @@ int putback_lru_pages(struct list_head *
 	return count;
 }
 
+static inline int is_swap_pte(pte_t pte)
+{
+	return !pte_none(pte) && !pte_present(pte) && !pte_file(pte);
+}
+
+/*
+ * Restore a potential migration pte to a working pte entry for
+ * anonymous pages.
+ */
+static void remove_migration_pte(struct vm_area_struct *vma, unsigned long addr,
+		struct page *old, struct page *new)
+{
+	struct mm_struct *mm = vma->vm_mm;
+	swp_entry_t entry;
+ 	pgd_t *pgd;
+ 	pud_t *pud;
+ 	pmd_t *pmd;
+	pte_t *ptep, pte;
+ 	spinlock_t *ptl;
+
+ 	pgd = pgd_offset(mm, addr);
+	 if (!pgd_present(*pgd))
+                return;
+
+	pud = pud_offset(pgd, addr);
+	if (!pud_present(*pud))
+                return;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return;
+
+	ptep = pte_offset_map(pmd, addr);
+
+	if (!is_swap_pte(*ptep)) {
+		pte_unmap(ptep);
+ 		return;
+ 	}
+
+ 	ptl = pte_lockptr(mm, pmd);
+ 	spin_lock(ptl);
+	pte = *ptep;
+	if (!is_swap_pte(pte))
+		goto out;
+
+	entry = pte_to_swp_entry(pte);
+
+	if (!is_migration_entry(entry) || migration_entry_to_page(entry) != old)
+		goto out;
+
+	inc_mm_counter(mm, anon_rss);
+	get_page(new);
+	set_pte_at(mm, addr, ptep, pte_mkold(mk_pte(new, vma->vm_page_prot)));
+	page_add_anon_rmap(new, vma, addr);
+out:
+	pte_unmap_unlock(pte, ptl);
+}
+
+/*
+ * Get rid of all migration entries and replace them by
+ * references to the indicated page.
+ *
+ * Must hold mmap_sem lock on at least one of the vmas containing
+ * the page so that the anon_vma cannot vanish.
+ */
+static void remove_migration_ptes(struct page *old, struct page *new)
+{
+	struct anon_vma *anon_vma;
+	struct vm_area_struct *vma;
+	unsigned long mapping;
+
+	mapping = (unsigned long)new->mapping;
+
+	if (!mapping || (mapping & PAGE_MAPPING_ANON) == 0)
+		return;
+
+	/*
+	 * We hold the mmap_sem lock. So no need to call page_lock_anon_vma.
+	 */
+	anon_vma = (struct anon_vma *) (mapping - PAGE_MAPPING_ANON);
+	spin_lock(&anon_vma->lock);
+
+	list_for_each_entry(vma, &anon_vma->head, anon_vma_node)
+		remove_migration_pte(vma, page_address_in_vma(new, vma),
+					old, new);
+
+	spin_unlock(&anon_vma->lock);
+}
+
 /*
  * Non migratable page
  */
@@ -125,8 +214,9 @@ int fail_migrate_page(struct page *newpa
 EXPORT_SYMBOL(fail_migrate_page);
 
 /*
- * Remove references for a page and establish the new page with the correct
- * basic settings to be able to stop accesses to the page.
+ * Remove or replace all references to a page so that future accesses to
+ * the page can be blocked. Establish the new page
+ * with the basic settings to be able to stop accesses to the page.
  */
 int migrate_page_remove_references(struct page *newpage,
 				struct page *page, int nr_refs)
@@ -139,38 +229,51 @@ int migrate_page_remove_references(struc
 	 * indicates that the page is in use or truncate has removed
 	 * the page.
 	 */
-	if (!mapping || page_mapcount(page) + nr_refs != page_count(page))
-		return -EAGAIN;
+	if (!page->mapping ||
+		page_mapcount(page) + nr_refs != page_count(page))
+			return -EAGAIN;
 
 	/*
-	 * Establish swap ptes for anonymous pages or destroy pte
+	 * Establish migration ptes for anonymous pages or destroy pte
 	 * maps for files.
 	 *
 	 * In order to reestablish file backed mappings the fault handlers
 	 * will take the radix tree_lock which may then be used to stop
   	 * processses from accessing this page until the new page is ready.
 	 *
-	 * A process accessing via a swap pte (an anonymous page) will take a
-	 * page_lock on the old page which will block the process until the
-	 * migration attempt is complete. At that time the PageSwapCache bit
-	 * will be examined. If the page was migrated then the PageSwapCache
-	 * bit will be clear and the operation to retrieve the page will be
-	 * retried which will find the new page in the radix tree. Then a new
-	 * direct mapping may be generated based on the radix tree contents.
-	 *
-	 * If the page was not migrated then the PageSwapCache bit
-	 * is still set and the operation may continue.
+	 * A process accessing via a migration pte (an anonymous page) will
+	 * take a page_lock on the old page which will block the process
+	 * until the migration attempt is complete.
 	 */
 	if (try_to_unmap(page, 1) == SWAP_FAIL)
 		/* A vma has VM_LOCKED set -> permanent failure */
 		return -EPERM;
 
 	/*
-	 * Give up if we were unable to remove all mappings.
+	 * Retry if we were unable to remove all mappings.
 	 */
 	if (page_mapcount(page))
 		return -EAGAIN;
 
+	if (!mapping) {
+		/*
+		 * Anonymous page without swap mapping.
+		 * User space cannot access the page anymore since we
+		 * removed the ptes. Now check if the kernel still has
+		 * pending references.
+		 */
+		if (page_count(page) != nr_refs)
+			return -EAGAIN;
+
+		/* We are holding the only remaining reference */
+		newpage->index = page->index;
+		newpage->mapping = page->mapping;
+		return 0;
+	}
+
+	/*
+	 * The page has a mapping that we need to change
+	 */
 	write_lock_irq(&mapping->tree_lock);
 
 	radix_pointer = (struct page **)radix_tree_lookup_slot(
@@ -194,10 +297,13 @@ int migrate_page_remove_references(struc
 	get_page(newpage);
 	newpage->index = page->index;
 	newpage->mapping = page->mapping;
+
+#ifdef CONFIG_SWAP
 	if (PageSwapCache(page)) {
 		SetPageSwapCache(newpage);
 		set_page_private(newpage, page_private(page));
 	}
+#endif
 
 	*radix_pointer = newpage;
 	__put_page(page);
@@ -232,7 +338,9 @@ void migrate_page_copy(struct page *newp
 		set_page_dirty(newpage);
  	}
 
+#ifdef CONFIG_SWAP
 	ClearPageSwapCache(page);
+#endif
 	ClearPageActive(page);
 	ClearPagePrivate(page);
 	set_page_private(page, 0);
@@ -259,22 +367,16 @@ int migrate_page(struct page *newpage, s
 
 	BUG_ON(PageWriteback(page));	/* Writeback must be complete */
 
-	rc = migrate_page_remove_references(newpage, page, 2);
+	rc = migrate_page_remove_references(newpage, page,
+			page_mapping(page) ? 2 : 1);
 
-	if (rc)
+	if (rc) {
+		remove_migration_ptes(page, page);
 		return rc;
+	}
 
 	migrate_page_copy(newpage, page);
-
-	/*
-	 * Remove auxiliary swap entries and replace
-	 * them with real ptes.
-	 *
-	 * Note that a real pte entry will allow processes that are not
-	 * waiting on the page lock to use the new page via the page tables
-	 * before the new page is unlocked.
-	 */
-	remove_from_swap(newpage);
+	remove_migration_ptes(page, newpage);
 	return 0;
 }
 EXPORT_SYMBOL(migrate_page);
@@ -356,9 +458,11 @@ redo:
 		 * Try to migrate the page.
 		 */
 		mapping = page_mapping(page);
-		if (!mapping)
+		if (!mapping) {
+			rc = migrate_page(newpage, page);
 			goto unlock_both;
 
+		} else
 		if (mapping->a_ops->migratepage) {
 			/*
 			 * Most pages have a mapping and most filesystems
Index: linux-2.6.17-rc1-mm2/mm/Kconfig
===================================================================
--- linux-2.6.17-rc1-mm2.orig/mm/Kconfig	2006-04-02 20:22:10.000000000 -0700
+++ linux-2.6.17-rc1-mm2/mm/Kconfig	2006-04-13 15:58:56.000000000 -0700
@@ -138,8 +138,8 @@ config SPLIT_PTLOCK_CPUS
 #
 config MIGRATION
 	bool "Page migration"
-	def_bool y if NUMA
-	depends on SWAP && NUMA
+	def_bool y
+	depends on NUMA
 	help
 	  Allows the migration of the physical location of pages of processes
 	  while the virtual addresses are not changed. This is useful for

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2006-04-13 23:54 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-13 23:54 [PATCH 0/5] Swapless page migration V2: Overview Christoph Lameter
2006-04-13 23:54 ` [PATCH 1/5] Swapless V2: try_to_unmap() - Rename ignrefs to "migration" Christoph Lameter
2006-04-13 23:54 ` [PATCH 2/5] Swapless V2: Add migration swap entries Christoph Lameter
2006-04-14  0:13   ` Andrew Morton
2006-04-14  0:29     ` Christoph Lameter
2006-04-14  0:42       ` Andrew Morton
2006-04-14  0:46         ` Christoph Lameter
2006-04-14  1:01           ` Andrew Morton
2006-04-14  1:17             ` Andrew Morton
2006-04-14  1:31               ` Christoph Lameter
2006-04-14  5:25                 ` Andrew Morton
2006-04-14 14:27                   ` Lee Schermerhorn
2006-04-14 16:01                   ` Christoph Lameter
2006-04-14  1:31             ` Christoph Lameter
2006-04-14  5:29               ` Andrew Morton
2006-04-14 17:28                 ` Implement lookup_swap_cache for migration entries Christoph Lameter
2006-04-14 18:31                   ` Andrew Morton
2006-04-14 18:48                     ` Christoph Lameter
2006-04-14 19:15                       ` Andrew Morton
2006-04-14 19:22                         ` Christoph Lameter
2006-04-14 19:53                           ` Andrew Morton
2006-04-14 20:12                             ` Christoph Lameter
2006-04-14 21:51                             ` Wait for migrating page after incr of page count under anon_vma lock Christoph Lameter
2006-04-17 23:52                               ` migration_entry_wait: Use the pte lock instead of the " Christoph Lameter
2006-04-14  0:36     ` [PATCH 2/5] Swapless V2: Add migration swap entries Christoph Lameter
2006-04-13 23:54 ` [PATCH 3/5] Swapless V2: Make try_to_unmap() create migration entries Christoph Lameter
2006-04-13 23:54 ` [PATCH 4/5] Swapless V2: Rip out swap portion of old migration code Christoph Lameter
2006-04-13 23:54 ` Christoph Lameter [this message]
2006-04-14  1:19   ` [PATCH 5/5] Swapless V2: Revise main migration logic KAMEZAWA Hiroyuki
2006-04-14  1:33     ` Christoph Lameter
2006-04-14  1:40       ` KAMEZAWA Hiroyuki
2006-04-14  2:34       ` KAMEZAWA Hiroyuki
2006-04-14  2:44         ` KAMEZAWA Hiroyuki
2006-04-14 17:29           ` Preserve write permissions in migration entries Christoph Lameter
2006-04-14 16:48         ` [PATCH 5/5] Swapless V2: Revise main migration logic Christoph Lameter
2006-04-15  0:06           ` KAMEZAWA Hiroyuki
2006-04-15 17:41             ` Christoph Lameter
2006-04-17  0:18               ` KAMEZAWA Hiroyuki
2006-04-17 17:00                 ` Christoph Lameter
2006-04-18  0:04                   ` KAMEZAWA Hiroyuki
2006-04-18  0:27                     ` Christoph Lameter
2006-04-18  0:42                       ` KAMEZAWA Hiroyuki
2006-04-18  1:57                         ` Christoph Lameter
2006-04-18  3:00                           ` KAMEZAWA Hiroyuki
2006-04-18  3:16                             ` Christoph Lameter
2006-04-18  3:32                               ` KAMEZAWA Hiroyuki
2006-04-18  6:58                                 ` Christoph Lameter
2006-04-18  8:05                                   ` KAMEZAWA Hiroyuki
2006-04-18  8:27                                     ` Christoph Lameter
2006-04-18  9:08                                       ` KAMEZAWA Hiroyuki
2006-04-18 16:49                                         ` Christoph Lameter
2006-04-14  0:08 ` [PATCH 0/5] Swapless page migration V2: Overview Andrew Morton
2006-04-14  0:27   ` Christoph Lameter
2006-04-14 14:14     ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060413235432.15398.23912.sendpatchset@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=akpm@osdl.org \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=taka@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).