linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations
@ 2012-02-10 19:42 Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 1/4] shmem: simlify shmem_unlock_mapping Konstantin Khlebnikov
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-10 19:42 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

Here some shmem patches related to the radix-tree iterator patchset,
they cleans radix-tree usage in shmem and notably optimizes swapoff operation.
Last patch is slightly off-topic, but it shares test results with previous patch.

---

Konstantin Khlebnikov (4):
      shmem: simlify shmem_unlock_mapping
      shmem: tag swap entries in radix tree
      shmem: use radix-tree iterator in shmem_unuse_inode()
      mm: use swap readahead at swapoff


 include/linux/radix-tree.h |    1 
 lib/radix-tree.c           |   93 --------------------------------------------
 mm/shmem.c                 |   60 ++++++++++++++++++++--------
 mm/swapfile.c              |    3 -
 4 files changed, 44 insertions(+), 113 deletions(-)

-- 
Signature

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] shmem: simlify shmem_unlock_mapping
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
@ 2012-02-10 19:42 ` Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 2/4] shmem: tag swap entries in radix tree Konstantin Khlebnikov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-10 19:42 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

find_get_pages() now can skip unlimited count of exeptional entries,
so shmem_find_get_pages_and_swap() does not required there any more.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/shmem.c |   12 ++----------
 1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 269d049..4af8e85 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -397,7 +397,6 @@ static void shmem_deswap_pagevec(struct pagevec *pvec)
 void shmem_unlock_mapping(struct address_space *mapping)
 {
 	struct pagevec pvec;
-	pgoff_t indices[PAGEVEC_SIZE];
 	pgoff_t index = 0;
 
 	pagevec_init(&pvec, 0);
@@ -405,16 +404,9 @@ void shmem_unlock_mapping(struct address_space *mapping)
 	 * Minor point, but we might as well stop if someone else SHM_LOCKs it.
 	 */
 	while (!mapping_unevictable(mapping)) {
-		/*
-		 * Avoid pagevec_lookup(): find_get_pages() returns 0 as if it
-		 * has finished, if it hits a row of PAGEVEC_SIZE swap entries.
-		 */
-		pvec.nr = shmem_find_get_pages_and_swap(mapping, index,
-					PAGEVEC_SIZE, pvec.pages, indices);
-		if (!pvec.nr)
+		if (!pagevec_lookup(&pvec, mapping, index, PAGEVEC_SIZE))
 			break;
-		index = indices[pvec.nr - 1] + 1;
-		shmem_deswap_pagevec(&pvec);
+		index = pvec.pages[pvec.nr - 1]->index + 1;
 		check_move_unevictable_pages(pvec.pages, pvec.nr);
 		pagevec_release(&pvec);
 		cond_resched();

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] shmem: tag swap entries in radix tree
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 1/4] shmem: simlify shmem_unlock_mapping Konstantin Khlebnikov
@ 2012-02-10 19:42 ` Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 3/4] shmem: use radix-tree iterator in shmem_unuse_inode() Konstantin Khlebnikov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-10 19:42 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

Shmem not uses any radix tree tags. Let's use one of them to mark
swap-entries stored in radix-tree as exceptional entries.
This allows to simplify and speedup truncate and swapoff operations.

Plus put tag manipulation, shmem_unuse(), shmem_unuse_inode() and
shmem_writepage() under CONFIG_SWAP. They are useless without swap.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/shmem.c |   21 +++++++++++++++++++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 4af8e85..b8e5f90 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -76,6 +76,9 @@ static struct vfsmount *shm_mnt;
 /* Symlink up to this size is kmalloc'ed instead of using a swappable page */
 #define SHORT_SYMLINK_LEN 128
 
+/* Radix-tree tag for swap-entries */
+#define SHMEM_TAG_SWAP		0
+
 struct shmem_xattr {
 	struct list_head list;	/* anchored by shmem_inode_info->xattr_list */
 	char *name;		/* xattr name */
@@ -239,9 +242,17 @@ static int shmem_radix_tree_replace(struct address_space *mapping,
 							&mapping->tree_lock);
 	if (item != expected)
 		return -ENOENT;
-	if (replacement)
+	if (replacement) {
+#ifdef CONFIG_SWAP
+		if (radix_tree_exceptional_entry(replacement))
+			radix_tree_tag_set(&mapping->page_tree,
+					index, SHMEM_TAG_SWAP);
+		else if (radix_tree_exceptional_entry(expected))
+			radix_tree_tag_clear(&mapping->page_tree,
+					index, SHMEM_TAG_SWAP);
+#endif
 		radix_tree_replace_slot(pslot, replacement);
-	else
+	} else
 		radix_tree_delete(&mapping->page_tree, index);
 	return 0;
 }
@@ -592,6 +603,8 @@ static void shmem_evict_inode(struct inode *inode)
 	end_writeback(inode);
 }
 
+#ifdef CONFIG_SWAP
+
 /*
  * If swap found in inode, free it and move page from swapcache to filecache.
  */
@@ -760,6 +773,8 @@ redirty:
 	return 0;
 }
 
+#endif /* CONFIG_SWAP */
+
 #ifdef CONFIG_NUMA
 #ifdef CONFIG_TMPFS
 static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
@@ -2281,7 +2296,9 @@ static void shmem_destroy_inodecache(void)
 }
 
 static const struct address_space_operations shmem_aops = {
+#ifdef CONFIG_SWAP
 	.writepage	= shmem_writepage,
+#endif
 	.set_page_dirty	= __set_page_dirty_no_writeback,
 #ifdef CONFIG_TMPFS
 	.write_begin	= shmem_write_begin,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] shmem: use radix-tree iterator in shmem_unuse_inode()
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 1/4] shmem: simlify shmem_unlock_mapping Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 2/4] shmem: tag swap entries in radix tree Konstantin Khlebnikov
@ 2012-02-10 19:42 ` Konstantin Khlebnikov
  2012-02-10 19:42 ` [PATCH 4/4] mm: use swap readahead at swapoff Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-10 19:42 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

This patch rewrites shmem swap entry searching with using radix tree iterator
and removes radix_tree_locate_item() which is used only in shmem.
Tagged radix-tree iterating would skip normal pages much more effectively.

Test: push 1Gb tmpfs file into swap and call # time swapoff.
Virtual machine: without patch: 35 seconds, with patch: 7 seconds.
Real hardware: without patch: 180 seconds, with patch: 100 seconds.
(Vm: qemu, swap on ssd, mostly all in host ram. Rh: swap on hdd.)

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/radix-tree.h |    1 
 lib/radix-tree.c           |   93 --------------------------------------------
 mm/shmem.c                 |   27 ++++++++++---
 3 files changed, 22 insertions(+), 99 deletions(-)

diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h
index f59d6c8..665396f 100644
--- a/include/linux/radix-tree.h
+++ b/include/linux/radix-tree.h
@@ -250,7 +250,6 @@ unsigned long radix_tree_range_tag_if_tagged(struct radix_tree_root *root,
 		unsigned long nr_to_tag,
 		unsigned int fromtag, unsigned int totag);
 int radix_tree_tagged(struct radix_tree_root *root, unsigned int tag);
-unsigned long radix_tree_locate_item(struct radix_tree_root *root, void *item);
 
 static inline void radix_tree_preload_end(void)
 {
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index b0158a2..d193d27 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -1120,99 +1120,6 @@ radix_tree_gang_lookup_tag_slot(struct radix_tree_root *root, void ***results,
 }
 EXPORT_SYMBOL(radix_tree_gang_lookup_tag_slot);
 
-#if defined(CONFIG_SHMEM) && defined(CONFIG_SWAP)
-#include <linux/sched.h> /* for cond_resched() */
-
-/*
- * This linear search is at present only useful to shmem_unuse_inode().
- */
-static unsigned long __locate(struct radix_tree_node *slot, void *item,
-			      unsigned long index, unsigned long *found_index)
-{
-	unsigned int shift, height;
-	unsigned long i;
-
-	height = slot->height;
-	shift = (height-1) * RADIX_TREE_MAP_SHIFT;
-
-	for ( ; height > 1; height--) {
-		i = (index >> shift) & RADIX_TREE_MAP_MASK;
-		for (;;) {
-			if (slot->slots[i] != NULL)
-				break;
-			index &= ~((1UL << shift) - 1);
-			index += 1UL << shift;
-			if (index == 0)
-				goto out;	/* 32-bit wraparound */
-			i++;
-			if (i == RADIX_TREE_MAP_SIZE)
-				goto out;
-		}
-
-		shift -= RADIX_TREE_MAP_SHIFT;
-		slot = rcu_dereference_raw(slot->slots[i]);
-		if (slot == NULL)
-			goto out;
-	}
-
-	/* Bottom level: check items */
-	for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) {
-		if (slot->slots[i] == item) {
-			*found_index = index + i;
-			index = 0;
-			goto out;
-		}
-	}
-	index += RADIX_TREE_MAP_SIZE;
-out:
-	return index;
-}
-
-/**
- *	radix_tree_locate_item - search through radix tree for item
- *	@root:		radix tree root
- *	@item:		item to be found
- *
- *	Returns index where item was found, or -1 if not found.
- *	Caller must hold no lock (since this time-consuming function needs
- *	to be preemptible), and must check afterwards if item is still there.
- */
-unsigned long radix_tree_locate_item(struct radix_tree_root *root, void *item)
-{
-	struct radix_tree_node *node;
-	unsigned long max_index;
-	unsigned long cur_index = 0;
-	unsigned long found_index = -1;
-
-	do {
-		rcu_read_lock();
-		node = rcu_dereference_raw(root->rnode);
-		if (!radix_tree_is_indirect_ptr(node)) {
-			rcu_read_unlock();
-			if (node == item)
-				found_index = 0;
-			break;
-		}
-
-		node = indirect_to_ptr(node);
-		max_index = radix_tree_maxindex(node->height);
-		if (cur_index > max_index)
-			break;
-
-		cur_index = __locate(node, item, cur_index, &found_index);
-		rcu_read_unlock();
-		cond_resched();
-	} while (cur_index != 0 && cur_index <= max_index);
-
-	return found_index;
-}
-#else
-unsigned long radix_tree_locate_item(struct radix_tree_root *root, void *item)
-{
-	return -1;
-}
-#endif /* CONFIG_SHMEM && CONFIG_SWAP */
-
 /**
  *	radix_tree_shrink    -    shrink height of a radix tree to minimal
  *	@root		radix tree root
diff --git a/mm/shmem.c b/mm/shmem.c
index b8e5f90..7a3fe08 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -613,14 +613,31 @@ static int shmem_unuse_inode(struct shmem_inode_info *info,
 {
 	struct address_space *mapping = info->vfs_inode.i_mapping;
 	void *radswap;
-	pgoff_t index;
+	struct radix_tree_iter iter;
+	void **slot;
 	int error;
 
 	radswap = swp_to_radix_entry(swap);
-	index = radix_tree_locate_item(&mapping->page_tree, radswap);
-	if (index == -1)
-		return 0;
 
+	rcu_read_lock();
+	radix_tree_for_each_chunk(slot, &mapping->page_tree, &iter, 0,
+				RADIX_TREE_ITER_TAGGED | SHMEM_TAG_SWAP) {
+		radix_tree_for_each_chunk_slot(slot, &iter,
+						RADIX_TREE_ITER_TAGGED) {
+			if (*slot != radswap)
+				continue;
+			rcu_read_unlock();
+			goto found;
+		}
+		rcu_read_unlock();
+		cond_resched();
+		rcu_read_lock();
+	}
+	rcu_read_unlock();
+
+	return 0;
+
+found:
 	/*
 	 * Move _head_ to start search for next from here.
 	 * But be careful: shmem_evict_inode checks list_empty without taking
@@ -635,7 +652,7 @@ static int shmem_unuse_inode(struct shmem_inode_info *info,
 	 * but also to hold up shmem_evict_inode(): so inode cannot be freed
 	 * beneath us (pagelock doesn't help until the page is in pagecache).
 	 */
-	error = shmem_add_to_page_cache(page, mapping, index,
+	error = shmem_add_to_page_cache(page, mapping, iter.index,
 						GFP_NOWAIT, radswap);
 	/* which does mem_cgroup_uncharge_cache_page on error */
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] mm: use swap readahead at swapoff
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2012-02-10 19:42 ` [PATCH 3/4] shmem: use radix-tree iterator in shmem_unuse_inode() Konstantin Khlebnikov
@ 2012-02-10 19:42 ` Konstantin Khlebnikov
  2012-02-11  7:43 ` [PATCH 5/4] shmem: put shmem_delete_from_page_cache under CONFIG_SWAP Konstantin Khlebnikov
  2012-02-11  7:44 ` [PATCH 6/4] shmem: simplify shmem_truncate_range Konstantin Khlebnikov
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-10 19:42 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

try_to_unuse() iterates over swap-entries sequentially,
thus readahead here will not hurt.

Test results:
Virtual machine: without patch 7 seconds, with patch 4 seconds.
Real hardware: without patch 100 seconds, with patch 70 seconds.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/swapfile.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index d999f09..4c99689 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1106,8 +1106,7 @@ static int try_to_unuse(unsigned int type)
 		 */
 		swap_map = &si->swap_map[i];
 		entry = swp_entry(type, i);
-		page = read_swap_cache_async(entry,
-					GFP_HIGHUSER_MOVABLE, NULL, 0);
+		page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, NULL, 0);
 		if (!page) {
 			/*
 			 * Either swap_duplicate() failed because entry

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/4] shmem: put shmem_delete_from_page_cache under CONFIG_SWAP
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2012-02-10 19:42 ` [PATCH 4/4] mm: use swap readahead at swapoff Konstantin Khlebnikov
@ 2012-02-11  7:43 ` Konstantin Khlebnikov
  2012-02-11  7:44 ` [PATCH 6/4] shmem: simplify shmem_truncate_range Konstantin Khlebnikov
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-11  7:43 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

Fix warning added in patch "shmem: tag swap entries in radix tree"

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/shmem.c |   38 +++++++++++++++++++-------------------
 1 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 7a3fe08..709e3d8 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -302,25 +302,6 @@ static int shmem_add_to_page_cache(struct page *page,
 }
 
 /*
- * Like delete_from_page_cache, but substitutes swap for page.
- */
-static void shmem_delete_from_page_cache(struct page *page, void *radswap)
-{
-	struct address_space *mapping = page->mapping;
-	int error;
-
-	spin_lock_irq(&mapping->tree_lock);
-	error = shmem_radix_tree_replace(mapping, page->index, page, radswap);
-	page->mapping = NULL;
-	mapping->nrpages--;
-	__dec_zone_page_state(page, NR_FILE_PAGES);
-	__dec_zone_page_state(page, NR_SHMEM);
-	spin_unlock_irq(&mapping->tree_lock);
-	page_cache_release(page);
-	BUG_ON(error);
-}
-
-/*
  * Like find_get_pages, but collecting swap entries as well as pages.
  */
 static unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
@@ -718,6 +699,25 @@ out:
 }
 
 /*
+ * Like delete_from_page_cache, but substitutes swap for page.
+ */
+static void shmem_delete_from_page_cache(struct page *page, void *radswap)
+{
+	struct address_space *mapping = page->mapping;
+	int error;
+
+	spin_lock_irq(&mapping->tree_lock);
+	error = shmem_radix_tree_replace(mapping, page->index, page, radswap);
+	page->mapping = NULL;
+	mapping->nrpages--;
+	__dec_zone_page_state(page, NR_FILE_PAGES);
+	__dec_zone_page_state(page, NR_SHMEM);
+	spin_unlock_irq(&mapping->tree_lock);
+	page_cache_release(page);
+	BUG_ON(error);
+}
+
+/*
  * Move the page from the page cache to the swap cache.
  */
 static int shmem_writepage(struct page *page, struct writeback_control *wbc)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 6/4] shmem: simplify shmem_truncate_range
  2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
                   ` (4 preceding siblings ...)
  2012-02-11  7:43 ` [PATCH 5/4] shmem: put shmem_delete_from_page_cache under CONFIG_SWAP Konstantin Khlebnikov
@ 2012-02-11  7:44 ` Konstantin Khlebnikov
  5 siblings, 0 replies; 7+ messages in thread
From: Konstantin Khlebnikov @ 2012-02-11  7:44 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, Hugh Dickins, linux-kernel; +Cc: Linus Torvalds

find_get_pages() now can skip unlimited count of exeptional entries,
so truncate_inode_pages_range() can truncate pages from shmem inodes.
Thus shmem_truncate_range() can be simplified.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/shmem.c |  199 +++++++++++++++---------------------------------------------
 1 files changed, 49 insertions(+), 150 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 709e3d8..b981fa9 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -302,57 +302,6 @@ static int shmem_add_to_page_cache(struct page *page,
 }
 
 /*
- * Like find_get_pages, but collecting swap entries as well as pages.
- */
-static unsigned shmem_find_get_pages_and_swap(struct address_space *mapping,
-					pgoff_t start, unsigned int nr_pages,
-					struct page **pages, pgoff_t *indices)
-{
-	unsigned int i;
-	unsigned int ret;
-	unsigned int nr_found;
-
-	rcu_read_lock();
-restart:
-	nr_found = radix_tree_gang_lookup_slot(&mapping->page_tree,
-				(void ***)pages, indices, start, nr_pages);
-	ret = 0;
-	for (i = 0; i < nr_found; i++) {
-		struct page *page;
-repeat:
-		page = radix_tree_deref_slot((void **)pages[i]);
-		if (unlikely(!page))
-			continue;
-		if (radix_tree_exception(page)) {
-			if (radix_tree_deref_retry(page))
-				goto restart;
-			/*
-			 * Otherwise, we must be storing a swap entry
-			 * here as an exceptional entry: so return it
-			 * without attempting to raise page count.
-			 */
-			goto export;
-		}
-		if (!page_cache_get_speculative(page))
-			goto repeat;
-
-		/* Has the page moved? */
-		if (unlikely(page != *((void **)pages[i]))) {
-			page_cache_release(page);
-			goto repeat;
-		}
-export:
-		indices[ret] = indices[i];
-		pages[ret] = page;
-		ret++;
-	}
-	if (unlikely(!ret && nr_found))
-		goto restart;
-	rcu_read_unlock();
-	return ret;
-}
-
-/*
  * Remove swap entry from radix tree, free the swap and its page cache.
  */
 static int shmem_free_swap(struct address_space *mapping,
@@ -369,21 +318,6 @@ static int shmem_free_swap(struct address_space *mapping,
 }
 
 /*
- * Pagevec may contain swap entries, so shuffle up pages before releasing.
- */
-static void shmem_deswap_pagevec(struct pagevec *pvec)
-{
-	int i, j;
-
-	for (i = 0, j = 0; i < pagevec_count(pvec); i++) {
-		struct page *page = pvec->pages[i];
-		if (!radix_tree_exceptional_entry(page))
-			pvec->pages[j++] = page;
-	}
-	pvec->nr = j;
-}
-
-/*
  * SysV IPC SHM_UNLOCK restore Unevictable pages to their evictable lists.
  */
 void shmem_unlock_mapping(struct address_space *mapping)
@@ -406,6 +340,50 @@ void shmem_unlock_mapping(struct address_space *mapping)
 }
 
 /*
+ * Remove range of swap entries from radix tree, and free them.
+ */
+static long shmem_truncate_swap_range(struct address_space *mapping,
+				      pgoff_t start, pgoff_t end)
+{
+	struct radix_tree_iter iter;
+	void **slot, *data, *radswaps[PAGEVEC_SIZE];
+	unsigned long indices[PAGEVEC_SIZE];
+	long nr_swaps_freed = 0;
+	int i, nr;
+
+next:
+	rcu_read_lock();
+	nr = 0;
+restart:
+	radix_tree_for_each_tagged(slot, &mapping->page_tree, &iter,
+						start, SHMEM_TAG_SWAP) {
+		if (iter.index > end)
+			break;
+		data = radix_tree_deref_slot(slot);
+		if (!data || !radix_tree_exception(data))
+			continue;
+		if (radix_tree_deref_retry(data))
+			goto restart;
+		radswaps[nr] = data;
+		indices[nr] = iter.index;
+		if (++nr == PAGEVEC_SIZE)
+			break;
+	}
+	rcu_read_unlock();
+
+	for ( i = 0 ; i < nr ; i++ ) {
+		if (!shmem_free_swap(mapping, indices[i], radswaps[i]))
+			nr_swaps_freed++;
+		start = indices[i] + 1;
+	}
+
+	if (nr == PAGEVEC_SIZE && start)
+		goto next;
+
+	return nr_swaps_freed;
+}
+
+/*
  * Remove range of pages and swap entries from radix tree, and free them.
  */
 void shmem_truncate_range(struct inode *inode, loff_t lstart, loff_t lend)
@@ -415,52 +393,11 @@ void shmem_truncate_range(struct inode *inode, loff_t lstart, loff_t lend)
 	pgoff_t start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
 	unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
 	pgoff_t end = (lend >> PAGE_CACHE_SHIFT);
-	struct pagevec pvec;
-	pgoff_t indices[PAGEVEC_SIZE];
 	long nr_swaps_freed = 0;
-	pgoff_t index;
-	int i;
 
 	BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
 
-	pagevec_init(&pvec, 0);
-	index = start;
-	while (index <= end) {
-		pvec.nr = shmem_find_get_pages_and_swap(mapping, index,
-			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1,
-							pvec.pages, indices);
-		if (!pvec.nr)
-			break;
-		mem_cgroup_uncharge_start();
-		for (i = 0; i < pagevec_count(&pvec); i++) {
-			struct page *page = pvec.pages[i];
-
-			index = indices[i];
-			if (index > end)
-				break;
-
-			if (radix_tree_exceptional_entry(page)) {
-				nr_swaps_freed += !shmem_free_swap(mapping,
-								index, page);
-				continue;
-			}
-
-			if (!trylock_page(page))
-				continue;
-			if (page->mapping == mapping) {
-				VM_BUG_ON(PageWriteback(page));
-				truncate_inode_page(mapping, page);
-			}
-			unlock_page(page);
-		}
-		shmem_deswap_pagevec(&pvec);
-		pagevec_release(&pvec);
-		mem_cgroup_uncharge_end();
-		cond_resched();
-		index++;
-	}
-
-	if (partial) {
+	if (IS_ENABLED(CONFIG_SWAP) && partial) {
 		struct page *page = NULL;
 		shmem_getpage(inode, start - 1, &page, SGP_READ, NULL);
 		if (page) {
@@ -469,51 +406,13 @@ void shmem_truncate_range(struct inode *inode, loff_t lstart, loff_t lend)
 			unlock_page(page);
 			page_cache_release(page);
 		}
+		lstart += PAGE_CACHE_SIZE - partial;
 	}
 
-	index = start;
-	for ( ; ; ) {
-		cond_resched();
-		pvec.nr = shmem_find_get_pages_and_swap(mapping, index,
-			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1,
-							pvec.pages, indices);
-		if (!pvec.nr) {
-			if (index == start)
-				break;
-			index = start;
-			continue;
-		}
-		if (index == start && indices[0] > end) {
-			shmem_deswap_pagevec(&pvec);
-			pagevec_release(&pvec);
-			break;
-		}
-		mem_cgroup_uncharge_start();
-		for (i = 0; i < pagevec_count(&pvec); i++) {
-			struct page *page = pvec.pages[i];
-
-			index = indices[i];
-			if (index > end)
-				break;
+	truncate_inode_pages_range(mapping, lstart, lend);
 
-			if (radix_tree_exceptional_entry(page)) {
-				nr_swaps_freed += !shmem_free_swap(mapping,
-								index, page);
-				continue;
-			}
-
-			lock_page(page);
-			if (page->mapping == mapping) {
-				VM_BUG_ON(PageWriteback(page));
-				truncate_inode_page(mapping, page);
-			}
-			unlock_page(page);
-		}
-		shmem_deswap_pagevec(&pvec);
-		pagevec_release(&pvec);
-		mem_cgroup_uncharge_end();
-		index++;
-	}
+	if (IS_ENABLED(CONFIG_SWAP))
+		nr_swaps_freed = shmem_truncate_swap_range(mapping, start, end);
 
 	spin_lock(&info->lock);
 	info->swapped -= nr_swaps_freed;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-02-11  7:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-10 19:42 [PATCH 0/4] shmem: radix-tree cleanups and swapoff optimizations Konstantin Khlebnikov
2012-02-10 19:42 ` [PATCH 1/4] shmem: simlify shmem_unlock_mapping Konstantin Khlebnikov
2012-02-10 19:42 ` [PATCH 2/4] shmem: tag swap entries in radix tree Konstantin Khlebnikov
2012-02-10 19:42 ` [PATCH 3/4] shmem: use radix-tree iterator in shmem_unuse_inode() Konstantin Khlebnikov
2012-02-10 19:42 ` [PATCH 4/4] mm: use swap readahead at swapoff Konstantin Khlebnikov
2012-02-11  7:43 ` [PATCH 5/4] shmem: put shmem_delete_from_page_cache under CONFIG_SWAP Konstantin Khlebnikov
2012-02-11  7:44 ` [PATCH 6/4] shmem: simplify shmem_truncate_range Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).