public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Subject: [PATCH -mm 16/16] SHM_LOCKED pages are nonreclaimable
Date: Fri, 23 May 2008 15:55:22 -0400	[thread overview]
Message-ID: <20080523195535.917456536@redhat.com> (raw)
In-Reply-To: 20080523195506.084894989@redhat.com

[-- Attachment #1: rvr-14-lts-noreclaim-SHM_LOCKED-pages-are-nonreclaimable.patch --]
[-- Type: text/plain, Size: 9408 bytes --]

From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

While working with Nick Piggin's mlock patches, I noticed that
shmem segments locked via shmctl(SHM_LOCKED) were not being handled.
SHM_LOCKed pages work like ramdisk pages--the writeback function
just redirties the page so that it can't be reclaimed.  Deal with
these using the same approach as for ram disk pages.

Use the AS_NORECLAIM flag to mark address_space of SHM_LOCKed
shared memory regions as non-reclaimable.  Then these pages
will be culled off the normal LRU lists during vmscan.

Add new wrapper function to clear the mapping's noreclaim state
when/if shared memory segment is munlocked.

Add 'scan_mapping_noreclaim_page()' to mm/vmscan.c to scan all
pages in the shmem segment's mapping [struct address_space] for
reclaimability now that they're no longer locked.  If so, move
them to the appropriate zone lru list.

Changes depend on [CONFIG_]NORECLAIM_LRU.

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by:  Rik van Riel <riel@redhat.com>

---
V2 -> V3:
+ rebase to 23-mm1 atop RvR's split LRU series.
+ Use scan_mapping_noreclaim_page() on unlock.  See below.

V1 -> V2:
+  modify to use reworked 'scan_all_zones_noreclaim_pages()'
   See 'TODO' below - still pending.

 include/linux/mm.h      |    7 ++-
 include/linux/pagemap.h |   10 ++++-
 include/linux/swap.h    |    4 ++
 ipc/shm.c               |   11 ++++-
 mm/shmem.c              |   10 +++--
 mm/vmscan.c             |   92 ++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 124 insertions(+), 10 deletions(-)

Index: linux-2.6.26-rc2-mm1/mm/shmem.c
===================================================================
--- linux-2.6.26-rc2-mm1.orig/mm/shmem.c	2008-05-23 15:14:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/mm/shmem.c	2008-05-23 15:19:28.000000000 -0400
@@ -1458,23 +1458,27 @@ static struct mempolicy *shmem_get_polic
 }
 #endif
 
-int shmem_lock(struct file *file, int lock, struct user_struct *user)
+struct address_space *shmem_lock(struct file *file, int lock,
+				 struct user_struct *user)
 {
 	struct inode *inode = file->f_path.dentry->d_inode;
 	struct shmem_inode_info *info = SHMEM_I(inode);
-	int retval = -ENOMEM;
+	struct address_space *retval = ERR_PTR(-ENOMEM);
 
 	spin_lock(&info->lock);
 	if (lock && !(info->flags & VM_LOCKED)) {
 		if (!user_shm_lock(inode->i_size, user))
 			goto out_nomem;
 		info->flags |= VM_LOCKED;
+		mapping_set_noreclaim(file->f_mapping);
+		retval = NULL;
 	}
 	if (!lock && (info->flags & VM_LOCKED) && user) {
 		user_shm_unlock(inode->i_size, user);
 		info->flags &= ~VM_LOCKED;
+		mapping_clear_noreclaim(file->f_mapping);
+		retval = file->f_mapping;
 	}
-	retval = 0;
 out_nomem:
 	spin_unlock(&info->lock);
 	return retval;
Index: linux-2.6.26-rc2-mm1/include/linux/pagemap.h
===================================================================
--- linux-2.6.26-rc2-mm1.orig/include/linux/pagemap.h	2008-05-23 15:19:21.000000000 -0400
+++ linux-2.6.26-rc2-mm1/include/linux/pagemap.h	2008-05-23 15:19:28.000000000 -0400
@@ -38,14 +38,20 @@ static inline void mapping_set_noreclaim
 	set_bit(AS_NORECLAIM, &mapping->flags);
 }
 
+static inline void mapping_clear_noreclaim(struct address_space *mapping)
+{
+	clear_bit(AS_NORECLAIM, &mapping->flags);
+}
+
 static inline int mapping_non_reclaimable(struct address_space *mapping)
 {
-	if (mapping && (mapping->flags & AS_NORECLAIM))
-		return 1;
+	if (mapping)
+		return test_bit(AS_NORECLAIM, &mapping->flags);
 	return 0;
 }
 #else
 static inline void mapping_set_noreclaim(struct address_space *mapping) { }
+static inline void mapping_clear_noreclaim(struct address_space *mapping) { }
 static inline int mapping_non_reclaimable(struct address_space *mapping)
 {
 	return 0;
Index: linux-2.6.26-rc2-mm1/mm/vmscan.c
===================================================================
--- linux-2.6.26-rc2-mm1.orig/mm/vmscan.c	2008-05-23 15:19:21.000000000 -0400
+++ linux-2.6.26-rc2-mm1/mm/vmscan.c	2008-05-23 15:19:28.000000000 -0400
@@ -2317,4 +2317,96 @@ int page_reclaimable(struct page *page, 
 
 	return 1;
 }
+
+/**
+ * check_move_noreclaim_page - check page for reclaimability and move to appropriate zone lru list
+ * @page: page to check reclaimability and move to appropriate lru list
+ * @zone: zone page is in
+ *
+ * Checks a page for reclaimability and moves the page to the appropriate
+ * zone lru list.
+ *
+ * Restrictions: zone->lru_lock must be held, page must be on LRU and must
+ * have PageNoreclaim set.
+ */
+static void check_move_noreclaim_page(struct page *page, struct zone *zone)
+{
+
+	ClearPageNoreclaim(page); /* for page_reclaimable() */
+	if (page_reclaimable(page, NULL)) {
+		enum lru_list l = LRU_INACTIVE_ANON + page_file_cache(page);
+		__dec_zone_state(zone, NR_NORECLAIM);
+		list_move(&page->lru, &zone->list[l]);
+		__inc_zone_state(zone, NR_INACTIVE_ANON + l);
+	} else {
+		/*
+		 * rotate noreclaim list
+		 */
+		SetPageNoreclaim(page);
+		list_move(&page->lru, &zone->list[LRU_NORECLAIM]);
+	}
+}
+
+/**
+ * scan_mapping_noreclaim_pages - scan an address space for reclaimable pages
+ * @mapping: struct address_space to scan for reclaimable pages
+ *
+ * Scan all pages in mapping.  Check non-reclaimable pages for
+ * reclaimability and move them to the appropriate zone lru list.
+ */
+void scan_mapping_noreclaim_pages(struct address_space *mapping)
+{
+	pgoff_t next = 0;
+	pgoff_t end   = i_size_read(mapping->host);
+	struct zone *zone;
+	struct pagevec pvec;
+
+	if (mapping->nrpages == 0)
+		return;
+
+	pagevec_init(&pvec, 0);
+	while (next < end &&
+		pagevec_lookup(&pvec, mapping, next, PAGEVEC_SIZE)) {
+		int i;
+
+		zone = NULL;
+
+		for (i = 0; i < pagevec_count(&pvec); i++) {
+			struct page *page = pvec.pages[i];
+			pgoff_t page_index = page->index;
+			struct zone *pagezone = page_zone(page);
+
+			if (page_index > next)
+				next = page_index;
+			next++;
+
+			if (TestSetPageLocked(page)) {
+				/*
+				 * OK, let's do it the hard way...
+				 */
+				if (zone)
+					spin_unlock_irq(&zone->lru_lock);
+				zone = NULL;
+				lock_page(page);
+			}
+
+			if (pagezone != zone) {
+				if (zone)
+					spin_unlock_irq(&zone->lru_lock);
+				zone = pagezone;
+				spin_lock_irq(&zone->lru_lock);
+			}
+
+			if (PageLRU(page) && PageNoreclaim(page))
+				check_move_noreclaim_page(page, zone);
+
+			unlock_page(page);
+
+		}
+		if (zone)
+			spin_unlock_irq(&zone->lru_lock);
+		pagevec_release(&pvec);
+	}
+
+}
 #endif
Index: linux-2.6.26-rc2-mm1/include/linux/swap.h
===================================================================
--- linux-2.6.26-rc2-mm1.orig/include/linux/swap.h	2008-05-23 15:14:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/include/linux/swap.h	2008-05-23 15:19:28.000000000 -0400
@@ -232,12 +232,16 @@ static inline int zone_reclaim(struct zo
 
 #ifdef CONFIG_NORECLAIM_LRU
 extern int page_reclaimable(struct page *page, struct vm_area_struct *vma);
+extern void scan_mapping_noreclaim_pages(struct address_space *);
 #else
 static inline int page_reclaimable(struct page *page,
 						struct vm_area_struct *vma)
 {
 	return 1;
 }
+static inline void scan_mapping_noreclaim_pages(struct address_space *mapping)
+{
+}
 #endif
 
 extern int kswapd_run(int nid);
Index: linux-2.6.26-rc2-mm1/include/linux/mm.h
===================================================================
--- linux-2.6.26-rc2-mm1.orig/include/linux/mm.h	2008-05-23 15:14:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/include/linux/mm.h	2008-05-23 15:19:28.000000000 -0400
@@ -694,10 +694,11 @@ static inline int page_mapped(struct pag
 extern void show_free_areas(void);
 
 #ifdef CONFIG_SHMEM
-int shmem_lock(struct file *file, int lock, struct user_struct *user);
+extern struct address_space *shmem_lock(struct file *file, int lock,
+					struct user_struct *user);
 #else
-static inline int shmem_lock(struct file *file, int lock,
-			     struct user_struct *user)
+static inline struct address_space *shmem_lock(struct file *file, int lock,
+					struct user_struct *user)
 {
 	return 0;
 }
Index: linux-2.6.26-rc2-mm1/ipc/shm.c
===================================================================
--- linux-2.6.26-rc2-mm1.orig/ipc/shm.c	2008-05-23 15:14:03.000000000 -0400
+++ linux-2.6.26-rc2-mm1/ipc/shm.c	2008-05-23 15:19:28.000000000 -0400
@@ -736,6 +736,8 @@ asmlinkage long sys_shmctl(int shmid, in
 	case SHM_LOCK:
 	case SHM_UNLOCK:
 	{
+		struct address_space *mapping = NULL;
+
 		shp = shm_lock_check(ns, shmid);
 		if (IS_ERR(shp)) {
 			err = PTR_ERR(shp);
@@ -763,18 +765,23 @@ asmlinkage long sys_shmctl(int shmid, in
 		if(cmd==SHM_LOCK) {
 			struct user_struct * user = current->user;
 			if (!is_file_hugepages(shp->shm_file)) {
-				err = shmem_lock(shp->shm_file, 1, user);
+				mapping = shmem_lock(shp->shm_file, 1, user);
+				if (IS_ERR(mapping))
+					err = PTR_ERR(mapping);
+				mapping = NULL;
 				if (!err && !(shp->shm_perm.mode & SHM_LOCKED)){
 					shp->shm_perm.mode |= SHM_LOCKED;
 					shp->mlock_user = user;
 				}
 			}
 		} else if (!is_file_hugepages(shp->shm_file)) {
-			shmem_lock(shp->shm_file, 0, shp->mlock_user);
+			mapping = shmem_lock(shp->shm_file, 0, shp->mlock_user);
 			shp->shm_perm.mode &= ~SHM_LOCKED;
 			shp->mlock_user = NULL;
 		}
 		shm_unlock(shp);
+		if (mapping)
+			scan_mapping_noreclaim_pages(mapping);
 		goto out;
 	}
 	case IPC_RMID:

-- 
All Rights Reversed


  parent reply	other threads:[~2008-05-23 22:07 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-23 19:55 [PATCH -mm 00/16] VM pageout scalability improvements (V8) Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 01/16] move isolate_lru_page() to vmscan.c Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 02/16] Use an indexed array for LRU variables Rik van Riel
2008-05-27 16:54   ` Lee Schermerhorn
2008-05-27 17:03     ` Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 03/16] use an array for the LRU pagevecs Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 04/16] free swap space on swap-in/activation Rik van Riel
2008-05-28  9:08   ` Daisuke Nishimura
2008-05-23 19:55 ` [PATCH -mm 05/16] define page_file_cache() function Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 06/16] split LRU lists into anon & file sets Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 07/16] second chance replacement for anonymous pages Rik van Riel
2008-05-28  5:36   ` Daisuke Nishimura
2008-05-28 13:39     ` Rik van Riel
2008-05-28 15:42       ` Daisuke Nishimura
2008-05-28 16:08         ` Rik van Riel
2008-05-28 11:03   ` KOSAKI Motohiro
2008-05-28 13:43     ` Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 08/16] add some sanity checks to get_scan_ratio Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 09/16] fix pagecache reclaim referenced bit check Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 10/16] add newly swapped in pages to the inactive list Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 11/16] more aggressively use lumpy reclaim Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 12/16] pageflag helpers for configed-out flags Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 13/16] No Reclaim LRU Infrastructure Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 14/16] Non-reclaimable page statistics Rik van Riel
2008-05-23 19:55 ` [PATCH -mm 15/16] ramfs pages are non-reclaimable Rik van Riel
2008-05-23 19:55 ` Rik van Riel [this message]
2008-05-26 18:24 ` [PATCH -mm 00/16] VM pageout scalability improvements (V8) Balbir Singh
2008-05-26 19:33   ` Rik van Riel
2008-05-27 15:54     ` Lee Schermerhorn
2008-05-27 16:10       ` Balbir Singh
2008-05-28  1:12       ` KAMEZAWA Hiroyuki
2008-05-28 11:04         ` [RFC PATCH] No Reclaim LRU Infrastructure enhancement for memcgroup KOSAKI Motohiro
2008-05-29  2:30           ` Balbir Singh
2008-05-29 11:14             ` Daisuke Nishimura
2008-05-28 11:49     ` [PATCH -mm 00/16] VM pageout scalability improvements (V8) Balbir Singh
2008-05-28 13:33       ` KOSAKI Motohiro
2008-05-28 13:36         ` Balbir Singh
2008-05-29 12:47 ` Carsten Otte
2008-05-29 14:43   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080523195535.917456536@redhat.com \
    --to=riel@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox