linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Forrest <dan.forrest@ssec.wisc.edu>
To: Rik van Riel <riel@redhat.com>
Cc: linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: Repeated fork() causes SLAB to grow without bound
Date: Fri, 17 Aug 2012 19:03:12 -0500	[thread overview]
Message-ID: <20120818000312.GA4262@evergreen.ssec.wisc.edu> (raw)
In-Reply-To: <502D42E5.7090403@redhat.com>

On Thu, Aug 16, 2012 at 02:58:45PM -0400, Rik van Riel wrote:

> Oh dear.
> 
> Basically, what happens is that at fork time, a new
> "level" is created for the anon_vma hierarchy. This
> works great for normal forking daemons, since the
> parent process just keeps running, and forking off
> children.
> 
> Look at anon_vma_fork() in mm/rmap.c for the details.
> 
> Having each child become the new parent, and the
> previous parent exit, can result in an "infinite"
> stack of anon_vmas.
> 
> Now, the parent anon_vma we cannot get rid of,
> because that is where the anon_vma lock lives.
> 
> However, in your case you have many more anon_vma
> levels than you have processes!
> 
> I wonder if it may be possible to fix your bug
> by adding a refcount to the struct anon_vma,
> one count for each VMA that is directly attached
> to the anon_vma (ie. vma->anon_vma == anon_vma),
> and one for each page that points to the anon_vma.
> 
> If the reference count on an anon_vma reaches 0,
> we can skip that anon_vma in anon_vma_clone, and
> the child process should not get that anon_vma.
> 
> A scheme like that may be enough to avoid the trouble
> you are running into.
> 
> Does this sound realistic?

Based on your comments, I came up with the following patch.  It boots
and the anon_vma/anon_vma_chain SLAB usage is stable, but I don't know
if I've overlooked something.  I'm not a kernel hacker.


--- include/linux/rmap.h.ORIG	2011-08-05 04:59:21.000000000 +0000
+++ include/linux/rmap.h	2012-08-16 22:52:25.000000000 +0000
@@ -35,6 +35,7 @@ struct anon_vma {
 	 * anon_vma if they are the last user on release
 	 */
 	atomic_t refcount;
+	atomic_t pagecount;
 
 	/*
 	 * NOTE: the LSB of the head.next is set by
--- mm/rmap.c.ORIG	2011-08-05 04:59:21.000000000 +0000
+++ mm/rmap.c	2012-08-17 23:55:13.000000000 +0000
@@ -85,6 +85,7 @@ static inline struct anon_vma *anon_vma_
 static inline void anon_vma_free(struct anon_vma *anon_vma)
 {
 	VM_BUG_ON(atomic_read(&anon_vma->refcount));
+	VM_BUG_ON(atomic_read(&anon_vma->pagecount));
 
 	/*
 	 * Synchronize against page_lock_anon_vma() such that
@@ -176,6 +177,7 @@ int anon_vma_prepare(struct vm_area_stru
 		spin_lock(&mm->page_table_lock);
 		if (likely(!vma->anon_vma)) {
 			vma->anon_vma = anon_vma;
+			atomic_inc(&anon_vma->pagecount);
 			avc->anon_vma = anon_vma;
 			avc->vma = vma;
 			list_add(&avc->same_vma, &vma->anon_vma_chain);
@@ -262,7 +264,10 @@ int anon_vma_clone(struct vm_area_struct
 		}
 		anon_vma = pavc->anon_vma;
 		root = lock_anon_vma_root(root, anon_vma);
-		anon_vma_chain_link(dst, avc, anon_vma);
+		if (!atomic_read(&anon_vma->pagecount))
+			anon_vma_chain_free(avc);
+		else
+			anon_vma_chain_link(dst, avc, anon_vma);
 	}
 	unlock_anon_vma_root(root);
 	return 0;
@@ -314,6 +319,7 @@ int anon_vma_fork(struct vm_area_struct
 	get_anon_vma(anon_vma->root);
 	/* Mark this anon_vma as the one where our new (COWed) pages go. */
 	vma->anon_vma = anon_vma;
+	atomic_set(&anon_vma->pagecount, 1);
 	anon_vma_lock(anon_vma);
 	anon_vma_chain_link(vma, avc, anon_vma);
 	anon_vma_unlock(anon_vma);
@@ -341,6 +347,8 @@ void unlink_anon_vmas(struct vm_area_str
 
 		root = lock_anon_vma_root(root, anon_vma);
 		list_del(&avc->same_anon_vma);
+		if (vma->anon_vma == anon_vma)
+			atomic_dec(&anon_vma->pagecount);
 
 		/*
 		 * Leave empty anon_vmas on the list - we'll need
@@ -375,6 +383,7 @@ static void anon_vma_ctor(void *data)
 
 	mutex_init(&anon_vma->mutex);
 	atomic_set(&anon_vma->refcount, 0);
+	atomic_set(&anon_vma->pagecount, 0);
 	INIT_LIST_HEAD(&anon_vma->head);
 }
 
@@ -996,6 +1005,7 @@ static void __page_set_anon_rmap(struct
 	if (!exclusive)
 		anon_vma = anon_vma->root;
 
+	atomic_inc(&anon_vma->pagecount);
 	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
 	page->mapping = (struct address_space *) anon_vma;
 	page->index = linear_page_index(vma, address);
@@ -1142,6 +1152,11 @@ void page_remove_rmap(struct page *page)
 	if (unlikely(PageHuge(page)))
 		return;
 	if (PageAnon(page)) {
+		struct anon_vma *anon_vma;
+
+		anon_vma = page_anon_vma(page);
+		if (anon_vma)
+			atomic_dec(&anon_vma->pagecount);
 		mem_cgroup_uncharge_page(page);
 		if (!PageTransHuge(page))
 			__dec_zone_page_state(page, NR_ANON_PAGES);
@@ -1747,6 +1762,7 @@ static void __hugepage_set_anon_rmap(str
 	if (!exclusive)
 		anon_vma = anon_vma->root;
 
+	atomic_inc(&anon_vma->pagecount);
 	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
 	page->mapping = (struct address_space *) anon_vma;
 	page->index = linear_page_index(vma, address);

-- 
Daniel K. Forrest		Space Science and
dan.forrest@ssec.wisc.edu	Engineering Center
(608) 890 - 0558		University of Wisconsin, Madison

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-08-18  0:03 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120816024610.GA5350@evergreen.ssec.wisc.edu>
2012-08-16 18:58 ` Repeated fork() causes SLAB to grow without bound Rik van Riel
2012-08-18  0:03   ` Daniel Forrest [this message]
2012-08-18  3:46     ` Rik van Riel
2012-08-18  4:07       ` Daniel Forrest
2012-08-18  4:10         ` Rik van Riel
2012-08-20  8:00       ` Hugh Dickins
2012-08-20  9:39         ` Michel Lespinasse
2012-08-20 11:11           ` Andi Kleen
2012-08-20 11:17           ` Rik van Riel
2012-08-20 11:53             ` Michel Lespinasse
2012-08-20 19:11               ` Michel Lespinasse
2012-08-22  3:20           ` [RFC PATCH] " Michel Lespinasse
2012-08-22  3:29             ` Rik van Riel
2013-06-03 19:50               ` Daniel Forrest
2013-06-04 10:37                 ` Rik van Riel
2013-06-05 14:02                   ` Andrea Arcangeli
2014-11-14 16:30                 ` [PATCH] " Daniel Forrest
2014-11-18  0:02                   ` Andrew Morton
2014-11-18  1:41                     ` Daniel Forrest
2014-11-18  2:41                       ` Rik van Riel
2014-11-18 20:19                         ` Andrew Morton
2014-11-18 22:15                           ` Konstantin Khlebnikov
2014-11-18 23:02                             ` Konstantin Khlebnikov
2014-11-18 23:50                               ` Vlastimil Babka
2014-11-19 14:36                                 ` Konstantin Khlebnikov
2014-11-19 16:09                                   ` Vlastimil Babka
2014-11-19 16:58                                     ` Konstantin Khlebnikov
2014-11-19 23:14                                       ` Michel Lespinasse
2014-11-20 14:42                                         ` Konstantin Khlebnikov
2014-11-20 14:50                                           ` Rik van Riel
2014-11-20 15:03                                             ` Konstantin Khlebnikov
2014-11-24  7:09                                               ` Konstantin Khlebnikov
2014-11-25 10:59                                                 ` Michal Hocko
2014-11-25 12:13                                                   ` Konstantin Khlebnikov
2014-11-25 15:00                                                     ` Michal Hocko
2014-11-26 17:35                                                       ` Michal Hocko
2014-12-05 15:44                                                         ` Jerome Marchand
2014-11-20 15:27                                           ` Michel Lespinasse
2014-11-19  2:48                           ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120818000312.GA4262@evergreen.ssec.wisc.edu \
    --to=dan.forrest@ssec.wisc.edu \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).