Re: [PATCH] mm: fix page_lock_anon_vma leaving mutex locked

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix page_lock_anon_vma leaving mutex locked
Date: Sun, 29 May 2011 10:33:44 +0200	[thread overview]
Message-ID: <1306658024.1200.1222.camel@twins> (raw)
In-Reply-To: <alpine.LSU.2.00.1105281634440.14257@sister.anvils>

On Sat, 2011-05-28 at 17:12 -0700, Hugh Dickins wrote:

> Though I think I'm arriving at the conclusion that this patch
> is correct as is, despite the doubts that have arisen.
> 
> One argument is by induction: since we've noticed no problems before
> Peter's patchset, and actually Peter's patchset plus my patch is not
> really making any difference to this "anon_vma changing beneath you"
> case, is it?

I'll buy that argument, and on those grounds would have Acked the patch,
but since Linus already committed it, that's a tad moot.

Still understanding why the previous code was right (if so) is
important.

> > I'm wondering - wouldn't it be nicer to just re-check (after getting
> > the anon_vma lock) that page->mapping still matches anon_mapping?
> 
> I toyed with that: it seemed a better idea than relying on the refcount,
> which wasn't giving the guarantee we needed (the refcount is perfectly
> good in other respects, it just isn't good for this particular check).
> 
> However, the problem (if there is one) goes a bit further than that:
> if we don't actually have serialization against page->anon_vma (okay,
> it's actually page->mapping, but simpler to express this way) being
> changed at any instant, i.e. we're serving the page_referenced() without
> PageLocked case, then what good is the "anon_vma" that page_lock_anon_vma()
> returns?  If that can be freed and reused at any moment?

Agreed, all except page_referenced() are serialized using PageLock.

> I believe that although it may no longer be the anon_vma that the page
> is pointing to, it remains stable.  Because even if page->anon_vma is
> updated, it will certainly have the same anon_vma->root as before
> (see the first BUG_ON in __page_check_anon_rmap() for reassurance),
> so the mutex locking holds good.
> 
> And the structure itself won't be freed: although the page is now
> pointing to a less inclusive, more optimal anon_vma for reclaim to use,
> the anon_vma which was originally pointed to remains on the same vma's
> chains as it ever was, and only gets freed up when they're all gone.
> 
> So, when there's this race with moving anon_vma, page_lock_anon_vma()
> may end up returning a less than optimal anon_vma, but it's still valid
> as a good though longer list of vmas to look through.

Yes, and I think I see what you mean, if a page's anon_vma is changed
while it remains mapped it will only ever be moved to a child of the
original anon_vma. And because of the anon_vma ref-counting, the
original anon_vma will stick around until that too is dead, which won't
happen for as long as the page remains mapped.

Therefore, for as long as we observe page_mapped(), any anon_vma
obtained from it remains valid.

Talk about tricky.. shees. I bet that wants a comment or so.

> The previous code would have broken horribly, wouldn't it, were that
> not the case?

It would have, yes.

---
Subject: mm, rmap: Add yet more comments to page_get_anon_vma/page_lock_anon_vma

Inspired by an analysis from Hugh on why again all this doesn't explode
in our face.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/rmap.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 6bada99..487d5cc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -350,7 +350,12 @@ void __init anon_vma_init(void)
  * have been relevant to this page.
  *
  * The page might have been remapped to a different anon_vma or the anon_vma
- * returned may already be freed (and even reused).
+ * returned may already be freed (and even reused). 
+ *
+ * In case it was remapped to a different anon_vma, the new anon_vma will be a
+ * child of the old anon_vma, and the anon_vma lifetime rules will therefore
+ * ensure that any anon_vma obtained from the page will still be valid for as
+ * long as we observe page_mapped() [ hence all those page_mapped() tests ].
  *
  * All users of this function must be very careful when walking the anon_vma
  * chain and verify that the page in question is indeed mapped in it
@@ -421,7 +426,7 @@ struct anon_vma *page_lock_anon_vma(struct page *page)
 		/*
 		 * If the page is still mapped, then this anon_vma is still
 		 * its anon_vma, and holding the mutex ensures that it will
-		 * not go away, see __put_anon_vma().
+		 * not go away, see anon_vma_free().
 		 */
 		if (!page_mapped(page)) {
 			mutex_unlock(&root_anon_vma->mutex);

WARNING: multiple messages have this Message-ID (diff)

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix page_lock_anon_vma leaving mutex locked
Date: Sun, 29 May 2011 10:33:44 +0200	[thread overview]
Message-ID: <1306658024.1200.1222.camel@twins> (raw)
In-Reply-To: <alpine.LSU.2.00.1105281634440.14257@sister.anvils>

On Sat, 2011-05-28 at 17:12 -0700, Hugh Dickins wrote:

> Though I think I'm arriving at the conclusion that this patch
> is correct as is, despite the doubts that have arisen.
> 
> One argument is by induction: since we've noticed no problems before
> Peter's patchset, and actually Peter's patchset plus my patch is not
> really making any difference to this "anon_vma changing beneath you"
> case, is it?

I'll buy that argument, and on those grounds would have Acked the patch,
but since Linus already committed it, that's a tad moot.

Still understanding why the previous code was right (if so) is
important.

> > I'm wondering - wouldn't it be nicer to just re-check (after getting
> > the anon_vma lock) that page->mapping still matches anon_mapping?
> 
> I toyed with that: it seemed a better idea than relying on the refcount,
> which wasn't giving the guarantee we needed (the refcount is perfectly
> good in other respects, it just isn't good for this particular check).
> 
> However, the problem (if there is one) goes a bit further than that:
> if we don't actually have serialization against page->anon_vma (okay,
> it's actually page->mapping, but simpler to express this way) being
> changed at any instant, i.e. we're serving the page_referenced() without
> PageLocked case, then what good is the "anon_vma" that page_lock_anon_vma()
> returns?  If that can be freed and reused at any moment?

Agreed, all except page_referenced() are serialized using PageLock.

> I believe that although it may no longer be the anon_vma that the page
> is pointing to, it remains stable.  Because even if page->anon_vma is
> updated, it will certainly have the same anon_vma->root as before
> (see the first BUG_ON in __page_check_anon_rmap() for reassurance),
> so the mutex locking holds good.
> 
> And the structure itself won't be freed: although the page is now
> pointing to a less inclusive, more optimal anon_vma for reclaim to use,
> the anon_vma which was originally pointed to remains on the same vma's
> chains as it ever was, and only gets freed up when they're all gone.
> 
> So, when there's this race with moving anon_vma, page_lock_anon_vma()
> may end up returning a less than optimal anon_vma, but it's still valid
> as a good though longer list of vmas to look through.

Yes, and I think I see what you mean, if a page's anon_vma is changed
while it remains mapped it will only ever be moved to a child of the
original anon_vma. And because of the anon_vma ref-counting, the
original anon_vma will stick around until that too is dead, which won't
happen for as long as the page remains mapped.

Therefore, for as long as we observe page_mapped(), any anon_vma
obtained from it remains valid.

Talk about tricky.. shees. I bet that wants a comment or so.

> The previous code would have broken horribly, wouldn't it, were that
> not the case?

It would have, yes.

---
Subject: mm, rmap: Add yet more comments to page_get_anon_vma/page_lock_anon_vma

Inspired by an analysis from Hugh on why again all this doesn't explode
in our face.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/rmap.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 6bada99..487d5cc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -350,7 +350,12 @@ void __init anon_vma_init(void)
  * have been relevant to this page.
  *
  * The page might have been remapped to a different anon_vma or the anon_vma
- * returned may already be freed (and even reused).
+ * returned may already be freed (and even reused). 
+ *
+ * In case it was remapped to a different anon_vma, the new anon_vma will be a
+ * child of the old anon_vma, and the anon_vma lifetime rules will therefore
+ * ensure that any anon_vma obtained from the page will still be valid for as
+ * long as we observe page_mapped() [ hence all those page_mapped() tests ].
  *
  * All users of this function must be very careful when walking the anon_vma
  * chain and verify that the page in question is indeed mapped in it
@@ -421,7 +426,7 @@ struct anon_vma *page_lock_anon_vma(struct page *page)
 		/*
 		 * If the page is still mapped, then this anon_vma is still
 		 * its anon_vma, and holding the mutex ensures that it will
-		 * not go away, see __put_anon_vma().
+		 * not go away, see anon_vma_free().
 		 */
 		if (!page_mapped(page)) {
 			mutex_unlock(&root_anon_vma->mutex);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-05-29  8:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-28 20:20 [PATCH] mm: fix page_lock_anon_vma leaving mutex locked Hugh Dickins
2011-05-28 20:20 ` Hugh Dickins
2011-05-28 21:14 ` Peter Zijlstra
2011-05-28 21:14   ` Peter Zijlstra
2011-05-28 22:02   ` Hugh Dickins
2011-05-28 22:02     ` Hugh Dickins
2011-05-28 23:24     ` Linus Torvalds
2011-05-28 23:24       ` Linus Torvalds
2011-05-28 23:56       ` Linus Torvalds
2011-05-28 23:56         ` Linus Torvalds
2011-05-29  0:12       ` Hugh Dickins
2011-05-29  0:12         ` Hugh Dickins
2011-05-29  0:23         ` Linus Torvalds
2011-05-29  0:23           ` Linus Torvalds
2011-05-29  0:43           ` Hugh Dickins
2011-05-29  0:43             ` Hugh Dickins
2011-05-29  8:35             ` Peter Zijlstra
2011-05-29  8:35               ` Peter Zijlstra
2011-05-29  8:33         ` Peter Zijlstra [this message]
2011-05-29  8:33           ` Peter Zijlstra
2011-05-29 20:53           ` Hugh Dickins
2011-05-29 20:53             ` Hugh Dickins

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:6bada99 dfblob:487d5cc dfblob:6bada99 dfblob:487d5cc )
 OR (
bs:"Re: [PATCH] mm: fix page_lock_anon_vma leaving mutex locked" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1306658024.1200.1222.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.