All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michel Lespinasse <walken@google.com>
To: Sasha Levin <levinsasha928@gmail.com>
Cc: linux-mm@kvack.org, riel@redhat.com, peterz@infradead.org,
	aarcange@redhat.com, hughd@google.com, daniel.santos@pobox.com,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	Dave Jones <davej@redhat.com>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
Date: Fri, 14 Sep 2012 17:00:29 -0700	[thread overview]
Message-ID: <20120915000029.GA29426@google.com> (raw)
In-Reply-To: <CANN689Ff3W4z=+3J8aGO-2GrPHGJ=ote_f5q9jzRQRAP+b0T4Q@mail.gmail.com>

On Fri, Sep 14, 2012 at 3:46 PM, Michel Lespinasse <walken@google.com> wrote:
> On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> On 09/04/2012 11:20 AM, Michel Lespinasse wrote:
>>> Add a CONFIG_DEBUG_VM_RB build option for the previously existing
>>> DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using
>>> recursive algorithms, we can expose it a bit more.
>>>
>>> Also extend this code to validate_mm() after stack expansion, and to
>>> check that the vma's start and last pgoffs have not changed since the
>>> nodes were inserted on the anon vma interval tree (as it is important
>>> that the nodes be reindexed after each such update).
>>
>> This patch exposes the following warning:
>>
>> [   24.977502] ------------[ cut here ]------------
>> [   24.979089] WARNING: at mm/interval_tree.c:110
>> anon_vma_interval_tree_verify+0x81/0xa0()
>> [   24.981765] Pid: 5928, comm: trinity-child37 Tainted: G        W
>> 3.6.0-rc5-next-20120914-sasha-00003-g7deb7fa-dirty #333
>> [   24.985501] Call Trace:
>> [   24.986345]  [<ffffffff81224c91>] ? anon_vma_interval_tree_verify+0x81/0xa0
>> [   24.988535]  [<ffffffff81106766>] warn_slowpath_common+0x86/0xb0
>> [   24.990636]  [<ffffffff81106855>] warn_slowpath_null+0x15/0x20
>> [   24.992658]  [<ffffffff81224c91>] anon_vma_interval_tree_verify+0x81/0xa0
>> [   24.994980]  [<ffffffff8122e6e8>] validate_mm+0x58/0x1e0
>> [   24.996772]  [<ffffffff8122e934>] vma_link+0x94/0xe0
>> [   24.997719]  [<ffffffff812315e9>] copy_vma+0x279/0x2e0
>> [   24.998522]  [<ffffffff8117a7fd>] ? trace_hardirqs_off+0xd/0x10
>> [   25.000772]  [<ffffffff81232e89>] move_vma+0xa9/0x260
>> [   25.002499]  [<ffffffff812334b5>] sys_mremap+0x475/0x540
>> [   25.004364]  [<ffffffff8374b6e8>] tracesys+0xe1/0xe6
>> [   25.006108] ---[ end trace 7c901670963aa6e2 ]---
>>
>> The code line is
>>
>>         WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node));
>
> That's very interesting (and potentially relevant to another bug
> that's been reported too).
>
> I'd like to know, what workload did you use that triggered this ?
> (I find it hard to test mremap as I don't know of enough users of it)

All right. Hugh managed to reproduce the issue on his suse laptop, and
I came up with a fix.

The problem was that in mremap, the new vma's vm_{start,end,pgoff}
fields need to be updated before calling anon_vma_clone() so that the
new vma will be properly indexed.

Patch attached. I expect this should also explain Jiri's reported
failure involving splitting THP pages during mremap(), even though we
did not manage to reproduce that one.

---------------------------------8<-------------------------------

From: Michel Lespinasse <walken@google.com>
Date: Fri, 14 Sep 2012 16:43:49 -0700
Subject: [PATCH] mm anon rmap: in mremap, set the new vma's position before
 anon_vma_clone()

anon_vma_clone() expects new_vma->vm_{start,end,pgoff} to be correctly set
so that the new vma can be indexed on the anon interval tree.

copy_vma() was failing to do that, which broke mremap().

Signed-off-by: Michel Lespinasse <walken@google.com>

---
 mm/mmap.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index cc8c64077a42..7e672800b5d4 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2446,16 +2446,16 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 		new_vma = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL);
 		if (new_vma) {
 			*new_vma = *vma;
+			new_vma->vm_start = addr;
+			new_vma->vm_end = addr + len;
+			new_vma->vm_pgoff = pgoff;
 			pol = mpol_dup(vma_policy(vma));
 			if (IS_ERR(pol))
 				goto out_free_vma;
+			vma_set_policy(new_vma, pol);
 			INIT_LIST_HEAD(&new_vma->anon_vma_chain);
 			if (anon_vma_clone(new_vma, vma))
 				goto out_free_mempol;
-			vma_set_policy(new_vma, pol);
-			new_vma->vm_start = addr;
-			new_vma->vm_end = addr + len;
-			new_vma->vm_pgoff = pgoff;
 			if (new_vma->vm_file) {
 				get_file(new_vma->vm_file);
 
-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Michel Lespinasse <walken@google.com>
To: Sasha Levin <levinsasha928@gmail.com>
Cc: linux-mm@kvack.org, riel@redhat.com, peterz@infradead.org,
	aarcange@redhat.com, hughd@google.com, daniel.santos@pobox.com,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	Dave Jones <davej@redhat.com>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option
Date: Fri, 14 Sep 2012 17:00:29 -0700	[thread overview]
Message-ID: <20120915000029.GA29426@google.com> (raw)
In-Reply-To: <CANN689Ff3W4z=+3J8aGO-2GrPHGJ=ote_f5q9jzRQRAP+b0T4Q@mail.gmail.com>

On Fri, Sep 14, 2012 at 3:46 PM, Michel Lespinasse <walken@google.com> wrote:
> On Fri, Sep 14, 2012 at 3:14 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
>> On 09/04/2012 11:20 AM, Michel Lespinasse wrote:
>>> Add a CONFIG_DEBUG_VM_RB build option for the previously existing
>>> DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using
>>> recursive algorithms, we can expose it a bit more.
>>>
>>> Also extend this code to validate_mm() after stack expansion, and to
>>> check that the vma's start and last pgoffs have not changed since the
>>> nodes were inserted on the anon vma interval tree (as it is important
>>> that the nodes be reindexed after each such update).
>>
>> This patch exposes the following warning:
>>
>> [   24.977502] ------------[ cut here ]------------
>> [   24.979089] WARNING: at mm/interval_tree.c:110
>> anon_vma_interval_tree_verify+0x81/0xa0()
>> [   24.981765] Pid: 5928, comm: trinity-child37 Tainted: G        W
>> 3.6.0-rc5-next-20120914-sasha-00003-g7deb7fa-dirty #333
>> [   24.985501] Call Trace:
>> [   24.986345]  [<ffffffff81224c91>] ? anon_vma_interval_tree_verify+0x81/0xa0
>> [   24.988535]  [<ffffffff81106766>] warn_slowpath_common+0x86/0xb0
>> [   24.990636]  [<ffffffff81106855>] warn_slowpath_null+0x15/0x20
>> [   24.992658]  [<ffffffff81224c91>] anon_vma_interval_tree_verify+0x81/0xa0
>> [   24.994980]  [<ffffffff8122e6e8>] validate_mm+0x58/0x1e0
>> [   24.996772]  [<ffffffff8122e934>] vma_link+0x94/0xe0
>> [   24.997719]  [<ffffffff812315e9>] copy_vma+0x279/0x2e0
>> [   24.998522]  [<ffffffff8117a7fd>] ? trace_hardirqs_off+0xd/0x10
>> [   25.000772]  [<ffffffff81232e89>] move_vma+0xa9/0x260
>> [   25.002499]  [<ffffffff812334b5>] sys_mremap+0x475/0x540
>> [   25.004364]  [<ffffffff8374b6e8>] tracesys+0xe1/0xe6
>> [   25.006108] ---[ end trace 7c901670963aa6e2 ]---
>>
>> The code line is
>>
>>         WARN_ON_ONCE(node->cached_vma_last != avc_last_pgoff(node));
>
> That's very interesting (and potentially relevant to another bug
> that's been reported too).
>
> I'd like to know, what workload did you use that triggered this ?
> (I find it hard to test mremap as I don't know of enough users of it)

All right. Hugh managed to reproduce the issue on his suse laptop, and
I came up with a fix.

The problem was that in mremap, the new vma's vm_{start,end,pgoff}
fields need to be updated before calling anon_vma_clone() so that the
new vma will be properly indexed.

Patch attached. I expect this should also explain Jiri's reported
failure involving splitting THP pages during mremap(), even though we
did not manage to reproduce that one.

---------------------------------8<-------------------------------

From: Michel Lespinasse <walken@google.com>
Date: Fri, 14 Sep 2012 16:43:49 -0700
Subject: [PATCH] mm anon rmap: in mremap, set the new vma's position before
 anon_vma_clone()

anon_vma_clone() expects new_vma->vm_{start,end,pgoff} to be correctly set
so that the new vma can be indexed on the anon interval tree.

copy_vma() was failing to do that, which broke mremap().

Signed-off-by: Michel Lespinasse <walken@google.com>

---
 mm/mmap.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index cc8c64077a42..7e672800b5d4 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2446,16 +2446,16 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 		new_vma = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL);
 		if (new_vma) {
 			*new_vma = *vma;
+			new_vma->vm_start = addr;
+			new_vma->vm_end = addr + len;
+			new_vma->vm_pgoff = pgoff;
 			pol = mpol_dup(vma_policy(vma));
 			if (IS_ERR(pol))
 				goto out_free_vma;
+			vma_set_policy(new_vma, pol);
 			INIT_LIST_HEAD(&new_vma->anon_vma_chain);
 			if (anon_vma_clone(new_vma, vma))
 				goto out_free_mempol;
-			vma_set_policy(new_vma, pol);
-			new_vma->vm_start = addr;
-			new_vma->vm_end = addr + len;
-			new_vma->vm_pgoff = pgoff;
 			if (new_vma->vm_file) {
 				get_file(new_vma->vm_file);
 
-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

  reply	other threads:[~2012-09-15  0:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04  9:20 [PATCH 0/7] use interval trees for anon rmap Michel Lespinasse
2012-09-04  9:20 ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 1/7] mm: interval tree updates Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-07 22:13   ` Andrew Morton
2012-09-07 22:13     ` Andrew Morton
2012-09-07 22:29     ` Michel Lespinasse
2012-09-07 22:29       ` Michel Lespinasse
2012-09-07 22:55       ` Andrew Morton
2012-09-07 22:55         ` Andrew Morton
2012-09-07 23:26         ` Michel Lespinasse
2012-09-07 23:26           ` Michel Lespinasse
2012-09-08  4:45           ` Hillf Danton
2012-09-08  4:45             ` Hillf Danton
2012-09-07 23:26         ` Michel Lespinasse
2012-09-07 23:26           ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 2/7] mm: fix potential anon_vma locking issue in mprotect() Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-04 14:27   ` Andrea Arcangeli
2012-09-04 14:27     ` Andrea Arcangeli
2012-09-04 21:53     ` Michel Lespinasse
2012-09-04 21:53       ` Michel Lespinasse
2012-09-04 22:16       ` Andrea Arcangeli
2012-09-04 22:16         ` Andrea Arcangeli
2012-09-05  0:45         ` Michel Lespinasse
2012-09-05  0:45           ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 3/7] mm anon rmap: remove anon_vma_moveto_tail Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 4/7] mm anon rmap: replace same_anon_vma linked list with an interval tree Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-05  0:51   ` Michel Lespinasse
2012-09-05  0:51     ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 5/7] mm rmap: remove vma_address check for address inside vma Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-04  9:20 ` [PATCH 6/7] mm: add CONFIG_DEBUG_VM_RB build option Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse
2012-09-14 22:14   ` Sasha Levin
2012-09-14 22:14     ` Sasha Levin
2012-09-14 22:40     ` Sasha Levin
2012-09-14 22:40       ` Sasha Levin
2012-09-14 22:46     ` Michel Lespinasse
2012-09-14 22:46       ` Michel Lespinasse
2012-09-15  0:00       ` Michel Lespinasse [this message]
2012-09-15  0:00         ` Michel Lespinasse
2012-09-15  7:52         ` Jiri Slaby
2012-09-15  7:52           ` Jiri Slaby
2012-09-16 19:07           ` Hugh Dickins
2012-09-16 19:07             ` Hugh Dickins
2012-09-22  7:19             ` Jiri Slaby
2012-09-22  7:19               ` Jiri Slaby
2012-09-15  9:26         ` Sasha Levin
2012-09-15  9:26           ` Sasha Levin
2012-09-20 21:39           ` Fengguang Wu
2012-09-20 22:27             ` Hugh Dickins
2012-09-20 22:27               ` Hugh Dickins
2012-09-20 22:31               ` Fengguang Wu
2012-09-20 22:31                 ` Fengguang Wu
2012-09-04  9:20 ` [PATCH 7/7] mm: avoid taking rmap locks in move_ptes() Michel Lespinasse
2012-09-04  9:20   ` Michel Lespinasse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120915000029.GA29426@google.com \
    --to=walken@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel.santos@pobox.com \
    --cc=davej@redhat.com \
    --cc=hughd@google.com \
    --cc=jslaby@suse.cz \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.