All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/3] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information
Date: Wed, 28 Apr 2010 17:35:25 +0200	[thread overview]
Message-ID: <20100428153525.GR510@random.random> (raw)
In-Reply-To: <20100428091555.GB15815@csn.ul.ie>

On Wed, Apr 28, 2010 at 10:15:55AM +0100, Mel Gorman wrote:
> It became unconditional because I wasn't sure of the optimisation versus the
> new anon_vma changes (doesn't matter, should have been safe). At the time

Changeset 287d97ac032136724143cde8d5964b414d562ee3 is meant to explain
the removal of the lock but I don't get it from the comments. Or at
least I don't get from that comment why we can't resurrect the plain
old deleted code that looked fine to me. Like there is no reason to
take the lock if start == vma->vm_start.

> So, the VMA list does not appear to be messed up but there still needs
> to be protection against modification of VMA details that are already on
> the list. For that, the seq counter would have been enough and
> lighter-weight than acquiring the anon_vma->lock every time in
> vma_adjust().
> 
> I'll drop this patch again as the execve race looks the most important.

You mean you're dropping patch 2 too? I agree dropping patch 1 but
to me the having to take all the anon_vma locks for every
vma->anon_vma->lock that we walk seems a must, otherwise
expand_downwards and vma_adjust won't be ok, plus we need to re-add
the anon_vma lock to vma_adjust, it can't be safe to alter vm_pgoff
and vm_start outside of the anon_vma->lock. Or I am mistaken?

Patch 2 wouldn't help the swapops crash we reproduced because at that
point the anon_vma of the stack is the local one, it's just after
execve.

vma_adjust and expand_downards would alter vm_pgoff and vm_start while
taking only the vma->anon_vma->lock where the vma->anon_vma is the
_local_ one of the vma.  But a vma in mainline can be indexed in
infinite anon_vmas, so to prevent breaking migration
vma_adjust/expand_downards the rmap_walk would need to take _all_
anon_vma->locks for every anon_vma that the vma is indexed into. Or
alternatively like you implemented rmap_walk would need to check if
the vma we found in the rmap_walk is different from the original
anon_vma and to take the vma->anon_vma->lock (so taking the
anon_vma->lock of the local anon_vma of every vma) before it can
actually read the vma->vm_pgoff/vm_start inside vma_address.

If the above is right it also means the new anon-vma changes also break
the whole locking of transparent hugepage, see wait_split_huge_page,
it does a spin_unlock_wait(&anon_vma->lock) thinking that waiting the
"local" anon-vma is enough, when in fact the hugepage may be shared
and belonging to the parent parent_vma->anon_vma and not to the local
one of the last child that is waiting on the wrong lock. So I may have
to rewrite this part of the thp locking to solve this. And for me it's
not enough to just taking more locks inside the rmap walks inside
split_huge_page as I used the anon_vma lock outside too.

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/3] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information
Date: Wed, 28 Apr 2010 17:35:25 +0200	[thread overview]
Message-ID: <20100428153525.GR510@random.random> (raw)
In-Reply-To: <20100428091555.GB15815@csn.ul.ie>

On Wed, Apr 28, 2010 at 10:15:55AM +0100, Mel Gorman wrote:
> It became unconditional because I wasn't sure of the optimisation versus the
> new anon_vma changes (doesn't matter, should have been safe). At the time

Changeset 287d97ac032136724143cde8d5964b414d562ee3 is meant to explain
the removal of the lock but I don't get it from the comments. Or at
least I don't get from that comment why we can't resurrect the plain
old deleted code that looked fine to me. Like there is no reason to
take the lock if start == vma->vm_start.

> So, the VMA list does not appear to be messed up but there still needs
> to be protection against modification of VMA details that are already on
> the list. For that, the seq counter would have been enough and
> lighter-weight than acquiring the anon_vma->lock every time in
> vma_adjust().
> 
> I'll drop this patch again as the execve race looks the most important.

You mean you're dropping patch 2 too? I agree dropping patch 1 but
to me the having to take all the anon_vma locks for every
vma->anon_vma->lock that we walk seems a must, otherwise
expand_downwards and vma_adjust won't be ok, plus we need to re-add
the anon_vma lock to vma_adjust, it can't be safe to alter vm_pgoff
and vm_start outside of the anon_vma->lock. Or I am mistaken?

Patch 2 wouldn't help the swapops crash we reproduced because at that
point the anon_vma of the stack is the local one, it's just after
execve.

vma_adjust and expand_downards would alter vm_pgoff and vm_start while
taking only the vma->anon_vma->lock where the vma->anon_vma is the
_local_ one of the vma.  But a vma in mainline can be indexed in
infinite anon_vmas, so to prevent breaking migration
vma_adjust/expand_downards the rmap_walk would need to take _all_
anon_vma->locks for every anon_vma that the vma is indexed into. Or
alternatively like you implemented rmap_walk would need to check if
the vma we found in the rmap_walk is different from the original
anon_vma and to take the vma->anon_vma->lock (so taking the
anon_vma->lock of the local anon_vma of every vma) before it can
actually read the vma->vm_pgoff/vm_start inside vma_address.

If the above is right it also means the new anon-vma changes also break
the whole locking of transparent hugepage, see wait_split_huge_page,
it does a spin_unlock_wait(&anon_vma->lock) thinking that waiting the
"local" anon-vma is enough, when in fact the hugepage may be shared
and belonging to the parent parent_vma->anon_vma and not to the local
one of the last child that is waiting on the wrong lock. So I may have
to rewrite this part of the thp locking to solve this. And for me it's
not enough to just taking more locks inside the rmap walks inside
split_huge_page as I used the anon_vma lock outside too.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-28 15:36 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-27 21:30 [PATCH 0/3] Fix migration races in rmap_walk() V2 Mel Gorman
2010-04-27 21:30 ` Mel Gorman
2010-04-27 21:30 ` [PATCH 1/3] mm,migration: During fork(), wait for migration to end if migration PTE is encountered Mel Gorman
2010-04-27 21:30   ` Mel Gorman
2010-04-27 22:22   ` Andrea Arcangeli
2010-04-27 22:22     ` Andrea Arcangeli
2010-04-27 23:52     ` KAMEZAWA Hiroyuki
2010-04-27 23:52       ` KAMEZAWA Hiroyuki
2010-04-28  0:18       ` Andrea Arcangeli
2010-04-28  0:18         ` Andrea Arcangeli
2010-04-28  0:19         ` Andrea Arcangeli
2010-04-28  0:19           ` Andrea Arcangeli
2010-04-28  0:28           ` KAMEZAWA Hiroyuki
2010-04-28  0:28             ` KAMEZAWA Hiroyuki
2010-04-28  0:59             ` Andrea Arcangeli
2010-04-28  0:59               ` Andrea Arcangeli
2010-04-28  8:24       ` Mel Gorman
2010-04-28  8:24         ` Mel Gorman
2010-04-27 21:30 ` [PATCH 2/3] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information Mel Gorman
2010-04-27 21:30   ` Mel Gorman
2010-04-27 23:10   ` Andrea Arcangeli
2010-04-27 23:10     ` Andrea Arcangeli
2010-04-28  9:15     ` Mel Gorman
2010-04-28  9:15       ` Mel Gorman
2010-04-28 15:35       ` Andrea Arcangeli [this message]
2010-04-28 15:35         ` Andrea Arcangeli
2010-04-28 15:39         ` Andrea Arcangeli
2010-04-28 15:39           ` Andrea Arcangeli
2010-04-28 15:55         ` Mel Gorman
2010-04-28 15:55           ` Mel Gorman
2010-04-28 16:23           ` Andrea Arcangeli
2010-04-28 16:23             ` Andrea Arcangeli
2010-04-28 17:34             ` Mel Gorman
2010-04-28 17:34               ` Mel Gorman
2010-04-28 17:58               ` Andrea Arcangeli
2010-04-28 17:58                 ` Andrea Arcangeli
2010-04-28 17:47             ` [RFC PATCH] take all anon_vma locks in anon_vma_lock Rik van Riel
2010-04-28 17:47               ` Rik van Riel
2010-04-28 18:03               ` Andrea Arcangeli
2010-04-28 18:03                 ` Andrea Arcangeli
2010-04-28 18:09                 ` Rik van Riel
2010-04-28 18:09                   ` Rik van Riel
2010-04-28 18:25               ` [RFC PATCH -v2] " Rik van Riel
2010-04-28 18:25                 ` Rik van Riel
2010-04-28 19:07                 ` Mel Gorman
2010-04-28 19:07                   ` Mel Gorman
2010-04-28 20:17                 ` [RFC PATCH -v3] " Rik van Riel
2010-04-28 20:17                   ` Rik van Riel
2010-04-28 20:57                   ` Rik van Riel
2010-04-28 20:57                     ` Rik van Riel
2010-04-29  0:28                     ` Minchan Kim
2010-04-29  0:28                       ` Minchan Kim
2010-04-29  2:10                       ` Rik van Riel
2010-04-29  2:10                         ` Rik van Riel
2010-04-29  2:55                         ` Minchan Kim
2010-04-29  2:55                           ` Minchan Kim
2010-04-29  6:42                           ` Minchan Kim
2010-04-29  6:42                             ` Minchan Kim
2010-04-29 15:39                           ` Rik van Riel
2010-04-29 15:39                             ` Rik van Riel
2010-04-29  7:37                       ` Mel Gorman
2010-04-29  7:37                         ` Mel Gorman
2010-04-29  8:15                     ` Mel Gorman
2010-04-29  8:15                       ` Mel Gorman
2010-04-29  8:32                       ` Minchan Kim
2010-04-29  8:32                         ` Minchan Kim
2010-04-29  8:44                         ` Mel Gorman
2010-04-29  8:44                           ` Mel Gorman
2010-04-27 21:30 ` [PATCH 3/3] mm,migration: Remove straggling migration PTEs when page tables are being moved after the VMA has already moved Mel Gorman
2010-04-27 21:30   ` Mel Gorman
2010-04-27 22:30   ` Andrea Arcangeli
2010-04-27 22:30     ` Andrea Arcangeli
2010-04-27 22:58     ` Andrea Arcangeli
2010-04-27 22:58       ` Andrea Arcangeli
2010-04-28  0:39       ` KAMEZAWA Hiroyuki
2010-04-28  0:39         ` KAMEZAWA Hiroyuki
2010-04-28  1:05         ` Andrea Arcangeli
2010-04-28  1:05           ` Andrea Arcangeli
2010-04-28  1:09           ` Andrea Arcangeli
2010-04-28  1:09             ` Andrea Arcangeli
2010-04-28  1:18           ` KAMEZAWA Hiroyuki
2010-04-28  1:18             ` KAMEZAWA Hiroyuki
2010-04-28  1:36             ` Andrea Arcangeli
2010-04-28  1:36               ` Andrea Arcangeli
2010-04-28  1:29       ` KAMEZAWA Hiroyuki
2010-04-28  1:29         ` KAMEZAWA Hiroyuki
2010-04-28  1:44         ` Andrea Arcangeli
2010-04-28  1:44           ` Andrea Arcangeli
2010-04-28  2:12           ` KAMEZAWA Hiroyuki
2010-04-28  2:12             ` KAMEZAWA Hiroyuki
2010-04-28  2:42             ` Andrea Arcangeli
2010-04-28  2:42               ` Andrea Arcangeli
2010-04-28  2:49               ` KAMEZAWA Hiroyuki
2010-04-28  2:49                 ` KAMEZAWA Hiroyuki
2010-04-28  7:28                 ` KAMEZAWA Hiroyuki
2010-04-28  7:28                   ` KAMEZAWA Hiroyuki
2010-04-28 10:48                   ` Mel Gorman
2010-04-28 10:48                     ` Mel Gorman
2010-04-28  0:03   ` KAMEZAWA Hiroyuki
2010-04-28  0:03     ` KAMEZAWA Hiroyuki
2010-04-28  0:08     ` Andrea Arcangeli
2010-04-28  0:08       ` Andrea Arcangeli
2010-04-28  0:36       ` KAMEZAWA Hiroyuki
2010-04-28  0:36         ` KAMEZAWA Hiroyuki
2010-04-28  8:30   ` KAMEZAWA Hiroyuki
2010-04-28  8:30     ` KAMEZAWA Hiroyuki
2010-04-28 14:46     ` Andrea Arcangeli
2010-04-28 14:46       ` Andrea Arcangeli
2010-04-27 22:27 ` [PATCH 0/3] Fix migration races in rmap_walk() V2 Christoph Lameter
2010-04-27 22:27   ` Christoph Lameter
2010-04-27 22:32   ` Andrea Arcangeli
2010-04-27 22:32     ` Andrea Arcangeli
2010-04-28  0:13     ` KAMEZAWA Hiroyuki
2010-04-28  0:13       ` KAMEZAWA Hiroyuki
2010-04-28  0:20       ` Andrea Arcangeli
2010-04-28  0:20         ` Andrea Arcangeli
2010-04-28 14:23         ` Mel Gorman
2010-04-28 14:23           ` Mel Gorman
2010-04-28 14:57           ` Mel Gorman
2010-04-28 14:57             ` Mel Gorman
2010-04-28 15:16             ` Andrea Arcangeli
2010-04-28 15:16               ` Andrea Arcangeli
2010-04-28 15:23               ` Mel Gorman
2010-04-28 15:23                 ` Mel Gorman
2010-04-28 15:45                 ` Andrea Arcangeli
2010-04-28 15:45                   ` Andrea Arcangeli
2010-04-28 20:40                   ` Andrea Arcangeli
2010-04-28 20:40                     ` Andrea Arcangeli
2010-04-28 21:05                     ` Andrea Arcangeli
2010-04-28 21:05                       ` Andrea Arcangeli
2010-04-28  9:17     ` Mel Gorman
2010-04-28  9:17       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100428153525.GR510@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.