From: Mel Gorman <mel@csn.ul.ie>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Minchan Kim <minchan.kim@gmail.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 3/3] mm,migration: Remove straggling migration PTEs when page tables are being moved after the VMA has already moved
Date: Wed, 28 Apr 2010 11:48:13 +0100 [thread overview]
Message-ID: <20100428104813.GD15815@csn.ul.ie> (raw)
In-Reply-To: <20100428162838.c762fcda.kamezawa.hiroyu@jp.fujitsu.com>
Thanks to you both for looking into this. I far prefer this general approach
than cleaning up the migration PTEs as the page tables get copied. While it
might "work", it's sloppy in the same way as having migration_entry_wait()
do the cleanup was sloppy. It's far preferable to make the VMA move and
page table copy atomic with anon_vma->lock.
On Wed, Apr 28, 2010 at 04:28:38PM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 28 Apr 2010 11:49:44 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> > On Wed, 28 Apr 2010 04:42:27 +0200
> > Andrea Arcangeli <aarcange@redhat.com> wrote:
>
> > > migrate.c requires rmap to be able to find all ptes mapping a page at
> > > all times, otherwise the migration entry can be instantiated, but it
> > > can't be removed if the second rmap_walk fails to find the page.
> > >
> > > So shift_arg_pages must run atomically with respect of rmap_walk, and
> > > it's enough to run it under the anon_vma lock to make it atomic.
> > >
> > > And split_huge_page() will have the same requirements as migrate.c
> > > already has.
> > >
> > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> >
> > Seems good.
> > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > I'll test this and report if I see trouble again.
> >
> > Unfortunately, I'll have a week of holidays (in Japan) in 4/29-5/05,
> > my office is nearly closed. So, please consider no-mail-from-me is
> > good information.
> >
> Here is bad news. When move_page_tables() fails, "some ptes" are moved
> but others are not and....there is no rollback routine.
>
The biggest problem is that the reverse mapping is temporarily out of
sync until do_exit gets rid of the mess, but how serious is that really?
If there is a migration entry in there, the mapcount should already be zero and
migration holds a reference to the page to prevent it going away. rmap_walk()
may then miss the migration_pte so it gets left behind. Ordinarily this
would be bad but in exec(), we cannot be faulting this page so we won't
trigger the bug in swapops. Instead, do_exit ultimately will skip over the
migration PTE doing nothing with the page but as the mapcount is still zero,
the page won't leak.
> I bet the best way to fix this mess up is
> - disable overlap moving of arg pages
> - use do_mremap().
>
> But maybe you guys want to fix this directly.
> Here is a temporal fix from me. But don't trust me..
I see the point of your patch but I'm not yet seeing why it is
necessary to back out if move_page_tables fails.
That said, both patches have a greater problem. Both of them hold a spinlock
(anon_vma->lock) while calling into the page allocator with GFP_KERNEL (to
allocate the page tables). We don't want to change that to GFP_ATOMIC so
either we need to allocate the pages in advance or special case rmap_walk()
to not walk processes that are in exec.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mel@csn.ul.ie>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Minchan Kim <minchan.kim@gmail.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 3/3] mm,migration: Remove straggling migration PTEs when page tables are being moved after the VMA has already moved
Date: Wed, 28 Apr 2010 11:48:13 +0100 [thread overview]
Message-ID: <20100428104813.GD15815@csn.ul.ie> (raw)
In-Reply-To: <20100428162838.c762fcda.kamezawa.hiroyu@jp.fujitsu.com>
Thanks to you both for looking into this. I far prefer this general approach
than cleaning up the migration PTEs as the page tables get copied. While it
might "work", it's sloppy in the same way as having migration_entry_wait()
do the cleanup was sloppy. It's far preferable to make the VMA move and
page table copy atomic with anon_vma->lock.
On Wed, Apr 28, 2010 at 04:28:38PM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 28 Apr 2010 11:49:44 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> > On Wed, 28 Apr 2010 04:42:27 +0200
> > Andrea Arcangeli <aarcange@redhat.com> wrote:
>
> > > migrate.c requires rmap to be able to find all ptes mapping a page at
> > > all times, otherwise the migration entry can be instantiated, but it
> > > can't be removed if the second rmap_walk fails to find the page.
> > >
> > > So shift_arg_pages must run atomically with respect of rmap_walk, and
> > > it's enough to run it under the anon_vma lock to make it atomic.
> > >
> > > And split_huge_page() will have the same requirements as migrate.c
> > > already has.
> > >
> > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> >
> > Seems good.
> > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > I'll test this and report if I see trouble again.
> >
> > Unfortunately, I'll have a week of holidays (in Japan) in 4/29-5/05,
> > my office is nearly closed. So, please consider no-mail-from-me is
> > good information.
> >
> Here is bad news. When move_page_tables() fails, "some ptes" are moved
> but others are not and....there is no rollback routine.
>
The biggest problem is that the reverse mapping is temporarily out of
sync until do_exit gets rid of the mess, but how serious is that really?
If there is a migration entry in there, the mapcount should already be zero and
migration holds a reference to the page to prevent it going away. rmap_walk()
may then miss the migration_pte so it gets left behind. Ordinarily this
would be bad but in exec(), we cannot be faulting this page so we won't
trigger the bug in swapops. Instead, do_exit ultimately will skip over the
migration PTE doing nothing with the page but as the mapcount is still zero,
the page won't leak.
> I bet the best way to fix this mess up is
> - disable overlap moving of arg pages
> - use do_mremap().
>
> But maybe you guys want to fix this directly.
> Here is a temporal fix from me. But don't trust me..
I see the point of your patch but I'm not yet seeing why it is
necessary to back out if move_page_tables fails.
That said, both patches have a greater problem. Both of them hold a spinlock
(anon_vma->lock) while calling into the page allocator with GFP_KERNEL (to
allocate the page tables). We don't want to change that to GFP_ATOMIC so
either we need to allocate the pages in advance or special case rmap_walk()
to not walk processes that are in exec.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-28 10:48 UTC|newest]
Thread overview: 132+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-27 21:30 [PATCH 0/3] Fix migration races in rmap_walk() V2 Mel Gorman
2010-04-27 21:30 ` Mel Gorman
2010-04-27 21:30 ` [PATCH 1/3] mm,migration: During fork(), wait for migration to end if migration PTE is encountered Mel Gorman
2010-04-27 21:30 ` Mel Gorman
2010-04-27 22:22 ` Andrea Arcangeli
2010-04-27 22:22 ` Andrea Arcangeli
2010-04-27 23:52 ` KAMEZAWA Hiroyuki
2010-04-27 23:52 ` KAMEZAWA Hiroyuki
2010-04-28 0:18 ` Andrea Arcangeli
2010-04-28 0:18 ` Andrea Arcangeli
2010-04-28 0:19 ` Andrea Arcangeli
2010-04-28 0:19 ` Andrea Arcangeli
2010-04-28 0:28 ` KAMEZAWA Hiroyuki
2010-04-28 0:28 ` KAMEZAWA Hiroyuki
2010-04-28 0:59 ` Andrea Arcangeli
2010-04-28 0:59 ` Andrea Arcangeli
2010-04-28 8:24 ` Mel Gorman
2010-04-28 8:24 ` Mel Gorman
2010-04-27 21:30 ` [PATCH 2/3] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information Mel Gorman
2010-04-27 21:30 ` Mel Gorman
2010-04-27 23:10 ` Andrea Arcangeli
2010-04-27 23:10 ` Andrea Arcangeli
2010-04-28 9:15 ` Mel Gorman
2010-04-28 9:15 ` Mel Gorman
2010-04-28 15:35 ` Andrea Arcangeli
2010-04-28 15:35 ` Andrea Arcangeli
2010-04-28 15:39 ` Andrea Arcangeli
2010-04-28 15:39 ` Andrea Arcangeli
2010-04-28 15:55 ` Mel Gorman
2010-04-28 15:55 ` Mel Gorman
2010-04-28 16:23 ` Andrea Arcangeli
2010-04-28 16:23 ` Andrea Arcangeli
2010-04-28 17:34 ` Mel Gorman
2010-04-28 17:34 ` Mel Gorman
2010-04-28 17:58 ` Andrea Arcangeli
2010-04-28 17:58 ` Andrea Arcangeli
2010-04-28 17:47 ` [RFC PATCH] take all anon_vma locks in anon_vma_lock Rik van Riel
2010-04-28 17:47 ` Rik van Riel
2010-04-28 18:03 ` Andrea Arcangeli
2010-04-28 18:03 ` Andrea Arcangeli
2010-04-28 18:09 ` Rik van Riel
2010-04-28 18:09 ` Rik van Riel
2010-04-28 18:25 ` [RFC PATCH -v2] " Rik van Riel
2010-04-28 18:25 ` Rik van Riel
2010-04-28 19:07 ` Mel Gorman
2010-04-28 19:07 ` Mel Gorman
2010-04-28 20:17 ` [RFC PATCH -v3] " Rik van Riel
2010-04-28 20:17 ` Rik van Riel
2010-04-28 20:57 ` Rik van Riel
2010-04-28 20:57 ` Rik van Riel
2010-04-29 0:28 ` Minchan Kim
2010-04-29 0:28 ` Minchan Kim
2010-04-29 2:10 ` Rik van Riel
2010-04-29 2:10 ` Rik van Riel
2010-04-29 2:55 ` Minchan Kim
2010-04-29 2:55 ` Minchan Kim
2010-04-29 6:42 ` Minchan Kim
2010-04-29 6:42 ` Minchan Kim
2010-04-29 15:39 ` Rik van Riel
2010-04-29 15:39 ` Rik van Riel
2010-04-29 7:37 ` Mel Gorman
2010-04-29 7:37 ` Mel Gorman
2010-04-29 8:15 ` Mel Gorman
2010-04-29 8:15 ` Mel Gorman
2010-04-29 8:32 ` Minchan Kim
2010-04-29 8:32 ` Minchan Kim
2010-04-29 8:44 ` Mel Gorman
2010-04-29 8:44 ` Mel Gorman
2010-04-27 21:30 ` [PATCH 3/3] mm,migration: Remove straggling migration PTEs when page tables are being moved after the VMA has already moved Mel Gorman
2010-04-27 21:30 ` Mel Gorman
2010-04-27 22:30 ` Andrea Arcangeli
2010-04-27 22:30 ` Andrea Arcangeli
2010-04-27 22:58 ` Andrea Arcangeli
2010-04-27 22:58 ` Andrea Arcangeli
2010-04-28 0:39 ` KAMEZAWA Hiroyuki
2010-04-28 0:39 ` KAMEZAWA Hiroyuki
2010-04-28 1:05 ` Andrea Arcangeli
2010-04-28 1:05 ` Andrea Arcangeli
2010-04-28 1:09 ` Andrea Arcangeli
2010-04-28 1:09 ` Andrea Arcangeli
2010-04-28 1:18 ` KAMEZAWA Hiroyuki
2010-04-28 1:18 ` KAMEZAWA Hiroyuki
2010-04-28 1:36 ` Andrea Arcangeli
2010-04-28 1:36 ` Andrea Arcangeli
2010-04-28 1:29 ` KAMEZAWA Hiroyuki
2010-04-28 1:29 ` KAMEZAWA Hiroyuki
2010-04-28 1:44 ` Andrea Arcangeli
2010-04-28 1:44 ` Andrea Arcangeli
2010-04-28 2:12 ` KAMEZAWA Hiroyuki
2010-04-28 2:12 ` KAMEZAWA Hiroyuki
2010-04-28 2:42 ` Andrea Arcangeli
2010-04-28 2:42 ` Andrea Arcangeli
2010-04-28 2:49 ` KAMEZAWA Hiroyuki
2010-04-28 2:49 ` KAMEZAWA Hiroyuki
2010-04-28 7:28 ` KAMEZAWA Hiroyuki
2010-04-28 7:28 ` KAMEZAWA Hiroyuki
2010-04-28 10:48 ` Mel Gorman [this message]
2010-04-28 10:48 ` Mel Gorman
2010-04-28 0:03 ` KAMEZAWA Hiroyuki
2010-04-28 0:03 ` KAMEZAWA Hiroyuki
2010-04-28 0:08 ` Andrea Arcangeli
2010-04-28 0:08 ` Andrea Arcangeli
2010-04-28 0:36 ` KAMEZAWA Hiroyuki
2010-04-28 0:36 ` KAMEZAWA Hiroyuki
2010-04-28 8:30 ` KAMEZAWA Hiroyuki
2010-04-28 8:30 ` KAMEZAWA Hiroyuki
2010-04-28 14:46 ` Andrea Arcangeli
2010-04-28 14:46 ` Andrea Arcangeli
2010-04-27 22:27 ` [PATCH 0/3] Fix migration races in rmap_walk() V2 Christoph Lameter
2010-04-27 22:27 ` Christoph Lameter
2010-04-27 22:32 ` Andrea Arcangeli
2010-04-27 22:32 ` Andrea Arcangeli
2010-04-28 0:13 ` KAMEZAWA Hiroyuki
2010-04-28 0:13 ` KAMEZAWA Hiroyuki
2010-04-28 0:20 ` Andrea Arcangeli
2010-04-28 0:20 ` Andrea Arcangeli
2010-04-28 14:23 ` Mel Gorman
2010-04-28 14:23 ` Mel Gorman
2010-04-28 14:57 ` Mel Gorman
2010-04-28 14:57 ` Mel Gorman
2010-04-28 15:16 ` Andrea Arcangeli
2010-04-28 15:16 ` Andrea Arcangeli
2010-04-28 15:23 ` Mel Gorman
2010-04-28 15:23 ` Mel Gorman
2010-04-28 15:45 ` Andrea Arcangeli
2010-04-28 15:45 ` Andrea Arcangeli
2010-04-28 20:40 ` Andrea Arcangeli
2010-04-28 20:40 ` Andrea Arcangeli
2010-04-28 21:05 ` Andrea Arcangeli
2010-04-28 21:05 ` Andrea Arcangeli
2010-04-28 9:17 ` Mel Gorman
2010-04-28 9:17 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100428104813.GD15815@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.