linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Lameter <cl@linux.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary stack
Date: Sun, 9 May 2010 20:21:45 +0100	[thread overview]
Message-ID: <20100509192145.GI4859@csn.ul.ie> (raw)
In-Reply-To: <alpine.LFD.2.00.1005061905230.901@i5.linux-foundation.org>

On Thu, May 06, 2010 at 07:12:59PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 7 May 2010, KAMEZAWA Hiroyuki wrote:
> > 
> > IIUC, move_page_tables() may call "page table allocation" and it cannot be
> > done under spinlock.
> 
> Bah. It only does a "alloc_new_pmd()", and we could easily move that out 
> of the loop and pre-allocate the pmd's.
> 
> If that's the only reason, then it's a really weak one, methinks.
> 

It turns out not to be easy to the preallocating of PUDs, PMDs and PTEs
move_page_tables() needs.  To avoid overallocating, it has to follow the same
logic as move_page_tables duplicating some code in the process. The ugliest
aspect of all is passing those pre-allocated pages back into move_page_tables
where they need to be passed down to such functions as __pte_alloc. It turns
extremely messy.

I stopped working on it about half way through as it was already too ugly
to live and would have similar cost to Kamezawa's much more straight-forward
approach of using move_vma().

While using move_vma is straight-forward and solves the problem, it's
not as cheap as Andrea's solution. Andrea allocates a temporary VMA and
puts it on a list and very little else. It didn't show up any problems
in microbenchmarks. Calling move_vma does a lot more work particularly in
copy_vma and this slows down exec.

With Kamezawa's patch, kernbench was fine on wall time but in System Time,
it slowed by up 1.48% in comparison to Andrea's slowing up by 0.64%[1].

aim9 was slowed as well. Kamezawa's slowed by 2.77% where Andrea's reported
faster by 2.58%. While AIM9 is flaky and these figures are barely outside
the noise, calling move_vma() is obviously more expensive.

While my solution at http://lkml.org/lkml/2010/4/30/198 is cheapest as it
does not touch exec() at all, is_vma_temporary_stack() could be broken in
the future if any of the assumptions it makes change.

So what you have is an inverse relationship between magic and
performance. Mine has the most magic and is fastest. Kamezawa's has the
least magic but slowest and Andrea has the goldilocks factor. Which do
you prefer?

[1] One caveat of the performance tests was that a lot of debugging such
    as lockdep was enabled. Disabling these would give different results
    but it should still be the case that calling move_vma is more expensive
    than calling kmem_cache_alloc.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-05-09 19:22 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-06 23:20 [PATCH 0/2] Fix migration races in rmap_walk() V7 Mel Gorman
2010-05-06 23:20 ` [PATCH 1/2] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information Mel Gorman
2010-05-07  0:56   ` KAMEZAWA Hiroyuki
2010-05-07 16:26     ` Mel Gorman
2010-05-08 15:39   ` Andrea Arcangeli
2010-05-08 17:02     ` Linus Torvalds
2010-05-08 18:04       ` Andrea Arcangeli
2010-05-08 19:51         ` Linus Torvalds
2010-05-09 19:23     ` Mel Gorman
2010-05-06 23:20 ` [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary stack Mel Gorman
2010-05-07  1:40   ` Linus Torvalds
2010-05-07  1:57     ` KAMEZAWA Hiroyuki
2010-05-07  2:12       ` Linus Torvalds
2010-05-07  4:19         ` KAMEZAWA Hiroyuki
2010-05-07 14:18           ` Linus Torvalds
2010-05-09 19:21         ` Mel Gorman [this message]
2010-05-09 19:56           ` Linus Torvalds
2010-05-09 20:06             ` Linus Torvalds
2010-05-09 20:20               ` Linus Torvalds
2010-05-10  0:40             ` KAMEZAWA Hiroyuki
2010-05-10  1:30               ` Linus Torvalds
2010-05-10  1:32                 ` Linus Torvalds
2010-05-10  1:40                   ` KAMEZAWA Hiroyuki
2010-05-10  1:49                     ` Linus Torvalds
2010-05-10 13:24                     ` Mel Gorman
2010-05-10 23:55                       ` KAMEZAWA Hiroyuki
2010-05-10  0:42             ` KAMEZAWA Hiroyuki
2010-05-10 14:02               ` Mel Gorman
2010-05-10 13:49             ` Mel Gorman
2010-05-10  0:32           ` KAMEZAWA Hiroyuki
2010-05-07  9:16     ` Mel Gorman
2010-05-07  8:13 ` [PATCH 0/2] Fix migration races in rmap_walk() V7 KAMEZAWA Hiroyuki
  -- strict thread matches above, loose matches on Subject: below --
2010-05-06 15:33 [PATCH 0/2] Fix migration races in rmap_walk() V6 Mel Gorman
2010-05-06 15:33 ` [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary stack Mel Gorman
2010-05-05 13:14 [PATCH 0/2] Fix migration races in rmap_walk() V5 Mel Gorman
2010-05-05 13:14 ` [PATCH 2/2] mm,migration: Fix race between shift_arg_pages and rmap_walk by guaranteeing rmap_walk finds PTEs created within the temporary stack Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100509192145.GI4859@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).