linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	lwoodman@redhat.com,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: FWD:  [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone
Date: Fri, 18 Dec 2009 12:43:32 -0500	[thread overview]
Message-ID: <4B2BBF44.2090104@redhat.com> (raw)
In-Reply-To: <20091218162332.GR29790@random.random>

On 12/18/2009 11:23 AM, Andrea Arcangeli wrote:
> On Thu, Dec 17, 2009 at 09:05:23PM +0000, Hugh Dickins wrote:

>> An rwlock there has been proposed on several occasions, but
>> we resist because that change benefits this case but performs
>> worse on more common cases (I believe: no numbers to back that up).
>
> I think rwlock for anon_vma is a must. Whatever higher overhead of the
> fast path with no contention is practically zero, and in large smp it
> allows rmap on long chains to run in parallel, so very much worth it
> because downside is practically zero and upside may be measurable
> instead in certain corner cases. I don't think it'll be enough, but I
> definitely like it.

I agree, changing the anon_vma lock to an rwlock should
work a lot better than what we have today.  The tradeoff
is a tiny slowdown in medium contention cases, at the
benefit of avoiding catastrophic slowdown in some cases.

With Nick Piggin's fair rwlocks, there should be no issue
at all.

> Rik suggested to me to have a cowed newly allocated page to use its
> own anon_vma. Conceptually Rik's idea is fine one, but the only
> complication then is how to chain the same vma into multiple anon_vma
> (in practice insert/removal will be slower and more metadata will be
> needed for additional anon_vmas and vams queued in more than
> anon_vma). But this only will help if the mapcount of the page is 1,
> if the mapcount is 10000 no change to anon_vma or prio_tree will solve
> this,

It's even more complex than this for anonymous pages.

Anonymous pages get COW copied in child (and parent)
processes, potentially resulting in one page, at each
offset into the anon_vma, for every process attached
to the anon_vma.

As a result, with 10000 child processes, page_referenced
can end up searching through 10000 VMAs even for pages
with a mapcount of 1!

-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-12-18 17:43 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-11 21:46 [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone Rik van Riel
2009-12-14  0:14 ` Minchan Kim
2009-12-14  4:09   ` Rik van Riel
2009-12-14  4:19     ` Minchan Kim
2009-12-14  4:29       ` Rik van Riel
2009-12-14  5:00         ` Minchan Kim
2009-12-14 12:22 ` KOSAKI Motohiro
2009-12-14 12:23   ` [cleanup][PATCH 1/8] vmscan: Make shrink_zone_begin/end helper function KOSAKI Motohiro
2009-12-14 14:34     ` Rik van Riel
2009-12-14 22:39     ` Minchan Kim
2009-12-14 12:24   ` [PATCH 2/8] Mark sleep_on as deprecated KOSAKI Motohiro
2009-12-14 13:03     ` Christoph Hellwig
2009-12-14 16:04       ` Arjan van de Ven
2009-12-14 14:34     ` Rik van Riel
2009-12-14 22:44     ` Minchan Kim
2009-12-14 12:29   ` [PATCH 3/8] Don't use sleep_on() KOSAKI Motohiro
2009-12-14 14:35     ` Rik van Riel
2009-12-14 22:46     ` Minchan Kim
2009-12-14 12:30   ` [PATCH 4/8] Use prepare_to_wait_exclusive() instead prepare_to_wait() KOSAKI Motohiro
2009-12-14 14:33     ` Rik van Riel
2009-12-15  0:45       ` KOSAKI Motohiro
2009-12-15  5:32         ` Mike Galbraith
2009-12-15  8:28           ` Mike Galbraith
2009-12-15 14:36             ` Mike Galbraith
2009-12-15 14:58           ` Rik van Riel
2009-12-15 18:17             ` Mike Galbraith
2009-12-15 18:43             ` Mike Galbraith
2009-12-15 19:33               ` Rik van Riel
2009-12-16  0:48             ` KOSAKI Motohiro
2009-12-16  2:44               ` Rik van Riel
2009-12-16  5:43               ` Mike Galbraith
2009-12-14 23:03     ` Minchan Kim
2009-12-14 12:30   ` [PATCH 5/8] Use io_schedule() instead schedule() KOSAKI Motohiro
2009-12-14 14:37     ` Rik van Riel
2009-12-14 23:46     ` Minchan Kim
2009-12-15  0:56       ` KOSAKI Motohiro
2009-12-15  1:13         ` Minchan Kim
2009-12-14 12:31   ` [PATCH 6/8] Stop reclaim quickly when the task reclaimed enough lots pages KOSAKI Motohiro
2009-12-14 14:45     ` Rik van Riel
2009-12-14 23:51       ` KOSAKI Motohiro
2009-12-15  0:11     ` Minchan Kim
2009-12-15  0:35       ` KOSAKI Motohiro
2009-12-14 12:32   ` [PATCH 7/8] Use TASK_KILLABLE instead TASK_UNINTERRUPTIBLE KOSAKI Motohiro
2009-12-14 14:47     ` Rik van Riel
2009-12-14 23:52     ` Minchan Kim
2009-12-14 12:32   ` [PATCH 8/8] mm: Give up allocation if the task have fatal signal KOSAKI Motohiro
2009-12-14 14:48     ` Rik van Riel
2009-12-14 23:54     ` Minchan Kim
2009-12-15  0:50       ` KOSAKI Motohiro
2009-12-15  1:03         ` Minchan Kim
2009-12-15  1:16           ` KOSAKI Motohiro
2009-12-14 12:40   ` [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone KOSAKI Motohiro
2009-12-14 17:08 ` Larry Woodman
2009-12-15  0:49   ` KOSAKI Motohiro
     [not found]   ` <20091217193818.9FA9.A69D9226@jp.fujitsu.com>
2009-12-17 12:23     ` FWD: " Larry Woodman
2009-12-17 14:43       ` Rik van Riel
2009-12-17 19:55       ` Rik van Riel
2009-12-17 21:05         ` Hugh Dickins
2009-12-17 22:52           ` Rik van Riel
2009-12-18 16:23           ` Andrea Arcangeli
2009-12-18 17:43             ` Rik van Riel [this message]
2009-12-18 10:27       ` KOSAKI Motohiro
2009-12-18 14:09         ` Rik van Riel
2009-12-18 13:38 ` Avi Kivity
2009-12-18 14:12   ` Rik van Riel
2009-12-18 14:13     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B2BBF44.2090104@redhat.com \
    --to=riel@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).