From: Rik van Riel <riel@redhat.com>
To: lwoodman@redhat.com
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>
Subject: Re: FWD: [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone
Date: Thu, 17 Dec 2009 14:55:20 -0500 [thread overview]
Message-ID: <4B2A8CA8.6090704@redhat.com> (raw)
In-Reply-To: <4B2A22C0.8080001@redhat.com>
After removing some more immediate bottlenecks with
the patches by Kosaki and me, Larry ran into a really
big one:
Larry Woodman wrote:
> Finally, having said all that, the system still struggles reclaiming
> memory with
> ~10000 processes trying at the same time, you fix one bottleneck and it
> moves
> somewhere else. The latest run showed all but one running process
> spinning in
> page_lock_anon_vma() trying for the anon_vma_lock. I noticed that there
> are
> ~5000 vma's linked to one anon_vma, this seems excessive!!!
>
> I changed the anon_vma->lock to a rwlock_t and page_lock_anon_vma() to use
> read_lock() so multiple callers could execute the page_reference_anon code.
> This seems to help quite a bit.
The system has 10000 processes, all of which are child
processes of the same parent.
Pretty much all memory is anonymous memory.
This means that pretty much every anonymous page in the
system:
1) belongs to just one process, but
2) belongs to an anon_vma which is attached to 10,000 VMAs!
This results in page_referenced scanning 10,000 VMAs for
every page, despite the fact that each page is typically
only mapped into one process.
This seems to be our real scalability issue.
The only way out I can think is to have a new anon_vma
when we start a child process and to have COW place new
pages in the new anon_vma.
However, this is a bit of a paradigm shift in our object
rmap system and I am wondering if somebody else has a
better idea :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-17 19:55 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-11 21:46 [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone Rik van Riel
2009-12-14 0:14 ` Minchan Kim
2009-12-14 4:09 ` Rik van Riel
2009-12-14 4:19 ` Minchan Kim
2009-12-14 4:29 ` Rik van Riel
2009-12-14 5:00 ` Minchan Kim
2009-12-14 12:22 ` KOSAKI Motohiro
2009-12-14 12:23 ` [cleanup][PATCH 1/8] vmscan: Make shrink_zone_begin/end helper function KOSAKI Motohiro
2009-12-14 14:34 ` Rik van Riel
2009-12-14 22:39 ` Minchan Kim
2009-12-14 12:24 ` [PATCH 2/8] Mark sleep_on as deprecated KOSAKI Motohiro
2009-12-14 13:03 ` Christoph Hellwig
2009-12-14 16:04 ` Arjan van de Ven
2009-12-14 14:34 ` Rik van Riel
2009-12-14 22:44 ` Minchan Kim
2009-12-14 12:29 ` [PATCH 3/8] Don't use sleep_on() KOSAKI Motohiro
2009-12-14 14:35 ` Rik van Riel
2009-12-14 22:46 ` Minchan Kim
2009-12-14 12:30 ` [PATCH 4/8] Use prepare_to_wait_exclusive() instead prepare_to_wait() KOSAKI Motohiro
2009-12-14 14:33 ` Rik van Riel
2009-12-15 0:45 ` KOSAKI Motohiro
2009-12-15 5:32 ` Mike Galbraith
2009-12-15 8:28 ` Mike Galbraith
2009-12-15 14:36 ` Mike Galbraith
2009-12-15 14:58 ` Rik van Riel
2009-12-15 18:17 ` Mike Galbraith
2009-12-15 18:43 ` Mike Galbraith
2009-12-15 19:33 ` Rik van Riel
2009-12-16 0:48 ` KOSAKI Motohiro
2009-12-16 2:44 ` Rik van Riel
2009-12-16 5:43 ` Mike Galbraith
2009-12-14 23:03 ` Minchan Kim
2009-12-14 12:30 ` [PATCH 5/8] Use io_schedule() instead schedule() KOSAKI Motohiro
2009-12-14 14:37 ` Rik van Riel
2009-12-14 23:46 ` Minchan Kim
2009-12-15 0:56 ` KOSAKI Motohiro
2009-12-15 1:13 ` Minchan Kim
2009-12-14 12:31 ` [PATCH 6/8] Stop reclaim quickly when the task reclaimed enough lots pages KOSAKI Motohiro
2009-12-14 14:45 ` Rik van Riel
2009-12-14 23:51 ` KOSAKI Motohiro
2009-12-15 0:11 ` Minchan Kim
2009-12-15 0:35 ` KOSAKI Motohiro
2009-12-14 12:32 ` [PATCH 7/8] Use TASK_KILLABLE instead TASK_UNINTERRUPTIBLE KOSAKI Motohiro
2009-12-14 14:47 ` Rik van Riel
2009-12-14 23:52 ` Minchan Kim
2009-12-14 12:32 ` [PATCH 8/8] mm: Give up allocation if the task have fatal signal KOSAKI Motohiro
2009-12-14 14:48 ` Rik van Riel
2009-12-14 23:54 ` Minchan Kim
2009-12-15 0:50 ` KOSAKI Motohiro
2009-12-15 1:03 ` Minchan Kim
2009-12-15 1:16 ` KOSAKI Motohiro
2009-12-14 12:40 ` [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone KOSAKI Motohiro
2009-12-14 17:08 ` Larry Woodman
2009-12-15 0:49 ` KOSAKI Motohiro
[not found] ` <20091217193818.9FA9.A69D9226@jp.fujitsu.com>
2009-12-17 12:23 ` FWD: " Larry Woodman
2009-12-17 14:43 ` Rik van Riel
2009-12-17 19:55 ` Rik van Riel [this message]
2009-12-17 21:05 ` Hugh Dickins
2009-12-17 22:52 ` Rik van Riel
2009-12-18 16:23 ` Andrea Arcangeli
2009-12-18 17:43 ` Rik van Riel
2009-12-18 10:27 ` KOSAKI Motohiro
2009-12-18 14:09 ` Rik van Riel
2009-12-18 13:38 ` Avi Kivity
2009-12-18 14:12 ` Rik van Riel
2009-12-18 14:13 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B2A8CA8.6090704@redhat.com \
--to=riel@redhat.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hugh.dickins@tiscali.co.uk \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lwoodman@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).