Re: [RFC PATCH 0/3] Change how we determine when to hand out THPs

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Alex Thorlton <athorlton@sgi.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Rik van Riel <riel@redhat.com>,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Michel Lespinasse <walken@google.com>,
	Benjamin LaHaise <bcrl@kvack.org>,
	Oleg Nesterov <oleg@redhat.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Al Viro <viro@zeniv.linux.org.uk>,
	David Rientjes <rientjes@google.com>,
	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Jiang Liu <jiang.liu@huawei.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Glauber Costa <glommer@parallels.com>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/3] Change how we determine when to hand out THPs
Date: Thu, 19 Dec 2013 14:55:11 +0000	[thread overview]
Message-ID: <20131219145511.GO11295@suse.de> (raw)
In-Reply-To: <20131212180037.GA134240@sgi.com>

On Thu, Dec 12, 2013 at 12:00:37PM -0600, Alex Thorlton wrote:
> This patch changes the way we decide whether or not to give out THPs to
> processes when they fault in pages.  The way things are right now,
> touching one byte in a 2M chunk where no pages have been faulted in
> results in a process being handed a 2M hugepage, which, in some cases,
> is undesirable.  The most common issue seems to arise when a process
> uses many cores to work on small portions of an allocated chunk of
> memory.
> 
> <SNIP>
> 
> As you can see there's a significant performance increase when running
> this test with THP off.  Here's a pointer to the test, for those who are
> interested:
> 
> http://oss.sgi.com/projects/memtests/thp_pthread.tar.gz
> 
> My proposed solution to the problem is to allow users to set a
> threshold at which THPs will be handed out.  The idea here is that, when
> a user faults in a page in an area where they would usually be handed a
> THP, we pull 512 pages off the free list, as we would with a regular
> THP, but we only fault in single pages from that chunk, until the user
> has faulted in enough pages to pass the threshold we've set. 

I have not read this thread yet so this is just me initial reaction to
just this part.

First, you say that the propose solution is to allow users to set a
threshold at which THPs will be handed out but you actually allocate all
the pages up front so it's not just that. There a few things in play

1. Deferred zeroing cost
2. Deferred THP set cost
3. Different TLB pressure
4. Alignment issues and NUMA

All are important. It is common for there to be fewer large TLB entries
than small ones. Workloads that sparsely reference data may suffer badly
when using large pages as the TLB gets trashed. Your workload could be
specifically testing for the TLB pressure (optimising point 3 above) in
which case the procesor used for benchmarking is a major factor and it's
not a universal win.

For example, your workload may optimise 3 but other workloads may suffer
because more faults are incurred until the threshold is reached, the
page tables must be walked to initialse the remaining pages and then the
THP setup and TLB flushed. 

Keep these details in mind when measuring your patches if at all possible.

Otherwise, on the face of it this is actually a similar proposal to "page
reservation" described one of the more important large page papers written
by Talluri (http://dl.acm.org/citation.cfm?id=195531). Right now you could
consider Linux to be reserving pages with a promotion threshold of 1 and
you're aiming to alter that threshold. Seems like a reasonable idea that
will eventually work out even though I have not seen the implementation yet.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2013-12-19 14:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-12 18:00 [RFC PATCH 0/3] Change how we determine when to hand out THPs Alex Thorlton
2013-12-12 20:33 ` Alex Thorlton
2013-12-14  5:44 ` Andrew Morton
2013-12-16 17:12   ` Alex Thorlton
2013-12-16 17:51     ` Andrea Arcangeli
2013-12-17 16:20       ` Alex Thorlton
2013-12-17 17:55         ` Andrea Arcangeli
2013-12-18 17:15           ` Rik van Riel
2013-12-25 19:07           ` Alex Thorlton
2013-12-17  1:43     ` Andy Lutomirski
2013-12-17 16:04       ` Alex Thorlton
2013-12-17 16:54         ` Andy Lutomirski
2013-12-17 17:47           ` Alex Thorlton
2013-12-17 22:25             ` Andy Lutomirski
2013-12-19 15:29     ` Mel Gorman
2013-12-25 16:38       ` Alex Thorlton
2013-12-19 14:55 ` Mel Gorman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131219145511.GO11295@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=athorlton@sgi.com \
    --cc=bcrl@kvack.org \
    --cc=benh@kernel.crashing.org \
    --cc=cody@linux.vnet.ibm.com \
    --cc=ebiederm@xmission.com \
    --cc=glommer@parallels.com \
    --cc=hannes@cmpxchg.org \
    --cc=jiang.liu@huawei.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=luto@amacapital.net \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=walken@google.com \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).