From: Lin Ming <ming.m.lin@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
Nick Piggin <npiggin@suse.de>, Mel Gorman <mel@csn.ul.ie>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Linux Memory Management List <linux-mm@kvack.org>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Christoph Lameter <cl@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH 00/19] Cleanup and optimise the page allocator V2
Date: Fri, 06 Mar 2009 16:33:08 +0800 [thread overview]
Message-ID: <1236328388.11608.35.camel@minggr.sh.intel.com> (raw)
In-Reply-To: <20090305103403.GB32407@elte.hu>
On Thu, 2009-03-05 at 18:34 +0800, Ingo Molnar wrote:
> * Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:
>
> > On Wed, 2009-03-04 at 10:07 +0100, Nick Piggin wrote:
> > > On Wed, Mar 04, 2009 at 10:05:07AM +0800, Zhang, Yanmin wrote:
> > > > On Mon, 2009-03-02 at 11:21 +0000, Mel Gorman wrote:
> > > > > (Added Ingo as a second scheduler guy as there are queries on tg_shares_up)
> > > > >
> > > > > On Fri, Feb 27, 2009 at 04:44:43PM +0800, Lin Ming wrote:
> > > > > > On Thu, 2009-02-26 at 19:22 +0800, Mel Gorman wrote:
> > > > > > > In that case, Lin, could I also get the profiles for UDP-U-4K please so I
> > > > > > > can see how time is being spent and why it might have gotten worse?
> > > > > >
> > > > > > I have done the profiling (oltp and UDP-U-4K) with and without your v2
> > > > > > patches applied to 2.6.29-rc6.
> > > > > > I also enabled CONFIG_DEBUG_INFO so you can translate address to source
> > > > > > line with addr2line.
> > > > > >
> > > > > > You can download the oprofile data and vmlinux from below link,
> > > > > > http://www.filefactory.com/file/af2330b/
> > > > > >
> > > > >
> > > > > Perfect, thanks a lot for profiling this. It is a big help in figuring out
> > > > > how the allocator is actually being used for your workloads.
> > > > >
> > > > > The OLTP results had the following things to say about the page allocator.
> > > > In case we might mislead you guys, I want to clarify that here OLTP is
> > > > sysbench (oltp)+mysql, not the famous OLTP which needs lots of disks and big
> > > > memory.
> > > >
> > > > Ma Chinang, another Intel guy, does work on the famous OLTP running.
> > >
> > > OK, so my comments WRT cache sensitivity probably don't apply here,
> > > but probably cache hotness of pages coming out of the allocator
> > > might still be important for this one.
> > Yes. We need check it.
> >
> > >
> > > How many runs are you doing of these tests?
> > We start sysbench with different thread number, for example, 8 12 16 32 64 128 for
> > 4*4 tigerton, then get an average value in case there might be a scalability issue.
> >
> > As for this sysbench oltp testing, we reran it for 7 times on
> > tigerton this week and found the results have fluctuations.
> > Now we could only say there is a trend that the result with
> > the pathces is a little worse than the one without the
> > patches.
>
> Could you try "perfstat -s" perhaps and see whether any other of
> the metrics outside of tx/sec has less natural noise?
Thanks, I have used "perfstat -s" to collect cache misses data.
2.6.29-rc7-tip: tip/perfcounters/core (b5e8acf)
2.6.29-rc7-tip-mg2: v2 patches applied to tip/perfcounters/core
I collected 5 times netperf UDP-U-4k data with and without mg-v2 patches
applied to tip/perfcounters/core on a 4p quad-core tigerton machine, as
below
"value" means UDP-U-4k test result.
2.6.29-rc7-tip
---------------
value cache misses CPU migrations cachemisses/migrations
5329.71 391094656 1710 228710
5641.59 239552767 2138 112045
5580.87 132474745 2172 60992
5547.19 86911457 2099 41406
5626.38 196751217 2050 95976
2.6.29-rc7-tip-mg2
-------------------
value cache misses CPU migrations cachemisses/migrations
4749.80 649929463 1132 574142
4327.06 484100170 1252 386661
4649.51 374201508 1489 251310
5655.82 405511551 1848 219432
5571.58 90222256 2159 41788
Lin Ming
>
> I think a more invariant number might be the ratio of "LLC
> cachemisses" divided by "CPU migrations".
>
> The fluctuation in tx/sec comes from threads bouncing - but you
> can normalize that away by using the cachemisses/migrations
> ration.
>
> Perhaps. It's definitely a difficult thing to measure.
>
> Ingo
WARNING: multiple messages have this Message-ID (diff)
From: Lin Ming <ming.m.lin@intel.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
Nick Piggin <npiggin@suse.de>, Mel Gorman <mel@csn.ul.ie>,
Pekka Enberg <penberg@cs.helsinki.fi>,
Linux Memory Management List <linux-mm@kvack.org>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Christoph Lameter <cl@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC PATCH 00/19] Cleanup and optimise the page allocator V2
Date: Fri, 06 Mar 2009 16:33:08 +0800 [thread overview]
Message-ID: <1236328388.11608.35.camel@minggr.sh.intel.com> (raw)
In-Reply-To: <20090305103403.GB32407@elte.hu>
On Thu, 2009-03-05 at 18:34 +0800, Ingo Molnar wrote:
> * Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:
>
> > On Wed, 2009-03-04 at 10:07 +0100, Nick Piggin wrote:
> > > On Wed, Mar 04, 2009 at 10:05:07AM +0800, Zhang, Yanmin wrote:
> > > > On Mon, 2009-03-02 at 11:21 +0000, Mel Gorman wrote:
> > > > > (Added Ingo as a second scheduler guy as there are queries on tg_shares_up)
> > > > >
> > > > > On Fri, Feb 27, 2009 at 04:44:43PM +0800, Lin Ming wrote:
> > > > > > On Thu, 2009-02-26 at 19:22 +0800, Mel Gorman wrote:
> > > > > > > In that case, Lin, could I also get the profiles for UDP-U-4K please so I
> > > > > > > can see how time is being spent and why it might have gotten worse?
> > > > > >
> > > > > > I have done the profiling (oltp and UDP-U-4K) with and without your v2
> > > > > > patches applied to 2.6.29-rc6.
> > > > > > I also enabled CONFIG_DEBUG_INFO so you can translate address to source
> > > > > > line with addr2line.
> > > > > >
> > > > > > You can download the oprofile data and vmlinux from below link,
> > > > > > http://www.filefactory.com/file/af2330b/
> > > > > >
> > > > >
> > > > > Perfect, thanks a lot for profiling this. It is a big help in figuring out
> > > > > how the allocator is actually being used for your workloads.
> > > > >
> > > > > The OLTP results had the following things to say about the page allocator.
> > > > In case we might mislead you guys, I want to clarify that here OLTP is
> > > > sysbench (oltp)+mysql, not the famous OLTP which needs lots of disks and big
> > > > memory.
> > > >
> > > > Ma Chinang, another Intel guy, does work on the famous OLTP running.
> > >
> > > OK, so my comments WRT cache sensitivity probably don't apply here,
> > > but probably cache hotness of pages coming out of the allocator
> > > might still be important for this one.
> > Yes. We need check it.
> >
> > >
> > > How many runs are you doing of these tests?
> > We start sysbench with different thread number, for example, 8 12 16 32 64 128 for
> > 4*4 tigerton, then get an average value in case there might be a scalability issue.
> >
> > As for this sysbench oltp testing, we reran it for 7 times on
> > tigerton this week and found the results have fluctuations.
> > Now we could only say there is a trend that the result with
> > the pathces is a little worse than the one without the
> > patches.
>
> Could you try "perfstat -s" perhaps and see whether any other of
> the metrics outside of tx/sec has less natural noise?
Thanks, I have used "perfstat -s" to collect cache misses data.
2.6.29-rc7-tip: tip/perfcounters/core (b5e8acf)
2.6.29-rc7-tip-mg2: v2 patches applied to tip/perfcounters/core
I collected 5 times netperf UDP-U-4k data with and without mg-v2 patches
applied to tip/perfcounters/core on a 4p quad-core tigerton machine, as
below
"value" means UDP-U-4k test result.
2.6.29-rc7-tip
---------------
value cache misses CPU migrations cachemisses/migrations
5329.71 391094656 1710 228710
5641.59 239552767 2138 112045
5580.87 132474745 2172 60992
5547.19 86911457 2099 41406
5626.38 196751217 2050 95976
2.6.29-rc7-tip-mg2
-------------------
value cache misses CPU migrations cachemisses/migrations
4749.80 649929463 1132 574142
4327.06 484100170 1252 386661
4649.51 374201508 1489 251310
5655.82 405511551 1848 219432
5571.58 90222256 2159 41788
Lin Ming
>
> I think a more invariant number might be the ratio of "LLC
> cachemisses" divided by "CPU migrations".
>
> The fluctuation in tx/sec comes from threads bouncing - but you
> can normalize that away by using the cachemisses/migrations
> ration.
>
> Perhaps. It's definitely a difficult thing to measure.
>
> Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-06 8:39 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-24 12:16 [RFC PATCH 00/19] Cleanup and optimise the page allocator V2 Mel Gorman
2009-02-24 12:16 ` Mel Gorman
2009-02-24 12:16 ` [PATCH 01/19] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
2009-02-24 12:16 ` Mel Gorman
2009-02-24 12:16 ` [PATCH 02/19] Do not sanity check order in the fast path Mel Gorman
2009-02-24 12:16 ` Mel Gorman
2009-02-24 12:16 ` [PATCH 03/19] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman
2009-02-24 12:16 ` Mel Gorman
2009-02-24 17:17 ` Christoph Lameter
2009-02-24 17:17 ` Christoph Lameter
2009-02-24 12:17 ` [PATCH 04/19] Convert gfp_zone() to use a table of precalculated values Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 16:43 ` Christoph Lameter
2009-02-24 16:43 ` Christoph Lameter
2009-02-24 17:07 ` Mel Gorman
2009-02-24 17:07 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 05/19] Re-sort GFP flags and fix whitespace alignment for easier reading Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 06/19] Check only once if the zonelist is suitable for the allocation Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 17:24 ` Christoph Lameter
2009-02-24 17:24 ` Christoph Lameter
2009-02-24 12:17 ` [PATCH 07/19] Break up the allocator entry point into fast and slow paths Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 08/19] Simplify the check on whether cpusets are a factor or not Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 17:27 ` Christoph Lameter
2009-02-24 17:27 ` Christoph Lameter
2009-02-24 17:55 ` Mel Gorman
2009-02-24 17:55 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 09/19] Move check for disabled anti-fragmentation out of fastpath Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 10/19] Calculate the preferred zone for allocation only once Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 17:31 ` Christoph Lameter
2009-02-24 17:31 ` Christoph Lameter
2009-02-24 17:53 ` Mel Gorman
2009-02-24 17:53 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 11/19] Calculate the migratetype " Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 12/19] Calculate the alloc_flags " Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 13/19] Inline __rmqueue_smallest() Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 14/19] Inline buffered_rmqueue() Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 15/19] Do not call get_pageblock_migratetype() more than necessary Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 16/19] Do not disable interrupts in free_page_mlock() Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 17/19] Do not setup zonelist cache when there is only one node Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 18/19] Do not check for compound pages during the page allocator sanity checks Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-24 12:17 ` [PATCH 19/19] Split per-cpu list into one-list-per-migrate-type Mel Gorman
2009-02-24 12:17 ` Mel Gorman
2009-02-26 9:10 ` [RFC PATCH 00/19] Cleanup and optimise the page allocator V2 Lin Ming
2009-02-26 9:10 ` Lin Ming
2009-02-26 9:26 ` Pekka Enberg
2009-02-26 9:26 ` Pekka Enberg
2009-02-26 9:27 ` Lin Ming
2009-02-26 9:27 ` Lin Ming
2009-02-26 11:03 ` Mel Gorman
2009-02-26 11:03 ` Mel Gorman
2009-02-26 11:18 ` Pekka Enberg
2009-02-26 11:18 ` Pekka Enberg
2009-02-26 11:22 ` Mel Gorman
2009-02-26 11:22 ` Mel Gorman
2009-02-26 12:27 ` Lin Ming
2009-02-26 12:27 ` Lin Ming
2009-02-27 8:44 ` Lin Ming
2009-02-27 8:44 ` Lin Ming
2009-03-02 11:21 ` Mel Gorman
2009-03-02 11:21 ` Mel Gorman
2009-03-02 11:39 ` Nick Piggin
2009-03-02 11:39 ` Nick Piggin
2009-03-02 12:16 ` Mel Gorman
2009-03-02 12:16 ` Mel Gorman
2009-03-03 4:42 ` Nick Piggin
2009-03-03 4:42 ` Nick Piggin
2009-03-03 8:25 ` Mel Gorman
2009-03-03 8:25 ` Mel Gorman
2009-03-03 9:04 ` Nick Piggin
2009-03-03 9:04 ` Nick Piggin
2009-03-03 13:51 ` Mel Gorman
2009-03-03 13:51 ` Mel Gorman
2009-03-03 16:31 ` Christoph Lameter
2009-03-03 16:31 ` Christoph Lameter
2009-03-03 21:48 ` Mel Gorman
2009-03-03 21:48 ` Mel Gorman
2009-03-04 2:05 ` Zhang, Yanmin
2009-03-04 2:05 ` Zhang, Yanmin
2009-03-04 7:23 ` Peter Zijlstra
2009-03-04 7:23 ` Peter Zijlstra
2009-03-04 8:31 ` Zhang, Yanmin
2009-03-04 8:31 ` Zhang, Yanmin
2009-03-04 9:07 ` Nick Piggin
2009-03-04 9:07 ` Nick Piggin
2009-03-05 1:56 ` Zhang, Yanmin
2009-03-05 1:56 ` Zhang, Yanmin
2009-03-05 10:34 ` Ingo Molnar
2009-03-05 10:34 ` Ingo Molnar
2009-03-06 8:33 ` Lin Ming [this message]
2009-03-06 8:33 ` Lin Ming
2009-03-06 9:39 ` Ingo Molnar
2009-03-06 9:39 ` Ingo Molnar
2009-03-06 13:03 ` Mel Gorman
2009-03-06 13:03 ` Mel Gorman
2009-03-09 1:50 ` Zhang, Yanmin
2009-03-09 1:50 ` Zhang, Yanmin
2009-03-09 7:31 ` Lin Ming
2009-03-09 7:31 ` Lin Ming
2009-03-09 7:03 ` Lin Ming
2009-03-09 7:03 ` Lin Ming
2009-03-04 18:04 ` Mel Gorman
2009-03-04 18:04 ` Mel Gorman
2009-02-26 16:28 ` Christoph Lameter
2009-02-26 16:28 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1236328388.11608.35.camel@minggr.sh.intel.com \
--to=ming.m.lin@intel.com \
--cc=cl@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.