linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Dan Smith <danms@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Paul Turner <pjt@google.com>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Mike Galbraith <efault@gmx.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Bharata B Rao <bharata.rao@gmail.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC] AutoNUMA alpha6
Date: Wed, 21 Mar 2012 23:52:43 +0100	[thread overview]
Message-ID: <20120321225242.GL24602@redhat.com> (raw)
In-Reply-To: <87limtboet.fsf@danplanet.com>

On Wed, Mar 21, 2012 at 03:05:30PM -0700, Dan Smith wrote:
> something isn't right about my setup, point it out. I've even gone so
> far as to print debug from inside numa01 and numa02 to make sure the
> -DFOO's are working.

That's good check indeed.

> Re-running all the configurations with THP disabled seems to yield very
> similar results to what I reported before:
> 
>         mainline autonuma numasched hard inverse same_node
> numa01  483      366      335       335  483     483

I assume you didn't run the numa01_same_node on the "numasched" kernel
here.

Now if you want I can fix this and boost autonuma for the numa01
without any parameter.

With the first 5 sec of runtime, I thought I'd be ok with the
MPOL_DEFAULT behavior unchanged (where autonuma behaves as a bypass
for those initial seconds).

Now if we're going to measure who places memory better within the
first 10 seconds of startup, I may have to resurrect
autonuma_balance_blind. I disabled that function because I didn't want
blind heuristics that may backfire for some apps.

It's really numa01_same node the interesting benchmark meant
to start from a fixed position and it is the thing that really
exercises ability of the algorithm to converge.

> The inverse and same_node numbers above are on mainline, and both are
> lower on autonuma and numasched:
> 
>            numa01_hard numa01_inverse numa01_same_node
> mainline   335         483            483
> autonuma   335         356            377
> numasched  335         375            491

In these numbers the numa01_inverse column is suspect for
autonuma/numasched.

The numa01_inverse and numa01_hard you should duplicate it from
mainline to be sure. That is an "hardware" not software measurement.

The exact numbers shall be like this:

            numa01_hard numa01_inverse numa01_same_node
 mainline   335         483            483
 autonuma   335         483            377
 numasched  335         483            491



And it pretty much matches what I get. Well I tried many times again
but I couldn't complete any more numa01 runs with numasched, I was
real lucky last night. It never ends... it becomes incredibly slow and
misbehave until it's almost unusable and I reboot it. So I stopped
worrying about benchmarking numasched as it's too unstable for that.

> I also ran your numa02, which seems to correlate to your findings:
> 
>         mainline autonuma numasched hard inverse
> numa02  54       42       55        37   53
> 
> So, I'm not seeing the twofold penalty of running with numasched, and in
> fact, it seems to basically do no worse than current mainline (within
> the error interval). However, I hope the matching trend somewhat
> validates the fact that I'm running your stuff correctly.

I still see it even in your numbers:

numasched 55
mainline  54
autonuma  42
hard      37

numasched 491
mainline  483
autonuma  377
hard      335

Yes I think you're running everything correctly.

I'm only wondering why numa01_inverse is faster than on upstream when
run on autonuma (and numasched), I'll try to reproduce it. I thought I
wasn't messing with anything except MPOL_DEFAULT but I'll have to
re-check that.

> I also ran your numa01 with my system clamped to 16G and saw no change
> in the positioning of the metrics (i.e. same_node was still higher than
> inverse and everything was shifted slightly up linearly).

Yes it shall run fine on all kernels. But for me running that on
numasched (and only on numasched) never ends.

> Well, it's bad in either case, because it means either it's too
> temperamental to behave the same on two similar but differently-sized
> machines, or that it doesn't properly balance the load for machines with
> differing topologies.

Your three numbers of mainline looked ok, it's still strange that
numa01_same_node is identical to numa01_inverse_bind though. It
shoudln't. same_node uses 1 numa node. inverse uses both nodes but
always with remote memory. It's surprising to see an identical value
there.

> I'll be glad to post details of the topology if you tell me specifically
> what you want (above and beyond what I've already posted).

It should look like this to be correct for my -DHARD_BIND and
-DINVERSE_BIND to work as intended:

numactl --hardware

available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23

If your topology is different than above, then updates are required to
numa*.c.

> Me too. Unless you have specific things for me to try, it's probably
> best to let someone else step in with more interesting and
> representative benchmarks, as all of my numbers seem to continue to
> point in the same direction...

It's all good! Thanks for the help.

If you want to keep benchmarking I'm about to upload the autonuma-dev
branch (same git-tree) with alpha8 based on post-3.3 scheduler
codebase and with some more fix.

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-03-21 22:54 UTC|newest]

Thread overview: 152+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-16 14:40 [RFC][PATCH 00/26] sched/numa Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 01/26] mm, mpol: Re-implement check_*_range() using walk_page_range() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 02/26] mm, mpol: Remove NUMA_INTERLEAVE_HIT Peter Zijlstra
2012-07-06 10:32   ` Johannes Weiner
2012-07-06 14:48     ` Minchan Kim
2012-07-06 15:02       ` Peter Zijlstra
2012-07-06 14:54   ` Kyungmin Park
2012-07-06 15:00     ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 03/26] mm, mpol: add MPOL_MF_LAZY Peter Zijlstra
2012-03-23 11:50   ` Mel Gorman
2012-07-06 16:38     ` Rik van Riel
2012-07-06 20:04       ` Lee Schermerhorn
2012-07-06 20:27         ` Rik van Riel
2012-07-09 11:48       ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 04/26] mm, mpol: add MPOL_MF_NOOP Peter Zijlstra
2012-07-06 18:40   ` Rik van Riel
2012-03-16 14:40 ` [RFC][PATCH 05/26] mm, mpol: Check for misplaced page Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 06/26] mm: Migrate " Peter Zijlstra
2012-04-03 17:32   ` Dan Smith
2012-03-16 14:40 ` [RFC][PATCH 07/26] mm: Handle misplaced anon pages Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 08/26] mm, mpol: Simplify do_mbind() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 09/26] sched, mm: Introduce tsk_home_node() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 10/26] mm, mpol: Make mempolicy home-node aware Peter Zijlstra
2012-03-16 18:34   ` Christoph Lameter
2012-03-16 21:12     ` Peter Zijlstra
2012-03-19 13:53       ` Christoph Lameter
2012-03-19 14:05         ` Peter Zijlstra
2012-03-19 15:16           ` Christoph Lameter
2012-03-19 15:23             ` Peter Zijlstra
2012-03-19 15:31               ` Christoph Lameter
2012-03-19 17:09                 ` Peter Zijlstra
2012-03-19 17:28                   ` Peter Zijlstra
2012-03-19 19:06                   ` Christoph Lameter
2012-03-19 20:28                   ` Lee Schermerhorn
2012-03-19 21:21                     ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 11/26] mm, mpol: Lazy migrate a process/vma Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 12/26] sched, mm: sched_{fork,exec} node assignment Peter Zijlstra
2012-06-15 18:16   ` Tony Luck
2012-06-20 19:12     ` [PATCH] sched: Fix build problems when CONFIG_NUMA=y and CONFIG_SMP=n Luck, Tony
2012-03-16 14:40 ` [RFC][PATCH 13/26] sched: Implement home-node awareness Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 14/26] sched, numa: Numa balancer Peter Zijlstra
2012-07-07 18:26   ` Rik van Riel
2012-07-09 12:05     ` Peter Zijlstra
2012-07-09 12:23     ` Peter Zijlstra
2012-07-09 12:40       ` Peter Zijlstra
2012-07-09 14:50         ` Rik van Riel
2012-07-08 18:35   ` Rik van Riel
2012-07-09 12:25     ` Peter Zijlstra
2012-07-09 14:54       ` Rik van Riel
2012-07-12 22:02   ` Rik van Riel
2012-07-13 14:45     ` Don Morris
2012-07-14 16:20       ` Rik van Riel
2012-03-16 14:40 ` [RFC][PATCH 15/26] sched, numa: Implement hotplug hooks Peter Zijlstra
2012-03-19 12:16   ` Srivatsa S. Bhat
2012-03-19 12:19     ` Peter Zijlstra
2012-03-19 12:27       ` Srivatsa S. Bhat
2012-03-16 14:40 ` [RFC][PATCH 16/26] sched, numa: Abstract the numa_entity Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 17/26] srcu: revert1 Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 18/26] srcu: revert2 Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 19/26] srcu: Implement call_srcu() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 20/26] mm, mpol: Introduce vma_dup_policy() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 21/26] mm, mpol: Introduce vma_put_policy() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 22/26] mm, mpol: Split and explose some mempolicy functions Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 23/26] sched, numa: Introduce sys_numa_{t,m}bind() Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 24/26] mm, mpol: Implement numa_group RSS accounting Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 25/26] sched, numa: Only migrate long-running entities Peter Zijlstra
2012-07-08 18:34   ` Rik van Riel
2012-07-09 12:26     ` Peter Zijlstra
2012-07-09 14:53       ` Rik van Riel
2012-07-09 14:55         ` Peter Zijlstra
2012-03-16 14:40 ` [RFC][PATCH 26/26] sched, numa: A few debug bits Peter Zijlstra
2012-03-16 18:25 ` [RFC] AutoNUMA alpha6 Andrea Arcangeli
2012-03-19 18:47   ` Peter Zijlstra
2012-03-19 19:02     ` Andrea Arcangeli
2012-03-20 23:41   ` Dan Smith
2012-03-21  1:00     ` Andrea Arcangeli
2012-03-21  2:12     ` Andrea Arcangeli
2012-03-21  4:01       ` Dan Smith
2012-03-21 12:49         ` Andrea Arcangeli
2012-03-21 22:05           ` Dan Smith
2012-03-21 22:52             ` Andrea Arcangeli [this message]
2012-03-21 23:13               ` Dan Smith
2012-03-21 23:41                 ` Andrea Arcangeli
2012-03-22  0:17               ` Andrea Arcangeli
2012-03-22 13:58                 ` Dan Smith
2012-03-22 14:27                   ` Andrea Arcangeli
2012-03-22 18:49                     ` Andrea Arcangeli
2012-03-22 18:56                       ` Dan Smith
2012-03-22 19:11                         ` Andrea Arcangeli
2012-03-23 14:15                         ` Andrew Theurer
2012-03-23 16:01                           ` Andrea Arcangeli
2012-03-25 13:30                         ` Andrea Arcangeli
2012-03-21  7:12       ` Ingo Molnar
2012-03-21 12:08         ` Andrea Arcangeli
2012-03-21  7:53     ` Ingo Molnar
2012-03-21 12:17       ` Andrea Arcangeli
2012-03-19  9:57 ` [RFC][PATCH 00/26] sched/numa Avi Kivity
2012-03-19 11:12   ` Peter Zijlstra
2012-03-19 11:30     ` Peter Zijlstra
2012-03-19 11:39     ` Peter Zijlstra
2012-03-19 11:42     ` Avi Kivity
2012-03-19 11:59       ` Peter Zijlstra
2012-03-19 12:07         ` Avi Kivity
2012-03-19 12:09       ` Peter Zijlstra
2012-03-19 12:16         ` Avi Kivity
2012-03-19 20:03           ` Peter Zijlstra
2012-03-20 10:18             ` Avi Kivity
2012-03-20 10:48               ` Peter Zijlstra
2012-03-20 10:52                 ` Avi Kivity
2012-03-20 11:07                   ` Peter Zijlstra
2012-03-20 11:48                     ` Avi Kivity
2012-03-19 12:20       ` Peter Zijlstra
2012-03-19 12:24         ` Avi Kivity
2012-03-19 15:44           ` Avi Kivity
2012-03-19 13:40       ` Andrea Arcangeli
2012-03-19 20:06         ` Peter Zijlstra
2012-03-19 13:04     ` Andrea Arcangeli
2012-03-19 13:26       ` Peter Zijlstra
2012-03-19 13:57         ` Andrea Arcangeli
2012-03-19 14:06           ` Avi Kivity
2012-03-19 14:30             ` Andrea Arcangeli
2012-03-19 18:42               ` Peter Zijlstra
2012-03-20 22:18                 ` Rik van Riel
2012-03-21 16:50                   ` Andrea Arcangeli
2012-04-02 16:34                   ` Pekka Enberg
2012-04-02 16:55                     ` Rik van Riel
2012-04-02 16:54                       ` Pekka Enberg
2012-04-02 17:12                         ` Pekka Enberg
2012-04-02 17:23                           ` Pekka Enberg
2012-03-19 14:07           ` Peter Zijlstra
2012-03-19 14:34             ` Andrea Arcangeli
2012-03-19 18:41               ` Peter Zijlstra
2012-03-19 19:13           ` Peter Zijlstra
2012-03-19 14:07         ` Andrea Arcangeli
2012-03-19 19:05           ` Peter Zijlstra
2012-03-19 13:26       ` Peter Zijlstra
2012-03-19 14:16         ` Andrea Arcangeli
2012-03-19 13:29       ` Peter Zijlstra
2012-03-19 14:19         ` Andrea Arcangeli
2012-03-19 13:39       ` Peter Zijlstra
2012-03-19 14:20         ` Andrea Arcangeli
2012-03-19 20:17           ` Christoph Lameter
2012-03-19 20:28             ` Ingo Molnar
2012-03-19 20:43               ` Christoph Lameter
2012-03-19 21:34                 ` Ingo Molnar
2012-03-20  0:05               ` Linus Torvalds
2012-03-20  7:31                 ` Ingo Molnar
2012-03-21 22:53 ` Nish Aravamudan
2012-03-22  9:45   ` Peter Zijlstra
2012-03-22 10:34     ` Ingo Molnar
2012-03-24  1:41     ` Nish Aravamudan
2012-03-26 11:42       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120321225242.GL24602@redhat.com \
    --to=aarcange@redhat.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bharata.rao@gmail.com \
    --cc=danms@us.ibm.com \
    --cc=efault@gmx.de \
    --cc=hannes@cmpxchg.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=riel@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).