From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Paul Turner <pjt@google.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 00/27] Latest numa/core release, v16
Date: Mon, 19 Nov 2012 21:37:08 +0000 [thread overview]
Message-ID: <20121119213708.GN8218@suse.de> (raw)
In-Reply-To: <20121119200707.GA12381@gmail.com>
On Mon, Nov 19, 2012 at 09:07:07PM +0100, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@suse.de> wrote:
>
> > > [ SPECjbb transactions/sec ] |
> > > [ higher is better ] |
> > > |
> > > SPECjbb single-1x32 524k 507k | 638k +21.7%
> > > -----------------------------------------------------------------------
> > >
> >
> > I was not able to run a full sets of tests today as I was
> > distracted so all I have is a multi JVM comparison. I'll keep
> > it shorter than average
> >
> > 3.7.0 3.7.0
> > rc5-stats-v4r2 rc5-schednuma-v16r1
> > TPut 1 101903.00 ( 0.00%) 77651.00 (-23.80%)
> > TPut 2 213825.00 ( 0.00%) 160285.00 (-25.04%)
> > TPut 3 307905.00 ( 0.00%) 237472.00 (-22.87%)
> > TPut 4 397046.00 ( 0.00%) 302814.00 (-23.73%)
> > TPut 5 477557.00 ( 0.00%) 364281.00 (-23.72%)
> > TPut 6 542973.00 ( 0.00%) 420810.00 (-22.50%)
> > TPut 7 540466.00 ( 0.00%) 448976.00 (-16.93%)
> > TPut 8 543226.00 ( 0.00%) 463568.00 (-14.66%)
> > TPut 9 513351.00 ( 0.00%) 468238.00 ( -8.79%)
> > TPut 10 484126.00 ( 0.00%) 457018.00 ( -5.60%)
>
> These figures are IMO way too low for a 64-way system. I have a
> 32-way system with midrange server CPUs and get 650k+/sec
> easily.
>
48-way as I said here https://lkml.org/lkml/2012/11/3/109. If I said
64-way somewhere else, it was a mistake. The lack of THP would account
for some of the difference. As I was looking for potential
locking-related issues, I also had CONFIG_DEBUG_VMA nd
CONFIG_DEBUG_MUTEXES set which would account for more overhead. Any
options set are set for all tests that make up a group.
> Have you tried to analyze the root cause, what does 'perf top'
> show during the run and how much idle time is there?
>
No, I haven't and the machine is currently occupied. However, a second
profile run was run as part of the test above. The figures I reported are
based on a run without profiling. With profiling, oprofile reported
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 6000
samples % image name app name symbol name
176552 42.9662 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 intel_idle
22790 5.5462 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 find_busiest_group
10533 2.5633 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 update_blocked_averages
10489 2.5526 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 rb_get_reader_page
9514 2.3154 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 native_write_msr_safe
8511 2.0713 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 ring_buffer_consume
7406 1.8023 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 idle_cpu
6549 1.5938 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 update_cfs_rq_blocked_load
6482 1.5775 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 rebalance_domains
5212 1.2684 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 run_rebalance_domains
5037 1.2258 perl perl /usr/bin/perl
4167 1.0141 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 page_fault
3885 0.9455 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 cpumask_next_and
3704 0.9014 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 find_next_bit
3498 0.8513 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 getnstimeofday
3345 0.8140 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 __update_cpu_load
3175 0.7727 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 load_balance
3018 0.7345 vmlinux-3.7.0-rc5-schednuma-v16r1 vmlinux-3.7.0-rc5-schednuma-v16r1 menu_select
> Trying to reproduce your findings I have done 4x JVM tests
> myself, using 4x 8-warehouse setups, with a sizing of -Xms8192m
> -Xmx8192m -Xss256k, and here are the results:
>
> v3.7 v3.7
> SPECjbb single-1x32 524k 638k +21.7%
> SPECjbb multi-4x8 633k 655k +3.4%
>
I'll re-run with THP enabled the next time and see what I find.
> So while here we are only marginally better than the
> single-instance numbers (I will try to improve that in numa/core
> v17), they are still better than mainline - and they are
> definitely not slower as your numbers suggest ...
>
> So we need to go back to the basics to figure this out: please
> outline exactly which commit ID of the numa/core tree you have
> booted. Also, how does 'perf top' look like on your box?
>
I'll find out what perf top looks like ASAP.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-19 21:37 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 2:14 [PATCH 00/27] Latest numa/core release, v16 Ingo Molnar
2012-11-19 2:14 ` [PATCH 01/27] mm/generic: Only flush the local TLB in ptep_set_access_flags() Ingo Molnar
2012-11-19 2:14 ` [PATCH 02/27] x86/mm: Only do a local tlb flush " Ingo Molnar
2012-11-19 2:14 ` [PATCH 03/27] x86/mm: Introduce pte_accessible() Ingo Molnar
2012-11-19 2:14 ` [PATCH 04/27] mm: Only flush the TLB when clearing an accessible pte Ingo Molnar
2012-11-19 2:14 ` [PATCH 05/27] x86/mm: Completely drop the TLB flush from ptep_set_access_flags() Ingo Molnar
2012-11-19 2:14 ` [PATCH 06/27] mm: Count the number of pages affected in change_protection() Ingo Molnar
2012-11-19 2:14 ` [PATCH 07/27] mm: Optimize the TLB flush of sys_mprotect() and change_protection() users Ingo Molnar
2012-11-19 2:14 ` [PATCH 08/27] sched, numa, mm: Add last_cpu to page flags Ingo Molnar
2012-11-19 2:14 ` [PATCH 09/27] sched, mm, numa: Create generic NUMA fault infrastructure, with architectures overrides Ingo Molnar
2012-11-19 2:14 ` [PATCH 10/27] sched: Make find_busiest_queue() a method Ingo Molnar
2012-11-19 2:14 ` [PATCH 11/27] sched, numa, mm: Describe the NUMA scheduling problem formally Ingo Molnar
2012-11-19 2:14 ` [PATCH 12/27] numa, mm: Support NUMA hinting page faults from gup/gup_fast Ingo Molnar
2012-11-19 2:14 ` [PATCH 13/27] mm/migrate: Introduce migrate_misplaced_page() Ingo Molnar
2012-11-19 2:14 ` [PATCH 14/27] sched, numa, mm, arch: Add variable locality exception Ingo Molnar
2012-11-19 2:14 ` [PATCH 15/27] sched, numa, mm: Add credits for NUMA placement Ingo Molnar
2012-11-19 2:14 ` [PATCH 16/27] sched, mm, x86: Add the ARCH_SUPPORTS_NUMA_BALANCING flag Ingo Molnar
2012-11-19 2:14 ` [PATCH 17/27] sched, numa, mm: Add the scanning page fault machinery Ingo Molnar
2012-11-19 2:14 ` [PATCH 18/27] sched: Add adaptive NUMA affinity support Ingo Molnar
2012-11-19 2:14 ` [PATCH 19/27] sched: Implement constant, per task Working Set Sampling (WSS) rate Ingo Molnar
2012-11-19 2:14 ` [PATCH 20/27] sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges Ingo Molnar
2012-11-19 2:14 ` [PATCH 21/27] sched: Implement slow start for working set sampling Ingo Molnar
2012-11-19 2:14 ` [PATCH 22/27] sched, numa, mm: Interleave shared tasks Ingo Molnar
2012-11-19 2:14 ` [PATCH 23/27] sched: Implement NUMA scanning backoff Ingo Molnar
2012-11-19 2:14 ` [PATCH 24/27] sched: Improve convergence Ingo Molnar
2012-11-19 2:14 ` [PATCH 25/27] sched: Introduce staged average NUMA faults Ingo Molnar
2012-11-19 2:14 ` [PATCH 26/27] sched: Track groups of shared tasks Ingo Molnar
2012-11-19 2:14 ` [PATCH 27/27] sched: Use the best-buddy 'ideal cpu' in balancing decisions Ingo Molnar
2012-11-19 16:29 ` [PATCH 00/27] Latest numa/core release, v16 Mel Gorman
2012-11-19 19:13 ` Ingo Molnar
2012-11-19 21:18 ` Mel Gorman
2012-11-19 22:36 ` Ingo Molnar
2012-11-19 23:00 ` Mel Gorman
2012-11-20 0:41 ` Rik van Riel
2012-11-21 10:58 ` Mel Gorman
2012-11-20 1:02 ` Linus Torvalds
2012-11-20 7:17 ` Ingo Molnar
2012-11-20 7:37 ` David Rientjes
2012-11-20 7:48 ` Ingo Molnar
2012-11-20 8:01 ` Ingo Molnar
2012-11-20 8:11 ` David Rientjes
2012-11-21 11:14 ` Mel Gorman
2012-11-20 10:20 ` Mel Gorman
2012-11-20 10:47 ` Mel Gorman
2012-11-20 15:29 ` [PATCH] mm, numa: Turn 4K pte NUMA faults into effective hugepage ones Ingo Molnar
2012-11-20 16:09 ` [PATCH, v2] " Ingo Molnar
2012-11-20 16:31 ` Rik van Riel
2012-11-20 16:52 ` Ingo Molnar
2012-11-21 12:08 ` Mel Gorman
2012-11-21 8:12 ` Ingo Molnar
2012-11-21 2:41 ` David Rientjes
2012-11-21 9:34 ` Ingo Molnar
2012-11-21 11:40 ` Mel Gorman
2012-11-23 1:26 ` Alex Shi
2012-11-20 17:56 ` numa/core regressions fixed - more testers wanted Ingo Molnar
2012-11-21 1:54 ` Andrew Theurer
2012-11-21 3:22 ` Rik van Riel
2012-11-21 4:10 ` Hugh Dickins
2012-11-21 17:59 ` Andrew Theurer
2012-11-21 11:52 ` Mel Gorman
2012-11-21 22:15 ` Andrew Theurer
2012-11-21 3:33 ` David Rientjes
2012-11-21 9:38 ` Ingo Molnar
2012-11-21 11:06 ` Ingo Molnar
2012-11-21 8:39 ` Alex Shi
2012-11-22 1:21 ` Ingo Molnar
2012-11-23 13:31 ` Ingo Molnar
2012-11-23 15:23 ` Alex Shi
2012-11-26 2:11 ` Alex Shi
2012-11-28 14:21 ` Alex Shi
2012-11-20 10:40 ` [PATCH 00/27] Latest numa/core release, v16 Ingo Molnar
2012-11-20 11:40 ` Mel Gorman
2012-11-21 10:38 ` Mel Gorman
2012-11-21 19:37 ` Andrea Arcangeli
2012-11-21 19:56 ` Mel Gorman
2012-11-19 20:07 ` Ingo Molnar
2012-11-19 21:37 ` Mel Gorman [this message]
2012-11-20 0:50 ` David Rientjes
2012-11-20 1:05 ` David Rientjes
2012-11-20 6:00 ` Ingo Molnar
2012-11-20 6:20 ` David Rientjes
2012-11-20 7:44 ` Ingo Molnar
2012-11-20 7:48 ` Paul Turner
2012-11-20 8:20 ` David Rientjes
2012-11-20 9:06 ` Ingo Molnar
2012-11-20 9:41 ` [patch] x86/vsyscall: Add Kconfig option to use native vsyscalls, switch to it Ingo Molnar
2012-11-20 23:01 ` Andy Lutomirski
2012-11-21 0:43 ` David Rientjes
2012-11-20 12:02 ` [PATCH] x86/mm: Don't flush the TLB on #WP pmd fixups Ingo Molnar
2012-11-20 12:31 ` Ingo Molnar
2012-11-21 11:47 ` Mel Gorman
2012-11-21 1:22 ` David Rientjes
2012-11-21 17:02 ` [PATCH 00/27] Latest numa/core release, v16 Linus Torvalds
2012-11-21 17:10 ` Ingo Molnar
2012-11-21 17:20 ` Ingo Molnar
2012-11-22 4:31 ` David Rientjes
2012-11-21 17:40 ` Ingo Molnar
2012-11-21 22:04 ` Linus Torvalds
2012-11-21 22:46 ` Ingo Molnar
2012-11-21 17:45 ` Rik van Riel
2012-11-21 18:04 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121119213708.GN8218@suse.de \
--to=mgorman@suse.de \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@kernel.org \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).