All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Ingo Molnar <mingo@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hugh Dickins <hughd@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 19/19] mm: sched: numa: Implement slow start for working set sampling
Date: Tue, 06 Nov 2012 14:56:59 -0500	[thread overview]
Message-ID: <50996B8B.30404@redhat.com> (raw)
In-Reply-To: <1352193295-26815-20-git-send-email-mgorman@suse.de>

On 11/06/2012 04:14 AM, Mel Gorman wrote:
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
>
> Add a 1 second delay before starting to scan the working set of
> a task and starting to balance it amongst nodes.
>
> [ note that before the constant per task WSS sampling rate patch
>    the initial scan would happen much later still, in effect that
>    patch caused this regression. ]
>
> The theory is that short-run tasks benefit very little from NUMA
> placement: they come and go, and they better stick to the node
> they were started on. As tasks mature and rebalance to other CPUs
> and nodes, so does their NUMA placement have to change and so
> does it start to matter more and more.
>
> In practice this change fixes an observable kbuild regression:
>
>     # [ a perf stat --null --repeat 10 test of ten bzImage builds to /dev/shm ]
>
>     !NUMA:
>     45.291088843 seconds time elapsed                                          ( +-  0.40% )
>     45.154231752 seconds time elapsed                                          ( +-  0.36% )
>
>     +NUMA, no slow start:
>     46.172308123 seconds time elapsed                                          ( +-  0.30% )
>     46.343168745 seconds time elapsed                                          ( +-  0.25% )
>
>     +NUMA, 1 sec slow start:
>     45.224189155 seconds time elapsed                                          ( +-  0.25% )
>     45.160866532 seconds time elapsed                                          ( +-  0.17% )
>
> and it also fixes an observable perf bench (hackbench) regression:
>
>     # perf stat --null --repeat 10 perf bench sched messaging
>
>     -NUMA:
>
>     -NUMA:                  0.246225691 seconds time elapsed                   ( +-  1.31% )
>     +NUMA no slow start:    0.252620063 seconds time elapsed                   ( +-  1.13% )
>
>     +NUMA 1sec delay:       0.248076230 seconds time elapsed                   ( +-  1.35% )
>
> The implementation is simple and straightforward, most of the patch
> deals with adding the /proc/sys/kernel/balance_numa_scan_delay_ms tunable
> knob.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> [ Wrote the changelog, ran measurements, tuned the default. ]
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Ingo Molnar <mingo@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hugh Dickins <hughd@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 19/19] mm: sched: numa: Implement slow start for working set sampling
Date: Tue, 06 Nov 2012 14:56:59 -0500	[thread overview]
Message-ID: <50996B8B.30404@redhat.com> (raw)
In-Reply-To: <1352193295-26815-20-git-send-email-mgorman@suse.de>

On 11/06/2012 04:14 AM, Mel Gorman wrote:
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
>
> Add a 1 second delay before starting to scan the working set of
> a task and starting to balance it amongst nodes.
>
> [ note that before the constant per task WSS sampling rate patch
>    the initial scan would happen much later still, in effect that
>    patch caused this regression. ]
>
> The theory is that short-run tasks benefit very little from NUMA
> placement: they come and go, and they better stick to the node
> they were started on. As tasks mature and rebalance to other CPUs
> and nodes, so does their NUMA placement have to change and so
> does it start to matter more and more.
>
> In practice this change fixes an observable kbuild regression:
>
>     # [ a perf stat --null --repeat 10 test of ten bzImage builds to /dev/shm ]
>
>     !NUMA:
>     45.291088843 seconds time elapsed                                          ( +-  0.40% )
>     45.154231752 seconds time elapsed                                          ( +-  0.36% )
>
>     +NUMA, no slow start:
>     46.172308123 seconds time elapsed                                          ( +-  0.30% )
>     46.343168745 seconds time elapsed                                          ( +-  0.25% )
>
>     +NUMA, 1 sec slow start:
>     45.224189155 seconds time elapsed                                          ( +-  0.25% )
>     45.160866532 seconds time elapsed                                          ( +-  0.17% )
>
> and it also fixes an observable perf bench (hackbench) regression:
>
>     # perf stat --null --repeat 10 perf bench sched messaging
>
>     -NUMA:
>
>     -NUMA:                  0.246225691 seconds time elapsed                   ( +-  1.31% )
>     +NUMA no slow start:    0.252620063 seconds time elapsed                   ( +-  1.13% )
>
>     +NUMA 1sec delay:       0.248076230 seconds time elapsed                   ( +-  1.35% )
>
> The implementation is simple and straightforward, most of the patch
> deals with adding the /proc/sys/kernel/balance_numa_scan_delay_ms tunable
> knob.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> [ Wrote the changelog, ran measurements, tuned the default. ]
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.com>


  reply	other threads:[~2012-11-06 19:54 UTC|newest]

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-06  9:14 [RFC PATCH 00/19] Foundation for automatic NUMA balancing Mel Gorman
2012-11-06  9:14 ` Mel Gorman
2012-11-06  9:14 ` [PATCH 01/19] mm: compaction: Move migration fail/success stats to migrate.c Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 17:32   ` Rik van Riel
2012-11-06 17:32     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 02/19] mm: migrate: Add a tracepoint for migrate_pages Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 17:33   ` Rik van Riel
2012-11-06 17:33     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 03/19] mm: compaction: Add scanned and isolated counters for compaction Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 17:35   ` Rik van Riel
2012-11-06 17:35     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 04/19] mm: numa: define _PAGE_NUMA Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 18:35   ` Rik van Riel
2012-11-06 18:35     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 05/19] mm: numa: pte_numa() and pmd_numa() Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-13  9:54   ` Ingo Molnar
2012-11-13  9:54     ` Ingo Molnar
2012-11-13 11:24     ` Mel Gorman
2012-11-13 11:24       ` Mel Gorman
2012-11-06  9:14 ` [PATCH 06/19] mm: numa: teach gup_fast about pmd_numa Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-13 10:07   ` Ingo Molnar
2012-11-13 10:07     ` Ingo Molnar
2012-11-13 11:37     ` Mel Gorman
2012-11-13 11:37       ` Mel Gorman
2012-11-13 13:51       ` Ingo Molnar
2012-11-13 13:51         ` Ingo Molnar
2012-11-06  9:14 ` [PATCH 07/19] mm: numa: split_huge_page: transfer the NUMA type from the pmd to the pte Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06  9:14 ` [PATCH 08/19] mm: numa: Create basic numa page hinting infrastructure Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 18:58   ` Rik van Riel
2012-11-06 18:58     ` Rik van Riel
2012-11-07 10:38     ` Mel Gorman
2012-11-07 10:38       ` Mel Gorman
2012-11-07 10:48       ` Rik van Riel
2012-11-07 10:48         ` Rik van Riel
2012-11-07 11:00         ` Mel Gorman
2012-11-07 11:00           ` Mel Gorman
2012-11-13 10:21   ` Ingo Molnar
2012-11-13 10:21     ` Ingo Molnar
2012-11-13 11:50     ` Mel Gorman
2012-11-13 11:50       ` Mel Gorman
2012-11-13 13:49       ` Ingo Molnar
2012-11-13 13:49         ` Ingo Molnar
2012-11-13 14:26         ` Mel Gorman
2012-11-13 14:26           ` Mel Gorman
2012-11-06  9:14 ` [PATCH 09/19] mm: mempolicy: Make MPOL_LOCAL a real policy Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06  9:14 ` [PATCH 10/19] mm: mempolicy: Add MPOL_MF_NOOP Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06  9:14 ` [PATCH 11/19] mm: mempolicy: Check for misplaced page Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06  9:14 ` [PATCH 12/19] mm: migrate: Introduce migrate_misplaced_page() Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:10   ` Rik van Riel
2012-11-06 19:10     ` Rik van Riel
2012-11-13  9:36   ` Ingo Molnar
2012-11-13  9:36     ` Ingo Molnar
2012-11-13 11:43     ` Ingo Molnar
2012-11-13 11:56       ` Mel Gorman
2012-11-13 11:56         ` Mel Gorman
2012-11-13 14:49       ` Rik van Riel
2012-11-13 14:49         ` Rik van Riel
2012-11-06  9:14 ` [PATCH 13/19] mm: mempolicy: Use _PAGE_NUMA to migrate pages Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:18   ` Rik van Riel
2012-11-06 19:18     ` Rik van Riel
2012-11-07 12:32     ` Mel Gorman
2012-11-07 12:32       ` Mel Gorman
2012-11-06  9:14 ` [PATCH 14/19] mm: mempolicy: Add MPOL_MF_LAZY Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:19   ` Rik van Riel
2012-11-06 19:19     ` Rik van Riel
2012-11-13 10:25   ` Ingo Molnar
2012-11-13 10:25     ` Ingo Molnar
2012-11-13 12:02     ` Mel Gorman
2012-11-13 12:02       ` Mel Gorman
2012-11-06  9:14 ` [PATCH 15/19] mm: numa: Add fault driven placement and migration Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:41   ` Rik van Riel
2012-11-06 19:41     ` Rik van Riel
2012-11-07 10:49     ` Mel Gorman
2012-11-07 10:49       ` Mel Gorman
2012-11-07 11:46       ` Rik van Riel
2012-11-07 11:46         ` Rik van Riel
2012-11-13 10:45   ` Ingo Molnar
2012-11-13 10:45     ` Ingo Molnar
2012-11-13 12:09     ` Mel Gorman
2012-11-13 12:09       ` Mel Gorman
2012-11-13 13:39       ` Ingo Molnar
2012-11-13 13:39         ` Ingo Molnar
2012-11-06  9:14 ` [PATCH 16/19] mm: numa: Add pte updates, hinting and migration stats Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:55   ` Rik van Riel
2012-11-06 19:55     ` Rik van Riel
2012-11-07 10:57     ` Mel Gorman
2012-11-07 10:57       ` Mel Gorman
2012-11-07 11:47       ` Rik van Riel
2012-11-07 11:47         ` Rik van Riel
2012-11-06  9:14 ` [PATCH 17/19] mm: numa: Migrate on reference policy Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-07 11:56   ` Rik van Riel
2012-11-07 11:56     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 18/19] mm: sched: numa: Implement constant, per task Working Set Sampling (WSS) rate Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:55   ` Rik van Riel
2012-11-06 19:55     ` Rik van Riel
2012-11-06  9:14 ` [PATCH 19/19] mm: sched: numa: Implement slow start for working set sampling Mel Gorman
2012-11-06  9:14   ` Mel Gorman
2012-11-06 19:56   ` Rik van Riel [this message]
2012-11-06 19:56     ` Rik van Riel
2012-11-07  9:27 ` [RFC PATCH 00/19] Foundation for automatic NUMA balancing Zhouping Liu
2012-11-07 15:25   ` Mel Gorman
2012-11-07 15:25     ` Mel Gorman
2012-11-08  6:37     ` Zhouping Liu
2012-11-08  6:37       ` Zhouping Liu
2012-11-08  6:39       ` 杨竹
2012-11-08  7:03         ` Zhouping Liu
2012-11-08  7:03           ` Zhouping Liu
2012-11-09 14:42 ` Andrea Arcangeli
2012-11-09 14:42   ` Andrea Arcangeli
2012-11-09 16:12   ` Mel Gorman
2012-11-09 16:12     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50996B8B.30404@redhat.com \
    --to=riel@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.