From: tip-bot for Rik van Riel <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: torvalds@linux-foundation.org, riel@redhat.com, hpa@zytor.com,
linux-kernel@vger.kernel.org, tglx@linutronix.de,
mingo@kernel.org, mgorman@suse.de, peterz@infradead.org
Subject: [tip:sched/core] sched/numa: Slow down scan rate if shared faults dominate
Date: Thu, 10 Aug 2017 05:07:30 -0700 [thread overview]
Message-ID: <tip-37ec97deb3a8c68a7adfab61beb261ffeab19d09@git.kernel.org> (raw)
In-Reply-To: <20170731192847.23050-2-riel@redhat.com>
Commit-ID: 37ec97deb3a8c68a7adfab61beb261ffeab19d09
Gitweb: http://git.kernel.org/tip/37ec97deb3a8c68a7adfab61beb261ffeab19d09
Author: Rik van Riel <riel@redhat.com>
AuthorDate: Mon, 31 Jul 2017 15:28:46 -0400
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 10 Aug 2017 12:18:16 +0200
sched/numa: Slow down scan rate if shared faults dominate
The comment above update_task_scan_period() says the scan period should
be increased (scanning slows down) if the majority of memory accesses
are on the local node, or if the majority of the page accesses are
shared with other tasks.
However, with the current code, all a high ratio of shared accesses
does is slow down the rate at which scanning is made faster.
This patch changes things so either lots of shared accesses or
lots of local accesses will slow down scanning, and numa scanning
is sped up only when there are lots of private faults on remote
memory pages.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jhladky@redhat.com
Cc: lvenanci@redhat.com
Link: http://lkml.kernel.org/r/20170731192847.23050-2-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
kernel/sched/fair.c | 39 +++++++++++++++++++++++++--------------
1 file changed, 25 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ef5b66b..cb6b7c8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1892,7 +1892,7 @@ static void update_task_scan_period(struct task_struct *p,
unsigned long shared, unsigned long private)
{
unsigned int period_slot;
- int ratio;
+ int lr_ratio, ps_ratio;
int diff;
unsigned long remote = p->numa_faults_locality[0];
@@ -1922,25 +1922,36 @@ static void update_task_scan_period(struct task_struct *p,
* >= NUMA_PERIOD_THRESHOLD scan period increases (scan slower)
*/
period_slot = DIV_ROUND_UP(p->numa_scan_period, NUMA_PERIOD_SLOTS);
- ratio = (local * NUMA_PERIOD_SLOTS) / (local + remote);
- if (ratio >= NUMA_PERIOD_THRESHOLD) {
- int slot = ratio - NUMA_PERIOD_THRESHOLD;
+ lr_ratio = (local * NUMA_PERIOD_SLOTS) / (local + remote);
+ ps_ratio = (private * NUMA_PERIOD_SLOTS) / (private + shared);
+
+ if (ps_ratio >= NUMA_PERIOD_THRESHOLD) {
+ /*
+ * Most memory accesses are local. There is no need to
+ * do fast NUMA scanning, since memory is already local.
+ */
+ int slot = ps_ratio - NUMA_PERIOD_THRESHOLD;
+ if (!slot)
+ slot = 1;
+ diff = slot * period_slot;
+ } else if (lr_ratio >= NUMA_PERIOD_THRESHOLD) {
+ /*
+ * Most memory accesses are shared with other tasks.
+ * There is no point in continuing fast NUMA scanning,
+ * since other tasks may just move the memory elsewhere.
+ */
+ int slot = lr_ratio - NUMA_PERIOD_THRESHOLD;
if (!slot)
slot = 1;
diff = slot * period_slot;
} else {
- diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot;
-
/*
- * Scale scan rate increases based on sharing. There is an
- * inverse relationship between the degree of sharing and
- * the adjustment made to the scanning period. Broadly
- * speaking the intent is that there is little point
- * scanning faster if shared accesses dominate as it may
- * simply bounce migrations uselessly
+ * Private memory faults exceed (SLOTS-THRESHOLD)/SLOTS,
+ * yet they are not on the local NUMA node. Speed up
+ * NUMA scanning to get the memory moved over.
*/
- ratio = DIV_ROUND_UP(private * NUMA_PERIOD_SLOTS, (private + shared + 1));
- diff = (diff * ratio) / NUMA_PERIOD_SLOTS;
+ int ratio = max(lr_ratio, ps_ratio);
+ diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot;
}
p->numa_scan_period = clamp(p->numa_scan_period + diff,
next prev parent reply other threads:[~2017-08-10 12:11 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-31 19:28 [PATCH 0/2] numa,sched: improve performance for multi-threaded workloads riel
2017-07-31 19:28 ` [RHEL-ALT-7.4 PATCH 1/2] numa,sched: slow down scan rate if shared faults dominate riel
2017-08-02 7:01 ` Mel Gorman
2017-08-10 12:07 ` tip-bot for Rik van Riel [this message]
2017-07-31 19:28 ` [RHEL-ALT-7.4 PATCH 2/2] sched,numa: scale scan period with tasks in group and shared/private riel
2017-08-02 10:22 ` Mel Gorman
2017-08-10 12:07 ` [tip:sched/core] sched/numa: Scale " tip-bot for Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-37ec97deb3a8c68a7adfab61beb261ffeab19d09@git.kernel.org \
--to=tipbot@zytor.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox