From: Peter Zijlstra <peterz@infradead.org>
To: 王贇 <yun.wang@linux.alibaba.com>
Cc: hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
mcgrof@kernel.org, keescook@chromium.org,
linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org,
Mel Gorman <mgorman@suse.de>,
riel@surriel.com
Subject: Re: [PATCH 4/4] numa: introduce numa cling feature
Date: Thu, 11 Jul 2019 16:27:28 +0200 [thread overview]
Message-ID: <20190711142728.GF3402@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <9a440936-1e5d-d3bb-c795-ef6f9839a021@linux.alibaba.com>
On Wed, Jul 03, 2019 at 11:34:16AM +0800, 王贇 wrote:
> Although we paid so many effort to settle down task on a particular
> node, there are still chances for a task to leave it's preferred
> node, that is by wakeup, numa swap migrations or load balance.
>
> When we are using cpu cgroup in share way, since all the workloads
> see all the cpus, it could be really bad especially when there
> are too many fast wakeup, although now we can numa group the tasks,
> they won't really stay on the same node, for example we have numa
> group ng_A, ng_B, ng_C, ng_D, it's very likely result as:
>
> CPU Usage:
> Node 0 Node 1
> ng_A(600%) ng_A(400%)
> ng_B(400%) ng_B(600%)
> ng_C(400%) ng_C(600%)
> ng_D(600%) ng_D(400%)
>
> Memory Ratio:
> Node 0 Node 1
> ng_A(60%) ng_A(40%)
> ng_B(40%) ng_B(60%)
> ng_C(40%) ng_C(60%)
> ng_D(60%) ng_D(40%)
>
> Locality won't be too bad but far from the best situation, we want
> a numa group to settle down thoroughly on a particular node, with
> every thing balanced.
>
> Thus we introduce the numa cling, which try to prevent tasks leaving
> the preferred node on wakeup fast path.
> @@ -6195,6 +6447,13 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> if ((unsigned)i < nr_cpumask_bits)
> return i;
>
> + /*
> + * Failed to find an idle cpu, wake affine may want to pull but
> + * try stay on prev-cpu when the task cling to it.
> + */
> + if (task_numa_cling(p, cpu_to_node(prev), cpu_to_node(target)))
> + return prev;
> +
> return target;
> }
Select idle sibling should never cross node boundaries and is thus the
entirely wrong place to fix anything.
next prev parent reply other threads:[~2019-07-11 14:27 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-22 2:10 [RFC PATCH 0/5] NUMA Balancer Suite 王贇
2019-04-22 2:11 ` [RFC PATCH 1/5] numa: introduce per-cgroup numa balancing locality, statistic 王贇
2019-04-23 8:44 ` Peter Zijlstra
2019-04-23 9:14 ` 王贇
2019-04-23 8:46 ` Peter Zijlstra
2019-04-23 9:32 ` 王贇
2019-04-23 8:47 ` Peter Zijlstra
2019-04-23 9:33 ` 王贇
2019-04-23 9:46 ` Peter Zijlstra
2019-04-22 2:12 ` [RFC PATCH 2/5] numa: append per-node execution info in memory.numa_stat 王贇
2019-04-23 8:52 ` Peter Zijlstra
2019-04-23 9:36 ` 王贇
2019-04-23 9:46 ` Peter Zijlstra
2019-04-23 10:01 ` 王贇
2019-04-22 2:13 ` [RFC PATCH 3/5] numa: introduce per-cgroup preferred numa node 王贇
2019-04-23 8:55 ` Peter Zijlstra
2019-04-23 9:41 ` 王贇
2019-04-22 2:14 ` [RFC PATCH 4/5] numa: introduce numa balancer infrastructure 王贇
2019-04-22 2:21 ` [RFC PATCH 5/5] numa: numa balancer 王贇
2019-04-23 9:05 ` Peter Zijlstra
2019-04-23 9:59 ` 王贇
2019-04-22 14:34 ` [RFC PATCH 0/5] NUMA Balancer Suite 禹舟键
2019-04-23 2:14 ` 王贇
2019-07-03 3:26 ` [PATCH 0/4] per cpu cgroup numa suite 王贇
2019-07-03 3:28 ` [PATCH 1/4] numa: introduce per-cgroup numa balancing locality, statistic 王贇
2019-07-11 13:43 ` Peter Zijlstra
2019-07-12 3:15 ` 王贇
2019-07-11 13:47 ` Peter Zijlstra
2019-07-12 3:43 ` 王贇
2019-07-12 7:58 ` Peter Zijlstra
2019-07-12 9:11 ` 王贇
2019-07-12 9:42 ` Peter Zijlstra
2019-07-12 10:10 ` 王贇
2019-07-15 2:09 ` 王贇
2019-07-15 12:10 ` Michal Koutný
2019-07-16 2:41 ` 王贇
2019-07-19 16:47 ` Michal Koutný
2019-07-03 3:29 ` [PATCH 2/4] numa: append per-node execution info in memory.numa_stat 王贇
2019-07-11 13:45 ` Peter Zijlstra
2019-07-12 3:17 ` 王贇
2019-07-03 3:32 ` [PATCH 3/4] numa: introduce numa group per task group 王贇
2019-07-11 14:10 ` Peter Zijlstra
2019-07-12 4:03 ` 王贇
2019-07-03 3:34 ` [PATCH 4/4] numa: introduce numa cling feature 王贇
2019-07-08 2:25 ` [PATCH v2 " 王贇
2019-07-09 2:15 ` 王贇
2019-07-09 2:24 ` [PATCH v3 " 王贇
2019-07-11 14:27 ` Peter Zijlstra [this message]
2019-07-12 3:10 ` [PATCH " 王贇
2019-07-12 7:53 ` Peter Zijlstra
2019-07-12 8:58 ` 王贇
2019-07-22 3:44 ` 王贇
2019-07-11 9:00 ` [PATCH 0/4] per cgroup numa suite 王贇
2019-07-16 3:38 ` [PATCH v2 0/4] per-cgroup " 王贇
2019-07-16 3:39 ` [PATCH v2 1/4] numa: introduce per-cgroup numa balancing locality statistic 王贇
2019-07-16 3:40 ` [PATCH v2 2/4] numa: append per-node execution time in cpu.numa_stat 王贇
2019-07-19 16:39 ` Michal Koutný
2019-07-22 2:36 ` 王贇
2019-07-16 3:41 ` [PATCH v2 3/4] numa: introduce numa group per task group 王贇
2019-07-16 3:41 ` [PATCH v4 4/4] numa: introduce numa cling feature 王贇
2019-07-22 2:37 ` [PATCH v5 " 王贇
2019-07-25 2:33 ` [PATCH v2 0/4] per-cgroup numa suite 王贇
2019-08-06 1:33 ` 王贇
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190711142728.GF3402@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=keescook@chromium.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=riel@surriel.com \
--cc=vdavydov.dev@gmail.com \
--cc=yun.wang@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.