From: Rik van Riel <riel@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Christoph Lameter <cl@linux.com>, Alex Shi <alex.shi@intel.com>,
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Don Morris <don.morris@hp.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [PATCH 22/40] autonuma: teach CFS about autonuma affinity
Date: Sun, 01 Jul 2012 12:37:40 -0400 [thread overview]
Message-ID: <4FF07CD4.1070101@redhat.com> (raw)
In-Reply-To: <1340888180-15355-23-git-send-email-aarcange@redhat.com>
On 06/28/2012 08:56 AM, Andrea Arcangeli wrote:
> @@ -2621,6 +2622,8 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
> load = weighted_cpuload(i);
>
> if (load< min_load || (load == min_load&& i == this_cpu)) {
> + if (!task_autonuma_cpu(p, i))
> + continue;
> min_load = load;
> idlest = i;
> }
Is it right to only consider CPUs on the "right" NUMA
node, or do we want to harvest idle time elsewhere as
a last resort?
After your change the comment above find_idlest_cpu
no longer matches what the function does!
if (load < min_load || (load == min_load && i ==
this_cpu)) {
min_load = load;
idlest = i;
}
Would it make sense for task_autonuma_cpu(p, i) to be
inside the if ( ) braces, since that is what you are
trying to do?
if ((load < min_load || (load == min_load &&
i == this_cpu)) && task_autonuma_cpu(p, i)) {
> @@ -2639,24 +2642,27 @@ static int select_idle_sibling(struct task_struct *p, int target)
These bits make sense.
> /*
> * Otherwise, iterate the domains and find an elegible idle cpu.
> */
> + idle_target = false;
> sd = rcu_dereference(per_cpu(sd_llc, target));
> for_each_lower_domain(sd) {
> sg = sd->groups;
> @@ -2670,9 +2676,18 @@ static int select_idle_sibling(struct task_struct *p, int target)
> goto next;
> }
>
> - target = cpumask_first_and(sched_group_cpus(sg),
> - tsk_cpus_allowed(p));
> - goto done;
> + for_each_cpu_and(i, sched_group_cpus(sg),
> + tsk_cpus_allowed(p)) {
> + /* Find autonuma cpu only in idle group */
> + if (task_autonuma_cpu(p, i)) {
> + target = i;
> + goto done;
> + }
> + if (!idle_target) {
> + idle_target = true;
> + target = i;
> + }
> + }
There already is a for loop right above this:
for_each_cpu(i, sched_group_cpus(sg)) {
if (!idle_cpu(i))
goto next;
}
It appears to loop over all the CPUs in a sched group, but
not really. If the first CPU in the sched group is idle,
it will fall through.
If the first CPU in the sched group is not idle, we move
on to the next sched group, instead of looking at the
other CPUs in the sched group.
Peter, Ingo, what is the original code in select_idle_sibling
supposed to do?
That original for_each_cpu loop would make more sense if
it actually looped over each cpu in the group.
Then we could remember two targets. One idle target, and
one autonuma-compliant idle target.
If, after looping over the CPUs, we find no autonuma-compliant
target, we use the other idle target.
Does that make sense?
Am I overlooking something about how the way select_idle_sibling
is supposed to work?
> @@ -3195,6 +3217,8 @@ static int move_one_task(struct lb_env *env)
> {
> struct task_struct *p, *n;
>
> + env->flags |= LBF_NUMA;
> +numa_repeat:
> list_for_each_entry_safe(p, n,&env->src_rq->cfs_tasks, se.group_node) {
> if (throttled_lb_pair(task_group(p), env->src_rq->cpu, env->dst_cpu))
> continue;
> @@ -3209,8 +3233,14 @@ static int move_one_task(struct lb_env *env)
> * stats here rather than inside move_task().
> */
> schedstat_inc(env->sd, lb_gained[env->idle]);
> + env->flags&= ~LBF_NUMA;
> return 1;
> }
> + if (env->flags& LBF_NUMA) {
> + env->flags&= ~LBF_NUMA;
> + goto numa_repeat;
> + }
> +
> return 0;
> }
Would it make sense to remember the first non-autonuma-compliant
task that can be moved, and keep searching for one that fits
autonuma's criteria further down the line?
Then, if you fail to find a good autonuma task in the first
iteration, you do not have to loop over the list a second time.
> @@ -3235,6 +3265,8 @@ static int move_tasks(struct lb_env *env)
> if (env->imbalance<= 0)
> return 0;
>
> + env->flags |= LBF_NUMA;
> +numa_repeat:
Same here. Loops are bad enough, and it looks like it would
only cost one pointer on the stack to avoid numa_repeat :)
--
All rights reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-07-01 18:06 UTC|newest]
Thread overview: 177+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-28 12:55 [PATCH 00/40] AutoNUMA19 Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 01/40] mm: add unlikely to the mm allocation failure check Andrea Arcangeli
2012-06-29 14:10 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 02/40] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-06-29 14:10 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 03/40] autonuma: export is_vma_temporary_stack() even if CONFIG_TRANSPARENT_HUGEPAGE=n Andrea Arcangeli
2012-06-29 14:11 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 04/40] xen: document Xen is using an unused bit for the pagetables Andrea Arcangeli
2012-06-29 14:16 ` Rik van Riel
2012-07-04 23:05 ` Andrea Arcangeli
2012-06-30 4:47 ` Konrad Rzeszutek Wilk
2012-07-03 10:45 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 05/40] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-06-28 15:13 ` Don Morris
2012-06-28 15:00 ` Andrea Arcangeli
2012-06-29 14:26 ` Rik van Riel
2012-07-03 20:30 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 06/40] autonuma: x86 pte_numa() and pmd_numa() Andrea Arcangeli
2012-06-29 15:02 ` Rik van Riel
2012-07-04 23:03 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 07/40] autonuma: generic " Andrea Arcangeli
2012-06-29 15:13 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 08/40] autonuma: teach gup_fast about pte_numa Andrea Arcangeli
2012-06-29 15:27 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 09/40] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-06-29 15:36 ` Rik van Riel
2012-06-29 16:04 ` Peter Zijlstra
2012-06-29 16:11 ` Rik van Riel
2012-06-29 16:38 ` Andrea Arcangeli
2012-06-29 16:58 ` Rik van Riel
2012-07-05 13:09 ` Johannes Weiner
2012-07-05 18:33 ` Glauber Costa
2012-07-05 20:07 ` Andrea Arcangeli
2012-06-30 4:50 ` Konrad Rzeszutek Wilk
2012-07-04 23:14 ` Andrea Arcangeli
2012-07-05 12:04 ` Konrad Rzeszutek Wilk
2012-07-05 12:28 ` Andrea Arcangeli
2012-07-05 12:18 ` Peter Zijlstra
2012-07-05 12:21 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 10/40] autonuma: mm_autonuma and sched_autonuma data structures Andrea Arcangeli
2012-06-29 15:47 ` Rik van Riel
2012-06-29 17:45 ` Rik van Riel
2012-07-04 23:16 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 11/40] autonuma: define the autonuma flags Andrea Arcangeli
2012-06-29 16:10 ` Rik van Riel
2012-06-30 4:58 ` Konrad Rzeszutek Wilk
2012-07-02 15:42 ` Konrad Rzeszutek Wilk
2012-06-30 5:01 ` Konrad Rzeszutek Wilk
2012-07-04 23:45 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 12/40] autonuma: core autonuma.h header Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 13/40] autonuma: CPU follow memory algorithm Andrea Arcangeli
2012-06-28 14:46 ` Peter Zijlstra
2012-06-29 14:11 ` Nai Xia
2012-06-29 16:30 ` Andrea Arcangeli
2012-06-29 18:09 ` Nai Xia
2012-06-29 21:02 ` Nai Xia
2012-07-03 11:53 ` Peter Zijlstra
2012-06-28 14:53 ` Peter Zijlstra
2012-06-29 12:16 ` Hillf Danton
2012-06-29 12:55 ` Ingo Molnar
2012-06-29 16:51 ` Dor Laor
2012-06-29 18:41 ` Peter Zijlstra
2012-06-29 18:46 ` Rik van Riel
2012-06-29 18:51 ` Peter Zijlstra
2012-06-29 18:57 ` Peter Zijlstra
2012-06-29 19:03 ` Peter Zijlstra
2012-06-29 19:19 ` Rik van Riel
2012-07-02 16:57 ` Vaidyanathan Srinivasan
2012-07-05 16:56 ` Vaidyanathan Srinivasan
2012-07-06 13:04 ` Hillf Danton
2012-07-06 18:38 ` Vaidyanathan Srinivasan
2012-07-12 13:12 ` Andrea Arcangeli
2012-06-29 18:49 ` Peter Zijlstra
2012-06-29 18:53 ` Peter Zijlstra
2012-06-29 20:01 ` Nai Xia
2012-06-29 20:44 ` Nai Xia
2012-06-30 1:23 ` Andrea Arcangeli
2012-06-30 2:43 ` Nai Xia
2012-06-30 5:48 ` Dor Laor
2012-06-30 6:58 ` Nai Xia
2012-06-30 13:04 ` Andrea Arcangeli
2012-06-30 15:19 ` Nai Xia
2012-06-30 19:37 ` Dor Laor
2012-07-01 2:41 ` Nai Xia
2012-06-30 23:55 ` Benjamin Herrenschmidt
2012-07-01 3:10 ` Nai Xia
2012-06-30 8:23 ` Nai Xia
2012-07-02 7:29 ` Rik van Riel
2012-07-02 7:43 ` Nai Xia
2012-06-30 12:48 ` Andrea Arcangeli
2012-06-30 15:10 ` Nai Xia
2012-07-02 7:36 ` Rik van Riel
2012-07-02 7:56 ` Nai Xia
2012-07-02 8:17 ` Rik van Riel
2012-07-02 8:31 ` Nai Xia
2012-07-05 18:07 ` Rik van Riel
2012-07-05 22:59 ` Andrea Arcangeli
2012-07-06 1:00 ` Nai Xia
2012-06-29 19:04 ` Peter Zijlstra
2012-06-29 20:27 ` Nai Xia
2012-06-29 18:03 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 14/40] autonuma: add page structure fields Andrea Arcangeli
2012-06-29 18:06 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 15/40] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-06-29 18:31 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 16/40] autonuma: init knuma_migrated queues Andrea Arcangeli
2012-06-29 18:35 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 17/40] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-06-29 18:37 ` Rik van Riel
2012-06-28 12:55 ` [PATCH 18/40] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-06-29 18:39 ` Rik van Riel
2012-06-30 5:04 ` Konrad Rzeszutek Wilk
2012-07-12 17:50 ` Andrea Arcangeli
2012-06-28 12:55 ` [PATCH 19/40] autonuma: alloc/free/init sched_autonuma Andrea Arcangeli
2012-06-29 18:52 ` Rik van Riel
2012-06-30 5:10 ` Konrad Rzeszutek Wilk
2012-07-12 17:59 ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 20/40] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-06-29 18:54 ` Rik van Riel
2012-06-30 5:12 ` Konrad Rzeszutek Wilk
2012-07-12 18:08 ` Andrea Arcangeli
2012-07-12 18:17 ` Johannes Weiner
2012-07-13 14:19 ` Christoph Lameter
2012-07-14 17:01 ` Andrea Arcangeli
2012-07-01 15:33 ` Rik van Riel
2012-07-12 18:27 ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 21/40] autonuma: avoid CFS select_task_rq_fair to return -1 Andrea Arcangeli
2012-06-29 18:57 ` Rik van Riel
2012-06-29 19:05 ` Peter Zijlstra
2012-06-29 19:07 ` Rik van Riel
2012-06-29 20:48 ` Ingo Molnar
2012-06-28 12:56 ` [PATCH 22/40] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-07-01 16:37 ` Rik van Riel [this message]
2012-06-28 12:56 ` [PATCH 23/40] autonuma: sched_set_autonuma_need_balance Andrea Arcangeli
2012-07-01 16:57 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 24/40] autonuma: core Andrea Arcangeli
2012-07-02 4:07 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 25/40] autonuma: follow_page check for pte_numa/pmd_numa Andrea Arcangeli
2012-07-02 4:14 ` Rik van Riel
2012-07-14 16:43 ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 26/40] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-07-02 4:19 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 27/40] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-07-02 4:22 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 28/40] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-07-02 4:24 ` Rik van Riel
2012-07-12 18:50 ` Andrea Arcangeli
2012-07-12 21:25 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 29/40] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-07-02 4:33 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 30/40] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-07-02 4:47 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 31/40] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-07-02 4:49 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 32/40] autonuma: initialize page structure fields Andrea Arcangeli
2012-07-02 4:50 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 33/40] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-07-02 4:56 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 34/40] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-07-02 4:58 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 35/40] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-07-02 5:12 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 36/40] autonuma: page_autonuma Andrea Arcangeli
2012-06-30 5:24 ` Konrad Rzeszutek Wilk
2012-07-12 19:43 ` Andrea Arcangeli
2012-07-02 6:37 ` Rik van Riel
2012-07-12 19:58 ` Andrea Arcangeli
2012-06-28 12:56 ` [PATCH 37/40] autonuma: page_autonuma change #include for sparse Andrea Arcangeli
2012-07-02 6:22 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 38/40] autonuma: autonuma_migrate_head[0] dynamic size Andrea Arcangeli
2012-07-02 5:15 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 39/40] autonuma: bugcheck page_autonuma fields on newly allocated pages Andrea Arcangeli
2012-07-02 6:40 ` Rik van Riel
2012-06-28 12:56 ` [PATCH 40/40] autonuma: shrink the per-page page_autonuma struct size Andrea Arcangeli
2012-07-02 7:18 ` Rik van Riel
2012-07-12 20:21 ` Andrea Arcangeli
2012-07-09 15:40 ` [PATCH 00/40] AutoNUMA19 Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF07CD4.1070101@redhat.com \
--to=riel@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=benh@kernel.crashing.org \
--cc=bharata.rao@gmail.com \
--cc=cl@linux.com \
--cc=danms@us.ibm.com \
--cc=dhillf@gmail.com \
--cc=don.morris@hp.com \
--cc=efault@gmx.de \
--cc=hannes@cmpxchg.org \
--cc=konrad.wilk@oracle.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mauricfo@linux.vnet.ibm.com \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).