From: Andrea Arcangeli <aarcange@redhat.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Christoph Lameter <cl@linux.com>
Subject: Re: [PATCH 22/35] autonuma: sched_set_autonuma_need_balance
Date: Tue, 29 May 2012 19:33:47 +0200 [thread overview]
Message-ID: <20120529173347.GJ21339@redhat.com> (raw)
In-Reply-To: <1338307942.26856.111.camel@twins>
On Tue, May 29, 2012 at 06:12:22PM +0200, Peter Zijlstra wrote:
> On Fri, 2012-05-25 at 19:02 +0200, Andrea Arcangeli wrote:
> > Invoke autonuma_balance only on the busy CPUs at the same frequency of
> > the CFS load balance.
> >
> > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> > ---
> > kernel/sched/fair.c | 3 +++
> > 1 files changed, 3 insertions(+), 0 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 99d1d33..1357938 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4893,6 +4893,9 @@ static void run_rebalance_domains(struct softirq_action *h)
> >
> > rebalance_domains(this_cpu, idle);
> >
> > + if (!this_rq->idle_balance)
> > + sched_set_autonuma_need_balance();
> > +
>
> This just isn't enough.. the whole thing needs to move out of
> schedule(). The only time schedule() should ever look at another cpu is
> if its idle.
>
> As it stands load-balance actually takes too much time as it is to live
> in a softirq, -rt gets around that by pushing all softirqs into a thread
> and I was thinking of doing some of that for mainline too.
No worries, I didn't mean to leave it like this forever. I was
considering using the stop cpu _nowait variant but I didn't have
enough time to realize if it would work for my case. I need to rethink
about that.
I was thinking which thread to use for that or if to use the stop_cpu
_nowait variant that active balancing is using, but it wasn't so easy
to change and considering from a practical standpoint it already flies
I released it. It's already an improvement, the previous approach was
mostly a debug approach to see if autonuma_balance would flood the
debug log and not converging.
autonuma_balance isn't fundamentally different from load_balance, they
boot look around at the other runqueues, to see if some task should be
moved.
If you move the load_balance to a kernel thread, I could move
autonuma_balance there too.
I just wasn't sure if to invoke a schedule() to actually call
autonuma_balance() made any sense, so I thought running it from
softirq too with the noblocking _nowait variant (or keep it in
schedule to be able to call stop_one_cpu without _nowait) would have
been more efficient.
The moment I gave up on the _nowait variant before releasing is when I
couldn't understand what is tlb_migrate_finish doing, and why it's not
present in the _nowait version in fair.c. Can you explain me that?
Obviously it's only used by ia64 so I could as well ignore that but it
was still an additional annoyance that made me think I needed a bit
more of time to think about it.
I'm glad you acknowledge load_balance already takes a bulk of the time
as it needs to find the busiest runqueue checking other CPU runqueues
too... With autonuma14 there's no measurable difference in hackbench
with autonuma=y or noautonuma boot parameter anymore, or upstream
without autonuma applied (not just autonuma=n). So the cost on a
24-way SMP is 0.
Then I tried to measure it also with lockdep and all lock/mutex
debugging/stats enabled there's a slighty measurable slowdown in
hackbench that may not be a measurement error, but it's barely
noticeable and I expect if I remove load_balance from the softirq, the
gain would be bigger than removing autonuma_balance (it goes from 70
to 80 sec in avg IIRC, but the error is about 10sec, just the avg
seems slightly higher). With lockdep and all other debug disabled it
takes fixed 6sec for all configs and it's definitely not measurable
(tested both thread and process, not that it makes any difference for
this).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-05-29 17:34 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-25 17:02 [PATCH 00/35] AutoNUMA alpha14 Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 01/35] mm: add unlikely to the mm allocation failure check Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 02/35] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 03/35] xen: document Xen is using an unused bit for the pagetables Andrea Arcangeli
2012-05-25 20:26 ` Konrad Rzeszutek Wilk
2012-05-26 15:59 ` Andrea Arcangeli
2012-05-29 14:10 ` Konrad Rzeszutek Wilk
2012-05-29 16:01 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 04/35] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-05-30 18:22 ` Konrad Rzeszutek Wilk
2012-05-30 18:34 ` Andrea Arcangeli
2012-05-30 20:01 ` Konrad Rzeszutek Wilk
2012-06-05 17:13 ` Andrea Arcangeli
2012-06-05 17:17 ` Konrad Rzeszutek Wilk
2012-06-05 17:40 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 05/35] autonuma: x86 pte_numa() and pmd_numa() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 06/35] autonuma: generic " Andrea Arcangeli
2012-05-30 20:23 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 07/35] autonuma: teach gup_fast about pte_numa Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 08/35] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-05-29 12:49 ` Peter Zijlstra
2012-05-29 16:11 ` Andrea Arcangeli
2012-05-29 17:04 ` Peter Zijlstra
2012-05-29 17:44 ` Andrea Arcangeli
2012-05-29 17:48 ` Peter Zijlstra
2012-05-29 18:15 ` Andrea Arcangeli
2012-05-30 20:26 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 09/35] autonuma: mm_autonuma and sched_autonuma data structures Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 10/35] autonuma: define the autonuma flags Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 11/35] autonuma: core autonuma.h header Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 12/35] autonuma: CPU follow memory algorithm Andrea Arcangeli
2012-05-29 13:00 ` Peter Zijlstra
2012-05-29 13:54 ` Rik van Riel
2012-05-29 13:10 ` Peter Zijlstra
2012-06-22 17:36 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 13/35] autonuma: add page structure fields Andrea Arcangeli
2012-05-29 13:16 ` Peter Zijlstra
2012-05-29 13:56 ` Rik van Riel
2012-05-29 14:54 ` Peter Zijlstra
2012-05-30 8:25 ` KOSAKI Motohiro
2012-05-30 9:06 ` Peter Zijlstra
2012-05-30 9:41 ` KOSAKI Motohiro
2012-05-30 9:55 ` Peter Zijlstra
2012-05-30 13:49 ` Andrea Arcangeli
2012-05-31 18:18 ` Peter Zijlstra
2012-06-05 14:51 ` Andrea Arcangeli
2012-06-19 18:06 ` Andrea Arcangeli
2012-05-29 16:38 ` Andrea Arcangeli
2012-05-29 16:46 ` Rik van Riel
2012-05-29 16:56 ` Peter Zijlstra
2012-05-29 18:35 ` Andrea Arcangeli
2012-05-29 17:38 ` Linus Torvalds
2012-05-29 18:09 ` Andrea Arcangeli
2012-05-29 20:42 ` Rik van Riel
2012-05-25 17:02 ` [PATCH 14/35] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-05-29 13:51 ` Peter Zijlstra
2012-05-30 0:14 ` Andrea Arcangeli
2012-05-30 18:19 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 15/35] autonuma: init knuma_migrated queues Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 16/35] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 17/35] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 18/35] autonuma: alloc/free/init sched_autonuma Andrea Arcangeli
2012-05-30 20:55 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 19/35] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 20/35] autonuma: avoid CFS select_task_rq_fair to return -1 Andrea Arcangeli
2012-05-29 14:02 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 21/35] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-05-29 16:05 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 22/35] autonuma: sched_set_autonuma_need_balance Andrea Arcangeli
2012-05-29 16:12 ` Peter Zijlstra
2012-05-29 17:33 ` Andrea Arcangeli [this message]
2012-05-29 17:43 ` Peter Zijlstra
2012-05-29 18:24 ` Andrea Arcangeli
2012-05-29 22:21 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 23/35] autonuma: core Andrea Arcangeli
2012-05-29 11:45 ` Kirill A. Shutemov
2012-05-30 0:03 ` Andrea Arcangeli
2012-05-29 16:27 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 24/35] autonuma: follow_page check for pte_numa/pmd_numa Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 25/35] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 26/35] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 27/35] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 28/35] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 29/35] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 30/35] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-05-29 16:30 ` Peter Zijlstra
2012-05-29 16:49 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 31/35] autonuma: initialize page structure fields Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 32/35] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 33/35] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 34/35] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 35/35] autonuma: page_autonuma Andrea Arcangeli
2012-05-29 16:44 ` Peter Zijlstra
2012-05-29 17:14 ` Andrea Arcangeli
2012-05-26 17:28 ` [PATCH 00/35] AutoNUMA alpha14 Rik van Riel
2012-05-26 20:42 ` Linus Torvalds
2012-05-29 15:53 ` Christoph Lameter
2012-05-29 16:08 ` Andrea Arcangeli
2012-05-30 14:46 ` Peter Zijlstra
2012-05-30 15:30 ` Ingo Molnar
2012-05-29 13:36 ` Kirill A. Shutemov
2012-05-29 15:43 ` Petr Holasek
2012-05-31 18:08 ` AutoNUMA15 Andrea Arcangeli
2012-05-31 20:01 ` AutoNUMA15 Don Morris
2012-05-31 22:54 ` AutoNUMA15 Andrea Arcangeli
2012-06-01 0:04 ` AutoNUMA15 Andrea Arcangeli
2012-05-31 18:52 ` AutoNUMA15 Don Morris
2012-06-07 2:30 ` AutoNUMA15 Zhouping Liu
2012-06-21 7:29 ` AutoNUMA15 Alex Shi
2012-06-21 14:55 ` AutoNUMA15 Andrea Arcangeli
2012-06-26 7:52 ` AutoNUMA15 Alex Shi
2012-06-26 12:03 ` AutoNUMA15 Andrea Arcangeli
2012-07-12 2:36 ` AutoNUMA15 Alex Shi
2012-05-29 17:15 ` [PATCH 00/35] AutoNUMA alpha14 Andrea Arcangeli
2012-06-01 22:41 ` Mauricio Faria de Oliveira
2012-06-22 17:57 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120529173347.GJ21339@redhat.com \
--to=aarcange@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=bharata.rao@gmail.com \
--cc=cl@linux.com \
--cc=danms@us.ibm.com \
--cc=dhillf@gmail.com \
--cc=efault@gmx.de \
--cc=hannes@cmpxchg.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).