From: Nick Piggin <piggin@cyberone.com.au>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Anton Blanchard <anton@samba.org>, Ingo Molnar <mingo@redhat.com>,
"Martin J. Bligh" <mbligh@aracnet.com>,
"Nakajima, Jun" <jun.nakajima@intel.com>,
Mark Wong <markw@osdl.org>
Subject: Re: [CFT][RFC] HT scheduler
Date: Fri, 12 Dec 2003 18:00:42 +1100 [thread overview]
Message-ID: <3FD9679A.1020404@cyberone.com.au> (raw)
In-Reply-To: <20031212052812.E016B2C072@lists.samba.org>
Rusty Russell wrote:
>In message <3FD7F1B9.5080100@cyberone.com.au> you write:
>
>>http://www.kerneltrap.org/~npiggin/w26/
>>Against 2.6.0-test11
>>
>>This includes the SMT description for P4. Initial results shows comparable
>>performance to Ingo's shared runqueue's patch on a dual P4 Xeon.
>>
>
>I'm still not convinced. Sharing runqueues is simple, and in fact
>exactly what you want for HT: you want to balance *runqueues*, not
>CPUs. In fact, it can be done without a CONFIG_SCHED_SMT addition.
>
>Your patch is more general, more complex, but doesn't actually seem to
>buy anything. It puts a general domain structure inside the
>scheduler, without putting it anywhere else which wants it (eg. slab
>cache balancing). My opinion is either (1) produce a general NUMA
>topology which can then be used by the scheduler, or (2) do the
>minimal change in the scheduler which makes HT work well.
>
>Note: some of your changes I really like, it's just that I think this
>is overkill.
>
>I'll produce a patch so we can have something solid to talk about.
>
Thanks for having a look Rusty. I'll try to convince you :)
As you know, the domain classes is not just for HT, but can do multi levels
of NUMA, and it can be built by architecture specific code which is good
for Opteron, for example. It doesn't need CONFIG_SCHED_SMT either, of
course,
or CONFIG_NUMA even: degenerate domains can just be collapsed (code isn't
there to do that now).
Shared runqueues I find isn't so flexible. I think it perfectly describes
the P4 HT architecture, but what happens if (when) siblings get seperate
L1 caches? What about SMT, CMP, SMP and NUMA levels in the POWER5?
The large SGI (and I imagine IBM's POWER5s) systems need things like
progressive balancing backoff and would probably benefit with a more
heirachical balancing scheme so all the balancing operations don't kill
the system.
w26 does ALL this, while sched.o is 3K smaller than Ingo's shared runqueue
patch on NUMA and SMP, and 1K smaller on UP (although sched.c is 90 lines
longer). kernbench system time is down nearly 10% on the NUMAQ, so it isn't
hurting performance either.
And finally, Linus also wanted the balancing code to be generalised to
handle SMT, and Ingo said he liked my patch from a first look.
next prev parent reply other threads:[~2003-12-12 7:00 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-08 4:25 [PATCH][RFC] make cpu_sibling_map a cpumask_t Nick Piggin
2003-12-08 15:59 ` Anton Blanchard
2003-12-08 23:08 ` Nick Piggin
2003-12-09 0:14 ` Anton Blanchard
2003-12-11 4:25 ` [CFT][RFC] HT scheduler Nick Piggin
2003-12-11 7:24 ` Nick Piggin
2003-12-11 8:57 ` Nick Piggin
2003-12-11 11:52 ` William Lee Irwin III
2003-12-11 13:09 ` Nick Piggin
2003-12-11 13:23 ` William Lee Irwin III
2003-12-11 13:30 ` Nick Piggin
2003-12-11 13:32 ` William Lee Irwin III
2003-12-11 15:30 ` Nick Piggin
2003-12-11 15:38 ` William Lee Irwin III
2003-12-11 15:51 ` Nick Piggin
2003-12-11 15:56 ` William Lee Irwin III
2003-12-11 16:37 ` Nick Piggin
2003-12-11 16:40 ` William Lee Irwin III
2003-12-12 1:52 ` [PATCH] improve rwsem scalability (was Re: [CFT][RFC] HT scheduler) Nick Piggin
2003-12-12 2:02 ` Nick Piggin
2003-12-12 9:41 ` Ingo Molnar
2003-12-13 0:07 ` Nick Piggin
2003-12-14 0:44 ` Nick Piggin
2003-12-17 5:27 ` Nick Piggin
2003-12-19 11:52 ` Nick Piggin
2003-12-19 15:06 ` Martin J. Bligh
2003-12-20 0:08 ` Nick Piggin
2003-12-12 0:58 ` [CFT][RFC] HT scheduler Rusty Russell
2003-12-11 10:01 ` Rhino
2003-12-11 8:14 ` Nick Piggin
2003-12-11 16:49 ` Rhino
2003-12-11 15:16 ` Nick Piggin
2003-12-11 11:40 ` William Lee Irwin III
2003-12-11 17:05 ` Rhino
2003-12-11 15:17 ` William Lee Irwin III
2003-12-11 16:28 ` Kevin P. Fleming
2003-12-11 16:41 ` Nick Piggin
2003-12-12 2:24 ` Rusty Russell
2003-12-12 7:00 ` Nick Piggin [this message]
2003-12-12 7:23 ` Rusty Russell
2003-12-13 6:43 ` Nick Piggin
2003-12-14 1:35 ` bill davidsen
2003-12-14 2:18 ` Nick Piggin
2003-12-14 4:32 ` Jamie Lokier
2003-12-14 9:40 ` Nick Piggin
2003-12-14 10:46 ` Arjan van de Ven
2003-12-16 17:46 ` Bill Davidsen
2003-12-16 18:22 ` Linus Torvalds
2003-12-17 0:24 ` Davide Libenzi
2003-12-17 0:41 ` Linus Torvalds
2003-12-17 0:54 ` Davide Libenzi
2003-12-16 17:34 ` Bill Davidsen
2003-12-15 5:53 ` Rusty Russell
2003-12-15 23:08 ` Nick Piggin
2003-12-19 4:57 ` Nick Piggin
2003-12-19 5:13 ` Nick Piggin
2003-12-20 2:43 ` Rusty Russell
2003-12-21 2:56 ` Nick Piggin
2004-01-03 18:57 ` Bill Davidsen
2003-12-15 20:21 ` Zwane Mwaikambo
2003-12-15 23:20 ` Nick Piggin
2003-12-16 0:11 ` Zwane Mwaikambo
2003-12-12 8:59 ` Nick Piggin
2003-12-12 15:14 ` Martin J. Bligh
2003-12-08 19:44 ` [PATCH][RFC] make cpu_sibling_map a cpumask_t James Cleverdon
2003-12-08 20:38 ` Ingo Molnar
2003-12-08 20:51 ` Zwane Mwaikambo
2003-12-08 20:55 ` Ingo Molnar
2003-12-08 23:17 ` Nick Piggin
2003-12-08 23:36 ` Ingo Molnar
2003-12-08 23:58 ` Nick Piggin
2003-12-08 23:46 ` Rusty Russell
2003-12-09 13:36 ` Nick Piggin
2003-12-11 21:41 ` bill davidsen
[not found] <20031213022038.300B22C2C1@lists.samba.org.suse.lists.linux.kernel>
[not found] ` <3FDAB517.4000309@cyberone.com.au.suse.lists.linux.kernel>
[not found] ` <brgeo7$huv$1@gatekeeper.tmr.com.suse.lists.linux.kernel>
[not found] ` <3FDBC876.3020603@cyberone.com.au.suse.lists.linux.kernel>
[not found] ` <20031214043245.GC21241@mail.shareable.org.suse.lists.linux.kernel>
[not found] ` <3FDC3023.9030708@cyberone.com.au.suse.lists.linux.kernel>
[not found] ` <1071398761.5233.1.camel@laptop.fenrus.com.suse.lists.linux.kernel>
2003-12-14 16:26 ` [CFT][RFC] HT scheduler Andi Kleen
2003-12-14 16:54 ` Arjan van de Ven
[not found] <200312161127.13691.habanero@us.ibm.com>
2003-12-16 17:37 ` Andrew Theurer
2003-12-17 2:41 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2003-12-16 19:03 Nakajima, Jun
2003-12-17 0:38 Nakajima, Jun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3FD9679A.1020404@cyberone.com.au \
--to=piggin@cyberone.com.au \
--cc=anton@samba.org \
--cc=jun.nakajima@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=markw@osdl.org \
--cc=mbligh@aracnet.com \
--cc=mingo@redhat.com \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox