All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Mike Travis <travis@sgi.com>
Subject: Re: [git pull] cpus4096 fixes
Date: Mon, 28 Jul 2008 09:56:11 +0200	[thread overview]
Message-ID: <20080728075611.GA16208@elte.hu> (raw)
In-Reply-To: <200807281634.43036.rusty@rustcorp.com.au>


* Rusty Russell <rusty@rustcorp.com.au> wrote:

> On Monday 28 July 2008 13:06:36 Andrew Morton wrote:
> > On Mon, 28 Jul 2008 10:42:12 +1000 Rusty Russell <rusty@rustcorp.com.au> 
> wrote:
> > > The 4k CPU patches have been sliding in without review up until now.
> >
> > wot?
> 
> This surprises you? [...]

you should check many of the earliest iterations (it's all on lkml), and 
the bits we rejected in review/testing. You'll be surprised how much 
questionable and fragile stuff was filtered out.

But your intuition is right in a sense, this whole topic _feels_ ugly, 
and there's a good reason for it and i doubt you'll like it:

Much of it derives from the ugly fact that cpumasks were designed to be 
word-size-ish and are used as such in hundreds of places in the kernel, 
while with 4K CPUs they become half a _kilobyte_.

That causes the basic conceptual friction. That fundamental unease is 
what caused me to split these patches off into a completly separate 
topic, so that they can be NAK-ed individually without blocking other 
subsystem changes. Mike will be able to tell you how many bits were 
rejected and rewritten - it's been one of the most iterated topics.

Unless you know some good way around that basic "0.5K cpumask" problem 
[besides the 'dont try to do it at all then, stupid' solution] Mike's 
painful year-long, multi-release, all-on-lkml effort to bootstrap a 4K 
CPUs kernel, to track down dozens of early boot crashes, to look at 
stack sizes in zillions of functions, to write a ton of patches to 
evolve the APIs to cope with it better (all of this was done out in the 
open on lkml for all to see) looks like quite close to what _can_ be 
done.

128/256/512/1024 CPU support (which has been upstream for years and 
built into enterprise distros, etc.) already turned cpumasks into rather 
static objects in practice and their proliferation into hotpaths stopped 
- so maybe we could just turn them into non-stack objects from now on.

( with perhaps some nice wrappers that turns then into on-stack objects
  to not slow down the common case. Mike tried to do something like 
  that. )

Help and more cleanup patches welcome. Mike & co did most of the hard 
work already, latest -git does boot with 4K cpus built into the kernel. 
We can iterate this stuff a _lot_ easier now. Turn on CONFIG_MAXSMP=y on 
x86 and you can boot it on your PC.

> [...]  I stumbled across the cpumask_of_cpu() bug because I happened 
> to want it for stop_machine and read the damned code.  But it lead me 
> to the surrounding code, which is pretty questionable.  An 
> arch-specific map, rather than depending on NR_CPUS?  Adding 
> set_cpus_allowed_ptr() instead of changing set_cpus_allowed()? [...]

the set_cpus_allowed_ptr() change too was done due to review feedback, 
to reduce the friction with other tree, to make for smoother migration. 
Breaking an existing API is a far too rude technique for a long-lived 
topic like this. (it's been going on for nearly a year or so)

> [...] Macros which declare things and may or may not do an 
> allocation/free?  Finally a patch so horrifically ugly that it can't 
> be ignored any more gets all the way to Linus.

[ hey, is that your suggested solution you are talking about? ;-) ]

> Overall, it seems like an attempt to sneak in gradual workarounds for 
> cpumasks on the stack, rather than a coherent plan.  I understand the 
> temptation to avoid an "are we prepared to pay this price for large 
> NR_CPUS?" discussion, but we need it anyway.

sure. From a practical standpoint 4096 CPUs support looks pretty stable 
and functional. I boot a 4K cpus kernel every couple of minutes:

 config-Sun_Jul_27_09_15_47_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_09_27_00_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_09_29_39_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_09_36_41_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_09_40_22_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_09_59_33_CEST_2008.good:CONFIG_MAXSMP=y

 config-Sun_Jul_27_22_14_47_CEST_2008.good:CONFIG_NR_CPUS=8
 config-Sun_Jul_27_22_20_09_CEST_2008.good:CONFIG_NR_CPUS=8
 config-Sun_Jul_27_22_25_32_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_22_25_32_CEST_2008.good:CONFIG_NR_CPUS=4096
 config-Sun_Jul_27_22_36_52_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_22_36_52_CEST_2008.good:CONFIG_NR_CPUS=4096
 config-Sun_Jul_27_22_42_19_CEST_2008.good:CONFIG_MAXSMP=y
 config-Sun_Jul_27_22_42_19_CEST_2008.good:CONFIG_NR_CPUS=4096
 config-Sun_Jul_27_22_47_28_CEST_2008.good:CONFIG_NR_CPUS=32
 config-Sun_Jul_27_22_52_47_CEST_2008.good:CONFIG_NR_CPUS=32
 config-Sun_Jul_27_22_57_59_CEST_2008.good:CONFIG_NR_CPUS=32

The last difficult regression has been months ago. So this stuff is 
hackable in practice and you can try out the end result if you are 
interested in it.

	Ingo

  parent reply	other threads:[~2008-07-28  7:56 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-27 19:06 [git pull] cpus4096 fixes Ingo Molnar
2008-07-27 20:15 ` Linus Torvalds
2008-07-27 21:03   ` Ingo Molnar
2008-07-28 18:42     ` Mike Travis
2008-07-27 21:05   ` Al Viro
2008-07-27 22:17     ` Linus Torvalds
2008-07-28  0:42   ` Rusty Russell
2008-07-28  3:06     ` Andrew Morton
2008-07-28  6:34       ` Rusty Russell
2008-07-28  6:58         ` Nick Piggin
2008-07-28  7:56         ` Ingo Molnar [this message]
2008-07-28 18:12         ` Mike Travis
2008-07-28  8:33     ` Ingo Molnar
2008-07-28 18:07       ` Mike Travis
2008-07-28 17:50     ` Mike Travis
2008-07-28 18:32       ` Linus Torvalds
2008-07-28 18:37         ` Linus Torvalds
2008-07-28 18:51           ` Ingo Molnar
2008-07-28 19:22             ` Mike Travis
2008-07-28 19:31               ` Mike Travis
2008-07-28 19:04         ` Mike Travis
2008-07-28 20:57         ` [rfc git pull] cpus4096 fixes, take 2 Ingo Molnar
2008-07-28 21:35           ` Ingo Molnar
2008-07-28 21:41             ` [build error] drivers/char/pcmcia/ipwireless/hardware.c:571: error: invalid use of undefined type 'struct ipw_network' Ingo Molnar
2008-07-28 22:06               ` Ingo Molnar
2008-07-28 22:20                 ` Andrew Morton
2008-07-28 22:29                   ` Ingo Molnar
2008-07-30 14:59               ` David Sterba
2008-07-30 15:11                 ` James Bottomley
2008-07-30 15:14                   ` Jiri Kosina
2008-07-28 21:36           ` [rfc git pull] cpus4096 fixes, take 2 Mike Travis
2008-07-29  1:45           ` Rusty Russell
2008-07-29 12:11             ` Ingo Molnar
2008-07-30  0:15               ` Rusty Russell
2008-07-28 18:46     ` [git pull] cpus4096 fixes Mike Travis
2008-07-28 19:13       ` Ingo Molnar
2008-07-29  1:33       ` Rusty Russell
2008-07-28  0:53 ` Rusty Russell
2008-07-28  8:16   ` Ingo Molnar
2008-07-28 13:21     ` Rusty Russell
2008-07-28 18:23       ` Mike Travis
2008-07-31 10:30       ` Ingo Molnar
2008-07-28  8:43   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080728075611.GA16208@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@linux-foundation.org \
    --cc=travis@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.