From: ebiederm@xmission.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Cort Dougan <cort@fsmlabs.com>,
Benjamin LaHaise <bcrl@redhat.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Robert Love <rml@tech9.net>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: latest linus-2.5 BK broken
Date: 19 Jun 2002 21:57:35 -0600 [thread overview]
Message-ID: <m1d6umtxe8.fsf@frodo.biederman.org> (raw)
In-Reply-To: <Pine.LNX.4.44.0206191018510.2053-100000@home.transmeta.com>
Linus Torvalds <torvalds@transmeta.com> writes:
> On 19 Jun 2002, Eric W. Biederman wrote:
> >
> > 10-20 years or someone finds a good way to implement a single system
> > image on linux clusters. They are already into the 1000s of nodes,
> > and dual processors per node category. And as things continue they
> > might even grow bigger.
>
> Oh, clusters are a separate issue. I'm absolutely 100% conviced that you
> don't want to have a "single kernel" for a cluster, you want to run
> independent kernels with good communication infrastructure between them
> (ie global filesystem, and try to make the networking look uniform).
>
> Trying to have a single kernel for thousands of nodes is just crazy. Even
> if the system were ccNuma and _could_ do it in theory.
I totally agree, mostly I was playing devils advocate. The model
actually in my head is when you have multiple kernels but they talk
well enough that the applications have to care in areas where it
doesn't make a performance difference (There's got to be one of those).
> The NuMA work can probably take single-kernel to maybe 64+ nodes, before
> people just start turning stark raving mad. There's no way you'll have
> single-kernel for thousands of CPU's, and still stay sane and claim any
> reasonable performance under generic loads.
>
> So don't confuse the issue with clusters like that. The "set_affinity()"
> call simply doesn't have anything to do with them. If you want to move
> processes between nodes on such a cluster, you'll probably need user-level
> help, the kernel is unlikely to do it for you.
Agreed.
The compute cluster problem is an interesting one. The big items
I see on the todo list are:
- Scalable fast distributed file system (Lustre looks like a
possibility)
- Sub application level checkpointing.
Services like a schedulers, already exist.
Basically the job of a cluster scheduler gets much easier, and the
scheduler more powerful once it gets the ability to suspend jobs.
Checkpointing buys three things. The ability to preempt jobs, the
ability to migrate processes, and the ability to recover from failed
nodes, (assuming the failed hardware didn't corrupt your jobs
checkpoint).
Once solutions to the cluster problems become well understood I
wouldn't be surprised if some of the supporting services started to
live in the kernel like nfsd. Parts of the distributed filesystem
certainly will.
I suspect process checkpointing and restoring will evolve something
something like pthread support. With some code in user space, and
some generic helpers in the kernel as clean pieces of the job can be
broken off. The challenge is only how to save/restore interprocess
communications. Things like moving a tcp connection from one node to
another are interesting problems.
But also I suspect most of the hard problems that we need kernel help
with can have uses independent of checkpointing. Already we have web
server farms that spread connections to a single ip across nodes.
Eric
next prev parent reply other threads:[~2002-06-20 4:07 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-06-18 17:18 latest linus-2.5 BK broken James Simmons
2002-06-18 17:46 ` Robert Love
2002-06-18 18:51 ` Rusty Russell
2002-06-18 18:43 ` Zwane Mwaikambo
2002-06-18 18:56 ` Linus Torvalds
2002-06-18 18:59 ` Robert Love
2002-06-18 20:05 ` Rusty Russell
2002-06-18 20:05 ` Linus Torvalds
2002-06-18 20:31 ` Rusty Russell
2002-06-18 20:41 ` Linus Torvalds
2002-06-18 21:12 ` Benjamin LaHaise
2002-06-18 21:08 ` Cort Dougan
2002-06-18 21:47 ` Linus Torvalds
2002-06-19 12:29 ` Eric W. Biederman
2002-06-19 17:27 ` Linus Torvalds
2002-06-20 3:57 ` Eric W. Biederman [this message]
2002-06-20 5:24 ` Larry McVoy
2002-06-20 7:26 ` Andreas Dilger
2002-06-20 14:54 ` Eric W. Biederman
2002-06-20 15:41 ` McVoy's Clusters (was Re: latest linus-2.5 BK broken) Sandy Harris
2002-06-20 17:10 ` William Lee Irwin III
2002-06-20 20:42 ` Timothy D. Witham
2002-06-21 5:16 ` Eric W. Biederman
2002-06-22 14:14 ` Kai Henningsen
2002-06-20 16:30 ` latest linus-2.5 BK broken Cort Dougan
2002-06-20 17:15 ` Linus Torvalds
2002-06-21 6:15 ` Eric W. Biederman
2002-06-21 17:50 ` Larry McVoy
2002-06-21 17:55 ` Robert Love
2002-06-21 18:09 ` Linux, the microkernel (was Re: latest linus-2.5 BK broken) Jeff Garzik
2002-06-21 18:46 ` Cort Dougan
2002-06-21 20:25 ` Daniel Phillips
2002-06-22 1:07 ` Horst von Brand
2002-06-22 1:23 ` Larry McVoy
2002-06-22 12:41 ` Roman Zippel
2002-06-23 15:15 ` Sandy Harris
2002-06-23 17:29 ` Jakob Oestergaard
2002-06-24 6:27 ` Craig I. Hagan
2002-06-24 13:06 ` J.A. Magallon
2002-06-24 10:59 ` Eric W. Biederman
2002-06-21 19:34 ` Rob Landley
2002-06-22 15:31 ` Alan Cox
2002-06-22 12:24 ` Rob Landley
2002-06-22 19:00 ` Ruth Ivimey-Cook
2002-06-22 21:09 ` jdow
2002-06-23 17:56 ` John Alvord
2002-06-23 20:48 ` jdow
2002-06-23 21:40 ` [OT] " Xavier Bestel
2002-06-22 18:25 ` latest linus-2.5 BK broken Eric W. Biederman
2002-06-22 19:26 ` Larry McVoy
2002-06-22 22:25 ` Eric W. Biederman
2002-06-22 23:10 ` Larry McVoy
2002-06-23 6:34 ` William Lee Irwin III
2002-06-23 22:56 ` Kai Henningsen
2002-06-20 17:16 ` RW Hawkins
2002-06-20 17:23 ` Cort Dougan
2002-06-20 20:40 ` Martin Dalecki
2002-06-20 20:53 ` Linus Torvalds
2002-06-20 21:27 ` Martin Dalecki
2002-06-20 21:37 ` Linus Torvalds
2002-06-20 21:59 ` Martin Dalecki
2002-06-20 22:18 ` Linus Torvalds
2002-06-20 22:41 ` Martin Dalecki
2002-06-21 0:09 ` Allen Campbell
2002-06-21 7:43 ` Zwane Mwaikambo
2002-06-21 21:02 ` Rob Landley
2002-06-22 3:57 ` (RFC)i386 arch autodetect( was Re: latest linus-2.5 BK broken ) Matthew D. Pitts
2002-06-22 4:54 ` William Lee Irwin III
2002-06-21 16:01 ` Re: latest linus-2.5 BK broken Sandy Harris
2002-06-21 20:38 ` Rob Landley
2002-06-20 21:13 ` Timothy D. Witham
2002-06-21 19:53 ` Rob Landley
2002-06-21 5:34 ` Eric W. Biederman
2002-06-19 10:21 ` Padraig Brady
2002-06-18 21:45 ` Bill Huey
2002-06-18 20:55 ` Robert Love
2002-06-19 13:31 ` Rusty Russell
2002-06-18 19:29 ` Benjamin LaHaise
2002-06-18 19:19 ` Zwane Mwaikambo
2002-06-18 19:49 ` Benjamin LaHaise
2002-06-18 19:27 ` Zwane Mwaikambo
2002-06-18 20:13 ` Rusty Russell
2002-06-18 20:21 ` Linus Torvalds
2002-06-18 22:03 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2002-06-18 23:38 Michael Hohnbaum
2002-06-18 23:57 ` Ingo Molnar
2002-06-19 0:08 ` Ingo Molnar
2002-06-19 1:00 ` Matthew Dobson
2002-06-19 23:48 ` Michael Hohnbaum
[not found] <E17KSLb-0007Dj-00@wagner.rustcorp.com.au>
2002-06-19 0:12 ` Linus Torvalds
2002-06-19 15:23 ` Rusty Russell
2002-06-19 16:28 ` Linus Torvalds
2002-06-19 20:57 ` Rusty Russell
2002-06-20 23:48 Miles Lane
2002-06-21 7:31 Martin Knoblauch
2002-06-21 12:59 Jesse Pollard
2002-06-24 21:28 Paul McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1d6umtxe8.fsf@frodo.biederman.org \
--to=ebiederm@xmission.com \
--cc=bcrl@redhat.com \
--cc=cort@fsmlabs.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rml@tech9.net \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox