From: Shailabh Nagar <nagar@watson.ibm.com>
To: Mark Hahn <hahn@physics.mcmaster.ca>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.6.13-rc3-mm1 (ckrm)
Date: Thu, 21 Jul 2005 23:59:09 -0400 [thread overview]
Message-ID: <42E06F0D.8020503@watson.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0507171330030.4074-100000@coffee.psychology.mcmaster.ca>
Mark Hahn wrote:
>>I suspect that the main problem is that this patch is not a mainstream
>>kernel feature that will gain multiple uses, but rather provides
>>support for a specific vendor middleware product used by that
>>vendor and a few closely allied vendors. If it were smaller or
>>less intrusive, such as a driver, this would not be a big problem.
>>That's not the case.
>
>
> yes, that's the crux. CKRM is all about resolving conflicting resource
> demands in a multi-user, multi-server, multi-purpose machine. this is a
> huge undertaking, and I'd argue that it's completely inappropriate for
> *most* servers. that is, computers are generally so damn cheap that
> the clear trend is towards dedicating a machine to a specific purpose,
> rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.
The argument about scale-up vs. scale-out is nowhere close to being
resolved. To argue that any support for performance partitioning (which
CKRM does) is in support of a lost cause is premature to say the least.
> this is *directly* in conflict with certain prominent products, such as
> the Altix and various less-prominent Linux-based mainframes. they're all
> about partitioning/virtualization - the big-iron aesthetic of splitting up
> a single machine. note that it's not just about "big", since cluster-based
> approaches can clearly scale far past big-iron, and are in effect statically
> partitioned. yes, buying a hideously expensive single box, and then chopping
> it into little pieces is more than a little bizarre, and is mainly based
> on a couple assumptions:
>
> - that clusters are hard. really, they aren't. they are not
> necessarily higher-maintenance, can be far more robust, usually
> do cost less. just about the only bad thing about clusters is
> that they tend to be somewhat larger in size.
>
> - that partitioning actually makes sense. the appeal is that if
> you have a partition to yourself, you can only hurt yourself.
> but it also follows that burstiness in resource demand cannot be
> overlapped without either constantly tuning the partitions or
> infringing on the guarantee.
"constantly tuning the partitions" is effectively whats done by workload
managers. But our earlier presentations and papers have made the case
that this is not the only utility for performance isolation - simple
needs like isolating one user from another on a general purpose server
is also a need that cannot be met by any existing or proposed Linux
kernel mechanisms today.
If partitioning made so little sense and the case for clusters was that
obvious, one would be hard put to explain why server consolidation is
being actively pursued by so many firms, Solaris is bothering with
coming up with Containers and Xen/VMWare getting all this attention.
I don't think the concept of partitioning can be dismissed so easily.
Of course, it must be noted that CKRM only provides performance
isolation not fault isolation. But there is a need for that. Whether
Linux chooses to let this need influence its design is another matter
(which I hope we'll also discuss besides the implementation issues).
> CKRM is one of those things that could be done to Linux, and will benefit a
> few, but which will almost certainly hurt *most* of the community.
>
> let me say that the CKRM design is actually quite good. the issue is whether
> the extensive hooks it requires can be done (at all) in a way which does
> not disporportionately hurt maintainability or efficiency.
If there are suggestions on implementing this better, it'll certainly be
very welcome.
>
> CKRM requires hooks into every resource-allocation decision fastpath:
> - if CKRM is not CONFIG, the only overhead is software maintenance.
> - if CKRM is CONFIG but not loaded, the overhead is a pointer check.
> - if CKRM is CONFIG and loaded, the overhead is a pointer check
> and a nontrivial callback.
>
> but really, this is only for CKRM-enforced limits. CKRM really wants to
> change behavior in a more "weighted" way, not just causing an
> allocation/fork/packet to fail. a really meaningful CKRM needs to
> be tightly integrated into each resource manager - effecting each scheduler
> (process, memory, IO, net). I don't really see how full-on CKRM can be
> compiled out, unless these schedulers are made fully pluggable.
This is a valid point for the CPU, memory and network controllers (I/O
can be made pluggable quite easily). For the CPU controller in SuSE, the
CKRM CPU controller can be turned on and off dynamically at runtime.
Exploring a similar option for memory and network (incurring only a
pointer check) could be explored. Keeping the overhead close to zero for
kernel users not interested in CKRM is certainly one of our objectives.
> finally, I observe that pluggable, class-based resource _limits_ could
> probably be done without callbacks and potentially with low overhead.
> but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource
> partitioning within a large, shared machine.
True but only limits are not as useful for general workload management.
> regards, mark hahn.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2005-07-22 3:57 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-15 8:36 2.6.13-rc3-mm1 Andrew Morton
2005-07-15 8:49 ` 2.6.13-rc3-mm1 Russell King
2005-07-15 8:56 ` 2.6.13-rc3-mm1 Andrew Morton
2005-07-15 9:03 ` 2.6.13-rc3-mm1 Russell King
2005-07-15 9:15 ` 2.6.13-rc3-mm1 Andrew Morton
2005-07-15 9:24 ` 2.6.13-rc3-mm1 Matthias Urlichs
2005-07-15 17:42 ` 2.6.13-rc3-mm1 Matthias Urlichs
2005-07-15 10:25 ` 2.6.13-rc3-mm1 Grant Coady
2005-07-15 10:36 ` 2.6.13-rc3-mm1 Andrew Morton
2005-07-15 10:27 ` 2.6.13-rc3-mm1: horribly drivers/scsi/qla2xxx/Makefile Adrian Bunk
2005-07-15 14:40 ` Andrew Vasquez
2005-07-16 17:26 ` Jindrich Makovicka
2005-07-19 14:04 ` [-mm patch] SCSI_QLA2ABC options must select FW_LOADER Adrian Bunk
2005-07-20 13:38 ` Jesper Juhl
2005-07-21 15:25 ` Adrian Bunk
2005-07-17 2:38 ` [2.6 patch] SCSI_QLA2ABC mustn't select SCSI_FC_ATTRS Adrian Bunk
2005-07-17 3:11 ` Lee Revell
2005-07-17 4:04 ` randy_dunlap
2005-07-17 4:20 ` Lee Revell
2005-07-15 15:00 ` 2.6.13-rc3-mm1 Christoph Hellwig
2005-07-15 20:16 ` 2.6.13-rc3-mm1 (ckrm) Andrew Morton
2005-07-17 15:20 ` Paul Jackson
2005-07-17 19:02 ` Mark Hahn
2005-07-21 1:40 ` Paul Jackson
2005-07-22 3:59 ` Shailabh Nagar [this message]
2005-07-22 4:27 ` Gerrit Huizenga
2005-07-22 4:53 ` Mark Hahn
2005-07-22 5:03 ` Gerrit Huizenga
2005-07-22 5:37 ` Mark Hahn
2005-07-22 14:53 ` Alan Cox
2005-07-22 15:51 ` Gerrit Huizenga
2005-07-22 16:35 ` Mark Hahn
2005-07-22 19:27 ` Alan Cox
2005-07-22 20:18 ` [ckrm-tech] " Matthew Helsley
2005-07-23 0:23 ` Mark Hahn
2005-07-23 4:19 ` Matthew Helsley
2005-07-23 15:38 ` Mark Hahn
2005-07-18 10:12 ` Hirokazu Takahashi
2005-07-21 22:37 ` Matthew Helsley
2005-07-21 23:32 ` Paul Jackson
2005-07-22 0:29 ` Martin J. Bligh
2005-07-22 3:46 ` Paul Jackson
2005-07-22 4:07 ` Shailabh Nagar
2005-07-22 19:53 ` Paul Jackson
2005-07-28 20:15 ` Shailabh Nagar
2005-07-28 22:54 ` Paul Jackson
2005-07-22 1:06 ` Peter Williams
2005-07-22 3:00 ` Gerrit Huizenga
2005-07-22 3:46 ` Peter Williams
2005-07-22 3:55 ` Gerrit Huizenga
2005-07-15 17:13 ` 2.6.13-rc3-mm1 Joel Becker
2005-07-15 22:04 ` [PATCH] Assorted fixes J.A. Magallon
2005-07-15 22:11 ` [PATCH] fix LDT tss J.A. Magallon
2005-07-15 22:11 ` [PATCH] fix kmalloc in IDE J.A. Magallon
2005-07-15 22:12 ` [PATCH] SCSI SATA is a tristate J.A. Magallon
2005-07-15 22:13 ` [PATCH] SMB fix J.A. Magallon
2005-07-15 22:14 ` [PATCH] signed char fixes for scripts J.A. Magallon
2005-07-16 9:52 ` Sam Ravnborg
2005-07-18 11:16 ` Paulo Marques
2005-07-18 11:29 ` Paulo Marques
2005-07-27 20:27 ` Sam Ravnborg
2005-07-27 23:36 ` J.A. Magallon
2005-07-28 10:02 ` Paulo Marques
2005-07-28 10:16 ` Bernd Petrovitsch
2005-07-28 10:40 ` Paulo Marques
2005-07-28 11:05 ` Bernd Petrovitsch
2005-07-15 22:52 ` 2.6.13-rc3-mm1 Yoichi Yuasa
2005-07-15 23:00 ` 2.6.13-rc3-mm1 Yoichi Yuasa
2005-07-15 23:23 ` 2.6.13-rc3-mm1 Andrew Morton
2005-07-16 1:08 ` 2.6.13-rc3-mm1 Yoichi Yuasa
2005-07-16 21:30 ` 2.6.13-rc3-mm1: a regression Rafael J. Wysocki
2005-07-16 21:39 ` Andrew Morton
2005-07-17 20:11 ` Rafael J. Wysocki
2005-07-16 22:12 ` 2.6.13-rc3-mm1 : oops in dnotify_parent Laurent Riffard
2005-07-17 1:32 ` 2.6.13-rc3-mm1 Joseph Fannin
2005-07-18 11:41 ` 2.6.13-rc3-mm1 Pavel Machek
2005-07-18 14:21 ` 2.6.13-rc3-mm1 Joseph Fannin
2005-07-17 20:20 ` 2.6.13-rc3-mm1: mount problems w/ 3ware on dual Opteron Rafael J. Wysocki
2005-07-19 14:21 ` 2.6.13-rc3-mm1 Coywolf Qi Hunt
2005-07-19 14:42 ` [patch] kbuild: make help binrpm-pkg fix Coywolf Qi Hunt
2005-07-21 21:46 ` Sam Ravnborg
2005-07-21 11:37 ` 2.6.13-rc3-mm1 - breaks DRI Ed Tomlinson
2005-07-21 15:56 ` Andrew Morton
2005-07-21 22:37 ` Ed Tomlinson
2005-07-21 23:18 ` Dave Airlie
2005-07-22 21:17 ` [-mm patch] kernel/ckrm/rbce/rbce_core.c: fix -Wundef warning Adrian Bunk
2005-07-24 16:20 ` 2.6.13-rc3-mm1 Richard Purdie
2005-07-25 6:42 ` 2.6.13-rc3-mm1 Andrew Morton
2005-07-25 9:35 ` [patch] Stop the nand functions triggering false softlockup reports Richard Purdie
2005-07-28 12:50 ` 2.6.13-rc3-mm1 compiles unrequested/unconfigured module! Helge Hafting
2005-07-28 12:56 ` Adrian Bunk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42E06F0D.8020503@watson.ibm.com \
--to=nagar@watson.ibm.com \
--cc=hahn@physics.mcmaster.ca \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox