* Re: about modularization
2007-08-03 12:07 Scheduler Situation T. J. Brumfield
2007-08-03 13:00 ` debian developer
@ 2007-08-03 13:19 ` Ingo Molnar
[not found] ` <cdc89fe60708030651s54b5f0e0j938450632cf621c5@mail.gmail.com>
` (2 more replies)
1 sibling, 3 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-08-03 13:19 UTC (permalink / raw)
To: T. J. Brumfield; +Cc: linux-kernel
* T. J. Brumfield <enderandrew@gmail.com> wrote:
> 1 - Can someone please explain why the kernel can be modular in every
> other aspect, including offering a choice of IO schedulers, but not
> kernel schedulers?
that's a fundamental misconception. If you boot into a distro kernel on
a typical PC, about half of the kernel code that the box runs in any
moment will be in modules, half of it is in the "kernel core". For
example, on a random laptop:
$ echo `lsmod | cut -c1-30 | cut -d' ' -f2-` | sed 's/Size //' |
sed 's/ /+/g' | bc
2513784
i.e. 2.5 MB of modules. The core kernel's size:
$ dmesg | grep 'kernel code'
Memory: 2053212k/2087808k available (2185k kernel code, 33240k reserved, 1174k data, 244k init, 1170304k highmem)
2.1 MB of kernel core code. (of course the total body of "possible
drivers" is 10 times larger than that of the core kernel - but the
fundamental 'variety' is not.)
most of the modules are for stuff where there is a significant physical
difference between the components they support. Drivers for different
pieces of hardware. Filesystem drivers for different on-disk physical
layouts. Network protocol drivers for different on-wire formats. The
sanest technological decision there is clearly to modularize.
And note that often it's not even about choice there: the user's system
has a particular piece of hardware, to which there is usually one
primary driver. The user does not have any real 'choice' over the
modularization here, it's largely a technological act to make the
kernel's footprint smaller.
But the kernel core, which does not depend as much on the physical
properties of the stuff it supports (it depends on the physics of the
machine of course, but those rules are mostly shared between all
machines of that architecture), and is fundamentally influenced by the
syscall API (which is not modular either) and by our OS design
decisions, has much less reason to be modularized.
The core kernel was always non-modular, and it depends on the technical
details whether we want to or _have to_ modularize something so that it
becomes modular to the user too. For example we dont have 'competing',
modular versions of the IPv4 stack. Neither of the VFS. Nor of timers,
futexes, nor of locking code or of the CPU scheduler. But we can switch
out any of those implementations from the core kernel, and did so
numerous times in the past and will do so in the future.
CPU schedulers are as core kernel code as it gets - you cannot even boot
without having a CPU scheduler. IO schedulers, although similar in name,
are quite different beasts from CPU schedulers, and they are somewhere
between the core kernel and drivers. They are not 'physical drivers' (an
IO scheduler can drive any disk), nor are they fully 'core kernel code'
in the sense of a kernel not even being able to boot without them. Also,
disks are physically different from CPUs, in a way which works _against_
the user-modularization of CPU schedulers. (there are also many other
differences which have been pointed out in the past)
In any case, the IO subsystem maintainers decided to modularize IO
schedulers, and that's their decision. One of the authors of the IO
scheduler code said it on lkml recently that while modularization of IO
scheduler had advantages too, in retrospect he wishes they would not
have made IO schedulers modular and now that decision cannot be undone.
So even that much different situation was far from a clear decision, and
some negative effects can be felt today too, in form of having two
primary IO schedulers but not having one IO scheduler that works well in
all cases. For CPU schedulers the circumstances point away away from
user-selectable modularization even stronger.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 13:52 ` Fwd: " T. J. Brumfield
@ 2007-08-03 15:02 ` Ingo Molnar
2007-08-03 15:13 ` Ingo Molnar
1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-08-03 15:02 UTC (permalink / raw)
To: T. J. Brumfield; +Cc: linux-kernel
* T. J. Brumfield <enderandrew@gmail.com> wrote:
> On 8/3/07, Ingo Molnar <mingo@elte.hu> wrote:
> > snip...
>
> Except that a working prototype of plugsched exists and functions
> exactly as advertised. [...]
a prototype for dynamic syscalls exists too. A prototype for pluggable
network IPv4 stacks exists too. A working implementation existed for
STREAMS too. Existence of a patch still does not make any of them a good
idea for the core kernel.
[ If existence of a patch was the only criterium for upstream merging
then we'd have a much poorer quality Linux kernel today. Odds are that
in that case you would not even know what 'Linux' means, it would
still be an obscure, niche hacker's toy somewhere on the 'net ;-) ]
> [...] I understand that modules can be loaded and unloaded, where as
> other aspects of the core kernel can't just load/unload as the kernel
> is running, but the cpu scheduler could be selected at boot via a
> command line, or as a kconfig option when compiling like the io
> scheduler.
i replied to that in my previous mail:
But the kernel core, which does not depend as much on the physical
properties of the stuff it supports (it depends on the physics of the
machine of course, but those rules are mostly shared between all
machines of that architecture), and is fundamentally influenced by
the syscall API (which is not modular either) and by our OS design
decisions, has much less reason to be modularized.
But to put it in different words:
_certain core kernel code should not be pluggable_
and whether it should or should not be pluggable is an entirely
technical decision, up to the maintainers of that code.
> And while the io team may feel that it would be best to have one
> scheduler that worked well under all circumstances, often this can't
> simply be the case. [...]
you skipped the bit where i pointed it out how different CPU scheduling
is from IO scheduling. Plus now you are arguing against the opinion of
an IO maintainer too? :-)
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 13:52 ` Fwd: " T. J. Brumfield
2007-08-03 15:02 ` Ingo Molnar
@ 2007-08-03 15:13 ` Ingo Molnar
1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-08-03 15:13 UTC (permalink / raw)
To: T. J. Brumfield; +Cc: linux-kernel
* T. J. Brumfield <enderandrew@gmail.com> wrote:
> CFS is apparently better in its simplicity, however others are
> reporting that SD still provides benefits for 3D gaming. [...]
even for 3D gaming the opposite of what you say seems to be the case:
http://people.redhat.com/mingo/misc/cfs-sd-ut2004-perf.jpg
http://people.redhat.com/mingo/misc/cfs-vs-sd-wine-quake.jpg
( more such measurements were done and reported, i stopped doing
graphs after the first two. )
but in any case, the main "target" of CFS was not even SD (although it
is certainly desirable to handle any load at least as well as SD) but
the _mainline_ scheduler. SD was the primary selection of a relatively
small subset of existing Linux users, and SD had known (and intentional)
tradeoffs over the mainline scheduler in certain areas. CFS tried to do
zero tradeoffs over the existing scheduler, to not introduce regressions
to the many users who found the existing scheduler just good enough.
That's been a success so far, at the moment there's no open CFS
"interactivity regression" relative to the 2.6.22 mainline scheduler.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 13:00 ` debian developer
@ 2007-08-03 15:28 ` Ingo Molnar
0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-08-03 15:28 UTC (permalink / raw)
To: debian developer; +Cc: T. J. Brumfield, linux-kernel
* debian developer <debiandev@gmail.com> wrote:
> > 1 - Can someone please explain why the kernel can be modular in
> > every other aspect, including offering a choice of IO schedulers,
> > but not kernel schedulers?
>
> Good question. has been answered in other threads. Linus does'nt like
> having separate kernel schedulers.
not just Linus, but neither me nor Nick Piggin, nor a ton of other
kernel hackers agree with that idea, for numerous technical reasons,
as it has been discussed to death already ;-)
and the last but not least point, although they might sound pretty
similar, there is quite a bit of difference between "IO schedulers" and
"CPU schedulers", just like there is quite a bit of difference between
"Paris Hilton" and "The Hilton, Paris" =B-)
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 13:19 ` Ingo Molnar
[not found] ` <cdc89fe60708030651s54b5f0e0j938450632cf621c5@mail.gmail.com>
@ 2007-08-03 17:47 ` Rene Herman
2007-08-03 18:59 ` Ingo Molnar
2007-09-01 22:02 ` Oleg Verych
2 siblings, 1 reply; 12+ messages in thread
From: Rene Herman @ 2007-08-03 17:47 UTC (permalink / raw)
To: Ingo Molnar; +Cc: T. J. Brumfield, linux-kernel
On 08/03/2007 03:19 PM, Ingo Molnar wrote:
> One of the authors of the IO scheduler code said it on lkml recently that
> while modularization of IO scheduler had advantages too, in retrospect he
> wishes they would not have made IO schedulers modular and now that
> decision cannot be undone.
Just as a matter of interest -- why can't it? (a pointer to a list archive
if you have one, or a name so I can look for it myself if you don't, will do
as answer).
Rene.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 17:47 ` Rene Herman
@ 2007-08-03 18:59 ` Ingo Molnar
0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2007-08-03 18:59 UTC (permalink / raw)
To: Rene Herman; +Cc: linux-kernel
* Rene Herman <rene.herman@gmail.com> wrote:
> On 08/03/2007 03:19 PM, Ingo Molnar wrote:
>
> > One of the authors of the IO scheduler code said it on lkml recently
> > that while modularization of IO scheduler had advantages too, in
> > retrospect he wishes they would not have made IO schedulers modular
> > and now that decision cannot be undone.
>
> Just as a matter of interest -- why can't it? (a pointer to a list
> archive if you have one, or a name so I can look for it myself if you
> don't, will do as answer).
some apps depend on AS, some on CFQ, and once you expose something to
users it's _very_ hard to remove it, even if the technical arguments are
strong.
http://lists.openwall.net/linux-kernel/2007/04/16/23
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
@ 2007-08-06 20:20 Mitchell Erblich
2007-08-06 20:50 ` Rene Herman
0 siblings, 1 reply; 12+ messages in thread
From: Mitchell Erblich @ 2007-08-06 20:20 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, "T. J. Brumfield"
Ingo Molnar and group,
If we just concentrate on CPU schedulars...
IMO, POSIX requirements almost guarantee
the support for modularization. The different
task scheds allow a set of task class specific
funcs to be generated. The question is whether
those modular schedulars will ALWAYS consume
kernel footprint space?
With the arg of modularization is really targeted
to optional hardware and decreases the kernel
footprint size. Then here is a arg to support only 1
default schedular and take the rest of the sched
code and modularize that..
IMO, ONLY some envs REQUIRE RT
sched and some envs REQUIRE MP
(multi-core and multi-processor) scheduling.
I question whether the core kernel needs
this support.
.
This additional capability could be removed
from the growing kernel footprint and
additional schedulars could be kept in the src
code base with increasingly minimal effort if full
modularization support were added.
Thus, a hybrid schedular approach could be taken
that would default to a single uni-processor schedular
(ex: CFS) and the other schedulars could be
modularized.
Mitchell Erblich
------------------------
Ingo Molnar wrote:
>
> * T. J. Brumfield <enderandrew@gmail.com> wrote:
>
> > 1 - Can someone please explain why the kernel can be modular in every
> > other aspect, including offering a choice of IO schedulers, but not
> > kernel schedulers?
>
> that's a fundamental misconception. If you boot into a distro kernel on
> a typical PC, about half of the kernel code that the box runs in any
> moment will be in modules, half of it is in the "kernel core". For
> example, on a random laptop:
>
> $ echo `lsmod | cut -c1-30 | cut -d' ' -f2-` | sed 's/Size //' |
> sed 's/ /+/g' | bc
> 2513784
>
> i.e. 2.5 MB of modules. The core kernel's size:
>
> $ dmesg | grep 'kernel code'
> Memory: 2053212k/2087808k available (2185k kernel code, 33240k reserved,
1174k data, 244k init, 1170304k highmem)
>
> 2.1 MB of kernel core code. (of course the total body of "possible
> drivers" is 10 times larger than that of the core kernel - but the
> fundamental 'variety' is not.)
>
> most of the modules are for stuff where there is a significant physical
> difference between the components they support. Drivers for different
> pieces of hardware. Filesystem drivers for different on-disk physical
> layouts. Network protocol drivers for different on-wire formats. The
> sanest technological decision there is clearly to modularize.
>
> And note that often it's not even about choice there: the user's system
> has a particular piece of hardware, to which there is usually one
> primary driver. The user does not have any real 'choice' over the
> modularization here, it's largely a technological act to make the
> kernel's footprint smaller.
>
> But the kernel core, which does not depend as much on the physical
> properties of the stuff it supports (it depends on the physics of the
> machine of course, but those rules are mostly shared between all
> machines of that architecture), and is fundamentally influenced by the
> syscall API (which is not modular either) and by our OS design
> decisions, has much less reason to be modularized.
>
> The core kernel was always non-modular, and it depends on the technical
> details whether we want to or _have to_ modularize something so that it
> becomes modular to the user too. For example we dont have 'competing',
> modular versions of the IPv4 stack. Neither of the VFS. Nor of timers,
> futexes, nor of locking code or of the CPU scheduler. But we can switch
> out any of those implementations from the core kernel, and did so
> numerous times in the past and will do so in the future.
>
> CPU schedulers are as core kernel code as it gets - you cannot even boot
> without having a CPU scheduler. IO schedulers, although similar in name,
> are quite different beasts from CPU schedulers, and they are somewhere
> between the core kernel and drivers. They are not 'physical drivers' (an
> IO scheduler can drive any disk), nor are they fully 'core kernel code'
> in the sense of a kernel not even being able to boot without them. Also,
> disks are physically different from CPUs, in a way which works _against_
> the user-modularization of CPU schedulers. (there are also many other
> differences which have been pointed out in the past)
>
> In any case, the IO subsystem maintainers decided to modularize IO
> schedulers, and that's their decision. One of the authors of the IO
> scheduler code said it on lkml recently that while modularization of IO
> scheduler had advantages too, in retrospect he wishes they would not
> have made IO schedulers modular and now that decision cannot be undone.
> So even that much different situation was far from a clear decision, and
> some negative effects can be felt today too, in form of having two
> primary IO schedulers but not having one IO scheduler that works well in
> all cases. For CPU schedulers the circumstances point away away from
> user-selectable modularization even stronger.
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-06 20:20 Mitchell Erblich
@ 2007-08-06 20:50 ` Rene Herman
0 siblings, 0 replies; 12+ messages in thread
From: Rene Herman @ 2007-08-06 20:50 UTC (permalink / raw)
To: Mitchell Erblich; +Cc: linux-kernel, Ingo Molnar, T. J. Brumfield
On 08/06/2007 10:20 PM, Mitchell Erblich wrote:
> Thus, a hybrid schedular approach could be taken
> that would default to a single uni-processor schedular
What a brilliant idea in a world where buying a non multi core CPU is
getting to be only somewhat easier than a non SMT one...
Rene.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
@ 2007-08-06 21:48 Mitchell Erblich
2007-08-06 23:35 ` Rene Herman
0 siblings, 1 reply; 12+ messages in thread
From: Mitchell Erblich @ 2007-08-06 21:48 UTC (permalink / raw)
To: Rene Herman; +Cc: linux-kernel, Ingo Molnar, "T. J. Brumfield"
Rene,
Of the uni-processor systems currently that can run Linux, I would not
doubt if 99.9999% percent are uni-cores. It will be probably
3-5 years minimum before the multi-core processors will have any
decent percentage of systems.
And I am not suggesting not supporting them. I am only suggesting
is wrt the schedular, bring the system up with a default schedular,
and then load additional functionality based on the hardware/software
requirements of the system.
Thus, the fallout MIGHT be a uni-processor CFS that would not migrate
tasks between multiple CPUs and as additional processors are brought
online, migration could be enabled, and gang type scheduling, whatever
could be then used.
IMO, if their is a fault (because of heat, etc) the user would rather
bring
up the system in a degraded mode. Same reason applies to...
boot -s..
Mitchell Erblich
------------------------------
Rene Herman wrote:
>
> On 08/06/2007 10:20 PM, Mitchell Erblich wrote:
>
> > Thus, a hybrid schedular approach could be taken
> > that would default to a single uni-processor schedular
>
> What a brilliant idea in a world where buying a non multi core CPU is
> getting to be only somewhat easier than a non SMT one...
>
> Rene.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-06 21:48 about modularization Mitchell Erblich
@ 2007-08-06 23:35 ` Rene Herman
2007-08-06 23:45 ` Rene Herman
0 siblings, 1 reply; 12+ messages in thread
From: Rene Herman @ 2007-08-06 23:35 UTC (permalink / raw)
To: Mitchell Erblich; +Cc: linux-kernel, Ingo Molnar, T. J. Brumfield
On 08/06/2007 11:48 PM, Mitchell Erblich wrote:
> Of the uni-processor systems currently that can run Linux, I would not
> doubt if 99.9999% percent are uni-cores.
s/can// and I would. s/uni-processor// additionally and I'd assure you it's
untrue. s/uni-cores/non-smt uni-cores/ and I'd do the same.
> It will be probably 3-5 years minimum before the multi-core processors
> will have any decent percentage of systems.
Which is also approximately the same timeframe in which one might consider
currently developped kernels obsolete for deployment by the way...
> And I am not suggesting not supporting them. I am only suggesting is wrt
> the schedular, bring the system up with a default schedular, and then
> load additional functionality based on the hardware/software requirements
> of the system.
But why? First, look at the number of #ifdef CONFIG_SMP in the scheduler
code -- the Linux kernel already has seperate UP/SMP schedulers selected
through CONFIG_SMP. Embedded can certainly use its own !CONFIG_SMP kernels,
for Linux servers SMP is the norm today and for the desktop/home, SMP
probably already _also_ is the norm today, what with multi-core and HT
(which needs different things than real SMP does, but is also certainly not
UP). And if it isn't, it will be tomorrow and stay that way for the
forseeable future.
[ snip ]
> IMO, if their is a fault (because of heat, etc) the user would rather
> bring up the system in a degraded mode. Same reason applies to... boot
> -s..
To what? I don't understand this comment. You are optimizing for the case of
a dead CPU? Why would the user care if he'd be running the most optimal
scheduler for the situation when his box is limping along anyway?
Rene.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-06 23:35 ` Rene Herman
@ 2007-08-06 23:45 ` Rene Herman
0 siblings, 0 replies; 12+ messages in thread
From: Rene Herman @ 2007-08-06 23:45 UTC (permalink / raw)
To: Mitchell Erblich; +Cc: linux-kernel, Ingo Molnar, T. J. Brumfield
On 08/07/2007 01:35 AM, Rene Herman wrote:
> On 08/06/2007 11:48 PM, Mitchell Erblich wrote:
>
>> Of the uni-processor systems currently that can run Linux, I would not
>> doubt if 99.9999% percent are uni-cores.
>
> s/can// and I would. s/uni-processor// additionally and I'd assure you
> it's untrue. s/uni-cores/non-smt uni-cores/ and I'd do the same.
(no, that's obviously wrong given embedded volumes, but as stated below,
embedded is fine running non-generic, !CONFIG_SMP kernels).
Rene.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: about modularization
2007-08-03 13:19 ` Ingo Molnar
[not found] ` <cdc89fe60708030651s54b5f0e0j938450632cf621c5@mail.gmail.com>
2007-08-03 17:47 ` Rene Herman
@ 2007-09-01 22:02 ` Oleg Verych
2 siblings, 0 replies; 12+ messages in thread
From: Oleg Verych @ 2007-09-01 22:02 UTC (permalink / raw)
To: Ingo Molnar; +Cc: T. J. Brumfield, linux-kernel
* Date: Fri, 3 Aug 2007 15:19:00 +0200
* Received-SPF: softfail (mx3: transitioning domain of elte.hu does not designate 157.181.1.14 as permitted sender) client-ip=157.181.1.14; envelope-from=mingo@elte.hu; helo=elvis.elte.hu;
> If you boot into a distro kernel on
> a typical PC, about half of the kernel code that the box runs in any
> moment will be in modules, half of it is in the "kernel core". For
> example, on a random laptop:
That was your laptop and distro.
> $ echo `lsmod | cut -c1-30 | cut -d' ' -f2-` | sed 's/Size //' |
> sed 's/ /+/g' | bc
> 2513784
>
> i.e. 2.5 MB of modules. The core kernel's size:
>
> $ dmesg | grep 'kernel code'
> Memory: 2053212k/2087808k available (2185k kernel code, 33240k reserved, 1174k data, 244k init, 1170304k highmem)
>
> 2.1 MB of kernel core code. (of course the total body of "possible
> drivers" is 10 times larger than that of the core kernel - but the
> fundamental 'variety' is not.)
Just for reference here's my 2+ years old Asus A4K, kernel is form
Debian Etch:
deen:/tmp# uname -a
Linux deen 2.6.18-4-amd64 #1 SMP Mon Mar 26 11:36:53 CEST 2007 x86_64 x86_64
deen:/tmp# lsmod | (read a; while read a b c; do S=$((b+${S=0})); done; echo $S)
1583684
deen:/tmp# lsmod | grep xfs
xfs 485192 3
deen:/tmp# dmesg | grep kernel\ code
Memory: 506676k/523520k available (1930k kernel code, 16456k reserved, 868k data, 176k init)
Apart from diff in hardware and implied designing/coding skills, decision
was made
* after one "wrong" response plus illness from Con,
* brave core-duo by Ingo and Tomas, who made some bunch of students to test
scheduler and reported success to Linus.
I don't know why, after all that variety of things (mostly drivers, but
recent *fd also) there's such big resistance to anything that's useful
and used by ordinary people. A star sickness, pride? If yes, that's just
ridiculous, but who cares.
____
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-09-01 21:47 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-06 21:48 about modularization Mitchell Erblich
2007-08-06 23:35 ` Rene Herman
2007-08-06 23:45 ` Rene Herman
-- strict thread matches above, loose matches on Subject: below --
2007-08-06 20:20 Mitchell Erblich
2007-08-06 20:50 ` Rene Herman
2007-08-03 12:07 Scheduler Situation T. J. Brumfield
2007-08-03 13:00 ` debian developer
2007-08-03 15:28 ` about modularization Ingo Molnar
2007-08-03 13:19 ` Ingo Molnar
[not found] ` <cdc89fe60708030651s54b5f0e0j938450632cf621c5@mail.gmail.com>
2007-08-03 13:52 ` Fwd: " T. J. Brumfield
2007-08-03 15:02 ` Ingo Molnar
2007-08-03 15:13 ` Ingo Molnar
2007-08-03 17:47 ` Rene Herman
2007-08-03 18:59 ` Ingo Molnar
2007-09-01 22:02 ` Oleg Verych
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox