linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Crunch time -- the musical.  (2.5 merge candidate list 1.5)
@ 2002-10-23 21:26 Rob Landley
  2002-10-24 16:17 ` Michael Hohnbaum
  2002-10-25 14:46 ` Kevin Corry
  0 siblings, 2 replies; 17+ messages in thread
From: Rob Landley @ 2002-10-23 21:26 UTC (permalink / raw)
  To: linux-kernel

Kernel hooks is back with new links.  Also new versions of Linux Trace Tookit
and sys_epoll.  And new stuff from the 2.5 status list, and new stuff is STILL
showing up on linux-kernel.  (Still no 2.5 patch for Alan's 32 bit dev_t,
though.)

Richard J. Moore has stepped up to defend "VM Large Page support",
which has become "hugetlb update".  I don't know if this counts as
a new feature or a bugfix, but it's back...

Due to numerous complaints (okay, one, but technically that's a number)
tried to reformat a bit to have a slightly less eye-searingly hideous layout.
And reorganized the -mm stuff to be together in one clump.

And so:

----------

Linus returns from the Linux Lunacy Cruise after Sunday, October 27th.
(See "http://www.geekcruises.com/itinerary/ll2_itinerary.html".  He's
off to Jamaica, mon.)

The following features aim to be ready for submission to Linus by Monday,
October 28th, to be considered for inclusion (in 2.5.45) before the feature
freeze on Thursday, October 31 (halloween).  (L minus four days, and
counting...)

Note: if you want to submit a new entry to this list, PLEASE provide a URL
to where the patch can be found, and any descriptive announcement you think
useful (user space tools, etc).  This doesn't have to be a web page devoted
to the patch, if the patch has been posted to linux-kernel a URL to the post
on any linux-kernel archive site is fine.

If you don't know of one, a good site for looking at the threaded archive is:
http://lists.insecure.org/lists/linux-kernel/

A more searchable archive is available at:
http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&group=mlist.linux.kernel

This archive seems less likely to mangle your patch for cut and pasting
(especially if you click "raw download" at the top of the message),
although its a real pain to actualy try to read:
http://marc.theaimsgroup.com/?l=linux-kernel

This list is just pending features trying to get in before feature freeze.
It's primarily for features that need more testing, or might otherwise get
forgotten in the rush.  If you want to know what's already gone in, or what's
being worked on for the next development cycle, check out
"http://kernelnewbies.org/status".

You can get Andrew Morton's MM tree here, including a broken-out patches
directory and a description file:

http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.44

Alan Cox's -ac tree comes from here:

http://www.kernel.org/pub/linux/kernel/people/alan/

Thanks to Rusty Russell and Guillaume Boissiere, whose respective 2.5 merge
candidate lists have been ruthlessly strip-mined in the process of
assembling this.  And to everybody who's emailed stuff.

And now, in no particular order:

============================ Pending features: =============================

1) New kernel configuration system (Roman Zippel)

Announcement:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/6898.html

Code:
http://www.xs4all.nl/~zippel/lc/

Linus has actually looked fairly favorably on this one so far:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/3250.html

----------------------------------------------------------------------------

2) ext2/ext3 extended attributes and access control lists (Ted Tso) (in -mm)

Announce:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/6787.html

Code:
bk://extfs.bkbits.net/extfs-2.5-update
http://thunk.org/tytso/linux/extfs-2.5
(Or just grab it from the -mm tree.)

(Considering that EA/ACL infrastructure is already in, and supported by XFS
and JFS, this one's pretty close to a shoe-in.)

----------------------------------------------------------------------------

3) Page table sharing  (Daniel Phillips, Dave McCracken) (in -mm)

Announce:
http://www.geocrawler.com/mail/msg.php3?msg_id=7855063&list=35

Patch from the -mm tree:
http://www.zipworld.com.au/~akpm/linux/patches/2.5/2.5.44/2.5.44-mm3/broken-out/shpte-ng.patch

Ed Tomlinson seems to have a show-stopper bug for this one
(although he tells me in email he'd like to see it go in anyway):

http://lists.insecure.org/lists/linux-kernel/2002/Oct/7147.html

----------------------------------------------------------------------------

4) Improved Hugetlb support (Richard J. Moore) (in -mm tree)

(Dunno if this is exactly a feature, but giving it the benfit of the doubt...)

Description:
http://www.zipworld.com.au/~akpm/linux/patches/2.5/2.5.44/2.5.44-mm3/description

Patches (everything starting with "htlb" or "hugetlb"):
http://www.zipworld.com.au/~akpm/linux/patches/2.5/2.5.44/2.5.44-mm3/broken-out/

----------------------------------------------------------------------------

5) Generic Nonlinear Mappings (Ingo Molnar) (in -mm)

It's new, very close to deadline, needs testing and discussion.  I'm still a
touch vague on what it actually does, but there's a thread.

Announcement, patch, and start of thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103530883511032&w=2

----------------------------------------------------------------------------

6) Linux Trace Toolkit (LTT) (Karim Yaghmour)

Announce:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/7016.html

Patch:
http://opersys.com/ftp/pub/LTT/ExtraPatches/patch-ltt-linux-2.5.44-vanilla-021022-2.2.bz2

User tools:
http://opersys.com/ftp/pub/LTT/TraceToolkit-0.9.6pre2.tgz

----------------------------------------------------------------------------

7) Device mapper for Logical Volume Manager (LVM2)  (LVM2 team)  (in -ac)

Announce:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103536883428443&w=2

Download:
http://people.sistina.com/~thornber/patches/2.5-stable/

Home page:
http://www.sistina.com/products_lvm.htm

----------------------------------------------------------------------------

8) EVMS (Enterprise Volume Management System) (EVMS team)

Home page:
http://sourceforge.net/projects/evms

----------------------------------------------------------------------------

9) Kernel Probes (IBM, contact: Vamsi Krishna S)

Kprobes announcement:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103528410215211&w=2

Base Kprobes Patch:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103528425615302&w=2

KProbes->DProbes patches:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103528454215523&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103528454015520&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103528485415813&w=2

Official IBM download site for most recent versions (gzipped
tarballs):
http://www-124.ibm.com/linux/patches/?project_id=141

See also the DProbes Home Page:
http://oss.software.ibm.com/developerworks/opensource/linux/projects/dprobes

A good explanation of the difference between kprobes, dprobes,
and kernel hooks is here:

http://marc.theaimsgroup.com/?l=linux-kernel&m=103532874900445&w=2

And a clarification: just kprobes is being submitted for
2.5.45, not the whole of dprobes:

http://marc.theaimsgroup.com/?l=linux-kernel&m=103536827928012&w=2

----------------------------------------------------------------------------

10) High resolution timers (George Anzinger, etc.)

Home page:
http://high-res-timers.sourceforge.net/

Patch via evil sourceforge download auto-mirror thing:
http://prdownloads.sourceforge.net/high-res-timers/hrtimers-support-2.5.36-1.0.patch?download

Linus has unresolved concerns with this one, by the way:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/3463.html

Note: The Google posix timer patch forwarded by Jim Houston is being
merged into this patch:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/8068.html

----------------------------------------------------------------------------

11) Linux Kernel Crash Dumps (Matt Robinson, LKCD team)

Announce:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103536576625905&w=2

Code:
http://lkcd.sourceforge.net/download/latest/

----------------------------------------------------------------------------

12) Rewrite of the console layer (James Simmons)

Home page:
http://linuxconsole.sourceforge.net/

Patch (Unknown version, but home page only has random CVS du jour link.):
http://phoenix.infradead.org/~jsimmons/fbdev.diff.gz

Bitkeeper tree:
http://linuxconsole.bkbits.net


----------------------------------------------------------------------------

13) Kexec, luanch new linux kernel from Linux (Eric W. Biederman)

Announcement with links:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/6584.html

And this thread is just too brazen not to include:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/7952.html

----------------------------------------------------------------------------

14) USAGI IPv6 (Yoshifujy Hideyaki)

README:
ftp://ftp.linux-ipv6.org/pub/usagi/patch/ipsec/README.IPSEC

Patch:
ftp://ftp.linux-ipv6.org/pub/usagi/patch/ipsec/ipsec-2.5.43-ALL-03.patch.gz

----------------------------------------------------------------------------

15) MMU-less processor support (Greg Ungerer)

Announcement with lots of links:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/7027.html

----------------------------------------------------------------------------

16) sys_epoll (I.E. /dev/poll) (Davide Libenzi)

homepage:
http://www.xmailserver.org/linux-patches/nio-improve.html

patch:
http://www.xmailserver.org/linux-patches/sys_epoll-2.5.44-0.7.diff

Linus participated repeatedly in a thread on this one too, expressing
concerns which (hopefully) have been addressed.  See:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/6428.html

----------------------------------------------------------------------------

17) CD Recording/sgio patches (Jens Axboe)

Announce:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/8060.html

Patch:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.5/2.5.44/sgio-14b.diff.bz2

----------------------------------------------------------------------------

18) In-kernel module loader (Rusty Russell.)

Announce:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/6214.html

Patch:
http://www.kernel.org/pub/linux/kernel/people/rusty/patches/module-x86-18-10-2002.2.5.43.diff.gz

----------------------------------------------------------------------------

19) Unified Boot/Module parameter support (Rusty Russell)

Note: depends on in-kernel module loader.

Huge disorganized heap 'o patches with no explanation:
http://www.kernel.org/pub/linux/kernel/people/rusty/patches/Module/

----------------------------------------------------------------------------

20) Hotplug CPU Removal (Rusty Russell)

Even bigger, more disorganized Heap 'o patches:
http://www.kernel.org/pub/linux/kernel/people/rusty/patches/Hotplug/

----------------------------------------------------------------------------

21) Unlimited groups patch (Tim Hockin.)

Announce:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103524761319825&w=2

Patch set:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103524717119443&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103524761819834&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103524761619831&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103524761519829&w=2

----------------------------------------------------------------------------

22) Initramfs (Al Viro)

Way back when, Al said:
http://www.cs.helsinki.fi/linux/linux-kernel/2001-30/0110.html

I THINK this is the most recent patch:
ftp://ftp.math.psu.edu/pub/viro/N0-initramfs-C40

And Linus recently made happy noises about the idea:
http://lists.insecure.org/lists/linux-kernel/2002/Oct/1110.html

----------------------------------------------------------------------------

23) Kernel Hooks (IBM contact: Vamsi Krishna S.)

Website:
http://www-124.ibm.com/linux/projects/kernelhooks/

Download site:
http://www-124.ibm.com/linux/patches/?patch_id=595

Posted patch:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103364774926440&w=2

----------------------------------------------------------------------------

24) NMI request/release interface (Corey Minyard)

He says:
> Add a request/release mechanism to the kernel (x86 only for now) for NMIs.
...
>I have modified the nmi watchdog to use this interface, and it
>seems to work ok.  Keith Owens is copied to see if he would be
>interested in converting kdb to use this, if it gets put into the kernel.

The latest patch so far:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540434409894&w=2

----------------------------------------------------------------------------

25) Digital Video Broadcasting Layer (LinuxTV team)

Home page:
http://www.linuxtv.org:81/dvb/

Download:
http://www.linuxtv.org:81/download/dvb/

----------------------------------------------------------------------------

26) NUMA aware scheduler extenstions (Erich Focht, Michael Hohnbaum)

Home page:
http://home.arcor.de/efocht/sched/

Patch:
http://home.arcor.de/efocht/sched/Nod20_numa_sched-2.5.31.patch

----------------------------------------------------------------------------

27) DriverFS Topology (Matthew Dobson)

Announcement:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103523702710396&w=2

Patches:
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540707113401&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540757613962&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540758013984&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540757513957&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540757813966&w=2

----------------------------------------------------------------------------

28) Advanced TCA Disk Hotswap (Steven Dake)

At the last minute, Steven Dake submitted (and if he'd cc'd the list, I could
have linked to this message as the announcement, hint hint...):

> Please add to your 2.5.45 list:
>
> "Advanced TCA Disk Hotswap".
>
> This is a generic feature that provides good hotswap support for SCSI
> and FibreChannel disk devices.  The entire SCSI layer has been properly
> analyzed to provide correct locking and a complete RAMFS filesystem is
> available to control the kernel disk hotswap operations.
>
> Both Alan Cox and Greg KH have looked at the patch for 2.4 and suggested
> if I ported to 2.5 and made some changes (as I have in the latest port)
> this feature would be a good candidate for the 2.5 kernel.
>
> The sourceforge site for the latest patches is:
> https://sourceforge.net/projects/atca-hotswap/
>
> The lkml announcement for this latest port is:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103541572622729&w=2
>
> A thread discussing Advanced TCA hotswap (of which this partch is one
> part of) can be found at:
> http://marc.theaimsgroup.com/?t=103462115700001&r=1&w=2
>
> Thanks!
> -steve


======================== Unresolved issues: =========================

1) hyperthread-aware scheduler
2) connection tracking optimizations.

No URLs to patch.  Anybody want to come out in favor of these
with an announcement and pointer to a version being suggested
for inclusion?

3) IPSEC (David Miller, Alexy)
4) New CryptoAPI (James Morris)

David S. Miller said:

> No URLs, being coded as I type this :-)
>
> Some of the ipv4 infrastructure is in 2.5.44

Note, this may conflict with Yoshifuji Hideyaki's ipv6 ipsec stuff.  If not,
I'd like to collate or clarify the entries.)  USAGI ipv6 is in the first
section and this isn't because I have a URL to an existing patch to
USAGI, and don't for this.  I have no idea how much overlap there is
between these projects, and whether they're considered parts of the
same project or submitted individually...

5) ReiserFS 4

Hans Reiser said:

> We will send Reiser4 out soon, probably around the 27th.
>
> Hans

See also http://www.namesys.com/v4/fast_reiser4.html

Hans and Jens Axboe are arguing about whether or not Reiser4 is a
potential post-freeze addition.  That thread starts here:

http://lists.insecure.org/lists/linux-kernel/2002/Oct/7140.html

6) 32bit dev_t

Alan Cox said:

> The big one missing is 32bit dev_t. Thats the killer item we have left.

But did not provide a URL to a patch.  Presumably, it's in his tree and
is capable of being extracted out of it, so I guess it's already in
good hands?  (I dunno, ask him.)

He also mentioned:

> Oh other one I missed - DVB layer - digital tv etc. Pretty much
> essential now for europe, but again its basically all driver layer

But it's not clear this is an item that must go in before feature freeze
or not at all, which is what this list tries to focus on.

Then Dan Kegel pointed out:

> One possible page to quote for 32 bit dev_t:
> http://lwn.net/Articles/11583/

7) Online EXT3 resize support:

A thread over whether or not this is self-contained enough and low
enough impact to go in after the freature freeze starts here:

http://lists.insecure.org/lists/linux-kernel/2002/Oct/7680.html

I mention it just in case it isn't.  (We've had offline EXT3 resize for
a while, this is apparently twiddling a mounted partition without
unplugging it first, or even wearing rubber boots.)

-- 
http://penguicon.sf.net - Terry Pratchett, Eric Raymond, Pete Abrams, Illiad, 
CmdrTaco, liquid nitrogen ice cream, and caffienated jello.  Well why not?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-23 21:26 Rob Landley
@ 2002-10-24 16:17 ` Michael Hohnbaum
       [not found]   ` <200210240750.09751.landley@trommello.org>
  2002-10-25 14:46 ` Kevin Corry
  1 sibling, 1 reply; 17+ messages in thread
From: Michael Hohnbaum @ 2002-10-24 16:17 UTC (permalink / raw)
  To: landley; +Cc: linux-kernel, Erich Focht

On Wed, 2002-10-23 at 14:26, Rob Landley wrote:

> 26) NUMA aware scheduler extenstions (Erich Focht, Michael Hohnbaum)
> 
> Home page:
> http://home.arcor.de/efocht/sched/
> 
> Patch:
> http://home.arcor.de/efocht/sched/Nod20_numa_sched-2.5.31.patch

The simple NUMA scheduler patch, which is ready for inclusion is a 
separate project from Erich's NUMA scheduler extensions.  Information
on the simple NUMA scheduler is contained in this lkml posting:

http://marc.theaimsgroup.com/?l=linux-kernel&m=103351680614980&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103480772901235&w=2

The most recent version has been split into two patches for 2.5.44: 

http://marc.theaimsgroup.com/?l=linux-kernel&m=103539626130709&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103540481010560&w=2

-- 

Michael Hohnbaum                      503-578-5486
hohnbaum@us.ibm.com                   T/L 775-5486


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
       [not found]   ` <200210240750.09751.landley@trommello.org>
@ 2002-10-24 19:01     ` Michael Hohnbaum
  2002-10-24 21:51       ` Erich Focht
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Hohnbaum @ 2002-10-24 19:01 UTC (permalink / raw)
  To: landley; +Cc: linux-kernel, Erich Focht

On Thu, 2002-10-24 at 05:50, Rob Landley wrote:
> On Thursday 24 October 2002 11:17, Michael Hohnbaum wrote:
> > On Wed, 2002-10-23 at 14:26, Rob Landley wrote:
> > > 26) NUMA aware scheduler extenstions (Erich Focht, Michael Hohnbaum)
> > >
> > > Home page:
> > > http://home.arcor.de/efocht/sched/
> > >
> > > Patch:
> > > http://home.arcor.de/efocht/sched/Nod20_numa_sched-2.5.31.patch
> >
> > The simple NUMA scheduler patch, which is ready for inclusion is a
> > separate project from Erich's NUMA scheduler extensions.  Information
> > on the simple NUMA scheduler is contained in this lkml posting:
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103351680614980&w=2
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103480772901235&w=2
> >
> > The most recent version has been split into two patches for 2.5.44:
> >
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103539626130709&w=2
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=103540481010560&w=2
> 
> Any relation to http://lse.sourceforge.net/numa/ which the 2.5 status list 
> says is "Alpha" state, two steps down from "Ready"?
> 
> Rob

Yes and no.  At one point I was working with Erich moving his NUMA 
scheduler to 2.5 and testing it on our NUMA hardware.  However, it
was not looking like his NUMA scheduler was going to be ready for 
2.5, so I went off on a separate effort to produce a much smaller,
simpler patch to provide rudimentary NUMA support within the scheduler.
This patch does not have all the functionality of Erich's, but does
provide definite performance improvements on NUMA machines with no
degradation on non-NUMA SMP.  It is much smaller and less intrusive,
and has been tested on multiple NUMA architectures (including by 
Erich on the NEC IA64 NUMA box).

The 2.5 status list has not been updated to reflect this separate 
effort, and I believe incorrectly lists this entry as "ready".  There
really are now two NUMA scheduler projects:

* Simple NUMA scheduler (Michael Hohnbaum)  - ready for inclusion
* Node affine NUMA scheduler (Erich Focht)  - Alpha (Beta?)

-- 

Michael Hohnbaum                      503-578-5486
hohnbaum@us.ibm.com                   T/L 775-5486


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-24 19:01     ` Michael Hohnbaum
@ 2002-10-24 21:51       ` Erich Focht
  2002-10-24 22:38         ` Martin J. Bligh
  0 siblings, 1 reply; 17+ messages in thread
From: Erich Focht @ 2002-10-24 21:51 UTC (permalink / raw)
  To: Michael Hohnbaum, landley; +Cc: linux-kernel

Hi Rob and Michael,

I need to correct some inexactities and, of course, advertise my aproach
:-)

On Thursday 24 October 2002 21:01, Michael Hohnbaum wrote:
> > > > 26) NUMA aware scheduler extenstions (Erich Focht, Michael Hohnbaum)
> > > >
> > > > Home page:
> > > > http://home.arcor.de/efocht/sched/
> > > >
> > > > Patch:
> > > > http://home.arcor.de/efocht/sched/Nod20_numa_sched-2.5.31.patch

These are old. I posted the newer patches (splitted up in order to clearly
separate the functionality additions) to LKML:

http://marc.theaimsgroup.com/?l=linux-kernel&m=103459387719030&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103459387519026&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103459441119407&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103459441319411&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=103459441419416&w=2
They should work for any NUMA platform by just adding a call to
build_pools() in smp_cpus_done(). They work for non-NUMA platforms
the same way as the O(1) scheduler (though the code looks different).
A test overview is in: http://lwn.net/Articles/12546/
This suggests that taking only patches 01+02 already gives you a VERY
good NUMA scheduler. They deliver the infrastructure for later
developments (patches 03+05) which we can further research and tune or
give only to special customers.

> The 2.5 status list has not been updated to reflect this separate
> effort, and I believe incorrectly lists this entry as "ready".  There
> really are now two NUMA scheduler projects:
>
> * Simple NUMA scheduler (Michael Hohnbaum)  - ready for inclusion
> * Node affine NUMA scheduler (Erich Focht)  - Alpha (Beta?)
This is not correct. We have the node affine scheduler in production
since 6 months on top of 2.4. kernels and are happy with it. It is a lot
more than alpha or beta, it already makes customers happy.

The situation is really funny: Everybody seems to agree that the design
ideas in my NUMA aproach are sane and exactly what we want to have on
a NUMA platform in the end. But instead of concentrating on tuning the
parameters for the many different NUMA platforms and reshaping this
aproach to make it acceptable, IBM concentrates on a very much stripped
down aproach. I understand that this project has been started to make
the inclusion of some NUMA scheduler easier. But in the end, the simple
NUMA scheduler will have to develop to a much more complex thing and in
some form or another replicate the design ideas of my node affine
scheduler. On machines with poor NUMA ratio like NUMAQ the simple NUMA
change helps. For machines with good NUMA ratio like NEC Azusa, NEC TX7
you need a little bit more. AMD Hammer-SMP and ppc64 are certainly in
the same class as the Azusa/TX7. And as soon as Hammer SMP systems will
be around, the pressure for a full featured NUMA scheduler will be much
higher.

A NUMA scheduler extension of the 2.6 kernel fits very well with the
development effort done for better scalability and enterprise level
fitnes of Linux. Check http://lwn.net/Articles/12546/ to see that it
makes a difference to have more than O(1) on NUMA machines! I'd
definitely prefer the inclusion of my 01+02 patches (I'd have to
maintain less code to keep the customers happy), on the other side:
including Michael's patch would be better than not adding NUMA
scheduler support at all.

Best regards,
Erich



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-24 21:51       ` Erich Focht
@ 2002-10-24 22:38         ` Martin J. Bligh
  2002-10-25  8:15           ` Erich Focht
  0 siblings, 1 reply; 17+ messages in thread
From: Martin J. Bligh @ 2002-10-24 22:38 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum, landley; +Cc: linux-kernel

> The situation is really funny: Everybody seems to agree that the design
> ideas in my NUMA aproach are sane and exactly what we want to have on
> a NUMA platform in the end. But instead of concentrating on tuning the
> parameters for the many different NUMA platforms and reshaping this
> aproach to make it acceptable, IBM concentrates on a very much stripped
> down aproach.

>From my point of view, the reason for focussing on this was that 
your scheduler degraded the performance on my machine, rather than
boosting it. Half of that was the more complex stuff you added on
top ... it's a lot easier to start with something simple that works 
and build on it, than fix something that's complex and doesn't work
well.

I still haven't been able to get your scheduler to boot for about 
the last month without crashing the system. Andrew says he has it 
booting somehow on 2.5.44-mm4, so I'll steal his kernel tommorow and
see how it looks. If the numbers look good for doing boring things
like kernel compile, SDET, etc, I'm happy.

M.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
@ 2002-10-25  0:25 Jim Houston
  2002-10-25 17:58 ` george anzinger
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Jim Houston @ 2002-10-25  0:25 UTC (permalink / raw)
  To: landley, linux-kernel, george

Hi Rob,

The Posix timers entry in your list is confused.  I don't know how
my patch got the name Google.

I think Dan Kegel misunderstood George's answer to my previous announcement.  George might be picking up some of my changes, but
there will still be two
patches for Linus to choose from.  You included the URL to George's answer
which quoted my patch, rather than the URL I sent you.

Here is the URL for an archived copy of my latest patch:
     Jim Houston's  [PATCH] alternate Posix timer patch3
     http://marc.theaimsgroup.com/?l=linux-kernel&m=103549000027416&w=2

I would be happy to see either version go into 2.5.  

The URLs for George's patches are incomplete.  I believe this is the
most recent (it's from Oct 18).  The Sourceforge.net reference has the
user space library and test programs, but I did not see 2.5 kernel
patches.

  [PATCH ] POSIX clocks & timers take 3 (NOT HIGH RES)
     http://marc.theaimsgroup.com/?l=linux-kernel&m=103489669622397&w=2

Thanks
Jim Houston - Concurrent Computer Corp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-24 22:38         ` Martin J. Bligh
@ 2002-10-25  8:15           ` Erich Focht
  2002-10-25 23:26             ` Martin J. Bligh
  2002-10-26 18:58             ` Martin J. Bligh
  0 siblings, 2 replies; 17+ messages in thread
From: Erich Focht @ 2002-10-25  8:15 UTC (permalink / raw)
  To: Martin J. Bligh, Michael Hohnbaum, landley; +Cc: linux-kernel

On Friday 25 October 2002 00:38, Martin J. Bligh wrote:
> > The situation is really funny: Everybody seems to agree that the design
> > ideas in my NUMA aproach are sane and exactly what we want to have on
> > a NUMA platform in the end. But instead of concentrating on tuning the
> > parameters for the many different NUMA platforms and reshaping this
> > aproach to make it acceptable, IBM concentrates on a very much stripped
> > down aproach.
>
> From my point of view, the reason for focussing on this was that
> your scheduler degraded the performance on my machine, rather than
> boosting it. Half of that was the more complex stuff you added on
> top ... it's a lot easier to start with something simple that works
> and build on it, than fix something that's complex and doesn't work
> well.

You're talking about one of the first 2.5 versions of the patch. It
changed a lot since then, thanks to your feedback, too.

> I still haven't been able to get your scheduler to boot for about
> the last month without crashing the system. Andrew says he has it
> booting somehow on 2.5.44-mm4, so I'll steal his kernel tommorow and
> see how it looks. If the numbers look good for doing boring things
> like kernel compile, SDET, etc, I'm happy.

I thought this problem is well understood! For some reasons independent of
my patch you have to boot your machines with the "notsc" option. This
leaves the cache_decay_ticks variable initialized to zero which my patch
doesn't like. I'm trying to deal with this inside the patch but there is
still a small window when the variable is zero. In my opinion this needs
to be fixed somewhere in arch/i386/kernel/smpboot.c. Booting a machine
with cache_decay_ticks=0 is pure nonsense, as it switches off cache
affinity which you absolutely need! So even if "notsc" is a legal option,
it should be fixed such that it doesn't leave your machine without cache
affinity. That would anyway give you a falsified behavior of the O(1)
scheduler.

Erich



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical. (2.5 merge candidate list 1.5)
  2002-10-23 21:26 Rob Landley
  2002-10-24 16:17 ` Michael Hohnbaum
@ 2002-10-25 14:46 ` Kevin Corry
  1 sibling, 0 replies; 17+ messages in thread
From: Kevin Corry @ 2002-10-25 14:46 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Wednesday 23 October 2002 16:26, Rob Landley wrote:
> Due to numerous complaints (okay, one, but technically that's a number)
> tried to reformat a bit to have a slightly less eye-searingly hideous
> layout. And reorganized the -mm stuff to be together in one clump.
>
> And so:

> ......

> ---------------------------------------------------------------------------
>
> 8) EVMS (Enterprise Volume Management System) (EVMS team)
>
> Home page:
> http://sourceforge.net/projects/evms
>
> ---------------------------------------------------------------------------

Rob,

Can you please add the following links for the EVMS project:

Home page:
http://evms.sourceforge.net

Download:
http://evms.sourceforge.net/patches/

Some related discussions:
http://marc.theaimsgroup.com/?t=103359686900003&r=1&w=2
http://marc.theaimsgroup.com/?t=103439913000001&r=1&w=2
http://marc.theaimsgroup.com/?w=2&r=1&s=%5Bpatch%5D+evms+core&q=t

Thanks!
-- 
Kevin Corry
corryk@us.ibm.com
http://evms.sourceforge.net/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25  0:25 Crunch time -- the musical. (2.5 merge candidate list 1.5) Jim Houston
@ 2002-10-25 17:58 ` george anzinger
  2002-10-25 18:53 ` highres timers question Rob Landley
  2002-10-25 19:58 ` Crunch time -- the musical. (2.5 merge candidate list 1.5) Rob Landley
  2 siblings, 0 replies; 17+ messages in thread
From: george anzinger @ 2002-10-25 17:58 UTC (permalink / raw)
  To: jim.houston; +Cc: landley, linux-kernel

Jim Houston wrote:
> 
> Hi Rob,
> 
> The Posix timers entry in your list is confused.  I don't know how
> my patch got the name Google.
> 
> I think Dan Kegel misunderstood George's answer to my previous announcement.  George might be picking up some of my changes, but
> there will still be two
> patches for Linus to choose from.  You included the URL to George's answer
> which quoted my patch, rather than the URL I sent you.
> 
> Here is the URL for an archived copy of my latest patch:
>      Jim Houston's  [PATCH] alternate Posix timer patch3
>      http://marc.theaimsgroup.com/?l=linux-kernel&m=103549000027416&w=2
> 
> I would be happy to see either version go into 2.5.
> 
> The URLs for George's patches are incomplete.  I believe this is the
> most recent (it's from Oct 18).  The Sourceforge.net reference has the
> user space library and test programs, but I did not see 2.5 kernel
> patches.
> 
>   [PATCH ] POSIX clocks & timers take 3 (NOT HIGH RES)
>      http://marc.theaimsgroup.com/?l=linux-kernel&m=103489669622397&w=2

I would be very careful picking up patches from the
digests.  Some of them have message size limits that cause
truncated patches.  I know mine was on the marc digest.  I
will post the latest HRT patches on the project sourceforge
site.

-- 
George Anzinger   george@mvista.com
High-res-timers: 
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml

^ permalink raw reply	[flat|nested] 17+ messages in thread

* highres timers question...
  2002-10-25  0:25 Crunch time -- the musical. (2.5 merge candidate list 1.5) Jim Houston
  2002-10-25 17:58 ` george anzinger
@ 2002-10-25 18:53 ` Rob Landley
  2002-10-26  9:07   ` george anzinger
  2002-10-25 19:58 ` Crunch time -- the musical. (2.5 merge candidate list 1.5) Rob Landley
  2 siblings, 1 reply; 17+ messages in thread
From: Rob Landley @ 2002-10-25 18:53 UTC (permalink / raw)
  To: jim.houston, linux-kernel, george

I'm guessing that of the patches here:

http://sourceforge.net/projects/high-res-timers

The -posix one adds posix support on top of the base high-res timers patch?

(Did I guess right?)

Rob

-- 
http://penguicon.sf.net - Terry Pratchett, Eric Raymond, Pete Abrams, Illiad, 
CmdrTaco, liquid nitrogen ice cream, and caffienated jello.  Well why not?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25  0:25 Crunch time -- the musical. (2.5 merge candidate list 1.5) Jim Houston
  2002-10-25 17:58 ` george anzinger
  2002-10-25 18:53 ` highres timers question Rob Landley
@ 2002-10-25 19:58 ` Rob Landley
  2002-10-26  8:45   ` george anzinger
  2 siblings, 1 reply; 17+ messages in thread
From: Rob Landley @ 2002-10-25 19:58 UTC (permalink / raw)
  To: jim.houston, linux-kernel, george

On Thursday 24 October 2002 19:25, Jim Houston wrote:
> Hi Rob,
>
> The Posix timers entry in your list is confused.  I don't know how
> my patch got the name Google.

Sorry, misread "George's version" as "Google's version" at 5 am one morning.
Lot of late nights recently... :)

> I think Dan Kegel misunderstood George's answer to my previous
> announcement.  George might be picking up some of my changes, but there
> will still be two patches for Linus to choose from.  You included the URL to 
> George's answer which quoted my patch, rather than the URL I sent you.

Had it in, then took it out.  I'm trying to collate down the list wherever I 
can.

> Here is the URL for an archived copy of my latest patch:
>      Jim Houston's  [PATCH] alternate Posix timer patch3
>      http://marc.theaimsgroup.com/?l=linux-kernel&m=103549000027416&w=2

It's back now.

> I would be happy to see either version go into 2.5.

So what exactly is the difference between them?

> The URLs for George's patches are incomplete.  I believe this is the
> most recent (it's from Oct 18).  The Sourceforge.net reference has the
> user space library and test programs, but I did not see 2.5 kernel
> patches.
>
>   [PATCH ] POSIX clocks & timers take 3 (NOT HIGH RES)
>      http://marc.theaimsgroup.com/?l=linux-kernel&m=103489669622397&w=2

He's up to version 4 now.

> Thanks
> Jim Houston - Concurrent Computer Corp.

Rob

-- 
http://penguicon.sf.net - Terry Pratchett, Eric Raymond, Pete Abrams, Illiad, 
CmdrTaco, liquid nitrogen ice cream, and caffienated jello.  Well why not?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25  8:15           ` Erich Focht
@ 2002-10-25 23:26             ` Martin J. Bligh
  2002-10-25 23:45               ` Martin J. Bligh
  2002-10-26  0:02               ` Martin J. Bligh
  2002-10-26 18:58             ` Martin J. Bligh
  1 sibling, 2 replies; 17+ messages in thread
From: Martin J. Bligh @ 2002-10-25 23:26 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum; +Cc: linux-kernel

> You're talking about one of the first 2.5 versions of the patch. It
> changed a lot since then, thanks to your feedback, too.

Right. But I've been struggling to boot anything later than that ;-)
 
> I thought this problem is well understood! For some reasons independent of
> my patch you have to boot your machines with the "notsc" option. This
> leaves the cache_decay_ticks variable initialized to zero which my patch
> doesn't like. I'm trying to deal with this inside the patch but there is
> still a small window when the variable is zero. In my opinion this needs
> to be fixed somewhere in arch/i386/kernel/smpboot.c. Booting a machine
> with cache_decay_ticks=0 is pure nonsense, as it switches off cache
> affinity which you absolutely need! So even if "notsc" is a legal option,
> it should be fixed such that it doesn't leave your machine without cache
> affinity. That would anyway give you a falsified behavior of the O(1)
> scheduler.

OK, well we seem to have it working on one machine, but not on another.
Those should be identical, I suspect it's a timing thing. I'm playing around
with the differences. First major thing I noticed is that the working box has
gcc 3.1, and the non-working gcc 2.95.4 (debian woody). I suspect it's
a subtle timing thing, or something equally horrible.

Changing the non-working box to gcc 3.1 instead (which I *really* don't
want to do long term unless we prove there's a bug in 2.95 ... gcc 3.x
is disgustingly slow) resulted in it getting a little further, but then got the 
following oops ... does this provide any clues?

CPU 7 IS NOW UP!
Starting migration thread for cpu 7
Bringing up 8
CPU 8 IS NOW UP!
Starting migration thread for cpu 8
divide error: 0000
 
CPU:    4
EIP:    0060:[<c011ac38>]    Not tainted
EFLAGS: 00010002
EIP is at task_to_steal+0x118/0x260
eax: 00000001   ebx: f01c5040   ecx: 00000000   edx: 00000000
esi: 00000063   edi: f01c5020   ebp: f0197ee8   esp: f0197eac
ds: 0068   es: 0068   ss: 0068
Process swapper (pid: 0, threadinfo=f0196000 task=f01bf060)
Stack: 00000000 f01b4120 00000000 c02ec940 f0197ed4 00000004 00000000 c02ecd3c 
       c02ec93c 00000000 00000001 0000007d c02ec4a0 00000001 00000004 f0197f1c 
       c011829c c02ec4a0 00000004 00000004 00000001 00000000 c39376c0 00000000 
Call Trace:
 [<c011829c>] load_balance+0x8c/0x140
 [<c0118588>] scheduler_tick+0x238/0x360
 [<c0123347>] tasklet_hi_action+0x77/0xc0
 [<c0105420>] default_idle+0x0/0x50
 [<c0126bd5>] update_process_times+0x45/0x60
 [<c0113faa>] smp_apic_timer_interrupt+0x11a/0x120
 [<c0105420>] default_idle+0x0/0x50
 [<c010815e>] apic_timer_interrupt+0x1a/0x20
 [<c0105420>] default_idle+0x0/0x50
 [<c0105420>] default_idle+0x0/0x50
 [<c010544a>] default_idle+0x2a/0x50
 [<c01054ea>] cpu_idle+0x3a/0x50
 [<c011db20>] printk+0x140/0x180

Code: f7 75 cc 8b 55 c8 83 f8 64 0f 4c f0 39 4d ec 8d 46 64 0f 44 

This is 2.5.44-mm4 + your patches 1,2,3,5, I think.

M.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25 23:26             ` Martin J. Bligh
@ 2002-10-25 23:45               ` Martin J. Bligh
  2002-10-26  0:02               ` Martin J. Bligh
  1 sibling, 0 replies; 17+ messages in thread
From: Martin J. Bligh @ 2002-10-25 23:45 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum; +Cc: linux-kernel

> divide error: 0000
>  
> CPU:    4
> EIP:    0060:[<c011ac38>]    Not tainted
> EFLAGS: 00010002
> EIP is at task_to_steal+0x118/0x260
> eax: 00000001   ebx: f01c5040   ecx: 00000000   edx: 00000000
> esi: 00000063   edi: f01c5020   ebp: f0197ee8   esp: f0197eac
> ds: 0068   es: 0068   ss: 0068
> Process swapper (pid: 0, threadinfo=f0196000 task=f01bf060)
> Stack: 00000000 f01b4120 00000000 c02ec940 f0197ed4 00000004 00000000 c02ecd3c 
>        c02ec93c 00000000 00000001 0000007d c02ec4a0 00000001 00000004 f0197f1c 
>        c011829c c02ec4a0 00000004 00000004 00000001 00000000 c39376c0 00000000 
> Call Trace:
>  [<c011829c>] load_balance+0x8c/0x140
>  [<c0118588>] scheduler_tick+0x238/0x360
>  [<c0123347>] tasklet_hi_action+0x77/0xc0
>  [<c0105420>] default_idle+0x0/0x50
>  [<c0126bd5>] update_process_times+0x45/0x60
>  [<c0113faa>] smp_apic_timer_interrupt+0x11a/0x120
>  [<c0105420>] default_idle+0x0/0x50
>  [<c010815e>] apic_timer_interrupt+0x1a/0x20
>  [<c0105420>] default_idle+0x0/0x50
>  [<c0105420>] default_idle+0x0/0x50
>  [<c010544a>] default_idle+0x2a/0x50
>  [<c01054ea>] cpu_idle+0x3a/0x50
>  [<c011db20>] printk+0x140/0x180
> 
> Code: f7 75 cc 8b 55 c8 83 f8 64 0f 4c f0 39 4d ec 8d 46 64 0f 44 


Dump of assembler code for function task_to_steal:
0xc011ab20 <task_to_steal>:     push   %ebp
0xc011ab21 <task_to_steal+1>:   mov    %esp,%ebp
0xc011ab23 <task_to_steal+3>:   push   %edi
0xc011ab24 <task_to_steal+4>:   push   %esi
0xc011ab25 <task_to_steal+5>:   push   %ebx
0xc011ab26 <task_to_steal+6>:   sub    $0x30,%esp
0xc011ab29 <task_to_steal+9>:   movl   $0x0,0xffffffdc(%ebp)
0xc011ab30 <task_to_steal+16>:  mov    0xc(%ebp),%eax
0xc011ab33 <task_to_steal+19>:  movl   $0x0,0xffffffe8(%ebp)
0xc011ab3a <task_to_steal+26>:  mov    0x8(%ebp),%edx
0xc011ab3d <task_to_steal+29>:  mov    0xc034afe0(,%eax,4),%eax
0xc011ab44 <task_to_steal+36>:  sar    $0x4,%eax
0xc011ab47 <task_to_steal+39>:  mov    %eax,0xffffffec(%ebp)
0xc011ab4a <task_to_steal+42>:  mov    0x20(%edx),%eax
0xc011ab4d <task_to_steal+45>:  mov    (%eax),%esi
0xc011ab4f <task_to_steal+47>:  test   %esi,%esi
0xc011ab51 <task_to_steal+49>:  je     0xc011ad6a <task_to_steal+586>
0xc011ab57 <task_to_steal+55>:  mov    %eax,0xffffffe4(%ebp)
0xc011ab5a <task_to_steal+58>:  movl   $0x0,0xfffffff0(%ebp)
0xc011ab61 <task_to_steal+65>:  mov    0xffffffe4(%ebp),%ebx
0xc011ab64 <task_to_steal+68>:  add    $0x4,%ebx
0xc011ab67 <task_to_steal+71>:  mov    %ebx,0xffffffd0(%ebp)
0xc011ab6a <task_to_steal+74>:  lea    0x0(%esi),%esi
0xc011ab70 <task_to_steal+80>:  mov    0xfffffff0(%ebp),%ebx
0xc011ab73 <task_to_steal+83>:  test   %ebx,%ebx
0xc011ab75 <task_to_steal+85>:  jne    0xc011acec <task_to_steal+460>
0xc011ab7b <task_to_steal+91>:  mov    0xffffffe4(%ebp),%edx
0xc011ab7e <task_to_steal+94>:  mov    0x4(%edx),%eax
0xc011ab81 <task_to_steal+97>:  test   %eax,%eax
0xc011ab83 <task_to_steal+99>:  jne    0xc011ace4 <task_to_steal+452>
0xc011ab89 <task_to_steal+105>: mov    0xffffffd0(%ebp),%ecx
0xc011ab8c <task_to_steal+108>: mov    0x4(%ecx),%eax
0xc011ab8f <task_to_steal+111>: test   %eax,%eax
0xc011ab91 <task_to_steal+113>: jne    0xc011acd9 <task_to_steal+441>
0xc011ab97 <task_to_steal+119>: mov    0xffffffd0(%ebp),%ebx
0xc011ab9a <task_to_steal+122>: mov    0x8(%ebx),%eax
0xc011ab9d <task_to_steal+125>: test   %eax,%eax
0xc011ab9f <task_to_steal+127>: jne    0xc011acce <task_to_steal+430>
0xc011aba5 <task_to_steal+133>: mov    0xffffffd0(%ebp),%edx
0xc011aba8 <task_to_steal+136>: mov    0xc(%edx),%eax
0xc011abab <task_to_steal+139>: test   %eax,%eax
0xc011abad <task_to_steal+141>: je     0xc011acbf <task_to_steal+415>
0xc011abb3 <task_to_steal+147>: bsf    %eax,%eax
0xc011abb6 <task_to_steal+150>: add    $0x60,%eax
0xc011abb9 <task_to_steal+153>: mov    %eax,0xfffffff0(%ebp)
0xc011abbc <task_to_steal+156>: cmpl   $0x8c,0xfffffff0(%ebp)
0xc011abc3 <task_to_steal+163>: je     0xc011ac9e <task_to_steal+382>
0xc011abc9 <task_to_steal+169>: mov    0xfffffff0(%ebp),%ebx
0xc011abcc <task_to_steal+172>: mov    0xffffffe4(%ebp),%eax
0xc011abcf <task_to_steal+175>: mov    0xc034b4e0,%edx
0xc011abd5 <task_to_steal+181>: lea    0x18(%eax,%ebx,8),%ebx
0xc011abd9 <task_to_steal+185>: mov    %ebx,0xffffffe0(%ebp)
0xc011abdc <task_to_steal+188>: mov    0x4(%ebx),%ebx
0xc011abdf <task_to_steal+191>: mov    %edx,0xffffffcc(%ebp)
0xc011abe2 <task_to_steal+194>: lea    0x0(%esi,1),%esi
0xc011abe9 <task_to_steal+201>: lea    0x0(%edi,1),%edi
0xc011abf0 <task_to_steal+208>: lea    0xffffffe0(%ebx),%edi
0xc011abf3 <task_to_steal+211>: mov    0xc0348e68,%eax
0xc011abf8 <task_to_steal+216>: mov    0x30(%edi),%edx
0xc011abfb <task_to_steal+219>: sub    %edx,%eax
0xc011abfd <task_to_steal+221>: cmp    0xffffffcc(%ebp),%eax
0xc011ac00 <task_to_steal+224>: jbe    0xc011ac70 <task_to_steal+336>
0xc011ac02 <task_to_steal+226>: mov    0x8(%ebp),%ecx
0xc011ac05 <task_to_steal+229>: mov    0x14(%ecx),%ecx
0xc011ac08 <task_to_steal+232>: cmp    %ecx,%edi
0xc011ac0a <task_to_steal+234>: mov    %ecx,0xffffffc8(%ebp)
0xc011ac0d <task_to_steal+237>: je     0xc011ac70 <task_to_steal+336>
0xc011ac0f <task_to_steal+239>: movzbl 0xc(%ebp),%ecx
0xc011ac13 <task_to_steal+243>: mov    0x38(%edi),%eax
0xc011ac16 <task_to_steal+246>: shr    %cl,%eax
0xc011ac18 <task_to_steal+248>: and    $0x1,%eax
0xc011ac1b <task_to_steal+251>: je     0xc011ac70 <task_to_steal+336>
0xc011ac1d <task_to_steal+253>: mov    0x48(%edi),%esi
0xc011ac20 <task_to_steal+256>: test   %esi,%esi
0xc011ac22 <task_to_steal+258>: jne    0xc011ac83 <task_to_steal+355>
0xc011ac24 <task_to_steal+260>: mov    0xc0348e68,%eax
0xc011ac29 <task_to_steal+265>: xor    %edx,%edx
0xc011ac2b <task_to_steal+267>: mov    $0x63,%esi
0xc011ac30 <task_to_steal+272>: mov    0x30(%edi),%ecx
0xc011ac33 <task_to_steal+275>: sub    %ecx,%eax
0xc011ac35 <task_to_steal+277>: mov    0x44(%edi),%ecx
0xc011ac38 <task_to_steal+280>: divl   0xffffffcc(%ebp)
0xc011ac3b <task_to_steal+283>: mov    0xffffffc8(%ebp),%edx
0xc011ac3e <task_to_steal+286>: cmp    $0x64,%eax
0xc011ac41 <task_to_steal+289>: cmovl  %eax,%esi
0xc011ac44 <task_to_steal+292>: cmp    %ecx,0xffffffec(%ebp)
0xc011ac47 <task_to_steal+295>: lea    0x64(%esi),%eax
0xc011ac4a <task_to_steal+298>: cmove  %eax,%esi
0xc011ac4d <task_to_steal+301>: mov    0x4(%edx),%eax
0xc011ac50 <task_to_steal+304>: lea    0xffffff9c(%esi),%edx
0xc011ac53 <task_to_steal+307>: mov    0xc(%eax),%eax
0xc011ac56 <task_to_steal+310>: mov    0xc034afe0(,%eax,4),%eax
0xc011ac5d <task_to_steal+317>: sar    $0x4,%eax
0xc011ac60 <task_to_steal+320>: cmp    %eax,%ecx
0xc011ac62 <task_to_steal+322>: cmove  %edx,%esi
0xc011ac65 <task_to_steal+325>: cmp    0xffffffdc(%ebp),%esi
0xc011ac68 <task_to_steal+328>: jle    0xc011ac70 <task_to_steal+336>
0xc011ac6a <task_to_steal+330>: mov    %esi,0xffffffdc(%ebp)
0xc011ac6d <task_to_steal+333>: mov    %edi,0xffffffe8(%ebp)
0xc011ac70 <task_to_steal+336>: mov    (%ebx),%ebx
0xc011ac72 <task_to_steal+338>: cmp    0xffffffe0(%ebp),%ebx
0xc011ac75 <task_to_steal+341>: jne    0xc011abf0 <task_to_steal+208>
0xc011ac7b <task_to_steal+347>: incl   0xfffffff0(%ebp)
0xc011ac7e <task_to_steal+350>: jmp    0xc011ab70 <task_to_steal+80>
0xc011ac83 <task_to_steal+355>: mov    %edi,(%esp,1)
0xc011ac86 <task_to_steal+358>: call   0xc0118070 <upd_node_mem>
0xc011ac8b <task_to_steal+363>: mov    0x8(%ebp),%edx
0xc011ac8e <task_to_steal+366>: mov    0xc034b4e0,%eax
0xc011ac93 <task_to_steal+371>: mov    %eax,0xffffffcc(%ebp)
0xc011ac96 <task_to_steal+374>: mov    0x14(%edx),%edx
0xc011ac99 <task_to_steal+377>: mov    %edx,0xffffffc8(%ebp)
0xc011ac9c <task_to_steal+380>: jmp    0xc011ac24 <task_to_steal+260>
0xc011ac9e <task_to_steal+382>: mov    0x8(%ebp),%eax
0xc011aca1 <task_to_steal+385>: mov    0xffffffe4(%ebp),%edx
0xc011aca4 <task_to_steal+388>: cmp    0x20(%eax),%edx
0xc011aca7 <task_to_steal+391>: jne    0xc011acb4 <task_to_steal+404>
0xc011aca9 <task_to_steal+393>: mov    0x1c(%eax),%ecx
0xc011acac <task_to_steal+396>: mov    %ecx,0xffffffe4(%ebp)
0xc011acaf <task_to_steal+399>: jmp    0xc011ab5a <task_to_steal+58>
0xc011acb4 <task_to_steal+404>: mov    0xffffffe8(%ebp),%eax
0xc011acb7 <task_to_steal+407>: add    $0x30,%esp
0xc011acba <task_to_steal+410>: pop    %ebx
0xc011acbb <task_to_steal+411>: pop    %esi
0xc011acbc <task_to_steal+412>: pop    %edi
0xc011acbd <task_to_steal+413>: pop    %ebp
0xc011acbe <task_to_steal+414>: ret    
0xc011acbf <task_to_steal+415>: mov    0xffffffd0(%ebp),%ecx
0xc011acc2 <task_to_steal+418>: bsf    0x10(%ecx),%eax
0xc011acc6 <task_to_steal+422>: sub    $0xffffff80,%eax
0xc011acc9 <task_to_steal+425>: jmp    0xc011abb9 <task_to_steal+153>
0xc011acce <task_to_steal+430>: bsf    %eax,%eax
0xc011acd1 <task_to_steal+433>: add    $0x40,%eax
0xc011acd4 <task_to_steal+436>: jmp    0xc011abb9 <task_to_steal+153>
0xc011acd9 <task_to_steal+441>: bsf    %eax,%eax
0xc011acdc <task_to_steal+444>: add    $0x20,%eax
0xc011acdf <task_to_steal+447>: jmp    0xc011abb9 <task_to_steal+153>
0xc011ace4 <task_to_steal+452>: bsf    %eax,%eax
0xc011ace7 <task_to_steal+455>: jmp    0xc011abb9 <task_to_steal+153>
0xc011acec <task_to_steal+460>: mov    0xfffffff0(%ebp),%eax
0xc011acef <task_to_steal+463>: xor    %esi,%esi
0xc011acf1 <task_to_steal+465>: mov    0xfffffff0(%ebp),%ecx
0xc011acf4 <task_to_steal+468>: mov    0xffffffd0(%ebp),%ebx
0xc011acf7 <task_to_steal+471>: sar    $0x5,%eax
0xc011acfa <task_to_steal+474>: and    $0x1f,%ecx
0xc011acfd <task_to_steal+477>: lea    (%ebx,%eax,4),%edi
0xc011ad00 <task_to_steal+480>: je     0xc011ad2b <task_to_steal+523>
0xc011ad02 <task_to_steal+482>: mov    (%edi),%eax
0xc011ad04 <task_to_steal+484>: shr    %cl,%eax
0xc011ad06 <task_to_steal+486>: bsf    %eax,%esi
0xc011ad09 <task_to_steal+489>: jne    0xc011ad10 <task_to_steal+496>
0xc011ad0b <task_to_steal+491>: mov    $0x20,%esi
0xc011ad10 <task_to_steal+496>: mov    $0x20,%eax
0xc011ad15 <task_to_steal+501>: sub    %ecx,%eax
0xc011ad17 <task_to_steal+503>: cmp    %eax,%esi
0xc011ad19 <task_to_steal+505>: jge    0xc011ad26 <task_to_steal+518>
0xc011ad1b <task_to_steal+507>: mov    0xfffffff0(%ebp),%edx
0xc011ad1e <task_to_steal+510>: lea    (%edx,%esi,1),%eax
0xc011ad21 <task_to_steal+513>: jmp    0xc011abb9 <task_to_steal+153>
0xc011ad26 <task_to_steal+518>: mov    %eax,%esi
0xc011ad28 <task_to_steal+520>: add    $0x4,%edi
0xc011ad2b <task_to_steal+523>: mov    0xffffffd0(%ebp),%ecx
0xc011ad2e <task_to_steal+526>: mov    %edi,%eax
0xc011ad30 <task_to_steal+528>: mov    $0x8c,%edx
0xc011ad35 <task_to_steal+533>: mov    %edi,%ebx
0xc011ad37 <task_to_steal+535>: sub    %ecx,%eax
0xc011ad39 <task_to_steal+537>: shl    $0x3,%eax
0xc011ad3c <task_to_steal+540>: sub    %eax,%edx
0xc011ad3e <task_to_steal+542>: add    $0x1f,%edx
0xc011ad41 <task_to_steal+545>: shr    $0x5,%edx
0xc011ad44 <task_to_steal+548>: mov    %edx,0xffffffd4(%ebp)
0xc011ad47 <task_to_steal+551>: mov    %edx,%ecx
0xc011ad49 <task_to_steal+553>: xor    %eax,%eax
0xc011ad4b <task_to_steal+555>: repz scas %es:(%edi),%eax
0xc011ad4d <task_to_steal+557>: je     0xc011ad55 <task_to_steal+565>
0xc011ad4f <task_to_steal+559>: lea    0xfffffffc(%edi),%edi
0xc011ad52 <task_to_steal+562>: bsf    (%edi),%eax
0xc011ad55 <task_to_steal+565>: sub    %ebx,%edi
0xc011ad57 <task_to_steal+567>: shl    $0x3,%edi
0xc011ad5a <task_to_steal+570>: add    %edi,%eax
0xc011ad5c <task_to_steal+572>: mov    %eax,%edx
0xc011ad5e <task_to_steal+574>: mov    0xfffffff0(%ebp),%eax
0xc011ad61 <task_to_steal+577>: add    %esi,%eax
0xc011ad63 <task_to_steal+579>: add    %edx,%eax
0xc011ad65 <task_to_steal+581>: jmp    0xc011abb9 <task_to_steal+153>
0xc011ad6a <task_to_steal+586>: mov    0x8(%ebp),%ecx
0xc011ad6d <task_to_steal+589>: mov    0x1c(%ecx),%ecx
0xc011ad70 <task_to_steal+592>: jmp    0xc011acac <task_to_steal+396>
0xc011ad75 <task_to_steal+597>: nop    
0xc011ad76 <task_to_steal+598>: lea    0x0(%esi),%esi
0xc011ad79 <task_to_steal+601>: lea    0x0(%edi,1),%edi
End of assembler dump.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25 23:26             ` Martin J. Bligh
  2002-10-25 23:45               ` Martin J. Bligh
@ 2002-10-26  0:02               ` Martin J. Bligh
  1 sibling, 0 replies; 17+ messages in thread
From: Martin J. Bligh @ 2002-10-26  0:02 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum; +Cc: linux-kernel

>> I thought this problem is well understood! For some reasons independent of
>> my patch you have to boot your machines with the "notsc" option. This
>> leaves the cache_decay_ticks variable initialized to zero which my patch
>> doesn't like. I'm trying to deal with this inside the patch but there is
>> still a small window when the variable is zero. In my opinion this needs
>> to be fixed somewhere in arch/i386/kernel/smpboot.c. Booting a machine
>> with cache_decay_ticks=0 is pure nonsense, as it switches off cache
>> affinity which you absolutely need! So even if "notsc" is a legal option,
>> it should be fixed such that it doesn't leave your machine without cache
>> affinity. That would anyway give you a falsified behavior of the O(1)
>> scheduler.

> EIP is at task_to_steal+0x118/0x260

This turned out to be:

weight = (jiffies - tmp->sleep_timestamp)/cache_decay_ticks;

So I guess that window is still biting you. I'll see if I can fix it properly.

M.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25 19:58 ` Crunch time -- the musical. (2.5 merge candidate list 1.5) Rob Landley
@ 2002-10-26  8:45   ` george anzinger
  0 siblings, 0 replies; 17+ messages in thread
From: george anzinger @ 2002-10-26  8:45 UTC (permalink / raw)
  To: landley; +Cc: jim.houston, linux-kernel

Rob Landley wrote:
> 
> On Thursday 24 October 2002 19:25, Jim Houston wrote:
> > Hi Rob,
> >
> > The Posix timers entry in your list is confused.  I don't know how
> > my patch got the name Google.
> 
> Sorry, misread "George's version" as "Google's version" at 5 am one morning.
> Lot of late nights recently... :)
> 
> > I think Dan Kegel misunderstood George's answer to my previous
> > announcement.  George might be picking up some of my changes, but there
> > will still be two patches for Linus to choose from.  You included the URL to
> > George's answer which quoted my patch, rather than the URL I sent you.
> 
> Had it in, then took it out.  I'm trying to collate down the list wherever I
> can.
> 
> > Here is the URL for an archived copy of my latest patch:
> >      Jim Houston's  [PATCH] alternate Posix timer patch3
> >      http://marc.theaimsgroup.com/?l=linux-kernel&m=103549000027416&w=2
> 
> It's back now.
> 
> > I would be happy to see either version go into 2.5.
> 
> So what exactly is the difference between them?

First, to answer your question about the order of things in
my patches.  The 4 patches should be applied in this order:

First, the posix patch.  It introduces the POSIX clocks &
timers to the system.  It is not high res and stands alone. 
The rest of the patches are all about doing the high res
timers:

The 3 parts to the high res timers are:
 core           The core kernel (i.e. platform independent)
changes
 i386           The high-res changes for the i386 (x86)
platform
 posixhr        The changes to the POSIX clocks & timers
patch to
use high-res timers

This last is almost entirely contained to the one file
(.../kernel/posix_timers.c).  The "almost" is because it
adds a member to the posix timers structure which is defined
in sched.h.

Now, as to the differences between my patches and Jim's. 
Jim's patch is an alternate for the first or "posix" patch
only.  Since I picked up a variation on his id allocator,
thus removing the configuration option for the maximum
number of timers, the principle difference is that Jim keeps
the posix timers in a separate list, where as, my patch puts
them in the same list (i.e. the add_timer list) as all other
timers.  I assume (not having looked in detail at his latest
patch) that he uses the systems add_timers to drive the
timers in this list, and thus has a two stage expiry
algorithm (a. the add_timer pop which then, b. causes a
check of this new list).

Jim has also attempted to address the clock_nanosleep()
interaction with signals problem.  In short, the standard
says that signals that do not actually cause a handler in
the user code to run are NOT supposed to interrupt a sleep. 
The straight forward way to do this is to interrupt the
sleep on the signal, call do_signal() to deliver the signal
and check the return to see if it invoked a user handler (it
returns 1 in this case, else 0) and either continue the
sleep or return.  The problem is that do_signal() requires
&regs as a parameter and this is passed in different ways,
in the various platforms, to system calls.  ALL other system
calls that call do_signal() reside in platform dependent
code, most likely for this reason.

My solution for this problem is to provide a couple of
macros in linux/signal.h and linux/asm-i386/signal.h to
define the entry sequence for clock_nanosleep (and nanosleep
as it is now just a call to clock_nanosleep).  The macros in
linux/signal.h are general purpose and do NOT actually solve
the problem, but they do allow other platforms to work,
although, without the standard required signal handling. 
These are only defined if the asm/signal.h does not supply
an alternative.  This allows each platform to customize the
entry to clock_nanosleep() to pass in regs in what ever way
works for that platform.  I fully admit that this is a VERY
messy bit of code, BUT at the same time, it works.  I am
fully prepared to change to a cleaner solution should one
arise.

Jim has NOT provided high res timers as yet, and thus does
not have any code to replace the 3 high res patches.  I
don't know if he is attempting to do this code.  I suspect
he is not, but he did indicate that he wants to expand his
posix timers to be high res.  If he does this, I suspect
that it would be his version of the "hrposix" patch.
> 
> > The URLs for George's patches are incomplete.  I believe this is the
> > most recent (it's from Oct 18).  The Sourceforge.net reference has the
> > user space library and test programs, but I did not see 2.5 kernel
> > patches.
> >
> >   [PATCH ] POSIX clocks & timers take 3 (NOT HIGH RES)
> >      http://marc.theaimsgroup.com/?l=linux-kernel&m=103489669622397&w=2
> 
> He's up to version 4 now.

As I said in another post, don't trust these archives, they
truncate long posts to less than what the lklm allows.  In
particular, they have truncated my patches.  The full set of
4 patches are available here:

 http://sourceforge.net/projects/high-res-timers/

or, to save a few clicks:

http://sourceforge.net/project/showfiles.php?group_id=20460&release_id=118345

Please do read the notes, they tell about the order of
application, which is fixed, i.e.:
hrtimers-posix  The posix clock/ timers interface, low res.
hrtimers-core   The core system high res patch.
hrtimers-i386   The high res code for the i386 platform.
hrtimers-hrposix The patch to move the low res posix patch
                 to high res.


-- 
George Anzinger   george@mvista.com
High-res-timers: 
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: highres timers question...
  2002-10-25 18:53 ` highres timers question Rob Landley
@ 2002-10-26  9:07   ` george anzinger
  0 siblings, 0 replies; 17+ messages in thread
From: george anzinger @ 2002-10-26  9:07 UTC (permalink / raw)
  To: landley; +Cc: jim.houston, linux-kernel

Rob Landley wrote:
> 
> I'm guessing that of the patches here:
> 
> http://sourceforge.net/projects/high-res-timers
> 
> The -posix one adds posix support on top of the base high-res timers patch?
> 
> (Did I guess right?)

Uh, no.  We made the command decision that even IF he does
not let in the high-res stuff we would like the POSIX API in
the kernel.  Thus the patches are structured to require the
POSIX patch first.  This can be changed if need be, but that
is the way it is now.
> 
> Rob
> 
> --
> http://penguicon.sf.net - Terry Pratchett, Eric Raymond, Pete Abrams, Illiad,
> CmdrTaco, liquid nitrogen ice cream, and caffienated jello.  Well why not?

-- 
George Anzinger   george@mvista.com
High-res-timers: 
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Crunch time -- the musical.  (2.5 merge candidate list 1.5)
  2002-10-25  8:15           ` Erich Focht
  2002-10-25 23:26             ` Martin J. Bligh
@ 2002-10-26 18:58             ` Martin J. Bligh
  1 sibling, 0 replies; 17+ messages in thread
From: Martin J. Bligh @ 2002-10-26 18:58 UTC (permalink / raw)
  To: Erich Focht, Michael Hohnbaum, landley; +Cc: linux-kernel

>> I still haven't been able to get your scheduler to boot for about
>> the last month without crashing the system. Andrew says he has it
>> booting somehow on 2.5.44-mm4, so I'll steal his kernel tommorow and
>> see how it looks. If the numbers look good for doing boring things
>> like kernel compile, SDET, etc, I'm happy.
> 
> I thought this problem is well understood! For some reasons independent of
> my patch you have to boot your machines with the "notsc" option. This
> leaves the cache_decay_ticks variable initialized to zero which my patch
> doesn't like. I'm trying to deal with this inside the patch but there is
> still a small window when the variable is zero. In my opinion this needs
> to be fixed somewhere in arch/i386/kernel/smpboot.c. Booting a machine
> with cache_decay_ticks=0 is pure nonsense, as it switches off cache
> affinity which you absolutely need! So even if "notsc" is a legal option,
> it should be fixed such that it doesn't leave your machine without cache
> affinity. That would anyway give you a falsified behavior of the O(1)
> scheduler.

Oh, not sure if I ever replied to this or not. I don't *have* to boot
with notsc, I just usually do. And it crashed either way, so it's a
different problem (changing versions of gcc seems to perturb it too).
BUT ... your new patches 1 and 2 don't have this problem. See followup
email in a second.

M.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2002-10-26 18:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-25  0:25 Crunch time -- the musical. (2.5 merge candidate list 1.5) Jim Houston
2002-10-25 17:58 ` george anzinger
2002-10-25 18:53 ` highres timers question Rob Landley
2002-10-26  9:07   ` george anzinger
2002-10-25 19:58 ` Crunch time -- the musical. (2.5 merge candidate list 1.5) Rob Landley
2002-10-26  8:45   ` george anzinger
  -- strict thread matches above, loose matches on Subject: below --
2002-10-23 21:26 Rob Landley
2002-10-24 16:17 ` Michael Hohnbaum
     [not found]   ` <200210240750.09751.landley@trommello.org>
2002-10-24 19:01     ` Michael Hohnbaum
2002-10-24 21:51       ` Erich Focht
2002-10-24 22:38         ` Martin J. Bligh
2002-10-25  8:15           ` Erich Focht
2002-10-25 23:26             ` Martin J. Bligh
2002-10-25 23:45               ` Martin J. Bligh
2002-10-26  0:02               ` Martin J. Bligh
2002-10-26 18:58             ` Martin J. Bligh
2002-10-25 14:46 ` Kevin Corry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).