public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Linux 2.6.23
@ 2007-10-09 20:54 Linus Torvalds
  2007-10-10  6:12 ` Nicholas Miell
  2007-10-10  7:44 ` René Rebe
  0 siblings, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2007-10-09 20:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List


Finally.

Yeah, it got delayed, not because of any huge issues, but because of 
various bugfixes trickling in and causing me to reset my "release clock" 
all the time. But it's out there now, and hopefully better for the wait.

Not a whole lot of changes since -rc9, although there's a few updates to 
mips, sparc64 and blackfin in there.  Ignoring those arch updates, there's 
basically a number of mostly one-liners (mostly in drivers, but there's 
some networking fixes and soem VFS/VM fixes there too).

Shortlog and diffstat appended (both relative to -rc9, of course - the 
full log from 2.6.22 is on kernel.org as usual).

I want this to be what people look at for a few days, but expect the x86 
merge to go ahead after that. So far, all indications are still that it's 
going to be all smooth sailing, but hey, those indicators seem to always 
say that, and only after the fact do people notice any problems ;)

		Linus

---
Akinobu Mita (1):
      [SPARC64]: check fork_idle() error

Al Viro (1):
      fix bogus reporting of signals by audit

Alexey Dobriyan (2):
      Move kasprintf.o to obj-y
      [ROSE]: Fix rose.ko oops on unload

Alexey Kuznetsov (1):
      [SFQ]: Remove artificial limitation for queue limit.

Andrew Morton (1):
      binfmt_flat: checkpatch fixing minimum support for the blackfin relocations

Anton Blanchard (2):
      [POWERPC] Fix xics set_affinity code
      Fix timer_stats printout of events/sec

Attila Kinali (1):
      Add manufacturer and card id of teltonica pcmcia modems

Ben Dooks (2):
      [ARM] 4597/2: OSIRIS: ensure CPLD0 is preserved after suspend
      [ARM] 4598/2: OSIRIS: Ensure we do not get nRSTOUT during suspend

Benjamin Herrenschmidt (1):
      Fix non-terminated PCI match table in PowerMac IDE

Bernd Schmidt (1):
      Binfmt_flat: Add minimum support for the Blackfin relocations

Brian Haley (1):
      [IPv6]: Fix ICMPv6 redirect handling with target multicast address

Bryan Wu (1):
      Blackfin arch: add some missing syscall

Dale Farnsworth (1):
      mv643xx_eth: Do not modify struct netdev tx_queue_len

David S. Miller (8):
      [SPARC]: Fix EBUS use of uninitialized variable.
      [SPARC64]: Fix put_user() calls in binfmt_aout32.c
      [SPARC64]: Fix missing load-twin usage in Niagara-1 memcpy.
      [SPARC64]: Don't use in/local regs for ldx/stx data in N1 memcpy.
      [SPARC64]: Fix domain-services port probing.
      [SPARC64]: VIO device addition log message level is too high.
      [SPARC64]: Temporary workaround for PCI-E slot on T1000.
      [SPARC64]: Fix 'niu' complex IRQ probing.

Dmitry Torokhov (1):
      Driver core: fix SYSF_DEPRECATED breakage for nested classdevs

Eric Dumazet (1):
      [TCP]: secure_tcp_sequence_number() should not use a too fast clock

FUJITA Tomonori (1):
      [SCSI] megaraid_old: fix READ_CAPACITY

Florian Fainelli (2):
      [MIPS] Alchemy: Fix USB initialization.
      [MIPS] Au1000: set the PCI controller IO base

Francois Romieu (1):
      r8169: revert part of 6dccd16b7c2703e8bbf8bca62b5cf248332afbe2

Giuseppe Sacco (2):
      [MIPS] IP32: Enable PCI bridges
      [MIPS] IP32: Fix fatal typo in address computation.

Hugh Dickins (1):
      Fix sys_remap_file_pages BUG at highmem.c:15!

Ilpo Järvinen (1):
      [TCP]: Fix fastpath_cnt_hint when GSO skb is partially ACKed

Ingo Molnar (1):
      sched: fix profile=sleep

Jeff Garzik (2):
      aic94xx: fix DMA data direction for SMP requests
      sata_mv: correct S/G table limits

Jeremy Fitzhardinge (1):
      xen: disable split pte locks for now

Jiri Slaby (1):
      Ata: pata_marvell, use ioread* for iomap-ped memory

Joe Perches (1):
      bcm43xx: Correct printk with PFX before KERN_

John W. Linville (1):
      [IEEE80211]: avoid integer underflow for runt rx frames

Karsten Keil (1):
      ISDN: Fix data access out of array bounds

Kyle McMartin (1):
      Revert "intel_agp: fix stolen mem range on G33"

Linus Torvalds (3):
      VT_WAITACTIVE: Avoid returning EINTR when not necessary
      Don't do load-average calculations at even 5-second intervals
      Linux 2.6.23

Maarten Bressers (1):
      Correct Makefile rule for generating custom keymap

Maciej W. Rozycki (1):
      [MIPS] pg-r4k.c: Fix a typo in an R4600 v2 erratum workaround

Michael Hennerich (2):
      Blackfin arch: gpio pinmux and resource allocation API required by BF537 on chip ethernet mac driver
      Blackfin arch: fix PORT_J BUG for BF537/6 EMAC driver reported by Kalle Pokki <kalle.pokki@iki.fi>

Olof Johansson (1):
      libata: fix for sata_mv >64KB DMA segments

Pavel Machek (1):
      sysrq docs: document sequence that actually works

Peter Korsgaard (1):
      dm9601: Fix receive MTU

Peter Zijlstra (2):
      lockstat: documentation
      mm: set_page_dirty_balance() vs ->page_mkwrite()

Rafal Bilski (1):
      Longhaul: add auto enabled "revid_errata" option

Ralf Baechle (2):
      [MIPS] Type proof reimplementation of cmpxchg.
      [MIPS] Terminally fix local_{dec,sub}_if_positive

Richard Knutsson (1):
      softmac: Fix compiler-warning

Ron Mercer (2):
      qla3xxx: bugfix: Add memory barrier before accessing rx completion.
      qla3xxx: bugfix: Fix VLAN rx completion handling.

Scott Thompson (1):
      drivers/ata/pata_ixp4xx_cf.c: ioremap return code check

Serge Belyshev (1):
      Remove unnecessary cast in prefetch()

Stefan Richter (1):
      firewire: point to migration document

Stephen Hemminger (2):
      sky2: jumbo frame regression fix
      [PKT_SCHED] cls_u32: error code isn't been propogated properly

Sunil Mushran (1):
      ocfs2: Unlock mutex in local alloc failure case

Tejun Heo (1):
      ata_piix: add another TECRA M3 entry to broken suspend list

Trond Myklebust (1):
      NLM: Fix a memory leak in nlmsvc_testlock

Yan Zheng (3):
      fix VM_CAN_NONLINEAR check in sys_remap_file_pages
      fix page release issue in filemap_fault
      AIO: fix cleanup in io_submit_one(...)

---
 Documentation/lockstat.txt                        |  120 +++++++
 Documentation/sysrq.txt                           |    2 +-
 Makefile                                          |    2 +-
 arch/arm/mach-s3c2440/mach-osiris.c               |   18 +
 arch/blackfin/kernel/bfin_gpio.c                  |  285 ++++++++++++++--
 arch/blackfin/mach-common/entry.S                 |   23 +-
 arch/i386/kernel/cpu/cpufreq/longhaul.c           |   60 ++++-
 arch/mips/au1000/common/pci.c                     |    1 +
 arch/mips/au1000/mtx-1/board_setup.c              |    4 +-
 arch/mips/au1000/pb1000/board_setup.c             |    6 +-
 arch/mips/au1000/pb1100/board_setup.c             |    4 +-
 arch/mips/au1000/pb1500/board_setup.c             |    6 +-
 arch/mips/mm/pg-r4k.c                             |    2 +-
 arch/mips/pci/ops-mace.c                          |   21 +-
 arch/powerpc/platforms/pseries/xics.c             |    2 +-
 arch/sparc/kernel/ebus.c                          |    2 +
 arch/sparc64/kernel/binfmt_aout32.c               |    4 +-
 arch/sparc64/kernel/ebus.c                        |    5 +-
 arch/sparc64/kernel/pci_common.c                  |    4 +-
 arch/sparc64/kernel/prom.c                        |    3 +-
 arch/sparc64/kernel/smp.c                         |    2 +
 arch/sparc64/kernel/vio.c                         |   29 ++-
 arch/sparc64/lib/NGcopy_from_user.S               |    8 +-
 arch/sparc64/lib/NGcopy_to_user.S                 |    8 +-
 arch/sparc64/lib/NGmemcpy.S                       |  371 ++++++++++++---------
 drivers/ata/ata_piix.c                            |    7 +
 drivers/ata/pata_ixp4xx_cf.c                      |    3 +
 drivers/ata/pata_marvell.c                        |    4 +-
 drivers/ata/sata_mv.c                             |   35 ++-
 drivers/base/core.c                               |   10 +-
 drivers/char/Makefile                             |    2 +-
 drivers/char/agp/intel-agp.c                      |    5 -
 drivers/char/random.c                             |   10 +-
 drivers/char/vt_ioctl.c                           |    4 +-
 drivers/firewire/Kconfig                          |    3 +-
 drivers/ide/ppc/pmac.c                            |    1 +
 drivers/isdn/i4l/isdn_common.c                    |    5 +-
 drivers/net/mv643xx_eth.c                         |    1 -
 drivers/net/qla3xxx.c                             |    7 +
 drivers/net/r8169.c                               |   16 +-
 drivers/net/sky2.c                                |    3 -
 drivers/net/usb/dm9601.c                          |    2 +-
 drivers/net/wireless/bcm43xx/bcm43xx_wx.c         |    2 +-
 drivers/scsi/aic94xx/aic94xx_task.c               |    4 +-
 drivers/scsi/megaraid.c                           |    8 +
 drivers/serial/serial_cs.c                        |    1 +
 fs/aio.c                                          |    2 +-
 fs/binfmt_flat.c                                  |    6 +-
 fs/lockd/svclock.c                                |    4 +-
 fs/ocfs2/localalloc.c                             |    4 +-
 include/asm-blackfin/mach-bf533/bfin_serial_5xx.h |   11 +-
 include/asm-blackfin/mach-bf537/bfin_serial_5xx.h |   23 +-
 include/asm-blackfin/mach-bf537/portmux.h         |   35 ++-
 include/asm-blackfin/mach-bf561/bfin_serial_5xx.h |   11 +-
 include/asm-blackfin/portmux.h                    |   55 +++
 include/asm-blackfin/unistd.h                     |   56 +++-
 include/asm-h8300/flat.h                          |    3 +-
 include/asm-m32r/flat.h                           |    3 +-
 include/asm-m68knommu/flat.h                      |    3 +-
 include/asm-mips/cmpxchg.h                        |  107 ++++++
 include/asm-mips/local.h                          |   69 +----
 include/asm-mips/system.h                         |  261 +---------------
 include/asm-sh/flat.h                             |    3 +-
 include/asm-v850/flat.h                           |    4 +-
 include/asm-x86_64/processor.h                    |    2 +-
 include/linux/sched.h                             |    2 +-
 include/linux/writeback.h                         |    2 +-
 include/net/rose.h                                |    2 +-
 kernel/sched_fair.c                               |   10 +
 kernel/signal.c                                   |   22 +-
 kernel/time/timer_stats.c                         |    5 +-
 lib/Kconfig.debug                                 |    2 +
 lib/Makefile                                      |    4 +-
 mm/Kconfig                                        |    1 +
 mm/filemap.c                                      |    1 +
 mm/fremap.c                                       |    2 +-
 mm/memory.c                                       |   23 +-
 mm/page-writeback.c                               |    4 +-
 net/ieee80211/ieee80211_rx.c                      |    6 +
 net/ieee80211/softmac/ieee80211softmac_wx.c       |    2 +-
 net/ipv4/tcp_input.c                              |    3 +
 net/ipv6/ndisc.c                                  |    9 +-
 net/rose/rose_loopback.c                          |    4 +-
 net/rose/rose_route.c                             |   15 +-
 net/sched/cls_u32.c                               |    2 +-
 net/sched/sch_sfq.c                               |   47 ++-
 86 files changed, 1254 insertions(+), 701 deletions(-)
 create mode 100644 Documentation/lockstat.txt
 create mode 100644 include/asm-mips/cmpxchg.h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-09 20:54 Linux 2.6.23 Linus Torvalds
@ 2007-10-10  6:12 ` Nicholas Miell
  2007-10-10 10:14   ` Ingo Molnar
  2007-10-10  7:44 ` René Rebe
  1 sibling, 1 reply; 20+ messages in thread
From: Nicholas Miell @ 2007-10-10  6:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Ingo Molnar

On Tue, 2007-10-09 at 13:54 -0700, Linus Torvalds wrote:
> Finally.
> 
> Yeah, it got delayed, not because of any huge issues, but because of 
> various bugfixes trickling in and causing me to reset my "release clock" 
> all the time. But it's out there now, and hopefully better for the wait.
> 
> Not a whole lot of changes since -rc9, although there's a few updates to 
> mips, sparc64 and blackfin in there.  Ignoring those arch updates, there's 
> basically a number of mostly one-liners (mostly in drivers, but there's 
> some networking fixes and soem VFS/VM fixes there too).
> 
> Shortlog and diffstat appended (both relative to -rc9, of course - the 
> full log from 2.6.22 is on kernel.org as usual).
> 
> I want this to be what people look at for a few days, but expect the x86 
> merge to go ahead after that. So far, all indications are still that it's 
> going to be all smooth sailing, but hey, those indicators seem to always 
> say that, and only after the fact do people notice any problems ;)
> 
> 		Linus

Does CFS still generate the following sysbench graphs with 2.6.23, or
did that get fixed?

http://people.freebsd.org/~kris/scaling/linux-pgsql.png
http://people.freebsd.org/~kris/scaling/linux-mysql.png

(There's also some interesting FreeBSD vs. Linux graphs in
http://people.freebsd.org/~kris/scaling/Scalability%20Update.pdf , but
AFAIK those comparisons are more indicative of glibc malloc performance
than Linux performance.)

-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-09 20:54 Linux 2.6.23 Linus Torvalds
  2007-10-10  6:12 ` Nicholas Miell
@ 2007-10-10  7:44 ` René Rebe
  2007-10-10  8:37   ` Alexey Dobriyan
  2007-10-10 19:14   ` Ingo Molnar
  1 sibling, 2 replies; 20+ messages in thread
From: René Rebe @ 2007-10-10  7:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

Hi Linus et al.,

2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:

In file included from fs/drop_caches.c:8:
include/linux/mm.h:1210: warning: 'struct super_block' declared inside parameter list
nclude/linux/mm.h:1210: warning: its scope is only this definition or declaration, which is probably not what you want
fs/drop_caches.c:17: error: conflicting types for 'drop_pagecache_sb'
include/linux/mm.h:1210: error: previous declaration of 'drop_pagecache_sb' was here
fs/drop_caches.c:28: error: conflicting types for 'drop_pagecache_sb'
include/linux/mm.h:1210: error: previous declaration of 'drop_pagecache_sb' was here

A little forward declaration fixes this:

--- linux-2.6.23/include/linux/mm.h.vanilla	2007-10-10 09:28:33.000000000 +0200
+++ linux-2.6.23/include/linux/mm.h	2007-10-10 09:30:23.000000000 +0200
@@ -1207,6 +1207,7 @@
 					void __user *, size_t *, loff_t *);
 unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 			unsigned long lru_pages);
+struct super_block;
 extern void drop_pagecache_sb(struct super_block *);
 void drop_pagecache(void);
 void drop_slab(void);

You probably end up fixing it some other way, but as I do not know this
file inside out I just wanted to drop a note.

Yours,
  René Rebe

On Tuesday 09 October 2007 22:54:30 Linus Torvalds wrote:

> Finally.
> 
> Yeah, it got delayed, not because of any huge issues, but because of 
> various bugfixes trickling in and causing me to reset my "release clock" 
> all the time. But it's out there now, and hopefully better for the wait.
> 
> Not a whole lot of changes since -rc9, although there's a few updates to 
> mips, sparc64 and blackfin in there.  Ignoring those arch updates, there's 
> basically a number of mostly one-liners (mostly in drivers, but there's 
> some networking fixes and soem VFS/VM fixes there too).
> 
> Shortlog and diffstat appended (both relative to -rc9, of course - the 
> full log from 2.6.22 is on kernel.org as usual).
> 
> I want this to be what people look at for a few days, but expect the x86 
> merge to go ahead after that. So far, all indications are still that it's 
> going to be all smooth sailing, but hey, those indicators seem to always 
> say that, and only after the fact do people notice any problems ;)
> 
> 		Linus

-- 
  René Rebe - ExactCODE GmbH - Europe, Germany, Berlin
  http://exactcode.de | http://t2-project.org | http://rene.rebe.name

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10  7:44 ` René Rebe
@ 2007-10-10  8:37   ` Alexey Dobriyan
  2007-10-10  9:12     ` Michael Tokarev
  2007-10-10 19:14   ` Ingo Molnar
  1 sibling, 1 reply; 20+ messages in thread
From: Alexey Dobriyan @ 2007-10-10  8:37 UTC (permalink / raw)
  To: René Rebe; +Cc: Linus Torvalds, Linux Kernel Mailing List

On 10/10/07, René Rebe <rene@exactcode.de> wrote:
> 2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:
>
> In file included from fs/drop_caches.c:8:
> include/linux/mm.h:1210: warning: 'struct super_block' declared inside
> parameter list

> --- linux-2.6.23/include/linux/mm.h.vanilla
> +++ linux-2.6.23/include/linux/mm.h

> +struct super_block;
>  extern void drop_pagecache_sb(struct super_block *);
>  void drop_pagecache(void);
>  void drop_slab(void);
>
> You probably end up fixing it some other way, but as I do not know this
> file inside out I just wanted to drop a note.

You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10  8:37   ` Alexey Dobriyan
@ 2007-10-10  9:12     ` Michael Tokarev
  2007-10-10 10:36       ` Alexey Dobriyan
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Tokarev @ 2007-10-10  9:12 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: René Rebe, Linus Torvalds, Linux Kernel Mailing List

Alexey Dobriyan wrote:
> On 10/10/07, René Rebe <rene@exactcode.de> wrote:
>> 2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:
>>
>> In file included from fs/drop_caches.c:8:
>> include/linux/mm.h:1210: warning: 'struct super_block' declared inside
>> parameter list
> 
>> --- linux-2.6.23/include/linux/mm.h.vanilla
>> +++ linux-2.6.23/include/linux/mm.h
> 
>> +struct super_block;
>>  extern void drop_pagecache_sb(struct super_block *);
>>  void drop_pagecache(void);
>>  void drop_slab(void);
>>
>> You probably end up fixing it some other way, but as I do not know this
>> file inside out I just wanted to drop a note.
> 
> You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.

The same happens here as well.

-rw-rw-r--  1 mjt mjt 45488158 Oct  9 20:48 linux-2.6.23.tar.bz2
2cc2fd4d521dc5d7cfce0d8a9d1b3472  linux-2.6.23.tar.bz2

(timestamp is in UTC) Downloaded yesterday, 3 hours after an announce,
from http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 .

/mjt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10  6:12 ` Nicholas Miell
@ 2007-10-10 10:14   ` Ingo Molnar
  2007-10-11  1:20     ` Nicholas Miell
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Ingo Molnar @ 2007-10-10 10:14 UTC (permalink / raw)
  To: Nicholas Miell; +Cc: Linus Torvalds, Linux Kernel Mailing List


* Nicholas Miell <nmiell@comcast.net> wrote:

> Does CFS still generate the following sysbench graphs with 2.6.23, or 
> did that get fixed?
>
> http://people.freebsd.org/~kris/scaling/linux-pgsql.png 
> http://people.freebsd.org/~kris/scaling/linux-mysql.png

as far as my testsystem goes, v2.6.23 beats v2.6.22.9 in sysbench:

    http://redhat.com/~mingo/misc/sysbench.jpg

As you can see it in the graph, v2.6.23 schedules much more consistently 
too. [ v2.6.22 has a small (but potentially statistically insignificant) 
edge at 4-6 clients, and CFS has a slightly better peak (which is 
statistically insignificant). ]

( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
  1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
  nor in the setup - everything is pretty close to the defaults. )

i'm aware of a 2.6.21 vs. 2.6.23 sysbench regression report, and it 
apparently got resolved after various changes to the test environment:

   http://jeffr-tech.livejournal.com/10103.html

 " [<CFS>] has virtually no dropoff and performs better under load than
   the default 2.6.21 scheduler. " (paraphrased)

(The new link you posted, just a few hours after the release of v2.6.23, 
has not been reported to lkml before AFAICS - when did you become aware 
of it? If you learned about it before v2.6.23 it might have been useful 
to report it to the v2.6.23 regression list.)

At a quick glance there are no .configs or other testing details at or 
around that URL that i could use to reproduce their result precisely, so 
at least a minimal bugreport would be nice.

In any case, here are a few general comments about sysbench numbers:

Sysbench is a pretty 'batched' workload: it benefits most from batchy 
scheduling: the client doing as much work as it can, then server doing 
as much work as it can - and so on. The longer the client can work the 
more cache-efficient the workload is. Any round-trip to the server due 
to pesky preemption only blows up the cache footprint of the workload 
and gives lower throughput.

This kind of workload would probably run best on DOS or Windows 3.11, 
with no preemptive scheduling done at all. In other words: run both 
mysqld and the client as SCHED_FIFO to get the best performance out of 
it. So in that sense the workload is a bit similar to dbench.

The other thing is that mysqld does _tons_ of sys_time() calls, so GTOD 
differences between .22 and .23 might cause extra overhead - especially 
with 8 CPUs/cores. Does the sys_time() scalability patch below improve 
sysbench performance for you? (i'm not sure about psqld)

If it's indeed due to batched vs. well-spread-out scheduling behavior 
(which is possible), there are a few things you could do to make 
scheduling more batched:

1) start the DB daemon up as SCHED_BATCH:

     schedtool -B -e service mysqld restart

   (and do the same with the client-side commands as well)

   or:

       schedtool -B $$

   to mark the parent shell as SCHED_BATCH - then start up the DB and 
   start the client workload. (All other tasks not started from this 
   shell will still be SCHED_OTHER, so only your mysql workload will be 
   affected.) For example "beagled" already runs under SCHED_BATCH by 
   default.

   SCHED_BATCH will cause the scheduler to batch up the workload more. 
   You basically tell the scheduler: "this workload really wants
   throughput above all", and the scheduler takes that hint and acts 
   upon it. (it's still not as drastic as SCHED_FIFO, it's somewhere 
   between SCHED_OTHER and SCHED_FIFO, in terms of batching. Start up 
   your DB and your client as SCHED_FIFO via "schedtool -F -p 10 ..." to 
   establish the best-case batching win.)

2) check out the v22 CFS backport patch which has the latest & greatest 
   scheduler code, from http://people.redhat.com/mingo/cfs-scheduler/ . 
   Does performance go up for you with it? It's somewhat less
   preemption-eager, which might as well make the crutial difference for
   sysbench.

3) if it's enabled, disable CONFIG_PREEMPT=y. CONFIG_PREEMPT can cause
   unwanted overscheduling and cache-trashing under overload.

hope this helps, and i'm definitely interested in more feedback about 
this,

	Ingo

Index: linux/kernel/time.c
===================================================================
--- linux.orig/kernel/time.c
+++ linux/kernel/time.c
@@ -57,11 +57,7 @@ EXPORT_SYMBOL(sys_tz);
  */
 asmlinkage long sys_time(time_t __user * tloc)
 {
-	time_t i;
-	struct timespec tv;
-
-	getnstimeofday(&tv);
-	i = tv.tv_sec;
+	time_t i = get_seconds();
 
 	if (tloc) {
 		if (put_user(i,tloc))
Index: linux/kernel/time/timekeeping.c
===================================================================
--- linux.orig/kernel/time/timekeeping.c
+++ linux/kernel/time/timekeeping.c
@@ -49,19 +49,12 @@ struct timespec wall_to_monotonic __attr
 static unsigned long total_sleep_time;		/* seconds */
 EXPORT_SYMBOL(xtime);
 
-
-#ifdef CONFIG_NO_HZ
 static struct timespec xtime_cache __attribute__ ((aligned (16)));
 static inline void update_xtime_cache(u64 nsec)
 {
 	xtime_cache = xtime;
 	timespec_add_ns(&xtime_cache, nsec);
 }
-#else
-#define xtime_cache xtime
-/* We do *not* want to evaluate the argument for this case */
-#define update_xtime_cache(n) do { } while (0)
-#endif
 
 static struct clocksource *clock; /* pointer to current clocksource */
 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10  9:12     ` Michael Tokarev
@ 2007-10-10 10:36       ` Alexey Dobriyan
  2007-10-10 10:53         ` Jan Engelhardt
  0 siblings, 1 reply; 20+ messages in thread
From: Alexey Dobriyan @ 2007-10-10 10:36 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: René Rebe, Linus Torvalds, Linux Kernel Mailing List

On 10/10/07, Michael Tokarev <mjt@tls.msk.ru> wrote:
> Alexey Dobriyan wrote:
> > On 10/10/07, René Rebe <rene@exactcode.de> wrote:
> >> 2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:
> >>
> >> In file included from fs/drop_caches.c:8:
> >> include/linux/mm.h:1210: warning: 'struct super_block' declared inside
> >> parameter list
> >
> >> --- linux-2.6.23/include/linux/mm.h.vanilla
> >> +++ linux-2.6.23/include/linux/mm.h
> >
> >> +struct super_block;
> >>  extern void drop_pagecache_sb(struct super_block *);
> >>  void drop_pagecache(void);
> >>  void drop_slab(void);
> >>
> >> You probably end up fixing it some other way, but as I do not know this
> >> file inside out I just wanted to drop a note.
> >
> > You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.
>
> The same happens here as well.
>
> -rw-rw-r--  1 mjt mjt 45488158 Oct  9 20:48 linux-2.6.23.tar.bz2
> 2cc2fd4d521dc5d7cfce0d8a9d1b3472  linux-2.6.23.tar.bz2
>
> (timestamp is in UTC) Downloaded yesterday, 3 hours after an announce,
> from http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 .

Strange. Same size, same md5, no super_block in mm.h, though

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 10:36       ` Alexey Dobriyan
@ 2007-10-10 10:53         ` Jan Engelhardt
  2007-10-10 11:13           ` Michael Tokarev
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Engelhardt @ 2007-10-10 10:53 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: Michael Tokarev, René Rebe, Linus Torvalds,
	Linux Kernel Mailing List


On Oct 10 2007 14:36, Alexey Dobriyan wrote:
>> >> --- linux-2.6.23/include/linux/mm.h.vanilla
>> >> +++ linux-2.6.23/include/linux/mm.h
>> >
>> >> +struct super_block;
>> >>  extern void drop_pagecache_sb(struct super_block *);
>> >>  void drop_pagecache(void);
>> >>  void drop_slab(void);
>> >>
>> >> You probably end up fixing it some other way, but as I do not know this
>> >> file inside out I just wanted to drop a note.
>> >
>> > You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.
>>
>> The same happens here as well.
>>
>> -rw-rw-r--  1 mjt mjt 45488158 Oct  9 20:48 linux-2.6.23.tar.bz2
>> 2cc2fd4d521dc5d7cfce0d8a9d1b3472  linux-2.6.23.tar.bz2
>>
>> (timestamp is in UTC) Downloaded yesterday, 3 hours after an announce,
>> from http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 .
>
>Strange. Same size, same md5, no super_block in mm.h, though

Does someone still have the broken tarball?

There has not been any drop_pagecache_sb anytime between 2.6.23-rc1
and 2.6.23. drop_pagecache_sb reminds me of reiser4, too.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 10:53         ` Jan Engelhardt
@ 2007-10-10 11:13           ` Michael Tokarev
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Tokarev @ 2007-10-10 11:13 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Alexey Dobriyan, René Rebe, Linus Torvalds,
	Linux Kernel Mailing List

Jan Engelhardt wrote:
> On Oct 10 2007 14:36, Alexey Dobriyan wrote:
>>>>> --- linux-2.6.23/include/linux/mm.h.vanilla
>>>>> +++ linux-2.6.23/include/linux/mm.h
>>>>> +struct super_block;
>>>>>  extern void drop_pagecache_sb(struct super_block *);
>>>>>  void drop_pagecache(void);
>>>>>  void drop_slab(void);
>>>>>
>>>>> You probably end up fixing it some other way, but as I do not know this
>>>>> file inside out I just wanted to drop a note.
>>>> You have some strange vanilla kernel. 2.6.23 doesn't have this prototype.
>>> The same happens here as well.
>>>
>>> -rw-rw-r--  1 mjt mjt 45488158 Oct  9 20:48 linux-2.6.23.tar.bz2
>>> 2cc2fd4d521dc5d7cfce0d8a9d1b3472  linux-2.6.23.tar.bz2
>>>
>>> (timestamp is in UTC) Downloaded yesterday, 3 hours after an announce,
>>> from http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 .
>> Strange. Same size, same md5, no super_block in mm.h, though
> 
> Does someone still have the broken tarball?
> 
> There has not been any drop_pagecache_sb anytime between 2.6.23-rc1
> and 2.6.23. drop_pagecache_sb reminds me of reiser4, too.

ghhrm.  That's nonsense.  I found where that struct super_block come
from -- it's from unionfs patches for 2.6.22, which I forgot to
update for 2.6.23 (I just dropped new kernel tarball into my
build directory together with other patches and ran usual build
procedure).  It's a definitely false alarm - the tarball is
fine.

/mjt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10  7:44 ` René Rebe
  2007-10-10  8:37   ` Alexey Dobriyan
@ 2007-10-10 19:14   ` Ingo Molnar
  2007-10-10 19:26     ` Michael Tokarev
                       ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Ingo Molnar @ 2007-10-10 19:14 UTC (permalink / raw)
  To: René Rebe; +Cc: Linus Torvalds, Linux Kernel Mailing List


* René Rebe <rene@exactcode.de> wrote:

> Hi Linus et al.,
> 
> 2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:

i know about 4 (low-impact, cornercase) build breakages for 2.6.23-final 
on x86:

- an uncommon embedded config combinatio: if CONFIG_EMBEDDED=y and
  CONFIG_BLOCK is unset. (a normally useless combination)

- an uncommon V4L config combination: mixed-modular-built-in driver V4L
  config variation. (CONFIG_VIDEO_SAA7146=y and CONFIG_VIDEO_BUF=m)

- an uncommon MTD config combination (normal systems do not need
  CONFIG_MTD configured)

- an uncommon CONFIG_USB_NET_CDC_SUBSET config combination (normal 
  systems should never hit that)

[ furthermore there are a few driver-firmware build options that break 
  and which are not correctly made dependent on !PREVENT_FIRMWARE_BUILD. 
  Again, this is not something one would normally configure. ]

your superblock build failure would be a new and so far unknown build 
breakage variant - please send the .config you used, and double-check 
that it's indeed a vanilla 2.6.23 tree.

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 19:14   ` Ingo Molnar
@ 2007-10-10 19:26     ` Michael Tokarev
  2007-10-10 20:04     ` Andi Kleen
  2007-10-10 23:27     ` Krzysztof Halasa
  2 siblings, 0 replies; 20+ messages in thread
From: Michael Tokarev @ 2007-10-10 19:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: René Rebe, Linus Torvalds, Linux Kernel Mailing List

Ingo Molnar wrote:
> * René Rebe <rene@exactcode.de> wrote:
> 
>> Hi Linus et al.,
>>
>> 2.6.23 does not build with my usual .config on x86_64 and gcc-4.2.1:
[]
> your superblock build failure would be a new and so far unknown build 
> breakage variant - please send the .config you used, and double-check 
> that it's indeed a vanilla 2.6.23 tree.

It's not a vanilla 2.6.23.  In vanilla 2.6.23 there's no lines about
which it complains (struct super_block isn't mentioned in mm.h at all).
It's some external patch that used to work with 2.6.22 but needs to be
updated for 2.6.23 - in my case it was unionfs.

/mjt

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 19:14   ` Ingo Molnar
  2007-10-10 19:26     ` Michael Tokarev
@ 2007-10-10 20:04     ` Andi Kleen
  2007-10-10 23:27     ` Krzysztof Halasa
  2 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2007-10-10 20:04 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: René Rebe, Linus Torvalds, Linux Kernel Mailing List

Ingo Molnar <mingo@elte.hu> writes:
 
> your superblock build failure would be a new and so far unknown build 
> breakage variant - please send the .config you used, and double-check 
> that it's indeed a vanilla 2.6.23 tree.

It is not -- my 2.6.23 tree doesn't have the prototype that broke
the build for him.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 19:14   ` Ingo Molnar
  2007-10-10 19:26     ` Michael Tokarev
  2007-10-10 20:04     ` Andi Kleen
@ 2007-10-10 23:27     ` Krzysztof Halasa
  2 siblings, 0 replies; 20+ messages in thread
From: Krzysztof Halasa @ 2007-10-10 23:27 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: René Rebe, Linus Torvalds, Linux Kernel Mailing List

Ingo Molnar <mingo@elte.hu> writes:

> - an uncommon embedded config combinatio: if CONFIG_EMBEDDED=y and
>   CONFIG_BLOCK is unset. (a normally useless combination)

Uncommon but far from useless - may be pure initramfs-based.
-- 
Krzysztof Halasa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 10:14   ` Ingo Molnar
@ 2007-10-11  1:20     ` Nicholas Miell
  2007-10-11  2:34     ` Zhang, Yanmin
  2007-10-11  9:16     ` Nick Piggin
  2 siblings, 0 replies; 20+ messages in thread
From: Nicholas Miell @ 2007-10-11  1:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linus Torvalds, Linux Kernel Mailing List

On Wed, 2007-10-10 at 12:14 +0200, Ingo Molnar wrote:
> * Nicholas Miell <nmiell@comcast.net> wrote:
> 
> > Does CFS still generate the following sysbench graphs with 2.6.23, or 
> > did that get fixed?
> >
> > http://people.freebsd.org/~kris/scaling/linux-pgsql.png 
> > http://people.freebsd.org/~kris/scaling/linux-mysql.png
> 
> as far as my testsystem goes, v2.6.23 beats v2.6.22.9 in sysbench:
> 
>     http://redhat.com/~mingo/misc/sysbench.jpg

That's nice to know. Note that I'm not actually involved in any of these
tests, just a somewhat interested bystander.

> 
> As you can see it in the graph, v2.6.23 schedules much more consistently 
> too. [ v2.6.22 has a small (but potentially statistically insignificant) 
> edge at 4-6 clients, and CFS has a slightly better peak (which is 
> statistically insignificant). ]
> 
> ( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
>   1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
>   nor in the setup - everything is pretty close to the defaults. )
> 
> i'm aware of a 2.6.21 vs. 2.6.23 sysbench regression report, and it 
> apparently got resolved after various changes to the test environment:
> 
>    http://jeffr-tech.livejournal.com/10103.html
> 
>  " [<CFS>] has virtually no dropoff and performs better under load than
>    the default 2.6.21 scheduler. " (paraphrased)
> 
> (The new link you posted, just a few hours after the release of v2.6.23, 
> has not been reported to lkml before AFAICS - when did you become aware 
> of it? If you learned about it before v2.6.23 it might have been useful 
> to report it to the v2.6.23 regression list.)

According to my IRC logs, Jeffr pasted the URL at Oct 09 22:53:56 PDT.
He says he tried to contact you early in CFS's development, but got no
reply.

> At a quick glance there are no .configs or other testing details at or 
> around that URL that i could use to reproduce their result precisely, so 
> at least a minimal bugreport would be nice.
> 

AFAICT, the configuration is described in
http://people.freebsd.org/~kris/scaling/mysql.html


-- 
Nicholas Miell <nmiell@comcast.net>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 10:14   ` Ingo Molnar
  2007-10-11  1:20     ` Nicholas Miell
@ 2007-10-11  2:34     ` Zhang, Yanmin
  2007-10-11 13:32       ` Ingo Molnar
  2007-10-11  9:16     ` Nick Piggin
  2 siblings, 1 reply; 20+ messages in thread
From: Zhang, Yanmin @ 2007-10-11  2:34 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Nicholas Miell, Linus Torvalds, Linux Kernel Mailing List

On Wed, 2007-10-10 at 12:14 +0200, Ingo Molnar wrote:
> * Nicholas Miell <nmiell@comcast.net> wrote:
> 
> > Does CFS still generate the following sysbench graphs with 2.6.23, or 
> > did that get fixed?
> >
> > http://people.freebsd.org/~kris/scaling/linux-pgsql.png 
> > http://people.freebsd.org/~kris/scaling/linux-mysql.png
I also captured the same issue on a couple of machines.

> 
> as far as my testsystem goes, v2.6.23 beats v2.6.22.9 in sysbench:
> 
>     http://redhat.com/~mingo/misc/sysbench.jpg
> 
> As you can see it in the graph, v2.6.23 schedules much more consistently 
> too. [ v2.6.22 has a small (but potentially statistically insignificant) 
> edge at 4-6 clients, and CFS has a slightly better peak (which is 
> statistically insignificant). ]
> 
> ( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
>   1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
>   nor in the setup - everything is pretty close to the defaults. )
I used FedoraCore 8 Test2 distribution, so glibc-2.6.90-13 already fixed
the old malloc scalability issue. Cpu is 2.66GHZ quad core, 2 physical
processor, totally 8 cores. The regression is about 28%.


> 
> i'm aware of a 2.6.21 vs. 2.6.23 sysbench regression report, and it 
> apparently got resolved after various changes to the test environment:
> 
>    http://jeffr-tech.livejournal.com/10103.html
> 
>  " [<CFS>] has virtually no dropoff and performs better under load than
>    the default 2.6.21 scheduler. " (paraphrased)
> 
> (The new link you posted, just a few hours after the release of v2.6.23, 
> has not been reported to lkml before AFAICS - when did you become aware 
> of it? If you learned about it before v2.6.23 it might have been useful 
> to report it to the v2.6.23 regression list.)
I tested it in 2.6.22 and all 2.6.23-rc kernels. All 2.6.23-rc kernel has
the same regression. The testing result is stable.

> At a quick glance there are no .configs or other testing details at or 
> around that URL that i could use to reproduce their result precisely, so 
> at least a minimal bugreport would be nice.
Commandline to run testing:
#sysbench --test=oltp --mysql-user=root --mysql-db=mysql --max-time=120
--max-requests=0 --oltp-read-only=on --num-threads=16 run

> In any case, here are a few general comments about sysbench numbers:
> 
> Sysbench is a pretty 'batched' workload: it benefits most from batchy 
> scheduling: the client doing as much work as it can, then server doing 
> as much work as it can - and so on. The longer the client can work the 
> more cache-efficient the workload is. Any round-trip to the server due 
> to pesky preemption only blows up the cache footprint of the workload 
> and gives lower throughput.
> 
> This kind of workload would probably run best on DOS or Windows 3.11, 
> with no preemptive scheduling done at all. In other words: run both 
> mysqld and the client as SCHED_FIFO to get the best performance out of 
> it. So in that sense the workload is a bit similar to dbench.
> 
> The other thing is that mysqld does _tons_ of sys_time() calls, so GTOD 
> differences between .22 and .23 might cause extra overhead - especially 
> with 8 CPUs/cores. Does the sys_time() scalability patch below improve 
> sysbench performance for you? (i'm not sure about psqld)
> 
> If it's indeed due to batched vs. well-spread-out scheduling behavior 
> (which is possible), there are a few things you could do to make 
> scheduling more batched:
> 
> 1) start the DB daemon up as SCHED_BATCH:
> 
>      schedtool -B -e service mysqld restart
> 
>    (and do the same with the client-side commands as well)
> 
>    or:
> 
>        schedtool -B $$
> 
>    to mark the parent shell as SCHED_BATCH - then start up the DB and 
>    start the client workload. (All other tasks not started from this 
>    shell will still be SCHED_OTHER, so only your mysql workload will be 
>    affected.) For example "beagled" already runs under SCHED_BATCH by 
>    default.
> 
>    SCHED_BATCH will cause the scheduler to batch up the workload more. 
>    You basically tell the scheduler: "this workload really wants
>    throughput above all", and the scheduler takes that hint and acts 
>    upon it. (it's still not as drastic as SCHED_FIFO, it's somewhere 
>    between SCHED_OTHER and SCHED_FIFO, in terms of batching. Start up 
>    your DB and your client as SCHED_FIFO via "schedtool -F -p 10 ..." to 
>    establish the best-case batching win.)
> 
> 2) check out the v22 CFS backport patch which has the latest & greatest 
>    scheduler code, from http://people.redhat.com/mingo/cfs-scheduler/ . 
>    Does performance go up for you with it? It's somewhat less
>    preemption-eager, which might as well make the crutial difference for
>    sysbench.
> 
> 3) if it's enabled, disable CONFIG_PREEMPT=y. CONFIG_PREEMPT can cause
>    unwanted overscheduling and cache-trashing under overload.
Below is PREMPT config in my kernel config file.

CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
# CONFIG_NUMA is not set


-yanmin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-10 10:14   ` Ingo Molnar
  2007-10-11  1:20     ` Nicholas Miell
  2007-10-11  2:34     ` Zhang, Yanmin
@ 2007-10-11  9:16     ` Nick Piggin
  2007-10-12  5:46       ` Ingo Molnar
  2 siblings, 1 reply; 20+ messages in thread
From: Nick Piggin @ 2007-10-11  9:16 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Nicholas Miell, Linus Torvalds, Linux Kernel Mailing List

On Wednesday 10 October 2007 20:14, Ingo Molnar wrote:
> * Nicholas Miell <nmiell@comcast.net> wrote:
> > Does CFS still generate the following sysbench graphs with 2.6.23, or
> > did that get fixed?
> >
> > http://people.freebsd.org/~kris/scaling/linux-pgsql.png
> > http://people.freebsd.org/~kris/scaling/linux-mysql.png
>
> as far as my testsystem goes, v2.6.23 beats v2.6.22.9 in sysbench:
>
>     http://redhat.com/~mingo/misc/sysbench.jpg
>
> As you can see it in the graph, v2.6.23 schedules much more consistently
> too. [ v2.6.22 has a small (but potentially statistically insignificant)
> edge at 4-6 clients, and CFS has a slightly better peak (which is
> statistically insignificant). ]
>
> ( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
>   1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
>   nor in the setup - everything is pretty close to the defaults. )
>
> i'm aware of a 2.6.21 vs. 2.6.23 sysbench regression report, and it
> apparently got resolved after various changes to the test environment:
>
>    http://jeffr-tech.livejournal.com/10103.html
>
>  " [<CFS>] has virtually no dropoff and performs better under load than
>    the default 2.6.21 scheduler. " (paraphrased)

;) I think you snipped the important bit:

"the peak is terrible but it has virtually no dropoff and performs
better under load than the default 2.6.21 scheduler." (verbatim)

The dropoff under load was due to trivially avoided mmap_sem
contention in the kernel and glibc (and not-very-scalable mysql
heap locking), rather than specifically anything the scheduler
was doing wrong, I think (when the scheduler chose to start
preempting threads holding locks, then performance would tank.
Exactly when that point was reached, and what happens afterwards
was probably just luck.)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-11  2:34     ` Zhang, Yanmin
@ 2007-10-11 13:32       ` Ingo Molnar
  0 siblings, 0 replies; 20+ messages in thread
From: Ingo Molnar @ 2007-10-11 13:32 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: Nicholas Miell, Linus Torvalds, Linux Kernel Mailing List


* Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:

> > ( Config is at http://redhat.com/~mingo/misc/config, system is Core2Duo
> >   1.83 GHz, mysql-5.0.45, glibc-2.6. Nothing fancy either in the config
> >   nor in the setup - everything is pretty close to the defaults. )
>
> I used FedoraCore 8 Test2 distribution, so glibc-2.6.90-13 already 
> fixed the old malloc scalability issue. Cpu is 2.66GHZ quad core, 2 
> physical processor, totally 8 cores. The regression is about 28%.

thanks for confirming this! I've updated glibc and mysql and now i can 
reproduce something similar. (I have a theory about the reason of this 
regression, and i'm working on a test-patch.)

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-12  5:46       ` Ingo Molnar
@ 2007-10-11 14:15         ` Nick Piggin
  2007-10-12 12:21         ` Bill Davidsen
  1 sibling, 0 replies; 20+ messages in thread
From: Nick Piggin @ 2007-10-11 14:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Nicholas Miell, Linus Torvalds, Linux Kernel Mailing List

On Friday 12 October 2007 15:46, Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> > ;) I think you snipped the important bit:
> >
> > "the peak is terrible but it has virtually no dropoff and performs
> > better under load than the default 2.6.21 scheduler." (verbatim)
>
> hm, i understood that peak remark to be in reference to FreeBSD's
> scheduler (which the FreeBSD guys are primarily interested in
> obviously), not v2.6.21 - but i could be wrong.

I think the Linux peak has always been roughly as good as their
best FreeBSD ones (eg. http://people.freebsd.org/~jeff/sysbench.png).
Obviously in that graph, Linux sucks because of the malloc/mmap_sem
issue. It also shows what he is calling the terrible CFS peak, I
guess.

In my own tests, after that was fixed, Linux's peak got even a bit
higher, so that's the benchmark for performance.


> In any case, there is indeed a regression with sysbench and a low number
> of threads, and it's being fixed. The peak got improved visibly in
> sched-devel:
>
>   http://people.redhat.com/mingo/misc/sysbench-sched-devel.jpg
>
> but there is still some peak regression left, i'm testing a patch for
> that.

OK good. Once that's fixed, we'll hopefully be competitive with
FreeBSD again in this test :)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-11  9:16     ` Nick Piggin
@ 2007-10-12  5:46       ` Ingo Molnar
  2007-10-11 14:15         ` Nick Piggin
  2007-10-12 12:21         ` Bill Davidsen
  0 siblings, 2 replies; 20+ messages in thread
From: Ingo Molnar @ 2007-10-12  5:46 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Nicholas Miell, Linus Torvalds, Linux Kernel Mailing List


* Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> ;) I think you snipped the important bit:
> 
> "the peak is terrible but it has virtually no dropoff and performs 
> better under load than the default 2.6.21 scheduler." (verbatim)

hm, i understood that peak remark to be in reference to FreeBSD's 
scheduler (which the FreeBSD guys are primarily interested in 
obviously), not v2.6.21 - but i could be wrong.

In any case, there is indeed a regression with sysbench and a low number 
of threads, and it's being fixed. The peak got improved visibly in 
sched-devel:

  http://people.redhat.com/mingo/misc/sysbench-sched-devel.jpg

but there is still some peak regression left, i'm testing a patch for 
that.

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 2.6.23
  2007-10-12  5:46       ` Ingo Molnar
  2007-10-11 14:15         ` Nick Piggin
@ 2007-10-12 12:21         ` Bill Davidsen
  1 sibling, 0 replies; 20+ messages in thread
From: Bill Davidsen @ 2007-10-12 12:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nick Piggin, Nicholas Miell, Linus Torvalds,
	Linux Kernel Mailing List

Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>> ;) I think you snipped the important bit:
>>
>> "the peak is terrible but it has virtually no dropoff and performs 
>> better under load than the default 2.6.21 scheduler." (verbatim)
> 
> hm, i understood that peak remark to be in reference to FreeBSD's 
> scheduler (which the FreeBSD guys are primarily interested in 
> obviously), not v2.6.21 - but i could be wrong.
> 
> In any case, there is indeed a regression with sysbench and a low number 
> of threads, and it's being fixed. The peak got improved visibly in 
> sched-devel:
> 
>   http://people.redhat.com/mingo/misc/sysbench-sched-devel.jpg
> 
> but there is still some peak regression left, i'm testing a patch for 
> that.
> 
There's one important bit missing from that graph, the 
2.6.23-SCHED_BATCH values. Without that we can't tell how much 
improvement is from sched-devel and how much from SCHED_BATCH. Clearly 
2.6.23 is better than 2.6.22.any in this test, the locking issues seem 
to dominate that difference to the point that nothing else would be 
informative.

This weekend I have to do some building of kernels for various machines, 
so I intend to run some builds SCHED_BATCH and some will just run. If I 
find anything interesting I'll report.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-10-12 12:14 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-09 20:54 Linux 2.6.23 Linus Torvalds
2007-10-10  6:12 ` Nicholas Miell
2007-10-10 10:14   ` Ingo Molnar
2007-10-11  1:20     ` Nicholas Miell
2007-10-11  2:34     ` Zhang, Yanmin
2007-10-11 13:32       ` Ingo Molnar
2007-10-11  9:16     ` Nick Piggin
2007-10-12  5:46       ` Ingo Molnar
2007-10-11 14:15         ` Nick Piggin
2007-10-12 12:21         ` Bill Davidsen
2007-10-10  7:44 ` René Rebe
2007-10-10  8:37   ` Alexey Dobriyan
2007-10-10  9:12     ` Michael Tokarev
2007-10-10 10:36       ` Alexey Dobriyan
2007-10-10 10:53         ` Jan Engelhardt
2007-10-10 11:13           ` Michael Tokarev
2007-10-10 19:14   ` Ingo Molnar
2007-10-10 19:26     ` Michael Tokarev
2007-10-10 20:04     ` Andi Kleen
2007-10-10 23:27     ` Krzysztof Halasa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox