public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Performance of -mm2 and -mm4
@ 2004-08-23 16:58 Martin J. Bligh
  2004-08-23 21:31 ` Jesse Barnes
  2004-08-24  3:23 ` Nick Piggin
  0 siblings, 2 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-08-23 16:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Nick Piggin


Kernbench: (make -j vmlinux, maximal tasks)
                              Elapsed      System        User         CPU
                  2.6.8.1       43.90       87.76      572.94     1505.67
              2.6.8.1-mm1       44.26       87.71      574.73     1496.33
              2.6.8.1-mm2       44.27       90.27      574.84     1502.33
              2.6.8.1-mm4       45.87       97.60      595.23     1510.00

mm2 seems to take slightly (but consistently) more systime than mm1, and
mm4 is significantly worse still ;-(

diffprofile from mm1 to mm2:

      5469  32170.6% find_get_page
       785    13.0% __d_lookup
       476     0.4% total
       128    21.9% generic_file_open
        93    22.5% __alloc_pages
        62    26.1% file_ra_state_init
        58     0.0% put_page
        54     9.9% dput
...
       -51    -2.8% finish_task_switch
       -55    -4.3% __wake_up
       -67    -6.6% file_move
      -128    -1.7% __copy_to_user_ll
      -156    -1.1% do_anonymous_page
     -2189    -4.7% default_idle
     -3632  -100.0% find_trylock_page

and -mm1 to -mm4

      5841  34358.8% find_get_page
      5394     4.0% total
      1459    24.2% __d_lookup
       740     9.2% __copy_from_user_ll
       718     9.5% __copy_to_user_ll
       304    24.0% __wake_up
       253    20.2% free_hot_cold_page
       248    19.8% atomic_dec_and_lock
       229    42.2% dput
       228    13.7% path_lookup
       226    43.0% Letext
       202    34.6% generic_file_open
       197    23.6% pte_alloc_one
       194    77.0% pgd_ctor
       180    22.6% kmem_cache_free
       173     6.5% zap_pte_range
       170     9.2% buffered_rmqueue
       146    16.1% in_group_p
       123    22.3% __fput
...
       -56   -23.5% file_ra_state_init
       -72   -31.7% page_add_anon_rmap
      -104    -5.6% finish_task_switch
      -124    -0.9% do_anonymous_page
     -3633  -100.0% find_trylock_page
     -4636   -10.0% default_idle

The -mm4 looks more like sched stuff to me (copy_to/from_user, etc),
but the -mm2 stuff looks like something else. Buggered if I know what.
-mm3 didn't compile cleanly, so I didn't bother, but I prob can if you
like.

m.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-23 16:58 Performance of -mm2 and -mm4 Martin J. Bligh
@ 2004-08-23 21:31 ` Jesse Barnes
  2004-08-24  0:41   ` Nick Piggin
  2004-08-24  3:23 ` Nick Piggin
  1 sibling, 1 reply; 7+ messages in thread
From: Jesse Barnes @ 2004-08-23 21:31 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Andrew Morton, linux-kernel, Nick Piggin

On Monday, August 23, 2004 9:58 am, Martin J. Bligh wrote:
> The -mm4 looks more like sched stuff to me (copy_to/from_user, etc),
> but the -mm2 stuff looks like something else. Buggered if I know what.
> -mm3 didn't compile cleanly, so I didn't bother, but I prob can if you
> like.

If you suspect the scheduler, you could try bumping SD_NODES_PER_DOMAIN in 
kernel/sched.c to a larger value (e.g. the number of nodes in your system).  
That'll make the scheduler balance more aggressively across the whole system.

Jesse

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-23 21:31 ` Jesse Barnes
@ 2004-08-24  0:41   ` Nick Piggin
  2004-08-24  1:29     ` Con Kolivas
  0 siblings, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2004-08-24  0:41 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Martin J. Bligh, Andrew Morton, linux-kernel



Jesse Barnes wrote:

>On Monday, August 23, 2004 9:58 am, Martin J. Bligh wrote:
>
>>The -mm4 looks more like sched stuff to me (copy_to/from_user, etc),
>>but the -mm2 stuff looks like something else. Buggered if I know what.
>>-mm3 didn't compile cleanly, so I didn't bother, but I prob can if you
>>like.
>>
>
>If you suspect the scheduler, you could try bumping SD_NODES_PER_DOMAIN in 
>kernel/sched.c to a larger value (e.g. the number of nodes in your system).  
>That'll make the scheduler balance more aggressively across the whole system.
>
>

Try increasing /proc/sys/kernel/base_timeslice as well.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-24  0:41   ` Nick Piggin
@ 2004-08-24  1:29     ` Con Kolivas
  2004-08-25 22:29       ` Martin J. Bligh
  0 siblings, 1 reply; 7+ messages in thread
From: Con Kolivas @ 2004-08-24  1:29 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Jesse Barnes, Martin J. Bligh, Andrew Morton, linux-kernel

Nick Piggin writes:

> 
> 
> Jesse Barnes wrote:
> 
>>On Monday, August 23, 2004 9:58 am, Martin J. Bligh wrote:
>>
>>>The -mm4 looks more like sched stuff to me (copy_to/from_user, etc),
>>>but the -mm2 stuff looks like something else. Buggered if I know what.
>>>-mm3 didn't compile cleanly, so I didn't bother, but I prob can if you
>>>like.
>>>
>>
>>If you suspect the scheduler, you could try bumping SD_NODES_PER_DOMAIN in 
>>kernel/sched.c to a larger value (e.g. the number of nodes in your system).  
>>That'll make the scheduler balance more aggressively across the whole system.
>>
>>
> 
> Try increasing /proc/sys/kernel/base_timeslice as well.

Or back out nicksched.patch

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-23 16:58 Performance of -mm2 and -mm4 Martin J. Bligh
  2004-08-23 21:31 ` Jesse Barnes
@ 2004-08-24  3:23 ` Nick Piggin
  2004-08-24  3:26   ` Martin J. Bligh
  1 sibling, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2004-08-24  3:23 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Andrew Morton, linux-kernel



Martin J. Bligh wrote:

>Kernbench: (make -j vmlinux, maximal tasks)
>                              Elapsed      System        User         CPU
>                  2.6.8.1       43.90       87.76      572.94     1505.67
>              2.6.8.1-mm1       44.26       87.71      574.73     1496.33
>              2.6.8.1-mm2       44.27       90.27      574.84     1502.33
>              2.6.8.1-mm4       45.87       97.60      595.23     1510.00
>
>mm2 seems to take slightly (but consistently) more systime than mm1, and
>mm4 is significantly worse still ;-(
>
>

Increasing base_timeslice here takes about 10s off the user time,
and maybe 1-2 off elapsed. You may see a better improvement because
the machine I'm testing on has very small caches; I assume you are
using a 32-way NUMAQ with 1-2MB caches?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-24  3:23 ` Nick Piggin
@ 2004-08-24  3:26   ` Martin J. Bligh
  0 siblings, 0 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-08-24  3:26 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, linux-kernel

> Martin J. Bligh wrote:
> 
>> Kernbench: (make -j vmlinux, maximal tasks)
>>                              Elapsed      System        User         CPU
>>                  2.6.8.1       43.90       87.76      572.94     1505.67
>>              2.6.8.1-mm1       44.26       87.71      574.73     1496.33
>>              2.6.8.1-mm2       44.27       90.27      574.84     1502.33
>>              2.6.8.1-mm4       45.87       97.60      595.23     1510.00
>> 
>> mm2 seems to take slightly (but consistently) more systime than mm1, and
>> mm4 is significantly worse still ;-(
>> 
>> 
> 
> Increasing base_timeslice here takes about 10s off the user time,
> and maybe 1-2 off elapsed. You may see a better improvement because
> the machine I'm testing on has very small caches; I assume you are
> using a 32-way NUMAQ with 1-2MB caches?

16-way with 2MB caches. Doing 256 as opposed to 64 gives a little less
user time, more systime at the low end, and a wash with more tasks.
Not much affects elapsed though. I'll try 16, then backing out the 
sched patch, and what Jesse suggested as well.

M.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Performance of -mm2 and -mm4
  2004-08-24  1:29     ` Con Kolivas
@ 2004-08-25 22:29       ` Martin J. Bligh
  0 siblings, 0 replies; 7+ messages in thread
From: Martin J. Bligh @ 2004-08-25 22:29 UTC (permalink / raw)
  To: Con Kolivas, Nick Piggin; +Cc: Jesse Barnes, Andrew Morton, linux-kernel

>>>> The -mm4 looks more like sched stuff to me (copy_to/from_user, etc),
>>>> but the -mm2 stuff looks like something else. Buggered if I know what.
>>>> -mm3 didn't compile cleanly, so I didn't bother, but I prob can if you
>>>> like.
>>>> 
>>> 
>>> If you suspect the scheduler, you could try bumping SD_NODES_PER_DOMAIN in 
>>> kernel/sched.c to a larger value (e.g. the number of nodes in your system).  
>>> That'll make the scheduler balance more aggressively across the whole system.
>>> 
>>> 
>> 
>> Try increasing /proc/sys/kernel/base_timeslice as well.
> 
> Or back out nicksched.patch

Yeah, that mostly fixed it.

Kernbench: (make -j N vmlinux, where N = 16 x num_cpus)
                              Elapsed      System        User         CPU
                  2.6.8.1       44.82       97.19      574.55     1497.33
              2.6.8.1-mm4       46.82      107.47      594.15     1497.33
           2.6.8.1-mm4-nn       44.93       96.33      576.44     1496.33

Kernbench: (make -j vmlinux, maximal tasks)
                              Elapsed      System        User         CPU
                  2.6.8.1       43.90       87.76      572.94     1505.67
              2.6.8.1-mm4       45.87       97.60      595.23     1510.00
           2.6.8.1-mm4-nn       44.53       90.71      575.68     1495.67


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-08-25 22:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-23 16:58 Performance of -mm2 and -mm4 Martin J. Bligh
2004-08-23 21:31 ` Jesse Barnes
2004-08-24  0:41   ` Nick Piggin
2004-08-24  1:29     ` Con Kolivas
2004-08-25 22:29       ` Martin J. Bligh
2004-08-24  3:23 ` Nick Piggin
2004-08-24  3:26   ` Martin J. Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox