public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.19-rc1: Slowdown in lmbench's fork
@ 2006-11-02 16:44 Tim Chen
  2006-11-02 18:33 ` Eric W. Biederman
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Chen @ 2006-11-02 16:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: ebiederm


After introduction of the following patch:

[PATCH] genirq: x86_64 irq: make vector_irq per cpu
http://kernel.org/git/?
p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=550f2299ac8ffaba943cf211380d3a8d3fa75301

we see fork benchmark in lmbench-3.0-a7 slowed by 
11.5% on a 2 socket woodcrest machine.  Similar change
is seen also on other SMP Xeon machines.

When running lmbench, we have chosen the lmbench option
to pin parent and child on different processor cores 
  
Overhead of calling sched_setaffinity to place the process 
on processor is included in lmbench's fork time measurement. 
The patch may play a role in increasing this.

The two follow up patches to the original "make vector_irq per cpu" 
did not affect the fork time.

[PATCH] x86_64 irq: Properly update vector_irq
[PATCH] x86-64: Only look at per_cpu data for online ...

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: Slowdown in lmbench's fork
  2006-11-02 16:44 2.6.19-rc1: Slowdown in lmbench's fork Tim Chen
@ 2006-11-02 18:33 ` Eric W. Biederman
  2006-11-02 18:34   ` Tim Chen
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2006-11-02 18:33 UTC (permalink / raw)
  To: tim.c.chen; +Cc: linux-kernel

Tim Chen <tim.c.chen@linux.intel.com> writes:

> After introduction of the following patch:
>
> [PATCH] genirq: x86_64 irq: make vector_irq per cpu
> http://kernel.org/git/?
> p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=550f2299ac8ffaba943cf211380d3a8d3fa75301
>
> we see fork benchmark in lmbench-3.0-a7 slowed by 
> 11.5% on a 2 socket woodcrest machine.  Similar change
> is seen also on other SMP Xeon machines.
>
> When running lmbench, we have chosen the lmbench option
> to pin parent and child on different processor cores 
>   
> Overhead of calling sched_setaffinity to place the process 
> on processor is included in lmbench's fork time measurement. 
> The patch may play a role in increasing this.

The only think I can think of is that because data structures
moved around we may be seeing some more cache misses.  If we
were talking normal interrupts taking a little longer I can
see my change having a direct correlation.  But I don't believe
I touched anything in that patch that touched the IPI path.

I did add some things to the per cpu area and expanded it a little
which may be what you are seeing.

That feels like a significant increase in fork times.  I will think
about it and holler if I can think of a productive direction to try.

My only partial guess is that it might be worth adding the per cpu
variables my patch adds without any of the corresponding code changes.
And see if adding variables to the per cpu area is what is causing the
change.

The two tests I can see in this line are:
- to add the percpu vector_irq variable.
- to increase NR_IRQs.

My suspicion is that one of those two changes alone will change
things enough that you see your lmbench slowdown.  If that is the
case then it is probably worth shuffling around the variables in the
per cpu area to get better cache line affinity.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: Slowdown in lmbench's fork
  2006-11-02 18:33 ` Eric W. Biederman
@ 2006-11-02 18:34   ` Tim Chen
  2006-11-03  2:11     ` 2.6.19-rc1: x86_64 slowdown " Adrian Bunk
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Chen @ 2006-11-02 18:34 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-kernel

On Thu, 2006-11-02 at 11:33 -0700, Eric W. Biederman wrote:

> 
> My only partial guess is that it might be worth adding the per cpu
> variables my patch adds without any of the corresponding code changes.
> And see if adding variables to the per cpu area is what is causing the
> change.
> 
> The two tests I can see in this line are:
> - to add the percpu vector_irq variable.
> - to increase NR_IRQs.

Increasing the NR_IRQs resulted in the regression.
Adding the percpu vector_irq variable did not cause any changes.  

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-02 18:34   ` Tim Chen
@ 2006-11-03  2:11     ` Adrian Bunk
  2006-11-03  9:08       ` Eric W. Biederman
  2006-11-03 16:10       ` Tim Chen
  0 siblings, 2 replies; 11+ messages in thread
From: Adrian Bunk @ 2006-11-03  2:11 UTC (permalink / raw)
  To: Tim Chen; +Cc: Eric W. Biederman, linux-kernel, ak, discuss

On Thu, Nov 02, 2006 at 10:34:13AM -0800, Tim Chen wrote:
> On Thu, 2006-11-02 at 11:33 -0700, Eric W. Biederman wrote:
> 
> > My only partial guess is that it might be worth adding the per cpu
> > variables my patch adds without any of the corresponding code changes.
> > And see if adding variables to the per cpu area is what is causing the
> > change.
> > 
> > The two tests I can see in this line are:
> > - to add the percpu vector_irq variable.
> > - to increase NR_IRQs.
> 
> Increasing the NR_IRQs resulted in the regression.
>...

What's your CONFIG_NR_CPUS setting that you are seeing such a big
regression?

> Tim

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03  2:11     ` 2.6.19-rc1: x86_64 slowdown " Adrian Bunk
@ 2006-11-03  9:08       ` Eric W. Biederman
  2006-11-03 16:10       ` Tim Chen
  1 sibling, 0 replies; 11+ messages in thread
From: Eric W. Biederman @ 2006-11-03  9:08 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Tim Chen, linux-kernel, ak, discuss

Adrian Bunk <bunk@stusta.de> writes:

> On Thu, Nov 02, 2006 at 10:34:13AM -0800, Tim Chen wrote:
>> On Thu, 2006-11-02 at 11:33 -0700, Eric W. Biederman wrote:
>> 
>> > My only partial guess is that it might be worth adding the per cpu
>> > variables my patch adds without any of the corresponding code changes.
>> > And see if adding variables to the per cpu area is what is causing the
>> > change.
>> > 
>> > The two tests I can see in this line are:
>> > - to add the percpu vector_irq variable.
>> > - to increase NR_IRQs.
>> 
>> Increasing the NR_IRQs resulted in the regression.
>>...
>
> What's your CONFIG_NR_CPUS setting that you are seeing such a big
> regression?

Also could we see the section of System.map that deals with
per cpu variables.

I believe there are some counters for processes and the like
just below kstat whose size increase is causing you real
problems.

Ugh.  I just looked at include/linux/kernel_stat.h
kstat has the per cpu irq counters and all of the cpu process
time accounting so it is quite likely that we are going to be
touching this structure plus the run queues and the process counts
during a fork.  All of which are now potentially much more spread out.

Also has anyone else reproduce this problem yet?

I don't doubt that it exists but having a few more data points or
eyeballs on the problem couldn't hurt.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03  2:11     ` 2.6.19-rc1: x86_64 slowdown " Adrian Bunk
  2006-11-03  9:08       ` Eric W. Biederman
@ 2006-11-03 16:10       ` Tim Chen
  2006-11-03 17:35         ` Eric W. Biederman
  1 sibling, 1 reply; 11+ messages in thread
From: Tim Chen @ 2006-11-03 16:10 UTC (permalink / raw)
  To: Adrian Bunk; +Cc: Eric W. Biederman, linux-kernel, ak, discuss

On Fri, 2006-11-03 at 03:11 +0100, Adrian Bunk wrote:

> 
> What's your CONFIG_NR_CPUS setting that you are seeing such a big
> regression?
> 

CONFIG_NR_CPUS is set to 8.  

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03 16:10       ` Tim Chen
@ 2006-11-03 17:35         ` Eric W. Biederman
  2006-11-03 17:47           ` Andi Kleen
  0 siblings, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2006-11-03 17:35 UTC (permalink / raw)
  To: tim.c.chen; +Cc: Adrian Bunk, linux-kernel, ak, discuss

Tim Chen <tim.c.chen@linux.intel.com> writes:

> On Fri, 2006-11-03 at 03:11 +0100, Adrian Bunk wrote:
>
>> 
>> What's your CONFIG_NR_CPUS setting that you are seeing such a big
>> regression?
>> 
>
> CONFIG_NR_CPUS is set to 8.  

Ugh.  This simply changes NR_IRQS from 256 to 512.  Changing
the size of data from 1K to 2K.

So unless there is some other array that is sized by NR_IRQs
in the context switch path which could account for this in
other ways.  It looks like you just got unlucky.

The only hypothesis that I can seem to come up with is that maybe
you are getting an extra tlb now that you didn't use to.  
I think the per cpu area is covered by huge pages but maybe not.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03 17:35         ` Eric W. Biederman
@ 2006-11-03 17:47           ` Andi Kleen
  2006-11-03 18:18             ` Eric W. Biederman
  2006-11-03 20:25             ` Tim Chen
  0 siblings, 2 replies; 11+ messages in thread
From: Andi Kleen @ 2006-11-03 17:47 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: tim.c.chen, Adrian Bunk, linux-kernel, discuss


> So unless there is some other array that is sized by NR_IRQs
> in the context switch path which could account for this in
> other ways.  It looks like you just got unlucky.


TLB/cache profiling data might be useful?
My bet would be more on cache effects.
 
> The only hypothesis that I can seem to come up with is that maybe
> you are getting an extra tlb now that you didn't use to.  
> I think the per cpu area is covered by huge pages but maybe not.

It should be.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03 17:47           ` Andi Kleen
@ 2006-11-03 18:18             ` Eric W. Biederman
  2006-11-03 18:39               ` Andi Kleen
  2006-11-03 20:25             ` Tim Chen
  1 sibling, 1 reply; 11+ messages in thread
From: Eric W. Biederman @ 2006-11-03 18:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: tim.c.chen, Adrian Bunk, linux-kernel, discuss

Andi Kleen <ak@suse.de> writes:

>> So unless there is some other array that is sized by NR_IRQs
>> in the context switch path which could account for this in
>> other ways.  It looks like you just got unlucky.
>
>
> TLB/cache profiling data might be useful?
> My bet would be more on cache effects.

The only way I can see that being true is if some irq was keeping
the cache line warm for something in the process startup.

I have trouble seeing how adding 1K to an already 1K data structure
can cause a cache fault that wasn't happening already.
  
>> The only hypothesis that I can seem to come up with is that maybe
>> you are getting an extra tlb now that you didn't use to.  
>> I think the per cpu area is covered by huge pages but maybe not.
>
> It should be.

Which invalidates the tlb fault hypothesis unless it happens to lie
on the 2MB boundary.

Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03 18:18             ` Eric W. Biederman
@ 2006-11-03 18:39               ` Andi Kleen
  0 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2006-11-03 18:39 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: tim.c.chen, Adrian Bunk, linux-kernel, discuss


> Which invalidates the tlb fault hypothesis unless it happens to lie
> on the 2MB boundary.

It's in different pages than the code. 

One drawback of using large TLBs is that the CPUs tend to have a lot less
of them than small TLBs so they can actually thrash quicker if you're unlucky.

Only profiling can tell I guess.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: 2.6.19-rc1: x86_64 slowdown in lmbench's fork
  2006-11-03 17:47           ` Andi Kleen
  2006-11-03 18:18             ` Eric W. Biederman
@ 2006-11-03 20:25             ` Tim Chen
  1 sibling, 0 replies; 11+ messages in thread
From: Tim Chen @ 2006-11-03 20:25 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Eric W. Biederman, Adrian Bunk, linux-kernel, discuss

On Fri, 2006-11-03 at 18:47 +0100, Andi Kleen wrote:
> > So unless there is some other array that is sized by NR_IRQs
> > in the context switch path which could account for this in
> > other ways.  It looks like you just got unlucky.
> 
> 
> TLB/cache profiling data might be useful?
> My bet would be more on cache effects.
>  

The TLB miss, cache miss and page walk profiles did not change when I
measured it.

I have a suspicion that the overhead to pin parent and child
processes to specific cpu had something to do with the 
change in time observed.  Lmbench includes this overhead in
the fork time it reported.  I had chosen the lmbench option to
set parent and child process on specific cpu.

When I skip this by picking another lmbench option to let scheduler 
pick the placement of parent and child process. I see that 
the fork time now stays unchanged with this setting.  Wonder if
the increase in time is in sched_setaffinity.

Tim

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-11-03 21:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-02 16:44 2.6.19-rc1: Slowdown in lmbench's fork Tim Chen
2006-11-02 18:33 ` Eric W. Biederman
2006-11-02 18:34   ` Tim Chen
2006-11-03  2:11     ` 2.6.19-rc1: x86_64 slowdown " Adrian Bunk
2006-11-03  9:08       ` Eric W. Biederman
2006-11-03 16:10       ` Tim Chen
2006-11-03 17:35         ` Eric W. Biederman
2006-11-03 17:47           ` Andi Kleen
2006-11-03 18:18             ` Eric W. Biederman
2006-11-03 18:39               ` Andi Kleen
2006-11-03 20:25             ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox