* unusual scheduling performance
@ 2002-11-18 8:18 William Lee Irwin III
2002-11-18 16:34 ` Martin J. Bligh
2002-11-20 14:12 ` Ingo Molnar
0 siblings, 2 replies; 19+ messages in thread
From: William Lee Irwin III @ 2002-11-18 8:18 UTC (permalink / raw)
To: linux-kernel; +Cc: mingo, rml, riel, akpm
On 16x, 2.5.47 kernel compiles take about 26s when the machine is
otherwise idle.
On 32x, 2.5.47 kernel compiles take about 48s when the machine is
otherwise idle.
When a single-threaded task consumes an entire cpu, kernel compiles
take 36s on 32s when the machine is idle aside from the task consuming
that cpu and the kernel compile itself.
I suspect the scheduler, because cpu reporting in top(1) shows that a
two or more cpu-intensive tasks are concentrated on the same cpu, and
some long-lived tasks appear to be "bouncing" across cpus. If someone
with knowledge and/or expertise with respect to scheduling semantics
could look into this, I would be much obliged. Resolving this would
likely address many SMP and/or NUMA scheduling performance issues.
Thanks,
Bill
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 8:18 unusual scheduling performance William Lee Irwin III
@ 2002-11-18 16:34 ` Martin J. Bligh
2002-11-18 16:53 ` William Lee Irwin III
2002-11-20 14:12 ` Ingo Molnar
1 sibling, 1 reply; 19+ messages in thread
From: Martin J. Bligh @ 2002-11-18 16:34 UTC (permalink / raw)
To: William Lee Irwin III, linux-kernel; +Cc: mingo, rml, riel, akpm
> On 16x, 2.5.47 kernel compiles take about 26s when the machine is
> otherwise idle.
>
> On 32x, 2.5.47 kernel compiles take about 48s when the machine is
> otherwise idle.
>
> When a single-threaded task consumes an entire cpu, kernel compiles
> take 36s on 32s when the machine is idle aside from the task consuming
> that cpu and the kernel compile itself.
>
> I suspect the scheduler, because cpu reporting in top(1) shows that a
> two or more cpu-intensive tasks are concentrated on the same cpu, and
> some long-lived tasks appear to be "bouncing" across cpus. If someone
> with knowledge and/or expertise with respect to scheduling semantics
> could look into this, I would be much obliged. Resolving this would
> likely address many SMP and/or NUMA scheduling performance issues.
1. make -j <what?>
2. profiles?
3. Can you try the latest set of NUMA sched patches posted by Eric Focht?
M.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 16:34 ` Martin J. Bligh
@ 2002-11-18 16:53 ` William Lee Irwin III
2002-11-18 17:53 ` Dave Hansen
0 siblings, 1 reply; 19+ messages in thread
From: William Lee Irwin III @ 2002-11-18 16:53 UTC (permalink / raw)
To: Martin J. Bligh; +Cc: linux-kernel, mingo, rml, riel, akpm
On Mon, Nov 18, 2002 at 08:34:34AM -0800, Martin J. Bligh wrote:
> 1. make -j <what?>
> 2. profiles?
> 3. Can you try the latest set of NUMA sched patches posted by Eric Focht?
(1) make -j64 bzImage
(2) doesn't sound useful for load balancing
(3) sure
Bill
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 16:53 ` William Lee Irwin III
@ 2002-11-18 17:53 ` Dave Hansen
2002-11-18 18:16 ` Andrew Morton
2002-11-18 20:17 ` William Lee Irwin III
0 siblings, 2 replies; 19+ messages in thread
From: Dave Hansen @ 2002-11-18 17:53 UTC (permalink / raw)
To: William Lee Irwin III
Cc: Martin J. Bligh, linux-kernel, mingo, rml, riel, akpm
William Lee Irwin III wrote:
> On Mon, Nov 18, 2002 at 08:34:34AM -0800, Martin J. Bligh wrote:
>
>>1. make -j <what?>
>>2. profiles?
>>3. Can you try the latest set of NUMA sched patches posted by Eric Focht?
>
> (1) make -j64 bzImage
> (2) doesn't sound useful for load balancing
> (3) sure
I'm seeing the same thing. In my pagecache warmup test, I do 20 greps
to pull in a 10-gig fileset. Each grep works on 1/20th of the files.
For a long, long time, one the file set was warmed up, the time to do
the test took ~14 secdonds:
Average Real: 14.0824
Average User: 0.94055
Average Sys: 5.20875
Full profile here:
http://www.sr71.net/prof/grep/run-grep-warm-2.5.47-11-15-2002-15.21.31/
As of 2.5.47, it looks like this:
Average Real: 18.9168
Average User: 1.0073
Average Sys: 4.9918
Full profile here:
http://www.sr71.net/prof/grep/run-grep-warm-2.5.47-11-15-2002-15.58.02/
readprofile ticks
------------------
fast slow diff
page_cache_readahead: 93 82 -11
__generic_file_aio_read: 73 83 10
file_move: 52 86 34
dget_locked: 57 87 30
proc_pid_stat: 149 88 -61
ep_notify_file_close: 59 89 30
get_pid_list: 21 91 70
update_atime: 100 93 -7
get_unused_fd: 23 105 82
fget: 120 113 -7
dput: 100 120 20
get_empty_filp: 105 121 16
system_call: 113 129 16
rwsem_down_write_failed: 133 133
vfs_follow_link: 116 164 48
file_read_actor: 198 227 29
__fput: 178 241 63
radix_tree_lookup: 324 293 -31
atomic_dec_and_lock: 229 307 78
.text.lock.dec_and_lock: 111 331 220
try_to_wake_up: 374 374
kmap_atomic: 346 398 52
kunmap_atomic: 379 409 30
vfs_read: 440 431 -9
.text.lock.namei: 149 482 333
__d_lookup: 456 518 62
link_path_walk: 533 710 177
schedule: 1 1060 1059
do_generic_mapping_read: 1880 1846 -34
poll_idle: 2059 33416 31357
__copy_to_user: 94208 87678 -6530
total: 104173 132206 28033
So, schedule() is being called a _lot_ more. But, for some reason,
the slower one wasn't caught doing __copy_to_user() as much.
Bill, does this look like what you're seeing?
--
Dave Hansen
haveblue@us.ibm.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 17:53 ` Dave Hansen
@ 2002-11-18 18:16 ` Andrew Morton
2002-11-18 18:34 ` Davide Libenzi
2002-11-18 20:17 ` William Lee Irwin III
1 sibling, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2002-11-18 18:16 UTC (permalink / raw)
To: Dave Hansen
Cc: William Lee Irwin III, Martin J. Bligh, linux-kernel, mingo, rml,
riel, akpm
Dave Hansen wrote:
>
> ...
> rwsem_down_write_failed: 133 133
Possible culprit.
Please stick a dump_stack() in rwsem_down_write_failed(), and add the below.
Suggest you stick with 2.5.47 to diagnose this. The loss of kksymoops
is a pain.
fs/eventpoll.c | 2 ++
1 files changed, 2 insertions(+)
--- 25/fs/eventpoll.c~hey Mon Nov 18 10:13:40 2002
+++ 25-akpm/fs/eventpoll.c Mon Nov 18 10:14:01 2002
@@ -328,6 +328,8 @@ void eventpoll_release(struct file *file
if (list_empty(lsthead))
return;
+ printk("hey!\n");
+
/*
* We don't want to get "file->f_ep_lock" because it is not
* necessary. It is not necessary because we're in the "struct file"
_
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 18:16 ` Andrew Morton
@ 2002-11-18 18:34 ` Davide Libenzi
2002-11-18 18:52 ` Andrew Morton
2002-11-18 18:56 ` Dave Hansen
0 siblings, 2 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 18:34 UTC (permalink / raw)
To: Andrew Morton
Cc: Dave Hansen, William Lee Irwin III, Martin J. Bligh,
Linux Kernel Mailing List, Ingo Molnar, Robert Love, riel, akpm
On Mon, 18 Nov 2002, Andrew Morton wrote:
> Dave Hansen wrote:
> >
> > ...
> > rwsem_down_write_failed: 133 133
>
> Possible culprit.
>
> Please stick a dump_stack() in rwsem_down_write_failed(), and add the below.
> Suggest you stick with 2.5.47 to diagnose this. The loss of kksymoops
> is a pain.
>
>
> fs/eventpoll.c | 2 ++
> 1 files changed, 2 insertions(+)
>
> --- 25/fs/eventpoll.c~hey Mon Nov 18 10:13:40 2002
> +++ 25-akpm/fs/eventpoll.c Mon Nov 18 10:14:01 2002
> @@ -328,6 +328,8 @@ void eventpoll_release(struct file *file
> if (list_empty(lsthead))
> return;
>
> + printk("hey!\n");
> +
Andrew, if you don't use epoll there's no way you get there. The function
eventpoll_file_init() initialize the list at each file* init in
fs/file_table.c
If you're not using epoll and you get there, someone is screwing up the
data inside the struct file
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 18:34 ` Davide Libenzi
@ 2002-11-18 18:52 ` Andrew Morton
2002-11-18 18:58 ` Davide Libenzi
2002-11-18 18:56 ` Dave Hansen
1 sibling, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2002-11-18 18:52 UTC (permalink / raw)
To: Davide Libenzi
Cc: Dave Hansen, William Lee Irwin III, Martin J. Bligh,
Linux Kernel Mailing List, Ingo Molnar, Robert Love, riel
Davide Libenzi wrote:
>
> On Mon, 18 Nov 2002, Andrew Morton wrote:
>
> > Dave Hansen wrote:
> > >
> > > ...
> > > rwsem_down_write_failed: 133 133
> >
> > Possible culprit.
> >
> > Please stick a dump_stack() in rwsem_down_write_failed(), and add the below.
> > Suggest you stick with 2.5.47 to diagnose this. The loss of kksymoops
> > is a pain.
> >
> >
> > fs/eventpoll.c | 2 ++
> > 1 files changed, 2 insertions(+)
> >
> > --- 25/fs/eventpoll.c~hey Mon Nov 18 10:13:40 2002
> > +++ 25-akpm/fs/eventpoll.c Mon Nov 18 10:14:01 2002
> > @@ -328,6 +328,8 @@ void eventpoll_release(struct file *file
> > if (list_empty(lsthead))
> > return;
> >
> > + printk("hey!\n");
> > +
>
> Andrew, if you don't use epoll there's no way you get there.
Yup. That was a random stab based on recently-added down_write()
calls.
However the down_write isn't there in 2.5.47 so that's a false
lead. We'll need that dump_stack() output.
Here's Dave's profile. ep_notify_file_close() makes a small appearance.
The change you made to 2.5.48 will wipe that out. Neat.
0.058% 78 locks_remove_flock
0.062% 82 page_cache_readahead
0.062% 83 __generic_file_aio_read
0.065% 86 file_move
0.065% 87 dget_locked
0.066% 88 proc_pid_stat
0.067% 89 ep_notify_file_close
0.068% 91 get_pid_list
0.070% 93 update_atime
0.079% 105 get_unused_fd
0.085% 113 fget
0.090% 120 dput
0.091% 121 get_empty_filp
0.097% 129 system_call
0.100% 133 rwsem_down_write_failed
0.124% 164 vfs_follow_link
0.171% 227 file_read_actor
0.182% 241 __fput
0.221% 293 radix_tree_lookup
0.232% 307 atomic_dec_and_lock
0.250% 331 .text.lock.dec_and_lock
0.282% 374 try_to_wake_up
0.301% 398 kmap_atomic
0.309% 409 kunmap_atomic
0.326% 431 vfs_read
0.364% 482 .text.lock.namei
0.391% 518 __d_lookup
0.537% 710 link_path_walk
0.801% 1060 schedule
1.396% 1846 do_generic_mapping_read
25.275% 33416 poll_idle
66.319% 87678 __copy_to_user
100.000% 132206 total
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 18:34 ` Davide Libenzi
2002-11-18 18:52 ` Andrew Morton
@ 2002-11-18 18:56 ` Dave Hansen
2002-11-18 18:59 ` Davide Libenzi
1 sibling, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2002-11-18 18:56 UTC (permalink / raw)
To: Davide Libenzi
Cc: Andrew Morton, William Lee Irwin III, Martin J. Bligh,
Linux Kernel Mailing List, Ingo Molnar, Robert Love, riel, akpm
Davide Libenzi wrote:
> On Mon, 18 Nov 2002, Andrew Morton wrote:
>> fs/eventpoll.c | 2 ++
>> 1 files changed, 2 insertions(+)
>>
>>--- 25/fs/eventpoll.c~hey Mon Nov 18 10:13:40 2002
>>+++ 25-akpm/fs/eventpoll.c Mon Nov 18 10:14:01 2002
>>@@ -328,6 +328,8 @@ void eventpoll_release(struct file *file
>> if (list_empty(lsthead))
>> return;
>>
>>+ printk("hey!\n");
>>+
>
> Andrew, if you don't use epoll there's no way you get there. The function
> eventpoll_file_init() initialize the list at each file* init in
> fs/file_table.c
> If you're not using epoll and you get there, someone is screwing up the
> data inside the struct file
That little tidbit isn't even in .47. Is that patch against one of
the 2.5.47-mm's?
--
Dave Hansen
haveblue@us.ibm.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 18:52 ` Andrew Morton
@ 2002-11-18 18:58 ` Davide Libenzi
0 siblings, 0 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 18:58 UTC (permalink / raw)
To: Andrew Morton
Cc: Dave Hansen, William Lee Irwin III, Martin J. Bligh,
Linux Kernel Mailing List, Ingo Molnar, Robert Love, riel
On Mon, 18 Nov 2002, Andrew Morton wrote:
> Here's Dave's profile. ep_notify_file_close() makes a small appearance.
> The change you made to 2.5.48 will wipe that out. Neat.
It was per-Linus suggestion actually :)
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 18:56 ` Dave Hansen
@ 2002-11-18 18:59 ` Davide Libenzi
0 siblings, 0 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 18:59 UTC (permalink / raw)
To: Dave Hansen; +Cc: Linux Kernel Mailing List
On Mon, 18 Nov 2002, Dave Hansen wrote:
> > If you're not using epoll and you get there, someone is screwing up the
> > data inside the struct file
>
> That little tidbit isn't even in .47. Is that patch against one of
> the 2.5.47-mm's?
No Andrew took it from 2.5.48 ...
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 17:53 ` Dave Hansen
2002-11-18 18:16 ` Andrew Morton
@ 2002-11-18 20:17 ` William Lee Irwin III
2002-11-18 22:51 ` Dave Hansen
1 sibling, 1 reply; 19+ messages in thread
From: William Lee Irwin III @ 2002-11-18 20:17 UTC (permalink / raw)
To: Dave Hansen; +Cc: Martin J. Bligh, linux-kernel, mingo, rml, riel, akpm
On Mon, Nov 18, 2002 at 09:53:24AM -0800, Dave Hansen wrote:
> I'm seeing the same thing. In my pagecache warmup test, I do 20 greps
> to pull in a 10-gig fileset. Each grep works on 1/20th of the files.
[...]
> So, schedule() is being called a _lot_ more. But, for some reason,
> the slower one wasn't caught doing __copy_to_user() as much.
> Bill, does this look like what you're seeing?
No, I'm seeing strange load balancing behavior. But you seem to have
tripped over a somewhat more severe anomaly.
Bill
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 20:17 ` William Lee Irwin III
@ 2002-11-18 22:51 ` Dave Hansen
2002-11-18 23:09 ` Andrew Morton
2002-11-18 23:33 ` Davide Libenzi
0 siblings, 2 replies; 19+ messages in thread
From: Dave Hansen @ 2002-11-18 22:51 UTC (permalink / raw)
To: William Lee Irwin III
Cc: Martin J. Bligh, linux-kernel, mingo, rml, riel, akpm
As Andrew suggested, I put a dump_stack() in rwsem_down_write_failed().
This was actually in a 2.5.47 bk snapshot, so it has eventpoll in it.
kksymoops is broken, so:
dmesg | tail -20 | sort | uniq | ksymoops -m /boot/System.map
Trace; c01c5757 <rwsem_down_write_failed+27/170>
Trace; c01220c6 <update_wall_time+16/50>
Trace; c01223ee <do_timer+2e/c0>
Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
Trace; c0146568 <__fput+18/c0>
Trace; c010ae9a <handle_IRQ_event+2a/60>
Trace; c0144a05 <filp_close+85/b0>
Trace; c0144a8d <sys_close+5d/70>
Trace; c0108fab <syscall_call+7/b>
Trace; c01c5757 <rwsem_down_write_failed+27/170>
Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
Trace; c0146568 <__fput+18/c0>
Trace; c011e90b <do_softirq+6b/d0>
Trace; c0144a05 <filp_close+85/b0>
Trace; c0144a8d <sys_close+5d/70>
Trace; c0108fab <syscall_call+7/b>
Trace; c01c5757 <rwsem_down_write_failed+27/170>
Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
Trace; c0146568 <__fput+18/c0>
Trace; c0144c2d <generic_file_llseek+2d/e0>
Trace; c0144a05 <filp_close+85/b0>
Trace; c0144a8d <sys_close+5d/70>
Trace; c0108fab <syscall_call+7/b>
Trace; c01c5757 <rwsem_down_write_failed+27/170>
Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
Trace; c0146568 <__fput+18/c0>
Trace; c01553fa <sys_getdents64+4a/98>
Trace; c0144a05 <filp_close+85/b0>
Trace; c0144a8d <sys_close+5d/70>
Trace; c0108fab <syscall_call+7/b>
Mystery solved?
--
Dave Hansen
haveblue@us.ibm.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 22:51 ` Dave Hansen
@ 2002-11-18 23:09 ` Andrew Morton
2002-11-18 23:20 ` Davide Libenzi
2002-11-18 23:26 ` Dave Hansen
2002-11-18 23:33 ` Davide Libenzi
1 sibling, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2002-11-18 23:09 UTC (permalink / raw)
To: Dave Hansen
Cc: William Lee Irwin III, Martin J. Bligh, linux-kernel, mingo, rml,
riel, Davide Libenzi
Dave Hansen wrote:
>
> As Andrew suggested, I put a dump_stack() in rwsem_down_write_failed().
>
> This was actually in a 2.5.47 bk snapshot, so it has eventpoll in it.
So printk("hey!\n") would have worked. Looks like it would have
talked to you, too...
> kksymoops is broken, so:
> dmesg | tail -20 | sort | uniq | ksymoops -m /boot/System.map
>
> Trace; c01c5757 <rwsem_down_write_failed+27/170>
> Trace; c01220c6 <update_wall_time+16/50>
> Trace; c01223ee <do_timer+2e/c0>
> Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> Trace; c0146568 <__fput+18/c0>
> Trace; c010ae9a <handle_IRQ_event+2a/60>
> Trace; c0144a05 <filp_close+85/b0>
> Trace; c0144a8d <sys_close+5d/70>
> Trace; c0108fab <syscall_call+7/b>
>
So it would appear that eventpoll_release() is the problem.
How odd. You're not actually _using_ epoll there, are you?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 23:09 ` Andrew Morton
@ 2002-11-18 23:20 ` Davide Libenzi
2002-11-18 23:26 ` Dave Hansen
1 sibling, 0 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 23:20 UTC (permalink / raw)
To: Andrew Morton
Cc: Dave Hansen, William Lee Irwin III, Martin J. Bligh,
Linux Kernel Mailing List, Ingo Molnar, Robert Love, riel
On Mon, 18 Nov 2002, Andrew Morton wrote:
> Dave Hansen wrote:
> >
> > As Andrew suggested, I put a dump_stack() in rwsem_down_write_failed().
> >
> > This was actually in a 2.5.47 bk snapshot, so it has eventpoll in it.
>
> So printk("hey!\n") would have worked. Looks like it would have
> talked to you, too...
>
> > kksymoops is broken, so:
> > dmesg | tail -20 | sort | uniq | ksymoops -m /boot/System.map
> >
> > Trace; c01c5757 <rwsem_down_write_failed+27/170>
> > Trace; c01220c6 <update_wall_time+16/50>
> > Trace; c01223ee <do_timer+2e/c0>
> > Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> > Trace; c0146568 <__fput+18/c0>
> > Trace; c010ae9a <handle_IRQ_event+2a/60>
> > Trace; c0144a05 <filp_close+85/b0>
> > Trace; c0144a8d <sys_close+5d/70>
> > Trace; c0108fab <syscall_call+7/b>
> >
>
> So it would appear that eventpoll_release() is the problem.
> How odd. You're not actually _using_ epoll there, are you?
Could you pls use 2.5.48 ...
This is wierd, the code is straight forward.
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 23:09 ` Andrew Morton
2002-11-18 23:20 ` Davide Libenzi
@ 2002-11-18 23:26 ` Dave Hansen
2002-11-18 23:30 ` Davide Libenzi
1 sibling, 1 reply; 19+ messages in thread
From: Dave Hansen @ 2002-11-18 23:26 UTC (permalink / raw)
To: Andrew Morton
Cc: William Lee Irwin III, Martin J. Bligh, linux-kernel, mingo, rml,
riel, Davide Libenzi
Andrew Morton wrote:
> Dave Hansen wrote:
>>kksymoops is broken, so:
>>dmesg | tail -20 | sort | uniq | ksymoops -m /boot/System.map
>>
>>Trace; c01c5757 <rwsem_down_write_failed+27/170>
>>Trace; c01220c6 <update_wall_time+16/50>
>>Trace; c01223ee <do_timer+2e/c0>
>>Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
>>Trace; c0146568 <__fput+18/c0>
>>Trace; c010ae9a <handle_IRQ_event+2a/60>
>>Trace; c0144a05 <filp_close+85/b0>
>>Trace; c0144a8d <sys_close+5d/70>
>>Trace; c0108fab <syscall_call+7/b>
>>
>
> So it would appear that eventpoll_release() is the problem.
> How odd. You're not actually _using_ epoll there, are you?
Not unless grep uses epoll.
--
Dave Hansen
haveblue@us.ibm.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 23:26 ` Dave Hansen
@ 2002-11-18 23:30 ` Davide Libenzi
0 siblings, 0 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 23:30 UTC (permalink / raw)
To: Dave Hansen; +Cc: Linux Kernel Mailing List
On Mon, 18 Nov 2002, Dave Hansen wrote:
> > So it would appear that eventpoll_release() is the problem.
> > How odd. You're not actually _using_ epoll there, are you?
>
> Not unless grep uses epoll.
I'd be surprised if it would :)
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 22:51 ` Dave Hansen
2002-11-18 23:09 ` Andrew Morton
@ 2002-11-18 23:33 ` Davide Libenzi
1 sibling, 0 replies; 19+ messages in thread
From: Davide Libenzi @ 2002-11-18 23:33 UTC (permalink / raw)
To: Dave Hansen
Cc: William Lee Irwin III, Martin J. Bligh, linux-kernel, mingo, rml,
riel, akpm
On Mon, 18 Nov 2002, Dave Hansen wrote:
> As Andrew suggested, I put a dump_stack() in rwsem_down_write_failed().
>
> This was actually in a 2.5.47 bk snapshot, so it has eventpoll in it.
> kksymoops is broken, so:
> dmesg | tail -20 | sort | uniq | ksymoops -m /boot/System.map
>
> Trace; c01c5757 <rwsem_down_write_failed+27/170>
> Trace; c01220c6 <update_wall_time+16/50>
> Trace; c01223ee <do_timer+2e/c0>
> Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> Trace; c0146568 <__fput+18/c0>
> Trace; c010ae9a <handle_IRQ_event+2a/60>
> Trace; c0144a05 <filp_close+85/b0>
> Trace; c0144a8d <sys_close+5d/70>
> Trace; c0108fab <syscall_call+7/b>
>
> Trace; c01c5757 <rwsem_down_write_failed+27/170>
> Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> Trace; c0146568 <__fput+18/c0>
> Trace; c011e90b <do_softirq+6b/d0>
> Trace; c0144a05 <filp_close+85/b0>
> Trace; c0144a8d <sys_close+5d/70>
> Trace; c0108fab <syscall_call+7/b>
>
> Trace; c01c5757 <rwsem_down_write_failed+27/170>
> Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> Trace; c0146568 <__fput+18/c0>
> Trace; c0144c2d <generic_file_llseek+2d/e0>
> Trace; c0144a05 <filp_close+85/b0>
> Trace; c0144a8d <sys_close+5d/70>
> Trace; c0108fab <syscall_call+7/b>
>
> Trace; c01c5757 <rwsem_down_write_failed+27/170>
> Trace; c0166bd3 <.text.lock.eventpoll+6/f3>
> Trace; c0146568 <__fput+18/c0>
> Trace; c01553fa <sys_getdents64+4a/98>
> Trace; c0144a05 <filp_close+85/b0>
> Trace; c0144a8d <sys_close+5d/70>
> Trace; c0108fab <syscall_call+7/b>
>
> Mystery solved?
Could you pls put this in eventpoll_release() :
if (list_empty(lsthead))
return;
printk("[%p] head=%p prev=%p next=%p\n", current, lsthead,
lsthead->prev, lsthead->next);
- Davide
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-18 8:18 unusual scheduling performance William Lee Irwin III
2002-11-18 16:34 ` Martin J. Bligh
@ 2002-11-20 14:12 ` Ingo Molnar
2002-11-20 22:19 ` William Lee Irwin III
1 sibling, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2002-11-20 14:12 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: linux-kernel, Robert Love, riel, Andrew Morton
On Mon, 18 Nov 2002, William Lee Irwin III wrote:
> On 16x, 2.5.47 kernel compiles take about 26s when the machine is
> otherwise idle.
>
> On 32x, 2.5.47 kernel compiles take about 48s when the machine is
> otherwise idle.
one thing to note is that the kernel's compilation is not something that
parallelizes well to above 8 CPUs. Our make architecture creates many link
points which serialize 'threads of compilation'.
i'd try two things:
1) try Erich Focht's NUMA enhancements to the load balancer.
2) remove the -pipe flag from arch/i386/Makefile
the later thing will reduce the number of processes and makes compilation
more localized to a single CPU - which might (or might not) help NUMA
architectures.
Ingo
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: unusual scheduling performance
2002-11-20 14:12 ` Ingo Molnar
@ 2002-11-20 22:19 ` William Lee Irwin III
0 siblings, 0 replies; 19+ messages in thread
From: William Lee Irwin III @ 2002-11-20 22:19 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Robert Love, riel, Andrew Morton
On Mon, 18 Nov 2002, William Lee Irwin III wrote:
>> On 16x, 2.5.47 kernel compiles take about 26s when the machine is
>> otherwise idle.
>> On 32x, 2.5.47 kernel compiles take about 48s when the machine is
>> otherwise idle.
On Wed, Nov 20, 2002 at 03:12:57PM +0100, Ingo Molnar wrote:
> one thing to note is that the kernel's compilation is not something that
> parallelizes well to above 8 CPUs. Our make architecture creates many link
> points which serialize 'threads of compilation'.
Well, I was only -j64. Thats 2 processes per-cpu... something unusual
seems to happen with low process/cpu density. Some fiddling around with
prior kernels seemed to show that both -j64 and -j256 were previously
near-equivalent sweet spots for 32x.
On Wed, Nov 20, 2002 at 03:12:57PM +0100, Ingo Molnar wrote:
> i'd try two things:
> 1) try Erich Focht's NUMA enhancements to the load balancer.
> 2) remove the -pipe flag from arch/i386/Makefile
> the later thing will reduce the number of processes and makes compilation
> more localized to a single CPU - which might (or might not) help NUMA
> architectures.
The unusual bit that neither of those can really address was that
eating a single cpu with something completely unrelated sped the whole
process up from 48s to 36s on 32x (this is all nicely repeatable). No
good explanations for this have surfaced yet. I'll have to get a good
way of logging what processes are chewing how much cpu and what cpus
they're running on before I can send comprehensible traces of this.
OTOH Focht's fork() and/or exec() -time load balancing should
significantly help the low process/cpu density case by creating an
opportunity for load balancing before the lifetime of short-lived
processes expires, with the added bonus of keeping things within
nodes most of the time.
Thanks,
Bill
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-11-20 22:15 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-18 8:18 unusual scheduling performance William Lee Irwin III
2002-11-18 16:34 ` Martin J. Bligh
2002-11-18 16:53 ` William Lee Irwin III
2002-11-18 17:53 ` Dave Hansen
2002-11-18 18:16 ` Andrew Morton
2002-11-18 18:34 ` Davide Libenzi
2002-11-18 18:52 ` Andrew Morton
2002-11-18 18:58 ` Davide Libenzi
2002-11-18 18:56 ` Dave Hansen
2002-11-18 18:59 ` Davide Libenzi
2002-11-18 20:17 ` William Lee Irwin III
2002-11-18 22:51 ` Dave Hansen
2002-11-18 23:09 ` Andrew Morton
2002-11-18 23:20 ` Davide Libenzi
2002-11-18 23:26 ` Dave Hansen
2002-11-18 23:30 ` Davide Libenzi
2002-11-18 23:33 ` Davide Libenzi
2002-11-20 14:12 ` Ingo Molnar
2002-11-20 22:19 ` William Lee Irwin III
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox