sched_wakeup_new and sched_kthread_stop events cause great overload

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* sched_wakeup_new and sched_kthread_stop events cause great overload
@ 2010-03-25  9:36 Xiao Guangrong
  2010-03-25 13:35 ` Steven Rostedt
  0 siblings, 1 reply; 4+ messages in thread
From: Xiao Guangrong @ 2010-03-25  9:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, Frederic Weisbecker, Hitoshi Mitake,
	Peter Zijlstra, LKML

We have done sysbench test for ftrace's performance and it looks sched_wakeup_new
and sched_kthread_stop events can cause great overload.

When we only enable sched_wakeup_new and sched_kthread_stop events, sysbench.threads
shows the overload is 10%, sysbench.mutex shows the overload is 7.5%.

The more weird thing is that we found the sched_kthread_stop event is never called
in this test, the test steps as follow:

echo > debugfs/tracing/set_event
echo 1 > debugfs/tracing/events/sched/sched_wakeup_new/enable
echo 1 > debugfs/tracing/events/sched/sched_kthread_stop/enable

com_opt="--num-threads=5000 --max-requests=50000"
echo > debugfs/tracing/trace
sysbench $com_opt --test=threads --thread-yields=1000 --thread-locks=10000 run >& log
[or sysbench $com_opt --test=mutex --mutex-num=100 --mutex-locks=50000 --mutex-loops=10000 run for mutex]
echo > debugfs/tracing/set_event

For sysbench.threads:
cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
5001
cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
0

For sysbench.mutex:
cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
5001
cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
0

And, if only enable sched_kthread_stop event, the sysbench.threads's
overload is 5.90%, the sysbench.mutex's overload is 3.36%.

It hardly explain why sched_kthread_stop is never called but cause great overload.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sched_wakeup_new and sched_kthread_stop events cause great overload
  2010-03-25  9:36 sched_wakeup_new and sched_kthread_stop events cause great overload Xiao Guangrong
@ 2010-03-25 13:35 ` Steven Rostedt
  2010-04-01  9:37   ` Xiao Guangrong
  0 siblings, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2010-03-25 13:35 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Ingo Molnar, Frederic Weisbecker, Hitoshi Mitake, Peter Zijlstra,
	LKML

On Thu, 2010-03-25 at 17:36 +0800, Xiao Guangrong wrote:
> We have done sysbench test for ftrace's performance and it looks sched_wakeup_new
> and sched_kthread_stop events can cause great overload.
> 
> When we only enable sched_wakeup_new and sched_kthread_stop events, sysbench.threads
> shows the overload is 10%, sysbench.mutex shows the overload is 7.5%.
> 
> The more weird thing is that we found the sched_kthread_stop event is never called
> in this test, the test steps as follow:
> 
> echo > debugfs/tracing/set_event
> echo 1 > debugfs/tracing/events/sched/sched_wakeup_new/enable
> echo 1 > debugfs/tracing/events/sched/sched_kthread_stop/enable
> 
> com_opt="--num-threads=5000 --max-requests=50000"
> echo > debugfs/tracing/trace
> sysbench $com_opt --test=threads --thread-yields=1000 --thread-locks=10000 run >& log
> [or sysbench $com_opt --test=mutex --mutex-num=100 --mutex-locks=50000 --mutex-loops=10000 run for mutex]
> echo > debugfs/tracing/set_event
> 
> For sysbench.threads:
> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
> 5001
> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
> 0

Strange? So if you did:

cat debugfs/tracing/trace | wc -l

you should get 5005?

> 
> For sysbench.mutex:
> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
> 5001
> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
> 0
> 
> And, if only enable sched_kthread_stop event, the sysbench.threads's
> overload is 5.90%, the sysbench.mutex's overload is 3.36%.

Could be something that messes with the cache lines?

> 
> It hardly explain why sched_kthread_stop is never called but cause great overload.

Would be worth running perf against this test, to see where the issues
are occurring.

-- Steve



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sched_wakeup_new and sched_kthread_stop events cause great overload
  2010-03-25 13:35 ` Steven Rostedt
@ 2010-04-01  9:37   ` Xiao Guangrong
  2010-04-01 16:53     ` Frederic Weisbecker
  0 siblings, 1 reply; 4+ messages in thread
From: Xiao Guangrong @ 2010-04-01  9:37 UTC (permalink / raw)
  To: rostedt
  Cc: Ingo Molnar, Frederic Weisbecker, Hitoshi Mitake, Peter Zijlstra,
	LKML

Hi Steven,

Sorry for the delay reply since i'm busy during this time.

Steven Rostedt wrote:

>> For sysbench.threads:
>> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
>> 5001
>> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
>> 0
> 
> Strange? So if you did:
> 
> cat debugfs/tracing/trace | wc -l
> 
> you should get 5005?
> 

Yes, other 4 lines are tracing message's header

>> For sysbench.mutex:
>> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
>> 5001
>> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
>> 0
>>
>> And, if only enable sched_kthread_stop event, the sysbench.threads's
>> overload is 5.90%, the sysbench.mutex's overload is 3.36%.
> 
> Could be something that messes with the cache lines?

We don't know the real reason, but we can keep fighting for this issue, once
we get useful information, we will sent to you.

> 
>> It hardly explain why sched_kthread_stop is never called but cause great overload.
> 
> Would be worth running perf against this test, to see where the issues
> are occurring.
> 

We've used 'perf' to tracing it:
perf record -e sched:sched_wakeup_new -e sched:sched_kthread_stop -f -i -a -c 1 sysbench --num-threads=5000 --max-requests=50000 --test=threads --thread-yields=1000 --thread-locks=10000 run

The output is:
# Samples: 5001
#
# Overhead          Command           Shared Object  Symbol
# ........  ...............  ......................  ......
#
    99.98%         sysbench  /lib64/libc-2.10.90.so  [.] clone
     0.02%  perf.2.6.31-38.  /lib64/libc-2.10.90.so  [.] fork
#
# (For a higher level overview, try: perf report --sort comm,dso)
#

The result shows that the most time is costed on clone and fork, and no path to
call sched_kthread_stop.

Xiao

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sched_wakeup_new and sched_kthread_stop events cause great overload
  2010-04-01  9:37   ` Xiao Guangrong
@ 2010-04-01 16:53     ` Frederic Weisbecker
  0 siblings, 0 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2010-04-01 16:53 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: rostedt, Ingo Molnar, Hitoshi Mitake, Peter Zijlstra, LKML

On Thu, Apr 01, 2010 at 05:37:59PM +0800, Xiao Guangrong wrote:
> Hi Steven,
> 
> Sorry for the delay reply since i'm busy during this time.
> 
> Steven Rostedt wrote:
> 
> >> For sysbench.threads:
> >> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
> >> 5001
> >> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
> >> 0
> > 
> > Strange? So if you did:
> > 
> > cat debugfs/tracing/trace | wc -l
> > 
> > you should get 5005?
> > 
> 
> Yes, other 4 lines are tracing message's header
> 
> >> For sysbench.mutex:
> >> cat debugfs/tracing/trace | grep "sched_wakeup_new" | wc -l
> >> 5001
> >> cat debugfs/tracing/trace | grep "sched_kthread_stop" | wc -l
> >> 0
> >>
> >> And, if only enable sched_kthread_stop event, the sysbench.threads's
> >> overload is 5.90%, the sysbench.mutex's overload is 3.36%.
> > 
> > Could be something that messes with the cache lines?
> 
> We don't know the real reason, but we can keep fighting for this issue, once
> we get useful information, we will sent to you.
> 
> > 
> >> It hardly explain why sched_kthread_stop is never called but cause great overload.
> > 
> > Would be worth running perf against this test, to see where the issues
> > are occurring.
> > 
> 
> We've used 'perf' to tracing it:
> perf record -e sched:sched_wakeup_new -e sched:sched_kthread_stop -f -i -a -c 1 sysbench --num-threads=5000 --max-requests=50000 --test=threads --thread-yields=1000 --thread-locks=10000 run


Do you mean you've noticed this overhead with perf only, not
with ftrace, right?



 
> The output is:
> # Samples: 5001
> #
> # Overhead          Command           Shared Object  Symbol
> # ........  ...............  ......................  ......
> #
>     99.98%         sysbench  /lib64/libc-2.10.90.so  [.] clone
>      0.02%  perf.2.6.31-38.  /lib64/libc-2.10.90.so  [.] fork
> #
> # (For a higher level overview, try: perf report --sort comm,dso)
> #
> 
> The result shows that the most time is costed on clone and fork, and no path to
> call sched_kthread_stop.


Could you please add the -g option to you perf record. It would be interesting
to see the callgraphs.
(Redirection to a pipe pass through the newt GUI so you can report the ascii
result).

In perf, even if an event doesn't generate much "events", various things
happen internally due to the fact we have active perf events, especially
on sched out/in and clone and fork.

On clone and fork, the events get inherited from the task parent. With a
high amount of task childs created, this may perfhaps raise an overhead.
But this is not supposed to be the case here because you are using the -a
option, which means the events are not inherited as they are per cpu.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-04-01 15:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-25  9:36 sched_wakeup_new and sched_kthread_stop events cause great overload Xiao Guangrong
2010-03-25 13:35 ` Steven Rostedt
2010-04-01  9:37   ` Xiao Guangrong
2010-04-01 16:53     ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox