public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>, mingo@elte.hu
Subject: Re: find_busiest_group using lots of CPU
Date: Mon, 05 Oct 2009 14:31:38 +0200	[thread overview]
Message-ID: <1254745898.26976.52.camel@twins> (raw)
In-Reply-To: <20090930081811.GP23126@kernel.dk>

On Wed, 2009-09-30 at 10:18 +0200, Jens Axboe wrote:
> Hi,
> 
> I stuffed a few more SSDs into my text box. Running a simple workload
> that just does streaming reads from 10 processes (throughput is around
> 2.2GB/sec), find_busiest_group() is using > 10% of the CPU time. This is
> a 64 thread box.
> 
> The top two profile entries are:
> 
>     10.86%      fio  [kernel]                [k] find_busiest_group
>                 |          
>                 |--99.91%-- thread_return
>                 |          io_schedule
>                 |          sys_io_getevents
>                 |          system_call_fastpath
>                 |          0x7f4b50b61604
>                 |          |          
>                 |           --100.00%-- td_io_getevents
>                 |                     io_u_queued_complete
>                 |                     thread_main
>                 |                     run_threads
>                 |                     main
>                 |                     __libc_start_main
>                  --0.09%-- [...]
> 
>      5.78%      fio  [kernel]                [k] cpumask_next_and
>                 |          
>                 |--67.21%-- thread_return
>                 |          io_schedule
>                 |          sys_io_getevents
>                 |          system_call_fastpath
>                 |          0x7f4b50b61604
>                 |          |          
>                 |           --100.00%-- td_io_getevents
>                 |                     io_u_queued_complete
>                 |                     thread_main
>                 |                     run_threads
>                 |                     main
>                 |                     __libc_start_main
>                 |          
>                  --32.79%-- find_busiest_group
>                            thread_return
>                            io_schedule
>                            sys_io_getevents
>                            system_call_fastpath
>                            0x7f4b50b61604
>                            |          
>                             --100.00%-- td_io_getevents
>                                       io_u_queued_complete
>                                       thread_main
>                                       run_threads
>                                       main
>                                       __libc_start_main
> 
> This is with SCHED_DEBUG=y and SCHEDSTATS=y enabled, I just tried with
> both disabled but that yields the same result (well actually worse, 22%
> spent in there. dunno if that's normal "fluctuation"). GROUP_SCHED is
> not set. This seems way excessive!

io_schedule() straight into find_busiest_group() leads me to think this
could be SD_BALANCE_NEWIDLE, does something like:

for i in /proc/sys/kernel/sched_domain/cpu*/domain*/flags; 
do 
	val=`cat $i`; echo $((val & ~0x02)) > $i; 
done

[ assuming SCHED_DEBUG=y ]

Cure things?

If so, then its spending time looking for work, which there might not be
on your machine, since everything is waiting for IO or somesuch.

Not really sure what to do about it though, this is a quad socket
nehalem, right? We could possibly disable SD_BALANCE_NEWIDLE on the NODE
level, but that would again decrease throughput in things like kbuild.


  reply	other threads:[~2009-10-05 12:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-30  8:18 find_busiest_group using lots of CPU Jens Axboe
2009-10-05 12:31 ` Peter Zijlstra [this message]
2009-10-06  7:51   ` Jens Axboe
2009-10-06 11:20     ` Jens Axboe
2009-10-06 11:47       ` Ingo Molnar
2009-10-06 11:56         ` Jens Axboe
2009-10-06 12:04       ` Peter Zijlstra
2009-10-06 12:14         ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1254745898.26976.52.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox