From: Andy Lutomirski <luto@mit.edu>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Jayaraman <sjayaraman@suse.de>,
LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>
Subject: Re: High priority threads causing severe CPU load imbalances
Date: Wed, 07 Apr 2010 00:42:06 -0400 [thread overview]
Message-ID: <4BBC0D1E.3030509@mit.edu> (raw)
In-Reply-To: <1270562890.1595.438.camel@laptop>
Peter Zijlstra wrote:
> On Tue, 2010-04-06 at 18:42 +0530, Suresh Jayaraman wrote:
>> I have a simple test program that accepts number of threads(pthreads) to
>> be created as a input. Each of these threads that gets created invokes a
>> function which is just a infinite while loop. The main function after
>> creating those threads goes in a infinite loop itself
>>
>> My test machine is a Dual Core AMD Opteron(tm) 860 with 8
>> sockets(non-HT), I run this test program with number of threads ==
>> number of CPUs:
>>
>> ./loadcpu -t 16
>>
>> I see 100% CPU utilization on almost all CPUs (via mpstat/htop/vmstat).
>>
>> When the above threads are running, if I introduce a few high priority
>> threads by doing:
>>
>> nice -n -13 ./loadcpu -t 3
>>
>> After a short while, I see a few CPUs becoming idle at ~0% utilization
>> (the number of CPUs becoming idle equals roughly the number of high
>> priority threads i.e. 3). When I stop the high priority threads, the CPU
>> utilization comes back to normal i.e. ~100%.
>>
>> This is reproducible on 2.6.32.10 stable kernel with all the recent all
>> SMT fixes (I hope) and I think it would be reproducible in current
>> upstream as well.
>
> Why bother using -stable for reporting bugs?
>
>> sched_mc_power_savings has been always set to 0.
>>
>> I spent a while staring at the load balancing and the thread migration
>> code, but could not figure out why this is happening. Would appreciate
>> any pointers.
>
> Right, except its not a severe imbalance as the subject suggests. For
> some reason it seems to end up in a semi-stable state that is actually
> quite balanced.
>
> for ((i=0; i<8; i++)) do while :; do :; done & done
> for ((i=0; i<3; i++)) do while :; do :; done & renice -n -15 -p $! ;
> done
>
> gets me:
>
> Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu4 : 99.0%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 16440840k total, 1073672k used, 15367168k free, 105844k buffers
> Swap: 16777212k total, 0k used, 16777212k free, 296504k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 4370 root 5 -15 105m 804 304 R 100.1 0.0 0:45.02 bash
> 4374 root 5 -15 105m 804 304 R 100.1 0.0 0:44.95 bash
> 4372 root 5 -15 105m 804 304 R 99.1 0.0 0:45.00 bash
> 4364 root 20 0 105m 804 304 R 51.0 0.0 0:33.06 bash
> 4362 root 20 0 105m 800 300 R 50.0 0.0 0:33.17 bash
> 4365 root 20 0 105m 804 304 R 50.0 0.0 0:33.75 bash
> 4368 root 20 0 105m 804 304 R 50.0 0.0 0:33.32 bash
> 4369 root 20 0 105m 804 304 R 50.0 0.0 0:33.38 bash
> 4363 root 20 0 105m 804 304 R 49.1 0.0 0:33.65 bash
> 4366 root 20 0 105m 804 304 R 49.1 0.0 0:33.29 bash
> 4367 root 20 0 105m 804 304 R 49.1 0.0 0:33.54 bash
>
> So we have the 3 -15 loops on a cpu each, and the 8 0 loops on 2 cpus
> each, and 1 cpu idle. That is actually quite balanced, 'better' would be
> if those 0 loops would rotate over the 5 available cpus, but that would
> also trash more caches I guess.
What's wrong with having the three -15 loops each get a CPU, having six
of the remaining 0 loops get half a CPU, and the last two get their own
CPUs. That's less fair but strictly better than the current solution,
and nothing bounces.
--Andy
next prev parent reply other threads:[~2010-04-07 4:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-06 13:12 High priority threads causing severe CPU load imbalances Suresh Jayaraman
2010-04-06 14:08 ` Peter Zijlstra
2010-04-06 16:35 ` Suresh Jayaraman
2010-04-08 16:15 ` Peter Zijlstra
2010-04-09 2:20 ` Masayuki Igawa
2010-04-07 4:42 ` Andy Lutomirski [this message]
2010-04-07 7:44 ` Peter Zijlstra
2010-04-07 5:46 ` Masayuki Igawa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BBC0D1E.3030509@mit.edu \
--to=luto@mit.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=sjayaraman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.