linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Jim Schutt" <jaschut@sandia.gov>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Minchan Kim <minchan@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: excessive CPU utilization by isolate_freepages?
Date: Thu, 28 Jun 2012 09:30:00 -0600	[thread overview]
Message-ID: <4FEC7878.6000906@sandia.gov> (raw)
In-Reply-To: <20120628113619.GR8103@csn.ul.ie>

On 06/28/2012 05:36 AM, Mel Gorman wrote:
> On Wed, Jun 27, 2012 at 03:59:19PM -0600, Jim Schutt wrote:
>> Hi,
>>
>> I'm running into trouble with systems going unresponsive,
>> and perf suggests it's excessive CPU usage by isolate_freepages().
>> I'm currently testing 3.5-rc4, but I think this problem may have
>> first shown up in 3.4.  I'm only just learning how to use perf,
>> so I only currently have results to report for 3.5-rc4.
>>
>
> Out of curiosity, why do you think it showed up in 3.4? It's not
> surprising as such if it did show up there but I'm wondering what you
> are basing it on.

If I remember correctly, when I was testing this workload on 3.4 is
when I first saw hundreds of runable threads being reported by vmstat.
At that time I couldn't reproduce quite as reliably, and I didn't
know how to get perf to give me call chains, so I didn't follow up :(

>
> It's not a suprise because it's also where reclaim/compaction stopped
> depending on lumpy reclaim. In the past we would have reclaimed more
> pages but now rely on compaction more. It's plassible that for many
> parallel compactions that there would be higher CPU usage now.
>
>> <SNIP>
>> 2012-06-27 14:00:03.219-06:00
>> vmstat -w 4 16
>> procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
>>   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
>> 75  1          0     566988        576   35664800    0    0     2  1355   21    3   1  4  95  0  0
>> 433  1          0     964052        576   35069112    0    0     7 456359 102256 20901   2 98   0  0  0
>> 547  3          0     820116        576   34893932    0    0    57 560507 114878 28115   3 96   0  0  0
>> 806  2          0     606992        576   34848180    0    0   339 309668 101230 21056   2 98   0  0  0
>> 708  1          0     529624        576   34708000    0    0   248 370886 101327 20062   2 97   0  0  0
>> 231  5          0     504772        576   34663880    0    0   305 334824 95045 20407   2 97   1  1  0
>> 158  6          0    1063088        576   33518536    0    0   531 847435 130696 47140   4 92   1  2  0
>> 193  0          0    1449156        576   33035572    0    0   363 371279 94470 18955   2 96   1  1  0
>> 266  6          0    1623512        576   32728164    0    0    77 241114 95730 15483   2 98   0  0  0
>> 243  8          0    1629504        576   32653080    0    0    81 471018 100223 20920   3 96   0  1  0
>> 70 11          0    1342140        576   33084020    0    0   100 925869 139876 56599   6 88   3  3  0
>> 211  7          0    1130316        576   33470432    0    0   290 1008984 150699 74320   6 83   6  5  0
>> 365  3          0     776736        576   34072772    0    0   182 747167 139436 67135   5 88   4  3  0
>> 29  1          0    1528412        576   34110640    0    0    50 612181 137403 77609   4 87   6  3  0
>> 266  5          0    1657688        576   34105696    0    0     3 258307 62879 38508   2 93   3  2  0
>> 1159  2          0    2002256        576   33775476    0    0    19 88554 42112 14230   1 98   0  0  0
>>
>
> ok, so System CPU usage through the roof.
>
>>
>> Right around 14:00 I was able to get a "perf -a -g"; here's the
>> beginning of what "perf report --sort symbol --call-graph fractal,5"
>> had to say:
>>
>> #
>>      64.86%  [k] _raw_spin_lock_irqsave
>>              |
>>              |--97.94%-- isolate_freepages
>>              |          compaction_alloc
>>              |          unmap_and_move
>>              |          migrate_pages
>>              |          compact_zone
>>              |          |
>>              |          |--99.56%-- try_to_compact_pages
>>              |          |          __alloc_pages_direct_compact
>>              |          |          __alloc_pages_slowpath
>>              |          |          __alloc_pages_nodemask
>>              |          |          alloc_pages_vma
>>              |          |          do_huge_pmd_anonymous_page
>>              |          |          handle_mm_fault
>>              |          |          do_page_fault
>>              |          |          page_fault
>>              |          |          |
>>              |          |          |--53.53%-- skb_copy_datagram_iovec
>>              |          |          |          tcp_recvmsg
>>              |          |          |          inet_recvmsg
>>              |          |          |          sock_recvmsg
>>              |          |          |          sys_recvfrom
>>              |          |          |          system_call_fastpath
>>              |          |          |          __recv
>>              |          |          |          |
>>              |          |          |           --100.00%-- (nil)
>>              |          |          |
>>              |          |          |--27.80%-- __pthread_create_2_1
>>              |          |          |          (nil)
>>              |          |          |
>>              |          |           --18.67%-- memcpy
>>              |          |                     |
>>              |          |                     |--57.38%-- 0x50d000005
>>              |          |                     |
>>              |          |                     |--34.52%-- 0x3b300bf271940a35
>>              |          |                     |
>>              |          |                      --8.10%-- 0x1500000000000009
>>              |           --0.44%-- [...]
>>               --2.06%-- [...]
>>
>
> This looks like lock contention to me on zone->lock which
> isolate_freepages takes and releases frequently. You do not describe the
> exact memory layout but it's likely that there are two very large zones
> with 12 CPUs each. If they all were running compaction they would pound
> zone->lock pretty heavily.

Does this tell you what you need to know?
If not, let me know what you'd like to see.

# grep -H MemTotal /sys/devices/system/node/node*/meminfo
/sys/devices/system/node/node0/meminfo:Node 0 MemTotal:       25156644 kB
/sys/devices/system/node/node1/meminfo:Node 1 MemTotal:       25165824 kB

>
>> <SNIP>
>
> The other call traces also look like they are pounding zone->lock
> heavily.
>
> Rik's patch has the potential to reduce contention by virtue of the fact
> that less scanning is required. I'd be interested in hearing how much of
> an impact that patch has so please test that first.

Working on that now....

>
> If that approach does not work I'll put together a patch that either
> backs off compaction on zone->lock contention.
>
>> I seem to be able to recreate this issue at will, so please
>> let me know what I can do to help learn what is going on.
>>
>
> Thanks very much for testing.
>

No problem.  Thanks to you all for jumping on this so quickly.

FWIW, perf rocks!  Thanks to all who've helped make it so great!

-- Jim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2012-06-28 15:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-27 21:59 excessive CPU utilization by isolate_freepages? Jim Schutt
2012-06-27 22:35 ` Mark Nelson
2012-06-27 23:59 ` Minchan Kim
2012-06-28  0:28   ` Rik van Riel
2012-06-28  0:52     ` David Rientjes
2012-06-28  0:58       ` Minchan Kim
2012-06-28  1:06         ` David Rientjes
2012-06-28  1:18           ` Minchan Kim
2012-06-28  1:13       ` Rik van Riel
2012-06-28  0:55     ` Minchan Kim
2012-06-28 11:36 ` Mel Gorman
2012-06-28 15:30   ` Jim Schutt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FEC7878.6000906@sandia.gov \
    --to=jaschut@sandia.gov \
    --cc=akpm@linux-foundation.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).