Re: Kswapd in 3.2.0-rc5 is a CPU hog

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Nikolay S." <nowhere@hakkenden.ath.cx>,
	Michal Hocko <mhocko@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Kswapd in 3.2.0-rc5 is a CPU hog
Date: Thu, 29 Dec 2011 08:33:59 +1100	[thread overview]
Message-ID: <20111228213359.GF12731@dastard> (raw)
In-Reply-To: <20111227134405.9902dcbb.kamezawa.hiroyu@jp.fujitsu.com>

On Tue, Dec 27, 2011 at 01:44:05PM +0900, KAMEZAWA Hiroyuki wrote:
> To me,  it seems kswapd does usual work...reclaim small memory until free
> gets enough. And it seems 'dd' allocates its memory from ZONE_DMA32 because
> of gfp_t fallbacks.
> 
> 
> Memo.
> 
> 1. why shrink_slab() should be called per zone, which is not zone aware.
>    Isn't it enough to call it per priority ?

It is intended that it should be zone aware, but the current
shrinkers only have global LRUs and hence cannot discriminate
between objects from different zones easily. And if only a single
node/zone is being scanned, then we still have to call shirnk_slab()
to try to free objects in that zone/node, despite it's current
global scope.

I have some prototype patches that make the major slab caches and
shrinkers zone/node aware - that is the eventual goal here - but
first all the major slab cache LRUs need to be converted to be node
aware first. Then we can pass a nodemask into shrink_slab() and down
to the shrinkers so that those that have per-node LRUs can scan only
the appropriate nodes for objects to free. This is someting that I'm
working on in my spare time, but I have very little of that at the
moment, unfortunately.

> 2. what spinlock contention that perf showed ?
>    And if shrink_slab() doesn't consume cpu as trace shows, why perf 
>    says shrink_slab() is heavy..

There isn't any spin lock contention - it's just showing how
expensive locking superblocks is when it's being done every few
microseconds for no good reason.

> 3. because 8/9 of memory is in DMA32, calling shrink_slab() frequently
>    at scanning NORMAL seems to be time wasting.

Especially as the shrink_slab() calls are returning zero pages freed
every single time (i.e. the slab caches are empty). kswapd needs to
back off here, I think, or free more memory at a time. Only freeing
100 pages at a time is pretty inefficient, esp. as we have 4 orders
of magnitude more pages on the LRU and that is consuming >90% of
RAM...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Dave Chinner <david@fromorbit.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Nikolay S." <nowhere@hakkenden.ath.cx>,
	Michal Hocko <mhocko@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Kswapd in 3.2.0-rc5 is a CPU hog
Date: Thu, 29 Dec 2011 08:33:59 +1100	[thread overview]
Message-ID: <20111228213359.GF12731@dastard> (raw)
In-Reply-To: <20111227134405.9902dcbb.kamezawa.hiroyu@jp.fujitsu.com>

On Tue, Dec 27, 2011 at 01:44:05PM +0900, KAMEZAWA Hiroyuki wrote:
> To me,  it seems kswapd does usual work...reclaim small memory until free
> gets enough. And it seems 'dd' allocates its memory from ZONE_DMA32 because
> of gfp_t fallbacks.
> 
> 
> Memo.
> 
> 1. why shrink_slab() should be called per zone, which is not zone aware.
>    Isn't it enough to call it per priority ?

It is intended that it should be zone aware, but the current
shrinkers only have global LRUs and hence cannot discriminate
between objects from different zones easily. And if only a single
node/zone is being scanned, then we still have to call shirnk_slab()
to try to free objects in that zone/node, despite it's current
global scope.

I have some prototype patches that make the major slab caches and
shrinkers zone/node aware - that is the eventual goal here - but
first all the major slab cache LRUs need to be converted to be node
aware first. Then we can pass a nodemask into shrink_slab() and down
to the shrinkers so that those that have per-node LRUs can scan only
the appropriate nodes for objects to free. This is someting that I'm
working on in my spare time, but I have very little of that at the
moment, unfortunately.

> 2. what spinlock contention that perf showed ?
>    And if shrink_slab() doesn't consume cpu as trace shows, why perf 
>    says shrink_slab() is heavy..

There isn't any spin lock contention - it's just showing how
expensive locking superblocks is when it's being done every few
microseconds for no good reason.

> 3. because 8/9 of memory is in DMA32, calling shrink_slab() frequently
>    at scanning NORMAL seems to be time wasting.

Especially as the shrink_slab() calls are returning zero pages freed
every single time (i.e. the slab caches are empty). kswapd needs to
back off here, I think, or free more memory at a time. Only freeing
100 pages at a time is pretty inefficient, esp. as we have 4 orders
of magnitude more pages on the LRU and that is consuming >90% of
RAM...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2011-12-28 21:34 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-21  3:10 Kswapd in 3.2.0-rc5 is a CPU hog Nikolay S.
2011-12-21  9:52 ` Michal Hocko
2011-12-21  9:52   ` Michal Hocko
2011-12-21 10:15   ` nowhere
2011-12-21 10:15     ` nowhere
2011-12-21 10:24     ` Michal Hocko
2011-12-21 10:24       ` Michal Hocko
2011-12-21 10:24       ` Michal Hocko
2011-12-21 10:52       ` nowhere
2011-12-21 10:52         ` nowhere
2011-12-21 10:52         ` nowhere
2011-12-21 14:06       ` Alex Elder
2011-12-21 14:06         ` Alex Elder
2011-12-21 14:06         ` Alex Elder
2011-12-21 14:19         ` nowhere
2011-12-21 14:19           ` nowhere
2011-12-21 14:19           ` nowhere
2011-12-21 22:55   ` Dave Chinner
2011-12-21 22:55     ` Dave Chinner
2011-12-23  9:01     ` nowhere
2011-12-23  9:01       ` nowhere
2011-12-23 10:20       ` Dave Chinner
2011-12-23 10:20         ` Dave Chinner
2011-12-23 11:04         ` nowhere
2011-12-23 11:04           ` nowhere
2011-12-23 20:45           ` Dave Chinner
2011-12-23 20:45             ` Dave Chinner
2011-12-25  9:09             ` Hillf Danton
2011-12-25  9:09               ` Hillf Danton
2011-12-25 10:21               ` Nikolay S.
2011-12-25 10:21                 ` Nikolay S.
2011-12-26 12:35                 ` Hillf Danton
2011-12-26 12:35                   ` Hillf Danton
2011-12-27  0:20                   ` KAMEZAWA Hiroyuki
2011-12-27  0:20                     ` KAMEZAWA Hiroyuki
2011-12-27 13:33                     ` Hillf Danton
2011-12-27 13:33                       ` Hillf Danton
2011-12-28  0:06                       ` KAMEZAWA Hiroyuki
2011-12-28  0:06                         ` KAMEZAWA Hiroyuki
2011-12-27  2:15             ` KAMEZAWA Hiroyuki
2011-12-27  2:15               ` KAMEZAWA Hiroyuki
2011-12-27  2:50               ` Nikolay S.
2011-12-27  2:50                 ` Nikolay S.
2011-12-27  4:44                 ` KAMEZAWA Hiroyuki
2011-12-27  4:44                   ` KAMEZAWA Hiroyuki
2011-12-27  6:06                   ` nowhere
2011-12-27  6:06                     ` nowhere
2011-12-28 21:33                   ` Dave Chinner [this message]
2011-12-28 21:33                     ` Dave Chinner
2011-12-28 22:57                     ` KOSAKI Motohiro
2011-12-28 22:57                       ` KOSAKI Motohiro
2012-01-02  7:00                       ` Dave Chinner
2012-01-02  7:00                         ` Dave Chinner
2011-12-27  3:57               ` Minchan Kim
2011-12-27  3:57                 ` Minchan Kim
2011-12-27  4:56                 ` KAMEZAWA Hiroyuki
2011-12-27  4:56                   ` KAMEZAWA Hiroyuki
2012-01-10 22:33                   ` Andrew Morton
2012-01-10 22:33                     ` Andrew Morton
2012-01-11  3:25                     ` Nikolay S.
2012-01-11  3:25                       ` Nikolay S.
2012-01-11  4:42                       ` Andrew Morton
2012-01-11  4:42                         ` Andrew Morton
2012-01-11  0:33                   ` Dave Chinner
2012-01-11  0:33                     ` Dave Chinner
2012-01-11  1:17                 ` Rik van Riel
2012-01-11  1:17                   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111228213359.GF12731@dastard \
    --to=david@fromorbit.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=nowhere@hakkenden.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.