From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753037Ab1GUQKK (ORCPT ); Thu, 21 Jul 2011 12:10:10 -0400 Received: from cantor2.suse.de ([195.135.220.15]:50277 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752762Ab1GUQKH (ORCPT ); Thu, 21 Jul 2011 12:10:07 -0400 Date: Thu, 21 Jul 2011 17:09:59 +0100 From: Mel Gorman To: Minchan Kim Cc: Andrew Morton , P?draig Brady , James Bottomley , Colin King , Andrew Lutomirski , Rik van Riel , Johannes Weiner , linux-mm , linux-kernel Subject: Re: [PATCH 0/4] Stop kswapd consuming 100% CPU when highest zone is small Message-ID: <20110721160958.GT5349@suse.de> References: <1308926697-22475-1-git-send-email-mgorman@suse.de> <20110721153722.GD1713@barrios-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20110721153722.GD1713@barrios-desktop> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 22, 2011 at 12:37:22AM +0900, Minchan Kim wrote: > On Fri, Jun 24, 2011 at 03:44:53PM +0100, Mel Gorman wrote: > > (Built this time and passed a basic sniff-test.) > > > > During allocator-intensive workloads, kswapd will be woken frequently > > causing free memory to oscillate between the high and min watermark. > > This is expected behaviour. Unfortunately, if the highest zone is > > small, a problem occurs. > > > > This seems to happen most with recent sandybridge laptops but it's > > probably a co-incidence as some of these laptops just happen to have > > a small Normal zone. The reproduction case is almost always during > > copying large files that kswapd pegs at 100% CPU until the file is > > deleted or cache is dropped. > > > > The problem is mostly down to sleeping_prematurely() keeping kswapd > > awake when the highest zone is small and unreclaimable and compounded > > by the fact we shrink slabs even when not shrinking zones causing a lot > > of time to be spent in shrinkers and a lot of memory to be reclaimed. > > > > Patch 1 corrects sleeping_prematurely to check the zones matching > > the classzone_idx instead of all zones. > > > > Patch 2 avoids shrinking slab when we are not shrinking a zone. > > > > Patch 3 notes that sleeping_prematurely is checking lower zones against > > a high classzone which is not what allocators or balance_pgdat() > > is doing leading to an artifical believe that kswapd should be > > still awake. > > > > Patch 4 notes that when balance_pgdat() gives up on a high zone that the > > decision is not communicated to sleeping_prematurely() > > > > This problem affects 2.6.38.8 for certain and is expected to affect > > 2.6.39 and 3.0-rc4 as well. If accepted, they need to go to -stable > > to be picked up by distros and this series is against 3.0-rc4. I've > > cc'd people that reported similar problems recently to see if they > > still suffer from the problem and if this fixes it. > > > > Good! > This patch solved the problem. > But there is still a mystery. > > In log, we could see excessive shrink_slab calls. Yes, because shrink_slab() was called on each loop through balance_pgdat() even if the zone was balanced. > And as you know, we had merged patch which adds cond_resched where last of the function > in shrink_slab. So other task should get the CPU and we should not see > 100% CPU of kswapd, I think. > cond_resched() is not a substitute for going to sleep. -- Mel Gorman SUSE Labs