From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755890Ab3AWOgV (ORCPT ); Wed, 23 Jan 2013 09:36:21 -0500 Received: from mx2.parallels.com ([64.131.90.16]:47957 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750845Ab3AWOgT (ORCPT ); Wed, 23 Jan 2013 09:36:19 -0500 Message-ID: <50FFF571.8080506@parallels.com> Date: Wed, 23 Jan 2013 18:36:33 +0400 From: Glauber Costa User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Dave Chinner CC: , , , , Johannes Weiner Subject: Re: [RFC, PATCH 00/19] Numa aware LRU lists and shrinkers References: <1354058086-27937-1-git-send-email-david@fromorbit.com> <50FD6815.90900@parallels.com> <20130121232121.GG2498@dastard> In-Reply-To: <20130121232121.GG2498@dastard> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/22/2013 03:21 AM, Dave Chinner wrote: > On Mon, Jan 21, 2013 at 08:08:53PM +0400, Glauber Costa wrote: >> On 11/28/2012 03:14 AM, Dave Chinner wrote: >>> [PATCH 09/19] list_lru: per-node list infrastructure >>> >>> This makes the generic LRU list much more scalable by changing it to >>> a {list,lock,count} tuple per node. There are no external API >>> changes to this changeover, so is transparent to current users. >>> >>> [PATCH 10/19] shrinker: add node awareness >>> [PATCH 11/19] fs: convert inode and dentry shrinking to be node >>> >>> Adds a nodemask to the struct shrink_control for callers of >>> shrink_slab to set appropriately for their reclaim context. This >>> nodemask is then passed by the inode and dentry cache reclaim code >>> to the generic LRU list code to implement node aware shrinking. >> >> I have a follow up question that popped up from a discussion between me >> and my very American friend Johnny Wheeler, also known as Johannes >> Weiner (CC'd). I actually remember we discussing this, but don't fully >> remember the outcome. And since I can't find it anywhere, it must have >> been in a media other than e-mail. So I thought it would do no harm in >> at least documenting it... >> >> Why are we doing this per-node, instead of per-zone? >> >> It seems to me that the goal is to collapse all zones of a node into a >> single list, but since the number of zones is not terribly larger than >> the number of nodes, and zones is where the pressure comes from, what do >> we really gain from this? > > The number is quite a bit higher - there are platforms with 5 zones > to a node. The reality is, though, for most platforms slab > allocations come from a single zone - they never come from ZONE_DMA, > ZONE_HIGHMEM or ZONE_MOVEABLE, so there is there is no good reason > for having cache LRUs for these zones. So, two zones at most. > Yes, but one would expect that most of those special zones would be present only in the first node, no? (correct me if I am wrong here). Over that, things should be pretty much addressable.