From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:96]) by ozlabs.org (Postfix) with ESMTP id E4EDF2C01AE for ; Fri, 31 Jan 2014 03:26:56 +1100 (EST) Date: Thu, 30 Jan 2014 10:26:51 -0600 (CST) From: Christoph Lameter To: Nishanth Aravamudan Subject: Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory In-Reply-To: <20140129223640.GA10101@linux.vnet.ibm.com> Message-ID: References: <52e1da8f.86f7440a.120f.25f3SMTPIN_ADDED_BROKEN@mx.google.com> <20140124232902.GB30361@linux.vnet.ibm.com> <20140125001643.GA25344@linux.vnet.ibm.com> <20140125011041.GB25344@linux.vnet.ibm.com> <20140127055805.GA2471@lge.com> <20140128182947.GA1591@linux.vnet.ibm.com> <20140129223640.GA10101@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Han Pingtian , mpm@selenic.com, penberg@kernel.org, linux-mm@kvack.org, cody@linux.vnet.ibm.com, paulus@samba.org, Anton Blanchard , David Rientjes , Joonsoo Kim , linuxppc-dev@lists.ozlabs.org, Wanpeng Li List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 29 Jan 2014, Nishanth Aravamudan wrote: > exactly what the caller intends. > > int searchnode = node; > if (node == NUMA_NO_NODE) > searchnode = numa_mem_id(); > if (!node_present_pages(node)) > searchnode = local_memory_node(node); > > The difference in semantics from the previous is that here, if we have a > memoryless node, rather than using the CPU's nearest NUMA node, we use > the NUMA node closest to the requested one? The idea here is that the page allocator will do the fallback to other nodes. This check for !node_present should not be necessary. SLUB needs to accept the page from whatever node the page allocator returned and work with that. The problem is the check for having a slab from the "right" node may fall again after another attempt to allocate from the same node. SLUB will then push the slab from the *wrong* node back to the partial lists and may attempt another allocation that will again be successful but return memory from another node. That way the partial lists from a particular node are growing uselessly. One way to solve this may be to check if memory is actually allocated from the requested node and fallback to NUMA_NO_NODE (which will use the last allocated slab) for future allocs if the page allocator returned memory from a different node (unless GFP_THIS_NODE is set of course). Otherwise we end up replicating the page allocator logic in slub like in slab. That is what I wanted to avoid.