From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 23 Jan 2008 19:13:35 -0800 (PST) From: Christoph Lameter Subject: Re: [PATCH] Fix boot problem in situations where the boot CPU is running on a memoryless node In-Reply-To: <20080123213637.GE3848@us.ibm.com> Message-ID: References: <20080123125236.GA18876@aepfle.de> <20080123135513.GA14175@csn.ul.ie> <20080123155655.GB20156@csn.ul.ie> <20080123195220.GB3848@us.ibm.com> <84144f020801231302g2cafdda9kf7f916121dc56aa5@mail.gmail.com> <20080123213637.GE3848@us.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Nishanth Aravamudan Cc: Pekka Enberg , Mel Gorman , akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org, "Aneesh Kumar K.V" , KAMEZAWA Hiroyuki , lee.schermerhorn@hp.com, Linux MM , Olaf Hering List-ID: On Wed, 23 Jan 2008, Nishanth Aravamudan wrote: > Right, so it might have functioned before, but the correctness was > wobbly at best... Certainly the memoryless patch series has tightened > that up, but we missed these SLAB issues. > > I see that your patch fixed Olaf's machine, Pekka. Nice work on > everyone's part tracking this stuff down. Another important result is that I found that GFP_THISNODE is actually required for proper SLAB operation and not only an optimization. Fallback can lead to very bad results. I have two customer reported instances of SLAB corruption here that can be explained now due to fallback to another node. Foreign objects enter the per cpu queue. The wrong node lock is taken during cache_flusharray(). Fields in the struct slab can become corrupted. It typically hits the list field and the inuse field. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org