From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754848AbbIHOGs (ORCPT <rfc822;w@1wt.eu>);
	Tue, 8 Sep 2015 10:06:48 -0400
Received: from mail-wi0-f181.google.com ([209.85.212.181]:34354 "EHLO
	mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753551AbbIHOGp (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 8 Sep 2015 10:06:45 -0400
Subject: Re: Kernel 4.1.6 Panic due to slab corruption
To: Christoph Lameter <cl@linux.com>
References: <55ED4DAD.7080701@kyup.com> <55ED7186.7060503@kyup.com>
 <alpine.DEB.2.11.1509080857300.24606@east.gentwo.org>
Cc: "Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
        Marian Marinov <mm@1h.com>,
        SiteGround Operations <operations@siteground.com>
From: Nikolay Borisov <kernel@kyup.com>
Message-ID: <55EEEB6F.2090302@kyup.com>
Date: Tue, 8 Sep 2015 17:06:39 +0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.11.1509080857300.24606@east.gentwo.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


On 09/08/2015 04:58 PM, Christoph Lameter wrote:
> On Mon, 7 Sep 2015, Nikolay Borisov wrote:
> 
>> Did a bit more investigation and it turns out the
>> corruption is happening in slab_alloc_node, in the
>> 'else' branch when get_freepointer is being called:
> 
> Please reboot the system and specify
> 
> 	slub_debug
> 

Unfortunately I haven't found a way to reproduce it so the only option
would be to do this on a live server. However, the performance impact I
believe is going to be very prohibitive :(.  Alternatively what I could
do is probably leave merging on but enable debugging only for the
kmalloc-32 slab cache. Do you think this would provide enough
information to help track the corruption when it happens, without
impacting performance?

> on the kernel command line. This will enable additional diagnostics which
> will allow tracking down the issue to the subsystem causing it.
> 
> 
> Or rebuild with
> 
> CONFIG_SLUB_DEBUG_ON
>