public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SLUB BUG: check_slab called with interrupts enabled
@ 2011-06-15 14:56 Rik van Riel
  2011-06-15 15:03 ` Christoph Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2011-06-15 14:56 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux kernel Mailing List, Kyle McMartin

Hi Christoph,

last night I got an interesting backtrace running 3.0-rc3
(Fedora Rawhide kernel package).  Unfortunately netconsole
seems to be incompatible with KVM at the moment, so I had
to capture the oops on my digital camera and will be
transcribing just the backtrace.

Essentially, kernel 3.0-rc3 hit this bug:

static int check_slab(struct kmem_cache *s, struct page *page)
{
         int maxobj;

         VM_BUG_ON(!irqs_disabled());

The call trace:

check_slab
alloc_debug_processing
__slab_alloc
kmem_cache_alloc
bvec_alloc_bs
bio_alloc_bioset
bio_alloc
mpage_alloc
do_mpage_readpage
... followed by ext4 and VFS code, obviously innocent


Is this a known issue, Christoph?

If not, anything I can do to help debug/fix this?

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-15 14:56 SLUB BUG: check_slab called with interrupts enabled Rik van Riel
@ 2011-06-15 15:03 ` Christoph Lameter
  2011-06-15 15:16   ` Rik van Riel
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2011-06-15 15:03 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Linux kernel Mailing List, Kyle McMartin

On Wed, 15 Jun 2011, Rik van Riel wrote:

> Hi Christoph,
>
> last night I got an interesting backtrace running 3.0-rc3
> (Fedora Rawhide kernel package).  Unfortunately netconsole
> seems to be incompatible with KVM at the moment, so I had
> to capture the oops on my digital camera and will be
> transcribing just the backtrace.
>
> Essentially, kernel 3.0-rc3 hit this bug:
>
> static int check_slab(struct kmem_cache *s, struct page *page)
> {
>         int maxobj;
>
>         VM_BUG_ON(!irqs_disabled());
>
> The call trace:
>
> check_slab
> alloc_debug_processing
> __slab_alloc
> kmem_cache_alloc
> bvec_alloc_bs
> bio_alloc_bioset
> bio_alloc
> mpage_alloc
> do_mpage_readpage
> ... followed by ext4 and VFS code, obviously innocent

__slab_alloc() disables interrupts so alloc_debug_processing() should not
run into this issue.

There are no additional special slub patches applied right? Because some
of the patches under discussion change the interrupt disable handling a
bit.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-15 15:03 ` Christoph Lameter
@ 2011-06-15 15:16   ` Rik van Riel
  2011-06-15 15:45     ` Christoph Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2011-06-15 15:16 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux kernel Mailing List, Kyle McMartin

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

On 06/15/2011 11:03 AM, Christoph Lameter wrote:
> On Wed, 15 Jun 2011, Rik van Riel wrote:
>
>> Hi Christoph,
>>
>> last night I got an interesting backtrace running 3.0-rc3
>> (Fedora Rawhide kernel package).  Unfortunately netconsole
>> seems to be incompatible with KVM at the moment, so I had
>> to capture the oops on my digital camera and will be
>> transcribing just the backtrace.
>>
>> Essentially, kernel 3.0-rc3 hit this bug:
>>
>> static int check_slab(struct kmem_cache *s, struct page *page)
>> {
>>          int maxobj;
>>
>>          VM_BUG_ON(!irqs_disabled());
>>
>> The call trace:
>>
>> check_slab
>> alloc_debug_processing
>> __slab_alloc
>> kmem_cache_alloc
>> bvec_alloc_bs
>> bio_alloc_bioset
>> bio_alloc
>> mpage_alloc
>> do_mpage_readpage
>> ... followed by ext4 and VFS code, obviously innocent
>
> __slab_alloc() disables interrupts so alloc_debug_processing() should not
> run into this issue.
>
> There are no additional special slub patches applied right? Because some
> of the patches under discussion change the interrupt disable handling a
> bit.

Just the two attached ones, which don't seem to touch the
code path in question...

-- 
All rights reversed

[-- Attachment #2: mm-slub-do-not-take-expensive-steps-for-slubs-speculative-high-order-allocations.patch --]
[-- Type: text/plain, Size: 3236 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:18 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:23 +0100
Message-Id: <1305295404-12129-4-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail. However, by simply trying to allocate, the caller can enter
compaction or reclaim - both of which are likely to cost more than the
benefit of using high-order pages in SLUB. On a desktop system, two
users report that the system is getting stalled with kswapd using large
amounts of CPU.

This patch prevents SLUB taking any expensive steps when trying to use
high-order allocations. Instead, it is expected to fall back to smaller
orders more aggressively. Testing was somewhat inconclusive on how much
this helped but it makes sense that falling back to order-0 allocations
is faster than entering compaction or direct reclaim.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/page_alloc.c |    3 ++-
 mm/slub.c       |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9f8a97b..057f1e2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 {
 	int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 	const gfp_t wait = gfp_mask & __GFP_WAIT;
+	const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD);
 
 	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
 	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	 */
 	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
 
-	if (!wait) {
+	if (!wait && can_wake_kswapd) {
 		/*
 		 * Not worth trying to allocate harder for
 		 * __GFP_NOMEMALLOC even if it can't schedule.
diff --git a/mm/slub.c b/mm/slub.c
index 98c358d..c5797ab 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NO_KSWAPD) &
+			~(__GFP_NOFAIL | __GFP_WAIT | __GFP_REPEAT);
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

[-- Attachment #3: mm-slub-do-not-wake-kswapd-for-slubs-speculative-high-order-allocations.patch --]
[-- Type: text/plain, Size: 2243 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:00 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:22 +0100
Message-Id: <1305295404-12129-3-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail.  However, by simply trying to allocate, kswapd is woken up to
start reclaiming at that order. On a desktop system, two users report
that the system is getting locked up with kswapd using large amounts
of CPU.  Using SLAB instead of SLUB made this problem go away.

This patch prevents kswapd being woken up for high-order allocations.
Testing indicated that with this patch applied, the system was much
harder to hang and even when it did, it eventually recovered.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/slub.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 9d2e5e4..98c358d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-15 15:16   ` Rik van Riel
@ 2011-06-15 15:45     ` Christoph Lameter
  2011-06-15 16:24       ` Rik van Riel
  2011-06-16 12:34       ` Rik van Riel
  0 siblings, 2 replies; 8+ messages in thread
From: Christoph Lameter @ 2011-06-15 15:45 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Linux kernel Mailing List, Kyle McMartin

[-- Attachment #1: Type: TEXT/PLAIN, Size: 642 bytes --]

On Wed, 15 Jun 2011, Rik van Riel wrote:

> > There are no additional special slub patches applied right? Because some
> > of the patches under discussion change the interrupt disable handling a
> > bit.
>
> Just the two attached ones, which don't seem to touch the
> code path in question...

I also do not see how these could break something. But they are mucking
around with the __GFP_WAIT flag. __GFP_WAIT determines the reenabling and
redisabling of interrupts in __slab_alloc(). If some variables gets
corrupted then this could be the result.

Print out the value of gfpflags before and after the call to new_slab()
from __slab_alloc()?

[-- Attachment #2: Type: TEXT/PLAIN, Size: 3236 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:18 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:23 +0100
Message-Id: <1305295404-12129-4-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail. However, by simply trying to allocate, the caller can enter
compaction or reclaim - both of which are likely to cost more than the
benefit of using high-order pages in SLUB. On a desktop system, two
users report that the system is getting stalled with kswapd using large
amounts of CPU.

This patch prevents SLUB taking any expensive steps when trying to use
high-order allocations. Instead, it is expected to fall back to smaller
orders more aggressively. Testing was somewhat inconclusive on how much
this helped but it makes sense that falling back to order-0 allocations
is faster than entering compaction or direct reclaim.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/page_alloc.c |    3 ++-
 mm/slub.c       |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9f8a97b..057f1e2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 {
 	int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 	const gfp_t wait = gfp_mask & __GFP_WAIT;
+	const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD);
 
 	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
 	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	 */
 	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
 
-	if (!wait) {
+	if (!wait && can_wake_kswapd) {
 		/*
 		 * Not worth trying to allocate harder for
 		 * __GFP_NOMEMALLOC even if it can't schedule.
diff --git a/mm/slub.c b/mm/slub.c
index 98c358d..c5797ab 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NO_KSWAPD) &
+			~(__GFP_NOFAIL | __GFP_WAIT | __GFP_REPEAT);
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

[-- Attachment #3: Type: TEXT/PLAIN, Size: 2243 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:00 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:22 +0100
Message-Id: <1305295404-12129-3-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail.  However, by simply trying to allocate, kswapd is woken up to
start reclaiming at that order. On a desktop system, two users report
that the system is getting locked up with kswapd using large amounts
of CPU.  Using SLAB instead of SLUB made this problem go away.

This patch prevents kswapd being woken up for high-order allocations.
Testing indicated that with this patch applied, the system was much
harder to hang and even when it did, it eventually recovered.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/slub.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 9d2e5e4..98c358d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-15 15:45     ` Christoph Lameter
@ 2011-06-15 16:24       ` Rik van Riel
  2011-06-16 12:34       ` Rik van Riel
  1 sibling, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2011-06-15 16:24 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux kernel Mailing List, Kyle McMartin

On 06/15/2011 11:45 AM, Christoph Lameter wrote:
> On Wed, 15 Jun 2011, Rik van Riel wrote:
>
>> > There are no additional special slub patches applied right? Because
>> some
>> > of the patches under discussion change the interrupt disable handling a
>> > bit.
>>
>> Just the two attached ones, which don't seem to touch the
>> code path in question...
>
> I also do not see how these could break something. But they are mucking
> around with the __GFP_WAIT flag. __GFP_WAIT determines the reenabling and
> redisabling of interrupts in __slab_alloc(). If some variables gets
> corrupted then this could be the result.
>
> Print out the value of gfpflags before and after the call to new_slab()
> from __slab_alloc()?

Since I'm building a new kernel anyway (with Neil Horman's
patch to make netconsole work with tun/tap), I have commented
out these two slub patches.

Lets see what happens...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-15 15:45     ` Christoph Lameter
  2011-06-15 16:24       ` Rik van Riel
@ 2011-06-16 12:34       ` Rik van Riel
  2011-06-16 15:57         ` Christoph Lameter
  1 sibling, 1 reply; 8+ messages in thread
From: Rik van Riel @ 2011-06-16 12:34 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux kernel Mailing List, Kyle McMartin

On 06/15/2011 11:45 AM, Christoph Lameter wrote:
> On Wed, 15 Jun 2011, Rik van Riel wrote:
>
>> > There are no additional special slub patches applied right? Because
>> some
>> > of the patches under discussion change the interrupt disable handling a
>> > bit.
>>
>> Just the two attached ones, which don't seem to touch the
>> code path in question...
>
> I also do not see how these could break something.

After backing them out, I got 18 hours of uptime so far.

This could just be dumb luck, or it could be some slub vs. kswapd
interaction.  Or maybe I simply am not triggering the original
bug any more because kswapd is now doing all the work, but may
still be able to trigger it under more memory pressure...

Either way, since this could still be dumb luck, I'll let you
guys know if/when I see a next crash :)

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-16 12:34       ` Rik van Riel
@ 2011-06-16 15:57         ` Christoph Lameter
  2011-06-17  0:22           ` Rik van Riel
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2011-06-16 15:57 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Linux kernel Mailing List, Kyle McMartin

On Thu, 16 Jun 2011, Rik van Riel wrote:

> After backing them out, I got 18 hours of uptime so far.
>
> This could just be dumb luck, or it could be some slub vs. kswapd
> interaction.  Or maybe I simply am not triggering the original
> bug any more because kswapd is now doing all the work, but may
> still be able to trigger it under more memory pressure...

Could be some memory issue or stack corruption. This is a machine with ECC
ram right?

> Either way, since this could still be dumb luck, I'll let you
> guys know if/when I see a next crash :)

OK.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB BUG: check_slab called with interrupts enabled
  2011-06-16 15:57         ` Christoph Lameter
@ 2011-06-17  0:22           ` Rik van Riel
  0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2011-06-17  0:22 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux kernel Mailing List, Kyle McMartin

On 06/16/2011 11:57 AM, Christoph Lameter wrote:
> On Thu, 16 Jun 2011, Rik van Riel wrote:
>
>> After backing them out, I got 18 hours of uptime so far.
>>
>> This could just be dumb luck, or it could be some slub vs. kswapd
>> interaction.  Or maybe I simply am not triggering the original
>> bug any more because kswapd is now doing all the work, but may
>> still be able to trigger it under more memory pressure...
>
> Could be some memory issue or stack corruption. This is a machine with ECC
> ram right?

Yes it is, 12GB of ECC memory.

Running that much without ECC is probably a bad idea :)

>> Either way, since this could still be dumb luck, I'll let you
>> guys know if/when I see a next crash :)
>
> OK.

30 hours uptime already.

Just a little beyond the "dumb luck" threshold.

However, I suspect the bug may still be there and
the fact that kswapd is doing more work may simply
be hiding it.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-06-17  0:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-15 14:56 SLUB BUG: check_slab called with interrupts enabled Rik van Riel
2011-06-15 15:03 ` Christoph Lameter
2011-06-15 15:16   ` Rik van Riel
2011-06-15 15:45     ` Christoph Lameter
2011-06-15 16:24       ` Rik van Riel
2011-06-16 12:34       ` Rik van Riel
2011-06-16 15:57         ` Christoph Lameter
2011-06-17  0:22           ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox