public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: Linux kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kyle McMartin <kyle@redhat.com>
Subject: Re: SLUB BUG: check_slab called with interrupts enabled
Date: Wed, 15 Jun 2011 11:16:49 -0400	[thread overview]
Message-ID: <4DF8CCE1.80400@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1106150959500.768@router.home>

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

On 06/15/2011 11:03 AM, Christoph Lameter wrote:
> On Wed, 15 Jun 2011, Rik van Riel wrote:
>
>> Hi Christoph,
>>
>> last night I got an interesting backtrace running 3.0-rc3
>> (Fedora Rawhide kernel package).  Unfortunately netconsole
>> seems to be incompatible with KVM at the moment, so I had
>> to capture the oops on my digital camera and will be
>> transcribing just the backtrace.
>>
>> Essentially, kernel 3.0-rc3 hit this bug:
>>
>> static int check_slab(struct kmem_cache *s, struct page *page)
>> {
>>          int maxobj;
>>
>>          VM_BUG_ON(!irqs_disabled());
>>
>> The call trace:
>>
>> check_slab
>> alloc_debug_processing
>> __slab_alloc
>> kmem_cache_alloc
>> bvec_alloc_bs
>> bio_alloc_bioset
>> bio_alloc
>> mpage_alloc
>> do_mpage_readpage
>> ... followed by ext4 and VFS code, obviously innocent
>
> __slab_alloc() disables interrupts so alloc_debug_processing() should not
> run into this issue.
>
> There are no additional special slub patches applied right? Because some
> of the patches under discussion change the interrupt disable handling a
> bit.

Just the two attached ones, which don't seem to touch the
code path in question...

-- 
All rights reversed

[-- Attachment #2: mm-slub-do-not-take-expensive-steps-for-slubs-speculative-high-order-allocations.patch --]
[-- Type: text/plain, Size: 3236 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:18 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:23 +0100
Message-Id: <1305295404-12129-4-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail. However, by simply trying to allocate, the caller can enter
compaction or reclaim - both of which are likely to cost more than the
benefit of using high-order pages in SLUB. On a desktop system, two
users report that the system is getting stalled with kswapd using large
amounts of CPU.

This patch prevents SLUB taking any expensive steps when trying to use
high-order allocations. Instead, it is expected to fall back to smaller
orders more aggressively. Testing was somewhat inconclusive on how much
this helped but it makes sense that falling back to order-0 allocations
is faster than entering compaction or direct reclaim.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/page_alloc.c |    3 ++-
 mm/slub.c       |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9f8a97b..057f1e2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 {
 	int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 	const gfp_t wait = gfp_mask & __GFP_WAIT;
+	const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD);
 
 	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
 	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
@@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	 */
 	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
 
-	if (!wait) {
+	if (!wait && can_wake_kswapd) {
 		/*
 		 * Not worth trying to allocate harder for
 		 * __GFP_NOMEMALLOC even if it can't schedule.
diff --git a/mm/slub.c b/mm/slub.c
index 98c358d..c5797ab 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NO_KSWAPD) &
+			~(__GFP_NOFAIL | __GFP_WAIT | __GFP_REPEAT);
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

[-- Attachment #3: mm-slub-do-not-wake-kswapd-for-slubs-speculative-high-order-allocations.patch --]
[-- Type: text/plain, Size: 2243 bytes --]

From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:00 2011
From:	Mel Gorman <mgorman@suse.de>
To:	Andrew Morton <akpm@linux-foundation.org>
Cc:	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Colin King <colin.king@canonical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@gmail.com>,
	Jan Kara <jack@suse.cz>, Chris Mason <chris.mason@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations
Date:	Fri, 13 May 2011 15:03:22 +0100
Message-Id: <1305295404-12129-3-git-send-email-mgorman@suse.de>
X-Mailing-List:	linux-fsdevel@vger.kernel.org

To avoid locking and per-cpu overhead, SLUB optimisically uses
high-order allocations and falls back to lower allocations if they
fail.  However, by simply trying to allocate, kswapd is woken up to
start reclaiming at that order. On a desktop system, two users report
that the system is getting locked up with kswapd using large amounts
of CPU.  Using SLAB instead of SLUB made this problem go away.

This patch prevents kswapd being woken up for high-order allocations.
Testing indicated that with this patch applied, the system was much
harder to hang and even when it did, it eventually recovered.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/slub.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 9d2e5e4..98c358d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1170,7 +1170,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
 	 * Let the initial higher-order allocation fail under memory pressure
 	 * so we fall-back to the minimum order allocation.
 	 */
-	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOFAIL;
+	alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL;
 
 	page = alloc_slab_page(alloc_gfp, node, oo);
 	if (unlikely(!page)) {
-- 
1.7.3.4

  reply	other threads:[~2011-06-15 15:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-15 14:56 SLUB BUG: check_slab called with interrupts enabled Rik van Riel
2011-06-15 15:03 ` Christoph Lameter
2011-06-15 15:16   ` Rik van Riel [this message]
2011-06-15 15:45     ` Christoph Lameter
2011-06-15 16:24       ` Rik van Riel
2011-06-16 12:34       ` Rik van Riel
2011-06-16 15:57         ` Christoph Lameter
2011-06-17  0:22           ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF8CCE1.80400@redhat.com \
    --to=riel@redhat.com \
    --cc=cl@linux.com \
    --cc=kyle@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox