From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>, Hugh Dickins <hugh@veritas.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>
Cc: L-K <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
David Howells <dhowells@redhat.com>
Subject: [PATCH] x86,mm: fix pte_free()
Date: Fri, 23 Jan 2009 17:37:49 +0100 [thread overview]
Message-ID: <1232728669.4826.143.camel@laptop> (raw)
On -rt we were seeing spurious bad page states like:
Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a
This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.
This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.
union {
struct {
unsigned long private;
struct address_space *mapping;
};
#if NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS
spinlock_t ptl;
#endif
struct kmem_cache *slab;
struct page *first_page;
};
Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.
But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.
The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.
It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.
Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
CC: stable@kernel.org
---
arch/x86/include/asm/pgalloc.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index cb7c151..b99023c 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -42,6 +42,7 @@ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
static inline void pte_free(struct mm_struct *mm, struct page *pte)
{
+ pgtable_page_dtor();
__free_page(pte);
}
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>, Hugh Dickins <hugh@veritas.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>
Cc: L-K <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
David Howells <dhowells@redhat.com>
Subject: [PATCH] x86,mm: fix pte_free()
Date: Fri, 23 Jan 2009 17:37:49 +0100 [thread overview]
Message-ID: <1232728669.4826.143.camel@laptop> (raw)
On -rt we were seeing spurious bad page states like:
Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a
This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.
This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.
union {
struct {
unsigned long private;
struct address_space *mapping;
};
#if NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS
spinlock_t ptl;
#endif
struct kmem_cache *slab;
struct page *first_page;
};
Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.
But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.
The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.
It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.
Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
CC: stable@kernel.org
---
arch/x86/include/asm/pgalloc.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index cb7c151..b99023c 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -42,6 +42,7 @@ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
static inline void pte_free(struct mm_struct *mm, struct page *pte)
{
+ pgtable_page_dtor();
__free_page(pte);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2009-01-23 16:38 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-23 16:37 Peter Zijlstra [this message]
2009-01-23 16:37 ` [PATCH] x86,mm: fix pte_free() Peter Zijlstra
2009-01-23 17:34 ` Ingo Molnar
2009-01-23 17:34 ` Ingo Molnar
2009-01-23 17:39 ` Peter Zijlstra
2009-01-23 17:39 ` Peter Zijlstra
2009-01-23 17:45 ` Ingo Molnar
2009-01-23 17:45 ` Ingo Molnar
2009-01-26 3:09 ` KOSAKI Motohiro
2009-01-26 3:09 ` KOSAKI Motohiro
2009-01-23 17:34 ` Peter Zijlstra
2009-01-23 17:34 ` Peter Zijlstra
2009-01-23 18:42 ` Hugh Dickins
2009-01-23 18:42 ` Hugh Dickins
2009-01-23 20:15 ` David Howells
2009-01-23 20:15 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1232728669.4826.143.camel@laptop \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.