From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>, Hugh Dickins <hugh@veritas.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>,
L-K <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
David Howells <dhowells@redhat.com>
Subject: Re: [PATCH] x86,mm: fix pte_free()
Date: Fri, 23 Jan 2009 18:34:28 +0100 [thread overview]
Message-ID: <1232732068.4850.0.camel@laptop> (raw)
In-Reply-To: <1232728669.4826.143.camel@laptop>
On Fri, 2009-01-23 at 17:37 +0100, Peter Zijlstra wrote:
> On -rt we were seeing spurious bad page states like:
>
> Bad page state in process 'firefox'
> page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
> [<c043d0f3>] ? printk+0x14/0x19
> [<c0272d4e>] bad_page+0x4e/0x79
> [<c0273831>] free_hot_cold_page+0x5b/0x1d3
> [<c02739f6>] free_hot_page+0xf/0x11
> [<c0273a18>] __free_pages+0x20/0x2b
> [<c027d170>] __pte_alloc+0x87/0x91
> [<c027d25e>] handle_mm_fault+0xe4/0x733
> [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
> [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
> [<c0218875>] do_page_fault+0x36f/0x88a
>
> This is the case where a concurrent fault already installed the PTE
> and
> we get to free the newly allocated one.
>
> This is due to pgtable_page_ctor() doing the
> spin_lock_init(&page->ptl)
> which is overlaid with the {private, mapping} struct.
>
> union {
> struct {
> unsigned long private;
> struct address_space *mapping;
> };
> #if NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS
> spinlock_t ptl;
> #endif
> struct kmem_cache *slab;
> struct page *first_page;
> };
>
> Normally the spinlock is small enough to not stomp on page->mapping,
> but
> PREEMPT_RT=y has huge 'spin'locks.
>
> But lockdep kernels should also be able to trigger this splat, as the
> lock tracking code grows the spinlock to cover page->mapping.
>
> The obvious fix is calling pgtable_page_dtor() like the regular pte
> free
> path __pte_free_tlb() does.
>
> It seems all architectures except x86 and nm10300 already do this, and
> nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
> doesn't do SMP or simply doesnt do MMU at all or something.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
> CC: stable@kernel.org
Now one that's not obviously borken,..
---
arch/x86/include/asm/pgalloc.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index cb7c151..dd14c54 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -42,6 +42,7 @@ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
static inline void pte_free(struct mm_struct *mm, struct page *pte)
{
+ pgtable_page_dtor(pte);
__free_page(pte);
}
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>, Hugh Dickins <hugh@veritas.com>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>,
L-K <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
David Howells <dhowells@redhat.com>
Subject: Re: [PATCH] x86,mm: fix pte_free()
Date: Fri, 23 Jan 2009 18:34:28 +0100 [thread overview]
Message-ID: <1232732068.4850.0.camel@laptop> (raw)
In-Reply-To: <1232728669.4826.143.camel@laptop>
On Fri, 2009-01-23 at 17:37 +0100, Peter Zijlstra wrote:
> On -rt we were seeing spurious bad page states like:
>
> Bad page state in process 'firefox'
> page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
> [<c043d0f3>] ? printk+0x14/0x19
> [<c0272d4e>] bad_page+0x4e/0x79
> [<c0273831>] free_hot_cold_page+0x5b/0x1d3
> [<c02739f6>] free_hot_page+0xf/0x11
> [<c0273a18>] __free_pages+0x20/0x2b
> [<c027d170>] __pte_alloc+0x87/0x91
> [<c027d25e>] handle_mm_fault+0xe4/0x733
> [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
> [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
> [<c0218875>] do_page_fault+0x36f/0x88a
>
> This is the case where a concurrent fault already installed the PTE
> and
> we get to free the newly allocated one.
>
> This is due to pgtable_page_ctor() doing the
> spin_lock_init(&page->ptl)
> which is overlaid with the {private, mapping} struct.
>
> union {
> struct {
> unsigned long private;
> struct address_space *mapping;
> };
> #if NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS
> spinlock_t ptl;
> #endif
> struct kmem_cache *slab;
> struct page *first_page;
> };
>
> Normally the spinlock is small enough to not stomp on page->mapping,
> but
> PREEMPT_RT=y has huge 'spin'locks.
>
> But lockdep kernels should also be able to trigger this splat, as the
> lock tracking code grows the spinlock to cover page->mapping.
>
> The obvious fix is calling pgtable_page_dtor() like the regular pte
> free
> path __pte_free_tlb() does.
>
> It seems all architectures except x86 and nm10300 already do this, and
> nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
> doesn't do SMP or simply doesnt do MMU at all or something.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
> CC: stable@kernel.org
Now one that's not obviously borken,..
---
arch/x86/include/asm/pgalloc.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index cb7c151..dd14c54 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -42,6 +42,7 @@ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
static inline void pte_free(struct mm_struct *mm, struct page *pte)
{
+ pgtable_page_dtor(pte);
__free_page(pte);
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-01-23 17:35 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-23 16:37 [PATCH] x86,mm: fix pte_free() Peter Zijlstra
2009-01-23 16:37 ` Peter Zijlstra
2009-01-23 17:34 ` Ingo Molnar
2009-01-23 17:34 ` Ingo Molnar
2009-01-23 17:39 ` Peter Zijlstra
2009-01-23 17:39 ` Peter Zijlstra
2009-01-23 17:45 ` Ingo Molnar
2009-01-23 17:45 ` Ingo Molnar
2009-01-26 3:09 ` KOSAKI Motohiro
2009-01-26 3:09 ` KOSAKI Motohiro
2009-01-23 17:34 ` Peter Zijlstra [this message]
2009-01-23 17:34 ` Peter Zijlstra
2009-01-23 18:42 ` Hugh Dickins
2009-01-23 18:42 ` Hugh Dickins
2009-01-23 20:15 ` David Howells
2009-01-23 20:15 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1232732068.4850.0.camel@laptop \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.