From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Hugh Dickins <hughd@google.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
linuxppc-dev@lists.ozlabs.org, Nick Piggin <npiggin@kernel.dk>,
linux-kernel@vger.kernel.org
Subject: Re: PowerPC BUG: using smp_processor_id() in preemptible code
Date: Thu, 30 Dec 2010 21:45:10 +1100 [thread overview]
Message-ID: <1293705910.17779.60.camel@pasglop> (raw)
In-Reply-To: <alpine.LSU.2.00.1012291413360.22939@sister.anvils>
On Wed, 2010-12-29 at 14:54 -0800, Hugh Dickins wrote:
> With recent 2.6.37-rc, with CONFIG_PREEMPT=y CONFIG_DEBUG_PREEMPT=y
> on the PowerPC G5, I get spammed by BUG warnings each time I swapoff:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: swapoff/3974
> caller is .hpte_need_flush+0x4c/0x2e8
> Call Trace:
> [c0000001b4a3f830] [c00000000000f3cc] .show_stack+0x6c/0x16c (unreliable)
> [c0000001b4a3f8e0] [c00000000023eda0] .debug_smp_processor_id+0xe4/0x11c
> [c0000001b4a3f970] [c00000000002f2f4] .hpte_need_flush+0x4c/0x2e8
> [c0000001b4a3fa30] [c0000000000e7ef8] .vunmap_pud_range+0x148/0x200
> [c0000001b4a3fb10] [c0000000000e8058] .vunmap_page_range+0xa8/0xd4
> [c0000001b4a3fbb0] [c0000000000e80a4] .free_unmap_vmap_area+0x20/0x38
> [c0000001b4a3fc40] [c0000000000e8138] .remove_vm_area+0x7c/0xb4
> [c0000001b4a3fcd0] [c0000000000e8308] .__vunmap+0x50/0x104
> [c0000001b4a3fd60] [c0000000000ef3fc] .SyS_swapoff+0x59c/0x6a8
> [c0000001b4a3fe30] [c0000000000075a8] syscall_exit+0x0/0x40
>
> I notice hpte_need_flush() itself acknowledges
> * Must be called from within some kind of spinlock/non-preempt region...
Yes, we assume that the PTE lock is always held when modifying page
tables...
> Though I didn't actually bisect, I believe this is since Jeremy's
> 64141da587241301ce8638cc945f8b67853156ec "vmalloc: eagerly clear ptes
> on vunmap", which moves a call to vunmap_page_range() from one place
> (which happened to be inside a spinlock) to another (where it's not).
>
> I guess my warnings would be easily silenced by moving that call to
> vunmap_page_range() down just inside the spinlock below it; but I'm
> dubious that that's the right fix - it looked as if there are other
> paths through vmalloc.c where vunmap_page_range() has been getting
> called without preemption disabled, long before Jeremy's change,
> just paths that I never happen to go down in my limited testing.
>
> For the moment I'm using the obvious patch below to keep it quiet;
> but I doubt that this is the right patch either. I'm hoping that
> ye who understand the importance of hpte_need_flush() will be best
> able to judge what to do. Or might there be other architectures
> expecting to be unpreemptible there?
Well, it looks like our kernel mappings tend to take some nasty
shortcuts with the PTE locking, which I suppose are legit but do break
some of my assumptions there. I need to have a closer look. Thanks for
the report !
Cheers,
Ben.
> Thanks,
> Hugh
>
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -37,11 +37,13 @@ static void vunmap_pte_range(pmd_t *pmd,
> {
> pte_t *pte;
>
> + preempt_disable(); /* Stop __vunmap() triggering smp_processor_id() in preemptible from hpte_need_flush() */
> pte = pte_offset_kernel(pmd, addr);
> do {
> pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte);
> WARN_ON(!pte_none(ptent) && !pte_present(ptent));
> } while (pte++, addr += PAGE_SIZE, addr != end);
> + preempt_enable();
> }
>
> static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end)
WARNING: multiple messages have this Message-ID (diff)
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Hugh Dickins <hughd@google.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
Nick Piggin <npiggin@kernel.dk>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: PowerPC BUG: using smp_processor_id() in preemptible code
Date: Thu, 30 Dec 2010 21:45:10 +1100 [thread overview]
Message-ID: <1293705910.17779.60.camel@pasglop> (raw)
In-Reply-To: <alpine.LSU.2.00.1012291413360.22939@sister.anvils>
On Wed, 2010-12-29 at 14:54 -0800, Hugh Dickins wrote:
> With recent 2.6.37-rc, with CONFIG_PREEMPT=y CONFIG_DEBUG_PREEMPT=y
> on the PowerPC G5, I get spammed by BUG warnings each time I swapoff:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: swapoff/3974
> caller is .hpte_need_flush+0x4c/0x2e8
> Call Trace:
> [c0000001b4a3f830] [c00000000000f3cc] .show_stack+0x6c/0x16c (unreliable)
> [c0000001b4a3f8e0] [c00000000023eda0] .debug_smp_processor_id+0xe4/0x11c
> [c0000001b4a3f970] [c00000000002f2f4] .hpte_need_flush+0x4c/0x2e8
> [c0000001b4a3fa30] [c0000000000e7ef8] .vunmap_pud_range+0x148/0x200
> [c0000001b4a3fb10] [c0000000000e8058] .vunmap_page_range+0xa8/0xd4
> [c0000001b4a3fbb0] [c0000000000e80a4] .free_unmap_vmap_area+0x20/0x38
> [c0000001b4a3fc40] [c0000000000e8138] .remove_vm_area+0x7c/0xb4
> [c0000001b4a3fcd0] [c0000000000e8308] .__vunmap+0x50/0x104
> [c0000001b4a3fd60] [c0000000000ef3fc] .SyS_swapoff+0x59c/0x6a8
> [c0000001b4a3fe30] [c0000000000075a8] syscall_exit+0x0/0x40
>
> I notice hpte_need_flush() itself acknowledges
> * Must be called from within some kind of spinlock/non-preempt region...
Yes, we assume that the PTE lock is always held when modifying page
tables...
> Though I didn't actually bisect, I believe this is since Jeremy's
> 64141da587241301ce8638cc945f8b67853156ec "vmalloc: eagerly clear ptes
> on vunmap", which moves a call to vunmap_page_range() from one place
> (which happened to be inside a spinlock) to another (where it's not).
>
> I guess my warnings would be easily silenced by moving that call to
> vunmap_page_range() down just inside the spinlock below it; but I'm
> dubious that that's the right fix - it looked as if there are other
> paths through vmalloc.c where vunmap_page_range() has been getting
> called without preemption disabled, long before Jeremy's change,
> just paths that I never happen to go down in my limited testing.
>
> For the moment I'm using the obvious patch below to keep it quiet;
> but I doubt that this is the right patch either. I'm hoping that
> ye who understand the importance of hpte_need_flush() will be best
> able to judge what to do. Or might there be other architectures
> expecting to be unpreemptible there?
Well, it looks like our kernel mappings tend to take some nasty
shortcuts with the PTE locking, which I suppose are legit but do break
some of my assumptions there. I need to have a closer look. Thanks for
the report !
Cheers,
Ben.
> Thanks,
> Hugh
>
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -37,11 +37,13 @@ static void vunmap_pte_range(pmd_t *pmd,
> {
> pte_t *pte;
>
> + preempt_disable(); /* Stop __vunmap() triggering smp_processor_id() in preemptible from hpte_need_flush() */
> pte = pte_offset_kernel(pmd, addr);
> do {
> pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte);
> WARN_ON(!pte_none(ptent) && !pte_present(ptent));
> } while (pte++, addr += PAGE_SIZE, addr != end);
> + preempt_enable();
> }
>
> static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end)
next prev parent reply other threads:[~2010-12-30 10:45 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-29 22:54 PowerPC BUG: using smp_processor_id() in preemptible code Hugh Dickins
2010-12-29 22:54 ` Hugh Dickins
2010-12-30 1:26 ` Jeremy Fitzhardinge
2010-12-30 1:26 ` Jeremy Fitzhardinge
2010-12-30 10:45 ` Benjamin Herrenschmidt [this message]
2010-12-30 10:45 ` Benjamin Herrenschmidt
2011-02-24 20:47 ` Hugh Dickins
2011-02-24 20:47 ` Hugh Dickins
2011-02-24 21:07 ` Peter Zijlstra
2011-02-24 21:07 ` Peter Zijlstra
2011-02-24 21:23 ` Benjamin Herrenschmidt
2011-02-24 21:23 ` Benjamin Herrenschmidt
2011-02-24 21:27 ` Peter Zijlstra
2011-02-24 21:27 ` Peter Zijlstra
2011-02-24 21:12 ` Benjamin Herrenschmidt
2011-02-24 21:12 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1293705910.17779.60.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=hughd@google.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@kernel.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.